This page contains responses that various class members have made to the assigned readings. Only responses that have been explicitly shared with the class are shown.
Of the two papers, I decided to focus more on the Intelligent Systems paper because of their focus on ACL comparisons. The author makes a clear distinction between "CORBA" and ACL as
Comments about Agent Communication Languages: First I want to point that I was amazed the moment I realized how important language has been in the area of interactions between the members of a community and even more when I read about how complicated and demanding it has been coming to agreements on communications among agents. Through this article I had a handle on how much effort has been conducted in standardization of ACLs and the role that many different public and private organizations have played in these enterprises. However it is sad that with no surprise, the detonator of such efforts is the Military Industry. This article together with the research I have conducted brought me to the conclusion that everyday FIPA ACL will keep on gaining more preference in front of KQML Concerning the first layer of the common-language problem, through this article I realized the importance of the OMG not only in the "syntactic translation between languages in the same family" as stated in the article but also its consequences of the integration of legacy systems and another standardization effort, the UML (Unified Modeling Language). And regarding the second layer of the problem I clearly understood the need of ontologies and how important efforts like those conducted at Stanford are important to reach a common agreement in ontologies' definitions. It is clear that ACLs surpass by far Remote Procedure Calling and Remote Method Invocation by the capabilities of ACLs to describe desired states in declarative language rather than a procedure or method and the consequences of this in elaborating conversations or the exchange of shared experiences and long-term strategies to alter interlocutors' BDI states Comments on Team Formation This article was by far more complicated than any other I have read so far for this course The first part is very clear when elaborating about joint action from a team perspective, and its difference with purely coordinated actions. As Brooks stated, the theories need more elaboration when they are being applied to real world, and as stated by the authors of this article, it is the exposition to uncertainties of the real world what on the one hand complicates joint activities but on the other, makes their study more interesting. In the second part, Joint Actions, they write "our notion of commitment in turn, was specified as a goal that persists over time" from my view point this does not make sense in their previous definition of individual intentions as "internal commitments to perform an action while in a certain mental state" The four major challenges to any joint actions' theory that they pose before starting their definition are very interesting. The definition of goal is very unclear to me because I do not understand when they say that "goals are the propositions that are true in all these worlds "(the most desirable worlds, a subset of the belief-accessible ones). Could we elaborate more on this in class as well as what "action expressions" are? The formal definitions are not difficult to understand when I read them separately as well as the theorems cites, they completely make sense; however, when I tried to follow the authors' discussion on them it turned to be very difficult to understand all their points I would like to know how to program agents' goals following their definitions
Starting with the paper by Cohen, Levesque, and Smith, I concluded that the following key points were of the utmost importance:
Thomas Nelson KQML et al seem to be overkill for our soccer agents. In general, it seems to me very useful to divide MAS into two broad categories:
The model for "establishing and discharging joint commitment" presented in "On Team Building" was very interesting, and I enjoyed it very much. Every premise seemed reasonable, and every point made sense. Despite its abstractness, the description of this model was complete enough that one could implement such a system for team-building, given a sufficiently oriented BDI architecture. Granted, all that was presented was a semantic specification to form and break teams, without regard to the conditions or actions before, during, and after the forming of those teams. However, it seems to me that those are the domain of agents' designers. Having said that, I do have some issues with this paper. First, of the authors assumptions, including agent sincerity, I felt that of perfect memory (Section 3, used in the definitions of belief, goal, and mutual belief) was the hardest to justify, and was made with the most ease. Of course, this is obviously an issue with humans, but I feel that this is just as much an issue with artificial agents. Be they robotic or software agents, they are susceptible to memory alteration and degradation. Consider their domains. One motivation behind building robotic agents is that they can operate in environments unsafe and unnatural for humans. Factors in these extreme environments--temperature, radiation, gravity, etc.--could all make perfect memory impossible. Many software agents are made to operate on the Internet. Whether it's referred to as the colloquial "information superhighway" or my "information public swimming pool", the Internet poses well known threats to the integrity of any system accessing it--agent or otherwise. Second, insistence on mutual belief among all parties in the team seems unscalable. Even allowing broadcasts, each time an agent attempts to join the team, it must receive an assertion message from every current team member. This could be easily alleviated, however, by organizing the team into a hierarchy (partially ordered, such that 'representatives' are responsible for their constituents, or totally ordered, like the convoy analogy), or even partitioning the team (possibly following the organization of experts from the Sycara reading) such that MB within the partition is as defined, and then as a partition held with respect to other partitions. Thirdly, the specification doesn't allow for maintenance goals. This is an even smaller issue than that of scalability, and may be simply my ignorance of the formal language used. Which brings me to my last issue with the paper: the language. The definitions made me appreciate all the parentheses used in Lisp. While several readings did little for my immediate understanding, I felt I could intuit the meaning behind the more complex communicatives, like accede and assert. However, I think that all the KQML derivatives alluded to in the ACL article come from this tendency to intuit rather than comprehend. A more comprehensible formal language is necessary for this specification. Finally, I did have 2 questions. What is a non-institutional illocutionary act? It is 'defined' in section 5, paragraph 1, somewhat circularly. And what is the distinction between illocutionary and perlocutionary acts? The authors assume the usual distinction between them, but I don't see the difference between wanting and intending an effect, at least not how it's described in the paper.
The first article that seemed more interesting to me is the one on team evolving.
I do agree that at some level, teams only play the strategies proposed by the programmers just as our assignments; however I think that by citing an article and stating that the intend of their entire development is to demonstrate the incongruence of it, seems to me very audacious.
I think that their idea of having a reward that accounts for events other than goals is pretty straightforward and it seems very similar to the idea of my team to evolve skills or non basic behaviors (called ADF) and positions (formations). And I regard the same about the fact of having the evolving teams playing against each other and leaving out the bad performers. What it seems not that straightforward is the idea of having three types of tests, the empty field, against kick posts and against Humboldt University Team.
The article is easily understandable. Even I am not familiar with the use of libsclient the way they explain their implementation is helpful in case you want to use their approach
What did not entirely make sense to me is:
The article that I liked the most was the one of Action Selection where very clearly the authors expose their implementations of:
All the implementations are explained clearly even when they contain soft Boolean expressions that are difficult to understand at the beginning but that completely make sense.
Finally the way they present the results of their experimentations is easily understandable and the figure of the components of the force-fields is very explanative.
Doubt: What is a hill climbing approach to program development?
For this week's readings, I selected to read Reactive Deliberation: An Architecture for Real-Time Intelligent Control in Dynamic Environments by Micheal K. Sahota and Multi-Robot Decision Making Using Coordination Graphs by Jelle R. Kok, Matthijs T. J. Spaan, and Nikos Vlassis. I selected these two readings because I thought they would be best suited for a project area that will closesly resemble the goals of my graduate research.
For my project, I am tentatively planning to focus on response time and decision-making in intelligent control. I feel that an investigation into this topic for the RoboCup tournament will help me in my graduate research, which centers around real-time response for multi-agent systems. By developing algorithms that allow for the players in the game to respond faster and with better decisions, I hope to be able to find methods that can be applied to the problem facing my area of study.
Upon completing the readings, I came up with the following summaries and questions.
Starting with the paper by Sahota, I concluded that the following key points were of the utmost importance:
My response to this paper was that it was very well-written and very easy to understand. I also agreed with much of the author's arguments, especially those referring to the need of robot architectures to be able to ask the questions "what to do" and "how to do it" all in real-time. Also, I appreciated this paper's relation to past readings, and even references to them, i.e. Brooks' Subsumption architecture.
Some questions I had, though, include:
Also, an important question that I asked myself after reading the paper was: Will this paper help with either of my projects? The answer is probably 'no' to both, unfortunately. I thought when reading the abstract that it might help me to gain a better understanding of how to create agents with quick response times that are still focused on achieving a goal. Unfortunately, it looks as though the paper was more concerned with robotics than I wanted it to be.
I felt the article by Kok, Spaan, and Vlassis was much better-suited towards my projects, but was more difficult to understand.
The key points I came across were:
This paper was particularly difficult for me to understand, especially given its mathematical terminology and mapping functions. One question that I had was:
Is there a better explanation for a "Nash equilibrium" than the one presented in the paper?
Upon reading the paper, I decided that it might be helpful to me on either of my projects. Because it focuses more on inter-agent communication and computational algorithms, it definitely has more potential than the other paper. However, that doesn't necessarily guarantee that I will be able to apply this knowledge either.
The two articles I read are "Evolving Team Darwin United" by Andre and Teller, and "Effective methods for reinforcement learning in large multi-agent domains" by Reidmiller and Withopf. I thought both articles were interesting and thought provoking. In the "Evolving Team Darwin United" paper, the authors make a claim about when a system can be effectively "evolved from scratch". It's an interesting question: when does human knowledge help solve a problem, and when might it get in the way? It seems that one of the big advantages to ML in this domain is that it won't fall for any false parallels between real soccer and simulated soccer. But this method definately has its limitations; in particular, I noticed the Darwin United team did not try to evolve world modeling behavior. I don't know what aspects of world modeling make it unsuitable to evolved behavior; maybe it's just that for humans the algorithm seems clear or managable, and so they see no point in asking a computer to evolve it?
In "Effective methods for reinforcement learning in large multi-agent domains", I was struck by the rigorous mathematics used. They proved that their SDQ algorithm solves a SDMDP in a finite number of steps. I don't know enough about markov decision processes to know if that's an impressive result or not, but it seems good to me. I wonder if they came up with SDQ first, and then tried to prove a result about it, or started by asking "what algorithm can gives a guaranteed optimal behavior in a finite number of steps?" and derived the algorithm from there. It's also worth noting that they don't attempt to evolve their attack strategy "from scratch" as did darwin united; they begin with a rule that the player closest to the ball must move towards it.
Starting with the paper by Parunak, I concluded that the following key points were of the utmost importance:
An example of a system that could benefit from a simpler, less-centralized system of smaller agents might be that of a floor cleaning system of robots. With more smaller robots, the job could get done a lot faster. However, a system of lots of simple robots could be more expensive and produce work of lower quality than just one that meticulously goes over the whole floor, takes longer, but ensures a quality performance.
My response to this paper was that it was actually very intriguing. I thought it to be my favorite so far this year. This may be because it was easy to understand and focused on fun issues like discussing fascinating animal/insect behavior rather than just bombarding the reader with technical jargon. However, I will add that I had difficulty (and often a lack of patience) with some of the equations designed to model the systems.
I felt the article by Svennebring and Koenig titled "Trail-Laying Robots for Robust Terrain Coverage" was also a very interesting read.
The key points I came across were:
A question that occurred to me while reading the article was: - What real-life mechanisms provided for "node counting?" Was this method, since it is consider the best real-time search method, ever employed by Pebbles? I get the impression that the sensors Pebbles has simply detect trails, not how heavily trafficked each cell on a grid may be. Was this the only method investigated?
Following are my thoughts on each of this week's assigned articles. As always, I permit the posting of my response. First, I respond to "Go To The Ant".
I found an informative article discussing and comparing the AI and philosophy Frame Problem: http://plato.stanford.edu/entries/frame-problem/
Referring to 'discrete event' agents and environments: "This model... leads to an (unrealistic) identity between an agent's actions and the resulting change in the environment, which in turn contributes to classical AI conundrums such as the Frame Problem." (sec 2.4.1, par 1) I disagree that such an identity is unrealistic. Granted, the issue of imperfect (fallible) action must be and is taken into account in the design of most of the agents I've seen. While most interesting environments do not allow perfect action, it is reasonable to assume certain changes in the environment follow from certain actions; I have little reason to doubt that the light will turn on if I flip on the wall switch.
This paper brought up many interesting points, but the one I felt was most important was the need for diversity. Diversity is the basis for the fault tolerance and self-organizing (emergent) behaviour of a swarm MAS.
This diversity doesn't necessarily come from design, however. "The important observation is that the advantage of the larger population lies not merely in numbers, but in the diversity that results from physical exclusion laws." (sec 4.5.1, par 2) Being in different locations provide even identical agents with unique perspectives, and thus increases diversity.
It's good to be prepared; that's how diversity improves fault tolerance. "One price of being able to thrive in a constantly changing environment is a willingness to support a diverse set of resources that go beyond the needs of the immediate environment." (sec 4.5.1, par 6)
I appreciate and understand the notion of entropy presented in this paper. "The system can reduce entropy at the macro level by generating more than enough entropy at the micro level to pay its second-law [of thermodynamics] debt." (sec 4.6, par 3) However, I'm not confident in the validity of this metaphor. It feels forced. I'm reminded of the misapplication of the theory of natural selection to coin 'Social Darwinism'; natural selection, as the name suggests, applies to the natural mechanisms of survivability, and was not meant for artificial social structures.
Trying hard to tie "Trail Laying Robots" to the Robocup problem, I did come up with one thing. The concept of node counting can be implemented in a Robocup team to increase the ball distribution. The ball takes the role of the trail laying agent, with each player as a node on an undirected graph. In this case, the node with the ball would determine which node the ball goes to next, and each node keeps track of it's node count, incrementing the count when it kicks the ball. But, since agents can display no persistent indication of their node count, the only way for agents to detect the count of their neighboring teammates is for everyone to say their count repeatedly. "Disconnecting" parts of the graph can keep the ball in a particular part of the field. While this method may ensure a more even distribution of the ball, it's actual benefit is unclear without testing. (A thought: perhaps players only announce their node counts when they are ready to receive the ball: when they don't have the ball already, or when they're "open".)
The self updating probability in the agents described in "Self-Organized Task Allocation" seems similar to the transfer of "force" in the wasp task differentiation example given in "Go To The Ant". I imagine that, considering the need for an "entropy leak" as stated in "Ant", substituting the "Variable Delta Rule" (sec 3 par 2) with a force transfer system will improve performance. If nothing else, one could test the importance of entropy leaks so.
I really enjoyed the Parunak paper. It seems like designing swarm algorithms requires a similar type of thinking to designing recursive algorithms. In both, the task isn't designing an algorithm that will solve the problem, but designing an algorithm that will get us a few steps closer to solving the problem, and is guaranteed to fit well with other iterations of itself. In my introductory CS classes, we spent a lot of time studying object-oriented design. If Parunak is right, perhaps the next generation of programmers will study agent-oriented design, although for many algorithms it's probably overkill.
One part of the paper that puzzled me was principle (5), that agents should leak entropy. It isn't clear what this means. This idea of dissipating disorder seems nice, but it doesn't match very well with Shannon or Boltzmann entropy as I understand them. "Insect colonies leak entropy by depositing pheromones whose molecules, evaporating and spreading through the environment under Brownian motion, generate entropy." The fact that the molecules dissapate has only a minor effect on their primary function, which is to serve as information transmitions to other agents. This doesn't really generate entropy; if anything, it counters it, since under shannon's model entropy is directly related to unpredictability, and the messages increase the predictablity of food source locations. What am I missing? These ideas, and the relationship between evolution, entropy, and chaotic or dynamical systems, seems fundamental to the problem of multiagent systems. But at present I think we don't really have the vocabulary or framework to properly understand the concepts. Perhaps a paradigm shift is in order.
Reading critically is causing me to vacillate quite a bit in my opinions on approaches to building multiagent systems. My respond to many of the previous readings sounded similar to this:
"This {concept|algorithm|model|architecture} is a clever solution to the particular problem described. However, I think in the larger picture, the approach is flawed. A simple and elegant implementation of a truly artificially intelligent system will solve this problem and many others. Some system modeled after what happens in nature will be a cure-all for issues in multiagent systems."
However, after reading "Go to the Ant", which makes some claims similar to mine, I actually saw more value in all the previous readings I had somewhat discredited. It could be argued that one of the greatest advantages of computers is ability to program with exactness and provability, unlike an ant colony. I am certainly still in favor of attempts to develop a massively parallel multiagent system with tiny, individually "stupid" agents that amalgamate to something with real intelligence. However, I am definitely going to have to think about this more before I decide which basket to put all my eggs in.
The "Go to the Ant" article is extremely enjoyable, interesting and clear from the beginning; moreover, incredibly cunning, its author took the name from a biblical statement!
Starting with a brief history of programming it expresses very clearly the differences of agents against other previous programming paradigms. Basically containing code, state and control and lately organization capabilities is how an agent is characterized; however, for more explicitness, more elaborated definitions of MASs, agents, environment and their composing elements as well as their relations are presented.
Homodynamic and Heterodynamic Systems, models of coupling between agent and environment present very different problems to be addressed as they are described in the article.
The next part of the article presents very interesting examples of natural MASs related to ants, termites, wasps, birds, fish and wolves, all of them sharing principles of self-organization. The author clearly specifies behavior, responsibilities and integration for each of them
In section four, views from different authors are presented and mostly important: seven principles to design MASs. Some of the most important being: correspondence to things, smallness, decentralization, diversity, dissipative, and concurrency.
Of particular importance for me are the industrial applications of agents and the insight to the AARIA project mentioned.
Among other things I won't ever forget of this article is "We used to think that bugs were the problem in manufacturing software, Now we suspect the may be the solution!"
This is the second paper by Parunak I have read, the other related to the AARIA project, "Manufacturing over the Internet and into Your Living Room", is also very clear and interesting and I feel like reading again these two articles to follow the enjoyable pattern in which they are written.
The second article I chose was the one on Self-Organised Task Allocation in a Group of Robots based where the goal is to find the optimimum number of robots such that they do not interfere each other while increasing the retrieval time to bring objects to a target location. This is called "prey retrieval" usign task allocation.
First they describe the Lego robots' hardware and behavior that they used in a circular arena then they explain the experiments they ran to prove that individual adaptation during a life time leads to self-organized task-alocation in a colony and better fit indiviualds, in this case mechanically, are more likely to be selected for the tasks.
In "BDI Agents", the authors claim to argue the necessity, but not the sufficiency, of belief, desire and intention. What other attributes, if any, can be used to supplement BDI? Further, did they actually argue necessity? It seems to me that they asserted it by including belief, desire and intention into their architecture.
"As the agent has no direct control over it's beliefs and desires, there is no way that it can adopt or effectively realize a commitment strategy over these attitudes." (BDI Logics, par 9) It would be interesting to see where removing this constraint would lead. The idea is not without precedent; in psychology it's called cognitive restructuring. A simple example of this is a student who is bad at math because he hates it. It's often suggested that the student can improve his performance by adopting a more positive attitude towards math.
"The conditions under which a plan can be chosen as an option are specified by an invocation condition and a precondition..." (Abstract Architecture, par 8) Why distinguish between those two conditions? If a plan can't be invoked without a precondition holding, why isn't that precondition entailed by the invocation condition, so as to merit a separate precondition?
Initially, my impression of the reservation system as presented in "Traffic" is that if at any time a vehicle fails to make a reservation, then the probability of it getting a reservation in the future is decreased because it must decelerate in anticipation of stopping. Was this phenomenon observed? I'd imagine that consciously attempting to reduce the decrease in probability (should it even be the case) would significantly improve the performance of the reservation system
I was fortunate enough to see Kurt deliver a talk to Surge (Science undergraduate research group), where we could see the cool simulations run. It's an interesting idea, and I think about it alot when I'm stuck in traffic. >From an MAS perspective, it's an excellent example of the great things communication can accomplish. I think maybe he gave the talk after this was published, because there are a few things he mentioned I didn't see in the paper. One was setting the controller so that no agent can get a reservation while a closer agent is still waiting. Another was having a priority system for ambulances, etc. This could be modified so people could pay for higher light priority, like paying for first class shipping on mail. Or it could be built so that more passengers have higher priority, to encourage carpooling. There's still a long way to go before we can implement his intersections, though. I think people would be more willing to trust computer drivers on highways first, where there aren't pedestrians and things like that.
The other article is also interesting, although I wish it had gone into a little more detail about methods used, like we saw for the Cohen "On team formation" article. they say their method can implement any of 12 other methods, but I guess to know what those methods are, I'd have to go read the cited papers.
First I discuss the BDI Agents article:
3rd paragraph, 1st column, page 2: I do not see why they say that the system -referring to the control- is nondeterministic, if using the decision trees and considering the values obtained from them, the system will "determine" the actions to take; then the system is deterministic!
6th paragraph, 1st column, page 2: When they claim: "the actions or procedures that achieve the objectives are dependent on the state of the environment and are independent of the internal state of the system". I also disagree, because the system is influencing the environment, thus indirectly, affecting what the actions need to be.
2nd paragraph, 2nd column, page 2: I think a brief introduction to branching trees before assigning this reading would have been very useful - I am not sure about this because probably students in CS are very familiar with them- however, in my case, I spent some time reading in the some of the basics of how these trees work. First I downloaded the article by Emerson, which I found works or worked in the CS Depatment at UT (which surprised me being the now famous BDI based on his research). But his chapter was very long so I quitted soon. Then I end up reading some slides from Andrew ... from CMU. I still do not know what he means in 2nd paragraph, 2nd column, page 3 "a payoff function that maps terminal nodes to real numbers".
4th paragraph, 2nd column, page 3: I think a figure of the branch tree before and after the transformation that they apply would have been very illustrative in this case
5th paragraph, 2nd column, page 3: What is an accessibility relation
4th paragraph, 1st column, page 4: Why you would like to change a real number probability and payoff to dichotomous?
5th paragraph, 1st column, page 4: What is closed under implication?
2nd paragraph, 2nd column, page 5: Why not to use objects instead of structures to represent beliefs, desires, and intentions?
5th paragraph, 1st column, page 6: When they say "one way of tailoring and thus improving the process of option generation is to insert an additional procedure" why you need to generate options if you already had plans or `deliberated options'?
While looking additional information on BDI I found the same article assigned for reading in a completely different format. I wonder if you can send the exact same article to be published in different places.
This article left me very interested in Decision Theory and in the work of Bratman. And though I did not understand many things the topic definitively passionates me and I would continue researching it.
The Traffic Reservation article: The other article was very easy and good. Reading it took me 1/6 the time of the first. It was interesting and I found the animations on the web very enjoyable and making easy to understand the research and its results. However, the article seemed to be written in a complete rush. Just like trying to submit your homework before 10pm on Monday
I chose to read Chapter Six from the textbook for this week's reading assignment. The chapter was very easy to understand and the content was actually very easy to get through. I found that any questions I had were answered as I went.
One example of a situation that can be modeled as a game matrix is that of two people potentially becoming romantically involved. If one person asks out the other (i.e. cooperating), the other has the option of either accepting their offer (cooperating) or rejecting it (defecting). The same goes for if the other person asks first. However, both people still have the option defecting, and hence remaining friends. The game matrix would be as follows:
i cooperates i defects 5 2 j cooperates 5 0 2 3 j defects 0 3
In this matrix, the points are assigned to the following outcomes:
Hence is the game of love.
The Nash equilibria and how a person should play the game depends on how interested i believes j is. If i believes j is uninterested, then i would be better off remaining friends with j. However, if i believes the feelings are mutual, i should attempt to cooperate romantically with j, therefore achieving the highest amount of points for both. The problem arises when i (or j) mistakenly believes that j (or i) is interested, when in fact, this wasn't the case. Here, one person ends up rejected and embarrassed, and the other is deprived of a friend. This is the worst case scenario.
The Nash equilibria depend on the believes of the people involved. If i cooperates, j can do no better than to cooperate (assuming the feelings are mutual - i.e., j believes i to be a suitable match). If this isn't the case, j can actually do no better than to defect. But, then, if j were to defect, i would have done no better than to defect. Hence, there are two true equilibria.
Based on the author's explanations it seems clear how the Nash Equilibria arise. However, I would like to understand more thoroughly why for the table of symmetric scenarios presented some times we have only one Nash Equilibrium point and why for others we have two.
There is a sentence I did not quite understand: page 166: "For the result seems to imply that cooperation can only arise as a result of irrational behavior, and that cooperative behavior can be exploited by those who behave rationally"
The reading in general was very interesting and especially, the examples and some remarks made by the author make it very enjoyable. Besides I think he is very clear in the use and explanation of the notation. Nevertheless, I found some errors like in the table presented for the Stag Hunt where the lower-left corner is wrong.
This was a very philosophical enjoyable reading
In this response, I chose to identify an application not mentioned in the reading that can be modeled as a matrix game. As always, I permit the posting of this response.
I had trouble coming up with more abstract situations that can be modeled as a matrix game, but I do remember a game show called "Friend or Foe" that is similar to the prisoners' dilemma. Teams of two would compete with each other for cash, and each round a team would be eliminated, finally leaving one. Then, the members of the final team would indicate whether they were friend or foe by making a gesture of an open palm or closed fist, respectively. This gesture was made in secret, and simultaneously revealed. If both indicate friend, they'd split the cash winnings evenly. If one was a friend and the other a foe, then the foe received all the cash. And if both were foes, neither would get any cash. So, let the C represent the cash winnings for the team, with players 1 and 2. Then the game matrix would look as follows:
| Player 1 | | | Friend | Foe | -------------------------------- Player 2 | | | Friend | C/2, C/2 | C, 0 | -------------------------------- | | | Foe | 0, C | 0, 0 | --------------------------------
If a player chooses to be a friend, he can gain half the cash or none of it, and if he chooses to be a foe, then he can gain all the cash or none of it. Since every outcome of one strategy is not preferable to every outcome of the other strategy, there is no dominant strategy.
However, since the total gain is C if at least one player is a friend and 0 otherwise, it would be rational for either player to choose to be a friend. Knowing this, a player could become a foe and take the whole cash pot. But, once a player has chosen to be a foe, the other can't improve his position by changing his choice; he will get no money regardless. Therefore, there are three Nash equilibriums: when either player is a foe and the other a friend, and when both are foes.
I always wonder why, more often than not, both players chose to be foes. It make sense after this analysis, and further exemplifies the rule that "the house always wins".
I had heard about many of the Game Theory models and example problems discussed in McCain's book, but I had never read anything consolidated explanation using consistent language. So, I found the reading very interesting. However, I am having a little bit of a problem understanding how this is going to tie in with the other aspects of multiagent systems we have discussed. The generic question "How can I maximize my benefit?", asked often in the text, sounds like maximizing the fitness function in genetic programming. However, the approaches are entirely different. Game Theory describes ways to optimize certain classes of fitness functions, whereas genetic programming can be used given any well-defined fitness function. It is the same difference between optimizations in mathematics. One way to find extrema of a function is to find the points at which the derivative is zero. However, we don't know how to symbolically find derivatives for all functions. There are numeric approaches that are more generic but perhaps less effective, such as the simplex algorithm. Game Theory, in this regard, seems to be a layer of abstraction on discreet calculus.
After completing the readings for this week, I came up with the following ideas for experiments for my team's final project:
For each of the experiments, I decided to focus on the aspect of player positioning in our project. Since our project is centered around the coaching program, the use of statistics is crucial. The program makes use of different log files to identify patterns in game play. For my experiments, I propose comparing the live game play to log files where a pattern has been implimented and attempting to find a correlation.
The null hypothesis in each case is that there will be a correlation. If this proves to be incorrect, then the program should note that the pattern has not been implimented, and will therefore not make a declaration.
T-Test: For the t-test, I propose using the idea of grid spaces and a matrix that will be incremented by one each time the observed player enters a particular space. (Our group is actually using this method, with the grid spaces referred to as "bins.") Then, histograms will be made of both the X and Y grid spaces to see where the player happens to position himself most frequently. Since there is no telling how often a player will move (or if he will at all), the t-test must be used here. Otherwise, comparing the live gameplay log file and the pattern log file will prove almost impossible because movement cannot be paired. Once the data is taken, the t-test can be used to see if there is, in fact, a correlation between the player's positioning during the live game and that of the pattern log file.
Paired T-Test: For the paired t-test, I propose using the idea of recording the player's position at each cycle. This way, assuming that both the live game file and the pattern file have the same amount of cycles, the different files can be paired such that they may be compared in this manner. Through this test, the average positions as well as the standard deviations may be analyzed to identify a correlation. If one in fact exists, then the program should make the proper declaration.
Chi-Squared Test: For the chi-squared test, I propose first analyzing the data from the pattern files and allowing them to be the "expected" results. Then, the program should analyze the positioning of the player (with either the grid spaces method or the cycle method) in the live gameplay and treat this data as the "observed" data. To calculate the chi squared value, I would suggest using the "expected" standard deviation, but the "observed" x values and mean. This should allow for a valid way of analyzing the chi squared value. If there is a correlation between the two log files, then the chi squared value should approach the number of cycles or the number of recorded repositionings (depending on which approach was used) and the program should make the proper declaration. In other words, the difference between each position and the mean in the "observed" data should be close to the standard deviation of the "expected" data. If no correlation exists, the chi squared value will be "much" bigger than the number of cycles or recorded repositionings and no declaration should be made.
First experiment: A t-test could be used to prove that after using our genetic algorithm, our team has effectively improved its playing capabilities.
The null hypothesis is: Is the mean score of the unevolved team equal to that of the evolved?
And the alternative hypothesis is: Is the mean score of the evolved team higher?
The experiment would be conducted like this:
Second experiment: An analysis of variance ANOVA to determine against which of the opponent teams, the listos team is better fitted.
The null hypothesis would be that the listos team is equally fitted to play against any of the opponent teams
The experiment would be run like this:
For our experiment, we plan to develop good positions for our team. We can do a T-test where we play many games with the old positions against a particular team or suite of teams, and then play the same games with our new positions, and compare the average score differential between the two teams. We would test to see if the results seem statistically better. For our experiments, a paired t test doesn't really make sense. However, we could do chi-squared tests, where each we compare our new and old teams against a set of opposition teams, something like:
opp1 opp2 opp3 old x y z new u v w
Then we could learn if our team seems to have improved particularly against one opponent.
Upon completing the reading for this week, I came up with the following summaries and questions. I concluded that the following key points were of the utmost importance:
My response to this paper was that it did not grab my attention very well. I was also very frustrated by reading it because I thought it was far too loaded with confusing equations and terminology and not nearly enough real-world examples. I felt as though this paper was extremely difficult to understand, even to the point where I could barely even conjure intelligent questions about it. But the best I could do would include the following:
As you can see, this paper left me thoroughly confused. Sorry, but I think this is my least favorite reading this semester.
This is a very interesting chapter of a researcher that just came to UT this semester. It presents a nice summary of the state-of-the-art in multiagent negotiation for self-interested agents.
The first section after a brief introduction presents different criteria to evaluate negotiation protocols, one of the most interesting being: stability. From my particular interests in efficiency I find the "Computational Efficiency" criterion particularly appealing, since as presented by the author provides an incredible interesting approach when analysis the trade off between the cost of the process and the solution quality. Finally, from this section where the author states that some games do not present Nash Equilibrium, I would like to ask for one example.
The next section deals with different interaction protocols. The first of them, voting, leaves me with a doubt: How irrelevant alternatives are defined such as that in the mentioned desiderata for ideal social choice rule they do not alter consumer rankings while under the relaxation of the third desiderata they can split the majority? Moreover is there a typo? Because I consider that is not the relaxation of asymmetry and transitivity but of independence of irrelevant alternatives: the fifth criterion instead of the third. In the same section the Borda count protocol and its paradox are quite interesting.
Insincere voters are a special case and I would like that Dr. Stone elaborates more on: "A protocol is said to implement a particular social choice function if the protocol has an equilibrium whose outcome is the same as the outcome of the social choice function would be if the agents revealed their types truthfully".
From 5.3 I do not understand how if each agent's type has its preference outcomes' order, it can happen that some agent gets his most preferred outcomes chosen no matter what type the others reveal. How can this happen if the social choice function is truthfully implementable in dominant strategy?
I consider the next part: Auctions easier to read and though the author considers interesting problems that rise in this protocols including bidder collusion I would like to pose these 2 cases: when all the bidders on a contract agree who is going to win and they agree to round robin different contracts and when one bidder in a contract buys (bribes) the auctioneer as to make it appear as the winner even when he is not.
The last part to be read: bargaining is also appealing. In particular I would like to explore more in class the formula of theorem 5.8 for Axiomatic Bargaining Theory. The Strategic counterpart is easier to understand especially after two weeks of game theory.
The authors lay out 6 criteria for a good voting system. Of these, only the first 4 seem interesting and useful to me. The last 2 don't seem necessary. Yes, in human terms we prefer not to have dictators, but if the system is constructed in a way that provides equal opportunity dictatorship, this seems ok to me. Also that the scheme should be independent of irrelevant alternatives seems dubious to me.
Also, I think it's interesting that the authors don't mention bidding as a type of voting, i.e. each person pays for the weight of their vote; since they discuss bidding, it seems like a rational extension, and probably the closest to what happens in real life.