Teaching Multiagent Systems using RoboCup and Biter
José M. Vidal, University of South Carolina
Paul Buhler, College of Charleston
This paper presents our experiences
using RoboCup and Biter - our agent framework - to teach three consecutive
graduate-level classes in multiagent systems. We argue that the combination
of RoboCup and Biter forms an effective platform for the teaching of multiagent
systems and the distributed mindset. Having students build RoboCup teams
exposes them to the non-intuitive emergent consequences of simple local
actions. The exercise also exposes the students to the often frustrating
complexities of debugging distributed applications and teaches them strategies
for overcoming these diffculties.
2. The Distributed
Resnick argues that these misconceptions should be addressed as early as possible in a child's education. Our experience indicates that distributed, as opposed to hierarchical, control is a concept that is initially difficult for most computer science and engineering students to grasp. We believe that a combination of instruction and hands-on experimentation is required to help students develop an understanding of distributed systems. These experiences allow students to develop the appropriate intuitions that will let them design and debug distributed applications that exhibit complex emergent behaviors. This need is even more pronounced in today's highly networked world where web based and peer-to-peer applications are quickly becoming the norm .
||Figure 1. The soccerserver display.|
3. A RoboCup Overview
The soccerserver updates its world model every 100ms. Each player can send at most one action to the server every 100ms. All communications are over UDP and use the soccerserver's communication protocol. The actions available to the players are dash, turn, kick, and say. Every 150ms each player receives a message from the server telling it what it sees, i.e., what is in its viewcone, as well as what it hears. This information is given in polar coordinates with the player as the origin, and the zero angle points in the direction that the agent is facing. For example, the server might tell a player that is "sees" the ball at a distance of 1.3 meters and an angle of 10 degrees from where it is currently looking. Since the information from the server is all relative to the player's current location, the only way a player can determine its location is by triangulating it from some fixed objects whose location is known. These fixed objects are a series of flags placed around the field, boundary lines, and the goal posts.
The player's dash and kick actions can be given a force parameter to indicate the force that should be applied. In the case of the dash command the server updates the player's position by taking into account its current momentum and force. The turn command turns the player in the appropriate direction. For the kick command, the server takes into account the player's and the ball's momentum as well as the relative angle of the ball to the player - it is easier to kick a ball that is directly in front of the player. In both cases the server adds some noise that is proportional to the diffculty of the action. In other words, neither player nor ball consistently end up where expected, thus complicating the task of implementing reliable players. The say command is used to shout any arbitrary string which will be heard by some of the nearby players.
4. Using RoboCup
While all these characteristics make RoboCup a great platform, there are several aspects that make it hard to use for instructional purposes.
These drawbacks forced students to spend most of their time writing code to handle the UDP message parsing and the construction of a world model, as we detail in Section 6. Therefore, we designed a basic RoboCup client, called Biter, that implements all the features the students will need in order to quickly get started testing new behaviors and coordination protocols.
5. The Biter Platform
As sensory information about dynamic objects is placed into Biter's world model it is time stamped and the world model is updated. We first calculate the player's absolute position using some of the closest static objects as guide. We then use the player's position to calculate the absolute position of all dynamic objects. All information is discarded after its age exceeds the user-defined limit. Users can experiment with this limit. A small value leads to a purely reactive agent; a large value leads to the agent seeing "ghosts" of players that are not there anymore.
Access to world model data should be simple; however, approaching this extraction problem too simplistically leads to undesirable cluttering of code. This code obfuscation occurs with access strategies that litter loop and test logic within every routine that accesses the world model. Biter utilizes a decorator pattern  which is used to augment the capabilities of Java's ArrayList iterator. The underlying technique used is that of a filtering iterator. This filtering iterator traverses another iterator, only returning objects that satisfy a given criteria. Biter utilizes regular expressions for the selection criteria. For example, depending on proximity, the soccer ball's identity is sometimes reported as "ball" and other times as "Ball." If our processing algorithm calls for the retrieval of the soccer ball from the world model, we would initialize the filtering iterator with the criteria [bB]all to reliably locate the object. Since the filtering criterion is regular expression based, we are able to construct powerful extraction routines without incurring the complexity of coding error-prone compound conditionals.
Although access to the world model has been streamlined, creating more concise and algorithm-revealing code, it remains difficult to fully understand the behavior of the players. At times it seems the only way to understand moments of unexplainable behavior is to have access to the player's world model. Dumping the contents of the world model to a file for later interpretation is unnecessarily complex and unwieldy. To attack this problem, Biter provides a run-time visual display of a player's internal view of his environment. When a Biter agent is started, a command-line parameter is used to enable the graphical display of the world model. The display is served by an independent thread and utilizes double buffering for smooth animation. The overhead view of the field shows all static objects and the dynamic objects currently found in the player's world model. Whenever stale elements are encountered, an algorithm is run which merges its display color with the background color of the field. Visually, this has the effect of having stale elements fade away as they age. The graphical display of a player's world model can be compared to the soccer monitor's display for purposes of independent verification and validation of the player's world model contents. This powerful debugging feature has saved users countless hours of fruitless troubleshooting and helps them focus on other multiagent system implementation issues.
5.2 The Generic
Biter implements a GAA that provides the structure needed to guide users in the development of a solid object-oriented agent architecture. The GAA is designed for agents that receive input from the environment at discrete intervals and take discrete actions. That is, we envision an agent that receives readings from its sensors and takes actions using its effectors. This is a common method for modeling autonomous agents [15, Chapter 1] and captures many agent applications.
The GAA provides a mechanism for scheduling activities each time the agent receives some form of input. An activity is defined as a set of actions to be performed over time. The action chosen at any particular time might depend on the state of the world and the agent's internal state. The two types of activities we have defined are conversations and behaviors. Conversations are series of messages exchanged between agents. Behaviors are actions taken over a series of time steps. The ActivityManager determines which activity should be called to handle any new input. A general overview of the system can be seen in Figure 2.
|Figure 2. Biter's UML class diagram. We omit many of the operations and attributes for brevity. Italic class names denote abstract classes.|
An agent is propelled to act only after receiving some form of input, that is, after the activity manager receives a new object of the Input class. This class has three sub-classes: SensorInput, Message, and Event. A SensorInput is a set of inputs that come directly from the agent's sensors. Biter provides a parsing function that transforms the input from its original format - a list - into an object of this class. In most implementations a class hierarchy should be created under this class in order to differentiate between the various types of sensor inputs. Biter defines ObjectInfo and ObjectInfoContainer as extensions of this class. The Message class represents a message from another agent. Robocup players can use a broadcast mechanism ("say") to send messages to all nearby players. Finally, the Event class is a special form of input that represents an event the agent itself created. Events function as alarms set to go off at a certain time. They are important because they provide a way to implement timeouts. Timeouts are used when waiting for a reply to a message, when waiting for some input to arrive, or when repeatedly taking an action in the hope of generating some effect.
Biter implements a special instance of Event, which we call the act event. This event fires when the time window for sending an action to the soccer server opens. That is, it tries to fire every 100ms, in accordance with the soccerserver's main loop. Since the messages between Biter and the soccerserver can be delayed, and their clocks can get skewed over time, the actual firing time of the act event needs to be constantly monitored. Biter uses an algorithm similar to the one used in  for keeping these events synchronized with the soccerserver.
The Activity class represents our basic building block. Biter agents are defined by creating a number of activities and letting the activity manager schedule them as needed. The Activity class has three main member functions: canHandle, handle, and inhibits.
The canHandle member function receives an input object as an argument and returns true if the activity can handle the input, that is, if it can execute as a consequence of receiving that input. This function could not only consider the contents of the input, but it could also consider the agent's current internal state, the agent's world model, etc. Since this is a generic framework, we do not constrain the canHandle function to only access a certain subset of the available data. That decision is left to the software engineer who wants to refine the architecture. The only requirement we make is for the function to be speedy since it will need to be called after each new input has arrived.
The handle member function is called when the activity is chosen to handle that input. It is called when the activity manager wants the activity to execute with the given input. This function usually generates one or more atomic actions, sets some member variables, and returns. A call to the handle function executes the next step in the activity, the step that corresponds to the received input. The function can set member variables as a way to maintain a state between successive invocations. This state allows the activity to implement multistep plans and other complex long-term behaviors. The handle function will return true when the activity is done, at which point it will be deleted. We expect that in most agents there will be a set of persistent activities that are never done and always return false.
Finally, the inhibits member function receives an Activity object as a parameter and returns true if that activity is inhibited by the current one. This function implements the control knowledge which the activity manager will use to determine which activity to execute. The use of this function mirrors the use of subsuming behaviors in the subsumption architecture . However, the function can also consult state variables in order to calculate its value, thereby extending the functionality. Since the activities are organized in a hierarchy, this function is able to easily inhibit whole subtrees of that hierarchy. This allows users to add new activities without having to modify all existing ones.
A significant advantage of representing each activity by its own class, and with the required member functions, is that we enforce a clear separation between the behavior knowledge and the control knowledge. That is, the handle function implements the knowledge about how to accomplish certain tasks or goals. The canHandle function tells us under which conditions this activity represents a suitable solution. Meanwhile the inhibits function incorporates some control knowledge that tells us when this activity should be executed. This separation is a necessary requirement of a modular and easily expandable agent architecture.
The Behavior class is an abstract class that groups all long-term behaviors of the agent. We define these behaviors as series of atomic actions. For example, a robotic behavior might be to "avoid obstacles," while a software agent might have a "gather data from sources" behavior. Behaviors can, like all activities, create new activities and add them to the set of activities.
Biter defines its own behavior hierarchy by extending the Behavior class, starting with the abstract class RobocupBehavior which implements many useful functions. The hierarchy continues with standard behaviors such as DashToBall, IncorporateObservation, and DribbleToGoal. For example, a basic Biter agent can be created by simply adding these three behaviors to a player's activity manager. The resulting player would always run to the ball and then dribble it towards the goal.
The Conversation class is an abstract class that serves as the base class for all the agent's conversations. In general, we define a conversation as a set of messages sent between one agent and other agents for the purpose of achieving some goal, e.g, the purchase of an item, the delegation of a task, etc. A GAA implementation defines its own set of conversations as classes that inherit from the general Conversation class. For example, if an agent wanted to use the contract-net protocol, it would implement a contract-net class that inherits from Conversation.
Conversations implement protocols. Most protocols can be represented with a finite state machine where the states represent the current status of the conversation and the edges represent the messages sent between agents (see  for specific proposal that extends UML to cover agent conversations). In some protocols each agent will play one of the available "roles." For example, in the contract-net protocol agents can play the role of contractor or contractee. The conversations will, therefore, implement a finite state machine.
Multiple conversations can be handled by having the existing conversation add a new one to the set of activities. For example, if a message that starts a new conversation (e.g., a request-for-bids) is received by an agent the canHandle function of the appropriate conversation will return true even if the conversation is already busy, that is, even if it is not in its starting state. When the handle function is called with the new message the conversation will recognize that it is busy and create a new conversation, add it to the action manager, call the new conversation's handle method with the new input, and return. In this way, a new conversation object is created to handle the new message. Behaviors can use the same method to initialize a conversation.
For example, a "move to point" behavior might realize that another agent is blocking the path and start a conversation with that agent in an effort to convince it to move out of the way. If only one instance of a conversation is desired the user can implement a conversation factory [5, Factory Method] in order to dynamically limit the number and type of conversations.
We are also planning to add error handling and verification functions to the top-level Conversation. Specifically, many conversations will want to implement some timeout mechanism for expected replies, as well as a method for determining what the next action should be (e.g., resend the message, send another message, fail). Given the commonality of this functionality, it makes sense for us to implement it on the base class.
The ActivityManager picks one of the activities to execute for each input the agent receives. It implements the agent's control loop. The manager runs in its own thread, where it receives input from the sensors and dispatches it to the appropriate activity. The dispatching is done by the handle function, shown in Figure 3, which determines which of the activities will actually handle the input. The algorithm it implements echoes the type of control mechanism implemented by subsumption and Belief, Desire, Intention  architectures. The function first finds all activities that can handle the input; from this group it chooses one which is not inhibited by any other one in the group and asks it to handle the input. Since the inhibition function can be arbitrarily defined by its activity, the ordering becomes very flexible. That is, the user of the GAA has options ranging from no organization (no activity inhibits any other activity), to a static organization (activities inhibit a fixed type of activities), to a dynamic organization (activities inhibit based on many other factors). As an agent matures, the user can choose to increase the organizational complexity without re-implementing the architecture.
Figure 3. ActivityManager.handle (Input i).
All agent implementations that extend Biter must follow a series of steps. First the agent must instantiate an activity manager object. The agent then adds the desired activities to this object. These are the activities that define its overall behavior. It then calls the run method on the activity manager in order to start it running. At this point the manager takes complete control and enters its infinite loop, choosing which behavior to execute every time. In general, the user should not need to modify the manager. All the control knowledge is stored in the canHandle and inhibits methods which are defined by the user.
In the Spring 2000 semester we gave students a very basic Java client whose only functionality was the ability to parse and exchange messages with the soccerserver. We also made available to them the source code for the CMUnited team , authored at Carnegie Mellon and written in "C". The CMUnited team had won the previous two international RoboCup competitions. Due to the complexity of the CMU code, the students unanimously chose to use the simple Java client as their basic framework and to peruse the CMUnited code for ideas.
The final results for the first semester were encouraging. All groups were able to build working teams and participate in the final tournament. Their strategies, however, did not reflect the coordination protocols or behavior selection and planning algorithms we had studied in class. Several groups resorted to a simple "everyone go to the ball and try to kick it towards the goal" strategy. In fact, it was this strategy which won the tournament. A couple of groups implemented very rudimentary "zone" strategies, but these were incomplete. For example, a player dribbling the ball towards the goal would suddenly stop when it reached the end of its zone. Moreover, the code written by many of the groups lacked any structured design and resorted to the use of one large nested if-then-else statement. That is, the students did not do a thorough job at implementing any of the agent architectures described in class.
We believe that part of the reason for this last omission was the lack of well-documented object-oriented designs for agent architectures.* Our textbook  describes the architectures using very high-level box diagrams. Three quarters into the semester we attempted to remedy this problem by providing the students with a set of UML agent architectures, but by then it was too late. The groups were already too committed to their own designs to pay attention to a better alternative.
*Available at www.multiagent.com/arch/
As a result of these experiences, we developed Biter with the expectation that it would allow the student groups to concentrate more on using the coordination strategies studied in class and help them develop good agent designs.
During the Fall 2000 semester the students were given the version of Biter described in Section 5, with a default behavior of going to the ball and dribbling it towards the goal. The last problem set before the final competition asked the students to implement a team that could beat a team of Biter agents with the default behavior. All groups were able to achieve this goal, with varying degrees of success.
The results of the second tournament were impressive. All of the teams implemented complex strategies. Many of the teams utilized flexible zones, stigmergy , and broadcast communications. For example, some groups were able to have the players switch between a set of modes that determined the overall strategy being used (e.g., aggressive versus protective). The players would achieve this without any explicit communication, using only cues from the environment, thereby implementing a form of stigmergy. An example of communication employed by several teams was having a player announce its pass. That is, a player would shout "I am passing to P3" just before making the pass. All the other players, especially P3, that heard the announcement could then behave accordingly.
The quality of the resulting architectures also improved. The player code was no longer a long and hard to understand if-then statement with global mode variables. Instead, the groups encapsulated behavior functionality in the various behavior classes. The behavior to use was chosen based on some pre-determined method such as the priority of the behavior or based on certain features of the current world state. This new modularity also allowed the groups to quickly test new behavior combinations and reject behaviors that actually resulted in worse team performance. We believe that this flexibility contributed to the quality of the final teams.
In the Fall of 2001 class we again used the same Biter code, this time with a few new low-level behaviors and bug-fixes. Once again we noticed improvements in the team performance, albeit not as drastic as the previous year. All the teams exhibited true team behavior. They engaged in a lot of passing and, on occasion, had pass sequences that involved three players - from first, to second, to third, to the goal. Overall the students' experiences continued to be positive.
There were still, however, some areas left for improvement. The teams showed a poor ability to resolve deadlocks. For example, sometimes several players from the same team would all try to take control of the ball at the same time, annulling each other's actions in the process. In general, the strategies were brittle in that they would stop working even after small changes were made to the soccerserver parameters. Finally, the real-time performance of the teams was still not up to par with that of competition teams - they were missing action and coordination opportunities.
Six screen capture movies of selected class tournament playback, with the authors' comments on each team's design and strategy in programming their players' behaviors.
Although our experiences with RoboCup and Biter have been at the graduate level, we fully expect that they will be useful tools for undergraduate education. The STEELMAN draft of the Computing Curricula 2001 (CC2001) recognizes that distributed systems topics need to be introduced with more rigor in an undergraduate CSE education. Topics related to distributed systems are present in each of the following CS body of knowledge core areas, as defined in CC2001: Algorithms and Complexity, Architecture and Organization, Operating Systems, Intelligent Systems, and Net-Centric Computing . Each of these core areas is further subdivided into topics and units. The STEELMAN draft of CC2001 presents sample curricular components that demonstrate possible strategies for integrating topic and unit coverage into an undergraduate CSE educational experience. Our work couples topically with the proposed intermediate course CS240 - Intelligent Systems. The RoboCup simulation league, with the aid of the Biter framework, could easily serve as a project-based component for this Intelligent Systems course.
Biter continues to evolve. New features and behaviors are being added and we expect the pace to quicken as more users start to employ it for pedagogical and research purposes. We are currently working on the addition of a GUI for the visual development of agents using Java Beans, as well as more low-level behaviors. We envision a system which will allow users to draw graphs with the basic behaviors as the vertices and "inhibits" links as the directed edges. These edges could be annotated with some code. Our system would then generate the Java code that implements the agent. That is, the behaviors we have defined can be seen as components  which the programmer can wire together to form aggregate behaviors. This system will allow inexperienced users to experiment with multiagent systems' design, both at the agent and the multi-agent levels. We also believe the system will prove to be useful to experienced multiagent researchers because it will allow them to quickly prototype and test new coordination algorithms.
 Robocup initiative. http://www.robocup.org/
 Computing curricula 2001, steelman draft, August 2001. http://www.computer.org/education/cc2001/steelman/cc2001.
 Brooks, R. A.
Intelligence without representation. Artificial Intelligence
47 (1991), 139-159.
 Gamma, E., Helm, R., Johnson, R., and Vlissides, J. Design Patterns : Elements of Reusable Object-Oriented Software. Addison-Wesley, 1995.
 Georgeff, M.,
Pell, B., Pollack, M., Tambe, M., and Wooldridge, M. The Belief-Desire-Intention
model of agency. In Proceedings of Agents, Theories, Architectures,
and Languages (1999).
 Jennings, N. R.
On agent-based software engineering. Artificial Intelligence
117 (2000), 277-296.
 Kube, C. R., and
Bonabeau, E. Cooperative transport by ants and robots. Santa Fe 99-01-008.
 Oram, A., Ed.
Peer-to-Peer. O'Reilly, 2001.
 Resnick, M. Turtles,
Termites and Traffic Jams. The MIT Press, 1994.
 Stone, P. Layered
Learning in Multiagent Systems. MIT Press, 2000.
 Szypersky, C. Component Software: Beyond Object-Oriented Programming. Addison-Wesley, 1999.
 Vidal, J. M.,
Buhler, P. A., and Huhns, M. N. Inside an agent. IEEE Internet Computing
5, 1 (January-February 2001).
 Weiss, G., Ed.
Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence.
MIT Press, 1999.