8. Multi-Agent Systems¶
Introduction 1¶
- We explored the differences in decision-making when we have perfect information (or when we know with certainty what the outcome of an action will be) as well as partial information (when we do not have certainty of the outcome of an action and we must estimate the probability of the outcome).
- We introduce the concept of multiple agents within a particular environment. Each agent acts autonomously to achieve its own goals. What makes this a bit different is the fact that each agent can have its own goals and will make decisions based upon its evaluation of the environment, its own utility, and its specific goals.
- This adds an additional layer of uncertainty and complexity into the problem as an agent who, as part of its planning process, must determine or take into account the actions of another agent.
- In uncertain environments, this means that an agent must develop a set of beliefs on how the other agent will act and develop a set of contingencies based upon those actions.
Ch14: Multi-agent Systems 2¶
14.1 Multiagent Framework¶
- A mechanism specifies what actions are available to each agent and how the actions of the agents lead to outcomes.
- An agent acts strategically when it decides what to do based on its goals or utilities.
- Sometimes nature is treated as an agent. Nature is defined as being a special agent that does not have preferences and does not act strategically. It just acts, perhaps stochastically.
- Agents that are not acting strategically are treated as part of nature.
- A strategic agent should not treat other strategic agents as part of nature, but rather should be open to coordination, cooperation, and perhaps negotiation with other strategic agents.
- Agents can be cooperative or competitive.
- Cooperative agents are those that work together to achieve a common goal. They may share information, resources, and strategies to maximize their collective utility.
- Competitive agents are those that work against each other to achieve their own goals. They may try to outsmart or outmaneuver each other to maximize their individual utility.
14.2 Representations of Games¶
- Three representations are presented; two of these are classic representations from economics:
- The first abstracts away all structure of the policies of the agents (Normal Form Games).
- The second models the sequential structure of games and is the foundation for much work on representing board games (Extensive Form Games).
- The third representation moves away from the state-based representation to allow the representation of games in terms of features (MultiAgent Decision Networks).
- Normal Form Games:
- Aka, strategic form games.
- It consists of:
- Finite set of agents (I).
- Set of actions (A_i) for each agent (i) in I.
- Utility function (u_i) for each agent (i) in I, which maps the joint action of all agents to a real number.
- Zero-sum game because one person wins only when the other loses.
- Extensive Form Games
- Aka, game trees.
- This is an extension of the single-agent decision tree to multiple agents.
- It consist of:
- A tree, where each node represents a game state and each edge represents a possible move (action).
- Each internal node is labeled with a player (agent) who is to move at that node (that agent controls the node).
- Each arc from a labelled node is an action of the agent who controls the node.
- Each internal node with the nature label is a chance node and has a probability distribution over its children that specifies the probability of each child being selected.
- Each leaf node is labelled with a utility value for each agent.
- This form represents the sequential structure of games, that is, how the game unfolds as time progresses.
- A strategy for agent i is a function from nodes controlled by agent i into actions. That is, a strategy selects a child for each node that agent i controls.
- A strategy profile consists of a strategy for each agent.
Minimax Algorithm 3¶
- Two players compete against each other in a game; each has perfect knowledge (closed world) about the world (the game).
- We will consider two teams: white and black. The white team is the maximizer, and the black team is the minimizer.
- Positive utility values are good for the maximizer and bad for the minimizer. Negative utility values are bad for the maximizer and good for the minimizer.
- Positive infinity means win for the maximizer, and negative infinity means win for the minimizer.
- The minimax algorithm is a recursive algorithm that traverses the game tree to find the optimal move for the maximizer.
- The algorithm works by recursively evaluating the utility of each possible move and selecting the move that maximizes the minimum utility (hence the name “minimax”).
- The algorithm assumes that the opponent will always play optimally, meaning that they will choose the move that minimizes the maximizer’s utility.
- The algorithm can be represented as a tree structure, where each node represents a game state and each edge represents a possible move.
- The leaves of the tree represent terminal states, which have a utility value associated with them.
- The algorithm starts at the root node and recursively evaluates the utility of each child node until it reaches the leaves.
- The utility values of the leaves are then propagated back up the tree, with the maximizer selecting the child node with the highest utility value and the minimizer selecting the child node with the lowest utility value.
- A heuristic function can be used to evaluate non-terminal states, allowing the algorithm to prune branches of the tree that are unlikely to lead to an optimal solution.
Alpha-Beta Pruning 4¶
- Each level in the tree represents a different player (maximizer or minimizer), where the maximizer is trying to maximize the utility value and the minimizer is trying to minimize it (for the maximizer).
- Alpha-beta pruning is an optimization technique for the minimax algorithm that reduces the number of nodes that need to be evaluated in the game tree.
- The algorithm maintains two values, alpha and beta, which represent the minimum score that the maximizer is assured of and the maximum score that the minimizer is assured of, respectively.
- If the minimizer finds a move that is worse than the current alpha value, it can prune that branch of the tree, as it will not be selected by the maximizer.
- Similarly, if the maximizer finds a move that is better than the current beta value, it can prune that branch of the tree, as it will not be selected by the minimizer.
- This allows the algorithm to skip evaluating branches of the tree that will not affect the final decision, significantly reducing the number of nodes that need to be evaluated.
- The alpha-beta pruning algorithm can be implemented using a depth-first search approach, where the algorithm explores the tree in a depth-first manner and prunes branches as soon as it determines that they will not affect the final decision.
References¶
-
Learning Guide Unit 8: Introduction | Home. (2025). Uopeople.edu. https://my.uopeople.edu/mod/book/view.php?id=454721&chapterid=555072 ↩
-
Poole, D. L., & Mackworth, A. K. (2017). Artificial Intelligence: Foundations of computational agents. Cambridge University Press. https://artint.info/2e/html/ArtInt2e.html Chapter 14 – Multiagent Systems. ↩
-
Sen, G. (2017, March 6). What is the minimax algorithm? - Artificial intelligence [Video]. YouTube. https://youtu.be/KU9Ch59-4vw ↩
-
Levine, J. (2017, March 15). Minimax with alpha beta pruning [Video]. YouTube. https://youtu.be/zp3VMe0Jpf8 ↩