Up until now all the searches that we have studied there were only one person or agent searching the solution space to find the goal or the solution. In many applications, there might be multiple agents or persons searching for solutions in the same solution space.
Such scenarios usually occur in-game playing where two opponents also called adversaries are searching for a goal. Their goals are usually contrary to each other. For example, in a game of tic-tac-toe player, one might want that he should complete a line with crosses while at the same time player two wants to complete a line of zeros.
Hence both have different goals. Notice further that if player one puts across in any box, player-two will intelligently try to make a move that would leave player-one with a minimum chance to win, that is, he will try to stop player one from completing a line of crosses and at the same time will try to complete his line of zeros.
Searches in which two or more players with contrary goals are trying to explore the same solution space in search of the solution are called adversarial searches.
In adversarial searches, one player tries to cater for the opponent’s moves by intelligently deciding that what will be the impact of his own move on the overall configuration of the game. To develop this stance he uses a look ahead thinking strategy.
That is, before making a move he looks a few levels down the game tree to see that what can be the impact of his move and what options will be open to the opponent once he has made this move. To clarify the concept of adversarial search let us discuss a procedure called the minimax procedure.
The opponents in a game are referred to as MIN and MAX. Although this is partly for historical reasons, the significance of these names is straightforward: MAX represents the player trying to win or to MAXimize her advantage.
MIN is the opponent who attempts to MINImize MAX’s score. We assume that MIN uses the same information and always attempts to move to a state that is worst for MAX. The maximizer has to keep in view that what choices will be available to the minimizer on the next step. The minimizer has to keep in view that what choices will be available to the maximizer on the next step. e.g.
Standing at node A the maximizer wants to decide which node to visit next, that is, choose between B or C. The maximizer wishes to maximize the score so apparently 7 being the maximum score, the maximizer should go to C and then to G. But when the maximizer will reach C the next turn to select the node will be of the minimizer, which will force the game to reach configuration/node F with a score of 2.
Hence maximizer will end up with a score of 2 if he goes to C from A. On the other hand if the maximizer goes to B from A the worst which the minimizer can do is that he will force the maximizer to a score of 3. Now, since the choice is between scores of 3 or 2, the maximizer will go to node B from A.
To fully understand the Minimax procedure consider a game “nim”. To play nim, a number of tokens are placed on a table between the two opponents; at each move, the player must dive a pile of tokens into two nonempty piles of different sizes. Thus, 6 tokens can be divided into piles of 5 and 1 or 4 and 2, but not 3 and 3. The player who can no longer make a move loses the game.
In this game, one player is represented as MAX and the opponent is represented as MIN. Let say MIN is allowed to move first as shown in the figure below. Each leaf node is given a value of 1 or 0, depending on whether it is a win for MAX or for MIN. Minimax propagates these values up in the graph through successive parent nodes according to the rule:
- If the parent state is a MAX node, give it the maximum value among its children
- If the parent is a MIN node, give it the minimum value of its children.
The value that is thus assigned to each state indicates the value of the best state that this player can hope to achieve. These derived values are used to choose among possible moves. The above diagram shows the resulting win path for Max in bold arrows. In starting state, all of MIN’s possible first moves lead to nodes with a derived value of 1, the second play MAX can always try to win the game, regardless of MIN’s first move. MIN can pick any first move. By selecting the most favorable option MAX is trying MIN to lose the game.
Although depending upon the problem and search space, it is not always possible to exhaustively search the state space. In order to cope with such space, we use heuristics.
Minimaxing to Fixed Ply depth: In applying minimax to more complicates games, it is seldom possible to expand the state space graph out to the leaf nodes. Instead, the state space is searched to a predefined number of levels, as determined by the available resources of time and memory. This strategy is known as n-move look-ahead, where n is the number of levels explored. As the leaves of this subgraph are not final states of the game, it is not possible to give them values that reflect a win or loss.
Also Read: Shell Scripting – Variables
Instead, each node is given a value according to some heuristic evaluation function. These values are propagated back to the root node of that subgraph. If the parent is on a MIN level, the minimum value of the children is backed up. If the parent is a MAX node, minimax assigns it the maximum value of its children. However, these values do not indicate win or lose.
These values can be propagated upward depending on the MIN or MAX Parents.