http://www.hamedahmadi.com/gametree/
An Introduction to Game Tree Algorithmsby Hamed Ahmadi (www.hamedahmadi.com)
Preface
Please remember that the methods presented here serve only as an introduction, and are not the only methods. On the contrary, the alphabeta pruning described here can be improved, and a number of other algorithms also exist.
Introduction
I will be assuming that the game is a board game, and that the two players are blue and red, with blue moving first. The general approach to writing a program to play such a game is to look ahead a few moves, and analyze different possibilities. Thus we have two major concerns; devising an analysis function, and searching through different positions.
The Analysis Function
The analysis function is a heuristic; in actuality it is very difficult to find a precise value. However, that doesn't stop us from trying! A good analysis function takes into effect many different aspects of the board position, such as the number of pieces, control of the board, control of certain key squares on the board, and also mobility. Mobility indicates how freely each player can move his/her pieces, and how many moves she/he has. Much of the work in writing a game program involves writing a good analysis function. Usually, after you implement a good search procedure, the best way to improve your program's gameplay is to improve your analysis function! This stage is so important that often complicated methods, like genetic algorithms, are used to tune the analysis function. However, I recommend that you not spend too much time on analysis until you write the actual search procedure.
The Game Tree
Therefore, we need a way to "look forward" a few moves. This is accomplished by searching in a game tree, in which the board positions are organized. In a game tree, every node corresponds to a board position. The children of each node N are the different positions that result from a single move in N, made by the player whose turn it is at N. The root of the tree is the starting position, or the position from which we want to find a move. The leaves (nodes which have no children) in the game tree correspond to terminating positions; positions in which the game has ended. A portion of the game tree for "Tictactoe" is shown in Figure 1. Note that each level in the tree pertains to the moves made by one particular player.
The MiniMax Algorithm
Each node that is not a leaf corresponds to a decision made by the player whose turn it is at the node's level. The player making a move at the node must decide which of that node's children it should choose as the next board position. It is clear that if the player is blue, he should choose the child with the largest value. Likewise, red should choose the position which minimizes her node's value. In this manner, we can determine the value for each node, and in doing so we find the best move possible from each node. Figure 2 illustrates this method. For a small game like Tictactoe, we can build and search the entire tree, all the way down to the actual leaves. However, this is clearly impossible for a more complicated game like reversi or chess. Here is where we finally use the analysis function. Since we can't search the entire tree, we only consider the first few levels. In this manner we pretend that the nodes below a certain level are leaves, and determine their value using the analysis function. We can implement the procedure recursively, using two functions, one for red and one for blue. int BlueValue(Board b, int depth) { if ((GameOver(b) or depth>MaxDepth) return Analysis(b) int max = infinity for each legal move m in board b { copy b to c make move m in board c int x = RedValue(c, depth+1) if (x>max) max = x } return max } int RedValue(Board b, int depth) { if ((GameOver(b) or depth>MaxDepth) return Analysis(b) int min = infinity for each legal move m in board b { copy b to c make move m in board c int x = BlueValue(c, depth+1) if (x<min) min = x } return min } Where MaxDepth is a constant defining the maximum depth. Each of these functions is initially called with depth=0. The value returned is the value of the node corresponding to board b. Now, in order to find the best move from any position, we just need to choose the child with the best (minimum or maximum depending on color) value.
The NegaMax Algorithm
One way to implement negamax is shown below. 1 const int sign[2]={1,1} //0 is blue, 1 is red 2 3 int NegaMax(Board b, int depth, int color) { 4 if (GameOver(b) or depth>MaxDepth) 5 return sign[color]*Analysis(b) 6 int max = infinity 7 for each legal move m in board b { 8 copy b to c 9 make move m in board c 10 int x=  NegaMax(c, depth+1, 1color) //Note the "" before "NegaMax" 11 if (x>max) max = x 12 } 13 return max 14 }The function NegaMax returns the value of the node b, as seen by the player indicated by color. That is, the better the position is for color, the greater the value that it returns. Note that this is exactly the opposite of what we see in Figure 3. (In Figure 3, nodes were considered from their parent's point of view.) However, this is fixed in line 10, where the value of the child is reversed (multiplied by 1). In this manner, node values are processed as in Figure 3, with the exception of the root. Reversing the root, as mentioned before, is not important. In fact, it can be better if we don't reverse it. This way, the true value (the value it would have had in minimax) of a node b with color c, is returned by NegaMax(b,0,c). Note also that we multiply the analysis value of leaves by sign[color] in line 5. We need to be careful not to leave out things like that, or make mistakes about them. A common mistake in negamax is to get the positive and negative mixed up. It is so easy to make a mistake and end up with a program that always finds the worst possible move for each player. Even worse, you might end up with a program that always finds the best move for one color, but the worse move for the other color.
AlphaBeta Pruning and Branching
Thus we need a way to speed up our search procedure. How? Consider how a human would play a game. She would say, "OK, if I move my queen there, it would be captured immediately by my opponent's knight. So I won't move my queen there." But what would our program do in the same situation? It would consider moving the queen to where it could easily be captured, and then waste time search through thousands of nodes which result from moving pawns on the other side of the board! Eventually it would find the move in which the knight captures the queen, but it didn't need to take so long. AlphaBeta pruning uses the above ideas to drastically reduce the number of nodes we search. Pruning is the elimination of nodes found to be unnecessary to process while searching. Consider the tree in Figure 4. Note that for simplicity, this tree has not been "negamaxed". Let's follow our algorithm, assuming that nodes are processed from left to right. While trying to find the value of A, we go down to B and determine its value to be 6, the minimum of 6 and 9. Then control is passed to C. From there we compute E to be 4, the maximum of 4 and 2. Since C is a minimizer, we now know that the value of C is less than or equal to 4. Therefore, C will not be chosen by A as the maximum, since A already has the child B with value 6, which will be greater than any value C may later have. Hence, searching F and G is useless, and we can immediately stop processing (prune) C. Control then passes to D. The same thing can happen if A is a minimizer. In general, we can define lower and upper bounds for the value of a node, called alpha and beta, respectively. One of these bounds (depending on the player's color) does not change within the node, and if the node's value crosses this bound, we can prune the node. The other bound, however, changes as the node's value is updated. This can be implemented using minimax as below. int BlueValue(Board b, int depth, int alpha, int beta) { if ((GameOver(b) or depth>MaxDepth) return Analysis(b) int max = infinity for each legal move m in board b { copy b to c make move m in board c int x = RedValue(c, depth+1, alpha, beta) if (x>max) max = x if (x>alpha) alpha = x if (alpha>=beta) return alpha } return max } int RedValue(Board b, int depth, int alpha, int beta) { if ((GameOver(b) or depth>MaxDepth) return Analysis(b) int min = infinity for each legal move m in board b { copy b to c make move m in board c int x = BlueValue(c, depth+1, alpha, beta) if (x<min) min = x if (x<beta) beta = x if (alpha>=beta) return beta } return min } Note that we can write alpha>=beta instead of alpha>beta, which slightly increases our program's speed. These functions are intially called with depth=0, alpha=∞, and beta=∞. The negamax version can be written as below. const int sign[2]={1,1} //0 is blue, 1 is red int NegaMax(Board b, int depth, int color, int alpha, int beta) { if ((GameOver(b) or depth>MaxDepth) return sign[color]*Analysis(b) int max = infinity for each legal move m in board b { copy b to c make move m in board c int x=  NegaMax(c, depth+1, 1color, beta, alpha) if (x>max) max = x if (x>alpha) alpha = x if (alpha>=beta) return alpha } return max }Note that alpha and beta are replaced with beta and alpha, respectively, when NegaMax is called within itself. AlphaBeta pruning, used properly, will usually allow us to search at least 23 extra levels. However, pruning alone is not enough! Let's look back for a moment at Figure 4. Why was C pruned? C was pruned because B, which was searched before C, had a value better than any value C could possibly have. We knew that C will not be chosen, because we'd already found a better move in A. Thus, in order to maximize the number of nodes that get pruned, we try to check the moves that "look better" first. That is, we process the children of a node in the order of how good we expect the move to be. This is called branching. Here, we use another heuristic to try to compare moves. Some of the things we could use are:
The best way to find which method to use is to experiment! See which works best for your program. After we find the move values, we can sort them in decreasing order, and process them in that order. However, a heap data structure may prove to be more efficient than sorting, because when pruning takes place, we leave the function and thus don't need to have the rest of the moves sorted. Another method sometimes used to increase speed is to divide the number returned by the analysis function by a constant value. (For example divide it by 10.) This reduces the range of values that the nodes in the game tree can have, and can therefore result in more pruning. However, this makes the analysis less accurate, so it should be used with care.
The Horizon Effect
The fact is, the problem is still there, though it is less drastic, because we often have time to fix things before they get too bad. The problem is called the horizon effect, and happens when we use a constant maximum depth. In this situation, there is always a "horizon" over which we cannot see. At every move, the horizon moves back a bit, revealing something that we couldn't see before. How can we prevent being surprised by something bad suddenly appearing after the horizon moves back? Usually such events occur when there is a node at our bottom level (at the horizon), in which a strong move exists, but is not considered since we are at our maximum depth. In order to avoid this situation, we allow our program to go beyond the maximum depth under certain circumstances, when the position is "hot". For example, good chess programs usually process capturing do a great depth. That is, we can extend the maximum depth for a node in which the capture of a piece other than a pawn exists, or if the king is in check.
References
