What is Alpha Beta Pruning all about?
Alpha Beta Pruning technique makes the minimax algorithm leaner and fitter.
This means that minimax is doing some extra work - which can be afforded to loose as it will not impact its expected output.
What exactly does Mini-Max do?
MiniMax does the following 4 things:
1. Find the best move for each node.
2. Compare the best move out of all the legal moves.
3. Return it to father who has opposite intentions.
4. The father then picks what is the best for him based on the values returned.
The point is that, Minimax is interested in finding the best value for each and every node of the search tree.
However, in alpha-beta, we also use the relations between the nodes of a tree and use the advantage that, the best move need not be found at each node.
If the SON finds out that there is a move which is too good for him, he should stop searching further.
This is because, the FATHER will never pick him, if he already has a better SON than him.
In chess terms, this is called as refutation.
See this game:
http://www.chessgames.com/perl/chessgame?gid=1795602
At move number 21,
The engine first calculates Na5 as say 70 centi-pawns better for White (MAXIMISER).
White then evaluates next move Rh3.
He sends to MINIMISER, the score of 70 centi pawns saying if you find that you have any move less than 70 c.p., stop searching further.
This is because, you being a minimiser, will return me that move or any move which is lesser than that because you are a minimiser.
I as a maximiser will never select you because I already have 21. Na5 with 70 centi-pawns.
Black calculates that 21. Rh3 leads to Qf1#.
This is score of -INFINITY.
The point is that black now need not search moves like 21. ... 0-0 because, White will anyways not choose to play 21. Rh3.
If White plays 21 Rh3, black has atleast 21. ... Qf1# .
It could be that maybe black has a better move than Qf1# (ofcourse, it is not in this case).
But the point is that black need not find the best move now as white is anyways not going to play it.
We can say that 21. Rh3 gets refuted by atleast 21. ... Qf1#.
Hence, 21 Rh3 need not be searched more.
Theory
A complete line gets refuted if atleast 1 response is found that is less profitable for the previous depth player compared to another line.
This is because, in the search tree, the FATHER and SON have exactly opposite intentions.
The SON having opposite intentions, will definitely choose that move or a move which can only be further worse for the FATHER.
For alpha beta pruning to work, we need to search to a depth of atleast 2.
Lets put it in a story.
The concept is that lets suppose root position is MAX.
He has 2 children.
Both will be MIN.
Child 1 will has returned a value of 5.
For MAX, this will be Alpha. Initially this was -INFINITY.
He will send this value 5 to his second son and tell him -
"Son, you are a MIN node and I am a MAX node.
Your brother has already given me a value of 5.
If you find even 1 child of yours having value < 5, then please don't look at your other children.
This is because, since you are a MIN, and you already found out that 1 of your sons is worth 3, you will pick him or you will pick any other son of yours who is lesser than him.
You will never return me a value greater than 3 even if you search all of your children.
So, don't waste your time in calculating who is the best son for you."
Code
int AlphaBetaMax(int alpha, int beta, int depth )
{
pv_length[ply] = ply;
if(depth == 0)
return eval();
int i , value;
gen();
for (i = first_move[ply]; i < first_move[ply + 1]; ++i) {
if (!makemove(gen_dat[i].m.b))
continue;
value = AlphaBetaMin(alpha, beta, depth - 1 );
takeback();
if(value >= beta)
return beta;
if(value > alpha){
alpha = value;
pv[ply][ply] = gen_dat[i].m;
for ( int j = ply + 1; j < pv_length[ply + 1]; ++j)
pv[ply][j] = pv[ply + 1][j];
pv_length[ply] = pv_length[ply + 1];
}
}
return alpha;
}
int AlphaBetaMin(int alpha, int beta, int depth)
{
pv_length[ply] = ply;
if(depth == 0)
return -eval();
int i , value;
gen();
for (i = first_move[ply]; i < first_move[ply + 1]; ++i) {
if (!makemove(gen_dat[i].m.b))
continue;
value = AlphaBetaMax(alpha, beta, depth - 1 );
takeback();
if(value <= alpha)
return alpha;
if(value < beta){
beta = value;
pv[ply][ply] = gen_dat[i].m;
for ( int j = ply + 1; j < pv_length[ply + 1]; ++j)
pv[ply][j] = pv[ply + 1][j];
pv_length[ply] = pv_length[ply + 1];
}
}
return beta;
}
Note the following:
1. Alpha and Beta are both received from FATHER node.
2. A MAX node never updates beta. He receives the value of beta from above and uses it to check for refutation. He only updates alpha and sends it to his sons.
3. Similarly, a MIN node never updates alpha.He receives the value of alpha from above and uses it to check for refutation.He only updates beta and sends it to his sons.
4. Alpha and beta are thus the 2 bounds and a node can exist in 3 places.
<= alpha OR >= beta OR between alpha and beta.
5. For a MAX node, beta is always constant and his intention is to increase alpha.
However, he can never increase alpha to more than beta because he will be refuted.
6. Similarly, for a MIN node, alpha is always constant and his intention is to decrease beta.
However, he can never decrease beta to less than alpha because he will be refuted.
Alpha Beta pruning at depth >= 3
Also note that in Alpha Beta pruning at depth >= 3, something more happens.
Consider the continuation of the story above.
Previously the depth was 2.
Hence the son was just running eval on his sons.
Now, since depth = 3, the MIN son will send this value of 5 as alpha to his MAX son.
And tell him this: "Son, your grandfather is a MAX node and my brother has already given him a value of 5.
He told me that if I have even 1 son whose value is less than 5 then he will not pick me, because then i'll never be as good as my brother.
Since depth is > 2, I cannot run eval on you to find your value and you will have to run eval on your sons.
Also, you, like your grandfather, are a MAX node.
If you find that none of your sons are greater than 5, do not waste time in updating the PV.
This is because, for me to return a value > 5, you should have atleast 1 son whose value is greater than 5.
If you don't have it, then I will never have it and we will never make it to the PV.
So finally, what will happen is that, you being a MAX node will not have a value > 5.
Suppose you have the max value as 4.
You will return to me the value of 5.
I being a MIN guy would never pick any son who is more than you.
(if (value <= alpha) return alpha;)
My Father would know that he cannot pick me without me checking my other sons."
Thus, a node is categorised into the following:
1. ALL node:
If a MAX node has all values of his SONS as less than or equal to alpha OR
If a MIN node has all values of his SONS as greater than or equal to beta
Then such a node is an ALL node.
A MAX ALL node will always return alpha and a MIN ALL node will always return beta.
2. CUT node:
If a MAX node has atleast 1 SON of value >= beta OR
If a MIN node has atleast 1 SON of value <= alpha
Then such a node is a CUT node.
A MAX CUT node will always return beta and a MIN CUT node will always return alpha.
It is obvious that, an ALL node's father will always be a CUT node.
3. PV node:
If a MAX node OR a MIN node has atleast 1 SON of value > alpha and value < beta
i.e. alpha < value < beta
Then such a node is a PV node.
Please visit the following link for an awesome explanation of alpha beta algorithm:
http://web.cs.ucla.edu/~rosen/161/notes/alphabeta.html
A few comments:
The smaller the window size, lesser number of nodes to be searched.
Also, the efffectiveness of alpha beta lies in move ordering.
Currently the moves are searched according to the order in which they are generated.