Such a BST would have all gaps (represented as red squares) in the leaf positions and all keys (represented as blue circles) in the interior positions:
Our goal is to arrange the keys and gaps such that
E[search_time] = Σ(prob(v_i) * search_time(v_i)) over all vertices v_i
= Σ(prob(v_i) * depth(v_i)) over all vertices v_i
is minimized.
By recognizing that any optimal tree must contain optimal subtrees, we can use dynamic programming to solve the problem.
Suppose you had 5 keys, K_1 . . . K_5, with corresponding search probabilities of P_1 . . . P_5. Also suppose that the probability of searching a value between K_i and K_j = Q_(i-1) , for j-i = 1:
We can then generate the weights of all possible subtrees:
Next we can calculate the optimal search time of each subtree by following the formula:
OPT(i, j) = min{OPT(i, f-1) + W_i,j + OPT(f+1, j)} for all f such that i<=f<=j
Then, by noting which subproblems combine to make the OPT, we can construct the tree:
Below is a downloadable file containing an interactive implementation for optimal (static) binary search trees. Included in the download, is the code, a readme, a script to allow the program to be run with an input file, and an example input file that corresponds to the above example.