Optimal Static BST

The Problem: Given a list of keys, their associated search probabilities, and the probabilities of searching for a value in the gaps between keys, we want to construct a BST that minimized expected search time.

What would the structure of such a BST be?

Such a BST would have all gaps (represented as red squares) in the leaf positions and all keys (represented as blue circles) in the interior positions:

Our goal is to arrange the keys and gaps such that

E[search_time] = Σ(prob(v_i) * search_time(v_i)) over all vertices v_i

= Σ(prob(v_i) * depth(v_i)) over all vertices v_i

is minimized.

By recognizing that any optimal tree must contain optimal subtrees, we can use dynamic programming to solve the problem.

An Example

Suppose you had 5 keys, K_1 . . . K_5, with corresponding search probabilities of P_1 . . . P_5. Also suppose that the probability of searching a value between K_i and K_j = Q_(i-1) , for j-i = 1:

We can then generate the weights of all possible subtrees:

Next we can calculate the optimal search time of each subtree by following the formula:

OPT(i, j) = min{OPT(i, f-1) + W_i,j + OPT(f+1, j)} for all f such that i<=f<=j

Then, by noting which subproblems combine to make the OPT, we can construct the tree:

My Code

Below is a downloadable file containing an interactive implementation for optimal (static) binary search trees. Included in the download, is the code, a readme, a script to allow the program to be run with an input file, and an example input file that corresponds to the above example.

optimalBST.zipShared with Dropbox

Google Sites

Report abuse