A polymorphic C implementation of a basic binary search tree is available here: https://github.com/Tyresius92/binary-search-tree
A Binary Search Tree (BST) is a data structure which has the following characteristics:
Building a standard BST is similar to the search above. Simply follow the same path you would while searching for your number, and then insert it when you hit an empty spot. For instance, if inserting 5, it would become the right child of 4, which is where we fell off the tree when searching for 5.
In a relatively balanced BST (like the one at left, where all the nodes are as close to the root as possible), you would perform approximately O(lg n) comparisons (all logarithms are base 2). For instance, in the tree at left, there are 9 nodes. The log of 9 is ~3.16. We therefore expect to do 3 to 4 comparisons on average. In this tree, we end up doing 4 comparisons when searching for 5, but only 3 comparisons when searching for 2.
In order to build a BST, you have to do n insertions, and each one takes O(lg n) time on average. Therefore, the expected time to build a binary search tree is O(n lg n). Similarly, doing n searches also takes O(n lg n) time.
If you would like a more formal discussion of this information, there is a slide deck here, and a video lecture here.
If you need a refresher on runtimes, please click here.
The depth d of a node refers to how many levels you need to go through in order to reach the node. For instance, in the tree above, 10 has a depth of d = 2, since you have to go through 8 at the root, as well as 13 on the next level. Similarly, 7 has a depth of d = 3. We say that the root has a depth of zero (0).
You can store each node's depth as an additional bit of data in each node. It is trivial to maintain this data, even with rotations (discussed later).
Standard binary search trees have a few weaknesses. First, although they have an expected runtime of O(log n) for a single search, they actually have a worst case runtime of O(n). Suppose that, when building the tree above, we had inserted the nodes in increasing order: 1, 3, 4, 6, 7, 8, 10, 13, 14. We would have put 1 at the root, then 3 as the right child of 1, then 4 as the right child of 3, and 6 as the right child of 4, and so on, and we would end up with just a single path of nodes, which is not any better than our original list.
In the scenario presented in the previous paragraph, we end up building one long list, where we always go right. In order to insert the n-th element, we would end up doing n - 1 comparisons. This evaluates to n x (n - 1) = O(n^2) work. Searching and other operations take a similar amount of work, by the same arguments.
This runtime defeats the entire point of building the BST in the first place. It would be nice to have a way to ensure that we always end up with a balanced binary search tree instead of just one long list of elements.
With (most) dynamic BSTs, the goal is to keep the tree relatively balanced. We don't necessarily need a perfectly balanced tree, where all nodes have minimum depth. We just need it to be reasonably balanced. If the tree becomes unbalanced, rebalancing is accomplished using a series of rotations.
In a Red Black tree, we color each node red or black, and maintain several properties to ensure we remain reasonably balanced. You can read more about Red Black Trees here.
Splay trees have a completely different access pattern that does not care about balance, but they still rely on rotations. You can read more about Splay Trees here.
There are two basic rotations - left rotate and right rotate. Since they are symmetric, we will describe only one here.
Looking at the gif at right, there are three triangles. By definition, all items in the leftmost triangle are smaller than items anywhere else in this tree, and all items in the rightmost triangle are larger. All items in the middle triangle must be between the two circle items.
We can therefore manipulate the relationships between the nodes without having to worry about invalidating the BST properties.
In a left rotation, node L is at the root of the subtree, and R is L's right child. To perform the rotation, we make the left child of R (the middle triangle) the right child of L. Then we make L the left child of R, and finally, point L's parent at R.
We will discuss the following kinds of Dynamic Binary Search Trees: