Runtime and Dynamic Optimality

Runtime Basics

Because there are so many variables to consider between algorithms, computer scientists find it helpful to discard all but the most dominating factors when analyzing the runtime of an algorithm. This tutorial is intended to provide the reader with the basic terminology necessary to understand the discussions on the rest of the site.

Runtime in terms of number of elements

Due to variations in hardware, we do not talk about algorithms in terms of actual units of time (i.e. "this algorithm takes 4 seconds"). What could take 2 seconds on one machine might take 45 seconds on another. However, both machines would have in common the fact that they both accessed the same elements the same number of times (assuming everything else is held constant except the architecture of the machine itself).

This leads us to a key insight - that we can analyze algorithms in terms of the size of the dataset. Additionally, since all algorithms are going to complete pretty much instantly on 10 elements, we care about the runtime as the number of elements n approaches infinity. Therefore, we generally analyze algorithms in terms of the number of elements, although we sometimes use different variables as well.

Putting Bounds on Runtime

Upper Bounds

Usually, we care about the absolute worst case runtime (called the "upper bound") of an algorithm. Put differently, this means that we want to know "Suppose this algorithm executes its worst case. How bad would that be?"

For instance, suppose you have a list of numbers (unsorted), and your algorithm (let's call it "Scan") calls for you to look at the first number, then the second number, and then the third number, and so on, until you either find the number you are looking for, or you run out of numbers. In the worst case, you don't find the number, but you look at every element (so you look at all n elements). We notate that using Big-Oh notation: O(n). Since we touch every element only one time, we say that we are doing a "linear" amount of work.

Given another algorithm that touches all n elements n times, we end up doing n x n work for an O(n^2) runtime.

Lower Bounds

Additionally, sometimes we might also care about the lower limit of an algorithm (called the "lower bound"). In other words, we might want to know, "Suppose the algorithm executes its worst case. What's the best possible runtime we could get in that situation?" With the scan algorithm, the best that the worst case can do is linear. There is no scenario in which you don't just look at every element in the worst case. The "Scan" algorithm is therefore Big-Omega of n, annotated Ω(n).

Tight Bounds

Finally, if we happen to have an upper bound and a lower bound that are the same, we call this the "tight bound," and annotate it with Theta Notation: ϴ(n).

A note on constant factors

This notation still holds if (in the "Scan" algorithm) for some reason you need to touch every element 2 times, or 10 times. You could say these are O(2n) or O(10n), but generally speaking, we don't care about constant factors (at least, we don't care what they are), so these are also O(n). Similarly, if we need to touch all n elements 10n times, we would be doing n x 10n = 10n^2 = O(n^2) work.

Dynamic Optimality

One of the most interesting areas of current research involving BSTs is dynamic optimality.

The Quest for an O(OPT) BST

Computer scientists are interested in answering the question: is there a single best binary search tree? One that, no matter what operations you do, it is going to do the minimum amount of work possible? (Or, perhaps some constant factor times the minimum amount of work?)

This hypothetical tree is called OPT. A BST that performs within a constant factor of the optimal runtime on an arbitrary access sequence and arbitrary set of nodes is said to be O(OPT).

The tree at left is used in all following discussions.

Offline Algorithms

In order to access 6 (the particular operation doesn't matter, so let's just consider searching), you would need to first access 8 and 3. But suppose you knew in advance that you're going to access the node containing 6 more than you're going to access any other node (perhaps we access 6 one million times, and all other nodes only 3 times each). Anytime we know all of the operations in advance, we call this Offline.

If we already know that 6 is going to be accessed most frequently, then it would make the most sense to put 6 at the root, so that we don't need to needlessly access 8 and 3 over and over again. Given the search frequency of each value, along with the full set of items in the tree, we can construct a tree that requires the fewest possible accesses by placing the items with the highest search frequencies closer to the root.

The actual algorithm to do this is a bit more complicated (click here for Knuth's original paper on the topic), but that is the underlying principle. However, we are not particularly interested in offline algorithms in this project.

Online Algorithms

Much more interesting is the possibility of creating an optimal BST when we don't know the operations beforehand.

Red Black Trees: O(lg n * OPT) = O(lg n)-competitive

Red Black Trees are guaranteed to find a value in O(lg n) time, while the best possible run time would be constant. The competitive bound is therefore quite trivial: O(lg n) * O(1) = O(lg n)-competitive.

Tango Trees: O(lg (lg (n)) * OPT) = O(lg lg n)-competitive

The runtime of a single search in a Tango Tree is bounded by the number of non-preferred edges traversed (call this number E) times the amount of work done in a Tango operation. This equals O(E * lg lg n).

Another way to describe the number of non-preferred edges is the Interleave Lower Bound, also known as the Wilber 1 bound. The Interleave Lower Bound is also described in Demaine, Harmon, Iacono, and Patrascu's paper introducing Tango Trees.

If we can find a tree which requires some constant factor times the interleave lower bound (E) in order to access any arbitrary element, we will have discovered OPT. However, Tango Trees have a runtime of O(E * lg lg n) = O(OPT * lg lg n) = O(lg lg n)-competitive.

Splay Trees: Conjectured to be O(OPT), but this is unproven.

So far, no one has managed to come up with a proof that splay trees are (or are not) dynamically optimal, despite the fact that they are simpler than Tango Trees.