Binary Search

Linear Search reminder

A reminder that a linear search will iterate through an array, checking if each element matches a goal. an example algorithm is shown below.

What is a Binary Search?

A Binary Search algorithm works by splitting the list in two and comparing the mid point. Similar to using a dictionary:

This is a Divide and Conquer algorithm. It is generally faster than linear search but the main drawback is the list has to be sorted first  (more info on sorting Bubble Sorts and Insertion Sorts).

A Binary Search requires the array of data to be sorted in order before it can be executed.

Binary Search Walkthrough

Binary Search Pseudocode

found = FALSE

goal = Input from User

REPEAT

mid = INTEGER((startpos + endpos)/2)

IF array(mid) = goal 

    Found = TRUE

ELSE IF array(mid) < goal 

    startpos = mid + 1

ELSE 

            endpos = mid - 1

UNTIL found = TRUE OR (startpos>endpos)


This example uses a REPEAT UNTIL loop so the condition isn't checked until there is one iteration through the array.

found = False

goal = Input from User

WHILE NOT FOUND and startpos <= endpos

mid = INTEGER ((startpos+ endpos)/2)

IF array(mid) = goal 

  found = True

  position TO mid

ELSE IF array(mid) < goal 

  startpos = mid + 1

ELSE 

           endpos = mid - 1

        END IF

END WHILE

This version uses a WHILE loop so the condition is checked before the array is traversed.

It is important to note that we cannot rely on just found = FALSE as an exit condition as it is a possibility that the goal value is not in the list. If the start pointer ever reaches the end pointer then we know we have searched the entire list.

Binary Search - Python Implementation

Note: The // operator in Python is integer division so 5//2 would equal 2. The int() function could easily have been used.

As the main function of a Binary search is usually to return the position it is important to consider what value can be interpreted as a no result

Operation of a Binary Search

For a list of N data items the average search length would be Log2N. So, for a list of 64 items, the average search length would be 6.

Further Reading

There can be a potential problem when using the binary search on extremely large lists

When the low + high value exceeds the storage limits for an integer

https://research.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html

To correct a possible overflow you can change the midpoint calculation to:

 middle = low + ((high - low) / 2)

This will avoid any overflow.

Binary Search comparison against Linear Search

Binary Search Example Video