Comparing linear and binary search
In this step, you’re going to compare linear and binary search, and see how well they perform on a variety of sorted lists when searching for specific items.
Which search to use?
You have looked at two different options for searching a list this week, but how do you know which algorithm to use in any given situation?
The question may seem obvious, because I spent two whole steps telling you that binary search is much more efficient. However, there are certain situations where linear search is the only option.
The main one is when a list is unsorted. If the list you want to search is not in order, then the only option you have is linear search. Binary search relies on the idea that you can tell which way to look by examining the value at each midpoint. You could sort the list first and then run a binary search, but this would take longer than just running a linear search algorithm.
If the item is near the start of the list, then linear search will also be more efficient — even if the list is sorted in order. If you have control over the order of the list, you could place commonly searched for items at the front of the list to use linear search efficiently.
Comparing the speed of the algorithms
Just as you did in week three, you can use the time module in Python to find out how long it takes the two search algorithms to find a specific item.
from time import time
from random import randint
#######################################################################
# COPY AND PASTE YOUR linear_search AND binary_search functions below #
#######################################################################
#Generate a large list to search
list_to_search = []
for i in range(1000000):
list_to_search.append(i)
# Capture the current time, before the algorithm has run
start_time = time()
# Run the linear search algorithm
linear_search(list_to_search, randint(0,1000000))
# Calculate the running time by taking the start time away from the current time, and display it.
print("Linear took: " + str(time() - start_time))
# Repeat the process for binary search...
start_time = time() # Capture start time
# Run the algorithm
binary_search(list_to_search, randint(0,1000000))
print("Binary took: " + str(time() - start_time)) #Display the results
Run this comparison to see how long each algorithm takes. Once you have done this, have a go at answering the following questions:
How do the times vary when the functions are given smaller or larger lists?
What happens to the time when they are asked to search for numbers that are not in the list?
How about when searching for the last number in the list?
Adjust the code above to try these out, and share your thoughts in the comments.