apply linear search algorithms to search for specific information in array or ArrayList objects
Computers store vast amounts of data.
One of the strengths of computers is their ability to find things quickly.
This ability is called searching.
For the exam, you will need to know both linear (sequential) search and binary search algorithms.
The following video introduces the concept of searching including sequential search and binary search:
Sequential or Linear search typically starts at the first element in an array or ArrayList and looks through all the items one by one until it finds the desired value (or doesn't).
It will usually return the index if it finds the value or if it searches the entire array or list without finding the value it returns -1.
Binary search can only be used on data that has been sorted or stored in order.
Sorting algorithms will be covered in the next lesson.
It checks to see if the middle value is less than, equal, or greater than the desired value and then based on the results of that it narrows the search.
It cuts the search space in half each time.
LINEAR (SEQUENTIAL) SEARCH
Sequential or linear search is the only method that can be used to find a value in unsorted data.
It usually starts at the first element and walks through the array or list until it finds the value it is looking for and returns the index it found it at, or it loops until the end of the array or list and then it returns a -1 to show that it didn't find the value.
Check out this example for sequentialSearch for arrays below:
Here is the same search with an ArrayList.
The same algorithms can be used with arrays or ArrayLists, but notice that size() and get(i) are used with ArrayLists.
Length and [i] are used with arrays.
Of course you can also look for a string in an array or list.
Remember that == is only true when the two references refer to the same object, while .equals returns true of the characters in the two objects are the same.
So, when you look for a string be sure to use .equals instead of ==.
BINARY SEARCH
IMPORTANT: Binary search can only be used if the data is sorted.
Binary search keeps dividing the sorted search space into half.
It compares a target value to the value in the middle of a range of indices.
If the value isn't found it looks again in either the left or right half of the current range.
Each time through the loop it eliminates half the values in the search area until either the value is found or there is no more data to look at.
The following animation shows the difference between linear (sequential) and binary search:
Binary search calculates the middle index as (left + right) / 2 where left starts out at 0 and right starts at array length - 1 (the index of the last element).
Remember that integer division gives an integer result so 2.5 would be 2.
It compares the value at the middle index with the target value (the value you are searching for).
If the target value is less than the value at the middle it sets right to middle - 1.
If the target value is greater than the value at the middle it sets left to middle + 1.
Otherwise the values match and it returns the middle index.
It also stops when left is greater than right which indicates the value wasn't found and it returns -1.
You can also use binary search with a string array.
But, when you look for a string be sure to use the compareTo() method rather than < or > which can only be used with primitive data types.
Remember, the compareTo(String other) method returns a negative value if the current string is less than the other string, 0 if they have the same characters in the same order, and a positive value if the string is greater than the other string.
This example code shows binary search with a string array:
RUNTIMES
So, how do we choose between two algorithms that solve the same problem?
They usually have different characteristics and runtimes which measures how fast they run.
For a searching problem, it depends on your data.
Binary search is much faster than linear search, especially on large data sets, but as we know it can only be used on sorted data.
Often with runtimes, computer scientists think about the worst case behavior.
With searching, the worst case is usually if you cannot find the item.
With linear search, you would have to go through the whole array before realizing that it is not there, but binary search is much faster even in this case because it eliminates half of the data set in each step.
We can measure the informal runtime just by counting the number of steps taken to find (or not find) the item.
Here is a table that compares the worst case runtime of each search algorithm given an array of n elements:
The runtime above is measured as the number of times the loop runs in each algorithm or the number of elements checked in the worst case that the item is not found.
Notice that with linear search, the worst case runtime is the size of the array n, because it has to look through the whole array.
For the binary search runtime, we can calculate the number of times you can divide n in half until you get to 1.
So, for example 8 elements can be divided in half to narrow down to 4 elements, which can be further divided in half to narrow down to two elements, which can be further divided in half to get down to 1 element, and if that is wrong, to 0 elements.
In other words, 4 divisions or guesses are needed (4 then 2 then 1 then 0).
Binary search is much faster than linear search!
RUNTIME LOGARITHM
Runtimes can be described with mathematical functions.
For an array of size n, linear search runtime is a linear function, and binary search runtime is a function of log base 2 of n.
This is called the big-O runtime function in computer science, for example O(log n) vs. O(n).
You can compare the growth of functions as n, the data size, grows and see that binary search runs much faster for any n:
You don't need to know the log n runtime growth function for the exam, but you should be able to calculate how many steps binary search takes for a given n by counting how many times you can divide it in half.
One way you could do this is by starting at 1 and keep a count of how many times you can double it with the powers of two (1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, etc.) until you reach a number that is slightly above n.
SUMMARY
There are standard algorithms for searching.
Sequential/linear search algorithms check each element in order until the desired value is found or all elements in the array or ArrayList have been checked.
The binary search algorithm starts at the middle of a sorted array or ArrayList and eliminates half of the array or ArrayList in each iteration until the desired value is found or all elements have been eliminated.
Data must be in sorted order to use the binary search algorithm. This algorithm will be covered more in Unit 10.
Informal run-time comparisons of program code segments can be made using statement execution counts.
EVIDENCE
1) Complete the following Google Form. This form must be 100% correct and includes released AP practice questions. To stop working and return later, hit submit! You can "edit your response" and continue where you left off.
Notice that get(i) is used instead of [ ] to get an element in the ArrayList dictionary at index i. The search also prints out the index where it found the word. This is an informal runtime that tells us how many words it had to check. Run the code with the following test cases for each word listed in this Google document (make sure to use File/Make a Copy to be able to record your results):
Make sure to "fork" the program so that you can save it as your own project. The SpellChecker.java file also has a binarySpellCheck(word) method defined, but it does not print out the number of words checked. Looking at the linearSpellCheck(word) method as a guide:
a) add in a counter variable
b) increment it in the binary search loop after finding the middle of the list
c) print out the value of your counter variable before returning true or false.
d) change the Main.java code to call the binarySpellCheck method instead of the linearSpellCheck method, and try all the same test case words again. Record the runtimes for binary search and compare with the linear search times.
3) Answer the questions at the bottom of the Google document. To earn credit for this objective, meet with me to show me your Google document and the program you edited.