Searching

Searching is obviously an important algorithm in its own right, however you will also need to use various versions of search to make other algorithms work, such as: selection sort; insertion sort; unique random numbers; etc...

The two basic search strategies you need to know are Linear Search and Binary Search

There are other search strategies that are not part of this course. For a good discussion, see the subpages on Geeks for Geeks: Searching Algorithms.

Back to:

Main Algorithm Page

Linear Search

Steps through the elements in a collection (array) one by one until it finds one that matches the search requirements.
Pros
- Easy to program
- Works on any iterable data structure
Cons
- Slow - it is an O(n) algorithm

Variants

Find max and find min
Find position of max and position of min
Linear search returning all matching elements
Linear search on a restricted subarray
Linear search on a sorted array that returns the correct position of an element that needs to be inserted

Pseudocode

In the version below, the array indexes from 1.

BEGIN LinearSearch(array, item)

  FOR idx = 1 TO length(array) STEP 1

    IF item = array(idx) THEN

      RETURN idx // item found

    END IF

  NEXT idx

  RETURN -1 // item not found

END LinearSearch

Java Code

/** * Linear search an integer array for item */public static int linearSearch(int[] array, int item) { for (int i = 0; i < array.length; i++) { if (array[i] == item) { return i; } } return -1;}

Find position of Max

This algorithm is needed for SelectionSort. It searches the array between the start and stop indices

START findMaxPos(array, start, stop)

  maxPos = start

  max = array(start)

  FOR idx = start+1 TO stop STEP 1

    IF array(idx) > max THEN

      max = array(idx)

      maxPos = idx

    END IF

  NEXT idx

  RETURN maxPos // could also return max

END findMaxPos

Binary Search

Uses the properties of a sorted list to halve the search space with each step. At each step the middle element is compared to the searched item and the result used determine which remaining half the item must be in.
Pros
- fast - it is an O(log(n)) algorithm
Cons
- Requires data to be sorted

Variants

Binary search returning all matching elements
Binary search on a restricted subarray
Binary search returning the position that the searched for element should be inserted to keep data sorted

Pseudocode

In the version below, the array indexes from 1.

BEGIN BinarySearch(array, item)

  left = 1

  right = length(array)

  WHILE (left < right) DO

    mid = (left + right) / 2

    IF item = array(mid) THEN

      RETURN mid // item found

    ELSE IF item < array(mid) THEN

      right = mid - 1

    ELSE

      left = mid + 1

    END IF

  END WHILE

  RETURN -1 // item not found

END BinarySearch

Java Code

/** * Search a sorted integer array for item */public static int binarySearch(int[] array, int item) { int L = 0, R = array.length, M; while (L < R) { M = (L + R - 1) / 2; // integer division, rounds down if (array[M] == item) { return M; } else if (array[M] < item) { L = M + 1; } else { R = M; } } return -1;}

Testing the speed of Linear and Binary Search

You should test your code to see if it has the correct asymptotic behaviour. Generate a random array then repeatedly search it for an item - the repetition helps make the time scale measurable and smooths out clock-time noise due to other system processes.

Linear Search

Running the code that searches an array of up to 1 million elements 100 thousand times at each step yields a nice linear result:

public static void timeLinearSearch() { int[] array = randomInts(1_000_000, 0, 1_000_000); int num_times = 100_000; System.out.println("array size\t time"); for (int size = 16; size < array.length; size *= 2) { long start = System.currentTimeMillis(); for (int i = 0; i < num_times; i++) { int out = linearSearch(item, 42, 0, size); } long stop = System.currentTimeMillis(); System.out.format("%12d\t%4d%n", size, stop - start); }}

On my MacBook Pro, I get the following results:

You can see that the time is clearly approximately proportional to the size. So, time = O(n)

Binary Search

Similar to the linear search testing code, but running 100 more times per step. Note that for last step it is searching the array approximately 3000 times faster.

public static void timeBinarySearch() { int[] array = randomInts(1_000_000, 0, 1_000_000); int num_times = 10_000_000; System.out.println("array size\t time"); for (int size = 16; size < array.length; size *= 2) { long start = System.currentTimeMillis(); for (int i = 0; i < num_times; i++) { int out = binarySearch(array, 42, 0, size); } long stop = System.currentTimeMillis(); System.out.format("%12d\t%4d%n", size, stop - start); }}

The plot above shows that the algorithm is far from linear. Plotting the time vs log(array size) gives a more linear result, so time = O(log2(n))

Extension: Recursive Versions

Binary search is naturally recursive - each step is just another binary search on the remaining part of the array after halving.

Linear search can also be thought of as recursive, as each step is just another linear search after removing the element just checked. But it is more naturally written as an iterative instead of recursive algorithm.

BEGIN BinarySearchR(array, item, left, right)

  IF (left > right) THEN

    // exit condition for item not found

    RETURN -1

  ELSE

    mid = (left + right) / 2

    IF item = array(mid) THEN

      RETURN mid // exit condition for item found

    ELSE IF item < array(mid) THEN

      RETURN BinarySearchR(array, item, left, mid - 1)

    ELSE

      RETURN BinarySearchR(array, item, mid + 1, right)

    END IF

  END IF

END BinarySearchR

To start this search call BinarySearchR(array, item, 1, length(array)) or overload the BinarySearchR function (in some modern languages you can also do this using default values for optional arguments):

BEGIN BinarySearchR(array, item)

  RETURN BinarySearchR(array, item, 1, length(array))

END BinarySearchR

Alternatively, if you have a function that can extract subarrays you can use something likeBEGIN BinarySearchR2(array, item) IF length(array) = 0 THEN // exit condition for item not found RETURN -1 ELSE mid = length(array) / 2 IF item = array(mid) THEN // exit condition for item found RETURN mid ELSE IF item < array(mid) THEN RETURN BinarySearchR2(subarray(array, 0, mid - 1), item) ELSE idx = BinarySearchR2(subarray(array, mid + 1, length(array)), item) IF idx > -1 THEN idx = mid + 1 + idx END IF RETURN idx END IF END IFEND BinarySearchR2

Report abuse