The process of arranging a set of data in a particular order is called sorting. This order can be in increasing or decreasing order, lexicographically order, alphabetical order, and so on.
Suppose you have an array of items that are not in order:
int[] a = {224,6,2,18,4,7,21,-6,3,13};
How can you order them?
Possible ideas include:
Move the smallest to position 0 and repeat to find the second smallest
Keep checking the smallest vs the largest.
All of those ideas include common mechanisms:
To go over all elements
To check/compare the magnitude
Load a value into a position and unload a value to a different position
Keep track of the current ordered value
In terms of code, we have:
a loop that goes to all the values in the array
for(int i = 0; i < a.length-1; i++)
an if-statement that checks if contagious values are in order. Noticed that we subtract 1 from the a.length because we will need to check for the next position of our current state (e.g., a[i + 1] as follows :
if(a[i] > a[i+1])
a swap, which consists in having a temporary variable to hold the value of one position, e.g., a[i] while positioning a[i+1] takes its place, and eventually replaces the position a[i+1] by a[i].
int tmp = a[i];
a[i] = a[i+1];
a[i+1] = tmp;
In conclusion, we have a procedure as follows:
public static void sort(int[] a){
for(int i = 0; i < a.length-1; i++){
if(a[i] > a[i+1]){
int tmp = a[i];
a[i] = a[i+1];
a[i+1] = tmp;
}
}
}
By running our program with the given array, we obtained the following output:
======BEFORE======
224 6 2 18 4 7 21 -6 3 13
======AFTER=======
-6 2 3 4 6 7 13 18 21 224
Notice that the largest value is indeed at its place, however, the rest of the array is not necessarily in order. If we run the algorithm 2 times consecutively
sort(a);
sort(a);
we get:
======BEFORE======
224 6 2 18 4 7 21 -6 3 13
======AFTER=======
-6 2 3 4 6 7 13 18 21 224
By running the program 3 times we get the 3 largest values in order; if we run it N times we got the N values. Therefore, we need an extra loop to process N times the loop, generating a transformation of the method as follows:
public static void sort(int[] a){
for(int i = 0; i < a.length-1; i++){
for(int j = 0; j < a.length-1; j++){
if(a[j] > a[j+1]){
int tmp = a[j];
a[j] = a[j+1];
a[j+1] = tmp;
}
}
}
}
We have discussed search in the past through sequential search. Related problems may include:
Given the following string, count how many specific characters appear in the string.
Given this array, return true if the target t was found in the array, false otherwise.
public static boolean sequential(int[] a, int t){
for(int i = 0; i < a.length; i++){
if(a[i] == t)
return true;
}
return false;
}
The issue is that the method, in the worst-case scenario, will take n.
For binary search, we will use the “divide and conquer” technique in a sorted array. The big assumption is that the array is in ascending order. We can use any of the five different flavors we provided in sorting to order them in ascending order.
Check if the mid-point is the value that we are looking for (figure 11.1).
If the target that we are looking for is at position mid-point, then return true, because we just found the element we are looking for
If the target that we are looking for is NOT at position mid-point, then we need to check our navigation path (which can be either left or right), and proceed to Step 2.
2. Check for navigation path:
If the value that we are looking for is smaller than the current mid-point, there is no need to waste time exploring the right side (figure 11.2).
Our next right has changed to the current m – 1, the left remains the same.
If the value that we are looking for is larger than the current mid-point, there is no need to waste time exploring the right side (figure 11.3).
3. As long as the current left is smaller than the right, we will repeat steps 1 and 2.
If the program succeeds to get out of the loop, only means one thing: the target was not found in the array, and your method shall return false.
To be consistent with the search method, we write a public method that takes the same number of arguments. However, internally, we invoke the private method that will allow us to update our continuous left and right.
public static boolean binary(int[] a, int t){
int l = 0; // initial stage of left
int r = a.length; // initial stage of the right
return binary(a,t,l,r); // invocation of private method
}
The code for the private method looks as follows:
private static boolean binary(int[]a, int t, int l, int r){
while(l <= r){
// check if the element at medium is what you are looking for
int m = l + (r - l) / 2;
if(a[m] == t)
return true;
// if the target we are looking for is smaller than our mid
// update your new right, ignore the left side
if(a[m] > t)
r = m - 1;
if(a[m] < t)
l = m + 1;
}
return false;
}
How about Recursively? Write the public method that will take an array of ints and a target. The private method shall take three additional parameters: the left, the right, and the mid-point.
Which one is more effective: Experimenting with random data
Let us test both implementations (i.e., sequential vs binary) and let us see how fast they run.
Java has the object System, which has a method called nanoTime(). Let us use this method as a “timer” to measure the time elapsed between one task and another task. The skeleton of this experiment looks as follows:
long start = System.nanoTime(); // measures the start time
... // Task to measure
long end = System.nanoTime(); // measures the stop time
long diff = end - start; // perform the difference
System.out.println(diff); // Report
For our example, it will not make a difference since the number of elements in the array is small.
We will implement a populate() method that will take an integer N representing the total number of elements in the array. The method shall return an array with length N of random numbers between [0,1000].
public static int[] populate(int N){
int[] a = new int[N];
for(int i = 0; i < N; i++)
a[i] = (int)(Math.random()*1001);
return a;
}
Now it makes more sense to test both methods. Only one last thing, remember that binary search only works when the data is sorted in ascending order.