Project 22 - performance

 deliverable SortTest.java
 array_performance.png
 linked_performance.png
 due 11:59PM on Thursday March 27th, 2013 
 projectid performance
 

ArrayList vs LinkedList

It is often the case that different classes are capable of accomplishing the same task.  It is important in computer science to know the proper context in which to use a class.  For example, one class might take less time than another class to do the same thing.   In a series of experiments we compare the performance of ArrayList versus LinkList in sorting integers using the sort algorithm from the previous lab.  ArrayList is just one of many possible implementations of the List interface. Another implementation of the List interface in the Java API is LinkedList. At the moment we don't know and we don't care what a LinkedLIst is or how it works.  All we need to know is that LinkedList, like ArrayList, implements the List interface in the Java API and so has the same methods as ArrayList.  Since the method sort

    public static void sort(List<Integer> data) ...

of the previous lab sorts all lists it works for ArrayList and LinkedList.

Will the performance of the two lists differ?  And, if so, why?

Task:

Use SelectionSort you implemented in last project. Create a program SortTest.java that uses the sort() method of the SelectionSort.java program. These are the steps to test selection sort:
  • Initialize the list to {1, 2, 3, ..., N}, shuffle it using the random number generator, and then sort it in ascending order using selection sort.
  • Repeat the shuffling and sorting 25 times (you can increase this number to increase accuracy).
  • Measure the average run time of sorting using LinkedList or ArrayList. Using the method from the StopWatch class provided in the link (add the jar file to your project and add it to the build path the same way you did for StdDraw.jar)
  • Print the average running time
  • The program should accepts 2 command line arguments:
    •  An integer indicating the size of the data to be sorted (this is the same as the upper bound of the range of integers)
    • A name of a class which implements the interface List, You can use sample code to choose the implementation class(ArrayList or Linkedlist) from a command line argument.

Create Class using its name

With reflection it is possible to instantiate a class using the name of the class as string. This enables us to take the choice of class to run in the experiment from the command line.  Copy the following code exactly:

final Class<?> clazz = Class.forName (args[1]);
@SuppressWarnings("unchecked")
final java.util.List<Integer> list =(java.util.List<Integer>) clazz.newInstance();

You will have to add a throws clause to the main method because of the possible exceptions.

Measure duration using stopWatch

Use the Stopwatch class to measure the runtime of your bubble sort for ArrayList versus LinkedList of size N.The library Stopwatch.jar provides implementation of a stop watch. To calculate the amount of time it takes to run a piece of code you will have to follow this

final StopWatch sw = new StopWatch(true);

// Initialize
for (int i=0; i<25; i++) {
     // Shuffle

     sw.start();

     // your code

     sw.stop();
}
final double averageTime = sw.getAverageTime();

The Experiments.
Record the performance of selection sort on linked lists vs. array lists of size N, ranging from 200 to 2000 (inclusive), incremented 200 in each iteration. (N = 200, 400, 600, 800, 1000, 1200, 1400 …….. ). Record the values in an excel sheet:

N                   ArrayListTime (seconds)            LinkedListTime (seconds)
200                          8.43E-4                                  1.30E-3
400                          3.80E-3                                  7.92E-3
600                          1.94E-2                                  5.04E-2
........

Use the following Sample inputs to your experiment:

 n=200,...,1_800, 2_000; seed=4321567; ArrayList
java -Dseed=4321567 SortTest 200 java.util.ArrayList
java -Dseed=4321567 SortTest 400 java.util.ArrayList ... java -Dseed=4321567 SortTest 200 java.util.LinkedList java -Dseed=4321567 SortTest 400 java.util.LinkedList
...

The program should print one line of output -- the average time in seconds.

Report.

After you recorded your finding in an Excel file, show the dependency of sorting time on the size of the list by creating a Scatter Chart with only markers from your data, plotting 1 series (values from ArrayListTime column or values from the LinkedListTime column). Use values from the N column to label the X axis.  Repeat this for both the ArrayList and LinkedList. Then try to figure out if the best fit for each of the plots is a linear, polynomial of order 2 or polynomial of order 3. To determine this, add a trend line with each regression type (one by one to test) and make sure you check the "Display R-squared value on the chart". The fit with larger R-squared value is the best one. Copy the figure with best trend lines for each plot (equation and R-squared should both be on the same figure) and paste it in paint (go to run and type paint on windows 7) and save it as array_performance.png and linked_performance.png. Make sure you are using png file type.