Data Structures

Lists

What is a Data Structure?

A data structure is any object that allows us to store multiple values, sometimes, according to criteria. In most languages, that criteria would be that the types being stored are all the same. Another property of data structures is that all of the data within it can be accessed or stored quickly and easily.

For example, in Java, we create an array of integers, like this:

int [] myArray = new int[size];

As you know, this allows us to store multiple integers, strictly according to the value in our size variable. If size = 10, then we can store 10 integers. In addition, storing data into and accessing elements of the array is quick and easy:

myArray[index] = value; // Store something into the array
System.out.println(myArray[index]); // Access something from the array

So, arrays are useful... To a point. But from a strict definition sense, arrays are definitely data structures:

  • We can hold multiple values. Check!
  • We can access and store values quickly and easily. Check!

So, in the end, a data structure is simply something that allows you to organize and access a bunch of data.

Lists

Arrays are a great starting place for introducing data structures, but they have some serious shortcomings. Let's say we're writing a program that holds the cost of a bunch of books at a store, and print the list of those prices. Shouldn't be a problem:

  • Ask them how many books there are
  • Have them enter the prices, one at a time

In the past you may have written a method like the one in the example to accomplish such a task. Now let's think about what would happen if a program like this existed in the real world: First, the manager would give you, a bunch of books. How many? Who knows! You'll have to count them first if you want to use THIS program. Obviously this is a problem.

Not to mention other issues with arrays:

  • Cannot be easily expanded
  • Removal of data leaves empty space with no data instead of collapsing the space

Both of these issues make arrays a problem to work with for large programs. This is where the List data structure comes in.

Formally, a list is a group of values that can be counted and are ordered according to some criteria. In lists, the same value may occur more than once. We will be discussing Lists as they pertain to Java specifically.

Java has an abstract interface called List, which can be found here (Java 8). If you notice, at the top of the document we see:

Interface List<E>

This syntax is mostly familiar:

  • Interface tells us that this document is specifying a contract that any implementing class must follow in order to be considered a list.
  • List is the name of the contract.

Generics

The last part <E> is called a generic. Java is a very strictly typed language, meaning that whenever you create a variable, it must have a type. This goes for arrays as well (see above). When we create a new List in Java we write the following:

List<E> myList;

The type of the variable myList is List. Java needs to know what types of variables are going to be stored in myList, which is where the generic comes in. Here are some examples:

List<String> myList; // A list of strings
List<Student> myList; // A list of Student objects
List<Integer> myList; // A list of Integer objects

The first two are easy to understand: A list of strings and a list of created Student objects. The third, however, is a little odd. Why did we write <Integer> rather than <int> to create a list of integers? Java does not allow primitive types to be the generic type. This means you cannot create a list of:

  • int
  • float
  • double
  • char
  • short
  • long
  • boolean
  • and any other type that shows up in purple, or has a small first letter

Because of this, Java has created object wrappers for each of the primitive types. When we talk about them we identify them as, for example, "Big-I Integer" or "Big-B Boolean." These wrappers turn primitive types into objects. (They also have useful functions like type conversion, but that's a different topic.)

This video is a brief discussion about the differences between ArrayLists and Arrays in Java. It discusses the relative benefits of ArrayLists, but does not mention its drawbacks: increased memory consumption and that it is slower (minimally) than using an array.

ArrayLists

Now, back to talking about Lists. Remember, a Java List is just an interface. That means there must be some implementing class that agrees to that contract. Java has provided several different implementing classes:

  • ArrayList
  • LinkedList
  • Stack

But for now, we are going to focus on one of them: ArrayList. An ArrayList object is a dynamic list, meaning it grows in size to fit the data. In addition, adding or removing data shifts the data as necessary so you can do things like insert or delete elements into or from the middle of your array easily. Under the hood, an ArrayList uses arrays to accomplish all of these amazing features.

Remember earlier when we said that "a list is a group of values that can be counted and are ordered according to some criteria"? Well the criteria for an array list is insertion order. This means you, the programmer have ultimate control of the order of your elements.

Useful Features

Some of the more useful commands for Lists are:

  • add(E e) - Adds a new element e of type E to the end of the List.
  • add(int i, E e) - Inserts a new element e of type E at index i. If there was an element already at index i, then the old element, and every element after it, gets moved by one index to create space for the new element.
  • set(int i, E e) - Replaces an existing element at index i with the new element e of type E.
  • get(int i) - Returns the element of the List at index i.
  • contains(Object o) - Searches the List for any object.
  • isEmpty() - Simple check to see if there is anything in the List.
  • size() - Returns the number of elements in the List.
  • remove(int i) - Removes the element located at index i.
  • clear() - Removes all elements in the list.
  • subList(int fromIndex, int toIndex) - Returns a list containing only the elements in the range [fromIndex, toIndex).

All of these commands that are part of every List object, are incredibly useful and make arrays much easier to work with. In order to get a full understanding of what it takes to implement these methods, you are going to do exactly that:

Exercises

1. Aside from insertion order, what other ways can you think of that a list could be ordered?

2. You are going to implement the following interface to create your own, smaller, version of the ArrayList. Your implementation should be backed by a single array. That array should grow as necessary (i.e. you run out of space).

public interface ISimpleList<E> {
    
    /**
     * Appeds an element e to the end of the list. Returns success status.
     */
    boolean add(E e);
    
    /**
     * Inserts an element, e, into the list without losing or reordering existing data.
     * Returns success status.
     */
    boolean insert(int i, E e);
    
    /**
     * Replaces the element at index i with new element e. Returns the old element.
     */
    E replace(int i, E e);
    
    /**
     * Removes the element at index i and returns it.
     */
    E remove(int i);
    
    /**
     * Returns the element at index i
     */
    E get(int index);
    
    /**
     * Removes all elements from the list
     */
    void clear();
    
    /**
     * Returns the number of elements currently in the list.
     */
    int size();
    
    /**
     * Returns true if and only if the list has more than zero elements.
     */
    boolean isEmpty();
    
    /**
     * Returns a new SimpleList which contains the elements starting at
     * startIndex inclusive and ending at endIndex exclusive.
     */
    ISimpleList<E> subList(int startIndex, int endIndex);
}