Sets

What is a Set?

Previously we talked about how a List could have multiple of the same object. That is the first way a set differs from a list. A set cannot have multiple of the same object. By looking at the Java documentation for the Set Interface, we can see that a set uses the objects equals method to do the comparison.

The second major difference between a List and a Set is that a set does not guarantee iteration order. This means two very important things:

  • That in order to do any operations on every element of a set, you must use an iterator. More on that later.
  • The order you process elements can be different every time you iterate.

Looking at the Set interface, we notice that a few methods that exist for the List do not exist for the set. There is no:

  • get(int i)
  • add(int i, E e)
  • remove(int i)
  • set(int i, E e)
  • sublist(int startIndex, int endIndex)

In fact, any method that takes an integer used for an index will not exist for a set. Sets do not have indexes because, by default, we cannot guarantee order in a set.

How Does a Set Work?

Every object, by default, has an equals method. The signature for the equals method is:

public boolean equals(Object o);

The objective is to perform some type of comparison on the object being passed with the current object in order to determine if the objects are equal. Take, for example the Circle class below:

public class Circle extends Shape {
    
    private double radius;
    
    public Circle() {
        this(1.0);
    }
    
    public Circle (double radius){
        this.radius = radius;
    }

    public double getRadius() {
        return radius;
    }
    
    public void setRadius(double radius) {
        if (radius > 0.0) {
            this.radius = radius;
        }
    }
    
    public double getArea() {
        return Math.PI * Math.pow(radius, 2);
    }
    
    public double getPerimeter() {
        return Math.PI * 2 * radius;
    }
    
    @Override
    public String toString() {
        return "Circle[radius="+radius+"]";
    }
    
    @Override
    public boolean equals (Object o) {
        if (o instanceof Circle) {
            double otherRadius = ((Circle)o).radius;
            return radius == otherRadius;
        } 
        return false;
    }
}

It is not enough to simply check this == o. Remember a comparison like this == o only compares the 2 memory addresses assigned to the variables, it does nothing with regards to the content of the objects themselves. In the Circle example we need to check that the radius of both circles are equal in order to check if 2 circles are the equivalent.

Because the equals method is a part of the object class, we must override it, which means that we will need to cast the Object o into a Circle to access its data. If the passed in object is NOT a circle, then it is clearly not equal to this circle. The instanceof operator allows us to check that the object is, in fact, a circle before we try to cast it.

Moral of the story: When you create objects that are going to be put into a set, seriously consider what makes two of those objects equal, and if necessary override the equals method.

Iterators

Typically, when we store data into a data structure, there comes a time where we want to perform some operation on every item stored. When using a List, we can simply count from 0 to the size of the list and use the Lists get(int index) method to retrieve an item in order to process it. Sets don't have a get method.

An iterator is an object that allows us to access each element in any collection (List/Set/etc.) one at a time. The only thing an iterator guarantees is that you will be able to access each element exactly one time during the iterator's lifespan. This means you won't access the same element twice, and you can't go back to a previous element.

Iterators have 3 important methods:

  • hasNext() - Returns a Boolean. This must be called before next() to ensure that there is an element to consume.
  • next() - Returns the next element if one exists. If you try to call next() at the end of the iterator, an exception will be thrown.
  • remove() - Removes the last element that was consumed from the underlying collection (List/Set/etc.) without affecting this iterator.

Here is a typical loop created to perform operations on successive elements of any collection:

Iterator<E> iter = collection.iterator();

while (iter.hasNext()){
    E next = iter.next();
    // Do things to the object
    
    if (CONDITION) { // Optional. You don't always want to remove objects.
        iter.remove();
    }
}

Exercise (From HERE)

In mathematics, several operations are defined on sets. The union of two sets A and B is a set that contains all the elements that are in A together with all the elements that are in B. The intersection of A and B is the set that contains elements that are in both A and B. The difference of A and B is the set that contains all the elements of A except for those elements that are also in B.

Suppose that A and B are variables of type set in Java. The mathematical operations on A and B can be computed using methods from the Set interface. In particular: A.addAll(B) computes the union of A and B; A.retainAll(B) computes the intersection of A and B; and A.removeAll(B) computes the difference of A and B. (These operations change the contents of the set A, while the mathematical operations create a new set without changing A, but that difference is not relevant to this exercise.)

For this exercise, you should write a program that can be used as a "set calculator" for simple operations on sets of non-negative integers. (Negative integers are not allowed.) A set of such integers will be represented as a list of integers, separated by commas and, optionally, spaces and enclosed in square brackets. For example: [1,2,3] or [17, 42, 9, 53, 108]. The characters +, *, and - will be used for the union, intersection, and difference operations. The user of the program will type in lines of input containing two sets, separated by an operator. The program should perform the operation and print the resulting set. Here are some examples:

Input

[1, 2, 3] + [3, 5, 7]

[10,9,8,7] * [2,4,6,8]

[ 5, 10, 15, 20 ] - [ 0, 10, 20 ]

Output

[1, 2, 3, 5, 7]

[8]

[5, 15]

To represent sets of non-negative integers, use sets of type TreeSet<Integer>. Read the user's input, create two TreeSets, and use the appropriate TreeSet method to perform the requested operation on the two sets. Your program should be able to read and process any number of lines of input. If a line contains a syntax error, your program should not crash. It should report the error and move on to the next line of input. (Note: To print out a Set, A, of Integers, you can just say System.out.println(A). I've chosen the syntax for sets to be the same as that used by the system for outputting a set.)