Friday, November 21, 2014

Collections Part 1: List

Let's talk about Collections.  I don't mean baseball cards...  or dust.  I'm talking about the Collections framework, which is a powerful set of classes included with your Java distribution.

This is not meant to be an in depth tutorial.  I'm merely going to scratch the surface and give a few guidelines regarding basic use of the framework.

The Collections framework is incredibly important to the Java developer.  You will spend a great deal of time working with the data structures within it.  The two that you will most likely use more than any other are List and Map.

List and Map are not classes.  They are interfaces.  An interface can be thought of as the set of controls for a range of classes.  Much like you might have a gas powered or electric powered oven, you still turn a dial and set a desired temperature.  The interface is the same, although the implementation is different.  This is a key concept:  Code to the Interface is commonly said, and slightly less commonly actually done.  I will do other posts on this idea, it's very important.

Let's start with List.  We'll do Map in another post.

Suppose I wanted to keep track of a few numbers read in from a file.  Each line contains one number, but I don't know how big the file is.  While it would be possible to load the numbers into an array, it would be a bit messy, as I would have to manage the size of the array.  Enter the 'ArrayList'.  This is a Collections based object that works very much like an Array with a turbocharger on it.  I don't have any hard numbers on this, but my suspicion is that it is the most commonly used Collections class out there.

Let's say my pseudocode looks something like this:

Open a file

For each line in the file:

    Read an integer number from the file
    Store the integer in a list for later use

Generate a sum of all the integers in the list as SumTotal
Display SUM to the user

There are multiple paths to success of course, and it's hard to say that any given solution is exactly 'wrong'.  However, unless we know in advance how many values we're going to store, or we wildly oversize the initial allocation, using an array would require us to modify the length of the array regularly.  It's not that this is particularly hard, but it's even easier not to do it.

And let's face it:  One of the secrets of successful programming is doing things the easy way.  It's a hard enough job, there's no reason to add unnecessary bells, whistles and epicycles to our code.

I'm not going to address the bits about reading the file right now, that will be in another post.  I just want to concentrate on 'Store the integer in a list for later use' and 'Generate a sum of all the integers in the list as SumTotal'.

First, we have to create the ArrayList object.  I'm going to do so in a way I haven't shown you before, but it's not too awful to understand:

List<Integer> numberList = new ArrayList<Integer>();

What?

Well, honestly Java can deal with this without having those '<Integer>' parts, but I thought you ought to get used to using them.  They're really handy for preventing certain classes of errors when programs get bigger.

What this is doing is creating a variable called 'numberList'.  It is an object of type 'List', and that 'List' is constrained to hold only Integer values.  That's the reason for the syntax, it makes sure we don't do something silly like try to put a String or a PersonData in there.

List is NOT a class.  You cannot create a 'new List()' directly.  It is what's known as an interface, and essentially describes how to use a category of classes, rather than one single class.

ArrayList IS a class.  You could do this if you so choose:

ArrayList<Integer> sillyNumberList = new ArrayList<Integer>();

This is roughly akin to adding the name of the manufacturer before mentioning an appliance.  Every time.  You wouldn't tell someone to go get you a beer from the Acme refrigerator, you'd just say refrigerator.  The fact that it's an Acme was only important when you bought it (or when you need to have it repaired.  Don't ask me about refrigerator repair, it's a sore spot).  The fact that it's a refrigerator means that you will grab a handle, pull it, and find something yummy and cold inside.  The specific manufacturer does not matter.  Similarly, a List allows you to add items, remove items, look at items and such, all without having to know that it is specifically an ArrayList.

Let's imagine that we've now got a value we want to add as an Integer, in a variable called 'numberToAddToList' (imaginative, huh?)  We'd just do this:

numberList.add(numberToAddToList);

No muss, no fuss, no checking capacity.  Set it and forget it.  You can do this once, or a thousand times or more.  That one line is all you'll need to use.

The other bit we need to work on is generating a sum of the numbers.  The way to do this, of course, is to take all the numbers, one by one, and add them to a variable that started at zero.  I'm going to do this three times here.  The first time using a for loop the way we've already done.  Next, I'll do so using an 'Iterator', which is a helper type designed to go through all the items in a Collection one by one, has been around forever, and is oftentimes a better choice than a standard for loop.  The last is using the 'enhanced' for loops that have been in the Java spec for a few years now.  None of these ways are strictly wrong, but if I was reviewing your code I'd want to see or hear a good reason for not doing it the third way.

Integer sumTotal = 0;

1)
for (int i = 0; i < numberList.size(); i++ ) {
    sumTotal = sumTotal = numberlist.get(i);
}

2)
Iterator<Integer> iter = numberList.iterator();
while(iter.hasNext()) {
    sumTotal = sumTotal + iter.next();
}

3)
for(Integer number : numberList) {
    sumTotal = sumTotal + number;
}


In all cases, we'd just then write:

System.out.println("The total is " + sumTotal);

As I think you can see, the new enhanced for loop makes life pretty easy.  You declare a variable and state where it comes from, and then just use it in the loop.

The iterator probably looks pretty awful to you, and to be fair, that's the cleaned up version available to us since we got to start using Generics.  Back before we could add that <Integer> tag it would have looked like this:
Iterator iter = numberList.Iterator();
while(iter.hasNext()) {
    sumTotal = sumTotal + (Integer)(iter.next());
}

But it was still a useful construct quite often.  Naturally in small examples like these the full impact does not show itself.

List can be used for much more than numbers of course.  Any sort of Object can be stored in a list.  You can have a List of Strings, or a List of Lists, or a List of Maps...

Next time, we'll discuss Maps, which instead of sequential access specialize in looking up specific objects based on a key value.