OOP University: arrays

Showing posts with label arrays. Show all posts

Tuesday, November 10, 2015

I tried to print my array but all I got was gobbledegook!

This is inspired by a post at /r/javahelp.

A user posted that they were trying to print a sorted array to the console with System.out,println, and that instead of anything that made sense they got [I@4969dd64... So clearly something was wrong and the array was not being sorted.

But that's faulty troubleshooting, as one could tell by just trying to print the array before sorting. You would see basically the same thing.

The reason for this is that when you do a System.out.println on any object, what you're actually doing is calling 'toString()'. If you are printing your own objects for which you've created nice methods that override the default toString, you get nicely formatted output. If you're using an ArrayList you get a nice, comma separated list enclosed in square brackets.

Arrays are more primitive than that. Their default toString is just the one from Object, and it does nothing more than tell you where in the heap the object is located. This is good for verifying that something isn't null, but useless for determining the content.

If you want to see the contents of the array, you've gotta do the work yourself, my friend. And it's not difficult at all. This, for instance, will work for any array of Objects that themselves support a reasonable version of 'toString':

public static String arrayToString(Object[] theArray) {
    StringBuilder output = new StringBuilder("[");
    for (Object o:theArray){
        if (output.toString().length() > 1) {
            output.append(",");
        }
        output.append(o);
    }
    output.append("]");
    return output.toString();
}

View code

Tuesday, November 18, 2014

Arrays: A new dimension

Let's revisit arrays and take them quite literally to the next level.

To recap: An array is a means for storing multiple values of the same kind. Those kinds can be ints. Those kinds can be Strings. Those kinds can be... other arrays.

The simplest way to think of this is to picture a grid.

int [] [] twoDArray = new int [3][3];

Would generate something akin to this:

	0	1	2
0
1
2

Each of the empty boxes can hold an int value.

The type of twoDArray[0] is 'int []'.

To process ALL of the cells in this block you'd most likely write a nested set of for loops:

for (int i = 0; i < 3; i ++) {

for (int j = 0; j < 3; j++ ) {

System.out.println(i + ", " + j + ": " + twoDArray [i] [j] );

}

Now the grid analogy breaks down just a bit when you make an array more complicated. For instance, each row could have a different number of columns. If you were building a more sophisticated application you would probably not want to hard-code the size of the arrays in your loops, but would be more likely to use:

for (int i = 0; i < twoDArray.length; i++) {

for (int j = 0; j < twoDArray[i].length; j++) {

But the principle is the same.

By way of example, let's make a table of five rows, where each row has one more column than the last. We'll populate each cell with the product of the row and column indexes:

int [] [] twoDArray = new int [5][];

for (int i = 0; i < 5; i ++) {

twoDArray [i] = new int[i+1]; //NOTE THIS! For each row we create a new row of boxes of the correct length.

for (int j = 0; j <= i; j++) {

twoDArray [i] [j] = i*j;

}

	1	2	3	4
0
1	1
2	2	4
3	3	6	9
4	4	8	12	16

To print out the contents one could simply write this:

for (int i = 0; i < twoDArray.length ; i ++) {

for (int j = 0; j < twoDArray [i].length ; j++) {

System.out.println(i + ", " + j + ": " + twoDArray [i] [j] );

}

Which would generate the following output:

0, 0: 0

1, 0: 0

1, 1: 1

2, 0: 0

2, 1: 2

2, 2: 4

3, 0: 0

3, 1: 3

3, 2: 6

3, 3: 9

4, 0: 0

4, 1: 4

4, 2: 8

4, 3: 12

4, 4: 16

It would certainly possible to put this into a more readable format if one wished, of course.

For instance, if you changed the code to this:

for (int i = 0; i < twoDArray.length ; i ++) {

for (int j = 0; j < twoDArray [i].length ; j++) {

System.out.print( twoDArray [i] [j] );

if (j < (twoDArray [i].length - 1)) {

System.out.print(",");

}

else {

System.out.println();

}

We've introduced something I haven't used before on this blog. While System.out.println should be familiar, we also have System.out.print. It's basically the same thing, but it doesn't end by creating a new line. This gives us the option to add more text after it. System.out.println() with no arguments just ends the current line of output.

The output from the above would therefore take the following form:

0,1

0,2,4

0,3,6,9

0,4,8,12,16

Which closely matches the table up above. Naturally, if the arrays had been created in a more complex manner the output would reflect that.

This concept can be extended to three, four, or 12 dimensions if you so choose. Mind you, it gets mighty hard to visualize past three, Just remember that the content of an array cell can be pretty much anything, including another array.

Monday, November 17, 2014

Arrays of Golden Sun

We've seen processing of simple integer values already, but what if you have to deal with a bunch of them? Perhaps you have 20 test results and want to do something with them.

I suppose we could write something like this:

int testResult1;
int testResult2;
int testResult3;
...

Then we could write a calculateAverage function...

int calculateAverage(int testResult1, int testResult2, int testResult3..

You know, I'm already tired of typing that, and I've got a whole lot more to go. And you just know that next year there will be 22 tests, right?

Never fear, we aren't stuck with such awfulness. Java (as with most languages) supports the concept of arrays. Don't fear the word, the concept is actually pretty simple to grasp.

First, let's go back and deal with testResult1. We can think of that variable as being a box that holds a number. We can put a number in the box, or we can ask the box what number is in it.

testResult1 = 97; //We've just put 97 into the box

System.out.println("Your first test result was " + testResult1 ); // We've just asked the box what number is in it.

All the operations boil down to one of those. If we do math, we're just asking what's in the box. If we reassign the number (like 'testResult1 ++') we're really just asking what's inside, making a new number and putting the new one into the box. Whatever was in the box before is gone once you do this, there's no room for two numbers. These are pretty simplistic boxes.

So, I can conceive of 'testResult1' through 'testResult20' as being a row of 20 boxes, each of which I can use to store or retrieve a number. But man, it's going to be annoying to do much useful work with individually addressable variables like that.

Enter the array.

Let's just take that same row of 20 boxes and relabel them. In fact, let's glue them together in a long line and give the whole spiel one single label. We'll call it 'testResults'.

Of course, testResults is not an int. It's a grouping of 20 ints. That's going to make it slightly more difficult to do math on it. Enter the square bracket and the array index.

If we want to refer to the item in the first box, first we have to get over a little hump. For reasons we do not really need to get into now, just trust me that they're valid, the first box is number 0, not 1. So, what we previously would have called 'testResult1' can now be called 'testResults[0]'. It is important to understand that 'testResults' is NOT an integer. However, testResults[0] (or testResults[13]) IS an integer and is just a marker for one of the boxes in our row.

I know, that seems a bit more difficult at first, but it turns out not to be. Here's why:

Remember before? We had to declare testResult1, testResult2, etc. on individual lines?

Well now we can just do:

int [] testResults = new int [ 20 ];

We've specified that testResults is not an int, but an array of ints with the '[]' notation. We've made it have 20 boxes with 'new int [ 20 ]'. (This can be done live at runtime, you could do 'new int [ someCalculatedValue ]'). In one fell swoop we've made storage space for 20 individual (but related) integer values. It gets better.

The 'calculateAverage' function can now be declared like this:

int calculateAverage( int [] valuesToAverage ) {

Well isn't that a whole lot easier? And you can imagine that before, that calculateAverage method would have had to look something like this:

return (value1 + value2 + value3 + ... + value20) / 20;

Which seems like it takes up a lot of valuable real estate on screen, and, as mentioned before, is irritating to change.

Our new version takes advantage of the fact that the array is indexed (it could do more but I'll talk about that later):

float total = 0;

for (int i = 0; i < 20; i ++ ) {
total = total + valuesToAverage[i];
}
return total / 20f;

That's the whole thing. I didn't even get tired.

Of course, this can be improved further. What if there aren't 20 values in the array? What if there are 12? We could change the code, or we could let the array tell us how big it is:

float total = 0;
float valueCount = valuesToAverage.length;

for (int i = 0; i < valueCount; i++ ) {
total = total + valuesToAverage[i];
}
return total/valueCount;

Now we have a completely generic method for averaging any number of integer values we want, that we will never have to change again. All we need to do is make sure before using it that the array we're passing in is sized correctly and populated. We can stash that into a utility class somewhere and use it for years.

Naturally none of this is going to work without stashing numbers in the array in the first place.

One can initialize an array with hard coded values:

int [] testScores = { 100, 95, 22, 84 };

Or one can write code to set the values. Assuming we have appropriate helper functions in place, this kind of thing could work. It's more likely that you'd be getting input from the user or a file for your early experiments with the language. I think we'll talk about those soon.

int [] testScores = new int[ getNumberOfScores() ];
for ( int i = 0; i < getNumberOfScores(); i++ ) {
testScores[i] = getScore(i);
}

Of course, that's just the tip of the array iceberg, but I think it gets across the main points:

Arrays:

Are just ordered, indexed lists of the data types we already understand
Can be passed as single labels but all the internal values are readily available
Contain 'meta' information about their contents, particularly how many items they hold
Can make many data processing tasks easier to code and understand

If you're having trouble with this concept, please let me know. It's good to have a firm grounding in basic aggregate data structures like this before we move on to more complex topics. Much of what we do in programming involves understanding some concept, then not using it directly, but instead using higher level structures that are better handled if you know what's going on underneath.

View Code

OOP University