Sunday, December 21, 2014

Old McDonald had a File, IO IO IO

One of the most basic and common functions we programmers have to implement involves reading and writing data files. So get used to doing so. The good news is that Java is well equipped with many library functions for doing this. The bad news is exactly the same. There are many library functions, so many that you can get lost in them. In fact, just when we thought we had a handle on the 'java.io' package, along came 'java.nio' (New I/O) to cloud the picture even more. Even an experienced programmer can get a little lost in the multiple classes that are in place for dealing with persistent data, and I certainly don't want to overwhelm my audience, so I'm going to stick to some basics.

 First off, I'm going to pretend that 'new I/O' never happened. I'm also going to offer you only a subset of all the functionality that's out there even in the java.io library. When it comes down to it, when you've got a file to read or write, you want to be able to just do so, without a lot of muss or fuss.

I am going to have to assume that you're reasonably familiar with directories (or folders, if you must) and files as seen by your operating system. I'm going to assume that you understand what a text file is, and that you feel reasonably comfortable with viewing them or editing them with notepad, or UltraEdit, or whatever your choice of text editor might happen to be.

First, a note on file types. There are of course a huge number of file formats out there, to support applications from AutoCAD to PhotoShop to Visual Studio, but all files can be pretty much divided into to major categories: Text and Other. Text files are broken down into lines, and many programs deal with them on a line by line basis. Other files tend to consist of plain old blocks of information, whether those be pixels for an image, or a song or movie, or quite literally anything else you can save.

Java supports both major file types of course, and has objects specifically for dealing with each. When dealing with text files it's very convenient to read or write things one line at a time, so we have that ability in our libraries.

We've used an object we call 'PersonData' before, and we'll make use of it again now.  Suppose we had a file in which we had a bunch of last names and first names stored, for now we'll assume that last names and first names each appear on a line, so the file would look something like this:

PEOPLE.TXT

    Schmoe
    Joe
    Doe
    John
    Smith
    Bill

So we've got the information here to support three PersonData objects, all contained within a file called PEOPLE.TXT.

If we wanted to read the names from this file and create PersonData objects, it's conceptually pretty simple:

Open the file for reading
As long as there's something to read
    Read a last name
    Read a first name
    Create a PersonData object with those values

That makes sense, right?

Enter the Scanner.  The Scanner object is pretty new, and frankly I'm still not completely used to it.  But it's so darned handy I have no excuse not to use it.  The Scanner has several different constructors, but we're going to make use of the one that uses the File object right now.

First, let's define the file that we intend to read.  This sets us up for the next step:

File personFile = new File("PEOPLE.TXT");

Note that this isn't something that will fail, even if PEOPLE.TXT does not actually exist.  Maybe we intend to create it next, or we want to test for its existence.  The File object is handy that way.

Scanner personScanner = new Scanner(personFile);

This can fail.  If personFile refers to a non-existent file bad things can and will happen starting here.  Proper code will be wrapped up in try/catch blocks as I discussed when we talked about exception handling.

At this point we have covered the first line of our pseudo-code algorithm:  "Open the file for reading".  The next part was "As long as there's something to read", which a Scanner can handle quite easily:

while (personScanner.hasNextLine()) {

This loop will now run until we're out of lines.  That's pretty convenient.  Now let's extract the last and first names:

    String lastName = personScanner.nextLine();
    String firstName = personScanner.nextLine();

Yes, there's a weakness or three here.  We're making the assumption that our file is correctly formatted and complete, with two lines for every person.  Good coding practice would include checking 'hasNextLine' in between those two lines and dealing with it if that isn't the case.

We're actually almost done here.  Once we've got the last name and the first name, we can go ahead and create a PersonData object:

    PersonData personFromFile = new PersonData( firstName, lastName );

At this point, technically we can end the loop.  We've done the specified job, although realistically this is useless.  There's not much point in reading a data file only to toss away the data.  You'd want to do something with these 'personFromFile' objects, like adding them to a List or Map, or storing them in some other sort of data structure your program will use.

But for now, we'll just pretend we're doing that and close down the file:

}
Scanner.close();

And that's about all there is to reading text files since Java 7 came out.  It was a bit more difficult before, when we had to create a couple of other objects in place of Scanner, and if the data included numbers or multiple items on one line it was even more involved.

Let's say that the file was formatted like this instead:

PEOPLE.TXT

    Schmoe Joe
    Doe John
    Smith Bill

That seems reasonable, right?  It is, and it's pretty easy to deal with.  All we'd need to do is change the following two lines from this:

    String lastName = personScanner.nextLine();
    String firstName = personScanner.nextLine();

To this:

    String lastName = personScanner.next();
    String firstName = personScanner.nextLine();

And we will have accomplished our goal.  We could also have done this:

    String lastName = personScanner.next();
    String firstName = personScanner.next();
    String anythingLeftover = personScanner.nextLine();

What is important is to realize that 'nextLine' is critical to advance us from one line to the next, so we have to use it before we go through the loop again.

Before I go, I also want to mention that Scanner is capable of reading not just Strings, but also any of the primitive data types as well.  It supports nextInt(), nextDouble(), etc.  It's also capable of reading more complicated input via the use of Regular Expressions, which are a huge topic in their own right.  Regular Expressions essentially provide an entire language for matching complicated text, and an entire series of articles could no doubt be written just on that topic alone.

In our next episode, we'll talk about writing text files, which is just the reverse of what we've done here.

Monday, December 15, 2014

Inheritance Part 2: That's Rather Abstract, Isn't it?

I hope you've been browsing through Java source code outside of your own and the small samples I've provided on this site.  Reading and trying to understand code can only help you in the long run.  You will encounter different code styles, you will see interesting data structures, API calls you didn't know existed, all sorts of stuff.

Today I'm going to talk about something you might have seen, but if you haven't, it's no big deal.  It is the keyword abstract.

We use abstract as part of the definition of a class, which indicates that while it is in fact a valid class file, nobody is allowed to create an object of that type.  That doesn't mean that it's useless though.  Remember, we have inheritance on our side.  You can create concrete classes (concrete is just the opposite of abstract, indicating a class from which you can create objects.  No special keyword is needed for that) that extend abstract classes all you want.

We also use abstract as part of method signatures when we want to create the idea of a particular method call but wish to defer the actual implementation of said method to our concrete classes.  This is similar in concept to interfaces, and sometimes it's more of a personal decision as to which you will use.

You see a lot of abstract classes in use when you look through frameworks, which are basically bodies of code designed to accomplish some function but with the details left up to the final developer.  There are business frameworks, logging frameworks, GUI frameworks, reporting frameworks, the list is very long.  Usually the trigger for creating a framework is when someone realizes they've just written the same thing for the fifth or tenth time, and that the differences between the various programs are relatively minor.  Maybe you query a database, format a report and write it to a file, and it's always the same except for the details of which fields you look for and print.

This is a great application for abstract classes.  Let's take a look at a simple one:

public abstract class Report {
}

Well, that's about as simple as it gets.  To be fair it's also about as useless as it gets but never fear, we'll fill in some details later.  Actually it's not entirely useless.  You might want to have a set of classes that all share a common base class so that you can collect them into a Collection of some sort, although you could also make this work with an interface.  Abstract classes really come into their own when you include some code within them.  So let's write something.

public abstract class Report {
    public abstract void GenerateOutputLine(String dataLine);
    public abstract String ReadData();

    public void runReport() {

        String dataLine = null;
        while( null != ( dataLine = ReadData() ) ) {
            GenerateOutputLine(dataLine);
        }
    }
    
}

OK, that was simple enough.  We have defined not only our abstract Report class, but made it do a little bit of work for us.  It calls ReadData until that method returns a null, and then takes the value returned from ReadData and passes it to GenerateOutputLine.  Of course, you can't actually run this report, you need to create at least one concrete class based on it.  That could be a simple test driver designed to make absolutely certain that your framework behaves properly with known data.  It could be something that reads from one file and writes to another, which would basically give you a 'copy file' command.  It could be something that reads one file, does some calculations or modifications, and then spits out the changed data.  The options are nearly endless, despite the fact that this is a highly simplified example.

Let's write a simple test driver:

public class ReportTester extends Report {

    public static void main(String[] args) {
        ReportTester rt = new ReportTester();
        rt.runReport();
    }

    String [] dataLines = {
        "one", "two", "three", "four", "five"
    };
    int lineNumber = 0;

    public void GenerateOutputLine(String dataLine) {
        System.out.println(dataLine);
    }

    public String ReadData() {
        String toReturn = null;
        if (lineNumber < dataLines.length) {
            toReturn = dataLines[ lineNumber++ ] ;
        }
        return toReturn;
    } 
}

Now if we were to run 'ReportTester' we'd get the output:

one
two
three
four
five

The main driver of the program is still in that 'Report' superclass, but the implementation details have been kept in ReportTester.  We run the code in Report, and it calls the methods it finds in ReportTester to do the work.  ReportTester itself is quite simple.  In addition to a simple main that does nothing more that create ReportTester and run it (actually running that method from the superclass Report), it defines the specific implementations for our two abstract methods.  One method does nothing more than write to the console and I hope I don't need to explain that.  The other goes through an array of Strings, returning the next one in the array and increasing the index value each time it's invoked until it reaches the end of the list, at which time it returns a null which triggers the end of the loop in 'runReport'.

This is of course tremendous overkill for a program of this size, but as your projects get more involved, the ratio of abstract code to concrete code is likely to change quite a bit.  System maintenance gets easier and faster, and your software gets more robust.  After all, a bug fix to a framework may take care of issues across several dozen different specific implementations of different reports.

I will of course revisit this topic later, because we've only scratched the surface of the possibilities brought on by object orientation.  With enough small components intelligently built, one can assemble new programs almost like putting together Tinker Toys or Lego.  One's systems can be defined in large part by configuration files, and changed easily at will, all without writing, testing or deploying new code.


Thursday, December 11, 2014

Good Practices: Coding to the Interface

You will sometimes hear someone recommend that you 'code to the interface'.  What the heck does that mean?

Well, I'm going to get deeper into just what interfaces provide for you, and why you should both create and use them in a later post, but a simple example can be shown with the very common Collections interface "List".

Using List is very common, although a lot of people will begin using ArrayList directly.  It's not that there's anything precisely wrong with doing this,  In fact, probably nine or more out of ten times the only thing it will cost you is a few extra keystrokes.  However, it serves as a good example in a section of code with which many of us get familiar early.

Suppose I wanted to have a list of active elements for a game.  I could write this code to have it available within my GameData class:

    private ArrayList<GameElement> gameElements = new ArrayList<GameElement>();

And then I could have a getter method:

    public Arraylist<GameElement> getGameElements() {
        return gameElements;
    }

That's simple enough, right?  So I go ahead and write a bunch of code within the rest of my game, happily accessing all the elements they need to function:

    public void addElementToGame(GameElement newElement) {
        ArrayList<GameElement> allElements = myGameData.getGameElements();
        if (false == allElements.contains(newElement) {
            allElements.add(newElement);
        }
    }
    ...
    public void walkAllElements(ElementChecker myChecker) {
        ArrayList<GameElement> allElements = myGameData.getGameElements(); 

        for(GameElement element : allElements ) {
            myChecker.checkElement(element);
        }
    }

etc. etc. etc.

It's not that there is anything wrong with this.  In fact, you're feeling pretty good, you even made a method that accepts a worker object as a parameter and invokes some code on that worker object for each item in the list.

Then the reports start coming in:  Sometimes the game gets very slow.  It's generally after someone has been playing for a long time.  Eventually, you pin it down to your element management code.  It's fine with a few elements but once you get to having hundreds, certain operations get bogged down.  At the same time, you realize that you'd love to prioritize these game elements by allowing inserts to happen at the start of the list for faster processing.  That's got you down all weekend as you try to figure out what to do.

The light at the end of the tunnel comes when you review the documentation for the Collections framework and discover LinkedList.  It might be a bit slower with smaller amounts of data, but it sounds like it will behave better for you with the large data sets you're processing and it offers the side benefit of being able to easily handle quick inserts at the start of the list.

That's when you suddenly get annoyed.  You make the change in GameData and from all the red markers that show up on your screen you realize that you've got seven hundred and fifty places in your code base where you have to do nothing more than change the word 'ArrayList' to the word 'LinkedList'.  This is all fixable, but it's going to take up a significant chunk of your time.  Most of the code does not even have to change in any other way, maybe you've got two or five places where you will need to explicitly refer to methods that are specific to LinkedList, bu the rest are just mechanical updates of the code.

This is where coding to the interface could have saved you some time and effort.  If, instead of referring to 'ArrayList' everywhere, you had just used 'List', all or at least most of your code would have otherwise remained exactly the same.

The declaration of gameElements could have been written this way:

    private List<GameElement> gameElements = new ArrayList<GameElement>();

Your walkAllElements could have looked like this instead:

    public void walkAllElements(ElementChecker myChecker) {
        List<GameElement> allElements = myGameData.getGameElements(); 

        for(GameElement element : allElements ) {
            myChecker.checkElement(element);
        }

Note that nothing has really changed here except for changing 'ArrayList' to 'List' in some places.  This is because 'List' is an interface, which is really a kind of contract.  It forces an object that implements it to support a set of defined methods, and both ArrayList and LinkedList do this.

    public class ArrayList implements List {

    public class LinkedList implements List {

Now this just scratches the surface of interface usage.  Later on we'll get into creating and using your own, and how that can save you even more time and energy, and bring your coding to a new level.  For now though, just be aware that classes can implement interfaces, and that whenever possible, it's better to write your code to care only about said interfaces, and not to give a hang about the specific object type it's dealing with.  Get used to it, let it become your natural way of operating.  It will serve you well later on.  Eventually you will be creating your own interfaces, which can serve as the control points for widely varying objects.  I can drive a car from any manufacturer, because they all offer me the same set of controls.  I do not need to be licensed to drive a Subaru, separately licensed to drive a Volvo, and have a learner's permit for a Ford.  Interfaces are like specifying 'gas pedal, brake pedal, steering wheel'.  Oh, and objects can support arbitrary numbers of interfaces, so in reality my car is defined something like this:

    public class S60 implements Driveable, GasPowered, Geartronic, PushbuttonStart, KeylessEntry {

There would no doubt be a lot more if I sat down to think about it all.  The point is, once I understand how to drive a car, I can drive pretty much any car, because they ALL implement Driveable.  If I go to my local Avis and rent something, I may not know about all the features, the navigation system may be a complete mystery to me, but man, I know how to work a steering wheel.

Wednesday, December 10, 2014

I Came Here for a Good Argument

I was inspired this morning by another post at /r/javahelp.  A guy taking a programming class got a note from his teacher indicating the teacher wasn't happy with a hard coded path to a data file.  I am not saying I think the professor's tone was helpful, but I do see the point of that.

So how do we get around hard coded paths in main?

I think you can probably guess by now that if you have a method that opens a file for reading you don't have to embed the file name into the method itself.  It's simple enough to create a String parameter to your file and open up whatever is passed in:

    void readFile(String fileToRead) {
        File forReading = new File(fileToRead);

Well we can do the same sort of thing in main, too.  Let's go back to how main is defined:

    static void main(String[] args) {

I haven't mentioned that args parameter before, but it can come in really handy.  That's where your command line arguments can be found.

By the way, you will see the term arguments and the term parameters applied to method calls time and again.  They're the same thing, so don't get too hung up on which one is which.

When you run a java program, you can do all sorts of things beyond just calling the main method.  I won't discuss JVM Parameters for now, but you can tell java itself to modify its behavior, request extra memory, etc.  More importantly for us right now, you can give your program additional information.  For now I'll discuss running from a command line prompt, but the same concepts apply if you're running inside an IDE.  You just get to them a bit differently.

 Let's write a small program:

    public class CommandLineDemo {
        public static void main(String [] args) {
            for(int i = 0; i < args.length; i++) {
                System.out.println(args[i]);
            }
        }
    }


See?  I told you it was small.  But it does something which will prove useful once you've absorbed this concept.

If we just run the program it looks like it's doing nothing:

    $java CommandLineDemo
    $

But if we provide arguments things change

    $java CommandLineDemo Doe a deer
    Doe
    a
    deer
    $

Well now, that changes things, doesn't it?  Many programs become a lot more useful if you can tell them exactly what you want done.  Let's face it, this workflow:

1) Update file name in program
2) Compile program
3) Run program

Is rather less efficient than this workflow:

1) Run program with file name as an argument

Of course, file names just scratch the surface of this new capability.  You can analyze your command line for multiple values.  You can pass command line switches that control your program's behavior.  If you need numbers instead of String values, never fear, because it's always possible to convert Strings to numbers via readily available parsing methods.  In short, command line arguments give you and your users a great deal of flexibility that would be at the least inconvenient if they weren't available.  Sure, you could force a user of a program that searches for text in a file to update another file containing the file name and the value to search...  But that's when they go looking for someone else's program to use.

So let's combine the two bits of code we had above into one.  Note that because args is an array, we are free to deal with MANY input file names instead of just one as in the original hard coded demonstration:

    public class CommandLineFileReader {

        public static void main(String [] args) {
            for(int i = 0; i < args.length; i++) {
                readFile(args[i]);
            }
        }
   
        public static void readFile(String fileName) {
        ...
        }
}

Of course, in a real world program you'd probably want to do some exception handling, as outlined in a recent post.  That might be embedded within readFile, or maybe main might look more like this:

        public static void main(String [] args) {
            int successCount = 0;
            int failCount = 0;
            for(int i = 0; i < args.length; i++) {
                try {
                    readFile(args[i]);
                    successCount ++;
                    System.our.println("I was able to read " + args[i]);
                }
                catch(IOException e) {
                    failCount ++;
                    System.out.println("Unable to read " + args[i] + ": " + e);
            }
            System.out.println("Successfully read " + successCount + " files.");
            System.out.println("Failed to read " + failCount + " files.");
        }

Note that because I catch this exception inside of the for loop, I can notify the user that a file failed and keep track of how many have done so but continue on to process the rest.  It should take something much more catastrophic than a misspelling to break your system.

I hope this makes some of the capabilities of command line arguments clear, with a side order of more robust programming practices.  As always, comments and questions are welcome.

Tuesday, December 9, 2014

Everything is Awesome... Well, with one Exception

Let's talk about my old buddy Murphy.  I've been developing software for...  is it really a quarter of a century now?  And his law is ironclad.  Things WILL go wrong.  Things will always be going wrong, I suppose, unless and until a perfect computer is combined with a perfect programmer.  I don't think I'll worry about my day job on that account just yet.

When a program encounters an error condition, it can give up in a panic or it can attempt to handle the situation.  If you try to open a picture in your favorite photo editor, and it turns out that despite the name of the file, it was actually a word processing document, you probably wouldn't want the software to crash.  If you deploy a new program at work and it chokes because you forgot to feed it all the command line parameters it needs, you probably want it to tell the operations staff what went wrong, rather than giving them the precious gift of a crash report with no supporting information.

Back in the old days, we had several methods of dealing with these sorts of errors.  Sometimes there would be a particular value that would only come back from a function is something had gone wrong.  Maybe you expected a result of 0 or higher, so if it came back as -1 you knew you had a problem:

    returnCode = deployLandingGear();
    if (returnCode == -1) {
      printf("ERROR:  deployLandingGear failed.  Please check operations manual.");
      return;
    }
    // deployLandingGear worked.  We can go ahead and attempt to land on Mars
    land();

Now maybe your code is not as mission critical as that, but the point remains that we wound up sticking our error handling there into the main body of our code.  At best this was annoying.  At worst we failed to account for possible errors, but of course not because the project manager was pressuring us to get code shipped.

Enter the Exception.

Exceptions are pretty much what they sound like.  Conditions that are outside the normal expectations you have for what your program ought to be doing.  Your 'deployLandngGear' method ought to work pretty much every time, unless there's something physically wrong with your lander.  Exceptions give us a method of keeping our code nice and clean, while giving us an effective 'hook' to deal with problems as they arise.

Exceptions require just a bit of coordination between calling and called methods.  Any method that wants to be able to notify callers of problems should specify this fact in its method signature.

So the header for the method 'deployLandingGear might be written like this:

    public void deployLandingGear() throws HardwareException {
      

'HardwareException' is an arbitrary name I just made up, but it has to refer to a valid class that extends Exception.  I could have had it just say 'throws Exception', actually, but it's generally better to have it be more specific than that.  This is particularly the case if, for instance, you wanted to have the constructor for the HardwareException accept parameters such, for instance, how far the gear had gone before the problem occurred, which subsystem failed, etc.

So HardwareException might be declared like this:

    public class HardwareException extends Exception {

And perhaps it has a constructor like this:

        public HardwareException(Subsystem failureIn, int percentComplete, boolean canContinueWithFailure) {

Of course, when it comes right down to it, I don't work for either NASA or SpaceX and I have no idea what their protocols might be in the event of such a failure.  Let's assume though, that the landing gear subsystem is consider deployed 'enough' if it's at least 85% deployed, and that it reports its position as an integer of said percent deployed.

Let's see what happens in one case of deployLandingGear failing.

    public void deployLandingGear() throws HardwareException {
      ...
      if (false == landingGear.isCycleComplete()) {
        throw new HardwareException(landingGear, landingGear.getPosition(), landingGear.getPosition() > 85);
      }

Here we have introduced the keyword 'throw', which fulfills the contract specified by the 'throws' clause in the method signature.  We call this 'throwing an exception' and it pretty much is exactly that.  When we throw an exception it's kind of like returning from the method, except that there is no return type specified, required, or allowed.  Also, the control flow in the calling program will be different.

If we try to write code to deploy the landing gear and fail to deal with the HardwareException, The Java compiler will complain.  Your code will simply not compile.  In order to call it you will have to use a special construct called a try/catch block.  This syntax is often confusing when you first encounter it, but basically, we are going to try to do something, and if it fails and throws an exception, we're going to catch that exception and deal with it.

It looks like this:

    try {
      deployLandingGear();
      land();
    }
    catch(HardwareException landingError) {
      System.out.println("Hardware failure.  Landing may not be possible.  Error was: " + landingError );
    }

OK, now we see that the actual code block for 'deployLandingGear' and 'land' is a bit cleaner than it was using the older style.  Of course, this would be much more the case if we had more than two lines of code in there.  We also see (and this is actually not a great way to write the code) that in the event of a failure, control passes down into the catch block, eliminating all chance that you might ignore the error due to a lack of coffee and try to land anyway.

In reality it's probably better to allow the HardwareException to pass further up to a higher level master control function,  You can do this automatically by not having the try/catch structure at this level and declaring that this method also throws such a beast.  You can also do it manually by simply calling 'throw landingError' inside the catch block.

You can have multiple 'catch' blocks one after the other, if you need to account for several different exception types.  You can have a 'catch' block catch a superclass of your exception and still catch it, so 'catch (Exception e)' would have worked fine here.  It gets involved, and there's no one right answer for all situations.  You must let your system requirements and sometimes your personal preferences guide you.

Finally (ha ha) there is the 'finally' block.  This is optional and can be placed after all catch blocks.  This is code that WILL be run, whether you hit an exception or not.  Once either the try block is finished, or any catch block is finished, your finally block will be run.  This can be handy if you absolutely must release some resources or post a message or something.

    try {
     doSomethingThatCouldFail();
     System.out.println("Succeeded.");
    }
    catch(Exception e) {
      System.out.println("This is why you fail: " + e);
    }
    finally {
      System.out.println("Do or do not, there is no try.");
    }

You can't ignore exception handling, but it's possible to do it poorly.  The better you deal with things that go wrong, though, the more robust and resilient your software will be.  If you can change some setting and make another attempt, so much the better.

Importing and Packages - Customs won't like this.

Alright, I've been ignoring it, but the time has come to take on a fairly important issue when it comes to organizing your code.  The topic of the day is packaging.

Packages are really not much more than a set of directories (or folders if you prefer) into which your .class files are placed.  This becomes important once your projects get beyond two or three .java files, because you really want to avoid namespace clashes.  You don't even know what they are yet but I'll bet you're agreeing that they sound like something to avoid.  More importantly, proper packaging will help you to keep your workspace and brain organized.  Programming is hard enough, you don't need to make it worse on yourself.  I may keep a messy desk but I do try to have my source code laid out in a logical fashion.

Most of the time, we bundle up our .class files into something called a Jar.  This is just a file with a .jar extension, and it's actually in technical terms a .zip archive file.  This is convenient because a sizeable system can contain dozens or hundreds of .class files, and managing them individually will make you want to gnaw your foot off.

Now, we could go ahead and just put all of the .class files into one giant list.  But that's not a list you want to be looking at or searching through.  It is almost always possible to separate out the various parts of a project into logical subsections, and you should do this.

At the most basic level, you can control packaging by putting a line at the top of your source file like this one:

    package com.oopuniversity.packages;

It dictates that the compiled .class file should be placed in a subdirectory called 'com.oopuniversity.packages'.  When you want to refer to a class called 'PackageDemo' in your java code, you can do it either the hard way:

    com.oopuniversity.packages.PackageDemo = new com.oopuniversity.packages.PackageDemo();

Or you can do it the easy way.  Add an 'import' line up above your class declaration like this:

   import com.oopuniversity.packages.PackageDemo;

This applies to any class that is not in the same package as the class on which you're working.  For classes within the same package the import statement is unnecessary.'

IDEs make this whole process really easy.  You tell them to create a new package, and they do so, creating all of the necessary directories for both java and class files.  You try to create an object and they can automatically figure out that you need an import statement and add it for you.  When they build the code, it's placed in the correct location.

At first this may seem like a needless complication, but it pays off as soon as you're doing any work that is not extremely basic in nature.

The rules for package names are simple.  While it's not hard and fast, stick to lower case letters.  Separate individual words with periods.  You can call them anything you like, but it's generally a good idea to go from less specific to more specific.  Most of the standard packages that come in the Java libraries start with 'java.' and most third party packages start off with the domain name of the developer (this 'com.oopuniversity') and are then followed by as many identifiers as needed for your purposes.  It is a good idea to organize this way, because if you use the same package names as someone else does you could badly confuse your ClassLoader and this leads to really irritating situations when you try to get programs to actually run.  In short, 'util' is a pretty bad package name, but 'org.mycompany.util' is better.  'org.mycompany.secretproject.util' is even better yet.

In summary:

Organize your code into packages.
Specify the package name at the top of your source file.
Import the classes you need by specifying their packages in import statements.

There are specific packaging rules I like to follow when building systems, related to how the code itself is structured.  I'll talk about those rules when it's appropriate.  Those are going to be more on the order of recommendations, but it won't hurt you (much) to go along with them.

Friday, December 5, 2014

Encapsulation - Protecting You From Yourself

Encapsulation sounds like the central plot point of some weird science fiction movie, but actually it's a fundamental technique for improving the reliability and predictability of object oriented programs. Let's pull out our old friend PersonData and see what's ailing him today: 

public class PersonData { 
    public String givenName; 
    public String surName; 
   
    public PersonData() { } 

    public PersonData(String given, String sur) { 
        givenName = given; 
        surName = sur; 
    } 

    public String toString() { 
        return givenName + " " + surName; 
    } 


Let me start by saying that there is nothing fundamentally wrong with this class. It is properly formed, is created with a valid constructor, and has a fully functional toString override method that formats the name is a reasonable manner. Is it object oriented? Well, it's an object at any rate, but we can do better.

 Here's the key issue I'd like to address today. Suppose we had a method that we wanted to use to ensure that a person's name begins with a capital letter. The logic for this is simple enough, but I'm going to deliberately break it for this example, so don't use this method as-is:

public static String ensureFirstCharacterIsUpperCase(String input) { 
    String upperCaseString = null; 
    if (Character.isUpperCase(input.charAt(0))) { 
        return input; 
    } 
    upperCaseString = input.substring(0,1).toUpperCase() + input.substring(1); 
    return input; //Bad programmer! No Pizza! 


 Ignoring for the moment the fact that this method will not quite do what we actually want, how would one go about using it? Well, we could go ahead and write something like this:

PersonData joeBlow = new PersonData(); joeBlow.givenName=ensureFirstCharacterIsUpperCase("joe"); joeBlow.surName=ensureFirstCharacterIsUpperCase("blow"); 
System.out.println(joeBlow); 

That would work just fine, the first, second, and eight hundredth time you do it. But why would you want to? What if you decided that you wanted to capitalize all surnames? What if you wanted to plug in some library that understands how to capitalize special names like MacNeil or something? Do you really want to search out every instance of setting the last name and change it?

There is a better way, my friend, and it gets to one of the other key aspects of object oriented programming. Encapsulation is basically protecting your data from your own programs by making it inaccessible except through a very clearly defined path that you strictly control.

We start by making the fields surName and givenName private. This indicates that they cannot be changed by code in any class except PersonData.

private String givenName; 
private String surName; 

Of course, we aren't quite finished at this point, because we can't change or even see these values. That could make for a remarkably useless class if we did not find a way around it. The way around it is by creating 'accessor' and 'mutator' methods. Those are horrible names so we usually just call them 'getters' and 'setters'. These labels actually make a lot more sense, because the naming convention is to prefix the field name with 'get' and 'set'. 

The required methods for PersonData could look like this:

public void setGivenName(String newGivenName) { 
    givenName = newGivenName; 


public String getGivenName() { 
    return givenName; 


public void setSurName(String newSurName) { 
    surName = newSurName; 


public String getSurName() { 
    return surName; 


 With the fields marked private, and with our getters and setters in place, we would now write the code above more like this:

 PersonData joeBlow = new PersonData(); 
 joeBlow.setGivenName (ensureFirstCharacterIsUpperCase("joe")); 
 joeBlow.setSurName (ensureFirstCharacterIsUpperCase("blow")); 
 System.out.println(joeBlow); 

 But wait, there's more!

 Why should we take a chance that we (or some other developer) fails to use that method to make sure our first character is correct? We can do better than that, and make sure it always happens no matter what kind of night someone had. For now, we'll move 'ensureFirstCharacterIsUpperCase' into PersonData (we'll talk about a better way later) and we'll change our setters a bit:

public void setGivenName(String newGivenName) { 
    givenName = ensureFirstCharacterIsUpperCase(newGivenName); 


public void setSurName(String newSurName) { 
    surName = ensureFirstCharacterIsUpperCase(newSurName); 


 With this done, we can now do this instead:

 PersonData joeBlow = new PersonData(); 
 joeBlow.setGivenName ("joe"); 
 joeBlow.setSurName ("blow"); 
 System.out.println(joeBlow); 

 And still get everything looking the way we expect. We've minimized the amount of code we need to write, and we've encapsulated both field access and a bit of business logic in our PersonData class. Malicious or misinformed programmers will not be able to bypass our hard and fast rules. IDEs won't even let you see the fields in their helpful pop-up dialogs. Dogs will love us.

Of course, we can (and really should) do more than this. First off, we have a constructor that accepts the two names, and we need to make sure it's also performing this task. That's probably most easily done by also having it call the setters. Second, we can think about other business rules we can enforce, like "A user's first name cannot be null or zero length". The setter can check for this and refuse to accept a bad value. Of course, if it does so it also needs to complain to the caller that they've done it wrong. This would be done via an Exception, which we'll need to talk about soon.

Get used to writing your code this way. Protect your data from yourself. Use accessor and mutator methods regularly and uniformly. It's a small amount of extra work, but you will thank yourself later. It's actually a VERY small amount of extra work, given that any decent IDE will create the methods for you if you just ask, and it will certainly save you time and energy later on.

I'll update this post later with a link to a gist containing the fully updated PersonData Object.

Wednesday, December 3, 2014

Diamonds are not an Object's Best Friend

Once again, I turn to a post I made on reddit for inspiration.

Some people wonder why inheritance branch but never join.  Why not tie two different families of objects together and extend both of 'em?

I can see the point of that, one could conceivably build objects that join several disparate branches of inheritance into large and versatile classes.  Conceivably.  One could also get oneself into a world of trouble, though, and this is most easily explained via the so-called diamond inheritance problem.

Please note that you cannot do this and expect your code to compile, it's just a conceptual demonstration.
Diamond.java
public class Diamond extends Class1, Class2 {
}

Class1.java
public class Class1 {
    public void doSomething() {
         System.out.println("Class1.doSomething()");
    }
}

Class2.java
public class Class2 {
    public void doSomething() {
         System.out.println("Class2.doSomething()");
    }
}
Now you have a main that uses the above classes:
public static void main(String[] args) {
    Diamond diamond = new Diamond();
    diamond.doSomething();
}
So, when you run main, what happens? This is the essence of the diamond problem. You wind up having two lines of inheritance converging on one and run into a quite serious issue.

This basically just scratches the surface of the problem, of course.  We can also think about what happens when you try to invoke something in super.  The question would have to become 'which super'?  There are many potential negative ramifications from the interactions that could be created by allowing this kind of inheritance, and the creators of Java decided that the best way to avoid them was by preventing the issue altogether.
One can achieve most of the positive effects of multiple inheritance via Interfaces and a few helpful design patterns such as Adapter, Composite and Decorator.

Inheritance Part 1 - That Class is Super!

I would like to discuss the concept of inheritance, and no, I don't mean the vase that Aunt Gladys left you in her will.

When we talk about inheritance in object oriented programming languages, we are talking about how one object can be based on another object.  In Java, we use the keyword extends to do this.

However, in order to make this as simple as possible, let's ignore even that keyword for a moment.  EVERY single Java class automatically extends the class Object without any effort at all on your part.

Let's look at an example Person data object, which I will continue to use as an example for future posts.

public class Person {
    private String firstName;
    private String lastName;

    public Person(String newFirstName, String newLastName) {
        firstName = newFirstName;
        lastName = newLastName;
    }
    ...
}

As I mentioned above, Person automatically extends Object, so in reality the class definition is really more like this:
    public class Person extends Object {

But we just don't bother writing it.  We call Object the superclass of Person and Person is called a subclass of Object.

As far as what this brings to the table, let's use what is perhaps simultaneously one of the most common features and one of the most common issues people have with objects when they are starting out.  Printing them out.

If I create a new Person object and want to see what it looks like, I might be tempted to do this:

    Person joe = new Person("Joe", "Blow");
    System.out.println(joe);

If I do this, I will be sorely disappointed when I run my program, because while I might expect to see this:

    Joe Blow

I will instead see something like this:

    Person@4f1d0d

At this point, I begin to tremble and sweat.  In a panic I turn to stackoverflow.com or /r/javahelp and complain that my object is broken and my printouts are garbled.  The regulars there probably make fun of me for not actually posting my code correctly or something.  My dog bites me and my girlfriend refuses to go out with me.  OK, I may have exaggerated a bit.

But what's happening here is that when you call 'System.out.println(joe)' you are (again, invisibly) actually in effect calling 'System.out.println(joe.toString())'.  After all, you need to turn your object into something that can be displayed on the console.  

You might at this point be saying to yourself "But wait, I don't have a 'toString' in Person" and in a sense you're right.  You certainly didn't create such a thing.  But you did in fact inherit a toString method from Object.  Object is pretty ignorant about your code, though, and has no idea how you intend to format your printouts, so it just tells you the type of object and where in your JVMs memory that object is stored.  In general, if a superclass contains a method, a subclass can call it as though it were a part of the subclass itself.

This is all inheritance really is, in the end.  If we try to call a method or refer to a variable in an object, and it isn't mentioned in that object, the compiler tries to see if it's available via inheritance.  If so, you avoid compile errors and things 'work'.  They may not work the way you want, but at least something interesting happens.

Let's fix the problem now.  I could take the cheap and easy way out like this:

    System.out.println(joe.firstName + " " + joe.lastName);

And that would work.  But honestly, all we really need to do is create our own 'toString' method in Person that does things the way we want.  This will save us the effort of having to write the above code everywhere that wants to just display someone's name.  You'll really appreciate that when you decide that you'd rather display that guy as "Blow, Joe" instead and have to change code in 23 different places.

Add the following block of code to Person:

public String toString() {
    return firstName + " " + lastName;
}

And just like that, the original program will print the name out the way you wanted it to in the first place.

There, you've done it.  You've made use of an inherited method, and then you created an override for it.  An override is nothing more or less than specifying that instead of using the method found somewhere up the line of inheritance, you want to use this specific method instead.

When it comes down to it, overriding toString is just the tip of the iceberg, and I will of course be revisiting this later on.  There are some extremely interesting things that you can do with inheritance, and it forms the basis for many important language structures.  For now though, just remember that one class can extend another one, and that when it does so it has the option of replacing methods in its superclass with its own more specific version.  There are a few other rules you'll need to follow, but most of them are not too horrible to deal with.