http://www.developer.com/

Back to article

Java Needs to Get a Pair (and a Triple...)


April 1, 2009

Options for multiple return parameters in Java are somewhat limited. A method may only return one object, array or primitive, and unlike many other languages it does not offer an easy facility to use out parameters in method calls. In effect your options are to return an Array of Objects, a Collection, create a class just for the return parameters, or finally to pass in objects which you intend to alter. All of these have their drawbacks:

Using an Array of Objects

If you are lucky enough to have a homogeneous set of return parameters, then an array of Objects is a fine option with the exception that you have to remember which parameter is which when you unpack them. If, on the other hand, you are returning multiple different types of parameters, you will need to use an Array of a superclass of all of the objects—most likely Object itself. You will then need to cast each parameter as you unpack it. You have lost type safety and raised the chance of getting the order of the return parameters wrong as well.

Using a Collection

Similar to using an Array, you could create a collection to return. The main reason to use an Array over a Collection is the amount of code necessary to create a collection, which is much higher than using array initializers:

return new Object[] {string1, num2, object3}

is a lot shorter than

List<Object> retVal = new ArrayList<Object>();
retVal.add(string1);
retVal.add(num2);
retVal.add(object3);
return retVal;

There are no real advantages to using a collection over an array unless you decide to use a map in order to organize return values by a name or other key.

When Java was first created, its simplicity was a reversal of the increasing complexity (and, let's be fair, flexibility) of C++. Pointers and memory management were simplified, including elimination of parameter indirection, const, function pointers and other powerful but often confusing features. In C++ you can pass parameters by value (like in Java) or by reference, allowing the reference to be reassigned in the method and providing you with out parameters, or a way of returning more values from a method than the one return value allowed by the syntax. The signatures that can end up with char** and & dereferences throughout the code might be unsightly to some, but they are quite useful.

Using JavaBeans

C++ also supports structs, which allow lightweight structured data packaging. Of course, Java classes can do double duty as structs easily enough, but often the conventions (like JavaBeans) end up making the source larger with a lot of boilerplate.

Another problem with using classes and the JavaBeans conventions is that the objects are inherently mutable in nature. Given that these objects may end up being shared between the caller of the method and the class of the method called, this can lead to shared mutable state, which can be bad news in a multi-threaded system.

In Java, what you are left with is method parameters being passed by value and cannot be outparams, while methods can only return one argument. This certainly works, look at any significant codebase and you will see plenty of examples, but it's not particularly efficient in developer time.

Improving the Java Beans Way

So what can be done? Well, the Java classes option is really the only typesafe solution to be had, and by improving style, these classes can be a better substitute for structs, and with a few advantages of their own.

Let's take a return class with two arguments - say a name and a date of birth:

public class PersonNameDOB {
    private String name;
    private Date dob;

    public Date getDob() {
        return dob;
    }

    public void setDob(Date dob) {
        this.dob = dob;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }
}

Obviously this is a manufactured example, and the chances are you probably have a Person class already defined that you could maybe return instead. I would also bet that you have examples of your own where you want to return two different objects from a method but don't have a class already defined for them, or maybe you end up returning a class with far more information than necessary in order to just get a couple of items out of it. Depending on the circumstance that might be even worse. For example, what if the person calling your method starts to use or even modify values in that returned object when you had no intention of letting that happen?

The above is more code than is needed anyway. This is meant to be a lightweight way of returning some values from a method, so let's make some simple changes:

public class PersonNameDOB {
    public final String name;
    public final Date dob;

    public PersonNameDOB(String name, Date dob) {
        this.name = name;
        this.dob = dob;
    }
}

The result is shorter, and more fit for the task. Values are being returned, so there is no need for setters, let's just set up the values when the return object is created. They don't need to change, and since they are in a constructor, they can be made final. Now that they are final, there is no risk to making the class attributes themselves public, since they can't be side affected, so now you can get rid of the getters as well as the setters. The result is shorter and easier to use:

PersonNameDOB personNameDOB = SSNLookup.lookupBySSN("123-45-6789");
System.out.println(personNameDOB.name);
System.out.println(personNameDOB.dob);

And the lookupBySSN method:

public PersonNameDOB lookupBySSN(String ssn) {
    ... Find the person record in the DB, etc. ...

    return new PersonNameDOB(person.getName(), person.getDOB());
}

If this seems totally obvious then great, just bear with me as I take things a bit further.

I like this approach to lightweight return objects. It is typesafe, so there is no need to cast objects out of arrays after return. Even better, the final modifier on the attributes means that these return objects cannot be abused - they are simply for transfer of data.

Taking that safety a step further, I recommend that you take copies of objects or use immutable objects where possible since doing otherwise is to risk unexpected modification of the values in your donor object by a calling method. In our example, String is immutable, but date should be copied:

public PersonNameDOB lookupBySSN(String ssn) {
    ... Find the person record in the DB, etc. ...

    return new PersonNameDOB(person.getName(), new Date(person.getDOB().getTime()));
}

This will prevent a caller doing the following:

PersonNameDOB personNameDOB = SSNLookup.lookupBySSN("123-45-6789");
personNameDOB.dob.setTime(0);

from side affecting the original dob value which is a huge risk otherwise. Immutable values rock, and if you can't use those, then be sure to take copies and return those copies in your results instead.

The Need for a Pair

The pattern above is one I use a lot as a struct replacement in Java API calls now, but it is still an overhead to create these classes if all you want to do is return two typed objects—something that is, in my experience, really common (many finder algorithms can be made much more efficient by simply returning a pair of related instead of one, for a key,value pair that will be added to a map).

Something that seems to be low hanging fruit for such a situation, but which is still mysteriously missing from the Java SE standard distribution, is a genericized Pair class. Take a look at how you could build it from the pattern above.

Firstly, the values would be more general than name and dob. The most universal would seem to be fields named first and second:

public class Pair {
    public final String first;
    public final Date second;

    public Pair(String first, Date second) {
        this.first = first;
        this.second = second;
    }
}

So far so good. You now have a general class for returning pairs of Strings and Dates, but not other types. Bring on the Generics:

public class Pair<A, B> {
    public final A first;
    public final B second;

    public Pair(A first, B second) {
        this.first = first;
        this.second = second;
    }
}

This is much better. There is no need to worry about wildcards for something that is just a quick way of representing a couple of return types. This class can now be used for general type pairs, for example:

public static Pair<String, Date> lookupBySSN(String ssn) {
    // find the person in the DB....
    return new Pair(person.getName(), new Date(person.getDOB().getTime()));
}

and to use it:

Pair<String, Date> personNameDOB = SSNLookup.lookupBySSN("123-45-6789");
System.out.println(personNameDOB.first);
System.out.println(personNameDOB.second);

You aren't finished yet with the Pair class though. Some of the things you need to consider if this thing is going to be truly universal:

  • You don't want someone extending the Pair class and changing what it does - that might break the original intent of the class.
  • The new Pair() is okay, but it is a little clumsy looking. You can do better than that.
  • Pair is pretty useful just for returning values, but with a little effort it could, for example, be used as the key in a map.
  • It would be nice to have a pretty form of the Pair string representation for debugging or other toString() usage.
  • Finally, perhaps (and this is debatable), it would be nice to allow the Pair and contained objects to be serializable assuming the contents are serializable too.

So, let's see how that changes the pair implementation up:

public final class Pair<A,B> implements Serializable {

    private static final long serialVersionUID = 1L;  // shouldn't 
                                                      // need to change

    public final A first;
    public final B second;

    private Pair (A first, B second) {
        this.first = first;
        this.second = second;
    }

    public static <A,B> Pair<A,B> of (A first, B second) {
       return new Pair<A,B>(first,second);
    }

    @Override
    public boolean equals(Object obj) {
        if (obj == null) {
            return false;
        }
        if (getClass() != obj.getClass()) {
            return false;
        }
        final Pair other = (Pair) obj;
        if (this.first != other.first && 
                (this.first == null || !this.first.equals(other.first))) {
            return false;
        }
        if (this.second != other.second && 
                (this.second == null || !this.second.equals(other.second))) {
            return false;
        }
        return true;
    }

    @Override
    public int hashCode() {
        int hash = 7;
        hash = 37 * hash + (this.first != null ? 
                              this.first.hashCode() : 0);
        hash = 37 * hash + (this.second != null ? this.second.hashCode() : 0);
        return hash;
    }

    @Override
    public String toString () {
        return String.format("Pair[%s,%s]", first,second);
    }
}

You have made the constructor private, but provided a static of() method, which I think reads more nicely:

return Pair.of(person.getName(), new Date(person.getDOB().getTime()));

You have made the Pair class final, so no one will override it and change the original intent of the class. It might seem strict, but with something that is intended for wide usage, it is sensible to be quite defensive about things like this. If someone wants Pair to work differently, they should write their own implementation, and justify it to their peers.

The equals() and hashCode() methods mean that this class can be used for more purposes than just returning values. They can used, for example, as the key to a map. A recommendation here, for any kind of return object using this pattern, is to let your IDE create the equals and hashCode methods for you - these are simple implementations but even here there are some things to remember about the contract with hashCode and equals, for example null checks and different types (see the excellent description in Josh Bloch's Effective Java 2nd ed. for the definitive writeup on the subject). IDEs tend to have boilerplate insertions with the best practices already defined, so I just had NetBeans create these for me. Perhaps they could be streamlined a little, but these implementations are safe. I did remove the generic signature from the NetBeans equals() implementation though, as it is not necessary and could create confusion.

The toString() override just prints a pretty form of the pair, like "Pair[Fred Jones,Sun Mar 22 12:55:44 PDT 2009]". This is particularly useful for, say, debugging popups.

The class now implements Serializable. I believe that this is the point where the choices become more dubious, but since collections are serializable, Pair should be as well in my opinion. If the classes used in the Pair are not themselves serializable, that is no worse than classes in collections not being serializable, and Pair should not be the thing that prevents serialization used in a class.

Is It Complete?

Yes, and no! This class is in use at Navigenics where it performs well. Doubtless there are other similar implementations all over other Java applications out there, which probably offer the same feature set and are as useful. Is it complete then? From a "we can use it as it is for most foreseeable needs", then yes it is complete. From a "will it be the only pair we will ever need" then possibly not.

For example, Pairs represented by this class cannot be compared. I thought about adding a compareTo() method and making it Comparable but here you start to hit the complexity that even simple general class design brings up. In very general usage, what is the correct behavior for it? It's easy to say that you compare the first value, and if they are equal, compare the second value. This is likely the most consistent behavior, but is it right for every usage of Pair? You would also need to check if the classes being compared are themselves comparable, and if not use what? The object reference comparison?

I sidestepped the issue. I know enough from writing that less is usually more, or as my thesis tutor used to say: "Omit Unnecessary Words". If someone wants their Pair usage to be comparable, they can supply a comparator to do it. After all, Lists are not Comparable in the Java SE libraries either.

Pair also isn't Cloneable, and this would be easy enough to add. Perhaps it should be, but it's also really easy to clone a Pair (just create a new Pair of the first and second from another Pair). It seems unnecessary.

So, yes I would claim that this is complete enough. Simple wins every time in my opinion.

What About Triple?

The bigger question here is where do you stop.

If Pair is twice as good as a single return type, surely Triple is three times as good, right?

In truth I bekieve there is a reasonable case for a triple implementation (I have one in my codebase) since I can see use cases where I want to return three objects instead of two. Some libraries implement a Quadruple as well, but I would argue (strictly from my own opinion) that four or more returns is less common, and when you get to returning above three objects, perhaps it is time to implement a custom lightweight transfer object anyway, if only to remember which object is in what position (e.g. was name the third or fourth object now?)

Other people I have spoken to claim that Pair is all you need, since you can make a triple like this:

Pair<String, Pair<Integer, Date>>

But this leads to uses like the following:

Pair<String, Pair<Integer, Date>> triple = methodReturningTriple();
System.out.println(triple.first, triple.second.first, triple.second.second);

Now I don't know about you, but I have some style issues with references like triple.second.second from a readability perspective compared with a simple triple.third. Not to mention the definition Triple<String, Integer, Date> is a lot cleaner than Pair<String, Pair<Integer, Date>>.

The implementation of Triple is simple enough - take the Pair class and add a third attribute with a generic class C. Adjust the equals, hashCode, of and toString methods plus the constructor accordingly, and you are done.

Conclusion

Creating a Pair class for Java is hardly in the realm of rocket science, but even here, in the implementation of a simple generalized library class, one which requires no language changes to Java and one with a very small feature list, some questions still exist. Generality are troublesome. Imagine then the problems facing some of the questions about the "small" language changes to be added to Java 7 with Project Coin. I do not envy the work ahead for the language designers and would-be language designers in implementing these changes without (hopefully) disrupting anything important in the general uses of the language.

Still, I believe the Pair class described here is at least a good starting point for a standardized Pair implementation in the SE libraries. I hope this addition will be in the Java 7 release. Even though it is really a poor substitute for the Tuples available in other languages, the inclusion of a standard Pair (and maybe Triple) will hopefully provide a great deal of utility, an improvement in readability and removal of a lot of boilerplate code, all of which are aims for the project Coin small language changes, and all this without needing to change the language at all.

Finally, if you dispute the need for a standardized Pair class, just take a look at the multitude of implementations already out there in the open source world (and probably many times more in proprietary code bases).

Finally, I would like to say thanks in particular to Tomas Zezula, a NetBeans engineer whose Pair implementation in the open source NetBeans codebase provided valuable considerations in the construction of this article. Interestingly Pair appears (as far as I can tell) in 11 different locations in the NetBeans codebase alone. Yes, I would say one is needed in the Java SE libraries...

About the Author

Dick Wall is a Software Engineer for Navigenics, Inc. A bay area startup specializing in genetic personalized medicine services at http://www.navigenics.com and is also co-host of the Java Posse podcast , a developer centric podcast all about the news, happenings and technology in the Java development world.

Sitemap | Contact Us

Thanks for your registration, follow us on our social networks to keep up-to-date