GuidesExtending the Java 2 Collections Framework

Extending the Java 2 Collections Framework


Sun introduced a much-improved framework for working with collections of objects in Java 2. It was pretty clear that the Vector and Hashtable workhorse collections in JDK 1.1 and earlier were not well-suited to all tasks. The main thrust of the framework, however, is not to provide a better set of concrete collection class implementations — which it does — but instead to provide a standard interface for collections.

Based on my Smalltalk background, I wanted to improve the Collections Framework in order to simplify the process of iterating across a collection. Key considerations for me included:

  • minimal effort
  • reuse of existing code
  • easy migration of my existing application code.

This article demonstrates a simple extension to the Java 2 Collections Framework.


The Trouble with Iteration


Iterating over a collection in Java is an external iteration–the collection class itself is not responsible for stepping through the objects contained within. There are benefits to external iteration, one being that you can easily replace an iterator with a better-performing solution. Another is that you can have multiple threads each perform separate iterations against the same collection simultaneously.

There are also negatives to external iteration, the chief one being that all collection clients must manage the iteration process themselves. This means you have to code the same lines over and over. For example:


Iterator iterator = list.iterator();
while (iterator.hasNext())
process((String)iterator.next());

After coding iterations like this a few thousand times, it starts to become tedious. Tasks that are tedious are soon taken for granted, ignored, and then when they are not done properly, they are last suspected as the source of a problem.

In Smalltalk, the collection classes support internal iteration by default. In the following line of Smalltalk:


collection do: [:each | each doSomething]

the message

doSomething
is sent in turn to each object in collection. An analogous line of code in Java, if Java supported method pointers, might be:


collection.do(something(each));

Smalltalk uses the concept of a block of code. Blocks, or block closures, are essentially what I like to call deferred code: code contained within a block is not executed until something sends the value message to it. The benefit is that you can treat code just as if it were an object, passing it around from place to place and executing it when necessary.

Java does not directly support block closures. Anonymous inner classes get us pretty close, however, to supporting the concept of code objects. The example below uses anonymous inner classes to best emulate the block concept.

The intent of this article is not to show how much Java can be made to look like Smalltalk, but to discuss what issues there are with extending the collections framework.


The Collections Framework


Sun’s Java 2 Collections Framework is designed to allow you to easily create your own concrete collection classes. There is a short example at java.sun.com that shows you how to create a custom implementation. However, in the following example, I wasn’t interested in creating a new implementation for an interface. Instead, I just wanted to add a small piece of functionality. The downside of the Sun technique is that you end up having to provide code that duplicates functionality already in the collection superclass. The below example demonstrates an alternative technique, one that minimizes code duplication.


Building Blocks


Before we can use an anonymous inner class to support our internal iterators, we must first get Java to emulate Smalltalk blocks. Our initial definition for Block.java is fairly simple:


package com.langr.collection;
public interface Block
{
public void exec(Object each);
}

For now, we will build Block as an interface. It will act as a contract that insists the exec(Object) method be implemented for anyone interested in using a Block.

Note that I’ve chosen the package name com.langr.collection for my collection subclasses and other related classes.


Working with Blocks


Now that we have a block, how is it used? The following bit of code demonstrates a simple example. In the example, list represents an object of the collection class that we will be building shortly. We invoke the method do(Block) on this object, passing it a Block object representing what we want to happen for each object. A dynamic declaration for Block’s exec(Object) method is provided in the form of an anonymous inner class. In the body of the exec(Object) method, we simply take the each parameter (which represents each object in turn from list) and pass it off to another method called show(Object).


list.do(
new Block() {
public void exec(Object each) {
show(each);
}});

A slight problem: do is a reserved word in Java. We’ll modify this in a later version; my preference for a clear method name is forEachDo.

Okay, you’re thinking, this looks like more work than the classic iteration. The benefit is that the iteration logic is not repeated. We are replacing potentially three operations in code (initialize iterator, test for end of iteration, extract element from collection) with a single operation (invoke method on collection).

Now that we know how we want this to look, all we have to do is provide the behavior for the do(Block) method. This code is pretty straightforward:


package com.langr.collection;
import java.util.Iterator;
public class ArrayList
extends java.util.ArrayList
{
public void forEachDo(Block block)
{
Iterator iterator = iterator();
while (iterator.hasNext())
block.exec(iterator.next());
}
}

(I’ve already renamed do(Block) to forEachDo(Block) here.) Most importantly, note that we are creating a new class called ArrayList by extending the existing java.util.ArrayList. Thus we get all of the behavior of an ArrayList, plus the functionality we have added in forEachDo(Block).

The body of forEachDo(Block) is simply the three-step iteration that used to be externalized. We now have encapsulated these three steps in the ArrayList object itself.

Are we done? Let’s write a simple test program to try using our new ArrayList subclass.


import java.util.List;
import com.langr.collection.ArrayList;
import com.langr.collection.Block;
public class TestArray
{
public static void main(String[] args)
{
List x = new ArrayList();
x.add(“x”);
x.add(“t”);
x.add(“c”);
x.forEachDo(
new Block()
{
public void exec(Object each)
{
show((String)each);
}
});
}
public static void show(String x)
{
System.out.println(“x: ” + x);
}
}

If you try to compile this code, you receive the following error message:


Listing 1.
Sample error message.

Our com.langr.util.ArrayList inherits from java.util.ArrayList, which in turn implements the java.util.List interface. This gives us polymorphic flexibility: We can use a com.langr.util.ArrayList object anywhere a java.util.List is expected. Good practice states that you should assign concrete classes to a variable of the most generic interface or superclass possible. This allows for easy maintenance of code later — in this example, even, it allows us to make a single modification to swap out the Sun class with our custom collection. In this case, List is probably the most useful assignment we can make.

The problem, though, is that java.util.List knows nothing about our forEachDo(Block) method. We have one of two choices: Either accept this fact and assign the new ArrayList instance to a com.langr.collection.ArrayList variable or create our own List interface that does recognize the forEachDo(Block) method. Creating the List interface is easy.

Our ArrayList implementation now needs to implement com.langr.collection.List:


package com.langr.collection;
import java.util.Iterator;
public class ArrayList
extends java.util.ArrayList
implements com.langr.collection.List
{
public void forEachDo(Block block)
{
Iterator iterator = iterator();
while (iterator.hasNext())
block.exec(iterator.next());
}
}

And finally, our test program now must import com.langr.collection.List instead of java.util.List:


import com.langr.collection.ArrayList;
import com.langr.collection.Block;
import com.langr.collection.List;
public class TestArray
{
public static void main(String[] args)
{
List x = new ArrayList();
x.add(“Jeff”);
x.add(“Joe”);
x.add(“jane doe”);
x.forEachDo(
new Block()
{
public void exec(Object each)
{
show((String)each);
}
});
}
public static void show(String x)
{
System.out.println(“x: ” + x);
}
}

You will want to modify all your code to use the com.langr.collection package. Since our ArrayList class uses the same name as the java.util class, you will only need to make a simple package name change. Usually this can be done at the top of all your source files.

Because both the java.util and com.langr.collection packages define ArrayList and List, conflicts might arise if you need to use classes from each. There is a simple technique to ensure that your code uses the com.langr.collection class definitions for ArrayList and List. Just include the explicit class import after the generic:


import java.util.*;
import com.langr.collection.ArrayList;
import com.langr.collection.List;


About the Author


Jeff Langr is a software developer with over 17 years of experience. He is the author of the book Essential Java Style: Patterns for Implementation (Prentice Hall, 1999). Jeff currently lives in Colorado Springs and works as a consultant for ChannelPoint, Inc.

Get the Free Newsletter!
Subscribe to Developer Insider for top news, trends & analysis
This email address is invalid.
Get the Free Newsletter!
Subscribe to Developer Insider for top news, trends & analysis
This email address is invalid.

Latest Posts

Related Stories