The Java Collections Framework: Implementations
December 7, 1999
In part one of this article, we covered the basic interfaces provided by the collections framework. Next, we'll examine the data structure implementations that the new collections framework provides, and some examples of their usage. Implementations of the and Interfaces There are many implementations of the and interfaces. Some you'll find familiar ( and ), and others are new. The following table introduces the new classes, and shows which each implements. | Implementations | | Set | SortedSet | List | Map | SortedMap | | HashSet | TreeSet | Vector / Stack | Hashtable | TreeMap | | | ArrayList | HashMap | | | | LinkedList | WeakHashMap | | Figure 1. Non-abstract classes that implement and interfaces. Selecting a Collection or Map implementation When using the new classes, it is important to remember that some implementations impose additional restrictions on what elements, keys and values may be safely stored and retrieved. For example, both and require that elements implement the new interface. Some additional restrictions may include that elements are not , while other implementations may permit elements. The and interfaces (discussed earlier in Part One) provide a consistent mechanism for accessing these new data structures, so in many situations the choice of data structure is transparent. However, for performance reasons, one data structure may favor a particular set of circumstances over another. Additionally, using a over a will allow duplicate storage of objects. ArrayList The class is a resizable array, and very similar in nature to a . The chief difference is that, like many other implementations, it is not synchronized. This should give a mild performance increase in single-threaded applications. LinkedList offers methods for access ( ), retrieval ( ) and deletion ( ) of the first and last elements of the list, in addition to the standard methods. This functionality is extremely important, as it makes the more versatile than an . For example, a stack which is a last-in-first-out (LIFO) queue needs the ability to insert data into the beginning of a list, and then remove it. HashSet The class offers a implementation, which uses a hash table implementation to offer good addition, removal and access performance. For the benefit of those who have never written a hash table implementation before, an array index value (hash code) is calculated. This is determined by using a hashing function which attempts to calculate a unique value for each object. We can remove and access elements, as well as add new ones, in what is termed constant computing time (the time to calculate the hash code and add/remove/select the element is almost uniform, even as the number of elements increase). TreeSet Unlike , the class sorts its elements in ascending order. This gives a slightly slower speed a hash table implementation (log(n) time rather than constant computing time), but in many situations this is worth it, for the guarantee that the set is sorted. HashMap differs very little from the Hashtable class. The chief difference is that its methods are unsynchronized, and it also accepts values. WeakHashMap This class is extremely interesting, in that it uses weak keys. When there is no object reference to a particular key, it may be reclaimed by the automatic garbage collector. In other implementations, the key will be protected from automatic reclamation, as the maintains a reference to it. There is no such protection with a , so keys may be deleted (and hence the mapping to a value) without warning. This functionality is actually quite useful, as it saves having to manually remove unwanted entries. TreeMap implements the SortedMap interface, which sorts mapping by key. For this reason, data inserted into a must implement the interface. If sorting is important, then use , which offers log(n) time for and . If not, one of the implementations may be more appropriate. Retrieving data from a collection Adding data to a collection, or checking if an element exists, is relatively straightforward, particularly for those already familiar with the class. A big change though, is the way in which data elements are accessed. Readers may be familiar with the interface, which offers a way to iterate through a series of data elements. The collection framework introduces a replacement: the . The makes it easier to traverse through the elements of a collection. It also has an extra feature not present in the older Enumeration interface - the ability to remove elements. This makes it easy to perform a search through a collection, and strip out unwanted entries. The following example demonstrates the in action. It adds every command line parameter to a , then uses an to strip out parameters not beginning with a hyphen ('-'). import java.util.*;
public class IteratorDemo
{
public static void main(String args[])
{
Collection c = new ArrayList();
// Add every parameter to our array list
for (int indx = 0; indx < args.length; indx++)
{
c.add(args[indx]);
}
// Examine only those parameters that start with -
Iterator i = c.iterator();
// PRE : Collection has all parameters
while (i.hasNext())
{
String param = (String) i.next();
// Use the remove method of iterator
if (! param.startsWith("-") )
i.remove();
}
// POST: Collection only has parameters that start with -
// Demonstrate by dumping collection contents
Iterator i2 = c.iterator();
while (i2.hasNext())
{
System.out.println ("Param : " + i2.next());
}
}
} Synchronizing access to a Collection or Map The and interfaces don't offer synchronized methods. However, collections are just as relevant in multi-threaded applications as they are in single-threaded ones! Fortunately, there is an easy solution. Included in the collections framework are some support classes. One feature you'll find particularly handy is the ability to produce a synchronized version of a collection or map, using the class (not to be confused with the interface). This is best done when first constructing it, to prevent a thread accessing an unsynchronized version. Methods are provided to create either a Collection, List, Set, SortedSet, Map | or . For example, to created a synchronized , you could do the following: - SortedSet set = Collections.synchronizedSortedSet(new TreeSet()); This produces a synchronization wrapper, which is a class that provides synchronized access to a collection. Failure to use synchronization wrappers can lead to concurrency problems, and a ConcurrentModificationException | being thrown. Converting from a Collection to an array - and back again The interface defines the method, which returns an array of objects. If you want to convert back to a implementation, you could manually traverse each element of the array and add it using the method. // Convert from a collection to an array
Object[] array = c.toArray();
// Convert back to a collection
Collection c2 = new HashSet();
for (int i = 0; i < array.length; i++) { c2.add(array[i]; } That's a little bit of work though. If you're happy enough to use a , rather than a , the support class has a method which converts from an array to a list. // Convert from a collection to an array
Object[] array = c.toArray();
// Convert back to a collection
Collection c2 = ArrayList.asList(array); Either way, converting between collection and array (or vice versa) is relatively straightforward. Putting the Collections framework to use The collections framework makes it easy to create data structures with specialized behavior. When writing new data-types, you can implement or extend a (for an is-a relationship), or use an existing collection as a member variable (for a has-a relationship). For example, if you wanted to create a stack, you could use the class. Though a stack implementation already exists in the form of , the exercise of creating a stack is a useful learning aid. As you can see, the code is extremely compact and clear. makes a good base for any type of queue, stack or double-ended queue. For example, a custom queue might accept only a certain type of object (such as instances for a thread pool). import java.util.*;
public final class SimpleStack
{
private LinkedList list = new LinkedList();
public void push( Object o )
{
list.addFirst( o );
}
public Object pop()
{
Object o = list.getFirst();
list.removeFirst();
return o;
}
public boolean isEmpty()
{
return ( list.size() >0 ? false : true );
}
public int size() { return list.size(); }
} Let's look a little further at some of the implementations. As mentioned above, some implementations guarantee the sorting of data elements, whereas others do not. Some guarantee only one occurrence of an element, others do not. Now's the time to see the differences in action! Take a look at the following listing, which adds random numbers to a collection. A will not allow duplicates, whereas a will. A doesn't guarantee order, but a does. By commenting and un-commenting the lines highlighted in bold, you can see the difference between them. import java.util.*;
public class CollectionDemo
{
public static void main(String args[])
{
Collection c;
Random r = new Random();
c = new ArrayList();
//c = new HashSet();
//c = new TreeSet();
for (int count = 1; count < 100; count++)
{
// Add a number between one and ten
int num = r.nextInt();
num = num < 0 ? -num : num;
c.add ( new Integer (num % 10 +1) );
}
Iterator i = c.iterator();
while (i.hasNext())
{
System.out.println (i.next());
}
}
} Here's an extra tip for when you want to sort a . Since there isn't a interface, and a isn't always appropriate, you must rely on the class. The static method will automatically sort a for you. Try the following changes to the example, to demonstrate sorting with a (which will then contain duplicate but ordered elements). import java.util.*;
public class CollectionDemo
{
public static void main(String args[])
{
Collection c;
Random r = new Random();
c = new ArrayList();
for (int count = 1; count < 100; count++)
{
// Add a number between one and ten
int num = r.nextInt();
num = num < 0 ? -num : num;
c.add ( new Integer (num % 10 +1) );
}
Collections.sort ( (List) c ); Iterator i = c.iterator();
while (i.hasNext())
{
System.out.println (i.next());
}
}
} Summary The collections framework, and the new data structure implementations it provides, is a powerful tool for aggregation of objects. Almost any type of application or applet can benefit from using collections, and they offer much greater versatility than the simple and classes. However, this functionality is part of the Java 2 platform - so it is currently incompatible with many browsers and JVMs. This will change in time though, and it's wise to become familiar with the collections framework now. Finding a use for collections won't be difficult, and will save many hours of coding custom data-structures. About the author David Reilly is a software engineer and freelance technical writer living in Australia. A Sun Certified Java 1.1 Programmer, his research interests include the Java programming language, and networking & distributed systems. He can be reached via email at java@davidreilly.com, or at his personal website.
|