JavaEnterprise JavaData Structures in Java: Part 4, Purpose of Implementations and Algorithms

Data Structures in Java: Part 4, Purpose of Implementations and Algorithms

Java Programming, Lecture Notes #1356


Preface

This is the fourth lesson in a miniseries on Java data structures and the
Java Collections Framework.  The first lesson in the miniseries was
entitled Data
Structures in Java: Part 1, Getting Started
.

The purpose of this miniseries is to help you learn the essential features
of Object-Oriented data structures in Java using the Collections Framework.

Supplementary material

I recommend that you also study the other lessons in my extensive collection
of online Java tutorials.  You will find those lessons published at
Gamelan.com
However, as of the date of this writing, Gamelan doesn’t maintain a consolidated
index of my Java tutorial lessons, and sometimes they are difficult to
locate there.  You will find a consolidated index at
Baldwin’s
Java Programming Tutorials
.

The index on my site provides links to the lessons
at
Gamelan.com
.

Preview


At least three things are included in a collections framework:

  • interfaces
  • implementations
  • algorithms

The previous lesson discussed the purpose of the interfaces.  This
lesson will discuss the purpose of the implementations and the algorithms
in the Collections Framework.

Introduction

In the previous lesson, entitled
Data
Structures in Java: Part 3, Purpose of Framework Interfaces
,
we learned that the framework provides nine concrete implementations of the interfaces in the
framework.  These nine implementation classes are available for immediate
instantiation to produce objects to satisfy your collection needs.

We also learned that the framework provides three incomplete implementations. 
These classes are available for you to use as a starting point in defining
your own implementations.  Default implementations of many of the
interface methods are provided in the incomplete implementations.

Discussion
and Sample Program

Purpose of implementations

The implementations in the Java Collections Framework are the
concrete definitions of the classes that implement the core collection
interfaces
.  For example, as of JDK 1.3, the concrete implementations
in the Java Collections Framework are provided by the following nine classes.

  • HashSet
  • TreeSet
  • LinkedList
  • ArrayList
  • Vector
  • HashMap
  • WeakHashMap
  • TreeMap
  • Hashtable

Available for immediate use

These classes are available for immediate use to instantiate collection
objects.

As you can see, there are two classes that obviously fall into the set
category, two that obviously fall into the list category, and three
that obviously fall into the map category.  I will have more
to say about the details of these classes in subsequent lessons.

This leaves two additional classes whose names don’t readily divulge
the category in which they belong.

Vector and Hashtable classes

The classes Vector and Hashtable were part of Java even
before the Java Collections Framework became available.  The Vector
class can be used to instantiate objects that fall in the general list
category.

The Hashtable class can be used to instantiate objects that fall
in the map category.

These two classes have been upgraded to make them compatible with the
Collections Framework.

Abstract implementations

In addition to the concrete implementations listed above, the following
three classes partially implement the interfaces, but are not intended
for instantiation.  Rather, they are intended to be extended into
new concrete classes that you define.

  • AbstractSet
  • AbstractList
  • AbstractMap

Therefore, by either using one of the three classes listed above as a starting
point, or by starting from scratch and fully implementing one or more of
the interfaces, you can provide new concrete implementations to augment
the framework to include collections that meet your special needs.

Purpose of algorithms

Algorithms are methods (not necessarily exposed) that provide
useful capabilities, such as searching and sorting.  For example,
the Collection interface declares an exposed method named contains.

The contains method

  • Receives an incoming reference of type Object as a parameter
  • Searches the collection looking for an element that matches the incoming
    reference
  • Returns true if the collection on which the method is invoked contains
    the specified element

Different classes, different implementations

You can safely invoke the contains method on any object instantiated
from a class that properly implements the Collection interface,
even if you don’t know the actual type of the collection object.

The manner in which the search will be performed will probably differ
from one concrete implementation of the interface to the next.  For
example, a TreeSet object will perform the search very rapidly with
a time cost of only log(n) where n is the number of elements.  On
the other hand, for the same number of elements, because of a different
underlying data structure, a search on an ArrayList object will
probably require more time than a search on a TreeSet object. 
As the number of elements increases, the difference in time cost between
the two will also increase.

Consider the sample program shown in Listing 1.  This program compares
the search speed of the ArrayList class and the TreeSet class. 
A detailed discussion of the program follows Listing 1.

 

/*File SpeedTest01
Copyright 2001 R.G.Baldwin
**************************************/

import java.util.*;

public class SpeedTest01{
  public static void main(
                        String args[]){
    new Worker().doIt();
  }//end main()
}//end class SpeedTest01

class Worker{
  public void doIt(){
    int size = 500000;
    //Create a TreeSet object
    Collection aTree = new TreeSet();
    
    //Populate the TreeSet object with
    // random values.  The add() method
    // for a set rejects duplicates.
    Random rnGen = new Random();
    for(int ct = 0; ct < size; ct++){
      aTree.add(new Double(
                  rnGen.nextDouble()));
    }//end for loop
    
    //Create and populate an ArrayList
    // object with the same random
    // values
    Collection aList = 
                  new ArrayList(aTree);
                  
    //Extract a value near the center
    // of the ArrayList object to use
    // as a test case.
    Object testVal = 
        ((List)aList).get(size/2);
    
    //Search for the test value in each
    // of the collection objects.  
    // Measure and display the time 
    // required to perform the search
    // in each case.
    long start = new Date().getTime();
    boolean found = 
               aList.contains(testVal);
    long stop = new Date().getTime();
    System.out.println(
         found + " " + (stop - start));
    
    start = new Date().getTime();
    for(int x = 0; x < 10000; x++){
      found = aTree.contains(testVal);
    }//end for loop
    stop = new Date().getTime();
    System.out.println(
         found + " " + (stop - start));

  }//end doIt()
}// end class Worker

Listing 1

The program begins by instantiating a TreeSet object and populating
it with approximately 500,000 elements as shown in Listing 2 below. 
The values encapsulated in the objects referred to by the elements in the
collection are produced by a random number generator.

Recall that the add method of a Set object rejects duplicate
elements, so there may be fewer than 500,000 elements in the object after
it is populated.

 

  public void doIt(){
    int size = 500000;

    Collection aTree = new TreeSet();

    Random rnGen = new Random();
    for(int ct = 0; ct < size; ct++){
      aTree.add(new Double(
                  rnGen.nextDouble()));
    }//end for loop

Listing 2

One of the capabilities of the Collection Framework is to create a new
Collection object and populate it with the contents of an existing
Collection object of a different (or the same) actual type.

The code in Listing 3 instantiates an ArrayList object and populates
it with the contents of the existing TreeSet object.  As a
result, we then have two different Collection objects of different
actual types containing the same elements.

 

    Collection aList = 
                  new ArrayList(aTree);

Listing 3

The objective of this program is to compare the times required to search
for and find an element in each of the collections.  Thus, we need
a target element to search for.

The code in Listing 4 below extracts a value near the center of the
ArrayList object using an index to find and extract the value. 
This is a very fast operation on a List object.  This value
is saved in testVal to be used later for test purposes.

Note that the reference to the ArrayList object was saved as
type Collection in Listing 3 above.

Note also that it was necessary to cast that reference to type List
in Listing 4 below in order to invoke the get method on the reference. 
This is because the Collection interface does not declare a method
named get.  Rather, the get method is added to the List
interface to define a more specialized form of collection.

 

    Object testVal = 
        ((List)aList).get(size/2);

Listing 4

The code in Listing 5 below invokes the contains method to search
for the test value in each of the collections.  It uses the system
clock to measure the time required to find the element in each case. 
I will assume that you understand how to use the Date class for
this purpose, and won’t provide a detailed explanation.

 

    long start = new Date().getTime();
    boolean found = 
               aList.contains(testVal);
    long stop = new Date().getTime();
    System.out.println(
         found + " " + (stop - start));
    
    start = new Date().getTime();
    for(int x = 0; x < 10000; x++){
      found = aTree.contains(testVal);
    }//end for loop
    stop = new Date().getTime();
    System.out.println(
         found + " " + (stop - start));

  }//end doIt()

Listing 5

The output from the program was:

true 171
true 30

The first line of output applies to the ArrayList object, and
the second line applies to the TreeSet object.

As we would expect, the test value was successfully found in both cases;
hence the display of true in both cases.

Time required to search ArrayList

The output indicates that approximately 171 milliseconds were required
to find the test value in the ArrayList object.

Time required to search TreeSet object

On the other hand, the time required to find the test value in the TreeSet
object was so small that it wasn’t even measurable within the granularity
of the system clock (other experiments have caused me to believe that
the granularity of the system clock on this machine is at least ten milliseconds)

Hence, the original reported time required to find the test value in the
TreeSet object was zero.

In order to get a measurable time value to search the TreeSet
object, I had to wrap the invocation of the contains method in a
for-loop and search for the same value 10,000 times in succession. 
Thus, the time required to find the test value in the TreeSet object
was approximately 0.0030 milliseconds as compared to 171 milliseconds for
the ArrayList object.

(I’ll let you do the arithmetic to see if this makes sense in terms
of the expected time cost to search the two different types of objects. 
Don’t forget the extra overhead of the for-loop.)

Different implementations

This is a graphic demonstration that even though both objects can be
treated as type Collection, and the contains method can be
invoked on either object in a polymorphic manner, the actual implementations
of the two objects and the implementations of the contains methods
in those two objects are different.

Each type of collection has advantages and disadvantages, depending
on your needs.

Polymorphic behavior applies

The important point is that if you receive a reference to the collection
object as type Collection, you can invoke the contains method
on that reference without regard to the underlying structure of the collection
object.  This is because polymorphic behavior applies.

Very briefly, polymorphic behavior means that the actual method that
is executed is the appropriate method for that type of object regardless
of the type of the reference to the object.  This is one of the great
advantages of using the Java Collections Framework and passing collection
objects among methods as interface types.

Sorting algorithms

Some of the implementations of the Java Collection Framework maintain
their elements in a random order, and other implementations maintain their
elements in a sorted order.  Thus, the framework also provides sorting
algorithms.  However, the sorting algorithms used to maintain the
order of the collections are not exposed in the way that the search algorithm
is exposed (via the contains() method).
Rather, the sorting algorithms
are implicit in those implementations that need them, and are absent from
those implementations that don’t need them.

Now for a little quiz

Let’s see if you are still awake.  Select the words in one pair
of parentheses in the following statement that causes the statement to
be true.

The interfaces in the Collections Framework make it possible to manipulate
the contents of collections in a manner that is (dependent on)
(independent
of)
the underlying implementation of each collection.

And the answer is …

The interfaces in the Collections Framework make it possible to manipulate
the contents of collections in a manner that is independent of the underlying
implementation of each collection.  That is the beauty of basing the
framework on interfaces that declare polymorphic methods.

I placed this question here to drive home the point that the methods
declared in the Collection interface can be invoked on collection
objects in a polymorphic manner.

That is to say, as a user of an object instantiated from a class that
properly implements the Collection interface, you can invoke the
methods declared in that interface on a reference to the object and be
confident that the actual method that is invoked will be the version that
is appropriate for the class from which the object was instantiated. 
This is polymorphic behavior.

In the event that you need to invoke a method that is not declared in
the Collection interface (such as the get() method in Listing
4 above),
you can pass the reference as one of the more specialized
sub-interfaces of Collection, such as Set.

Benefits of using the Collections Framework

The Java
Tutorial
from Sun lists and explains the following benefits of using
the Java Collections Framework.

  • It reduces programming effort
  • It increases program speed and quality
  • It allows interoperability among unrelated APIs
  • It reduces the effort to learn and use new APIs
  • It reduces effort to design new APIs
  • It fosters software reuse

For a detailed explanation of these benefits, I am simply going to refer
you directly to this excellent online book.

Summary

So, let’s recap some of what we have learned in this and the previous lessons.

The core collection interfaces in the Collections Framework are shown
in Listing 6.

 

  • Collection
    • Set
      • SortedSet
    • List
  • Map
    • SortedMap

Listing 6

The basic purpose of the core collection interfaces in the Java Collections
Framework is to allow collections to be manipulated without regard for
how the collections are implemented, provided of course that the implementations
comply with the contracts.

The framework provides the following nine concrete implementations (classes)
of the interfaces shown in Listing 1:

  • HashSet
  • TreeSet
  • LinkedList
  • ArrayList
  • Vector
  • HashMap
  • WeakHashMap
  • TreeMap
  • Hashtable

For example, the classes TreeSet and ArrayList are concrete
implementations of the Collection interface as shown in the above
list.  (Actually, they are concrete implementations of sub-interfaces
of Collection.  JDK 1.3 doesn’t provide any direct implementations
of Collection).

A collection object instantiated from the class TreeSet and a
collection object instantiated from the class ArrayList can each
be viewed as being of the interface type Collection.

Methods having the same signatures can be used to manipulate either
collection with confidence that the behavior of the method will be appropriate
for the actual type of collection involved.

The framework also provides the following incomplete implementations
of the core interfaces:

  • AbstractSet
  • AbstractList
  • AbstractMap

The purpose of these implementations is to provide you with a starting
point for defining your own concrete implementations for more specialized
collections.

What’s Next?

In the next lesson, I will dig a little deeper into the details of the
Java Collections Framework.  I will begin the next lesson with a sample
program that illustrates the basic purpose of the core collection interfaces,
which is to allow collections to be manipulated without regard for how
the collections are implemented, as long as the implementation meets the
contract of the interface.


Copyright 2000, Richard G. Baldwin.  Reproduction in whole or in
part in any form or medium without express written permission from Richard
Baldwin is prohibited.

About the author

Richard Baldwin
is a college professor and private consultant whose primary focus is a
combination of Java and XML. In addition to the many platform-independent
benefits of Java applications, he believes that a combination of Java and
XML will become the primary driving force in the delivery of structured
information on the Web.

Richard has participated in numerous consulting projects involving
Java, XML, or a combination of the two.  He frequently provides onsite
Java and/or XML training at the high-tech companies located in and around
Austin, Texas.  He is the author of Baldwin’s Java Programming
Tutorials,
which has gained a worldwide following among experienced and aspiring Java
programmers. He has also published articles on Java Programming in Java
Pro magazine.

Richard holds an MSEE degree from Southern Methodist University and
has many years of experience in the application of computer technology
to real-world problems.

[email protected]

Latest Posts

Related Stories