Java Enterprise Java Data Structures in Java: Part 1, Getting Started

Data Structures in Java: Part 1, Getting Started

Java Programming, Lecture Notes #1350


Preface


This is the first lesson in a new miniseries on Java data structures and
the Java Collections Framework.

The purpose of this miniseries is to help you learn the essential features
of Object-Oriented data structures in Java using the Collections Framework.

Viewing tip

You may find it useful to open another copy of this lesson in a separate
browser window.  That will make it easier for you to scroll back and
forth among the different listings while you are reading about them.

Recommended supplementary reading

I recommend that you also study the other lessons in my extensive collection
of online Java tutorials.  You will find those lessons published at
Gamelan.com
However, as of the date of this writing, Gamelan doesn’t maintain a consolidated
index of my Java tutorial lessons, and sometimes my lessons are difficult
to locate there.  You will find a consolidated index at
Baldwin’s
Java Programming Tutorials
.

The index on my site provides links to the lessons
at
Gamelan.com.

Preview


This lesson provides a brief introduction to the use of the Java Collections
Framework.  The framework is designed to encourage you to reuse rather
than to reinvent collections and maps.

A collection represents a group of objects, known as its elements. 
Some collections allow duplicate elements while others do not. Some collections
are ordered and others are not.  (Maps will be discussed in subsequent
lessons.)

The Collections Framework is defined by a set of interfaces and associated
contracts, and provides concrete implementations of the interfaces for
the most common data structures.  In addition, the framework also
provides several abstract implementations, which are designed to make it
easier for you to create new and different implementations while still
maintaining the structural polymorphic integrity of the framework.

Introduction


A little quiz

Let’s begin with a little quiz to establish your baseline knowledge
of the Collections Framework.  Take a look at the program in Listing
6 near the end of this lesson.  Which of the following is the output
produced by that program?

  • A.  Compiler Error
  • B.  Runtime Error
  • C.  44321
  • D. 12344
  • E. 1234
  • F.  None of the above.

If your answer was 1234 then you already know
quite a lot about the use of the Collections Framework.  If not, keep
reading to begin learning about the framework.

Elements of the Framework are easy to use

This simple introductory program is not intended to do anything particularly
useful.  Rather, it was designed to illustrate several important features
of the framework, including the ease with which elements of the framework
can be reused in your programs.

Don’t reinvent the wheel

As many of you already know, I am a college professor.  I specialize
in teaching OOP using Java.  In the past, many college courses in
Data Structures (often referred to as CS2 courses) have emphasized
the concept of reinventing the wheel.  Students were required
to learn how to reinvent a variety of complex data structures in order
to successfully complete the course.

Hopefully a change is in the works

Hopefully, with the conversion of these CS2 courses to Java OOP, the
emphasis will change to reuse instead of reinvent.

Unfortunately, most of the CS2 Java textbooks that I have seen so far
look like warmed-over textbooks from the days of Pascal, C, and C++. 
Many look like they were written using cut-and-paste technology to update
an old Pascal, C, or C++ textbook to incorporate a little Java OOP. 
Most of the textbooks that I have seen include only enough Java OOP to
avoid violating truth in advertising requirements.  They still
emphasize
reinvent instead of reuse.

Collections Framework encourages reuse

The Java Collections Framework is designed to encourage programmers
to reuse existing interfaces and classes instead of inventing new ones. 
In the event that it is necessary to invent a new class or interface, the
programmer is encouraged to integrate it into the Framework in a polymorphic
manner.

Sample Program


I am going to provide a brief discussion of the sample
program (shown in Listing 6) in this lesson.  Later, I will
provide more detailed discussions of many of the features used in that
program.

Interesting
Code Fragments


I will break this program down and discuss it in fragments.

An object of the TreeSet class

The code fragment in Listing 1 instantiates an object of the TreeSet
class and stores the object’s reference in a reference variable of type
Collection.

 

class Worker{
  public void doIt(){
    Collection ref = new TreeSet();

Listing 1

Collection is an interface

The TreeSet class implements the SortedSet interface,
which extends the Set interface, which in turn extends the Collection
interface.  Thus, a TreeSet object is a Collection
Therefore, a reference to a TreeSet object can be stored in a reference
variable of type Collection, and can be treated as the generic type
Collection.

What is a TreeSet object?

Among other things, in CS2 courses, we worry about the time and memory
cost of a collection.  According to Sun, the TreeSet class
guarantees that the sorted set will be in ascending element order, and
provides guaranteed log(n) time cost for the basic operations (add,
remove
and contains).

What does ascending element order mean?

Again, according to Sun, the elements will be sorted according to the
natural
order
of the elements (see the Comparable interface) or by a
comparator (see the Comparator interface) provided at the time the
set is created.  This depends on which overloaded constructor is used. 
I will have more to say about these alternatives in a subsequent lesson.

What does log(n) time cost mean?

I’m not going to try to explain the details of log(n) time cost here. 
Suffice it to say that the add, remove, and contains
methods execute very fast.  (I will have more to say about this
is a subsequent lesson.)

A TreeSet object is a Set

An object of the TreeSet class also is a Set
One of the characteristics of a Java Set (an object that implements
the Set interface)
is that it can contain no duplicate elements. 
Therefore, a TreeSet object can contain no duplicate elements. 
If the add() method of a TreeSet object is invoked in an
attempt to add a duplicate element, the element will not be added.

A TreeSet object is a SortedSet

The TreeSet class also implements the SortedSet interface. 
This guarantees that the contents of a TreeSet object will be in
ascending element order, regardless of the order in which the elements
are added.  (In a subsequent lesson, I will discuss how comparisons
are made to enforce the ordering of the elements.)

A TreeSet object is a Collection

Because an object of the TreeSet class is a Collection,
a reference to such an object can be passed to any method that requires
an incoming parameter of type Collection.  The receiving method
can invoke any method on that reference that is declared in the Collection
interface.
(I will discuss such methods in detail in subsequent
lessons.)

Populate the Collection

The statement in Listing 2 below passes the TreeSet object’s
reference to a method named fillIt(), which is a class method of
the Populator class.  (The Populator class is a class of
my own design whose only purpose is to illustrate the polymorphic behavior
achieved using the Collections Framework.)
  The behavior of this
method is to add elements to the incoming Collection object without
regard for the actual type of the object (the class from which the object
was instantiated).


 

    Populator.fillIt(ref);

Listing 2

At this point, I am going to discuss the fillIt() method of the
Populator
class invoked in the above fragment.  The entire class definition
of the Populator class is shown in Listing 3 below.

 

class Populator{
  public static void fillIt(
                       Collection ref){
    ref.add(new Integer(4));
    ref.add(new Integer(4));
    ref.add(new Integer(3));
    ref.add(new Integer(2));
    ref.add(new Integer(1));
  }//end fillIt()
}//end class populator

Listing 3

Don’t know, don’t care

As you can see in the above fragment, the fillIt() method receives
the reference to the TreeSet object as type Collection
This method doesn’t know, and doesn’t care, what the actual type of the
object is.  All it cares about is that the object is a Collection
object.  (Otherwise, the object’s reference couldn’t be passed
in as a parameter.  A type mismatch would occur.)

Because the incoming parameter is a reference to a Collection
object, the fillIt() method can invoke the add() method on
the object, with confidence that the behavior of the add() method
will be appropriate for the specific type of object involved.  (For
example, the behavior of the add() method for an object of the TreeSet
class will probably be different from the behavior of the add() method
for an object of some other class that implements the Collection interface.)

Polymorphism is great!

The great thing about polymorphic behavior is that the author of the
fillIt() method doesn’t need to be concerned about the implementation
details of the add() method.

Add five elements with some duplicates

The code in the fillIt() method adds five elements to the object. 
Each element is a reference to a new object of type Integer
Two of the objects encapsulate the int value 4, and thus are duplicates.

The int values encapsulated in the Integer objects are
not in ascending order.  Rather, they are added to the object in descending
order.  (They could be added in any order and the end result would
be the same.)

Filter out the duplicates

Unknown to the author of the fillIt() method, the add()
method filters out the duplicate element in order to satisfy the contract
of the Collection interface.  In this case, the author didn’t
care what happens in the case of duplicate elements.

Notification of duplicates

If the author of the fillIt() method does care what happens in
the case of duplicates, she can find out when an object is a duplicate.

According to the contract of the Collection interface, the add()
method must return
true if the invocation of the method modifies
the contents of the object and must return false if the collection
does not permit duplicates and the collection already contains the specified
element.

Sort the elements

Also unknown to the author of the fillIt() method, even though
the elements are added in descending order (or could be added in any
other order for that matter),
they are stored and maintained in the
TreeSet object in such a way that they can later be accessed in
ascending order.

The TreeSet object is now populated

When the fillIt() method returns, the TreeSet object contains
four (not five) elements with no duplicates.  Each element
is a reference to an object of type Integer.  Those references
are maintained in such a way as to make them accessible in ascending order,
based on the
int values encapsulated in each of the Integer
objects.

Get an Iterator object

Returning now to the flow in the doIt() method, the following
statement invokes the iterator() method on the TreeSet object’s
reference that is stored in the reference variable of type
Collection
This is shown in Listing 4 below.

 

    Iterator iter = ref.iterator();

Listing 4

Invocation of the iterator() method on any Collection
object returns an instance of a class that implements the Iterator
interface.  The Iterator object can be used to traverse the
collection, gaining access to each element in order.  (The concept
of in order means different things for different kinds of collections. 
For a collection instantiated from the TreeSet class, in order means in
ascending order.)

Again, don’t know, don’t care

Again, the author of the method that uses the Collection object
doesn’t need to know or care about the internal implementation of the collection,
or the implementation of the methods of the Iterator object. 
They simply do what they do, and can be used for their intended purpose.

An Iterator object acts as a doorkeeper

The Iterator interface declares three methods:

  • hasNext()
  • next()
  • remove()

You might say that an Iterator object acts as a doorkeeper for the
collection object that it represents, providing access to the contents
of the collection in a very specific manner.

The code fragment in Listing 5 below shows how the first two of the
above methods can be used to

  • Traverse the collection, accessing each of the object’s elements in succession
  • Display the value encapsulated in the object referred to by each element

As mentioned earlier, when the collection is an object instantiated from
the TreeSet class, access to the elements is provided in ascending
order.

 

    while(iter.hasNext()){
      System.out.print(iter.next());
    }//end while loop

Listing 5

Four elements with no duplicates

At this point, the TreeSet object contains four elements, with
no duplicates.  Each of the elements is a reference to an object of
type Integer.  The code in the above loop causes each of those
elements to be accessed and displayed in ascending order.  This causes
the following text to appear on the screen:

1234

An editorial opinion

In my opinion, this is the kind of knowledge that a computer science
student in a modern data structures course should be learning.  This
is a far departure from courses of the past where CS2 students were required
to memorize the intricate details of how to implement various data structures.

What kind of knowledge is needed?

Does an architect need to understand the detailed inner workings of
an air conditioning compressor in order to design a cooling system into
a building?  Of course not!

However, the architect does need to know the tradeoffs among the available
cooling systems in terms of initial cost, operating cost, size, efficiency,
etc.

Does an audio technician need to understand the detailed inner workings
of an electronic audio equalizer in order to construct an integrated audio
system?  Absolutely not!  If that were a requirement, there would
likely be very few audio systems in existence.

However, the audio technician does need to understand the tradeoffs
among the various available audio equalizers.

The same concept applies to software design

Does an OOP software designer need to know the detailed inner workings
of the various kinds of collection objects in order to use them effectively? 
No!

However, the software designer does need to know the tradeoffs among
the various types of collection objects in terms of their operational behavior.

Modern CS2 students should be learning about the performance and operational
differences among the different types of collections, and how to use available
frameworks to create and use those collections.  They should not be
wasting their time learning how to reinvent them.  They have more
important ways to spend their valuable time, and they have more important
things to learn.

An analogy

Frankly, I don’t care how the programmers at Sun implemented the TreeSet
class, so long as the behavior of objects instantiated from that class
meets the published specifications.

As an analogy, I also don’t care how they implemented the Random
class, so long as objects instantiated from the Random class provide
the pseudo random values that I need in my programs.

I see no conceptual differences between the TreeSet class and
the Random class from a software reuse viewpoint.

  • I can instantiate an object of the Random class to produce pseudo
    random values, without caring how those values are actually generated. 
    However, if I am working in cryptography, I might need to know how many
    such values can be generated before the sequence repeats.
  • I can use any of the thirty or so methods of the Math class to produce
    a variety of complex mathematical values without caring about how those
    values are actually produced.  However, since many of those values
    are approximations, I might need to know something about the quality of
    the approximation.
  • I can instantiate an object of the TreeSet class to create a collection
    object, which guarantees that the sorted set will be in ascending element
    order, and provide log(n) time cost for the basic operations of
    add,
    remove,
    and contains.  As long as I know that, I have very little need
    to know exactly how the collection object is implemented.

Its time to reinvent the CS2 curriculum

I’m confident that the future employers of most students share my opinion
on this.  I don’t know of any employer who wants their programmers
to expend time and dollars reinventing the classical data structures. 
What those employers are looking for is a staff of programmers who understand
the tradeoffs among the data structures, and when it is appropriate to
use each of the different structures.

It is time to reinvent the curriculum in CS2 courses by

  • Encouraging the understanding of techniques for software reuse
  • Teaching when, why, and how each of the different structures should be
    used
  • Discouraging the reinvention of those structures
  • Summary


    In this lesson, I have provided a brief introduction to the use of the
    Java Collections Framework.  The framework is designed to encourage
    you to reuse rather than to reinvent collections and maps (I will have
    more to say about maps in a subsequent lesson).

    A collection represents a group of objects, known as its elements.

    While some collections allow duplicate elements, others do not. Some
    collections are ordered and others are not ordered.

    The Collections Framework is defined by a set of interfaces and associated
    contracts.  The framework provides concrete implementations of the
    interfaces (classes) for the most common data structures. 
    In addition, the framework also provides several abstract implementations,
    which are designed to make it easier for you to create new and different
    concrete implementations.

    The TreeSet class is a concrete implementation of the SortedSet
    interface.  The SortedSet interface extends Set, which
    extends Collection.  Thus, a TreeSet object is a
    SortedSet.  Also it is a Set, and it is
    a
    Collection.

    The TreeSet class guarantees that the sorted set will be in ascending
    element order, and provides guaranteed log(n) time cost for the basic operations
    (add, remove and contains).

    TreeSet objects can be treated as the generic type Collection
    Methods declared in the Collection interface can be invoked on a
    TreeSet
    object without regard for the actual class from which the object was instantiated.

    (This is polymorphic behavior.)

    When such methods are invoked, the author of the program can have confidence
    that the behavior of the method will be appropriate for an
    object of the class from which the object was instantiated.  In my
    opinion, this is the true essence of object-oriented behavior.

    What’s Next?


    This is the first lesson in a miniseries on the Collection Framework. 
    Subsequent lessons will teach you how to use the framework for creating
    and using various types of collections and maps.

    Once you learn how to use the framework, it is unlikely that you will
    need to reinvent classical data structures, search algorithms, or sorting
    algorithms, because those capabilities are neatly packaged within the framework.

    Complete Program Listing


    A complete listing of the program is provided in Listing
    6 below
    .

     

    import java.util.TreeSet;
    import java.util.Collection;
    import java.util.Iterator;
    
    public class AP400{
      public static void main(
                            String args[]){
        new Worker().doIt();
      }//end main()
    }//end class AP400
    
    class Worker{
      public void doIt(){
        Collection ref = new TreeSet();
        Populator.fillIt(ref);
        Iterator iter = ref.iterator();
        while(iter.hasNext()){
          System.out.print(iter.next());
        }//end while loop
        System.out.println();
        
      }//end doIt()
    }// end class Worker
    
    class Populator{
      public static void fillIt(
                           Collection ref){
        ref.add(new Integer(4));
        ref.add(new Integer(4));
        ref.add(new Integer(3));
        ref.add(new Integer(2));
        ref.add(new Integer(1));
      }//end fillIt()
    }//end class populator
    
    Listing 6

    Copyright 2000, Richard G. Baldwin.  Reproduction in whole or in
    part in any form or medium without express written permission from Richard
    Baldwin is prohibited.

    About the author

    Richard Baldwin
    is a college professor and private consultant whose primary focus is a
    combination of Java and XML. In addition to the many platform-independent
    benefits of Java applications, he believes that a combination of Java and
    XML will become the primary driving force in the delivery of structured
    information on the Web.

    Richard has participated in numerous consulting projects involving
    Java, XML, or a combination of the two.  He frequently provides onsite
    Java and/or XML training at the high-tech companies located in and around
    Austin, Texas.  He is the author of Baldwin’s Java Programming Tutorials,
    which has gained a worldwide following among experienced and aspiring Java
    programmers. He has also published articles on Java Programming in Java
    Pro magazine.

    Richard holds an MSEE degree from Southern Methodist University and
    has many years of experience in the application of computer technology
    to real-world problems.

    [email protected]

Latest Posts

Related Stories