Java Programming, Lecture Notes #1350
- Preface
- Preview
- Introduction
- Sample Program
- Interesting Code Fragments
- Summary
- What’s Next
- Complete Program Listing
Preface
This is the first lesson in a new miniseries on Java data structures and
the Java Collections Framework.
The purpose of this miniseries is to help you learn the essential features
of Object-Oriented data structures in Java using the Collections Framework.
Viewing tip
You may find it useful to open another copy of this lesson in a separate
browser window. That will make it easier for you to scroll back and
forth among the different listings while you are reading about them.
Recommended supplementary reading
I recommend that you also study the other lessons in my extensive collection
of online Java tutorials. You will find those lessons published at
Gamelan.com.
However, as of the date of this writing, Gamelan doesn’t maintain a consolidated
index of my Java tutorial lessons, and sometimes my lessons are difficult
to locate there. You will find a consolidated index at
Baldwin’s
Java Programming Tutorials.
The index on my site provides links to the lessons
at Gamelan.com.
Preview
This lesson provides a brief introduction to the use of the Java Collections
Framework. The framework is designed to encourage you to reuse rather
than to reinvent collections and maps.
A collection represents a group of objects, known as its elements.
Some collections allow duplicate elements while others do not. Some collections
are ordered and others are not. (Maps will be discussed in subsequent
lessons.)
The Collections Framework is defined by a set of interfaces and associated
contracts, and provides concrete implementations of the interfaces for
the most common data structures. In addition, the framework also
provides several abstract implementations, which are designed to make it
easier for you to create new and different implementations while still
maintaining the structural polymorphic integrity of the framework.
Introduction
A little quiz
Let’s begin with a little quiz to establish your baseline knowledge
of the Collections Framework. Take a look at the program in Listing
6 near the end of this lesson. Which of the following is the output
produced by that program?
- A. Compiler Error
- B. Runtime Error
- C. 44321
- D. 12344
- E. 1234
- F. None of the above.
If your answer was 1234 then you already know
quite a lot about the use of the Collections Framework. If not, keep
reading to begin learning about the framework.
Elements of the Framework are easy to use
This simple introductory program is not intended to do anything particularly
useful. Rather, it was designed to illustrate several important features
of the framework, including the ease with which elements of the framework
can be reused in your programs.
Don’t reinvent the wheel
As many of you already know, I am a college professor. I specialize
in teaching OOP using Java. In the past, many college courses in
Data Structures (often referred to as CS2 courses) have emphasized
the concept of reinventing the wheel. Students were required
to learn how to reinvent a variety of complex data structures in order
to successfully complete the course.
Hopefully a change is in the works
Hopefully, with the conversion of these CS2 courses to Java OOP, the
emphasis will change to reuse instead of reinvent.
Unfortunately, most of the CS2 Java textbooks that I have seen so far
look like warmed-over textbooks from the days of Pascal, C, and C++.
Many look like they were written using cut-and-paste technology to update
an old Pascal, C, or C++ textbook to incorporate a little Java OOP.
Most of the textbooks that I have seen include only enough Java OOP to
avoid violating truth in advertising requirements. They still
emphasize
reinvent instead of reuse.
Collections Framework encourages reuse
The Java Collections Framework is designed to encourage programmers
to reuse existing interfaces and classes instead of inventing new ones.
In the event that it is necessary to invent a new class or interface, the
programmer is encouraged to integrate it into the Framework in a polymorphic
manner.
Sample Program
I am going to provide a brief discussion of the sample
program (shown in Listing 6) in this lesson. Later, I will
provide more detailed discussions of many of the features used in that
program.
Interesting
Code Fragments
I will break this program down and discuss it in fragments.
An object of the TreeSet class
The code fragment in Listing 1 instantiates an object of the TreeSet
class and stores the object’s reference in a reference variable of type
Collection.
class Worker{ public void doIt(){ Collection ref = new TreeSet(); Listing 1 |
Collection is an interface
The TreeSet class implements the SortedSet interface,
which extends the Set interface, which in turn extends the Collection
interface. Thus, a TreeSet object is a Collection.
Therefore, a reference to a TreeSet object can be stored in a reference
variable of type Collection, and can be treated as the generic type
Collection.
What is a TreeSet object?
Among other things, in CS2 courses, we worry about the time and memory
cost of a collection. According to Sun, the TreeSet class
guarantees that the sorted set will be in ascending element order, and
provides guaranteed log(n) time cost for the basic operations (add,
remove
and contains).
What does ascending element order mean?
Again, according to Sun, the elements will be sorted according to the
natural
order of the elements (see the Comparable interface) or by a
comparator (see the Comparator interface) provided at the time the
set is created. This depends on which overloaded constructor is used.
I will have more to say about these alternatives in a subsequent lesson.
What does log(n) time cost mean?
I’m not going to try to explain the details of log(n) time cost here.
Suffice it to say that the add, remove, and contains
methods execute very fast. (I will have more to say about this
is a subsequent lesson.)
A TreeSet object is a Set
An object of the TreeSet class also is a Set.
One of the characteristics of a Java Set (an object that implements
the Set interface) is that it can contain no duplicate elements.
Therefore, a TreeSet object can contain no duplicate elements.
If the add() method of a TreeSet object is invoked in an
attempt to add a duplicate element, the element will not be added.
A TreeSet object is a SortedSet
The TreeSet class also implements the SortedSet interface.
This guarantees that the contents of a TreeSet object will be in
ascending element order, regardless of the order in which the elements
are added. (In a subsequent lesson, I will discuss how comparisons
are made to enforce the ordering of the elements.)
A TreeSet object is a Collection
Because an object of the TreeSet class is a Collection,
a reference to such an object can be passed to any method that requires
an incoming parameter of type Collection. The receiving method
can invoke any method on that reference that is declared in the Collection
interface.
(I will discuss such methods in detail in subsequent
lessons.)
Populate the Collection
The statement in Listing 2 below passes the TreeSet object’s
reference to a method named fillIt(), which is a class method of
the Populator class. (The Populator class is a class of
my own design whose only purpose is to illustrate the polymorphic behavior
achieved using the Collections Framework.) The behavior of this
method is to add elements to the incoming Collection object without
regard for the actual type of the object (the class from which the object
was instantiated).
Populator.fillIt(ref); Listing 2 |
At this point, I am going to discuss the fillIt() method of the
Populator
class invoked in the above fragment. The entire class definition
of the Populator class is shown in Listing 3 below.
class Populator{ public static void fillIt( Collection ref){ ref.add(new Integer(4)); ref.add(new Integer(4)); ref.add(new Integer(3)); ref.add(new Integer(2)); ref.add(new Integer(1)); }//end fillIt() }//end class populator Listing 3 |
Don’t know, don’t care
As you can see in the above fragment, the fillIt() method receives
the reference to the TreeSet object as type Collection.
This method doesn’t know, and doesn’t care, what the actual type of the
object is. All it cares about is that the object is a Collection
object. (Otherwise, the object’s reference couldn’t be passed
in as a parameter. A type mismatch would occur.)
Because the incoming parameter is a reference to a Collection
object, the fillIt() method can invoke the add() method on
the object, with confidence that the behavior of the add() method
will be appropriate for the specific type of object involved. (For
example, the behavior of the add() method for an object of the TreeSet
class will probably be different from the behavior of the add() method
for an object of some other class that implements the Collection interface.)
Polymorphism is great!
The great thing about polymorphic behavior is that the author of the
fillIt() method doesn’t need to be concerned about the implementation
details of the add() method.
Add five elements with some duplicates
The code in the fillIt() method adds five elements to the object.
Each element is a reference to a new object of type Integer.
Two of the objects encapsulate the int value 4, and thus are duplicates.
The int values encapsulated in the Integer objects are
not in ascending order. Rather, they are added to the object in descending
order. (They could be added in any order and the end result would
be the same.)
Filter out the duplicates
Unknown to the author of the fillIt() method, the add()
method filters out the duplicate element in order to satisfy the contract
of the Collection interface. In this case, the author didn’t
care what happens in the case of duplicate elements.
Notification of duplicates
If the author of the fillIt() method does care what happens in
the case of duplicates, she can find out when an object is a duplicate.
According to the contract of the Collection interface, the add()
method must return
true if the invocation of the method modifies
the contents of the object and must return false if the collection
does not permit duplicates and the collection already contains the specified
element.
Sort the elements
Also unknown to the author of the fillIt() method, even though
the elements are added in descending order (or could be added in any
other order for that matter), they are stored and maintained in the
TreeSet object in such a way that they can later be accessed in
ascending order.
The TreeSet object is now populated
When the fillIt() method returns, the TreeSet object contains
four (not five) elements with no duplicates. Each element
is a reference to an object of type Integer. Those references
are maintained in such a way as to make them accessible in ascending order,
based on the
int values encapsulated in each of the Integer
objects.
Get an Iterator object
Returning now to the flow in the doIt() method, the following
statement invokes the iterator() method on the TreeSet object’s
reference that is stored in the reference variable of type
Collection.
This is shown in Listing 4 below.
Iterator iter = ref.iterator(); Listing 4 |
Invocation of the iterator() method on any Collection
object returns an instance of a class that implements the Iterator
interface. The Iterator object can be used to traverse the
collection, gaining access to each element in order. (The concept
of in order means different things for different kinds of collections.
For a collection instantiated from the TreeSet class, in order means in
ascending order.)
Again, don’t know, don’t care
Again, the author of the method that uses the Collection object
doesn’t need to know or care about the internal implementation of the collection,
or the implementation of the methods of the Iterator object.
They simply do what they do, and can be used for their intended purpose.
An Iterator object acts as a doorkeeper
The Iterator interface declares three methods:
- hasNext()
- next()
- remove()
You might say that an Iterator object acts as a doorkeeper for the
collection object that it represents, providing access to the contents
of the collection in a very specific manner.
The code fragment in Listing 5 below shows how the first two of the
above methods can be used to
- Traverse the collection, accessing each of the object’s elements in succession
- Display the value encapsulated in the object referred to by each element
As mentioned earlier, when the collection is an object instantiated from
the TreeSet class, access to the elements is provided in ascending
order.
while(iter.hasNext()){ System.out.print(iter.next()); }//end while loop Listing 5 |
Four elements with no duplicates
At this point, the TreeSet object contains four elements, with
no duplicates. Each of the elements is a reference to an object of
type Integer. The code in the above loop causes each of those
elements to be accessed and displayed in ascending order. This causes
the following text to appear on the screen:
1234
An editorial opinion
In my opinion, this is the kind of knowledge that a computer science
student in a modern data structures course should be learning. This
is a far departure from courses of the past where CS2 students were required
to memorize the intricate details of how to implement various data structures.
What kind of knowledge is needed?
Does an architect need to understand the detailed inner workings of
an air conditioning compressor in order to design a cooling system into
a building? Of course not!
However, the architect does need to know the tradeoffs among the available
cooling systems in terms of initial cost, operating cost, size, efficiency,
etc.
Does an audio technician need to understand the detailed inner workings
of an electronic audio equalizer in order to construct an integrated audio
system? Absolutely not! If that were a requirement, there would
likely be very few audio systems in existence.
However, the audio technician does need to understand the tradeoffs
among the various available audio equalizers.
The same concept applies to software design
Does an OOP software designer need to know the detailed inner workings
of the various kinds of collection objects in order to use them effectively?
No!
However, the software designer does need to know the tradeoffs among
the various types of collection objects in terms of their operational behavior.
Modern CS2 students should be learning about the performance and operational
differences among the different types of collections, and how to use available
frameworks to create and use those collections. They should not be
wasting their time learning how to reinvent them. They have more
important ways to spend their valuable time, and they have more important
things to learn.
An analogy
Frankly, I don’t care how the programmers at Sun implemented the TreeSet
class, so long as the behavior of objects instantiated from that class
meets the published specifications.
As an analogy, I also don’t care how they implemented the Random
class, so long as objects instantiated from the Random class provide
the pseudo random values that I need in my programs.
I see no conceptual differences between the TreeSet class and
the Random class from a software reuse viewpoint.
-
I can instantiate an object of the Random class to produce pseudo
random values, without caring how those values are actually generated.
However, if I am working in cryptography, I might need to know how many
such values can be generated before the sequence repeats. -
I can use any of the thirty or so methods of the Math class to produce
a variety of complex mathematical values without caring about how those
values are actually produced. However, since many of those values
are approximations, I might need to know something about the quality of
the approximation. -
I can instantiate an object of the TreeSet class to create a collection
object, which guarantees that the sorted set will be in ascending element
order, and provide log(n) time cost for the basic operations of
add,
remove,
and contains. As long as I know that, I have very little need
to know exactly how the collection object is implemented.
Its time to reinvent the CS2 curriculum
I’m confident that the future employers of most students share my opinion
on this. I don’t know of any employer who wants their programmers
to expend time and dollars reinventing the classical data structures.
What those employers are looking for is a staff of programmers who understand
the tradeoffs among the data structures, and when it is appropriate to
use each of the different structures.
It is time to reinvent the curriculum in CS2 courses by
- Encouraging the understanding of techniques for software reuse
-
Teaching when, why, and how each of the different structures should be
used - Discouraging the reinvention of those structures
Summary
In this lesson, I have provided a brief introduction to the use of the
Java Collections Framework. The framework is designed to encourage
you to reuse rather than to reinvent collections and maps (I will have
more to say about maps in a subsequent lesson).
A collection represents a group of objects, known as its elements.
While some collections allow duplicate elements, others do not. Some
collections are ordered and others are not ordered.
The Collections Framework is defined by a set of interfaces and associated
contracts. The framework provides concrete implementations of the
interfaces (classes) for the most common data structures.
In addition, the framework also provides several abstract implementations,
which are designed to make it easier for you to create new and different
concrete implementations.
The TreeSet class is a concrete implementation of the SortedSet
interface. The SortedSet interface extends Set, which
extends Collection. Thus, a TreeSet object is a
SortedSet. Also it is a Set, and it is
a Collection.
The TreeSet class guarantees that the sorted set will be in ascending
element order, and provides guaranteed log(n) time cost for the basic operations
(add, remove and contains).
TreeSet objects can be treated as the generic type Collection.
Methods declared in the Collection interface can be invoked on a
TreeSet
object without regard for the actual class from which the object was instantiated.
(This is polymorphic behavior.)
When such methods are invoked, the author of the program can have confidence
that the behavior of the method will be appropriate for an
object of the class from which the object was instantiated. In my
opinion, this is the true essence of object-oriented behavior.
What’s Next?
This is the first lesson in a miniseries on the Collection Framework.
Subsequent lessons will teach you how to use the framework for creating
and using various types of collections and maps.
Once you learn how to use the framework, it is unlikely that you will
need to reinvent classical data structures, search algorithms, or sorting
algorithms, because those capabilities are neatly packaged within the framework.
Complete Program Listing
A complete listing of the program is provided in Listing
6 below.
import java.util.TreeSet; import java.util.Collection; import java.util.Iterator; public class AP400{ public static void main( String args[]){ new Worker().doIt(); }//end main() }//end class AP400 class Worker{ public void doIt(){ Collection ref = new TreeSet(); Populator.fillIt(ref); Iterator iter = ref.iterator(); while(iter.hasNext()){ System.out.print(iter.next()); }//end while loop System.out.println(); }//end doIt() }// end class Worker class Populator{ public static void fillIt( Collection ref){ ref.add(new Integer(4)); ref.add(new Integer(4)); ref.add(new Integer(3)); ref.add(new Integer(2)); ref.add(new Integer(1)); }//end fillIt() }//end class populator Listing 6 |
Copyright 2000, Richard G. Baldwin. Reproduction in whole or in
part in any form or medium without express written permission from Richard
Baldwin is prohibited.
About the author
Richard Baldwin
is a college professor and private consultant whose primary focus is a
combination of Java and XML. In addition to the many platform-independent
benefits of Java applications, he believes that a combination of Java and
XML will become the primary driving force in the delivery of structured
information on the Web.
Richard has participated in numerous consulting projects involving
Java, XML, or a combination of the two. He frequently provides onsite
Java and/or XML training at the high-tech companies located in and around
Austin, Texas. He is the author of Baldwin’s Java Programming Tutorials,
which has gained a worldwide following among experienced and aspiring Java
programmers. He has also published articles on Java Programming in Java
Pro magazine.
Richard holds an MSEE degree from Southern Methodist University and
has many years of experience in the application of computer technology
to real-world problems.