Java Enterprise Java Data Structures in Java: Part 6, Duplicate Elements, Ordered Collections, and More

Data Structures in Java: Part 6, Duplicate Elements, Ordered Collections, and More

Java Programming, Lecture Notes #1360


Preface


This is the sixth lesson in a miniseries on Java data structures and the
Java Collections Framework.  The first lesson in the miniseries was
entitled
Data
Structures in Java: Part 1, Getting Started
.  The previous lesson
was entitled Data
Structures in Java: Part 5, The Core Collection Interfaces
.

The purpose of this miniseries is to help you learn the essential features
of Object-Oriented data structures in Java using the Collections Framework.

Viewing tip

You may find it useful to open another copy of this lesson in a separate
browser window.  That will make it easier for you to scroll back and
forth among the different figures and listings while you are reading about
them.

Supplementary material

I recommend that you also study the other lessons in my extensive collection
of online Java tutorials.  You will find those lessons published at
Gamelan.com
However, as of the date of this writing, Gamelan doesn’t maintain a consolidated
index of my Java tutorial lessons, and sometimes they are difficult to
locate there.  You will find a consolidated index at
Baldwin’s
Java Programming Tutorials
.

The index on my site provides links to the lessons
at
Gamelan.com.

Preview


In the previous lesson, entitled Data
Structures in Java: Part 5, The Core Collection Interfaces
, you learned
that the Java Collections Framework defines six core interfaces, in two
distinct trees.  You learned the names and the inheritance structure
of those interfaces.  You also learned about their purpose. 
You saw how the interfaces declare polymorphic methods that apply to implementations
of the interfaces, and you learned about the optional methods of the Collection
interface.

In this lesson you will learn that all of the
implementations of the interfaces in the Java Collections Framework implement
one of the subinterfaces of the Collection interface.  You
will learn that a Set object cannot contain duplicate elements,
but a List object can contain duplicate elements.

You will learn about the difference between ordered collections
and sorted collections.  You will also learn about ascending
order
and the natural ordering of objects.  In addition,
you will learn how more specialized stipulations are placed on interfaces
as you progress down the interface inheritance hierarchy of the Java Collections
Framework.

Discussion


Let’s start with a quiz

As is often the case, I am going to begin this lesson with a little
quiz just to make sure that you are awake.  Is the following statement
True or False?

The TreeSet class is a direct implementation of the
Collection
interface.

The answer is False.  The TreeSet class is not a direct implementation
of the Collection interface.  Rather, the TreeSet class
is a direct implementation of the SortedSet interface.  The
SortedSet
interface extends the Set interface, and the
Set interface
extends the Collection interface.

The root of the collection hierarchy

The Collection interface is the root of the collection hierarchy. 
JDK 1.3 doesn’t provide any direct implementations of the Collection
interface.  All of the implementations of the interfaces in the Java
Collections Framework implement one of the subinterfaces of the Collection
interface.

What does Sun say about this?

Here is what the Sun documentation has to say on the topic of the Collection
interface:

“The SDK does not provide any direct implementations of this interface:
it provides implementations of more specific subinterfaces like Set and
List. This interface is typically used to pass collections around and manipulate
them where maximum generality is desired.”

The Sun documentation also states:

“Bags or multisets (unordered collections that may contain duplicate
elements) should implement this interface directly.”

However, JDK 1.3 does not provide any implementations for bags or multisets. 
If you need collections of these types, you will need to define the implementation
classes yourself.

What about duplicate elements?

Some implementations of Collection allow
duplicate elements, and others do not.  Implementations of the List
interface
(such as ArrayList) allow duplicate elements.  Implements
of
Set and SortedSet (such as TreeSet) do not allow
duplicate elements.  This was illustrated in a previous lesson entitled
Data
Structures in Java: Part 5, The Core Collection Interfaces
.

A sample program in that lesson created two collection objects and applied
the polymorphic add() method to add the same elements to each collection. 
One of the collection objects was of type ArrayList, and the other
collection object was of type TreeSet.  The elements added
to each collection contained one pair of duplicate elements.  The
duplicate element was automatically excluded from the TreeSet object,
but was retained in the ArrayList object.

So, what is a set?

According to Sun, a Set is a “collection
that contains no duplicate elements … this interface models the mathematical
set abstraction.”
An object of type Set is typically used to
model collections such as Social Security numbers where duplicates are
not allowed.

And, what is a list?

Also according to Sun, a List is “An
ordered collection (also known as a sequence). The user of this interface
has precise control over where in the list each element is inserted. The
user can access elements by their integer index (position in the list),
and search for elements in the list.”

Ordered is not the same as sorted

Note that an ordered collection is not
the same as a sorted collection.  The fact that the collection
is ordered derives from the fact that each element in the collection has
a specific position specified by an index.  In a sorted collection,
the position of each element is determined by its value relative to the
values of its predecessors and successors.

Sun goes on to say, “Unlike sets, lists typically
allow duplicate elements. More formally, lists typically allow pairs of
elements e1 and e2 such that e1.equals(e2), and they typically allow multiple
null elements if they allow null elements at all.”

Is ascending sort order always required?

Not all implementations of the Collection interface maintain
the elements in ascending sort order.  Some may, and others do not. 
For example, as discussed above, implementations of the List interface
(such
as ArrayList)
do not maintain their elements in sorted order at all. 
In other words, the position of an element in an ArrayList does
not depend on the value of the element.

On the other hand, implementations of the interfaces named SortedSet
(such
as TreeSet)
and SortedMap
do maintain their elements in sorted
order.  However, that order is not necessarily ascending.  When
an object is instantiated from a class that implements the SortedSet
interface, the sorting order for that object can be established by providing
an object instantiated from a class that implements the Comparator
interface.  In that case, the author of the implementing class determines
the order imposed on the elements in the collection.

Does case matter in String objects?

For example, if your SortedSet object contains references to
String
objects, the natural ascending sort would take the difference between upper
case and lower case characters into account.  However, you might prefer
that case be ignored when establishing the sorted order.  You can
accomplish this by providing an object of a class that implements the Comparator
interface and which defines the compare() and equals() methods
in such a way as to eliminate case considerations for comparisons of String
objects.  (The lesson entitled
Swing from
A to Z:  Analyzing Swing Components, Part 1, Concepts
contains
a sample program named Introspect03 that implements the Comparator interface
for exactly this purpose.)

Subinterfaces have more stipulations

As you progress down the inheritance hierarchy,
you find that additional stipulations apply at each level of inheritance. 
As an example, according to Sun, “The Set interface places additional
stipulations, beyond those inherited from the Collection interface, on
the contracts of all constructors and on the contracts of the add,
equals and hashCode methods.”

The important point is that specific subinterfaces
of the Collection interface can define requirements that do not
apply to all subinterfaces of the Collection interface.  For
example, the add() method of the Set interface stipulates
the following:

“Adds the specified element to this
set if it is not already present.”

On the other hand, the add() method of the
Collection
interface simply states:

“Ensures that this collection contains
the specified element.”

Thus, the contract for the add() method of
an object of a class that implements the Set interface is more specialized
than the contract for the add() method of an object of a class that
implements Collection interface.

An additional stipulation on the constructor for a Set object
is that all constructors must create a set that contains no duplicate elements.

Stipulations on SortedSet

The SortedSet interface extends the Set interface. The
SortedSet
interface contains the following stipulation that makes it more specialized
than a Set.

“A set that further guarantees that its iterator will traverse the
set in ascending element order, sorted according to the natural ordering
of its elements (see Comparable), or by a Comparator provided at sorted
set creation time.”

Let’s end with a quiz

I’m going to finish this lesson with several questions in the form of
a quiz.  To ensure that this is a learning experience, I will provide
an explanation in addition to the answer for each question.

Q1  True or False?  A
collection that implements the List interface maintains its elements
in ascending alphanumeric order.

The answer to question 1 is false.  Unlike collections that implement
the SortedSet interface, the order of the elements in a collection
that implements the List interface is not based on the values of
the objects referred to by the elements in the list.

Q2  True or False?  A
collection that implements the List interface is an unordered collection.

The answer to question 2 is also false.  A collection that implements
the List interface is an ordered collection
(also known as a
sequence)
.  According to Sun, “The user of the interface has
precise control over where in the list each element is inserted.”
Elements
can be inserted and retrieved on the basis of their integer index (position
in the list)
using the following methods:

add(int index, Object element)

get(int index)

Valid index values are positive integers that begin with zero. 
When the add method is used to insert an element at a specific position
in the sequence, the element currently at that position (if any)
and any subsequent elements are shifted toward higher index values to make
room for the new element.

Another version of the add method takes a reference to an object
as an incoming parameter and appends the specified element to the end of
the collection.

The get method simply returns the element at the specified position
in the collection.

The List interface also declares various other methods that can
be used to manipulate the contents of the collection.

Q3  True or False?  A
collection that implements the List interface is allowed to contain
duplicate values.

The answer to question 3 is true.  Unlike
a collection that implements the Set interface, a
collection
that implements the List interface is typically allowed to contain
duplicate values.  More formally, according to Sun, “lists typically
allow pairs of elements e1 and e2 such that e1.equals(e2),
and they typically allow multiple null elements if they allow null elements
at all.”

Q4  True or False?  The
contracts of the methods in the List interface are the same as the
contracts of the methods inherited from the Collection interface.

The answer to question 4 is false.  According
to Sun, “The List interface places additional stipulations, beyond
those specified in the Collection interface, on the contracts of
the iterator, add, remove,
equals,
and hashCode methods.”

For example, the iterator() method (for
both the List and Collection interfaces)
returns an iterator
over the elements in the collection.  For the Collection interface,
there are no guarantees concerning the order in which the elements are
returned by the methods of the Iterator object.

On the other hand, the iterator() method
for the List interface returns an iterator over the elements in
the collection in proper sequence, where the sequence is determined by
the numeric index.  In other words, when you invoke the methods of
the Iterator object on a List, the elements will be returned
in the proper sequence as determined by a numeric index.

Similarly, according to Sun, the SortedSet
interface “guarantees that its iterator will traverse the set in ascending
element order, sorted according to the natural ordering of its elements
(see Comparable), or by a Comparator provided at sorted set creation time.”

Summary


In this lesson you learned that all of the
implementations of the interfaces in the Java Collections Framework implement
one of the subinterfaces of the Collection interface.  You
learned that a Set object cannot contain duplicate elements, but
a List object can contain duplicate elements.

You learned about the difference between ordered collections
and sorted collections.  You also learned about ascending
order
and the natural ordering of objects.  In addition,
you learned how more specialized stipulations are placed on interfaces
as you progress down the interface inheritance hierarchy of the Java Collections
Framework.

What’s Next?


The SortedSet interface “guarantees that
its iterator will traverse the set in ascending element order, sorted according
to the natural ordering of its elements (see Comparable), or by a Comparator
provided at sorted set creation time.” 
In the next lesson, I
will show you how to use the Comparator interface to control the
sorted order of your collections.


Copyright 2000, Richard G. Baldwin.  Reproduction in whole or in
part in any form or medium without express written permission from Richard
Baldwin is prohibited.

About the author

Richard Baldwin
is a college professor and private consultant whose primary focus is a
combination of Java and XML. In addition to the many platform-independent
benefits of Java applications, he believes that a combination of Java and
XML will become the primary driving force in the delivery of structured
information on the Web.

Richard has participated in numerous consulting projects involving
Java, XML, or a combination of the two.  He frequently provides onsite
Java and/or XML training at the high-tech companies located in and around
Austin, Texas.  He is the author of Baldwin’s Java Programming Tutorials,
which has gained a worldwide following among experienced and aspiring Java
programmers. He has also published articles on Java Programming in Java
Pro magazine.

Richard holds an MSEE degree from Southern Methodist University and
has many years of experience in the application of computer technology
to real-world problems.

[email protected]

Latest Posts

Related Stories