
 |
Requirements and Test Management kit
For effective test automation IBM Rational offers a team-based, test management solution set that includes requirements, test management and change management for streamlined test planning, test execution, and results analysis. In this kit you will find demos and articles on Rational requirements and test management tools and best practices.
»
Use Rational's Change and Release Management Solution in Your IDE
This demonstration illustrates how Rational® ClearCase®, ClearQuest® and Build Forge® can be used to help developers develop, build, and test their applications in the comfort of their IDE.
»
Using IBM Rational Tester for SOA Quality: Testing SOAP Secured Web Services
Learn how to test SOAP-secured Web services by using the IBM Rational Tester for SOA quality.
»
IBM Rational Testing Ekits
Access five complimentary kits for testers, developers and business domain experts. Each kit provides a collection of materials that can help you understand and use IBM Rational testing tools and best practices.
»
|
 |
|
  |
|
|
|
|

Data Structures in Java: Part 1, Getting Started
By Richard G. Baldwin
Go to page: 1 2 Next
Java Programming, Lecture Notes #1350
Preface
This is the first lesson in a new miniseries on Java data structures and
the Java Collections Framework.
The purpose of this miniseries is to help you learn the essential features
of Object-Oriented data structures in Java using the Collections Framework.
Viewing tip
You may find it useful to open another copy of this lesson in a separate
browser window. That will make it easier for you to scroll back and
forth among the different listings while you are reading about them.
Recommended supplementary reading
I recommend that you also study the other lessons in my extensive collection
of online Java tutorials. You will find those lessons published at
Gamelan.com.
However, as of the date of this writing, Gamelan doesn't maintain a consolidated
index of my Java tutorial lessons, and sometimes my lessons are difficult
to locate there. You will find a consolidated index at
Baldwin's
Java Programming Tutorials.
The index on my site provides links to the lessons
at Gamelan.com.
Preview
This lesson provides a brief introduction to the use of the Java Collections
Framework. The framework is designed to encourage you to reuse rather
than to reinvent collections and maps.
A collection represents a group of objects, known as its elements.
Some collections allow duplicate elements while others do not. Some collections
are ordered and others are not. (Maps will be discussed in subsequent
lessons.)
The Collections Framework is defined by a set of interfaces and associated
contracts, and provides concrete implementations of the interfaces for
the most common data structures. In addition, the framework also
provides several abstract implementations, which are designed to make it
easier for you to create new and different implementations while still
maintaining the structural polymorphic integrity of the framework.
Introduction
A little quiz
Let's begin with a little quiz to establish your baseline knowledge
of the Collections Framework. Take a look at the program in Listing
6 near the end of this lesson. Which of the following is the output
produced by that program?
-
A. Compiler Error
-
B. Runtime Error
-
C. 44321
-
D. 12344
-
E. 1234
-
F. None of the above.
If your answer was 1234 then you already know
quite a lot about the use of the Collections Framework. If not, keep
reading to begin learning about the framework.
Elements of the Framework are easy to use
This simple introductory program is not intended to do anything particularly
useful. Rather, it was designed to illustrate several important features
of the framework, including the ease with which elements of the framework
can be reused in your programs.
Don't reinvent the wheel
As many of you already know, I am a college professor. I specialize
in teaching OOP using Java. In the past, many college courses in
Data Structures (often referred to as CS2 courses) have emphasized
the concept of reinventing the wheel. Students were required
to learn how to reinvent a variety of complex data structures in order
to successfully complete the course.
Hopefully a change is in the works
Hopefully, with the conversion of these CS2 courses to Java OOP, the
emphasis will change to reuse instead of reinvent.
Unfortunately, most of the CS2 Java textbooks that I have seen so far
look like warmed-over textbooks from the days of Pascal, C, and C++.
Many look like they were written using cut-and-paste technology to update
an old Pascal, C, or C++ textbook to incorporate a little Java OOP.
Most of the textbooks that I have seen include only enough Java OOP to
avoid violating truth in advertising requirements. They still
emphasize
reinvent instead of reuse.
Collections Framework encourages reuse
The Java Collections Framework is designed to encourage programmers
to reuse existing interfaces and classes instead of inventing new ones.
In the event that it is necessary to invent a new class or interface, the
programmer is encouraged to integrate it into the Framework in a polymorphic
manner.
Sample Program
I am going to provide a brief discussion of the sample
program (shown in Listing 6) in this lesson. Later, I will
provide more detailed discussions of many of the features used in that
program.
Interesting
Code Fragments
I will break this program down and discuss it in fragments.
An object of the TreeSet class
The code fragment in Listing 1 instantiates an object of the TreeSet
class and stores the object's reference in a reference variable of type
Collection.
class Worker{
public void doIt(){
Collection ref = new TreeSet();
Listing 1
|
Collection is an interface
The TreeSet class implements the SortedSet interface,
which extends the Set interface, which in turn extends the Collection
interface. Thus, a TreeSet object is a Collection.
Therefore, a reference to a TreeSet object can be stored in a reference
variable of type Collection, and can be treated as the generic type
Collection.
What is a TreeSet object?
Among other things, in CS2 courses, we worry about the time and memory
cost of a collection. According to Sun, the TreeSet class
guarantees that the sorted set will be in ascending element order, and
provides guaranteed log(n) time cost for the basic operations (add,
remove
and contains).
What does ascending element order mean?
Again, according to Sun, the elements will be sorted according to the
natural
order of the elements (see the Comparable interface) or by a
comparator (see the Comparator interface) provided at the time the
set is created. This depends on which overloaded constructor is used.
I will have more to say about these alternatives in a subsequent lesson.
What does log(n) time cost mean?
I'm not going to try to explain the details of log(n) time cost here.
Suffice it to say that the add, remove, and contains
methods execute very fast. (I will have more to say about this
is a subsequent lesson.)
A TreeSet object is a Set
An object of the TreeSet class also is a Set.
One of the characteristics of a Java Set (an object that implements
the Set interface) is that it can contain no duplicate elements.
Therefore, a TreeSet object can contain no duplicate elements.
If the add() method of a TreeSet object is invoked in an
attempt to add a duplicate element, the element will not be added.
A TreeSet object is a SortedSet
The TreeSet class also implements the SortedSet interface.
This guarantees that the contents of a TreeSet object will be in
ascending element order, regardless of the order in which the elements
are added. (In a subsequent lesson, I will discuss how comparisons
are made to enforce the ordering of the elements.)
A TreeSet object is a Collection
Because an object of the TreeSet class is a Collection,
a reference to such an object can be passed to any method that requires
an incoming parameter of type Collection. The receiving method
can invoke any method on that reference that is declared in the Collection
interface.
(I will discuss such methods in detail in subsequent
lessons.)
Populate the Collection
The statement in Listing 2 below passes the TreeSet object's
reference to a method named fillIt(), which is a class method of
the Populator class. (The Populator class is a class of
my own design whose only purpose is to illustrate the polymorphic behavior
achieved using the Collections Framework.) The behavior of this
method is to add elements to the incoming Collection object without
regard for the actual type of the object (the class from which the object
was instantiated).
Populator.fillIt(ref);
Listing 2
|
At this point, I am going to discuss the fillIt() method of the
Populator
class invoked in the above fragment. The entire class definition
of the Populator class is shown in Listing 3 below.
class Populator{
public static void fillIt(
Collection ref){
ref.add(new Integer(4));
ref.add(new Integer(4));
ref.add(new Integer(3));
ref.add(new Integer(2));
ref.add(new Integer(1));
}//end fillIt()
}//end class populator
Listing 3
|
Don't know, don't care
As you can see in the above fragment, the fillIt() method receives
the reference to the TreeSet object as type Collection.
This method doesn't know, and doesn't care, what the actual type of the
object is. All it cares about is that the object is a Collection
object. (Otherwise, the object's reference couldn't be passed
in as a parameter. A type mismatch would occur.)
Because the incoming parameter is a reference to a Collection
object, the fillIt() method can invoke the add() method on
the object, with confidence that the behavior of the add() method
will be appropriate for the specific type of object involved. (For
example, the behavior of the add() method for an object of the TreeSet
class will probably be different from the behavior of the add() method
for an object of some other class that implements the Collection interface.)
Polymorphism is great!
The great thing about polymorphic behavior is that the author of the
fillIt() method doesn't need to be concerned about the implementation
details of the add() method.
Add five elements with some duplicates
The code in the fillIt() method adds five elements to the object.
Each element is a reference to a new object of type Integer.
Two of the objects encapsulate the int value 4, and thus are duplicates.
The int values encapsulated in the Integer objects are
not in ascending order. Rather, they are added to the object in descending
order. (They could be added in any order and the end result would
be the same.)
Filter out the duplicates
Unknown to the author of the fillIt() method, the add()
method filters out the duplicate element in order to satisfy the contract
of the Collection interface. In this case, the author didn't
care what happens in the case of duplicate elements.
Notification of duplicates
If the author of the fillIt() method does care what happens in
the case of duplicates, she can find out when an object is a duplicate.
According to the contract of the Collection interface, the add()
method must return
true if the invocation of the method modifies
the contents of the object and must return false if the collection
does not permit duplicates and the collection already contains the specified
element.
Sort the elements
Also unknown to the author of the fillIt() method, even though
the elements are added in descending order (or could be added in any
other order for that matter), they are stored and maintained in the
TreeSet object in such a way that they can later be accessed in
ascending order.
The TreeSet object is now populated
When the fillIt() method returns, the TreeSet object contains
four (not five) elements with no duplicates. Each element
is a reference to an object of type Integer. Those references
are maintained in such a way as to make them accessible in ascending order,
based on the
int values encapsulated in each of the Integer
objects.
Get an Iterator object
Returning now to the flow in the doIt() method, the following
statement invokes the iterator() method on the TreeSet object's
reference that is stored in the reference variable of type
Collection.
This is shown in Listing 4 below.
Iterator iter = ref.iterator();
Listing 4
|
Invocation of the iterator() method on any Collection
object returns an instance of a class that implements the Iterator
interface. The Iterator object can be used to traverse the
collection, gaining access to each element in order. (The concept
of in order means different things for different kinds of collections.
For a collection instantiated from the TreeSet class, in order means in
ascending order.)
Again, don't know, don't care
Again, the author of the method that uses the Collection object
doesn't need to know or care about the internal implementation of the collection,
or the implementation of the methods of the Iterator object.
They simply do what they do, and can be used for their intended purpose.
An Iterator object acts as a doorkeeper
The Iterator interface declares three methods:
-
hasNext()
-
next()
-
remove()
You might say that an Iterator object acts as a doorkeeper for the
collection object that it represents, providing access to the contents
of the collection in a very specific manner.
The code fragment in Listing 5 below shows how the first two of the
above methods can be used to
-
Traverse the collection, accessing each of the object's elements in succession
-
Display the value encapsulated in the object referred to by each element
As mentioned earlier, when the collection is an object instantiated from
the TreeSet class, access to the elements is provided in ascending
order.
while(iter.hasNext()){
System.out.print(iter.next());
}//end while loop
Listing 5
|
Four elements with no duplicates
At this point, the TreeSet object contains four elements, with
no duplicates. Each of the elements is a reference to an object of
type Integer. The code in the above loop causes each of those
elements to be accessed and displayed in ascending order. This causes
the following text to appear on the screen:
1234
An editorial opinion
In my opinion, this is the kind of knowledge that a computer science
student in a modern data structures course should be learning. This
is a far departure from courses of the past where CS2 students were required
to memorize the intricate details of how to implement various data structures.
What kind of knowledge is needed?
Does an architect need to understand the detailed inner workings of
an air conditioning compressor in order to design a cooling system into
a building? Of course not!
However, the architect does need to know the tradeoffs among the available
cooling systems in terms of initial cost, operating cost, size, efficiency,
etc.
Does an audio technician need to understand the detailed inner workings
of an electronic audio equalizer in order to construct an integrated audio
system? Absolutely not! If that were a requirement, there would
likely be very few audio systems in existence.
However, the audio technician does need to understand the tradeoffs
among the various available audio equalizers.
The same concept applies to software design
Does an OOP software designer need to know the detailed inner workings
of the various kinds of collection objects in order to use them effectively?
No!
However, the software designer does need to know the tradeoffs among
the various types of collection objects in terms of their operational behavior.
Modern CS2 students should be learning about the performance and operational
differences among the different types of collections, and how to use available
frameworks to create and use those collections. They should not be
wasting their time learning how to reinvent them. They have more
important ways to spend their valuable time, and they have more important
things to learn.
An analogy
Frankly, I don't care how the programmers at Sun implemented the TreeSet
class, so long as the behavior of objects instantiated from that class
meets the published specifications.
As an analogy, I also don't care how they implemented the Random
class, so long as objects instantiated from the Random class provide
the pseudo random values that I need in my programs.
I see no conceptual differences between the TreeSet class and
the Random class from a software reuse viewpoint.
-
I can instantiate an object of the Random class to produce pseudo
random values, without caring how those values are actually generated.
However, if I am working in cryptography, I might need to know how many
such values can be generated before the sequence repeats.
-
I can use any of the thirty or so methods of the Math class to produce
a variety of complex mathematical values without caring about how those
values are actually produced. However, since many of those values
are approximations, I might need to know something about the quality of
the approximation.
-
I can instantiate an object of the TreeSet class to create a collection
object, which guarantees that the sorted set will be in ascending element
order, and provide log(n) time cost for the basic operations of
add,
remove,
and contains. As long as I know that, I have very little need
to know exactly how the collection object is implemented.
Its time to reinvent the CS2 curriculum
I'm confident that the future employers of most students share my opinion
on this. I don't know of any employer who wants their programmers
to expend time and dollars reinventing the classical data structures.
What those employers are looking for is a staff of programmers who understand
the tradeoffs among the data structures, and when it is appropriate to
use each of the different structures.
It is time to reinvent the curriculum in CS2 courses by
-
Encouraging the understanding of techniques for software reuse
-
Teaching when, why, and how each of the different structures should be
used
-
Discouraging the reinvention of those structures
Go to page: 1 2 Next
|