Continuing on the same theme of my last article, Java EE 5 Performance Management and Optimization, have you ever heard anyone ask the following question: “When developers are building their
individual components before a single use case is implemented, isn’t it premature to start performance testing?”
Let me ask a similar question: When building a car, is it premature to test the performance
of your alternator before the car is assembled and you try to start it? The answer to this question is
obviously “No, it’s not premature. I want to make sure that the alternator works before building my
car!” If you would never assemble a car from untested parts, why would you assemble an enterprise
application from untested components? Furthermore, because you integrate performance
criteria into use cases, use cases will fail testing if they do not meet their performance criteria.
In short, performance matters!
In development, components are tested in unit tests. A unit test is designed to test the
functionality and performance of an individual component, independently from other components
that it will eventually interact with. The most common unit testing framework is an open
source initiative called JUnit. JUnit’s underlying premise is that alongside the development
of your components, you should write tests to validate each piece of functionality of your components.
A relatively new development paradigm, Extreme Programming (www.xprogramming.com),
promotes building test cases prior to building the components themselves, which forces you to
better understand how your components will be used prior to writing them.
JUnit focuses on functional testing, but side projects spawned from JUnit include performance
and scalability testing. Performance tests measure expected response time, and scalability
tests measure functional integrity under load. Formal performance unit test criteria should do
the following:
- Identify memory issues
- Identify poorly performing methods and algorithms
- Measure the coverage of unit tests to ensure that the majority of code is being tested
Memory leaks are the most dangerous and difficult to diagnose problems in enterprise
Java applications. The best way to avoid memory leaks at a code level is to run your components
through a memory profiler. A memory profiler takes a snapshot of your heap (after first
running garbage collection), allows you to run your tests, takes another snapshot of your heap
(after garbage collection again), and shows you all of the objects that remain in the heap. The
analysis of the heap differences identifies objects abandoned in memory. Your task is then to
look at these objects and decide if they should remain in the heap or if they were left there by
mistake. Another danger of memory misusage is object cycling, which, again, is the rapid creation
and destruction of objects. Because it increases the frequency of garbage collection, excessive
object cycling may result in the premature tenuring of short-lived objects, necessitating a
major garbage collection to reclaim these objects.
After considering memory issues, you need to quantify the performance of methods and
algorithms. Because SLAs are defined at the use case level, but not at the component level,
measuring response times may be premature in the development phase. Rather, the strategy is
to run your components through a code profiler. A code profiler reveals the most frequently
executed sections of your code and those that account for the majority of the components’
execution times. The resulting relative weighting of hot spots in the code allows for intelligent
tuning and code refactoring. You should run code profiling on your components while executing
your unit tests, because your unit tests attempt to mimic end-user actions and alternate user
scenarios. Code profiling your unit tests should give you a good idea about how your component
will react to real user interactions.
Coverage profiling reports the percentage of classes, methods, and lines of code that were
executed during a test or use case. Coverage profiling is important in assessing the efficacy of
unit tests. If both the code and memory profiling of your code are good, but you are exercising
only 20 percent of your code, then your confidence in your tests should be minimal. Not only
do you need to receive favorable results from your functional unit tests and your code and
memory performance unit tests, but you also need to ensure that you are effectively testing
your components.
This level of testing can be further extended to any code that you outsource. You should
require your outsourcing company to provide you with unit tests for all components it develops,
and then execute a performance test against those unit tests to measure the quality of the
components you are receiving. By combining code and memory profiling with coverage profiling,
you can quickly determine whether the unit tests are written properly and have acceptable results.
Once the criteria for tests are met, the final key step to effectively implementing this level
of testing is automation. You need to integrate functional and performance unit testing into
your build process—only by doing so can you establish a repeatable and trackable procedure.
Because running performance unit tests can burden memory resources, you might try executing
functional tests during nightly builds and executing performance unit tests on Friday-night
builds, so that you can come in on Monday to test result reports without impacting developer
productivity. This suggestion’s success depends a great deal on the size and complexity of your
environment, so, as always, adapt this plan to serve your application’s needs.
When performance unit tests are written prior to, or at least concurrently with, component
development, then component performance can be assessed at each build. If such extensive
assessment is not realistic, then the reports need to be evaluated at each major development
milestone. For the developer, milestones are probably at the completion of the component or
a major piece of functionality for the component. But at minimum, performance unit tests
need to be performed prior to the integration of components. Again, building a high-performance
car from tested and proven high-performance parts is far more effective than from scraps gathered
from the junkyard.
Unit Testing
I thought this section would be a good opportunity to talk a little about unit testing tools and
methods, though this discussion is not meant to be exhaustive. JUnit is, again, the tool of
choice for unit testing. JUnit is a simple regression-testing framework that enables you to write
repeatable tests. Originally written by Erich Gamma and Kent Beck, JUnit has been embraced
by thousands of developers and has grown into a collection of unit testing frameworks for a
plethora of technologies. The JUnit Web site (www.junit.org) hosts support information and links to the other JUnit derivations.
JUnit offers the following benefits to your unit testing:
- Faster coding: How many times have you written debug code inside your classes to verify
values or test functionality? JUnit eliminates this by allowing you to write test cases in
closely related, but centralized and external, classes. - Simplicity: If you have to spend too much time implementing your test cases, then you
won’t do it. Therefore, the creators of JUnit made it as simple as possible. - Single result reports: Rather than generating loads of reports, JUnit will give you a single
pass/fail result, and, for any failure, show you the exact point where the application failed. - Hierarchical testing structure: Test cases exercise specific functionality, and test suites
execute multiple test cases. JUnit supports test suites of test suites, so when developers
build test cases for their classes, they can easily assemble them into a test suite at the
package level, and then incorporate that into parent packages and so forth. The result is
that a single, top-level test execution can exercise hundreds of unit test cases. - Developer-written tests: These tests are written by the same person who wrote the code,
so the tests accurately target the intricacies of the code that the developer knows can be
problematic. This test differs from a QA-written one, which exercises the external functionality
of the component or use case—instead, this test exercises the internal functionality. - Seamless integration: Tests are written in Java, which makes the integration of test cases
and code seamless. - Free: JUnit is open source and licensed under the Common Public License Version 1.0,
so you are free to use it in your applications.
From an architectural perspective, JUnit can be described by looking at two primary components:
TestCase and TestSuite. All code that tests the functionality of your class or classes must
extend junit.framework.TestCase. The test class can implement one or more tests by defining
public void methods that start with test and accept no parameters, for example:
public void testMyFunctionality() { ... }
For multiple tests, you have the option of initializing and cleaning up the environment
before and between tests by implementing the following two methods: setUp() and tearDown().
In setUp() you initialize the environment, and in teardown() you clean up the environment.
Note that these methods are called between each test to eliminate side effects between test
cases; this makes each test case truly independent.
Inside each TestCase “test” method, you can create objects, execute functionality, and
then test the return values of those functional elements against expected results. If the return
values are not as expected, then the test fails; otherwise, it passes. The mechanism that JUnit
provides to validate actual values against expected values is a set of assert methods:
In addition, JUnit offers a fail() method that you can call anywhere in your test case to
immediately mark a test as failing.
JUnit tests are executed by one of the TestRunner instances (there is one for command-line
execution and one for a GUI execution), and each version implements the following steps:
- It opens your TestCase class instance.
- It uses reflection to discover all methods that start with “test”.
- It repeatedly calls setUp(), executes the test method, and calls teardown().
As an example, I have a set of classes that model data metrics. A metric contains a set of
data points, where each data point represents an individual sample, such as the size of the
heap at a given time. I purposely do not list the code for the metric or data point classes; rather,
I list the JUnit tests. Recall that according to one of the tenets of Extreme Programming, we
write test cases before writing code. Listing 1 shows the test case for the Metric class, and
Listing 2 shows the test case for the DataPoint class.
Listing 1. DataPointTest.java
package com.javasrc.metric; import junit.framework.TestCase; import java.util.*; /** * Tests the core functionality of a DataPoint */ public class DataPointTest extends TestCase { /** * Maintains our reference DataPoint */ private DataPoint dp; /** * Create a DataPoint for use in this test */ protected void setUp() { dp = new DataPoint( new Date(), 5.0, 1.0, 10.0 ); } /** * Clean up: do nothing for now */ protected void tearDown() { } /** * Test the range of the DataPoint */ public void testRange() { assertEquals( 9.0, dp.getRange(), 0.001 ); } /** * See if the DataPoint scales properly */ public void testScale() { dp.scale( 10.0 ); assertEquals( 50.0, dp.getValue(), 0.001 ); assertEquals( 10.0, dp.getMin(), 0.001 ); assertEquals( 100.0, dp.getMax(), 0.001 ); } /** * Try to add a new DataPoint to our existing one */ public void testAdd() { DataPoint other = new DataPoint( new Date(), 4.0, 0.5, 20.0 ); dp.add( other ); assertEquals( 9.0, dp.getValue(), 0.001 ); assertEquals( 0.5, dp.getMin(), 0.001 ); assertEquals( 20.0, dp.getMax(), 0.001 ); } /** * Test the compare functionality of our DataPoint to ensure that * when we construct Sets of DataPoints they are properly ordered */ public void testCompareTo() { try { // Sleep for 100ms so we can be sure that the time of // the new data point is later than the first Thread.sleep( 100 ); } catch( Exception e ) { } // Construct a new DataPoint DataPoint other = new DataPoint( new Date(), 4.0, 0.5, 20.0 ); // Should return -1 because other occurs after dp int result = dp.compareTo( other ); assertEquals( -1, result ); // Should return 1 because dp occurs before other result = other.compareTo( dp ); assertEquals( 1, result ); // Should return 0 because dp == dp result = dp.compareTo( dp ); assertEquals( 0, result ); } }Listing 2. MetricTest.java
package com.javasrc.metric; import junit.framework.TestCase; import java.util.*; public class MetricTest extends TestCase { private Metric sampleHeap; protected void setUp() { this.sampleHeap = new Metric( "Test Metric", "Value/Min/Max", "megabytes" ); double heapValue = 100.0; double heapMin = 50.0; double heapMax = 150.0; for( int i=0; i<10; i++ ) { DataPoint dp = new DataPoint( new Date(), heapValue, heapMin, heapMax ); this.sampleHeap.addDataPoint( dp ); try { Thread.sleep( 50 ); } catch( Exception e ) { } // Update the heap values heapMin -= 1.0; heapMax += 1.0; heapValue += 1.0; } } public void testMin() { assertEquals( 41.0, this.sampleHeap.getMin(), 0.001 ); } public void testMax() { assertEquals( 159.0, this.sampleHeap.getMax(), 0.001 ); } public void testAve() { assertEquals( 104.5, this.sampleHeap.getAve(), 0.001 ); } public void testMaxRange() { assertEquals( 118.0, this.sampleHeap.getMaxRange(), 0.001 ); } public void testRange() { assertEquals( 118.0, this.sampleHeap.getRange(), 0.001 ); } public void testSD() { assertEquals( 3.03, this.sampleHeap.getStandardDeviation(), 0.01 ); } public void testVariance() { assertEquals( 9.17, this.sampleHeap.getVariance(), 0.01 ); } public void testDataPointCount() { assertEquals( 10, this.sampleHeap.getDataPoints().size() ); } }In Listing 1, you can see that the DataPoint class, in addition to maintaining the observed
value for a point in time, supports minimum and maximum values for the time period, computes
the range, and supports scaling and adding data points. The sample test case creates a DataPoint
object in the setUp() method and then exercises each piece of functionality.Listing 2 shows the test case for the Metric class. The Metric class aggregates the
DataPoint objects and provides access to the collective minimum, maximum, average, range,
standard deviation, and variance. In the setUp() method, the test creates a set of data points
and builds the metric to contain them. Each subsequent test case uses this metric and validates
values computed by hand to those computed by the Metric class.Listing 3 rolls both of these test cases into a test suite that can be executed as one test.
Listing 3. MetricTestSuite.java
package com.javasrc.metric; import junit.framework.Test; import junit.framework.TestSuite; public class MetricTestSuite { public static Test suite() { TestSuite suite = new TestSuite(); suite.addTestSuite( DataPointTest.class ); suite.addTestSuite( MetricTest.class ); return suite; } }A TestSuite exercises all tests in all classes added to it by calling the addTestSuite()
method. A TestSuite can contain TestCases or TestSuites, so once you build a suite of test
cases for your classes, a master test suite can include your suite and inherit all of your test cases.The final step in this example is to execute either an individual test case or a test suite. After
downloading JUnit from www.junit.org, add the junit.jar file to your CLASSPATH and then invoke either its command-line interface or GUI interface. The three classes that execute these tests
are as follows:
- junit.textui.TestRunner
- junit.swingui.TestRunner
- junit.awtui.TestRunner
And as these package names imply, textui is the command-line interface and swingui is
the graphical interface. awtui provides a batch interface to executing unit tests. You can pass an
individual test case or an entire test suite as an argument to the TestRunner class. For example,
to execute the test suite that we created earlier, you would use this:
java junit.swingui.TestRunner com.javasrc.metric.MetricTestSuite
Unit Performance Testing
Unit performance testing has three aspects:
This section explores each facet of performance profiling. I provide examples of what to
look for and the step-by-step process to implement each type of testing.
Memory Profiling
Let’s first look at memory profiling. To illustrate how to determine if you do, in fact, have a
memory leak, I modified the BEA MedRec application to capture the state of the environment
every time an administrator logs in and to store that information in memory. My intent is to
demonstrate how a simple tracking change left to its own devices can introduce a memory leak.
The steps you need to perform on your code for each use are as follows:
- Request a garbage collection and take a snapshot of your heap.
- Perform your use case.
- Request a garbage collection and take another snapshot of your heap.
- Compare the two snapshots (the difference between them includes all objects
remaining in the heap) and identify any unexpected loitering objects. - For each suspect object, open the heap snapshot and track down where the object
was created.
Note |
---|
A memory leak can be detected with a single execution of a use case or through a plethora of executions |
In this scenario, I performed steps 1 through 3 with a load tester that executed the MedRec
administration login use case almost 500 times. Figure 1 shows the difference between the
two heap snapshots.
Figure 1. The snapshot difference between the heaps before and after executing the use case
Figure 1 shows that my use case yielded 8,679 new objects added to the heap. Most of
these objects are collection classes, and I suspect they are part of BEA’s infrastructure. I scanned
this list looking for my code, which in this case consists of any class in the com.bea.medrec package.
Filtering on those classes, I was interested to see a large number of com.bea.medrec.actions.
SystemSnapShot instances, as shown in Figure 2.
Note |
---|
The screen shots in this article are from Quest Software’s JProbe and PerformaSure products. |
Figure 2. The snapshot difference between the heaps, filtered on my application packages
Realize that rarely is a loitering object a single simple object; rather, it is typically a subgraph
that maintains its own references. In this case, the SystemSnapShot class is a dummy class that
holds a set of primitive type arrays with the names timestamp, memoryInfo, jdbcInfo, and
threadDumps, but in a real-world scenario these arrays would be objects that reference other objects
and so forth. By opening the second heap snapshot and looking at one of the SystemSnapShot
instances, you can see all objects that it references. As shown in Figure 3, the SystemSnapShot
class references four objects: timestamp, memoryInfo, jdbcInfo, and threadDumps. A loitering
object, then, has a far greater impact than the object itself.
Next, let’s look at the referrer tree. We repeatedly ask the following questions: What class
is referencing the SystemSnapShot? What class is referencing that class? Eventually, we finally find
one of our classes. Figure 4 shows that the SystemSnapShot class is referenced by an Object array
that is referenced by an ArrayList that is finally referenced by the AdminLoginAction.
Figure 3. The SystemSnapShot class references four objects: timestamp, memoryInfo, jdbcInfo, and
threadDumps.
Figure 4. Here we can see that the AdminLoginAction class created the SystemSnapShot, and that it
stored it in an ArrayList.
Finally, we can look into the AdminLoginAction code to see that it creates the new
SystemSnapShot instance we are looking at and adds it to its cache in line 66, as shown in
Figure 5.
You need to perform this type of memory profiling test on your components during your
performance unit testing. For each object that is left in the heap, you need to ask yourself
whether or not you intended to leave it there. It’s OK to leave things on the heap as long as you
know that they are there and you want them to be there. The purpose of this test is to identify
and document potentially troublesome objects and objects that you forgot to clean up.
Figure 5. The AdminLoginAction source code
Code Profiling
The purpose of code profiling is to identify sections of your code that are running slowly and
then determine why. The perfect example I have to demonstrate the effectiveness of code profiling
is a project that I gave to my Data Structures and Algorithm Analysis class—compare and quantify
the differences among the following sorting algorithms for various values of n (where n
represents the sample size of the data being sorted):
- Bubble sort
- Selection sort
- Insertion sort
- Shell sort
- Heap sort
- Merge sort
- Quick sort
As a quick primer on sorting algorithms, each of the aforementioned algorithms has its
strengths and weaknesses. The first four algorithms run in O(N2) time, meaning that the run
time increases exponentially as the number of items to sort, N, increases; specifically, as N
increases, the amount of time required for the sorting algorithm to complete increases by N2.
The last three algorithms run in O( N log N ) time, meaning that the run time grows logarithmically:
as N increases, the amount of time required for the sorting algorithm to complete
increases by N log N. Achieving O( N log N ) performance requires additional overhead that
may cause the last three algorithms to actually run slower than the first four for a small number
of items. My recommendation is to always examine both the nature of the data you want to sort
today and the projected nature of the data throughout the life cycle of the product prior to
selecting your sorting algorithm.
With that foundation in place, I provided my students with a class that implements the
aforementioned sorting algorithms. I really wanted to drive home the dramatic difference
between executing these sorting algorithms on 10 items as opposed to 10,000 items, or even
1,000,000 items. For this exercise, I think it would be useful to profile this application against
5,000 randomly generated integers, which is enough to show the differences between the
algorithms, but not so excessive that I have to leave my computer running overnight.
Figure 6 shows the results of this execution, sorting each method by its cumulative
run time.
Figure 6. The profiled methods used to sort 5,000 random integers using the seven sorting algorithms
We view the method response times sorted by cumulative time, because some of the algorithms
make repeated calls to other methods to perform their sorting (for example, the
quickSort() method makes 5,000 calls to q_sort()). We have to ignore the main() method,
because it calls all seven sorting methods. (Its cumulative time is almost 169 seconds, but its
exclusive method time is only 90 milliseconds, demonstrating that most of its time is spent in
other method calls—namely, all of the sorting method calls.) The slowest method by far is the
bubbleSort() method, accounting for 80 seconds in total time and 47.7 percent of total run
time for the program.
The next question is, why did it take so long? Two pieces of information can give us insight
into the length of time: the number of external calls the method makes and the amount of time
spent on each line of code. Figure 7 shows the number of external calls that the bubbleSort()
method makes.
Figure 7. The number of external calls that the bubbleSort() method makes
This observation is significant—in order to sort 5,000 items, the bubble sort algorithm
required almost 12.5 million comparisons. It immediately alerts us to the fact that if we have a
considerable number of items to sort, bubble sort is not the best algorithm to use. Taking this
example a step further, Figure 8 shows a line-by-line breakdown of call counts and time
spent inside the bubbleSort() method.
Figure 8. Profiling the bubbleSort() method
By profiling the bubbleSort() method, we see that 45 percent of its time is spent comparing
items, and 25 percent is spent managing a for loop; these two lines account for 56 cumulative
seconds. Figure 8 clearly illustrates the core issue of the bubble sort algorithm: on line 15 it
executes the for loop 12,502,500 times, which resolves to 12,479,500 comparisons.
To be successful in deploying high-performance components and applications, you need
to apply this level of profiling to your code.
Coverage Profiling
Identifying and rectifying memory issues and slow-running algorithms gives you confidence in
the quality of your components, but that confidence is meaningful only as long as you are exercising
all—or at least most—of your code. That is where coverage profiling comes in; coverage
profiling reveals the percentage of classes, methods, and lines of code that are executed by a
test. Coverage profiling can provide strong validation that your unit and integration tests are
effectively exercising your components.
In this section, I’ll show a test of a graphical application that I built to manage my digital
pictures running inside of a coverage profiler filtered according to my classes. I purposely
chose not to test it extensively in order to present an interesting example. Figure 9 shows a
class summary of the code that I tested, with six profiled classes in three packages displayed in
the browser window and the methods of the JThumbnailPalette class with missed lines in the
pane below.
Figure 9. Coverage profile of a graphical application
The test exercised all six classes, but missed a host of methods and classes. For example, in
the JThumbnailPalette class, the test completely failed to call the methods getBackgroundColor(),
setBackgroundColor(), setTopRow(), and others. Furthermore, even though the paint() method
was called, the test missed 16.7 percent of the lines. Figure 10 shows the specific lines of code
within the paint() method that the test did not execute.
Figure 10 reveals that most lines of code were executed 17 times, but the code that handles
painting a scrolled set of thumbnails was skipped. With this information in hand, the person
needs to move the scroll bar, or configure an automated test script to move it, to ensure that
this piece of code is executed.
Coverage is a powerful profiling tool, because without it, you may miss code that your
users will encounter when they use your application in a way that you do not expect (and rest
assured, they definitely will).
Figure 10. A look inside the JThumbnailPalette’s paint() method
Summary
As components are built, performance unit tests are
performed alongside functional unit tests. These performance tests include testing for memory
issues, and code issues, and the validation of the coverage of tests to ensure that the majority
of component code is being tested.
About the Author
Steven Haines is the author of three Java books: The Java Reference Guide (InformIT/Pearson, 2005), Java 2 Primer Plus (SAMS, 2002), and Java 2 From Scratch (QUE, 1999). In addition to contributing chapters and coauthoring other books, as well as technical editing countless software publications, he is also the Java Host on InformIT.com. As an educator, he has taught all aspects of Java at Learning Tree University as well as at the University of California, Irvine. By day he works as a Java EE 5 Performance Architect at Quest Software, defining performance tuning and monitoring software as well as managing and performing Java EE 5 performance tuning engagements for large-scale Java EE 5 deployments, including those of several Fortune 500 companies.
Source of this material
Pro Java EE 5 Performance Management and
Optimization
By Steven Haines
Published: May 2006, Paperback: 424 pages
Published by Apress
ISBN: 1590596102
Retail price: $49.99
This material is from Chapter 5 of the book.