Architecture & DesignBuilding in J2EE Performance during the Development Phase

Building in J2EE Performance during the Development Phase

Developer.com content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Continuing on the same theme of my last article, Java EE 5 Performance Management and Optimization, have you ever heard anyone ask the following question: “When developers are building their
individual components before a single use case is implemented, isn’t it premature to start performance testing?”

Let me ask a similar question: When building a car, is it premature to test the performance
of your alternator before the car is assembled and you try to start it? The answer to this question is
obviously “No, it’s not premature. I want to make sure that the alternator works before building my
car!” If you would never assemble a car from untested parts, why would you assemble an enterprise
application from untested components? Furthermore, because you integrate performance
criteria into use cases, use cases will fail testing if they do not meet their performance criteria.
In short, performance matters!

In development, components are tested in unit tests. A unit test is designed to test the
functionality and performance of an individual component, independently from other components
that it will eventually interact with. The most common unit testing framework is an open
source initiative called JUnit. JUnit’s underlying premise is that alongside the development
of your components, you should write tests to validate each piece of functionality of your components.
A relatively new development paradigm, Extreme Programming (www.xprogramming.com),
promotes building test cases prior to building the components themselves, which forces you to
better understand how your components will be used prior to writing them.

JUnit focuses on functional testing, but side projects spawned from JUnit include performance
and scalability testing. Performance tests measure expected response time, and scalability
tests measure functional integrity under load. Formal performance unit test criteria should do
the following:

  • Identify memory issues
  • Identify poorly performing methods and algorithms
  • Measure the coverage of unit tests to ensure that the majority of code is being tested

Memory leaks are the most dangerous and difficult to diagnose problems in enterprise
Java applications. The best way to avoid memory leaks at a code level is to run your components
through a memory profiler. A memory profiler takes a snapshot of your heap (after first
running garbage collection), allows you to run your tests, takes another snapshot of your heap
(after garbage collection again), and shows you all of the objects that remain in the heap. The
analysis of the heap differences identifies objects abandoned in memory. Your task is then to
look at these objects and decide if they should remain in the heap or if they were left there by
mistake. Another danger of memory misusage is object cycling, which, again, is the rapid creation
and destruction of objects. Because it increases the frequency of garbage collection, excessive
object cycling may result in the premature tenuring of short-lived objects, necessitating a
major garbage collection to reclaim these objects.

After considering memory issues, you need to quantify the performance of methods and
algorithms. Because SLAs are defined at the use case level, but not at the component level,
measuring response times may be premature in the development phase. Rather, the strategy is
to run your components through a code profiler. A code profiler reveals the most frequently
executed sections of your code and those that account for the majority of the components’
execution times. The resulting relative weighting of hot spots in the code allows for intelligent
tuning and code refactoring. You should run code profiling on your components while executing
your unit tests, because your unit tests attempt to mimic end-user actions and alternate user
scenarios. Code profiling your unit tests should give you a good idea about how your component
will react to real user interactions.

Coverage profiling reports the percentage of classes, methods, and lines of code that were
executed during a test or use case. Coverage profiling is important in assessing the efficacy of
unit tests. If both the code and memory profiling of your code are good, but you are exercising
only 20 percent of your code, then your confidence in your tests should be minimal. Not only
do you need to receive favorable results from your functional unit tests and your code and
memory performance unit tests, but you also need to ensure that you are effectively testing
your components.

This level of testing can be further extended to any code that you outsource. You should
require your outsourcing company to provide you with unit tests for all components it develops,
and then execute a performance test against those unit tests to measure the quality of the
components you are receiving. By combining code and memory profiling with coverage profiling,
you can quickly determine whether the unit tests are written properly and have acceptable results.

Once the criteria for tests are met, the final key step to effectively implementing this level
of testing is automation. You need to integrate functional and performance unit testing into
your build process—only by doing so can you establish a repeatable and trackable procedure.
Because running performance unit tests can burden memory resources, you might try executing
functional tests during nightly builds and executing performance unit tests on Friday-night
builds, so that you can come in on Monday to test result reports without impacting developer
productivity. This suggestion’s success depends a great deal on the size and complexity of your
environment, so, as always, adapt this plan to serve your application’s needs.

When performance unit tests are written prior to, or at least concurrently with, component
development, then component performance can be assessed at each build. If such extensive
assessment is not realistic, then the reports need to be evaluated at each major development
milestone. For the developer, milestones are probably at the completion of the component or
a major piece of functionality for the component. But at minimum, performance unit tests
need to be performed prior to the integration of components. Again, building a high-performance
car from tested and proven high-performance parts is far more effective than from scraps gathered
from the junkyard.

Unit Testing

I thought this section would be a good opportunity to talk a little about unit testing tools and
methods, though this discussion is not meant to be exhaustive. JUnit is, again, the tool of
choice for unit testing. JUnit is a simple regression-testing framework that enables you to write
repeatable tests. Originally written by Erich Gamma and Kent Beck, JUnit has been embraced
by thousands of developers and has grown into a collection of unit testing frameworks for a
plethora of technologies. The JUnit Web site (www.junit.org) hosts support information and links to the other JUnit derivations.

JUnit offers the following benefits to your unit testing:

  • Faster coding: How many times have you written debug code inside your classes to verify
    values or test functionality? JUnit eliminates this by allowing you to write test cases in
    closely related, but centralized and external, classes.

  • Simplicity: If you have to spend too much time implementing your test cases, then you
    won’t do it. Therefore, the creators of JUnit made it as simple as possible.

  • Single result reports: Rather than generating loads of reports, JUnit will give you a single
    pass/fail result, and, for any failure, show you the exact point where the application failed.

  • Hierarchical testing structure: Test cases exercise specific functionality, and test suites
    execute multiple test cases. JUnit supports test suites of test suites, so when developers
    build test cases for their classes, they can easily assemble them into a test suite at the
    package level, and then incorporate that into parent packages and so forth. The result is
    that a single, top-level test execution can exercise hundreds of unit test cases.

  • Developer-written tests: These tests are written by the same person who wrote the code,
    so the tests accurately target the intricacies of the code that the developer knows can be
    problematic. This test differs from a QA-written one, which exercises the external functionality
    of the component or use case—instead, this test exercises the internal functionality.

  • Seamless integration: Tests are written in Java, which makes the integration of test cases
    and code seamless.

  • Free: JUnit is open source and licensed under the Common Public License Version 1.0,
    so you are free to use it in your applications.

From an architectural perspective, JUnit can be described by looking at two primary components:
TestCase and TestSuite. All code that tests the functionality of your class or classes must
extend junit.framework.TestCase. The test class can implement one or more tests by defining
public void methods that start with test and accept no parameters, for example:

public void testMyFunctionality() { ... }

For multiple tests, you have the option of initializing and cleaning up the environment
before and between tests by implementing the following two methods: setUp() and tearDown().
In setUp() you initialize the environment, and in teardown() you clean up the environment.
Note that these methods are called between each test to eliminate side effects between test
cases; this makes each test case truly independent.

Inside each TestCase “test” method, you can create objects, execute functionality, and
then test the return values of those functional elements against expected results. If the return
values are not as expected, then the test fails; otherwise, it passes. The mechanism that JUnit
provides to validate actual values against expected values is a set of assert methods:

  • assertEquals() methods test primitive types.
  • assertTrue() and assertFalse() test Boolean values.
  • assertNull() and assertNotNull() test whether or not an object is null.
  • assertSame() and assertNotSame() test object equality.

    In addition, JUnit offers a fail() method that you can call anywhere in your test case to
    immediately mark a test as failing.

    JUnit tests are executed by one of the TestRunner instances (there is one for command-line
    execution and one for a GUI execution), and each version implements the following steps:

    1. It opens your TestCase class instance.
    2. It uses reflection to discover all methods that start with “test”.
    3. It repeatedly calls setUp(), executes the test method, and calls teardown().

    As an example, I have a set of classes that model data metrics. A metric contains a set of
    data points, where each data point represents an individual sample, such as the size of the
    heap at a given time. I purposely do not list the code for the metric or data point classes; rather,
    I list the JUnit tests. Recall that according to one of the tenets of Extreme Programming, we
    write test cases before writing code. Listing 1 shows the test case for the Metric class, and
    Listing 2 shows the test case for the DataPoint class.

    Listing 1. DataPointTest.java

    package com.javasrc.metric;
    
    import junit.framework.TestCase;
    import java.util.*;
    
    /**
     * Tests the core functionality of a DataPoint
     */
    public class DataPointTest extends TestCase
    {
      /**
      * Maintains our reference DataPoint
      */
    private DataPoint dp;
    
    /**
     * Create a DataPoint for use in this test
     */
    protected void setUp()
    {
      dp = new DataPoint( new Date(), 5.0, 1.0, 10.0 );
    }
    
    /**
     * Clean up: do nothing for now
     */
    protected void tearDown()
    {
    }
    
    /**
     * Test the range of the DataPoint
     */
    public void testRange()
    {
      assertEquals( 9.0, dp.getRange(), 0.001 );
    }
    
    /**
     * See if the DataPoint scales properly
     */
    public void testScale()
    {
      dp.scale( 10.0 );
      assertEquals( 50.0, dp.getValue(), 0.001 );
      assertEquals( 10.0, dp.getMin(), 0.001 );
      assertEquals( 100.0, dp.getMax(), 0.001 );
    }
    
    /**
     * Try to add a new DataPoint to our existing one
     */
    public void testAdd()
    {
      DataPoint other = new DataPoint( new Date(), 4.0, 0.5, 20.0 );
      dp.add( other );
      assertEquals( 9.0, dp.getValue(), 0.001 );
      assertEquals( 0.5, dp.getMin(), 0.001 );
      assertEquals( 20.0, dp.getMax(), 0.001 );
    }
    
    /**
     * Test the compare functionality of our DataPoint to ensure that
     * when we construct Sets of DataPoints they are properly ordered
     */
    public void testCompareTo()
    {
      try
       {
        // Sleep for 100ms so we can be sure that the time of
        // the new data point is later than the first
        Thread.sleep( 100 );
      }
      catch( Exception e )
      {
      }
    
      // Construct a new DataPoint
      DataPoint other = new DataPoint( new Date(), 4.0, 0.5, 20.0 );
    
      // Should return -1 because other occurs after dp
      int result = dp.compareTo( other );
      assertEquals( -1, result );
    
      // Should return 1 because dp occurs before other
      result = other.compareTo( dp );
      assertEquals( 1, result );
    
      // Should return 0 because dp == dp
      result = dp.compareTo( dp );
      assertEquals( 0, result );
     }
    }
    
    

    Listing 2. MetricTest.java

    package com.javasrc.metric;
    
    import junit.framework.TestCase;
    import java.util.*;
    
    public class MetricTest extends TestCase
    {
      private Metric sampleHeap;
    
      protected void setUp()
      {
        this.sampleHeap = new Metric( "Test Metric",
                                      "Value/Min/Max",
                                      "megabytes" );
        double heapValue = 100.0;
        double heapMin = 50.0;
        double heapMax = 150.0;
    
        for( int i=0; i<10; i++ )
        {
          DataPoint dp = new DataPoint( new Date(),
                                        heapValue,
                                        heapMin,
                                        heapMax );
          this.sampleHeap.addDataPoint( dp );
          try
          {
            Thread.sleep( 50 );
          }
          catch( Exception e )
          {
          }
          // Update the heap values
          heapMin -= 1.0;
          heapMax += 1.0;
          heapValue += 1.0;
        }
    }
    
    public void testMin()
    {
      assertEquals( 41.0, this.sampleHeap.getMin(), 0.001 );
    }
    
    public void testMax()
    {
      assertEquals( 159.0, this.sampleHeap.getMax(), 0.001 );
    }
    
    public void testAve()
    {
      assertEquals( 104.5, this.sampleHeap.getAve(), 0.001 );
    }
    
    public void testMaxRange()
    {
      assertEquals( 118.0, this.sampleHeap.getMaxRange(), 0.001 );
    }
    
    public void testRange()
    {
      assertEquals( 118.0, this.sampleHeap.getRange(), 0.001 );
    }
    
    public void testSD()
    {
      assertEquals( 3.03, this.sampleHeap.getStandardDeviation(), 0.01 );
    }
    
    public void testVariance()
    {
      assertEquals( 9.17, this.sampleHeap.getVariance(), 0.01 );
    }
    
    public void testDataPointCount()
    {
      assertEquals( 10, this.sampleHeap.getDataPoints().size() );
    }
    }
    

    In Listing 1, you can see that the DataPoint class, in addition to maintaining the observed
    value for a point in time, supports minimum and maximum values for the time period, computes
    the range, and supports scaling and adding data points. The sample test case creates a DataPoint
    object in the setUp() method and then exercises each piece of functionality.

    Listing 2 shows the test case for the Metric class. The Metric class aggregates the
    DataPoint objects and provides access to the collective minimum, maximum, average, range,
    standard deviation, and variance. In the setUp() method, the test creates a set of data points
    and builds the metric to contain them. Each subsequent test case uses this metric and validates
    values computed by hand to those computed by the Metric class.

    Listing 3 rolls both of these test cases into a test suite that can be executed as one test.

    Listing 3. MetricTestSuite.java

    package com.javasrc.metric;
    
    import junit.framework.Test;
    import junit.framework.TestSuite;
    
    public class MetricTestSuite
    {
      public static Test suite()
      {
        TestSuite suite = new TestSuite();
        suite.addTestSuite( DataPointTest.class );
        suite.addTestSuite( MetricTest.class );
        return suite;
      }
    }
    

    A TestSuite exercises all tests in all classes added to it by calling the addTestSuite()
    method. A TestSuite can contain TestCases or TestSuites, so once you build a suite of test
    cases for your classes, a master test suite can include your suite and inherit all of your test cases.

    The final step in this example is to execute either an individual test case or a test suite. After
    downloading JUnit from www.junit.org, add the junit.jar file to your CLASSPATH and then invoke either its command-line interface or GUI interface. The three classes that execute these tests
    are as follows:

    • junit.textui.TestRunner
    • junit.swingui.TestRunner
    • junit.awtui.TestRunner

    And as these package names imply, textui is the command-line interface and swingui is
    the graphical interface. awtui provides a batch interface to executing unit tests. You can pass an
    individual test case or an entire test suite as an argument to the TestRunner class. For example,
    to execute the test suite that we created earlier, you would use this:

    java junit.swingui.TestRunner com.javasrc.metric.MetricTestSuite
    

    Unit Performance Testing

    Unit performance testing has three aspects:

  • Memory profiling
  • Code profiling
  • Coverage profiling
  • This section explores each facet of performance profiling. I provide examples of what to
    look for and the step-by-step process to implement each type of testing.

    Memory Profiling

    Let’s first look at memory profiling. To illustrate how to determine if you do, in fact, have a
    memory leak, I modified the BEA MedRec application to capture the state of the environment
    every time an administrator logs in and to store that information in memory. My intent is to
    demonstrate how a simple tracking change left to its own devices can introduce a memory leak.

    The steps you need to perform on your code for each use are as follows:

    1. Request a garbage collection and take a snapshot of your heap.
    2. Perform your use case.
    3. Request a garbage collection and take another snapshot of your heap.
    4. Compare the two snapshots (the difference between them includes all objects
      remaining in the heap) and identify any unexpected loitering objects.

    5. For each suspect object, open the heap snapshot and track down where the object
      was created.
    Note

    A memory leak can be detected with a single execution of a use case or through a plethora of executions
    of a use case. In the latter case, the memory leak will scream out at you. So, while analyzing individual use
    cases is worthwhile, when searching for subtle memory leaks, executing your use case multiple times makes
    finding them easier.

    In this scenario, I performed steps 1 through 3 with a load tester that executed the MedRec
    administration login use case almost 500 times. Figure 1 shows the difference between the
    two heap snapshots.

    Figure 1. The snapshot difference between the heaps before and after executing the use case

    Figure 1 shows that my use case yielded 8,679 new objects added to the heap. Most of
    these objects are collection classes, and I suspect they are part of BEA’s infrastructure. I scanned
    this list looking for my code, which in this case consists of any class in the com.bea.medrec package.
    Filtering on those classes, I was interested to see a large number of com.bea.medrec.actions.
    SystemSnapShot
    instances, as shown in Figure 2.

    Note

    The screen shots in this article are from Quest Software’s JProbe and PerformaSure products.

    Figure 2. The snapshot difference between the heaps, filtered on my application packages

    Realize that rarely is a loitering object a single simple object; rather, it is typically a subgraph
    that maintains its own references. In this case, the SystemSnapShot class is a dummy class that
    holds a set of primitive type arrays with the names timestamp, memoryInfo, jdbcInfo, and
    threadDumps
    , but in a real-world scenario these arrays would be objects that reference other objects
    and so forth. By opening the second heap snapshot and looking at one of the SystemSnapShot
    instances, you can see all objects that it references. As shown in Figure 3, the SystemSnapShot
    class references four objects: timestamp, memoryInfo, jdbcInfo, and threadDumps. A loitering
    object, then, has a far greater impact than the object itself.

    Next, let’s look at the referrer tree. We repeatedly ask the following questions: What class
    is referencing the SystemSnapShot? What class is referencing that class? Eventually, we finally find
    one of our classes. Figure 4 shows that the SystemSnapShot class is referenced by an Object array
    that is referenced by an ArrayList that is finally referenced by the AdminLoginAction.

    Figure 3. The SystemSnapShot class references four objects: timestamp, memoryInfo, jdbcInfo, and
    threadDumps.

    Figure 4. Here we can see that the AdminLoginAction class created the SystemSnapShot, and that it
    stored it in an ArrayList.

    Finally, we can look into the AdminLoginAction code to see that it creates the new
    SystemSnapShot instance we are looking at and adds it to its cache in line 66, as shown in
    Figure 5.

    You need to perform this type of memory profiling test on your components during your
    performance unit testing. For each object that is left in the heap, you need to ask yourself
    whether or not you intended to leave it there. It’s OK to leave things on the heap as long as you
    know that they are there and you want them to be there. The purpose of this test is to identify
    and document potentially troublesome objects and objects that you forgot to clean up.

    Figure 5. The AdminLoginAction source code

    Code Profiling

    The purpose of code profiling is to identify sections of your code that are running slowly and
    then determine why. The perfect example I have to demonstrate the effectiveness of code profiling
    is a project that I gave to my Data Structures and Algorithm Analysis class—compare and quantify
    the differences among the following sorting algorithms for various values of n (where n
    represents the sample size of the data being sorted):

    • Bubble sort
    • Selection sort
    • Insertion sort
    • Shell sort
    • Heap sort
    • Merge sort
    • Quick sort

    As a quick primer on sorting algorithms, each of the aforementioned algorithms has its
    strengths and weaknesses. The first four algorithms run in O(N2) time, meaning that the run
    time increases exponentially as the number of items to sort, N, increases; specifically, as N
    increases, the amount of time required for the sorting algorithm to complete increases by N2.
    The last three algorithms run in O( N log N ) time, meaning that the run time grows logarithmically:
    as N increases, the amount of time required for the sorting algorithm to complete
    increases by N log N. Achieving O( N log N ) performance requires additional overhead that
    may cause the last three algorithms to actually run slower than the first four for a small number
    of items. My recommendation is to always examine both the nature of the data you want to sort
    today and the projected nature of the data throughout the life cycle of the product prior to
    selecting your sorting algorithm.

    With that foundation in place, I provided my students with a class that implements the
    aforementioned sorting algorithms. I really wanted to drive home the dramatic difference
    between executing these sorting algorithms on 10 items as opposed to 10,000 items, or even
    1,000,000 items. For this exercise, I think it would be useful to profile this application against
    5,000 randomly generated integers, which is enough to show the differences between the
    algorithms, but not so excessive that I have to leave my computer running overnight.

    Figure 6 shows the results of this execution, sorting each method by its cumulative
    run time.

    Figure 6. The profiled methods used to sort 5,000 random integers using the seven sorting algorithms

    We view the method response times sorted by cumulative time, because some of the algorithms
    make repeated calls to other methods to perform their sorting (for example, the
    quickSort() method makes 5,000 calls to q_sort()). We have to ignore the main() method,
    because it calls all seven sorting methods. (Its cumulative time is almost 169 seconds, but its
    exclusive method time is only 90 milliseconds, demonstrating that most of its time is spent in
    other method calls—namely, all of the sorting method calls.) The slowest method by far is the
    bubbleSort() method, accounting for 80 seconds in total time and 47.7 percent of total run
    time for the program.

    The next question is, why did it take so long? Two pieces of information can give us insight
    into the length of time: the number of external calls the method makes and the amount of time
    spent on each line of code. Figure 7 shows the number of external calls that the bubbleSort()
    method makes.

    Figure 7. The number of external calls that the bubbleSort() method makes

    This observation is significant—in order to sort 5,000 items, the bubble sort algorithm
    required almost 12.5 million comparisons. It immediately alerts us to the fact that if we have a
    considerable number of items to sort, bubble sort is not the best algorithm to use. Taking this
    example a step further, Figure 8 shows a line-by-line breakdown of call counts and time
    spent inside the bubbleSort() method.

    Figure 8. Profiling the bubbleSort() method

    By profiling the bubbleSort() method, we see that 45 percent of its time is spent comparing
    items, and 25 percent is spent managing a for loop; these two lines account for 56 cumulative
    seconds. Figure 8 clearly illustrates the core issue of the bubble sort algorithm: on line 15 it
    executes the for loop 12,502,500 times, which resolves to 12,479,500 comparisons.

    To be successful in deploying high-performance components and applications, you need
    to apply this level of profiling to your code.

    Coverage Profiling

    Identifying and rectifying memory issues and slow-running algorithms gives you confidence in
    the quality of your components, but that confidence is meaningful only as long as you are exercising
    all—or at least most—of your code. That is where coverage profiling comes in; coverage
    profiling reveals the percentage of classes, methods, and lines of code that are executed by a
    test. Coverage profiling can provide strong validation that your unit and integration tests are
    effectively exercising your components.

    In this section, I’ll show a test of a graphical application that I built to manage my digital
    pictures running inside of a coverage profiler filtered according to my classes. I purposely
    chose not to test it extensively in order to present an interesting example. Figure 9 shows a
    class summary of the code that I tested, with six profiled classes in three packages displayed in
    the browser window and the methods of the JThumbnailPalette class with missed lines in the
    pane below.

    Figure 9. Coverage profile of a graphical application

    The test exercised all six classes, but missed a host of methods and classes. For example, in
    the JThumbnailPalette class, the test completely failed to call the methods getBackgroundColor(),
    setBackgroundColor(), setTopRow(), and others. Furthermore, even though the paint() method
    was called, the test missed 16.7 percent of the lines. Figure 10 shows the specific lines of code
    within the paint() method that the test did not execute.

    Figure 10 reveals that most lines of code were executed 17 times, but the code that handles
    painting a scrolled set of thumbnails was skipped. With this information in hand, the person
    needs to move the scroll bar, or configure an automated test script to move it, to ensure that
    this piece of code is executed.

    Coverage is a powerful profiling tool, because without it, you may miss code that your
    users will encounter when they use your application in a way that you do not expect (and rest
    assured, they definitely will).

    Figure 10. A look inside the JThumbnailPalette’s paint() method

    Summary

    As components are built, performance unit tests are
    performed alongside functional unit tests. These performance tests include testing for memory
    issues, and code issues, and the validation of the coverage of tests to ensure that the majority
    of component code is being tested.

    About the Author

    Steven Haines is the author of three Java books: The Java Reference Guide (InformIT/Pearson, 2005), Java 2 Primer Plus (SAMS, 2002), and Java 2 From Scratch (QUE, 1999). In addition to contributing chapters and coauthoring other books, as well as technical editing countless software publications, he is also the Java Host on InformIT.com. As an educator, he has taught all aspects of Java at Learning Tree University as well as at the University of California, Irvine. By day he works as a Java EE 5 Performance Architect at Quest Software, defining performance tuning and monitoring software as well as managing and performing Java EE 5 performance tuning engagements for large-scale Java EE 5 deployments, including those of several Fortune 500 companies.

    Source of this material

    Pro Java EE 5 Performance Management and
    Optimization

    By Steven Haines

    Published: May 2006, Paperback: 424 pages
    Published by Apress
    ISBN: 1590596102
    Retail price: $49.99

    This material is from Chapter 5 of the book.

  • Get the Free Newsletter!

    Subscribe to Developer Insider for top news, trends & analysis

    Latest Posts

    Related Stories