Architecture & DesignJava EE 5 Performance Management and Optimization

Java EE 5 Performance Management and Optimization content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

A discussion held during a recent client visit…

“Okay, I understand how to gather metrics, but now what do I do with them?” John asked,
looking confounded. “If I have application response time, instrumentation, and application
server metrics, what should I have my developers do to ensure that the next deployment will be

“That is a very good question. At its core, it involves a change of mind-set by your entire
development organization, geared toward performance. You’ll most likely feel resistance from
your developers, but if they follow these steps and embrace performance testing from the outset,
then you’ll better your chances of success more than a hundredfold,” I said.

“I can deal with upset developers,” John responded. “The important thing is that the application
meets performance criteria when it goes live. I’ll make sure that they follow the proper
testing procedures; they have to understand the importance of application performance. I just
can’t face the idea of calling the CEO and telling him that we failed again!”

“Don’t worry, I’ve helped several customers implement this methodology into their development
life cycle, and each one has been successful. It is a discipline that, once adopted, becomes
second nature. The key is to get started now!”

“Tell me more,” John stated calmly, in contrast with his stressed demeanor. I knew that
John had seen the light and was destined for success in the future.

Performance Overview

All too often in application development, performance is an afterthought. I once worked for a
company that fully embraced the Rational Unified Process (RUP) but took it to an extreme. The
application the company built spent years in architecture, and the first of ten iterations took
nearly nine months to complete. The company learned much through its efforts and became
increasingly efficient in subsequent iterations, but one thing that the organization did not
learn until very late in the game was the importance of application performance. In the last
couple of iterations, it started implementing performance testing and learned that part of the
core architecture was flawed—specifically, the data model needed to be rearchitected. Because
object models are built on top of data models, the object model also had to change. In addition,
all components that interact with the object model had to change, and so on. Finally, the application
had to go through another lengthy QA cycle that uncovered new bugs as well as the
reemergence of former bugs.

That company learned the hard way that the later in the development life cycle performance
issues are identified, the more expensive they are to fix. Figure 1 illustrates this idea graphically. You can see that a performance issue identified during the application’s development is inexpensive to fix, but one found later can cause the
cost to balloon. Thus, you must ensure the performance of your application from the early
stages of its architecture and test it at each milestone to preserve your efforts.

Figure 1. The relationship between the time taken to identify performance issues and the
repair costs

A common theme has emerged from those customer sites I visit in which few or no performance
issues are identified: these customers kept in mind the performance of the application
when designing the application architecture. At these engagements, the root causes of most of
the application problems were related to load or application server configuration—the applications
had very few problems.

This is the first in a series of three articles that formalizes the methodology you should implement to ensure the performance
of your application at each stage of the application development, QA, and deployment stages.
I have helped customers implement this methodology into their organizations and roll out
their applications to production successfully.

Performance in Architecture

The first step in developing any application of consequence is to perform an architectural analysis
of a business problem domain. To review, application business owners work with application
technical owners to define the requirements of the system. Application business owners are
responsible for ensuring that when the application is complete it meets the needs of the end
users, while application technical owners are responsible for determining the feasibility of
options and defining the best architecture to solve the business needs. Together, these two
groups design the functionality of the application.

In most organizations, the architecture discussions end at this analysis stage; the next step
is usually the design of the actual solution. And this stage is where the architectural process
needs to be revolutionized. Specifically, these groups need to define intelligent SLAs for each
use case, they need to define the life cycles of major objects, and they need to address requirements
for sessions.


An intelligent SLA maintains three core traits. It is

  • Reasonable
  • Specific
  • Flexible

An SLA must satisfy end-user expectations but still be reasonable enough to be implemented.
An unreasonable SLA will be ignored by all parties until end users complain. This is why SLAs
need to be defined by both the application business owner and the application technical owner:
the business owner pushes for the best SLAs for his users, while the application technical owner
impresses upon the business owner the reality of what the business requirement presents. If the
business requirement cannot be satisfied in a way acceptable to the application business owner,
then the application technical owner needs to present all options and the cost of each (in terms
of effort). The business requirement may need to be changed or divided into subprocesses that
can be satisfied reasonably.

An intelligent SLA needs to be specific and measurable. In this requirement, you are looking
for a hard and fast number, not a statement such as “The search functionality will respond
within a reasonable user tolerance threshold.” How do you test “reasonable”? You need to
remove all subjectivity from this exercise. After all, what is the point in defining an SLA if you
cannot verify it?

Finally, an intelligent SLA needs to be flexible. It needs to account for variations in behavior
as a result of unforeseen factors, but define a hard threshold for how flexible it is allowed to be.
For example, an SLA may read “The search functionality will respond within three seconds
(specific) for 95 percent of requests (flexible).” The occasional seven-second response time is
acceptable, as long as the integrity of the application is preserved—it responds well most of the
time. By defining concrete values for the specific value as well as the limitations of the flexible
value, you can quantify what “most of the time” means to the performance of the application,
and you have a definite value with which to evaluate and verify the SLA.


Although you define specific performance criteria and a measure of flexibility, defining either a hard
upper limit of tolerance or a relative upper limit is also a good idea. I prefer to specify a relative upper limit,
measured in the number of standard deviations from the mean. The purpose of defining an SLA in this way is
that on paper a 3-second response time for 95 percent of requests is tolerable, but how do you address drastically
divergent response time, such as a 30-second response time? Statistically, this should not be grossly
applicable, but it is a good safeguard to be aware of.

An important aspect of defining intelligent SLAs is tracking them. The best way to do this
is to integrate them into your application use cases. A use case is built from a general thought,
such as “The application must provide search functionality for its patient medical records,”
but then the use case is divided into scenarios. Each scenario defines a path that the use case
may follow given varying user actions. For example, what does the application do when the
patient exists? What does it do when the patient does not exist? What if the search criterion
returns more than one patient record? Each of these business processes needs to be explicitly
called out in the use case, and each needs to have an SLA associated with it.
The following exercise demonstrates the format that a proper use case containing intelligent
SLAs should follow.


Use Case

The Patient Management System must provide functionality to search for specific patient medical history


Scenario 1: The Patient Management System returns one distinct record.
Scenario 2: The Patient Management System returns more than one match.
Scenario 3: The Patient Management System does not find any users meeting the specified criteria.


The user has successfully logged in to the application.


The user enters search criteria and submits data using the Web interface.


Scenario 1:
1. The Patient Management
2. . . .

Scenario 2:
3. . . .


The Patient Management System displays the results to the user.


Scenario 1: The Patient Management System will return a specific patient matching the specified criteria in
less than three seconds for 95 percent of requests. The response time will at no point stray more than two
standard deviations from the mean.

Scenario 2: The Patient Management System will return a collection of patients matching the specified criteria
in less than five seconds for 95 percent of requests. The response time will at no point stray more than two
standard deviations from the mean.

Scenario 3: When the Patient Management System cannot find a user matching the specified criteria, it will
inform the user in less than two seconds for 95 percent of requests. The response time will at no point stray
more than two standard deviations from the mean.

The format of this use case varies from traditional use cases with the addition of the SLA
component. In the SLA component, you explicitly call out the performance requirements for
each scenario. The performance criteria include the following:

  • The expected tolerance level: Respond in less than three seconds.
  • The measure of flexibility: Meet the tolerance level for 95 percent of requests.
  • The upper threshold: Do not stray more than three standard deviations from the
    observed mean.

With each of these performance facets explicitly defined, the developers implementing code to
satisfy the use case understand their expectations and can structure unit tests accordingly. The QA
team has a specific value to test and measure the quality of the application against. Next, when the
QA team, or a delegated performance capacity assessor, performs a formal capacity assessment, an
extremely accurate assessment can be built and a proper degradation model constructed.
Finally, when the application reaches production, enterprise Java system administrators have
values from which to determine if the application is meeting its requirements.

All of this specific assessment is possible, because the application business owner and
application technical owner took time to carefully determine these values in the architecture
phase. My aim here is to impress upon you the importance of up-front research and a solid
communication channel between the business and technical representatives.

Object Life Cycle Management

The most significant problem plaguing production enterprise Java applications is memory
management. The root cause of 90 percent of my customers’ problems is memory related and
can manifest in one of two ways:

  • Object cycling
  • Loitering objects (lingering object references)

Recall that object cycling is the rapid creation and deletion of objects in a short period of
time that causes the frequency of garbage collection to increase and may result in tenuring
short-lived objects prematurely. The cause of loitering objects is poor object management; the
application developer does not explicitly know when an object should be released from memory,
so the reference is maintained. Loitering objects are the result of an application developer
failing to release object references at the correct time. This is a failure to understand the impact
of reference management on application performance. This condition results in an overabundance
of objects residing in memory, which can have the following effects:

  • Garbage collection may run slower, because more live objects must be examined.
  • Garbage collection can become less effective at reclaiming objects.
  • Swapping on the physical machine can result, because less physical memory is available
    for other processes to use.

Neglecting object life cycle management can result in memory leaks and eventually application
server crashes. I discuss techniques for detecting and avoiding object cycling later in
this article, because it is a development or design issue, but object life cycle management is
an architectural issue.

To avoid loitering objects, take control of the management of object life cycles by defining
object life cycles inside use cases. I am not advocating that each use case should define every
int, boolean, and float that will be created in the code to satisfy the use case; rather, each use
case needs to define the major application-level components upon which it depends. For
example, in the Patient Management System, daily summary reports may be generated every
evening that detail patient metrics such as the number of cases of heart disease identified this
year and the common patient profile attributes for each. This report would be costly to build
on a per-request basis, so the architects of the system may dictate that the report needs to be
cached at the application level (or in the application scope so that all requests can access it).

Defining use case dependencies and application-level object life cycles provides a deeper
understanding of what should and should not be in the heap at any given time. Here are some
guidelines to help you identify application-level objects that need to be explicitly called out
and mapped to use cases in a dependency matrix:

  • Expensive objects, in terms of both allocated size as well as allocation time, that will be
    accessed by multiple users

  • Commonly accessed data
  • Nontemporal user session objects
  • Global counters and statistics management objects
  • Global configuration options

The most common examples of application-level object candidates are frequently accessed
business objects, such as those stored in a cache. If your application uses entity beans, then
you need to carefully determine the size of the entity bean cache by examining use cases; this
can be extrapolated to apply to any caching infrastructure. The point is that if you are caching
data in the heap to satisfy specific use cases, then you need to determine how much data is
required to satisfy the use cases. And if anyone questions the memory footprint, then you can
trace it directly back to the use cases.

The other half of the practice of object life cycle management is defining when objects
should be removed from memory. In the previous example, the medical summary report is
updated every evening, so at that point the old report should be removed from memory to
make room for the new report. Knowing when to remove objects is probably more important
than knowing when to create objects. If an object is not already in memory, then you can create
it, but if it is in memory and no one needs it anymore, then that memory is lost forever.

Application Session Management

Just as memory mismanagement is the most prevalent issue impacting the performance of
enterprise Java applications, HTTP sessions are by far the biggest culprit in memory abuse.
HTTP is a stateless protocol, and as such the conversation between the Web client and Web
server terminates at the conclusion of a single request: the Web client submits a request to the
Web server (most commonly GET or POST), and then the Web server performs its business logic,
constructs a response, and returns the response to the Web client. This ends the Web conversation
and terminates the relationship between client and server.

In order to sustain a long-term conversation between a Web client and Web server, the Web
server constructs a unique identifier for the client and includes it with its response to the request;
internally the Web server maintains all user data and associates it with that identifier. On
subsequent requests, the client submits this unique identifier to identify itself to the Web server.

This sounds like a good idea, but it creates the following problem: if the HTTP protocol is
truly stateless and the conversation between Web client and Web server can only be renewed
by a client interaction, then what does the Web server do with the client’s information if that
client never returns? Obviously, the Web server throws the information away, but the real
question relates to how long the Web server should keep the information.

All application servers provide a session time-out value that constrains the amount of time
user data is maintained. When the user makes any request from the server, the user’s time-out
is reset, and once the time-out has been exceeded, the user’s stateful information is discarded.
A practical example of this is logging in to your online banking application. You can view your
account balances, transfer funds, and pay bills, but if you sit idle for too long, you are forced to
log in again. The session time-out period for a banking application is usually quite short for
security reasons (for example, if you log in to your bank account and then leave your computer
unattended to go to a meeting, you do not want someone else who wanders by your desk to be
able to access your account). On the other hand, when you shop at, you can add
items to your shopping cart and return six months later to see that old book on DNA synthesis
and methylation that you still do not have time to read sitting there. uses a more
advanced infrastructure to support this feature (and a heck of a lot of hardware and memory),
but the question remains: how long should you hold on to data between user requests before
discarding it?

The definitive time-out value must come from the application business owner. He or she
may have specific, legally binding commitments with end users and business partners. But an
application technical owner can control the quantity of data that is held resident in memory for
each user. In the aforementioned example, do you think that maintains everyone’s
shopping cart in memory for all time? I suspect that shopping cart data is maintained in memory
for a fixed session length, and afterward persisted to a database for later retrieval.

As a general guideline, sessions should be as small as possible while still realizing the benefits
of being resident in memory. I usually maintain temporal data describing what the user does in
a particular session, such as the page the user came from, the options the user has enabled, and
so on. More significant data, such as objects stored in a shopping cart, opened reports, or partial
result sets, are best stored in stateful session beans, because rather than being maintained in
a hash map that can conceivably grow indefinitely like HTTP session objects, stateful session
beans are stored in predefined caches. The size of stateful session bean caches can be defined
upon deployment, on a per-bean basis, and hence assert an upper limit on memory consumption.
When the cache is full, to add a new bean to it, an existing bean must be selected and
written out to persistent storage. The danger is that if the cache is sized too small, the maintenance
of the cache can outweigh the benefits of having the cache in the first place. If your sessions are
heavy and your user load is large, then this upper limit can prevent your application servers
from crashing.


Here you learned how to integrate proactive performance testing throughout the
development life cycle. The process begins by integrating performance criteria into use cases,
which involves modifying use cases to include specific SLA sections that include performance
criteria for each use case scenario.

About the Author

Steven Haines is the author of three Java books: The Java Reference Guide (InformIT/Pearson, 2005), Java 2 Primer Plus (SAMS, 2002), and Java 2 From Scratch (QUE, 1999). In addition to contributing chapters and coauthoring other books, as well as technical editing countless software publications, he is also the Java Host on As an educator, he has taught all aspects of Java at Learning Tree University as well as at the University of California, Irvine. By day he works as a Java EE 5 Performance Architect at Quest Software, defining performance tuning and monitoring software as well as managing and performing Java EE 5 performance tuning engagements for large-scale Java EE 5 deployments, including those of several Fortune 500 companies.

Source of this material

Pro Java EE 5 Performance Management and

By Steven Haines

Published: May 2006, Paperback: 424 pages
Published by Apress
ISBN: 1590596102
Retail price: $49.99

This material is from Chapter 5 of the book.

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories