March 8, 2021
Hot Topics:

Reduce String Literal Overhead with Eclipse

  • By Scott Nelson
  • Send Email »
  • More Articles »

The ideas for these articles come from the successes, failures, and discoveries in real-world projects. In this particular case, all three inspirations are at play. The unconventional process described in the first section has proven successful, with success measured by everyone going home on time and the projects delivered on time and under budget. The failure that led to the specific problem and solution covered here was an application that continually required more memory and additional hardware at regular release intervals. A random check of code while looking for re-usable assets revealed that a good percentage of the problem was the inefficient use of String. The discovery of the Eclipse feature that makes this issue easy to track and address was the result of randomly trying different settings (which is less efficient than reading the documentation but much more fun).

Three Costs of String Literals

FUD: Fear, Uncertainty, and Doubt. These are the three roadblocks to improvements. When you read this article, you will probably experience all three until you have read it all the way through. Some may even have to give this a try themselves before they will be cured. And, some need an incentive to keep reading to get over their FUD. So, look at three problems that almost every application suffers from as motivation to get through the next sections with an open mind.

The first cost of a String literal is the overhead of creating the String. There are few more expensive operations than creating a new object. The use of interfaces is a common approach because you are all aware of this overhead. Yet, because String is so ubiquitous, it is often forgotten that creating a String is creating a new object. The failure example alluded to earlier is a perfect example. One class had a large number of Strings declared. They were all declared as static final, a good practice to reduce the cost of String creation. However, a random search of one these declarations revealed the exact same objects being created in 70 different files. A minimum footprint for a String is 40 bytes. That adds up quickly to more hardware expense and a good amount of labor spent looking for performance improvements.

The second cost is in the processing overhead of String comparison, a frequent reason for declaring a String in the first place. With the same String declared multiple times, the more efficient == comparison cannot be relied upon, requiring the more intensive .equals() comparison.

The third cost is maintenance. Tracking down every instance of a String is much more tedious and prone to error than changing a single instance.

An Hour of Prevention is Worth a Weekend of Cure

There are some processes that are rarely used that every project can benefit from. One of these is daily code reviews by either a build master or technical lead. This is rarely done for many reasons, two of which are the misconceptions that it takes more time than it is worth and that it is hard to do.

Looking at the time versus value concept, a daily review should take no longer than an hour. That time estimate is based on the reviewer being responsible for no more than eight developers (more than that and re-configuring the teams should be considered) and that it is done from the beginning rather than waiting for a performance issue to occur that can't be easily traced. In a six-month project, this would total 125 hours. When compared with how long it takes to tune applications in either QA or production, project savings will generally average 100%.

Daily code reviews should be easy. Every source control application includes a report of what files have changed since the last update and a comparison tool to view differences between versions. Setting aside simple beans and other classes that can (and should) be generated by an IDE, the total lines of code output by a team on a daily basis is far less than one might think. This is not because the team is not productive; it is because producing a line of code consists of thinking about what the line should be, writing the line, testing the line, and corrections to the current line or previously written lines based on test results.

The time taken to review the code can be greatly reduced through the use of code analysis tools both native to an IDE and available as plug-ins. By providing feedback from these daily reviews to team members and having them make the corrections themselves, the team will reduce their code standard variations. Code that is clean to begin with takes even less time to review. That hour per day can quickly drop to an average of 30 minutes a day.

These daily code reviews should not be full peer reviews. They only need to be cursory reviews looking for what can be found quickly (once the review becomes a habit). One Eclipse feature that can speed this process is the use of the Errors/Warnings settings under the Compiler preference. While the default settings are very useful, there is one non-default setting that every Java development team can benefit from. Setting Non-externalized strings to Warning.

Design for String Performance

The same String values used repeatedly in a web application is something that every developer is familiar with. Because example code is rift with String declarations, it is a common (and expensive) habit to declare Strings often. A much more efficient approach is the creation of a single interface to hold String values that will be re-used in more than one class (JSPs included).

A simple example is:

public interface StaticConstants
   public static final String USER_ID  =   "userId";
   public static final String PASSWORD =   "pword";

The two Strings above will be familiar to anyone who has ever worked on a web application. If you were to look at your last application, how many times would you find these Strings declared? Multiply that by the memory size required by each and you will see how much time you spent in meetings discussing how to reduce the time it takes for a user to login where there was a simple (if minor) reduction available immediately. Then add in the processing time where the .equals() method is used instead of the faster == comparison operator. I have used this design approach for many years; I was fortunate enough to be introduced to it on my first Java project. The average number of such Strings used in a web application is 120, with the criteria that the String must be used by more than one object. Frequently, these Strings are used by four or more objects. You would average three objects per String with an average of 60 bytes per String. Gosh, that is only .02 MB. Hardly worth it, eh? Ah, but these Strings are rarely declared as static final, so if you expect 1000 concurrent users, you are now at 20 MB. I'm prone to kill processes on my machine that use anything more the 500k if they aren't critical because I know they slow my 4GB machine down.

Although the use of String.intern() would also reduce overhead, that particular approach is much more useful at a class level than an application level.

Page 1 of 2

This article was originally published on October 29, 2008

Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Thanks for your registration, follow us on our social networks to keep up-to-date