The original Design Patterns book contains 23 patterns that identify named solutions to common software development problems. Over the years, I’ve found a need for many of these patterns over and over again. I’m continually recognizing patterns such as the command and template method. I’ve found only minimal use for some of the patterns, however, such as flyweight. Yet, flyweight is a design pattern that Java itself heavily depends upon.
Wikipedia indicates that the flyweight pattern is appropriate when “many objects must be manipulated and these cannot afford to have extraneous data.” In Java, String objects are managed as flyweight. Java puts all fixed String literals into a literal pool. For redundant literals, Java keeps only one copy in the pool. I can demonstrate this with a simple “language test” written in JUnit:
@Test public void pool() { String author = "Henry David Thoreau"; String authorCopy = "Henry David Thoreau"; assertTrue(author.equals(authorCopy)); assertTrue(author == authorCopy); }
The two String objects are, of course, semantically equal because their contents are the same. I demonstrate the equality by calling .equals() in the first assertion. The two objects are also located in the same location in memory! The second assertion compares the references (addresses) of author and authorCopy.
Even though the two String objects are created separately, under the covers Java is storing them in the same location, to save space. Most applications heavily use Strings. The payback can be significant: Profiling reveals that String objects often represent anywhere from a third to a half of all objects created in a typical application.
As an aside, the use of flyweight in Java to minimize String storage often causes beginning programmers to make a dangerous mistake. When comparing two references to determine whether or not they hold the same character sequences, novice Java programmers will often code:
if (author == authorCopy)
As our test demonstrates, the comparison can work—it can return true even if the two objects were created at separate times during execution of the Java application. Unfortunately, Java does not automatically put dynamically created Strings into the literal pool. The last assertion in the following test fails:
@Test public void pool() { String author = "Henry David Thoreau"; String authorCopy = "Henry David Thoreau"; assertTrue(author.equals(authorCopy)); assertTrue(author == authorCopy); StringBuilder builder = new StringBuilder("Henry"); builder.append(" David Thoreau"); assertTrue(author.equals(builder.toString())); // this assertion fails! assertTrue(author == builder.toString()); }
The String literal “Henry David Thoreau” appears in the literal pool because Java was able to extract it during compile time. However, at compile time, Java cannot guarantee that execution of the code would result in ” David Thoreau” getting concatenated to the “Henry” literal. Theoretically, the compiler could figure things out in this case, but that would dramatically increase compile time. And, such a tactic would work only in the simplest cases. (You can force a dynamically created string into the literal pool by creating a new String using String’s .intern() method.)
The chief lesson? You should almost always compare String references using .equals.
Time Objects
I’ve been working on a scheduling application that supports appointments. It contains a Time class that encapsulates a time of day, expressed using hour (presuming a 24-hour clock) and minute.
public class Time { private byte hour; private byte minute; public Time(byte hour, byte minute) { this.hour = hour; this.minute = minute; } public byte hour() { return hour; } public byte minute() { return minute; } public String toString() { return hour + ":" + minute; } }
My application must support very large numbers of appointments, perhaps millions, and I’m concerned about the memory requirement. The reality is, however, that appointments during a business day typically start either on the hour, or at quarter hour intervals past the hour. If I use the flyweight pattern so that I only store unique Time instances, I can reduce the maximum number of typical time objects to 96 (24 hours times four possible times per hour). The appointment application will still support odd times, but predictable use suggests that flyweight will dramatically minimize object creation and memory usage.
The key to making flyweight work is by controlling object instantiation using a factory method. The job of a factory method is simply to create objects: given input criteria, return an object of appropriate type. I design a TimeFactory class by writing a simple unit test:
import static org.junit.Assert.*; import org.junit.*; public class TimeFactoryTest { private static final byte HOUR = 23; private static final byte MINUTE = 59; @Test public void create() { Time time = TimeFactory.create(HOUR, MINUTE); assertEquals("23:59", time.toString()); } }
The code to get this test passing is trivial:
public class TimeFactory { public static Time create(byte hour, byte minute) { return new Time(hour, minute); } }
The stage is set for a second test:
@Test public void reuseOfMemory() { Time time1 = TimeFactory.create(HOUR, MINUTE); Time time2 = TimeFactory.create(HOUR, MINUTE); assertSame(time1, time2); }
In other words: If I create two time objects, each with the same hour and minute, they should be the same object in memory. JUnit’s assertSame method is equivalent to stating:
assertTrue(time1 == time2);
in this example. The test fails because the create method returns distinct Time instances. The failing reuseOfMemory test drives me to change the implementation in TimeFactory:
import java.util.*; public class TimeFactory { private static Map<String,Time> times = new HashMap<String,Time>(); public static Time create(byte hour, byte minute) { String key = hour + ":" + minute; Time time = times.get(key); if (time == null) { time = new Time(hour, minute); times.put(key, time); } return time; } }
Now that I’ve encapsulated instantiation in the factory’s create method, I can control when Time objects get created. As the first step, I create a unique key by concatenating the hour and the minute into a time string. I use this key to extract an entry, if one exists, from a new HashMap named times. If an entry exists, I return it. If no entry exists, I create a new Time object, and stuff it into the times map at the key I created.
That’s it. Implementing flyweight can be this straightforward. The chief downsides are:
- Instantiation of all Time objects must be done through the factory. If not, you lose the benefit of flyweight.
- Using the factory requires the overhead of an additional method call and a hash table lookup. This rarely presents a performance problem in Java.
So, was it worth it? I hacked together a quick application class to see whether the flyweight was paying off:
import java.util.*; public class App extends Thread { private static Random random = new Random(); public static void main(String[] args) { new App().start(); } public void run() { int size = 1000000; List<Appointment> appointments = new ArrayList<Appointment>(size); for (int i = 0; i < size; i++) { appointments.add(createRandomAppt()); } System.out.println("created " + appointments.size()); while (true) { Thread.yield(); } } private Appointment createRandomAppt() { Time start = createRandomTime(); Time stop = createRandomTime(); // kind of bogus return new Appointment(start, stop); } private Time createRandomTime() { byte hour = (byte)random.nextInt(24); byte quartile = (byte)random.nextInt(4); byte minute = 0; switch (quartile) { case 0: minute = 0; break; case 1: minute = 15; break; case 2: minute = 30; break; case 3: minute = 45; break; } //return new Time(hour, minute); return TimeFactory.create(hour, minute); } }
Ignoring the fact that an appointment stop time probably should be some time later than its start time, the application puts together a number of appointments with random stop and start times. The times are constrained to quarter-hour marks. This is good enough for a quick test; in reality, appointment times during a typical business day are likely to be even more constrained.
The last two lines in createRandomTime control how the Time instances get created. For the first run, I uncommented the line that creates a Time directly via its constructor:
return new Time(hour, minute);
and commented out the line that uses the factory.
I ran jconsole (which ships as part of Java 5 and 6) to monitor the application. Heap profiling indicated that the application required between 52 and 56 million bytes.
Finally, I switched how Time objects got created, uncommenting the line that uses the factory:
return TimeFactory.create(hour, minute);
The second run of jconsole showed that the application now took between 20 and 22 million bytes. In this case, using the flyweight pattern has saved me over 30 million bytes!
As with any memory or performance optimization, profiling and tests are essential. It’s easy to optimize the wrong thing, or even to introduce defects during optimization. Used properly, patterns such as flyweight can quickly provide your application with significant improvement in memory usage.
About the Author
Jeff Langr is a veteran software developer with a score and more years of experience. He’s authored two books and dozens of published articles on software development, including Agile Java: Crafting Code With Test-Driven Development (Prentice Hall) in 2005. You can find out more about Jeff at his site, http://langrsoft.com, or you can contact him directly at jeff@langrsoft.com.