"The automatic garbage collector of the JVM makes life much simpler for programmers by removing the need to explicitly de-allocate objects.
" |
One of Java’s coolest features is automatic reclamation of memory space, a technique known as garbage collection.
Languages like C++ force developers to manually allocate and de-allocate memory space for objects, creating extra work for developers and allowing the possibility of leaks. Memory leaks, for those who haven’t experienced them, occur when memory for an object is not de-allocated by programmers leading to the gradual depletion of memory resources. Java takes all the pain out of memory management by automatically reclaiming memory. Understanding how automatic garbage collection works is important, as programmers can influence when garbage collection occurs and which objects are destroyed. Without a clear comprehension of garbage collection, your software may not be running at peak performance and may be consuming more memory than is needed.
Memory Management with the Heap
To understand how garbage collection works, you need to know a little about how the Java Virtual Machine (JVM) handles memory allocation. All data, such as objects or arrays of primitive data types, is stored in the heap, a shared region of memory that all JVM threads have access to. When the JVM first starts, memory is allocated for the heap, and this memory may be contracted or expanded as required [1]. Whenever a new object is created, a portion of the heap is allocated for its storage.
Depending on the implementation, a JVM may give the user control over how much heap memory is allocated initially, through the use of command line parameters. Inevitably, however, memory will run short if objects are frequently allocated. Rather than forcing the programmer to decide which object’s memory storage must be freed, and when, the JVM takes the choice away from us. There is no way to explicitly allocate or de-allocate memory, nor are there any pointers (direct memory references). This is in stark comparison to C++, which maintains the use of pointers as a hangover from the C language. Instead, the JVM offers automatic garbage collection.
Automatic Garbage Collection
You’ll probably be wondering how the JVM knows when to junk an object, and when not to. Fear not — the JVM will never de-allocate memory for an object that you’re using. The way it decides which objects are no longer needed (and hence are garbage) is simple:
Any object which is not referenced by an active thread may be safely de-allocated.
Let’s consider that statement very carefully, for its meaning is crucial to understanding how garbage collection works. If an active thread does not reference an object (either directly, or indirectly through an object that references another object, and so on), then it is fair game for the garbage collector. For an object to be useful, we must have a reference to it so that we can access member variables or invoke methods. If that reference is destroyed, and no other object has a reference to it, the object immediately becomes lost and so we can never access it. Anything we can never access is no longer needed, and the garbage collector can safely reclaim its memory. Note too, that even if the current thread (for example, the primary thread running the
main |
This logic is fairly simple to follow and sounds like common sense. Unfortunately, a lot of myths and misconceptions surround automatic garbage collection. As soon as you assign the
null |
null |
// Some array of objects, arr
arr = null;// Now array can be reclaimed when needed by the gc
Let’s examine some common misconceptions to help illustrate this concept.
Misconception Number One: A Reference to Oneself
One big misconception is that an object that maintains a reference to itself will not protect itself from being junked by the garbage collector. Some programmers place a reference to the object as a member variable, hoping that it will count. Let me assure you that it does not. Remember that an object already has an implicit reference to itself via the
this |
Misconception Number Two: Cyclical References
The second biggest misconception concerns cyclical references, whereby one object links to another, which links back again. Sometimes these cyclical relationships may be referenced by other objects, and thus protected, which reinforces the misconception. Take the object references shown in Figure 2, for example. If there are no other references to Obj1, Obj2, or Obj3, then it will be reclaimed. The same is true for data structures like vectors, hash tables, and collections — if you don’t maintain a reference to them, their contents are gone. This is a handy tip to remember. I’ve seen people manually assign all references in a collection to
null |
null |
Garbage Collection Timing
You should be aware that objects are not immediately destroyed once they are no longer referenced — the thread responsible for garbage collection can reclaim memory at any time. This time, of course, may not always be opportune — a time sensitive calculation or GUI update might be trying to complete when the garbage collector kicks in. The garbage collector thread normally runs as a very low process thread, but once started, it might not be suspended until its task is complete. At the wrong time, garbage collection could introduce a performance degradation.
Maximizing Performance Through Garbage Collection
There are two ways to improve system performance through garbage collection. The first is to explicitly invoke the garbage collector during idle periods, or before creating a large number of objects. The
java.lang.System |
gc() |
// Invoke the garbage collector (may cause a delay)
System.gc();
The second way to maximize performance is to always
null |
Summary
The automatic garbage collector of the JVM makes life much simpler for programmers by removing the need to explicitly de-allocate objects. However, this does not let us off the hook entirely — we must still remember to
null |
References
[1] The Java Virtual Machine Specification (Second Edition), Lindholm & Yellin, Addison-Wesley, 1999.
About the Author
David Reilly is a software engineer and freelance technical writer living in Australia. A Sun Certified Java 1.1 Programmer, his research interests include the Java programming language, networking & distributed systems, and software agents. He can be reached via e-mail at java@davidreilly.com or through his personal site.