Eliminate Consecutive Full GCs in Java
By Ram Lakshmanan
Full GC is an important phase in the garbage collection process. During this full GC phase, garbage is collected from all the regions in the JVM heap (Young, Old, Perm, Metaspace). During this phase, the JVM is paused. No customer transactions will be processed, and the JVM will use all the CPU cycles to perform garbage collection. Due to that, CPU consumption will be quite high. Thus, in general, full GCs aren't desired. Needless to say, the desirability of consecutive full GCs is questionable. Consecutive full GCs will cause following problems:
- CPU consumption will spike up.
- Because the JVM is paused, the response time of your application transactions will degrade. Thus, it will have an impact on your SLAs and cause poor customer experience.
In this article, let's learn about these consecutive full GCs—what are they? What causes the problem? How do I fix it?
What Are Consecutive Full GCs in Java?
It's always easier to explain through an example. So, let's examine the GC log of a real-world application, which suffered from this consecutive full GC problem. Following are the graphs generated by a garbage collection log analysis tool. Notice the highlighted portion of the first graph (see Figure 1). You can see the full GCs to be consecutively running (red triangles in the graph indicate a full GC). In any application's life cycle, young GC and full GC runs. In healthy applications, most of the time young GC runs and, very few times, a full GC runs. If a full GC runs consecutively, it's indicative of a problem.
Even though full GCs were consecutively running, the system wasn't able to reclaim enough memory to continue. You can observe this in the second graph (which shows the reclaimed bytes). In this graph (see Figure 2), you can see the memory reclaimed from these full GCs to be much less. This is because most of the objects in memory are in active use, thus the JVM isn't able to reclaim enough memory.
Figure 1: Consecutive full GCs
Figure 2: Poor reclamation of memory despite full GC
What Causes Consecutive Full GCs?
Consecutive full GCs are caused because of one single reason: Under-allocation of JVM heap size. It's indicative of the fact that the application needs more memory than what you have allocated. In other words, you are trying to fit a truck load of objects in a small, compact car. So, the JVM has to do more work to clean up the garbage objects in the small, compact car, to make room for actively used objects.
Now, you might have a question: My application was running fine all along. Why, all of a sudden, do I see this consecutive full GC problem? That's a valid question. The answer to this question could be one of the following:
- Your application's traffic volume has started to grow since the last time you adjusted the JVM heap size. Maybe your business is improving, and more users have started to use your application.
- During your peak volume time period, more objects would get created than during a normal time. Maybe you didn't tune your JVM for peak traffic volume or your peak traffic volume has surged since the last time you tuned the JVM heap size.
How to Solve Consecutive Full GCs
Consecutive Full GCs can be solved through a couple of approaches:
1. Increase JVM Heap Size
Because consecutive full GCs run due to lack of memory, increasing the JVM heap size should solve the problem. Suppose you have allocated the JVM heap size to be 2.5GB; try increasing it to 3 GB and see whether it resolves the problem. The JVM heap size can be increased by passing the argument "-Xmx". Example:
This argument will set the JVM heap size to 3 GB. If it still doesn't resolve the problem, try increasing the JVM heap size step by step. Over-allocation of the JVM heap is also not good either; it might increase the GC pause time as well.
2. Add More JVM Instances
Adding more JVM instances is an another solution to this problem. When you add more JVM instances, traffic volume will get distributed. The amount of traffic volume handled by one single JVM instance will go down. If less traffic volume is sent to a JVM instance, fewer objects will be created. If fewer objects are created, you will not run into the consecutive full GC problems.
Validating the Fix
Irrespective of the approach you take to resolve the problem, validate the fix in the test environment before rolling out the change to production. Any changes to JVM heap settings should be thoroughly tested & validated. To validate that the problem doesn't resurface with new settings, study your GC log tool. It has the intelligence to spot and report whether the application is suffering from a consecutive full GC problem.
About the Author
Every single day, millions & millions of people in North America—bank, travel, and commerce—use the applications that Ram Lakshmanan has architected. Ram is an acclaimed speaker in major conferences on scalability, availability, and performance topics. Recently, he has founded a startup, which specializes in troubleshooting performance problems.