By Jevgeni Kabanov, ZeroTurnaround Founder/CEO
Classloaders are at the core of the Java language. Java EE containers, OSGi, various Web frameworks and other tools all rely on these complicated mechanisms. By understanding the inner workings of classloaders, a Java developer can avoid common programming pitfalls and increase his or her productivity.
How Java Classloaders Work
In negotiating the complex world of classloader mechanics, it’s important for Java developers to realize each classloader itself is an object — an instance of a class that extends java.lang.classloader. Every class is loaded by one of those instances and developers can subclass java.lang.classloader to extend the manner in which the JVM loads classes.
If a classloader has a class and every class is loaded by a classloader, then what comes first? When instantiating a classloader, a parent classloader can be specified as a constructor argument. If the parent classloader isn’t specified explicitly, the JVM system’s classloader will be assigned as a default parent. In order to more clearly understand the mechanics of a classloader, let’s look at the JVM classloader hierarchy more closely.
Classloader hierarchy isn’t an inheritance hierarchy, but a delegation hierarchy. The classloader loads only the classes not available to the parent; classes loaded by a classloader higher in the hierarchy cannot refer to classes available lower in the hierarchy. The three classloaders are:
- Bootstrap classloader — Whenever a new JVM is started, the bootstrap classloader loads key Java classes and other runtime classes to the memory first. Because the bootstrap classloader is a parent of all other classloaders, it’s the only one without a parent.
- Extension classloader — The extension classloader has the bootstrap classloader as parent.
- System classpath classloader — The most important classloader — from a developer’s perspective — is the system classpath classloader.
Java EE Delegation Model
An application container’s classloader hierarchy typically looks something like this:
As you can see, the container itself has a classloader, every EAR module has its own classloader and every WAR has its own classloader as well. The Java EE delegation model can cause some interesting problems when it encounters classloading. These errors are typical wrong class problems, which developers encounter when they assume a Java EE application uses one version of a class, when it actually uses some other version, or the class is loaded in some different way than is required. Wrong class problems include:
- NoClassDefFoundError: This error is one of the most common problems faced when developing Java EE Java applications. This occurs when the JVM or a classloader instance tries to load in the definition of a class and no definition can be found. This is a runtime problem, and the IDE cannot help here.
- NoSuchMethodError:This error occurs when the class being referred to exists but an incorrect version of it is loaded, and the required method is subsequently not found.
- Other wrong class related problems include ClassCastException, LinkageError and IllegalAccessError.
Developers must understand the nature of these errors in order to effectively solve the problems.
Dynamic classloaders, used in Web applications and module systems, often cause headaches for developers. Dynamic classloading can improve the experience.
When discussing dynamic classloading, it’s important to first understand the relationship between classes and objects. All Java code is associated with methods contained in classes. The class is loaded into memory with all its methods and receives a unique identity. Every object created gets a reference to this identity, and when a method is called on an object, the JVM consults the class reference and calls the method of that particular class. Every class object is in turn associated with its classloader (the main role of the classloader is to define a class scope) where the class is visible and where it isn’t. This scoping allows for classes with the same name to exist as long as they’re loaded in different classloaders, and also allows loading a newer version of the class in a different classloader.
The main issue with code reloading in Java is that although you can load a new version of a class, it will get a completely different identity and the existing objects will refer to the previous version of the class. Therefore when a method is called on those objects it will execute the old version of the method. Unfortunately it’s impossible in the Java API to update the class of an existing object or reliably copy its state. Instead, developers resort to complicated workarounds.
In Java, memory leaks are inevitable. They usually occur because a collection refers to objects that should have been cleared but never were. Unfortunately, with the current state of the Java platform, classloader leaks are also inevitable, and they’re also costly, causing errors in production applications after just a few redeploys.
As we’ve explained, every object has a reference to its class, which in turn has a reference to its classloader. However, every classloader in turn has a reference to each of the classes it has loaded, each of which holds static fields defined in the class. This means:
- Leaking a classloader can be expensive: If a classloader is leaked, it will hold on to all its classes and all their static fields. Static fields commonly hold caches, singleton objects and various configuration and application states.
- Leaking a classloader is quite common: All it takes to leak a classloader is to leave a reference to any object, created from a class, loaded by that classloader. Even if that object seems completely harmless, it will still hold on to its classloader and the entire application state. A single place in the application that survives the redeploy and doesn’t clean up properly is enough to sprout the leak, so in a typical application there will be several such places, some of which are almost impossible to fix.
Dynamic Classloading in Java EE Web Applications
In order for a Java EE Web application to run, it has to be packaged into an archive with a .WAR extension and deployed to a servlet container like Tomcat. Servers and frameworks use dynamic classloaders to speed up the development cycle.
Before you can use dynamic classloaders, you must first create them. When deploying your application, the server will create one classloader for each application. In the servlet container (Tomcat, for example) each .WAR application is managed by an instance of the StandardContext class that recreates an instance of the Webapp classloader used to load the Web application classes.
Having a hierarchy of classloaders is not always enough; modern classloaders, including OSGi, take the approach of having one classloader per module. All classloaders may refer to each other and share a central repository, and each module declares the packages it imports and exports. Given a package name, the common repository is able to find relevant classloaders.
Troubleshooting issues with modern classloaders is similar to troubleshooting issues with a classloader hierarchy, except a developer must think in terms of export/import as well as the classpath.
Here are the problems with the modern approach:
- Import is a one-way street: if you want to use Hibernate, you import it but it cannot access your classes.
- Leaking is perhaps even more problematic: the more classloaders, the more references between them, the easier leaking becomes.
- Deadlocks may happen as the JVM enforces a global lock on
How Can We Fix Classloader Leaks?
Isolating modules or applications through classloaders is an illusion: leaks can and will happen. The natural abstraction for isolation is a process — this is understood widely outside of Java: .NET, dynamic languages and even browsers use processes for isolation.
A separate process for each application is the only approach to isolation that supports leakless updates.
The best possible solution for Java EE production applications is running one Web application per application container and using rolling restarts with session draining to ensure the new version will start without any side effects from the previous one. This Continuous Delivery approach is found in tools that automatically run no downtime rolling restarts for Web applications.