Memory Leaks in Java Programs

IN NEARLY EVERY Java book available today—usually within the first chapter—you'll find the claim that Java solves the dreaded problem of memory leaks through the use of automatic garbage collection. Memory leaks in a program use memory from the operating system, without ever giving it back or reusing it—causing the program to consume more and more memory.

Traditional languages such as C and C++ explicitly require programmers to remove data structures that are allocated in heap memory, using constructs such as "free" or "delete." Sometimes programmers fail to call "free" or "delete," which is the prime cause of memory leaks.

The idea of automatic garbage collection is that a utility thread or process running within the program will follow the program along and clean up all its unused memory, giving it back to the operating system or program to be reused. The developer then does not have to indicate explicitly when objects should be removed from memory—thus solving the main cause of memory leaks. Because Java uses automatic garbage collection, many think that Java programs are free from the possibility of memory leaks. Unfortunately, this is not necessarily the case. Although automatic garbage collection solves the main cause of memory leaks, you can still have memory leaks in a Java program. My focus is to identify the causes of these memory leaks and give some ways to avoid them or solve them when they do occur.

The three typical causes of memory leaks in a Java program are:

  1. Unknown or unwanted object references.
  2. Failure to clean-up or free native system resources.
  3. Bugs in the Java Development Kit (JDK) or third-party libraries.
Before we can discuss these causes of memory leaks individually, we need to review how garbage collection is done in Java.

Garbage Collection in Java
The Java automatic garbage collection process typically operates in the form of a low priority thread that constantly searches memory for "unreachable" objects. Unreachable objects are objects that are not referenced by any other object that is reachable by a live thread.

This explanation may be a little confusing, so let's look at an example. In Listing 1, our program is simple; it instantiates a Customer object and then calls the getBalance() method on it.

In this example, when we return from the getBalance() method, there are no longer any references to the balanceUtility instance. This object is now unreachable and eligible for garbage collection. One nice thing about the garbage collector is that it calls the finalize() method, if you have implemented it, just before recycling the memory used by a given object. So, by putting a System.out.println() statement in the finalize() method of your object, you can easily find out when and if your object is being garbage collected. Note, however, that just because the balanceUtility instance is eligible for collection, doesn't mean that it will be collected immediately. In some cases, it may not be collected. Different Java Virtual Machines (VMs) use different algorithms to determine how to collect garbage most efficiently. For example, changes made to the standard Sun VM from version 1.1.x to version 2 made garbage collection less aggressive in general.1 Another example is that in the Sun 1.1.x VMs, there was a system method (runFinalizersOnExit(true)) that could be called to guarantee that the VM would finalize all the eligible objects before exiting. Although this feature still exists in version 2, it has been deprecated.2

Figure 1

Figure 1. The balanceUtility object is no longer being referenced and is eligible for garbage collection.

Figure 1 shows how the balanceUtility object is no longer being referenced and is eligible for garbage collection.

Class Garbage Collection and Unloading
The preceding section described how object-instance garbage collection is done, but there is yet another important area that is not discussed as often—class garbage collection and unloading. In our example, we had a class, Customer, with an instance of Customer called fred. The class Customer is also a heap-allocated object; however, there is only one instance of the class object per class loader. When you first try to instantiate a Customer instance, the Java VM must find the class using a tool called a class loader. You can write your own class loader in Java, but there is one built into the VM called a primordial class loader. Explaining exactly how class loaders work gets a bit complicated. 3 The important thing to understand is that class instances are different from regular instances in that special rules apply to their garbage collection. These rules are as follows:

  • If the primordial class loader loads class instances, they will stay in memory for the duration of your program. They are not eligible for garbage collection.

  • Classes loaded by a custom class loader are eligible for garbage collection and unloading, if the class loader that loads them is no longer referenced.
The main reason for these special rules is that class instances can have static variables and native methods that need to maintain their state for the lifetime of the program, because the VM cannot determine that they may never be used again.4 Singleton Pattern and Garbage Collection
Understanding the rules of garbage collection when you use static variables can be tricky. The basic rule of static variables is that once the variable is initialized with data (typically an object), the variable will stay in memory as long as the class that defines it stays in memory. Because the primordial class loader loads most classes, they will stay in memory for the life of the program, thus causing the static variables within them to stay in memory for the life of the program as well.

You might ask why garbage collecting static variables can be tricky. The reason will become clear as we introduce the Singleton pattern. One popular use of static variables is to implement the Singleton pattern. The Singleton pattern is an object-oriented design pattern that is used to ensure a class has only one instance and to provide a global point of access to that instance. For example, if we wanted to model a CustomerFactory class that produces Customer objects, we could use the Singleton pattern. In the CustomerFactory example, we define a static variable in the CustomerFactory class definition, which holds an instance of itself. We could then define a static method getCustomerFactory() that could be used to retrieve the customerFactory Singleton instance, thus providing a global point of access to it (see Figure 2).

Figure 2

Figure 2. getCustomerFactory() can be used to retrieve the customerFactory Singleton instance.

We could then get customers from the customerFactory Singleton instance by calling getCustomer(String name) to get a customer for a given name. Now let's say for performance reasons, we had the CustomerFactory hold onto a vector of customers to cache them to be reused. Each time we got a customer, we would check the cache. If the customer was not there, we could create it using some resource, such as a database or file, and add it to the cache.

Now take a step back and consider the memory implications of this example. We have a Primordial Class Loader that references the CustomerFactory class object. The CustomerFactory class object has a static variable, which references an instance of a CustomerFactory. Finally, the instance of CustomerFactory references a vector of Customer objects. Figure 3 illustrates the situation. Because the first object in the reference graph, the primordial class loader, is not eligible for garbage collection, none of the objects in this entire chain can be collected.

Figure 3

Figure 3. All objects shown will be kept in memory for the lifetime of the Java program.

As you can see in Figure 3, we have a case in which all of the objects shown in the diagram will be kept in memory for the lifetime of the Java program. This scenario could be looked at as either good or bad depending on your design requirements. It's good in that these objects are kept in memory and can be reused quickly; so if speed is the goal, this scenario may be fine. It's bad for two reasons. First, memory is a precious resource and there is a finite amount of it, so objects that are never or seldom reused should be destroyed to free-up memory. Second, developers may inadvertently reference objects from Singletons. These referenced objects would normally be garbage collected, but because Singletons stay in memory for as long as the program is running, they will keep these "other" objects in memory as well, causing a memory leak. This circumstance is known as unwanted or unknown references, which will be covered in detail later.

Unknown or Unwanted Object References
Now that we know a little bit about how objects are referenced and how they become unreachable, we can talk about unknown or unwanted references. The general idea of garbage collection is that when we are done with an object and want it to be removed from memory, we stop referencing it and expect it to go away. However, there are times when we think we have stopped referencing an object, but unknowingly we really do have another reference to it. Let's enhance the first simple example of the article to illustrate the problem. Our program now has a securityManager object that has the responsibility of maintaining security policies. The securityManager object is implemented as a Singleton. Our security- Manager object needs to notify all customers immediately when a policy has changed, so it holds a reference to all customers in memory. The code might look like Listing 2.

In this example, we set the fred variable to null, expecting it to be collected. However, because the SecurityManager object is holding onto fred through its vector of objects to notify, it will not be collected. The SecurityManager class, which is not eligible for garbage collection because it's a Singleton, is referencing the securityManager object though a static variable. Although this example is fairly simple, we can quickly see how using the Singleton pattern can lead to unknown references quite easily. Because these types of references can occur through inherited or collaborated classes, we may not even know they are happening. Figure 4 shows us how the references would look.

Figure 4

Figure 4. How the references would look.

As you can see, even after removing the fred variable reference by assigning it to null (signified by the lightning bolt), it is still being held by the securityManager object through its vector of objects to notify. As stated before, because the securityManager object is held as a static variable by the SecurityManager class, it is kept in memory.

Solving Unknown or Unwanted References
Now that we know how unknown or unwanted references can occur, we can solve them by avoiding referencing objects from static variables or long-lasting objects; cleaning up the references when we are done with the object; or using weak references.

Avoidance means not getting into a situation where you have objects that need to be collected referenced by the Singleton pattern or longer-lasting objects. Avoidance can be difficult and may require writing some "ugly" code. For example, if we wanted a customer to be notified when a security policy was changed, we don't have many development choices. We could write code to poll the securityManager object from time to time, but that method is not as clean as just letting the securityManager object tell you when a policy has changed.

The cleanup approach means that when the program is done with a given object, the program must make explicit calls to clean up the object itself. So in our example, we could implement a cleanUp() method on the Customer object that removes the object from the securityManager object's list, along with any other necessary cleanup. This solution is not ideal either because developers now need to remember to call cleanUp() when they are finished with an object.

Neither of the previous two methods really solve the problem of unwanted references. They just work around it. What we really need is a way to reference an instance but not prevent it from being garbage collected. Fortunately, JDK 1.2 gives us a convenient way to do this by using weak references. A weak reference is a type of object reference that exists in such a way that it will not prevent the object from being garbage-collected. The idea is that the garbage collector looks at the object to be collected and sees if it has any references; if it does, and they are only weak references, it recycles the object. In JDK 1.2, weak references are implemented through several classes contained in the java.lang.ref package. Let's take a look at how we could use one of these classes, WeakReference in our example, to fix our problem (see Listing 3).

Now we have the best solution. Through the use of weak references, we can have a reference to our customers without keeping the customer in memory if it is no longer used.

Another very useful class that has been added to JDK 1.2 is the WeakHashMap, which is a combination of a HashMap and WeakReference. This class can be used for programming problems where you need to have a HashMap of information, but you would like that information to be garbage collected if you are the only one referencing it. To continue from our earlier example, if for some reason we needed to store our customers in a HashMap by key, we could use the WeakHashMap. In this case, we wouldn't need to wrapper our customers in WeakReference objects, we could just put them in the WeakHashMap referenced by a key such as " goodCustomers." Whenever the WeakHashMap becomes the only one referencing the good customers, it becomes eligible for garbage collection. Let's take a look at how we could implement the code (see Listing 4).

Failure to Clean-Up or Free Native System Resources
The automatic garbage collector does a fairly good job of cleaning up unused object instances. Unfortunately, there are other ways to allocate memory from the Java VM that exist outside the bounds of Java instances—typically, in the form of native system resources. Native system resources are resources that are allocated by a function external to Java. These allocations are typically done through the Java Native Interface (JNI), and are implemented in C or C++.

One very simple example that frequently burns new Java developers is the use of Abstract Windowing Toolkit (AWT) resources. The Frame, Dialog, and Graphics classes require that the method dispose() be called on them when they are no longer used, to free up the system resources they reserve.

More complicated examples exist when you are writing your own native methods using JNI, or when accessing third-party methods that use JNI. In these cases, the native methods may need some explicit cleaning-up calls from Java. If these calls are not made, more memory leaks can occur.

Cleaning-Up and Freeing Native System Resources
When dealing with visual native resources, you can often improve performance and avoid the possibility of memory leaks by caching views. By caching and reusing each view, you don't have to worry about the view and its associated system resources being freed-up, because there are only a finite number of views in existence. However, this cache/reuse method may be difficult to do if you have 40 or 50 different views. In these cases, you may need to explicitly dispose() your resources when done with them or develop some sort of hybrid caching scheme. When using frames or windows, you can clean them up fairly easily by having your window-close events trigger a call to dispose().

Another approach that can be used to free up other system resources is to put cleanup code in the finalize() methods of the Java classes that use these system resources. This approach can be helpful, but if timing is important, it is difficult to predict when or even if finalize() will be called. As stated before, there is no requirement for VMs to garbage collect unused references, so there may be a considerable delay before objects are collected and finalize() is called or finalize() may not ever be called.

A better approach is to implement a register()/release() method set with your native wrapper class. When users of the class acquire a system resource, they use a register() method, and when they are done they must call a release() method. This method is inconvenient to developers at times, but it guarantees timely release of system resources.

Bugs in the JDK or Third-Party Libraries
By far the most difficult type of memory leak to track is one not caused by code you have written. There are bugs in various versions of the JDK in the visual widgets that can cause memory leaks. The JDK includes AWT libraries as well as several versions of Swing libraries. To see bugs submitted to Sun, visit the Developer Connection5 and go to the Bug Parade. Most bugs are isolated to specific uses of visual classes, so always make sure your code is not causing the leak before looking too hard in this database. Note that these memory leaks are no different than unwanted references and system resources; they just occur in classes you did not develop.

Detecting Memory Leaks
Now that you know about different ways that memory leaks occur in Java programs, you need to know how to detect them.

The first and easiest way is to use an operating system process monitor, which tells how much memory a process is using. See if your Java process grows in memory over time. On Windows NT, I personally use the NT task manager and watch the memory usage only while my Java program is running.

You can also use the totalMemory() and freeMemory() methods in the Java Runtime class, which tell you how much total heap memory is being controlled by the VM, along with how much is not in use at a particular time.

Finally, you can detect memory leaks using memory-profiling tools such as OptimizeIt, JProbe, or JInsight. These tools allow you to examine your running program, providing valuable information, such as the number of instances allocated for each type of class. The OptimizeIt tool allows you to see how each object in memory was allocated. It also shows you a trace of references to objects in memory. This tool can be very helpful in tracking down the cause of unwanted references. Once you have found the memory leaks, you can usually solve them easily by applying the techniques described earlier.

References

  1. For a description of Sun's garbage collection changes see java.sun.com/products/jdk/1.2/ compatibility.html#runtime #3.
  2. For a complete description of the rules of garbage collection, see the Java Language Specification (primarily sections 12.6,12.7,12.8) at java.sun.com/docs/books /jls/12.doc.html.
  3. For more information on class loading, unloading, and garbage collection, see Venners, B., Inside the Java Virtual Machine, McGraw-Hill, 1997.
  4. See java.sun.com/docs/ books/jls/unloading-rationale.html for a rationale of the special rules for class instance garbage collection.
  5. For more information on bugs, visit the Sun Developer Connection: developer.java.sun.com.