IN NEARLY EVERY Java book available today—usually within the first chapter—you'll
find the claim that Java solves the dreaded problem of memory leaks through the use of automatic garbage
collection. Memory leaks in a program use memory from the operating system, without ever giving it back
or reusing it—causing the program to consume more and more memory.
Traditional languages such as C and C++ explicitly require programmers to remove data structures that
are allocated in heap memory, using constructs such as "free" or "delete." Sometimes programmers fail
to call "free" or "delete," which is the prime cause of memory leaks.
The idea of automatic garbage collection is that a utility thread or process running within the program
will follow the program along and clean up all its unused memory, giving it back to the operating
system or program to be reused. The developer then does not have to indicate explicitly when objects
should be removed from memory—thus solving the main cause of memory leaks. Because Java uses
automatic garbage collection, many think that Java programs are free from the possibility of memory
leaks. Unfortunately, this is not necessarily the case. Although automatic garbage collection solves
the main cause of memory leaks, you can still have memory leaks in a Java program. My focus is to
identify the causes of these memory leaks and give some ways to avoid them or solve them when they do occur.
The three typical causes of memory leaks in a Java program are:
- Unknown or unwanted object references.
- Failure to clean-up or free native system resources.
- Bugs in the Java Development Kit (JDK) or third-party libraries.
Before we can discuss these causes of memory leaks individually, we need to review how garbage collection
is done in Java.
Garbage Collection in Java
The Java automatic garbage collection process typically operates in the form of a low priority thread
that constantly searches memory for "unreachable" objects. Unreachable objects are objects that are not
referenced by any other object that is reachable by a live thread.
This explanation may be a little confusing, so let's look at an example.
In Listing 1, our program is simple; it instantiates a Customer object and then calls the getBalance() method on it.
In this example, when we return from the getBalance() method, there are
no longer any references to the balanceUtility instance. This object
is now unreachable and eligible for garbage collection. One nice thing about the garbage collector is that
it calls the finalize() method, if you have implemented it, just before
recycling the memory used by a given object. So, by putting a System.out.println()
statement in the finalize() method of your object, you can easily find
out when and if your object is being garbage collected. Note, however, that just because the
balanceUtility instance is eligible for collection, doesn't mean
that it will be collected immediately. In some cases, it may not be collected. Different Java Virtual
Machines (VMs) use different algorithms to determine how to collect garbage most efficiently. For example,
changes made to the standard Sun VM from version 1.1.x to version 2 made garbage collection less aggressive
in general.1 Another example is that in the Sun 1.1.x VMs, there was a
system method (runFinalizersOnExit(true)) that could be called to
guarantee that the VM would finalize all the eligible objects before exiting. Although this feature
still exists in version 2, it has been deprecated.2
Figure 1. The balanceUtility object is no longer being referenced
and is eligible for garbage collection.
Figure 1 shows how the balanceUtility object is no longer being
referenced and is eligible for garbage collection.
Class Garbage Collection and Unloading
The preceding section described how object-instance garbage collection is done, but there is yet another
important area that is not discussed as often—class garbage collection and unloading. In our example,
we had a class, Customer, with an instance of
Customer called fred.
The class Customer is also a heap-allocated object; however, there
is only one instance of the class object per class loader. When you first try to instantiate a
Customer instance, the Java VM must find the class using a tool called
a class loader. You can write your own class loader in Java, but there is one built into the VM called a
primordial class loader. Explaining exactly how class loaders work gets a bit complicated.
3 The important thing to understand is that class instances are different
from regular instances in that special rules apply to their garbage collection. These rules are as follows:
- If the primordial class loader loads class instances, they will stay in memory for the duration of your
program. They are not eligible for garbage collection.
- Classes loaded by a custom class loader are eligible for garbage collection and unloading, if the
class loader that loads them is no longer referenced.
The main reason for these special rules is that class instances can have static variables and native
methods that need to maintain their state for the lifetime of the program, because the VM cannot determine
that they may never be used again.4
Singleton Pattern and Garbage Collection
Understanding the rules of garbage collection when you use static variables can be tricky. The basic rule
of static variables is that once the variable is initialized with data (typically an object), the variable
will stay in memory as long as the class that defines it stays in memory. Because the primordial class
loader loads most classes, they will stay in memory for the life of the program, thus causing the static
variables within them to stay in memory for the life of the program as well.
You might ask why garbage collecting static variables can be tricky. The reason will become clear as we
introduce the Singleton pattern. One popular use of static variables
is to implement the Singleton pattern. The
Singleton pattern is an object-oriented design pattern that is used to ensure a class has only one
instance and to provide a global point of access to that instance. For example, if we wanted to model a
CustomerFactory class that produces
Customer objects, we could use the Singleton pattern. In the
CustomerFactory example, we define a static variable in the
CustomerFactory class definition, which holds an instance of itself.
We could then define a static method getCustomerFactory() that could
be used to retrieve the customerFactory Singleton instance, thus
providing a global point of access to it (see Figure 2).
Figure 2. getCustomerFactory() can be used to retrieve the
customerFactory Singleton instance.
We could then get customers from the customerFactory Singleton
instance by calling getCustomer(String name) to get a customer
for a given name. Now let's say for performance reasons, we had the
CustomerFactory hold onto a vector of customers to cache them to be reused. Each time we got
a customer, we would check the cache. If the customer was not there, we could create it using some
resource, such as a database or file, and add it to the cache.
Now take a step back and consider the memory implications of this example. We have a
Primordial Class Loader that references the
CustomerFactory class object. The
CustomerFactory class object has a static variable, which references an instance of a
CustomerFactory. Finally, the instance of
CustomerFactory references a vector of
Customer objects. Figure 3 illustrates the situation.
Because the first object in the reference graph, the primordial class loader, is not eligible
for garbage collection, none of the objects in this entire chain can be collected.
Figure 3. All objects shown will be kept in memory for the lifetime of the Java program.
As you can see in Figure 3, we have a case in which all of the objects shown in the diagram will be
kept in memory for the lifetime of the Java program. This scenario could be looked at as either
good or bad depending on your design requirements. It's good in that these objects are kept in
memory and can be reused quickly; so if speed is the goal, this scenario may be fine. It's bad
for two reasons. First, memory is a precious resource and there is a finite amount of it, so
objects that are never or seldom reused should be destroyed to free-up memory. Second,
developers may inadvertently reference objects from Singletons.
These referenced objects would normally be garbage collected, but because
Singletons stay in memory for as long as the program is running, they will keep these "other"
objects in memory as well, causing a memory leak. This circumstance is known as unwanted or unknown
references, which will be covered in detail later.
Unknown or Unwanted Object References
Now that we know a little bit about how objects are referenced and how they become unreachable,
we can talk about unknown or unwanted references. The general idea of garbage collection is that
when we are done with an object and want it to be removed from memory, we stop referencing it and
expect it to go away. However, there are times when we think we have stopped referencing
an object, but unknowingly we really do have another reference to it. Let's enhance the first
simple example of the article to illustrate the problem. Our program now has a
securityManager object that has the responsibility of maintaining security policies. The
securityManager object is implemented as a
Singleton. Our security- Manager object needs to notify all
customers immediately when a policy has changed, so it holds a reference to all customers in memory.
The code might look like Listing 2.
In this example, we set the fred variable to
null, expecting it to be collected. However, because the
SecurityManager object is holding onto fred through
its vector of objects to notify, it will not be collected. The
SecurityManager class, which is not eligible for garbage collection because it's a
Singleton, is referencing the
securityManager object though a static variable. Although this example is fairly simple,
we can quickly see how using the Singleton pattern can
lead to unknown references quite easily. Because these types of references can occur through
inherited or collaborated classes, we may not even know they are happening. Figure 4 shows us
how the references would look.
Figure 4. How the references would look.
As you can see, even after removing the fred variable reference
by assigning it to null (signified by the lightning bolt), it is
still being held by the securityManager object through its vector
of objects to notify. As stated before, because the securityManager
object is held as a static variable by the SecurityManager class,
it is kept in memory.
Solving Unknown or Unwanted References
Now that we know how unknown or unwanted references can occur, we can solve them by avoiding referencing
objects from static variables or long-lasting objects; cleaning up the references when we are done with
the object; or using weak references.
Avoidance means not getting into a situation where you have objects that need to be collected referenced
by the Singleton pattern or longer-lasting objects. Avoidance can
be difficult and may require writing some "ugly" code. For example, if we wanted a customer to be notified
when a security policy was changed, we don't have many development choices. We could write code to poll
the securityManager object from time to time, but that method is
not as clean as just letting the securityManager object tell you
when a policy has changed.
The cleanup approach means that when the program is done with a given object, the program must make
explicit calls to clean up the object itself. So in our example, we could implement a
cleanUp() method on the
Customer object that removes the object from the securityManager
object's list, along with any other necessary cleanup. This solution is not ideal either because developers
now need to remember to call cleanUp() when they are finished with an
object.
Neither of the previous two methods really solve the problem of unwanted references. They just work around
it. What we really need is a way to reference an instance but not prevent it from being garbage collected.
Fortunately, JDK 1.2 gives us a convenient way to do this by using weak references. A weak reference is
a type of object reference that exists in such a way that it will not prevent the object from being
garbage-collected. The idea is that the garbage collector looks at the object to be collected and
sees if it has any references; if it does, and they are only weak references, it recycles the object.
In JDK 1.2, weak references are implemented through several classes contained in the
java.lang.ref package. Let's take a look at how we
could use one of these classes, WeakReference in our
example, to fix our problem (see Listing 3).
Now we have the best solution. Through the use of weak references, we can have a reference to our customers
without keeping the customer in memory if it is no longer used.
Another very useful class that has been added to JDK 1.2 is the
WeakHashMap, which is a combination of a HashMap
and WeakReference. This class can be used for programming
problems where you need to have a HashMap of information,
but you would like that information to be garbage collected if you are the only one referencing
it. To continue from our earlier example, if for some reason we needed to store our customers
in a HashMap by key, we could use the
WeakHashMap. In this case, we wouldn't need to wrapper our customers in
WeakReference objects, we could just put them in
the WeakHashMap referenced by a key such as "
goodCustomers." Whenever the
WeakHashMap becomes the only one referencing the good customers, it becomes eligible for
garbage collection. Let's take a look at how we could implement the code
(see Listing 4).
Failure to Clean-Up or Free
Native System Resources
The automatic garbage collector does a fairly good job of cleaning up unused object instances.
Unfortunately, there are other ways to allocate memory from the Java VM that exist outside the
bounds of Java instances—typically, in the form of native system resources. Native system
resources are resources that are allocated by a function external to Java. These allocations are
typically done through the Java Native Interface (JNI), and are implemented in C or C++.
One very simple example that frequently burns new Java developers is the use of Abstract Windowing
Toolkit (AWT) resources. The Frame, Dialog, and
Graphics classes require that the method
dispose() be called on them when they are no longer used,
to free up the system resources they reserve.
More complicated examples exist when you are writing your own native methods using JNI, or when
accessing third-party methods that use JNI. In these cases, the native methods may need some
explicit cleaning-up calls from Java. If these calls are not made, more memory leaks can occur.
Cleaning-Up and Freeing
Native System Resources
When dealing with visual native resources, you can often improve performance and avoid the possibility
of memory leaks by caching views. By caching and reusing each view, you don't have to worry about
the view and its associated system resources being freed-up, because there are only a finite number
of views in existence. However, this cache/reuse method may be difficult to do if you have 40 or
50 different views. In these cases, you may need to explicitly dispose()
your resources when done with them or develop some sort of hybrid caching scheme. When using frames
or windows, you can clean them up fairly easily by having your window-close events trigger a call
to dispose().
Another approach that can be used to free up other system resources is to put cleanup code in
the finalize() methods of the Java classes that use these
system resources. This approach can be helpful, but if timing is important, it is difficult to
predict when or even if finalize() will be called. As stated
before, there is no requirement for VMs to garbage collect unused references, so there may be a
considerable delay before objects are collected and finalize()
is called or finalize() may not ever be called.
A better approach is to implement a register()/release() method
set with your native wrapper class. When users of the class acquire a system resource, they use
a register() method, and when they are done they must
call a release() method. This method is inconvenient to
developers at times, but it guarantees timely release of system resources.
Bugs in the JDK or Third-Party Libraries
By far the most difficult type of memory leak to track is one not caused by code you have
written. There are bugs in various versions of the JDK in the visual widgets that can cause
memory leaks. The JDK includes AWT libraries as well as several versions of Swing libraries.
To see bugs submitted to Sun, visit the Developer Connection5
and go to the Bug Parade. Most bugs are isolated to specific uses of visual classes, so always
make sure your code is not causing the leak before looking too hard in this database. Note that
these memory leaks are no different than unwanted references and system resources; they just
occur in classes you did not develop.
Detecting Memory Leaks
Now that you know about different ways that memory leaks occur in Java programs, you need to
know how to detect them.
The first and easiest way is to use an operating system process monitor, which tells how much
memory a process is using. See if your Java process grows in memory over time. On Windows NT,
I personally use the NT task manager and watch the memory usage only while my Java program
is running.
You can also use the totalMemory() and
freeMemory() methods in the Java Runtime class, which tell
you how much total heap memory is being controlled by the VM, along with how much is not in use at
a particular time.
Finally, you can detect memory leaks using memory-profiling tools such as OptimizeIt, JProbe, or
JInsight. These tools allow you to examine your running program, providing valuable information,
such as the number of instances allocated for each type of class. The OptimizeIt tool allows you
to see how each object in memory was allocated. It also shows you a trace of references to objects
in memory. This tool can be very helpful in tracking down the cause of unwanted references. Once you
have found the memory leaks, you can usually solve them easily by applying the techniques described earlier.
References
- For a description of Sun's garbage collection changes see
java.sun.com/products/jdk/1.2/
compatibility.html#runtime #3.
- For a complete description of the rules of garbage collection, see the Java Language
Specification (primarily sections 12.6,12.7,12.8) at
java.sun.com/docs/books /jls/12.doc.html.
- For more information on class loading, unloading, and garbage collection, see Venners,
B., Inside the Java Virtual Machine, McGraw-Hill, 1997.
- See java.sun.com/docs/
books/jls/unloading-rationale.html for a rationale of the special rules for class instance garbage
collection.
- For more information on bugs, visit the Sun Developer Connection:
developer.java.sun.com.