Helping objects to a tidy end

  Steve Ball ([email protected]) is an independent contractor specializing in C++/Java/UNIX development.
John Miller ([email protected]) Crawford is Interactive Technology Manager for Dave Clark Design Associates.

THE CREATION OF objects happens in the full glare of the spotlights: wherever new appears, a class constructor is called to allocate resources. The end of an object is a much less publicized affair. Just when it occurs is often difficult to say. The elimination of an object and its re-absorption into the global heap happens at the behest of the runtime system and is largely out of the programmer's control.

Nonetheless, the complete relinquishment of an object's allocated resources is a critical issue and one that we need to keep tabs on. Java's automatic garbage collection ensures that all allocated memory is returned for recycling. But what of other resources acquired by objects? To avoid leaking these resources, programmers need to intervene in the destruction of resource-holding objects. We'll devote this column to exploring the steps you need to take to ensure that each of your objects meets a timely and tidy end. Fortunately, the garbage collector provides the necessary hooks to secure the control we'll need, so our first step is to examine what services are provided by the garbage collector. Consider what happens when the following statements are executed:

Object mayfly = new Object();
mayfly = null;

A new instance of the Object class is created and a reference to it is assigned to the variable mayfly, whereupon the object is immediately discarded. The object we created is no longer reachable from any variable and is thus a candidate for garbage collection. Alternatively, we can manufacture some refuse for the garbage collector to uplift with a single statement:

new Object();

Garbage collection is a background function of the Java virtual machine (JVM). Though it may be explicitly invoked with System.gc, it otherwise operates silently in a low priority thread where it continuously searches for unreachable objects. Those it finds are eliminated and their memory reclaimed—the memory is returned to the memory manager's free list where it may be recycled later when you invoke new.

The garbage collector identifies unreachable objects by tracing the references of all variables that are currently in use. The objects referenced by the program's variables may themselves contain references to other objects, so the garbage collector follows these references recursively. Any objects that cannot be reached by this traversal are deemed unreachable and become candidates for garbage collection.

Unlike a simpler reference-counting approach, this strategy, known as mark-and-sweep, is able to identify circular chains of objects that refer to each other but are not referenced by any other variables (including objects that are solely referred to by themselves):


void linkSiblings() {
	Object[] hansel = new Object[1];
	Object[] gretel = new Object[1];
	hansel[0] = gretel;
	gretel[0] = hansel;
	}

When linkSiblings returns, the variables hansel and gretel go out of scope. The two Object[] instances are unreachable, but they are not unreferenced: they refer to each other. This doesn't fool the garbage collector though. They will be collected at the first opportunity.

Unreachable objects, like hansel and gretel, live on borrowed time until some undefined future instant when they meet a common fate at the hands of the predatory garbage collector. The exact moment when this occurs (as well as in which thread it occurs) is undefined. Fortunately, the garbage collector signals its intention to eliminate the objects by calling one of their methods before proceeding: drum roll, please, for the entry of the finalize method.

We can arrange to tidy up after our objects by hooking into the finalize method, which the garbage collector will invoke before their memory is relinquished. The memory that the object occupies may not be the only resource that the object has acquired, so here is the opportunity to release any other resources it holds.

Because assignments within the finalize method may render some unreachable objects reachable once more and other reachable objects unreachable, finalized objects may not be disposed of at the termination of the finalize method. Finalized objects are merely flagged as having had their finalize method executed. When the garbage collector performs a subsequent sweep and finds unreachable objects that have this flag set, they are eliminated on the spot with no warning given.

MAXIM 6: USE FINALIZE METHODS TO ENSURE EXTERNAL RESOURCES GET RECYCLED

Here is a definition of a temporary file class. It may be used in any place where a Random-AccessFile would be used, but has the distinction that when the file is closed, it is also deleted automatically.


import java.io.*;

class TemporaryFile extends RandomAccessFile
{
  		TemporaryFile() throws IOException {
		super(next, "rw");
    		file = next; next = getNext();
  }

  public void close() throws IOException {
    super.close();
    file.delete();
  }
  
  private File file;

  private static File getNext() {
    return new File(
    new Object().hashCode() + ".tmp");
  }
  private static File next = getNext();
}
The class overrides the close method so that when the file is closed it can also delete the file. As the class stands, we rely on the file being closed explicitly with the close method to be able to remove the file.

To have the file automatically deleted, whether close is called explicitly or not, we will need to define a finalize method:

public void finalize() throws Throwable {
  close();
}
So now, when the TemporaryFile object is destroyed, its finalize method is first called, ensuring that the temporary file will always be removed, right? Unfortunately, no. To understand why, we need to take a closer look at the finalizer mechanism.

To recap, every object has a finalize method, which it inherited originally from the primordial Object superclass (whose finalize method is defined to do nothing). A class designer may override the finalize method to have cleanup code executed before the object is eliminated.

Take a look at the source code or online documentation for the Object class and you will find it contains this method:

protected void finalize() throws Throwable
Because regular polymorphism is used by the garbage collector to invoke the finalize method (dynamically resolved to the finalize method you wrote for your class), the compiler will not produce an error if the finalize method in your class doesn't quite match this one. Ensure that the method you define returns void, takes no arguments, is non-static, and is spelled correctly.* If you define your method with any other signature, it will not override Object.finalize.

At first encounter, finalizer methods sound like a housekeeper's dream come true. They appear to offer an easy way to tidy up and return borrowed resources after objects have left the party. Again, finalizers seem to promise cheap protection for the heedless programmer who forgets such tedious tasks as closing a file or disconnecting from the server. And all of this would be true, if only Java were to provide guarantees about when and, even if, each unreachable object's finalize method will be called.

Quite apart from the delay of undefined duration between an object becoming unreachable and the garbage collector collecting it, there exists the possibility that the object will never be disposed of at all. By default, the garbage collector does not guarantee that all garbage will be collected before a main method returns or an applet terminates. Even calling System.gc as the last action of the program would not catch all objects: what of the objects referred to by class variables?

This is not quite the end of the story though. We're about to welcome onto the scene a housekeeper's aid that was introduced with Java 1.1. But those who have to stay behind with 1.0 (such as writers of applets for JVM version-challenged browsers) shouldn't design classes to depend on their finalize methods. All the same, keep in mind that, even though there are no guarantees that every object will be finalized, most will be.

Your classes should be designed to provide methods to perform cleanup and closure operations explicitly. You would expect users of the class to invoke these methods before discarding their objects—a close method for a File class, a disconnect method for an inter-applet communication class. But if they forget, you can call these methods from the finalize method. You could also log the fact that the methods that use your class need to be revisited to check up on their cleanup code—the finalizer mechanism was able to catch the resource leak this time, but next time it might not.

And now, to achieve our objective with the TemporaryFile class of ensuring that the file is deleted every time, the rest of us will move on up to 1.1 and beyond.

MAXIM 7: TAKE ADVANTAGE OF RunFinalizersOnExit WHERE YOU HAVE IT
The Java 1.1 runtime system provides programmers with a means of intervening before the final curtain is rung down. Use System.runFinalizersOnExit to ensure that all objects still on stage as the curtain falls will have a chance to say their last words through their finalize method. runFinalizersOnExit specifies that the finalizers of all objects that have finalizers that have not yet been automatically invoked are to be run before the Java runtime exits:

System.runFinalizersOnExit(true);
With the JVM now acting as guarantor that every object will have its final say, it is perfectly acceptable for finalize methods to perform any tidy up silently and for you to rely on all necessary cleanup actions being performed. We recommend that you invoke System.runFinalizersOnExit at the start of your programs and depend on all necessary finalize methods being invoked. The alternative, of ensuring that all termination methods (such as close) are called explicitly for all objects, can be very tedious, particularly where methods have multiple exit points or exceptions are involved. For some methods, ensuring that the catch clause tidies up objects that were constructed within the try block can require extensive reorganizations just to bring the names of the variables into an outer scope where their termination methods may be invoked. This can quickly disguise the structure and intent of the method. It is also error-prone and does not lend itself well to the introduction of additional resource-holding objects into the method.

The runFinalizersOnExit approach is simpler, cleaner, and comes with guarantees: rely on it where you have it available. One conceivable objection to using runFinalizersOnExit is the cost of finalizing all objects before the JVM may exit. The JVM designers have anticipated this objection. The JLS specifies that any object that does not override Object.finalize (or only trivially, such as to merely call super.finalize) may be immediately expunged without having its finalizer invoked. This is possible because, as Object.finalize does nothing, it cannot resurrect itself by assigning its this reference to an active variable, reacquiring its reachable status. The object's memory may be instantly reclaimed without the object having to be returned to the pool to be detected as unreachable a second time, as is the case with objects with non-trivial finalizers. For the vast majority of objects in the system, which do not have finalizers, no overhead is incurred (see Figure 1).

From the Pages

Figure 1. Life cycle of an object.

If you want to intervene at some point prior to exit, the System class also offers you runFinalization,† which visits unreachable objects waiting in limbo for the garbage collector to call and has them execute their finalize method there and then. Now that we have looked at the finalizer mechanism in detail and determined when you may rely on it being called, we can turn to the question of just what should the body of your finalizer methods contain.

MAXIM 8: ALWAYS DEFINE FINALIZE TO THROW THROWABLE
The close method lists IOException as its only checked exception, indicating that it may only throw an object of type IOException or a subclass of it:

public void close() throws IOException {
As TemporaryFile.finalize only calls close, it too could only throw an IOException object. Appraised of this, we could narrow down the types of objects thrown by our own finalize method to only those we know it could throw (subclasses of IOException only):
public void finalize() throws IOException {
   close();
}
The problem that this well-intentioned action introduces is not immediately apparent, but will lie dormant until the class is further extended. Suppose the new subclass is a multi-thread capable shared temporary file; its finalizer may throw an InterruptedException object instead of an IOException. How do we define the finalize method in this class?
public void finalize() throws IOException, InterruptedException {
	peer.stop();
	peer.join();
	close();
}
This will not compile because an overriding method definition may only narrow the types of exception thrown; it may not broaden them. This forces all subclasses to explicitly catch their own supplementary exceptions. This is an unnecessary burden:
public void finalize() throws IOException {
  try {
    peer.stop();
    peer.join();
    close();
  } catch (InterruptedException e) {}
}
An even more severe stricture is imposed on subclassers if a finalizer is defined to catch all its exceptions and has no throws qualifier:

public void finalize() {
  try {
    close();
  } catch (IOException e) {}
}
A method that specifies no checked exceptions may not throw any object, and neither may any overriding subclass definition. Avoid encumbering subclass designers by defining your finalize methods with as broad an exception signature as the one they originally inherited from Object.

MAXIM 9: ALWAYS CHAIN FINALIZE METHODS
When you define a finalize method in your own class, then (according to the normal rules for dynamic resolution for overridden methods) you conceal the finalize method of your class's superclass. This won't be a problem if you're not extending any other class, but if your function inherits a finalize method from a superclass other than Object then its superclass's finalize method will not be automatically invoked.

Because the constructors that give objects birth chain automatically from subclass to superclass all the way back to Object, it might be thought that finalizers would do likewise as they preside over objects' demise. Sadly not, and so an explicit call is required—preferably at the end of your finalizer—to the superclass's finalize method.

public void finalize() throws Throwable {
  close();
  super.finalize();
}
The qualification with super is required to avoid a recursive infinite loop where your finalize method would invoke itself.

There's no strict requirement to chain finalizers if none of the superclasses override finalize, as the one from the Object class is defined to do nothing. However, a finalize method may be added to one of the superclasses later. We recommend that you always invoke the superclass finalize method as the last line of your own finalize methods so that you will not need to revisit subclass methods if you add a finalize method to a superclass.

One final point: Although the garbage collector will catch and ignore any exception thrown in a finalize method, it will result in the method immediately returning. The superclass should be given an opportunity to free its resources even if your class should fail to free its own. As a matter of habit, enclose your cleanup code in a try block and execute the superclass finalizer in the try block's finally clause:

public void finalize() throws Throwable {
  try {
    close();
  } finally {
    super.finalize();
  }
}
If close fails and throws an exception, the finalize method will execute the superclass finalizer before re-throwing the IOException object.

MAXIM 10: WRITE FINALIZE METHODS DEFENSIVELY AND KEEP THEM SIMPLE
Finalizer methods are best kept simple. They operate under a unique set of conditions:

  1. Unlike any other methods, finalize methods will be able to reach objects that are technically unreachable (the object referred to by this, for one) and, in the case of circularly linked objects, will have references to objects that have already had their finalize methods invoked.
  2. They may be called asynchronously by the garbage collector thread or synchronously by any other thread that calls System.gc or attempts to new an object when insufficient memory is available.
  3. If defined as public, the finalize method may be called explicitly against an object. If so, the garbage collector will still invoke it as part of its own cleanup.
  4. For sets of circularly linked unreachable objects, the garbage collector may execute the finalizer methods in any order, even in multiple concurrent threads (see Java Language Specification, Version 1.0, Gosling, J., B. Joy, and G. Steele, Addison-Wesley, 1996).

The environment in which the finalizer operates is uncertain, so complex finalize methods should be avoided. The more complex the method becomes, the more likely it is that any synchronization issues which other methods in the class must contend with, will also have to be applied to the finalize method. This is something you should eagerly avoid—you'll have no hesitation in doing so after you've had to track down a deadlock condition occurring in the garbage collector thread. But, of course, you may not know straight away that the garbage collector thread has deadlocked—you may just run out of memory.

The finalize method may be called at any time, by any thread, any number of times, so it should be written defensively. Check that the actions your finalizers are about to perform still need doing. Maintain a consistent internal state in the object: if you test a flag that indicates whether the file should be deleted before deleting it, remember to set the flag afterwards. It is also possible for finalized objects to reappear on the scene as reachable objects. By assigning the this reference or any instance variable that refers to an unreachable object to some accessible variable, a once unreachable object may be resurrected.‡ For this reason, the finalize method should leave the object in a tidy self-consistent state—the object's other methods may yet be invoked.

Taking that advice, we'll now rewrite our TemporaryFile class's finalize method as defensively and simply as possible:


public void close() throws IOException {
	if (file)
		try {
			super.close();
		} finally {
			file.delete();
			file = null;
		}
}

public void finalize() throws Throwable {
	try {
		close();
	} finally {
		super.finalize();
	}
}
As a further motivation to keep finalizers simple, be aware that objects "which would naively be considered reachable" may find their finalizers called and their memory returned to the heap. An optimizing compiler may assign null to objects "to cause the storage for such an object to be potentially reclaimable sooner" (see Java Language Specification, Version 1.0, Gosling, J., B. Joy, and G. Steele). For this reason, try to avoid side-effect behavior in finalizer methods. The file variable in our TemporaryFile class is defined to be private to avoid just these problems, but imagine if it was public. In the following code, would the construction of the RandomAccessFile succeed or throw an exception?
public void peekAttempFile() throws IOException {
	TemporaryFile tempFile = new TemporaryFile();

	// last reference to tempFile
	    File file = tempFile.file;

try {
		// take a peek at the contents of the 
		// temporary file
	RandomAccessFile view = new 
	RandomAccessFile(file, "r");
} catch (IOException e) {
	// file already deleted!
  }
}
The compiler is free to reclaim the object referred to by tempFile at any point after its last use in the scope into which it was introduced. The TemporaryFile object may have been finalized at the point that the RandomAccessFile object is constructed, but the new object depends on the temporary file not having been deleted yet.§

FINAL WORDS
In our September column ("Multi-threaded assignment surprises," Java Report, 3(9), 1998) we said this of classes used to encapsulate locking mechanisms: "Using separate locking classes introduces its own problems—most notably that stand-alone locking objects are not automatically unlocked."

We can now see how this problem may be solved: using a finalize method to free its lock, and System.runFinalizersOnExit to ensure that the method is called, we can guarantee that a discarded Mutex object will not (paradoxically) hold the lock after it has been destroyed:

public void finalize() throws Throwable {
	try {
		if (locked)
		unlock();
	} finally {
		super.finalize();
  }
}
Although there are plenty of caveats to be aware of when using finalizers, we encourage you to make confident use of this useful feature. Our maxims will guide you around the pitfalls. Particularly with Java 1.1, contributing its support for guaranteed finalizers, the finalization mechanism is a powerful and effective tool for ensuring that your classes co-operate well, leak no resources, and contribute to more professional and robust system designs.

LATE-BREAKING DEVELOPMENTS
At the time of going to press, Sun's JDK 1.2 Beta 4 had just been released. It contains this disturbing annotation to RunFinalizersOnExit: Deprecated. This method is inherently unsafe. It may result in finalizers being called on live objects, resulting in erratic behavior or deadlock. It is unclear from this warning whether the idea of guaranteed finalization is itself unimplementable or simply implemented fautily in current JDKs.

As we have shown in our column, this method is an essential tool for programmers wishing to write robust systems. As developers committed to the Java platform, we feel alarmed with the apparently cavalier attitude the deprecation of this method demonstrates.

RunFinalizersOnExit has been a part of the standard javal.lang package since the release of JDK 1.1.


*This is especially troublesome for those, like us, who live outside of the United States where finalize is spelled (as it should be) with an "s". Be careful, finalize is not a reserved Java keyword, just a ubiquitous method defined for every object, so syntax-highlighting editors won't bring your attention to the mistake if you misspell it.

† An alternative to the class method System.runFinalization is the instance method Runtime.runFinalization. The former is merely a wrapper for Runtime.getRuntime().runFinalization(). This is similarly true of System.runFinalizersOnExit.

‡ The garbage collector will only call an object's finalize method once. The next time that it detects that the resurrected object is unreachable the object will be dispatched without hesitation.

§ None of the Sun JDKs that we tested threw an exception when the RandomAccessFile object was constructed. The JVM from Microsoft's Visual J++ could be observed taking advantage of this optimization when a call to System.gc was made immediately after tempFile's last use.