Effective JavaSolutions for implementing dependable clone methods

IN MY PREVIOUS column,1 we took a whirlwind tour of Java's cloning mechanism. I highlighted the special relationship between the Object.clone method and the Cloneable interface and described how Object.clone is implemented within the Java Virtual Machine (JVM).

Unfortunately, the closer you look at Java's cloning mechanism, the more clumsy and inelegant it appears—implementers are required to catch and absorb exceptions that cannot possibly be thrown, and have no easy way to disable cloning for classes that inherit a clone method with no checked exceptions. Worse still, if you attempt to define a clone method in the presence of final members or inner classes, Java fights you all the way.

This would not be so bad if cloning wasn't a necessary part of fully encapsulating class behavior. Where constructors wish to retain references to their arguments while staying immune to changes made to those objects in the calling method, or where accessors wish to return objects but do not wish their own member instances to be modified through the returned references, cloning is unavoidable (see Maxim 18 of my previous column1).

In this column, I'll present workarounds that will enable you to implement dependable clone methods even in the presence of Java's most cloning-hostile language features.

The source of this contention originates in Object.clone. To understand why resistance is futile, I'll begin by showing why this powerful (and dangerous) method cannot be avoided.

Maxim 23: Always Use super.clone to Perform the Cloning (Unless clone is Declared final)
Consider this simple class, to which you would like to add cloning support:


class Vec {
  Vec() {}
  Vec(double xx, double yy) { x = xx; y = yy; }
  double x, y;
}
Although a method of any name could be added to this class to provide cloning support, the best choice is to override Object.clone, where the overridden method will be available polymorphically and with a standardized name (Maxim 191). The clone method is responsible for returning an object that is identical to the one it was invoked against, which, apart from the ugliness with the CloneNotSupportedException exception, is just what Object.clone will do for you:

class Vec implements Cloneable {
  public Object clone() {
    try {
      return (Vec) super.clone();
    } catch (CloneNotSupportedException e) {
      throw new InternalError(e.toString());
    }
  }
  // as before...
}
The implementation of Vec's clone method shown here is virtually a canonical form for a clone method (remembering, though, that if the class contains any references it will likely need some extra code to ensure that the duplication performs an appropriately deep copy—Maxim 201).

Object.clone is a native method implemented in the JVM, and as such, incurs the overhead of native method invocation. For large classes, the speed with which native machine instructions execute more than compensates for the one-off start-up overhead. For small classes such as Vec, and with state of the art "Just In Time" compiling JVMs, this overhead cannot be recouped (see Table 1).

Table 1. Performance comparison of clone construction methods.
  JDK 1.0 JDK 1.1 JDK 1.2 Visual J++
new Vec(orig.x, orig.y) 4,280 1,980 1,370 550
new Vec(orig) 4,230 1,980 1,320 600
(Vec) orig.clone() 4,060 2,800 1,980 6,270
(Vec) Vec.class.newinstance()* 10,000 7,630 14,230 19,330
All times are specified in nanoseconds.

* Invoking the newInstance method returns an object that was constructed using the no-arg constructor. Further initialization is required to make it a clone (the measured times include this).

It is tempting, then, to implement the clone method with simple assignments and avoid the call to super.clone:


public Object clone() {
   return new Vec(x, y);
}
This clone method functions perfectly well for cloning Vec objects. The trouble is that sometimes it is called to clone something else, an object of a subclass:

class Vec3D extends Vec {
  Vec3D() {}
  Vec3D(double xx, double yy, double zz) {
    super(xx, yy);
    z = zz;
  }

  public Object clone() {
    return (Vec3D) super.clone();
  }

  double z;
}
In the following code, what would you expect the values of x, y, and z of the cloned object to be?

Vec3D v = new Vec3D(2, 3, 5);
Vec3D w = (Vec3D) v.clone();
Looking solely at the definition of the Vec3D class, you might reasonably expect that the cloned object has the same values as the original. But it doesn't. The Object.clone method produces an object that is identical in internal state to the object against which it was invoked. The call to super.clone in our first definition of Vec's clone method was actually calling Object.clone, which would have duplicated the z value in the Vec3D object (as well as the x and y values that it inherited from Vec).

Our second definition of Vec's clone method only duplicates the member variables of the Vec class, knowing nothing of the z value. So what is the z value of the cloned object? Zero, the default, perhaps?

Actually, it's a trick question: there is no cloned object. Whereas Object.clone returns an object with the same state and type as the object being cloned, our second implementation returns a mere Vec object, so the cast to Vec3D fails with a ClassCastException. Without examining the source for the Vec class, this isn't something that the writer of the Vec3D class could possibly have known or have been expected to anticipate. The Vec class's clone method was effectively booby trapped.

Recall Maxim 21,1 "Remember that subclasses depend on a class's clone method too." Any class that calls super.clone expects that the values of its own instance members will also be duplicated in the cloned object returned from its superclass's clone method. This will be the case only when each of the methods chains upward to its superclass's clone methods, ultimately linking to the uniquely abled clone method present in the terminal Object superclass.

By capitalizing on the functionality provided by the clone method of the Object class, a class's clone method ensures that the same functionality is courteously extended to its subclasses.

An interesting exception to this rule is in the case of classes declared as final whose instances represent immutable objects.2 Classes such as String and Color, and the wrapper classes Integer, Float, Void, etc., fall into this category. Classes that are final (or possess a clone method declared as final) may validly have a clone method that bypasses super.clone. Because the state of immutable objects cannot be changed after construction and also because, being final, this ability cannot be added by subclassing, there is no need for the clone to even be a distinct instance from its parent. These classes may have the simplest of clone methods:


// Currency objects are immutable
public final class Currency {
  public Object clone() { return this; }
  // other methods...
}
For all other classes, the clone method must make a call to super.clone, be declared final, or unconditionally throw an exception. Exposing a clone method that does not call super.clone and may be overridden in a subclass allows for the possibility that a subclass's clone method will call its super.clone—expecting its instance variables to be cloned also—and have its expectations disappointed.

Unfortunately, there are two features that have been added to the Java language that mean that it is not always possible to call super.clone from some classes' clone methods, requiring one of the other options to be chosen.

Maxim 24: Resort to Copy Construction to Clone Objects With final Members
The first case involves final instance fields, which cannot be assigned new values (if they are of primitive type) or new objects (if they are of reference type) once their owning object has been constructed. Consider this simple mutual exclusion class:


class Mutex {
  Mutex() {}
  
  boolean lock(boolean block) {
    synchronized (signal) {
      while (block && locked)
        try {
          signal.wait();
        } catch (InterruptedException e) {}
        return locked ? false : (locked = true);
      }
    }

  boolean unlock() {
    synchronized (signal) {
      if (!locked) return false;
      locked = false;
      signal.notify();
      return true;
    }
  }

  private static long nextId = 1;

  private final long id = nextId++;
  private final Object signal = new Object();
  private boolean locked = false;
}
The class has two final members: a unique id that is assigned once when the object is constructed, and a unique Object instance the Mutex object uses to signal between threads that the lock has been released. If it were not for the fact that these two members were declared final, we could implement a standard clone method containing an extra couple of lines to give the clone a unique id and signaling object of its own:

public Object clone() {
  try {
    Mutex clone = (Mutex) super.clone();
    clone.id = nextId++;          // won't compile
    clone.signal = new Object();  // won't compile
    return clone;
  } catch (CloneNotSupportedException e) {
    throw new InternalError(e.toString());
  }
}
Unfortunately, to quote the Java Language Specification (JLS),3 "Any attempt to assign to a final field results in a compile-time error." An amendment was made for Java 1.1 to allow the initialization of final fields to be deferred until the completion of instance initialization. Not that it would have helped us with the cloning problem, but we could have initialized the fields this way to the same effect:

class Mutex {
  Mutex() {
    id = nextId++;
    signal = new Object();
  }

  private final long id;
  private final Object signal;
  // as before...
}
This is a special privilege that constructors enjoy, but not one that is extended to the clone method. (By rights it should be, because cloning is a form of instance initialization. Interestingly, this inconsistency has opened up a hole in Java's Definite Assignment rules.*) In short, once we have a cloned object constructed, there is no way to alter the values of its final fields.

The simple solution to this conundrum is to remove the final status from the offending fields. But as the JLS3 also states, "Declaring a field final can serve as useful documentation that its value will not change, can help to avoid programming errors, and can make it easier for a compiler to generate efficient code."

These are reasonable motivations to want to keep fields that are "final" in nature actually declared final in the class. Keeping the fields final while adding a working clone method is one case where you may want to break the rule about always calling super.clone to perform the cloning:


private Mutex(Mutex m) { locked = m.locked; }
public final Object clone() { return new Mutex(this); }
Notice that the copy constructor is only responsible for initializing the non-final fields; the final fields are initialized before the call to the constructor because their declarators assign them (unique) initial values.

In compliance with Maxim 23, the clone method has been declared final itself to prevent subclassers of the Mutex class from coming awry.

Maxim 25: Clone Inner Objects by Copy Construction
In addition to those fields explicitly declared final, a class may have other final fields that are not visible at all. Every instance of a non-static inner class will have one or more hidden private fields that are the this references to the object's enclosing outer instances. This is how the methods of an inner class are able to refer to the methods and fields of its outer classes.

Being implicitly final, these this values are unalterable aspects of the inner class object's existence. Not surprisingly then, the same problems that arise with explicit final fields also appear when attempting to clone inner objects.

Consider this simple scenario involving inner classes:


class Airplane {
  class Engine {
    void catchOnFire() {
      if (!onFire) {
        ++nbFaults;
        onFire = true;
      }
    }
    private boolean onFire = false;
  }

  Airplane() {
    for (int i = 0; i < engines.length; ++i)
      engines[i] = new Engine();
  }

  boolean warningLightOn() { return nbFaults > 0; }

  Engine[] engines = new Engine[4];
  private int nbFaults = 0;
}
In brief, every Airplane object contains an array of Engine objects. The Engine class is a non-static inner class of the Airplane class; so, there are two-way references between any Airplane object and each of its Engine objects. (Each array element refers to an Engine object, which in turn has a private hidden field that refers back to the Airplane.)

Adding cloning support to the Engine class is simple enough; it may be given a clone method in the canonical form:


class Engine implements Cloneable {
  public Object clone() {
    try {
      return super.clone();
    } catch (CloneNotSupportedException e) {
      throw new InternalError(e.toString());
    }
  }
  // as before...
}
The Airplane class can be given a straightforward clone method too, remembering to perform a deep copy by also cloning the array object and each of the Engine elements.

class Airplane implements Cloneable {
  public Object clone() {
    try {
      Airplane clone = (Airplane) super.clone();
      clone.engines = (Engine[]) clone.engines.clone();
      for (int i = 0; i < clone.engines.length; ++i)
        clone.engines[i] = (Engine) clone.engines[i].clone();
      return clone;
    } catch (CloneNotSupportedException e) {
      throw new InternalError(e.toString());
    }
  }
  // as before ...
}
All appears fine until one of the engines catches on fire:

Airplane a = new Airplane();
Airplane b = (Airplane) a.clone();
b.engines[0].catchOnFire();
It is one of b's engines that has caught on fire, but the warning light will come on for airplane a! The fault count was incremented for airplane a rather than airplane b (which experienced the failure) because the cloned Airplane object's inner objects refer to the original Airplane object and not the cloned one! The inner Engine objects were cloned by simply duplicating their in-memory images. This, of course, includes the private hidden outer this reference.

Although an inner object's outer this references may be referred to explicitly,


++Airplane.this.nbFaults;
like the regular this reference, it cannot be assigned to (it's final):

class Engine {
  void setAirplane(Airplane a) {
    Airplane.this = a;  // won't compile
  }
This produces a similar compiler error message to the one that you would get from attempting to assign to plain this.

We can solve the problem, though, the same way that we did with the Mutex class. First we replace the Engine class's clone method with a copy constructor (dispensing also with the Cloneable interface):


final class Engine {
  // replaces clone method
  Engine(Engine e) { onFire = e.onFire; }
  // as before...
}
To clone an Engine object in Airplane's clone method we call the copy constructor specifying the outer object that should "own" the new inner one:

for (int i = 0; i < clone.engines.length; ++i)
		clone.engines[i] = clone.new Engine(engines[i]);
Qualifying the new expression by invoking it as if it were a method of the clone Airplane object specifies that the outer object associated with the newly constructed Engine object should be clone. Again, in accordance with Maxim 23, the Engine class has been declared final. If it hadn't, imagine what would happen if an Airplane object had an engine replaced with an instance of a subclass of Engine. In this case, the Airplane cloning operation would perform object-slicing, copying only the superclass portion of the replacement engine and reverting its type back to plain Engine.

It is a shame that Java does not provide a way to alter the outer object associated with an inner object once it has been created, or to specify one when an inner object is being cloned. As in the case of final fields, this is another case when changes to the language have presented additional impediments to writing clone methods.

Defective Cloning
With these workarounds for cloning, where final fields or inner classes are involved, there remain only two other difficulty areas that deserve some examination. In a future column I'll address the first, by showing how to explicitly disable cloning for classes that have inherited a clone method with no checked exceptions.

The second trouble spot derives from a defect in the original design of the clone method: It is not possible to clone objects through Object references because the clone method is declared protected in Object and, furthermore, is not accessible through the Cloneable interface. Java's designers openly acknowledge the difficulties this imposes, stating:

"The sad fact of the matter is that Cloneable is broken, and always has been. The Cloneable interface should have a public clone method in it. This is a very annoying problem, and one for which there is no obvious fix."5

The inaccessibility of the clone method via Object references has implications for performing deep copying on container objects. Generic Object-based containers are prevented from cloning their elements because the clone method is inaccessible. In the words of the Java language developers again, "In fact, it's nearly impossible to do deep copies in Java, as the Cloneable interface lacks a public clone operation."4

Java's cloning mechanism has numerous defects and is unnecessarily awkward and error prone.1 Not only that, but cloning has been even more "broken" by the addition of new language features such as final members and inner classes, with apparent disregard for their impact on a facility that already presents a number of problems.

Missing features in the language also make cloning more bothersome than it could be: Java does not support covariant return types, where an overriding method is permitted to return a subclass of the type returned by the method it overrides. This requires callers of the clone method to cast the result to retrieve the type information that was lost when the clone method returned.

The clone method is not supported for many common classes for which it would be useful: notably String and the wrapper classes Integer, Float, etc. Cloning instances of these classes through generic Object references is unnecessarily irregular.

Fortunately, all these problems may be overcome. Providing cloning support is an important part of designing classes of maximum usefulness and reliability. In a future column on cloning, I will present a mechanism that allows objects to be cloned via Object references, one that can be added to any of the Java API container classes that will cause them to perform a deep copy when cloned. You will then have all the weaponry you need to beat Java at its own game.

References

  1. Ball, S., "Effective Cloning," Java Report, Vol. 5, No. 1, Jan. 2000, pp. 60–67.
  2. Ball, S., "Supercharged Strings," Java Report, Vol. 4, No. 2, Feb. 1999, pp. 62–69.
  3. Gosling, J., B. Joy, and G. Steele, Java Language Specification, Addison–Wesley, 1996.
  4. Sun Microsystems Bug Parade, http://developer.java.sun.-com/developer/bugParade/bugs4103477.html.
  5. http://developer.java.sun.com/developer/bugParade/bugs/4228604.html.
  6. Source code for the example classes, as well as the benchmarking application for timing the relative efficiency of construction vs. cloning, is available at http://effectivejava.com/column/d-cloning.

FOOTNOTES
* It is a language requirement that any so-called blank finals (final fields that are declared without initial values) are actually assigned values by the end of instance initialization. The JLS3 states that this is enforced both at compile-time and at runtime by the bytecode verifier. However, these checks may be bypassed via the clone method: Simply create a class with final fields that are assigned values in the constructor and call this.clone as the first line of the constructor. The object returned from clone will not have had its final fields initialized and can never have them initialized, since the object has already been constructed. It just goes to show what a kludge Java's cloning mechanism is, in that it behaves virtually indistinguishably from construction but neither enjoys the privileges of constructor methods nor operates under the same constraints. None of the compilers or verifiers of the ten JDKs I tested spotted this hole in the enforcement of the Definite Assignment rules.

The clone method is implemented in C in the Sun JVMs. The method eventually boils down to a call to the C memcpy() function followed by a bit of patching up to register any created references with the garbage collector; so, it really is a very raw duplication of an object's state.