Tuesday, October 27, 2009

Item 57: Provide a readResolve method when necessary




< BACKCONTINUE >


Item 57: Provide a readResolve method when necessary


Item 2 describes the Singleton pattern and gives the following example of a singleton class. This class restricts access to its constructor to ensure that only a single instance is ever created:





public class Elvis {
public static final Elvis INSTANCE = new Elvis();

private Elvis() {
...
}

... // Remainder omitted
}


As noted in Item 2, this class would no longer be a singleton if the words "implements Serializable" were added to its declaration. It doesn't matter whether the class uses the default serialized form or a custom serialized form (Item 55), nor does it matter whether the class provides an explicit readObject method (Item 56). Any readObject method, whether explicit or default, returns a newly created instance, which will not be the same instance that was created at class initialization time. Prior to the 1.2 release, it was impossible to write a serializable singleton class.



In the 1.2 release, the readResolve feature was added to the serialization facility [Serialization, 3.6]. If the class of an object being deserialized defines a readResolve method with the proper declaration, this method is invoked on the newly created object after it is deserialized. The object reference returned by this method is then returned in lieu of the newly created object. In most uses of this feature, no reference to the newly created object is retained; the object is effectively stillborn, immediately becoming eligible for garbage collection.



If the Elvis class is made to implement Serializable, the following readResolve method suffices to guarantee the singleton property:





private Object readResolve() throws ObjectStreamException {
// Return the one true Elvis and let the garbage collector
// take care of the Elvis impersonator.
return INSTANCE;
}


This method ignores the deserialized object, simply returning the distinguished Elvis instance created when the class was initialized. Therefore the serialized form of an Elvis instance need not contain any real data; all instance fields should be marked transient. This applies not only to Elvis, but to all singletons.



A readResolve method is necessary not only for singletons, but for all other instance-controlled classes, in other words, for all classes that strictly control instance creation to maintain some invariant. Another example of an instance-controlled class is a typesafe enum (Item 21), whose readResolve method must return the canonical instance representing the specified enumeration constant. As a rule of thumb, if you are writing a serializable class that contains no public or protected constructors, consider whether it requires a readResolve method.



A second use for the readResolve method is as a conservative alternative to the defensive readObject method recommended in Item 56.

In this approach, all validity checks and defensive copying are eliminated from the readObject method in favor of the validity checks and defensive copying provided by a normal constructor. If the default serialized form is used, the readObject method may be eliminated entirely. As explained in Item 56, this allows a malicious client to create an instance with compromised invariants. However, the potentially compromised deserialized instance is never placed into active service; it is simply mined for inputs to a public constructor or static factory and discarded.



The beauty of this approach is that it virtually eliminates the extralinguistic component of serialization, making it impossible to violate any class invariants that were present before the class was made serializable. To make this technique concrete, the following readResolve method can be used in lieu of the defensive readObject method in the Period example in Item 56:





// The defensive readResolve idiom
private Object readResolve() throws ObjectStreamException {
return new Period(start, end);
}


This readResolve method stops both of the attacks described Item 56 dead in their tracks. The defensive readResolve idiom has several advantages over a defensive readObject. It is a mechanical technique for making a class serializable without putting its invariants at risk. It requires little code and little thought, and it is guaranteed to work. Finally, it eliminates the artificial restrictions that serialization places on the use of final fields.



While the defensive readResolve idiom is not widely used, it merits serious consideration.

Its major disadvantage is that it is not suitable for classes that permit inheritance outside of their own package. This is not an issue for immutable classes, as they are generally final (Item 13). A minor disadvantage of the idiom is that it slightly reduces deserialization performance because it entails creating an extra object. On my machine, it slows the deserialization of Period instances by about one percent when compared to a defensive readObject method.



The accessibility of the readResolve method is significant.

If you place a readResolve method on a final class, such as a singleton, it should be private. If you place a readResolve method on a nonfinal class, you must carefully consider its accessibility. If it is private, it will not apply to any subclasses. If it is package-private, it will apply only to subclasses in the same package. If it is protected or public, it will apply to all subclasses that do not override it. If a readResolve method is protected or public and a subclass does not override it, deserializing a serialized subclass instance will produce a superclass instance, which is probably not what you want.



The previous paragraph hints at the reason the readResolve method may not be substituted for a defensive readObject method in classes that permit inheritance. If the superclass's readResolve method were final, it would prevent subclass instances from being properly deserialized. If it were overridable, a malicious subclass could override it with a method returning a compromised instance.



To summarize, you must use a readResolve method to protect the "instance-control invariants" of singletons and other instance-controlled classes. In essence, the readResolve method turns the readObject method from a de facto public constructor into a de facto public static factory. The readResolve method is also useful as a simple alternative to a defensive readObject method for classes that prohibit inheritance outside their package.





< BACKCONTINUE >

No comments: