Tuesday, October 20, 2009

Implementation Inheritance


3 4



Implementation Inheritance



With an object-oriented language such as C++ or Java, you can provide an interface implementation in a base class. Anyone who wants to reuse the interface implementation then derives from this class, inheriting its data members and behavior. We refer to this technique as implementation inheritance rather than just inheritance, because inheritance also can be used in C++ to indicate that a class will implement a certain interface. In this case, such base classes are sometimes called protocol classes, or simply interfaces, and the mode of inheritance is called interface inheritance. The technique I'd like to discuss here involves the inheritance of a full or partial implementation of an interface, not just the interface contract.



Imagine we apply implementation inheritance to our CAD application example. We construct a base class that derives from IShade, and we implement all of IShade's methods. Some of these implementations will need to gather data from the particular shape object. To this end, we define the pure virtual functions that object needs to override in order to deliver this data in some standard format. This solution is better than passing the data to the base class because it avoids at least two issues:




  • Replication of geometry data in multiple places and the resulting waste of space. Geometry data could occupy significant space.


  • Synchronization of this data in multiple places. What if the shape object were multithreaded? Better to avoid synchronization issues altogether.





A two-dimensional shape object that wants to reuse this implementation then derives a class (perhaps itself) from this base class. The new class implements the pure virtual functions and wires the implementation into its map of supported interfaces (if required by the implementation environment)—and voilà, reuse achieved! In C++, the interface and class definitions might be similar to the following code.





interface�IShade�:�public�IUnknown
{
����virtual�HRESULT�AddLightSource(...)�=�0;
����virtual�HRESULT�AddGradient(...)�=�0;
����
����virtual�HRESULT�ApplyShade(...,�[out,�retval]�bool*�pbResult)�=�0;
};

class�ATL_NO_VTABLE�CShade�:�public�IShade
{
//�Construction/destruction
protected:
����CShade();
����~CShade();��//�Nonvirtual:�ATL�objects�destroyed�bottom�up

//�IShade�interface
public:
����STDMETHOD(AddLightSource)(...);
����STDMETHOD(AddGradient)(...);
����
����STDMETHOD(ApplyShade)(...,�[out,�retval]�bool*�pbResult);

//�Local�interface�contract�for�deriving�reuser
private:
����virtual�CGeometryData&�GetGeometryData(...)�=�0;
����

//�Implementation
����std::vector<SLightSource>�m_cLightSources;
����std::vector<SGradient>�m_cGradients;
����
};

class�ATL_NO_VTABLE�CEllipse�:�
����public�CComObjectRootEx<CComSingleThreadModel>,
����public�CComCoClass<CEllipse,�&CLSID_Ellipse>,
����public�CShade
{
protected:
����CEllipse();

public:
DECLARE_REGISTRY_RESOURCEID(IDR_ELLIPSE)

BEGIN_COM_MAP(CEllipse)
����COM_INTERFACE_ENTRY(IShade)
END_COM_MAP()

private:
����virtual�CGeometryData&�GetGeometryData(...);
����
};



If a dual interface implementation is desired for IShade, CShade can derive from IDispatchImpl<IShade, &IID_IShade, &LIBID_CADDemo> instead of from IShade directly. The COM_INTERFACE_ENTRY(IDispatch) entry can then be added to the COM map of CEllipse. Now let's see how implementation inheritance rates with regard to each of our evaluation criteria for reuse techniques:




  • Reusability: very poor. Any technique that is limited to a single implementation language can't compete with the flexibility of binary compatibility. A base class is very reusable to projects that use the same implementation language, but nothing more. See the sidebar "Combining Source and Binary Techniques" on how to mitigate this drawback.


  • Setup: very good. As you can see in the example, all it takes to reuse the interface implementation is to declare it as a base class. It does not get any easier than this.


  • Adaptability: very good. Under normal circumstances, you expect deriving concrete classes to implement the pure virtual methods of the base to supply required behavior or data. However, a derived class can take charge at any time and implement a method from the interface. It can do so without delegating to the base at all, or it can perform preprocessing and postprocessing around a call to the base. Note that this is all done on a per-method basis, not a per-interface basis. A reuser can pick and choose an arbitrary set of methods from one or more interfaces to completely override. For implementation inheritance, adaptability of a prepackaged solution does not in any way interfere with the isolation of that solution. This very characteristic makes one of the most compelling arguments for source reuse over binary reuse.


  • Isolation: very good. An interface implementation is reused without any dependency on method signatures. Compile and link dependencies on the base class are much greater than is the case with binary reuse, but these dependencies turn out to be manageable in practice because taking care of them is fully automated by development environments. Changes to the declaration of a base class require recompilation and relinking of all derived classes, even if only private elements were changed. Changes to the implementation of methods in the base class merely require that all projects of derived classes be relinked. And even that step can be avoided by placing the base class into a DLL, instead of linking statically or compiling right into the project of each derived class.



Even though our example does not show it, a base class can be given just as much control over the interfaces it wants to expose as an aggregatee. And a derived class has as much flexibility when it comes to deciding which interfaces to let a base expose as an aggregator. This includes exposing only individual interfaces; exposing all interfaces, whether or not they exist when the derived class is compiled; or perhaps blocking selected interfaces from being exposed via the base. The key to achieving this degree of isolation is inserting the macro COM_INTERFACE_ENTRY_CHAIN into the COM map of a derived class written with ATL. For it to work, the base class must contain a COM map of its own.





Inheritance is a concept that is firmly entrenched in object-oriented analysis and design. Its conceptual and practical value is undisputed. But like most powerful techniques, inheritance is not completely without problems. The most significant issue involves the kind of coupling that can develop between a derived class and its base. A base class provides a service to a derived class. The reverse is frequently true as well; we saw an instance of this in our example. When the interface to this service and the need for supplementation by the derived class is specified in the base class, it is not always clear which methods will be overridden and in what manner. However, because the implementation of the base needs certain functionality immediately, developers place code in virtual functions that are called virtually in the base class. But instead of precisely specifying the needs of the base class of those functions, the functions are treated as though they were private nonvirtual functions, as extensions of functionality for the virtual functions of the base class that call them. This leads to extremely tight coupling between virtual functions in the base class. This occurs because developers are trying to create flexibility by constructing separately overridable parts, without understanding exactly what the adaptability needs of the deriving clients are. This vagueness expresses itself in the coupling between the different virtual functions and their documentation.



The first sign of trouble rears its head when a developer who implements a derived class and tries to override a method in the base is unsure whether she should call the base implementation. And if she should, then should she do it before her overriding code, afterward, or perhaps somewhere in the middle? Ah, surely looking at the base class's source code will tell. And all of a sudden the derived class becomes coupled to implementation details in the base class. Oh, and it won't be long until we reach this conclusion: implementation inheritance requires source code to be available for inspection.



When software modules become tightly coupled and their interfaces are unclear, module boundaries vanish. The whole project becomes one large piece of code that must be maintained as a unit. In the case of implementation inheritance, this phenomenon is called the fragile base class problem. When implementation details of a base class change, its derived classes fail to function properly because they were coupled to those details. The reverse is true as well: derived classes implement virtual functions according to how their developer interpreted the needs of the base, on the basis of source code examination, of course. This puts restrictions on the manner in which the base can call its own virtual functions. Base class developers of course are unaware of these implicit restrictions, and objects break when developers violate them. Derived classes become inseparably coupled to implementation details of the base class, and the base class becomes coupled to implementation details of derived classes.



The fragile base class problem has been quoted as the reason why COM did not and will not support implementation inheritance. As early as 1994, Sara Williams and Charlie Kindel of Microsoft made the following statement in a paper titled
"The Component Object Model: A Technical Overview"(http://msdn.microsoft.com/library/techart/msdn_comppr.htm):



The problem with implementation inheritance is that the "contract" or relationship between components in an implementation hierarchy is not clearly defined; it is implicit and ambiguous. When the parent or child component changes its behavior unexpectedly, the behavior of related components may become undefined. This is not a problem when the implementation hierarchy is under the control of a defined group of programmers who can make updates to all components simultaneously. But it is precisely this ability to control and change a set of related components simultaneously that differentiates an application, even a complex application, from a true distributed object system. So while implementation inheritance can be a very good thing for building applications, it is not appropriate for a system object model that defines an architecture for component software.



Granted, building large, distributed systems from tightly coupled objects with weak interfaces is impossible. However, nothing is innately less clear about the interface contract between base and derived class than that between two arbitrary objects. The challenge in any system is to produce cohesive, loosely coupled abstractions with precise interfaces. This challenge can be met in inheritance hierarchies just as it can between otherwise unrelated objects.



A useful behavioral pattern for factoring out commonality and forming a clear contract between base and derived classes is the Template Method.5 Its authors suggest that leaving virtual functions called by the base unimplemented in the base can enhance the clarity of the base class's interface. I find the following conditions for building a robust base class interface will lead to only the expected coupling with derived classes as set forth in the interface contract:




  • A clear understanding of the abstraction the base class is supposed to provide, along with a crisp definition of its services.


  • An explicitly defined strategy for precisely where and how the Template Method pattern needs to be applied to supplement data and behavior to the base implementation.


  • A clear statement of what in the base implementation is overridable. A clear definition of what overrides must provide.


  • Rigorous, accurate, and precise documentation of the entire interface contract, in the same manner as you would document a COM+ interface.





As long as developers follow these rules, the encapsulation of base classes will be entirely preserved. Developers implementing overrides will be just as likely or unlikely to reach for the source code of the overridden class as they are to look into the source of objects that implement a COM+ interface. Coupling between base and derived classes therefore will not exceed the typical coupling between an aggregator and aggregatee, or a container and containee.



When reused services break, all reusers in a distributed system are affected. This tends to cause a chain reaction with fatal effects on the system as a whole. Tight coupling makes systems unmaintainable and therefore increases the likelihood of such failure. But this is equally true for services reused through implementation inheritance as for those reused through a binary mechanism. The mere fact that an interface exists between two COM+ objects does not mean that these objects are not coupled to each other's implementation details. Developers routinely read into a syntactical interface contract semantic details that were not intended by the publisher of the contract. And developers do examine the source of COM+ objects when they find that an interface's documentation is unclear. There is nothing inherently better about COM+ interfaces that prevents a system from becoming tightly coupled and fragile. Systems can break—and do, as I have frequently witnessed—when implementation details of COM+ objects change. The more heavily reused such components are, the more likely failure is and the more widespread the failure tends to be. To prevent this, the architect must specify the semantics of an interface as precisely as possible. COM+ interface architects and base class designers alike must meet this challenge.



I see absolutely no conceptual tension between implementation inheritance and large, loosely coupled, distributed systems—even those that have portions subject to completely isolated maintenance and development. I therefore am unconvinced by Williams' and Kindel's argument and find implementation inheritance appropriate for an object model that defines an architecture for component software, at least from the standpoint of a coupling argument. In fact, I encourage you to use implementation inheritance in your own large system architectures. Based on the analysis I performed earlier, I believe that the strategy will lead to the most maintainable architecture, which simultaneously exhibits the greatest amount of reuse possible. Removing implementation inheritance from an architecture for fear of tight coupling is a cure that's worse than the disease. Forcing all abstractions to be concrete ones leads to awkward designs and unmaintainable patterns of object interactions. In my own practical experience, implementation inheritance can't be beaten as a reuse technique.



In the end, however, I find myself agreeing that implementation inheritance indeed is not a concept COM+ should tackle—but for a reason other than coupling. The fact that COM+ specifies object relationships at the binary level is the source of its power and its flexibility. This specification makes it easy for implementation environments to join the COM+ world, and each new language option makes the technology stronger and more compelling. Sticking with the relatively simple run-time entity of a component object makes providing COM+ bindings an achievable task. But inheritance is an object-oriented source language concept whose binary manifestation can be very different from language to language. Precisely because implementation inheritance does not appear to have a canonical, pseudo-standard, binary representation (as was the case with virtual functions and protocol classes when COM was defined), it is best left unregulated by a binary compatibility standard. Doing otherwise would weaken the loving, accepting, accommodating character COM+ has become known for.



Before we move on to the next topic, I would like to cite an almost perverse but nevertheless useful application of implementation inheritance. It is common for implementers of certain interfaces to provide only partial implementations. This means the object will provide substantial implementations for only a subset of the interface's methods and return E_NOTIMPL from the rest. IStream is an example of an interface frequently treated in this fashion. An implementer might take care of Read, Write, and Seek but provide no transactional support at all. If you have more than one of these partial implementations, repeating the declaration of all methods can become tedious. This is especially true if the interface is your own and is still under development. You might then find yourself adding and deleting E_NOTIMPL implementations in various places, whenever you add or remove a method. In such a case, it is convenient to build one implementation of the interface that returns E_NOTIMPL from all methods and shim that class between your partial implementation and the interface by using implementation inheritance. You are now free to declare only those methods that you actually will provide implementations for.



No comments: