Saturday, October 31, 2009

The Power of Generic Programming


3 4



The Power of Generic Programming



How do we make algorithms and data structures as reusable as possible? Classic object orientation offers one solution: interface inheritance and polymorphism. By isolating an algorithm from an implementation through an interface, we are free to change the implementation and apply the algorithm to any implementation without modification. This is by far the most common and effective reuse technique in modern software projects. Most of the reuse techniques I discuss in Chapter 6 involve polymorphism in one way or another.



Generic programming takes a different tack. It maximizes reuse by focusing on the potential usefulness and universal applicability of a piece of code. Not objects in the traditional object-oriented sense, such pieces of code are often referred to as software building blocks. Instead of narrowing an algorithm or data structure to work through a specially defined interface—a conduit of reuse of sorts—software building blocks become reusable through widening: any information about the entities on which they operate is removed from their implementation. In particular, this means the types of these entities are unknown. In terms of reuse, lack of knowledge about an entity is better than specific yet encapsulated knowledge. The less you know about the entity your code eventually will need to work with, the more entities your code eventually will be able to work with. It's actually quite simple.



While object orientation and generic programming achieve some of the same goals, the two technologies are substantially different. An interface is a rather narrow channel of reuse—so narrow in fact that the compiler can generate code for making a call to an interface method, even though the implementation is not known to the call site. But that's OK because the only piece of information that needs to be substituted at run time is the address of the method where the implementation resides. Everything else—parameter signatures, calling convention, and so on—must be known at compile time. Software building blocks are quite different: removing type information in its entirety makes it impossible for the compiler to generate the final version of binary code for the software building block.1 The compiler is only able to generate this final code once these types become known in the context of a particular client using the software building block. This is called instantiation
of the software building block.2 The compiler generates source code at instantiation time. Only when the type of Foo becomes known can an expression such as Foo->Bar() be evaluated. Foo could be a COM+ interface pointer, in which case the statement becomes an invocation of the vtable slot at which Bar resides. Or Foo could be a smart pointer, in which case its user-defined operator -> would be invoked. Or Bar could be a nonvirtual function of a structure. In that case, the call compiles to a direct function call, passing Foo in the this pointer slot. On the other hand, if Bar were a static method, no argument at all would be passed on the stack. I'm sure you've caught on to what I'm trying to express here. If type information is lacking, the compiler cannot possibly generate code for a statement as simple as Foo->Bar(). Therein lies a significant difference between generic programming and polymorphism.



Let's consider an example. Let's say you need a linked list for elements of a certain user-defined type, some sort of structure. But you recognize that a linked list has a generic quality to it, one that has nothing to do with the specific type of element (the struct) you want to store in it for now. You realize that later you might want to have linked lists of integers, strings, objects, and so on. Therefore, you remove knowledge of what you are storing in the linked list from your linked-list implementation. You widen the data structure's ability with respect to the storage elements' type. Every time you instantiate a new linked list (at compile time), you decide which element type to store in it. Thus, the compiler generates new code for each new specialization of your generic linked-list software building block. These versions can be significantly different. For one thing, the amount of space allocated by the compiler for each node varies with the size required for each element instantiation type.



Now let's say that you need an algorithm to sort your linked list. When first pondering your situation, you had planned to implement an algorithm that works by examining specific attributes of your user-defined type and basing sort order on the results of these examinations. You also had planned to make that algorithm a method of your linked-list data structure, in good object-oriented style. But now you pause. You've already decided to create a generic linked-list software building block. Shouldn't all instantiations of this building block be sortable, independent of element type? At first, your problem is that you can't quite see how to implement the algorithm if it has nothing to latch on to. How can you implement sorting if you don't know the sort order because you don't know what you are sorting? But then you realize that at the heart of sorting algorithm lies the ability to impose a strict weak ordering on the elements being sorted. C++'s operator < is the idiom normally employed to determine such a relationship among elements. So instead of trying to reach into the element type in your sort algorithm, you simply apply operator < to pairs of elements of a type that's now unknown to determine their order. The genius behind this decision is that it works for built-in types such as integers and floats, and the operator can be declared for user-defined types, either as a member of the type or externally as a function. By widening your sort algorithm's interface this way, it suddenly becomes usable with all instantiations of your linked list. How useful and maintainable! And again we see how the compiler generates wildly different versions for each instantiation of your sort algorithm, based on the linked list's element type. In one case, operator < might translate into a generation of code built into the compiler that compares numbers; in another it will become an invocation of a function you implemented. Perhaps it will be a virtual function, perhaps it will not. Perhaps it will be a member function of a class, perhaps not.



Having realized how generic programming has helped your project so far, you are ready to take your generalizations one step further. You start wondering if it was a good idea to make the sorting algorithm a member function of the generic linked list. Couldn't other data structures be sorted in the same way, by the same algorithm? What about queues, double-ended queues (deques), and stacks? You sense it would be wise to go in that direction, and so you decouple your sort algorithm from the linked list. The sort algorithm is not a member function any more. But you have to remove any instances of the generic linked-list type from the algorithm. Of course, you still need a way to communicate with the linked list. After all, you have to traverse the list, ask for its head, ask for its tail, navigate from one node to another, and so on. So you invent an intermediate generic construct to do this, one that can be supported by any of your data structures, not only the linked list. (STL calls this construct an iterator. We'll discuss this more in the next section.)



You feel like you're on a roll now. Generic programming makes it seem like you can hook up any piece of software with any other piece. (Remember the term software building block?) And it gets even better when you realize that the C++ standard prescribes a specific way of decoupling generic algorithms from generic containers. It achieves the same goals you achieved with the procedure we just stepped through, but by implementing the mechanics of the standard you allow your linked-list data structure to be operated upon by potentially any algorithm implemented by a third party. You might purchase a software library containing some compliant algorithms that you then can apply directly to your linked lists (of any element type). Similarly, you will be able to apply your generic sorting algorithm to any compliant data structure, not just your linked list. This includes data structures that haven't even been invented yet. The potential for reuse is dizzying.



In summary, the generic programming paradigm offers two major advantages. First, software building blocks show unprecedented flexibility and offer many useful combinations, which arise as follows:




  • Generic data structures can be instantiated with an unlimited set of built-in and user-defined types to form new data structures. Each new data structure forms a concrete type that can be used to instantiate other generic data structures. Some highly useful data structures are compositions of generic data structures. (For example, think of hash tables whose buckets must be searched linearly for items with colliding hash keys.)


  • Generic algorithms can be instantiated with an unlimited set of instantiated generic iterators. This quality lets algorithms work with a certain data structure regardless of the type the data structure has been instantiated with.


  • Because the interface to a generic data structure is uniform across data structures from an algorithm's point of view, you can apply almost any algorithm against almost any data structure. As the sorting example showed, this results in a vast number of useful combinations. Finally, it is possible to formulate the abstract essence of an algorithm, independent of any data structure.





Interfaces, polymorphism's reuse vehicle, allow for substitution of functionality through objects only. It takes an object to implement an interface. Generic programming, on the other hand, allows for combination of software building blocks involving any type. We can instantiate our generic linked list into a list of integers as well as one of objects. This allows us to take full advantage of the richness of our programming language with all of its syntactic and semantic features. In C++, this includes function overloading, operator overloading, arrays and the type system in general, inheritance and polymorphism, const-ness, and so on. Generic programming allows you to combine all these elements for maximum effect. And doing so does not require (but does not prevent) the use of inheritance hierarchies. Because multiple facets of behavior can be combined in a single class through manipulation of its syntactic appearance only, generic programming is especially attractive in environments that don't offer multiple implementation inheritance.



Recall some of the uses of C++ templates you have encountered so far in this book, ones that have been indications of their versatility. In Chapter 6, I used templates to prevent having to maintain the definition of an interface in multiple locations. I also used templates to implement interfaces that had not even been defined yet. In Chapter 7, I showed source code that was able to persist elements of a container to and from a stream, regardless of their type. This wide applicability, copy and paste retarding, and reuse enhancing effect is characteristic of software building blocks, wherever they appear.



The second major advantage of generic programming is that it produces compiled code that's about as efficient as compiled code gets. The compiler instantiates generic code with specific knowledge of all types involved. There are no layers of virtual function calls. And no code is generated for methods that evaluate to no-ops! For example, consider ATL's FinalRelease method. The method is called by one of the ATL templates CComObject, CComObjectCached, CComObjectNoLock, CComObjectGlobal, CComObjectStack, CComAggObject, CComPolyObject, CComTearOffObject, or CComCachedTearOffObject, all of which either derive from or contain the class that you implement. FinalRelease is called from the destructor of those templates. Now CComObjectRootBase, one of your class's base classes, declares and defines FinalRelease like this: void FinalRelease() {}. This means if you do not implement FinalRelease in your class (or any of its other base classes) at all, the call to
FinalRelease
by the ATL template generates absolutely no code. It also means that you can "override" CComObjectRootBase's no-op version of FinalRelease without it being virtual. While calls to FinalRelease by ATL certainly generate code, that code is a simple direct function call to a method with no parameters. No virtual call is involved, resulting in the maximum theoretical performance possible. If appropriate, the call might even be made inline! This is in stark contrast to a polymorphic solution, in which every virtual method call incurs the overhead of virtual function dispatching, whether or not an implementation of the method actually does any work.



Software building blocks are efficient because the compiler treats each new instantiation as though all lines of code that reference unknown types were written with the specific substituted types of the instantiation context in place. The removal of type information as an abstraction mechanism, which constitutes the widening of the software building block's interface, therefore imposes absolutely no run-time overhead. This is rather remarkable.



Generic programming is not really a new idea. Research has been performed on the technology for decades. But it turns out that the concepts are not necessarily easy to realize in programming languages with broad universal appeal. Such languages must support many different features to qualify for general-purpose use, and integrating generic programming functionality with the other aspects of a language can be tricky. In addition, any programming language with truly universal appeal will have compilers built and sold by many different vendors. With little experience in implementing generic programming functionality, exploitation of this feature might not always produce the desired outcome, and results might differ when compiling under more than one compiler on different platforms. Thus, it takes a mature and stable programming language with years of compiler implementation experience and programmer feedback to build a reliable platform for generic programming. Generic programming in a research environment is rather different from generic programming in the real world and on real projects.



Two widely used programming languages currently incorporate good support for generic programming: Ada with its generic packages, and C++ with its templates. I witnessed the evolution of template support in C++ over the past six years with a mixture of anxiety and excitement. Early implementations in compilers showed so much promise, but real use often could not exceed carefully tested and restricted scenarios because of compiler instabilities. But now we have reached the promised land. Microsoft Visual C++ 6, for example, presents a superb generic programming platform, showing rock-solid stability and a number of advanced features, such as member templates, support for the typename keyword, explicit template instantiation, and default template parameters.3 Because of its massive user base, it is really C++ we have to thank for moving generic programming from the research laboratory and commercial and government niches into the mainstream of modern software engineering. Right now, at the beginning of the twenty-first century, generic programming is as viable and promising a reuse technology as our engineering discipline has encountered.



Be careful not to mistake polymorphism and generic programming as two alternate approaches to achieving the same goal. Polymorphism is absolutely required whenever you have a client that needs to deal with a set of heterogeneous objects that provide different behavior in a uniform and extensible manner. Perhaps the old shape example illustrates this best: the code rendering a canvas loops through a set of drawing objects and tells each object to draw itself. Different types of objects (rectangle, circle, triangle, and so on) accomplish this in different ways, isolating the caller from the details. This scenario absolutely requires polymorphism. There is nothing generic about shapes in the sense of generic programming. Data structures such as linked lists, hash tables, and AVL trees should be made entirely generic and so should the algorithms that operate on them. We simply do not encounter situations in which a client would want to interoperate with an AVL tree through a polymorphic AVL tree interface,
interacting with one AVL tree implementation as it would with another. Instead, we want to create AVL trees of different types. No real alternatives to generic programming exist here. The alternatives that do exist are either cumbersome or unmaintainable. These include using macros, copying and pasting with type substitution, limiting yourself to only being able to deal with pointer types, and losing type safety. You can try to accommodate differing types through explicit dynamic allocation of the right size of memory for the element type. But doing so is difficult and error prone, and you still lose type safety. In some situations, clients might want to pick and choose from more than one AVL tree implementation, but these tree implementations usually are treated as entirely different data structures—just as a stack and a set are—and never polymorphically.



Similarly, when you need an algorithm to return, for example, the largest in an array of numbers and you need this algorithm to work with various numeric types (such as signed short, unsigned long, float, double and complex), don't try polymorphism. I suppose you could define an interface with member function IsGreaterThan, create implementations for all the different numeric types, and then wrap each numeric value in your array with an instance of such an object and pass it to the Max algorithm. It would work, but it wouldn't be the best way because you never needed a single invocation of the algorithm to deal with a heterogeneous set of numeric types. Instead, you would have been content with a different version of the algorithm for each of the types it should be applicable to. But to preserve maintainability, you of course didn't want to create such different versions through copy and paste. This is really where the generic solution, a generic Max algorithm parameterized here on the array element type, shines and displays all the benefits previously advertised. It is much more flexible, since it can be applied to built-in numeric types without needing to wrap them first. And it is vastly more efficient, since not only virtual function calls are avoided in all cases—instantiation for built-in types won't generate any function calls at all. Generic programming is the clear winner here. You should understand how the term generic applies to this example: software building blocks are generic because they can be tailored, automatically by the compiler and without programmer intervention, to work with elements of different and arbitrary types. Each tailoring or instantiation of the generic building block produces a distinct version of its own separate type, homogeneous with respect to the types in the instantiation context. There is no creation of a type through which a client uniformly can interact with different instantiations of the software building block, as with polymorphism. At their core, templates are an extremely sophisticated form of compiler-controlled, automated copy and paste of your source code. Polymorphism is not.



When devising a reuse strategy for a portion of your design, it is best to draw on your experience with the various technological alternatives before choosing an approach. Analyze the situation carefully before committing to inheritance and polymorphism, generic programming, or something entirely different. Sometimes combining technologies can yield the best results. For example, a generic linked list could be instantiated with a particular interface pointer type. A client dealing with such a list instance is taking advantage of a generic software building block. However, accessing each element of the linked list invokes polymorphic behavior facilitated by the element's inheritance structure. Reuse technologies complement one another naturally this way, and hybrid approaches are the rule rather than the exception.



As another example, say that you have some behavior with which you want to extend a number of different user interface controls. Because you are working with an application framework that wraps the Microsoft Windows user interface in an object model (such as MFC), you have implemented your functionality by overriding virtual functions of the window base class from which all controls are derived. But now you have a dilemma: you want to apply your functionality to more than just one type of window. Therefore, you cannot derive your class from just, say, CButton because then it would not be available to CListBox. But the inheritance hierarchies of CButton and CListBox already are established and fixed, so you can't define your class as a new common base class. The solution to this dilemma is to turn your class into a template and parameterize the base class type. Then you can instantiate new versions of CButton and CListBox—CMyWndExtensions<CButton> and
CMyWndExtensions<CListBox>
, respectively—and all controls will exhibit the common behavior you have implemented. This is a perfect example of how generic programming is being used to augment inheritance and polymorphism, making it more reusable.



I like to characterize generic programming as the plug-and-play technology of software engineering. Different software building blocks can be hooked up to one another with results that are predictable and correct, but the particular combinations were not anticipated by the designers of any of the building blocks participating in the composition. In part, generic programming is so appealing because it offers a way to make the compiler do the heavy lifting when making different pieces of software work together. Traditionally this was the role of the programmer. In each new combination of software building blocks, the compiler generates the mortar that makes the building blocks work with one another and defines the characteristics of the new software solution. The technology is stunningly flexible and simultaneously efficient. What's more, it is elegant and functional.



No comments: