OOSC 2: 25.7 SUBTYPE INHERITANCE AND DESCENDANT HIDING

This site contains older material on Eiffel. For the main Eiffel page, see http://www.eiffel.com.

25.7 SUBTYPE INHERITANCE AND DESCENDANT HIDING

Some of the categories of inheritance are controversial, but not the first one on our list --- probably the only form on which everyone agrees, at least everyone who accepts inheritance: what we may call pure subtype inheritance.

Defining a subtype

As was pointed out in the introduction of inheritance, part of the power of the idea comes from its fusion of a type mechanism, the definition new types as special cases of existing ones, with a module mechanism, the ability to base the definition of a module as extension of one or more existing modules. Many of the controversial questions about inheritance come from perceived conflicts between these two views. With subtype inheritance there is no such question --- although, as we shall see, this does not mean that everything becomes easy.

Subtype inheritance is closely patterned after the taxonomical principles of natural and mathematical sciences. Every vertebrate is an animal; every mammal is a vertebrate; every elephant is a mammal. Every group (in mathematics) is a monoid; every ring is a group; every field is a ring. Similar examples, of which we saw many in earlier chapters, abound in object-oriented software:

FIGURE CLOSED_FIGURE POLYGON QUADRANGLE RECTANGLE SQUARE
DEVICE FILE TEXT_FILE
SHIP LEISURE_SHIP SAILBOAT
ACCOUNT SAVINGS_ACCOUNT FIXED_RATE_ACCOUNT

and so on. In any one of these subtype links, we have clearly identified the set of objects that the parent type describes; and we have spotted a subset of these objects, characterized by some properties which do not necessarily apply to all instances of the parent. For example a text file is a file, but it has the extra property of being made of a sequence of characters --- a property that some other files, such as executable binaries, do not possess.

A general rule of subtype inheritance is that the various heirs of a class represent sets of instances disjoint from each other. No closed figure, for example, is both a polygon and an ellipse.

Multiple views

Subtype inheritance is straightforward when a clear criterion exists to classify the variants of a certain notion. But sometimes several qualities vie for our attention. Even in such a seemingly easy example such as the classification of polygons, doubt may arise: should we use the number of sides, leading to heirs such as TRIANGLE, QUADRANGLE etc., or should we divide our objects into regular polygons (EQUILATERAL_POLYGON, SQUARE and so on) and irregular ones?

Several strategies are available to address such conflicts. They will be reviewed as part of the study of view inheritance later in this chapter.

Enforcing the subtype view

A type is not just defined as a set of objects, of course: it is also characterized by the applicable operations (the features), and their semantic properties (the assertions: preconditions, postconditions, invariants). We expect the fate of features and assertions in the heir to be compatible with the concept of subtype --- meaning that it must allow us to view any instance of the heir also as an instance of the parent.

The rules on assertions indeed support the subtype view:

The parent's invariant is automatically part of the heir's invariant; so all the constraints that have been specified for instances of the parent also apply to instances of the heir.
A routine precondition applies, possibly weakened, to any redeclaration of the routine: so any call which satisfies the requirement specified for instances of the parent will also satisfy the (equal or weaker) requirement specified for instances of the heir.
A routine postcondition applies, possibly strengthened, to any redeclaration of the routine: so any property of the routine's outcome that has been specified for instances of the parent will be guaranteed to hold as a result of the (equal or stronger) properties specified for instances of the heir.

For features, the situation is a little more subtle. The subtype view implies that all operations applicable to an instance of the parent should be applicable to an instance of the heir. Internally, this is always true: even in the inheritance of ARRAYED_STACK from ARRAY, which seems far from subtype inheritance, the features of ARRAY were still available to the heir, and indeed were essential to the implementation of its STACK features. But in that case we had hidden all these ARRAY features from the heir's clients, and for good reason (as noted, we do not want a client of a stack class to perform arbitrary operations on the representation, such as directly modifying an array element, since this would be a violation of the class interface).

For pure subtype inheritance we might expect a much stronger rule: that every feature that a client can apply to instances of the parent class also be applicable, by that same client, to instances of the heir. In other words, no descendant hiding: if B inherits f from A, then the export status of f in B is at least as generous as in A. (That is to say: if f was generally exported, it still is; and if it was selectively exported to some classes, it is still exported to them, although it may be exported to more.)

The need for descendant hiding

In a perfect world we could indeed enforce the no-descendant-hiding rule; but not in the real world of software development. inheritance must be usable even for classes written by people who do not have perfect foresight; some of the features they include in a class may not make sense in a descendant written by someone else, later and in a completely different context. We may call such cases taxonomy exceptions. (In a different context the word "exception" would suffice, but we do not want any confusion with the software notion of exception handling as studied in an earlier chapter.)

Should we renounce inheriting from an attractive and useful class simply because of a taxonomy exception, that is to say because one or two of its features are inapplicable to our own clients? This would be unreasonable. We just hide the features from our clients' view, and proceed with our work.

The alternatives have been studied as part of one of the founding principles of object technology --- Open-Closed principle --- and they are not attractive:

We might modify the original class. This means we may invalidate myriads of existing software that relied on it --- no, thanks. In most practical cases, anyway, the class will not be ours to modify; we may not even have access to its source form.
We might write a new version of the class (or, if we are lucky and do have access to its source code, make a copy), and modify it. This approach is the pathetic reverse of everything that the object-oriented method promotes; it makes a mockery of the idea of reusability, and of attempts to use a modern, systematic software process.

Avoiding descendant hiding

Before probing further why and when we may need descendant hiding it is essential to note that most of the time we do not. Descendant hiding should remain a technique of last resort. When you have a full grasp of the inheritance structure sufficiently early in the design process, preconditions are a better technique to handle apparent taxonomy exceptions.

Here is an example. Consider a class ELLIPSE. An ellipse has two focuses, with normally a line to connect them:

Class ELLIPSE might correspondingly have a feature focus_line.

It is quite normal to define class CIRCLE as an heir to ELLIPSE: every circle is also an ellipse. But for a circle the two focuses are the same point --- the circle's center --- so there is no focus line. (It is perhaps more accurate to say that there is an infinity of focus lines, including any line that passes through the center, but in practice the effect is the same).

Is this a good example of descendant hiding? In other words, should class CIRCLE make feature focus_line secret, as in


class CIRCLE inherit   ELLIPSE
    export {NONE} focus_line end   ...

Probably no. in this case, the designer of the parent class has all the information at his disposal to determine that focus_line is not applicable to all ellipses. Assuming the feature is a routine, it should have a precondition:

 focus_line is
    -- The line through the two focuses require     not
equal (focus_1, focus_2)   do     ...    end

(The precondition could also be abstract, using a function distinct_focuses; this has the advantage that CIRCLE can redefine that function once and for all to yield false.)

In such a case the need to provide for ellipses without a focus line follows from a proper analysis of the problem. Writing an ellipse class with a function focus_line that has no precondition would simply be a design error; addressing such an error through descendant hiding would be attempting to cover up for that error. As was pointed out at the end of the presentation of the Open-Closed principle, erroneous designs must be fixed, not patched in descendants.

Using descendant hiding

The focus_line example is typical of taxonomy exceptions arising in application domains such as mathematics which can boast a solid theory with associated classifications, patiently refined over a long period. In such a context, the proper answer is to use a precondition, concrete or abstract, at the place where the original feature appears.

But that technique is not always applicable, especially in domains that are driven by human processes, with their attendant capriciousness that makes it difficult to foresee all possible exceptions.

Consider as an example a class hierarchy, rooted in a class MORTGAGE, in a software system for managing mortgages. The descendants have been organized according to various criteria, such as fixed-rate vs. variable rate, business versus personal or any other that has been found appropriate; we may assume for simplicity that this is a taxonomy of the pure subtype kind. Class MORTGAGE has a procedure redeem, which handles the mechanisms for paying off a mortgage at a certain time earlier than maturation.

Now assume that Congress, in a fit of generosity (or under the pressure of construction lobbies), introduces a new form of government-backed mortgage whose otherwise advantageous conditions carry a provision barring any early redeeming. We have found a proper place in the hierarchy for the corresponding class NEW_MORTGAGE; but what about procedure redeem?

We could use the technique illustrated with focus_line: a precondition. But what if there has never before in banker's memory existed a mortgage that could not be redeemed? Then procedure redeem probably does not have a precondition. (The situation is the same if the precondition existed but was concrete, so that it cannot be redefined in MORTGAGE.)

So if we decide to use a precondition we must modify class MORTGAGE. As usual, this assumes that we have access to its source code and the right to modify it --- often not true. Suppose, however, that this is not a problem. We will add to MORTGAGE a boolean-valued function redeemable and to redeem a clause

 require
  redeemable

But now we have changed the interface of the class. All the clients of the class and of its numerous descendants have instantly be made potentially incorrect; to observe the specification all calls mlredeem (...) should now be rewritten as

 if
mlredeemable then   mlredeem (...) else
  ... (What in the world do we say here?) ...  end

Initially we do not urgently need to make this change since the incorrectness is potential only: existing software will only use the existing descendants of MORTGAGE, so no harm can result. But not fixing them means leaving a time bomb --- unprotected calls to a precondition-equipped routine --- tickling in our software. As soon as a client developer has the good idea of using a polymorphic attachment with a source of type NEW_MORTGAGE but forgets the test we have a bug. And the compiler will not produce any diagnostic.

The absence of a precondition in the original version of redeem was not a design mistake on the part of the original designers: in their view of the world, until now correct, no precondition was needed. We cannot require every feature to have a precondition; imagine a world in which for every useful f there would be an accompanying boolean-valued function f_feasible serving as its bodyguard; then we would never be able to write a simple xlf for the rest of our lives; each call would be in an if ...or equivalent as illustrated above for mlredeem. Not fun.

The redeem example is typical of taxonomy exceptions which, unlike focus_line and other cases from perfect-foresight classifications, cannot be addressed through careful a priori precondition design. The observation made earlier fully applies: it would be absurd to renounce inheritance --- the reuse of a rich class structure, lovingly developed and carefully validated --- because a feature or two, out of dozens of useful ones, do not apply to our goal of the moment. We should just use descendant hiding:

 class NEW_MORTGAGE inherit   MORTGAGE
    export {NONE} redeem end   ...

No error or anomaly will be introduced in existing software --- the existing class structure or its clients. If a client is modified to include a polymorphic attachment with source type NEW_MORTGAGE, and the target of that attachment is also used with redeem, as in

 m:
MORTGAGE ; nm: NEW_MORTGAGE ;
...
m := n ;
...
mlredeem (...)

then the call becomes a catcall, and the potential error will be caught statically by the extended typing mechanism described in our discussion of typing.

Taxonomies and their limitations

[Taxonomy exceptions are not specific to software examples. Even --- or perhaps especially --- in the most established areas of natural science, it sometimes seems impossible to find a statement of the form "members of the ABC phylum" (or genus, species etc.) "are characterized by property XYZ" that is not prefaced by "most", qualified by "usually", or followed by "except in a few cases". This is true at all levels of the hierarchy. If you think for example that the distinction between the animal and plant kingdoms is simple, just ponder its definition in a popular reference text (italics added):

[QUOTATION TO BE FILLED IN.]

... ... ... ...

Another set of examples is provided by the presentation of the "tree of life" Web archive, an Internet project to establish a general Linnaean classification of living beings (see the bibliographical references). The same comments apply to another area of study, cultural rather than natural, which has also contributed to the development of cladistics (evolution-based taxonomy): the historical classification of human languages,

In zoology a common example, so famous in Artificial Intelligence circles as to have become a cliché, still provides a good illustration of taxonomy exceptions. (Remember, however, that this is only an analogy, not a software example, and so cannot prove anything; it can only help us understand ideas whose relevance has been demonstrated otherwise.) Birds fly; in software terms class BIRD would have a procedure fly. Yet if we wanted a class OSTRICH we would have to admit that ostriches, although among the birdest of birds, do not fly.

We could think of classifying birds into flying and non-flying categories. But this would conflict with other possible criteria including, most importantly, the commonly retained one, shown below.

[NOTE TO THE READERS OF THIS DRAFT: THIS IS FROM A WEB ARCHIVE. PERMISSION TO REPRINT HAS BEEN REQUESTED BUT NOT OBTAINED YET.]

The OSTRICH example has an interesting twist. Although regrettably most of them do not seem to be aware of it, ostriches really should fly. Younger generations lost this ancestral skill through an accident of evolutionary history, but anatomically ostriches have retained most of the aeronautical machinery of birds. This property, which makes the job of professional taxonomist a little harder (whereas it may facilitate that of his colleague the professional taxidermist), will not in the end prevent him from classifying ostriches among birds.

In software terms OSTRICH will simply inherit from BIRD and hide the inherited fly feature.

Using descendant hiding

Goethe/XXX citation to be included here, from: Peter Stevens, The Development of Biological Systems --- Antoine-Laurent de Jussieu, Nature and the Natural Systems, Columbia University Press, 1990 [CHECK].]

Descendant hiding, it has already been noted, should remain a rare occurrence. If you design a taxonomy with taxonomy exceptions all over --- well, they are not exceptions any more, so you do not really have much of a taxonomy.

Both software practice and the example of natural science taxonomies, with their history of efforts by intellectual giants (including Aristotle, Buffon, Linnaeus and Darwin) over many centuries, suggest that taxonomy exceptions and the ensuing need for descendant hiding are not just the result of bad design decisions and insufficient foresight, but, more profoundly, a consequence of the intrinsic limitations of our intellectual tools for understanding the world and describing it rationally. Could this be related to the incompleteness results uncovered in the first part of the 20th century in both theoretical physics and foundational mathematics?

In software, for those few cases in which conflicting classification criteria or massive previous work precludes the production of a perfect subtype hierarchy, descendant hiding is more than a convenient facility: it will save your neck.

PREVIOUS SECTION ---- NEXT SECTION