This site contains older material on Eiffel. For the main Eiffel page, see http://www.eiffel.com.

Beware of polymorphic catcalls

Bertrand Meyer, ISE

This technical note refers to the OOPSLA and TOOLS PACIFIC presentation on static typing, whose text is now also available on-line. If you have not attended one of these talks you should most likely read their text first; otherwise the present document probably will not make much sense!

Disclaimer, copyright, acknolwledgments

This note is personal research work by the author and does not in any way commit ISE.

It is copyrighted and may not be reproduced or distributed without the author's permission.

Successive drafts benefited from comments and corrections from Michael Schweitzer, Eric Bezault, James McKim, Roger Browne, Rock Howard, Peter Wegner, Kim Bruce and Fabrice Franceschi.

Presentation

What follows is only the results, with some explanations. A more extensive discussion may be found in the paper mentioned above; the full discussion is in the book Object-Oriented Software Construction, second edition. Some readers might find the rules objectionable until they have seen that discussion.

Note that in spite of what one may think at first the mechanism seems implementable as part of an incremental compilation process (as opposed to system validity as defined in chapter 22 of the book Eiffel: The Language, hereafter ETL, which only works on an entire system).

The idea is, roughly speaking, to avoid the transitive closure algorithm of ETL-22 by being more pessimistic, while still not excluding any reasonable construction.

Informal explanation.
Rules and definitions.
Notes.
The case of external functions and precompiled libraries.

Part 1: Informal explanation

The idea is very simple; it states that for an entity p you can have at most one of the following two (each common and useful) situations:

p is polymorphic, i.e. there is an assignment of the form p := q where q is of a type other than that of p.
p is the target of a ``catcall'', that is to say a call of the form p.f (...) where some descendant of the class that defines f either hides f from C or changes a formal argument type covariantly.

The rules below are but a precise formalization of this principle, complemented by a similar rule avoiding potentially dangerous creations.

Part 2: Rules and definitions

System Validity Rule (see notes).

A system is valid if and only if it includes no polymorphic catcall, broken assignment or broken creation instruction.

Definition 1: Polymorphic entity or function (see notes).

An entity or function x of type T appearing in a class C is polymorphic if and only if T is not expanded and one of the following conditions holds:

T is a formal generic parameter of C.
x is a formal argument of a routine of C.
x is Current.
x is an external function.
One of the routines of C contains a creation instruction of targetx and creation type other than T.
One of the routines of C contains a polymorphic assignment of target x (or, if x is a function, of target Result in the body of x).

Definition 2: Polymorphic assignment (see notes).

An instruction is a polymorphic assignment if it is an assignment or assignment attempt, with target of type T and source y of type S, satisfying the following conditions:

S is not the same as T and is neither expanded, nor anchored, nor NONE.
y is a polymorphic entity or a call to a polymorphic function.

Definition 3: Polymorphic call (see notes).

A call of target t is polymorphic if and only if its target is a polymorphic entity or a call to a polymorphic function.

Definition 4: Catcall (see notes).

A call to a feature f of a class B, appearing in a class C, is a catcall if and only if one of the following properties holds for a proper descendant D of B other than NONE:

D changes the export status of f, making it unavailable to C.
D redeclares f such that the call's actual argument signature does not conform to the redeclared formal argument signature.
The version of f in D has a formal argument of a type anchored either to Current or to a feature that D redeclares to a type to which the corresponding actual argument's type does not conform.

Definition 5: Broken assignment

An assignment appearing in a routine r of a class C, with target t and source type S, is broken if and only if it satisfies the following conditions:

t is an attribute.
r is inherited, and C does not redefine it.
C redefines t to a new type to which S does not conform.

Definition 6: Polymorphic creation instruction (see notes).

A creation instruction of target t and creation type T appearing in a routine r of a class D is polymorphic if and only if it satisfies one of the following conditions:

T is anchored.
[[ T is a formal generic parameter.]]
r is an inherited routine not redefined in D, t is an attribute, and D redefines t.

Definition 7: Broken creation instruction (see notes).

A creation instruction i appearing in a routine r of a class D, with creation procedure p and target t of type T in D, is broken if and only if it satisfies the following conditions, where p is the creation procedure and C is the base class of the creation type [[or, if the creation type is a formal generic parameter, any effective proper descendant of its base class other than NONE]]:

i is polymorphic in D.
The version of p in C is not available to D for creation, or has a formal argument signature to which the actual argument signature of the Creation_call does not conform.

Part 3: Notes

Notes on the System Validity Rule

This rule eliminates the definition of system-level validity in ETL.

0.A - The general convention in ETL (General Validity Rule, VBGV, page 28) is that for any construct C a validity constraint of the form ``A specimen of C is valid if and only if XXX'' includes an implicit addition of the form ``... and all its constituents are valid''. So because ``system'' is the highest level structuring concept in Eiffel the above validity rule includes all the validity rules of ETL.

0.B - The definitions in the rest of this document introduce the concepts used in the rule - catcall, creation catcall, polymorphic call or creation instruction, broken assignment, broken creation instruction - and some needed auxiliary concepts such as polymorphic entity.

0.C - Definitions 1 to 5 address the system validity issues raised by attachments (assignment, argument passing, assignment attempt); definitions 6 and 7 address the issues raised by creation instructions.

Notes on definition 1: Polymorphic entity or function

1.A - ``Polymorphic assignment'' as used in case 1.6 is defined in definition 2. Definitions 1 and 2 are mutually recursive.

1.B - ``Polymorphic'' (here and in the other definitions using this term) is an abbreviation for ``potentially polymorphic''. In particular, in case 1.2 we treat all formal arguments of a routine r as potentially polymorphic because we may compile the class into a library and then clients can call the routine with actual arguments of arbitrary conforming types. And in case 1.1 we treat any entity of generic type as polymorphic because its actual type can be any type conforming to the generic constraint.

1.C - Current is considered polymorphic because it represents every different descendant of C.

1.D - An entity of anchored type may not be polymorphic (see 2.1).

1.E- In case 1.4 we treat any external function as polymorphic; this is inevitable since an Eiffel compiler will have no way to find out the exact type that an object may have if it is produced by an external function. See also the discussion of precompiled libraries in part 4.

1.F - Whether an entity is polymorphic can be entirely determined on the basis of the class's own text, complemented by the knowledge of what inherited features (attributes and functions) are polymorphic. This is fundamentally different from system-level validity in ETL chapter 22, which requires a system-wide transitive closure.

Notes on definition 2: Polymorphic assignment

2.A - Case 2.2 is a recursive use of Definition 1.

2.B - Eiffel semantics does not generally distinguish between assignment and argument passing, together called attachment. The notion of polymorphic assignment is an exception, due to the decision (1.2) to treat any formal argument ipso facto as polymorphic because of the bottom-up nature of the Eiffel method and its reliance on libraries.

Notes on definition 3: Polymorphic call

3.A - An unqualified call has Current as its target (see ETL page 340) and hence is always polymorphic.

3.B - The target of a call can itself be a call to a function, polymorphic or not, as in (some_function (a)).some_routine (b).

Notes on definition 4: Catcall

4.A - Since the General Validity rule enables us to assume that the call is valid, the export status of f in B must make it available to C.

4.B - For the same reason, the call's actual argument signature conforms to the argument signature of f in B.

4.C - Case 4.3 is a variant of case 4.2: the use of an anchored type, with an anchor redefinition (explicit, or implicit if the anchor is Current), is equivalent to a covariant redefinition.

4.D - ``Cat'' stands for ``Changed Availability or Type''.

Notes on definition 5: Broken assignment

Notes on definition 6: Polymorphic creation instruction

6.A - For creation instructions the notion of being polymorphic is less dynamic than for calls. A call is polymorphic as soon as its target is, somewhere, the target of a polymorphic assignment. This is rather difficult to trace exactly, so definitions 1 and 2 take a somewhat pessimistic view (for example any formal argument is considered polymorphic because we do not want to look - recursively - at the actual arguments in all possible calls). For creation instructions we can be more exact without performing a system-wide transitive closure: the instruction will be polymorphic only if its creation type is anchored, redefined somewhere, or (see next note) generic.

6.B - Case 6.2 extends beyond the current language definition, which prohibits the use of a creation type as formal generic parameter. I believe, however, that a number of people would like to see this restriction removed. 6.2 makes this change possible and safe (should it be decided). If generic creation types indeed become valid, the revised validity rule (the new VGCC, page 286) should require the Creation_call to use one of the creation procedures of the base class of the generic constraint. Because 6.2 is beyond the current language, I have written it, as well as 7.2 which provides the corresponding constraint, in [[double brackets]].

Notes on definition 7: Broken creation instruction

7.A - To take into account the case of creation instructions with no Creation_call (just create x rather than create x.p (...)), this rule assumes the following convention: create x is viewed as using a Creation_call to the fictitious procedure no_creation, and a class with no Creators clause is viewed as having a Creation clause of the form creation no_creation.

7.B - For a compiler implementor, the exclusion of broken creation instructions essentially means that the compiler must redo the type checking of a routine containing a creation instruction in any descendant that redefines the target's type, or in every descendant if that type is anchored or generic.

7.C - About 7.2 and the [[double brackets]] see note 6.B.

Part 4: The case of precompiled libraries

Many implementations will provide the ability to precompile a set of classes and distribute it without the full source code, but only with an interface specification - the ``flat-short form''.

In a class of such a library you may find a query returning results of a certain type. (A ``query'' is a function or an attribute; the flat-short form does not distinguish, for information hiding purposes, between functions and attributes.) If you write client software and do not have access to the library's source code, you cannot know whether the query is polymorphic.

This suggests two possible approaches for compiler developers:

9.A - Treat all queries of a precompiled library as polymorphic.
9.B - Adopt a convention, in generating the flat-short forms, for indicating that a query is polymorphic.

Approach 9.A, the more restrictive one, is easier to implement. It is in line with the treatment of external functions (see note 1.E), but may be incompatible with the expected increasing reliance on precompiled libraries.

If approach 9.B is used, the convention could be an asterisk following the type name to indicate a polymorphic result, as in:

spanning_tree: TREE* [NODE]

(To avoid any confusion, remember that this is not a language extension, simply a convention for the output of the Eiffel flat-short tool that produces class interfaces. Roger Browne has suggested using, rather than the asterisk, a form such as polymorphic TREE, perhaps more in line with other Eiffel conventions.)