bertrand@eiffel.UUCP (Bertrand Meyer) (07/30/90)
The following article, posted originally in June of 1989 (after a draft was posted in April of that year) is being reposted since it addresses some of the most frequently asked questions about the Eiffel language, and is not widely available. Four important comments about the reposting: 1. As noted in the original message, this is again an article meant for typesetter output, and pure character format does not do it full justice. 2. The article has not been edited at all. This means any error in the original is still there. Also, this article predated the book ``Eiffel: The Language'', made available (as an internal report from Interactive) in August of 1989. That book is the official reference on the language; if you find a discrepancy, believe the book, not this article. The article does adequately cover the spirit, if not always the letter, of the type system, and many users have told us that it does a better job at explaining the general picture than the book, which is too dry and analytic. (I have worked to correct this deficiency of the book, of course, but the result is not yet available.) 3. Both the book in its current form and this article (with the previous reservations) apply to the currently available version, 2.2, available since August 1989, as well as to the forthcoming version, 2.3. 4. There has been further thinking on some of the solutions described in the article, such as the ``repeat'' mechanism; some recent comments in this newsgroup (see contributions by Steve Tynor, Rick Jones and others) pointed out limitations of this mechanism. Obviously the discussion is not closed. This reposting should not be very useful to current Eiffel users, for whom the material described is old and well-known; it should, however, provide a better perspective to readers whose only exposure to Eiffel dates back to older versions of the language. EIFFEL TYPES Bertrand Meyer March 1989 Updated, 14 May 1989 and 2 July 1989 1 OVERVIEW This note presents a unified view of the Eiffel types. The language as described is what is available for version 2.2 and includes some extensions over the definition given by reference [1]. No incompatibility is introduced; any Eiffel class that was legal according to [1] remains legal, with the same exact semantics. The conceptual framework and extensions presented here introduce a remarkable number of benefits, both theoretical and practical: + Independently of any actual change to the language, the concept of type in Eiffel is presented in a clearer and more uniform way. All Eiffel types are now based on classes, and multiple inheritance is the basic mechanism for constructing new classes. This means in particular that basic types (integer, real, character and real) need no longer be treated according to special rules. In fact, the reserved words INTEGER, REAL, CHARACTER and BOOLEAN are theoretically no longer needed, as it is now possible to define the corresponding types fully within the language, based on new library classes. (These names remain in use, however, for reasons of convenience and efficiency.) + Double precision reals, which were not previously supported directly in Eiffel because of theoretical and practical problems relating in particular to genericity, are now available. In fact, any precision can be supported. + Limited support is offered for programmer-defined infix and prefix operators. + Programmers can improve the efficiency of their software by writing classes whose instances include other objects. Up to release 2.1, objects could refer to other objects through reference fields, but could not contain other objects. The new possibility avoids unneeded indirections. + Exchange of data between Eiffel software and external routines is made easier, particularly when the external routines are written in C. Eiffel objects may contain sub-objects which the C world views as C structures. + Operations on bit sequences of arbitrary length (previously supported by the library class BOOLEAN_SET) are handled simply within the language proper. Throughout the rest of this note, ``Eiffel'' refers to version 2.2 Eiffel. The language defined by [1] and implemented as version 2.1 is called ``pre-2.2 Eiffel''. 2 EXPANDED TYPES The most important new concept is that of expanded type, which makes it possible to declare entities that denote objects rather than references to objects. As a result, it is now possible to create ``composite'' objects - objects which contain sub-objects. ``Entities'' in Eiffel cover class attributes, routine arguments, function results and local routine variables. 2.1 Overview In pre-2.2 Eiffel, an entity declaration is of the form: ent: TYPE where TYPE is either a class type (possibly defined by ``association'') or one of the basic types INTEGER, REAL, BOOLEAN and CHARACTER. (In a generic class, TYPE can also be a formal generic parameter, representing either a class type or a basic type depending on the choice of actual parameter.) If TYPE is one of the basic types, ent at run-time represents a value of the corresponding type (an integer, a real, a boolean, a character). But if TYPE is a class type, the above declaration means that the run-time denotations of ent will be not objects but references to potential objects. The objects are ``potential'' because they are only allocated explicitly, through a Create operation. Until it becomes associated with an object, a reference is said to be ``void''. This remains true by default, but it is now possible to declare an entity as being of an ``expanded'' type. The syntax is: ent: expanded CLASS_NAME In such a case, ent will be said to be expanded (as an abbreviation for ``of expanded type'') and of base type CLASS_NAME. With such a declaration, ent will represent not a reference to potential objects of type EXP, but directly objects of type EXP. So if a class C contains an attribute declared as sub: expanded D then any instance of C will contain a sub-object of type D, accessible through attribute sub. A class such as C having at least one attribute of expanded type is called a composite class; its instances are called composite objects. Figure 1 pictures a composite object. If an entity is not explicitly declared as expanded, for example ref: E it keeps its pre-2.2 denotation of a references to potential objects of type E. Assuming class C contains both of the above attribute declarations, the structure of an instance of this class is illustrated on figure 1. -------------------- ref | | ---> To dynamic instances -------------------- of UNEXP | | | | | Object | sub | of type | | EXP | | | | | -------------------- | | -------------------- Other | | -------------------- fields | | -------------------- | | -------------------- Figure 1: Composite object Composite objects serve to model external objects that have non-simple components. The modeling of such objects was, of course, possible in pre-2.2 Eiffel, but only by making the objects contain references to their components, not the components themselves. Now it is possible to specify that the objects are composite, so that they actually contain sub-objects. The improvement is not functional - no data structure can be described which could not have also be modeled in pre-2.2 - but is one of performance: useless indirections may now be avoided. 2.2 Create, Void and Forget Consider an entity declared of class type (that is to say, unexpanded): ent: C The rules of the language (in 2.2 as well as pre-2.2) define the effect of a creation operation ent.Create (...) to be the following sequence of steps: 1 + Create a new instance of C. 2 + Associate this instance with ent. 3 + Initialize its fields, each of which corresponds to an attribute of the class. 4 + If C has a specific Create procedure, apply it to the new instance, using the actual arguments given if any. Now assume that ent is declared as expanded: ent: expanded C Then the call ent.Create is still correct. Here steps 1 and 2 are meaningless since ent denotes an instance of C, not a reference. So the effect of the Create is now limited to applying steps 3 and 4 to the object associated with ent. In all cases the effect of step 3 (initialization) depends for each field on whether the corresponding attribute is expanded. For a non-expanded attribute, the field represents a reference and is always initialized to a void reference. For an expanded attribute sub, the effect of the initialization is to apply the call sub.Create to the field. Because sub is expanded, this means applying steps 3 and 4: initialization of all fields, and application of the specific Create if present. Note that step 3 is recursive because the type of sub might contain expanded attributes. Two other predefined Eiffel features, Void and Forget, are also allowed on entities of expanded types. If sub is such an entity, sub.Void always returns false and sub.Forget has no effect. 2.3 Restrictions Three simple rules apply to expanded types as a result of the above discussion. Violation of these rules will lead to compile-time errors. The first rule applies to the Create procedure of the base classes of expanded attributes. Consider a declaration of the form sub: E As any other class, class E may or may not have a specific Create procedure. If it has one, however, this procedure may not have any arguments. This clearly required in light of the rule for Create: in the above example, there is no way to initialize an instance of the composite class C if the initialization of its expanded sub component requires specific information. The second rule prohibits C from being deferred. (For one thing, its Create procedure could contain a call to a deferred routine. Create is disallowed for deferred classes anyway.) The third rule prohibits expansion cycles. A class used as base for an expanded type may also be composite: in other words, E in the above example may itself have an attributed of expanded type. But the relation ``class A has an attribute of type expanded B'' may not contain any cycles. The reason for this rule is obvious: we cannot implement A and B in such a way that an instance of both of these classes contains a sub-object which is an instance of the other. Observance of these three restrictions is necessary and sufficient to ensure that an expanded declaration is correct. This is summarized in the following rule. _________________________________________________ Expanded rule: An entity may be declared of type expanded C if and only if the following three conditions are met: 1. Class C either has no Create procedure, or its Create procedure has no argument. 2. Class C is not deferred. 3. The declaration does not introduce any cycle in the relation between classes defined to hold between any two classes X and Y if and only if X has an attribute declared of type expanded Y. _________________________________________________ 2.4 The BITS types An unbounded set of predefined expanded types turns out to be convenient for several purposes. These types are written BITS M where M is any non-negative integer constant. For any M, the definition of type BITS M is that it describes values whose representation will fit in M bits. All BITS types are expanded types. By declaring an entity such as quadruple: BITS 128 the programmer is requiring the supporting Eiffel environment to devote at least 128 bits to the representation of the object associated with quadruple at run-time. Of course, the environment is free to allocate more space. The BITS types are essentially useful for low-level manipulations of bit strings, for machine-dependent definitions, and for interfacing with other languages. The last two applications are particularly significant: + Classes such as INT, FLOAT and DFLOAT (used to define the corresponding expanded types) will be defined below as classes with an attribute of the form value: BITS M for some M. The default M is 32 for INT or FLOAT and 64 for DOUBLE. Thanks to the BITS classes the machine dependencies can be recognized and isolated in a few classes such as these. + Pre-2.2 Eiffel made it possible to exchange data between Eiffel and other languages such as C, but the sole interface unit was the word (32 bits by default). Now by declaring entities of type BITS M for arbitrary M, Eiffel software can send and receive data of any size. This is explained in more detail below. The operators and, or, xor and not (defined in infix form, as explained in the next section) are applicable to entities of BITS M types. This makes it possible to use these types for performing operations on bit chains of arbitrary length. In pre- 2.2 Eiffel this was done through the kernel library classes BOOLEAN_SET and INTEGER_SET. These classes provide for variable- size bit chains and remain available. When one of the operators and, or, xor and not is applied to operands of types BITS M and BITS N, the type of the result is BITS max (M, N). For assignments involving the BITS types, the following rules apply: + A value of type BITS M may be assigned to an entity of type BITS N for M <_ N. Only the first M bits of the target are affected by such an assignment. + Values true and false may be assigned to an entity of type BITS M, resulting in the assignment of a chain of all ones or all zeros, respectively. + No other assignment involving BITS operands is permitted. These conventions are consistent with the use of BITS types for low-level manipulations. 3 ASSIGNMENT, EQUALITY AND ARGUMENT PASSING The introduction of expanded types makes it necessary to specify precisely the kind of association which occurs between entities and objects as a result of assignment and argument passing, and the meaning of equality tests. 3.1 Overview Assignment will be considered first. Three kinds of assignment are needed to cover all possible cases: assignment of references; object duplication; field-by- field object copy without duplication. When all three forms are available on a source b and a target a, they are expressed by the following respective syntax forms (the last one new): + a := b + a.Clone (b) + a.copy (b) Clone is written with an initial upper-case because it is a predefined language feature and hence a reserved word. The name copy, however, is not reserved; instead, copy is a routine of the new universal class ANY, automatically inherited by every class. The effect is determined by whether the target a is expanded or not. 3.2 Assignment to unexpanded target If the type of the target a is unexpanded (that is to say, it is a normal class type), then a denotes references to possible objects and all three forms make sense. The rules in this case are the same as in pre-2.2 Eiffel, with the addition of copy and the possibility of an expanded source. The effect of the three forms of assignment is as follows: + := denotes reference assignment. If b is unexpanded, the assignment will result in a referring to the same object as b, or being void if b was. If b is unexpanded, a will be a reference to the object denoted by b. + Clone represents object duplication. If b refers to an object (which is always the case if its type is expanded), the Clone operation creates a new copy of this object, and makes a refer to it. If b was void, a becomes void. + copy represents copy without allocation. The contents of the object associated with b are copied onto the object associated with a. Both a and b must be non-void. Since copy is a normal feature, from class ANY, applying it to a void a will raise an exception. The routine has not b.Void as a precondition; in precondition checking mode, an exception will be raised if b is void (which cannot occur if its type is expanded). As in pre-2.2 Eiffel, the assignment in any of its three forms, is permitted if and only if the base type of b ``conforms to'' the type of a. Roughly speaking, this means that the latter is a direct or indirect heir of, or identical to the former. The precise definition of conformance is given in section 11.3.1 of [1]. 3.3 Assignment to expanded target Now consider the case in which the target a is expanded. In this case, a directly denotes an object; as a consequence, the only semantics that makes sense is field-by-field copy. Then all three syntax forms will have this effect. The source b may be expanded or not. If it is expanded, the object denoted by b will be copied onto the object denoted a. If it is not expanded, b denotes a reference, which must be non-void for the assignment to proceed correctly. If so, the object referred to by this reference is copied to a. If not, an exception will result (as when a feature access b.f is attempted and b is a void reference). In this case exact type identity rather than conformance is required: the base type of the source must be identical to the base type of the target, with the extra tolerance introduced above for BITS types. 3.4 Argument passing An entity may become associated with a value as a result not just of an assignment of one of the above kinds, but also of argument passing. The rule for Eiffel routines is that if a is a formal argument to a routine, a call to the routine using b as the corresponding actual argument always produces the same association between a and b as an assignment a := b The semantics of argument passing then follows from the above discussion: for unexpanded a, the effect is pass-by- reference; for expanded a, the effect is pass-by-copy. Type conformance applies in the former case, identity in the latter case. If a is expanded and b is not, the value of b must be non- void at execution time, otherwise an exception will be raised. This rule applies to Eiffel routines. The rule for external routines (routines written in other languages) are given in 5.1 below. 3.5 Equality testing Corresponding to the different kinds of assignment, two forms of equality testing may be needed, written respectively: 1 + a = b 2 + a.Equal (b) The rule is the following. If both a and b are unexpanded, the form 1 expresses equality of references and form 2 expresses field-by-field object equality. If both a and b are expanded, both forms express field-by-field object equality. If one is expanded and the other is not, both tests are illegal and will be rejected by the compiler. 3.6 Deep copy and deep equality The copies performed by Clone and copy, and by := for an expanded target, are ``shallow'' copies; in other words they copy one entire object, but that object only, the reference fields being copied verbatim. Similarly, field-by-field equality as provided by Equal, and by = for expanded arguments, is a shallow test, comparing only first-level fields. The universal class ANY includes a procedure deep_copy which will duplicate an entire data structure, and the corresponding deep_equal function. 4 THE TYPE SYSTEM 4.1 Basic types The behavior of entities of expanded types, as it has been defined, generalizes the behavior of entities of basic types (INTEGER, REAL, BOOLEAN, CHARACTER) in pre-2.2 Eiffel. For these basic types, assignment (:= and Clone) had the exact semantics discussed in the previous section for expanded types. The introduction of expanded types indeed makes it possible to cease treating the basic types as special. Instead, they are just predefined expanded types. More precisely, the Kernel Eiffel Library now includes fully normal classes INT, FLOAT, BOOL, CHAR and DFLOAT whose instances are objects representing integer, floating-point, boolean, character and double precision values. (The last one corresponds to a new facility.) The classes in question will be given below (section 6.4). To obtain the effect of basic types in pre-2.2 Eiffel it suffices to declare entities of respective types expanded INT; expanded FLOAT; expanded BOOL; expanded CHAR; expanded DFLOAT The reserved words INTEGER, REAL, BOOLEAN, CHARACTER, complemented by DOUBLE, subsist in 2.2 Eiffel, where they are simply considered to be syntactical abbreviations for the above. The semantics of entities declared of any of these types is exactly the same as in pre-2.2. Of course, the Eiffel compiler knows about these types and is able to treat them without any loss of performance. 4.2 A complete picture of Eiffel types The above makes it possible to give a full definition of Eiffel types. Any Eiffel type is based on a class but may be either expanded or unexpanded. An unexpanded type is also called a class type. A class type is defined by a class name, possibly with actual generic parameters (see below). An expanded type is of the form expanded T, where t is a class type. Expanded types also include INTEGER, REAL, BOOLEAN, CHARACTER and DOUBLE, which are syntactical abbreviations are explained above, and BITS M for any non-negative integer M. 5 INTERFACE WITH OTHER LANGUAGES [This section is only for readers interested in detailed questions of Eiffel interface to other languages, particularly C. Others should skip to the next section.] 5.1 Passing arguments to external routines The semantics of argument passing for Eiffel routines was given above; it is the same as that of := assignment. For arguments to external routines, the rule cannot be as systematic; it must of necessity be adapted to the target language. Current C, for example, does not support passing of structures or arrays as arguments; only pointers to such elements may be passed. (ANSI C may be more liberal, but has yet shown little relevance to the real world of C programming.) So for C all arguments, whether expanded or not, are normally passed by reference. It is the responsibility of the target C routine to copy any structure or array if needed. An exception is made, however, for arguments of basic types (integer, real, boolean, character and the new type ``long real'' described below), which are passed by value for compatibility with tradition. For the results of external functions, the Eiffel compiler can take care of ensuring the proper semantics (copy for expanded, reference for unexpanded); a language-dependent convention is needed, however, to determine what the function will return. In C, the answer is the only portable one: the C function is assumed to return a pointer to the result for non- basic types, expanded or not. The code generated by the Eiffel compiler then takes care of performing a copy in the expanded case. An important practical caveat governs the exchange of arguments and function results between Eiffel and C (or another language). We may usually assume that the basic types are common to both languages, and hence that values of these types may be manipulated by both sides. Both C and Eiffel routines, for example, may perform additions on a given integer value. For values of any other type, however, only one side may safely execute operations other than parameter passing; the other side will simply be used as repository for the values, or to pass them along to further elements. There are exceptions to this rule, but they apply to special cases and require that the programmer be well versed in the implementation techniques for both languages. 5.2 Exchanging data between Eiffel and other languages The notion of expanded type also provides for a more flexible interface with the non-Eiffel world. In pre-2.2 Eiffel, Eiffel routines that needed to interface with routines written in other languages, such as C, could only pass them arguments of basic types or of class types; in the latter case the argument is actually a reference. This also applied to the types of the results of external functions. Furthermore, it was not possible for an Eiffel object to contain a sub-object modeled directly to a C structure type declaration, although this constraint was somewhat relieved by the presence of a routine copy_structure, in the basic library class INTERNAL, making it possible to copy the contents of a C structure into an Eiffel object. This concern is now addressed by composite classes. If each instance of C must contain a sub-object whose structure comes from external C software, then C should contain an attribute declaration of the form sub: EXP where EXP is an appropriate expanded type. Often, EXP will simply be BITS M, where the integer constant M is large enough to cover the size of instances of the required C structure type. It would usually be imprudent to give EXP a more precise Eiffel structure, since this would require making non-portable assumptions about the internal structure of both Eiffel and C objects. The above, however, will be sufficient in practice since entities of type EXP, such as sub, should only be used as arguments to external function calls. The purpose is not to have Eiffel do the job of C or the reverse, but only to facilitate communication between the two worlds. This facility is consistent with the Eiffel approach to interfacing with other languages, based on two observations. First, reusability implies that Eiffel software must be able to interface with existing software written in other languages, but not that the Eiffel language should be corrupted by compatibility with older, unrelated designs. A passage is provided through the border, but it remains a border. Second (for the special case of C): C plays with respect to Eiffel the role that assembly languages played for earlier high-level languages: that of a low-level vehicle to be used only when one cannot do otherwise. On a related topic, note that release 2.2 provides further support for more powerful interaction between Eiffel and other languages. In particular, a simple syntactical extension makes it possible to pass the address of an Eiffel routine to an external routine. The notation is @f, where f is a routine of the enclosing class; such an expression is only valid as actual argument to an external routine. 6 DEFINING THE BASE CLASSES FOR THE BASIC TYPES It was explained above that basic types are expanded types, whose base classes are new kernel library classes. These basic classes will now be introduced. The purpose of these definitions is to show that the Eiffel type system, entirely based on the notion of class, is complete: basic types may entirely be defined within the language and have no more ``magical'' properties. In practice, however, the basic types will continue to be treated in a special way by the compiler for obvious performance reasons. The classes as given below define the semantics of the basic types (each of which is an expanded form of the corresponding class), but are not used as such by the compiler and hence do not limit the efficiency of the associated operations. 6.1 Prefix and infix operators The classes underlying the basic types use the facility for defining routines by infix and prefix operators, as introduced by 2.2. The standard grammar for the declaration of routines (see [1], appendix C) is as follows (brackets introduce optional components: ___________________________________________________________________ Routine_declaration = Routine_name [Formal_arguments] [Type_mark] "is" Routine_text Formal_arguments = "(" Entity_declaration_list ")" Type_mark = ":" Type ___________________________________________________________________ The optional ``Type_mark'' gives the type of the result when the routine is a function. In pre-2.2 Eiffel, ``Routine_name'' is simply an identifier. We now allow the two new forms prefix '"'Operator'"' infix '"'Operator'"' In this syntax, ``Operator'' may only be one of the following: + - * / < > <= >= and or xor not div mod (Operator xor, for exclusive or, is new with 2.2.) All these except not may be used in infix declarations; only +, - and not are permitted for a prefix declaration. Thus only a small group of common operators may be used in prefix or infix form. Routines declared in this way must all be functions. Infix routines must have exactly one argument; prefix routines must have none. Calls to such routines must use the corresponding operators, in prefix or infix form. Assume for example a class MATRIX with function declarations of the form infix ("+") (other: like Current): like Current is... prefix ("-"): like Current is... Then, for m1 and m2 of type MATRIX, the expressions m1 + m2 and -m1, respectively, will denote calls to the corresponding functions. The precedence of infix and prefix operators is fixed and given by the standard Eiffel precedence table (see section C.3 of [1]). It is important to note that the above possibility is not ``the introduction of overloading into Eiffel''. A powerful form of overloading has always been available in Eiffel, since any two classes can have different features of the same name, and dynamic binding makes it possible to obtain run-time discrimination. What is new is a simple syntactic extension, making it possible to use the standard arithmetic operators for any class (and associated expanded types), not just the basic types. 6.2 COMPARABLE and NUMERIC Two deferred classes are needed in the basic Eiffel library. Class COMPARABLE already existed in pre-2.2; its features now use the infix facility. Note that only "<=" needs to be deferred. deferred class COMPARABLE export infix "<", infix "<=", infix ">", infix ">=" feature infix "<=" (other: like Current): BOOLEAN is -- Is current element less than or equal to other? deferred end; -- "<=" infix "<" (other: like Current): BOOLEAN is -- Is current element less than other? do Result := Current < other and not Current.Equal (other) end; -- "<" infix ">" (other: like Current): BOOLEAN is -- Is current element greater than other? do Result := other < Current end; -- ">" infix ">=" (other: like Current): BOOLEAN is -- Is current element less than or equal to other? do Result := other <= Current end -- ">=" end -- class COMPARABLE Any class that needs to manipulate elements connected by an order relation may inherit from COMPARABLE. Class NUMERIC describes any type with the basic arithmetic operations: deferred class NUMERIC export infix "+", infix "-", infix "*", infix "/", prefix "+", prefix "-" feature infix "+" (other: NUMERIC): NUMERIC is -- Sum of other and current element deferred end; -- infix "+" prefix "-" BOOLEAN is -- Is current element less than other? deferred end; -- prefix "-" (etc.) end -- class COMPARABLE Any class that needs to manipulate elements on which operations similar to the basic arithmetic operations are available, may inherit from NUMERIC (and from COMPARABLE if necessary). 6.3 Conversions To allow for mixed-mode computation, we will need conversion functions. For example, the following class describes values which may be converted to floating-point: deferred class FLOAT_CONVERTIBLE feature float_value: FLOAT is -- Representation of current object as a floating-point value deferred end -- float_value end -- class FLOAT_CONVERTIBLE A similar DFLOAT_CONVERTIBLE class may be used for conversions to double precision. 6.4 Basic types and double precision We can now introduce the base classes for the basic types. The definition is based on an assessment of the present state of computer hardware: on most platforms, an integer or (short) real will fit in 32 bits; a character will fit in 8 bits; a long real will fit in 64 bits. I believe it is preferable to explicitly incorporate these industry-standard lengths into the programming language definition, rather than to maintain a pretense of implementation-independence. In practice such a pretense can only prevent software developers from writing portable software that will perform in a (reasonably) consistent fashion across a range of platforms. (Another, more axiomatic approach has been followed by Ada, but its advantages over the policy of making simple assumptions about the minimum bit-size requirements of basic data types are not clear in the present state of hardware technology and software theory.) These conventions are not really ``wired in'' and it is possible to change the definitions below to adapt to different sizes. The basic classes are defined as follows. class FLOAT export repeat NUMERIC, repeat COMPARABLE, value {FLOAT} inherit NUMERIC; COMPARABLE; FLOAT_CONVERTIBLE feature value: BITS 32; -- The real number's internal representation set_value (v: BITS 32) is -- Use v as the bit code for the real number's value do value := v end; -- set_value float_value: FLOAT is -- Representation of current object as a floating-point value do -- Here of course no conversion is necessary Result := Current end; -- float_value infix "+" (other: FLOAT_CONVERTIBLE): FLOAT is -- Sum of other and current element external short_real_addition (v, w: BITS 32): BITS 32 language "..." do Result.Create; Result.set_value (short_real_addition (value, other.float_value)) end; -- "+" ... And similarly for other prefix and infix operations ... end -- class FLOAT So we consider that a short real number is an expanded 32- bit object, which has the properties of NUMERIC objects, including those of COMPARABLE objects. The repeat notation which appears in the export clause for FLOAT is a facility (new for 2.2) which makes it possible to copy conceptually the export clause of an ancestor class such as COMPARABLE and NUMERIC into a descendant such as FLOAT without copying it physically. (In addition, value is exported to the class itself.) _________________________________________________ The repeat facility addresses a problem that was mentioned by users who developed large systems with pre-2.2 Eiffel: the tediousness of managing export lists when many levels of inheritance are involved. The problem should now disappear; if you specify with a repeat clause that the interface of a class D should always include the interface of one of its ancestors A, then any addition to A's interface will automatically be reflected in D. _________________________________________________ The basic operations of class FLOAT operations are implemented through external routines, since these operations are to be carried out directly by the hardware (or through the lower-level implementation language used by the given implementation of Eiffel, such as assembly or C). The idea here is the same as with the way arrays are treated in pre-2.2 Eiffel. Arrays, and now real numbers, are viewed as instances of ``normal'' Eiffel classes in the Basic Library. From a practical standpoint, however, the implementation knows about these and implements the corresponding operations through shortcuts: no actual routine call is needed to access an array element, or to add two reals. So the usual efficiency of fundamental operations is achieved, but the conceptual framework (classes, objects, inheritance) remains consistent with the rest of the language. Class INT is declared in a way similar to FLOAT: class INT export repeat NUMERIC, repeat COMPARABLE, value {INT}, to_real inherit NUMERIC; COMPARABLE; FLOAT_CONVERTIBLE feature value: BITS 32; -- The integer's internal representation set_value (v: BITS 32) is -- Use v as the bit code for the integer's value do value := v end; -- set_value float_value: FLOAT is -- Representation of current object as a floating-point value external int_to_float (v: BITS 32): BITS 32 language "..." do Result.Create; Result.set_value (int_to_float (value)) end; -- float_value infix "+" (other: INT): INT is -- Sum of other and current element external integer_addition (v, w: BITS 32): BITS 32 language "..." do Result.set_value (integer_addition (value, other.value)) end; -- "+" ... And similarly for other prefix and infix operations ... end -- class INT The other basic types are defined similarly. For CHAR, attribute value is of type BITS 8; for BOOL, value is of type BITS 1. The class DFLOAT describing double-precision reals and serving as base class for the expanded type DOUBLE, new with 2.2, is almost identical to FLOAT but uses BITS 64 rather than BITS 32 for the declaration of value and the associated entities. If longer precision reals or integers are needed, the technique is easily generalized: just define new classes that inherit from NUMERIC and COMPARABLE and have an internal value attribute of the right BITS size. 7 GENERICITY It is possible in Eiffel to write general-purpose classes describing ``container'' data structures (such as lists, trees etc.) containing objects of arbitrary types. These are written as generic classes. Genericity is essential to reconcile static typing with the need for reusable container classes. The preceding discussion raises some new issues with respect to genericity. 7.1 Form of generic declarations A generic class is defined with one or more generic parameters, as T in class C [T, ...], ... Variations on the way T may be specified are described below. An entity declared in the class as being of type T or another generic parameter will be said to be of generic type. Then declarations of entities of types based on C require an actual generic parameter, such as ACTUAL in x: C [ACTUAL, ...] ACTUAL is an arbitrary type. It may be either a class type or an expanded type (of the form expanded BASE, with the special cases corresponding to the basic types, or BITS M). 7.2 Ensuring consistent semantics for generic parameters Consider a routine r of a generic class, acting on values of generic type T. The problem arises (both in pre-2.2 and current Eiffel) of how the author of the class can write r so that the routine has the same semantics regardless of whether the actual generic parameter corresponding to T is expanded or not. class. In light of the discussion on assignment and argument passing in section 3 above, the answer is clear. Only copy semantics is available for both expanded and unexpanded types. (For expanded types reference semantics does not make sense.) Writers of generic classes may thus guarantee uniform semantics by making sure that: + All equality tests between expressions of generic type use Equal rather than =. + All assignments use copy rather than :=. Clone may also be used if, for generic parameters which are unexpanded types, duplication is desired rather than mere copy; the difference between Clone and copy is irrelevant for expanded types. This rule is particularly important because basic types (INTEGER etc.) are defined as expanded. Using Equal and copy is thus the appropriate way to guarantee that a generic class will have uniform semantics for basic types and class types. It is important to note that this was essentially already true in pre-2.2 Eiffel: as a first step towards the more general solution described here, Equal and Clone were already available on basic types, with the respective semantics of = and := on these types. The rules described above simply generalize and systematize these conventions. Clearly, the above guidelines should only be followed if uniform semantics is desired; they do not mean that = and := should be banned from generic classes. One can conceive of perfectly valid generic classes for which reference semantics is desired when the actual generic parameter is a class type, and value semantics is desired when the generic parameter is an expanded type, in particular one of the basic types. Since field-by-field object comparison and object copy are the operations that make sense in all cases, it is legitimate to criticize the syntax retained: why are the simpler symbols (= and :=) used for operations whose semantics is not the same for expanded and unexpanded types, and more verbose notations (Equal, copy) used for the operations which have consistent semantics and hence appear ``cleaner''? This is a valid criticism. Indeed, I seriously considered, during the initial design of Eiffel, making := denote copy assignment and = denote object equality. But this idea was rejected out of fear that such a clash with the tradition of all previous languages supporting reference semantics (Pascal, Ada, PL/I, and many others) would result in numerous mistakes on the part of new Eiffel programmers, not all of whom may yet be expected to learn Eiffel as their first programming language. The relative clumsiness of the syntax for the more uniform operations was deemed less unpleasant than the prospect of massive mistakes by the cohorts of new converts, still recovering from the influence of older programming languages. As explained in section 5.8.3 of [1], using variants of the same notation for copy and reference semantics (such as the traditional operators := and = for copy, and the Simula operators :- and == for reference) would have been the worst possible solution, implying that minor syntactical or typing oversights could result in drastic changes of meaning. The notations had to be visibly different. New reserved words (Clone, copy, Equal) were deemed clearer than new operators. 7.3 Catering to generic parameters of various sizes One more technique is needed to benefit the full benefit of expanded classes. We need to be able to define generic classes, representing such abstractions as matrices, in such a way that the formal generic parameters can be expanded types such as DOUBLE representing objects of any given size. Consider the following general class: class MATRIX [T -> NUMERIC] export repeat NUMERIC inherit NUMERIC feature ... Matrix representation and operations, expressed in terms of the corresponding operations on objects of type T ... end -- class MATRIX This generic class will accept any unexpanded type as actual generic parameter corresponding to T. It will also accept expanded types such as BOOLEAN, CHARACTER, INTEGER and REAL whose ``size'', defined below, is less than 32. It will not, however, accept DOUBLE or other ``big'' expanded types. To describe matrices of bigger composite objects, you may define a different class, an heir to MATRIX with a very short definition such as: class D_MATRIX [T (DOUBLE) -> NUMERIC] export repeat MATRIX inherit MATRIX -- No need for a feature clause end -- class D_MATRIX The new facility here is the possibility to specify a type (DOUBLE in this example) as a bounding type on a formal generic parameter (T in this example). A bounding type is written in parentheses after the generic parameter. If none is given explicitly, the bounding type is taken to be BITS 32 by default. The effect of having a bounding type associated with a formal generic parameter T is given by the following two rules: + The size (as defined below) of any actual generic parameter corresponding to T must be no bigger than the size the bounding type. + If C is a type involving a generic class G and D is a descendant class, the rules for conformance between D and C (governing assignment and parameter passing between these two types) are complemented by the requirement that the bounding types for each generic parameter must be exactly the same in both cases. The size of a type is defined as follows: + The size of all class types is the same. It can be referred to as the size of the universal ancestor ANY. By default this is 32. + The size of a BITS M expanded type is M. + The size of any other expanded type expanded C is obtained by adding the sizes of the types of all the attributes of class C (those defined in the class itself, and those inherited from ancestors). From the above definitions, then, the size of INTEGER and REAL is 32 and the size of DOUBLE is 64. Class D_MATRIX above should be used to describe matrices whose elements are no bigger than DOUBLE objects. This includes any class type as well as BOOLEAN, CHARACTER, INTEGER, REAL. Note that the conformance rule precludes such improper combinations as assignment of an integer matrix or a matrix of references to a double-precision matrix. The convention that the size of ANY and other class type is 32 results from an assessment of the current state of hardware technology. values of unexpanded types are accessible through references, that is to say, at the implementation level, pointers. We thus accept that, for still some time: 1. + Limiting Eiffel applications to manipulate 2 ^ 31 (about two billion) dynamic objects during a given session is acceptable 2. + Computer memory is not cheap enough that it would be preferable to raise this limitation to, say, 2 ^ 63 , if this implied doubling the space occupied by the pointers generated during a session. 8 CONCLUSION The mechanisms described above may at first appear as a significant extension of pre-2.2 Eiffel. They are not. Instead, the work reported here should primarily be viewed as a reorganization and cleanup of the theoretical framework, which makes important improvements available at the cost of very little actual extension. As a matter of fact the above discussion has introduced only five new reserved words: BITS DOUBLE expanded infix prefix From a theoretical viewpoint, four notions have been removed: the basic types INTEGER, REAL, BOOLEAN, CHARACTER. DOUBLE would have been needed anyway. The starting point for this work was both practical and theoretical. From a practical viewpoint, I had known for a long time that it would be ultimately indispensable to support double precision; however double precision initially appeared extremely difficult in the presence of genericity. Clearly I did not want an implementation of genericity which would result (as most Ada compilers do for generic packages) in duplicating code for each separate use of a generic class; this was deemed unacceptable. At some point I realized that, from a practical perspective, the mechanism of constrained genericity (implemented for 2.1) contained the key. But then why stop at double precision? Eiffel strives for generality; what if someone wants, say, 128-bit reals? Another practical concern, which had been present for some time, was to avoid dynamic allocation whenever possible. In most applications, the overhead of pointers is acceptable; whether in Eiffel, Pascal, Ada or C, most non-trivial data structures are accessed through pointers anyway. Still, this overhead is annoying for applications that manipulate very large numbers of composite objects each of which contains several small but non- atomic components - say triangles, each containing three ``point'' attributes. Yet another motive was the desire to improve the flexibility of the interface of Eiffel with other languages, especially C. This requirement is not an internal one, since for our own developments the external interface of pre-2.2 Eiffel, based on external routines that essentially pass BITS 32 values to and from Eiffel, is more than enough. Eiffel users have increasingly requested support for more general interface mechanisms, however; the aim in particular is to facilitate cooperation between Eiffel software and existing packages (graphics, databases, expert systems etc.) The support for composite objects should be particularly useful here, since it covers a need often voiced by users, that of having Eiffel objects contain sub-objects described by C structures. Support for prefix and infix operators was never by itself a major concern. In fact I was (and remain) strongly against unbounded operator overloading, which opens the door to all kinds of notational abuses; I am much more interested in the semantically useful overloading permitted by redefinition and dynamic binding than by purely syntactical techniques which make it possible to call a two-argument routine under the form a @#$ b (say). The ability to define the precedence of new operators, in particular, has always struck me as a perfect example of programming language feature that is bad on all counts: it is difficult to find a good syntax for it (do you specify precedence by a number, forcing each program writer or reader to refer to the programming language reference manual each time, or do you say ``one less than the precedence of operator x''?); it makes it particularly easy to write tricky programs; it invites bugs; it makes the parser - normally the most trivial and worriless part of a modern compiler - considerably more difficult to implement (since each program may change the syntax); and to top the whole thing, it is only a superficial, syntactical, ``vanity'' extension. Not the most urgent feature to add to Eiffel. It was clear, however, that if we were to treat basic types formally as classes (see next), then we should be able to interpret the basic operators as denoting routines. Why not then make them available to writers of normal programmer-defined classes, such as MATRIX, VECTOR and the like? Since the concept would only apply to a few predefined operators, the extension would remain small and reasonable. I also had more theoretical concerns. The difference of treatment between basic types and class types in pre-2.2 Eiffel, explained in chapter 5 of [1], came directly from Simula. I knew that some difference was necessary: in most cases, class types need reference semantics, and usually they also need dynamic object creation, but no one would want to allocate integers dynamically. Still, this inconsistency was unpleasant in a language that emphasizes simplicity and uniformity. The concepts of class seemed to provide such a powerful basis for the discussion of types that it was sad to have to abandon it for a handful of basic types - the most fundamental at that. When I presented Eiffel concepts in public courses, listeners would often question the special treatment of basic types, asking why INTEGER, for example, was not a class. My answer usually included (among other arguments) that any theory of types needed to start somewhere, and thus to assume a few pre-existent, ``Bourbaki-given'' types. Why not, the argument would go, use for these the few types with which everybody is familiar, and which are readily available on all hardware? I would usually say something like ``In the most purist object- oriented approach, we would still need to take at least one type for granted: the bit''. In a way, the framework presented above implements this purist approach, although it considers not a single bit type, but an unbounded sequence of bit types, BITS M. Everything else is derived from these. Thanks to the notion of expanded type, which also addresses the problems that I have called ``practical concerns'', the basic types make perfect sense in this context; they cease to be special language elements with magical properties and become standard library classes. To make the solution clear and uniform, limited support for infix and prefix operators, useful in its own right, may be introduced. No loss in performance is implied because the compiler can be made smart enough to recognize these classes and apply special techniques to them; in fact, the pre-2.2 implementation of basic types remains entirely valid. Finally, the introduction of expanded types makes it possible to tackle the dual-semantics issue in a much more explicit, complete, and (I hope) convincing fashion than ever before. I feel that the general cleanup of the Eiffel type system is almost complete with the discussion of this article (and the companion discussion [2], which addresses details of the rules making Eiffel fully statically typed through constraints on polymorphism, not well covered in [1] and not yet fully supported by the implementation). To make the basic type into fully ``ordinary'' citizens, we would also need to deal with manifest constants, which could be treated as ``zerofix'' operators. Example of such operators would be 0 and 1, now introduced as deferred functions in class NUMERIC, corresponding to the neutral elements for operations infix "+" and infix "*". The other integers would be interpreted as syntactical abbreviations for 1+1, 1+1+1 etc. Such an extension would require that we be more formal, separating what has been called NUMERIC into different inheritance levels (GROUP, RING, FIELD etc.) characterized by the proper assertions. This is worth exploring further; there may be the root of a full type theory, perhaps even of a general model for computation. This extension, however, will not be part of Eiffel release 2.2. References [1] B. Meyer, Object-Oriented Software Construction, Prentice- Hall, 1988. [2] B. Meyer, Static typing in Eiffel, Interactive Software Engineering Inc., Technical Report TR-EI-18/ST, January 1989. Added reference (July 1990): ``Eiffel: The Language''; Technical Report TR-EI-17/RM, Version 2.2, August 1989. To be published by Prentice-Hall in 1990.