bertrand@eiffel.UUCP (Bertrand Meyer) (06/05/89)
The following is a revised version of the article quoted. Much of the text is common but there have been some important simplifications; in particular, inheritance is used in a much more limited way than in the original description. (Expanded classes cannot be used as parents any more). If anyone is keeping a record, please discard the original. Many thanks for all comments received on the previous posting. The text below was produced with nroff from input meant for troff (with italics and the like); I hope it is understandable. I use this opportunity to mention that in an earlier message about 2.2 syntax (<153@eiffel.UUCP>) an item was forgotten in the list of new keywords. The omitted keyword is ``repeat''. ------------------------------------------------------------- EIFFEL TYPES Bertrand Meyer Draft, March 1989 Revised, 14 May 1989 This note presents a unified view of the Eiffel types. The language as described is what is available for version 2.2 and includes some extensions over the definition given by reference [1]. No incompatibility is introduced; any Eiffel class that was legal according to [1] remains legal, with the same semantics. The conceptual framework and extensions presented here introduce a remarkable number of benefits, both theoretical and practical: + Independently of any actual change to the language, the concept of type in Eiffel is presented in a clearer and more uniform way. All Eiffel types are now classes, and multiple inheritance is the basic mechanism for constructing new classes. This means in particular that basic types (integer, real, character and real) need no longer be treated according to special rules. In fact, the names INTEGER, REAL, CHARACTER and REAL need no longer be treated as reserved words in the language: instead, they are names of classes in the Basic Eiffel Library. (For practical reasons, however, these names will remain reserved in version 2.2.) + Double precision reals, which were not previously supported directly in Eiffel because of theoretical and practical problems relating in particular to genericity, are now available. In fact, any precision can be supported. + Limited support is offered for programmer-defined infix and prefix operators. + Programmers can improve the efficiency of their software by writing classes whose instances include other objects. Up to release 2.1, objects could refer to other objects through reference fields, but could not contain other objects. The new possibility avoids unneeded indirections. + Exchange of data between Eiffel software and external routines is made easier, particularly when the external routines are written in C. Eiffel objects may contain sub-objects which the C world views as C structures. + Operations on bit sequences of arbitrary length (previously supported by the library class BOOLEAN_SET) are handled simply within the base language. Throughout the rest of this note, ``Eiffel'' refers to version 2.2 Eiffel. The language defined by [1] and implemented as version 2.1 is called ``pre-2.2 Eiffel''. 1 EXPANDED CLASSES The most important new concept is that of expanded class, which makes it possible to declare entities (variables) that denote objects rather than references to objects. As a result, it is now possible to create ``composite'' objects - objects which contain sub-objects. In pre-2.2 Eiffel, any entity of a class type, for example an attribute declared as ref: CLASS_NAME where CLASS_NAME is the name of a class, represents not an object but a reference to a potential object. The object is ``potential'' because it is only allocated explicitly, through a Create operation. Until it becomes associated with an object, a reference is said to be ``void''. This remains true by default, but the situation will be different if CLASS_NAME has been declared under a newly provided mode, called ``expanded''. The syntax for declaring an expanded class is expanded class CLASS_NAME ... The rest as before ... As will be seen below, an expanded class may not be deferred, so the problem of the respective order of the optional keywords expanded and deferred does not arise. The effect of declaring a class EXP as expanded is simple: any entity declared of type EXP will represent not a reference to potential objects of type EXP, but directly objects of type EXP. (Remember that ``entities'' in Eiffel cover class attributes, routine arguments, function results, local routine variables.) So if a class C contains an attribute declared as sub: EXP then any instance of C will contain a sub-object of type EXP, accessible through attribute sub. A class such as C having at least one attribute of expanded type is called a composite class; its instances are called composite objects. Figure 1 pictures a composite object. By default, classes are unexpanded. If a class UNEXP is not explicitly declared as expanded, an entity declaration ref: UNEXP keeps its pre-2.2 meaning: ref denotes a reference to a potential object of type UNEXP, to be created dynamically. Assuming class C contains both of the above attribute declarations, the structure of an instance of this class is illustrated on figure 1. -------------------- ref | | ---> To dynamic instances -------------------- of UNEXP | | | | | Object | sub | of type | | EXP | | | | | -------------------- | | -------------------- Other | | -------------------- fields | | -------------------- | | -------------------- Figure 1: Composite object If EXP had not been declared as expanded, attribute sub would have denoted an attribute which, in any instance of C, represents a reference to potential instances of EXP. Composite objects serve to model external objects that have non-simple components. The modeling of such objects was, of course, possible in pre-2.2 Eiffel, but only by making the objects contain references to their components, not the components themselves. Now it is possible to specify that the objects are composite, so that they actually contain sub-objects. The improvement is not functional - no data structure can be described which could not have also be modeled in pre-2.2 - but is one of performance: useless indirections may now be avoided. In most respects, expanded classes have the same properties as default (unexpanded) classes. An important difference is that instances of expanded classes do not need to be created explicitly. If the type of sub is an expanded class, the call sub.Create is still permitted; its sole effect, however, is to initialize the fields of the object associated with sub to their default values (based on their types) and then to call the specific Create procedure of the class, if present. (For unexpanded classes, Create also has these two effects, but it precedes them by allocating a new object of the appropriate type.) If C is a composite class (expanded or not) with an attribute sub of expanded type EXP, the application of Create to an instance of C implies that all fields of the instance will be initialized to their default values. This applies to field sub; when performing the default initializations, the Create of C will call the Create of EXP (default or specific) on this field. Two other predefined Eiffel features, Void and Forget, are also allowed on entities of expanded types. If sub is such an entity, sub.Void always returns false and sub.Forget has no effect. Three simple rules apply to expanded classes as a result of the above discussion. Violation of these rules will lead to compile-time errors. + As any other class, an expanded class may or may not have a specific Create procedure. If it has one, however, this procedure may not have any arguments. This clearly required in light of the rule for Create: in the above example, there is no way to initialize an instance of the composite class C if the initialization of its expanded sub component requires specific information. + Expanded classes may also be composite: in other words, an expanded class may itself have an attributed of expanded type. Among expanded classes, however, the relation ``A has an attribute of type B'' may not contain any cycles. The reason for this rule is obvious: we cannot implement A and B in such a way that an instance of any of these classes contains a sub-object which is an instance of the other. + An expanded class may not be deferred. (For one thing, its Create procedure could contain a call to a deferred routine. Create is disallowed for deferred classes anyway.) 2 INHERITANCE AND CONFORMANCE Inheritance is the fundamental mechanism for constructing new classes from existing ones. Inheritance serves to related purposes: + As a module enrichment mechanism, it allows a class to reuse the features defined in another. + As a type refinement mechanism, it restricts the scope of legal assignments: b may only be assigned to a if the type of b is a descendant (heir through any number of levels) of the type of a. If the types are different the assignment is said to introduce polymorphism, meaning that the target, a, may at run-time refer to objects of more than one type. The inheritance mechanism has been defined in [1] for unexpanded classes. Expanded classes introduce the following new rules. 1. + No class may inherit from an expanded class. 2. + An expanded class may inherit from an unexpanded class. 3. + The above type compatibility rule for assignments (the type of the target must be a descendant of the type of the source) remains valid. Because of rule 1, this means that if either source or target is of an expanded type then both must be of the same expanded type. An exception is made for the types BITS M, as described below. In other words, inheritance for expanded classes acts only in its module enrichment capacity. No polymorphism is permitted here. This is due to the nature of expanded classes: an entity of type EXP, where EXP is expanded, is meant to be directly associated with an object of type EXP; so its size is frozen. In contrast, entities of unexpanded types are associated with references to objects; the size of such an object does not need to be identical for all possible assignments. 3 ASSIGNMENT AND EQUALITY TESTING An assignment of b to a, as just discussed, can be of any of the following three forms (the last one new): + a := b + a.Clone (b) + a.copy (b) The previous rules apply to all three forms. As noted, if the type of either a or b is expanded, then both types must be expanded and (except for BITS M, as discussed below) identical. Clone is written with an initial upper-case because it is a predefined language feature and hence a reserved word. The name copy, however, is not reserved; instead, copy is a routine of the new universal class ANY, automatically inherited by every class. The effect of assignment under its three forms depends on whether the types are expanded or not. For unexpanded types, the rules are the same as in pre-2.2 Eiffel, with the addition of copy: + := denotes reference assignment. After the assignment a will refer to the same object as b, or will be void if b was. + Clone represents object duplication. If b refers to an object, the Clone operation creates a new copy of this object, and makes a refer to it. If b was void, a becomes void. + copy represents copy without allocation. The contents of the object associated with b are copied onto the object associated with a. If either a or b is void, an exception is raised. The copies performed by both Clone and copy are ``shallow'' copies; in other words they copy one entire object, but that object only, the reference fields being copied verbatim. For an expanded type, all three assignments have the effect just described for copy. Because of the rules given, neither a nor b may be void in this case, so no exception will ever be raised. A similar semantics is defined for equality tests. For unexpanded types, the boolean expression a = b denotes a test for equality of references; the expression a.Equal (b) denotes a test for (shallow) field-by-field equality, returning true if only the contents of the corresponding objects are field-by-field identical, or both references are void. For expanded types, both tests have the semantic of Equal. 4 ENSURING CONSISTENT SEMANTICS FOR GENERIC CLASSES An important design guideline for generic classes follows from the preceding specification of assignment and equality operators. It is possible in Eiffel to write general-purpose classes describing ``container'' data structures (such as lists, trees etc.) containing objects of arbitrary types. These are written as generic classes. Genericity is essential to reconcile static typing with the need for reusable container classes. Consider, however, a routine of a generic class, acting on values of generic type; by generic type, we mean a type defined by a formal generic parameter of the class, for example T in a routine of class C [T]. The problem arises of how programmers can write such a routine and be assured that it has the same semantics regardless of the types used as actual generic parameters for practical uses of the class. In light of the above discussion, the answer is clear. Only copy semantics is available for both expanded and unexpanded types. (For expanded types reference semantics does not make sense.) Writers of generic classes may thus guarantee uniform semantics by making sure that: + All equality tests between expressions of generic type use Equal rather than =. + All assignments use either copy rather than :=. Clone may also be used if, for generic parameters which are unexpanded classes, duplication is desired rather than mere copy; the difference between Clone and copy is irrelevant for expanded classes. This rule is particularly important because, as will be seen below, basic types (INTEGER, REAL and others) are defined as expanded classes. Using Equal and copy is thus the appropriate way to guarantee that a generic class will have uniform semantics for basic types and other class types. It is important to note that this was essentially already true in pre-2.2 Eiffel: as a first step towards the more general solution described here, Equal and Clone were already available on basic types, with the respective semantics of = and := on these types. (copy did not exist in pre-2.2 Eiffel.) The rules described above simply generalize and systematize these conventions. Clearly, the above guidelines should only be followed if uniform semantics is desired; they do not mean that = and := should be banned from generic classes. One can conceive of perfectly valid generic classes for which reference semantics is desired when the actual generic parameter is an unexpanded class, and value semantics is desired when the generic parameter is an expanded class, in particular a basic type. Since field-by-field object comparison and object copy are the operations that make sense in all cases, it is legitimate to criticize the syntax retained: why are the simpler symbols (= and :=) used for operations whose semantics is not the same for expanded and unexpanded types, and more verbose notations (Equal, copy) used for the operations which have consistent semantics and hence appear ``cleaner''? This is a valid criticism. Indeed, I seriously considered, during the initial design of Eiffel, making := denote copy assignment and = denote object equality. But this idea was rejected out of fear that such a clash with the tradition of all previous languages supporting reference semantics (Pascal, Ada, PL/I, and many others) would result in numerous mistakes on the part of new Eiffel programmers, not all of whom may yet be expected to learn Eiffel as their first programming language. The relative clumsiness of the syntax for the more uniform operations was deemed less unpleasant than the prospect of massive mistakes by the cohorts of new converts, still recovering from the influence of older programming languages. As explained in section 5.8.3 of [1], using variants of the same notation for copy and reference semantics (such as the traditional operators := and = for copy, and the Simula operators :- and == for reference) would have been the worst possible solution, implying that minor syntactical or typing oversights could result in drastic changes of meaning. The notations had to be visibly different. New reserved words (Clone, copy, Equal) were deemed clearer than new operators. 5 BITS TYPES The type system of Eiffel is entirely based on the notion of class: every Eiffel type is now defined as a class. As will be seen shortly, this includes the types previously considered as ``basic'' and hence special: INTEGER, BOOLEAN, REAL, CHARACTER. Most classes, including these, will be defined by class declarations, under the syntax and semantics introduced in reference [1]. Since any non-empty class declaration must refer to other classes (to give the types of attributes, routine arguments, function results, and to list parents if an inheritance clause is present), some classes must be considered as predefined. Eiffel offers an unbounded set of predefined classes, written BITS M where M is any non-negative integer constant. For any M, the definition of class BITS M is that it describes objects whose representation will fit in M bits. All BITS classes are expanded classes. In other words, by declaring an entity such as quadruple: BITS 128 the programmer is requiring the supporting Eiffel environment to devote at least 128 bits to the representation of any object that becomes associated with quadruple at run-time. Of course, the environment is free to allocate more space. The BITS classes are essentially useful for low-level manipulations of bit strings, for machine-dependent definitions, and for interfacing with other languages. The last two applications are particularly significant: + Classes such as INTEGER, REAL and DOUBLE will be defined below as expanded classes with an attribute of the form value: BITS M for some M. The default M is 32 for INTEGER or REAL and 64 for DOUBLE. Thanks to the BITS classes the machine dependencies can be recognized and isolated in a few classes such as these. + Pre-2.2 Eiffel made it possible to exchange data between Eiffel and other languages such as C, but the sole interface unit was the word (32 bits by default). Now by declaring entities of type BITS M for arbitrary M, Eiffel software can send and receive data of any size. This is explained in more detail below. Classes BITS M, for any M, are considered to include and export the boolean operations and, or, xor and not (defined in infix form, as explained in the next section) and the constants true and false. This makes it possible to use these types for performing operations on bit strings of arbitrary length. In pre-2.2 Eiffel this was done through the library classes BOOLEAN_SET and INTEGER_SET, which remain available. For assignments involving the BITS M types, the following rule applies: + A value of type BITS M may be assigned to an entity of type BITS N for M N. Only the first M bits of the target are affected by such an assignment. This is consistent with the use of BITS types for low- level manipulations. No inheritance structure exists on BITS classes since they are expanded. (The above assignment rule could be construed as meaning that BITS M from BITS M+1 for all M, but this would be a rather special case for which the benefits of referring to inheritance concepts appear doubtless.) 6 INTERFACE WITH OTHER LANGUAGES [Readers who are not interested in detailed questions of Eiffel interface to other languages, particularly C, should skip to the next section.] The notion of expanded class also provides for a more flexible interface with the non-Eiffel world. In pre-2.2 Eiffel, Eiffel routines that needed to interface with routines written in other languages, such as C, could only pass them arguments of basic types or of class types; in the latter case the argument is actually a reference. This also applied to the types of the results of external functions. Furthermore, it was not possible for an Eiffel object to contain a sub-object modeled directly to a C structure type declaration, although this constraint was somewhat relieved by the presence of a routine copy_structure, in the basic library class INTERNAL, making it possible to copy the contents of a C structure into an Eiffel object. This concern is now addressed by composite classes. If each instance of C must contain a sub-object whose structure comes from external C software, then C should contain an attribute declaration of the form sub: EXP where EXP is an appropriate expanded class. Usually, EXP will simply be declared as expanded class EXP feature value: BITS M end where M (an integer constant) is large enough to cover the size of instances of the required C structure type. It would usually be imprudent to give EXP a more precise Eiffel structure, since this would require making non-portable assumptions about the internal structure of both Eiffel and C objects. The above, however, will be sufficient in practice since entities of type EXP, such as sub, should only be used as arguments to external function calls. The purpose is not to have Eiffel do the job of C or the reverse, but only to facilitate communication between the two worlds. This facility is consistent with the Eiffel approach to interfacing with other languages, based on two observations. First, reusability implies that Eiffel software must be able to interface with existing software written in other languages, but not that the Eiffel language should be corrupted by compatibility with older, unrelated designs. A passage is provided through the border, but it remains a border. Second (for the special case of C): C plays with respect to Eiffel the role that assembly languages played for earlier high-level languages: that of a low-level vehicle to be used only when one cannot do otherwise. As an aside, note that release 2.2 provides further support for more powerful interaction between Eiffel and other languages. In particular, a simple syntactical extension makes it possible to pass the address of an Eiffel routine to an external routine. The notation is @f, where f is a routine of the enclosing class; such an expression is only valid as actual argument to an external routine. 7 ARGUMENT PASSING The rules for argument passing are a consequence of the above. When an Eiffel routine calls another Eiffel routine, the semantics of argument passing is the same as that of assignment from actual to formal argument: by copy for arguments of expanded types (including, as will be seen below, the basic types), and by reference for arguments of unexpanded types. [C-unfascinated readers, please skip to next section.] For arguments to external routines, the rule cannot be as systematic; they must of necessity be adapted to the target language. Current C, for example, does not support passing of structures or arrays as arguments; only pointers to such elements may be passed. (ANSI C may be more liberal, but has yet shown little relevance to the real world of C programming.) So for C all arguments, whether expanded or not, are normally passed by reference. It is the responsibility of the target C routine to copy any structure or array if needed. An exception is made, however, for arguments of basic types (integer, real, boolean, character and the new type ``long real'' described below), which are passed by value for compatibility with tradition. For the results of external functions, the Eiffel compiler can take care of ensuring the proper semantics (copy for expanded, reference for unexpanded); a language- dependent convention is needed, however, to determine what the function will return. In C, the answer is the only portable one: the C function is assumed to return a pointer to the result for non-basic types, expanded or not. The code generated by the Eiffel compiler then takes care of performing a copy in the expanded case. An important practical caveat governs the exchange of arguments and function results between Eiffel and C (or another language). We may usually assume that the basic types are common to both languages, and hence that values of these types may be manipulated by both sides. Both C and Eiffel routines, for example, may perform additions on a given integer value. For values of any other type, however, only one side may safely execute operations other than parameter passing; the other side will simply be used as repository for the values, or to pass them along to further elements. There are exceptions to this rule, but they apply to special cases and require that the programmer be well versed in the implementation techniques for both languages. 8 PREFIX AND INFIX OPERATORS We are now almost ready to explain how basic types can be interpreted as classes. A syntactical facility, although not essential, will prove convenient for this. To facilitate expressiveness, we allow some routines to be used in prefix or infix form. The standard grammar for the declaration of routines (see [1], appendix C) is as follows (brackets introduce optional components: Routine_declaration = Routine_name [Formal_arguments] [Type_mark] "is" Routine_text Formal_arguments = "(" Entity_declaration_list ")" Type_mark = ":" Type The optional ``Type_mark'' gives the type of the result when the routine is a function. In pre-2.2 Eiffel, ``Routine_name'' is simply an identifier. We now allow the two new forms prefix '"'Operator'"' infix '"'Operator'"' In this syntax, ``Operator'' may only be one of the following: + - * / < > <= >= and or xor implies not div mod (Operators xor, for exclusive or, and implies, for boolean implication, are new with 2.2.) All these except not may be used in infix declarations; only +, - and not are permitted for a prefix declaration. Thus only a small group of common operators may be used in prefix or infix form. Routines declared in this way must all be functions. Infix routines must have exactly one argument; prefix routines must have none. Calls to such routines must use the corresponding operators, in prefix or infix form. Assume for example a class MATRIX with function declarations of the form infix ("+") (other: like Current): like Current is... prefix ("-"): like Current is... Then, for m1 and m2 of type MATRIX, the expressions m1 + m2 and -m1, respectively, will denote calls to the corresponding functions. The precedence of infix and prefix operators is fixed and given by the standard Eiffel precedence table (see section C.3 of [1]). It is important to note that the above possibility is not ``the introduction of overloading into Eiffel''. A powerful form of overloading has always been available in Eiffel, since any two classes can have different features of the same name, and dynamic binding makes it possible to obtain run-time discrimination. What is new is a simple syntactic extension, making it possible to use the standard arithmetic operators for any class, not just the basic types. 9 COMPARABLE and NUMERIC Two deferred classes are needed in the basic Eiffel library. Class COMPARABLE already existed in pre-2.2; its features now use the infix facility. Note that only "<=" needs to be deferred. deferred class COMPARABLE export infix "<", infix "<=", infix ">", infix ">=" feature infix "<=" (other: like Current): BOOLEAN is -- Is current element less than or equal to other? deferred end; -- "<=" infix "<" (other: like Current): BOOLEAN is -- Is current element less than other? do Result := Current < other and not Current.Equal (other) end; -- "<" infix ">" (other: like Current): BOOLEAN is -- Is current element greater than other? do Result := other < Current end; -- ">" infix ">=" (other: like Current): BOOLEAN is -- Is current element less than or equal to other? do Result := other <= Current end -- ">=" end -- class COMPARABLE Any class that needs to manipulate elements connected by an order relation may inherit from COMPARABLE. Class NUMERIC describes any type with the basic arithmetic operations: deferred class NUMERIC export infix "+", infix "-", infix "*", infix "/", prefix "+", prefix "-" feature infix "+" (other: NUMERIC): NUMERIC is -- Sum of other and current element deferred end; -- infix "+" prefix "-" BOOLEAN is -- Is current element less than other? deferred end; -- prefix "-" (etc.) end -- class COMPARABLE Any class that needs to manipulate elements on which operations similar to the basic arithmetic operations are available, may inherit from NUMERIC (and from COMPARABLE if necessary). 10 BASIC TYPES AND DOUBLE PRECISION We can now introduce the basic types as classes. This definition is based on an assessment of the present state of computer hardware: on most platforms, an integer or (short) real will fit in 32 bits; a character will fit in 8 bits; a long real will fit in 64 bits. I believe it is preferable to explicitly incorporate these industry-standard lengths into the programming language definition, rather than to maintain a pretense of implementation-independence. In practice such a pretense can only prevent software developers from writing portable software that will perform in a (reasonably) consistent fashion across a range of platforms. (Another, more axiomatic approach has been followed by Ada, but its advantages over the policy of making simple assumptions about the minimum bit-size requirements of basic data types are not clear in the present state of hardware technology and software theory.) These conventions are not really ``wired in'' and it is possible to change the definitions below to adapt to different sizes. The basic types are defined as follows. expanded class REAL export repeat NUMERIC, repeat COMPARABLE, value {REAL} inherit NUMERIC; COMPARABLE feature value: BITS 32; -- The real number's internal representation set_value (v: BITS 32) is -- Use v as the bit code for the real number's value do value := v end; -- set_value infix "+" (other: INTEGER): INTEGER is -- Sum of other and current element external short_real_addition (v, w: BITS 32): BITS 32 language "..." do Result.set_value (short_real_addition (value, other.value)) end; -- "+" ... And similarly for other prefix and infix operations ... end -- class REAL So we consider that a short real number is an expanded 32-bit object, which has the properties of NUMERIC objects, including those of COMPARABLE objects. The repeat notation which appears in the export clause for REAL is a facility (new for 2.2) which makes it possible to copy conceptually the export clause of an ancestor class such as COMPARABLE and NUMERIC into a descendant such as REAL without copying it physically. (In addition, value is exported to the class itself.) The repeat facility addresses a problem that was mentioned by users who developed large systems with pre-2.2 Eiffel: the tediousness of managing export lists when many levels of inheritance are involved. The problem should now disappear; if you specify with a repeat clause that the interface of a class D should always include the interface of one of its ancestors A, then any addition to A's interface will automatically be reflected in . The basic operations of class REAL operations are implemented through external routines, since these operations are to be carried out directly by the hardware (or through the lower-level implementation language used by the given implementation of Eiffel, such as assembly or C). The idea here is the same as with the way arrays are treated in pre-2.2 Eiffel. Arrays, and now real numbers, are viewed as instances of ``normal'' Eiffel classes in the Basic Library. From a practical standpoint, however, the implementation knows about these and implements the corresponding operations through shortcuts: no actual routine call is needed to access an array element, or to add two reals. So the usual efficiency of fundamental operations is achieved, but the conceptual framework (classes, objects, inheritance) remains consistent with the rest of the language. Class INTEGER is declared in a way similar to REAL: expanded class INTEGER export repeat NUMERIC, repeat COMPARABLE, value {INTEGER}, to_real inherit NUMERIC; COMPARABLE feature value: BITS 32; -- The integer's internal representation set_value (v: BITS 32) is -- Use v as the bit code for the integer's value do value := v end; -- set_value infix "+" (other: INTEGER): INTEGER is -- Sum of other and current element external integer_addition (v, w: BITS 32): BITS 32 language "..." do Result.set_value (integer_addition (value, other.value)) end; -- "+" ... And similarly for other prefix and infix operations ... end -- class INTEGER To allow for mixed-mode computation class REAL should also include a conversion function for integers, defined as follows: set_value_from_integer (n: INTEGER) is -- Convert from value given as integer external int_to_real (v: BITS 32): BITS 32 language "..." do set_value (int_to_real (n.value)) end; -- set_value_from_integer (With value in INTEGER selectively exported to REAL). In version 2.2, however, it will remain possible to assign an integer value to a real entity, with the expected effect of a conversion. This is an exception to the generality of the type scheme presented here, for compatibility with programmers' current habits. The other basic types are defined similarly. For CHARACTER, attribute value is of type BITS 8; for BOOLEAN, value is of type BITS 1. The type DOUBLE for double-precision reals, new with 2.2, is also an expanded class, almost identical to REAL, but inheriting from BITS 64 rather than BITS 32. The classes just discussed are all expanded and hence there is no inheritance relation among them. Despite what one might think at first look, it would not be appropriate to consider INTEGER as an heir to REAL or REAL as an heir to DOUBLE. If longer precision reals or integers are needed, the technique is easily generalized: just define new classes that inherit from NUMERIC and COMPARABLE and have an internal value attribute of the right BITS size. 11 UNEXPANDED CLASSES AND GENERICITY One more technique is needed to benefit the full benefit of expanded classes. We need to be able to define generic classes, representing such abstractions as matrices, in such a way that the formal generic parameters can be expanded classes such as DOUBLE representing objects of any given size. Consider the following general class: class MATRIX [T -> NUMERIC] export repeat NUMERIC inherit NUMERIC feature ... Matrix representation and operations, expressed in terms of the corresponding operations on objects of type T ... end -- class MATRIX This generic class will accept any unexpanded class as actual generic parameter corresponding to T. It will also accept expanded classes such as BOOLEAN, CHARACTER, INTEGER and REAL whose ``size'', defined below, is less than 32. It will not, however, accept DOUBLE or other ``big'' expanded classes. To describe matrices of bigger composite objects, you may define a different class, an heir to MATRIX with a very short definition such as: class D_MATRIX [T (DOUBLE) -> NUMERIC] export repeat MATRIX inherit MATRIX -- No need for a feature clause end -- class D_MATRIX The new facility here is the possibility to specify a class name CLASS_NAME (DOUBLE in this example) in parentheses after the name of the formal generic parameter G (T in this example). This means: ``Actual generic parameters corresponding to G must be classes of size no bigger than the size of CLASS_NAME''. The size of a class is defined as follows: + The size of all unexpanded classes is the same. It can be referred to as the size of the universal ancestor ANY. By default this is 32. + The size of one of the BITS M expanded classes is M. + The size of any other expanded class is obtained by adding the sizes of the types of all the attributes of the class (those defined in the class itself, and those inherited from ancestors). From the above definitions, then, the size of INTEGER and REAL is 32 and the size of DOUBLE is 64. Class D_MATRIX above should be used to describe matrices whose elements are no bigger than DOUBLE objects. This includes any unexpanded class as well as BOOLEAN, CHARACTER, INTEGER, REAL. The convention that the size of ANY and other unexpanded class is 32 results from an assessment of the current state of hardware technology. Instances of unexpanded classes are accessible through references, that is to say, at the implementation level, pointers. We thus accept that, for still some time: 1. + Limiting Eiffel applications to manipulate 28319 (about two billion) dynamic objects during a given session is acceptable 2. + Computer memory is not cheap enough that it would be preferable to raise this limitation to, say, 2639 , if this implied doubling the space occupied by the pointers generated during a session. 12 CONCLUSION The mechanisms described above may at first appear as a significant extension of pre-2.2 Eiffel. They are not. Instead, the work reported here should primarily be viewed as a reorganization and cleanup of the theoretical framework, which makes important improvements available at the cost of very little actual extension. As a matter of fact the above discussion has introduced only four new reserved words: BITS expanded infix prefix From a theoretical viewpoint, four have been removed: INTEGER, REAL, BOOLEAN, CHARACTER. If we accept that DOUBLE would have been needed anyway, the reserved word count is actually decreased by one. (The names of the basic types will, however, remain reserved in release 2.2 for practical reasons.) The starting point for this work was both practical and theoretical. From a practical viewpoint, I had known for a long time that it would be ultimately indispensable to support double precision; however double precision initially appeared extremely difficult in the presence of genericity. Clearly I did not want an implementation of genericity which would result (as most Ada compilers do for generic packages) in duplicating code for each separate use of a generic class; this was deemed unacceptable. At some point I realized that, from a practical perspective, the mechanism of constrained genericity (implemented for 2.1) contained the key. But then why stop at double precision? Eiffel strives for generality; what if someone wants, say, 128-bit reals? Another practical concern, which had been present for some time, was to avoid dynamic allocation whenever possible. In most applications, the overhead of pointers is acceptable; whether in Eiffel, Pascal, Ada or C, most non- trivial data structures are accessed through pointers anyway. Still, this overhead is annoying for applications that manipulate very large numbers of composite objects each of which contains several small but non-atomic components - say triangles, each containing three ``point'' attributes. Yet another motive was the desire to improve the flexibility of the interface of Eiffel with other languages, especially C. This requirement is not an internal one, since for our own developments the external interface of pre-2.2 Eiffel, based on external routines that essentially pass BITS 32 values to and from Eiffel, is more than enough. Eiffel users have increasingly requested support for more general interface mechanisms, however; the aim in particular is to facilitate cooperation between Eiffel software and existing packages (graphics, databases, expert systems etc.) The support for composite objects should be particularly useful here, since it covers a need often voiced by users, that of having Eiffel objects contain sub-objects described by C structures. Support for prefix and infix operators was never by itself a major concern. In fact I was (and remain) strongly against unbounded operator overloading, which opens the door to all kinds of notational abuses; I am much more interested in the semantically useful overloading permitted by redefinition and dynamic binding than by purely syntactical techniques which make it possible to call a two-argument routine under the form a @#$ b (say). The ability to define the precedence of new operators, in particular, has always struck me as a perfect example of programming language feature that is bad on all counts: it is difficult to find a good syntax for it (do you specify precedence by a number, forcing each program writer or reader to refer to the programming language reference manual each time, or do you say ``one less than the precedence of operator xxx''?); it makes it particularly easy to write tricky programs; it invites bugs; it makes the parser - normally the most trivial and worriless part of a modern compiler - considerably more difficult to implement (since each program may change the syntax); and to top the whole thing, it is only a superficial, syntactical, ``vanity'' extension. Not the most urgent feature to add to Eiffel. It was clear, however, that if we were to treat basic types formally as classes (see next), then we should be able to interpret the basic prefix and infix operators as denoting routines. Why not then make them available to writers of normal programmer-defined classes, such as MATRIX, VECTOR and the like? Since the concept would only apply to a few predefined operators, the extension would remain small and reasonable. I also had more theoretical concerns. The difference of treatment between basic types and class types in pre-2.2 Eiffel, explained in chapter 5 of [1], came directly from Simula. I knew that some difference was necessary: in most cases, class types need reference semantics, and usually they also need dynamic object creation, but no one would want to allocate integers dynamically. Still, this inconsistency was unpleasant in a language that emphasizes simplicity and uniformity. The concepts of class seemed to provide such a powerful basis for the discussion of types that it was sad to have to abandon it for a handful of basic types - the most fundamental at that. When I presented Eiffel concepts in public courses, listeners would often question the special treatment of basic types, asking why INTEGER, for example, was not a class. My answer usually included (among other arguments) that any theory of types needed to start somewhere, and thus to assume a few pre-existent, ``Bourbaki-given'' types. Why not, the argument would go, use for these the few types with which everybody is familiar, and which are readily available on all hardware? I would usually say something like ``In the most purist object-oriented approach, we would still need to take at least one type for granted: the bit''. In a way, the framework presented above implements this purist approach, although it considers not a single bit type, but an unbounded sequence of bit types, BITS M. Everything else is derived from these. Thanks to the notion of expanded class, which also addresses the problems that I have called ``practical concerns'', the basic types make perfect sense in this context; they cease to be special language elements with magical properties and become standard library classes. To make the solution clear and uniform, limited support for infix and prefix operators, useful in its own right, may be introduced. No loss in performance is implied because the compiler can be made smart enough to recognize these classes and apply special techniques to them; in fact, the pre-2.2 implementation of basic types remains entirely valid. Finally, the introduction of expanded classes makes it possible to tackle the dual-semantics issue in a much more explicit, complete, and (I hope) convincing fashion than ever before. I feel that the general cleanup of the Eiffel type system is almost complete with the discussion of this article (and the companion discussion [2], which addresses details of the rules making Eiffel fully statically typed through constraints on polymorphism, not well covered in [1] and not yet fully supported by the implementation). To make the basic type into fully ``ordinary'' citizens, we would also need to deal with manifest constants, which could be treated as ``zerofix'' operators. Example of such operators would be 0 and 1, now introduced as deferred functions in class NUMERIC, corresponding to the neutral elements for operations infix "+" and infix "*". The other integers would be interpreted as syntactical abbreviations for 1+1, 1+1+1 etc. Such an extension would require that we be more formal, separating what has been called NUMERIC into different inheritance levels (GROUP, RING, FIELD etc.) characterized by the proper assertions. This is worth exploring further; there may be the root of a full type theory, perhaps even of a general model for computation. This extension, however, will not be part of Eiffel release 2.2. References [1] B. Meyer, Object-Oriented Software Construction, Prentice-Hall, 1988. [2] B. Meyer, The complete rules for static typing in Eiffel, unpublished. -- -- Bertrand Meyer bertrand@eiffel.com
jos@cs.vu.nl (Jos Warmer) (06/08/89)
In article <154@eiffel.UUCP> bertrand@eiffel.UUCP (Bertrand Meyer) writes: > > As an aside, note that release 2.2 provides > further support for more powerful interaction > between Eiffel and other languages. In > particular, a simple syntactical extension makes > it possible to pass the address of an Eiffel > routine to an external routine. The notation is > @f, where f is a routine of the enclosing class; > such an expression is only valid as actual > argument to an external routine. > I guess/hope that this can be used as an efficient interface for calling Eiffel routines from C. But I don't understand the meaning of this. An Eiffel routine has meaning only when it is called on an Eiffel object as in 'object.routine'. An Eiffel routine can never be called as a standalone function as C routines can. Besides, because of the dynamic binding in Eiffel, you can never be sure what implementation of a routine is called. Even when the names of the routines are equal and they are called on an object of the same entity type, they can be different implementations. So, how can this address be used by the external routines ? They (or hopefully the Eiffel runtime system) must ensure that the routine will only be called on an Eiffel object of the correct (dynamic) type. If the routine address is used by the external routine to call it directly, this check cannot be performed by the Eiffel runtime system. This means that the external routines can easily corrupt internal Eiffel structures. If the routine address cannot be used by the external routine to call it directly, what use is it? ____ABOUT_THE_EFFICIENCY_ISSUE____ Some time ago, I mentioned to ISE that the way of calling Eiffel routines from C was very inefficient. In version 2.1b this is accomplished by the C routine `eif_rout', which is declared as follows: DATUM eif_rout (Objptr, r_name, arg1, ... OBJPTR Objptr; char *r_name; `Objptr' is a pointer to the eiffel-object and `r_name' is the name of the routine that must be performed on `Objptr'. In the implementation of `eif_rout', the array containing the names of all applyable routines for `Objptr' is searched for `r_name'. If `r_name' is found, the corresponding routine is called. This means that a number of string-comparisons will have to take place for each function call. I am not a fan for efficiency at each cost, but when this interface must be used heavily, this is prohibitive. Jos Warmer jos@cs.vu.nl ...uunet!mcvax!cs.vu.nl!jos -- Jos Warmer jos@cs.vu.nl ...uunet!mcvax!cs.vu.nl!jos
sakkinen@tukki.jyu.fi (Markku Sakkinen) (06/12/89)
The article referred to was entitled "EIFFEL TYPES" (by Bertrand Meyer). Note: this submission talks _only_ about the Eiffel-C interface (one short section of the full paper length article). In section 7 (Argument passing), Meyer writes: > For arguments to external routines, the rule cannot be > as systematic; they must of necessity be adapted to the > target language. Current C, for example, does not support > passing of structures or arrays as arguments; only pointers > to such elements may be passed. (ANSI C may be more liberal, > but has yet shown little relevance to the real world of C > programming.) So for C all arguments, whether expanded or > not, are normally passed by reference. It is the > responsibility of the target C routine to copy any structure > or array if needed. An exception is made, however, for > arguments of basic types (integer, real, boolean, character > and the new type ``long real'' described below), which are > passed by value for compatibility with tradition. > It must be a weird and antiquated C compiler indeed that cannot pass _structures_ themselves both as arguments and as function results. The above holds for _arrays_, though; that is a basic (mis)feature of C (and C++). Of course, pointers to structures are very often used as arguments in practice: either a call-by-reference effect is desired or copying large structures would appear inefficient. > For the results of external functions, the Eiffel > compiler can take care of ensuring the proper semantics > (copy for expanded, reference for unexpanded); a language- > dependent convention is needed, however, to determine what > the function will return. In C, the answer is the only > portable one: the C function is assumed to return a pointer > to the result for non-basic types, expanded or not. The > code generated by the Eiffel compiler then takes care of > performing a copy in the expanded case. > Returning a pointer (to a struct or an array) as the result of a C (or C++) function is very dangerous and should be done only if its referent (the _object pointed to_) fulfills one of the conditions: 1. its storage class is static or extern; 2. its storage class is auto, but it is defined in a calling function; 3. it has been allocated in dynamic memory before calling the function (and hopefully "someone" will delete it when it is no longer needed); 4. it has been allocated in dynamic memory within the function and its remaining allocated will not cause too much memory shortage (there is no garbage collection in the C or C++ realm). The most important point is: never pass a pointer to an _automatic_ variable of a function as the result of that function (cf. my short article in ACM SIGPLAN Notices, March 1989)! My recommendation for most cases is therefore: A. Pass any structure _itself_ (not a pointer to it) as the function's result. B. Since it is impossible to pass an _array_ itself, declare a structure that has the array as its only element; C (or C++) will let you pass the structure. This is only a compile-time trick that adds nothing to the runtime objects, so the Eiffel side should be able to handle the result directly as an (expanded?) array. Probably there _are_ cases in which one would rather like to pass a pointer as the result, especially with arrays. One must then be very careful. Markku Sakkinen Dept. of Computer Science, University of Jyvaskyla (imagine the a's in 'Jyvaskyla' with umlauts)