bertrand@eiffel.UUCP (Bertrand Meyer) (06/05/89)
The following is a revised version of the article quoted. Much of the text
is common but there have been some important simplifications; in
particular, inheritance is used in a much more limited way than in the
original description. (Expanded classes cannot be used as parents any
more).
If anyone is keeping a record, please discard the original.
Many thanks for all comments received on the previous posting.
The text below was produced with nroff from input meant for troff
(with italics and the like); I hope it is understandable.
I use this opportunity to mention that in an earlier message about 2.2
syntax (<153@eiffel.UUCP>) an item was forgotten in the list of new
keywords. The omitted keyword is ``repeat''.
-------------------------------------------------------------
EIFFEL TYPES
Bertrand Meyer
Draft, March 1989
Revised, 14 May 1989
This note presents a unified view of the Eiffel types.
The language as described is what is available for version
2.2 and includes some extensions over the definition given
by reference [1]. No incompatibility is introduced; any
Eiffel class that was legal according to [1] remains legal,
with the same semantics.
The conceptual framework and extensions presented here
introduce a remarkable number of benefits, both theoretical
and practical:
+
Independently of any actual change to the language,
the concept of type in Eiffel is presented in a
clearer and more uniform way. All Eiffel types are
now classes, and multiple inheritance is the basic
mechanism for constructing new classes. This means
in particular that basic types (integer, real,
character and real) need no longer be treated
according to special rules. In fact, the names
INTEGER, REAL, CHARACTER and REAL need no longer be
treated as reserved words in the language: instead,
they are names of classes in the Basic Eiffel
Library. (For practical reasons, however, these names
will remain reserved in version 2.2.)
+
Double precision reals, which were not previously
supported directly in Eiffel because of theoretical
and practical problems relating in particular to
genericity, are now available. In fact, any precision
can be supported.
+
Limited support is offered for programmer-defined
infix and prefix operators.
+
Programmers can improve the efficiency of their
software by writing classes whose instances include
other objects. Up to release 2.1, objects could refer
to other objects through reference fields, but could
not contain other objects. The new possibility
avoids unneeded indirections.
+
Exchange of data between Eiffel software and external
routines is made easier, particularly when the
external routines are written in C. Eiffel objects
may contain sub-objects which the C world views as C
structures.
+
Operations on bit sequences of arbitrary length
(previously supported by the library class
BOOLEAN_SET) are handled simply within the base
language.
Throughout the rest of this note, ``Eiffel'' refers to
version 2.2 Eiffel. The language defined by [1] and
implemented as version 2.1 is called ``pre-2.2 Eiffel''.
1 EXPANDED CLASSES
The most important new concept is that of expanded class,
which makes it possible to declare entities (variables) that
denote objects rather than references to objects. As a
result, it is now possible to create ``composite'' objects -
objects which contain sub-objects.
In pre-2.2 Eiffel, any entity of a class type, for
example an attribute declared as
ref: CLASS_NAME
where CLASS_NAME is the name of a class, represents not an
object but a reference to a potential object. The object is
``potential'' because it is only allocated explicitly,
through a Create operation. Until it becomes associated with
an object, a reference is said to be ``void''.
This remains true by default, but the situation will be
different if CLASS_NAME has been declared under a newly
provided mode, called ``expanded''. The syntax for
declaring an expanded class is
expanded class CLASS_NAME ... The rest as before ...
As will be seen below, an expanded class may not be
deferred, so the problem of the respective order of the
optional keywords expanded and deferred does not arise.
The effect of declaring a class EXP as expanded is
simple: any entity declared of type EXP will represent not a
reference to potential objects of type EXP, but directly
objects of type EXP. (Remember that ``entities'' in Eiffel
cover class attributes, routine arguments, function results,
local routine variables.)
So if a class C contains an attribute declared as
sub: EXP
then any instance of C will contain a sub-object of type
EXP, accessible through attribute sub. A class such as C
having at least one attribute of expanded type is called a
composite class; its instances are called composite objects.
Figure 1 pictures a composite object.
By default, classes are unexpanded. If a class UNEXP
is not explicitly declared as expanded, an entity
declaration
ref: UNEXP
keeps its pre-2.2 meaning: ref denotes a reference to a
potential object of type UNEXP, to be created dynamically.
Assuming class C contains both of the above attribute
declarations, the structure of an instance of this class is
illustrated on figure 1.
--------------------
ref | | ---> To dynamic instances
-------------------- of UNEXP
| |
| |
| Object |
sub | of type |
| EXP |
| |
| |
--------------------
| |
--------------------
Other | |
--------------------
fields | |
--------------------
| |
--------------------
Figure 1: Composite object
If EXP had not been declared as expanded, attribute sub
would have denoted an attribute which, in any instance of C,
represents a reference to potential instances of EXP.
Composite objects serve to model external objects that
have non-simple components. The modeling of such objects
was, of course, possible in pre-2.2 Eiffel, but only by
making the objects contain references to their components,
not the components themselves. Now it is possible to
specify that the objects are composite, so that they
actually contain sub-objects. The improvement is not
functional - no data structure can be described which could
not have also be modeled in pre-2.2 - but is one of
performance: useless indirections may now be avoided.
In most respects, expanded classes have the same
properties as default (unexpanded) classes. An important
difference is that instances of expanded classes do not need
to be created explicitly. If the type of sub is an expanded
class, the call
sub.Create
is still permitted; its sole effect, however, is to
initialize the fields of the object associated with sub to
their default values (based on their types) and then to call
the specific Create procedure of the class, if present.
(For unexpanded classes, Create also has these two effects,
but it precedes them by allocating a new object of the
appropriate type.)
If C is a composite class (expanded or not) with an
attribute sub of expanded type EXP, the application of
Create to an instance of C implies that all fields of the
instance will be initialized to their default values. This
applies to field sub; when performing the default
initializations, the Create of C will call the Create of EXP
(default or specific) on this field.
Two other predefined Eiffel features, Void and Forget,
are also allowed on entities of expanded types. If sub is
such an entity, sub.Void always returns false and sub.Forget
has no effect.
Three simple rules apply to expanded classes as a
result of the above discussion. Violation of these rules
will lead to compile-time errors.
+
As any other class, an expanded class may or may not
have a specific Create procedure. If it has one,
however, this procedure may not have any arguments.
This clearly required in light of the rule for
Create: in the above example, there is no way to
initialize an instance of the composite class C if
the initialization of its expanded sub component
requires specific information.
+
Expanded classes may also be composite: in other
words, an expanded class may itself have an
attributed of expanded type. Among expanded classes,
however, the relation ``A has an attribute of type
B'' may not contain any cycles. The reason for this
rule is obvious: we cannot implement A and B in such
a way that an instance of any of these classes
contains a sub-object which is an instance of the
other.
+
An expanded class may not be deferred. (For one
thing, its Create procedure could contain a call to a
deferred routine. Create is disallowed for deferred
classes anyway.)
2 INHERITANCE AND CONFORMANCE
Inheritance is the fundamental mechanism for constructing
new classes from existing ones. Inheritance serves to
related purposes:
+
As a module enrichment mechanism, it allows a class
to reuse the features defined in another.
+
As a type refinement mechanism, it restricts the
scope of legal assignments: b may only be assigned to
a if the type of b is a descendant (heir through any
number of levels) of the type of a. If the types are
different the assignment is said to introduce
polymorphism, meaning that the target, a, may at
run-time refer to objects of more than one type.
The inheritance mechanism has been defined in [1] for
unexpanded classes. Expanded classes introduce the
following new rules.
1.
+
No class may inherit from an expanded class.
2.
+
An expanded class may inherit from an unexpanded
class.
3.
+
The above type compatibility rule for assignments
(the type of the target must be a descendant of the
type of the source) remains valid. Because of rule 1,
this means that if either source or target is of an
expanded type then both must be of the same expanded
type. An exception is made for the types BITS M, as
described below.
In other words, inheritance for expanded classes acts
only in its module enrichment capacity. No polymorphism is
permitted here. This is due to the nature of expanded
classes: an entity of type EXP, where EXP is expanded, is
meant to be directly associated with an object of type EXP;
so its size is frozen. In contrast, entities of unexpanded
types are associated with references to objects; the size of
such an object does not need to be identical for all
possible assignments.
3 ASSIGNMENT AND EQUALITY TESTING
An assignment of b to a, as just discussed, can be of any of
the following three forms (the last one new):
+
a := b
+
a.Clone (b)
+
a.copy (b)
The previous rules apply to all three forms. As noted,
if the type of either a or b is expanded, then both types
must be expanded and (except for BITS M, as discussed below)
identical.
Clone is written with an initial upper-case because it
is a predefined language feature and hence a reserved word.
The name copy, however, is not reserved; instead, copy is a
routine of the new universal class ANY, automatically
inherited by every class.
The effect of assignment under its three forms depends
on whether the types are expanded or not. For unexpanded
types, the rules are the same as in pre-2.2 Eiffel, with the
addition of copy:
+
:= denotes reference assignment. After the assignment
a will refer to the same object as b, or will be void
if b was.
+
Clone represents object duplication. If b refers to
an object, the Clone operation creates a new copy of
this object, and makes a refer to it. If b was void,
a becomes void.
+
copy represents copy without allocation. The contents
of the object associated with b are copied onto the
object associated with a. If either a or b is void,
an exception is raised.
The copies performed by both Clone and copy are
``shallow'' copies; in other words they copy one entire
object, but that object only, the reference fields being
copied verbatim.
For an expanded type, all three assignments have the
effect just described for copy. Because of the rules given,
neither a nor b may be void in this case, so no exception
will ever be raised.
A similar semantics is defined for equality tests. For
unexpanded types, the boolean expression a = b denotes a
test for equality of references; the expression a.Equal (b)
denotes a test for (shallow) field-by-field equality,
returning true if only the contents of the corresponding
objects are field-by-field identical, or both references are
void. For expanded types, both tests have the semantic of
Equal.
4 ENSURING CONSISTENT SEMANTICS FOR GENERIC CLASSES
An important design guideline for generic classes follows
from the preceding specification of assignment and equality
operators.
It is possible in Eiffel to write general-purpose
classes describing ``container'' data structures (such as
lists, trees etc.) containing objects of arbitrary types.
These are written as generic classes. Genericity is
essential to reconcile static typing with the need for
reusable container classes.
Consider, however, a routine of a generic class, acting
on values of generic type; by generic type, we mean a type
defined by a formal generic parameter of the class, for
example T in a routine of class C [T]. The problem arises
of how programmers can write such a routine and be assured
that it has the same semantics regardless of the types used
as actual generic parameters for practical uses of the
class.
In light of the above discussion, the answer is clear.
Only copy semantics is available for both expanded and
unexpanded types. (For expanded types reference semantics
does not make sense.) Writers of generic classes may thus
guarantee uniform semantics by making sure that:
+
All equality tests between expressions of generic
type use Equal rather than =.
+
All assignments use either copy rather than :=. Clone
may also be used if, for generic parameters which are
unexpanded classes, duplication is desired rather
than mere copy; the difference between Clone and copy
is irrelevant for expanded classes.
This rule is particularly important because, as will be
seen below, basic types (INTEGER, REAL and others) are
defined as expanded classes. Using Equal and copy is thus
the appropriate way to guarantee that a generic class will
have uniform semantics for basic types and other class
types.
It is important to note that this was essentially
already true in pre-2.2 Eiffel: as a first step towards the
more general solution described here, Equal and Clone were
already available on basic types, with the respective
semantics of = and := on these types. (copy did not exist in
pre-2.2 Eiffel.) The rules described above simply generalize
and systematize these conventions.
Clearly, the above guidelines should only be followed
if uniform semantics is desired; they do not mean that = and
:= should be banned from generic classes. One can conceive
of perfectly valid generic classes for which reference
semantics is desired when the actual generic parameter is an
unexpanded class, and value semantics is desired when the
generic parameter is an expanded class, in particular a
basic type.
Since field-by-field object comparison and object copy
are the operations that make sense in all cases, it is
legitimate to criticize the syntax retained: why are the
simpler symbols (= and :=) used for operations whose
semantics is not the same for expanded and unexpanded types,
and more verbose notations (Equal, copy) used for the
operations which have consistent semantics and hence appear
``cleaner''?
This is a valid criticism. Indeed, I seriously
considered, during the initial design of Eiffel, making :=
denote copy assignment and = denote object equality. But
this idea was rejected out of fear that such a clash with
the tradition of all previous languages supporting reference
semantics (Pascal, Ada, PL/I, and many others) would result
in numerous mistakes on the part of new Eiffel programmers,
not all of whom may yet be expected to learn Eiffel as their
first programming language.
The relative clumsiness of the syntax for the more
uniform operations was deemed less unpleasant than the
prospect of massive mistakes by the cohorts of new converts,
still recovering from the influence of older programming
languages.
As explained in section 5.8.3 of [1], using variants of
the same notation for copy and reference semantics (such as
the traditional operators := and = for copy, and the Simula
operators :- and == for reference) would have been the worst
possible solution, implying that minor syntactical or typing
oversights could result in drastic changes of meaning. The
notations had to be visibly different. New reserved words
(Clone, copy, Equal) were deemed clearer than new operators.
5 BITS TYPES
The type system of Eiffel is entirely based on the notion of
class: every Eiffel type is now defined as a class. As will
be seen shortly, this includes the types previously
considered as ``basic'' and hence special: INTEGER, BOOLEAN,
REAL, CHARACTER.
Most classes, including these, will be defined by class
declarations, under the syntax and semantics introduced in
reference [1]. Since any non-empty class declaration must
refer to other classes (to give the types of attributes,
routine arguments, function results, and to list parents if
an inheritance clause is present), some classes must be
considered as predefined. Eiffel offers an unbounded set of
predefined classes, written
BITS M
where M is any non-negative integer constant.
For any M, the definition of class BITS M is that it
describes objects whose representation will fit in M bits.
All BITS classes are expanded classes.
In other words, by declaring an entity such as
quadruple: BITS 128
the programmer is requiring the supporting Eiffel
environment to devote at least 128 bits to the
representation of any object that becomes associated with
quadruple at run-time. Of course, the environment is free to
allocate more space.
The BITS classes are essentially useful for low-level
manipulations of bit strings, for machine-dependent
definitions, and for interfacing with other languages. The
last two applications are particularly significant:
+
Classes such as INTEGER, REAL and DOUBLE will be
defined below as expanded classes with an attribute
of the form value: BITS M for some M. The default M
is 32 for INTEGER or REAL and 64 for DOUBLE. Thanks
to the BITS classes the machine dependencies can be
recognized and isolated in a few classes such as
these.
+
Pre-2.2 Eiffel made it possible to exchange data
between Eiffel and other languages such as C, but the
sole interface unit was the word (32 bits by
default). Now by declaring entities of type BITS M
for arbitrary M, Eiffel software can send and receive
data of any size. This is explained in more detail
below.
Classes BITS M, for any M, are considered to include
and export the boolean operations and, or, xor and not
(defined in infix form, as explained in the next section)
and the constants true and false. This makes it possible to
use these types for performing operations on bit strings of
arbitrary length. In pre-2.2 Eiffel this was done through
the library classes BOOLEAN_SET and INTEGER_SET, which
remain available.
For assignments involving the BITS M types, the
following rule applies:
+
A value of type BITS M may be assigned to an entity
of type BITS N for M N. Only the first M bits of the
target are affected by such an assignment.
This is consistent with the use of BITS types for low-
level manipulations.
No inheritance structure exists on BITS classes since
they are expanded. (The above assignment rule could be
construed as meaning that BITS M from BITS M+1 for all M,
but this would be a rather special case for which the
benefits of referring to inheritance concepts appear
doubtless.)
6 INTERFACE WITH OTHER LANGUAGES
[Readers who are not interested in detailed questions of
Eiffel interface to other languages, particularly C, should
skip to the next section.]
The notion of expanded class also provides for a more
flexible interface with the non-Eiffel world. In pre-2.2
Eiffel, Eiffel routines that needed to interface with
routines written in other languages, such as C, could only
pass them arguments of basic types or of class types; in the
latter case the argument is actually a reference. This also
applied to the types of the results of external functions.
Furthermore, it was not possible for an Eiffel object to
contain a sub-object modeled directly to a C structure type
declaration, although this constraint was somewhat relieved
by the presence of a routine copy_structure, in the basic
library class INTERNAL, making it possible to copy the
contents of a C structure into an Eiffel object.
This concern is now addressed by composite classes. If
each instance of C must contain a sub-object whose structure
comes from external C software, then C should contain an
attribute declaration of the form
sub: EXP
where EXP is an appropriate expanded class. Usually, EXP
will simply be declared as
expanded class EXP feature
value: BITS M
end
where M (an integer constant) is large enough to cover the
size of instances of the required C structure type.
It would usually be imprudent to give EXP a more
precise Eiffel structure, since this would require making
non-portable assumptions about the internal structure of
both Eiffel and C objects. The above, however, will be
sufficient in practice since entities of type EXP, such as
sub, should only be used as arguments to external function
calls. The purpose is not to have Eiffel do the job of C or
the reverse, but only to facilitate communication between
the two worlds.
This facility is consistent with the Eiffel approach
to interfacing with other languages, based on two
observations. First, reusability implies that Eiffel
software must be able to interface with existing software
written in other languages, but not that the Eiffel language
should be corrupted by compatibility with older, unrelated
designs. A passage is provided through the border, but it
remains a border. Second (for the special case of C): C
plays with respect to Eiffel the role that assembly
languages played for earlier high-level languages: that of a
low-level vehicle to be used only when one cannot do
otherwise.
As an aside, note that release 2.2 provides
further support for more powerful interaction
between Eiffel and other languages. In
particular, a simple syntactical extension makes
it possible to pass the address of an Eiffel
routine to an external routine. The notation is
@f, where f is a routine of the enclosing class;
such an expression is only valid as actual
argument to an external routine.
7 ARGUMENT PASSING
The rules for argument passing are a consequence of the
above.
When an Eiffel routine calls another Eiffel routine,
the semantics of argument passing is the same as that of
assignment from actual to formal argument: by copy for
arguments of expanded types (including, as will be seen
below, the basic types), and by reference for arguments of
unexpanded types.
[C-unfascinated readers, please skip to next section.]
For arguments to external routines, the rule cannot be
as systematic; they must of necessity be adapted to the
target language. Current C, for example, does not support
passing of structures or arrays as arguments; only pointers
to such elements may be passed. (ANSI C may be more liberal,
but has yet shown little relevance to the real world of C
programming.) So for C all arguments, whether expanded or
not, are normally passed by reference. It is the
responsibility of the target C routine to copy any structure
or array if needed. An exception is made, however, for
arguments of basic types (integer, real, boolean, character
and the new type ``long real'' described below), which are
passed by value for compatibility with tradition.
For the results of external functions, the Eiffel
compiler can take care of ensuring the proper semantics
(copy for expanded, reference for unexpanded); a language-
dependent convention is needed, however, to determine what
the function will return. In C, the answer is the only
portable one: the C function is assumed to return a pointer
to the result for non-basic types, expanded or not. The
code generated by the Eiffel compiler then takes care of
performing a copy in the expanded case.
An important practical caveat governs the exchange of
arguments and function results between Eiffel and C (or
another language). We may usually assume that the basic
types are common to both languages, and hence that values of
these types may be manipulated by both sides. Both C and
Eiffel routines, for example, may perform additions on a
given integer value. For values of any other type, however,
only one side may safely execute operations other than
parameter passing; the other side will simply be used as
repository for the values, or to pass them along to further
elements. There are exceptions to this rule, but they apply
to special cases and require that the programmer be well
versed in the implementation techniques for both languages.
8 PREFIX AND INFIX OPERATORS
We are now almost ready to explain how basic types can be
interpreted as classes. A syntactical facility, although not
essential, will prove convenient for this.
To facilitate expressiveness, we allow some routines to
be used in prefix or infix form. The standard grammar for
the declaration of routines (see [1], appendix C) is as
follows (brackets introduce optional components:
Routine_declaration = Routine_name [Formal_arguments] [Type_mark] "is" Routine_text
Formal_arguments = "(" Entity_declaration_list ")"
Type_mark = ":" Type
The optional ``Type_mark'' gives the type of the result
when the routine is a function.
In pre-2.2 Eiffel, ``Routine_name'' is simply an
identifier. We now allow the two new forms
prefix '"'Operator'"'
infix '"'Operator'"'
In this syntax, ``Operator'' may only be one of the
following:
+ - * / < > <= >= and or xor implies not div mod
(Operators xor, for exclusive or, and implies, for boolean
implication, are new with 2.2.)
All these except not may be used in infix declarations;
only +, - and not are permitted for a prefix declaration.
Thus only a small group of common operators may be used
in prefix or infix form.
Routines declared in this way must all be functions.
Infix routines must have exactly one argument; prefix
routines must have none. Calls to such routines must use
the corresponding operators, in prefix or infix form.
Assume for example a class MATRIX with function
declarations of the form
infix ("+") (other: like Current): like Current is...
prefix ("-"): like Current is...
Then, for m1 and m2 of type MATRIX, the expressions m1 + m2
and -m1, respectively, will denote calls to the
corresponding functions.
The precedence of infix and prefix operators is fixed
and given by the standard Eiffel precedence table (see
section C.3 of [1]).
It is important to note that the above possibility is
not ``the introduction of overloading into Eiffel''. A
powerful form of overloading has always been available in
Eiffel, since any two classes can have different features of
the same name, and dynamic binding makes it possible to
obtain run-time discrimination. What is new is a simple
syntactic extension, making it possible to use the standard
arithmetic operators for any class, not just the basic
types.
9 COMPARABLE and NUMERIC
Two deferred classes are needed in the basic Eiffel library.
Class COMPARABLE already existed in pre-2.2; its
features now use the infix facility. Note that only "<="
needs to be deferred.
deferred class COMPARABLE export
infix "<", infix "<=", infix ">", infix ">="
feature
infix "<=" (other: like Current): BOOLEAN is
-- Is current element less than or equal to other?
deferred
end; -- "<="
infix "<" (other: like Current): BOOLEAN is
-- Is current element less than other?
do
Result := Current < other and not Current.Equal (other)
end; -- "<"
infix ">" (other: like Current): BOOLEAN is
-- Is current element greater than other?
do
Result := other < Current
end; -- ">"
infix ">=" (other: like Current): BOOLEAN is
-- Is current element less than or equal to other?
do
Result := other <= Current
end -- ">="
end -- class COMPARABLE
Any class that needs to manipulate elements connected
by an order relation may inherit from COMPARABLE.
Class NUMERIC describes any type with the basic
arithmetic operations:
deferred class NUMERIC export
infix "+", infix "-", infix "*", infix "/",
prefix "+", prefix "-"
feature
infix "+" (other: NUMERIC): NUMERIC is
-- Sum of other and current element
deferred
end; -- infix "+"
prefix "-" BOOLEAN is
-- Is current element less than other?
deferred
end; -- prefix "-"
(etc.)
end -- class COMPARABLE
Any class that needs to manipulate elements on which
operations similar to the basic arithmetic operations are
available, may inherit from NUMERIC (and from COMPARABLE if
necessary).
10 BASIC TYPES AND DOUBLE PRECISION
We can now introduce the basic types as classes. This
definition is based on an assessment of the present state of
computer hardware: on most platforms, an integer or (short)
real will fit in 32 bits; a character will fit in 8 bits; a
long real will fit in 64 bits.
I believe it is preferable to explicitly incorporate
these industry-standard lengths into the programming
language definition, rather than to maintain a pretense of
implementation-independence. In practice such a pretense can
only prevent software developers from writing portable
software that will perform in a (reasonably) consistent
fashion across a range of platforms. (Another, more
axiomatic approach has been followed by Ada, but its
advantages over the policy of making simple assumptions
about the minimum bit-size requirements of basic data types
are not clear in the present state of hardware technology
and software theory.)
These conventions are not really ``wired in'' and it is
possible to change the definitions below to adapt to
different sizes.
The basic types are defined as follows.
expanded class REAL export
repeat NUMERIC, repeat COMPARABLE, value {REAL}
inherit
NUMERIC;
COMPARABLE
feature
value: BITS 32;
-- The real number's internal representation
set_value (v: BITS 32) is
-- Use v as the bit code for the real number's value
do
value := v
end; -- set_value
infix "+" (other: INTEGER): INTEGER is
-- Sum of other and current element
external
short_real_addition
(v, w: BITS 32): BITS 32
language "..."
do
Result.set_value (short_real_addition (value, other.value))
end; -- "+"
... And similarly for other prefix and infix operations ...
end -- class REAL
So we consider that a short real number is an expanded
32-bit object, which has the properties of NUMERIC objects,
including those of COMPARABLE objects.
The repeat notation which appears in the export clause
for REAL is a facility (new for 2.2) which makes it possible
to copy conceptually the export clause of an ancestor class
such as COMPARABLE and NUMERIC into a descendant such as
REAL without copying it physically. (In addition, value is
exported to the class itself.)
The repeat facility addresses a problem that was
mentioned by users who developed large systems
with pre-2.2 Eiffel: the tediousness of managing
export lists when many levels of inheritance are
involved. The problem should now disappear; if you
specify with a repeat clause that the interface of
a class D should always include the interface of
one of its ancestors A, then any addition to A's
interface will automatically be reflected in .
The basic operations of class REAL operations are
implemented through external routines, since these
operations are to be carried out directly by the hardware
(or through the lower-level implementation language used by
the given implementation of Eiffel, such as assembly or C).
The idea here is the same as with the way arrays are treated
in pre-2.2 Eiffel. Arrays, and now real numbers, are viewed
as instances of ``normal'' Eiffel classes in the Basic
Library. From a practical standpoint, however, the
implementation knows about these and implements the
corresponding operations through shortcuts: no actual
routine call is needed to access an array element, or to add
two reals. So the usual efficiency of fundamental operations
is achieved, but the conceptual framework (classes, objects,
inheritance) remains consistent with the rest of the
language.
Class INTEGER is declared in a way similar to REAL:
expanded class INTEGER export
repeat NUMERIC, repeat COMPARABLE, value {INTEGER}, to_real
inherit
NUMERIC;
COMPARABLE
feature
value: BITS 32;
-- The integer's internal representation
set_value (v: BITS 32) is
-- Use v as the bit code for the integer's value
do
value := v
end; -- set_value
infix "+" (other: INTEGER): INTEGER is
-- Sum of other and current element
external
integer_addition (v, w: BITS 32): BITS 32
language "..."
do
Result.set_value (integer_addition (value, other.value))
end; -- "+"
... And similarly for other prefix and infix operations ...
end -- class INTEGER
To allow for mixed-mode computation class REAL should
also include a conversion function for integers, defined as
follows:
set_value_from_integer (n: INTEGER) is
-- Convert from value given as integer
external
int_to_real (v: BITS 32): BITS 32
language "..."
do
set_value (int_to_real (n.value))
end; -- set_value_from_integer
(With value in INTEGER selectively exported to REAL). In
version 2.2, however, it will remain possible to assign an
integer value to a real entity, with the expected effect of
a conversion. This is an exception to the generality of the
type scheme presented here, for compatibility with
programmers' current habits.
The other basic types are defined similarly. For
CHARACTER, attribute value is of type BITS 8; for BOOLEAN,
value is of type BITS 1.
The type DOUBLE for double-precision reals, new with
2.2, is also an expanded class, almost identical to REAL,
but inheriting from BITS 64 rather than BITS 32.
The classes just discussed are all expanded and hence
there is no inheritance relation among them. Despite what
one might think at first look, it would not be appropriate
to consider INTEGER as an heir to REAL or REAL as an heir to
DOUBLE.
If longer precision reals or integers are needed, the
technique is easily generalized: just define new classes
that inherit from NUMERIC and COMPARABLE and have an
internal value attribute of the right BITS size.
11 UNEXPANDED CLASSES AND GENERICITY
One more technique is needed to benefit the full benefit of
expanded classes. We need to be able to define generic
classes, representing such abstractions as matrices, in such
a way that the formal generic parameters can be expanded
classes such as DOUBLE representing objects of any given
size.
Consider the following general class:
class MATRIX [T -> NUMERIC] export
repeat NUMERIC
inherit
NUMERIC
feature
... Matrix representation and operations,
expressed in terms of the corresponding operations on
objects of type T ...
end -- class MATRIX
This generic class will accept any unexpanded class as
actual generic parameter corresponding to T. It will also
accept expanded classes such as BOOLEAN, CHARACTER, INTEGER
and REAL whose ``size'', defined below, is less than 32. It
will not, however, accept DOUBLE or other ``big'' expanded
classes.
To describe matrices of bigger composite objects, you
may define a different class, an heir to MATRIX with a very
short definition such as:
class D_MATRIX [T (DOUBLE) -> NUMERIC] export
repeat MATRIX
inherit
MATRIX
-- No need for a feature clause
end -- class D_MATRIX
The new facility here is the possibility to specify a
class name CLASS_NAME (DOUBLE in this example) in
parentheses after the name of the formal generic parameter G
(T in this example). This means:
``Actual generic parameters corresponding to G
must be classes of size no bigger than the size of
CLASS_NAME''.
The size of a class is defined as follows:
+
The size of all unexpanded classes is the same. It
can be referred to as the size of the universal
ancestor ANY. By default this is 32.
+
The size of one of the BITS M expanded classes is M.
+
The size of any other expanded class is obtained by
adding the sizes of the types of all the attributes
of the class (those defined in the class itself, and
those inherited from ancestors).
From the above definitions, then, the size of INTEGER
and REAL is 32 and the size of DOUBLE is 64.
Class D_MATRIX above should be used to describe
matrices whose elements are no bigger than DOUBLE objects.
This includes any unexpanded class as well as BOOLEAN,
CHARACTER, INTEGER, REAL.
The convention that the size of ANY and other
unexpanded class is 32 results from an assessment of the
current state of hardware technology. Instances of
unexpanded classes are accessible through references, that
is to say, at the implementation level, pointers. We thus
accept that, for still some time:
1.
+
Limiting Eiffel applications to manipulate 28319 (about
two billion) dynamic objects during a given session
is acceptable
2.
+
Computer memory is not cheap enough that it would be
preferable to raise this limitation to, say, 2639 , if
this implied doubling the space occupied by the
pointers generated during a session.
12 CONCLUSION
The mechanisms described above may at first appear as a
significant extension of pre-2.2 Eiffel. They are not.
Instead, the work reported here should primarily be viewed
as a reorganization and cleanup of the theoretical
framework, which makes important improvements available at
the cost of very little actual extension. As a matter of
fact the above discussion has introduced only four new
reserved words:
BITS
expanded
infix
prefix
From a theoretical viewpoint, four have been removed:
INTEGER, REAL, BOOLEAN, CHARACTER. If we accept that DOUBLE
would have been needed anyway, the reserved word count is
actually decreased by one. (The names of the basic types
will, however, remain reserved in release 2.2 for practical
reasons.)
The starting point for this work was both practical and
theoretical. From a practical viewpoint, I had known for a
long time that it would be ultimately indispensable to
support double precision; however double precision initially
appeared extremely difficult in the presence of genericity.
Clearly I did not want an implementation of genericity which
would result (as most Ada compilers do for generic packages)
in duplicating code for each separate use of a generic
class; this was deemed unacceptable. At some point I
realized that, from a practical perspective, the mechanism
of constrained genericity (implemented for 2.1) contained
the key. But then why stop at double precision? Eiffel
strives for generality; what if someone wants, say, 128-bit
reals?
Another practical concern, which had been present for
some time, was to avoid dynamic allocation whenever
possible. In most applications, the overhead of pointers is
acceptable; whether in Eiffel, Pascal, Ada or C, most non-
trivial data structures are accessed through pointers
anyway. Still, this overhead is annoying for applications
that manipulate very large numbers of composite objects each
of which contains several small but non-atomic components -
say triangles, each containing three ``point'' attributes.
Yet another motive was the desire to improve the
flexibility of the interface of Eiffel with other languages,
especially C. This requirement is not an internal one, since
for our own developments the external interface of pre-2.2
Eiffel, based on external routines that essentially pass
BITS 32 values to and from Eiffel, is more than enough.
Eiffel users have increasingly requested support for more
general interface mechanisms, however; the aim in particular
is to facilitate cooperation between Eiffel software and
existing packages (graphics, databases, expert systems etc.)
The support for composite objects should be particularly
useful here, since it covers a need often voiced by users,
that of having Eiffel objects contain sub-objects described
by C structures.
Support for prefix and infix operators was never by
itself a major concern. In fact I was (and remain) strongly
against unbounded operator overloading, which opens the door
to all kinds of notational abuses; I am much more interested
in the semantically useful overloading permitted by
redefinition and dynamic binding than by purely syntactical
techniques which make it possible to call a two-argument
routine under the form a @#$ b (say). The ability to define
the precedence of new operators, in particular, has always
struck me as a perfect example of programming language
feature that is bad on all counts: it is difficult to find a
good syntax for it (do you specify precedence by a number,
forcing each program writer or reader to refer to the
programming language reference manual each time, or do you
say ``one less than the precedence of operator xxx''?); it
makes it particularly easy to write tricky programs; it
invites bugs; it makes the parser - normally the most
trivial and worriless part of a modern compiler -
considerably more difficult to implement (since each program
may change the syntax); and to top the whole thing, it is
only a superficial, syntactical, ``vanity'' extension. Not
the most urgent feature to add to Eiffel. It was clear,
however, that if we were to treat basic types formally as
classes (see next), then we should be able to interpret the
basic prefix and infix operators as denoting routines. Why
not then make them available to writers of normal
programmer-defined classes, such as MATRIX, VECTOR and the
like? Since the concept would only apply to a few predefined
operators, the extension would remain small and reasonable.
I also had more theoretical concerns. The difference of
treatment between basic types and class types in pre-2.2
Eiffel, explained in chapter 5 of [1], came directly from
Simula. I knew that some difference was necessary: in most
cases, class types need reference semantics, and usually
they also need dynamic object creation, but no one would
want to allocate integers dynamically. Still, this
inconsistency was unpleasant in a language that emphasizes
simplicity and uniformity. The concepts of class seemed to
provide such a powerful basis for the discussion of types
that it was sad to have to abandon it for a handful of basic
types - the most fundamental at that.
When I presented Eiffel concepts in public courses,
listeners would often question the special treatment of
basic types, asking why INTEGER, for example, was not a
class. My answer usually included (among other arguments)
that any theory of types needed to start somewhere, and thus
to assume a few pre-existent, ``Bourbaki-given'' types. Why
not, the argument would go, use for these the few types with
which everybody is familiar, and which are readily available
on all hardware? I would usually say something like ``In the
most purist object-oriented approach, we would still need to
take at least one type for granted: the bit''.
In a way, the framework presented above implements this
purist approach, although it considers not a single bit
type, but an unbounded sequence of bit types, BITS M.
Everything else is derived from these. Thanks to the notion
of expanded class, which also addresses the problems that I
have called ``practical concerns'', the basic types make
perfect sense in this context; they cease to be special
language elements with magical properties and become
standard library classes. To make the solution clear and
uniform, limited support for infix and prefix operators,
useful in its own right, may be introduced. No loss in
performance is implied because the compiler can be made
smart enough to recognize these classes and apply special
techniques to them; in fact, the pre-2.2 implementation of
basic types remains entirely valid. Finally, the
introduction of expanded classes makes it possible to tackle
the dual-semantics issue in a much more explicit, complete,
and (I hope) convincing fashion than ever before.
I feel that the general cleanup of the Eiffel type
system is almost complete with the discussion of this
article (and the companion discussion [2], which addresses
details of the rules making Eiffel fully statically typed
through constraints on polymorphism, not well covered in [1]
and not yet fully supported by the implementation). To make
the basic type into fully ``ordinary'' citizens, we would
also need to deal with manifest constants, which could be
treated as ``zerofix'' operators. Example of such operators
would be 0 and 1, now introduced as deferred functions in
class NUMERIC, corresponding to the neutral elements for
operations infix "+" and infix "*". The other integers would
be interpreted as syntactical abbreviations for 1+1, 1+1+1
etc. Such an extension would require that we be more formal,
separating what has been called NUMERIC into different
inheritance levels (GROUP, RING, FIELD etc.) characterized
by the proper assertions. This is worth exploring further;
there may be the root of a full type theory, perhaps even of
a general model for computation. This extension, however,
will not be part of Eiffel release 2.2.
References
[1] B. Meyer, Object-Oriented Software Construction,
Prentice-Hall, 1988.
[2] B. Meyer, The complete rules for static typing in
Eiffel, unpublished.
--
-- Bertrand Meyer
bertrand@eiffel.comjos@cs.vu.nl (Jos Warmer) (06/08/89)
In article <154@eiffel.UUCP> bertrand@eiffel.UUCP (Bertrand Meyer) writes: > > As an aside, note that release 2.2 provides > further support for more powerful interaction > between Eiffel and other languages. In > particular, a simple syntactical extension makes > it possible to pass the address of an Eiffel > routine to an external routine. The notation is > @f, where f is a routine of the enclosing class; > such an expression is only valid as actual > argument to an external routine. > I guess/hope that this can be used as an efficient interface for calling Eiffel routines from C. But I don't understand the meaning of this. An Eiffel routine has meaning only when it is called on an Eiffel object as in 'object.routine'. An Eiffel routine can never be called as a standalone function as C routines can. Besides, because of the dynamic binding in Eiffel, you can never be sure what implementation of a routine is called. Even when the names of the routines are equal and they are called on an object of the same entity type, they can be different implementations. So, how can this address be used by the external routines ? They (or hopefully the Eiffel runtime system) must ensure that the routine will only be called on an Eiffel object of the correct (dynamic) type. If the routine address is used by the external routine to call it directly, this check cannot be performed by the Eiffel runtime system. This means that the external routines can easily corrupt internal Eiffel structures. If the routine address cannot be used by the external routine to call it directly, what use is it? ____ABOUT_THE_EFFICIENCY_ISSUE____ Some time ago, I mentioned to ISE that the way of calling Eiffel routines from C was very inefficient. In version 2.1b this is accomplished by the C routine `eif_rout', which is declared as follows: DATUM eif_rout (Objptr, r_name, arg1, ... OBJPTR Objptr; char *r_name; `Objptr' is a pointer to the eiffel-object and `r_name' is the name of the routine that must be performed on `Objptr'. In the implementation of `eif_rout', the array containing the names of all applyable routines for `Objptr' is searched for `r_name'. If `r_name' is found, the corresponding routine is called. This means that a number of string-comparisons will have to take place for each function call. I am not a fan for efficiency at each cost, but when this interface must be used heavily, this is prohibitive. Jos Warmer jos@cs.vu.nl ...uunet!mcvax!cs.vu.nl!jos -- Jos Warmer jos@cs.vu.nl ...uunet!mcvax!cs.vu.nl!jos
sakkinen@tukki.jyu.fi (Markku Sakkinen) (06/12/89)
The article referred to was entitled "EIFFEL TYPES" (by Bertrand Meyer). Note: this submission talks _only_ about the Eiffel-C interface (one short section of the full paper length article). In section 7 (Argument passing), Meyer writes: > For arguments to external routines, the rule cannot be > as systematic; they must of necessity be adapted to the > target language. Current C, for example, does not support > passing of structures or arrays as arguments; only pointers > to such elements may be passed. (ANSI C may be more liberal, > but has yet shown little relevance to the real world of C > programming.) So for C all arguments, whether expanded or > not, are normally passed by reference. It is the > responsibility of the target C routine to copy any structure > or array if needed. An exception is made, however, for > arguments of basic types (integer, real, boolean, character > and the new type ``long real'' described below), which are > passed by value for compatibility with tradition. > It must be a weird and antiquated C compiler indeed that cannot pass _structures_ themselves both as arguments and as function results. The above holds for _arrays_, though; that is a basic (mis)feature of C (and C++). Of course, pointers to structures are very often used as arguments in practice: either a call-by-reference effect is desired or copying large structures would appear inefficient. > For the results of external functions, the Eiffel > compiler can take care of ensuring the proper semantics > (copy for expanded, reference for unexpanded); a language- > dependent convention is needed, however, to determine what > the function will return. In C, the answer is the only > portable one: the C function is assumed to return a pointer > to the result for non-basic types, expanded or not. The > code generated by the Eiffel compiler then takes care of > performing a copy in the expanded case. > Returning a pointer (to a struct or an array) as the result of a C (or C++) function is very dangerous and should be done only if its referent (the _object pointed to_) fulfills one of the conditions: 1. its storage class is static or extern; 2. its storage class is auto, but it is defined in a calling function; 3. it has been allocated in dynamic memory before calling the function (and hopefully "someone" will delete it when it is no longer needed); 4. it has been allocated in dynamic memory within the function and its remaining allocated will not cause too much memory shortage (there is no garbage collection in the C or C++ realm). The most important point is: never pass a pointer to an _automatic_ variable of a function as the result of that function (cf. my short article in ACM SIGPLAN Notices, March 1989)! My recommendation for most cases is therefore: A. Pass any structure _itself_ (not a pointer to it) as the function's result. B. Since it is impossible to pass an _array_ itself, declare a structure that has the array as its only element; C (or C++) will let you pass the structure. This is only a compile-time trick that adds nothing to the runtime objects, so the Eiffel side should be able to handle the result directly as an (expanded?) array. Probably there _are_ cases in which one would rather like to pass a pointer as the result, especially with arrays. One must then be very careful. Markku Sakkinen Dept. of Computer Science, University of Jyvaskyla (imagine the a's in 'Jyvaskyla' with umlauts)