[comp.lang.eiffel] Eiffel type system <133@eiffel.UUCP>

bertrand@eiffel.UUCP (Bertrand Meyer) (06/05/89)

The following is a revised version of the article quoted. Much of the text
is common but there have been some important simplifications; in
particular, inheritance is used in a much more limited way than in the
original description. (Expanded classes cannot be used as parents any
more).

If anyone is keeping a record, please discard the original.

Many thanks for all comments received on the previous posting.

The text below was produced with nroff from input meant for troff
(with italics and the like); I hope it is understandable.

I use this opportunity to mention that in an earlier message about 2.2
syntax (<153@eiffel.UUCP>) an item was forgotten in the list of new
keywords. The omitted keyword is ``repeat''.



-------------------------------------------------------------


                        EIFFEL TYPES

                       Bertrand Meyer

                     Draft, March 1989

                    Revised, 14 May 1989

     This note presents a unified view of the Eiffel  types.
The  language  as described is what is available for version
2.2 and includes some extensions over the  definition  given
by  reference  [1].  No  incompatibility  is introduced; any
Eiffel class that was legal according to [1] remains  legal,
with the same semantics.

     The conceptual framework and extensions presented  here
introduce  a remarkable number of benefits, both theoretical
and practical:
     +
       Independently of any actual change to  the  language,
       the  concept  of  type  in  Eiffel  is presented in a
       clearer and more uniform way.  All Eiffel  types  are
       now  classes,  and  multiple inheritance is the basic
       mechanism for constructing new classes.   This  means
       in   particular  that  basic  types  (integer,  real,
       character  and  real)  need  no  longer  be   treated
       according  to  special  rules.  In  fact,  the  names
       INTEGER, REAL, CHARACTER and REAL need no  longer  be
       treated  as  reserved words in the language: instead,
       they  are  names  of  classes  in  the  Basic  Eiffel
       Library. (For practical reasons, however, these names
       will remain reserved in version 2.2.)
     +
       Double precision reals,  which  were  not  previously
       supported  directly  in Eiffel because of theoretical
       and practical  problems  relating  in  particular  to
       genericity, are now available. In fact, any precision
       can be supported.
     +
       Limited support  is  offered  for  programmer-defined
       infix and prefix operators.
     +
       Programmers  can  improve  the  efficiency  of  their
       software  by  writing classes whose instances include
       other objects. Up to release 2.1, objects could refer
       to  other objects through reference fields, but could
       not  contain  other  objects.   The  new  possibility
       avoids unneeded indirections.
     +
       Exchange of data between Eiffel software and external
       routines   is  made  easier,  particularly  when  the
       external routines are written in  C.  Eiffel  objects
       may  contain sub-objects which the C world views as C
       structures.
     +
       Operations  on  bit  sequences  of  arbitrary  length
       (previously    supported   by   the   library   class
       BOOLEAN_SET)  are  handled  simply  within  the  base
       language.

     Throughout the rest of this note, ``Eiffel'' refers  to
version   2.2  Eiffel.  The  language  defined  by  [1]  and
implemented as version 2.1 is called ``pre-2.2 Eiffel''.


1 EXPANDED CLASSES

The most important new concept is that  of  expanded  class,
which makes it possible to declare entities (variables) that
denote objects rather  than  references  to  objects.  As  a
result, it is now possible to create ``composite'' objects -
objects which contain sub-objects.

     In pre-2.2 Eiffel, any entity  of  a  class  type,  for
example an attribute declared as

     ref: CLASS_NAME

where CLASS_NAME is the name of a class, represents  not  an
object but a reference to a potential object.  The object is
``potential''  because  it  is  only  allocated  explicitly,
through a Create operation. Until it becomes associated with
an object, a reference is said to be ``void''.

     This remains true by default, but the situation will be
different  if  CLASS_NAME  has  been  declared under a newly
provided  mode,  called  ``expanded''.    The   syntax   for
declaring an expanded class is

     expanded class CLASS_NAME ... The rest as before ...

As will  be  seen  below,  an  expanded  class  may  not  be
deferred,  so  the  problem  of  the respective order of the
optional keywords expanded and deferred does not arise.

     The effect of declaring a  class  EXP  as  expanded  is
simple: any entity declared of type EXP will represent not a
reference to potential objects of  type  EXP,  but  directly
objects  of  type EXP. (Remember that ``entities'' in Eiffel
cover class attributes, routine arguments, function results,
local routine variables.)

     So if a class C contains an attribute declared as

     sub: EXP

then any instance of C will contain  a  sub-object  of  type
EXP,  accessible  through  attribute  sub. A class such as C
having at least one attribute of expanded type is  called  a
composite class; its instances are called composite objects.
Figure 1 pictures a composite object.

     By default, classes are unexpanded.  If a  class  UNEXP
is   not   explicitly   declared   as  expanded,  an  entity
declaration

     ref: UNEXP

keeps its pre-2.2 meaning: ref  denotes  a  reference  to  a
potential object of type UNEXP, to be created dynamically.

     Assuming class C contains both of the  above  attribute
declarations,  the structure of an instance of this class is
illustrated on figure 1.


        --------------------

    ref |                  |  ---> To dynamic instances

        --------------------       of UNEXP

        |                  |

        |                  |

        |      Object      |

    sub |      of type     |

        |       EXP        |

        |                  |

        |                  |

        --------------------

        |                  |

        --------------------
  Other |                  |

        --------------------
 fields |                  |

        --------------------

        |                  |

        --------------------


                 Figure 1: Composite object


     If EXP had not been declared as expanded, attribute sub
would have denoted an attribute which, in any instance of C,
represents a reference to potential instances of EXP.

     Composite objects serve to model external objects  that
have  non-simple  components.   The modeling of such objects
was, of course, possible in  pre-2.2  Eiffel,  but  only  by
making  the  objects contain references to their components,
not the  components  themselves.   Now  it  is  possible  to
specify  that  the  objects  are  composite,  so  that  they
actually  contain  sub-objects.  The  improvement   is   not
functional  - no data structure can be described which could
not have also  be  modeled  in  pre-2.2  -  but  is  one  of
performance: useless indirections may now be avoided.

     In  most  respects,  expanded  classes  have  the  same
properties  as  default  (unexpanded)  classes. An important
difference is that instances of expanded classes do not need
to  be created explicitly. If the type of sub is an expanded
class, the call

     sub.Create

is  still  permitted;  its  sole  effect,  however,  is   to
initialize  the  fields of the object associated with sub to
their default values (based on their types) and then to call
the  specific  Create  procedure  of  the class, if present.
(For unexpanded classes, Create also has these two  effects,
but  it  precedes  them  by  allocating  a new object of the
appropriate type.)

     If C is a composite class (expanded  or  not)  with  an
attribute  sub  of  expanded  type  EXP,  the application of
Create to an instance of C implies that all  fields  of  the
instance  will  be initialized to their default values. This
applies  to  field  sub;   when   performing   the   default
initializations, the Create of C will call the Create of EXP
(default or specific) on this field.

     Two other predefined Eiffel features, Void and  Forget,
are  also  allowed  on entities of expanded types. If sub is
such an entity, sub.Void always returns false and sub.Forget
has no effect.

     Three simple rules  apply  to  expanded  classes  as  a
result  of  the  above discussion.  Violation of these rules
will lead to compile-time errors.
     +
       As any other class, an expanded class may or may  not
       have  a  specific  Create  procedure.  If it has one,
       however, this procedure may not have  any  arguments.
       This  clearly  required  in  light  of  the  rule for
       Create: in the above example,  there  is  no  way  to
       initialize  an  instance  of the composite class C if
       the initialization  of  its  expanded  sub  component
       requires specific information.
     +
       Expanded classes may  also  be  composite:  in  other
       words,   an   expanded   class  may  itself  have  an
       attributed of expanded type. Among expanded  classes,
       however,  the  relation  ``A has an attribute of type
       B'' may not contain any cycles. The reason  for  this
       rule  is obvious: we cannot implement A and B in such
       a way that  an  instance  of  any  of  these  classes
       contains  a  sub-object  which  is an instance of the
       other.
     +
       An expanded class may  not  be  deferred.   (For  one
       thing, its Create procedure could contain a call to a
       deferred routine. Create is disallowed  for  deferred
       classes anyway.)


2 INHERITANCE AND CONFORMANCE

Inheritance is the fundamental  mechanism  for  constructing
new  classes  from  existing  ones.  Inheritance  serves  to
related purposes:
     +
       As a module enrichment mechanism, it allows  a  class
       to reuse the features defined in another.
     +
       As a type  refinement  mechanism,  it  restricts  the
       scope of legal assignments: b may only be assigned to
       a if the type of b is a descendant (heir through  any
       number  of levels) of the type of a. If the types are
       different  the  assignment  is  said   to   introduce
       polymorphism,  meaning  that  the  target,  a, may at
       run-time refer to objects of more than one type.
     The inheritance mechanism has been defined in  [1]  for
unexpanded   classes.    Expanded   classes   introduce  the
following new rules.
1.
     +
       No class may inherit from an expanded class.
2.
     +
       An expanded class  may  inherit  from  an  unexpanded
       class.
3.
     +
       The above type  compatibility  rule  for  assignments
       (the  type  of the target must be a descendant of the
       type of the source) remains valid. Because of rule 1,
       this  means  that if either source or target is of an
       expanded type then both must be of the same  expanded
       type.   An exception is made for the types BITS M, as
       described below.

     In other words, inheritance for expanded  classes  acts
only  in  its module enrichment capacity. No polymorphism is
permitted here. This  is  due  to  the  nature  of  expanded
classes:  an  entity  of type EXP, where EXP is expanded, is
meant to be directly associated with an object of type  EXP;
so  its  size is frozen. In contrast, entities of unexpanded
types are associated with references to objects; the size of
such  an  object  does  not  need  to  be  identical for all
possible assignments.


3 ASSIGNMENT AND EQUALITY TESTING

An assignment of b to a, as just discussed, can be of any of
the following three forms (the last one new):
     +
       a := b
     +
       a.Clone (b)
     +
       a.copy (b)

     The previous rules apply to all three forms. As  noted,
if  the  type  of either a or b is expanded, then both types
must be expanded and (except for BITS M, as discussed below)
identical.

     Clone is written with an initial upper-case because  it
is  a predefined language feature and hence a reserved word.
The name copy, however, is not reserved; instead, copy is  a
routine  of  the  new  universal  class  ANY,  automatically
inherited by every class.

     The effect of assignment under its three forms  depends
on  whether  the  types are expanded or not.  For unexpanded
types, the rules are the same as in pre-2.2 Eiffel, with the
addition of copy:
     +
       := denotes reference assignment. After the assignment
       a will refer to the same object as b, or will be void
       if b was.
     +
       Clone represents object duplication. If b  refers  to
       an  object, the Clone operation creates a new copy of
       this object, and makes a refer to it. If b was  void,
       a becomes void.
     +
       copy represents copy without allocation. The contents
       of  the  object associated with b are copied onto the
       object associated with a.  If either a or b is  void,
       an exception is raised.

     The  copies  performed  by  both  Clone  and  copy  are
``shallow''  copies;  in  other  words  they copy one entire
object, but that object only,  the  reference  fields  being
copied verbatim.

     For an expanded type, all three  assignments  have  the
effect  just described for copy. Because of the rules given,
neither a nor b may be void in this case,  so  no  exception
will ever be raised.

     A similar semantics is defined for equality tests.  For
unexpanded  types,  the  boolean  expression a = b denotes a
test for equality of references; the expression  a.Equal (b)
denotes   a  test  for  (shallow)  field-by-field  equality,
returning true if only the  contents  of  the  corresponding
objects are field-by-field identical, or both references are
void. For expanded types, both tests have  the  semantic  of
Equal.


4 ENSURING CONSISTENT SEMANTICS FOR GENERIC CLASSES

An important design guideline for  generic  classes  follows
from  the preceding specification of assignment and equality
operators.

     It is  possible  in  Eiffel  to  write  general-purpose
classes  describing  ``container''  data structures (such as
lists, trees etc.) containing objects  of  arbitrary  types.
These   are  written  as  generic  classes.   Genericity  is
essential to reconcile  static  typing  with  the  need  for
reusable container classes.

     Consider, however, a routine of a generic class, acting
on  values  of generic type; by generic type, we mean a type
defined by a formal generic  parameter  of  the  class,  for
example  T  in a routine of class C [T].  The problem arises
of how programmers can write such a routine and  be  assured
that  it has the same semantics regardless of the types used
as actual generic  parameters  for  practical  uses  of  the
class.

     In light of the above discussion, the answer is  clear.
Only  copy  semantics  is  available  for  both expanded and
unexpanded types. (For expanded  types  reference  semantics
does  not  make  sense.) Writers of generic classes may thus
guarantee uniform semantics by making sure that:

     +
       All equality tests  between  expressions  of  generic
       type use Equal rather than =.
     +
       All assignments use either copy rather than :=. Clone
       may also be used if, for generic parameters which are
       unexpanded classes,  duplication  is  desired  rather
       than mere copy; the difference between Clone and copy
       is irrelevant for expanded classes.

     This rule is particularly important because, as will be
seen  below,  basic  types  (INTEGER,  REAL  and others) are
defined as expanded classes. Using Equal and  copy  is  thus
the  appropriate  way to guarantee that a generic class will
have uniform semantics  for  basic  types  and  other  class
types.

     It is important  to  note  that  this  was  essentially
already  true in pre-2.2 Eiffel: as a first step towards the
more general solution described here, Equal and  Clone  were
already  available  on  basic  types,  with  the  respective
semantics of = and := on these types. (copy did not exist in
pre-2.2 Eiffel.) The rules described above simply generalize
and systematize these conventions.

     Clearly, the above guidelines should only  be  followed
if uniform semantics is desired; they do not mean that = and
:= should be banned from generic classes. One  can  conceive
of  perfectly  valid  generic  classes  for  which reference
semantics is desired when the actual generic parameter is an
unexpanded  class,  and  value semantics is desired when the
generic parameter is an  expanded  class,  in  particular  a
basic type.

     Since field-by-field object comparison and object  copy
are  the  operations  that  make  sense  in all cases, it is
legitimate to criticize the syntax  retained:  why  are  the
simpler  symbols  (=  and  :=)  used  for  operations  whose
semantics is not the same for expanded and unexpanded types,
and  more  verbose  notations  (Equal,  copy)  used  for the
operations which have consistent semantics and hence  appear
``cleaner''?

     This  is  a  valid  criticism.  Indeed,   I   seriously
considered,  during  the initial design of Eiffel, making :=
denote copy assignment and =  denote  object  equality.  But
this  idea  was  rejected out of fear that such a clash with
the tradition of all previous languages supporting reference
semantics  (Pascal, Ada, PL/I, and many others) would result
in numerous mistakes on the part of new Eiffel  programmers,
not all of whom may yet be expected to learn Eiffel as their
first programming language.

     The relative clumsiness of  the  syntax  for  the  more
uniform  operations  was  deemed  less  unpleasant  than the
prospect of massive mistakes by the cohorts of new converts,
still  recovering  from  the  influence of older programming
languages.

     As explained in section 5.8.3 of [1], using variants of
the  same notation for copy and reference semantics (such as
the traditional operators := and = for copy, and the  Simula
operators :- and == for reference) would have been the worst
possible solution, implying that minor syntactical or typing
oversights  could  result in drastic changes of meaning. The
notations had to be visibly different.  New  reserved  words
(Clone, copy, Equal) were deemed clearer than new operators.


5 BITS TYPES

The type system of Eiffel is entirely based on the notion of
class:  every Eiffel type is now defined as a class. As will
be  seen  shortly,  this  includes  the   types   previously
considered as ``basic'' and hence special: INTEGER, BOOLEAN,
REAL, CHARACTER.

     Most classes, including these, will be defined by class
declarations,  under  the syntax and semantics introduced in
reference [1]. Since any non-empty  class  declaration  must
refer  to  other  classes  (to give the types of attributes,
routine arguments, function results, and to list parents  if
an  inheritance  clause  is  present),  some classes must be
considered as predefined.  Eiffel offers an unbounded set of
predefined classes, written

     BITS M

where M is any non-negative integer constant.

     For any M, the definition of class BITS M  is  that  it
describes  objects  whose representation will fit in M bits.
All BITS classes are expanded classes.

     In other words, by declaring an entity such as

     quadruple: BITS 128

the  programmer   is   requiring   the   supporting   Eiffel
environment   to   devote   at   least   128   bits  to  the
representation of any object that  becomes  associated  with
quadruple at run-time. Of course, the environment is free to
allocate more space.

     The BITS classes are essentially useful  for  low-level
manipulations   of   bit   strings,   for  machine-dependent
definitions, and for interfacing with other  languages.  The
last two applications are particularly significant:
     +
       Classes such as INTEGER,  REAL  and  DOUBLE  will  be
       defined  below  as expanded classes with an attribute
       of the form value: BITS M for some M. The  default  M
       is  32  for INTEGER or REAL and 64 for DOUBLE. Thanks
       to the BITS classes the machine dependencies  can  be
       recognized  and  isolated  in  a  few classes such as
       these.
     +
       Pre-2.2 Eiffel made  it  possible  to  exchange  data
       between Eiffel and other languages such as C, but the
       sole  interface  unit  was  the  word  (32  bits   by
       default).  Now  by  declaring entities of type BITS M
       for arbitrary M, Eiffel software can send and receive
       data  of  any  size. This is explained in more detail
       below.

     Classes BITS M, for any M, are  considered  to  include
and  export  the  boolean  operations  and,  or, xor and not
(defined in infix form, as explained in  the  next  section)
and the constants true and false.  This makes it possible to
use these types for performing operations on bit strings  of
arbitrary  length.  In  pre-2.2 Eiffel this was done through
the  library  classes  BOOLEAN_SET  and  INTEGER_SET,  which
remain available.

     For  assignments  involving  the  BITS  M  types,   the
following rule applies:

     +
       A value of type BITS M may be assigned to  an  entity
       of type BITS N for M  N. Only the first M bits of the
       target are affected by such an assignment.

     This is consistent with the use of BITS types for  low-
level manipulations.

     No inheritance structure exists on BITS  classes  since
they  are  expanded.  (The  above  assignment  rule could be
construed as meaning that BITS M from BITS M+1  for  all  M,
but  this  would  be  a  rather  special  case for which the
benefits  of  referring  to  inheritance   concepts   appear
doubtless.)


6 INTERFACE WITH OTHER LANGUAGES

[Readers who are not interested  in  detailed  questions  of
Eiffel  interface to other languages, particularly C, should
skip to the next section.]

     The notion of expanded class also provides for  a  more
flexible  interface  with  the  non-Eiffel world. In pre-2.2
Eiffel,  Eiffel  routines  that  needed  to  interface  with
routines  written  in other languages, such as C, could only
pass them arguments of basic types or of class types; in the
latter  case the argument is actually a reference. This also
applied to the types of the results of  external  functions.
Furthermore,  it  was  not  possible for an Eiffel object to
contain a sub-object modeled directly to a C structure  type
declaration,  although this constraint was somewhat relieved
by the presence of a routine copy_structure,  in  the  basic
library  class  INTERNAL,  making  it  possible  to copy the
contents of a C structure into an Eiffel object.

     This concern is now addressed by composite classes.  If
each instance of C must contain a sub-object whose structure
comes from external C software, then  C  should  contain  an
attribute declaration of the form

     sub: EXP

where EXP is an appropriate  expanded  class.  Usually,  EXP
will simply be declared as

     expanded class EXP feature
          value: BITS M
     end

where M (an integer constant) is large enough to  cover  the
size of instances of the required C structure type.

     It would usually  be  imprudent  to  give  EXP  a  more
precise  Eiffel  structure,  since this would require making
non-portable assumptions about  the  internal  structure  of
both  Eiffel  and  C  objects.  The  above, however, will be
sufficient in practice since entities of type EXP,  such  as
sub,  should  only be used as arguments to external function
calls. The purpose is not to have Eiffel do the job of C  or
the  reverse,  but  only to facilitate communication between
the two worlds.

     This facility is  consistent with the  Eiffel  approach
to   interfacing   with   other   languages,  based  on  two
observations.  First,  reusability   implies   that   Eiffel
software  must  be  able to interface with existing software
written in other languages, but not that the Eiffel language
should  be  corrupted by compatibility with older, unrelated
designs. A passage is provided through the  border,  but  it
remains  a  border.  Second  (for  the special case of C): C
plays  with  respect  to  Eiffel  the  role  that   assembly
languages played for earlier high-level languages: that of a
low-level vehicle  to  be  used  only  when  one  cannot  do
otherwise.

     As  an  aside,  note  that  release  2.2  provides
     further  support  for  more  powerful  interaction
     between   Eiffel   and   other   languages.     In
     particular,  a  simple syntactical extension makes
     it possible to  pass  the  address  of  an  Eiffel
     routine  to  an external routine.  The notation is
     @f, where f is a routine of the  enclosing  class;
     such   an  expression  is  only  valid  as  actual
     argument to an external routine.


7 ARGUMENT PASSING

The rules for argument passing  are  a  consequence  of  the
above.

     When an Eiffel routine calls  another  Eiffel  routine,
the  semantics  of  argument  passing is the same as that of
assignment from actual  to  formal  argument:  by  copy  for
arguments  of  expanded  types  (including,  as will be seen
below, the basic types), and by reference for  arguments  of
unexpanded types.

     [C-unfascinated readers, please skip to next section.]

     For arguments to external routines, the rule cannot  be
as  systematic;  they  must  of  necessity be adapted to the
target language. Current C, for example,  does  not  support
passing  of structures or arrays as arguments; only pointers
to such elements may be passed. (ANSI C may be more liberal,
but  has  yet  shown little relevance to the real world of C
programming.) So for C all arguments,  whether  expanded  or
not,   are   normally   passed   by  reference.  It  is  the
responsibility of the target C routine to copy any structure
or  array  if  needed.  An  exception  is made, however, for
arguments of basic types (integer, real, boolean,  character
and  the  new type ``long real'' described below), which are
passed by value for compatibility with tradition.

     For the  results  of  external  functions,  the  Eiffel
compiler  can  take  care  of  ensuring the proper semantics
(copy for expanded, reference for unexpanded);  a  language-
dependent  convention  is needed, however, to determine what
the function will return. In  C,  the  answer  is  the  only
portable  one: the C function is assumed to return a pointer
to the result for non-basic types,  expanded  or  not.   The
code  generated  by  the  Eiffel compiler then takes care of
performing a copy in the expanded case.

     An important practical caveat governs the  exchange  of
arguments  and  function  results  between  Eiffel and C (or
another language). We may  usually  assume  that  the  basic
types are common to both languages, and hence that values of
these types may be manipulated by both  sides.  Both  C  and
Eiffel  routines,  for  example,  may perform additions on a
given integer value. For values of any other type,  however,
only  one  side  may  safely  execute  operations other than
parameter passing; the other side will  simply  be  used  as
repository  for the values, or to pass them along to further
elements.  There are exceptions to this rule, but they apply
to  special  cases  and  require that the programmer be well
versed in the implementation techniques for both languages.


8 PREFIX AND INFIX OPERATORS

We are now almost ready to explain how basic  types  can  be
interpreted as classes. A syntactical facility, although not
essential, will prove convenient for this.
     To facilitate expressiveness, we allow some routines to
be  used  in prefix or infix form.  The standard grammar for
the declaration of routines (see  [1],  appendix  C)  is  as
follows (brackets introduce optional components:
          Routine_declaration = Routine_name [Formal_arguments] [Type_mark] "is" Routine_text
          Formal_arguments = "(" Entity_declaration_list ")"
          Type_mark = ":" Type

     The optional ``Type_mark'' gives the type of the result
when the routine is a function.
     In  pre-2.2  Eiffel,  ``Routine_name''  is  simply   an
identifier. We now allow the two new forms

          prefix '"'Operator'"'
          infix '"'Operator'"'

In this syntax, ``Operator'' may only  be  one  of  the
following:
     +  -  *  /  <  >  <=  >=  and  or  xor implies  not div  mod

(Operators xor, for exclusive or, and implies, for boolean
implication, are new with 2.2.)

     All these except not may be used in infix declarations;
only +, - and not are permitted for a prefix declaration.

     Thus only a small group of common operators may be used
in prefix or infix form.

     Routines declared in this way must  all  be  functions.
Infix  routines  must  have  exactly  one  argument;  prefix
routines must have none.  Calls to such  routines  must  use
the corresponding operators, in prefix or infix form.

     Assume  for  example  a  class  MATRIX  with   function
declarations of the form

     infix ("+") (other: like Current): like Current is...
     prefix ("-"): like Current is...

Then, for m1 and m2 of type MATRIX, the expressions m1 +  m2
and   -m1,   respectively,   will   denote   calls   to  the
corresponding functions.

     The precedence of infix and prefix operators  is  fixed
and  given  by  the  standard  Eiffel  precedence table (see
section C.3 of [1]).

     It is important to note that the above  possibility  is
not  ``the  introduction  of  overloading  into  Eiffel''. A
powerful form of overloading has always  been  available  in
Eiffel, since any two classes can have different features of
the same name, and dynamic  binding  makes  it  possible  to
obtain  run-time  discrimination.  What  is  new is a simple
syntactic extension, making it possible to use the  standard
arithmetic  operators  for  any  class,  not  just the basic
types.


9 COMPARABLE and NUMERIC

Two deferred classes are needed in the basic Eiffel library.

     Class  COMPARABLE  already  existed  in  pre-2.2;   its
features  now  use  the  infix facility. Note that only "<="
needs to be deferred.

     deferred class COMPARABLE export
          infix "<", infix "<=", infix ">", infix ">="
     feature
          infix "<=" (other: like Current): BOOLEAN is
                    -- Is current element less than or equal to other?
               deferred
               end; -- "<="
          infix "<" (other: like Current): BOOLEAN is
                    -- Is current element less than other?
               do
                    Result := Current < other and not Current.Equal (other)
               end; -- "<"
          infix ">" (other: like Current): BOOLEAN is
                    -- Is current element greater than other?
               do
                    Result := other < Current
               end; -- ">"
          infix ">=" (other: like Current): BOOLEAN is
                    -- Is current element less than or equal to other?
               do
                    Result := other <= Current
               end -- ">="
     end -- class COMPARABLE

Any class that needs to manipulate  elements  connected
by an order relation may inherit from COMPARABLE.

     Class  NUMERIC  describes  any  type  with  the   basic
arithmetic operations:

     deferred class NUMERIC export
          infix "+", infix "-", infix "*", infix "/",
          prefix "+", prefix "-"
     feature
          infix "+" (other: NUMERIC): NUMERIC is
                    -- Sum of other and current element
               deferred
               end; -- infix "+"
          prefix "-" BOOLEAN is
                    -- Is current element less than other?
               deferred
               end; -- prefix "-"

          (etc.)

     end -- class COMPARABLE

Any class that needs to manipulate  elements  on  which
operations  similar  to  the basic arithmetic operations are
available, may inherit from NUMERIC (and from COMPARABLE  if
necessary).


10 BASIC TYPES AND DOUBLE PRECISION

We can now  introduce  the  basic  types  as  classes.  This
definition is based on an assessment of the present state of
computer hardware: on most platforms, an integer or  (short)
real  will fit in 32 bits; a character will fit in 8 bits; a
long real will fit in 64 bits.

     I believe it is preferable  to  explicitly  incorporate
these   industry-standard   lengths   into  the  programming
language definition, rather than to maintain a  pretense  of
implementation-independence. In practice such a pretense can
only  prevent  software  developers  from  writing  portable
software  that  will  perform  in  a (reasonably) consistent
fashion  across  a  range  of  platforms.   (Another,   more
axiomatic  approach  has  been  followed  by  Ada,  but  its
advantages over the  policy  of  making  simple  assumptions
about  the minimum bit-size requirements of basic data types
are not clear in the present state  of  hardware  technology
and software theory.)

     These conventions are not really ``wired in'' and it is
possible  to  change  the  definitions  below  to  adapt  to
different sizes.

     The basic types are defined as follows.

     expanded class REAL export
          repeat NUMERIC, repeat COMPARABLE, value {REAL}
     inherit
          NUMERIC;
          COMPARABLE
     feature
          value: BITS 32;
                    -- The real number's internal representation
          set_value (v: BITS 32) is
                    -- Use v as the bit code for the real number's value
               do
                    value := v
               end; -- set_value
          infix "+" (other: INTEGER): INTEGER is
                    -- Sum of other and current element
               external
                    short_real_addition
                         (v, w: BITS 32): BITS 32
                    language "..."
               do
                    Result.set_value (short_real_addition (value, other.value))
               end; -- "+"
... And similarly for other prefix and infix operations ...
     end -- class REAL

     So we consider that a short real number is an  expanded
32-bit  object, which has the properties of NUMERIC objects,
including those of COMPARABLE objects.

     The repeat notation which appears in the export  clause
for REAL is a facility (new for 2.2) which makes it possible
to copy conceptually the export clause of an ancestor  class
such  as  COMPARABLE  and  NUMERIC into a descendant such as
REAL without copying it physically.  (In addition, value  is
exported to the class itself.)

     The repeat facility addresses a problem  that  was
     mentioned  by  users  who  developed large systems
     with pre-2.2 Eiffel: the tediousness  of  managing
     export  lists  when many levels of inheritance are
     involved. The problem should now disappear; if you
     specify with a repeat clause that the interface of
     a class D should always include the  interface  of
     one  of  its ancestors A, then any addition to A's
     interface will automatically be reflected in .

     The basic  operations  of  class  REAL  operations  are
implemented   through   external   routines,   since   these
operations are to be carried out directly  by  the  hardware
(or  through the lower-level implementation language used by
the given implementation of Eiffel, such as assembly or  C).
The idea here is the same as with the way arrays are treated
in pre-2.2 Eiffel. Arrays, and now real numbers, are  viewed
as  instances  of  ``normal''  Eiffel  classes  in the Basic
Library.  From  a   practical   standpoint,   however,   the
implementation   knows   about   these  and  implements  the
corresponding  operations  through  shortcuts:   no   actual
routine call is needed to access an array element, or to add
two reals. So the usual efficiency of fundamental operations
is achieved, but the conceptual framework (classes, objects,
inheritance)  remains  consistent  with  the  rest  of   the
language.
     Class INTEGER is declared in a way similar to REAL:

     expanded class INTEGER export
          repeat NUMERIC, repeat COMPARABLE, value {INTEGER}, to_real
     inherit
          NUMERIC;
          COMPARABLE
     feature
          value: BITS 32;
                    -- The integer's internal representation
          set_value (v: BITS 32) is
                    -- Use v as the bit code for the integer's value
               do
                    value := v
               end; -- set_value
          infix "+" (other: INTEGER): INTEGER is
                    -- Sum of other and current element
               external
                    integer_addition (v, w: BITS 32): BITS 32
                    language "..."
               do
                    Result.set_value (integer_addition (value, other.value))
               end; -- "+"
          ... And similarly for other prefix and infix operations ...
     end -- class INTEGER

To allow for mixed-mode computation class  REAL  should
also  include a conversion function for integers, defined as
follows:

         set_value_from_integer (n: INTEGER) is
               -- Convert from value given as integer
          external
               int_to_real (v: BITS 32): BITS 32
                    language "..."
          do
               set_value (int_to_real (n.value))
          end; -- set_value_from_integer

(With value in INTEGER selectively exported  to  REAL).   In
version  2.2,  however, it will remain possible to assign an
integer value to a real entity, with the expected effect  of
a  conversion. This is an exception to the generality of the
type  scheme  presented   here,   for   compatibility   with
programmers' current habits.

     The  other  basic  types  are  defined  similarly.  For
CHARACTER,  attribute  value is of type BITS 8; for BOOLEAN,
value is of type BITS 1.

     The type DOUBLE for double-precision  reals,  new  with
2.2,  is  also  an expanded class, almost identical to REAL,
but inheriting from BITS 64 rather than BITS 32.

     The classes just discussed are all expanded  and  hence
there  is  no  inheritance relation among them. Despite what
one might think at first look, it would not  be  appropriate
to consider INTEGER as an heir to REAL or REAL as an heir to
DOUBLE.
     If longer precision reals or integers are  needed,  the
technique  is  easily  generalized:  just define new classes
that  inherit  from  NUMERIC  and  COMPARABLE  and  have  an
internal value attribute of the right BITS size.


11 UNEXPANDED CLASSES AND GENERICITY

One more technique is needed to benefit the full benefit  of
expanded  classes.  We  need  to  be  able to define generic
classes, representing such abstractions as matrices, in such
a  way  that  the  formal generic parameters can be expanded
classes such as DOUBLE representing  objects  of  any  given
size.

     Consider the following general class:

     class MATRIX [T -> NUMERIC] export
          repeat NUMERIC
     inherit
          NUMERIC
     feature
          ... Matrix representation and operations,
          expressed in terms of the corresponding operations on
          objects of type T ...
     end -- class MATRIX

This generic class will accept any unexpanded class  as
actual  generic  parameter  corresponding to T. It will also
accept expanded classes such as BOOLEAN, CHARACTER,  INTEGER
and  REAL whose ``size'', defined below, is less than 32. It
will not, however, accept DOUBLE or other  ``big''  expanded
classes.

     To describe matrices of bigger composite  objects,  you
may  define a different class, an heir to MATRIX with a very
short definition such as:

     class D_MATRIX [T (DOUBLE) -> NUMERIC] export
          repeat MATRIX
     inherit
          MATRIX
          -- No need for a feature clause
     end -- class D_MATRIX

The new facility here is the possibility to  specify  a
class   name   CLASS_NAME   (DOUBLE   in  this  example)  in
parentheses after the name of the formal generic parameter G
(T in this example). This means:

     ``Actual generic  parameters  corresponding  to  G
     must be classes of size no bigger than the size of
     CLASS_NAME''.

The size of a class is defined as follows:

     +
       The size of all unexpanded classes is  the  same.  It
       can  be  referred  to  as  the  size of the universal
       ancestor ANY. By default this is 32.
     +
       The size of one of the BITS M expanded classes is M.
     +
       The size of any other expanded class is  obtained  by
       adding  the  sizes of the types of all the attributes
       of the class (those defined in the class itself,  and
       those inherited from ancestors).
     From the above definitions, then, the size  of  INTEGER
and REAL is 32 and the size of DOUBLE is 64.
     Class  D_MATRIX  above  should  be  used  to   describe
matrices  whose  elements are no bigger than DOUBLE objects.
This includes any  unexpanded  class  as  well  as  BOOLEAN,
CHARACTER, INTEGER, REAL.
     The  convention  that  the  size  of  ANY   and   other
unexpanded  class  is  32  results from an assessment of the
current  state  of   hardware   technology.   Instances   of
unexpanded  classes  are accessible through references, that
is to say, at the implementation level,  pointers.  We  thus
accept that, for still some time:
1.
     +
       Limiting Eiffel applications to manipulate 28319 (about
       two  billion)  dynamic objects during a given session
       is acceptable
2.
     +
       Computer memory is not cheap enough that it would  be
       preferable to raise this limitation to, say, 2639 , if
       this implied  doubling  the  space  occupied  by  the
       pointers generated during a session.


12 CONCLUSION

The mechanisms described above may  at  first  appear  as  a
significant  extension  of  pre-2.2  Eiffel.  They  are not.
Instead, the work reported here should primarily  be  viewed
as   a   reorganization   and  cleanup  of  the  theoretical
framework, which makes important improvements  available  at
the  cost  of  very  little actual extension. As a matter of
fact the above  discussion  has  introduced  only  four  new
reserved words:

     BITS
     expanded
     infix
     prefix

     From a theoretical viewpoint, four have  been  removed:
INTEGER, REAL, BOOLEAN, CHARACTER.  If we accept that DOUBLE
would have been needed anyway, the reserved  word  count  is
actually  decreased  by  one.  (The names of the basic types
will, however, remain reserved in release 2.2 for  practical
reasons.)

     The starting point for this work was both practical and
theoretical.  From  a practical viewpoint, I had known for a
long time that  it  would  be  ultimately  indispensable  to
support double precision; however double precision initially
appeared extremely difficult in the presence of  genericity.
Clearly I did not want an implementation of genericity which
would result (as most Ada compilers do for generic packages)
in  duplicating  code  for  each  separate  use of a generic
class; this  was  deemed  unacceptable.   At  some  point  I
realized  that,  from a practical perspective, the mechanism
of constrained genericity (implemented  for  2.1)  contained
the  key.  But  then  why  stop  at double precision? Eiffel
strives for generality; what if someone wants, say,  128-bit
reals?

     Another practical concern, which had been  present  for
some   time,   was  to  avoid  dynamic  allocation  whenever
possible. In most applications, the overhead of pointers  is
acceptable;  whether  in Eiffel, Pascal, Ada or C, most non-
trivial  data  structures  are  accessed  through   pointers
anyway.  Still,  this  overhead is annoying for applications
that manipulate very large numbers of composite objects each
of  which contains several small but non-atomic components -
say triangles, each containing three ``point'' attributes.

     Yet another  motive  was  the  desire  to  improve  the
flexibility of the interface of Eiffel with other languages,
especially C. This requirement is not an internal one, since
for  our  own developments the external interface of pre-2.2
Eiffel, based on external  routines  that  essentially  pass
BITS  32  values  to  and  from Eiffel, is more than enough.
Eiffel users have increasingly requested  support  for  more
general interface mechanisms, however; the aim in particular
is to facilitate cooperation  between  Eiffel  software  and
existing packages (graphics, databases, expert systems etc.)
The support for composite  objects  should  be  particularly
useful  here,  since it covers a need often voiced by users,
that of having Eiffel objects contain sub-objects  described
by C structures.

     Support for prefix and infix  operators  was  never  by
itself a major concern.  In fact I was (and remain) strongly
against unbounded operator overloading, which opens the door
to all kinds of notational abuses; I am much more interested
in  the  semantically  useful   overloading   permitted   by
redefinition  and dynamic binding than by purely syntactical
techniques which make it possible  to  call  a  two-argument
routine under the form a @#$ b (say).  The ability to define
the precedence of new operators, in particular,  has  always
struck  me  as  a  perfect  example  of programming language
feature that is bad on all counts: it is difficult to find a
good  syntax  for it (do you specify precedence by a number,
forcing each program  writer  or  reader  to  refer  to  the
programming  language  reference manual each time, or do you
say ``one less than the precedence of operator  xxx''?);  it
makes  it  particularly  easy  to  write tricky programs; it
invites bugs; it  makes  the  parser  -  normally  the  most
trivial   and   worriless   part  of  a  modern  compiler  -
considerably more difficult to implement (since each program
may  change  the  syntax); and to top the whole thing, it is
only a superficial, syntactical, ``vanity''  extension.  Not
the  most  urgent  feature  to add to Eiffel.  It was clear,
however, that if we were to treat basic  types  formally  as
classes  (see next), then we should be able to interpret the
basic prefix and infix operators as denoting  routines.  Why
not   then   make   them  available  to  writers  of  normal
programmer-defined classes, such as MATRIX, VECTOR  and  the
like? Since the concept would only apply to a few predefined
operators, the extension would remain small and reasonable.

     I also had more theoretical concerns. The difference of
treatment  between  basic  types  and class types in pre-2.2
Eiffel, explained in chapter 5 of [1],  came  directly  from
Simula.  I  knew that some difference was necessary: in most
cases, class types need  reference  semantics,  and  usually
they  also  need  dynamic  object creation, but no one would
want  to  allocate  integers   dynamically.    Still,   this
inconsistency  was  unpleasant in a language that emphasizes
simplicity and uniformity. The concepts of class  seemed  to
provide  such  a  powerful basis for the discussion of types
that it was sad to have to abandon it for a handful of basic
types - the most fundamental at that.

     When I presented Eiffel  concepts  in  public  courses,
listeners  would  often  question  the  special treatment of
basic types, asking why INTEGER,  for  example,  was  not  a
class.  My  answer  usually included (among other arguments)
that any theory of types needed to start somewhere, and thus
to  assume a few pre-existent, ``Bourbaki-given'' types. Why
not, the argument would go, use for these the few types with
which everybody is familiar, and which are readily available
on all hardware? I would usually say something like ``In the
most purist object-oriented approach, we would still need to
take at least one type for granted: the bit''.

     In a way, the framework presented above implements this
purist  approach,  although  it  considers  not a single bit
type, but an  unbounded  sequence  of  bit  types,  BITS  M.
Everything  else is derived from these. Thanks to the notion
of expanded class, which also addresses the problems that  I
have  called  ``practical  concerns'',  the basic types make
perfect sense in this context;  they  cease  to  be  special
language   elements   with  magical  properties  and  become
standard library classes. To make  the  solution  clear  and
uniform,  limited  support  for  infix and prefix operators,
useful in its own right, may  be  introduced.   No  loss  in
performance  is  implied  because  the  compiler can be made
smart enough to recognize these classes  and  apply  special
techniques  to  them; in fact, the pre-2.2 implementation of
basic   types   remains   entirely   valid.   Finally,   the
introduction of expanded classes makes it possible to tackle
the dual-semantics issue in a much more explicit,  complete,
and (I hope) convincing fashion than ever before.

     I feel that the general  cleanup  of  the  Eiffel  type
system  is  almost  complete  with  the  discussion  of this
article (and the companion discussion [2],  which  addresses
details  of  the  rules making Eiffel fully statically typed
through constraints on polymorphism, not well covered in [1]
and not yet fully supported by the implementation).  To make
the basic type into fully ``ordinary''  citizens,  we  would
also  need  to  deal with manifest constants, which could be
treated as ``zerofix'' operators. Example of such  operators
would  be  0  and 1, now introduced as deferred functions in
class NUMERIC, corresponding to  the  neutral  elements  for
operations infix "+" and infix "*". The other integers would
be interpreted as syntactical abbreviations for  1+1,  1+1+1
etc. Such an extension would require that we be more formal,
separating what  has  been  called  NUMERIC  into  different
inheritance  levels  (GROUP, RING, FIELD etc.) characterized
by the proper assertions. This is worth  exploring  further;
there may be the root of a full type theory, perhaps even of
a general model for computation.  This  extension,  however,
will not be part of Eiffel release 2.2.

References
[1]  B.  Meyer,   Object-Oriented   Software   Construction,
Prentice-Hall, 1988.
[2] B. Meyer,  The  complete  rules  for  static  typing  in
Eiffel, unpublished.
-- 

-- Bertrand Meyer
bertrand@eiffel.com

jos@cs.vu.nl (Jos Warmer) (06/08/89)

In article <154@eiffel.UUCP> bertrand@eiffel.UUCP (Bertrand Meyer) writes:
>
>     As  an  aside,  note  that  release  2.2  provides
>     further  support  for  more  powerful  interaction
>     between   Eiffel   and   other   languages.     In
>     particular,  a  simple syntactical extension makes
>     it possible to  pass  the  address  of  an  Eiffel
>     routine  to  an external routine.  The notation is
>     @f, where f is a routine of the  enclosing  class;
>     such   an  expression  is  only  valid  as  actual
>     argument to an external routine.
>
I guess/hope that this can be used as an efficient interface for calling
Eiffel routines from C. But I don't understand the meaning of this.

An Eiffel routine has meaning only when it is called on an Eiffel
object as in 'object.routine'.  An Eiffel routine can never be
called as a standalone function as C routines can.
Besides, because of the dynamic binding in Eiffel, you can never
be sure what implementation of a routine is called.
Even when the names of the routines are equal and they are called on an
object of the same entity type, they can be different implementations.

So, how can this address be used by the external routines ?
They (or hopefully the Eiffel runtime system) must ensure that the routine
will only be called on an Eiffel object of the correct (dynamic) type.

If the routine address is used by the external routine to call
it directly, this check cannot be performed by the Eiffel runtime system.
This means that the external routines can easily corrupt internal Eiffel
structures.

If the routine address cannot be used by the external routine to call
it directly, what use is it?

____ABOUT_THE_EFFICIENCY_ISSUE____

Some time ago, I mentioned to ISE that the way of calling Eiffel routines 
from C was very inefficient.  In version 2.1b this is accomplished by the
C routine `eif_rout', which is declared as follows:

    DATUM eif_rout (Objptr, r_name, arg1, ...
    OBJPTR Objptr;
    char *r_name;

`Objptr' is a pointer to the eiffel-object and `r_name' is the
name of the routine that must be performed on `Objptr'.

In the implementation of `eif_rout', the array containing the names of
all applyable routines for `Objptr' is searched for `r_name'.
If `r_name' is found, the corresponding routine is called.

This means that a number of string-comparisons will have to take place
for each function call.  I am not a fan for efficiency at each cost,
but when this interface must be used heavily, this is prohibitive.

                                 Jos Warmer
				 jos@cs.vu.nl
				 ...uunet!mcvax!cs.vu.nl!jos
-- 
                                 Jos Warmer
				 jos@cs.vu.nl
				 ...uunet!mcvax!cs.vu.nl!jos

sakkinen@tukki.jyu.fi (Markku Sakkinen) (06/12/89)

The article referred to was entitled "EIFFEL TYPES" (by Bertrand Meyer).
Note: this submission talks _only_ about the Eiffel-C interface
(one short section of the full paper length article).

In section 7 (Argument passing), Meyer writes:

>      For arguments to external routines, the rule cannot  be
> as  systematic;  they  must  of  necessity be adapted to the
> target language. Current C, for example,  does  not  support
> passing  of structures or arrays as arguments; only pointers
> to such elements may be passed. (ANSI C may be more liberal,
> but  has  yet  shown little relevance to the real world of C
> programming.) So for C all arguments,  whether  expanded  or
> not,   are   normally   passed   by  reference.  It  is  the
> responsibility of the target C routine to copy any structure
> or  array  if  needed.  An  exception  is made, however, for
> arguments of basic types (integer, real, boolean,  character
> and  the  new type ``long real'' described below), which are
> passed by value for compatibility with tradition.
>

It must be a weird and antiquated C compiler indeed that cannot
pass _structures_ themselves both as arguments and as function
results. The above holds for _arrays_, though; that is a basic
(mis)feature of C (and C++). Of course, pointers to structures
are very often used as arguments in practice: either a call-by-reference
effect is desired or copying large structures would appear inefficient.
 
>      For the  results  of  external  functions,  the  Eiffel
> compiler  can  take  care  of  ensuring the proper semantics
> (copy for expanded, reference for unexpanded);  a  language-
> dependent  convention  is needed, however, to determine what
> the function will return. In  C,  the  answer  is  the  only
> portable  one: the C function is assumed to return a pointer
> to the result for non-basic types,  expanded  or  not.   The
> code  generated  by  the  Eiffel compiler then takes care of
> performing a copy in the expanded case.
>

Returning a pointer (to a struct or an array) as the result
of a C (or C++) function is very dangerous and should be done only if its
referent (the _object pointed to_) fulfills one of the conditions: 
1. its storage class is static or extern; 
2. its storage class is auto, but it is defined in a calling function;
3. it has been allocated in dynamic memory before calling the function
   (and hopefully "someone" will delete it when it is no longer needed);
4. it has been allocated in dynamic memory within the function 
   and its remaining allocated will not cause too much memory shortage 
   (there is no garbage collection in the C or C++ realm).
The most important point is: never pass a pointer to an _automatic_
variable of a function as the result of that function (cf. my short article
in ACM SIGPLAN Notices, March 1989)!

My recommendation for most cases is therefore:
A. Pass any structure _itself_ (not a pointer to it) as the function's result.
B. Since it is impossible to pass an _array_ itself, declare a structure
   that has the array as its only element; C (or C++) will let you pass
   the structure. This is only a compile-time trick that adds nothing
   to the runtime objects, so the Eiffel side should be able to handle
   the result directly as an (expanded?) array.
Probably there _are_ cases in which one would rather like to pass a pointer
as the result, especially with arrays. One must then be very careful.

Markku Sakkinen
Dept. of Computer Science, University of Jyvaskyla
(imagine the a's in 'Jyvaskyla' with umlauts)