[comp.lang.eiffel] Multi-branch instructions and unique values

bertrand@eiffel.UUCP (Bertrand Meyer) (03/19/89)

	Version 2.2 of Eiffel includes two new facilities for multi-branch
choices.  The following is extracted from an internal document describing
the purpose and form of these extensions and (under suitable form)
will be found in the documentation for the new release.
It seemed that readers of this news group would be interested
in seeing an advance version.

------------------- CUT HERE ----------------------------------

       MULTI-BRANCH CHOICES IN EIFFEL

           Bertrand Meyer

           February 21, 1989
    (Edited for posting to Usenet, March 18)


1 Overview
----------

The introduction of a ``test'' instruction and of a ``unique''
form of constant  attribute  declaration  is  meant  to  enable
programmers to write simple multi-branch choices.

   The extension is purposely simple and limited. Advanced
cases  of multi-branch discriminations should be handled not
by a test instruction (the equivalent of Pascal's  or  Ada's
case  instruction)  but  by  using  redefined  routines  and
dynamic binding. Dynamic binding ensures openness by  making
it  possible to add new cases at any time without disrupting
the structure of existing software.  In  contrast,  test  or
case  instructions  are  closed: they specify a fixed set of
choices. They may be needed, however,  in  simple  cases  of
choosing between a set of fixed alternatives.

   For this reason, there was no form of case instruction in
Eiffel until version 2.1. Although it came as a shock to some
users, this absence was not a major problem in actual use of
Eiffel once the inheritance techniques had been understood.
One user remarked that ``not having a case instruction for six
months was excellent for new object-oriented programmers''.

    Once the concepts have been understood, however, there
remain situations in which a simple multi-branch instruction is needed.
These include straightforward discriminations between a fixed set
of cases, where the context does not justify introducing one heir
class per case, and discriminating between ASCII characters on
input. The design reported here and implemented for version 2.2
of Eiffel addresses these cases. The resulting constructs should
be used with reason; great care should be taken not to destroy
the openness of object-oriented structures, which is such a key
factor in the extendibility and the reusability obtained through
proper application of the object-oriented method.

   In designing these constructs, it was strongly felt  that
the extension should remain modest, and should not introduce
any fundamentally new concept. In particular,  no  new  data
type  is  involved; no ``literal'' type is needed in Eiffel.
The extension involves two simple concepts:

   + A new form of constant value, written unique,  making
   it  possible  to  define  constant  attributes  whose
   values  are  chosen  by  the  compiler,  not  by  the
   programmer.

   + A new control  structure,  test...  end,  similar  to
   Pascal's case instruction.


2 Multi-branch choices in current Eiffel
----------------------------------------

In Eiffel  version  2.1,  as  described  in the book  ``Object-Oriented
Software  Construction'',  a  multi-branch  choice  can  be
implemented as an if... then... else... instruction:

   if e = v1 then i1
   elsif e = v2 then i2
   ...
   elsif e = vn then in

This works but has two disadvantages:

   + Performance:  when  the  values  v1,  v2  etc.  are
   consecutive  integer  or  character  values,  the
   implementation cannot easily take  advantage  of  the  well-
   known  constant-time  ``jump  table''  technique;
   instead, tests must be performed sequentially.

   + Convenience: It is often the case that the values v1,
   v2  are  symbolic  constants,  whose  value  is  not
   meaningful as long as all values are different.  This
   is  handled in Pascal or Ada by defining these values
   as leterals. In current Eiffel, the constants must be
   declared  individually  and  their  values  chosen
   explicitly by the programmer, as in

   blue: INTEGER is 1; white: INTEGER is 2; red: INTEGER is 3; ...

   This is tedious.

   The design described below removes  these  limitations.
This  is its sole aim; it should be viewed as a mechanism to
improve the  performance  and  convenience  of  the  current
Eiffel  techniques, NOT AS A NEW TECHNIQUE. No conceptually
new technique is needed here, only existing techniques made
simpler to use.


3 ``Unique'' constants
----------------------

The value  of  a  constant  integer  attribute  may  now  be
declared  not just as a literal value (such as, say, 347) but as
``unique'' (``unique'' is a new keyword). For example:

   blue: INTEGER is unique;
   white: INTEGER is unique;
   red: INTEGER is unique;

   A ``unique'' value is chosen by the compiler; all  attributes
declared as ``unique'' within a class are uaranteed to have
different values, hence the name.

   No further property is  guaranteed  to  hold  of  these
values;  programmers have no control over the policy used by
the compiler to assign  ``unique''  values.  Uniqueness  of
values  is guaranteed within each class; ``unique'' entities
declared in different  classes,  whether  or  not  they  are
related by inheritance, are not guaranteed to have different
values.

   Only integer constants may be declared as ``unique''.

   As  a  notational  facility,  two  or  more  ``unique''
declarations may be grouped, as in

   blue, white, red: INTEGER is unique;

   The meaning of such a combined declaration is the  same
as if the declarations had been written separately as above.



4 The ``test'' instruction
--------------------------

The test instruction is written as follows:

   test
     e
   when v11, v12, ... then
     instructions
   when v21, v22, ... then
     instructions
   ...
   when vn1, vn2, ... then
     instructions
   end

   In this notation: e is an expression of type INTEGER or
CHARACTER;  v11,  v21,  ...,  v21,  etc.,  called  ``test
constants'' in the sequel, are constants of the same type as
e;  instructions  represents  a  sequence  of  zero  or more
instruction separated by semicolons.

   The test constants may be explicit constant  values  or
constant  attributes;  in the latter case they may have been
declared as ``unique''.

   The following rules apply to the set of test constants:

   + If any of these constants are declared as ``unique'',
   then  all  of  them must be so declared; furthermore,
   they must have all been declared IN THE SAME CLASS.
   (This means REALLY in the same class: in particular,
   it is not permitted to have some of them declared in
   a class and some coming from proper ancestors of that class.)

   + If the test constants are not declared as ``unique'',
   that is to say if they have explicit values, then all
   of these values must be different.

   For  test  constants  declared  explicitly  (not
``unique''), a notational facility is offered: a sequence of
consecutive explicit values, such as 3, 4, 5  or  'a',  'b',
'c',  'd'  may be abbreviated as an interval, written in the
form min..max, as in one of the following:

   3..5

   'a'..'d'

The effect of a test instruction of the above  form  is defined as
follows.  The  value  of e is computed. By the above
rules, there can be at most one ``when'' branch such that one of  its
test  values  (after expansion of any intervals) is equal to
the value of e. If this is the case,  the  corresponding  ``then''
clause  is executed and this terminates the execution of the
test instruction.

   If no test value  is  equal  to  the  value  of  e,  an
exception is raised. The symbolic name for the corresponding
exception  code  in  the  library  class  EXCEPTIONS  is
Invalid_test_value.

   It is important to note that the test  instruction  has
no ``else'' clause or equivalent, and that its effect if none of
the test values matches that of the  expression  is  NOT  to
execute  an empty operation but to trigger an exception. The
test instruction (as its Pascal  and  Ada  counterparts)  is
potentially  dangerous  because it reflects the programmer's
expectation that the value of e will be one  of  the  stated
test values.  The least we can expect from the programmer is
that he precisely specify what these expected values are.
If  at  run-time  the expression does not evaluate to any of
these, then this is a serious abnormal case that  should  be
detected immediately, not quietly ignored.

sommar@enea.se (Erland Sommarskog) (03/19/89)

Bertrand Meyer (bertrand@eiffel.UUCP) writes:
>   It is important to note that the test  instruction  has
>no ``else'' clause or equivalent, and that its effect if none of
>the test values matches that of the  expression  is  NOT  to
>execute  an empty operation but to trigger an exception. The
>test instruction (as its Pascal  and  Ada  counterparts)  is
>potentially  dangerous  because it reflects the programmer's
>expectation that the value of e will be one  of  the  stated
>test values.  The least we can expect from the programmer is
>that he precisely specify what these expected values are.
>If  at  run-time  the expression does not evaluate to any of
>these, then this is a serious abnormal case that  should  be
>detected immediately, not quietly ignored.

I fully share the view that if the case selector is not matched,
this should be considered as an error and an exception should be
raised. However, I object to the omission of an OTHERWISE clause
and to to the comparison to Ada and Pascal.

Ada and Pascal first. I don't have any Pascal standard reference
at hand, but the behaviour I expect is to abandon execution if the
CASE selector is out of range, unless an OTHERWISE clause have
been given. (OTHERWISE is not standard Pascal.) A problem may
be if the compiler allows you to turn these checks off. (VAX-
Pascal does.)
  Ada have a different approach. The fragment below is illegal
Ada:
   ch : character;
   ...
   Case ch of   
      when 'A'..'Z' => Text_io.Put_line('Capital letter');
      when 'a'..'z' => Text_io.Put_line('Common letter');
   End case;
Ada requires you to either mention all elements of the selector's
subtype or give an OTHERS clause. The only possibility to get into
a CASE statement in Ada without a match is if the selector is outside 
its subtype due to defaulting from initiating it. This certainly is a 
problem, but hardly not one with CASE statements as such.

Why have an OTHERWISE clause? Say we have a simple interactive 
program where we read one character from the user and then chose
a command. Without OTHERWISE I would have to write: (Pascal syntax,
since I'm lazy.)
    CASE 'ch' OF
      'A', 'a' : Do_a_command;
      'F', 'f' : Do_f_command;
      'R', 'r' : Do_r_command;
      chr(0)..'@',
      'B'..'E',
      'G'..'Q',
      'S'..'`',
      'b'..'e',
      'g'..'q',
      's'..chr(255) : writeln('Illgeal command');
    END;
With an OTHERWISE it would be very cleaner. Not talking on how much
easier it would be to add one more command.
  There are other situations where an OTHERWISE nicely exhibits the
problem behind the logic. At least in Pascal and Ada programs. I
understand that Eiffel will take away some of these situations,
but I doubt that it will remove all of them.
  My suggestion is that the TEST statement is given an optoinal
OTHERWISE clause. If the clause is not present, an exception is
raised if there is no choice matching the selector.
-- 
Erland Sommarskog
ENEA Data, Stockholm              This signature is not to be quoted.
sommar@enea.se

bertrand@eiffel.UUCP (Bertrand Meyer) (03/23/89)

From <4377@enea.se>, sommar@enea.se (Erland Sommarskog):

> I object to [...] the comparison to Ada and Pascal.
> [... Discussion of Pascal and Ada mechanism ..]
    
    I checked my original message and I don't think anything it said
contradicted Mr. Sommarskog's discussion of the Pascal and
Ada mechanisms. My only precise reference to these languages in
the description of the multi-branch choice <122@eiffel.UUCP> was:

>> The test instruction (as its Pascal  and  Ada  counterparts)  is
>> potentially  dangerous  because it reflects the programmer's
>> expectation that the value of e will be one  of  the  stated test values.

    I think this statement is correct; in particular, the Pascal and Ada
techniques described by Mr. Sommarskog in his message do not invalidate it.

    I also considered in detail Mr. Sommarskog's arguments in favor of an
optional ``else'' clause and I think he is right. The mechanism will
be emended accordingly. An exception will still be raised if no ``else''
clause is present and the value of the discriminant matches none of the
values listed in the ``when'' clause. 

    There will also be one minor other change: the keyword for the
multi-branch instruction will be something else than ``test''.
It occurred to me that at least one out of every three new users
is bound to write a first trial class called TEST, implying that
his first encounter with the Eiffel tools would now be some message
of the form ``test is a reserved word and cannot be used as a class
name''. Better choose a less obvious keyword.

    I am grateful to Mr. Sommarskog for his observations.
-- 

-- Bertrand Meyer
bertrand@eiffel.com

day@grand.UUCP (Dave Yost) (03/24/89)

In article <130@eiffel.UUCP> bertrand@eiffel.UUCP (Bertrand Meyer) writes:
>    I also considered in detail Mr. Sommarskog's arguments in favor of an
>optional ``else'' clause and I think he is right. The mechanism will
>be emended accordingly. An exception will still be raised if no ``else''

My hat off to you for being so available, responsive,
candid, and open to suggestion in this public forum,
as in this random example.  Very refreshing.
May you not become *so* swamped with customers that
you can't keep it up.

 --dave yost

db@lfcs.ed.ac.uk (Dave Berry) (03/25/89)

In article <122@eiffel.UUCP> bertrand@eiffel.UUCP (Bertrand Meyer) writes:
>
>   For  test  constants  declared  explicitly  (not
>``unique''), a notational facility is offered: a sequence of
>consecutive explicit values, such as 3, 4, 5  or  'a',  'b',
>'c',  'd'  may be abbreviated as an interval, written in the
>form min..max, as in one of the following:
>
>   3..5
>
>   'a'..'d'

Would it not be possible to extend this notational fcility to unique
constants declared in the same declararion?  To use Bertrand's example:

>   blue, white, red: INTEGER is unique;

If this notation were taken to mean that the unique values given to
blue, white and red must be consecutive, tests could be made against the
ranges  blue..white,  white..red  and  blue..red.  In a small example
like this, it isn't too useful, but it might be useful with a larger case.

Alternatively, since all unique constants in a test must be defined in
the same class, one could define all unique constants defined in the
same class to be consecutive in the order in which they were defined.
Then the above declaration would just be an abbreviation, as before.

A further note on else clauses: If you have user-controlled exceptions,
one can use an else clause to raise an exception suitable to the
particular test statement which fails to match.  This can be more useful
than raising a system exception.

I know Bertrand has accepted Erland's arguments for an else clause;
but I think this use for them is worth mentioning anyway.

(The references I have on eiffel are strongly against exceptions,
so I presume they're out of date!  I'm currently trying to get hold of
the eiffel book; I apologise if I've made any inaccurate assumptions
in the above.)

Dave Berry,	Laboratory for Foundations of Computer Science, Edinburgh.
		db%lfcs.ed.ac.uk@nss.cs.ucl.ac.uk
		<Atlantic Ocean>!mcvax!ukc!lfcs!db

sommar@enea.se (Erland Sommarskog) (03/25/89)

 Bertrand Meyer (bertrand@eiffel.UUCP) writes:
 >I said:
 >> I object to [...] the comparison to Ada and Pascal.
 >    
 >    I checked my original message and I don't think anything it said
 >contradicted Mr. Sommarskog's discussion of the Pascal and
 >Ada mechanisms. 
 
 Yes, that's perfectly correct. I misread Bertrand Meyer's article
 to say something it didn't say. My excuses for the confusion.
-- 
Erland Sommarskog
ENEA Data, Stockholm              This signature is not to be quoted.
sommar@enea.se

bertrand@eiffel.UUCP (Bertrand Meyer) (03/26/89)

From <1638@etive.ed.ac.uk>, db@lfcs.ed.ac.uk (Dave Berry):

> Would it not be possible to extend the notational facility
> [for intervals, e.g. 3..5]
> to constants declared in the same declaration? [If]
> 
>   blue, white, red: INTEGER is unique;
> 
> were taken to mean that the unique values given to
> blue, white and red must be consecutive, tests could be made against
> the ranges  blue..white,  white..red  and  blue..red.

	The above is defined formally as a syntactical abbreviation for

   blue: INTEGER is unique;
   white: INTEGER is unique;
   red: INTEGER is unique

	In Eiffel, the order of declarations of features within a
class is immaterial. I would hesitate to make an exception to this rule.

	I have always been uneasy with the ordering that exists between
the values of an enumeration type in Pascal, and in particular with
the predefined ``ord'' function. I believe it is preferable to think
of enumerated values in Pascal, or ``unique'' values in Eiffel, as
elements of a set rather than of a sequence.

	This is largely a matter of taste. I am not claiming that there
is just one correct answer.

> The references I have on Eiffel are strongly against exceptions,
> so I presume they're out of date!

	They must be quite old indeed. I was so shocked by the Ada
exception mechanism that for a while I was dead set against exceptions.
After looking more carefully at assertions, however, I realized
that there is a need for a disciplined exception mechanism, based
on program correctness techniques and on an analysis of what
``abnormal case'' means from a theoretical viewpoint. This mechanism
was incorporated into Eiffel. In my biased view it is one of Eiffel's
main contributions to software engineering.

	The methodological rationale and the mechanism itself are described
in detail in an article entitled ``Programming as Contracting'',
which has been sitting for about two centuries in the refereeing
process of a journal. Various bits of the article found their way
into section 7.10 of ``Object-Oriented Software Construction'',
into a recent paper in Springer-Verlag's ``Structured Programming''
journal (Vol. 10, no. 1, pp. 19-39), and into the next installment of
the Eiffel column in the Journal of Object-Oriented Programming,
due to appear (I think) in May.
-- 

-- Bertrand Meyer
bertrand@eiffel.com