[comp.lang.eiffel] Reneging on promises

bertrand@eiffel.UUCP (Bertrand Meyer) (12/28/90)

In preparing the published version of ``Eiffel: The Language'',
describing version 3 of the language, I have had to renege
on two features that, in the debates of a few months ago,
I said would be included. (I am afraid both were originally suggested
by the same person, Erland Sommarskog from ENEA Data in Stockholm.)

One is the support for international character sets.
On closer consideration, I just gave up. This is too much
for one person. After all, I must leave some fun to the
language committee of the Eiffel Consortium. Besides,
not having enough background and knowledge in this field,
I was almost sure to do it wrong.

I apologize to those Eiffel users using non-English
keyboards who will continue to see an A with umlaut or
something of the sort for every backslash, square bracket etc.,
but Eiffel is really not responsible for the mess that has accumulated
in this field over the years. Someone (standardization committees,
computer vendors?) obviously did not do his job.

In our own implementation, we will try to include tools which
will alleviate the problem.
A facility which might provide a temporarily satisfactory
solution would be the ability to specify for compiling
commands (such as `es' or `ec') a parameter
describing an input filter command, to be applied automatically
to all classes, which translates from a local character set
into ``core'' ASCII, and an associated output filter command
which will translate back, and will be applied to error messages
to ensure that they refer to the original identifiers.

Again, this will likely not be the final word, but someone
else, preferably a committee, will have to take care of this.
I'll be glad to help.

The other broken promise involves the facility suggested by
Mr. Sommarskog to simplify relational expressions, by allowing
an expression of the form

[1]
    a < b < c <= d

as an abbreviation for

[2]
    (a < b) and then (b < c) and then (c <= d)


This looked attractive at first. However such a facility breaks
the fundamental simplicity of Eiffel, where, in principle, *every*
operation of interest is a feature call; infix or prefix notation is
just a convenient syntactical convention. In other words, a < b is
conceptually the same as what would be written, using the standard ``dot''
notation, under the form

    a.less_than (b)

Adding the ``and then' equivalence for a special set of operators
(the predefined relational ones) would force a violation
of this convincing and universal principle. (I am grateful to
Kim Walden, also from ENEA Data, for bringing this point home
to me most persuasively.)

On closer look, then, the suggested facility does not seem worth its
while, and it does not appear so bad to have to write [2] (which, for
assertions, will use semicolon rather than ``and then'').  
-- 
-- Bertrand Meyer
bertrand@eiffel.com

sommar@enea.se (Erland Sommarskog) (12/30/90)

Bertrand Meyer (bertrand@eiffel.com) writes:
>I apologize to those Eiffel users using non-English
>keyboards who will continue to see an A with umlaut or
>something of the sort for every backslash, square bracket etc.,
>but Eiffel is really not responsible for the mess that has accumulated
>in this field over the years. Someone (standardization committees,
>computer vendors?) obviously did not do his job.

This is actually not correct. Eiffel is responsible. ISO 646
clearly specifies that the characters #$@[\]^`{|}~ are subject to
replacement in national character sets. Furthermore ISO 646 also
specifies how the national varities look like. One of them says
that the string above should appear as number-dollar-at-l.bracket-
backslash-r.bracket-circumflex-grave-l.brace-bar-r.brace-tilde and
that is the American one, also known as ASCII. I don't know the
history, but I think that ASCII preceded ISO 646, i.e 646 is a
development of ASCII. I would also assume that C and Pascal also
are older than ISO 646, which gives them some right to violate
the rules. I can fully understand that Dr. Meyer just followed
their path, but ISO 646 was established when he designed Eiffel
so his excuse can only be ignorance. (Which I certainly can
understand. ISO hasn't really managed to get 646 into every
programmers' mind, and often you hear expressions as "Swedish
ASCII" which tells you which is the de facto standard.)

I can only complain that Eiffel will not include improvements
in these areas. As I recall, brackets and braces would have
simple parenthesis as alternatives, a solution which is both
simple and fully acceptable, and I would encourage Dr. Meyer
to once again change his opinion on this point. This is a
trivial thing and the Eiffel Consortium should have more
important matters to discuss than lexical elements. :-)

As for the backslash as an escape character in string and
character literals things are admittedly more complex. Dr.
Meyer suggested the exclamation mark and a complete abandoning
of the backslash which would require conversion of old source
code. (Which will be necessary anyway, due to the change in
the create mechanism.) "!" is much better than the backslash,
but I just think an escape character isn't necessary at all. 
Ada and Pascal don't need it. Why should Eiffel?

Another issue was the use of eight-bit and multi-byte characters
in identifiers. This one is less urgent, and I don't mind leaving
that to the Eiffel consortium. It might be wise to wait what other
languages will do in this area. Ada9X will include something along 
these lines. (Actually, all languages who want an ISO stamp, needs
to adapt to accept at least Latin-1.) After all, if different languages
behaves similarly in area such as identifier characters, this is
a nice standardization for people who work in a multi-language
environment.
-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se
"There is only one success, namely to lead your life in your own way"
Anyone who can give a source for this?

sommar@enea.se (Erland Sommarskog) (12/31/90)

In a private message Betrand Meyer gave me a somewhat more
detailed explanation to why he decided to not introduce the
feature I proposed. Applying the view that in Eiffel an infix
operator is just a syntatcic way to write a feature call,
a < b < c becomes

   a.less_than(b).less_than(c)

which normally makes no sense. My proposal would introduce
a special case so that the interpretation would be

   a.less_than(b).and(b.less_than(c))

Kim Walden persuaded Bertrand Meyer that this special case
is against the "cleanness" of Eiffel and not worth the win.
I am inclined to agree. On the other hand if Algol-60(*) had
had this feature, and consequently most other programming
languages, Eiffel would almost certainly have been forced to have
a special case here. Recall that the view on infix operators as 
feature calls was introduced first with Eiffel 2.2.

My original idea was mainly a whim, based on an old idea and
inspired by a then current discussion in comp.lang.misc. So
from that point I'm not too depressed that Eiffel will not
introduce the little novelty. (Although it would be fun when
I get old to tell my grandchildren, look that was *my* idea.
Well, I wrote a revision request for Ada9x on the same lines,
so there is still a possibility. Fat chance.)

However, since then I have come to realise that this capability
is more important to Eiffel to most other languages, since
Eiffel don't have subrange types. The major application for this
shortcut form is of course in assertions. In Ada you would write:

      SUBTYPE Index_type IS integer RANGE a..c;
      ...
      PROCEDURE Something(b : Index_type) IS

and the compiler would do the rest for you. In Eiffel you would
maybe write:

      Something (b : integer) IS
      REQUIRE a <= b; b <= c

Just look through ISE's libraries and you find plenty of assertions
of this kind. Not only is there an extra chance to mess it up,
because you mention b twice, but also it takes longer time read
the two assertions which bascially are one.

Allowing a <= b <= c would be one solution, another would be to
introduce a range operator so you could write

      REQUIRE b IN a..c

This is of course less general than my original proposal, but things
like

     a < b <= c > d = e

are less useful in practice. It may still be desireable though to
have alternatives with open ends of the intervals, that is with
< instead of <=.

One way to accomplish this would be to add one or more new features
to the class COMPAREABLE, changing the assertion above to:

      REQUIRE b.in(a, c)

Of course I can write the class MY_COMPAREABLE to do that, but it
would indeed be nice if it was part of the kernel library.

(*) I owe this idea to Peter da Silva who once answered my question
why so few languages permit a < b < c with "Ineritance. Algol-60
didn't have it."
-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se
"There is only one success, namely to lead your life in your own way"
Anyone who can give a source for this?

bertrand@eiffel.UUCP (Bertrand Meyer) (12/31/90)

From <2352@enea.se> by sommar@enea.se (Erland Sommarskog):


> One way to accomplish this would be to add one or more new features
> to the class COMPARABLE, changing the assertion above to:
> 
>       REQUIRE b.in(a, c)


This looks like a very good idea.
-- 
-- Bertrand Meyer
bertrand@eiffel.com

bertrand@eiffel.UUCP (Bertrand Meyer) (12/31/90)

From <2347@enea.se> by sommar@enea.se (Erland Sommarskog):

> Eiffel is responsible. ISO 646
> clearly specifies that the characters #$@[\]^`{|}~ are subject to
> replacement in national character sets. Furthermore ISO 646 also
> specifies how the national varities look like. [...]
> ISO 646 was established when [B. Meyer] designed Eiffel
> so his excuse can only be ignorance. (Which I certainly can
> understand. ISO hasn't really managed to get 646 into every
> programmers' mind, and often you hear expressions as "Swedish
> ASCII" which tells you which is the de facto standard.)

The last point is definitely correct. The only standard that
matters is the keyboard that sits on programmers' desks.

Eiffel's lexical structure was designed under the assumption
that that keyboard supports upper- and lower-case letters,
digits, plus, minus, divide, asterisk, less than, greater than,
circumflex, period, comma, colon, semicolon, parentheses,
quote, double quote, square brackets, braces, at sign, exclamation
mark, plus two or three others.

This is a very reasonable assumption, and one that is not
very restrictive; in particular, it does not say anything
about where the characters are located on the keyboard,
and what other characters (such as accented letters) are
available beyond the basic ones. I maintain that if hardware
manufacturers and standardization bodies had done their
most elementary job this assumption would be trivially satisfied
in 1990/1991.

Unfortunately, as demonstrated by previous postings by
Mr. Sommarskog (and clear to anyone who has ever used a
non-English keyboard) this assumption is not satisfied.
I am sure ISO 646 is a respectable work but an international
standard does not mean anything if the entire state of the
industry contradicts it.

Eiffel can and will try not to make it too hard on users
having non-English keyboards. For example, it may permit
parentheses to be used instead of brackets or braces,
and avoid giving undue importance to characters such as
the backslash which have poor equivalents on certain
keyboards. But it cannot be expected to correct
single-handedly the mess accumulated over forty years
of hardware evolution.

At least this is the least bad answer that I can
offer now. Which brings up the next point:

> I can only complain that Eiffel will not include improvements
> in these areas. As I recall, brackets and braces would have
> simple parenthesis as alternatives, a solution which is both
> simple and fully acceptable, and I would encourage Dr. Meyer
> to once again change his opinion on this point. This is a
> trivial thing and the Eiffel Consortium should have more
> important matters to discuss than lexical elements. :-)

Permitting parentheses seems feasible (see above).
With respect to the last sentence, to the extent
that it is meant seriously, it appears contradictory
with the rest of Mr. Sommarskog's message and his
earlier ones:

1. Obviously there would not be such a discussion if the
matter was not ``important''. It is certainly important
enough for the language committee of NICE.

2. Since I am (justifiably) accused of ``ignorance'' of
this matter, isn't it preferable for the future of Eiffel
to make less ignorant people take care of its final
resolution?

In fact, a strong argument can be made that,
by the very nature of this issue, it may be best to entrust
its resolution to a committee including
people from diverse backgrounds and countries, rather than
to an individual, even with the help of the net.
-- 
-- Bertrand Meyer
bertrand@eiffel.com

amanda@iesd.auc.dk (Per Abrahamsen) (01/06/91)

Followups has been directed to comp.std.internat.

>>>>> On 29 Dec 90 22:05:16 GMT, sommar@enea.se (Erland Sommarskog) said:

Erland> This is actually not correct. Eiffel is responsible. ISO 646
Erland> clearly specifies that the characters #$@[\]^`{|}~ are subject to
Erland> replacement in national character sets.

It is probably wise to wait.  The ANSI C committee did not understand
the issues, and came up with a solution (trigraphs) which nobody
likes. 

Erland> ISO hasn't really managed to get 646 into every
Erland> programmers' mind, and often you hear expressions as "Swedish
Erland> ASCII" which tells you which is the de facto standard.

This is another good reason to wait.  ISO 646 was created at a time
where ASCII already was the de-facto standard.  It was not backward
compatible with ASCII, and therefore broke almost any application
which used the ASCII character set.  It is no wonder that programmers
has been hesitant in accepting ISO 646.

A new standard (ISO Latin1) has been created, which conforms to the
ASCII standard, and is quickly being accepted.  The best thing to
do might be to ignore ISO 646, and switch to ISO Latin1 as fast as
possible.

sommar@enea.se (Erland Sommarskog) (01/08/91)

Also sprach Bertrand Meyer (bertrand@eiffel.UUCP):
>The last point is definitely correct. The only standard that
>matters is the keyboard that sits on programmers' desks.

Different programmers have different keyboards. So does different
programmers have different standards?

>Eiffel's lexical structure was designed under the assumption
>that that keyboard supports upper- and lower-case letters,
>digits, plus, minus, divide, asterisk, less than, greater than,
>circumflex, period, comma, colon, semicolon, parentheses,
>quote, double quote, square brackets, braces, at sign, exclamation
>mark, plus two or three others.
>
>This is a very reasonable assumption, and one that is not
>very restrictive;

I have to beg to differ. The assumption is only reasonable if
we believe that keyboards and terminals are the same everywhere.
But we all know that other languages use characters beyond those
26 that makes up the English alphabet. (Even English does!) Why
believe that keyboards in other countries would just be bigger
to include the needed national characters? For long seven-bit
was used for communication and data storage, why believe that
all characters in the ASCII set would survive national replacements?

If I were to use Bertrand Meyer's reasoning and defined a language
which included ][\ as indentifier characters and used }{| as lower-
case equivalents. Many people would be confused, and probably hate
me as well. No, it is not a good idea to look at your keyboard
and think that it defines the world.

>I maintain that if hardware
>manufacturers and standardization bodies had done their
>most elementary job this assumption would be trivially satisfied
>in 1990/1991.
>...
>I am sure ISO 646 is a respectable work but an international
>standard does not mean anything if the entire state of the
>industry contradicts it.

ISO646 is a well-supported standard. For instance, I am writing
this message on a VT200-compatible. I can change the keyboard
to be any of North American, British, German, French, Danish,
Norweigian, Swedish, Swiss-German or Swiss-French in either
eight- or seven-bit mode. As I change the mode different
characters are also displayed on my screen for those who are
subject to change according to ISO646, i.e. $#@[\]^`{|}~.
I would expect that if I get a different hardware configuration
I get another set of choices. VT200 was introduced in 1984, but
was far from first in its field. I have an old manual for HP2621A
which reveals the same capabilities, although you have to replace
the character generator to change the screen, and of course eight-bit
support is missing. I don't how many printers I have enountered who
have been able to display various characters according to ISO646 
with the help of escape sequences to change between national varieties.

I wouldn't mention ISO646 if it was a paper tiger.

>But it (Eiffel) cannot be expected to correct
>single-handedly the mess accumulated over forty years
>of hardware evolution.

At this point Bertrand Meyer is starting to sound like Richard
Stallman defending Emacs' use of CTRL/S and CTRL/Q contrary to
all sensible standard. It is not a mess. It is a standard.
It may not be the standard you want it to be. The only time
you get a mess is when you ignore the standard.

>In fact, a strong argument can be made that,
>by the very nature of this issue, it may be best to entrust
>its resolution to a committee including
>people from diverse backgrounds and countries, rather than
>to an individual, even with the help of the net.

Certainly. Many people frown upon language commitees and think
that a language should be designed by an individual. I am
not so sure about this. Is it any wonder Ada does not use
any of the untouchable characters? (Well, it does use "|", but
"!" is permitted as an alternative. It also uses "#", but since
this character is never replaced by a letter it is less
sensitive. There is an alternative anyway, ":" I think.)
                                                                  -- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se
"There is only one success, namely to lead your life in your own way"
Anyone who can give a source for this?