bertrand@eiffel.UUCP (Bertrand Meyer) (12/28/90)
In preparing the published version of ``Eiffel: The Language'', describing version 3 of the language, I have had to renege on two features that, in the debates of a few months ago, I said would be included. (I am afraid both were originally suggested by the same person, Erland Sommarskog from ENEA Data in Stockholm.) One is the support for international character sets. On closer consideration, I just gave up. This is too much for one person. After all, I must leave some fun to the language committee of the Eiffel Consortium. Besides, not having enough background and knowledge in this field, I was almost sure to do it wrong. I apologize to those Eiffel users using non-English keyboards who will continue to see an A with umlaut or something of the sort for every backslash, square bracket etc., but Eiffel is really not responsible for the mess that has accumulated in this field over the years. Someone (standardization committees, computer vendors?) obviously did not do his job. In our own implementation, we will try to include tools which will alleviate the problem. A facility which might provide a temporarily satisfactory solution would be the ability to specify for compiling commands (such as `es' or `ec') a parameter describing an input filter command, to be applied automatically to all classes, which translates from a local character set into ``core'' ASCII, and an associated output filter command which will translate back, and will be applied to error messages to ensure that they refer to the original identifiers. Again, this will likely not be the final word, but someone else, preferably a committee, will have to take care of this. I'll be glad to help. The other broken promise involves the facility suggested by Mr. Sommarskog to simplify relational expressions, by allowing an expression of the form [1] a < b < c <= d as an abbreviation for [2] (a < b) and then (b < c) and then (c <= d) This looked attractive at first. However such a facility breaks the fundamental simplicity of Eiffel, where, in principle, *every* operation of interest is a feature call; infix or prefix notation is just a convenient syntactical convention. In other words, a < b is conceptually the same as what would be written, using the standard ``dot'' notation, under the form a.less_than (b) Adding the ``and then' equivalence for a special set of operators (the predefined relational ones) would force a violation of this convincing and universal principle. (I am grateful to Kim Walden, also from ENEA Data, for bringing this point home to me most persuasively.) On closer look, then, the suggested facility does not seem worth its while, and it does not appear so bad to have to write [2] (which, for assertions, will use semicolon rather than ``and then''). -- -- Bertrand Meyer bertrand@eiffel.com
sommar@enea.se (Erland Sommarskog) (12/30/90)
Bertrand Meyer (bertrand@eiffel.com) writes: >I apologize to those Eiffel users using non-English >keyboards who will continue to see an A with umlaut or >something of the sort for every backslash, square bracket etc., >but Eiffel is really not responsible for the mess that has accumulated >in this field over the years. Someone (standardization committees, >computer vendors?) obviously did not do his job. This is actually not correct. Eiffel is responsible. ISO 646 clearly specifies that the characters #$@[\]^`{|}~ are subject to replacement in national character sets. Furthermore ISO 646 also specifies how the national varities look like. One of them says that the string above should appear as number-dollar-at-l.bracket- backslash-r.bracket-circumflex-grave-l.brace-bar-r.brace-tilde and that is the American one, also known as ASCII. I don't know the history, but I think that ASCII preceded ISO 646, i.e 646 is a development of ASCII. I would also assume that C and Pascal also are older than ISO 646, which gives them some right to violate the rules. I can fully understand that Dr. Meyer just followed their path, but ISO 646 was established when he designed Eiffel so his excuse can only be ignorance. (Which I certainly can understand. ISO hasn't really managed to get 646 into every programmers' mind, and often you hear expressions as "Swedish ASCII" which tells you which is the de facto standard.) I can only complain that Eiffel will not include improvements in these areas. As I recall, brackets and braces would have simple parenthesis as alternatives, a solution which is both simple and fully acceptable, and I would encourage Dr. Meyer to once again change his opinion on this point. This is a trivial thing and the Eiffel Consortium should have more important matters to discuss than lexical elements. :-) As for the backslash as an escape character in string and character literals things are admittedly more complex. Dr. Meyer suggested the exclamation mark and a complete abandoning of the backslash which would require conversion of old source code. (Which will be necessary anyway, due to the change in the create mechanism.) "!" is much better than the backslash, but I just think an escape character isn't necessary at all. Ada and Pascal don't need it. Why should Eiffel? Another issue was the use of eight-bit and multi-byte characters in identifiers. This one is less urgent, and I don't mind leaving that to the Eiffel consortium. It might be wise to wait what other languages will do in this area. Ada9X will include something along these lines. (Actually, all languages who want an ISO stamp, needs to adapt to accept at least Latin-1.) After all, if different languages behaves similarly in area such as identifier characters, this is a nice standardization for people who work in a multi-language environment. -- Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se "There is only one success, namely to lead your life in your own way" Anyone who can give a source for this?
sommar@enea.se (Erland Sommarskog) (12/31/90)
In a private message Betrand Meyer gave me a somewhat more detailed explanation to why he decided to not introduce the feature I proposed. Applying the view that in Eiffel an infix operator is just a syntatcic way to write a feature call, a < b < c becomes a.less_than(b).less_than(c) which normally makes no sense. My proposal would introduce a special case so that the interpretation would be a.less_than(b).and(b.less_than(c)) Kim Walden persuaded Bertrand Meyer that this special case is against the "cleanness" of Eiffel and not worth the win. I am inclined to agree. On the other hand if Algol-60(*) had had this feature, and consequently most other programming languages, Eiffel would almost certainly have been forced to have a special case here. Recall that the view on infix operators as feature calls was introduced first with Eiffel 2.2. My original idea was mainly a whim, based on an old idea and inspired by a then current discussion in comp.lang.misc. So from that point I'm not too depressed that Eiffel will not introduce the little novelty. (Although it would be fun when I get old to tell my grandchildren, look that was *my* idea. Well, I wrote a revision request for Ada9x on the same lines, so there is still a possibility. Fat chance.) However, since then I have come to realise that this capability is more important to Eiffel to most other languages, since Eiffel don't have subrange types. The major application for this shortcut form is of course in assertions. In Ada you would write: SUBTYPE Index_type IS integer RANGE a..c; ... PROCEDURE Something(b : Index_type) IS and the compiler would do the rest for you. In Eiffel you would maybe write: Something (b : integer) IS REQUIRE a <= b; b <= c Just look through ISE's libraries and you find plenty of assertions of this kind. Not only is there an extra chance to mess it up, because you mention b twice, but also it takes longer time read the two assertions which bascially are one. Allowing a <= b <= c would be one solution, another would be to introduce a range operator so you could write REQUIRE b IN a..c This is of course less general than my original proposal, but things like a < b <= c > d = e are less useful in practice. It may still be desireable though to have alternatives with open ends of the intervals, that is with < instead of <=. One way to accomplish this would be to add one or more new features to the class COMPAREABLE, changing the assertion above to: REQUIRE b.in(a, c) Of course I can write the class MY_COMPAREABLE to do that, but it would indeed be nice if it was part of the kernel library. (*) I owe this idea to Peter da Silva who once answered my question why so few languages permit a < b < c with "Ineritance. Algol-60 didn't have it." -- Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se "There is only one success, namely to lead your life in your own way" Anyone who can give a source for this?
bertrand@eiffel.UUCP (Bertrand Meyer) (12/31/90)
From <2352@enea.se> by sommar@enea.se (Erland Sommarskog): > One way to accomplish this would be to add one or more new features > to the class COMPARABLE, changing the assertion above to: > > REQUIRE b.in(a, c) This looks like a very good idea. -- -- Bertrand Meyer bertrand@eiffel.com
bertrand@eiffel.UUCP (Bertrand Meyer) (12/31/90)
From <2347@enea.se> by sommar@enea.se (Erland Sommarskog): > Eiffel is responsible. ISO 646 > clearly specifies that the characters #$@[\]^`{|}~ are subject to > replacement in national character sets. Furthermore ISO 646 also > specifies how the national varities look like. [...] > ISO 646 was established when [B. Meyer] designed Eiffel > so his excuse can only be ignorance. (Which I certainly can > understand. ISO hasn't really managed to get 646 into every > programmers' mind, and often you hear expressions as "Swedish > ASCII" which tells you which is the de facto standard.) The last point is definitely correct. The only standard that matters is the keyboard that sits on programmers' desks. Eiffel's lexical structure was designed under the assumption that that keyboard supports upper- and lower-case letters, digits, plus, minus, divide, asterisk, less than, greater than, circumflex, period, comma, colon, semicolon, parentheses, quote, double quote, square brackets, braces, at sign, exclamation mark, plus two or three others. This is a very reasonable assumption, and one that is not very restrictive; in particular, it does not say anything about where the characters are located on the keyboard, and what other characters (such as accented letters) are available beyond the basic ones. I maintain that if hardware manufacturers and standardization bodies had done their most elementary job this assumption would be trivially satisfied in 1990/1991. Unfortunately, as demonstrated by previous postings by Mr. Sommarskog (and clear to anyone who has ever used a non-English keyboard) this assumption is not satisfied. I am sure ISO 646 is a respectable work but an international standard does not mean anything if the entire state of the industry contradicts it. Eiffel can and will try not to make it too hard on users having non-English keyboards. For example, it may permit parentheses to be used instead of brackets or braces, and avoid giving undue importance to characters such as the backslash which have poor equivalents on certain keyboards. But it cannot be expected to correct single-handedly the mess accumulated over forty years of hardware evolution. At least this is the least bad answer that I can offer now. Which brings up the next point: > I can only complain that Eiffel will not include improvements > in these areas. As I recall, brackets and braces would have > simple parenthesis as alternatives, a solution which is both > simple and fully acceptable, and I would encourage Dr. Meyer > to once again change his opinion on this point. This is a > trivial thing and the Eiffel Consortium should have more > important matters to discuss than lexical elements. :-) Permitting parentheses seems feasible (see above). With respect to the last sentence, to the extent that it is meant seriously, it appears contradictory with the rest of Mr. Sommarskog's message and his earlier ones: 1. Obviously there would not be such a discussion if the matter was not ``important''. It is certainly important enough for the language committee of NICE. 2. Since I am (justifiably) accused of ``ignorance'' of this matter, isn't it preferable for the future of Eiffel to make less ignorant people take care of its final resolution? In fact, a strong argument can be made that, by the very nature of this issue, it may be best to entrust its resolution to a committee including people from diverse backgrounds and countries, rather than to an individual, even with the help of the net. -- -- Bertrand Meyer bertrand@eiffel.com
amanda@iesd.auc.dk (Per Abrahamsen) (01/06/91)
Followups has been directed to comp.std.internat.
>>>>> On 29 Dec 90 22:05:16 GMT, sommar@enea.se (Erland Sommarskog) said:
Erland> This is actually not correct. Eiffel is responsible. ISO 646
Erland> clearly specifies that the characters #$@[\]^`{|}~ are subject to
Erland> replacement in national character sets.
It is probably wise to wait. The ANSI C committee did not understand
the issues, and came up with a solution (trigraphs) which nobody
likes.
Erland> ISO hasn't really managed to get 646 into every
Erland> programmers' mind, and often you hear expressions as "Swedish
Erland> ASCII" which tells you which is the de facto standard.
This is another good reason to wait. ISO 646 was created at a time
where ASCII already was the de-facto standard. It was not backward
compatible with ASCII, and therefore broke almost any application
which used the ASCII character set. It is no wonder that programmers
has been hesitant in accepting ISO 646.
A new standard (ISO Latin1) has been created, which conforms to the
ASCII standard, and is quickly being accepted. The best thing to
do might be to ignore ISO 646, and switch to ISO Latin1 as fast as
possible.
sommar@enea.se (Erland Sommarskog) (01/08/91)
Also sprach Bertrand Meyer (bertrand@eiffel.UUCP): >The last point is definitely correct. The only standard that >matters is the keyboard that sits on programmers' desks. Different programmers have different keyboards. So does different programmers have different standards? >Eiffel's lexical structure was designed under the assumption >that that keyboard supports upper- and lower-case letters, >digits, plus, minus, divide, asterisk, less than, greater than, >circumflex, period, comma, colon, semicolon, parentheses, >quote, double quote, square brackets, braces, at sign, exclamation >mark, plus two or three others. > >This is a very reasonable assumption, and one that is not >very restrictive; I have to beg to differ. The assumption is only reasonable if we believe that keyboards and terminals are the same everywhere. But we all know that other languages use characters beyond those 26 that makes up the English alphabet. (Even English does!) Why believe that keyboards in other countries would just be bigger to include the needed national characters? For long seven-bit was used for communication and data storage, why believe that all characters in the ASCII set would survive national replacements? If I were to use Bertrand Meyer's reasoning and defined a language which included ][\ as indentifier characters and used }{| as lower- case equivalents. Many people would be confused, and probably hate me as well. No, it is not a good idea to look at your keyboard and think that it defines the world. >I maintain that if hardware >manufacturers and standardization bodies had done their >most elementary job this assumption would be trivially satisfied >in 1990/1991. >... >I am sure ISO 646 is a respectable work but an international >standard does not mean anything if the entire state of the >industry contradicts it. ISO646 is a well-supported standard. For instance, I am writing this message on a VT200-compatible. I can change the keyboard to be any of North American, British, German, French, Danish, Norweigian, Swedish, Swiss-German or Swiss-French in either eight- or seven-bit mode. As I change the mode different characters are also displayed on my screen for those who are subject to change according to ISO646, i.e. $#@[\]^`{|}~. I would expect that if I get a different hardware configuration I get another set of choices. VT200 was introduced in 1984, but was far from first in its field. I have an old manual for HP2621A which reveals the same capabilities, although you have to replace the character generator to change the screen, and of course eight-bit support is missing. I don't how many printers I have enountered who have been able to display various characters according to ISO646 with the help of escape sequences to change between national varieties. I wouldn't mention ISO646 if it was a paper tiger. >But it (Eiffel) cannot be expected to correct >single-handedly the mess accumulated over forty years >of hardware evolution. At this point Bertrand Meyer is starting to sound like Richard Stallman defending Emacs' use of CTRL/S and CTRL/Q contrary to all sensible standard. It is not a mess. It is a standard. It may not be the standard you want it to be. The only time you get a mess is when you ignore the standard. >In fact, a strong argument can be made that, >by the very nature of this issue, it may be best to entrust >its resolution to a committee including >people from diverse backgrounds and countries, rather than >to an individual, even with the help of the net. Certainly. Many people frown upon language commitees and think that a language should be designed by an individual. I am not so sure about this. Is it any wonder Ada does not use any of the untouchable characters? (Well, it does use "|", but "!" is permitted as an alternative. It also uses "#", but since this character is never replaced by a letter it is less sensitive. There is an alternative anyway, ":" I think.) -- Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se "There is only one success, namely to lead your life in your own way" Anyone who can give a source for this?