db@its63b.ed.ac.uk (D Berry) (04/26/88)
In article <4444@ihlpf.ATT.COM> writes: >>In article <8804140925.AA13150@klaus.olsen.uucp> Info-Modula2 Distribution List <INFO-M2%UCF1VM.bitnet@jade.berkeley.edu> writes: >> >>Does C++ allow infix procedures other than the standard set? > >No. Doing this tends to lead to unreadable code. For example: If I >overload the word 'or' as an infix operator, this sentence no longer has >the same meaning that I intended (this is because 'word' becomes 'w or d'. Does this mean that "newton" isn't a legal C++ identifier because it will be parsed as "new" "ton"? I doubt it. Most languages that allow user defined infix operators let them be any (of a subrange of) lexically distinct token(s). Often alphanumeric and symbolic tokens will be different sets, allowing expressions such as "w+d" to be parsed correctly, while requiring the spaces in "w or d" to distinguish this case from "word". >It also leads to nightmares for the parser (is '/+' an error or an overload >operator, etc.). The easiest way to handle this is to take the rule for distinguishing between alphanumeric identifers -- read the longest -- and use it for symbolic identifiers as well. So Nevin's example would be a single identifier "/+". If this were done to C++ (I'm not suggesting it should be done), its expression would differ from C in some cases. E.g. Expression C parse (C++)++ parse a+++++b "a" "++" "+" "++" "b" "a" "+++++" "b" *++p "*" "++" "p" "*++" "p" However, C++ isn't source code compatible with C anyway, and this scheme would make the existing treatment of "/*" and "*/" examples of a general rule rather than a specific case. It would probably also make cases like the above easier to read, as they would have to be broken up: a++ + ++b *(++p) (Really basic symbols such as brackets and quotes shouldn't be allowed to appear in symbolic identifiers or things get out of hand). Defining your own operators is subject to the same cautions as overloading existing ones. It can make code easier to read, but you can also use it to make a real dog's breakfast. It also requires scope rules for the infix nature of the token. One person might define "or" to be a infix operator in class A while someone else defines "or" as a function in class B. How does a function parse "or" in a program that uses both classes? The rule used in Standard ML would translate to C++ as follows: an infix operator is infix in the class in which it's declared and in all subclasses and member functions. From outside the class, it's treated as a prefix function of two arguments (e.g. A::or (x, y); B::** (x, n);). A keyword can make all infix tokens of a class be parsed as infix in the current file (e.g. acceptinfix A; x or y; acceptinfix B; x ** n;). An alternative would be to let "A::or" be used infix all the time (e.g. x A::or y; x B::** n;). This would probably be better for C++. Presumably to be orthogonal this scheme would have to allow user defined prefix operators as well. Functions and prefix operators would have to be distinguished by the same rule as functions and infix operators. C++ can already distinguish between prefix and infix operators. I'm not proposing that user defined operators should be added to C++; I'm just attempting to show that it could be done and to point out some of the problems that would need to be resolved. Please follow up if I've missed anything. > > _ __ NEVIN J. LIBER ..!ihnp4!ihlpf!nevin1 (312) 510-6194 -- "The answer is simple, they could do it with ease; stop attacking the patients, and attack the disease." -- Tom Robinson.
nevin1@ihlpf.ATT.COM (00704a-Liber) (04/27/88)
BTW, in my original article I was merely explaining why (I thought) that C++ did not allow user-definable operators. I hope that I didn't give the impression that I thought that this is impossible to define within the language. In article <1206@its63b.ed.ac.uk> db@itspna.ed.ac.uk (Dave Berry) writes: > Expression C parse (C++)++ parse > > a+++++b "a" "++" "+" "++" "b" "a" "+++++" "b" > *++p "*" "++" "p" "*++" "p" > >However, C++ isn't source code compatible with C anyway, and this scheme >would make the existing treatment of "/*" and "*/" examples of a general rule >rather than a specific case. There is very little that makes C programs choke in the C++ translator. If you fix the header files up (put in function prototypes, which will be required by ANSI C anyway) and don't use any of the new keywords (I know there are a few more little things but I don't have the C++ book handy right now), I can compile my C program with my C++ compiler. Changing the rules from C to C++ is very undesirable. >It would probably also make cases like the above >easier to read, as they would have to be broken up: I agree, but this should not be a design goal of the language (this does not give me more expressive power, so why put it in?). A few things. First, a syntax would have to be devised to allow the overloading of the user-defined operators for the builtin types, such as char, int, etc. Secondly, what do you do about defining the order of precedence and associativity of these user-defined operators? If you make them all the same, then what do you really gain by having infix instead of prefix; you still need parens all over the place. And where in the chart would you put it? Thirdly, there might be some problem in differentiating (at compile time) which operators return lvalues and which do not. Fourthly (is that a word? :-)), I wouild also like to be able to define my own unary operators (both prefix and postfix). Otherwise the same question which started this discussion comes up :-). I am not saying that it is impossible to allow user-definable operators in C++, only that it may not be desirable and that if it is desirable it needs a lot more planning than we have done so far. -- _ __ NEVIN J. LIBER ..!ihnp4!ihlpf!nevin1 (312) 510-6194 ' ) ) "The secret compartment of my ring I fill / / _ , __o ____ with an Underdog super-energy pill." / (_</_\/ <__/ / <_ These are solely MY opinions, not AT&T's, blah blah blah
daniels@teklds.TEK.COM (Scott Daniels) (04/28/88)
In article <1206@its63b.ed.ac.uk> db@itspna.ed.ac.uk (Dave Berry) writes: >...I'm not proposing user defined operators should be added to C++; I'm just >attempting to show that it could be done and to point out some of the problems >that would need to be resolved. Please follow up if I've missed anything. Most of the explained non-problems with operators have to do with lexical analysis, solvable easily if by nothing else than whitespace rules. The other problems with operators is determining their precedence. This is a substantial problem to the code reader as well as the compiler. How does the following associate?: a + b operator c * d; If the answer is not obvoius to the code reader as well as the translator, you have a nightmare. One possible solution is to take advantage of an observetation about precendence in grammars: The vast majority of expressions do not require precendence to resolve them. Therefore, you could restrict user operators to use in contexts that provide unambiguous groupings. This would make the above illegal, requiring: a + (b operator c) * d; or (a + b) operator (c * d); or ... However, you might not be too happy with: a = b op c; // illegal a = (b op c); // legal Since '=' is itself an operator. Perhaps all user-operators are a fixed precedence? All in all, I think the can of worms is larger than the benefit. -Scott Daniels (daniels@teklds.UUCP)
djones@megatest.UUCP (Dave Jones) (04/28/88)
in article <4549@ihlpf.ATT.COM>, nevin1@ihlpf.ATT.COM (00704a-Liber) says: > ... > There is very little that makes C programs choke in the C++ translator. In this case, "very little" is quite a bit. You didn't mention that C++ treats the name-space differently: There is no separate lookup table for "struct this" and "struct that". That's another "little thing" that makes C++ not a superset of C. And as you mentioned in a part that I edited out, ANSII C is adding new keywords, so C and C++ will diverge even further (if you will allow that ANSII C is C, and not a new language per se, an arguable point). > If you fix the header files up (put in function prototypes, which will be > required by ANSI C anyway) ... Will they? I don't know much about ANSII C, but I certainly hope not! I hope that all old C programs will compile just fine under ANSII C. Is that not the intention of the committee? Except for old files which use the new reserved words, of course. Those programs will identify themselves (syntax error), and will be easy to fix -- just change the name. They will be quite rare. I can't remember ever having named a variable "volatile". (This makes a good case for not reserving key-words in a language. But there are good arguments on the other side too. ) I am under the impression that the old function declarations will work just fine. A function prototype for a function requiring no parameters does not look like an old style function declaration. Instead it looks like this: int foo(void); The different form of function prototypes will make C++ and ANSII C even "more incompatible" than C++ and old C, not less so. Since you were discussing changes to C++, here's my preference: Do function prototypes the ANSII C way, and do the name-spaces the old C way. But there may already be too large a body of C++ code out there to do that. And then there's the books in print. Still, for my own purposes, that's what I would like to see. It would be great if C++ could be made a proper superset of ANSII C. > ... Dave J.
shankar@hpclscu.HP.COM (Shankar Unni) (04/28/88)
> Expression C parse (C++)++ parse > > a+++++b "a" "++" "+" "++" "b" "a" "+++++" "b" > BZZZZZZZZ! There has already been a prolonged discussion about this example. C does *not* parse it like this. It parses it as a "++" "++" "+" b which is syntactically incorrect. Therefore, your example will choke any decent C compiler. Remember, "longest sequence of chars". --scu
jima@hplsla.HP.COM ( Jim Adcock) (04/28/88)
From the little bit I've messed around with writing "C" language compilers, I'd guess the restrictions put on overloading operators [IE you're not allowed to defined new operators, just to overload existing ones ] were chosen to allow C++ to remain compatible with the traditional "C" compiler approach -- with fixed definitions of what it is we need to lex and parse [other that the traditional "C" hack of passing typedef info back to the "lex" part of the compiler] As your example points out, allowing the user to define his/her own operator symbols greatly changes how we must interpret a string of non-alphanumerics in the input file. Which makes it potentially very difficult for the reader of that input file to figure out what was meant. Plus you'd have to give the user means to specify the new operator's precedence and binding relationships...... Thus, the design of the compilers, the tools used to design the compilers, and the way C++ users go about trying to read and interpret C++ program sources would have to change considerably to be able to handle defining new operators. I believe the restrictive C++ approach to operator overloading is a good practical choice. To give more flexibility in this area would cause C++ to diverge too greatly from C.
jima@hplsla.HP.COM ( Jim Adcock) (04/30/88)
| Most of the explained non-problems with operators have to do with lexical | analysis, solvable easily if by nothing else than whitespace rules. Where "whitespace rules" means that if users were allowed to define their own operators, then you'd be forced to start separating operators with whitespace, the way you presently have to separate "identifiers" with whitespace. I can't imagine writing C[++] code where operators "always" have to be separated using whitespace! What a pain! Just try going over your C[++] programs, separating adjecent operators with whitespace, and you'll realize why we don't want this "feature" in C++!
crowl@cs.rochester.edu (Lawrence Crowl) (05/10/88)
In article <6590049@hplsla.HP.COM> jima@hplsla.HP.COM (Jim Adcock) writes: >I can't imagine writing C[++] code where operators "always" have to be >separated using whitespace! What a pain! Just try going over your C[++] >programs, separating adjecent operators with whitespace, and you'll realize >why we don't want this "feature" in C++! This is not a problem. Consider three classes of characters, those for identifiers (also keywords and literals), those for grouping (eg parentheses), and those for operators. If you do not mix classes, then you need no spaces between tokens in different groups. This covers most tokens in the stream. The major cases where this does not happen are variable declarations (which currently requires the space) and unary operators adjacent to other operators. If you are not putting a space in your code in the latter case, your code is probably confusing anyway. Consider: a+=b+++*c; versus a+=b++ + *c; The latter provides for user-defined operators with minimal additional typing burden. -- Lawrence Crowl 716-275-9499 University of Rochester crowl@cs.rochester.edu Computer Science Department ...!{allegra,decvax,rutgers}!rochester!crowl Rochester, New York, 14627
kurt@color.ctt.bellcore.com (Kurt Gluck(PICS)) (05/11/88)
How about going with snobols method. Predefine a small number of
additional unused operator symbols that can be used.
In snobols case the operator symbols are:
BINARY OPERATORS
Graphic Defnition Associativity Precedence
======= ========================== ============= ==========
~ UNUSED right 12
? UNUSED left 12
$ immediate value assignment left 11
. conditional value assignment left 11
! exponentiation right 10
** exponentiation right 10
% UNUSED left 9
* multiplication left 8
/ division left 7
# UNUSED left 6
+ addition left 5
- subtraction left 5
@ UNUSED left 4
blank concatenation left 3
| alternation left 2
& UNUSED left 1
UNARY OPERATORS
graphic definition
======= ==========
~ negation
? interrogation
$ indirect reference
. name
! UNUSED
% UNUSED
* unevaluated expression
/ UNUSED
# UNUSED
+ positive
- negative
@ cursor position
| UNUSED
& keyword