[comp.lang.c] Paying attention to parentheses

franka@mmintl.UUCP (04/07/87)

[Not food]

OK, I'm going to try this once again.  I think the ANSI committee is doing
the right thing by keeping the current C rules for respecting parentheses,
and providing an alternative, but the syntax for the alternative they are
providing is *horrible*.

Instead of using the unary plus to force its argument to be evaluated as a
unit, I believe that a much better syntax would be to use square brackets to
enclose such subexpressions.  The result is much more readable.  Consider
the following examples from an article by Jim Valerio (532@omepd):

>Here are some examples of parenthesized expressions, pulled from the
>Elefunt tests, which want the order of evaluation guaranteed by the
>parentheses:
>		x = ((one + x * a) - one) * 16.0;
>		y = x / ((half + x * half) *((half - x) + half));
>		x = (x + eight) - eight;
>		while (((b+one)-b)-one == zero);
>		if ((a+betam1)-a != zero)

Under the proposed standard, these would be:

	x = +(+(one + x * a) - one) * 16.0;
	y = x / (+(half + x * half) * +(+(half - x) + half));
	x = +(x + eight) - eight;
	while (+(+(b+one)-b)-one == zero);
	if (+(a+betam1)-a != zero)

With my proposal, one would instead have:

	x = [[one + x * a] - one] * 16.0;
	y = x / ([half + x * half] * [[half - x] + half]);
	x = [x + eight] - eight;
	while ([[b+one]-b]-one == zero);
	if ([a+betam1]-a != zero)

(Actually, I'm not really sure that *all* the parentheses I have marked as
needing to be respected really do need to be; there is no way to tell for
sure from the original code.  For example, it might well suffice to write
"x = ([one + x * a] - one) * 16.0;" for the first statement.)

I claim that the third set of expressions is much more readable than the
second; in fact, it loses no readability from the original set.  I suspect
it would avoid reactions like the following (from the same article):

>I'm even reasonably happy with the
>parenthesis situation, since I just overspec my compiler and require it
>to preserve parentheses for floating-point operations.  (Sure, that's
>not encouraging portable code, but it's an easy dependency to document
>and consequently doesn't bother me much.)

Most likely, Jim would instead specify that brackets be used instead of
parentheses for all floating point expressions.

Several arguments were advanced against this proposal, which I will repeat
and answer here:

(1) Novice C programmers would say "gee, C supports both parentheses and
brackets for expressions; I can use whichever looks better".

So what?  They aren't hurting themselves any.  Anyone seriously interested
in writing very efficient code should have a better understanding of the
language than that; meanwhile programs will work whenever they would have
worked using only parentheses.

(2) Brackets for order of evaluation will be confused with subscripts.

I don't think this problem amounts to a hill of beans.  People very rarely
get confused between parentheses for grouping and parentheses for function
arguments; I don't think this would cause any more problems.

(3) It is hard to make compilers recognize brackets as a function, when
parentheses are used just for grouping.

I don't believe it!  I have studied some compiler theory, although I have
never had occasion to put much of it into practice.  I know of no parsing
algorithms for which implementing this would be any harder than
implementing the ANSI proposal.  I don't believe there are any where it
would be significantly harder.

(4) It is rather late in the process to make this kind of change.

That it is.  I can only say that if I had been aware of the situation sooner,
I would have said something earlier.  Maybe the committee should have more
studiously avoided adding functionality, so that things could have been
tried before being standardized on.

Still, I think the current proposal is quite ugly, and no arguments about
timeliness should let us put ugliness into the language, when a better
alternative is available.  Nor is this a really major change to the
standard; one could add it to a formal syntax in a few minutes, and adjust
the specification in relatively short order.

Comments?  If you think this is a good idea, say so.  It won't get adopted
without vocal support.

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

franka@mmintl.UUCP (04/10/87)

[This is a reposting, since I don't think the original got out.  My apologies
if this is a duplication.]

OK, I'm going to try this once again.  I think the ANSI committee is doing
the right thing by keeping the current C rules for respecting parentheses,
and providing an alternative, but the syntax for the alternative they are
providing is *horrible*.

Instead of using the unary plus to force its argument to be evaluated as a
unit, I believe that a much better syntax would be to use square brackets to
enclose such subexpressions.  The result is much more readable.  Consider
the following examples from an article by Jim Valerio (532@omepd):

Here are some examples of parenthesized expressions, pulled from the
Elefunt tests, which want the order of evaluation guaranteed by the
parentheses:
		x = ((one + x * a) - one) * 16.0;
		y = x / ((half + x * half) *((half - x) + half));
		x = (x + eight) - eight;
		while (((b+one)-b)-one == zero);
		if ((a+betam1)-a != zero)

Under the proposed standard, these would be:

	x = +(+(one + x * a) - one) * 16.0;
	y = x / (+(half + x * half) * +(+(half - x) + half));
	x = +(x + eight) - eight;
	while (+(+(b+one)-b)-one == zero);
	if (+(a+betam1)-a != zero)

With my proposal, one would instead have:

	x = [[one + x * a] - one] * 16.0;
	y = x / ([half + x * half] * [[half - x] + half]);
	x = [x + eight] - eight;
	while ([[b+one]-b]-one == zero);
	if ([a+betam1]-a != zero)

(Actually, I'm not really sure that *all* the parentheses I have marked as
needing to be respected really do need to be; there is no way to tell for
sure from the original code.  For example, it might well suffice to write
"x = ([one + x * a] - one) * 16.0;" for the first statement.)

I claim that the third set of expressions is much more readable than the
second; in fact, it loses no readability from the original set.  I suspect
it would avoid reactions like the following (from the same article):

I'm even reasonably happy with the
parenthesis situation, since I just overspec my compiler and require it
to preserve parentheses for floating-point operations.  (Sure, that's
not encouraging portable code, but it's an easy dependency to document
and consequently doesn't bother me much.)

Most likely, Jim would instead specify that brackets be used instead of
parentheses for all floating point expressions.

Several arguments were advanced against this proposal, which I will repeat
and answer here:

(1) Novice C programmers would say "gee, C supports both parentheses and
brackets for expressions; I can use whichever looks better".

So what?  They aren't hurting themselves any.  Anyone seriously interested
in writing very efficient code should have a better understanding of the
language than that; meanwhile programs will work whenever they would have
worked using only parentheses.

(2) Brackets for order of evaluation will be confused with subscripts.

I don't think this problem amounts to a hill of beans.  People very rarely
get confused between parentheses for grouping and parentheses for function
arguments; I don't think this would cause any more problems.

(3) It is hard to make compilers recognize brackets as a function, when
parentheses are used just for grouping.

I don't believe it!  I have studied some compiler theory, although I have
never had occasion to put much of it into practice.  I know of no parsing
algorithms for which implementing this would be any harder than
implementing the ANSI proposal.  I don't believe there are any where it
would be significantly harder.

(4) It is rather late in the process to make this kind of change.

That it is.  I can only say that if I had been aware of the situation sooner,
I would have said something earlier.  Maybe the committee should have more
studiously avoided adding functionality, so that things could have been
tried before being standardized on.

Still, I think the current proposal is quite ugly, and no arguments about
timeliness should let us put ugliness into the language, when a better
alternative is available.  Nor is this a really major change to the
standard; one could add it to a formal syntax in a few minutes, and adjust
the specification in relatively short order.

Comments?  If you think this is a good idea, say so.  It won't get adopted
without vocal support.

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

manis@ubc-cs.UUCP (04/14/87)

I think that this discussion is beginning (?!) to degenerate. For the record:

1) Compilers generally fall into three categories: student compilers, those
with optional code improvers, and those with built-in optimisation. A
student compiler will generally do no optimisation whatsoever, and therefore
(except in obvious cases such as argument lists) what you see is what you
get. An optional code improver (such as the one supplied with pcc) is not
used by default, and therefore problems based on optimisation can be easily 
dealt with by comparing the optimised program with the unoptimised one. The
third category, the really heavy-duty optimising ones, are certainly not
recommended for casual or beginning programmers; even so, they generally
have some way of disabling optimisation.

So what's the problem? Just because ANSI relaxes the semantics of the
language does not mean that a compiler has to be indeterminate. The standard
(unless it's been changed over the last year or so) does not *demand*
indeterminate order of evaluation.

As a final point, if I got code from somewhere else, I would certainly not 
optimise it (even on UNIX) until I was sure that it was clean.

2) Frank Adams dislikes the +(...) syntax, and proposes the use of square
brackets to denote a definite order of evaluation. I don't care for the
monadic +, but why add yet more clumsy syntax? If you don't like the +,
what's wrong with 
   #define respect(x) (+(x))

Do all the people who are flaming on this subject understand that there are
machines which don't do floating-point operations in the order specified,
even if you write the code in assembly language? (Larger 370's certainly
fall into this category; there is a certain type of NOP you write in order
to establish a sequence point in assembly language).

-----
Vincent Manis                {seismo,uw-beaver}!ubc-vision!ubc-cs!manis
Dept. of Computer Science    manis@cs.ubc.cdn
Univ. of British Columbia    manis%ubc.csnet@csnet-relay.arpa  
Vancouver, B.C. V6T 1W5      manis@ubc.csnet
(604) 228-6770 or 228-3061

"Long live the ideals of Marxism-Lennonism! May the thoughts of Groucho
 and John guide us in word, thought, and deed!"

tps@sdchem.UUCP (04/16/87)

In article <2094@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:

>Instead of using the unary plus to force its argument to be evaluated as a
>unit, I believe that a much better syntax would be to use square brackets to
>enclose such subexpressions...

>>Here are some examples of parenthesized expressions, pulled from the
>>Elefunt tests, which want the order of evaluation guaranteed by the
>>parentheses:
>>		x = ((one + x * a) - one) * 16.0;
>>		y = x / ((half + x * half) *((half - x) + half));...

>Under the proposed standard, these would be:
>
>	x = +(+(one + x * a) - one) * 16.0;
>	y = x / (+(half + x * half) * +(+(half - x) + half));...

>With my proposal, one would instead have:
>
>	x = [[one + x * a] - one] * 16.0;
>	y = x / ([half + x * half] * [[half - x] + half]);...

>(Actually, I'm not really sure that *all* the parentheses I have marked as
>needing to be respected really do need to be; there is no way to tell for
>sure from the original code.

This is good argument for separating the default-precedence-overide use of
parentheses from the evaluation-order use.  You never know if you're
maintaining this code if the order is important.  If you require a different
syntax the meaning is very clear not only to the optimizing compiler but to
the human readers.

>I claim that the third set of expressions is much more readable than the
>second; in fact, it loses no readability from the original set.

I think its *more* readable because the programmers intentions are unambiguously
laid out -- "add these two together first, add these others in the most
expeditious manner".

>Several arguments were advanced against this proposal, which I will repeat
>and answer here:
>
>(1) Novice C programmers would say "gee, C supports both parentheses and
>brackets for expressions; I can use whichever looks better".

Even non-novice programmers have admitted in this group that they never
realised what was happening to their parentheses.  Confusion between
bracket and parens will be quite innocuous for novices.

>(2) Brackets for order of evaluation will be confused with subscripts.
>
>I don't think this problem amounts to a hill of beans.  People very rarely
>get confused between parentheses for grouping and parentheses for function
>arguments; I don't think this would cause any more problems.

I think arrays are a little different than functions.  You can't do
arithmetic with functions, whereas you can with arrays.  However, most
floating point expressions do only simple subscripting and avoid applying
arithmetic operators to arrays.  I think it would be wise to keep the
proposed meaning of unary plus for those who think the double meaning
of brackets would be confusing for a particular expression.

Fortran programmers shouldn't complain about this syntax because in Fortran,
parentheses are overloaded for grouping expressions and as array subscripts.

>(3) It is hard to make compilers recognize brackets as a function, when
>parentheses are used just for grouping.
>
>I don't believe it!  I have studied some compiler theory, although I have
>never had occasion to put much of it into practice.  I know of no parsing
>algorithms for which implementing this would be any harder than
>implementing the ANSI proposal.  I don't believe there are any where it
>would be significantly harder.

Has anyone added brackets like this to their local C compiler?  Have they
found any problems? Was it hard to do?  If not, would someone try this?
That way it would be harder for the Committee to reject the proposal
on the grounds of "no prior art".

>(4) It is rather late in the process to make this kind of change.

Why was the adoption of the standard delayed for months and months for
another public review period if we don't get a chance to actually change
it?  The next release from ANSI had better have some non-trivial changes
adopted to justify all the delay.

>Comments?  If you think this is a good idea, say so.  It won't get adopted
>without vocal support.
>Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
>Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

Thanks for the proposal.  In case it wasn't apparent, I'm all for this
proposal.

|| Tom Stockfisch, UCSD Chemistry	tps%chem@sdcsvax.ucsd.edu
					or  sdcsvax!sdchem!tps

mckeeman@wanginst.UUCP (04/16/87)

In article <1002@ubc-cs.UUCP>, manis@ubc-cs.UUCP (Vincent Manis) writes:
> An optional code improver (such as the one supplied with pcc) is not
> used by default, and therefore problems based on optimisation can be easily 
> dealt with by comparing the optimised program with the unoptimised one...

Let me define "plain code" for C expressions:

    Operands are evaluated left-to-right,

    Operations are associated with operands as defined by the parse tree,

    Side effects are executed with the operation causing them, and

    The representation of an intermediate result is identical to the
    representation of that same value in a variable of the same type.

Relative to that definition, it is my opinion that all compilers do some
non-intuitive optimization *all* the time.  What a programmer using -O really
is saying is "I'm willing to wait a while for even better code".  If
programmers need something between "plain code" and "highly optimized code",
compiler writers are going to have to have a better definition of it.

Anyone want to try a definition of "reasonable code" for:
    1. floating point expressions
    2. overflow-vulnerable integer expressions
    3. expressions subject to debugger and postmortem peeking.

500 words or less per entry please. :-)

/s/ Bill

p.s. The K&R grammar is deliberately ambiguous, defeating the above
definition of plain code; plain code does make sense relative to the x3j11
grammar.
-- 
W. M. McKeeman            mckeeman@WangInst
Wang Institute            decvax!wanginst!mckeeman
Tyngsboro MA 01879

gwyn@brl-smoke.ARPA (Doug Gwyn ) (04/16/87)

In article <673@sdchema.sdchem.UUCP> tps@sdchemf.UUCP (Tom Stockfisch) writes:
>Why was the adoption of the standard delayed for months and months for
>another public review period if we don't get a chance to actually change
>it?  The next release from ANSI had better have some non-trivial changes
>adopted to justify all the delay.

You did have a chance to provide input, which IS considered by X3J11,
and you will have another during the second public review period.  We
don't make a change to the spec just because someone asks for it; it
is quite possible that a change could cause just as much objection
from people who were satisfied the way it was.  Proposals for which
sufficient justifcation is found ARE adopted, sometimes with
improvements figured out in the course of discussions.  Proposals
that aren't found to be sufficiently justified are rejected.  Guess
which category most proposals fall into?

The only reason there has to be another public review is BECAUSE some
substantive changes were indeed made.  If as a result of the second
public review more substantive changes are found necessary, there will
be even more delay.  That is why even at this late date X3J11 is still
working on solutions to major issues such as internationalization and
multi-byte character sets.  We really hope that the second public-review
draft will be "good enough" to adopt, modulo non-substantive editorial
adjustments.  (Clearly nobody will think it is "perfect" no matter how
long we work on it.)

Reminder:  The above is my own view, not an official X3J11 policy statement.

stuart@bms-at.UUCP (Stuart D. Gathman) (04/17/87)

I like the proposal to to use [] instead of +().  (For what it's worth.)

This keeps the spirit of 'C' and is much prettier.
-- 
Stuart D. Gathman	<..!seismo!dgis!bms-at!stuart>

jda@mas1.UUCP (James Allen) (04/17/87)

In <1002@ubc-cs.UUCP> manis@ubc-cs.UUCP (Vincent Manis) writes:

> Do all the people who are flaming on this subject understand that there are
> machines which don't do floating-point operations in the order specified,
> even if you write the code in assembly language? (Larger 370's certainly
                                                                 ^^^^^^^^^
> fall into this category; there is a certain type of NOP you write in order
> to establish a sequence point in assembly language).

I am not nearly so certain of this as Dr. Manis.  In fact the larger multi
megabuck 370's were required to give the same answers as the pocket
calculater models IBM sold for a mere $150,000.

There is a special NOP but it has nothing to do with floating-point.  This
NOP is usually meaningless to the applications programmer; here's an exception:

/*
** Warning: this fragment helps explain the "special NOP", but is
** otherwise wrong.  For one thing, any system call will do the equivalent
** of the special NOP; hence "signal" eliminates the need for the NOP
** in a Unix implementation.
*/
	signal (SIGBUS, SIG_IGN);
		/* ignore memory violation in next function */
	strcpy (might_be_corrupted_pointer, "hi earth");
	the_special_NOP ();
	signal (SIGBUS, SIG_DFL);

Without NOP, the access violation in strcpy may not show up until after
the second call to signal().  During the special NOP, all memory writes
logically issued by this CPU are physically completed.  Special hardware
eliminates the obvious programming problems:
	black = white;
	if (black != white) printf ("Impossible...even though storage \
		location BLACK may contain old data, hardware comparators \
		have made this obsolete data invisible to us.");
The NOP therefore is required only for cpu-cpu, cpu-channel, and
cpu-acccess_violation_hardware synchronization.

The 370/165 Model I, by the way, treated the special NOP as an ordinary NOP
and could not be made deterministic in this fashion.

There's one or two other things I know about 370's, so if anyone else out there
is "certain" about their behavior, I may be willing to book a bet...

		--   Nick the Greek

am@cl.cam.ac.uk (Alan Mycroft) (04/27/87)

Consider the following examples from an article by Jim Valerio (532@omepd):
>>Here are some examples of parenthesized expressions, pulled from the
>>Elefunt tests, which want the order of evaluation guaranteed by the
>>parentheses:
>>		x = ((one + x * a) - one) * 16.0;
>>		y = x / ((half + x * half) *((half - x) + half));
>>		x = (x + eight) - eight;
>>		while (((b+one)-b)-one == zero);
>>		if ((a+betam1)-a != zero)
Frank Adams (Ashton Tate) then suggests that these need to be:
>	x = +(+(one + x * a) - one) * 16.0;
(etc - lots of unary '+'s)
Now, by my understanding of the ANSI draft, none of the above 4 expressions
may be re-arranged so that none of the unary '+'s are required here:
The only rearrangements that are allowed are ASSOCIATIVITY AND COMMUTATIVITY.
So, by the mathematical definition, the only changes allowed are
(a+b)+c <-> a+(b+c) and a+b <-> b+a and the analogues for '*'.
None of the above expressions contain a+b+c or a*b*c and unary plus
cannot be used anyway to stop a+b being re-arranged as b+a.
Therefore using unary plus cannot affect the allowable compilations of the
above expressions.

As I have remarked before, the re-arrangements only apply to the above 4
cases.    a+b-c CANNOT be rearranged as a-c+b or even a+(b-c)
since '-' is not an associative commutative operator.
Similar reasoning forbids any other rearrangement.