[comp.lang.c] parens honored

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/07/88)

In article <12211@orchid.waterloo.edu> atbowler@orchid.waterloo.edu (Alan T. Bowler [SDG]) writes:
>If you have an situation where overflows are significant, then the
>"honour parenthesis" rule forbids the compiler from making even simple
>optimizations like constant folding.

True, but most (maybe all) implementations I have access to do not have
this problem.  I think SAS C on the IBM is the only one I know of (it
traps integer overflow), other than one's-complement architectures.
One of our system administrators once turned on floating-point overflow
trapping in our Gould PowerNode startup module, not realizing that the
hardware boundled it with integer overflow trapping and that the C code
generator was relying on benign integer overflow.  Lots of code crapped
out when it was relinked with the new startup module!

>    Note that this will apply to ALL floating point calculations,

Which is the main motivation behind the clamor for this specification.

>    With the new rule parentheses are being asked to perform 2 distinct
>actions
>    1) override the default precedence rules
>    2) specify the evaluation sequence.

Your analysis is essentially the one that the X3J11 committee used to
reject the constant demands for "honoring parentheses", until last
meeting where it was presented in a different light.

(2) was actually always implicit throughout the whole language, but
the base document (K&R Appendix A) contained a sentence that made an
explicit special dispensation for the order of mathematically
commutative and associative operators (+ and * being the main ones).
What X3J11 has done is remove the special license for optimizers to
rearrange such expressions in such a way that IT MAKES A POTENTIALLY
VISIBLE DIFFERENCE.  By the "as if" rule, an optimizer can still do
this if it can be sure that the effect is in all respects "as if" it
had strictly followed the evaluation order that the programmer
specified.

Hey, a lot of the committee didn't mind the way it was, but in the
end it was decided to help the programmer rather than the implementor.
I for one appreciate having my expressed algorithmic intentions
carried out precisely, instead of the compiler taking it upon itself
to "improve" things for me in ways that make my code unreliable.

dhesi@bsu-cs.UUCP (Rahul Dhesi) (01/08/88)

In article <6968@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) 
writes:
[about new restriction on rearrangement of parens in ANSI C]
>Hey, a lot of the committee didn't mind the way it was, but in the
>end it was decided to help the programmer rather than the implementor.
>I for one appreciate having my expressed algorithmic intentions
>carried out precisely, instead of the compiler taking it upon itself
>to "improve" things for me in ways that make my code unreliable.

Isn't it true though that in pre-ANSI K&R C, one could force a certain
order of evaluation through the use of temporary variables, thus
getting the best of both worlds?  ANSI C takes away this ability from
the user.  The new rule gives BOTH the programmer and the implementor
fewer options.  The unary plus, ugly though it was, did not take away
anything from the programmer and almost certainly was a smaller burden
on the compiler writer for a machine doing overflow checking than was
having to figure out when rearrangement of parens could change
overflow behavior.

[wild speculation follows]
It also will be a potential headache to the hardware designer who was
about to include automatic overflow checking into his CPU, a definite
plus in many cases, who now realizes that this could mean that his
company's C compiler will now show up badly in benchmarks against his
competitors who saved money by not having their hardware check for
integer overflow.  For his C compiler can now no longer simplify many
macro expressions.  So he may decide to forget overflow checking.
Thus the parens rule acts to make expensive in software a useful
feature that was already expensive in hardware.  A double whammy
with no winners, only losers.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/08/88)

In article <1798@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>Isn't it true though that in pre-ANSI K&R C, one could force a certain
>order of evaluation through the use of temporary variables, thus
>getting the best of both worlds?  ANSI C takes away this ability from
>the user.

How so?  I don't think this has changed one whit.

>It also will be a potential headache to the hardware designer ...

This shows that integer overflow trapping and optimization are at cross
purposes.  I certainly don't want both at the same time, because it
would make my patently correct code unreliable at run time.  People who
want integer overflow trapping should specify it in their RFPs, and if
that is a sizeable portion of the market, certainly vendors will provide
it.

leichter@yale.UUCP (Jerry Leichter) (01/09/88)

With all the whining one way and the other about parens and order of
evaluation - and given all the changes to "K&R C", such as it was - why
not do things the clean way:  Round parens group but don't imply evaluation
order; "square parens" [] both group AND imply evaluation order.  This way
of doing things has a couple of advantages:

	- Complex mathematical formulas are often written with several
		different kinds of brackets; it makes them much easier
		to read.
	- When square brackets are used in formulas, they usually refer to
		larger, logically separate pieces, so it makes sense that
		they imply a stronger kind of grouping.
	- No existing code uses square brackets this way, so no existing
		code is affected at all.

Ah, you object, but square brackets are already used for arrays!  So what?
Parens are used for function calls as well as grouping; the syntaxes are
precisely analogous.

To make my proposal more complete:

	- Brackets and parens do NOT match each other - (a+b] is ill-formed.
	- Brackets can be used for grouping only in expressions; they cannot
		be used in other syntactic constructs that use parens (after
		for or if, or surrounding the typename in a cast or after
		sizeof, to give a couple of examples).  The one place this
		might be worth thinking more closely about is in grouping
		within typenames - [*int]() and such.  I doubt this is
		worth the extra ambiguity it adds to abstract declarators
		([] is analogous to ()).

If you really want, it's even possible to allow {} as another kind of
grouping operator in expressions, though (a) you'd have to make some fairly
arbitrary restrictions to eliminate ambiguities; (b) it would make error
recovery harder.  So I certainly wouldn't recommend it.

							-- Jerry

dhesi@bsu-cs.UUCP (Rahul Dhesi) (01/09/88)

In article <6986@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) 
writes:
>In article <1798@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>>Isn't it true though that in pre-ANSI K&R C, one could force a certain
>>order of evaluation through the use of temporary variables, thus
>>getting the best of both worlds?  ANSI C takes away this ability from
>>the user.
>
>How so?  I don't think this has changed one whit.

Pre-ANSI K&R C allowed the user to pretty much use parentheses anywhere
without inhibiting optimization.  Yet, when necessary, the user could
force a specific evaluation order.  ANSI C preserves only the second
option--the user can no longer (on a machine that checks for overflow)
use parens with wild abandon.  It is this ability to get "the best of
both worlds" that ANSI C takes away.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/09/88)

In article <1808@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>ANSI C preserves only the second
>option--the user can no longer (on a machine that checks for overflow)
>use parens with wild abandon.  It is this ability to get "the best of
>both worlds" that ANSI C takes away.

I think you're looking at this the wrong way around.  The programmer
can still use parentheses any which way.  In fact, code that formerly
would mysteriously fail will now work the way it looks like it ought
to.  The only loss is a small amount of extra optimization under
quite limited circumstances (associating +s, or *s).  Some of this
can still be performed (folding integer constants under many common
circumstances).  Hardly an excessive price to pay for reliable code.

ado@elsie.UUCP (Arthur David Olson) (01/10/88)

To me, the real problem with honoring parentheses is simple:  existing
implementations of C don't.  This means that when an ANSI C program that
"needs" such honoring gets ported back to an existing implementation,
things will break quietly and mysteriously.

However much folks have complained about it, X3J11's earlier technique of
using unary + to force parentheses to be honored was better with respect
to "backward portability":  code such as
	a = +(b + c) + d;
would be flagged by existing compilers as illegal, rather than being
accepted as legal but doing something different from what you intended.
(The same would be true of the
	a = [b + c] + d;
notion presented in this news group recently.)

And, of course, the founding parents got it right in the first place:
if order of evaluation is important, use a temporary variable; then there'll
be absolutely no way for a code reader to misinterpret the intent of the code.
Your modern, now, a-go-go compiler (the one that needs "noalias"
hints so it can produce spiffy code) will surely optimize the temporary
variable away.
-- 
ado@vax2.nlm.nih.gov		ADO, VAX, and NIH are Ampex and DEC trademarks

pardo@uw-june.UUCP (David Keppel) (01/11/88)

[ if you want to force evaluation order use temporaries ]

3 thoughts about this:

(1) Super-optimizing compilers may get rid of the temporaries.

(2) You can't always (easily) use temporaries when using macros.

(3) Under ANSI-C, parens can be reordered as long as there isn't potential
    for overflow.  To be able to exploit this effectively (to be able to
    freely rearrange parens almost everywhere, which is what you want),
    you need to be able to declare variables as being range-restricted:

	#define	BUFSIZE 256
	typedef /* some type goes here */ a_type;
	a_type	pragma(range 0..BUFSIZE) i, j, k, l;

    so that

	    i = j + (k - l);

    can be freely rearranged if a_type is 16 bits (or even unsigned 9),
    but not if it is 8 bits.

	;-D on  (Ada(TM) is NOT a trademark of dpANS!)  Pardo

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/11/88)

In article <7563@elsie.UUCP> ado@elsie.UUCP (Arthur David Olson) writes:
>This means that when an ANSI C program that
>"needs" such honoring gets ported back to an existing implementation,
>things will break quietly and mysteriously.

Assuming that ANSI C code had a chance of working in an "old C"
environment in the first place!

Actually, my real response to this is that such backward porting
hasn't been a concern of the committee, so far as I can tell, and
rightfully not.  "Old C" is so amorphously ill-defined that there
is no way we can guarantee anything about it.  That's what the
new standard version of the language will be for.  With the
exception of people running unsupported compilers, such as those
on 4.nBSD, most programmers will be porting in a forward direction
quite soon after the final standard is published.  Most C compiler
vendors I know of are either already tracking right behind the
ANSI C drafts, or they have made plans to do so not long after the
standard is published.  One reason for this is that it won't be
long before procurement contracts call for standard conformance.

I also think that many existing implementations pretty much
produce acceptable parentheses behavior (at least for integers),
even according to the new standard.  I refer to those with
"benign" integer overflow.  It is the floating-point world that
stands to gain the most from the new rule, and indeed some new
floating-point code may become non-robust if compiled with
compilers that rearrange evaluation order.  At least this is a
well-known problem, for which there is no acceptable alternative
in "old C".  (Explicit use of temporary variables has been
declared "not acceptable" by many numerical programmers.)

V4039%TEMPLEVM.BITNET@CUNYVM.CUNY.EDU (Stan Horwitz) (01/11/88)

  I have a suggest which allows numerical programmers to avoid using temporary
variables in their C programs.  Restructure your formula so in a way that C
has no other option but to evaluate it correctly.  This should eliminate most
but not all, of the need for temporary variables.

   EXAMPLE:  Given arrays X and Y, and XX, XY, and SSXY, the formula
             SSXY = SUM(XY)/N - SUM(X) * SUM(Y) / N where N is the number
             of elements in all of the arrays is part of the method of
             linear least squares regression.  In order to get C to see
             this formula correctly, it can be reorganized to look like
             SSXY = (-SUM(Y) * SUM(Y) + SUM(XY)) / N.  In this form, there
             can be no ambiguity (or misinterpretation) as to how the formula
             is to be computed by any compiler.  Oops, almost forgot, the
             function SUM will take an array of size N and return the sum of
             it's elements.  Don't worry about what types things are.  Just
             assume every thing is interger with the exeption of SSXY which
             could be of type float.

  This allows the evaluation of the formula with no need to computer temporary
variables.

  Stan Horwitz
  V4039 at TEMPLEVM.BITNET
  Philadelphia, PA

gwyn@brl-smoke.UUCP (01/12/88)

In article <11205@brl-adm.ARPA> V4039%TEMPLEVM.BITNET@CUNYVM.CUNY.EDU (Stan Horwitz) writes:
>   EXAMPLE:  Given arrays X and Y, and XX, XY, and SSXY, the formula
>             SSXY = SUM(XY)/N - SUM(X) * SUM(Y) / N ...

I didn't understand the point of your example, but also please note
that this isn't a very robust formula.  I know better methods for
computing variance, so presumably there's also better computational
formulas for cross-correlations.

I just checked the "Numerical Recipes" book, which didn't help except
to point out that that's not the best general form for computing
linear regression coefficients.  (N should not be hardwired in.)

Perhaps you should explain what "temporary variables" the first form
involves that your rewrite eliminated.

ado@elsie.UUCP (01/12/88)

> Assuming that ANSI C code had a chance of working in an "old C"
> environment in the first place!

At our lab we're doing our best to have code that will work in both
"old C" and "ANSI C" environments, so far without any insurmountable problems.
I think the assumption's safe.

> "Old C" is so amorphously ill-defined that there is no way we can guarantee
> anything about it.

Not so--we can guarantee that no "Old C" compiler honors parentheses,
which is why it's a bad idea to introduce the guarantee into ANSI C.
(Not honoring parentheses is the "clear and unambiguous" existing practice,
and one X3J11 goal, set forth in its Rationale document, is to follow such
practices where they exist.)

> With the exception of people running unsupported compilers, such as those
> on 4.nBSD, most programmers will be porting in a forward direction
> quite soon after the final standard is published.

There are far more exceptions than "unsupported compiler" users--
folks with very legitimate economic ("can't afford it"), political
("the compiler we're using has been stable for four years"), and technical
("the compiler vendors don't have a version for our obsolete,
two-year-old machine") reasons will be sticking (or stuck) with "old C"
for quite a while.

> (Explicit use of temporary variables has been
> declared "not acceptable" by many numerical programmers.)

Here's hoping that at the very least the arguments presented by these
numerical programmers show up in the Rationale document.
-- 
ado@vax2.nlm.nih.gov		ADO, VAX, and NIH are Ampex and DEC trademarks

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/12/88)

In article <7564@elsie.UUCP> ado@elsie.UUCP (Arthur David Olson) writes:
>At our lab we're doing our best to have code that will work in both
>"old C" and "ANSI C" environments, so far without any insurmountable problems.

This is a good practice to follow, although difficult since many of the
essential things such as size_t simply don't exist in a portable way in
"old C".

>(Not honoring parentheses is the "clear and unambiguous" existing practice,
>and one X3J11 goal, set forth in its Rationale document, is to follow such
>practices where they exist.)

And another goal is to remedy clear deficiencies.  If you had had to wade
through all the demands for something called "honoring parentheses", it would
be clear to you that many people perceived the "old C" license to rearrange
a couple of special cases as a clear deficiency.

A perhaps unannounced goal is to get the ANSI and ISO standards to track.
This unfortunately gives extra weight to ISO requests/demands that does
not apply to "ordinary" comments.  I don't think anything can really be
done about this other than to continue dialogues with ISO members.

>folks with very legitimate economic ("can't afford it"), political
>("the compiler we're using has been stable for four years"), and technical
>("the compiler vendors don't have a version for our obsolete,
>two-year-old machine") reasons will be sticking (or stuck) with "old C"
>for quite a while.

But their problem will be mainly a forward-porting one.  It is extremely
unlikely that they will be able to import code written strictly for ANSI
C implementations.  For example, immediately their compilation will barf
when <stdlib.h> cannot be found, or when function prototypes are encountered.

Your situation is different -- because you plan in advance for the code to
fit both environments, you know not to count on parentheses being honored.
You'll therefore use temporary variables, etc. as required.  Such code can
be ported to other "old C" sites, with no problem related to parentheses.

nevin1@ihlpf.ATT.COM (00704A-Liber) (01/12/88)

In article <7563@elsie.UUCP> ado@elsie.UUCP writes:
>To me, the real problem with honoring parentheses is simple:  existing
>implementations of C don't.  This means that when an ANSI C program that
>"needs" such honoring gets ported back to an existing implementation,
>things will break quietly and mysteriously.

Although I am also against honoring parens I can't use your argument.  ANYTHING
using a feature which ANSI C has and existing implementations of C don't have
CANNOT be ported back to existing implementations (although I agree that it
is bad to change the language in such a way that code will break 'quietly').
-- 
 _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194
' )  )				"The secret compartment of my ring I fill
 /  / _ , __o  ____		 with an Underdog super-energy pill."
/  (_</_\/ <__/ / <_	These are solely MY opinions, not AT&T's, blah blah blah

V4039%TEMPLEVM.BITNET@CUNYVM.CUNY.EDU (Stan Horwitz) (01/12/88)

  The temp variables I am referring to are those needed to evaluate
the different quantities involved in the formula I mentioned.  Temps
for that case could be: TEMP1 = SUM(XY) / N and TEMP2 = SUM(X) * SUM(Y)
so that they can be included in the formula as:  SUMXY = TEMP1 - TEMP2 / N.
Of course I know the formula is not robust, it was not intended to be.
It was only meant as a short example to illustrate my point that in order
to avoid the use of temps, sometimes one can play a little mathemagic
(algebra) and write the formula in question to an order which C will
handle properly.  I did not provide the formula with the intention of
advocating it's use in numerical computations.


Stan Horwitz
V4039 at TEMPLEVM.BITNET

dhesi@bsu-cs.UUCP (Rahul Dhesi) (01/14/88)

In article <7053@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) 
writes:
>If you had had to wade
>through all the demands for something called "honoring parentheses", it would
>be clear to you that many people perceived the "old C" license to rearrange
>a couple of special cases as a clear deficiency.

I wonder if these requests came from experienced C users.  The
discussion of this topic in comp.lang.c always seemed to me to indicate
that a lot of people do want parens to be honored, but they're mostly
Fortran programmers!

Well, what can we do about this?  Write to ANSI demanding that
Fortran88 convert all characters to ints in expressions, since not
doing so is a clear deficiency?
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

gwyn@brl-smoke.ARPA (Doug Gwyn ) (01/15/88)

In article <1841@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>... a lot of people do want parens to be honored, but they're mostly
>Fortran programmers!

They're mostly NUMERICAL programmers.  Many of them have said that they
would have used C except for its perceived deficiencies for such work,
gratuitous expression reordering being chief among them.