[comp.lang.c] X3J11 meeting notes

gwyn@brl-smoke.ARPA (Doug Gwyn ) (12/14/87)

The following are unofficial, incomplete notes about some of the
developments at the December X3J11 meeting:

The good news is, it appears that the second formal public review of
the draft proposed standard for C may start as early as February.
This time it will last for two months, and we hope that nothing will
turn up that requires more than minor editorial changes for the final
official standard.

A new keyword, "noalias", was added; it's a type-qualifier like const
and volatile.  Its only function is to permit tighter optimization, so
in a sense it's like register.  A "noalias" variable is one that the
programmer guarantees will only be accessed via a single "handle", so
the compiler does not have to make worst-case assumptions about pointer
aliasing.

All library routine pointer parameters (except for memmove()) now have
"noalias" added, replacing the English injunction to that effect.

Parenthesis grouping is now honored.  The sentence that permitted
regrouping of commutative & associate operators has been removed.
Unary plus no longer has special grouping semantics.

Pointers to the same object are now guaranteed to compare equal.
All types of null pointer compare equal.  (char *) and (void *)
have the same representation.

Additional multi-byte character stuff was added, most notably L"..."
and L'...' literals.  Catenation of mixed string literal types is
undefined.  "mb_max" was changed to "MB_CUR_MAX".  mbstowcs() and
wcstombs() functions were added to convert entire multibyte strings.
A wchar_t array can be initialized with a L"...".

Additional international monetary and numeric support was added.

NULL and size_t are to be included in any header that references them
in the Standard; other symbols are defined in only one header.

va_start and va_end are both macros, and their ranges (on distinct
va_list data) can overlap.

'0'..'9' are now required to be contiguous and in ascending order, so
the c-'0' construct is guaranteed to work.

stderr always starts out unbuffered.

signal handlers may now be entered with an implementation-defined
blocking of the signal instead of having it reset to SIG_DFL.

bsearch() passes the key as the first argument to the comparison
function (was previously not specified which argument was which).

"GMT" is changed to "UTC".

Slight tweaks to math routines to accommodate IEEE floating point.

iwb@lan000.UUCP (12/16/87)

Is the change from GMT to UTC or UCT (Universtal Coordinated Time)?

		...ihnp4!infoswx!lan000!iwb

alan@mn-at1.UUCP (Alan Klietz) (12/16/87)

In article <6829@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
<The following are unofficial, incomplete notes about some of the
<developments at the December X3J11 meeting:
<
<Parenthesis grouping is now honored.  The sentence that permitted
<regrouping of commutative & associate operators has been removed.
<

This is going to make writing an optimizing compiler more difficult.   

Previously, the compiler only needed to parse parentheses for determining
operation order and grouping in the initial expression tree.  It could
later (in the expression optimization phase) rearrange the tree as necessary.

Now, conforming C compilers will have to "remember" the parentheses by
encoding them into the expression tree.  Optimizations across parenthesis
boundaries will be prohibited.   In particular, the compiler will be
disallowed from rewriting many expressions to fit an idiomatic CPU instruction.

Just ask a FORTRAN compiler writer about parenthesis restrictions in
his language.  Then watch him moan and start to mumble evil incantations at
John Backus.

This is a major change.  Why is it being put in so late in
the review cycle?

--
Alan Klietz
Minnesota Supercomputer Center (*)
1200 Washington Avenue South
Minneapolis, MN  55415    UUCP:  ..rutgers!meccts!mn-at1!alan
Ph: +1 612 626 1836              ..ihnp4!dicome!mn-at1!alan (beware ihnp4)
                          ARPA:  alan@uc.msc.umn.edu  (was umn-rei-uc.arpa)

(*) An affiliate of the University of Minnesota

gwyn@brl-smoke.ARPA (Doug Gwyn ) (12/17/87)

In article <406@mn-at1.UUCP> alan@mn-at1.UUCP (0000-Alan Klietz) writes:
>In article <6829@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
><Parenthesis grouping is now honored.  The sentence that permitted
><regrouping of commutative & associate operators has been removed.
>
>This is going to make writing an optimizing compiler more difficult.   
>
>Previously, the compiler only needed to parse parentheses for determining
>operation order and grouping in the initial expression tree.  It could
>later (in the expression optimization phase) rearrange the tree as necessary.
>
>Now, conforming C compilers will have to "remember" the parentheses by
>encoding them into the expression tree.  Optimizations across parenthesis
>boundaries will be prohibited.   In particular, the compiler will be
>disallowed from rewriting many expressions to fit an idiomatic CPU instruction.
>
>Just ask a FORTRAN compiler writer about parenthesis restrictions in
>his language.  Then watch him moan and start to mumble evil incantations at
>John Backus.
>
>This is a major change.  Why is it being put in so late in
>the review cycle?

Unofficial responses, more or less in order:

X3J11 has a lot of optimizing compiler implementors, who agreed
to this change.  They must not think it's a major problem.

The only actual operational change is the removal of the Ritchie
sentence that permitted rearrangement of adjacent mathematically
commutative and associative operations.  The parse tree you build
is no different from before, just certain optimizations may have
to be disabled.  Parentheses still affect just the parsing.

In fact, in many cases (for example, 2's complement machine with
no integer overflow trapping), those sort of optimizations can
still be performed.  The key is that the result must be the same
"as if" the strict virtual machine operations had been performed
in the non-rearranged order.  Constant folding, for example, often
will not affect the result.  Implementations that trap integer
overflow are prohibited from causing it by rearrangement now;
that actually helps the programmer obtain reliable behavior.

We also removed formulas like "-E is equivalent to 0-E" since
the IEEE floating-point people tell us that this is not necessarily
so for their machines (I will spare you the incredible details).

There have probably been more, and louder, complaints that C
parentheses acted unlike Fortran's than any other single complaint
the committee received.  Finally, an ISO representative simply
refused to sanction the proposed standard unless this was changed.
Too many people (admittedly, many of them misunderstanding the
issue) have been clamoring that they want "Fortran-like parentheses".
After yet another round of discussion, the long-standing X3J11
committee resistance to changing this historical language feature
was weakened sufficiently that the change was adopted.

There are many major changes between the first and second
(forthcoming) public review draft proposed standards.  There
actually is no "review cycle" currently in effect, but be advised
that this change was asked for several times during the first
formal public review, so if you like you can consider it made
"by popular demand".

meissner@xyzzy.UUCP (Michael Meissner) (12/17/87)

In article <406@mn-at1.UUCP> alan@mn-at1.UUCP (0000-Alan Klietz) writes:
| <Parenthesis grouping is now honored.  The sentence that permitted
| <regrouping of commutative & associate operators has been removed.
| <
| 
| This is going to make writing an optimizing compiler more difficult.   
| 
| Previously, the compiler only needed to parse parentheses for determining
| operation order and grouping in the initial expression tree.  It could
| later (in the expression optimization phase) rearrange the tree as necessary.
| 
| Now, conforming C compilers will have to "remember" the parentheses by
| encoding them into the expression tree.  Optimizations across parenthesis
| boundaries will be prohibited.   In particular, the compiler will be
| disallowed from rewriting many expressions to fit an idiomatic CPU instruction.

Actually at the beginning of the document, there is a rule called the "AS-IF"
rule, which says that you are free to do whatever optimizations, as long as
you get the same answer as-if it were done as specified by the standard.
For "normal" 2's complement integer arithmetic with integer traps disabled,
you can still do optimizations across parenthesis boundaries.  The two members
who voted no against sending out the standard for a second review, wanted
just that freedom as was originally promised in K&R.

| Just ask a FORTRAN compiler writer about parenthesis restrictions in
| his language.  Then watch him moan and start to mumble evil incantations at
| John Backus.

I don't know, just ask your normal user about the compiler ignoring his/her
parenthesis, and you'll hear evil incantations directed at the implementator.
 
| This is a major change.  Why is it being put in so late in
| the review cycle?

Because many users (and other standards bodies) objected to the current
practice, especially with regard to floating point:  (1.0E25 + -1.0E25)
+ 1.0 doesn't give the same answer as 1.0E25 + (-1.0E25 + 1.0).  As I
recall, it was one of the issues that lots of users wrote in about (almost
every one asking for parenthesis to be honored).
-- 
Michael Meissner, Data General.		Uucp: ...!mcnc!rti!xyzzy!meissner
					Arpa/Csnet:  meissner@dg-rtp.DG.COM

barmar@think.COM (Barry Margolin) (12/18/87)

In article <6852@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>The only actual operational change is the removal of the Ritchie
>sentence that permitted rearrangement of adjacent mathematically
>commutative and associative operations.  The parse tree you build
>is no different from before, just certain optimizations may have
>to be disabled.  Parentheses still affect just the parsing.

This doesn't sound correct to me.  I think most parsers internally
convert formulae into binary trees, so they would typically parse both

	a + b + c   and   (a + b) + c

into

		+
	       / \
	      +   c
	     / \
	    a   b

and then rearrange the tree for optimization purposes.  The new
restriction requires the parenthesization to be explicitly represented
in the parse tree, perhaps as a flag in the operator node, so that the
second expression would be parsed into

		+
	       / \
	     (+)  c
	     / \
	    a   b

The code generator would know that operations below a (<op>) node
cannot be moved above it.

Of course, this type of parse tree node was already necessitated by
unary plus.  It is just a matter of changing the parsers to create it
for all parenthesized expressions.

>In fact, in many cases (for example, 2's complement machine with
>no integer overflow trapping), those sort of optimizations can
>still be performed.  The key is that the result must be the same
>"as if" the strict virtual machine operations had been performed
>in the non-rearranged order.  Constant folding, for example, often
>will not affect the result.  Implementations that trap integer
>overflow are prohibited from causing it by rearrangement now;
>that actually helps the programmer obtain reliable behavior.

If overflow trapping exists, and equivalent behavior is required, very
little optimization can be done.  Consider

	(a + 1) - 1

When a is the highest value it can assume, this will cause overflow
under the new rules, but the old rules allow constant folding across
the parens, which translates the above to "a", which can never cause
overflow.

The biggest problem I see with this requirement is due to macros.  In
order to implement macros that expand into arithmetic operations and
guarantee that they are well behaved it is necessary to use a level of
parentheses, e.g.

#define one_more_than(x) (x + 1)

If the parentheses are omitted, as in

#define one_more_than(x) x + 1

then things like this happen:

	(2 * one_more_than(3)) != (one_more_than(3) * 2)

This standard change effectively disallows many optimizations of
expressions that make use of macros.

---
Barry Margolin
Thinking Machines Corp.

barmar@think.com
seismo!think!barmar

rudell@beeblebrox.uucp (Richard Rudell) (12/18/87)

Simple comment on re-arrangement of integer expressions ...

Doesn't ANSI leave the affect of integer overflow implementation defined ?
(Someone please correct me if this has also changed since the last
review draft.)

Doesn't this mean that for integer expressions such as:

	(a - 1) + 1

the compiler is still free to re-arrange the parentheses ?  The claim
has been made "but if integer traps are enabled, the compiler can't
change this".  But an ANSI conforming compiler is free to do whatever
it wants whenever it wants when integer overflow occurs.  Thus, an ANSI
compiler can re-arrange these parentheses and still meet the "as if"
condition which controls the entire interpretation of the ANSI
C specification.  

The point is, ANSI never promised what might happen if 'a' was the most 
positive integer.  Has this interpretation also changed ?

Rick.

cmt@myrias.UUCP (Chris Thomson) (12/18/87)

Doug Gwyn writes:
> There are many major changes between the first and second
> (forthcoming) public review draft proposed standards.  There
> actually is no "review cycle" currently in effect, but be advised
> that this change was asked for several times during the first
> formal public review, so if you like you can consider it made
> "by popular demand".

Pardon my English, but there sure as hell is a review cycle going on: ISO,
and all its member nations (except apparently the USA) are reviewing the May
15 Draft C submitted to ISO by ANSI for review.  Yes, I know that the latest
draft is November 9, and that there was another one in the middle.  I am
astonished to discover that there are "many major changes" in yet another
forthcoming draft.  This is incredible.  Not only is the committee inventing
instead of standardizing existing practice, but it is subverting the review
and consensus process while it is at it.

Perhaps the X3J11 committee should reach internal consensus before jerking
around the entire world on a wild-goose-chase review of a document that
is obsolete before it is even distributed.
-- 
Chris Thomson, Myrias Research Corporation	   alberta!myrias!cmt
900 10611 98 Ave, Edmonton Alberta, Canada	   403-428-1616

levy@ttrdc.UUCP (Daniel R. Levy) (12/19/87)

In article <13899@think.UUCP>, barmar@think.COM (Barry Margolin) writes:
> The biggest problem I see with this requirement [that C honor parenthesis
> groupings in expression evaluation] is due to macros.  In
> order to implement macros that expand into arithmetic operations and
> guarantee that they are well behaved it is necessary to use a level of
> parentheses, e.g.
> 
> #define one_more_than(x) (x + 1)
> 
> If the parentheses are omitted, as in
> 
> #define one_more_than(x) x + 1
> 
> then things like this happen:
> 
> 	(2 * one_more_than(3)) != (one_more_than(3) * 2)
> 
> This standard change effectively disallows many optimizations of
> expressions that make use of macros.

This would be a better example if instead of
	... one_more_than(2) ...
you had said
	int i;
	...
	i = <some non constant expression which evaluates to 2>;
	...
	... one_more_than(i) ...

In the first case, using the parens around (x + 1) would not harm the
compiler's ability to optimize, because the "as if" rule would allow the
constants to be folded in actual evaluation.  In the second case, this
could conceivably cripple optimization of macros.

This could be kludged around however by making the C compiler ignore one level
of parens (insofar as they require the compiler not to regroup associative
and commutative operators) which come from macros.  The macro would need
to double-paren an expression, e.g., ((x + 1)), to ensure the non-regrouping
evaluation behavior.  In a system with a separate preprocessor and compiler,
this could be done by having the preprocessor turn all single parens in
expressions into double parens (except those coming from macros and those
around a function call or declaration argument list) and having the actual
compiler recognize this special meaning of double parens versus single parens.

Can anyone come up with a reasonable example of a macro that would be badly
hurt in efficiency if the proposed rule to honor parentheses (except where
the result would be the same) were implemented?
-- 
|------------Dan Levy------------|  Path: ..!{akgua,homxb,ihnp4,ltuxa,mvuxa,
|         an Engihacker @        |  	<most AT&T machines>}!ttrdc!ttrda!levy
| AT&T Computer Systems Division |  Disclaimer?  Huh?  What disclaimer???
|--------Skokie, Illinois--------|

gwyn@brl-smoke.ARPA (Doug Gwyn ) (12/19/87)

In article <13899@think.UUCP> barmar@sauron.think.com.UUCP (Barry Margolin) writes:
>and then rearrange the tree for optimization purposes.

That's what is NOT allowed for expressions, in the general case.
There is no need to add special flagging for parentheses.  If your
implementation takes advantage of a non-overflowing 2's complement
assumption, for example, then, yes, you may then need special
markers in the parse tree, although I'm not sure that you would if
you limit yourself to permitted optimizations.

>Of course, this type of parse tree node was already necessitated by
>unary plus.

Only if you were availing yourself of the license "the sentence"
granted you to rearrange certain expressions.  With that license
gone, in the general case there is no need for special stuff for
unary plus, either.  It is now really just a no-op (when legal).

>If overflow trapping exists, and equivalent behavior is required, very
>little optimization can be done.

Yes, that's right.  The committee came down on the side of the
programmer rather than the implementor, on this issue at least.

I think if you could rearrange so as to eliminate possible
overflow without introducing one where it couldn't have occurred
for the "virtual machine", then you're allowed to perform that
rearrangement, since you're into undefined-behavior territory,
where one of your options is to "do the right thing".

>The biggest problem I see with this requirement is due to macros.  In
>order to implement macros that expand into arithmetic operations and
>guarantee that they are well behaved it is necessary to use a level of
>parentheses, ...

Yes, that was one of the main counter-arguments the committee had
been using against dropping "the sentence".  However, on most
architectures the constant folding can still occur (and it WILL
occur in "constant expressions").  In the other cases, usually the
parentheses actually group a logical structure together, that is,
one that needs to have valid meaning in its own right, so
permitting overflow for it would not be justifiable.

Thanks for the comments, but in fact the committee was aware of
these things and chose to yield to "popular demand" to do something
about "parentheses".  Simply dropping "the sentence" was the best
solution.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (12/19/87)

In article <2149@ucbcad.berkeley.edu> rudell@beeblebrox.UUCP (Richard Rudell) writes:
>	(a - 1) + 1
>The point is, ANSI never promised what might happen if 'a' was the most 
>positive integer.

No, as it now stands it is not legal for an ANSI C implementation to
introduce a (non-benign) overflow by rearranging an expression.
Although what happens on an overflow is undefined, it is not correct
to cause an overflow in violation of what the programmer actually
specified.  Thus, in the above example, if at run time `a' happens to
have the most negative possible value, an overflow is permitted (but
not required); however, for no other value of `a' is an overflow
permitted.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (12/19/87)

In article <544@myrias.UUCP> cmt@myrias.UUCP (Chris Thomson) writes:
>Doug Gwyn writes:
>> There are many major changes between the first and second
>> (forthcoming) public review draft proposed standards.  There
>> actually is no "review cycle" currently in effect, but be advised
>> that this change was asked for several times during the first
>> formal public review, so if you like you can consider it made
>> "by popular demand".
>Pardon my English, but there sure as hell is a review cycle going on: ISO,
>and all its member nations (except apparently the USA) are reviewing the May
>15 Draft C submitted to ISO by ANSI for review.  Yes, I know that the latest
>draft is November 9, and that there was another one in the middle.  I am
>astonished to discover that there are "many major changes" in yet another
>forthcoming draft.  This is incredible.  Not only is the committee inventing
>instead of standardizing existing practice, but it is subverting the review
>and consensus process while it is at it.

If ISO really has gotten so "hot to trot" that they're trying to
proceed with international standardization of C before the work
of the American standardization effort is complete, then really
it's their own fault.  The X3J11 representative to ISO is trying
to ensure that the two efforts remain on the same track.  The
original ISO idea was that the ANS should be able to be adopted
as the ISO standard.  Indeed, many of the major changes introduced
into the proposed ANS, for example the locale, multi-byte, and
parentheses business, were in direct response to ISO requests
and/or demands.

For your information, X3J11 is OBLIGED to make "major changes" to
address deficiencies turned up by the first formal public review.
We received public comments about aliasing/optimization issues,
and at the last meeting finally got around to finding a proposed
solution for them.  X3J11 is allowed, indeed required, to invent
whenever it is necessary to remedy a clear deficiency.  We do try
to avoid invention under other circumstances.

Actually I don't think the parentheses and noalias stuff were
really "major changes", but that's beside the point.

>Perhaps the X3J11 committee should reach internal consensus before jerking
>around the entire world on a wild-goose-chase review of a document that
>is obsolete before it is even distributed.

I wish you would explain what the hell you are talking about.
If you're upset at reviewing ISO documents, then holler at them,
not at X3J11.  We're trying to do our assigned job the best we can.

There has been only one review of the X3J11 document, about a year
ago, and one more is planned to start near the beginning of
February.  Very little of the change between the only two
"published" X3J11 drafts can correctly be attributed to anything
other than addressing issues resulting from the first review.

P.S.  These are of course my own remarks; they do not necessarily
reflect the opinions of other X3J11 members, nor do they
constitute any sort of official committee position.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (12/20/87)

In article <16100001@lan000> iwb@lan000.UUCP writes:
>Is the change from GMT to UTC or UCT (Universtal Coordinated Time)?

ISO asked for "UTC" ("Universal Time Coordinated").  To me, it sounds
like the French messing around with things.  I honestly don't know the
correct official terminology for international time standards.
Somebody else suggested "UT0".  I would appreciate an authoritative
answer to the question "What is the correct name to use in speaking
of the international Universal Time standard?"  I was going to drive
down to the NBS and find out, but I didn't have time.

lmiller@venera.isi.edu (Larry Miller) (12/22/87)

In article <6887@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <16100001@lan000> iwb@lan000.UUCP writes:
>>Is the change from GMT to UTC or UCT (Universtal Coordinated Time)?
>
>ISO asked for "UTC" ("Universal Time Coordinated").  To me, it sounds
>like the French messing around with things.  I honestly don't know the
>correct official terminology for international time standards.

"Coordinated UNiversal Time"

You can see the problems with the abbreviation for this.

Larry Miller				lmiller@venera.isi.edu (no uucp)
USC/ISI					213-822-1511
4676 Admiralty Way
Marina del Rey, CA. 90292

billc@prism.UUCP (12/22/87)

> ISO asked for "UTC" ("Universal Time Coordinated").  To me, it sounds
> like the French messing around with things.  I honestly don't know the
> correct official terminology for international time standards.
> Somebody else suggested "UT0".  I would appreciate an authoritative
> answer to the question "What is the correct name to use in speaking
> of the international Universal Time standard?"  I was going to drive
> down to the NBS and find out, but I didn't have time.

	Found this in sci.astro:

> UTC (Coordinated Universal Time) is measured by atomic clocks.
> UT1 or UT (Universal Time) is the mean solar time at Greenwich.

meissner@xyzzy.UUCP (Michael Meissner) (01/06/88)

In article <4374@venera.isi.edu> lmiller@venera.isi.edu.UUCP (Larry Miller) writes:
| In article <6887@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
| >In article <16100001@lan000> iwb@lan000.UUCP writes:
| >>Is the change from GMT to UTC or UCT (Universal Coordinated Time)?
| >
| >ISO asked for "UTC" ("Universal Time Coordinated").  To me, it sounds
| >like the French messing around with things.  I honestly don't know the
| >correct official terminology for international time standards.
| 
| "Coordinated UNiversal Time"
| 
| You can see the problems with the abbreviation for this.

As I understand it, at the ISO level, there are three official languages
used, English, French, and Russian.  The committees have to go to great
pains to make sure any of the above languages is not favored by
abbreviations that make sense in one language, but not in another, hence
UTC for the Coordinated UNiversal Time.  I forget offhand, what ISO
stands for (but it sure isn't International Standards Organization).
Anyway, this is off the subject about C.
-- 
Michael Meissner, Data General.		Uucp: ...!mcnc!rti!xyzzy!meissner
					Arpa/Csnet:  meissner@dg-rtp.DG.COM

dmt@ptsfa.UUCP (Dave Turner) (01/07/88)

In article <531@xyzzy.UUCP> meissner@xyzzy.UUCP (Michael Meissner) writes:
>In article <4374@venera.isi.edu> lmiller@venera.isi.edu.UUCP (Larry Miller) writes:
>| In article <6887@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>| >In article <16100001@lan000> iwb@lan000.UUCP writes:
>
>UTC for the Coordinated UNiversal Time.  I forget offhand, what ISO
>stands for (but it sure isn't International Standards Organization).

ISO is the International Organization for Standardization.

Seems to be a pattern here:

	UCT is called UTC
	IOS is called ISO

someome must be applying a postfix to infix operator to the names. :-)


-- 
Dave Turner	415/542-1299	{ihnp4,lll-crg,qantel,pyramid}!ptsfa!dmt