[comp.lang.c] Some questions about ANSI C

daw@houxs.UUCP (D.WOLVERTON) (08/06/87)

I have some questions regarding the ANSI C standardization process
that I haven't seen addressed in comp.lang.c:

1) I read recently (in the C Journal?) that the name space available
   to users might be increased to include some identifiers which begin
   with an underscore.  Has this been discussed by the Committee?

2) Is a report forthcoming regarding the results of the Paris meeting?
   [Perhaps from Doug?]

3) What is the current timetable for standardization?

4) dpANSI C, as I read it, requires that a conforming hosted implementation
   contain *no* declarations (including prototypes) or #defines for 
   user-space identifiers not defined by the Standard, in order to avoid name 
   space pollution.  This implies that declarations for library functions 
   supplied by an implementation but not defined by the Standard which 
   currently exist in one of the standard headers must be moved elsewhere.
   Where does the Committee or P1003 (POSIX) plan to put these gypsy 
   declarations?  Won't this break existing (UNIX application) code?

I would also like to take this opportunity to thank those on the net who
take the time to explain all those picky little details of C.

--------------------------
David Wolverton
..!houxs!daw

"A little song, a little dance, a little seltzer in your pants." - Monty Python

gwyn@brl-smoke.ARPA (Doug Gwyn ) (08/07/87)

In article <497@houxs.UUCP> daw@houxs.UUCP (D.WOLVERTON) writes:
>1) I read recently (in the C Journal?) that the name space available
>   to users might be increased to include some identifiers which begin
>   with an underscore.  Has this been discussed by the Committee?

In the Boulder meeting, it was decided to reserve to implementors the
names _* for externals, _[A-Z_]* for all.  This essentially reserves
to applications the names _[a-z]* for non-externals, such as structure
member names, macros, etc.

>2) Is a report forthcoming regarding the results of the Paris meeting?
>   [Perhaps from Doug?]

I wasn't at the Paris meeting, but I recently received the minutes,
so I can (unofficially) summarize significant decisions, which I'll
do at the end of this article.  For more information, check the C
Journal or other trade publications.

>3) What is the current timetable for standardization?

It is desired to have the next draft ready for the second public review
by the end of the September meeting, but there is considerable doubt
that it can be wrapped up by then.  The intention is for the second
public review draft to be technically solid enough that no substantive
changes will be required as a consequence of the second review.  Of
course, if it happens that a serious problem is uncovered, it will be
fixed; I don't recall whether that would necessitate a third review
but we're hoping to avoid the delay that would cause.  In any case, the
second review draft should be prepared by the end of 1987, and would go
out for public comment officially a month or two after it is approved.
Therefore you should be able to get a review copy either around November
1987 or February 1988, if all goes as planned.  After responses are
prepared to the second review (which would take at least one, probably
two more quarterly meetings), assuming there were no substantive changes
the final standard would be submitted to X3 etc., which means that the
form of the final standard would be known precisely (although perhaps
available only through unofficial channels for a while) in the second
half of 1988.  At least, that's my personal estimate.

>4) dpANSI C, as I read it, requires that a conforming hosted implementation
>   contain *no* declarations (including prototypes) or #defines for 
>   user-space identifiers not defined by the Standard, in order to avoid name 
>   space pollution.  This implies that declarations for library functions 
>   supplied by an implementation but not defined by the Standard which 
>   currently exist in one of the standard headers must be moved elsewhere.
>   Where does the Committee or P1003 (POSIX) plan to put these gypsy 
>   declarations?  Won't this break existing (UNIX application) code?

The name space collision issue is very important, and is one of the
things I hope to address at the October Berkeley POSIX workshop.
I won't go into all the gory details, but the outcome is, yes, the
implementation must not interfere with the application's freedom to
use names that the standard does not dictate or reserve for the
implementation.  However, this is not quite as severe as it would at
first seem.  For example, additional <> headers can be provided, and
extra C library functions and extern data can be provided -- so long
as the implementation of the X3J11-mandated functions does not itself
use them (it could, however, follow my lead and have fopen() call
_open() rather than open(), the latter of which could be provided to
support applications that want to use it).  The point is, an
application must be perfectly free to define and use its own open(),
gethostbyname(), etc. functions without breaking the semantics of
the library routines.  Insufficient care has been paid to this in the
past, but now that we (almost) have standards, we need to fix our
implementations.

A POSIX implementation could also use an additional library, although
this is not necessary (nor, in my opinion, desirable).

-----

Paris X3J11 meeting results (unofficially extracted from minutes):

Larry Rosler turned over the job of redactor to Dave Prosser.

The requirement that file levels statics have "static" attached to
their first appearance was reaffirmed.

Library headers must be included only outside of declarations.

Freestanding environments must support all C source characters in
the target character set.

Hexadecimal escape sequences \xhhh may contain an arbitrary number
(1 or more) of hex digits.

Funny characters in #include and #line file names have undefined
effect.

Structs, unions, and enums match across different translation units
if all their members match (including names).

Setjmp must be a macro.

Storage class not first in a declaration is "obsolescent".

%0 for zero fill in *printf() is no longer obsolescent.

Macro for largest permissible filename to be added to <stdio.h>.

Representation for enums is implementation-defined.

Bill Plauger's proposal for minimal support for multi-byte characters
(via a letter_t typedef, a few new functions, and some slight wording
changes) was approved in principle, wording to be arranged.  This is
much less of a change to the existing draft than alternative proposals
such as mine, but indications are that it will be enough to satisfy
the Japanese who rejected the draft because of the lack of some such
facility.

X3J11 finally admitted that there was no guaranteed "safe" limit to
the size of the converted string produced by strxfrm(), and suitable
revisions to its specs were made.

Sense of the committee was that some way should be provided to save/
restore tailored/partial locales, but no motion was passed.

Ditto that parentheses should force grouping, motion needed.

Ditto that currency locale (not yet approved) should be empty for the
"C" locale, not US$.

Aliasing restrictions:  An lvalue must match the type of the object
it accessesm or the type with or without the unsigned attribute, or
the type of an aggregate that contains the type, or be of a char type.

It was noticed that the draft allows assigning to a struct containing
const members, but no fix has yet been decided.

There were numerous other, comparitively minor, changes and edits,
and several proposals were discussed and rejected.  The list above
is simply the things I decided were worth mentioning here.

Reminder:  I am not an official spokesthing for X3J11.

karl@haddock.ISC.COM (Karl Heuer) (08/07/87)

In article <497@houxs.UUCP> daw@houxs.UUCP (D.WOLVERTON) writes:
>4) dpANSI C, as I read it, requires that a conforming hosted implementation
>   contain *no* declarations [for non-ANSI objects, like UNIX functions].
>   Where does the Committee or P1003 (POSIX) plan to put these gypsy 
>   declarations?  Won't this break existing (UNIX application) code?

One solution is to leave them in the standard headers, but enclosed in
"#ifdef __unix_extensions".

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

kenny@uiucdcsb.cs.uiuc.edu (08/09/87)

Thanks a lot, Doug, for posting your summary of the Paris meeting; I
hadn't seen anything at all on it before this.  Can you (or someone
else ``in the know'') clarify a couple of the statements you made?
[Please don't flame -- I know that this isn't the forum for comments;
I'm just trying to get my ducks in a row for when the draft goes back
to the public for comments once again.]

/* Written  7:47 am  Aug  7, 1987 by gwyn@brl-smoke.ARPA in uiucdcsb:comp.lang.c */
>	Library headers must be included only outside of declarations.
Uh, this wasn't mandated before?  It seems kind of obvious....

>	Hexadecimal escape sequences \xhhh may contain an arbitrary number
>	(1 or more) of hex digits.
Is there some accepted way to announce that a hex escape is at an end?
For instance, I might want to describe the ASCII sequence ESC 'B' as
"\x1bB"; will this have to be ("\x1b" "B") as the spec now stands?

>	Setjmp must be a macro.
Does this really translate to ``Setjmp need not exist as a library
function, but may be implemented in macro form only?'' Requiring the
implementor to include something like
	extern int _Setjmp_ (jmp_buf);
	#define	setjmp(jb)	(_Setjmp_((jb)))
in <setjmp.h> seems just a trifle silly.

>	[agreed in principle] that parentheses should force grouping,
>	motion needed.
Doing away with unary + once again?  I can live with either solution,
(although I mildly prefer the unary +) but the waffling on this issue
is starting to get annoying.

>	Aliasing restrictions:  An lvalue must match the type of the object
>	it accesses[,] or the type with or without the unsigned attribute, or
>	the type of an aggregate that contains the type, or be of a char type.
	            ^^^^^^^^^^^^ Shouldn't this read ``a union''?

Reminder:  I am not an official spokesthing for X3J11.
/* End of text from uiucdcsb:comp.lang.c */

gwyn@brl-smoke.ARPA (Doug Gwyn ) (08/10/87)

In article <165600007@uiucdcsb> kenny@uiucdcsb.cs.uiuc.edu writes:
>>	Library headers must be included only outside of declarations.
>Uh, this wasn't mandated before?  It seems kind of obvious....

Obviously, it must not have been mandated before.
I myself have occasionally #included headers inside function bodies,
when I knew that it would not cause a problem.
(I promise not to do that any more, though!)

>>	Hexadecimal escape sequences \xhhh may contain an arbitrary number
>>	(1 or more) of hex digits.
>Is there some accepted way to announce that a hex escape is at an end?
>For instance, I might want to describe the ASCII sequence ESC 'B' as
>"\x1bB"; will this have to be ("\x1b" "B") as the spec now stands?

Yes, or else you can write it using an octal constant.
This was a compromise to remove an unnecessary restriction with
minimum impact on existing code (which may use octal escapes, but
not hexadecimal, which is a new addition to the language).

>>	Setjmp must be a macro.
>Does this really translate to ``Setjmp need not exist as a library
>function, but may be implemented in macro form only?'' Requiring the
>implementor to include something like
>	extern int _Setjmp_ (jmp_buf);
>	#define	setjmp(jb)	(_Setjmp_((jb)))
>in <setjmp.h> seems just a trifle silly.

I don't have the exact wording that's going into the draft,
but I think the problem is that the argument to setjmp() is being
passed by name, which is possible for macros but not for functions.
(This is based on the traditional implementation of a jmp_buf as an
array, which does not behave the same as other types as a typedef,
coupled with the observation that some implementations would be
better off with a non-array for their jmp_bufs.)

>>	[agreed in principle] that parentheses should force grouping,
>>	motion needed.
>Doing away with unary + once again?  I can live with either solution,
>(although I mildly prefer the unary +) but the waffling on this issue
>is starting to get annoying.

The problem is, the public keep complaining about unary + and/or
the lack of grouping for parens.  I think X3J11 must have decided
that the "as if" rule still permits most ordinary regrouping of
parenthesized expressions even if the spec requires that they
imply execution order.  Perhaps someone who was at the meeting
can explain how the discussion went.

>>	Aliasing restrictions:  An lvalue must match the type of the object
>>	it accesses[,] or the type with or without the unsigned attribute, or
>>	the type of an aggregate that contains the type, or be of a char type.
>	            ^^^^^^^^^^^^ Shouldn't this read ``a union''?

It's possible that other constraints force it to be a union;
I really don't know.  The above wording was taken directly
from the (unofficial) minutes; perhaps it is incorrectly
recorded.  I have a dim recollection that certain allowances
were made for pointers to structs containing a type as their
first member being assignment-compatible with the type, or
something like that.  Perhaps that's what is intended here.

kenny@uiucdcsb.cs.uiuc.edu (08/13/87)

Doug Gwyn, describing Paris conference results (DG>) and me, asking
stupid questions (KK>):

DG>Setjmp must be a macro.

KK>Does this really translate to ``Setjmp need not exist as a library
KK>function, but may be implemented in macro form only?'' Requiring the
KK>implementor to include something like
KK>	extern int _Setjmp_ (jmp_buf);
KK>	#define	setjmp(jb)	(_Setjmp_((jb)))
KK>in <setjmp.h> seems just a trifle silly.

DG>I don't have the exact wording that's going into the draft,
DG>but I think the problem is that the argument to setjmp() is being
DG>passed by name, which is possible for macros but not for functions.
DG>(This is based on the traditional implementation of a jmp_buf as an
DG>array, which does not behave the same as other types as a typedef,
DG>coupled with the observation that some implementations would be
DG>better off with a non-array for their jmp_bufs.)

Gosh, how did I miss that?  I must be getting really sloppy in my old
age (after all, I remember when structs were passed by reference,
too... 8-) ).

Isn't the problem really that setjmp accepts a jmp_buf rather than a
jmp_buf *?  (Ditto longjmp, I suppose, although it's less obvious that
a value parameter couldn't be used).  It seems that perhaps the spec
should simply be altered to change the data type.  It would cause
existing code to lint incorrectly, but otherwise would, as far as I
know, not break anything; it's hard to imagine an implementation so
perverse that it doesn't use the same representation for ``foo (*)[]''
as it does for ``foo *''.

I will concede that the ice is thin here, but we otherwise condemn
future generations of C programmers to learning that all library
function parameters are passed by value, oh yes, except for setjmp,
which is REALLY a macro that coerces its parameter to being passed by
name, oh yes, which means that its parameter must be an lvalue, not an
rvalue like those of all other functions.

Any comments, from either Doug or the peanut gallery?

DG>Aliasing restrictions:  An lvalue must match the type of the object
DG>it accesses[,] or the type with or without the unsigned attribute, or
DG>the type of an aggregate that contains the type, or be of a char type.
KK>            ^^^^^^^^^^^^ Shouldn't this read ``a union''?

DG>It's possible that other constraints force it to be a union;
DG>I really don't know.  The above wording was taken directly
DG>from the (unofficial) minutes; perhaps it is incorrectly
DG>recorded.  I have a dim recollection that certain allowances
DG>were made for pointers to structs containing a type as their
DG>first member being assignment-compatible with the type, or
DG>something like that.  Perhaps that's what is intended here.

I get the idea.  Perhaps instead of ``an aggregate that contains the
type'' we should say ``a union that contains the type, or a structure
having the type as its first member,'' and then also expand the
verbiage that any of *those* may in turn have the unsigned attribute
added or removed.  (There has got to be a cleaner way to express this.
But it's 1:30 in the morning and I'm only half sentient.)

karl@haddock.ISC.COM (Karl Heuer) (08/13/87)

In article <6265@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn) writes:
>[Hex escapes, \xNNNN, may be arbitrarily long; thus a sequence like ESC B
>cannot be written "\x1bB".  Use "\x1b" "B",] or else you can write it using
>an octal constant.

I would prefer some way to forcibly terminate, perhaps "\x(1b)B".  I hope the
committee is still open to change on this issue.

>>>	Setjmp must be a macro.
>I don't have the exact wording that's going into the draft, but I think the
>problem is that the argument to setjmp() is being passed by name, which is
>possible for macros but not for functions.  (This is based on the traditional
>implementation of a jmp_buf as an array, which does not behave the same as
>other types as a typedef, coupled with the observation that some
>implementations would be better off with a non-array for their jmp_bufs.)

Can't such an implementation use a one-element array containing a struct?  Or
is there a more subtle problem here?

>>>	[agreed in principle] that parentheses should force grouping,
>>>	motion needed.
>The problem is, the public keep complaining about unary + and/or the lack of
>grouping for parens.  I think X3J11 must have decided that the "as if" rule
>still permits most ordinary regrouping ...

Arghh.  One reason I like the unary-plus idea is that it allows the programmer
to distinguish between precedence control and evaluation order.  It's a
documentation aid.  Also, I suspect that many compilers will zealously regroup
even when the as-if rule does not strictly apply; the presence of an explicit
marker -- be it unary-plus, a cast into volatile, or a re-overloading of the
keyword "break" :-) -- can be used as a warning sign.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

am@cam-cl.UUCP (08/14/87)

In article <165600007@uiucdcsb> kenny@uiucdcsb.cs.uiuc.edu (asking for
justifications and the like of Doug Gwyn's list of ANSI changes) writes:
>>	Setjmp must be a macro.
>Does this really translate to ``Setjmp need not exist as a library
>function, but may be implemented in macro form only?'' Requiring the
>implementor to include something like
>	extern int _Setjmp_ (jmp_buf);
>	#define	setjmp(jb)	(_Setjmp_((jb)))
>in <setjmp.h> seems just a trifle silly.
>
I agree Doug's original wording was odd (the wording for assert whcih
has similar properties seems better) but...
I would exactly like to see the above _Setjmp_ oddity present in <setjmp.h>
for the following reasons:
1. The above setjmp causes all occurences of setjmp NOT IN FUNCTION CONTEXT
to be faulted.
   E.g.      main() { foo(setjmp); }
would produce the message 'undeclared variable setjmp'
which thereby polices the standard better than alternatives.
2. As support for why I wish to inhibit such strange calls (partly out
of pure legalistic malice, but also:)
   Consider the function:
   void oddity(int f(jmp_buf))
   {   static jmp_buf t;
       if (f(t)) g(f); else h(f);
   }
   An optimisiation for such functions is to do tail recursion removal
   I.e. treat the code for oddity as
     <call f>
     <test result>
     <bnz g>
     <bz h>
  This is really quite a useful optimisation, saving code space, stack
  space and time.  Exact validity conditions need to be considered well!
  Now, the punchline: if f() in the above could be setjmp, and it
  were legal to call it via such a mechanism (e.g. as permitted by Oct 86
  draft) then I cannot do this optimisation.  
  (The stack-frame for oddity would be overwritten by the call to h() and
  the attempt to restore it in longjump would fail catastrophically).
  The 'solution' used in our compiler is to disable tail-recursion
  optimisation if a call to setjmp or a variable function occurs, but
  therefore I'm very happy to see this ANSI change.

tom@hcrvx1.UUCP (Tom Kelly) (08/17/87)

In article <6265@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>)
writes about the latest ANSI C meeting.

I was at the meeting in Paris, so perhaps I can provide some context
to the discussion.

The setjmp() issue arose because of a public comment from Symbolics,
Inc.  On this machine, it is possible to generate the code for
setjmp() inline at the point of call, but it is not possible
for the setjmp() function to determine the necessary information
about its caller.  Thus, setjmp() cannot be implemented as a
function on this machine.

Since setjmp() was a function, it was possible to take the address
of it, and invoke it via a pointer.  This led to an impossible
situation for the commentor.  The suggestion was to make it illegal
to take the address of setjmp(), so that the compiler would always
know when it was being invoked.

The committee felt this was a reasonable request, and that it
was unlikely that very much present or future code would be
seriously harmed by the inability to manipulate setjmp() as
an ordinary function.

As far as the aliasing issue:

A question arose (from various public comments, including one from
Stallman) regarding legal aliasing.  Boiled down to the essentials,
given:

	double *dp; int *ip;

	*ip = 1;
	*dp = 3.0;

Can the compiler deduce after these two statements that *ip has
not been changed, and still contains 1?

The change to the standard is intended to make it so the answer is
yes.  It sets out the cases in which object references can refer
to the same object and the compiler must still get "the right
answer."

Because of various aspects of the C language, and common programming
practice, certain special cases had to be provided in that aliasing
in these cases is permitted.  The struct case arises as follows:

	struct tag {
		int i;
		double d;
		char c;
	} s, t;

	double *dp;

	dp = &(s.d);
	*dp = 3.0;
	s = t;
	*dp ?

Although *dp and s have different types, it was felt that this kind
of aliasing was legal and common.  The language in the proposal is
meant to ensure that a compiler has to get this one right, while
not being required to worry about aliasing in the first example.

Hope this has helped.

Tom Kelly  (416) 922-1937
HCR Corporation, 130 Bloor St. W., Toronto, Ontario, Canada
{utzoo, ihnp4, decvax}!hcr!tom

daniels@cae780.TEK.COM (Scott Daniels) (08/20/87)

In article <165600008@uiucdcsb> kenny@uiucdcsb.cs.uiuc.edu quotes:
>Doug Gwyn, describing Paris conference results (DG>):
>DG>Setjmp must be a macro.

>KK>Does this really translate to ``Setjmp need not exist as a library
>KK>function, but may be implemented in macro form only?'' 
Yes, but for code generation reasons, not just in how its args come.

>DG>I don't have the exact wording that's going into the draft,
>DG>but I think the problem is that the argument to setjmp() is being
>DG>passed by name, which is possible for macros but not for functions.

>Isn't the problem really that setjmp accepts a jmp_buf rather than a
>jmp_buf *?  (Ditto longjmp, ...  perhaps the spec should simply be altered 
>to change the data type...
> we otherwise condemn C programmers to learning all library function 
>parameters are passed by value, oh yes, except for setjmp,...
>Any comments, from either Doug or the peanut gallery?

I am not Doug, so I must be from the peanut gallery.  There is a real problem
wit "setjmp()" as a function: It is the only function that can return more 
than once from an invokation.  
An optimizing compiler must treat setjmp specially in order to keep the
program from behaving in a wierd way (see example below).  If the extremely
conservative assumptions you need to make about setjmp must be made for all
procedure calls, you can kiss almost any optimizations goodbye.  Instead, there
is a requirement that setjmp NOT be a normal procedure, so that people may 
design optimizing compilers that perform the necessary pessimizations around
calls to "setjmp", secure in the fact that the compiler can recognize all such
calls.
The example:

jmp_buf gBlock;

subr() {
 char *state;

 state = "Set up";
 if( setjmp( gBlock ) ) { printf("Stopped during %s\n", state ); return; }
 state = "Initialization";
 initial_processing();
 state = "First Pass";
 pass(1);
 state = "Second Pass";
 pass(2);
 state = "Final Pass";
 pass(3);
 state = "Termination";
 shut_down();
 printf("Wow! we finally succeeded!\n");
}

If the optimizer does not know about setjmp:
 state is a variable which is only read once: in the initial printf.
 The code which can "reach" that use all goes through the statement:
	state = "Set up";
 So, we can replace the state in the printf with "Set up".
 Now, state is a write-only variable, so we can eliminate it.
 The new optimized code is:
subr() {
 if( setjmp( gBlock ) ) { puts("Stopped during Set up" ); return; }
 initial_processing();
 pass(1);
 pass(2);
 pass(3);
 shut_down();
 printf("Wow! we finally succeeded!\n");
}

What the compiler must realize is that any "setjmp" can potentially be 
reached from any procedure call.  This might be done in a number of ways 
inside the compiler, but the key is the compiler must KNOW it is a setjmp.

If you wish to ban the optimizations mentioned above, just imagine the above 
example with the "setjmp(gBlock)" replaced by "ask_if_user_wants_to_proceed()".
I think most everyone would like the above-mentioned optimizatons in that case.

You may also find systems that will not perform correctly if too many or few
arguments are passed to setjmp, since setjmp depends on the stack depth 
at its entrance.  (Note that you cannot write the following procedure, no 
matter how much you might like to:
	my_setjmp( blk ) jmp_block blk; 
	{
	 static int counter = 0;
	 printf("Called setjmp(0x%x) at %d\n", &blk, ++counter );
	 return setjmp(blk);
	}
	#define setjmp my_setjmp
Once my_setjmp exits, the call nevironment saved in blk is invalid.


FROM:   Scott Daniels, Tektronix CAE
	5302 Betsy Ross Drive, Santa Clara, CA  95054
UUCP:   tektronix!teklds!cae780!daniels
	{ihnp4, decvax!decwrl}!amdcad!cae780!daniels 
        {nsc, hplabs, resonex, qubix, leadsv}!cae780!daniels 

daniels@cae780.TEK.COM (Scott Daniels) (08/20/87)

In article <165600008@uiucdcsb> kenny@uiucdcsb.cs.uiuc.edu quotes:
>Doug Gwyn, describing Paris conference results (DG>):
>DG>Setjmp must be a macro.

>KK>Does this really translate to ``Setjmp need not exist as a library
>KK>function, but may be implemented in macro form only?'' 
Yes, but for code generation reasons, not just in how its args come.

>DG>I don't have the exact wording that's going into the draft,
>DG>but I think the problem is that the argument to setjmp() is being
>DG>passed by name, which is possible for macros but not for functions.

>Isn't the problem really that setjmp accepts a jmp_buf rather than a
>jmp_buf *?  (Ditto longjmp, ...  perhaps the spec should simply be altered 
>to change the data type...
> we otherwise condemn C programmers to learning all library function 
>parameters are passed by value, oh yes, except for setjmp,...
>Any comments, from either Doug or the peanut gallery?

I am not Doug, so I must be from the peanut gallery.  There is a real problem
wit "setjmp()" as a function: It is the only function that can return more 
than once from an invokation.  
An optimizing compiler must treat setjmp specially in order to keep the
program from behaving in a wierd way (see example below).  If the extremely
conservative assumptions you need to make about setjmp must be made for all
procedure calls, you can kiss almost any optimizations goodbye.  Instead, there
is a requirement that setjmp NOT be a normal procedure, so that people may 
design optimizing compilers that perform the necessary pessimizations around
calls to "setjmp", secure in the fact that the compiler can recognize all such
calls.
The example:

jmp_buf gBlock;

subr() {
 char *state;

 state = "Set up";
 if( setjmp( gBlock ) ) { printf("Stopped during %s\n", state ); return; }
 state = "Initialization";
 initial_processing();
 state = "First Pass";
 pass(1);
 state = "Second Pass";
 pass(2);
 state = "Final Pass";
 pass(3);
 state = "Termination";
 shut_down();
 printf("Wow! we finally succeeded!\n");
}

If the optimizer does not know about setjmp:
 state is a variable which is only read once: in the initial printf.
 The code which can "reach" that use all goes through the statement:
	state = "Set up";
 So, we can replace the state in the printf with "Set up".
 Now, state is a write-only variable, so we can eliminate it.
 The new optimized code is:
subr() {
 if( setjmp( gBlock ) ) { puts("Stopped during Set up" ); return; }
 initial_processing();
 pass(1);
 pass(2);
 pass(3);
 shut_down();
 printf("Wow! we finally succeeded!\n");
}

What the compiler must realize is that any "setjmp" can potentially be 
reached from any procedure call.  This might be done in a number of ways 
inside the compiler, but the key is the compiler must KNOW it is a setjmp.

If you wish to ban the optimizations mentioned above, just imagine the above 
example with the "setjmp(gBlock)" replaced by "ask_if_user_wants_to_proceed()".
I think most everyone would like the above-mentioned optimizatons in that case.

You may also find systems that will not perform correctly if too many or few
arguments are passed to setjmp, since setjmp depends on the stack depth 
at its entrance.  (Note that you cannot write the following procedure, no 
matter how much you might like to:
	my_setjmp( blk ) jmp_block blk; 
	{
	 static int counter = 0;
	 printf("Called setjmp(0x%x) at %d\n", &blk, ++counter );
	 return setjmp(blk);
	}
	#define setjmp my_setjmp
Once my_setjmp exits, the call nevironment saved in blk is invalid.


FROM:   Scott Daniels, Tektronix CAE
	5302 Betsy Ross Drive, Santa Clara, CA  95054
UUCP:   tektronix!teklds!cae780!daniels
	{ihnp4, decvax!decwrl}!amdcad!cae780!daniels 
        {nsc, hplabs, resonex, qubix, leadsv}!cae780!daniR and

rml@hpfcdc.HP.COM (Bob Lenk) (08/21/87)

> jmp_buf gBlock;
> 
> subr() {
>  char *state;
> 
>  state = "Set up";
>  if( setjmp( gBlock ) ) { printf("Stopped during %s\n", state ); return; }
>  state = "Initialization";
>  initial_processing();
>  state = "First Pass";
>  pass(1);
>  state = "Second Pass";
>  pass(2);
>  state = "Final Pass";
>  pass(3);
>  state = "Termination";
>  shut_down();
>  printf("Wow! we finally succeeded!\n");
> }

According to the public comment draft, optimizing away the last four
assignments to state would be perfectly legitimate.  Reading the
description of longjmp(), it is only makes no guarantees about the
values of automatic variables that are not declared as volatile.  If
state is declared as volatile, the optimizer had better leave it alone
regardless of the setjmp.

As for the arguments about taking the address of setjmp, the draft
also said (under "environmental constraint") that the only context in
which setjmp can portably appear is comparison to an integral constant
expression.  This seems to mean that no true setjmp() function had to
exist, even before the Paris change.

The Paris change to setjmp seems unnecessary to me.

		Bob Lenk
		{ihnp4, hplabs}!hpfcla!rml

cmt@myrias.UUCP (Chris Thomson) (08/24/87)

> As for the arguments about taking the address of setjmp, the draft
> also said (under "environmental constraint") that the only context in
> which setjmp can portably appear is comparison to an integral constant
> expression.  This seems to mean that no true setjmp() function had to
> exist, even before the Paris change.
> 		Bob Lenk

Not true.  You could still have a call via a function pointer that points
at setjmp() in a comparison context.  This means that every indirect call
of appropriate type in a comparison context might be a call to setjmp.
And that's just for correct programs.  What happens if any old indirect
call could be a (possibly erroneous) call to setjmp()?  How much work
should be done to do something sensible in every such case?

Life is much simpler if you always know when a setjump() is being done.
-- 
Chris Thomson, Myrias Research Corporation	   alberta!myrias!cmt
200 10328 81 Ave, Edmonton Alberta, Canada	   403-432-1616

kenny@uiucdcsb.cs.uiuc.edu (08/28/87)

/* Written  2:16 am  Aug 24, 1987 by cmt@myrias.UUCP in uiucdcsb:comp.lang.c */
/* ---------- "Re: Some questions about ANSI C" ---------- */
> As for the arguments about taking the address of setjmp, the draft
> also said (under "environmental constraint") that the only context in
> which setjmp can portably appear is comparison to an integral constant
> expression.  This seems to mean that no true setjmp() function had to
> exist, even before the Paris change.
> 		Bob Lenk

Not true.  You could still have a call via a function pointer that points
at setjmp() in a comparison context.  This means that every indirect call
of appropriate type in a comparison context might be a call to setjmp.
And that's just for correct programs.  What happens if any old indirect
call could be a (possibly erroneous) call to setjmp()?  How much work
should be done to do something sensible in every such case?

Life is much simpler if you always know when a setjump() is being done.
-- 
Chris Thomson, Myrias Research Corporation	   alberta!myrias!cmt
200 10328 81 Ave, Edmonton Alberta, Canada	   403-432-1616
/* End of text from uiucdcsb:comp.lang.c */

Uh, no.  If you can't use setjmp() except in the context of comparing
it to an integer, then how did you get a pointer to it to begin with?

kBk

rml@hpfcdc.UUCP (08/28/87)

> > As for the arguments about taking the address of setjmp, the draft
> > also said (under "environmental constraint") that the only context in
> > which setjmp can portably appear is comparison to an integral constant
> > expression.  This seems to mean that no true setjmp() function had to
> > exist, even before the Paris change.
> > 		Bob Lenk
> 
> Not true.  You could still have a call via a function pointer that points
> at setjmp() in a comparison context.

My mistake.  All that is limited is where calls to setjmp can appear.  I
misread it as a restriction on where the name setjmp can appear.  Broadening
the restriction in that way would be equivalent to saying that it must
be a macro without requiring implementations that use a true function to
define s mscro and rename the function.

		Bob Lenk
		{ihnp4, hplabs}!hpfcla!rml

wcs@ho95e.ATT.COM (Bill.Stewart) (08/29/87)

In article <5080003@hpfcdc.HP.COM> rml@hpfcdc.HP.COM (Bob Lenk) writes:
:As for the arguments about taking the address of setjmp, the draft
:also said (under "environmental constraint") that the only context in
:which setjmp can portably appear is comparison to an integral constant
:expression.  This seems to mean that no true setjmp() function had to
:exist, even before the Paris change.

Let's face it, setjmp() is an ugly hack, much uglier than goto.
	#define longjump(x) fprintf(stderr, "Oh, @##$%%$$#!!!!  %s\n", x)
It may be portable outside a Unix\(tm operating system environment,
but it's inherently dependent on stack-like implementations, and
is just as extra-linguistic as I/O.  It's ok to use on occasion, but
don't expect it to have particularly clean semantics.
-- 
#				Thanks;
# Bill Stewart, AT&T Bell Labs 2G218, Holmdel NJ 1-201-949-0705 ihnp4!ho95c!wcs