[comp.std.c] __STDC__ and POSIX

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/19/89)

In article <866@auspex.UUCP> guy@auspex.UUCP (Guy Harris) writes:
>when he meant
>>mean anything from "they define 'fileno' as a macro in <stdio.h>" to

Ok, I used fdopen() as my example of a POSIX name in <stdio.h>
instead of fileno() but the principle is the same.  These are
simply prohibited from being visible there in an ANSI C Standard
conforming implementation.  When the POSIX conflict with this
(extremely important, I think) general anti-pollution principle
was discovered, there was much frantic activity (since both
groups were meeting at the same time in different locations) to
try to work out a resolution.  These issues were not getting
resolved satisfactorily, and X3J11 sent a letter to P1003 with
X3J11's recommendation for how to best provide extensions like
this while remaining ANSI C Standard conformant.  (Basically the
idea was to require some symbol be defined before inclusion of a
standard header to enable the extensions; this symbol could have
been predefined in POSIX environments, to make it convenient.)
The actual scheme adopted in the final IEEE Std 1003.1 wasn't
quite what X3J11 had suggested; in fact it's sort of the converse.
They permit an application (NOT the implementation) to define the
macro _POSIX_SOURCE, which (a) makes available POSIX extensions
and (b) removes unconstrained extensions that Std 1003.1 would
otherwise have permitted in headers.  However, the intent appears
to be that POSIX-mandated extensions be available in the headers
even when _POSIX_SOURCE is NOT defined by the application, and
that contradicts ANSI C.

I think the logical consequence of all this is that in a
simultaneously ANSI C and IEEE 1003.1 conforming implementation,
applications would have to be sure to defined _POSIX_SOURCE if
they want to get at the POSIX-specific symbols in those headers
specified by the C Standard.  This is obviously a nuisance, and
consequently it is probable that there will be few attempts to
provide simultaneously-conforming implementations and fewer
attempts on the part of applications to use them.  This is a
real lossage.

My hope is that in the UNIX world the following slightly
cheating position will become widely adopted as a compromise
solution of most benefit to the programmer who is also concerned
with portability issues:

	The new default compilation environment predefines
	_POSIX_SOURCE "on behalf of the application", i.e. it is
	NOT considered as being provided by the official
	"implementation" but by the "program" being compiled.

	The shared standard headers key on _POSIX_SOURCE to enable
	the POSIX extensions beyond the ANSI C specification.

	Except for the header extensions enabled by _POSIX_SOURCE,
	the ANSI C Standard is obeyed.

	__STDC__ is defined as 1.

As an acceptable but slightly less convenient alternative that can
be coped with easily by POSIX applications adding -D_POSIX_SOURCE to
CFLAGS in Makefiles:

	The new default compilation environment does NOT predefine
	_POSIX_SOURCE.

	The shared standard headers key on _POSIX_SOURCE to enable
	the POSIX extensions beyond the ANSI C specification.

	The ANSI C Standard is obeyed.

	__STDC__ is defined as 1.

The key to this is that, because the "program" is defining the
identifier _POSIX_SOURCE, which is in the name space reserved for
implementations, technically by the ANSI C Standard the realm of
"undefined behavior" has been entered.  Basically, POSIX
implementations would define the behavior in this case to be that
outlined above.

This solution does NOT work if it is the "implementation" that is
considered to be defining _POSIX_SOURCE.

mark@jhereg.Jhereg.MN.ORG (Mark H. Colburn) (01/20/89)

In article <9436@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>The actual scheme adopted in the final IEEE Std 1003.1 wasn't
>quite what X3J11 had suggested; in fact it's sort of the converse.
>They permit an application (NOT the implementation) to define the
>macro _POSIX_SOURCE, which (a) makes available POSIX extensions
>and (b) removes unconstrained extensions that Std 1003.1 would
>otherwise have permitted in headers.  However, the intent appears
>to be that POSIX-mandated extensions be available in the headers
>even when _POSIX_SOURCE is NOT defined by the application, and
>that contradicts ANSI C.

It is possible, I think, to put together a header which conforms to
both standards.   For example:

---------------------------------------------------------------------------
#ifndef __LIMITS_H
#define __LIMITS_H

/* Defines */

#define CHAR_BIT	8		/* number of bits in a "char" */
#define CHAR_MAX	UCHAR_MAX	/* max value for "char" */
#define CHAR_MIN	0		/* min value for "char" */
#define INT_MAX		32767		/* max value for "int" */
#define INT_MIN		-32767		/* min value for "int" */
#define LONG_MAX	2147483647	/* max value for "long int" */
#define LONG_MIN	-2147483647	/* min value for "long int" */
#define MB_LEN_MAX	1		/* max bytes in a multibyte char */
#define SCHAR_MAX	127		/* max value for "signed char" */
#define SCHAR_MIN	-127		/* min value for "signed char" */
#define SHRT_MAX	32767		/* max value for "short int" */
#define SHRT_MIN	-32767		/* min value for "short int" */
#define UCHAR_MAX	255		/* max value for "unsigned char" */
#define UINT_MAX	65535		/* max value for "unsigned int" */
#define ULONG_MAX	4294967295	/* max value for "unsigned long" */
#define USHRT_MAX	65535		/* max value for "unsigned short" */

#ifdef _POSIX_SOURCE

#define MAX_INPUT	256	/* Max numbef of bytes in terminal input */
#define NGROUPS_MAX	1	/* Max number of suplemental group id's */
#define PASS_MAX	8	/* Max number of bytes in a password */
#define PID_MAX		30000	/* Max value for a process ID */
#define UID_MAX		32000	/* Max value for a user or group ID */
#define ARG_MAX		4096	/* Nax number of bytes passed to exec */
#define CHILD_MAX	6	/* Max number of simultaneous processes */
#define MAX_CANON	256	/* Max numbef of bytes in a cononical queue */
#define OPEN_MAX	16	/* Nax number of open files per process */
#define NAME_MAX	14	/* Max number of bytes in a file name */
#define PATH_MAX	255	/* Max number of bytes in pathname */
#define LINK_MAX	8	/* Max value of a file's link count */
#define PIPE_BUF	512	/* Max number of bytes for pipe reads */

#endif /* _POSIX_SOURCE */
#endif /* __LIMITS_H */
---------------------------------------------------------------------------

In the example above, the POSIX related defines will only be used if
_POSIX_SOURCE is defined.  Therefore, if it is not defined, there should be
no namespace pollution.

I would hope that ANSI is not attempting to say that the only materail that 
is allowed in an ANSI C conforming header is that material which is in the
ANSI C standard.  That would cause problems with virtually any kind of
optional extions.  I would (grudingly) agree that the namespace pollution
should not be visible unless a specific action is taken, such as defining
_POSIX_SOURCE.

I have always felt that the ANSI C position on namespace pollution was a
little parochial.  Obviously, it is causing problems with other ongoing,
and useful, standards such as IEEE 1003 (POSIX).

>I think the logical consequence of all this is that in a
>simultaneously ANSI C and IEEE 1003.1 conforming implementation,
>applications would have to be sure to defined _POSIX_SOURCE if
>they want to get at the POSIX-specific symbols in those headers
>specified by the C Standard.  This is obviously a nuisance, and
>consequently it is probable that there will be few attempts to
>provide simultaneously-conforming implementations and fewer
>attempts on the part of applications to use them.  This is a
>real lossage.

I must be confused here, because I do not understand how an application
can be both ANSI C conforming and P1003.1 conforming if it is attempting
to use features which only are available in 1003.1.  If the application
is using only those features available to ANSI C, then it would be able
to #undef _POSIX_SOURCE, or just ignore _POSIX_SOURCE.  I don't see how 
this would cause the problems with the development of applications which 
you mentioned.

You said earlier that the symbol _POSIX_SOURCE would be able to be 
predefined (for example by the pre-processor), so why should the 
application have to define it.  It should maybe test for it, if it going 
to optionally use 1003.1 facilities.

I think that an application which is being developed for strict ANSI C
environments on a P1003.1 conforming system should take additional steps, 
such as #undef _POSIX_SOURCE, simply to make sure that the available
extensions in 1003.1 are not used.

Another alternative, would be for P1003.1 conforming systems to provide a
flag for the compiler, such as -strict_ansi, which does not define
_POSIX_SOURCE.

>	The new default compilation environment predefines
>	_POSIX_SOURCE "on behalf of the application", i.e. it is
>	NOT considered as being provided by the official
>	"implementation" but by the "program" being compiled.

I would rather have the implementation provide it.  That way a program can
be portable, without user intervention.  I can have a program which has 
routines test for ANSI C conformance using __STDC__ and test for POSIX 
conformance, via the _POSIX_SOURCE test macro, and use the POSIX routines 
if they are available, othewise, use some other form of routine.

I already have applications which do this.  Admittedly, I must currently
define _POSIX_SOURCE by hand, since I do not have a strictly conforming
P1003.1 system yet.

>	Except for the header extensions enabled by _POSIX_SOURCE,
>	the ANSI C Standard is obeyed.

They could be obeyed simply by #undef _POSIX_SOURCE, couldn't it?

-- 
Mark H. Colburn                  "They didn't understand a different kind of 
NAPS International                smack was needed, than the back of a hand, 
mark@jhereg.mn.org                something else was always needed."

donn@hpfcdc.HP.COM (Donn Terry) (01/20/89)

>...
>I think the logical consequence of all this is that in a
>simultaneously ANSI C and IEEE 1003.1 conforming implementation,
>applications would have to be sure to defined _POSIX_SOURCE if
>they want to get at the POSIX-specific symbols in those headers
>specified by the C Standard.  This is obviously a nuisance, and
>consequently it is probable that there will be few attempts to
>provide simultaneously-conforming implementations and fewer
>attempts on the part of applications to use them.  This is a
>real lossage.

In fact, to my knowledge, an implementation that requires a #define
_POSIX_SOURCE is the preferred one as all the compiler organization has
to do is provide an ANSI compiler.  The headers are extended by the OS
group to include the POSIX symbols conditionally, and if the user wants
them, he asks.  The user decides what he wants in his source program.
Nice, clean, and simple.  It never breaks ANSI C.  It assures that
makefiles don't have to know the subtleties of which switches to set in
which modules.  It warns the reader of the program what symbol set he
should expect up front, rather than having to look at the makefile.
It makes sure the switch settings don't get separated from the program.

We did consider the possiblility of predefining the symbol in the 
compiler, but discarded it because of some of the other issues that
came up elsewhere in this discussion: if you predifine a symbol from
the outside, does that require turing off __STDC__?  (At the time it
appeared to.)  Or, how do you turn it off.  Or, as Bob Lenk observes,
how do you handle the combinatorics of N standards all of which need
switches.  (IEEE 1003.* is poised at going over into two digits;
each of those probably represents a standard or two or three.  GACK!)

>My hope is that in the UNIX world the following slightly
>cheating position will become widely adopted as a compromise
>solution of most benefit to the programmer who is also concerned
>with portability issues:

>	The new default compilation environment predefines
>	_POSIX_SOURCE "on behalf of the application", i.e. it is
>	NOT considered as being provided by the official
>	"implementation" but by the "program" being compiled.

It's intended to be provided by and in the program.  Weasel words
to do this aren't needed.

>	The shared standard headers key on _POSIX_SOURCE to enable
>	the POSIX extensions beyond the ANSI C specification.

That's easy and being done.

>	Except for the header extensions enabled by _POSIX_SOURCE,
>	the ANSI C Standard is obeyed.
>
>	__STDC__ is defined as 1.
>

No problem with that.

>As an acceptable but slightly less convenient alternative that can
>be coped with easily by POSIX applications adding -D_POSIX_SOURCE to
>CFLAGS in Makefiles:
> ...

What's wrong with putting it in the program?  That was the intent.
(As mentioned elsewhere, if the implementation doesn't conform to POSIX,
you've got bigger problems than a single flag can solve, anyway.  At
that point you use compilation flags from the command line to turn on/off
the bits and pieces you need, and if _POSIX_SOURCE needs to be turned
off, don't turn it on by:

	#ifndef NOT_POSIX
	#define _POSIX_SOURCE
	#endif

>The key to this is that, because the "program" is defining the
>identifier _POSIX_SOURCE, which is in the name space reserved for
>implementations, technically by the ANSI C Standard the realm of
>"undefined behavior" has been entered.  Basically, POSIX
>implementations would define the behavior in this case to be that
>outlined above.
>
>This solution does NOT work if it is the "implementation" that is
>considered to be defining _POSIX_SOURCE.

POSIX is best considered an implementation from the ANSI C point of
view.  Thus _POSIX_SOURCE is considered a symbol reserved to a
particular implementation (and thus outside ANSI C control).  How that
symbol is used by that implementation is up to the implementation.
Here we get into some subtleties of the meaning of the word "define".
In the sense of the English language, it means to associate a meaning
with it.  In the usual computer sense it means "add it to the symbol
table".  In this case, POSIX defines a meaning for that symbol (in
the English sense), and only when the user defines it (in the computer
sense) does it actually have an effect.  From the point of view of
ANSI C the symbol simply doesn't exist: either it isn't used, or 
if it is, you enter the realm of "undefined behavior" and ANSI can't
talk about it.  (But POSIX can, and the first thing it says is "it's
still just like ANSI but you got some more symbols from outside".)

Donn Terry
Chair, IEEE 1003.1

In this I represent only myself.  Neither my employer or IEEE necessarily
endorse any position presented here.

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/21/89)

In article <399@jhereg.Jhereg.MN.ORG> mark@jhereg.MN.ORG (Mark H. Colburn) writes:
>It is possible, I think, to put together a header which conforms to
>both standards.   For example:
[universal defines]
>#ifdef _POSIX_SOURCE
[POSIX-specific defines]
>#endif /* _POSIX_SOURCE */

>In the example above, the POSIX related defines will only be used if
>_POSIX_SOURCE is defined.  Therefore, if it is not defined, there should be
>no namespace pollution.

Fine, and an implementation that simultaneously conforms to both standards
has to use a technique like that.  The practical problem is that many
POSIX vendors do not want to force their customers to #define _POSIX_SOURCE
in their source code in order for the POSIX definitions to become visible.
They rather prefer that these definitions be visible without the extra work
on the part of the application.  Mostly this is because they would like to
be able to use their new POSIX environment to compile existing UNIX
applications without having to change said applications.  <limits.h> is not
the best example of the problem; <stdio.h> is a better one.

>I would hope that ANSI is not attempting to say that the only materail that 
>is allowed in an ANSI C conforming header is that material which is in the
>ANSI C standard.

No, but any extra stuff in the standard headers must either use only
those identifiers reserved for implementations (i.e. underscore-names),
or else the extra stuff must be protected by a conditional using a name
in the implementation's reserved name space (as in your example).  The
rationale for the latter usage is that an application, by defining a
symbol reserved for implementations, has entered the realm of "undefined
behavior", which permits an escape from the otherwise stringent ANSI C
constraints.

>I have always felt that the ANSI C position on namespace pollution was a
>little parochial.  Obviously, it is causing problems with other ongoing,
>and useful, standards such as IEEE 1003 (POSIX).

Excuse me, but as acting liaison between the two groups I must say that
I don't think 1003.1 ever quite understood this issue as well as X3J11.
Certainly the solution adopted by 1003.1 (_POSIX_SOURCE) doesn't properly
address the issue.  It is THAT, combined with the mess that constitutes
existing practice in this area, that is causing problems.  The name space
anti-pollution guarantee for portable programming is one I find quite
valuable.  In fact I don't see any viable alternative for specifying a
system-independent C standard.  I wish 1003.1 had done as well in the
specifications for their own headers..

Feel free to blame me for not doing a good enough job as liaison.

>I must be confused here, because I do not understand how an application
>can be both ANSI C conforming and P1003.1 conforming if it is attempting
>to use features which only are available in 1003.1.

Please read more carefully.  I said "implementation", not "application".

>If the application is using only those features available to ANSI C,
>then it would be able to #undef _POSIX_SOURCE, or just ignore
>_POSIX_SOURCE.

A strictly conforming ANSI C application cannot #undef _POSIX_SOURCE,
and in any event maximally-portable ANSI C programs are supposed to be
writable without reference to the POSIX standard.  The joint X3J11/P1003
goal is for such applications to work unchanged in a simultaneously
ANSI C and POSIX conforming implementation.

>... why should the application have to define it.  It should maybe
>test for it, if it going to optionally use 1003.1 facilities.

Because IEEE Std 1003.1 SAYS the application, not the implementation,
optionally defines _POSIX_SOURCE.

>I think that an application which is being developed for strict ANSI C
>environments on a P1003.1 conforming system should take additional steps, 
>such as #undef _POSIX_SOURCE, simply to make sure that the available
>extensions in 1003.1 are not used.

Sorry, that is not proper to require for portable use of a system-
independent programming language.  How would you feel if the Pascal
standard said "Portable programs must XXX in order to be sure that
they will work on VMS systems, and YYY to be sure they work on DOS,
and ..."?

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/21/89)

In article <12040007@hpfcdc.HP.COM> donn@hpfcdc.HP.COM (Donn Terry) writes:
>>As an acceptable but slightly less convenient alternative that can
>>be coped with easily by POSIX applications adding -D_POSIX_SOURCE to
>>CFLAGS in Makefiles:
>What's wrong with putting it in the program?  That was the intent.

I personally don't mind doing that, but I've been told that it would
annoy other customers.  They may have tens of thousands of source
files that they would like to compile with a POSIX-compliant "cc",
and apparently it is not considered feasible for them to do a massive
global edit of all their source files.

guy@auspex.UUCP (Guy Harris) (01/21/89)

 >My hope is that in the UNIX world the following slightly
 >cheating position will become widely adopted as a compromise
 >solution of most benefit to the programmer who is also concerned
 >with portability issues:
 >
 >	The new default compilation environment predefines
 >	_POSIX_SOURCE "on behalf of the application", i.e. it is
 >	NOT considered as being provided by the official
 >	"implementation" but by the "program" being compiled.

I agree.  If any excessively-legalistic type objects, they should be
ignored (unless they repeatedly object, in which case they should be
shot).  The goal of the standards is to help people do their jobs; given
that deciding whether the "implementation" or "the application" is
defining _POSIX_SOURCE is a matter of no practical importance
(except, perhaps, if you have incurably stupid people who won't read the
documentation and run provide the "-pedantic" or whatever flag to their
C compiler when compiling ANSI C/non-POSIX applications), I don't see
that your suggestion violates the spirit of either standard.

(The same logic applies to "-D" on the command line; the user
consciously decided to change the namespace in which they compiled their
application, presumably knowing precisely what the effect would be.  I
realize that the current state of liability law in the US seems, at
times, to tend towards the notion that no bad consequences of deciding
to do something aren't the fault of the person who decided to do it, but
I hope we can avoid thinking in those terms; the disclaimer you
suggested for the "-D" case sounds a bit like the disclaimer from the
apocryphal story of the steplatter planted in frozen manure that slipped
when the manure melted - the ladder's manufacturer was allegedly bullied
into applying a "don't stick the bottom of this ladder into frozen
manure, idiot" disclaimer to the product thanks to some stupid liability
suit....)

A similar approach could be taken for other extensions; perhaps there'd
be an option in S5R4, say, for the latest issue of the SVID (enabling,
for example, the definition of error numbers not found in POSIX), and
probably one for "open the floodgates wide", which includes stuff not
(or not yet) in the SVID.

All those versions would define __STDC__ as 1 (or 2 or later, in future
versions, presumably).  The "traditional" mode would probably turn
everything on; I agree that it would be prefereable that it not define
__STDC__ at all (although Peter da Silva's workaround isn't too bad - it
lets you use "#if __STDC__" at the expense of adding a few extra lines
of code, which you could presumably put in your application's
"unwedge_stdc.h" or something such as that).  Given that there would be
other "feature test macros" for the other options, (__STDC__ == 0)
wouldn't be needed as a way of *enabling* namespace-polluting features
(which is, I think, one of the functions for which AT&T wanted to use
it).

I don't see that (__STDC__ == 0) is all that useful for e.g. enabling
code that uses prototypes on applications that can use prototypes when
compiled in a compiler that has them but isn't ANS-conformant:

	1) if you're rewhacking your code to use prototypes, you might
	   want to rewhack it to conform to the ANS;

	2) you can define your own feature test macros for this, and
	   have some little prologue that defines _HAS_PROTOTYPES if
	   __STDC__ is defined and (greater than or) equal to 1, and
	   forcibly pre-define them on the implementations that have
	   prototypes but aren't ANS-conformant;

	3) if all else fails, you may be able to forcibly define
	   __STDC__ as 0 on the command line.

guy@auspex.UUCP (Guy Harris) (01/21/89)

 >>	The new default compilation environment predefines
 >>	_POSIX_SOURCE "on behalf of the application", i.e. it is
 >>	NOT considered as being provided by the official
 >>	"implementation" but by the "program" being compiled.
 >
 >I would rather have the implementation provide it.

The implementation *is* providing it, if you're thinking about it
reasonably; the intent is to *say* that it's being done "on behalf of
the application", if you're thinking about it legalistically, instead,
so as to keep excessively legalistic types from playing "dog in the
manger."

Basically, if you say "cc -Q posix", or whatever, it'll predefine it. 
"cc" by itself may end up giving you the "complete" environment,
whatever that be (probably ANSI C + POSIX +, on a modern UNIX system,
additional error numbers in <errno.h> and the like), and "cc -Q ansi"
may end up giving you ANSI C, no more, no less, no "fdopen", no
"fileno", etc..