[mod.std.c] mod.std.c Digest V3 #2

osd7@homxa.UUCP (Orlando Sotomayor-Diaz) (02/12/85)

mod.std.c Digest            Tue, 12 Feb 85       Volume 3 : Issue   2 

Today's Topics:
                 Administrivia - Flames on this Digest
                     implementation dependencies
           Need for first-member union initialization rule
----------------------------------------------------------------------

Date: 12 Feb 85 05:00:00 EST (Tue)
From: cbosgd!std-c
Subject: Administrivia - Flames on this Digest

Very few times I receive contributions with flames.  My policy is to
request individuals who send me those flames to post such material 
to the USENET net.flame group, or to mail to the persons who they
are flaming against.  In mod.std.c, technical contributions have a
much higher importance than inflammatory material.  This way, we
don't have to suffer thru a series of flames, as happens once in a
while in other technical groups, like net.lang.c. 

Any comments, to me. 

				The Moderator

---------------------------

Date: 2 Feb 85 04:02:57 CST (Sat)
From: cbosgd!ihnp4!utzoo!henry
Subject: implementation dependencies
To: ihnp4!cbosgd!std-c

> Henry Spencer writes:
> 
> >[...] The current draft standard's neat new
> >string-concatenation convention (adjacent string literals -- note this
> >is literals only -- are concatenated at compile time) eliminates the
> >need for in-string replacement as a way to build filenames out of #defined
> >pieces, which to my mind was the only real need for in-string replacement.
> 
> He has confused the in-string replacement problem with the string
> concatenation problem.   String concatenation in macros has previously
> been achieved by  foo/**/bar or
> 
> #define IDENT(x)x
> IDENT(foo)bar
> 
> and the standard's string concatenation does deal with this sufficiently.
> However, it does nothing for in-string replacement.

Not so.  Consider:

	#define	DIR	"/usr/foo/bar"
	char *file = DIR "/my.file";

This initializes "file" to point to "/usr/foo/bar/my.file".  Note that
you *cannot* do this using token concatenation -- the old /**/ trick --
because you end up with an extra pair of double quotes in the middle.

> Henry's point
> that the feature was not standard by K&R is well taken (however snidely put),
> although it is not reasonable to assert that K&R *outlaw* replacement within
> strings in the macro definition, since it is fairly clear that they were
> referring to strings in the running text.

"Fairly clear" to *WHO*?  Quoting K&R 12.1:

	[#define without parameters] causes the preprocessor to replace
	subsequent instances of the identifier with the given string of
	tokens...

	[after #define with parameters] Subsequent instances of the
	first identifier [with proper parameter list] are replaced by
	the token string in the definition.  Each occurrence of an
	identifier mentioned in the formal parameter list of the
	definition is replaced by the corresponding token string from
	the call...  Text inside a string or a character constant is
	not subject to replacement.

The last sentence in this quote contains no qualifications restricting
it to running text, and it is in the paragraph describing how macros
with parameters are handled.  If you go into section 12.1 without
preconceptions, and determined to find an answer on whether in-string
replacement is legitimate, the only reasonable interpretation is "no".

The key is the words "determined to find an answer".  The preprocessor
did not originally have macros with parameters at all; they were added
when the preprocessor was rewritten during the V5->V6 transition.  (I
remember having to fix some of my trickier macro definitions when the
new preprocessor arrived...)  The language in the CRM about macros with
parameters was, in other words, a late addition to an existing manual.
When the manual is looked at with this in mind, the "not subject to
replacement" phrase probably dates from the old days, and wasn't
updated to clarify the situation when parameters arrived.  So the best
conclusion that can be drawn, looking at the CRM and its history, is
that it doesn't answer the question at all.

Where does this leave us?  When the documentation does not answer the
question, the answer is implementation-dependent by definition.  So
the only way to write implementation-independent code is to avoid
depending on the answer at all.  So my original contention remains:
code that makes use of in-string substitution is relying on an
implementation-dependent feature.

> However, it is important to recognize the pragmatic side of Ken's
> code-breaking admonition.  In particular, it would be impossible without
> in-string replacement to implement the UNIX assert library facility, which is
> a macro defined as
> 
> #define assert(EX) if (EX) ; else _assert("EX", __FILE__, __LINE__)

Oh, really?  The assert(3x) macro in Unix V7 -- not Berklix, not USGlix --
is defined without using in-string substitution.  Works, too.  If AT&T
has changed this to rely on an implementation-dependent feature of their
C preprocessor, that is their problem.  I agree that the result is
superior, by the way, but to talk about any other version being impossible
simply isn't correct.

Incidentally, the latest draft of the ANSI standard does include a way
to do in-string substitution, although it is *not* the Reiser cpp's
method and is not compatible with it.

[ Suggestion: when referring to an ANSI draft, please state document
   ID so we all know what draft we are talking about.  -- Mod. -- ]

> As for
> 
> >As various people (including me) have pointed out, modifying (say) the
> >OS/360-aka-MVS linker is politically impossible, however desirable and
> >technically-simple it may be.
> 
> that linker does not restrict one to 6 character externals, and so I wish
> Henry would quit mentioning IBM (and DEC, which allows 32 character
> externs in VMS) in that context.  People might question this limitation
> a bit more if they realized that it was accounting for GCOS, not MVS.

As I recall it, MVS's linker allows a whole 8 chars.  I might be wrong,
though -- it's been a long time.  I seem to recall that the PC-DOS linker
has a 6-character limit.  Yes, the PC is also an IBM product to which
they have a huge commitment.  I do know that the RT-11 linker has a
6-character limit, as do the linkers of several other DEC systems.
No, it's not just GCOS, and IBM and DEC *are* real examples.

> Also, several proposals for accounting for such limited environments
> without modifying the host linker have been proposed that Henry has not
> addressed (mostly he has indulged in condescending attacks).
> The major criticism of these methods has been their inconvenience,
> especially for debugging.

I've yet to see one that had a hope in Hell of being accepted by IBM,
DEC, Microsoft, etc. etc.  Acceptance is what the whole damned 6-char
issue is about, remember, not just minimal feasibility.  It is not enough
to say "if you stand on your head and squint, you can make it work".
The problem is convincing IBM to bother.  They won't.  Which means we
would be stuck with a two-level standard in practice whether we liked
it or not.  Actually, it would be worse than that:  we'd have a multi-
level standard, with each old operating system setting its own limit,
and nobody sure what the rules were.  Better to face the problem.
(I trust everyone realizes that nobody in his right mind is suggesting
a 6-char limit in environments like Unix.)

> However, such inconvenience *on these specific
> systems* must be weighed against the costs to *all* C developers of a
> six-character limit.

As described above, the 6-char limit *will* exist no matter what the
standard says, if one is concerned about maximum portability.  If one
isn't, then one can probably assume Berklix or USGlix or a similar
civilized environment.  Where is the win in mandating long identifiers?

				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

------------------------------

Date: Mon, 4 Feb 85 19:23:43 est
From: cbosgd!packard!harvard!talcott!kendall
Subject: Need for first-member union initialization rule
To: cbosgd!std-c

A lot of people have criticized the first-member union initialization
rule.  I want to present a case that it is necessary to make the
language consistent, and I also want to answer the objection that this
solution will prevent a later, more elaborate solution.

    First, as to why the rule is necessary.  When an object is
initialized by default (no explicit initializer), it is initialized to
zero.  In the standard, zero is not necessarily the zero bit pattern;
rather, it is dependent on the type of the object.  For numbers and
enums, the value 0 is used; for pointers, null; for structures and
arrays, each member is zero; but for unions, what?  Suppose we have the
following union:

	union { double d; long l; } u;

And suppose that we are on a machine where floating point zero is not
the zero bit pattern, but integer zero is.  Then to what do we
initialize this union?  Do we make `d' zero, or `l' zero?  Without a
union initialization rule, this question is left unanswered.  With the
current rule, we make `d' zero, even if `l' is left with a nonzero
value.

    I do not have a copy of the standard with me, so I cannot verify my
argument with reference to it.  In particular, there may be no basis
for making such a strong connection between implicit and explicit
initialization, but it seems to me elegant, if not actually necessary,
to use this same rule for both.

    Second, I want to answer the criticism that this rule will prevent
further extensions along these lines.  My answer is very simple: any
union initialization syntax that attempts to select which member is
initialized had better look different than ordinary initialization, and
thus be syntactically distinguishable from it; otherwise things are
going to be very confusing.  (In particular--to make my one
inflammatory statement of this note--the "cast" syntax for union
initialization is not even worth arguing about.)  In other words,
current union initialization looks like ordinary initialization; any
extended syntax must look different, so there will be no conflict, no
need for the extension to use syntax that is already usurped by the
current rule.

	Sam Kendall	  {allegra,ihnp4,ima,amd}!wjh12!kendall
	Delft Consulting Corp.	    decvax!genrad!wjh12!kendall

------------------------------

End of mod.std.c Digest - Tue, 12 Feb 85 05:01:54 EST
******************************

USENET -> posting only through cbosgd!std-c.
ARPA -> replies to cbosgd!std-c@BERKELEY.ARPA (NOT to INFO-C)
In all cases, you may also reply to the author(s) above.
-- 
Orlando Sotomayor-Diaz	/AT&T Bell Laboratories, Red Hill Road
			/Middletown, New Jersey, 07748 (HR 1B 316)
Tel: 201-949-9230	/UUCP: {ihnp4, houxm}!homxa!osd7