[mod.std.c] mod.std.c Digest V6#10

osd7@homxa.UUCP (Orlando Sotomayor-Diaz) (05/19/85)

From: Orlando Sotomayor-Diaz (The Moderator) <cbosgd!std-c>


mod.std.c Digest            Sun, 19 May 85       Volume 6 : Issue  10 

Today's Topics:
                          declarator parsing
                        Pointer math and ints
----------------------------------------------------------------------

Date: 19 May 85 00:55:22 CDT (Sun)
From: ihnp4!utzoo!henry
Subject: declarator parsing
To: ihnp4!cbosgd!std-c

Has anybody else noticed what, uh, fun it is to try to parse the current
(11 Feb 1985, the latest I have anyway) draft ANSI C using a top-down
parser?  Most of it is OK, no worse than it used to be, but function
declarations (as opposed to definitions) have one substantial problem:
you don't know whether to parse the things in the parameter list using
declarators or abstract declarators, because they can have either.  (They
can even be some of one and some of the other, bletch!)  The part of the
parser that does declarators has to discover what it's dealing with as
it goes.  The syntaxes for declarators and abstract declarators are pretty
similar, but it's still a pain.  It will also adversely affect recovery
from errors, I think.  Ugh.

"Parsing C declarators:  it's not just an algorithm, it's an adventure."

				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

------------------------------

Date: 19 May 85 00:56:43 CDT (Sun)
From: ihnp4!utzoo!henry
Subject: Pointer math and ints
To: ihnp4!cbosgd!std-c

> The next question is: Why did the Committee choose to implement pointer
> subtraction in this fashion?  It is one thing to break existing code in
> a manner which is easily flagged by the compiler.  It is another matter
> entirely to break existing code in a manner which is difficult to detect.
> I suspect this was an editorial oversight by the Committee.

My understanding is that it was deliberate with malice aforethought.  The
problem is that an int *isn't* necessarily big enough to hold the result
of pointer subtraction.  This is basically a bug in K&R, which was written
back in the days when the only C implementations were on "reasonable"
machines.  Ah, for the good old days...

> A better example of this problem (pointer==long, int==short) occurs when
> one passes the (previously guaranteed to be an int) "p-s" as an actual
> parameter to a subroutine, which is expecting an int.  Under the current
> definition, the "p-s" expression pushes a long on the stack, but the
> subroutine expects a short.  Not only does this mess up the binding for
> this parameter, but it offsets any remaining parameters as well.

Surely lint should detect this, although for maximum portability smartness
lint needs to be souped up to know that the result of pointer subtraction
is of type "mumble" rather than just whatever it happens to be on the
machine lint is being run on.

> Aside from the "breaks existing code" problem, I don't see why the result
> of pointer subtraction is of implementation-defined integral size.  The
> quantity is always divided by the member size, and the memory allocators
> in common use take either an int to specify how much space to allocate, or
> the number of elements and the element size.  Assuming space is allocated
> in this fashion, it is impossible for the result of (proper) pointer
> subtraction to result in more bits than can fit in an int.

Not so; note that array subscripts do not have to fit in an int.  Also,
the parameters to the memory allocators have been unsigned, not int, ever
since V7.

> I think that architectures where the difference between two
> 'comparable' pointers can not be expressed in any int are probably
> very rare, and in any case very weird. The 8086 is the only machine
> I know of that might be in trouble.

Unfortunately, another trouble spot is 68000 implementations with 16-bit
ints.  (Please, no flames about this being silly; it may be suboptimal,
but it's not totally ridiculous, given the limited 32-bit arithmetic on
the 68000.)

Actually, the problem is *everywhere*, because a machine with n-bit
addresses cannot express the difference between two pointers as a
*signed* n-bit number; you need one more for the sign bit.  Unfortunately,
the sign of a pointer difference sometimes matters, so you can't just
make it unsigned.  My understanding is that the committee basically
just threw up its hands about this...

				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

------------------------------

End of mod.std.c Digest - Sun, 19 May 85 08:44:00 EDT
******************************
USENET -> posting only through cbosgd!std-c.
ARPA -> ... through cbosgd!std-c@BERKELEY.ARPA (NOT to INFO-C)
In all cases, you may also reply to the author(s) above.