[comp.std.c] ANSI C, hardware protection, out-of-bounds pointers

lhf@aries5 (Luiz H. deFigueiredo) (08/31/89)

There has been some discussion on the net about hardware protection and
out-of-bounds pointers.

ANSI C says that is *is* legal to use mention (but not dereference) a pointer
just out-of-bounds as in

	char a[N];
	char *last=a+N;		/* Here! */
	char *p;

	for (p=a; p<last; p++)
		do something;

Now I ask, it is possible/legal to do the analogous thing for a-1 as in

	char *head=a-1;		/* Here! */

	for (p=last-1; p>head; p--)
		do something else;

-------------------------------------------------------------------------------
Luiz Henrique de Figueiredo		internet: lhf@aries5.waterloo.edu
Computer Systems Group			bitnet:   lhf@watcsg
University of Waterloo     (possible domains are waterloo.edu and uwaterloo.ca)
-------------------------------------------------------------------------------
eof

henry@utzoo.uucp (Henry Spencer) (09/06/89)

In article <427@maytag.waterloo.edu> lhf@aries5 (Luiz H. deFigueiredo) writes:
>ANSI C says that is *is* legal to use mention (but not dereference) a pointer
>just out-of-bounds [on the high end]...
>Now I ask, it is possible/legal to do the analogous thing for a-1 ...

No.

High-end out-of-bounds pointers are very common, and all an implementation
usually has to do to make them work is to ensure that the end of the array
is at least one byte below a segment boundary (assuming that the hardware
has segments), so that a high-end pointer is still in the same linear
address space as a normal one and address arithmetic therefore works as
expected.  Low-end out-of-bounds pointers are much less common, and making
them work would typically require that the beginning of the array be at
least one array-element-size after a segment boundary... which can get
expensive if the elements are big.  The asymmetry is because a pointer to
an object generally points to its beginning.

It was felt (I am told) that the cost:benefit ratio was favorable for
high-end but not for low-end.  So it is not legal or portable to do this
for a-1.  (Whether it is possible is implementation-dependent.)
-- 
V7 /bin/mail source: 554 lines.|     Henry Spencer at U of Toronto Zoology
1989 X.400 specs: 2200+ pages. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

jsdy@hadron.UUCP (Joseph S. D. Yao) (09/07/89)

In article <427@maytag.waterloo.edu> lhf@aries5 (Luiz H. deFigueiredo) writes:
-ANSI C says that is *is* legal to use mention (but not dereference) a pointer
-just out-of-bounds as in
-
-	char a[N];
-	char *last=a+N;		/* Here! */
-	char *p;
-	for (p=a; p<last; p++)
-Now I ask, it is possible/legal to do the analogous thing for a-1 as in
-	char *head=a-1;		/* Here! */
-	for (p=last-1; p>head; p--)

This, too, is an out-of-bounds pointer, and is covered by the same
rule.  Nothing says that an OOB ptr has to be positively offset.

joe yao

gwyn@smoke.BRL.MIL (Doug Gwyn) (09/07/89)

In article <867@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes:
>-	char *head=a-1;		/* Here! */
>This, too, is an out-of-bounds pointer, and is covered by the same
>rule.  Nothing says that an OOB ptr has to be positively offset.

Sorry, Joe, but you're wrong.  Only the last+1 OOB pointer is legal,
not the first-1.  I've seen this fail in practice (in AT&T's UNIX
implementation of bsearch(), as I recall) when the array element
was fairly large and first-1 happened to wrap around the address
space.

kdb@chinet.chi.il.us (Karl Botts) (09/08/89)

>High-end out-of-bounds pointers are very common, and all an implementation
>usually has to do to make them work is to ensure that the end of the array
>is at least one byte below a segment boundary (assuming that the hardware
>has segments), so that a high-end pointer is still in the same linear
>address space as a normal one and address arithmetic therefore works as
>expected.  Low-end out-of-bounds pointers are much less common, and making

The following is in no way guaranteed by ANSI C, but I think it something
you can depend on pretty solidly.  Any standard implementation of malloc()
et al. puts either the size of the block or a pointer to the next block in
the machine word just before the start of the block; this will be in the
same linear address space.  Thus you can be sure that mentioning this word
(or even dereferencing it) will not cause an exception.  Of course this
only holds true for malloc()ed blocks.

jsdy@hadron.UUCP (Joseph S. D. Yao) (09/09/89)

In article <10970@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
-In article <867@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes:
->-	char *head=a-1;		/* Here! */
->This, too, is an out-of-bounds pointer, and is covered by the same
->rule.  Nothing says that an OOB ptr has to be positively offset.
-Sorry, Joe, but you're wrong.  Only the last+1 OOB pointer is legal,
-not the first-1.  I've seen this fail in practice (in AT&T's UNIX
-implementation of bsearch(), as I recall) when the array element
-was fairly large and first-1 happened to wrap around the address
-space.

Erk.  You're right.  On the other hand (since stacks don't HAVE to be
at the end of data space), an array could also abut the end of data
space, and thus last+1 becomes NULL.  Is there anything forbidding
that?

For that matter, is there anything IN THE STANDARD that says first-1 is
illegal?  (Besides the general fact that it's bad practice, of course.)

Joe Yao

gwyn@smoke.BRL.MIL (Doug Gwyn) (09/09/89)

In article <9520@chinet.chi.il.us> kdb@chinet.chi.il.us (Karl Botts) writes:
>Any standard implementation of malloc() et al. puts either the size
>of the block or a pointer to the next block in the machine word just
>before the start of the block ...

Not true.  A "buddy system" allocator is MOST unlikely to do so.

>Thus you can be sure that mentioning this word (or even dereferencing it)
>will not cause an exception.

Even in such cases, it still wouldn't help with arrays of large objects,
because first-1 would point many bytes below the start of the allocated
data block.

Just don't use first-1.  It's not that hard to avoid.

gwyn@smoke.BRL.MIL (Doug Gwyn) (09/09/89)

In article <868@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes:
>Erk.  You're right.  On the other hand (since stacks don't HAVE to be
>at the end of data space), an array could also abut the end of data
>space, and thus last+1 becomes NULL.  Is there anything forbidding
>that?

Yes.  A conforming implementation must ensure that that does not occur.
It's easier than for first-1, because a single extra storage word of slop
will suffice no matter how large the array member.

>For that matter, is there anything IN THE STANDARD that says first-1 is
>illegal?  (Besides the general fact that it's bad practice, of course.)

Yes, although not in those exact words.  Pointers to nonexistent objects
are not valid in strictly conforming programs, with an explicit exception
made for last+1 pointers (so long as no attempt is made to access what is
pointed to by one of them).

chris@mimsy.UUCP (Chris Torek) (09/09/89)

In article <9520@chinet.chi.il.us> kdb@chinet.chi.il.us (Karl Botts) writes:
>The following is in no way guaranteed by ANSI C, but I think it something
>you can depend on pretty solidly.  Any standard implementation of malloc()
>et al. puts either the size of the block or a pointer to the next block in
>the machine word just before the start of the block; this will be in the
>same linear address space.  Thus you can be sure that mentioning this word
>(or even dereferencing it) will not cause an exception.  Of course this
>only holds true for malloc()ed blocks.

This assumption is inadvisable.  Future Unix versions are quite likely
to have better/faster/fancier malloc()s that hide sizes elsewhere.
Putting pointers and sizes at the front of blocks is, for instance,
bad for paging.

As a nice side effect, when the implementation puts malloc() information
outside the blocks allocated, it can (a) arrange for the system to
leave `holes' in the address space at the edges of each block (this
traps many off-the-end references, though not all), and (b) arrange
for runtime checking to make sure you have not written over the ends
of allocated storage.

A malloc that finds code like

	p = malloc(strlen(s));
	if (p == NULL) die();
	strcpy(p, s);

can be quite helpful.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

kdb@chinet.chi.il.us (Karl Botts) (09/15/89)

>Even in such cases, it still wouldn't help with arrays of large objects,
>because first-1 would point many bytes below the start of the allocated
>data block.

No argument.  I think I said something about the size of the objects in my
original message.

>Just don't use first-1.  It's not that hard to avoid.

No argument again -- I would never do it myself.  It has been done,
however.  Take a look at yyparse.c the way it handles the stack pointers (I
think they are called "yyvs" and "yypvs" or something like that.)  I don't
have a machine where meerly mentioning, as opposed to dereferencing, an OOB
pointer will cause an exception, but such machines exist and I suspect that
YACC parsers would fail under certain circumstances on such machines.  I'm
out on a bit of a limb here, but I'd be interested if anybody has had such
an experience?

jeenglis@nunki.usc.edu (Joe English) (09/16/89)

kdb@chinet.chi.il.us (Karl Botts) writes:
>>Even in such cases, it still wouldn't help with arrays of large objects,
>
>No argument again -- I would never do it myself.  It has been done,
>however.  Take a look at yyparse.c the way it handles the stack pointers (I
>think they are called "yyvs" and "yypvs" or something like that.)  
> [...]
>YACC parsers would fail under certain circumstances on such machines.  I'm
>out on a bit of a limb here, but I'd be interested if anybody has had such
>an experience?

Not likely;  YACC produces some hideous looking code, but
yyvs[-N] is guarranteed to never point below the actual
value stack because of the mathematical priciples by which
the parser is constructed.

Not that this is really relevant to C, but it's worth knowing...

--Joe English

  jeenglis@nunki.usc.edu