lhf@aries5 (Luiz H. deFigueiredo) (08/31/89)
There has been some discussion on the net about hardware protection and out-of-bounds pointers. ANSI C says that is *is* legal to use mention (but not dereference) a pointer just out-of-bounds as in char a[N]; char *last=a+N; /* Here! */ char *p; for (p=a; p<last; p++) do something; Now I ask, it is possible/legal to do the analogous thing for a-1 as in char *head=a-1; /* Here! */ for (p=last-1; p>head; p--) do something else; ------------------------------------------------------------------------------- Luiz Henrique de Figueiredo internet: lhf@aries5.waterloo.edu Computer Systems Group bitnet: lhf@watcsg University of Waterloo (possible domains are waterloo.edu and uwaterloo.ca) ------------------------------------------------------------------------------- eof
henry@utzoo.uucp (Henry Spencer) (09/06/89)
In article <427@maytag.waterloo.edu> lhf@aries5 (Luiz H. deFigueiredo) writes: >ANSI C says that is *is* legal to use mention (but not dereference) a pointer >just out-of-bounds [on the high end]... >Now I ask, it is possible/legal to do the analogous thing for a-1 ... No. High-end out-of-bounds pointers are very common, and all an implementation usually has to do to make them work is to ensure that the end of the array is at least one byte below a segment boundary (assuming that the hardware has segments), so that a high-end pointer is still in the same linear address space as a normal one and address arithmetic therefore works as expected. Low-end out-of-bounds pointers are much less common, and making them work would typically require that the beginning of the array be at least one array-element-size after a segment boundary... which can get expensive if the elements are big. The asymmetry is because a pointer to an object generally points to its beginning. It was felt (I am told) that the cost:benefit ratio was favorable for high-end but not for low-end. So it is not legal or portable to do this for a-1. (Whether it is possible is implementation-dependent.) -- V7 /bin/mail source: 554 lines.| Henry Spencer at U of Toronto Zoology 1989 X.400 specs: 2200+ pages. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
jsdy@hadron.UUCP (Joseph S. D. Yao) (09/07/89)
In article <427@maytag.waterloo.edu> lhf@aries5 (Luiz H. deFigueiredo) writes:
-ANSI C says that is *is* legal to use mention (but not dereference) a pointer
-just out-of-bounds as in
-
- char a[N];
- char *last=a+N; /* Here! */
- char *p;
- for (p=a; p<last; p++)
-Now I ask, it is possible/legal to do the analogous thing for a-1 as in
- char *head=a-1; /* Here! */
- for (p=last-1; p>head; p--)
This, too, is an out-of-bounds pointer, and is covered by the same
rule. Nothing says that an OOB ptr has to be positively offset.
joe yao
gwyn@smoke.BRL.MIL (Doug Gwyn) (09/07/89)
In article <867@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes: >- char *head=a-1; /* Here! */ >This, too, is an out-of-bounds pointer, and is covered by the same >rule. Nothing says that an OOB ptr has to be positively offset. Sorry, Joe, but you're wrong. Only the last+1 OOB pointer is legal, not the first-1. I've seen this fail in practice (in AT&T's UNIX implementation of bsearch(), as I recall) when the array element was fairly large and first-1 happened to wrap around the address space.
kdb@chinet.chi.il.us (Karl Botts) (09/08/89)
>High-end out-of-bounds pointers are very common, and all an implementation >usually has to do to make them work is to ensure that the end of the array >is at least one byte below a segment boundary (assuming that the hardware >has segments), so that a high-end pointer is still in the same linear >address space as a normal one and address arithmetic therefore works as >expected. Low-end out-of-bounds pointers are much less common, and making The following is in no way guaranteed by ANSI C, but I think it something you can depend on pretty solidly. Any standard implementation of malloc() et al. puts either the size of the block or a pointer to the next block in the machine word just before the start of the block; this will be in the same linear address space. Thus you can be sure that mentioning this word (or even dereferencing it) will not cause an exception. Of course this only holds true for malloc()ed blocks.
jsdy@hadron.UUCP (Joseph S. D. Yao) (09/09/89)
In article <10970@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes: -In article <867@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes: ->- char *head=a-1; /* Here! */ ->This, too, is an out-of-bounds pointer, and is covered by the same ->rule. Nothing says that an OOB ptr has to be positively offset. -Sorry, Joe, but you're wrong. Only the last+1 OOB pointer is legal, -not the first-1. I've seen this fail in practice (in AT&T's UNIX -implementation of bsearch(), as I recall) when the array element -was fairly large and first-1 happened to wrap around the address -space. Erk. You're right. On the other hand (since stacks don't HAVE to be at the end of data space), an array could also abut the end of data space, and thus last+1 becomes NULL. Is there anything forbidding that? For that matter, is there anything IN THE STANDARD that says first-1 is illegal? (Besides the general fact that it's bad practice, of course.) Joe Yao
gwyn@smoke.BRL.MIL (Doug Gwyn) (09/09/89)
In article <9520@chinet.chi.il.us> kdb@chinet.chi.il.us (Karl Botts) writes: >Any standard implementation of malloc() et al. puts either the size >of the block or a pointer to the next block in the machine word just >before the start of the block ... Not true. A "buddy system" allocator is MOST unlikely to do so. >Thus you can be sure that mentioning this word (or even dereferencing it) >will not cause an exception. Even in such cases, it still wouldn't help with arrays of large objects, because first-1 would point many bytes below the start of the allocated data block. Just don't use first-1. It's not that hard to avoid.
gwyn@smoke.BRL.MIL (Doug Gwyn) (09/09/89)
In article <868@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes: >Erk. You're right. On the other hand (since stacks don't HAVE to be >at the end of data space), an array could also abut the end of data >space, and thus last+1 becomes NULL. Is there anything forbidding >that? Yes. A conforming implementation must ensure that that does not occur. It's easier than for first-1, because a single extra storage word of slop will suffice no matter how large the array member. >For that matter, is there anything IN THE STANDARD that says first-1 is >illegal? (Besides the general fact that it's bad practice, of course.) Yes, although not in those exact words. Pointers to nonexistent objects are not valid in strictly conforming programs, with an explicit exception made for last+1 pointers (so long as no attempt is made to access what is pointed to by one of them).
chris@mimsy.UUCP (Chris Torek) (09/09/89)
In article <9520@chinet.chi.il.us> kdb@chinet.chi.il.us (Karl Botts) writes: >The following is in no way guaranteed by ANSI C, but I think it something >you can depend on pretty solidly. Any standard implementation of malloc() >et al. puts either the size of the block or a pointer to the next block in >the machine word just before the start of the block; this will be in the >same linear address space. Thus you can be sure that mentioning this word >(or even dereferencing it) will not cause an exception. Of course this >only holds true for malloc()ed blocks. This assumption is inadvisable. Future Unix versions are quite likely to have better/faster/fancier malloc()s that hide sizes elsewhere. Putting pointers and sizes at the front of blocks is, for instance, bad for paging. As a nice side effect, when the implementation puts malloc() information outside the blocks allocated, it can (a) arrange for the system to leave `holes' in the address space at the edges of each block (this traps many off-the-end references, though not all), and (b) arrange for runtime checking to make sure you have not written over the ends of allocated storage. A malloc that finds code like p = malloc(strlen(s)); if (p == NULL) die(); strcpy(p, s); can be quite helpful. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
kdb@chinet.chi.il.us (Karl Botts) (09/15/89)
>Even in such cases, it still wouldn't help with arrays of large objects, >because first-1 would point many bytes below the start of the allocated >data block. No argument. I think I said something about the size of the objects in my original message. >Just don't use first-1. It's not that hard to avoid. No argument again -- I would never do it myself. It has been done, however. Take a look at yyparse.c the way it handles the stack pointers (I think they are called "yyvs" and "yypvs" or something like that.) I don't have a machine where meerly mentioning, as opposed to dereferencing, an OOB pointer will cause an exception, but such machines exist and I suspect that YACC parsers would fail under certain circumstances on such machines. I'm out on a bit of a limb here, but I'd be interested if anybody has had such an experience?
jeenglis@nunki.usc.edu (Joe English) (09/16/89)
kdb@chinet.chi.il.us (Karl Botts) writes: >>Even in such cases, it still wouldn't help with arrays of large objects, > >No argument again -- I would never do it myself. It has been done, >however. Take a look at yyparse.c the way it handles the stack pointers (I >think they are called "yyvs" and "yypvs" or something like that.) > [...] >YACC parsers would fail under certain circumstances on such machines. I'm >out on a bit of a limb here, but I'd be interested if anybody has had such >an experience? Not likely; YACC produces some hideous looking code, but yyvs[-N] is guarranteed to never point below the actual value stack because of the mathematical priciples by which the parser is constructed. Not that this is really relevant to C, but it's worth knowing... --Joe English jeenglis@nunki.usc.edu