chip@tct.uucp (Chip Salzenberg) (06/26/90)
According to rlc@aix.aix.kingston.ibm.com (Roger Collins): >When a commercial computer system doesn't run a piece of software (no >matter how old or poorly written) that runs on other systems, the >computer gets the blame. "The computer" often gets the blame undeservedly. So what? We should NOT make engineering decisions based on perceived blame. Dereferencing null pointers is *illegal* and *non-portable*. >Now, why don't you suggest to your boss that you release a computer >that initializes .bss to 0xffffffff? (Leaving it uninitialized is a >security hole on multiuser systems.) It doesn't break the C language. Yes it would. Uninitialized globals *are* guaranteed to be zero (or null, for pointers). Perhaps you should read Kernighan and Ritchie's book. It is a good idea to learn a language before trying to use it as example of anything. Otherwise you could end up looking stupid. -- Chip Salzenberg at ComDev/TCT <chip@tct.uucp>, <uunet!ateng!tct!chip>
martin@mwtech.UUCP (Martin Weitzel) (06/27/90)
In article <2392@aix.aix.kingston.ibm.com> rlc@aix.aix.kingston.ibm.com (Roger Collins) writes: [some lines deleted] >To give another example of this situation, almost all UNIX systems >initialize .bss to zero. This causes uninitialized global and static >variables to be initialized to zero by default. Many, many programs, >including System V commands, depend on this fact. They assume >uninitialized global and static variables are initialized to zero >even though the C language explicitly says the values are undetermined. K&R1 Page 37: External and static variables are initialized to zero by default ... Page 198: Static and external variables which are not initialized are guaranteed to start off as 0; ... K&R2 Page 40: External and static variables are initialized to zero by default. Page 219: A static object that is not explicitly initialized is initialized as if it (or its members) were assigned the constant 0. I don't have a copy of the ANSI-C-Standard handy, but I'm pretty sure that it gives the same guarantee. Could you please name *one* reference for the C language which supports your claim? >This is wrong, I know. Yes, of course it's wrong - but wouldn't it have been easier to leave the last two sentences out? (Of course, this would have made all the rest of the article pointless, so I guess the last sentence doesn't apply to the claim, but to the fact that many programs make the - perfectly correct - assumption that all uninitialized data is initialized with zero.) >Now, why don't you suggest to your boss that you release a computer >that initializes .bss to 0xffffffff? (Leaving it uninitialized is a >security hole on multiuser systems.) It doesn't break the C language. Really? I'm pretty sure it breaks the C Language! Again: Please name *one* reference for the C language that makes you think it wouldn't. [rest deleted] -- Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83
rsalz@bbn.com (Rich Salz) (06/27/90)
In <2392@aix.aix.kingston.ibm.com> rlc@aix.aix.kingston.ibm.com (Roger Collins) writes: |To give another example of this situation, almost all UNIX systems |initialize .bss to zero. This causes uninitialized global and static |variables to be initialized to zero by default. Many, many programs, |including System V commands, depend on this fact. They assume |uninitialized global and static variables are initialized to zero |even though the C language explicitly says the values are undetermined. This is totally and completely wrong. Static data is guaranteed to be initialized to zero. See, for example, K&R 2nd edition, page 219 and page 86. Gee, I hope you're not in the compiler or OS group. :-) Followups to comp.lang.c /r$ -- Please send comp.sources.unix-related mail to rsalz@uunet.uu.net. Use a domain-based address or give alternate paths, or you may lose out.
applix@runxtsa.runx.oz.au (Andrew Morton) (06/29/90)
> > I recently found to my delight that the Sparc architecture > core dumps if a NULL pointer is dereferenced, and I want to find > other systems that do this. [ stuff deleted ] > What other machines do this? I know that the 3B2 and 3B15 > don't do it, and neither does AT&T UNIX on the 386. The NCR Tower series (680x0 System V boxes) do this. Executables are loaded at address 0x8000 and a reference to any address lower than that will make your program drop core. I agree that this is a very useful feature. If this breaks any existing code, then the code is already broken!
jc@minya.UUCP (John Chambers) (06/29/90)
In article <13226@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes: > In article <2389@aix.aix.kingston.ibm.com> rlc@aix.aix.kingston.ibm.com (Roger Collins) writes: > >In article <31079@cup.portal.com> thad@cup.portal.com (Thad P Floryan) writes: > >> Point being: there "may" be some method (perhaps under software control) to > >> disable the MMU on YOUR system to catch NULL dereferencing... > >First, the way that the systems causes null pointer dereferencing > >*not* to fail is so simple that all systems should do it: just make > >sure address 0 (page 0) is allocated and set to 0. > > To the contrary, this merely prevents genuine bugs from being caught > as soon as they would be were dereference of a null pointer to trap. > Dereferencing a null pointer is a serious BUG in any application and > can indicate an algorithmic error that should be tracked down before > it is too late. Hey, wait just a minute here. I can't let such an erroneous error go unchallenged! Dereferencing a null pointer is quite definitely *not* an error, bug, mistake, or any other pejorative, in a great many sorts of applications. The trouble with generalizing to all C code is that C outgrew Unix about a decade ago. I've written many programs that are to be downloaded and run all by their lonesome in an "embedded" processor. Lots of people use C for such applications, and there are a *lot* of processors (in fact, the overwhelming majority of them) that run only one program (without an operating system) for their entire life. Such embedded, standalone programs not only can, but are required to access all of physical memory, including address zero. The hardware says that something particular resides at the bottom end of memory, and such a program must access that byte/word/whatever, or it can't do its job. If a C compiler won't let me access that address, then I look for another that was written by someone more competent. Now, if you were to suggest an *optional* feature that would allow me to turn null-pointer checking off and on (preferably at run time, but a compile-time flag would be useful), I'd grab it and use it. After all, it is pretty much true that when a process running on a Unix system dereferences a null pointer, it is generally a bug. Of course, all C compilers already have a perfectly good tool for doing a run-time check for a null pointer: if (p == 0 ) ... I use it a lot. In particular, I introduce it all over the place in code that I'm porting from VAXen to other machines. I suggest that others try it. It's a whole lot nicer than having your program bomb out on a SEGV. (Now if there were only an equivalent piece of code that would tell me whether a non-null pointer is valid.) -- Uucp: ...!{harvard.edu,ima.com,eddie.mit.edu,ora.com}!minya!jc (John Chambers) Home: 1-617-484-6393 Work: 1-508-952-3274 Cute-Saying: [I've gotta get a new one of these some day.]
jik@athena.mit.edu (Jonathan I. Kamens) (06/29/90)
In article <412@minya.UUCP>, jc@minya.UUCP (John Chambers) writes: |> Hey, wait just a minute here. I can't let such an erroneous error go |> unchallenged! Dereferencing a null pointer is quite definitely *not* |> an error, bug, mistake, or any other pejorative, in a great many sorts |> of applications. |> |> The trouble with generalizing to all C code is that C outgrew Unix |> about a decade ago. From K&R Second Edition, Page 102: "C guarantees that zero is never a valid address for data, so a return value of zero can be used to signal an abnormal event, in this case, no space." ANSI C is a lot newer than "about a decade ago." If you use a compiler that allows 0 to be a reference to valid data, then your compiler is non-standard. There may be situations (such as the one you described in your posting) in which such a non-standard configuration is desired, but just because it's desired doesn't mean it's valid standard C. It would seem to me that a simpler solution to the embedded processor problem than requiring a non-standard C compiler in order to write code for one would be to not have any physical memory at address 0, or to put program memory there (since, unless the program is self-modifying, it should never have to access its own memory, excluding perhaps function pointers). Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8495 Home: 617-782-0710
lwa@skeptic.osf.org (Larry Allen) (06/29/90)
Sorry, John, I have to disagree. On such a piece of hardware, a NULL pointer simply can't be represented as an all zero bit pattern. The representation of the NULL pointer has to guarantee that a NULL pointer will NEVER match a valid address -- if it does, then a comparison against NULL is not guaranteed to be unique. You have to get out of the mindset of expecting that NULL is equal to 0. As Doug Gwyn has so eloquently pointed out in the past, a NULL pointer is NOT a zero bit pattern, it's a distinguished language construct that's represented by the token "0" of type "pointer to ...". The language doesn't (and shouldn't) place any requirements on how NULL is represented, save that it must not be the same as any piece of valid storage that the program can access. So, you see, dereferencing a pointer whose bit representation is 0 may well be a valid thing to do, but dereferencing a NULL pointer is never valid. -Larry Allen Open Software Foundation
scott@bbxsda.UUCP (Scott Amspoker) (06/29/90)
In article <412@minya.UUCP> jc@minya.UUCP (John Chambers) writes: > Such embedded, standalone >programs not only can, but are required to access all of physical >memory, including address zero. The hardware says that something >particular resides at the bottom end of memory, and such a program >must access that byte/word/whatever, or it can't do its job. If a >C compiler won't let me access that address, then I look for another ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >that was written by someone more competent. Please re-read the thread. We are not talking about a feature of the C compiler but a feature of the host operating system. If you are running a standalone, absolute memory application then such error traps would not apply. >Of course, all C compilers already have a perfectly good tool for >doing a run-time check for a null pointer: > if (p == 0 ) ... >I use it a lot. In particular, I introduce it all over the place >in code that I'm porting from VAXen to other machines. I suggest >that others try it. It's a whole lot nicer than having your program >bomb out on a SEGV. Well, what if p==0? continue? abort? I think posters on this thread were refering to the situations where a null pointer should *not*, by definition, be present. In which case, dereferencing a NULL would indicate a bug that the programmer would wish to discover early in the testing phase. If the pointer in question could not be trusted (i.e. passed by an outside library routine) then it would be prudent to explicitly test it as you have shown. -- Scott Amspoker Basis International, Albuquerque, NM (505) 345-5232 unmvax.cs.unm.edu!bbx!bbxsda!scott
stripes@eng.umd.edu (Joshua Osborne) (06/30/90)
In article <1990Jun29.132304.12550@athena.mit.edu> jik@athena.mit.edu (Jonathan I. Kamens) writes: > It would seem to me that a simpler solution to the embedded processor >problem than requiring a non-standard C compiler in order to write code >for one would be to not have any physical memory at address 0, or to put >program memory there (since, unless the program is self-modifying, it >should never have to access its own memory, excluding perhaps function >pointers). Most processers treat the first N bytes of memmory diffrently. The 6510 uses the first 2 bytes as an I/O regester (which implments bank switching on the C=64). The 680x0 uses the first N bytes for address of traps, interupt vectors, and something else (is it the value of A7 after cold boot?). They are things that an OS would need to change, not a user. They are most likely things that standard system functions like malloc() never return pointers to. So NULL could still be returned as an error indicator from functions that return pointers. On systems with an OS only the OS should be able to write to NULL (the Atari ST only lets programs write to the first 2 or 4K when the S bit is on). If a system has no OS (non-hosted env) then a C program may well have to write to location 0, which may well be NULL, but it is still not a valid pointer to "normal" memmory. Also there may be *no* invalid addresses, a CPU with 16 bit address space, and 64K has no invalid addresses (unless it has alignment restrictions...). -- stripes@eng.umd.edu "Security for Unix is like Josh_Osborne@Real_World,The Mutitasking for MS-DOS" "The dyslexic porgramer" - Kevin Lockwood "Don't try to change C into some nice, safe, portable programming language with all sharp edges removed, pick another language." - John Limpert
cpcahil@virtech.uucp (Conor P. Cahill) (06/30/90)
In article <412@minya.UUCP> jc@minya.UUCP (John Chambers) writes: >Hey, wait just a minute here. I can't let such an erroneous error go >unchallenged! Dereferencing a null pointer is quite definitely *not* >an error, bug, mistake, or any other pejorative, in a great many sorts >of applications. > >The trouble with generalizing to all C code is that C outgrew Unix >about a decade ago. HOWEVER, this is not comp.lang.c. This comp.unix.whatever and in this context dereferencing a null pointer is, or should be, a no-no. Any unix code that does, has at least 1 serious bug that should be fixed. I remember one time when I was porting some code to a sun 3 and ran across the following line of code: if( strcmp(variable,(char*)0) == 0 ) I really wasn't sure what the original programmer was attempting to check (since the first byte at 0 was a null byte on the original machine), so I changed the code to be: if( (variable == (char *)0) || (*variable == '\0')) -- Conor P. Cahill (703)430-9247 Virtual Technologies, Inc., uunet!virtech!cpcahil 46030 Manekin Plaza, Suite 160 Sterling, VA 22170
aryeh@eddie.mit.edu (Aryeh M. Weiss) (06/30/90)
Under SCO Xenix V/386, 386 native 32-bit (`small' model) programs dump core on NULL deref. This is because location 0 is not allocated to the data space. Actually, stack grows down from 0x1880000, while static and heap storage grow up from this location (although this offset can be changed by a linker option). The situation is drastically different for 286 16-bit programs under Xenix 386 or Xenix 286. Small model 16-bit programs do not core dump because location 0 IS in the memory map. On the other hand, large and compact model 16-bit programs, where POINTERS are 32-bits (or 32-bit `far' pointers in medium/small programs) will cause a core dump because the most significant 16 bits of the pointer is actually a selector for the segment table and segment 0 cannot exist. --
chapman@sco.COM (Brian Chapman) (07/01/90)
aryeh@eddie.mit.edu (Aryeh M. Weiss) writes: >Under SCO Xenix V/386, 386 native 32-bit (`small' model) programs dump core >on NULL deref. This is because location 0 is not allocated to the data space. >Actually, stack grows down from 0x1880000, while static and heap storage grow >up from this location (although this offset can be changed by a linker option). Yes but we kluged it (for the application support reasons given previously) So that if you referenced 0 it would be _added_ to you address space as a readonly address with 0 in it. So writes cause SEGV at least. Under Unix 3.2 both text and data segments are the same piece of memory so a more normal arrangement exist text is loaded at 0 in *the* address space, and NULL data pointers point at the readonly text address 0 (which is not zero). >The situation is drastically different for 286 16-bit programs under Xenix 386 >or Xenix 286. Small model 16-bit programs do not core dump because location 0 >IS in the memory map. Which steps were taken in libc to make sure that 0 was loaded with 0. Again I know this is a kluge but.... (shrug). -- Brian Chapman uunet!sco!chapman Pay no attention to the man behind the curtain!
jc@minya.UUCP (John Chambers) (07/02/90)
In article <26878337.172@tct.uucp>, chip@tct.uucp (Chip Salzenberg) writes: > According to rlc@aix.aix.kingston.ibm.com (Roger Collins): > >When a commercial computer system doesn't run a piece of software (no > >matter how old or poorly written) that runs on other systems, the > >computer gets the blame. > > "The computer" often gets the blame undeservedly. > > So what? > > We should NOT make engineering decisions based on > perceived blame. Dereferencing null pointers is > *illegal* and *non-portable*. Beg to differ, but C is widely used for writing embedded code (i.e., code that runs standalone on a board with a processor, some memory, and generally some other interesting hardware). The hardware always insists that something particular be kept in low memory. The code must be able to read (and often write) location zero, or it can't possibly do its job correctly. This is as true of the typical Unix kernel as it is of any other system. If a C compiler doesn't allow dereferencing a null pointer, the applications that must do so due to the hardware's requirements are rather crippled, or must be coded partially in some other language. True, dereferencing a zero pointer is usually incorrect. But "usually" and "always" aren't even nearly synonyms. Until we can get the hardware designers to stop using address zero, we are stuck with the situation. Anyhow, there's a universally-available tool in C for testing for a null pointer: if (p == 0) ... Now if there were only a way to test a non-null pointer for validity, *that* would be really useful. (This is unix.wizards? Sheesh! ;-) -- Uucp: ...!{harvard.edu,ima.com,eddie.mit.edu,ora.com}!minya!jc (John Chambers) Home: 1-617-484-6393 Work: 1-508-952-3274 Cute-Saying: [I've gotta get a new one of these some day.]
goudreau@larrybud.rtp.dg.com (Bob Goudreau) (07/02/90)
In article <413@minya.UUCP>, jc@minya.UUCP (John Chambers) writes: > > > > We should NOT make engineering decisions based on > > perceived blame. Dereferencing null pointers is > > *illegal* and *non-portable*. > > Beg to differ, but C is widely used for writing embedded code (i.e., > code that runs standalone on a board with a processor, some memory, > and generally some other interesting hardware). The hardware always > insists that something particular be kept in low memory. The code > must be able to read (and often write) location zero, or it can't > possibly do its job correctly. This is as true of the typical Unix > kernel as it is of any other system. If a C compiler doesn't allow > dereferencing a null pointer, the applications that must do so due > to the hardware's requirements are rather crippled, or must be coded > partially in some other language. > > True, dereferencing a zero pointer is usually incorrect. But "usually" > and "always" aren't even nearly synonyms. Until we can get the hardware > designers to stop using address zero, we are stuck with the situation. You are confusing the C language's null pointer with a machine address whose bit pattern happens to be all zeros. These two concepts are *not* one and the same. For a thorough explanation of the null pointer, go read comp.lang.c's FAQ articles. ------------------------------------------------------------------------ Bob Goudreau +1 919 248 6231 Data General Corporation 62 Alexander Drive goudreau@dg-rtp.dg.com Research Triangle Park, NC 27709 ...!mcnc!rti!xyzzy!goudreau USA
chip@tct.uucp (Chip Salzenberg) (07/04/90)
According to jc@minya.UUCP (John Chambers): >In article <26878337.172@tct.uucp>, chip@tct.uucp (Chip Salzenberg) writes: >> We should NOT make engineering decisions based on >> perceived blame. Dereferencing null pointers is >> *illegal* and *non-portable*. > >Beg to differ, but C is widely used for writing embedded code [...] You are taking my statement out of context. I was talking about an *application* that failed because it was moved from an architecture that permitted user programs to dereference NULL to an architecture that did not permit it. In that context, if "the computer" or "the compiler" gets the blame for the failure, it's a bum rap, because the original programmer is at fault for writing non-portable code. In the context of device drivers, though, "non-portable" is often the order of the day, and the above comments do not apply. -- Chip Salzenberg at ComDev/TCT <chip@tct.uucp>, <uunet!ateng!tct!chip>
gwyn@smoke.BRL.MIL (Doug Gwyn) (07/04/90)
In article <412@minya.UUCP> jc@minya.UUCP (John Chambers) writes: >In article <13226@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes: >> To the contrary, this merely prevents genuine bugs from being caught >> as soon as they would be were dereference of a null pointer to trap. >> Dereferencing a null pointer is a serious BUG in any application and >> can indicate an algorithmic error that should be tracked down before >> it is too late. >Hey, wait just a minute here. I can't let such an erroneous error go >unchallenged! Dereferencing a null pointer is quite definitely *not* >an error, bug, mistake, or any other pejorative, in a great many sorts >of applications. It is ALWAYS an error, since a null pointer by definition does not point to valid storage. >The trouble with generalizing to all C code is that C outgrew Unix >about a decade ago. What does that have to do with anything? Indeed, in many UNIX implementations you could actually get away with dereferencing a null pointer. It has taken years to stamp out most such abuses in code originally developed under such UNIX implementations. This argues in exactly the opposite direction from how you must have intended. >... Such embedded, standalone programs not only can, but are required >to access all of physical memory, including address zero. Data memory address zero has no necessary relation to a null pointer. It is tricky to code an access to such an absolute address in C, because if you write something like "(foo *)0" you have specified a null pointer of type "pointer to foo", not a pointer to a "foo" object stored at machine location zero. (Back in the old days of UNIX, there was no need to distinguish between these, but now there is a definite distinction.) There are correct ways to code the intended effect in C, which I will leave as an exercise for you to work on, but it should be noted that it is highly likely that a compiler for such a target environment does use the same representation for both a null pointer and a pointer to something at location zero, in which case you can continue to use a naive approach to coding such operations. However, you then cannot distinguish between a valid pointer to such an address and a null pointer, which could cause other algorithmic problems. There are a variety of ways that a C implementation can represent null pointers using other than all-0 bit patterns. I won't bore you with implementation details, but you should be aware that such methods exist and may be used by the C implementation if access to machine address zero was considered important by the implementor.
jc@minya.UUCP (John Chambers) (07/06/90)
In article <1990Jun29.132304.12550@athena.mit.edu>, jik@athena.mit.edu (Jonathan I. Kamens) writes: > > From K&R Second Edition, Page 102: "C guarantees that zero is never a > valid address for data, so a return value of zero can be used to signal > an abnormal event, in this case, no space." ANSI C is a lot newer than > "about a decade ago." And on page 192 of my C bible I find the paragraph: The compilers currently allow a pointer to be assigned to an integer, an integer to a pointer, and a pointer to a pointer of another type. The assignment is a pure copy operation, with no conversion. This usage is nonportable, and may produce pointers which cause address exceptions when used. However, it is guaranteed that assignment of the constant 0 to a pointer will produce a null pointer distinguish- able from a pointer to any object. The trouble with this statement is that I've never seen a C compiler that implements it. On extant processors, it is simple to prove that it can't be implemented. If you examine any of the current commercial processors (68xxx, 8xx86, SPARC, MIPS, PDP11/VAX, etc.), you quickly learn that all of them have the property that there is no bit pattern that is guaranteed to cause a fault when used as an address. True, you can use the memory-management hardware to intercept attempts to reference ranges of addresses, but this is a different issue. The memory-referencing hardware has no bit pattern that a compiler can use as "null" value, with the guarantee that its use as an address will cause an interrupt under all circumstances. All bit patterns are legal (byte) addresses on these machines. Yes, I'm aware that processors have been built that have a null pointer, and some even have a bit pattern that is recognized as not-a-number by all the arithmetic opcodes. I've written code (even assembly code) for the Burroughs B5500 and B6700, for instance. I also think that having an explicit "illegal" value for all types is a Real Good Idea. But in today's real world, the programmers who write the really low-level stuff rarely have the luxury of a well-designed processor; they are stuck with 80386s and the like. On such processors, C simply can't be implemented in conformance with such standards, no matter how much we'd like it. > It would seem to me that a simpler solution to the embedded processor > problem than requiring a non-standard C compiler in order to write code > for one would be to not have any physical memory at address 0, or to put > program memory there (since, unless the program is self-modifying, it > should never have to access its own memory, excluding perhaps function > pointers). So how do you get the code there? If I were designing my own processor, I'd probably try to make address zero illegal, since that would catch so many bugs during early testing. But I'm not, and I can't. Given hardware that says that such-and-such is stored at address zero, you either write code that references address zero, or you hand the job over to someone else. You also tend to use the C compilers that are available, since you need to get a product out the door, rather than wait until an acceptable compiler comes along and you get a signature on the purchase order. BTW, perhaps this should be asked in comp.lang.c (though I recall it being discussed a few years back, with much flamage but few answers); can anyone show how one would portably code a statement that assigns the value zero (not null) to a pointer? If I am faced with hardware with a given structure in low memory, it'd help if I could declare: struct lowmem { ... } *lowmem = 0; and be guaranteed that it will point to the right place. It'd also be nice to be able to do the assignment at run time if necessary. As so many people have pointed out, the above code could legally be implemented as any illegal value; what I need is a way to guarantee that it will be zero, regardless of what null is and whether zero is a legal address at the moment. BTW, this issue could come up for some people working in the Unix kernel. Unix isn't immune from the clever ideas of the hardware implementors. Very often the interrupt vector table is in low memory, and you just might find yourself someday working with a kernel that allows interrupt routines to be added and deleted from a running system. (After all, VMS can do it, as can DOS.) How do you plan to plug in a new level-0 interrupt routine on this system, if you can't write to location zero? As any 80x86 hacker will tell you, ranting about the idiocies of the design won't help you get the system out the door (though it surely does feel good at times ;-). -- Typos and silly ideas Copyright (C) 1990 by: Uucp: ...!{harvard.edu,ima.com,eddie.mit.edu,ora.com}!minya!jc (John Chambers) Home: 1-617-484-6393 Work: 1-508-952-3274
volpe@underdog.crd.ge.com (Christopher R Volpe) (07/06/90)
In article <418@minya.UUCP>, jc@minya.UUCP (John Chambers) writes: > The trouble with this statement is that I've never seen a C compiler > that implements it. On extant processors, it is simple to prove that > it can't be implemented. If you examine any of the current commercial > processors (68xxx, 8xx86, SPARC, MIPS, PDP11/VAX, etc.), you quickly > learn that all of them have the property that there is no bit pattern > that is guaranteed to cause a fault when used as an address. True, > you can use the memory-management hardware to intercept attempts to > reference ranges of addresses, but this is a different issue. The > memory-referencing hardware has no bit pattern that a compiler can > use as "null" value, with the guarantee that its use as an address > will cause an interrupt under all circumstances. All bit patterns > are legal (byte) addresses on these machines. Noone ever said that dereferencing a NULL pointer is guaranteed to produce any kind of fault whatsoever. The only thing that is guaranteed is that sticking a '&' in front of any object won't yield a NULL pointer. That includes storage for variables and functions reserved by the compiler as well as malloced storage. > > If I were designing my own processor, I'd probably try to make address > zero illegal, since that would catch so many bugs during early testing. > But I'm not, and I can't. Given hardware that says that such-and-such > is stored at address zero, you either write code that references address > zero, or you hand the job over to someone else. Noone ever said that a NULL pointer means address zero. > BTW, perhaps this should be asked in comp.lang.c (though I recall it > being discussed a few years back, with much flamage but few answers); > can anyone show how one would portably code a statement that assigns > the value zero (not null) to a pointer? If I am faced with hardware > with a given structure in low memory, it'd help if I could declare: > struct lowmem { > ... > } *lowmem = 0; > and be guaranteed that it will point to the right place. It'd also > be nice to be able to do the assignment at run time if necessary. > As so many people have pointed out, the above code could legally be > implemented as any illegal value; what I need is a way to guarantee > that it will be zero, regardless of what null is and whether zero > is a legal address at the moment. Here's how to get a zero bit pattern into a pointer variable: type *ptr_to_type; int i; i=0; /* the integer zero, i.e. all zero bits */ ptr_to_type = *((type *) &i); Portable? Well, only on machines where sizeof(int) == sizeof(type *). On machines with pointers bigger than ints and longs, you could use a union with one member being the pointer and the other member being a sufficiently large array of ints and then just assign zero to all the ints. Why bother trying to make it completely portable? Any application that needs to reference address zero is pretty machine dependent already anyway. Chris Volpe G.E. Corporate R&D volpecr@crd.ge.com
russotto@eng.umd.edu (Matthew T. Russotto) (07/06/90)
In article <418@minya.UUCP> jc@minya.UUCP (John Chambers) writes: > >And on page 192 of my C bible I find the paragraph: > The compilers currently allow a pointer to be assigned to an integer, > an integer to a pointer, and a pointer to a pointer of another type. > The assignment is a pure copy operation, with no conversion. This > usage is nonportable, and may produce pointers which cause address > exceptions when used. However, it is guaranteed that assignment of > the constant 0 to a pointer will produce a null pointer distinguish- > able from a pointer to any object. > >The trouble with this statement is that I've never seen a C compiler >that implements it. On extant processors, it is simple to prove that >it can't be implemented. If you examine any of the current commercial >processors (68xxx, 8xx86, SPARC, MIPS, PDP11/VAX, etc.), you quickly >learn that all of them have the property that there is no bit pattern >that is guaranteed to cause a fault when used as an address. Why does that statement mean that you can't dereference NULL without a fault? Distinguishable in this sense just means you can use tests like p=0; if (p == &someobject) code(); else othercode() which will always fail (it probably also means that malloc, etc, can never allocate an object at 0) All code which attempts to reference objects at ANY specific memory location (includein kernel code which references interrupt vector tables, etc) will be nonportable-- that is implied by that statement also. -- Matthew T. Russotto russotto@eng.umd.edu russotto@wam.umd.edu ][, ][+, ///, ///+, //e, //c, IIGS, //c+ --- Any questions? Hey! Bush has NO LIPS!
bson@wheaties.ai.mit.edu (Jan Brittenson) (07/07/90)
In article <1990Jul6.152722.5320@eng.umd.edu> russotto@eng.umd.edu (Matthew T. Russotto) writes: >In article <418@minya.UUCP> jc@minya.UUCP (John Chambers) writes: >>And on page 192 of my C bible I find the paragraph: >> [...] it is guaranteed that assignment of >> the constant 0 to a pointer will produce a null pointer distinguish- >> able from a pointer to any object. >> >Distinguishable in this sense just means you can use tests like > >p=0; >if (p == &someobject) > code(); >else > othercode() > >which will always fail Not if someobject resides at address 0, in which case p does point to an object. In addition, the address calculation &someobject might yield a pointer to address 0. E.g., char foo[1]; ... &foo[ -(int) &foo ] ... Even when limiting ourselves to unix and the more common implementations can we come up with a reason for treating (char *) 0 as a specific object; to probe the access rights of ones u area, for instance. But for compatibility's sake NULL should be treated as "a pointer not pointing at any object" - esp. when the intention is to later be able to port software from the unix environment to a non-unix environment where variables may very well be located at 0. > (it probably also means that malloc, etc, can never > allocate an object at 0) If malloc() returns NULL, that should be regarded as "no object allocated." In future implementations 0 may be the first location of an upwards-growing heap for all we know... Of course the problem could be easily avoided by never using address 0, but that's - as far as I can tell - exactly what John Chambers argued against in the first place. To summarize, I think John's argument is quite valid, although I think the problem can be easily dodged at the cost of grace. (Disflamer: all above is IMHO.)
gwyn@smoke.BRL.MIL (Doug Gwyn) (07/08/90)
In article <418@minya.UUCP> jc@minya.UUCP (John Chambers) writes: >In article <1990Jun29.132304.12550@athena.mit.edu>, jik@athena.mit.edu (Jonathan I. Kamens) writes: >> From K&R Second Edition, Page 102: "C guarantees that zero is never a >> valid address for data, so a return value of zero can be used to signal >> an abnormal event, in this case, no space." Note that that's in the tutorial section of the book, where rigor is sometimes compromised in order to avoid overwhelming the reader with too much technical detail. A fully rigorous formulation of the rule would have hurt more than it would have helped. >And on page 192 of my C bible I find the paragraph: > The compilers currently allow a pointer to be assigned to an integer, > an integer to a pointer, and a pointer to a pointer of another type. > The assignment is a pure copy operation, with no conversion. This > usage is nonportable, and may produce pointers which cause address > exceptions when used. However, it is guaranteed that assignment of > the constant 0 to a pointer will produce a null pointer distinguish- > able from a pointer to any object. Note that this was removed for the Second Edition; in the intervening years the semantics of null pointers and pointer conversion were worked out more thoroughly. The current rule is that the operation IS a conversion and in some cases an explict cast is required; also, while a constant expression that LOOKS like the conversion of the integer value 0 to a pointer is used in source code to represent a null pointer constant, this particular usage is a special case that may have to be specially recognized by the compiler in some environments. >The trouble with this statement is that I've never seen a C compiler >that implements it. On extant processors, it is simple to prove that >it can't be implemented. If you examine any of the current commercial >processors (68xxx, 8xx86, SPARC, MIPS, PDP11/VAX, etc.), you quickly >learn that all of them have the property that there is no bit pattern >that is guaranteed to cause a fault when used as an address. Nowhere is it stated or implied that a fault must occur when a pointer is improperly used! In fact, some implementations DO fault when a null pointer is dereferenced; this is allowed but not required. It is also not required that the HARDWARE reserve a particular address bit pattern for use as a null pointer. On good old PDP-11 UNIX, the all-zero bit pattern was used for this purpose by the C implementation, which also made sure that no officially-defined C object existed at the location with address zero. That's more or less what the old citation you gave was trying to say about that use of zero. However, that is NOT a requirement on C implementations these days. In particular, one could choose to use the address of a reserved C run-time library object whose external name is, for example, __NULL as the implementation's null pointer value. (The compiler is then obliged to translate source-code constructs such as (foo*)0 to references to &__NULL.) >On such processors, C simply can't be implemented in conformance with >such standards, no matter how much we'd like it. Sure it can. We made quite sure of that when drafting the standard. >BTW, perhaps this should be asked in comp.lang.c (though I recall it >being discussed a few years back, with much flamage but few answers); >can anyone show how one would portably code a statement that assigns >the value zero (not null) to a pointer? The flamage occurs because certain people simply refuse to listen to the correct modern rules concerning null pointers in C. Here is an example of how you can access location 0 regardless of the way that null pointers happen to be implemented: thing *p; p = (thing*)sizeof(thing); --p; *p; This is of course nonportable, but then any assumption about actual assignment of address values is necessarily nonportable. However, it should work for the kind of flat address space environment that you've been postulating.
gwyn@smoke.BRL.MIL (Doug Gwyn) (07/08/90)
In article <9361@rice-chex.ai.mit.edu> bson@rice-chex.ai.mit.edu (Jan Brittenson) writes: >>Distinguishable in this sense just means you can use tests like >>p=0; >>if (p == &someobject) >> code(); >>else >> othercode() >>which will always fail Correct, and that is the primary reason why the C standard requires that a null pointer not compare equal to a pointer to any object. > Not if someobject resides at address 0, in which case p does point >to an object. A conforming C implementation cannot allocate someobject at such as address that a pointer to it is indistinguishable from a null pointer. A (not strictly conforming) application may, through appropriate implementation-specific manipulations, produce such an object pointer, but what it points to will not qualify as an "object" in the sense in which the term is used in the C standard.
amoss@huji.ac.il (Amos Shapira) (07/24/90)
applix@runxtsa.runx.oz.au (Andrew Morton) writes: >> >> I recently found to my delight that the Sparc architecture >> core dumps if a NULL pointer is dereferenced, and I want to find >> other systems that do this. [ stuff deleted ] >> What other machines do this? I know that the 3B2 and 3B15 >> don't do it, and neither does AT&T UNIX on the 386. I missed most of the postings in this matter, but just this morning I found that Personal Iris, based on MIPS 2000 or something similar, also core dumps on NULL derefference. Hope this help, Amos Shapira amoss@huji.ac.il.bitnet