phil@amdcad.UUCP (02/18/87)
In a Unix system I am designing, I am considering catering to bad code. That is, like the VAX I propose to make location 0 contain a readable 0. I think that code which gets ported to a Sun machine often has to have this kind of thing cleaned up. What do people think of this? Is it kind of disgusting? Just how much code has this problem? Did every program you ported to a Sun have to be fixed, or 10%, or something in between? I'd like to do things right, but I'm also lazy. I think doing this will save me a lot of work. Have I overlooked anything? The processor has a relocatable vector area so I can easily map the vector table somewhere else. -- How can I be Asian when I like milk so much? Phil Ngai +1 408 982 7840 UUCP: {ucbvax,decwrl,hplabs,allegra}!amdcad!phil ARPA: amdcad!phil@decwrl.dec.com
rpw3@amdcad.UUCP (02/19/87)
In article <14833@amdcad.UUCP> phil@amdcad.UUCP (Phil Ngai) writes: +--------------- | In a Unix system I am designing, I am considering catering to bad | code. That is, like the VAX I propose to make location 0 contain a | readable 0. I think that code which gets ported to a Sun machine often | has to have this kind of thing cleaned up. | | What do people think of this? Is it kind of disgusting? +--------------- No, no, Mr. Phil, don't do it! ;-} Well, if you think you MUST do it (I understand limited time/money and concerns about unlimited risk), do it this way: Make the a.out format explicitly support specifying the text starting offset. (System-V COFF format does, and it can be added to BSD a.out easily enough, if necessary, with a new magic number.) Make the default be to NOT map in the page, however, write yourself a little hack that takes an a.out and INSERTS the page of zeros, and adjusts the header to reflect the offset of zero (rather than "+1 page"). You won't need to re-link, since the absolute virtual addresses didn't move, just the existence of page #0. Benefits: 1. All new code developed on the machine will have the most often bug caught by default. 2. Old, bad, dirty, broken code can be made to work with just an execution of the "addpage0" utility on the object (AFTER it has blown up the first time and you are sure deref'ing a zero pointer is why). 3. You will be able to SLOWLY clean up the ugly code, as time/money permit. Rob Warnock Systems Architecture Consultant UUCP: {amdcad,fortune,sun}!redwood!rpw3 DDD: (415)572-2607 USPS: 627 26th Ave, San Mateo, CA 94403
klein@gravity.UUCP (02/19/87)
In article <14833@amdcad.UUCP>, phil@amdcad.UUCP (Phil Ngai) writes: > In a Unix system I am designing, I am considering catering to bad > code. That is, like the VAX I propose to make location 0 contain a > readable 0. I think that code which gets ported to a Sun machine often > has to have this kind of thing cleaned up. > > What do people think of this? Is it kind of disgusting? It's not just an issue of porting to a Sun or other machine, it's an issue of relying on code to do something that it does only on a subset of machines. A well-written program does not read at location 0 because some potential platforms do not allow it. A system that allows location 0 to be read (or written) only encourages bad programming. My vote is an emphatic NO! -- Mike Klein klein@sun.{arpa,com} Sun Microsystems, Inc. {ucbvax,hplabs,ihnp4,seismo}!sun!klein Mountain View, CA
guy@gorodish.UUCP (02/19/87)
>It's not just an issue of porting to a Sun or other machine, it's an issue of >relying on code to do something that it does only on a subset of machines. It's even more than that. Programs that try to use location 0 because they're trying to dereference null pointers do so because they have *bugs* in them - the core dump is just a more emphatic way of pointing this out than various flavors of maybe-correct behavior are. VAX/VMS takes location 0 out of the address space by default, probably for just that reason. I can understand the desire for expediency, but I very strongly support Rob Warnock's desire that you *not* make this the default behavior. Even though some paging versions of S5, at least, permit you to define images that will be run without location 0, it's *not* the default; as a result, I've had to fix a number of bugs that *they* should have fixed.
ron@brl-sem.UUCP (02/19/87)
In article <14833@amdcad.UUCP>, phil@amdcad.UUCP (Phil Ngai) writes: > In a Unix system I am designing, I am considering catering to bad > code. That is, like the VAX I propose to make location 0 contain a > readable 0. I think that code which gets ported to a Sun machine often > has to have this kind of thing cleaned up. > I like the GOULD approach. The board that traps access to location 0 among other out of bound memory addresses can be set to just ignore the attempt (the user appears to have accessed the location, but doesn't get anything if it is outside his memory limits, 0 is never in a user address space there), print a message in the logfile, or memory fault the process (and print a message in the log file). This allows you to turn on carefull mode or revert to VAX bad-code compatibility mode. -Ron
firth@sei.cmu.edu.UUCP (02/19/87)
In article <14833@amdcad.UUCP> phil@amdcad.UUCP (Phil Ngai) writes: >In a Unix system I am designing, I am considering catering to bad >code. That is, like the VAX I propose to make location 0 contain a >readable 0. I think that code which gets ported to a Sun machine often >has to have this kind of thing cleaned up. > >What do people think of this? Is it kind of disgusting? On the VAX-11 under VMS, the bottom 512 bytes of the program virtual address space is mapped out, and trying to access it crashes your code. When I ported systems code to the Vax, this was the SINGLE MOST USEFUL THING ON THE MACHINE. I simply could not believe how many old and trusted programs were in fact wrong, and finally broke. All those places where the code looked at p->next before checking for p==NULL. Tricky code is one thing. But, in my opinion, for truly "bad" code, ie code that explicitly violates the semantics of the language standard, the best remedy is to crash it disgustingly as hard and as soon as possible. Then it gets fixed. >Just how much code has this problem? Did every program you ported to a >Sun have to be fixed, or 10%, or something in between? Too much (again in my opinion), and partly because too many implementors are "kind" to bad code. > Phil Ngai +1 408 982 7840 > UUCP: {ucbvax,decwrl,hplabs,allegra}!amdcad!phil > ARPA: amdcad!phil@decwrl.dec.com But please remember that this is just the eccentric and intolerant opinion of Robert Firth
howard@cpocd2.UUCP (02/19/87)
In article <14833@amdcad.UUCP> phil@amdcad.UUCP (Phil Ngai) writes: > In a Unix system I am designing, I am considering catering to bad > code. That is, like the VAX I propose to make location 0 contain a > readable 0. > What do people think of this? Is it kind of disgusting? I agree with Rob and others who have said No (don't do it) and Yes (it's disgusting). It's not just location 0. It's all small integers, positive and negative. I once made a typo which left legal code that managed to pass lint but still dereferenced address 1 (or was it 3?), treating it as a (char *) and printing the string (i.e. the contents of low memory) out onto the users terminal. The terminal was emulating a VT100 and low memory just happened to contain several copies of the "Control String Initiator" character. The result was that the terminal would hang waiting for a "Control String Terminator" that never came. Explaining this bug to customers who had just lost minutes/hours of work was not pleasant. Explaining to management how it had occurred and gotten by me unnoticed AND GOTTEN THROUGH SIX WEEKS OF SOFTWARE QA UNNOTICED *AND* *BEEN* *SHIPPED* *TO* *EVERY* *CUSTOMER* was even less enjoyable. On a machine without read access to page 0 this bug would have caused a simple core dump and been easy to find and fix. Approximately, INTENDED CODE: printf("%s is %d","name",value); ACTUAL CODE: printf("%s is %d",value); (Note: this was before "printfck", which finds this error easily.) It is safer to make page 0 *and* page -1 both inaccessible to catch all uses of small integers as pointers, no matter which sign. And perhaps even all the pages addressable by shorts. Actually, you'd like it to be true that you couldn't accidentally use ANY integer in place of a pointer, but this is not easy on an untagged machine unless you are willing to make most or all of your address space inaccessible. ;-) Perhaps this is an argument for tagged architectures. Use of location 0 is a bug. It is not portable to many machines and operating systems (e.g. VMS). And lots of early UNIX programs do it. :^( -- Howard A. Landman ...!intelca!mipos3!cpocd2!howard
davis@unc.UUCP (02/20/87)
In article <554@aw.sei.cmu.edu.sei.cmu.edu> firth@bd.sei.cmu.edu.UUCP (Robert Firth) writes: >Tricky code is one thing. But, in my opinion, for truly "bad" code, >ie code that explicitly violates the semantics of the language >standard, the best remedy is to crash it disgustingly as hard and as >soon as possible. Then it gets fixed. It is obvious you are only an operating system maker and don't ever have to use code someone else wrote. As a part time computer user (as well as part time architect), I find any tool or operating system that "crashes disgustingly" to be highly wasteful of my time. Much of the software that I use was written by somebody else at another site. Getting him to fix his questionable practice may not be as easy as you indicate. I agree that it should be done right the first time, but "two wrongs don't make a right." The other issue here is that the differences may not show up at compile time. You have now introduced a "bug" into the newly installed software. Not only will a lot of user time be wasted, but a lot of system or applications programmer time will be spent chasing down this bug. How many guaranteed bugs are you willing to generate as a systems programmer? ------------------------------- Mark Davis (davis@unc.cs.unc.edu)
zs01#@andrew.cmu.edu.UUCP (02/20/87)
I think compilers and operating systems should go out of their way to punish
bad code. Things like unmapping the zero page, and initializing the stack
with something other than 0s are good examples of what I have in mind. Bad
code is due to one of two things, either the programmer is being lazy, or
he/she just didn't know about it. In the latter case, I am sure that the
programmer would prefer to know he/she screwed up instead of letting somebody
else find out about it. Even more important is the fact that a lot of this
"bad code" is flaky and a genuine pain to debug, regardless of the issue of
portability.
As an example, indirecting a NULL function pointer on an IBM RT will
effectively jump to the main() function of your program. Because of this, we
had a lot of fun one day trying to figure out why a certain program was
"infinitely tail-recursive". Indirecting a NULL pointer is likely to be an
unintentional error. Giving the guy a 0 just means that it will take longer
to find the bug.
Another bug I have seen twice in the last week is the following:
struct foo *InitFoo()
{
register struct foo *tempFoo = (struct foo *) malloc(sizeof(struct foo));
tempFoo->bar = 1;
tempFoo->bletch = 2;
tempFoo->baz = "loselose";
}
Notice there is no return statement there! Surprisingly, this code will work
on a lot of machines. The compiler wisely decides that it can use the return
register as a temporary (since there is no return value).
Sincerely,
Zalman Stern
Information Technology Center
Carnegie Mellon
(ARPA) zs01#@andrew.cmu.edu
(UUCP) try something like
...seismo!rochester!pt.cs.cmu.edu!andrew.cmu.edu!zs01#
john@uw-nsr.UUCP (02/20/87)
In article <14833@amdcad.UUCP> phil@amdcad.UUCP (Phil Ngai) writes: >In a Unix system I am designing, I am considering catering to bad >code. That is, like the VAX I propose to make location 0 contain a >readable 0. I think that code which gets ported to a Sun machine often >has to have this kind of thing cleaned up. Well, I think I can comment on this. For the last couple of years I have been (marooned) on a Data General MV/10000 system. As many of the readers of this newsgroup are probably aware this system has different hardware representations for "byte" pointers and "word" pointers. They also have "bit" pointers, but never having had to use one I really can't comment on them. I have seen postings from Steve Wallach in this newsgroup from time to time. I hear he knows something about MV's :-) At any rate it has been my experience that something like 70 - 80 % of the C programs I have ported have had pointer problems. There are a few exceptions to the rule, for example XLISP, rn and mkmf. Many of the programs that I have ported have tried to dereference through NULL. On the MV series this does not work, at all. However a much larger number of C programs are sloppy and don't cast pointer types appropriately. On the MV series this causes a protection violation and you are left looking at a few lines of traceback information. >What do people think of this? Is it kind of disgusting? No comment. I could vote either way right about now. >Just how much code has this problem? Did every program you ported to a >Sun have to be fixed, or 10%, or something in between? Too much code has this problem. However, there are people working on correcting the bad programs. I guess they will eventually die out, although not soon enough for me. However, only a couple of weeks ago my boss asked me how much work it would be to get "refer" running on our system. Aack! >I'd like to do things right, but I'm also lazy. I think doing this >will save me a lot of work. Have I overlooked anything? The processor >has a relocatable vector area so I can easily map the vector table >somewhere else. It would have saved me a lot of work. To be fair to Data General, they have developed (probably in self- defense) a top-notch C compiler that is very good at finding problems with pointer type mismatches and subscript range errors. To be fair to me they still don't have a dbx that can be used for serious debugging. -- John Sambrook Work: (206) 545-7433 University of Washington WD-12 Home: (206) 487-0180 Seattle, Washington 98195 UUCP: uw-beaver!uw-nsr!john
news@cit-vax.UUCP (02/20/87)
Organization : California Institute of Technology Keywords: From: jon@oddhack.Caltech.Edu (Jon Leech) Path: oddhack!jon In article <954@uw-nsr.UUCP> john@uw-nsr.UUCP (John Sambrook 5-7433) writes: >Many of the programs that I have ported have tried to dereference >through NULL. On the MV series this does not work, at all. However >a much larger number of C programs are sloppy and don't cast pointer >types appropriately. On the MV series this causes a protection violation >and you are left looking at a few lines of traceback information. I'm glad they fixed this. When I was using an early version of DG/UX a few years ago, an incorrectly cast pointer in one program resulted in a system crash. This made the debugging turnaround cycle rather long until the cast was corrected. On the bright side, making the program in question (a screen editor) run on the MV architecture nailed almost every remaining portability bug. Putting the same program on an 80286 Xenix got the rest. I recommend this technique for developing truly portable C code. -- Jon Leech (jon@csvax.caltech.edu || ...seismo!cit-vax!jon) Caltech Computer Science Graphics Group __@/
jans@stalker.UUCP (02/20/87)
In article <436@cpocd2.UUCP> howard@cpocd2.UUCP (Howard A. Landman) writes: >It's not just location 0. It's all small integers, positive and >negative... On a machine without read access to page 0 this bug >would have caused a simple core dump... It is safer to make page *0 >*and* page -1 both inaccessible... perhaps even all the pages >addressable by shorts... Use of location 0 is a bug. Hold on one second, there. Let's not make those with a legitimate use of low memory pointers pay for such things! Such idiocy would mean that certain machines, such as the NS32000, would lose the speed advantage of "base page addressing", or whatever you want to call it. Good National assembly coders typically keep their SB register at 0x40 for best code density and speed in memory indirection. Such code often requires C interface code to talk through such pointers. Another NS32000 application I did was a Z80 simulator. The Z80 code was loaded directly in the first 64k so that z80 pointers could be used directly. While the opcode simulator was assembly, BIOS routines were all written in C, and (as any CP/M programmer knows) location zero must be accessed. This is kind of far from the original subject, and I agree that the bad code should be cleaned up. C is a wonderful vehicle for accessing machine resources -- don't cripple it just because some people are incapable of using those resources correctly! Find the jokers and make them use Ada instead! :::::: Artificial Intelligence Machines --- Smalltalk Project :::::: :::::: Jan Steinman Box 1000, MS 60-405 (w)503/685-2956 :::::: :::::: tektronix!tekecs!jans Wilsonville, OR 97070 (h)503/657-7703 ::::::
klein@gravity.UUCP (02/20/87)
In article <636@brl-sem.ARPA>, ron@brl-sem.ARPA (Ron Natalie <ron>) writes: > I like the GOULD approach. The board that traps access to location 0 > among other out of bound memory addresses can be set to just ignore > the attempt (the user appears to have accessed the location, but doesn't > get anything if it is outside his memory limits, 0 is never in a user > address space there), print a message in the logfile, or memory fault > the process (and print a message in the log file). This allows you > to turn on carefull mode or revert to VAX bad-code compatibility mode. Only the last option is acceptable, because: 1. Ignoring the attempt to access unmapped address: hides a bug that could be caught at the disallowed access and may manifest itself in an unbelievably obscure manner later. Or, since the action of the program with this option might be the intended one, the bug is not detected, not fixed, and pops up sometime later in the development cycle (during porting) when it is much more expensive to fix. 2. Printing a message in a log file: too easy to miss; how do you correlate the message with the access that caused it? A core dump tells you exactly where the bug is and what the environment was at the time. 3. Memory fault: that's what the access is, a memory fault, and the system should stop right then and there because further execution will only mask the bug more. A point was brought up by a software user where he mentioned that time is wasted when a program core dumps. This is true, but the alternative can be much more costly. If the program accesses an unmapped address, it has a bug in it, and is therefore not correct. Its results are suspect, and it is better that it inform the user in no uncertain manner that it is doing something wrong rather than go on and produce wrong answers that might look OK. -- Mike Klein klein@sun.{arpa,com} Sun Microsystems, Inc. {ucbvax,hplabs,ihnp4,seismo}!sun!klein Mountain View, CA
phil@sequent.UUCP (02/20/87)
In my experience, having a zero of at least 4 bytes at zero is extremely useful. I have talked to customers who were very concerned about this. Apparently, they have fought battles trying to port to machines that did not have this feature. For our DYNIX software, we settled on: 1. an entire readonly page of zeros at zero by default 2. a loader flag that makes page zero invalid Using number 2 allows you to create more portable code or fix bugs in existing code before having to port to a machine without a zero at zero. NOTE: Having 1 be the default does save lots of headaches. -- ---- Phil Hochstetler Sequent Computer Systems Beaverton, Oregon
guy@gorodish.UUCP (02/21/87)
>As a part time computer user (as well as part time architect), I find >any tool or operating system that "crashes disgustingly" to be highly >wasteful of my time. So do I. That's why I agree 100% with Firth that programs that dereference null pointers should "crash disgustingly" on all implementations - so that they crash in the developer's lap, not mine. >Much of the software that I use was written by somebody else at another site. >Getting him to fix his questionable practice may not be as easy as you >indicate. That's entirely beside the point. There is no *guarantee* that the code would work correctly even if you *did* permit dereferencing of null pointers! If you don't permit dereferencing null pointers, a program that tries to dereference a null pointer is guaranteed to fail. If you do, the program might work, but then again it might fail in some mysterious way, and the consequences of the failure discovered much later (when it might be *too* late). This is *not* just ivory-tower speculation. See Howard Landman's article <436@cpocd2.UUCP>. >The other issue here is that the differences may not show up at compile >time. You have now introduced a "bug" into the newly installed >software. This resembles an annoying practice called "blaming the victim". The compiler/OS authors didn't introduce *any* bug whatsoever! The bug was in the original code; the author of that code was just lucky in that their code ran with no obvious visible symptoms. >Not only will a lot of user time be wasted, but a lot of system or >applications programmer time will be spent chasing down this >bug. How many guaranteed bugs are you willing to generate as a systems >programmer? What is this "guaranteed bugs" stuff? If it's guaranteed that some person will put out code that will fail if you can't dereference null pointers, then *that person* generated that bug, not the people who created the implementation that forbids dereferencing null pointers! The person in question should be re-educated, using a salary continuation program as incentive, if necessary. Go yell at him or her for the time you spent chasing down his or her bug, not at the compiler/OS implementor!
guy@gorodish.UUCP (02/22/87)
>In my experience, having a zero of at least 4 bytes at zero is extremely >useful. But is not necessarily possible, convenient, or desirable on all machines. >I have talked to customers who were very concerned about this. >Apparently, they have fought battles trying to port to machines that >did not have this feature. Who's to say they won't fight battles trying to port to machines that don't have the same byte order, or the same floating-point format, or the same address space layout, or the same... as the VAX? >Using number 2 allows you to create more portable code or fix bugs >in existing code Bugs is bugs. If the code has a bug, the fact that it *may* happen to work on a machine with 0 at location 0 - which is *not* guaranteed; consider, e.g., that somebody might be treating a null pointer or a *non*-zero value pointed to by that pointer as a termination condition. >NOTE: Having 1 be the default does save lots of headaches. And creates lots more. Mapping in page zero is the default on the 3B2's linker; this means *we* get to fix a bunch of bugs that are *not* our fault, but the fault of programmers at AT&T-IS.
mash@mips.UUCP (02/22/87)
In article <14833@amdcad.UUCP> phil@amdcad.UUCP (Phil Ngai) writes: >In a Unix system I am designing, I am considering catering to bad >code. That is, like the VAX I propose to make location 0 contain a >readable 0. I think that code which gets ported to a Sun machine often >has to have this kind of thing cleaned up. .... [followed by many articles, mostly against allowing zero in there, with a few allowing the option.] 1) At MIPS we left 0 out of the process image, for all the reasons that others have cited, although the next version has a ld option to move text to 0, just in case. However, on at least some machines you might be able to write a SIGSEGV catcher that stepped past the offending instruction [which after all, can't be right], or maybe even stuck a zero in the right register. Doing this implies the need to semi-interpret the offending instruction and bump the PC past it. This is anything from trivial to difficult, depending on the machine, but is at least a possibility for someone who doesn't control the OS on their machine, or has one whose architecture doesn't have a zero location in user space at all. 2) The discussion reminds me of a similar effect that happened somewhere around 1970. At Penn State, we were switching from a 360/67 to a 370/165. 360's required data items to be placed on natural boundaries [words on words, halfwords on halfwords, etc]. If you disobeyed you got a specification exception. 370's don't make that restriction. People at our computer center, as I recall, actually requested a price from IBM to get our 370 modified to restore the boundary requirement, because they'd found that in the (large number of) bugs they'd reported to IBM, SOMETHING LIKE 30-40% HAD FIRST SHOWN UP AS SPECIFICATION ERRORS! (We didn't get this feature, but people tried.) -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!mash, DDD: 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
wombat@ccvaxa.UUCP (02/23/87)
/* Written 11:33 pm Feb 19, 1987 by john@uw-nsr.UUCP in ccvaxa:comp.arch */ At any rate it has been my experience that something like 70 - 80 % of the C programs I have ported have had pointer problems. /* End of text from ccvaxa:comp.arch */ Grepping through the Gould UTX 2.0 sources, I found 19 standard 4.3BSD progams (out of about 420) had been altered so as to survive in the crash-'em-on-null-pointer-dereferences environment. Or rather, that's how many had the change commented, and any change to user-level code is supposed to be commented. Around here we run some machines with protect-bit on (crash offending programs) and some with it off, but anything to be released must be tested on a machine with protect-bit on. The ability to turn it off is useful if you have binary-only 3rd-party software that misbehaves. "My words say what you hear them say, but the movements of my mouth indicate that I am telling a series of humorous stories in Yiddish." R.A. Lafferty, *The Devil Is Dead* Wombat ihnp4!uiucdcs!ccvaxa!wombat
aglew@ccvaxa.UUCP (02/23/87)
...> Null pointer dereferencing - catering to VAXisms in new machines Gould's "prot" approach, optionally ignoring, logging, or coredumping protection violations (low memory) has its advantages for running binary only 3rd party code. But, if you use it, make it optional per process, not global per system. Andy "Krazy" Glew. Gould CSD-Urbana. USEnet: ihnp4!uiucdcs!ccvaxa!aglew 1101 E. University, Urbana, IL 61801 ARPAnet: aglew@gswd-vms.arpa
henry@utzoo.UUCP (Henry Spencer) (02/24/87)
Ultimately, sometimes you just gotta be realistic about these things. Yes, it is a Good Thing to make location 0 nonexistent so that code which touches it will drop dead. This will point out a great many bugs, which should be fixed. However, that is little comfort if you are on a tight schedule and the buggy code did somehow deliver the right answer. The "*NULL == '\0'" assumption is rather widespread, and it is often harmless, in the sense that it doesn't usually result in wrong answers. Making *NULL dump core is the right thing to do in the long term, but it does have a serious short- term impact on how soon you can ship the bloody system. Expediency may dictate providing some short-term workaround, like an optional ld flag that makes *NULL work but logs the problem somewhere for later attention. Note that you will be living with the problem *forever* unless you make people go out of their way when they want *NULL, so don't make it the default. -- Legalize Henry Spencer @ U of Toronto Zoology freedom! {allegra,ihnp4,decvax,pyramid}!utzoo!henry
lamaster@pioneer.UUCP (02/25/87)
In article <28200009@ccvaxa> wombat@ccvaxa.UUCP writes: > >Grepping through the Gould UTX 2.0 sources, I found 19 standard 4.3BSD >progams (out of about 420) had been altered so as to survive in the >crash-'em-on-null-pointer-dereferences environment. Or rather, that's >how many had the change commented, and any change to user-level code is >supposed to be commented. > >R.A. Lafferty, *The Devil Is Dead* Wombat > ihnp4!uiucdcs!ccvaxa!wombat Which ones are they? Hugh LaMaster, m/s 233-9, UUCP {seismo,topaz,lll-crg}!ames!pioneer!lamaster NASA Ames Research Center ARPA lamaster@ames-pioneer.arpa Moffett Field, CA 94035 ARPA lamaster@pioneer.arc.nasa.gov Phone: (415)694-6117 ARPA lamaster@ames.arc.nasa.gov "In order to promise genuine progress, the acronym RISC should stand for REGULAR (not reduced) instruction set computer." - Wirth ("Any opinions expressed herein are solely the responsibility of the author and do not represent the opinions of NASA or the U.S. Government")
wunder@hpcea.UUCP (02/27/87)
The HP-UX C compilers have flags that allow you to force traps on null pointer derefs, or to force accepting a null pointer deref (you get a 0 at that location). The default behavior depends upon the actual implementation. Providing both choices to the user is obviously the best idea. Walter Underwood wunder@hplabs ----------------------------------------- -z Do not bind anything to address zero. This option will allow runtime detection of null pointers. See the note on pointers below . -Z Allow dereferencing of null pointers. See the note on pointers below. Pointers Accessing the object of a NULL (zero) pointer is technically illegal, (see Kernighan and Ritchie) but many systems have permitted it in the past. The following is provided to maximize importability of code. If the hardware is able to return zero for reads of location zero (when accessing at least 8 and 16 bit quantities), it must do so unless the -z flag is present. The -z flag requests that SIGSEGV be generated if an access to location zero is attempted. Writes of location zero may be detected as errors even if reads are not. If the hardware cannot assure that location zero acts as if it was initialized to zero or is locked at zero, the hardware should act as if the -z flag is always set. ----------------------------------------- The above man page excerpt is almost certainly copyrighted by HP.
jpa@celerity.UUCP (02/27/87)
In article <436@cpocd2.UUCP> howard@cpocd2.UUCP (Howard A. Landman) writes: >In article <14833@amdcad.UUCP> phil@amdcad.UUCP (Phil Ngai) writes: >> In a Unix system I am designing, I am considering catering to bad >> code. That is, like the VAX I propose to make location 0 contain a >> readable 0. >> What do people think of this? Is it kind of disgusting? > >I agree with Rob and others who have said No (don't do it) and Yes >(it's disgusting). > >It's not just location 0. It's all small integers, positive and >negative. > ... >Use of location 0 is a bug. It is not portable to many machines and >operating systems (e.g. VMS). And lots of early UNIX programs do it. :^( >-- When we were designing Celerity's first product, I pushed for trapping on page 0 references. I lost out to arguments much like those here in the news. We also took no special precaution to see that address 0 has a 0 in it. In retrospect, I think we would have been better off making page 0 unreferencable. Almost all Celerity a.out's begin with the string 0x10, 0x80, 0x51, 0x0. Using 0 as a pointer is now affectionately referred to as 'The Q bug', referring to its visible attributes when printed as a string. The net effect was that 'Q bugs' were common, most of which were relatively harmless, but enough of which were quite nasty. Because we don't trap, it is hard to say how many of these bugs exist today. Due to the way Celerity manages shared text using a context identifier in the hardware, the kernel can distinguish a fetch-for-execute from a fetch-for-data. It can take special action (e.g., signal the process) on page 0 fetch-for-data references. To fix a number of hard-to-find bugs I have had to generate kernels to do exactly that for in-house developer use. Someone was in my office yesterday asking if I still had that kernel around (we never productized this behavior for the same reasons that we didn't make page 0 unreferencable. A site-configurable implementation is possible but has never been proposed). I still can't accept the argument that says that a machine should be designed to tolerate a certain class of bad code. The problem exists due to the fact that machines tolerated it in the first place. Would you argue that a machine should return 0 if a program accesses undefined virtual memory? (or -1, maybe? ;-)).
john@uw-nsr.UUCP (02/27/87)
In article <28200009@ccvaxa> wombat@ccvaxa.UUCP writes: > >/* Written 11:33 pm Feb 19, 1987 by john@uw-nsr.UUCP in ccvaxa:comp.arch */ >At any rate it has been my experience that something like 70 - 80 % >of the C programs I have ported have had pointer problems. >/* End of text from ccvaxa:comp.arch */ > >Grepping through the Gould UTX 2.0 sources, I found 19 standard 4.3BSD >progams (out of about 420) had been altered so as to survive in the >crash-'em-on-null-pointer-dereferences environment. Or rather, that's >how many had the change commented, and any change to user-level code is >supposed to be commented. There are a couple of factors here that I feel need to be taken into account. Sorry that I did not make them clear in my first posting. I would also like to thank "wombat" for pointing out where I was not clear. First, the code that I was talking about when I said 70 - 80 % had pointer problems was not the code from 4.3BSD. Rather, it was the standard assortment of C code posted to mod.sources and net.sources over the last year or so. Owing to its relative newness this code is naturally less well tested than the code in 4.3BSD. Second, when I say "pointer problems" I was not restricting my figures to programs that dereference through null, though certainly a large number of programs do this. An equally serious problem from my perspective is that a number of programs do something like this: typedef struct _mumble { int foo; char bar; } mumble; f() { mumble m; int fd; if (read(fd, &m, sizeof(m)) == -1) . . . } Everyone sees the (hopefully only one) bug in this code, right? I suppose that this is getting away from the charter for this newsgroup. I have directed follow-ups to comp.lang.c, where I suspect the bulk of the readership is thoroughly bored with this topic. I know I am :-) -- John Sambrook Work: (206) 545-7433 University of Washington WD-12 Home: (206) 487-0180 Seattle, Washington 98195 UUCP: uw-beaver!uw-nsr!john
news@cit-vax.UUCP (02/27/87)
Organization : California Institute of Technology Keywords: From: jon@oddhack.Caltech.Edu (Jon Leech) Path: oddhack!jon In article <6620001@hpcea.HP.COM> wunder@hpcea.HP.COM (Walter Underwood) writes: >The HP-UX C compilers have flags that allow you to force traps on null >pointer derefs, or to force accepting a null pointer deref (you get a 0 >at that location). The default behavior depends upon the actual >implementation. > >Providing both choices to the user is obviously the best idea. True enough. A pity HP doesn't do it. The HP 9000/300 C compiler has these flags; unfortunately, they are no-ops. NULL is always mapped whether you want it or not. I think they only work on series 500 machines. -- Jon Leech (jon@csvax.caltech.edu || ...seismo!cit-vax!jon) Caltech Computer Science Graphics Group __@/
wombat@ccvaxa.UUCP (02/28/87)
/*Written 2:23 pm Feb 25, 1987 by lamaster@pioneer.arpa in ccvaxa:comp.arch*/ Which ones are they? /* End of text from ccvaxa:comp.arch */ awk, csh, make, su, dump, ifconfig, timed, battlestar, games/hunt, biff, lastcomm, systat, telnet, w, at, atq, lookbib, refer, uucp, and yacc. The worst offender is awk, where 8 code changes were made. Most of the others were changed in only one place. "My words say what you hear them say, but the movements of my mouth indicate that I am telling a series of humorous stories in Yiddish." R.A. Lafferty, *The Devil Is Dead* Wombat ihnp4!uiucdcs!ccvaxa!wombat
phil@amdcad.UUCP (03/01/87)
I'm glad to see some people are mildly interested in this subject. :-) I didn't make very clear the purpose of the Unix system I'm designing. It is intended to be a throw-away. We only want to bring it up to show that Unix can be brought up, and to see how fast it runs. We don't intend to sell this software as a product. As such, our only interest is in doing it fast and to be able to run as much existing software as possible. I think the best thing to do is make accessing null pointers a segmentation violation by default and be able to allow it as needed. Perhaps we could discuss the best method of doing this, so as to try to provide a common mechanism. HP's method, for example sounds interesting. Should the choice be made at link time, at load time, at system configuration time, or something else? Should there be special magic numbers to ask the kernel to load with null pointers returning 0 on a data read? Etc. As an unlikely example, suppose you bought a binary from a vendor who used a permissive kernel and yours was strict. Someone mentioned that just returning 0 for the first 2 or 4 bytes was not enough, as there is code that accesses structure members with null structure pointers. (how perverse can you get?) Is this really a problem? Have many people seen this? -- I'd rather be compatible than right. Phil Ngai +1 408 982 7840 UUCP: {ucbvax,decwrl,hplabs,allegra}!amdcad!phil ARPA: amdcad!phil@decwrl.dec.com
bct@its63b.UUCP (03/04/87)
In article <15014@amdcad.UUCP> phil@amdcad.UUCP (Phil Ngai) writes: >I'm glad to see some people are mildly interested in this subject. :-) >I didn't make very clear the purpose of the Unix system I'm designing. >It is intended to be a throw-away. We only want to bring it up to show >that Unix can be brought up, and to see how fast it runs. We don't >intend to sell this software as a product. As such, our only interest >is in doing it fast and to be able to run as much existing software as >possible. > >-- > I'd rather be compatible than right. > Did the nice marketing man with a suit and tie tell you the bit about "..intended to be a throw-away. We only want to bring it up to show that .."? They do that to all the systems people all the time. Hardware people get it too. "..just put this MC68020 on a board to show that we can do it..", and before you know it the company has a new product. Where I worked (a large corporation in New England) they always sung me the just a "throw away" line. There's no such thing. When it's done someone will like it and buy it. Then you'll have to maintain it. If not you then someone else will get the job. You'll get bug reports complaining about bugs you knew were there all the time. You'll get bug reports about features. You'll write accurate documentation explaining how it's all a kludge, and the editors will expunge those bits. The millstone will be around your neck for years and years. Do it properly the first time. Do it properly everytime. Do a job to be proud of. Make checking segment violations the default behaviour, make it hard to turn them off. [But put a way of turning them off, just to save your ass]. I recomend turning off the segementation violations/memory violations or whatever in the software and have them always flagged by hardware. Don't you ever wonder how so much of the stuff out there is pure junk? Its the marketing people telling you ".. it's just this little demo... after the show we can put it right..". It never gets put right, they put you on a new project before your feet hit the floor. Brian -- > Brian Tompsett. Department of Computer Science, University of Edinburgh < > JMCB, The King's Buildings, Mayfield Road, Edinburgh, > Scotland, EH9 3JZ. >E-Mail: JANET: bct@uk.ac.ed.ecsvax > USENET: bct@ecsvax.ed.ac.uk > UUCP: seismo!mcvax!ukc!ecsvax.ed.ac.uk!bct > ARPA: bct%ecsvax.ed.ac.uk@ucl-cs > BITNET: psuvax1.bitnet!ecsvax.ed.ac.uk!bct > or bct%uk.ac.ed.ecsvax@earn.rl.ac.uk >Phone: +44 31 667 1081 x 3332
aglew@ccvaxa.UUCP (03/13/87)
...> Phil Ngai wondering if he should make low memory dereferencing ...> fault, or tolerate BSD VAX style code. Well, I wrote a first response to this last week, and then promptly got bitten by it. Gould traps read protection violations, but a kernel flag can be set to ignore, log (to user and console), or send SIGSEGV or SIGSTACK to the faulting process. The administrative program to do this is called "prot". Anyway, like any development shop, we always run with "prot abort". Except that last week somebody had to run a cross-assembler that had been written on VAX BSD. It died. But they *really* *had* to run it, so they turned "prot ignore" on. Now, "prot ignore" applies system wide. I was coding that night, and made some silly errors, and was surprised to come back the next morning, with "prot abort" turned back on, and find out that 'formerly working' code from the night before had mysteriously broken. Come the weekend and some free time, half an hour of code, an hour of compiling, two hours of documentation and three hours of testing... "prot" can now be turned on per-process. When or if this will make it into a product I don't know, but it made me feel better. Tidbits that may be useful: put the per-process flag into the proc, not the u. The syscall that sets or reads it should take a pid as an argument, mainly so that a utility program can perform prot_syscall( getppid(), PROT_IGNORE ) applied to your shell session, if you are about to use a lot of VAXish code, and don't want to bother prot'ing each of them. Make it inherited. Something that I didn't do, but which might be useful: it might be nice to attach this flag to executable files, so that buggy vax code can be executed without clogging your system logs. Also, it would be nice to be able to distinguish system programs, like csh and as, for which it would be nice to keep a log of protection violations, from user programs whose errors have no role getting into a system log. Distinguishing on the basis of uid is easy, but insufficient. Andy "Krazy" Glew. Gould CSD-Urbana. USEnet: ihnp4!uiucdcs!ccvaxa!aglew 1101 E. University, Urbana, IL 61801 ARPAnet: aglew@gswd-vms.arpa