hardy@sdccsu3.UUCP (Jeff Hardy) (05/12/84)
The Bizzare Bourne Shell The Bourne Shell relies on some very obscure quirks of the VAX/PDP-11 and Unix. As many of us know, the Bourne Shell does its own memory allocation scheme. This scheme does not necessarily do an sbrk when memory is allocated, but merely marks the memory as allocated. When an illegal reference is made to this memory, the memory fault signal is caught, the sbrk is done and the program continues on its merry way. This works like a charm on the VAX and PDP-11. It cannot work at all on the MC68000 or very easily on the MC68010. The reasons this cannot work on the MC68000 are well known, basically the 68000 cannot recover from an illegal memory reference. It cannot work easily on the MC68010 because the 68010 can only CONTINUE a faulted instruction, not RESTART it. Why this causes the Bourne Shell difficulty can be explained by: A memory fault occurs, we enter the USER signal processing which performs an sbrk, then we want to continue in the instruction which generated the memory fault. Unfortunately, we are now in USER state and we can only continue the instruction by doing an RTE, which must be done from SYSTEM state! There are, of course, solutions. Software continuation of the instruction or a special hook to allow a USER generated RTE might be possiblities. But I doubt you could convince me that they are either easy or not kludges. What most people who have ported the Bourne Shell have done is to catch all the places where the "non-allocated" memory might be referenced and check for the sbrk being done. This is not terribly pleasant either, since these references are sprinkled throughout the Bourne Shell. Fortunately, most are done through macros. But this only mildly soothes my angst and ire. Unix Guru Question: what is the value of *p after the second sbrk? p = sbrk(0); *p = 1; sbrk(2); The Unix Programmer's Manual says that all newly allocated memory is initialized to zero. However, on Unix Version 6, 7, System III and System V, *p is one! Indeed, Unix only zeros memory when a new MMU segment of memory is allocated. Also, this code will generate a memory fault if p happens to point to a new MMU segment. I contend that this really is a Unix bug, not a "feature". Either sbrk should not initialize any memory, or it should always initialize memory. The half-assed attempt it does now can only lead to bizzare usages of this "feature". You guessed it, the Bizzare Bourne Shell depends upon this. The Bourne Shell places a pointer into its available space pool at the location returned from an sbrk(0). If sbrk were to act as the documentation indicates, the next time you did an sbrk, it would corrupt your free list. Also note that on the VAX and PDP-11 this trick is guaranteed to work 100% of the time, since if p points to an invalid address, the Bourne Shell will field the memory fault and do the sbrk, which initializes the new segment to zero, before the assignment of the location. Only careful coding will prevent this on the 68000/68010. I really question whether a program as critical as the Bourne Shell should depend upon not merely an undocumented "feature", but one seemingly CONTRADICTED by the documentation. These are just a couple of the many interesting quirks in the Bourne Shell. I will not even comment on those brain-damaged individuals who insist upon "improving" the C language with macros like IF, THEN, ELSE, or who fail to use standard include files like <signal.h>. Michael Christensen Unix System V Project Manager Alcyon Corporation
ka@hou3c.UUCP (Kenneth Almquist) (05/14/84)
I contend that given the code p = sbrk(0); *p = 1; sbrk(2); a memory fault should be generated when the second statement is executed because it references memory above the break point. If the program catches memory faults and increases the break value, then after the second sbrk the value of *p should be one. In practice most hardware will not allow you to set a break value to an arbitrary location so this code may not generate a memory fault, but the value of *p should never the less be one after the second sbrk. None of this is intended to excuse the other faults of the Borne shell. Kenneth Almquist
kds@intelca.UUCP (05/16/84)
>The Unix Programmer's Manual says that all newly allocated memory is >initialized to zero...... >.........Indeed, Unix only zeros memory when a new MMU >segment of memory is allocated. Also, this code will generate a memory >fault if p happens to point to a new MMU segment. I contend that this And Eunice does this also, also, it seems that even calls to get memory using malloc, etc. do not always create zeroed out memory areas (as someone suggested). In addition, unless you set some special loader flags you may not get contiguous memory allocated, which creates mucho problems with, for example, nroff! (and, apparently, sh) But, its better than RAW VMS...verbum sat sapentia -- Ken Shoemaker, Intel, Santa Clara, Ca. {pur-ee,hplabs,ucbvax!amd70,ogcvax!omsvax}!intelca!kds
dan@idis.UUCP (Dan Strick) (05/17/84)
I looked up the V6, V7, and 4.0bsd (~32V) manual pages for the break system call and found that all mention the fact that the break address is rounded up and none mention the fact that new memory is cleared. I assume the unix support group changed the manual page to eliminate a machine dependency (the rounding) and to make the documentation more complete (clearing new memory). These kind of changes to documentation are always good. Right? Wrong! New memory is cleared not so much to save the programmer the trouble of clearing it explicitly but to avoid apparently randomly malfunctioning programs and possibly as a slightly paranoid security measure. It would have made as much sense to set new virtual memory locations to -1. Some features are best left undocumented or hidden in a footnote. (especially obviously implementation dependent features) The old manual pages give the rules for rounding up break addresses on each of the various machines that unix ran on at the time that those old versions of unix were released. This had to go, but it would have made sense to mention that the system did not always take requested break addresses literally. Some features are best left documented. (especially when omission is misleading) My opinions: the unix implementation of the brk() system call is correct. Recent documentation (i.e. system 5) may be misleading (so much for professionally designed user friendly documentation). The Bourne shell implementation (which motivated the tirade about the unix brk() implementation) is sick. It breaks all the rules. Worse: it uses longjumps. Dan Strick [decvax|mcnc]!idis!dan
steiny@scc.UUCP (Don Steiny) (05/18/84)
*** When I was first trying to figure out the Bourne shell, I found it useful to do: cc -E module.c | cb > more_readable The file "more_readable" has had the preprocessor substitute in the C for the Alogol 68. Don Steiny Personetics 425-0382
chris@umcp-cs.UUCP (05/20/84)
Now hold on a minute here ... From: hardy@sdccsu3.UUCP Unix Guru Question: what is the value of *p after the second sbrk? p = sbrk(0); *p = 1; sbrk(2); Undefined. sbrk(0) simply returns the address of the current break. If there is room beyond the current break, then *p will be one. If not, then you get that memory fault you're griping about, and I don't really want to know the details.... The Unix Programmer's Manual says that all newly allocated memory is initialized to zero. However, on Unix Version 6, 7, System III and System V, *p is one! Indeed, Unix only zeros memory when a new MMU segment of memory is allocated. Where does it say that? Not in ``man 2 brk'' (where one finds the sbrk manual). If you have a partial page, obviously sbrk is going to be lazy. Use calloc() if you want zeroed memory. (I know, your complaint is that the Bourne shell doesn't - so that makes the Bourne shell guilty of making hardware assumptions. But don't blame the manuals.) Also, this code will generate a memory fault if p happens to point to a new MMU segment. I contend that this really is a Unix bug, not a "feature". Either sbrk should not initialize any memory, or it should always initialize memory. The half-assed attempt it does now can only lead to bizzare usages of this "feature". It should be obvious that memory has to be initialized to *something*, or you'll have a huge security hole. But why initialize it more than once? That's what calloc() is for. I really question whether a program as critical as the Bourne Shell should depend upon not merely an undocumented "feature", but one seemingly CONTRADICTED by the documentation. The Bourne shell should *not* depend on it. (And I can't stand the fake ALGOL either.) (Anybody know if ksh has this kind of code in it? :-) ) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci (301) 454-7690 UUCP: {seismo,allegra,brl-bmd}!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@maryland
avr@CS-Arthur.UUCP (05/29/84)
x ---------------------- From: hardy@sdccsu3.UUCP I will not even comment on those brain-damaged individuals who insist upon "improving" the C language with macros like IF, THEN, ELSE . . . . ---------------------- But I will. It's done in 'adb', too, which I'm doing something with, and its a royal pain - looks ugly as BabaYaga, too. UGH! Andrew Royappa {ucbvax,decvax,pur-ee,ihnp4}!purdue!avr