kwh@bentley.UUCP (KW Heuer) (04/27/86)
In article <4495@cbrma.UUCP> cbrma!trl (Tom Lanning) writes: > Are there any "malloc" routines that check for modification of >malloc's link pointers by the "user's" code? Close to 70% of by bugs >are stack/memory problems were I use more space than I allocated. As mentioned in the man page, the malloc() code has a compile-time option for strict checks on the arena. (This is not too useful if you have no source code, of course.) In C++, you can define a datatype that works like a pointer but does run-time bounds checking; this requires changing your declarations from "char *" to "vector" (or whatever). Now, if only somebody would invent an architecture where all objects, including dynamicly allocated objects, are isolated in memory, then any subscript error would cause an immediate memory fault. You'd still be vulnerable to completely wild pointers (but less likely in a huge address space), and overflow of an array inside a structure might be untrappable, but otherwise it sounds like a great machine to do your debugging on. Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint
dan@prairie.UUCP (04/28/86)
------------- >Now, if only somebody would invent an architecture where all objects, >including dynamicly allocated objects, are isolated in memory, then any >subscript error would cause an immediate memory fault. If I'm not mistaken, this was done on the iAPX432, using a capability- based addressing scheme. Dimmed the lights. You could probably construct such an environment on the 80286, but no one does, probably for efficiency reasons. You're probably better off with a language that compiles checks into the code, and an option to turn off those checks once you're confident (?!) of the program. With a capability-based architecture, you pay the price all the time, whether you want to or not. -- Dan Frank ... uwvax!geowhiz!netzer!prairie!dan -or- dan@caseus.wisc.edu
robison@uiucdcsb.CS.UIUC.EDU (04/29/86)
> Now, if only somebody would invent an architecture where all objects, > including dynamicly allocated objects, are isolated in memory, then any > subscript error would cause an immediate memory fault .... > ... otherwise it sounds like a great machine to do your debugging on. > Its been around at least two decades. The Burroughs stack machines (6700 ... A15) have this feature. The hardware always checks subscripts before indexing. In fact indexing is not an addressing mode on these machines, but rather an instruction itself. - Arch
jans@tekecs.UUCP (Jan Steinman) (04/29/86)
In article <763@bentley.UUCP> kwh@bentley.UUCP (KW Heuer) writes: >Now, if only somebody would invent an architecture where all objects, >including dynamicly allocated objects, are isolated in memory, then any >subscript error would cause an immediate memory fault. You'd still be >vulnerable to completely wild pointers (but less likely in a huge address >space), and overflow of an array inside a structure might be untrappable, >but otherwise it sounds like a great machine to do your debugging on. > Sounds suspiciously like the Smalltalk virtual machine to me! :::::: Artificial Intelligence Machines --- Smalltalk Project :::::: :::::: Jan Steinman Box 1000, MS 60-405 (w)503/685-2956 :::::: :::::: tektronix!tekecs!jans Wilsonville, OR 97070 (h)503/657-7703 :::::: -- :::::: Artificial Intelligence Machines --- Smalltalk Project :::::: :::::: Jan Steinman Box 1000, MS 60-405 (w)503/685-2956 :::::: :::::: tektronix!tekecs!jans Wilsonville, OR 97070 (h)503/657-7703 ::::::
g-rh@cca.UUCP (Richard Harter) (04/30/86)
In article <> faustus@cad.UUCP (Wayne A. Christopher) writes: >In article <4495@cbrma.UUCP>, trl@cbrma.UUCP (Tom Lanning) writes: > >> Are there any "malloc" routines that check for modification of >> malloc's link pointers by the "user's" code? Close to 70% of by bugs >> are stack/memory problems were I use more space than I allocated. > >You could compile the 4.3 malloc() with the -DRCHECK flag, which checks >that you haven't modified the areas beyond your segment when you free it. >Also, if you're REALLY paranoid, write your own malloc that puts lots of >padding around the allocated areas and checks that none of the padding >areas has been changed every time you call malloc() or free(). > We wrote our own, partly out of paranoia an partly out of a probably misguided belief that we could write a more efficient allocator. The main thing that we did was to put all pointers and links in an entirely separate area from the space being allocated. This wins in that pointers never get overwritten -- it loses in that the program does not crash immediately on range errors. We added a one word pad on each end for overwrite testing (can be turned off) and legitimacy tests on all returns of space. We also put in a option to store where requests were coming from. (Hasn't been used in years.) The upshot is that space request/free problems are rare and easily found. However this doesn't avoid the problem of incorrectly dimensioned arrays which are handled by the system and can lead to very peculiar bugs. Richard Harter, SMDS Inc.
kwh@bentley.UUCP (KW Heuer) (04/30/86)
In article <117@prairie.UUCP> prairie!dan (Dan Frank) writes: [comments on overflow-checking architecture] > You're probably better off with a language that compiles checks into >the code, and an option to turn [them] off... As I mentioned, you can do it this way in C++, but when you want to use pointers you have to copy three words instead of one. (Or you can use a language like pascal, which "solves" the problem by disallowing pointer arithmetic.) What I was thinking of, though, was a computer with strict architecture that could be used for development and testing; when the program is shipped to the Real World it would presumably run on "normal" architecture. Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint
rbutterworth@watmath.UUCP (Ray Butterworth) (05/01/86)
> You're probably better off with a language that compiles checks into > the code, and an option to turn off those checks once you're confident > (?!) of the program. With a capability-based architecture, you pay the > price all the time, whether you want to or not. Many years ago I worked with a language in which all arrays had to have dimensions that were a power of two (like 4.2 malloc). The code which indexed into the array simply anded the index with the appropriate bit mask. This was very fast, yet it guaranteed that any bad indexes wouldn't corrupt anything except the array being addressed. As a side-effect, one could use this feature to cycle continuously through an array or could even use negative indexes without any extra overhead.
jer@peora.UUCP (J. Eric Roskos) (05/02/86)
> >Now, if only somebody would invent an architecture where all objects, > >including dynamicly allocated objects, are isolated in memory, then any > >subscript error would cause an immediate memory fault. > > If I'm not mistaken, this was done on the iAPX432, using a capability- > based addressing scheme. Dimmed the lights. You could probably construct > such an environment on the 80286, but no one does, probably for efficiency > reasons. One problem with the 432's approach was that it was very extreme; I don't think it's good to say "the 432 tried these approaches and it was too slow, therefore the checking can't be efficiently implemented." I posted some comments in here (net.arch) about a week ago on apparently the same subject, but nobody replied in net.arch to it (although I got a couple of replies by mail). Of the people who replied by mail, one (whose technical knowledge I have a high opinion of) pointed out that C compilers exist where subscript/pointer checking is done in software, and that thus it would seem likely that similar checking could be done in hardware. The way you could do it (which was a point the 2 people replying seemed to agree upon) was that, associated with all pointers, you should have a "minimum address" and "maximum address" for the object being pointed to. Bear in mind that in C array names are just constant pointers, so constructs like a[i] can use this method as well as plain pointer references such as *p. If p is a pointer of type t, then to use p you will have to first assign it a value by referencing an existing object, or by creating a new one: typedef <whatever> t; t a[100]; t *p; p = a; (1) p = &a[40]; (2) p = (t *)malloc(300); (3) In case 1 and 2, you can easily set p.base to &a[0], and p.bound to &a[99], and set p.pointer to &a for (1) and to &a[40]. So p then carries around with it the range of valid addresses it can point to. (Note that nothing says anything about what a pointer in C has to look like, so p can easily be a 3-word struct-like object, and if you were building a new machine to support such things, you could make the machine have 3-word address registers). In case 3, you could have malloc set the base and bound -- though if malloc is written in C then you'd have to provide some way to reference the base and bound fields from within the language -- so things like malloc would also work. I had originally thought that some counterexamples existed, but one of the respondants (John Gilmore) pointed out that really the counterexamples involved essentially semantically inconsistent uses of the pointers (e.g., having 2 pointers around and changing the bounds on 1 of them). In any case, if you change p, e.g. p++, then you'd change what I called p.pointer above, and leave p.base and p.bound alone. If you generated an effective address which was outside [p.base .. p.bound], then you'd generate an addressing fault. I don't think this checking would be that slow, although on a machine with a narrow bus (especially those like the 8088 where you are already fetching pointers through multiple bus cycles) fetching the range attributes of the pointer would increase the bus time by a factor of 3. It would also reduce the number of register variables you could have, if you kept the bounds in registers also -- I think it would work best if you had a machine that had registers set aside specifically for pointer checking. On a machine such as the 3280*, which does quadword reads from memory because the data bus is very wide, the bus overhead would be much less. So the checking by this method would probably not be that bad (certainly not as bad as the 432, which I believe had to sometimes fetch several descriptor objects in order to validate references) at least on larger machines (and after all, microprocessors are getting larger all the time in terms of width of the bus, etc.). ----- *I cite this machine because I'm more familiar with it; I suspect probably other machines like Vaxes have similarly wide buses. -- E. Roskos
rose@think.ARPA (John Rose) (05/05/86)
In article <763@bentley.UUCP> kwh@bentley.UUCP (KW Heuer) writes: >In article <4495@cbrma.UUCP> cbrma!trl (Tom Lanning) writes: >> Are there any "malloc" routines that check for modification of >>malloc's link pointers by the "user's" code? Close to 70% of by bugs >>are stack/memory problems were I use more space than I allocated. > >As mentioned in the man page, the malloc() code has a compile-time option >for strict checks on the arena. (This is not too useful if you have no >source code, of course.) If you do have source code, here's another suggestion which has worked very well for me. Define an circular buffer which stores a record of the last few hundred records of malloc/free/morecore history. Make sure your symbolic debugger can dump it for you. This trick alone has saved me hours of debugging time on quite a few occasions. In my applications, someone would either (1) try to use freed storage, or (2) go off the end of allocated storage, and usually these errors occurred within a dozen history events after the call to (1) free or (2) malloc. >Now, if only somebody would invent an architecture where all objects, >including dynamicly allocated objects, are isolated in memory, then any >subscript error would cause an immediate memory fault. You'd still be >vulnerable to completely wild pointers (but less likely in a huge address >space), and overflow of an array inside a structure might be untrappable, >but otherwise it sounds like a great machine to do your debugging on. > >Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint It's called the "Lisp Machine". All storage is allocated and manipulated according to very strict typing rules, enforced in microcode. If you walk off the end of an array, you get an immediate break into the debugger, with options to (a) abort, (b) re-invoke any stack frame, (c) return from any stack frame, (d) extend the array in place, (e) anything else you can specify in Lisp (the debugger, command loop, and application all run in the same address space! This is safe because of the enforcement mentioned above). The C compiler I'm using (Zeta-C from Zeta-Soft, Ltd.) implements pointers (approximately) as ordered pairs of arrays and indexes. The bounds-checking can be done in macrocode on a conventional machine. The BCC compiler (from Delft Consulting?) does this by source-transforming a C program so that pointers turn into small records carrying bounds information. Both systems (as far as I know) model the C runtime memory as a collection of "floating" byte arrays. Part of the justification for this is found in K&R RefMan 7.6 "Pointer comparison is portable only when the pointers point to objects in the same array." [Disclaimer: I have no professional connection with the Zeta-C or BCC companies, although I do know the implementors of both products personally.] -- ---------------------------------------------------------- John R. Rose Thinking Machines Corporation 245 First St., Cambridge, MA 02142 (617) 876-1111 X270 rose@think.arpa ihnp4!think!rose
toby@felix.UUCP (Toby Gottfried) (05/05/86)
In article <763@bentley.UUCP> kwh@bentley.UUCP (KW Heuer) writes: >Now, if only somebody would invent an architecture where all objects, >including dynamicly allocated objects, are isolated in memory, then any >subscript error would cause an immediate memory fault. Burroughs did exactly this in their Large Systems over 20 years ago. >You'd still be vulnerable to completely wild pointers (but less likely >in a huge address space), Not a problem - the address space isn't flat. >and overflow of an array inside a structure might be untrappable, >but otherwise it sounds like a great machine to do your debugging on. It is. -- Toby Gottfried FileNet Corp {ucbvax,ihnp4,decvax}!trwrb!felix!toby Costa Mesa, CA
kwh@bentley.UUCP (KW Heuer) (05/09/86)
In article <5097@think.ARPA> rose@think.ARPA (John Rose) writes: >If you do have source code, here's another suggestion which has worked >very well for me. Define an circular buffer which stores a record of >the last few hundred records of malloc/free/morecore history. Make >sure your symbolic debugger can dump it for you. This trick alone >has saved me hours of debugging time on quite a few occasions. I've found the following front-end* to be very useful: char *Dalloc(n) unsigned n; { register char *p = malloc(n); fprintf(stderr, "+%8x\n", p); return (p); } void Dfree(p) char *p; { fprintf(stderr, "-%8x\n", p); free(p); } char *Drealloc(p, n) char *p; unsigned n; { fprintf(stderr, "-%8x\n", p); p = realloc(p, n); fprintf(stderr, "+%8x\n", p); return (p); } When the program terminates (normal exit or core dump), I run a consistency check on the log file to cancel out "+xxx" and "-xxx". Any unbalanced "-" is a bug. I consider an unbalanced "+" to be a bug, too, unless there are a bounded number of them and I can account for them all. Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint *It can also be built into malloc.c if you have source; this allows it to log the calls within stdio. For some applications, a special log file should be used instead of stderr.