[net.lang.c] oops, corrupted memory again!

kwh@bentley.UUCP (KW Heuer) (04/27/86)

In article <4495@cbrma.UUCP> cbrma!trl (Tom Lanning) writes:
>	Are there any "malloc" routines that check for modification of
>malloc's link pointers by the "user's" code?   Close to 70% of by bugs
>are stack/memory problems were I use more space than I allocated.

As mentioned in the man page, the malloc() code has a compile-time option
for strict checks on the arena.  (This is not too useful if you have no
source code, of course.)

In C++, you can define a datatype that works like a pointer but does
run-time bounds checking; this requires changing your declarations from
"char *" to "vector" (or whatever).

Now, if only somebody would invent an architecture where all objects,
including dynamicly allocated objects, are isolated in memory, then any
subscript error would cause an immediate memory fault.  You'd still be
vulnerable to completely wild pointers (but less likely in a huge address
space), and overflow of an array inside a structure might be untrappable,
but otherwise it sounds like a great machine to do your debugging on.

Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint

dan@prairie.UUCP (04/28/86)

-------------

>Now, if only somebody would invent an architecture where all objects,
>including dynamicly allocated objects, are isolated in memory, then any
>subscript error would cause an immediate memory fault.

   If I'm not mistaken, this was done on the iAPX432, using a capability-
based addressing scheme.  Dimmed the lights.  You could probably construct
such an environment on the 80286, but no one does, probably for efficiency
reasons.

   You're probably better off with a language that compiles checks into
the code, and an option to turn off those checks once you're confident
(?!) of the program.  With a capability-based architecture, you pay the
price all the time, whether you want to or not.


-- 
	Dan Frank
	    ... uwvax!geowhiz!netzer!prairie!dan
	    -or- dan@caseus.wisc.edu

sam@delftcc.UUCP (Sam Kendall) (04/29/86)

In article <4495@cbrma.UUCP>, trl@cbrma.UUCP writes:
> 	Are there any "malloc" routines that check for modification of
> malloc's link pointers by the "user's" code?   Close to 70% of by bugs
> are stack/memory problems were I use more space than I allocated.  Tracking
> this kind of problem is horrible.
> ....
> 	This problem is not unique to my code, I know others that are
> plagued by these bugs also.  A programmer only needs one of these bugs
> in a large piece of software to waste a week of effort.
> ....
> 	I am interested in any tools, recommend tions, or routines that may
> help.  Thanks.

The Bcc Compiler, a tool made by Delft Consulting Corp., catches bugs of
this sort (as one case of "pointer out of bounds") and tells you exactly
where in your source the transgression occurred.  An item about Bcc was
recently posted to mod.newprod.  Please contact me for more info.

Also, if you have source, you can recompile malloc to check its internal
pointers stringently.  At least you could on V7, and I doubt anyone has
removed this capability.  This won't locate your bug, but it might help.

----
Sam Kendall			{ ihnp4 | seismo!cmcl2 }!delftcc!sam
Delft Consulting Corp.		ARPA:  delftcc!sam@NYU.ARPA
432 Park Avenue South		Phone: +1 212 243-8700
New York, NY  10016

gwyn@BRL.ARPA (VLD/VMB) (04/29/86)

If you check the malloc source code, you'll see that it can be
compiled with debugging checks turned on; the version in the C
library has them disabled.  I suggest keeping the debugging
malloc.o around somewhere like /usr/lib so everyone can get it
when needed.

jans@tekecs.UUCP (Jan Steinman) (04/29/86)

In article <763@bentley.UUCP> kwh@bentley.UUCP (KW Heuer) writes:
>Now, if only somebody would invent an architecture where all objects,
>including dynamicly allocated objects, are isolated in memory, then any
>subscript error would cause an immediate memory fault.  You'd still be
>vulnerable to completely wild pointers (but less likely in a huge address
>space), and overflow of an array inside a structure might be untrappable,
>but otherwise it sounds like a great machine to do your debugging on.
>
Sounds suspiciously like the Smalltalk virtual machine to me!

:::::: Artificial   Intelligence   Machines   ---   Smalltalk   Project ::::::
:::::: Jan Steinman		Box 1000, MS 60-405	(w)503/685-2956 ::::::
:::::: tektronix!tekecs!jans	Wilsonville, OR 97070	(h)503/657-7703 ::::::
-- 
:::::: Artificial   Intelligence   Machines   ---   Smalltalk   Project ::::::
:::::: Jan Steinman		Box 1000, MS 60-405	(w)503/685-2956 ::::::
:::::: tektronix!tekecs!jans	Wilsonville, OR 97070	(h)503/657-7703 ::::::

herndon@umn-cs.UUCP (04/30/86)

  You didn't post your address!  A (partial) solution to the
problem of not freeing/allocated memory was published in
SIGPLAN Notices a few years back.  Look in the May 1982 issue
for "A Technique for Finding Storage Allocation Errors in
C-language Programs", by David R. Barach, David H. Taenzer,
and Robert E. Wells.
  The technique used in the article is fairly simple, and is
more oriented towards finding allocation errors.  Depending
on the severity of your errors, their technique might be
applicable.

				Robert Herndon
				...!ihnp4!umn-cs!herndon
				herndon@umn-cs
				herndon.umn-cs@csnet-relay.ARPA

g-rh@cca.UUCP (Richard Harter) (04/30/86)

In article <> faustus@cad.UUCP (Wayne A. Christopher) writes:
>In article <4495@cbrma.UUCP>, trl@cbrma.UUCP (Tom Lanning) writes:
>
>> 	Are there any "malloc" routines that check for modification of
>> malloc's link pointers by the "user's" code?   Close to 70% of by bugs
>> are stack/memory problems were I use more space than I allocated.
>
>You could compile the 4.3 malloc() with the -DRCHECK flag, which checks
>that you haven't modified the areas beyond your segment when you free it.
>Also, if you're REALLY paranoid, write your own malloc that puts lots of
>padding around the allocated areas and checks that none of the padding
>areas has been changed every time you call malloc() or free(). 
>
	We wrote our own, partly out of paranoia an partly out of a 
probably misguided belief that we could write a more efficient allocator.
The main thing that we did was to put all pointers and links in an entirely
separate area from the space being allocated.  This wins in that pointers
never get overwritten -- it loses in that the program does not crash 
immediately on range errors.  We added a one word pad on each end for
overwrite testing (can be turned off) and legitimacy tests on all returns
of space.   We also put in a option to store where requests were coming
from.  (Hasn't been used in years.)  The upshot is that space request/free
problems are rare and easily found.  However this doesn't avoid the problem
of incorrectly dimensioned arrays which are handled by the system and can
lead to very peculiar bugs.

		Richard Harter, SMDS Inc.

kwh@bentley.UUCP (KW Heuer) (04/30/86)

In article <117@prairie.UUCP> prairie!dan (Dan Frank) writes:
[comments on overflow-checking architecture]
>   You're probably better off with a language that compiles checks into
>the code, and an option to turn [them] off...

As I mentioned, you can do it this way in C++, but when you want to use
pointers you have to copy three words instead of one.  (Or you can use
a language like pascal, which "solves" the problem by disallowing pointer
arithmetic.)  What I was thinking of, though, was a computer with strict
architecture that could be used for development and testing; when the
program is shipped to the Real World it would presumably run on "normal"
architecture.

Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint

rbutterworth@watmath.UUCP (Ray Butterworth) (05/01/86)

>    You're probably better off with a language that compiles checks into
> the code, and an option to turn off those checks once you're confident
> (?!) of the program.  With a capability-based architecture, you pay the
> price all the time, whether you want to or not.

Many years ago I worked with a language in which all arrays had to
have dimensions that were a power of two (like 4.2 malloc).  The
code which indexed into the array simply anded the index with the
appropriate bit mask.  This was very fast, yet it guaranteed that
any bad indexes wouldn't corrupt anything except the array being
addressed.  As a side-effect, one could use this feature to cycle
continuously through an array or could even use negative indexes
without any extra overhead.

jer@peora.UUCP (J. Eric Roskos) (05/02/86)

> >Now, if only somebody would invent an architecture where all objects,
> >including dynamicly allocated objects, are isolated in memory, then any
> >subscript error would cause an immediate memory fault.
>
>    If I'm not mistaken, this was done on the iAPX432, using a capability-
> based addressing scheme.  Dimmed the lights.  You could probably construct
> such an environment on the 80286, but no one does, probably for efficiency
> reasons.

One problem with the 432's approach was that it was very extreme; I don't
think it's good to say "the 432 tried these approaches and it was too slow,
therefore the checking can't be efficiently implemented."

I posted some comments in here (net.arch) about a week ago on apparently
the same subject, but nobody replied in net.arch to it (although I got a
couple of replies by mail).  Of the people who replied by mail, one (whose
technical knowledge I have a high opinion of) pointed out that C compilers
exist where subscript/pointer checking is done in software, and that thus
it would seem likely that similar checking could be done in hardware.

The way you could do it (which was a point the 2 people replying seemed to
agree upon) was that, associated with all pointers, you should have a
"minimum address" and "maximum address" for the object being pointed to.
Bear in mind that in C array names are just constant pointers, so
constructs like a[i] can use this method as well as plain pointer
references such as *p.  If p is a pointer of type t, then to use p you
will have to first assign it a value by referencing an existing object, or
by creating a new one:

	typedef <whatever> t;
	t  a[100];
	t  *p;

	p = a;                  (1)
	p = &a[40];             (2)
	p = (t *)malloc(300);   (3)

In case 1 and 2, you can easily set p.base to &a[0], and p.bound to
&a[99], and set p.pointer to &a for (1) and to &a[40].  So p then carries
around with it the range of valid addresses it can point to.  (Note that
nothing says anything about what a pointer in C has to look like, so
p can easily be a 3-word struct-like object, and if you were building a
new machine to support such things, you could make the machine have
3-word address registers).

In case 3, you could have malloc set the base and bound -- though if
malloc is written in C then you'd have to provide some way to reference
the base and bound fields from within the language -- so things like
malloc would also work.  I had originally thought that some
counterexamples existed, but one of the respondants (John Gilmore) pointed
out that really the counterexamples involved essentially semantically
inconsistent uses of the pointers (e.g., having 2 pointers around and
changing the bounds on 1 of them).

In any case, if you change p, e.g. p++, then you'd change what I called
p.pointer above, and leave p.base and p.bound alone.  If you generated an
effective address which was outside [p.base .. p.bound], then you'd
generate an addressing fault.

I don't think this checking would be that slow, although on a machine with
a narrow bus (especially those like the 8088 where you are already
fetching pointers through multiple bus cycles) fetching the range
attributes of the pointer would increase the bus time by a factor of 3.
It would also reduce the number of register variables you could have, if
you kept the bounds in registers also -- I think it would work best if you
had a machine that had registers set aside specifically for pointer
checking.  On a machine such as the 3280*, which does quadword reads from
memory because the data bus is very wide, the bus overhead would be much
less.  So the checking by this method would probably not be that bad
(certainly not as bad as the 432, which I believe had to sometimes fetch
several descriptor objects in order to validate references) at least on
larger machines (and after all, microprocessors are getting larger all the
time in terms of width of the bus, etc.).

-----
*I cite this machine because I'm more familiar with it; I suspect probably
 other machines like Vaxes have similarly wide buses.
-- 
E. Roskos

rose@think.ARPA (John Rose) (05/05/86)

In article <763@bentley.UUCP> kwh@bentley.UUCP (KW Heuer) writes:
>In article <4495@cbrma.UUCP> cbrma!trl (Tom Lanning) writes:
>>	Are there any "malloc" routines that check for modification of
>>malloc's link pointers by the "user's" code?   Close to 70% of by bugs
>>are stack/memory problems were I use more space than I allocated.
>
>As mentioned in the man page, the malloc() code has a compile-time option
>for strict checks on the arena.  (This is not too useful if you have no
>source code, of course.)
If you do have source code, here's another suggestion which has worked
very well for me.  Define an circular buffer which stores a record of
the last few hundred records of malloc/free/morecore history.  Make
sure your symbolic debugger can dump it for you.  This trick alone
has saved me hours of debugging time on quite a few occasions.
In my applications, someone would either (1) try to use freed storage,
or (2) go off the end of allocated storage, and usually these errors
occurred within a dozen history events after the call to (1) free or
(2) malloc.

>Now, if only somebody would invent an architecture where all objects,
>including dynamicly allocated objects, are isolated in memory, then any
>subscript error would cause an immediate memory fault.  You'd still be
>vulnerable to completely wild pointers (but less likely in a huge address
>space), and overflow of an array inside a structure might be untrappable,
>but otherwise it sounds like a great machine to do your debugging on.
>
>Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint

It's called the "Lisp Machine".  All storage is allocated and manipulated
according to very strict typing rules, enforced in microcode.  If you walk
off the end of an array, you get an immediate break into the debugger,
with options to (a) abort, (b) re-invoke any stack frame, (c) return from
any stack frame, (d) extend the array in place, (e) anything else you
can specify in Lisp (the debugger, command loop, and application all
run in the same address space!  This is safe because of the enforcement
mentioned above).  The C compiler I'm using (Zeta-C from Zeta-Soft, Ltd.)
implements pointers (approximately) as ordered pairs of arrays and indexes.

The bounds-checking can be done in macrocode on a conventional machine.
The BCC compiler (from Delft Consulting?) does this by source-transforming
a C program so that pointers turn into small records carrying bounds
information.

Both systems (as far as I know) model the C runtime memory as a collection
of "floating" byte arrays.  Part of the justification for this is found in K&R
RefMan 7.6 "Pointer comparison is portable only when the pointers point to
objects in the same array."

[Disclaimer:  I have no professional connection with the Zeta-C or BCC
companies, although I do know the implementors of both products personally.]
-- 
----------------------------------------------------------
John R. Rose		     Thinking Machines Corporation
245 First St., Cambridge, MA  02142    (617) 876-1111 X270
rose@think.arpa				  ihnp4!think!rose

toby@felix.UUCP (Toby Gottfried) (05/05/86)

In article <763@bentley.UUCP> kwh@bentley.UUCP (KW Heuer) writes:
>Now, if only somebody would invent an architecture where all objects,
>including dynamicly allocated objects, are isolated in memory, then any
>subscript error would cause an immediate memory fault.  

	Burroughs did exactly this in their Large Systems
	over 20 years ago.

>You'd still be vulnerable to completely wild pointers (but less likely 
>in a huge address space), 

	Not a problem - the address space isn't flat.

>and overflow of an array inside a structure might be untrappable,
>but otherwise it sounds like a great machine to do your debugging on.

	It is.

-- 
Toby Gottfried
 FileNet Corp	 {ucbvax,ihnp4,decvax}!trwrb!felix!toby
Costa Mesa, CA

kwh@bentley.UUCP (KW Heuer) (05/09/86)

In article <5097@think.ARPA> rose@think.ARPA (John Rose) writes:
>If you do have source code, here's another suggestion which has worked
>very well for me.  Define an circular buffer which stores a record of
>the last few hundred records of malloc/free/morecore history.  Make
>sure your symbolic debugger can dump it for you.  This trick alone
>has saved me hours of debugging time on quite a few occasions.

I've found the following front-end* to be very useful:
	char *Dalloc(n) unsigned n; {
	    register char *p = malloc(n);
	    fprintf(stderr, "+%8x\n", p);
	    return (p);
	}
	void Dfree(p) char *p; {
	    fprintf(stderr, "-%8x\n", p);
	    free(p);
	}
	char *Drealloc(p, n) char *p; unsigned n; {
	    fprintf(stderr, "-%8x\n", p);
	    p = realloc(p, n);
	    fprintf(stderr, "+%8x\n", p);
	    return (p);
	}

When the program terminates (normal exit or core dump), I run a consistency
check on the log file to cancel out "+xxx" and "-xxx".  Any unbalanced "-"
is a bug.  I consider an unbalanced "+" to be a bug, too, unless there are
a bounded number of them and I can account for them all.

Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint
*It can also be built into malloc.c if you have source; this allows it to
log the calls within stdio.  For some applications, a special log file
should be used instead of stderr.

rbj@icst-cmr (Root Boy Jim) (05/14/86)

> ...I consider an unbalanced "+" to be a bug, too, unless there are
> a bounded number of them and I can account for them all.

There always is :-)

	(Root Boy) Jim Cottrell		<rbj@cmr>
	"One man gathers what another man spills"