jkh@pcsbst.UUCP (jkh) (06/07/89)
I was thinking about writing this the other day, but figured I'd ask around first before expending the effort. Has anyone written a version of malloc/free that keeps extra information around about each chunk allocated and free'd so that garbage things free'd (or legitimate data free'd multiple times) can be detected and reported? An additional plus would be for malloc() to somehow detect when you'd gone off the end (either end) of a malloc'd block, but I can see how this would be a lot harder to do, if not impossible in some cases. Nonethess, I'm interested in anything that anyone's done in this dept. Jordan Hubbard PCS Computer Systeme, GmbH. EUROPE: jkh@meepmeep.pcs.com USA: jkh@violet.berkeley.edu -- -------- Jordan Hubbard PCS Computer Systeme GmbH West Germany UUCP: {uunet,decwrl}!pyramid!pcsbst!jkh ARPA: jkh@violet.berkeley.edu Hey! Leave that alone!
steve@nuchat.UUCP (Steve Nuchia) (06/07/89)
In a past life I wrote such a "checkout" version of a memory allocator. It was a bit more complex than malloc/free, it implemented a code-driven virtual memory facility, but the error checking techniques are portable. Start with a working malloc implementation -- one based on a huge static array would be fine. Make the header include the size of the block and keep either an active block chain and ensure that you can step through the array based on the sizes. This allows you to cross-check the chain integrity against the constraint that all memory (in the array) belong to some block. When allocating a block add a buffer zone on both sides of the data you give to the process. Fill this with a pattern, I used the address of each word XORed with a convenient constant. When freeing (at a minimum) check that the pattern is undisturbed. Another useful thing to keep in the header is a tag indicating which malloc call created it. If you can make malloc a macro and have it pass in __FILE__ and __LINE__ you can get to the line that called malloc from a suspect block. If you have a service wrapped around malloc you may need to extend this up a level. Periodicaly call a function that passes over the malloc data structure and validates it against whatever constraints you can think of. At a minimum check chain and contiguity constraints and all buffer zones. You may want to also ensure that the counts of certain types haven't exceeded some maximum -- whatever needs to be diagnosed. Sprinkle calles to this function wherever you think it might be needed, then move them around to isolate your problem. Sorry I don't have any code, but this didn't take more than an hour to put together, and I found a lot of bugs with it. -- Steve Nuchia South Coast Computing Services uunet!nuchat!steve POB 890952 Houston, Texas 77289 (713) 964 2462 Consultation & Systems, Support for PD Software.
roy@phri.UUCP (Roy Smith) (06/08/89)
jkh@meepmeep.pcs.com ( Jordan K. Hubbard ) writes: > Has anyone written a version of malloc/free that keeps extra information > around about each chunk allocated and free'd so that garbage things free'd > (or legitimate data free'd multiple times) can be detected and reported? I have two such packages, both of which were posted to the net some time ago. Michael Schwartz's maltrace package (slightly reformatted README file included below) is for tracing memory leaks; not exactly what Jordan asked for, but useful in its own right. I don't have much experience with it personally. Another package, from Brandon Allbery (see excerpt below), is useful for the kind of stuff Jordan wants to do; ckecking for corrupted malloced()'d memory, bad or redundant free()'s, etc. I've used it a lot and think it's wonderful. If you have BSD source, you should be able to recompile your standard malloc package to do similar checking, but I've never done so, I just use Brandon's. See the headers below to find out where and when the articles were posted originally. If you can't find them in some standard archive site (they are probably both on uunet, I would guess) let me know and I could mail them to you. > An additional plus would be for malloc() to somehow detect when you'd gone > off the end (either end) of a malloc'd block, but I can see how this would > be a lot harder to do, if not impossible in some cases. I've never tried this, but it seems possible to automatically generate a call to a checking function after each statement using a procedure like what dbx's "next" function does to single-step through a program to gain control. Each time you malloc a block, you could generate a few extra words before and after the block and fill them with magic numbers. The checking function could make sure all the magic numbers are as they should be. Of course, you still want to have free() be extra careful about checking to make sure it's freeing something it's supposed to. You might also have free zero out every byte that it frees and check to make sure that they stay zero (or some magic number, preferably an odd number which is not likely to be a valid pointer) each time the check function is called. This will catch programs which still use free()'d memory. This would all be slow as shit, but at least it would catch most malloc errors. Perhaps you could have switches all over the place to trade off speed for exhaustiveness -- maybe only check on every Nth statement execution, maybe not to the free'd zero checking, etc. Another thing which might be nice to have is a flag to malloc to automatically make each allocated block N bytes longer than is asked for. If your program has mysterious crashes which go away when you allocate an extra 4 bytes on each block, you might want to start looking for off-by-one errors at the ends of malloc'ed blocks. ---------------------- Newsgroups: comp.sources.misc Subject: maltrace -- trace un-free()'d space with dbx (maybe others?) Message-ID: <2835@ncoast.UUCP> Date: 10 Jul 87 02:00:34 GMT X-Archive: comp.sources.misc/8707/39 Malloc Leak Trace Package by Michael Schwartz University of Washington Computer Science Department Seattle, Washington, USA schwartz@cs.washington.edu (ARPANET) schwartz%cs.washington.edu@relay.cs.net (CSNET) ...{allegra,caip,ihnp4,nike}!uw-beaver!schwartz (UUCP) April 1987 1. Description This package is designed to help trace dynamic memory allocation leaks. It works by providing the standard malloc/free/realloc interface, but at the same time keeping track of all malloc'd buffers, including their size, order of call, and address. That way, at any point during the execution of your program, you can see what malloc'd buffers haven't yet been freed. It is particularly useful when your program performs many allocations before reaching some steady state, and hence you want to ignore the initial allocations and concentrate on where steady-state leaks occur. The idea is that you have some code (usually a server) that looks as follows: initialization code; do { ... } while (1); /* main loop */ There might be some dynamic allocation during the initialization, but this isn't where the memory leak is, since it's a one-shot allocation (i.e., at worst the initialization wastes some memory, but doesn't continually leak it). There might also be some dynamic allocation in the first few iterations of the main loop, until a "steady state" is reached (e.g., until some cache gets filled). In both cases (initialization and pre-steady state iterations), there may be many allocation calls, but you don't really want to look at them; rather, you want to look at what memory isn't being free'd once steady state has been reached. This package helps you to see the state of memory allocation after steady state has been reached. Bug reports and suggestions for improvements are welcome. 2. Instructions To use this package, take your favorite malloc/free/realloc code, and change the routine names as follows: malloc -> mmalloc free -> ffree realloc -> rrealloc You'll probably also need to add the following line to the beginning of your malloc.c: char *malloc(); (because realloc still calls malloc, but malloc is no longer defined in that file). Then link the program to be leak-traced with maltrace.o, btree.o, and (your modified) malloc.o. I would have included my malloc.c, but it's the copyrighted BSD 4.3 code, and besides, there are plenty of public domain malloc's available (e.g., in volume 6 of mod.sources). To trace a leak, take the example program skeleton, and augment it as follows: extern int MalTraceEnbld; extern int MalBrkNum; initialization code; do { ... if ( steady state reached) MalTraceEnbld = 1; ... at end of first steady-state cycle: PMal(-10); /* print last 10 (say) malloc's that haven't yet been free'd */ } while (1); /* main loop */ Then compile the program with -g, and run it. At the end of the first cycle, PMal will print a list of the last 10 malloc's that haven't yet been freed. (PMal(n) will print the first n entries if n > 0, the last -n entries if n < 0, and all entries if n == 0). Note the sequence number of one of these mallocs, and then go into dbx on the program, and put a breakpoint somewhere in the initialization code, and run the program. When you hit the breakpoint, use dbx to set MalBrkNum to the number of the malloc you just noticed, and set a break in MalBreak. Then, continue the program. When the malloc call in question is reached, MalBreak will get called, breaking, and giving you a chance to examine the state of the program at the time of this (potentially leaking) malloc call. In case this call is still within the steady-state setup (it is sometimes difficult to find where the setup ends), you can use dbx to call NextMal, to set MakBrkNum to be the next traced malloc call. 2.1 Usage Details This technique is not applicable to situations where the steady state allocation behavior (i.e., order and size of malloc requests) exhibits variation, e.g., due to pseudo-randomization or interaction with other processes via non-deterministically ordered message exchanges. In such situations you can sometimes inhibit the variation during debugging (e.g., by forcing interactions to occur in the same order each time). Alternatively, you can use dbx to set MalBreakSize to be the size of the malloc request at which to have MalBreak called, to reach a breakpoint (similar to the MalBrkNum scheme described above). This can be useful when the order of mallocs isn't fixed, but a particular size keeps showing up in the list of malloc's that haven't yet been free'd. There is also a routine called UntracedFree that gets called when a free call is made on an address that was not malloc'd with tracing enabled (again, this routine is present to allow one to set dbx breakpoints for this event). This could either indicate a free call on an address that isn't malloc'd (a bug) or a free call on an address that was malloc'd with tracing disabled. You can determine if it was of the former nature by going through the standard malloc code. For example, in the BSD code, you can set the switches -DDEBUG and -DRCHECK to check for this and other types of bugs. Alternatively, you can enable tracing from the very beginning of your program, and then any time UntracedFree gets called, it indicates a free call on an addresss that isn't malloc'd. 3. Interactive Demo You can try out this package interactively by making the program 'test'. Note that if you tell it to free some memory that was not malloc'd (with MalTraceEnbld = 1), it will give you a warning and then try to free the address anyway (for the reasons explained earlier). This may or may not cause malloc/free to get into a bad state; in BSD malloc this can cause a core dump, for instance. 4. Acknoledgements, History Thanks to Richard Hellier from the University of Kent at Canterbury (rlh@ukc.UUCP) for the btree package (which I modified slightly for the current package). I probably could have implemented my trace package more efficiently than it works currently (e.g., by incorporating the linked-list and btree nodes into the malloc header nodes), but I was more into hacking something together quickly that would solve my problems than making efficient code. Besides, this code doesn't need to be efficient, since it's only plugged in during debugging. ---------------- From: allbery@ncoast.UUCP (Brandon S. Allbery) Newsgroups: comp.sources.misc Subject: malloc package with debugging Message-ID: <3268@ncoast.UUCP> Date: 21 Jul 87 01:32:07 GMT X-Archive: comp.sources.misc/8707/59 This is my malloc() debugging package. Response to my small note was rather startling, so I'm posting it. The basic idea is that malloc'ed and free'd blocks are kept in doubly-linked lists. Every time an allocation function (malloc, free, calloc, realloc, or _mallchk) is called, the lists are checked to make sure the pointers have not been overwritten and the sizes are valid. They catch the majority of malloc'ed array overflows, and print dumps on segmentation and bus errors in order to determine if a memory overwrite was involved. They aren't perfect (an interpreter or other form of full runtime checking of every assignment would be needed for that), but they're pretty good. One warning: you can't depend on a free()'d block still being available, it will sbrk() backwards if possible. It also doesn't coalesce adjacent free blocks or do other kinds of "optimum" memory management; I consider this unimportant, since this is a debugging package rather than a full replacement for malloc. It's also slower than the "real" malloc. The code is included below; it's probably heavily 68000 dependent, but I've done my best to reduce such dependencies to a miminum. -- Roy Smith, Public Health Research Institute 455 First Avenue, New York, NY 10016 {allegra,philabs,cmcl2,rutgers,hombre}!phri!roy -or- roy@alanine.phri.nyu.edu "The connector is the network"
bet@bent (Bennett Todd) (06/11/89)
My error-checking wrappers library (libbent) does this in its wrappers around malloc/free/realloc (emalloc, efree, and erealloc, respectively). I sometimes regret having done this, since as a result of this decision, the pointers handled by emalloc/efree/erealloc aren't interchangeable with malloc/free/realloc, whereas the other cookies that are passed about (file descriptors, FILE pointers, and so forth) are interchangeable between my e* wrappers and the lower level functions. On the other hand, I haven't gotten into trouble as a result of mixing yet, and the additional checking has helped me catch some evil bugs quickly. My code has emalloc over-allocate by twice the length of a header structure, which includes a magic number and the allocation length. One copy is prepended, and one appended, and the pointer to the middle space is returned. On efree and erealloc the headers and trailers are checked; on efree they are mangled in a predictable fasion, so that double freeing can be detected. This catches double freeing, freeing bogus pointers, and running off either end of the array. I use these in all my production code. Although they don't catch problems as quickly as some other more ambitious packages out there, they also don't inflict enough overhead to make me want to leave them out of production code. -Bennett bet@orion.mc.duke.edu