rob@meaddata.com (Robert E. Lancia) (10/24/90)
As the subject says, I'm curious as to why free() does not return a value. (At least on our SPARCstations, SUN 3's and 4's, RT's and Sequent, free() is a void function.) It seems to me that it could pass back some useful information, especially if there was a problem. Was the pointer NULL, or an invalid address?? Was the memory not allocated by malloc() or it's siblings?? From K&R 2, page 252: "void free(void *p) ... p MUST be a pointer to space previously allocated by calloc, malloc, or realloc." (Emphasis on MUST by me.) What happens if p WASN'T allocated as it should've been?? How do we know there was a problem?? Is there really a problem?? Any ideas? -- |Robert Lancia | Mead Data Central |(513) 297-2560 | Data Services Division |rob@pmserv.meaddata.com | P.O. Box 308 |...!uunet!meaddata!pmserv!rob | Dayton, Ohio 45401
dkeisen@Gang-of-Four.Stanford.EDU (Dave Eisen) (10/25/90)
In article <1749@meaddata.meaddata.com> rob@pmserv.meaddata.com writes: > >As the subject says, I'm curious as to why free() does not return a value. > > "void free(void *p) > ... p MUST be a pointer to space previously allocated by calloc, > malloc, or realloc." (Emphasis on MUST by me.) > >What happens if p WASN'T allocated as it should've been?? How do we know >there was a problem?? Is there really a problem?? > There most certainly is a problem, free can do all kinds of damage if it is passed a pointer to (for instance) static memory. Most frees don't check if the address they are passed makes sense for a free call, most frees have no way of doing this even if they wanted to. So it is up to the programmer to be sure that the memory to be freed was obtained from malloc or its cousins. -- Dave Eisen Home: (415) 323-9757 dkeisen@Gang-of-Four.Stanford.EDU Office: (415) 967-5644 1447 N. Shoreline Blvd. Mountain View, CA 94043
rh@smds.UUCP (Richard Harter) (10/25/90)
In article <1749@meaddata.meaddata.com>, rob@meaddata.com (Robert E. Lancia) writes:
To summarize: Why is free a void function? Why doesn't return
something useful if the argument is invalid?
The answer is very simple -- the specs for malloc/free were broken in
the very beginning (IMHO, of course) and most implementations are broken
today. It is easier (and faster) to put the allocation control information
in memory adjacent to the allocated block in allocators which use free
lists (bit mapped buddy-system allocators are a different matter.)
IMNSHO this is not nice. It is a source of mystery bugs. An array
overwrite in allocated memory can wipe out control information; the
consequence doesn't hit you until much later. A stale or incorrect
pointer passed to free is allowed to do its damage without warning.
And so on.
The specs more or less explicitly provide that the users of malloc/free
have all of the responsibility for using them without error. The
system is not obliged to do error checking. It is at liberty to do
something indeterminately awful if the programmer makes an error.
All of this in one of the most primitive and least structured aspects
of C. Grumble, grumble, grumble.
--
Richard Harter, Software Maintenance and Development Systems, Inc.
Net address: jjmhome!smds!rh Phone: 508-369-7398
US Mail: SMDS Inc., PO Box 555, Concord MA 01742
This sentence no verb. This sentence short. This signature done.
henry@zoo.toronto.edu (Henry Spencer) (10/25/90)
In article <1749@meaddata.meaddata.com> rob@pmserv.meaddata.com writes: > "void free(void *p) > ... p MUST be a pointer to space previously allocated by calloc, > malloc, or realloc." (Emphasis on MUST by me.) > >What happens if p WASN'T allocated as it should've been?? How do we know >there was a problem?? ... "Bus error - core dumped." Handing free() an improper pointer is a major error, and one that free() is *not* required to catch. It can screw up in any manner it pleases if you do this. There is little point in allowing for a return code just to complain about something you are *never* *ever* supposed to do in the first place. -- The type syntax for C is essentially | Henry Spencer at U of Toronto Zoology unparsable. --Rob Pike | henry@zoo.toronto.edu utzoo!henry
dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) (10/26/90)
>>...I'm curious as to why free() does not return a value.
Yes, free() could legitimately return an error code if it detected
something wrong. A debugging version of the memory allocation library
could always have free() return a useful value, and even production
versions could sometimes return an error indication.
Also, a signal handler (*handler)() installed by signal() could
legitimately return a status code for use by the kernel.
Both free() and (*handler)() used to return int at one time. The
availability of the `void' data type, and a certain religious desire to
be "pure", seem to have made both functions now return nothing. It's a
loss.
--
Rahul Dhesi <dhesi%cirrusl@oliveb.ATC.olivetti.com>
UUCP: oliveb!cirrusl!dhesi
martin@mwtech.UUCP (Martin Weitzel) (10/26/90)
In article <1749@meaddata.meaddata.com> rob@pmserv.meaddata.com writes: > [Q: why free() doesn't return something "useful"?] > >From K&R 2, page 252: > > "void free(void *p) > ... p MUST be a pointer to space previously allocated by calloc, > malloc, or realloc." (Emphasis on MUST by me.) > >What happens if p WASN'T allocated as it should've been?? How do we know >there was a problem?? Is there really a problem?? As the ANSI-Standard tells us in 4.10.3.2, with a NULL pointer you have a no-op, otherwise UNDEFINED BEHAVIOR. As some others (esp. Chris Torek) never seem to get tired to invent scenarios to illustrate undefined behavior, I'll also try to add one: Undefined behavior *can* mean that your machine becomes inoperatable until you have typed the following sentence 100 times at the console terminal: "The argument handed to free must be NULL or a pointer to space previously allocated by calloc, malloc, or realloc" But let's get more serious. K&R 2 contains in section 8.7 a sample implementation of a storage allocator; the code for the free()-function is on page 188. As this allocator IMHO served for long years as a modell for real implementations, looking at the effects of free()ing some arbitrary pointer with this allocator will serve as a good realistic example for undefined behavior. (And it will give you enough reason to avoid such behavior, even if someone ensures you that "undefined behavior" will neither make your equipment explode, nor result in rude mail beeing sent to your boss ...) The basic idea of K&Rs allocator (the example is essentially the same in K&R I and II), is to keep the free()ed parts of dynamically allocated memory in a linked list. For that purpose each allocation requests some additional bytes, just enough for storing a number and a pointer(%1). So, after you have allocated, say 20, bytes the memory *may* look like follows (lower adresses to the left): Bytes: 4 + 4 20 4 +-----+------+-------------------------------------------+ | XXX | 4 | 20 bytes allocated | | +-----+------+-------------------------------------------+ HEADER ^ | pointer returned be malloc, calloc or realloc (Note that the byte counts above are only given to make the example a bit more realistic - they don't mean that pointers or int-s allways occupy 4 Bytes or are equal in size on all systems!) In the above example the allocator had "really" reserved 32 Bytes. But this "waste" shouldn't be of much interest for the programmer: He or she can count on at least 20 bytes beeing available above the pointer that was returned. Immediatly below the returned pointer there is some very important information stored: The size of the "really" allocated area, meassured in units of the size of the header (8 in the example). These four bytes are the *sole* information for a later free() to know the size of the allocated chunk of memory. So we should have a look at free() now. Its basic work is to walk thru a linked list of previosly freed chunks, which are chained by the first field in the header (XXX above). The list is kept in the order of increasing adresses (for reasons we'll see in a second). After free() has determined where the new part fits in, it links it either into the list, or combines it with the immediatly adjacent chunks. (Clearly, this strategy should keep fragmentation low.) Now let's come to the question what may happen if you try the following: char *greet = "hello, world"; .... free (greet); (Again please note that in the following scenario I make some assumptions, e.g. that static data is located together with string constants in the lower parts of the memory, below the heap; these assumptions are quite realistic, but there are certainly machines and implementations for which this doesn't hold). If free() follows the K&R implementation, it will first notice that the given adress is below the lowest adress of the previously free()ed parts of dynamic memory(%2). Now some bytes immediatly *below* the adress where the character 'h' from the "hello, world"-string is stored, will be taken as int which gives the size of the chunk free() is about to link into the list. NOTE THAT THIS IS TOTALLY UNRELATED TO THE LENGHT OF THE "hello, world"-STRING! Now free() decides that the bytes which are just beeing returned to the allocator cannot be combined with something else, so it links them as an alement into the list, WHICH CHANGES SOME BYTES AT SOME DISTANCE BELOW THE ADRESS OF THE CHARACTER 'h'. Besides that, so far not much damage has happened, and there is a fair chance that the changed bytes belong to some static variable (or other string) which is not any more needed until the program ends. Let's consider the next request to the allocator now (malloc, calloc or realloc). It steps thru the list, treating the bytes above the link-pointer as size information to decide if a given chunk may be used to fullfill the request. If we are lucky, these bytes (again: the contents of some totally unrelated static variable, or some other string) tell that it is too small, and nothing will happen. With less luck, the space of the "hello, world"- string will now be reused - something the programmer who originally free()ed it might have had in mind :-) - but not only that: Remember that the "how much space is available"-information is IN NO WAY RELATED to the length of the "hello, world"-string. If less than the "available" space was requested, some space at the "end" of the chunk under consideration is returned (and of course the size information is properly adjusted). I think some graphic will help to show what this really means: Bytes: 4 4 ---------16---------- 4 4 8 +------------------------------------------------------------+ | XXX nnn hello, world\0 | YYY | mmm | ZZZ | +------------------------------------------------------------+ ^ ^ | pointer handed to free | malloc(6) - XXX are the bytes overwritten by the erreneous free(); - nnn are some bytes that may accidentially form the value 5 (if treated as an int) when the request to malloc() starts beeing processed; nnn will be changed to 3 before malloc() returns; - mmm are some bytes that will be treated as an int and set to the value 2 before malloc() returns; - ZZZ are the bytes that will be overwritten, if the "space" to which malloc() returns a pointer gets used; - YYY are some bytes that may be used as a link pointer if the space just allocated will ever be returned (It is left as an exercise to the reader (:-)) to realize that this will not happen if the free follows immediatly and that also one more malloc() with a specific size could result in changing YYY :-) I hope everyone who has followed so far will be convinced now that things become more and more unpredictable and that undefined behavior is the last thing which one would wish for some program. At least *I* definetly wouldn't want to debug such a program, as it may be very very hard to trace back from the point the error shows up to the point where things started go wrong(%3). ---------------------------------------------------------------------- %1: Note that a more "space efficient" implementation would need only the space for the size, but not for the pointer, as the latter is only required as long as the chunks are "free". The K&R implementation was choosen to implement the obscure AND UNPORTABLE feature that you can use free()ed space until the next malloc() -- a thread about this topic appeared a short time ago in this group. %2: Interestingly enough, though K&R II specifies in the appendix about the library that free(NULL) does nothing, they fail to implement this behavior in their "own" function. What can we learn from this? (No, I have no answer, take it more as a fact I observed.) %3: For debugging techniques which might help you out of such situations, you may want to read Robert Ward's book "Debugging C" or the Article "Controlling the Malloc Heap" by Eric White in the CUJ Volume 7, Number 2 (Feb 1989). -- Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83
njk@diku.dk (Niels J|rgen Kruse) (10/26/90)
rh@smds.UUCP (Richard Harter) writes: > It is easier (and faster) to put the allocation control information >in memory adjacent to the allocated block in allocators which use free >lists (bit mapped buddy-system allocators are a different matter.) >IMNSHO this is not nice. It is a source of mystery bugs. An array >overwrite in allocated memory can wipe out control information; the >consequence doesn't hit you until much later. If thre is no adjacent control information, the array overwrite will just scribble over some of the data you have in the next block. This is not a source of mystery bugs? Your program may not dump core, but the output will be slightly corrupted. Ugh. -- Niels J|rgen Kruse DIKU Graduate njk@diku.dk
rh@smds.UUCP (Richard Harter) (10/27/90)
I wrote: It is easier (and faster) to put the allocation control information in memory adjacent to the allocated block in allocators which use free lists (bit mapped buddy-system allocators are a different matter.) IMNSHO this is not nice. It is a source of mystery bugs. An array overwrite in allocated memory can wipe out control information; the consequence doesn't hit you until much later. Niels J|rgen Kruse comments: If thre is no adjacent control information, the array overwrite will just scribble over some of the data you have in the next block. This is not a source of mystery bugs? Your program may not dump core, but the output will be slightly corrupted. Ugh. $my reply Quite true -- array overwrites are bugs. There are two philosophies about how utilities should respond to incorrect input, i.e. bugs. One approach says that the utility is not obliged to do anything predictable in response to errors because you should never write code with errors in it. Much as I admire people who can actually write code that is error free I must admit that I am not one of them -- I make mistakes from time to time. As a result I prefer a more robust approach which says that utilities should help you as much as is feasible when you make mistakes. The allocator/deallocator which I use goes a fair ways towards doing this. Features include: 1) Control information is separated from allocated storage. 2) Each allocation records the origin of the allocation request. 3) All return addresses are checked to make sure that they are valid. 4) Each allocated block is padded with four check bytes. The deallocator tests whether the check bytes have been touched. 5) Control information block stores the time of origin (or equivalent) for each allocation (highly useful for finding storage leaks.) 6) The allocator package prints a map of allocated memory when an error is detected. The map includes address, size, origin, date, and control block ID for each block, and a description of the trapped error. I find this information somewhat more useful than a core dump. There is a price to be paid for this -- it requires more storage (but not time) than a bare bones allocator. Some, perhaps most, will view this as an unacceptable price. For my own part, I prefer to pay the price because memory allocation bugs can be, in my experience, among the more difficult (expensive in time and effort) bugs to track down. Incidentally, I don't doubt that there are better allocation packages than the one I use. However I have been using it for years and I find that it eliminates an entire class of problems. -- Richard Harter, Software Maintenance and Development Systems, Inc. Net address: jjmhome!smds!rh Phone: 508-369-7398 US Mail: SMDS Inc., PO Box 555, Concord MA 01742 This sentence no verb. This sentence short. This signature done.
rtm@christmas.UUCP (Richard Minner) (10/29/90)
In article <1990Oct25.152057.23024@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes: >In article <1749@meaddata.meaddata.com> rob@pmserv.meaddata.com writes: >>How do we know there was a problem?? ... > >"Bus error - core dumped." Handing free() an improper pointer is a major >error, and one that free() is *not* required to catch. It can screw up in >any manner it pleases if you do this. There is little point in allowing >for a return code just to complain about something you are *never* *ever* >supposed to do in the first place. Yes yes yes. It seems to be a common misconception that the standard library ought to do everything *you* want it to do, whatever that happens to be. The standard library is there to provide a foundation upon which you can build whatever it is you need. As such, it should do as little as possible, yet still provide a means of accomplishing a great variety of tasks. That's what makes C so wonderfully small and general purpose (even ANSI C :-). It is extremely difficult to design a language to do more for you without adding more restrictions (or at least more complications). (Of course, C++ is perfect in every way...-) void free() is a good example. It is not a function in isolation, it is part of a very simple and basic memory allocation package. There are countless ways to manage memory, and it would be silly to try to provide a comprehensive set with the language. It is very smart to provide a simple scheme. That way, if your task is small or simple, the standard library may be all you need. If your task is large or complex, you can add your own layer, with whatever error detection, etc. you want. I think most large projects would be well advised to design a layer on top of the standard library. Unfortunately, this is often an upfront investment whose return is hard to `prove'. -- Richard Minner || {uunet,sun,well}!island!rtm (916) 736-1323 || || Island Graphics Corporation Sacramento, CA ||
s64421@zeus.usq.edu.au (house ron) (10/29/90)
rh@smds.UUCP (Richard Harter) writes: >Niels J|rgen Kruse comments: >> If thre is no adjacent control information, the array overwrite >> will just scribble over some of the data you have in the next >> block. This is not a source of mystery bugs? Your program may >> not dump core, but the output will be slightly corrupted. Ugh. >Quite true -- array overwrites are bugs. There are two philosophies >about how utilities should respond to incorrect input, i.e. bugs. >One approach says that the utility is not obliged to do anything >predictable in response to errors because you should never write >code with errors in it. >Much as I admire people who can actually write code that is error >free I must admit that I am not one of them -- I make mistakes from >time to time. As a result I prefer a more robust approach which says >that utilities should help you as much as is feasible when you make >mistakes. Actually, the way a utility can help _me_ when I overwrite memory is by causing the most drastic crash possible a.s.a.p.! Trying to minimise the effects of an error is useful if the error gets past the checking stage, but it makes it _much_ more likely that errors _do_ escape notice. A good ol' segmentation fault tends to alert you to problems! -- Regards, Ron House. (s64421@zeus.usq.edu.au) (By post: Info Tech, U.C.S.Q. Toowoomba. Australia. 4350)
rh@smds.UUCP (Richard Harter) (10/30/90)
In article <1990Oct29.121356.20818@zeus.usq.edu.au>, s64421@zeus.usq.edu.au (house ron) writes: re my comments on memory allocators > Actually, the way a utility can help _me_ when I overwrite memory is > by causing the most drastic crash possible a.s.a.p.! Trying to > minimise the effects of an error is useful if the error gets past > the checking stage, but it makes it _much_ more likely that errors > _do_ escape notice. A good ol' segmentation fault tends to alert you > to problems! Well, I agree, sort of. The point is that off-the-shelf malloc/free is not guaranteed to do anything of the sort if you do a memory overwrite. The behaviour is undefined; you may crash now or you may crash later. What is worse is that two calls to free for the same block (in a linked list implementation) are very likely to postpone the problem to a later time. The purpose of checking is to make sure that a.s.ap is in fact a.s.a.p instead of some indeterminate random time in the future. -- Richard Harter, Software Maintenance and Development Systems, Inc. Net address: jjmhome!smds!rh Phone: 508-369-7398 US Mail: SMDS Inc., PO Box 555, Concord MA 01742 This sentence no verb. This sentence short. This signature done.
barmar@think.com (Barry Margolin) (10/31/90)
In article <14@christmas.UUCP> rtm@island.uu.net (Richard Minner) writes: >void free() is a good example. ... >I think most large projects would be well advised to >design a layer on top of the standard library. But if you're going to use the standard library as a foundation for another layer, it has to provide decent services to the upper layer. Void free() doesn't provide such a service. Many respondents have explained that free() isn't required to detect errors. That's acceptable. The problem with the ANSI specification is that free is *prohibited* from returning an error indication that a portable program can use. It would have been better if it had been defined to return an int, with the specification that the implementation is not required to validate the argument, but if it does so then it may return -1 and set errno if it fails the validation. This would make systems that are able to check during free() (such as the debugging malloc libraries that many systems provide) more useful, as the application can try to recover. -- Barry Margolin, Thinking Machines Corp. barmar@think.com {uunet,harvard}!think!barmar
bright@nazgul.UUCP (Walter Bright) (10/31/90)
In article <1749@meaddata.meaddata.com> rob@pmserv.meaddata.com writes:
<As the subject says, I'm curious as to why free() does not return a value.
<It seems to me that it could pass back some useful information, especially
<if there was a problem. Was the pointer NULL, or an invalid address??
<Was the memory not allocated by malloc() or it's siblings??
I used to feel that free() should return a value, basically indicating if
malloc's data structures were corrupted or not. After using this for years,
it became clear that the only reasonable thing to do is to abort the
program:
if (free(p))
abortprogram("heap is corrupted");
If the data structures are corrupted, it is best to terminate the program
as soon as possible, to avoid doing terrible things like corrupting DOS
or the FAT. Thus, it became apparent that the abort should be moved inside
of free(). That's what the Zortech free function does now, if it detects
a corrupted heap it prints the message:
"Heap is corrupted"
and immediately terminates the program. free() now returns a void.
Occaisionally someone wants to regain control after this, so they can
'fail gracefully'. My argument is that if the heap is corrupted, the
program has already failed gracelessly, and it's best to terminate
it before more goes wrong. (The library source is available for
the diehards anyway!)
karl@ima.isc.com (Karl Heuer) (11/05/90)
In article <130@nazgul.UUCP> bright@nazgul.UUCP (Walter Bright) writes: >Thus, it became apparent that the abort should be moved inside of free(). >That's what the Zortech free function does now, if it detects a corrupted >heap it prints the message: > "Heap is corrupted" >and immediately terminates the program. free() now returns a void. Good for you. I hope you die with abort() or equivalent rather than exit() (even if the distinction is minor under DOS); then you can answer the diehards by telling them to trap SIGABRT. Karl W. Z. Heuer (karl@ima.isc.com or uunet!ima!karl), The Walking Lint
gwyn@smoke.brl.mil (Doug Gwyn) (11/07/90)
In article <1749@meaddata.meaddata.com> rob@pmserv.meaddata.com writes: >As the subject says, I'm curious as to why free() does not return a value. Because there's nothing reliable it can tell you. If you misinvoke it, it can't necessarily detect the fact.
gwyn@smoke.brl.mil (Doug Gwyn) (11/07/90)
In article <212@smds.UUCP> rh@smds.UUCP (Richard Harter) writes: >The specs more or less explicitly provide that the users of malloc/free >have all of the responsibility for using them without error. The >system is not obliged to do error checking. It is at liberty to do >something indeterminately awful if the programmer makes an error. >All of this in one of the most primitive and least structured aspects >of C. Grumble, grumble, grumble. If you really want thorough arena checking, you have to take a significant performance hit. Since it can be implemented outside the standard library, that's a logical place for it. People whose code doesn't have bugs would be annoyed if all that unnecessary arena checking were forced upon them. You can get a very nice (if I do say so myself) memory-allocation "wrapper" that we call the "Mm package" free from the BRL MUVES project. While Mm is part of a larger system and may be tricky to configure outside the intended compilation environment, it was designed to be useful in other projects and has been used in them. The additional overhead can be turned on to any extent that you feel necessary. It can even generate movie scripts for the "anim" system.
gwyn@smoke.brl.mil (Doug Gwyn) (11/07/90)
In article <2604@cirrusl.UUCP> dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) writes: >Both free() and (*handler)() used to return int at one time. They certainly did not. Before the "void" type was added to the C language, the only way to declare or define a function that didn't return a value was to just omit its return type. Unfortunately, as a programmer convenience the common type "int" was also permitted to be elided in many declarations, including as the return type for a function declaration/definition. Thus, an actually-void function had type indistinguishable from function-returning-int. After "void" was added, we cleaned up this overloading. Old-style actually-void function code for the most part will continue to work, signal handlers being the notable exception. (And even they generally continue to work in nearly all implementations.)
gwyn@smoke.brl.mil (Doug Gwyn) (11/07/90)
In article <1990Oct30.172429.10055@Think.COM> barmar@think.com (Barry Margolin) writes: >The problem with the ANSI specification is that free is *prohibited* from >returning an error indication that a portable program can use. It would >have been better if it had been defined to return an int, with the >specification that the implementation is not required to validate the >argument, but if it does so then it may return -1 and set errno if it fails >the validation. (a) The C standard is based on existing practice, and existing free() does not return a validation value. (b) It would be impossible to completely verify that the pointer passed to free() is appropriate. (c) It would be expensive to partially verify that the pointer passed to free() is appropriate. (d) It would be silly to write code that relied on this run-time validation if it's not guaranteed to be done anyway.
bright@nazgul.UUCP (Walter Bright) (11/09/90)
In article <5241@ima.ima.isc.com> karl@ima.isc.com (Karl Heuer) writes: /In article <130@nazgul.UUCP> bright@nazgul.UUCP (Walter Bright) writes: />Thus, it became apparent that the abort should be moved inside of free(). />That's what the Zortech free function does now, if it detects a corrupted />heap it prints the message: /> "Heap is corrupted" />and immediately terminates the program. free() now returns a void. /Good for you. I hope you die with abort() or equivalent rather than exit() /(even if the distinction is minor under DOS); then you can answer the diehards /by telling them to trap SIGABRT. No, the function exits immediately to DOS, it does not pass Go and does not collect $200. My reasoning is that the program *has already crashed* and must be stopped *as soon as possible* to avoid corrupting the FAT or some other critical area. The library source comes with the Developer's Edition, and if someone wishes to change this behavior, it would not be very difficult. But someone doing this I think would need a very good reason for it.