[comp.lang.c] Malloc problems

daveh@marob.MASA.COM (Dave Hammond) (05/06/88)

Help!

I am currently working on an application which requires repeated
allocation of small chunks of memory to dynamically build various structure
lists. It is quite conceivable that the program will attempt to allocate
hundreds (even thousands) of small blocks within a tight loop. The
application the goes thru one pass successfully.

Problem: on subsequent passes (free old, read data, allocate new space)
malloc() core dumps. Traces show that the structure addresses have not
moved since the first pass (i.e. no pointer address screwups) and the
problem does not appear in smaller versions of the application (its currently
running on an Intel 286 machine under SCO Xenix, as a Large model
program - about 100K binary file size).

As a remedy I've written a set of routines which allocate large chunks
of memory on startup and then parcel out portions as necessary. This
seems to work, but should I have to do this?

Is the problem with a)Xenix, b)the malloc() package, c) ??

Thanks for any Help! in advance.

Dave Hammond
DSI Communications
333 W. Merrick Rd.
Valley Stream, NY 11580
E-Mail: ...!marob!daveh

jk@hpfelg.HP.COM (John Kessenich) (05/14/88)

A wild guess:

    Malloc() has two behavoirs that combined might be causing your
    problem.

        1.  When malloc() runs out of memory, it returns NULL.

        2.  Free does not necessarily return memory for malloc's
		immediate re-use.

    If you repeatedly malloc and free, you may actually be using 
    up memory.  This leads to malloc eventually returning NULL,
    which, if you dereference, can cause a core dump.

    Free-ing memory in reverse order it was malloc'ed in may help.


John Kessenich

gwyn@brl-smoke.ARPA (Doug Gwyn ) (05/16/88)

In article <690008@hpfelg.HP.COM> jk@hpfelg.HP.COM (John Kessenich) writes:
>        2.  Free does not necessarily return memory for malloc's
>		immediate re-use.

In any reasonable implementation, it does.

The only wrinkle is that some implementations promise that an
immediately-following realloc() on the freed block will work, but
the freed block IS available immediately for re-use; this is just
a special weird case of re-use.  A malloc() after a free() is
expected to be able to return a pointer to a block containing
some or all of the freed block.

spencer@kazoo.cis.ohio-state.edu (Stephen Spencer) (05/16/88)

In article <690008@hpfelg.HP.COM> jk@hpfelg.HP.COM (John Kessenich) writes:
>
>A wild guess:
>    Malloc() has two behavoirs that combined might be causing your
>    problem.
>        1.  When malloc() runs out of memory, it returns NULL.
>        2.  Free does not necessarily return memory for malloc's
>		immediate re-use.
>    If you repeatedly malloc and free, you may actually be using 
>    up memory.  This leads to malloc eventually returning NULL,
>    which, if you dereference, can cause a core dump.
>    Free-ing memory in reverse order it was malloc'ed in may help.
>John Kessenich

It is true that when malloc() runs out of memory, the call to malloc()
will return NULL.  I would sincerely hope that one's code would be
robust enough to look SOMETHING like this:

     short *pi;

     pi = (short *)malloc(sizeof(short));
     if (pi == NULL) 
     {
	fprintf(stderr,"HELP!!!! Cannot allocate memory for pi.\n");
     }

Good (in my estimation) code would test any and all dynamically allocated
memory for successful completion of the malloc() call.  

I suppose that 'garbage collection' would be too machine-dependent.
It'd sure be a nice feature.



-=-
Stephen Spencer, Graduate Student       |
The Computer Graphics Research Group    | spencer@tut.cis.ohio-state.edu
The Ohio State University (614)292-3416 | 
1224 Kinnear Road, Columbus OH 43212    | ICBM: 39'58" N, 83'03" W

terry@wsccs.UUCP (Every system needs one) (05/17/88)

In article <272@marob.MASA.COM>, daveh@marob.MASA.COM (Dave Hammond) writes:
> Problem: on subsequent passes (free old, read data, allocate new space)
> malloc() core dumps. Traces show that the structure addresses have not
> moved since the first pass (i.e. no pointer address screwups) and the
> problem does not appear in smaller versions of the application (its currently
> running on an Intel 286 machine under SCO Xenix, as a Large model
> program - about 100K binary file size).
> 
> As a remedy I've written a set of routines which allocate large chunks
> of memory on startup and then parcel out portions as necessary. This
> seems to work, but should I have to do this?
> 
> Is the problem with a)Xenix, b)the malloc() package, c) ??

The Library routines malloc() and free() are broken under SCO Xenix 2.2.0
through 2.2.2 (controlled release).  This has been reported, but I have
no idea if/when it will be fixed, as the compiler and libraries belong to
MicroSoft, and there is a rather long backward chain to get problems gone.

My guess is that MicroSoft has already fixed it in the next release --
which hasn't been released yet.  I have no idea of the propagation time
between a MicroSoft fix and an SCO release, either.  I'll ask around.

Any fix, of course, will probably leave Tandy Xenix, which is an OEM'ed
version of MicroSoft, out in the cold, as they [Tandy] are always several
revisions out of date.

-=< THE FIX >=- (trumpet music or strumpet music, whichever)

Your fix is one of the two best available.  The other is to use an Altos
developement system and libraries from a 686, 886, or 1086.  This has the
added advantage that your programs will run on all 80286 based Altos boxes
(with the exception of the 2086, which is DIFFERENT), all of the SCO Xenix
systems (Although serial I/O is somewhat screwed on modem devices under 2.1.3)
and even L/F technologies systems (Although a MicroPort UNIX version runs
better there, L/F boxes WILL load Xenix binaries).  As a bonus, optimization
is somewhat better (a 486/586/986 8086 Altos Xenix compiler compiles our
product small enough that it will run on the XT version of SCO Xenix; the
SCO compiler produces the message "program too large") and can save you if
you're just on the boarder.

Warning:  SCO 2.2.2 (Xenix 386) has the same malloc problem with Altos
generated code, so don't necessarily run out and buy an Altos developement
system... your fix is best, in this case.  For more implementations, see the
C programming Bible (K&R), wherin malloc() and free() are implemented for
you.

| Terry Lambert           UUCP: ...{ decvax, ihnp4 } ...utah-cs!century!terry |
| @ Century Software        OR: ...utah-cs!uplherc!sp7040!obie!wsccs!terry    |
| SLC, Utah                                                                   |
|                   These opinions are not my companies, but if you find them |
|                   useful, send a $20.00 donation to Brisbane Australia...   |
| 'Admit it!  You're just harrasing me because of the quote in my signature!' |

henry@utzoo.uucp (Henry Spencer) (05/18/88)

> Good (in my estimation) code would test any and all dynamically allocated
> memory for successful completion of the malloc() call.  

Definitely.  A good way to do this for garden-variety programs is to have
a library function emalloc() which calls malloc, checks the result, and
prints a message and exits if NULL.  This is *not* appropriate for use from
library functions or in programs that want to do non-trivial error recovery,
but it is extremely useful in situations where it is known that nothing
useful can be done on running out of memory.
-- 
NASA is to spaceflight as            |  Henry Spencer @ U of Toronto Zoology
the Post Office is to mail.          | {ihnp4,decvax,uunet!mnetor}!utzoo!henry

brian@bucc2.UUCP (05/19/88)

> /* Written  7:32 am  May 16, 1988 by kazoo.cis.ohio-state.edu!spencer */
> It is true that when malloc() runs out of memory, the call to malloc()
> will return NULL. 

	[ example of if (!(p = malloc(n))) die(); deleted ]

> Good (in my estimation) code would test any and all dynamically allocated
> memory for successful completion of the malloc() call.  
> 
> I suppose that 'garbage collection' would be too machine-dependent.
> It'd sure be a nice feature.

  See _A Memory Allocator with Garbage Collection for C_, Michael
Caplinger (Bell Communications Research.) in the Winter 1988 USENIX
conference proceedings...

  This paper gave me the inspiration for an extension to malloc() we
are considering implementing. A short description follows:

  For the reasons outlined in Caplinger's paper, it is often difficult
to free() everything that has gotten *alloc()'d in a large project.
We do not want to implement garbage collection, however, due to
the code overhead and lack of sources. The environment we are running
is Microsoft C 5.0 on IBM AT's and PS/2's under MS-DOSN'T.

  We are considering a solution which is easier to implement, but
is not as clean.

  An additional field will be added to the header at the beginning of
every block on the heap. After a call to malloc is made, a mark() function
can be called to set a code in this field. Later, a function called
free_group() frees all the blocks with a specific code. The functions
would look like this:

/* mark a pointer to a block allocated from the heap. returns ptr on	*/
/* success, or NULL if ptr didn't point into the heap.			*/
char *mark(ptr, code)
  char *ptr;			/* block to mark			*/
  long code;			/* code number to mark with		*/

/* free all blocks on the heap with a specified mark code. Returns a	*/
/* count of the number of blocks free'd.				*/
int free_group(code)
  long code;			/* code number of blocks to free	*/


  An example is a function to read ten lines from a file, allocating
a new buffer for each line. If less than ten lines were found, we wish
to deallocate any lines that were read.

#define MY_MEM_GROUP	17L

char *lines[10];
char buf[512];

int readtenlines(fp)
  FILE *fp;
  {
  register int i;
  char *p;

  for (i = 0; i < 10; i++)
    {
    if (!fgets(buf, 512, fp))
      {
      free_group(MY_MEM_GROUP);
      return(0);
      }

    if (!(p = malloc(strlen(buf)+1)))
      {
      fatal_error("read_ten_lines", "malloc()", "ran out of memory");
      /*NOTREACHED*/
      }

    mark(p, MY_MEM_GROUP);
    strcpy(p, buf);
    lines[i] = p;
    }
  }

  MY_MEM_GROUP is just the arbitrary code number assigned for this
function. After the calling function is done with lines[], it can
call free_group(MY_MEM_GROUP) to get rid of the buffers.

  Another function, for debugging, will report how many objects with
a certain code are on the heap. This can be called at the end of a
module to find out if anything has been left "lying around" on the
heap.


  The code numbers will be 31 bits. The "missing" bit in that number is
used to indicate whether the block is allocated or free. Since we have
many programmers working on each project, the code will probably be
broken down as follows:

struct myheapinfo
  {
  size_t	 size;

  unsigned      inuse :  1;
  unsigned programmer :  6;
  unsigned    project :  8;
  unsigned     gensym :  1;
  unsigned       code : 16;
  }
 
   Each programmer will have a unique code similar to a UID, which will be
stored in the programmer field. This will help in finding a scapegoat when
problems arise... errrr rather in locating the code that isn't freeing
its allocated memory. Since we only have about 15 programmers doing this
type of work, 64 programmer codes per project should be sufficient.

  The project field will be a code indicating which project or library
allocated the block.

  Sixteen bits are reserved for the code itself, allowing for 2^16 
"memory" groups per programmer per project. Additionaly, to help with
coding reentrant functions, a gensym() function will return a unique
code number. To make gensym() codes seperate from codes defined by
the programmer, the gensym flag is provided in the structure.

  Blocks that haven't been mark()'d will have zeroes in all these
fields. Code zero will be reserved for this reason. Thus, it will
support 255 projects/libraries, with 63 programmers per project.
Each programmer will have 65535 codes he can define per project,
and can call gensym() for an additional 65535 codes.


  Any thoughts, comments, suggestions, flames, etc, would be
appreciated.

...............................................................................

  When the going gets weird, the weird turn pro.

  Brian Michael Wendt       UUCP: {cepu,ihnp4,uiucdcs,noao}!bradley!brian
  Bradley University        ARPA: cepu!bradley!brian@seas.ucla.edu
  (309) 677-2230            ICBM: 40 40' N  89 34' W

badri@valhalla.ee.rochester.edu (Badri Lokanathan) (05/20/88)

In article <1988May17.184525.14461@utzoo.uucp>, henry@utzoo.uucp (Henry Spencer) writes:
> 
> Definitely.  A good way to do this for garden-variety programs is to have
> a library function emalloc() which calls malloc, checks the result, and
> prints a message and exits if NULL.  This is *not* appropriate for use from

I have a set of utility definitions that are macros for doing exactly
these and more (such as swap, min/max) in a way that I think is quite
elegant. I also use macros for basic linked list handling (add_node,
delete_node etc.) by macros to avoid writing the same code again and again.

Say, for instance,
---------------------------- in file globals.c -------------------------
/* Contains globally visible variables. */
char memory_error[] = "Malloc ran out of memory.";
---------------------------- in file utildefs.h -------------------------
extern char *malloc();
extern char memory_error[];
#define MALLOC_N(A,B,N) {						\
	if ((A=(B *) malloc((unsigned) (N)*sizeof(B))) == NULL) {	\
		(void) fputs(memory_error,stderr); exit(-666);		\
	}								\
}
---------------------------- in file your_own.c -------------------------
/* say I have to allocate an array of pointers to structures to bar. */
#include "utildefs.h"
struct foo {
	:
} **bar;
	:
	:
	MALLOC_N(bar,struct foo *,128);
	:
	:

-- 
"It's better to burn out               {) badri@valhalla.ee.rochester.edu
 Than it is to rust-                  //\\ {ames,cmcl2,columbia,cornell,
 But I'll corrode                    ///\\\ garp,harvard,ll-xn,rutgers}!
 Till I turn to dust."                _||_   rochester!ur-valhalla!badri

raeburn@athena.mit.edu (Ken Raeburn) (05/23/88)

In article <13100010@bucc2> brian@bucc2.UUCP writes:
>  For the reasons outlined in Caplinger's paper, it is often difficult
>to free() everything that has gotten *alloc()'d in a large project.
>We do not want to implement garbage collection, however, due to
>the code overhead and lack of sources. The environment we are running
>is Microsoft C 5.0 on IBM AT's and PS/2's under MS-DOSN'T.
>
>  An additional field will be added to the header at the beginning of
>every block on the heap. After a call to malloc is made, a mark() function
>can be called to set a code in this field. Later, a function called
>free_group() frees all the blocks with a specific code. The functions
>would look like this:
>
>/* mark a pointer to a block allocated from the heap. returns ptr on	*/
>/* success, or NULL if ptr didn't point into the heap.			*/
>char *mark(ptr, code)
>  char *ptr;			/* block to mark			*/
>  long code;			/* code number to mark with		*/
>
>/* free all blocks on the heap with a specified mark code. Returns a	*/
>/* count of the number of blocks free'd.				*/
>int free_group(code)
>  long code;			/* code number of blocks to free	*/

How about combining these steps?  Or letting the allocator dynamically
assign the code number?  I think PL/I does something like this.  (Or
was it just Multics PL/I?)  The allocator is part of the language, so
you could have something like:

	declare bar area;
	allocate foo in bar;

Later, everything in bar could be freed.  The area could also be
declared as extensible, so that if you ran out of the space already
allocated, the allocator would go find more, and use it also under the
name of bar.

(I don't remember the details, and I didn't use it much, so I may have
some of this wrong.  There were some sort of inconsistencies between
the pure PL/I versions and some system subroutines available.)

A C/UNIX variant could probably be implemented with an internal design
similar to what you're already doing.

	typedef struct _area {
		struct _area *next;
		/* data relating to allocations */
	} *area;

	area create_area (void);
	void * allocate (size_t size, area a);
	void release (void * ptr);
	void free_area (area a);

>   Each programmer will have a unique code similar to a UID, which will be
>stored in the programmer field. This will help in finding a scapegoat when
>problems arise... errrr rather in locating the code that isn't freeing
>its allocated memory. Since we only have about 15 programmers doing this
>type of work, 64 programmer codes per project should be sufficient.
>
>  The project field will be a code indicating which project or library
>allocated the block.

Umm... I wouldn't be interested in knowing which programmer, except to
go beat him over the head later.  The library, and perhaps the source
file, would be useful information, but how can this be managed on a
large scale?  If I log into your system where some of the installed
libraries use this scheme, or if you're interested in exporting it to
other sites, how can conflicts be avoided?  I'd prefer some way of
avoiding conflicts, or at least detecting them sometime (compile- or
link-time preferably, run-time is permissible).

>  Sixteen bits are reserved for the code itself, allowing for 2^16 
>"memory" groups per programmer per project. Additionaly, to help with
>coding reentrant functions, a gensym() function will return a unique
>code number. To make gensym() codes seperate from codes defined by
>the programmer, the gensym flag is provided in the structure.

Now that's sounding better.  Can the gensym function be used
throughout?  (I hope you're not really calling it "gensym" -- it would
seem to be too generic a name.  I've used it in several different
programs for various things -- not libraries, but programs where name
conflicts wouldn't result.)

henry@utzoo.uucp (Henry Spencer) (05/26/88)

There is no need to mess with the insides of malloc for this.  It is
really pretty easy to implement a scheme in which you get an "area" by
calling newarea() and then get data "within" the area by calling, say,
areaalloc() with the area and the size you want.  The area is in fact
a data structure which keeps pointers to everything you allocate through
it.  areaalloc() just invokes malloc() and adds a pointer to the result
to the area.  You can then deallocate everything in the area by calling
a function that just goes through the area calling free() on everything.
Note that the "area" doesn't actually correspond to a contiguous block
of storage, despite its name.

You can add arbitrary bookkeeping to the area functions easily.  The
major advantage is this stuff can be fully portable, which the insides
of malloc (and hence, changes to the insides of malloc) are *NOT*.

bill@proxftl.UUCP (T. William Wells) (05/29/88)

In article <1313@valhalla.ee.rochester.edu>, badri@valhalla.ee.rochester.edu (Badri Lokanathan) writes:
> ---------------------------- in file utildefs.h -------------------------
> extern char *malloc();
> extern char memory_error[];
> #define MALLOC_N(A,B,N) {                                             \
>       if ((A=(B *) malloc((unsigned) (N)*sizeof(B))) == NULL) {       \
>               (void) fputs(memory_error,stderr); exit(-666);          \
>       }                                                               \
> }

Try, instead, this:

#include <stdlib.h>
extern void *Malloc_tmp;        /* Define this with malloc_fail. */
extern void *malloc_fail(size_t N);
#define MALLOC(B,N) ((Malloc_tmp = malloc((size_t)(N) * sizeof(B)))      \
				? (B *)Malloc_tmp                        \
				: (B *)malloc_fail((size_t)(N) * sizeof(B)))


If you do not have an ANSI-ish C compiler:

#define size_t unsigned
extern void *malloc();
extern void *Malloc_tmp;        /* Define this with malloc_fail. */
extern void *malloc_fail(size_t N);
#define MALLOC(B,N) ((Malloc_tmp = malloc((size_t)((N) * sizeof(B))))    \
				? (B *)Malloc_tmp                        \
				: (B *)malloc_fail((size_t)((N) * sizeof(B))))

If your compiler does not have void *, use char *.

This has several advantages.

  o  The generated code is smaller.
  o  All details of the error handling are hidden.
  o  It can be used in an expression.
  o  Malloc_fail could do other things, like purge memory and retry the
     allocation.

peter@ficc.UUCP (Peter da Silva) (06/01/88)

This "memory group" thing sounds like "AllocRemember", a routine that
comes as part of the Amiga Intuition library:

struct RememberKey *Memory;

	Memory = 0;

	...

	space = AllocRemember(&Memory, size, flags);

	...

	morespace = AllocRemember(&Memory, size, flags);


	...

	FreeRemember(&Memory, TRUE);

This has the result of allocating the memory on a linked list. You get
the effect of the memory spaces for free. When you call FreeRemember
it trashes all the memory linked on that particular RememberKey. 

Implementation? Easy:

struct memkey {
	struct memkey *nextkey;
	char data[0];
};

mallocremember(key, size)
struct memkey **key;
int size;
{
	struct memkey *ptr;

	ptr = malloc(sizeof(struct memkey *)+size);
	ptr->nextkey = *key;
	*key = ptr;
	return ptr->data;
}

freeremember(key)
struct memkey **key;
{
	while(*key) {
		ptr = (*key)->nextkey;
		free(*key);
		*key = ptr;
	}
}
-- 
-- Peter da Silva, Ferranti International Controls Corporation.
-- Phone: 713-274-5180. Remote UUCP: uunet!nuchat!sugar!peter.