XBR2D96D@DDATHD21.BITNET (Knobi der Rechnerschrat) (10/21/89)
Hello, I've posted this one month ago, but I think I've never seen an answer to the following problem: To get best performance in memory allocation I want to use the malloc(3X) routines by using '-lmalloc' at link time. Now it seems that there is something wrong with it, because 'free' doesn't seem to work when using '-lmalloc'. The code fragment attached to this mail does allocation/deallocation of some ammount of memory in a endless loop. Using the libraries -lgl_s -lbsd -lfastm -lm -lc_s #(make mem) everything is fine. Using instead the libraries -lgl_s -lmalloc -lbsd -lfastm -lm -lc_s #(make meml) the process' memory size is increasing and on a 8MB GT the system is after about 50 steps saturated with really heavy paging. What I really would like to know is: a) Is it my fault (using wrong library order?)? b) Is it a bug? - known? - fixed when? c) Is there really a performance gain when using -lmalloc (supposed it works properly)? Any comments are welcome and appreciated. Regards Martin Knoblauch TH-Darmstadt Physical Chemistry 1 Petersenstrasse 20 D-6100 Darmstadt, FRG BITNET: <XBR2D96D@DDATHD21> -------------------------makefile----------------------------------------- # # make - directives # CFLAGS = -g -I/usr/include/bsd # # Library Selection # LIBRL = -lgl_s -lmalloc -lbsd -lfastm -lm -lc_s LIBR = -lgl_s -lbsd -lfastm -lm -lc_s # # # mem: mem.c cc mem.c $(CFLAGS) -o mem $(LIBR) # meml: mem.c cc mem.c $(CFLAGS) -o meml $(LIBRL) # -------------------------mem.c-------------------------------------------- /* ** MOLCAD Version 4.1 ** ** COPYRIGHT AND ALL OTHER RIGHTS RESERVED ** ** Contact: ** ** Prof. Dr. J. Brickmann ** c/o TH - Darmstadt ** Dept. for Physical Chemistry ** Petersenstr. 20 ** D-6100 Darmstadt, FRG ** ** BITNET : <XBR2D96D@DDATHD21.BITNET> ** ** ** file : mem.c ** author : Martin Knoblauch + Michael Teschner ** date : ** purpose : memory allocation test ** comment : ** ** ** ** */ #include <stdio.h> #include <malloc.h> int acount,dcount; struct Dot { struct Dot *next; float arr[4]; }; extern struct Dot *mk_Newdot(); main() { struct Dot *first,*dot; int i,j,count; first = NULL; for(j=0;j<1000;j++){ mk_Deldots(first); dot = first = NULL; count = 0; for(i=0;i<5000;i++){ dot = mk_Newdot(dot); count++; if( first == NULL ) first = dot; } printf(" loop %d count %d \n",j,count); } } /* end main */ struct Dot *mk_Newdot(prev) struct Dot *prev; { struct Dot *help; if ((help = (struct Dot *)malloc(sizeof(struct Dot))) == NULL) return(NULL); if (prev != NULL) prev->next = help; help->next = NULL; return(help); } mk_Deldots(start) struct Dot *start; { struct Dot *help; while (start != NULL) { help = start->next; free(start); start = help; } } --------------------------------------------------------------------------
moraes@CS.TORONTO.EDU (Mark Moraes) (10/22/89)
In comp.sys.sgi you write: >a) Is it my fault (using wrong library order?)? Nope. Even if you remove all libraries except for -lmalloc, it still grows. You can see the problem over only a couple of iterations by printing the value of sbrk(0) after every loop. The break will grow steadily when using -lmalloc or amalloc/afree from -lmpc. With libc malloc, the BSD4.3 malloc or any other working malloc, the value stays constant after the first iteration. >b) Is it a bug? > - known? > - fixed when? Looks like a bug. Not fixed in Irix 3.2, it seems. >c) Is there really a performance gain when using -lmalloc (supposed it > works properly)? Not likely if it doesn't free... The standard libc malloc is about the speed of the "fast" BSD4.3 (Caltech) malloc for your example code (which is straight allocation followed by free -- not very demanding on most mallocs). But the Caltech malloc typically wastes twice as much memory, which can cause more paging if you use a lot of memory. (If free() doesn't work in -lmalloc, it isn't very useful, no matter how fast it is -- on our Power Iris, it takes about twice as long as the libc malloc...) The libmpc amalloc and afree show the same behaviour as -lmalloc if you modify your code to acreate an arena first, and add a grow function. Stay with the libc malloc unless profiling your application indicates that malloc is a bottleneck. At that point, consider custom allocation strategies for the most frequent uses of malloc. (like preallocating and managing memory pools of frequently used objects, using pages of memory where only the page is freed, using stack allocators with mark/release etc)
madd@world.std.com (jim frost) (10/24/89)
About SGI's and memory leakage: Something to remember is that the SGI graphical object library has memory leaks. This is a random fact that I ran into which I though some people might be interested in. In article <89Oct21.211825edt.3287@neat.cs.toronto.edu> moraes@CS.TORONTO.EDU (Mark Moraes) writes: |Stay with the libc malloc unless profiling your application indicates |that malloc is a bottleneck. At that point, consider custom allocation |strategies for the most frequent uses of malloc. (like preallocating |and managing memory pools of frequently used objects [...] ) The libc malloc slows considerably when dealing with many small object allocations and deallocations (typically a few hundred thousand if I remember right) where the BSD malloc degrades "reasonably"; pooled allocations will improve performance dramatically if you are doing this type of allocation on the SGI. The libmalloc malloc, even if broken, is good to run some tests with because it smashes the malloc'ed area; we found many bugs because of this behavior (and even more when running on a machine which disallowed null pointer dereferencing :-). jim frost software tool & die madd@std.com
moraes@CSRI.TORONTO.EDU (Mark Moraes) (10/24/89)
| The libc malloc slows considerably when dealing with many small object | allocations and deallocations (typically a few hundred thousand if I | remember right) where the BSD malloc degrades "reasonably"; pooled | allocations will improve performance dramatically if you are doing | this type of allocation on the SGI. True. I should also amend my earlier comment: For large numbers of small allocations, -lmalloc does indeed perform much faster than even the BSD4.3 malloc (and appears to free stuff correctly) For the specific case posted, it does not free correctly, and runs much slower. There are similar cases where the BSD malloc, while not losing performance in terms of CPU, will gobble up memory and causes paging activity. (eg. allocate a 1000 elements of 50 bytes each, free them all, then allocate a single element of 2000 bytes and watch it sbrk again, unnecessarily) | The libmalloc malloc, even if broken, is good to run some tests with | because it smashes the malloc'ed area; we found many bugs beca | use of | this behavior (and even more when running on a machine which | disallowed null pointer dereferencing :-). Huh? In Irix3.2, it doesn't necessarily smash the contents of freed blocks (I assume you mean smash the malloc'ed area on free -- I'd be rather displeased with a malloc that trashed the contents of the malloc'ed blocks:-) The following program prints hello world twice even when compiled with -lmalloc. Change that to "hello xxxxxxxxxxxxxxxxxxxxxxxxxxxx world" and it will then smash the freed block. Smashing the contents of a freed block (among other things) is desirable behaviour in a debugging malloc -- it degrades performance enough that you probably don't want it in your final code. -lmalloc won't work with the old kludge where you were allowed to rely on a freed block being undamaged till the next malloc. -lmalloc also returns NULL on malloc(0), following the SVID. Both are good for people who care about portability. #include <stdio.h> #define HELLO "hello world\n" main() { extern char *malloc(); char *cp = malloc(sizeof(HELLO)); strcpy(cp, HELLO); fputs(cp, stdout); free(cp); fputs(cp, stdout); exit(0); }
pj@fjord.sgi.com (10/24/89)
In article <8910210901.aa24434@SMOKE.BRL.MIL> XBR2D96D@DDATHD21.BITNET (Knobi der Rechnerschrat) writes: > I want to use the malloc(3X) > Now it seems that there is something > wrong with it, because 'free' doesn't seem to work when using '-lmalloc'. Yes, the 3.1 releases of IRIX did have a memory leakage with alot of small (under 28 bytes ??) allocs/frees. This was fixed in release 3.2. We believe that libmalloc will provide substantially better CPU performance and perhaps less memory fragmentation. Considerable work was done on libmalloc for 3.2, and we recommend its use for examples like yours. I assume that you have profiled your application, so that you already know that optimizing malloc is important to your performance. And I assume that you have noticed the behavioural differences between libc malloc and libmalloc, primarily that you should not dereference a pointer after freeing it when using libmalloc. Thanks, take care ... Paul Jackson (pj@sgi.com), x1373
zombie@voodoo.UUCP (Mike York) (10/26/89)
In article <89Oct21.211825edt.3287@neat.cs.toronto.edu> moraes@CS.TORONTO.EDU (Mark Moraes) writes: >Nope. Even if you remove all libraries except for -lmalloc, it still >grows. You can see the problem over only a couple of iterations by >printing the value of sbrk(0) after every loop. The break will grow >steadily when using -lmalloc or amalloc/afree from -lmpc. With libc >malloc, the BSD4.3 malloc or any other working malloc, the value stays >constant after the first iteration. > >>b) Is it a bug? >> - known? >> - fixed when? > >Looks like a bug. Not fixed in Irix 3.2, it seems. Actually, it seems that it IS fixed in 3.2: I'm running it right now on a 4D/70GT with 8MB and Irix 3.2 -- no problems. After the first iteration, the value of sbrk(0) remained constant. On a 4D/70G with 8MB running Irix 3.1, the value of sbrk does indeed grow, and after 60 iterations, it REALLY slows down. However, by inserting mallopt(M_MXFAST, 0) in mem.c before the main loop, the program works as desired under 3.1. -- Mike York Boeing Computer Services, Renton, Washington (206) 234-7724 uw-beaver!ssc-vax!voodoo!zombie