john@stat.tamu.edu (John S. Price) (11/17/90)
Archive-name: smalloc/30-Oct-90 Original-posting-by: john@stat.tamu.edu (John S. Price) Original-subject: Yet Another Lpmud Malloc Archive-site: worf.tamu.edu [128.194.28.20] Archive-directory: /pub Reposted-by: emv@ox.com (Edward Vielmetti) At Darker Realms (worf.tamu.edu 2000, 128.194.28.20 2000) [plug plug] we have for some time been successfully and happily running our own customized version of malloc, and it dawns on us that there may be other people out their plugging away with Lars' malloc, system malloc, or, heaven forbid, gnu's malloc. You might consider giving ours a try. We find it to be faster and not particularly less memory-efficient than Lars'. If you anonymous ftp to worf.tamu.edu, the file is smalloc.c in pub directory. If you can't anonymous ftp, get on the mud and send me, Satoria, mail. Download the README file from the same directory as above for installation instructions. At the end of this posting is the page returned from dump_malloc_data at the time I wrote this text (i.e., what you see when you type "malloc".) Here's the gory details: MEMORY USAGE: All blocks of memory, in use or free, use exactly one word of overhead (4 bytes). There are a few static lists, but no other dynamic allocations. However, see memory wastage below. TIME USAGE: We have not run detailed timing studies (gprof on lpmud? blech!), however on test data (who knows how representative it was) on a Sun 3/60, I found my malloc to be about 3.5 times faster than system, and 5 times faster than Lars, *however*, the Lars figure is in error because no garbage collection was ever performed, as I had no clue how often it should occur. The order of magnitude measures are: mallocing a block <= 32 bytes, O(1) [and fast], mallocing a block >32 bytes: O(n) n=number of free blocks, freeing any block O(1), no garbage collection. MEMORY WASTAGE: Any block <=32 bytes in size when freed is not coalesced with adjacent blocks, but remains wasted until another request of the same size occurs. This is acceptable behavior for lpmud. (This optimization is because of lpmud's heavy use of malloc to create local temporaries, and small blocks account for I believe 60-80% of malloc requests although only about 30% of active storage.) Typical memory wastage for about 6-8M total space malloced by lpmud is 100K-150K in large block free lists and 50-80K in unused small blocks. IMPLEMENTATION: All blocks returned are multiples of 4 bytes and are word aligned. There is a separate free list for each size of block <=32 bytes. Large blocks are handled with a tag-boundary scheme, allowing coalescing of adjacent free blocks to be performed by free in short constant time; no garbage collection is required. The tags are compressed into one word, which contains the size of the current block, a bit marking whether the block is a segment (in use) or a hole (free), and a bit indicating the same for the previous block. Extra information required for holes is stored in the unused data space in the hole. SAMPLE OUTPUT OF STATISTICS: (note some wasted overhead in chunks is not accounted for, so some stats don't quite add up.) --------------------------------------------------------------------- > malloc sbrk requests: 260 7586560 (a) large blocks: 25923 7475664 (b) large free blocks: 406 110892 (c) small chunks: 191 3130108 (d) small blocks: 170002 3048668 (e) small free blocks: 5396 74028 (f) unused from current chunk 5140 (g) Small blocks are stored in small chunks, which are allocated as large blocks. Therefore, the total large blocks allocated (b) plus the large free blocks (c) should equal total storage from sbrk (a). Similarly, (e) + (f) + (g) equals (d). The total amount of storage wasted is (c) + (f) + (g); the amount allocated is (b) - (f) - (g). > -------------------------------------------------------------------------- John Price | It infuriates me to be wrong john@stat.tamu.edu | when I know I'm right.... --------------------------------------------------------------------------