djones@megatest.UUCP (Dave Jones) (11/03/88)
From article <1709@garth.UUCP>, by smryan@garth.UUCP (Steven Ryan): ...[I missed the article to which Mr. Ryan is responding.] >> Well, gee, how do you allocate an indefinite sized array? > > Glad you asked. I stuff it down a stack, each element individually > handcrafted with malloc, until I've reached the end and know exactly > how big it is. Then, IF I need random access, I copy it to an array. > Linear time. A method I have used quite successfully for some time now is to use realloc to increase the size of a buffer by some *percentage* of its current size. How big a percentage depends on how much memory you can afford to be "wasted". Usually doubling the size of the array works just fine. The array will occasionally be twice as big as it need be. But the overhead is not as much as it might seem at first. Notice that in the stacking scheme described above, there is an overhead for links and heap-structure headers. On the machine I mostly use (Sun-3), that would increase memory usage for character arrays by a factor of ten! And with that particular libc.a, malloc overhead can be devastating in the speed column. (I would not use malloc "raw" for linked lists, but that's another issue.) The algorithm is linear in time and space. Here's a sample of an example. (I'm a poet, and didn't know it.) From my personal library... /* Vbuf.h */ #ifndef _VBUF_H_ typedef struct Vbuf { /* Public */ unsigned char* buf; /* current buffer. Read-only. */ int dot; /* byte-displacement of next byte ** to be put or bss'd. Read-write. */ /* private */ int bufsize; /* size, in bytes, of current buffer */ } Vbuf; Vbuf* Vbuf_init(/* Vbuf* */); void Vbuf_clean(/* Vbuf* */); /* These return the old dot. */ int Vbuf_put_uchar(/* Vbuf*, unsigned */); int Vbuf_put_ushort(/* Vbuf*, unsigned */); int Vbuf_put_ulong(/* Vbuf*, unsigned */); int Vbuf_put_block( /* Vbuf*, unsigned char*, int */) int Vbuf_bss_uchar(/* Vbuf* */); int Vbuf_bss_ushort(/* Vbuf* */); int Vbuf_bss_ulong(/* Vbuf* */); int Vbuf_bss(/* Vbuf*, int */); /* These return the new dot */ int Vbuf_align_short(/* Vbuf*/); int Vbuf_align_long(/* Vbuf */); /* Returns value from system-call write() */ int Vbuf_write(/* Vbuf*, int */); /* int is file-descriptor */ #define _VBUF_H_ #endif _VBUF_H_ You can probably guess what the code has to look like. Here's a fragment... #include "Vbuf.h" #define INITIAL_SIZE 16 static void expand(); Vbuf_put_uchar(obj, val) register Vbuf* obj; unsigned val; { while(obj->dot >= obj->bufsize) expand(obj); *(obj->buf + (obj->dot)++) = (unsigned char) val; return obj->dot - 1; } static void expand(obj) register Vbuf* obj; { if(obj->bufsize == 0 ) { /* first allocation */ obj->buf = (unsigned char*)smalloc(INITIAL_SIZE); obj->bufsize = INITIAL_SIZE; } else { obj->bufsize *= 2; obj->buf = (unsigned char*)realloc(obj->buf, obj->bufsize); } }
smryan@garth.UUCP (Steven Ryan) (11/04/88)
The problem is that allocating by some multiple badly approximates the required size after a number of iterations. Getting back to the original point: Make sure the basic algorithm is good. (1) Can you never have the case where the next realloc will require more memory than available even though the actual size needed will fit? (2) Can you avoid storage fragmentation so that only the current allocation exists and not all previous incarnations? -- -- s m ryan +--------------------------+-----------------------------------------------+ |Canada -- where American | The world will have peace. The question is | |subsidies are unfair. | whether people will be there to enjoy it. | +--------------------------+-----------------------------------------------+
daveb@geaclib.UUCP (David Collier-Brown) (11/06/88)
From article <1709@garth.UUCP>, by smryan@garth.UUCP (Steven Ryan): >> Well, gee, how do you allocate an indefinite sized array? > > Glad you asked. I stuff it down a stack, each element individually > handcrafted with malloc, until I've reached the end and know exactly > how big it is. Then, IF I need random access, I copy it to an array. > Linear time. A variant on that is to write a recursive function which declares a fixed-size buffer, fills that and then recurses. On the way "up" it mallocs a single correct-size buffer and copues each chunk into it. This mitigates some of the problems mentioned by djones @ megatest.UUCP (Dave Jones). I posted a getline function some months ago that did just that. Needless to say, this works best when 1) the variation of the maximum size is only an order of magnitude or so, and 2) you start with a largish size for the declared buffer. --dave c-b -- David Collier-Brown. | yunexus!lethe!dave Interleaf Canada Inc. | 1550 Enterprise Rd. | HE's so smart he's dumb. Mississauga, Ontario | --Joyce C-B