[comp.lang.c] Efficient dynamic storage question

ajayshah@alhena.usc.edu (Ajay Shah) (04/21/91)

I have a program in which a matrix is allocated dynamically.
Every time I see an observation (which is a row in the matrix), I
realloc the matrix (which is float **) to contain one more row
(one more float *).  Then I malloc the row and fill up it's
elements.

These realloc calls are not interspersed with any *other* dynamic
memory activity.  I start building this matrix and
single-mindedly read observations till it takes it's final
shape.  So the program profile is 
realloc-malloc-realloc-malloc-realloc-malloc until all
observations are read in.

Question: is it much more efficient to realloc in groups of (say)
100 observations at a time instead of calling realloc once for each
observation?  Realloc might well have to copy a lot of elements in the
process of doing this, right?  This could be expensive.

-- 
_______________________________________________________________________________
Ajay Shah, (213)734-3930, ajayshah@usc.edu
                             The more things change, the more they stay insane.
_______________________________________________________________________________

barmar@think.com (Barry Margolin) (04/21/91)

In article <32121@usc> ajayshah@alhena.usc.edu (Ajay Shah) writes:
>  So the program profile is 
>realloc-malloc-realloc-malloc-realloc-malloc until all
>observations are read in.
>
>Question: is it much more efficient to realloc in groups of (say)
>100 observations at a time instead of calling realloc once for each
>observation?  Realloc might well have to copy a lot of elements in the
>process of doing this, right?  This could be expensive.

Yes, it's generally better to minimize the reallocs.  Consider an
implementation of malloc that allocates a new object right where the
previous object ended (perhaps with some header information interspersed).
The reallocs would have to copy because the previous malloc would prevent
the realloc from simply extended the array.

In some implementations, malloc rounds up the amount of memory you
requested (perhaps to a power of 2), in which case many of the reallocs
will be very cheap.  Also, if the size of the array gets larger than the
size of an observation record, the malloc may be able to use the space
vacated during the previous realloc, so every other realloc would be cheap.

Thus, at worst you would probably get about the same performance whether or
not you grow by larger chunks, but at best you may speed things up quite a
bit.  Furthermore, if these arrays are large, lots of reallocs can cause
quite a bit of paging, and paging is generally something you want to
minimize (a page fault is several orders of magnitude slower than any
individual instruction).

By the way, it's popular to grow things multiplicitavely, rather than
linearly.  For instance, rather than adding 100 to the size when
reallocing, you might want make the new size 1.5 or 2 times the old size.

--
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

bhoughto@hopi.intel.com (Blair P. Houghton) (04/21/91)

In article <1991Apr20.230021.25248@Think.COM> barmar@think.com (Barry Margolin) writes:
>In article <32121@usc> ajayshah@alhena.usc.edu (Ajay Shah) writes:
>>Question: is it much more efficient to realloc in groups of (say)
>>100 observations at a time instead of calling realloc once for each

Tons.  If you need to use an array (though not a declared
one) instead of, e.g., a linked-list, use one that acts
like an array.  If you're worried about empty slots past
the last filled slot, you can always use realloc(3) to
shrink the array to fit your data (this is probably what
it was invented for...)

>Yes, it's generally better to minimize the reallocs.  Consider an
>implementation of malloc that allocates a new object right where the
>previous object ended (perhaps with some header information interspersed).
>The reallocs would have to copy because the previous malloc would prevent
>the realloc from simply extended the array.

i.e.  Realloc can't extend an array into a space that is used already.

>Thus, at worst you would probably get about the same performance whether or
>not you grow by larger chunks, but at best you may speed things up quite a

Not larger, always; sometimes _smaller_ chunks are better.

If you know what kind of data you're reading, you should be
able to get a distribution of the sizes of arrays you can
expect to use for that data.  If the data often reach
10.00Mb, but rarely 10.02Mb, it's very wasteful to allocate
another 10Mb of heap just because you've reached 10Mb.

I was playing around with this sort of thing a couple of
months ago, but productivity got in the way and I never
implemented it on a well-characterized database.

The basic idea was to keep an array of the likely array
sizes, and reallocate to the next size when one was reached.

The first step in the development (can you say "bottom-up
design"?  I knew that you could (-: ) was generating some
distributions.  So what I have is a filter (called
'loghist') that takes a stream of numbers (actually, lines
that start with numbers, for you du(1)-output fans) and
generates a cumulative histogram[*] with a logarithmic
ordinate axis.

It's not nearly clean enough for comp.sources.*, but I'd be
glad to mail it to anyone who wants it.

The next step is to translate the c.h. (or chuck it out
entirely, if necessary; I don't know, because this is about
where the deadline on another project kicked-in...) into
the set of numbers which divide the data into intervals
that have some rational probabilities (i.e., to reshape the
histogram by choosing uneven increments in the ordinate).
Whether to use a probability sequence of {.5, .75, .875, ...}
or {.9, .99., .999, ...} I haven't determined.  That
may best be determined by the kind of data involved and
the economics of memory usage vs. reallocation-execution
time on the system.

Thus it should be possible to give the optimal size of an
incremental reallocation.  Running loghist on my own files
(the output of 'du -a ~') to see what sort of space
"general user-file bytes" data would need, I found that
my files are fairly normal-distribution distributed in
the 0-5k range (there's a lot of saved news articles in there,
so it peaks around 2k; it's also more accurately a Student's-t
distribution, for you statistics fans), then fairly evenly
(and low) up to about 20k, then sparse until a few clusters
(of similar files, usually multiple, small-difference versions
of specialized data) show up in the 100k-10Mb range.

Creating an array to read this, I'd start with 5k (about 90%
of the files), then if a realloc was needed, I'd add enough
space to reach 20k, and then enough to reach each of the
clusters, as necessary.

Your mileage may vary.

				--Blair
				  "It took longer to explain
				   it than it did to do it..."

[*] A cumulative histogram is to a histogram what the
cumulative distribution function is to the probability
distribution function:  the probability that an event will
have its result in a certain interval is the difference
between the values of the c.d.f. at the endpoints of that
interval.  A cumulative histogram is, therefore, the
statistical analogue of the cumulative distribution
function (which is deduced, rather than statistical, and
is normalized so that F(oo) = 1).

kpv@ulysses.att.com (Phong Vo[drew]) (04/21/91)

In article <32121@usc>, ajayshah@alhena.usc.edu (Ajay Shah) writes:
: 
: ...  So the program profile is 
: realloc-malloc-realloc-malloc-realloc-malloc until all
: observations are read in.
: 
: Question: is it much more efficient to realloc in groups of (say)
: 100 observations at a time instead of calling realloc once for each
: observation?  Realloc might well have to copy a lot of elements in the
: process of doing this, right?  This could be expensive.
: 
: -- 
Assuming that this is the only thing that your program does, it is almost
certain that lots of memory copy is going to be done. The worst part is that
with some allocator such as the circular first-fit with a roving pointer
one (the first popular malloc for UNIX systems), the above pattern of
allocation can cause severe fragmentation if the size of a row is much
smaller than the number of rows. See the paper 'In Search of a Better Malloc'
that Dave Korn and I presented at the '85 Summer Usenix conference for more details.