fangchin@portia.Stanford.EDU (Chin Fang) (01/07/91)
Hi everyone,
Not long ago I had to write a code to handle certain iterative floating
point intensive computations. However, I discovered that the
algorithm allowed me to skip many repeated time-consuming fp
operations if I could somehow cache computed data in memory. I was
in a situation that I had enough memory at my disposal, so reducing
disk access would be my priority to speed up the process.
After some deliberation, I decided to use balanced tree as the way
to implement the data cache. The algorithm for the balanced tree is
Drs. Guibas and Sedgewick's RB tree [1].
The data that I had to cache were small float matrices (3x3 typical),
and I used Numerical Recipes [2] utility fuctions for creating and
distorying these small matrices. The small matrices required some
quite heavy transendental function computations to generate and I
deemed it unacceptable to have bad worst case performance. So I
decided to implemented an in-memory cache using RB tree, this strategy
should result somewhat cheaper operation (I hope).
The problem I had (and still having) was how to keep the program's
memory consumption almost constant. From what I understand, if my
malloc(3C) and free3(C) work correctly, as long as I make sure that
a sequence of malloced memory blocks are freed in EXACTLY the reverse
order, my program's memory consumption would stay constant no matter
how many iterations it goes thru. This has been my guideline in my
coding. By the way, I always do C programming on UNIX machines.
To achieve this for simple linear opertions is trival. But for trees,
as far as I could (and can) see, I had to maintain a list for keeping
track of the order of creation of each node in the tree in order to
call free() in the right order. To do so necessitated a mess in my code
that I was really unhappy.
Summerize, below is an outline of what I was doing
enter program
while (some conditions are true){
get raw data from somewhere;
form certain matrices and put them into nodes of a RB tree;
perform some FP ops, go to the tree first to see if required matrices
are available. (They should, my algorithm guarantee this);
distory the tree and release all memory (in the right order);
}
exit program
As obvious from above pseudo code, I was trying to trade expansive
memory operations for disgusting and far more expansive disk
accesses. Note, each while loop may require a different tree, so it
is impossible to move the expansive memory allocation/deallocation to
outside the while (I wish it could be done).
Given this situation, could someone give me some suggestions for
a better way of handling tree structure memory management? I don't
think the order tracking list that I am using is a good enough one.
and I believe there is still some memory leak somewhere as I can see
from the output of ps(1) (SZ term was growing).
Thanks in advance to anyone who responds. Emails are most welcome,
if this question is of general interest, I will summerize.
Regards
Chin Fang
Mechanical Engineering Department
Stanford University
fangchin@portia.stanford.edu
[1] L. J. Guibas and R. Sedgewick. "A dichromatic framework for
balanced trees." In Proc. 19th Ann. Symp. on Theory of Computing
(1978): 8-21
[2] W. H. Press, B.P.Flannery, S.A. Teukolsky, W.T.Vetterling.
Numerical Recipes in C, the art of scientific computing.
Cambridge University Press. 1988, Appendix D.
salomon@ccu.umanitoba.ca (Dan Salomon) (01/08/91)
In article <1991Jan7.034619.4449@portia.Stanford.EDU> fangchin@portia.Stanford.EDU (Chin Fang) writes: > > The problem I had (and still having) was how to keep the program's > memory consumption almost constant. From what I understand, if my > malloc(3C) and free3(C) work correctly, as long as I make sure that > a sequence of malloced memory blocks are freed in EXACTLY the reverse > order, my program's memory consumption would stay constant no matter > how many iterations it goes thru. This has been my guideline in my > coding. By the way, I always do C programming on UNIX machines. What you can do is add an extra pointer to the binary tree nodes to form a linked list that records the order of allocation. Ptr-to-Head-of-Allocation-List | | V ------------------------------------------------------- | Data | left-subtree | right-subtree | new-pointer | ------------------------------------------------------- | | V To node allocated just before this one. Every time you allocate a tree node, also add it to the FRONT of the linked list recording allocations. Thus the linked list will be in order of most recent allocation first, oldest allocation last. As a result it is easy to free the nodes from front to back. Note that each node is in two data structures simultaneously, the tree data structure, and the allocation data structure. Since the data structures are accessed through different pointers, they do not interfere with each other. -- Dan Salomon -- salomon@ccu.UManitoba.CA Dept. of Computer Science / University of Manitoba Winnipeg, Manitoba, Canada R3T 2N2 / (204) 275-6682
mostek@motcid.UUCP (Frank B. Mostek) (01/09/91)
fangchin@portia.Stanford.EDU (Chin Fang) writes: >Hi everyone, > >... >while (some conditions are true){ > get raw data from somewhere; > form certain matrices and put them into nodes of a RB tree; > perform some FP ops, go to the tree first to see if required matrices > are available. (They should, my algorithm guarantee this); > distory the tree and release all memory (in the right order); >} >exit program >Given this situation, could someone give me some suggestions for >a better way of handling tree structure memory management? I don't >think the order tracking list that I am using is a good enough one. >and I believe there is still some memory leak somewhere as I can see >from the output of ps(1) (SZ term was growing). In one of my compilers I cleaned up allocated memory that was transparent to the data structures. (Since the compiler was using several different ones.) I wrote a general memory allocation routine, and inside this routine I built a linked list of allocated memory. When I wanted to clean up, I simply checked for the type field, and freed up all the memory in a tree, stack, etc... In your case, you don't even have to keep track of the data structure. I don't have the code handy, but its something like this: struct alloced_mem { char* p; struct alloced_mem *next; } mem_list; void *allocate_memory(size) unsigned size; { char *p; p = malloc(size); if (p == null) err(); /* Add node to alloced_mem list, Add to beginning of list */ add_node_to_mem_list(p); /* Don't call allocate_memory! */ return (void *)p; } Note: Cast the call to allocate_memory i.e: tree = (struct tree*)allocate_memory(sizeof(struct tree)); Then to free it up: void free_tree() { while (mem_list != NULL) { free(mem_list); free(mem_list->p); mem_list = mem_list->next; } } This is kind of rough. If you think this may help, I can dig up the actual code and send it to you. Frank Mostek - uunet!motcid!amethyst!mostek Don't need no steenkin .signature file.
mcdaniel@adi.com (Tim McDaniel) (01/09/91)
fangchin@portia.Stanford.EDU (Chin Fang) writes:
From what I understand, if my malloc(3C) and free3(C) work
correctly, as long as I make sure that a sequence of malloced
memory blocks are freed in EXACTLY the reverse order, my program's
memory consumption would stay constant no matter how many
iterations it goes thru.
There are no firm guarantees. A free() that did nothing would satisfy
ANSI C, and there are certainly no guarantees in non-ANSI C.
In most implementations, if the requests are all the same size, it
doesn't matter in what order the blocks are freed. Choose any
convenient order.
If the size of your program is growing, it may not be your fault.
Many standard libraries do their own dynamic storage allocation.
There are libraries that check for memory allocation "leaks" (cases
where you malloced a block and neglected to free it.)
--
Tim McDaniel Applied Dynamics Int'l.; Ann Arbor, Michigan, USA
Work phone: +1 313 973 1300 Home phone: +1 313 677 4386
Internet: mcdaniel@adi.com UUCP: {uunet,sharkey}!amara!mcdaniel
gwyn@smoke.brl.mil (Doug Gwyn) (01/11/91)
In article <MCDANIEL.91Jan9102148@dolphin.adi.com> mcdaniel@adi.com (Tim McDaniel) writes: >A free() that did nothing would satisfy ANSI C, ... Not strictly true, as the standard specifies that the storage is made available for subsequent allocation. However, there is no strictly conforming way to test for this behavior.
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (01/11/91)
In article <14805@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes: > In article <MCDANIEL.91Jan9102148@dolphin.adi.com> mcdaniel@adi.com (Tim McDaniel) writes: > >A free() that did nothing would satisfy ANSI C, ... > Not strictly true, as the standard specifies that the storage is made > available for subsequent allocation. However, there is no strictly > conforming way to test for this behavior. Doesn't as-if kick in here? A conforming program cannot figure out that a no-op is different from the definition of free(); therefore the no-op works as if it were a free() satisfying the literal definition; therefore the no-op is a valid implementation of free(). Where's the mistake? ---Dan
gwyn@smoke.brl.mil (Doug Gwyn) (01/11/91)
In article <20872:Jan1017:54:3891@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >In article <14805@smoke.brl.mil> gwyn@smoke.brl.mil (Doug Gwyn) writes: >> In article <MCDANIEL.91Jan9102148@dolphin.adi.com> mcdaniel@adi.com (Tim McDaniel) writes: >> >A free() that did nothing would satisfy ANSI C, ... >> Not strictly true, as the standard specifies that the storage is made >> available for subsequent allocation. However, there is no strictly >> conforming way to test for this behavior. >Doesn't as-if kick in here? A conforming program cannot figure out that >a no-op is different from the definition of free(); therefore the no-op >works as if it were a free() satisfying the literal definition; >therefore the no-op is a valid implementation of free(). Where's the >mistake? The mistake is that failure of a strictly-conforming program to detect an error in the implementation does not magically let the implementation off the hook. It is fairly simple to write a strictly-conforming test program that would EVENTUALLY discover this implementation flaw, but there is no guaranteed upper bound on the amount of time that it would take to discover the problem. Indeed, this strikes me as a useful test to add to C compiler validation suites: while not guaranteeing that any such deficient implementation would be detected, it could at least loop the test enough times to be fairly sure of detecting such a problem in most common environments. For example: #include <stdio.h> /* for fprintf, stderr */ #include <stdlib.h> /* for exit, EXIT_*, free, malloc, size_t */ #define ALLOCS 100000L #define SIZE ((size_t)10000) int main(void) { register void *p; register long n; for (n = 0; n < ALLOCS; ++n) { if ((p = malloc(SIZE)) == NULL) { fprintf(stderr, "FAILED on iteration %ld\n", n); exit(EXIT_FAILURE); } free(p); } return EXIT_SUCCESS; }
gwyn@smoke.brl.mil (Doug Gwyn) (01/12/91)
In article <21@christmas.UUCP> rtm@island.uu.net (Richard Minner) writes: >First an open question: > How common are allocators that don't coalesce properly? Only one implemented by a nincompoop would have trouble with this, which was an obvious requirement as far back as the first edition of Knuth Vol. 1. There is no way to guarantee that arena fragmentation will not cause some allocation request to be impossible to satisfy even though the aggregate amount of available free store exceeds the amount requested. If memory would always accessed via an extra level of indirection (what Apple calls "handles"), then garbage collection could compact the arena and solve this problem. However, that is not the way that the malloc()/free() interface was designed.