paul@athertn.Atherton.COM (Paul Sander) (07/25/90)
There has been some discussion in comp.software-eng lately regarding reusable software components. This discussion focused mainly on why they are not being used more; the more prominent opinions on this seemed be: - There aren't many - It's easier or more enjoyable or more rewarding to write from scratch than to reuse someone else's - When they exist, they're hard to find Some of the fundamental problems which exhibit themselves as these three points are these: - There aren't many reusable software components because what's out there that claims to be reusable isn't - There is no good way to catalog the components - There is no good way to index the catalog Though all of these topics are interesting and worth discussion, I would like to spur some discussion on just what a reusable software component is. Following is (at the risk of doing some undergraduate's first-quarter Data Structures homework) an implementation of an in-memory B-tree. What I'd like to discuss are: - Is this a reusable software component? - What are its characteristics that make it one? Or don't make it one? - Is it a good one? If so, why? If not, why? - Would anyone use it? Why or why not? - How can it be made better? - Anything else relevant to producing a good reusable component that is reused Other interesting topics that are probably best deferred to email are: - Choice of language - Programming style - Optimizations - Which newsgroups distribute source code I've attempted to direct followups to comp.sw.components. Please double- check your followup headers. Here is the code. I have not enveloped it in any form of archive so that those not on Unix systems (and do not have the benefit of "shar") can read it. By the way, every test I could think of has not revealed any bugs in this code: Moderate-sized trees are really B-trees, there are no coredumps, there are no memory leaks or corruptions. Paul Sander paul@Atherton.COM (408) 734-9822 ********************************************************************* **************************** btree.doc ****************************** ********************************************************************* BTREE() MISC. REFERENCE MANUAL PAGES BTREE() NAME bt_dump, bt_search, bt_new, bt_destroy, bt_insert, bt_traverse, bt_delete - In-memory B-tree manipulation SYNOPSIS #include <btree.h> BTREE bt_new(order,comp) int order; int (*comp)(); void bt_destroy(tree,free_key) BTREE tree; void (*free_key)(); int bt_insert(tree,key) BTREE tree; char *key; char *bt_delete(tree,key) BTREE tree; char *key; char *bt_search(tree,target) BTREE tree; char *target; void bt_traverse(tree,fn,parms) BTREE tree; void (*fn)(); char *parms; void bt_dump(tree,key_dump) BTREE tree; void (*key_dump)(); DESCRIPTION These functions implement an in-memory B-tree data struc- ture. The tree itself stores only pointers to the client's data, and relies on client- provided functions for any com- parisons and storage deallocation. bt_new creates a new B-tree. The client specifies the desired order (the maximum number of children allowed each node) of the tree (for performance trade-offs based on the efficiency of comparisons of client-provided data), and a pointer to a function that compares two client-provided data structures. The comparison function, comp, has an interface identical to strcmp(3), except that its arguments point to client-provided data structures. Client-provided data structures that compare greater by this function will appear later in the lexical order of the data stored in the tree. Sun Release 4.0 Last change: 1 BTREE() MISC. REFERENCE MANUAL PAGES BTREE() bt_new returns a pointer to a handle for the B-tree, or NULL if an error occurs. bt_destroy deallocates all storage allocated to a B-tree. The client provides a pointer to a visitation function that is called once for each item stored in the tree. The items are visited in arbitrary order. If NULL is passed instead of a pointer to a function, no action is taken for the client-provided data, but the tree structure itself is freed. bt_insert inserts a new item into the tree. 1 is returned if the insertion was successful, -1 is returned if the new item matches another item that has already been inserted into the tree, and 0 is returned in the event of an error. bt_delete deletes an item from a tree. The value returned is the same as that passed to bt_insert when the item was inserted, or NULL if the item is not found. bt_search searches for an item in a tree. The value returned is the same as that passed to bt_insert when the item was inserted, or NULL if the item is not found. bt_traverse traverses the tree, calling a client-provided visitation function fn once for each item stored there. fn has the following interface: void fn(item,parms) char *item; char *parms; item is the pointer stored when the item was inserted into the tree. parms is an arbitrary pointer that the client wishes to pass to the visitation function, but is otherwise unused by the B-tree implementation. Items are visited in their lexical order. bt_dump displays the contents of the tree to stdout, along with some diagnostic and statistical information. The key_dump function is called once for each item in the tree, in arbitrary order. It may be NULL if no action is desired. Its interface follows: void key_dump(item) char *item; BUGS This implementation is written in K&R C, and tested only on a Sun 3 running SunOS 4.0.3. This implementation assumes "char*" is the generic pointer. This goes against practice used by most "modern" compilers, but aids portability to older compilers. Note also that this will probably generate lots of complaints from Lint. Sun Release 4.0 Last change: 2 BTREE() MISC. REFERENCE MANUAL PAGES BTREE() bt_dump assumes that pointers are the same size as integers, and that they can be displayed in total in eight hexidecimal digits. Sun Release 4.0 Last change: 3 **************************************************************** ******************** btree.h *********************************** **************************************************************** /* Header file for in-memory B-tree implementation */ typedef char *BTREE; extern void bt_dump(); extern char *bt_search(); extern BTREE bt_new(); extern void bt_destroy(); extern int bt_insert(); extern void bt_traverse(); extern char *bt_delete(); **************************************************************** ******************** btree.h *********************************** **************************************************************** /* #define DEBUG */ /********************************************************************** * * btree.c -- This file implements an in-memory B-tree abstract data * type. When the tree is created, the caller specifies * the number of children each node may have (i.e., the * tree's "order") and a pointer to a function that compares * two keys. The function has the same calling protocol as * strcmp(). * **********************************************************************/ #include <stdio.h> /* The following structure describes a B-tree node. */ struct btnode { int nkeys; /* Number of keys stored here */ char **keys; /* array[order-1] of keys */ struct btnode **children; /* array[order] of children */ }; /* The following structure describes a B-tree. */ struct btree { int order; /* Max children permitted */ int (*comp)(); /* Comparison function for keys */ struct btnode *root; /* Root of tree */ int capacity; /* Max keys that will fit */ int contains; /* Key currently stored */ }; typedef struct btree *BTREE; typedef struct btnode BTNODE; /* The following structure is used for debugging heap corruptions */ #ifdef DEBUG struct heapblk { char *blk; int freeNext; }; struct heap_struct { int allocSeq; int freeLast; struct heapblk blks[10000]; }; struct heap_struct btheap = {0,-1}; #endif DEBUG /* * The following code is used for allocating space on the heap. When not * debugging heap corruptions, a macro aliases bt_malloc with malloc. */ #ifdef DEBUG char *bt_malloc(size) unsigned size; { char *temp; temp = (char*) malloc(size); btheap.blks[btheap.allocSeq].blk = temp; btheap.blks[btheap.allocSeq].freeNext = -2; btheap.allocSeq++; return temp; } #else #define bt_malloc(size) malloc(size) #endif /* * The following code is used for freeing memory on the heap. If not * debugging heap corruptions, bt_free is aliased with free. */ #ifdef DEBUG bt_free(ptr) char *ptr; { int i; printf("Freeing heap storage at %x\n",ptr); for (i = 0; i < btheap.allocSeq; i++) { if (ptr == btheap.blks[i].blk) { if (btheap.blks[i].freeNext == -2) { btheap.blks[i].freeNext = btheap.freeLast; btheap.freeLast = i; free(ptr); return; } else { printf( "ERROR: freeing %x which was already freed\n", ptr); printf("Block\n"); i = btheap.freeLast; while (i > 0) { printf("%x\n",btheap.blks[i].blk); i = btheap.blks[i].freeNext; } fflush(stdout); kill(getpid(),5); } } } printf("ERROR: Freeing %x which has not been allocated\n",ptr); fflush(stdout); kill(getpid(),5); } #else #define bt_free(ptr) free(ptr) #endif /* The following verifies the integrity of the heap. */ int bt_malloc_verify() { int retval; int once; int i; retval = 1; once = 0; #ifdef DEBUG for (i = 0; i < btheap.allocSeq; i++) { if (btheap.blks[i].freeNext == -2) { if (!once) { once = 1; printf("The following blocks are allocated:\n"); } printf("%x\n",btheap.blks[i].blk); } } fflush(stdout); #endif #ifdef sun retval = malloc_verify(); #endif return retval; } /********************************************************************* * * dump_node -- This function displays the contents of a node and * then recursively displays its children. Note * that this function assumes that a pointer to a node * is the same size as an integer. * *********************************************************************/ void dump_node(node,key_dump) BTNODE *node; void (*key_dump)(); { int i; if (node == NULL) { printf("ERROR: null node pointer\n"); return; } printf("%08x: %d keys(%08x, %08x)\n",(int) node,node->nkeys, node->keys,node->children); printf("--------\n"); for (i = 0; i < node->nkeys; i++) { if (node->children != NULL) { printf(" %08x\n",node->children[i]); } else { printf(" %08x\n",0); } if (key_dump != NULL) { printf(" "); key_dump(node->keys[i]); } } if (node->children != NULL) { printf(" %08x\n",node->children[i]); for (i = 0; i <= node->nkeys; i++) { dump_node(node->children[i],key_dump); } } else { printf(" %08x\n",0); } fflush(stdout); return; } /********************************************************************* * * bt_dump -- This function displays the contents of a B-tree. The * caller passes in a function that displays the contents * of one of the keys stored in the tree. The calling * protocol for this function is: * * key_dump(key) * char *key; * * Also, if key_dump is NULL, no action is taken. This is * useful if the structure of the tree is desired without * including its contents. * *********************************************************************/ void bt_dump(ptree,key_dump) char *ptree; void (*key_dump)(); { BTREE tree; tree = (BTREE) ptree; printf("B-tree dump:\n\n"); printf("order = %d\n",tree->order); printf("capacity = %d\n",tree->capacity); printf("contains %d keys\n",tree->contains); printf("efficiency is %d per cent\n", (tree->contains*100)/tree->capacity); printf("handle at %08x\n",tree); printf("root = %08x\n\n",(int) tree->root); dump_node(tree->root,key_dump); return; } /******************************************************************** * * lin_search -- This function searches a node for a key. If the key * is found, its index in the node's key array is * returned, along with a flag indicating that the key * was found. If the key is not found, the index of the * next higher key in the key array is returned, along * with a "not found" indication. * ********************************************************************/ static int lin_search(node,target,comp,eq) BTNODE *node; /* Node to be searched */ char *target; /* Key to search for */ int (*comp)(); /* strcmp()-like comparison function */ int *eq; /* 0 if not found, non-zero otherwise */ { int i; int res; int hi; int lo; res = -1; /* * Perform linear search of node contains 7 keys or less, * otherwise perform binary search */ if (node->nkeys <= 7) { for (i = 0; i < node->nkeys; i++) { res = (*comp)(node->keys[i],target); if (res >= 0) break; } } else { lo = 0; hi = node->nkeys - 1; while ((lo <= hi) && (res != 0)) { i = (lo + hi) / 2; res = (*comp)(node->keys[i],target); if (res < 0) lo = i + 1; else if (res > 0) hi = i - 1; } if (res < 0) i++; } *eq = (res == 0); /* Indicate successful search */ return i; } /*********************************************************************** * * node_search -- This function performs a recursive-descent search of * a sub-tree for a key. The key is returned if found, * or NULL otherwise. * ***********************************************************************/ static char *node_search(node,target,comp) BTNODE *node; char *target; int (*comp)(); { int res; int i; int eq; if (node == NULL) return NULL; i = lin_search(node,target,comp,&eq); if (eq) return node->keys[i]; else if (node->children == NULL) return NULL; else return node_search(node->children[i],target,comp); } /*********************************************************************** * * bt_search -- This function searches a tree for a specified key. The * key is returned if found, or NULL if not. * ***********************************************************************/ char *bt_search(tree,target) BTREE tree; char *target; { if (tree == NULL) return NULL; return node_search(tree->root,target,tree->comp); } /*********************************************************************** * * bt_new -- Given the number of children each node is allowed to have, * and a pointer to a strcmp-like function that compares two * keys, this function creates an empty b-tree structure. * ***********************************************************************/ char *bt_new(order,comp) int order; int (*comp)(); { BTREE tree; int i; /* Validate parameters */ if ((comp == NULL) || (order < 3)) return (char *) NULL; /* Allocate the handle */ tree = (BTREE) bt_malloc(sizeof(struct btree)); if (tree != NULL) { /* Allocate the root */ tree->root = (BTNODE*) bt_malloc(sizeof(struct btnode)); if (tree->root != NULL) { /* Initialize root */ tree->order = order; tree->comp = comp; tree->root->nkeys = 0; tree->root->children = NULL; /* Allocate key array */ tree->root->keys = (char**)bt_malloc((order - 1) * sizeof(char*)); if (tree->root->keys != NULL) { /* Initialize keys */ for (i = 0; i < order - 1; i++) { tree->root->keys[i] = NULL; } tree->root->nkeys = 0; tree->contains = 0; tree->capacity = order - 1; return (char *) tree; } /* Failed to allocate keys, free tree */ bt_free(tree->root); } /* Failed to allocate root, free handle */ bt_free(tree); } /* Failed to allocate handle, return null (error) */ return (char *) NULL; } /******************************************************************** * * free_node -- This function frees a node in a b-tree. It is * provided with a function that frees each key in the * node also. The free_key function has the following * protocol: * * free_key(key) * char *key; * * If NULL is passed as the free_key function, the node * is freed, but no action is taken for the keys. * ********************************************************************/ static void free_node(node,free_key) BTNODE *node; void (*free_key)(); { int i; if (node->children != NULL) { for (i = 0; i <= node->nkeys; i++) { free_node(node->children[i],free_key); } bt_free(node->children); } if (free_key != NULL) { for (i = 0; i < node->nkeys; i++) { (*free_key)(node->keys[i]); } } bt_free(node->keys); bt_free(node); return; } /******************************************************************** * * bt_destroy -- This function frees a b-tree structure. It is also * passed a function that frees the keys contained by the * tree. The free_key function has the same calling * protocol as the free_key function passed to free_node * above. * ********************************************************************/ void bt_destroy(ptree,free_key) char *ptree; void (*free_key)(); { BTREE tree; tree = (BTREE) ptree; free_node(tree->root,free_key); bt_free(tree); return; } /****************************************************************** * * rot_right -- Given a node and an index into its key array, this * function rotates the node right at the specified * key. This is used during insertion and deletion to * keep the tree balanced. The return value is 0 if * the function fails, i.e. the node on the left * cannot lose keys, the node on the right cannot * gain keys, or the specified node is a leaf; 1 is * returned if successful. * ******************************************************************/ static int rot_right(node,n,order) BTNODE *node; int n; int order; { int i; BTNODE *right; BTNODE *left; /* Test parameters */ if ((n >= node->nkeys) || (n < 0)) { return 0; } /* Point to children */ left = node->children[n]; right = node->children[n+1]; /* Return FALSE if rotation is not possible */ if ((left == NULL) || (left->nkeys <= order/2) || (right->nkeys >= order - 1)) { return 0; } /* Shift the right child's keys */ for (i = right->nkeys; i > 0; i--) { right->keys[i] = right->keys[i - 1]; } /* Rotate keys */ right->keys[0] = node->keys[n]; node->keys[n] = left->keys[left->nkeys - 1]; left->keys[left->nkeys - 1] = NULL; if (right->children != NULL) { /* Shift the right child's children */ for (i = right->nkeys; i >= 0; i--) { right->children[i + 1] = right->children[i]; } /* Rotate children */ right->children[0] = left->children[left->nkeys]; left->children[left->nkeys] = NULL; } /* Update key count and return TRUE */ right->nkeys++; left->nkeys--; return 1; } /****************************************************************** * * rot_left -- Given a node and an index into its key array, this * function rotates the node left at the specified * key. This is used during insertion and deletion to * keep the tree balanced. The return value is 0 if * the function fails, i.e. the node on the right * cannot lose keys, the node on the left cannot * gain keys, or the specified node is a leaf; 1 is * returned when successful. * ******************************************************************/ static int rot_left(node,n,order) BTNODE *node; int n; int order; { int i; BTNODE *right; BTNODE *left; /* Test parameters */ if ((n < 0) || (n >= node->nkeys)) { return 0; } /* Point to children */ right = node->children[n+1]; left = node->children[n]; /* Return FALSE if rotation is not possible */ if ((left == NULL) || (right->nkeys <= order/2) || (left->nkeys >= order - 1)) { return 0; } /* Rotate keys */ left->keys[left->nkeys] = node->keys[n]; node->keys[n] = right->keys[0]; /* Update key counts */ left->nkeys ++; right->nkeys --; /* Shift the right child's keys */ for (i = 0; i < right->nkeys; i++) { right->keys[i] = right->keys[i+1]; } right->keys[i] = NULL; if (left->children != NULL) { /* Rotate children */ left->children[left->nkeys] = right->children[0]; /* Shift the right child's children */ for (i = 0; i <= right->nkeys; i++) { right->children[i] = right->children[i+1]; } right->children[i+1] = NULL; } /* Return TRUE */ return 1; } /********************************************************************* * * ins_key -- This function inserts a key into a specified node at * a specified location in its key array. 0 is returned * if the node is already full, 1 otherwise. * *********************************************************************/ static int ins_key(node,key,n,order) BTNODE *node; char *key; int n; int order; { int i; /* Return FALSE if insertion is not possible */ if (node->nkeys >= order - 1) { return 0; } /* Shift keys to make room, then insert new key */ for (i = node->nkeys - 1; i >= n; i--) { node->keys[i+1] = node->keys[i]; } node->keys[n] = key; node->nkeys++; return 1; } /********************************************************************* * * burst -- Given a node and an index into its key and child arrays, * this function splits the specified child node in half, * elevating the key at the child's split point into this * specified node. 0 is returned if the child array index * is out of range, or the node is full, or malloc fails; * 1 is returned otherwise. * *********************************************************************/ static int burst(tree,node,n) BTREE tree; BTNODE *node; int n; { BTNODE *new; int i; int m; BTNODE *left; int lkeys; /* Test parameters */ if (((node->nkeys > 0) && (node->nkeys >= tree->order - 1)) || (n < 0)) { return 0; } /* Create a new node */ new = (BTNODE*) bt_malloc(sizeof(BTNODE)); if (new != NULL) { /* Allocate new key array */ new->keys = (char**) bt_malloc(tree->order*sizeof(char*)); if (new->keys != NULL) { /* Calculate needed partial results */ m = tree->order / 2; left = node->children[n]; lkeys = left->nkeys; /* Allocate new child array */ if (left->children != NULL) { new->children = (BTNODE**) bt_malloc((tree->order + 1) * sizeof(BTNODE*)); if (new->children != NULL) { for (i = 0; i <= lkeys - m; i++) { new->children[i] = left->children[m + i]; } } else { bt_free(new->keys); bt_free(new); return 0; } } else { new->children = NULL; } /* Split keys */ for (i = 0; i < lkeys - m; i++) { new->keys[i] = left->keys[m + i]; } new->nkeys = lkeys - m; left->nkeys = m - 1; /* Elevate the key where the split was made */ for (i = node->nkeys; i > n; i--) { node->keys[i] = node->keys[i-1]; node->children[i+1] = node->children[i]; } node->children[n+1] = new; node->keys[n] = left->keys[m - 1]; node->nkeys++; tree->capacity += tree->order - 1; return 1; } /* Failed to allocate new child array, free node */ bt_free(new); } /* Failed to allocate new node, return FALSE */ return 0; } /********************************************************************* * * rebalance -- This function tries to rebalance the two adjacent * nodes. This is needed when an insertion is attempted * into a node that is already full but its neighbors are * not, or when a deletion is made and the node is less * than half-full. * *********************************************************************/ static rebalance(node,n,order) BTNODE *node; int n; int order; { int res; if (node->children[n]->nkeys < (order-1)/2) { res = rot_left(node,n,order); } else if ((n < node->nkeys) && (node->children[n+1]->nkeys < (order-1)/2)) { res = rot_right(node,n,order); } return; } /********************************************************************* * * ins_node -- This function performs a recursive-descent of a b-tree, * inserting a new key in a leaf node. If the leaf is * full, the tree is reorganized to make room. -1 is * returned if the insert fails due to a duplicate key. * 0 is returned if the insert fails for some other * reason. 1 is returned if successful. * *********************************************************************/ static int ins_node(tree,node,key) BTREE tree; BTNODE *node; char *key; { int i; int res; /* Locate proper place for new key in node */ i = lin_search(node,key,tree->comp,&res); if (res) return -1; /* If no children, insert key in this node */ if (node->children == NULL) { return ins_key(node,key,i,tree->order); } /* Try to insert the new key in a child node */ res = ins_node(tree,node->children[i],key); /* Child is full, reorganize */ if (res == 0) { /* Try rotating right */ if ((res == 0) && (i < node->nkeys) && (node->children[i+1]->nkeys < tree->order - 2)) { res = rot_right(node,i,tree->order); } /* Try rotating left */ if ((res == 0) && (i > 0) && (node->children[i-1]->nkeys < tree->order - 2)) { res = rot_left(node,i - 1,tree->order); } /* Can't rotate, try bursting */ if (res == 0) res = burst(tree,node,i); if (res > 0) { res = ins_node(tree,node,key); if (res > 0) { rebalance(node,i,tree->order); } } } return res; } /********************************************************************* * * bt_insert -- This function inserts a new key into a b-tree. -1 is * returned if the insert fails due to a duplicate key. * 0 is returned if the insert fails for some other * reason. 1 is returned if successful. * *********************************************************************/ int bt_insert(ptree,key) char *ptree; char *key; { int res; BTNODE *new; BTNODE *node; BTREE tree; tree = (BTREE) ptree; /* Begin at root */ node = tree->root; /* If root is empty, insert first key */ if (node->nkeys == 0) { node->keys[0] = key; node->nkeys = 1; tree->contains = 1; return 1; } /* Try inserting new key */ res = ins_node(tree,node,key); /* If 0 return, burst the root, rebalance, and try again */ if (res == 0) { new = (BTNODE*) bt_malloc(sizeof(BTNODE)); if (new != NULL) { new->children = (BTNODE**) bt_malloc((tree->order + 1) * sizeof(BTNODE*)); if (new->children != NULL) { new->keys = (char**) bt_malloc((tree->order) * sizeof(char*)); if (new->keys != NULL) { new->nkeys = 0; new->children[0] = node; tree->root = new; tree->capacity += tree->order - 1; res = burst(tree,new,0); if (res > 0) { res = ins_node(tree,new,key); } if (res > 0) { tree->contains++; rebalance(new,0,tree->order); } return res; } bt_free(new->children); } bt_free(new); } } if (res > 0) tree->contains++; return res; } /**************************************************************** * * visit_node -- This function is used while traversing a b-tree. * It is called once for each node. It calls some * specified function once for each key in its * key array, and calls itself once for each child * in its child array. * ****************************************************************/ static void visit_node(node,fn,parms) BTNODE *node; void (*fn)(); char *parms; { int i; if (node == NULL) return; for (i = 0; i < node->nkeys; i++) { if (node->children != NULL) { visit_node(node->children[i],fn,parms); } (*fn)(node->keys[i],parms); } if (node->children != NULL) { visit_node(node->children[i],fn,parms); } return; } /**************************************************************** * * bt_traverse -- This function traverses a b-tree, calling some * specified function once for each key stored in * it. The specified function has the following * protocol: * * fn(key,parms) * char *key; * char *parms; * * where key is a pointer to a key stored in the * btree, and parms is the parameter structure * passed by the caller. * * The nodes are visited in descending order, if * the comp function passed to bt_new is * strcmp-like. * ****************************************************************/ void bt_traverse(ptree,fn,parms) char *ptree; void (*fn)(); char *parms; { BTREE tree; tree = (BTREE) ptree; visit_node(tree->root,fn,parms); return; } /******************************************************************** * * del_key -- This function deletes the specified key from the * specified node. * ********************************************************************/ static del_key(node,n) BTNODE *node; int n; { int i; for (i = n; i < node->nkeys - 1; i++) { node->keys[i] = node->keys[i+1]; } if (node->children != NULL) { for (i = n; i < node->nkeys; i++) { node->children[i] = node->children[i+1]; } } node->nkeys--; return; } /******************************************************************** * * weld -- This function is the opposite of burst, above. It joins * two neighboring children of the given node, and lowers a * key into the result. * ********************************************************************/ static weld(tree,node,n) BTREE tree; BTNODE *node; int n; { int i; int j; BTNODE *left; BTNODE *right; if (node->children == NULL) return; #ifdef DEBUG printf("Welding node at %x, key %d\n",(int)node,n); #endif left = node->children[n]; right = node->children[n+1]; i = left->nkeys; left->keys[i] = node->keys[n]; for (i++, j = 0; j < right->nkeys; i++, j++) { left->keys[i] = right->keys[j]; } for (i = n + 1; i <= node->nkeys; i++) { node->keys[i-1] = node->keys[i]; node->children[i] = node->children[i+1]; } node->nkeys--; if (left->children != NULL) { for (i = left->nkeys + 1, j = 0; j <= right->nkeys; i++, j++) { left->children[i] = right->children[j]; } bt_free(right->children); } left->nkeys += right->nkeys + 1; bt_free(right->keys); bt_free(right); tree->capacity -= tree->order - 1; return; } /******************************************************************** * * del_pred -- This function searches a sub-tree, and deletes the * highest key it finds, and returns it to its caller. * If any node along the way loses its b-tree'ness, the * tree is reorganized after the deletion. * ********************************************************************/ static char *del_pred(tree,node) BTREE tree; BTNODE *node; { BTNODE *pnode; int res; char *retval; /* * If at a leaf node, delete the highest key, otherwise * call self recursively */ if (node->children == NULL) { retval = node->keys[node->nkeys - 1]; node->nkeys--; } else { pnode = node->children[node->nkeys]; retval = del_pred(tree,pnode); /* Reorganize tree if node gets too empty */ if (pnode->nkeys < (tree->order - 1) / 2) { res = rot_right(node,node->nkeys - 1,tree->order); if (res == 0) weld(tree,node,node->nkeys - 1); } } return retval; } /******************************************************************** * * del_node -- This function searches a sub-tree for a key, and * deletes the key. If the key is found in a leaf node, * it is simple removed and returned to the caller; * otherwise, it is swapped with its predecessor, and * the returned to the caller. If any node loses its * b-tree'ness, the tree is reorganized. If the key is * not found, NULL is returned. * ********************************************************************/ static char *del_node(tree,node,key) BTREE tree; BTNODE *node; char *key; { int i; int res; char *retval; /* Key not found */ if (node == NULL) return NULL; /* Search for key, descend tree until found or find leaf */ i = lin_search(node,key,tree->comp,&res); if (res == 0) { if (node->children == NULL) return NULL; retval = del_node(tree,node->children[i],key); } else { /* Return the deleted node to caller */ retval = node->keys[i]; if (node->children == NULL) { /* Delete from leaf */ del_key(node,i); } else { /* Swap with predecessor */ node->keys[i] = del_pred(tree,node->children[i]); } } /* If a child node was reorganized, rebalancing may be necessary */ if ((node->children != NULL) && (node->children[i]->nkeys < (tree->order - 1) / 2)) { res = rot_right(node,i - 1,tree->order); if (res == 0) res = rot_left(node,i,tree->order); if (res == 0) { if (i >= node->nkeys) i--; weld(tree,node,i); } } return retval; } /******************************************************************** * * bt_delete -- This function deletes the specified key from a * b-tree and reorganizes it as necessary. The deleted * key is returned to the caller if it is found, NULL * is returned otherwise. * ********************************************************************/ char *bt_delete(ptree,key) char *ptree; char *key; { BTNODE *root; char *retval; BTREE tree; tree = (BTREE) ptree; /* Do no deletion if tree is empty */ root = tree->root; if (root->nkeys == 0) return NULL; /* Delete the key */ retval = del_node(tree,root,key); /* * Delete the root if it is empty, yet its children are not. * This happens after a weld on the last key in the root. */ if ((root->nkeys == 0) && (root->children != NULL)) { tree->root = root->children[0]; bt_free(root->keys); bt_free(root->children); bt_free(root); tree->capacity -= tree->order - 1; } if (retval != NULL) tree->contains--; return retval; } ***************************************************************** ****************** test.c *************************************** ***************************************************************** #include <stdio.h> #include "btree.h" /* #define DEBUGGING */ #define LOOPS 500 #define ORDER 12 typedef struct key { int key; } KEY; KEY savedKeys[LOOPS]; dump(key) KEY *key; { printf("%x: %d\n",key,key->key); } visit(key,parms) KEY *key; char *parms; { dump(key); } comp(k1,k2) KEY *k1,*k2; { return (k1->key - k2->key); } main() { BTREE tree; KEY *key; int i; int res; #ifdef DEBUGGING #ifdef sun malloc_debug(2); #endif #endif tree = (BTREE)bt_new(ORDER,comp); if (tree == NULL) { printf("Failed to create tree\n"); exit(1); } (void) bt_malloc_verify(); for (i = 0; i < LOOPS; i++) { printf("Iteration %d\n",i+1); key = &savedKeys[i]; key->key = rand(); printf("Inserting %d\n",key->key); fflush(stdout); res = bt_insert(tree,key); if (res > 0) { printf("Inserted %d\n",key->key); } else if (res < 0) { printf("Duplicate key %d\n",key->key); } else { printf("Error inserting key %d\n"); exit(3); } printf("Trying duplicate\n"); res = bt_insert(tree,key); if (res < 0) { printf("Failed to insert duplicate\n"); } else if (res == 0) { printf("Error inserting duplicate\n"); exit(4); } else { printf("Succeeded in inserting duplicate\n"); exit(5); } #if 0 #ifdef DEBUGGING printf("Contents of tree:\n"); bt_dump(tree,dump); #endif #endif 0 } printf("Finished creating the tree.\n"); printf("Contents of tree follow:\n"); bt_traverse(tree,visit,NULL); for (i = 0; i < LOOPS; i++) { printf("Iteration %d\n",i); printf("Seeking %d at %x\n",savedKeys[i].key,&savedKeys[i]); key = (KEY*)bt_search(tree,&savedKeys[i]); if (key != &savedKeys[i]) { printf("Can't find %d, returned %x\n",savedKeys[i].key, key); } printf("Freeing %d at %x\n",savedKeys[i].key,&savedKeys[i]); fflush(stdout); key = (KEY*)bt_delete(tree,&savedKeys[i]); if (key != &savedKeys[i]) { printf("Problem freeing %d at %x, returned %x\n", savedKeys[i].key,&savedKeys[i].key,key); } else { printf("Freed %d, returned %x\n",savedKeys[i].key,key); } #ifdef DEBUGGING printf("Contents of tree:\n"); bt_dump(tree,dump); fflush(stdout); #endif key = (KEY*)bt_search(tree,&savedKeys[i]); if (key == &savedKeys[i]) { printf("Found %d after deletion\n",savedKeys[i].key); } key = (KEY*)bt_delete(tree,&savedKeys[i]); if (key != NULL) { printf("%d remained in tree after deletion at %x\n", savedKeys[i].key,key); } #if 0 #ifdef DEBUGGING printf("Contents of tree:\n"); bt_dump(tree,dump); #endif #endif 0 } (void) bt_malloc_verify(); printf("Freeing tree\n"); bt_destroy(tree,NULL); printf("Finished freeing tree\n"); printf("Malloc verify: "); if (bt_malloc_verify()) { printf("success\n"); fflush(stdout); } else { printf("failure\n"); fflush(stdout); exit(6); } printf("Creating second tree\n"); tree = (BTREE)bt_new(ORDER,comp); if (tree == NULL) { printf("Failed to create tree\n"); exit(7); } (void) bt_malloc_verify(); printf("Contents of tree:\n"); bt_dump(tree,dump); for (i = 0; i < LOOPS; i++) { printf("Iteration %d\n",i+1); key = &savedKeys[i]; key->key = rand(); printf("Inserting %d\n",key->key); fflush(stdout); res = bt_insert(tree,key); if (res > 0) { printf("Inserted %d\n",key->key); } else if (res < 0) { printf("Duplicate key %d\n",key->key); } else { printf("Error inserting key %d\n"); exit(8); } printf("Trying duplicate\n"); res = bt_insert(tree,key); if (res < 0) { printf("Failed to insert duplicate\n"); } else if (res == 0) { printf("Error inserting duplicate\n"); exit(9); } else { printf("Succeeded in inserting duplicate\n"); exit(10); } #if 0 #ifdef DEBUGGING printf("Contents of tree:\n"); bt_dump(tree,dump); #endif #endif 0 } printf("Finished creating the tree.\n"); printf("Contents of tree follow:\n"); bt_traverse(tree,visit,NULL); printf("Freeing tree\n"); bt_destroy(tree,NULL); printf("Finished freeing tree\n"); printf("Malloc verify: "); if (bt_malloc_verify()) printf("success\n"); else printf("failure\n"); fflush(stdout); exit(0); } -- Paul Sander (408) 734-9822 | "Passwords are like underwear," she said, paul@Atherton.COM | "Both should be changed often." {decwrl,pyramid,sun}!athertn!paul | -- Bennett Falk in "Mom Meets Unix"