davidsen@crdos1.crd.ge.com (Bill Davidsen) (03/16/91)
Submitted-by: Bill Davidsen <davidsen@crdos1.crd.ge.com> Posting-number: Volume 17, Issue 40 Archive-name: bplus/part02 #!/bin/sh # this is part 2 of a multipart archive # do not concatenate these parts, unpack them in order with /bin/sh # file bplus.doc continued # CurArch=2 if test ! -r s2_seq_.tmp then echo "Please unpack part 1 first!" exit 1; fi ( read Scheck if test "$Scheck" != $CurArch then echo "Please unpack part $Scheck next!" exit 1; else exit 0; fi ) < s2_seq_.tmp || exit 1 echo "x - Continuing file bplus.doc" sed 's/^X//' << 'SHAR_EOF' >> bplus.doc X X X X int cdecl find_exact(pe, pix); X X ENTRY *pe; Pointer to variable of type ENTRY X IX_DESC *pix; Pointer to index file variable X X Description: The find_exact function searches the index X file for the key value contained in pe.key and the data X record position stored in pe.recptr. If an exact match X is found for both of these values, the value IX_OK (1) X is returned, and the internal index file pointer is set X to that position in the index file. If an exact match X is not found, the value IX_FAIL (0) is returned. X X X X TAILORING OR CHANGING B-PLUS X X Most applications can use the B-PLUS program as it is X distributed by Hunter and Associates without any changes. It X allows multiple, large data files to be indexed in a fast, X efficient manner. However, a description of the values that X can be changed to tailor B-PLUS are given in this section. X X An index tree becomes full when no more entries can be X added to the tree. This is the case when adding another X entry to the tree would cause the tree to grow larger than X its maximum allowed height. This height depends on the size X of the index blocks and the average size of the keys used by X the data files. The minimum capacity of a b-tree index is X given by the following formula: X X C = (B / E + 1) * (B / (2 * E) + 1) ** (H - 1) X X where C is the number of entries in the index file, B is the X X Page 8 X X X HUNTER AND ASSOCIATES B-PLUS FILE INDEXING PROGRAM X ------------------------------------------------------------- X X X block size in bytes, E is the average size of an ENTRY in X bytes, and H is the maximum tree height. X X The maximum key size is defined by MAXKEY and is set at X 100. The block size is 1024 bytes as defined by IXB_SIZE. X Ten bytes are used by pointers so 1014 bytes are used by X entries. The size of an ENTRY is 9 + the key length. X X Thus, if the average key length is 11, an average ENTRY X is 20 bytes long and the minimum number of entries that can X be contained in a tree of height 4 is: X X C = (1014 / 20 + 1) * (1014 / 40 + 1) ** 3 X = 945,072 X X If the average ENTRY is 40 bytes long, the minimum number of X entries that can be contained in a tree of height 4 is X 67,384. The corresponding values for a tree of height 3 are X 35,896 and 4927, respectively. X X The maximum tree height is determined by MAX_LEVELS and X is set to eight. Very little memory space is used by X allowing the maximum tree height to be this large. B-PLUS X increases the height of the tree dynamically as required by X number of records in the data file. X X If your application does not use long keys and your X files are not huge, IXB_SIZE can be changed to 512 bytes with X only a slight degradation in performance. X X The root of an open index file is always memory resident X as defined by the variable of type IX_DESC. A dynamic pool X of cache buffers is used for other index blocks of the tree. X The number of blocks in the pool is defined by NUM_BUFS with X a default value of 8. Each memory block is sizeof(IXB_SIZE) X + 8 bytes in length so approximately 8K of memory is used for X cache storage of index blocks. If the number of index files X that are open simultaneously is larger than 4, you may want X to increase NUM_BUFS. If the usage of memory is critical, X the number of buffers can be decreased. However, NUM_BUFS X must be at least 2. In general, the speed of file access can X be expected to improve if the number of buffers is increased X since more of the index file is memory resident. X X Because some indices are always memory resident, and X because the DOS Operating System requires it, it is very X important that all open index files be closed before an X application program terminates. X X As stated earlier, the B-PLUS program has been tested X X Page 9 X X X HUNTER AND ASSOCIATES B-PLUS FILE INDEXING PROGRAM X ------------------------------------------------------------- X X X using Microsoft's Optimizing C Compilers, Versions 4, 4.5 and X 5.0, and Borland's Turbo C, Version 1.0. However, standard K X & R programming guidelines are followed so B-PLUS should be X able to be used with other C Compilers with little X modification. Since B-PLUS is non-recursive, the usage of X stack space does not change dynamically. It is recommend X that the B-PLUS program be complied without stack checking. X For Microsoft C, the /Ox option can be used to maximize speed X and minimize code size. For Turbo C, B-PLUS can be complied X with maximum optimization to minimize the object size and X improve performance. X X X ACKNOWLEDGMENTS AND REFERENCES X X The following reference materials were used and helpful X in writing the B-PLUS program: X X Wirth, Niklaus: X Algorithms + Data Structures = Programs X Prentice Hall (1976) X X Hunt, William James: X The C Toolbox X (Serious C Programming for the IBM PC) X Addison-Wesley (1985) X X X Wirth's book is a standard reference source for binary X trees and data structures. Hunt's C Toolbox contains useful X C programming concepts with carefully constructed programming X examples. X X X USING THE BPLUS ROUTINES X X The BPLUS.C routines must be compiled and the object X file (BPLUS.OBJ) loaded with programs that use the B_PLUS X toolkit. Several sample C programs have been included which X illustrate how to use the BPLUS Toolkit. These programs X should be compiled and run to make sure your copy of the X program is correct. X X X A SPECIAL NOTE REGARDING MICROSOFT C COMPILERS X X The Microsoft C library routines are different for X Version 4.0 and for Version 5.0 and Quick C. In particular, X the memcpy routine in Version 5.0 does not check that X overlapping bytes are copied correctly. Consequently, the X X Page 10 X X X HUNTER AND ASSOCIATES B-PLUS FILE INDEXING PROGRAM X ------------------------------------------------------------- X X X memmove routine must be used for Version 5.0 and for Quick C. X A macro is included in BLUS.C which makes this substitution. X This macro MUST be used (remove the comment delimiters) for X these versions of the Microsoft compilers. X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X Page 11 X\032 SHAR_EOF echo "File bplus.doc is complete" chmod 0644 bplus.doc || echo "restore of bplus.doc fails" echo "x - extracting bplus.h (Text)" sed 's/^X//' << 'SHAR_EOF' > bplus.h && X X/* bplus.h - data structures and constants */ X X X/* the next two lines delete the 'pascal' and 'cdecl' keywords X to make the source compile on an ANSI compiler. X*/ X#ifndef MSC /* not Microsoft C */ X#define cdecl X#define pascal X#endif /* MSC */ X X/* the following checks are to define things frequently not in X UNIX compilers, since they are recent ANSI additions. X*/ X#ifndef SEEK_SET X#define SEEK_SET 0 X#endif X#ifndef O_BINARY X#define O_BINARY 0 X#endif X X#if defined(ANSI_C) | defined(MSC) | defined(M_XENIX) X#define Param(x) x X#else X#define Param(x) () X#endif /* ANSI or PCC style decls */ X X#define IX_OK 1 X#define IX_FAIL 0 X#define EOIX (-2) X#define MAXKEY 100 X#define NUM_BUFS 8 X#define MAX_LEVELS 8 X#define IXB_SIZE 1024 X#define IXB_SPACE (IXB_SIZE - sizeof(short) - sizeof(long) * 2) X Xtypedef long RECPOS; X Xtypedef struct /* entry structure in index */ X { RECPOS idxptr; /* points to lower index level */ X RECPOS recptr; /* points to data record */ X char key[MAXKEY]; /* start of record key */ X } ENTRY; X Xtypedef struct /* index record format */ X { RECPOS brec; /* position in index file */ X /* or location of next free block */ X short bend; /* first unused block location */ X RECPOS p0; /* points to next level */ X char entries[IXB_SPACE]; /* here are the key entries */ X } BLOCK; X Xtypedef struct /* disk file info */ X { RECPOS ff; /* location of first free block */ X short nl; /* number of index levels */ X } IX_DISK; X Xtypedef struct /* memory buffer pool of indx blks */ X { short dirty; /* true if changed */ X short handle; /* index file handle */ X short count; /* number of times referenced */ X BLOCK mb; X } MEMBLOCK; X Xtypedef struct X { MEMBLOCK cache [ NUM_BUFS ]; X } IX_BUFFER; X Xtypedef struct /* in-memory index descriptor */ X { short ixfile; X short level; /* level in btree */ X short duplicate; /* no duplicate keys if 0 */ X struct X { RECPOS cblock; /* position in index file */ X short coffset; /* current offset within block */ X } pos [ MAX_LEVELS ]; X BLOCK root; /* root index record */ X IX_DISK dx; X } IX_DESC; X X/* a few system procedure types here */ Xextern long filelength(), lseek(), tell(); Xextern char *mktemp(); Xextern int read(), write(), open(), close(); Xextern void exit(); X X/* ================ external interface ================ */ Xint cdecl open_index Param((char *,IX_DESC *, int)); Xint cdecl close_index Param((IX_DESC *)); Xint cdecl make_index Param((char *,IX_DESC *, int)); Xint cdecl first_key Param((IX_DESC *)); Xint cdecl last_key Param((IX_DESC *)); Xint cdecl next_key Param((ENTRY *, IX_DESC *)); Xint cdecl prev_key Param((ENTRY *, IX_DESC *)); Xint cdecl find_key Param((ENTRY *, IX_DESC *)); Xint cdecl add_key Param((ENTRY *, IX_DESC *)); Xint cdecl locate_key Param((ENTRY *, IX_DESC *)); Xint cdecl delete_key Param((ENTRY *, IX_DESC *)); Xint cdecl find_exact Param((ENTRY *, IX_DESC *)); Xint cdecl find_1stkey Param((ENTRY *, IX_DESC *)); Xvoid clearkey Param((ENTRY *)); SHAR_EOF chmod 0644 bplus.h || echo "restore of bplus.h fails" echo "x - extracting makefile (Text)" sed 's/^X//' << 'SHAR_EOF' > makefile && X# makefile for UNIX bplus test X X# Bplus routines, utility routines, and the whole library XPLUSLIB = bplus.o find1stkey.o XUTILLIB = filelen.o memmove.o XLIB = $(PLUSLIB) $(UTILLIB) XLIBSRC = bplus.c find1stkey.c filelen.c memmove.c XDOCFILES = read.me bplus.doc X XTESTS = utest3 utest4 zoodb XTESTOBJ = utest3.o utest4.o zoodb.o XTESTSRC = utest3.c utest4.c s.zoodb.c X XDEMOSRC = ndb.c lookup.c names.c X X# C flags, load flags, misc (shared) flags XCFLAGS = -O XLFLAGS = XMFLAGS = XCC = cc $(CFLAGS) ${MFLAGS} X X# archive for distribution XBPLUSARC = bplus11a.arc X X# info to install the libraries XLIBX286 = /lib XLIBX386 = /lib/386 XLIBDOS = /usr/lib/dos X Xall: $(TESTS) X X$(TESTS): bplus.a $$@.o X $(CC) $(LFLAGS) -o $@ $@.o bplus.a X X$(PLUSLIB): bplus.h X$(TESTOBJ): bplus.h X Xbplus.a: $(LIB) X ar rv bplus.a $? X ranlib bplus.a X X# special section to make libraries under Xenix386, for Xenix286 and MS-DOS Xextlib: $(LIBSRC) bplus.h X @echo "Making x286 small" X cc -c -O -M2 $(LIBSRC) X ar rv Sbplus286.a $(LIB) X ranlib Sbplus286.a X @echo "Making x286 large" X cc -c -O -M2l $(LIBSRC) X ar rv Lbplus286.a $(LIB) X ranlib Lbplus286.a X @echo "Making DOS small" X cc -c -O -M2 -dos $(LIBSRC) X ar rv SbplusDOS.a $(LIB) X ranlib SbplusDOS.a X @echo "Making DOS large" X cc -c -O -M2l -dos $(LIBSRC) X ar rv LbplusDOS.a $(LIB) X ranlib LbplusDOS.a X @echo "Making 386 version" X cc -c -O $(LIBSRC) X ar rv bplus.a $(LIB) X touch extlib X X# install the special libraries - run su LIBOWNER Xlibinstall: extlib X cp Slib286.a $(LIBX286)/Slibbplus.a X cp Llib286.a $(LIBX286)/Llibbplus.a X cp bplus.a $(LIBX386)/Slibbplus.a X cp SlibDOS.a $(LIBDOS)/Slibbplus.a X cp LlibDOS.a $(LIBDOS)/Llibbplus.a X X.SUFFIXES: .exe X X# special inference rules, DOS and SCCS X.o.exe: X $(CC) -o $@ -dos $< bplus.a X.c~.o: X get -s $? X $(CC) -c $*.c && rm -f $*.c X.c.o: X $(CC) -c $*.c X XMISCSAVE = $(DOCFILES) bplus.h makefile XSAVELIST = $(LIBSRC) $(MISCSAVE) $(DEMOSRC) Xbackup: bplus.zoo Xbplus.zoo: $(SAVELIST) s.bplus.c X zoo aunPP bplus $? X Xbplusarc: $(BPLUSARC) X X$(BPLUSARC): $(SAVELIST) X rm -f $(BPLUSARC) X arc a bplus $? X Xshar: bplus.shar Xbplus.shar: $(SAVELIST) X $${SHAR:-shar} $(SAVELIST) > bplus.shar X X# cleanup after backup Xclean: backup X rm $(SAVELIST) X X# phone number database software X XPDBX = ndb lookup Xpdbx: $(PDBX) lookup.exe X X$(PDBX): bplus.a $$@.o X cc $(MFLAGS) $(LFLAGS) -o $@ $@.o bplus.a X Xlookup.exe: lookup.c X $(CC) $(LFLAGS) -dos -o lookup.exe lookup.c bplus.a X Xndb.o lookup.o: bplus.h SHAR_EOF chmod 0644 makefile || echo "restore of makefile fails" echo "x - extracting ndb.c (Text)" sed 's/^X//' << 'SHAR_EOF' > ndb.c && X/***************************************************************** X | ndb - build Name DataBase X |---------------------------------------------------------------- X | takes data from a raw listing of names, phone numbers, room X | numbers, etc and builds an index by name. Name is in upper X | case, in the first 26 columns. The index points into the X | original text file. X ****************************************************************/ X X#include <stdio.h> X#include "bplus.h" X#define EOS ((unsigned char) 0) X XIX_DESC namesfile; XENTRY index; XFILE *namedata, *fopen(); X Xmain() { X int status, n, nnames; X int names_count = 0, keys_count = 0; X long ftell(); X char namebuf[132], names[6][30]; X X status = make_index("phdata.idx", &namesfile, 1); X if (status != IX_OK) { X printf("Can't create 'phdata.idx' file\n"); X exit(1); X } X namedata = fopen("phdata", "r"); X if (namedata == NULL) { X printf("Can't find 'phdata' file\n"); X exit(1); X } X X /* read loop */ X while (index.recptr = ftell(namedata), X fgets(namebuf, 132, namedata) != NULL) X { X namebuf[26] = EOS; X ++names_count; X nnames = sscanf(namebuf, "%s%s%s%s%s%s", X names[0], names[1], names[2], X names[3], names[4], names[5]); X for (n=0; n<nnames; ++n) { X if (strlen(names[n]) < 2) continue; X ++keys_count; X clearkey(&index); X strcpy(index.key, names[n]); X status = add_key(&index, &namesfile); X if (status != IX_OK) { X printf("Bad status on add of key '%s'\n", names[n]); X close_index(&namesfile); X exit(1); X } X } X } X X /* wrapup */ X printf("Build complete, %d names, %d keys\n", X names_count, keys_count); X close_index(&namesfile); X fclose(namedata); X} SHAR_EOF chmod 0644 ndb.c || echo "restore of ndb.c fails" echo "x - extracting lookup.c (Text)" sed 's/^X//' << 'SHAR_EOF' > lookup.c && X/***************************************************************** X | lookup - lookup a name in the names data base X |---------------------------------------------------------------- X | finds all occurrences of the name(s) on the command line and X | displays the matching records from the original text file. X | Locates records with name fields starting with the string X | argument. "lookup j" finds all records with a first or last X | name starting with j. X ****************************************************************/ X X#include <stdio.h> X#include <ctype.h> X#include "bplus.h" X XIX_DESC namesfile; XENTRY index; XFILE *namedata, *fopen(); X Xmain(argc, argv) Xint argc; Xchar *argv[]; X{ X int n, status, keylen; X char uckey[MAXKEY]; X int ch, chindex; X X status = open_index("phdata.idx", &namesfile, 1); X namedata = fopen("phdata", "r"); X X for (n = 1; n < argc; ++n) { X for (chindex = 0; ch = argv[n][chindex]; ++chindex) { X if (islower(ch)) ch = toupper(ch); X uckey[chindex] = ch; X } X X clearkey(&index); X strcpy(index.key, uckey); X status = locate_key(&index, &namesfile); X if (status == EOIX) continue; X keylen = strlen(uckey); X X dumprec(index.recptr); X while (next_key(&index, &namesfile) == IX_OK && X (strncmp(index.key, uckey, keylen) == 0)) X { X dumprec(index.recptr); X } X } X X close_index(&namesfile); X} X Xdumprec(seekloc) Xlong seekloc; X{ X char line[132]; X X fseek(namedata, seekloc, 0); X fgets(line, 132, namedata); X fputs(line, stdout); X} SHAR_EOF chmod 0644 lookup.c || echo "restore of lookup.c fails" echo "x - extracting names.c (Text)" sed 's/^X//' << 'SHAR_EOF' > names.c && X/*******************************************************************/ X/* NAMES.C */ X/* */ X/* This example shows how easy it is to write a program for an */ X/* online address book using the B-PLUS file indexing toolkit. */ X/* The program creates a file of names, addresses, and telephone */ X/* numbers. A record is displayed on the screen by entering part */ X/* or all of the name. Although the program is usefully as written */ X/* it has purposely been kept simple. You may want to add new */ X/* features to the progran. */ X/* */ X/********************************************************************/ X X X#include <stdio.h> X#include <string.h> X#include "bplus.h" X X Xtypedef struct /* Here is the address record definition */ X { X char lastname[16]; /* last name */ X char firstname[16]; /* first name */ X char address1[31]; /* first address line */ X char address2[31]; /* second address line */ X char city[21]; /* the city */ X char state[3]; /* the state */ X char zipcode[6]; /* postal zip code */ X char telephone[14]; /* telephone number */ X } ADDRESS; X X XIX_DESC nameindex; /* index file variable */ XFILE *namefile; /* data file pointer */ XADDRESS person; /* data record variable */ X X Xvoid openfiles(void); Xvoid closefiles(void); Xint addrecord(void); Xvoid getstring(char*, int); Xvoid newaddress(void); Xvoid printname(ENTRY*); Xvoid getname(void); Xvoid nextname(void); Xvoid listnames(void); X X Xvoid openfiles() X /* If the file NAMES.DAT already exist, open the index and data */ X /* file. Otherwise, these files are created. */ X { X if ((namefile = fopen("names.dat","r+")) != NULL) X open_index("names.idx", &nameindex, 1); /* open index file */ X else X { X namefile = fopen("names.dat","w+"); /* create data file */ X if (namefile == NULL) X { X printf("Unable to open namefile\n"); X exit(1); X } X make_index("names.idx", &nameindex, 1); /* creat index file */ X } /* allow duplicate keys */ X } /* openfiles */ X X Xvoid closefiles() X /* close all files and exit */ X { X fclose(namefile); X close_index(&nameindex); X exit(0); X } /* closefiles */ X X Xint addrecord() X /* add a new address to the data file - add index to index file */ X { X ENTRY ee; X char name[32]; X int ret; X ret = fseek(namefile, 0L, SEEK_END); /* seek to end of datafile */ X if (ret == 0) X { X strcpy(ee.key, person.lastname); /* key is last name followed */ X strcat(ee.key, person.firstname); /* first name. Capitalize */ X strupr(ee.key); /* and copy to ee.key. */ X ee.recptr = ftell(namefile); /* get position in datafile */ X if (ee.recptr != -1L) X { X if (add_key(&ee, &nameindex) == IX_OK) /* add key to index */ X { X fwrite(&person,sizeof(person),1,namefile); /* add address */ X return (IX_OK); X } X } X } X else printf("Seek error - data file"); X return (IX_FAIL); X } /* addrecord */ X X Xvoid getstring(mes, length) X char *mes; X int length; X /* input a string and check that it is not too long */ X { X char message[80]; X gets(message); X if (strlen(message) > length) message[length] = '\0'; X strcpy(mes,message); X } /* getstring */ X X Xvoid newaddress() X /* add new address records */ X { X while (1) X { X printf("\n\nLast Name : "); X getstring(person.lastname,15); X if ( strlen(person.lastname) > 0) /* quit if no last name */ X { X printf("First Name : "); X getstring(person.firstname,15); X printf("Address Line 1 : "); X getstring(person.address1,30); X printf("Address Line 2 : "); X getstring(person.address2,30); X printf("City : "); X getstring(person.city,20); X printf("State : "); X getstring(person.state,2); X printf("Zip Code : "); X getstring(person.zipcode,5); X printf("Telephone : "); X getstring(person.telephone,13); X addrecord(); /* update data and index files */ X printf("\n"); X } X else return ; X } X } /* newaddress */ X X Xvoid printname(e) X ENTRY *e; X /* retrieve a data record and print it on the screen */ X { X int ret; X X /* seek to the record address stored in ENTRY e->recptr */ X ret = fseek(namefile, e->recptr, SEEK_SET); X X if (ret == 0) /* if OK read the record and display */ X { X fread(&person,sizeof(person),1,namefile); X printf("\n\n %s %s" , person.firstname,person.lastname); X printf("\n %s", person.address1); X if (strlen(person.address2) > 0) X printf("\n %s", person.address2); X printf("\n %s, %s %s", person.city,person.state,person.zipcode); X printf("\n %s\n", person.telephone); X } X else printf("Seek error - data file"); X } /* printname */ X X Xvoid getname() X /* Get an address record by entering part or all of name */ X /* Enter last name first then first name with no spaces */ X { X ENTRY ee; X printf("\n\nEnter name: "); X gets(ee.key); X X /* make all upper case letters and copy to ee.key */ X strupr(ee.key); X X /* use locate_key instead of find_key so an exact match not required */ X if (locate_key(&ee, &nameindex) != EOIX) printname(&ee); X else printf("No key this large in index file\n"); X } /* getname */ X X Xvoid nextname() X /* display the next address in the address file */ X { X ENTRY ee; X if (next_key(&ee, &nameindex) == IX_OK) printname(&ee); X else printf("\nEnd of index file\n"); X } /* nextname */ X X Xvoid listnames() X /* list all the names in the address file */ X { X ENTRY ee; X first_key(&nameindex); X while (next_key(&ee, &nameindex) == IX_OK) printname(&ee); X } /* listnames */ X Xmain() X /* Here is the main program loop */ X { X char cmd; X int done; X done = 0; X openfiles(); X do X { X printf("\nCommand: A (Add Name), F (Find), N (Next), L (List), Q (Quit): "); X cmd = toupper(getche()); X switch (cmd) X { X case 'A': newaddress(); break; /* add a name to address file */ X case 'F': getname(); break; /* find an address */ X case 'N': nextname(); break; /* display next address in file */ X case 'L': listnames(); break; /* display all addresses */ X case 'Q': closefiles(); /* quit and close files */ X } X } X while (!done); X } /* main */ X X\032 SHAR_EOF chmod 0644 names.c || echo "restore of names.c fails" rm -f s2_seq_.tmp echo "You have unpacked the last part" exit 0 exit 0 # Just in case... -- Kent Landfield INTERNET: kent@sparky.IMD.Sterling.COM Sterling Software, IMD UUCP: uunet!sparky!kent Phone: (402) 291-8300 FAX: (402) 291-4362 Please send comp.sources.misc-related mail to kent@uunet.uu.net.