[comp.unix.programmer] ndbm won't cut it, what now?

lindner@cs.umn.edu (Paul Lindner) (11/11/90)

Does anyone know of any public domain code for doing relational database
stuff.  Basically my requirements are variable length records, 
searching on different keys in the database, and multiple concurrent
accesses to the same database.  My prototype just used
a directory structure and ascii files.  But I now need better
performance and the ability to search on multiple fields.  I've looked
at the dbm and ndbm libraries and found them to be lacking.
(i.e. poor multi-user access, only one key per database and the limit
on record size).  I imagine I could do like NIS aka YP and have 
multiple database files but I would really rather not.  Also I don't want
to have to use ingres, sysbase or any other large database.  I want
my code to be standalone and free of any legal entanglements.

So basically if you've got some database code lying around that you'd
like to share, by all means let me know!  Otherwise I'd appreciate any
references to a practical text on databases.

Thanks!
 
-- 
Paul Lindner, Univ. of MN   \ Microcomputer /  Pauls Law: You can't
IT Sun dude, & UofM ACM pres \ Workstation / fall off the floor.
lindner@boombox.micro.umn.edu \ Networks  / {...!rutgers!umn-cs!lindner}
     |   |  |  |  | | | | |||||\ Center  /||||| | | | |  |  |  |   |
-- 
Paul Lindner, Univ. of MN   \ Microcomputer /  Pauls Law: You can't
IT Sun dude, & UofM ACM pres \ Workstation / fall off the floor.
lindner@boombox.micro.umn.edu \ Networks  / {...!rutgers!umn-cs!lindner}
     |   |  |  |  | | | | |||||\ Center  /||||| | | | |  |  |  |   |

davidsen@sixhub.UUCP (Wm E. Davidsen Jr) (11/15/90)

  The secret to having multiple keys in a dbm database is to use the
first byte as the key type. It's that easy, and works very nicely
thanks.

  I did some limited testing of dbm and gdbm, and found that while gdbm
kept the file size way down, the build time was a lot longer.
-- 
bill davidsen - davidsen@sixhub.uucp (uunet!crdgw1!sixhub!davidsen)
    sysop *IX BBS and Public Access UNIX
    moderator of comp.binaries.ibm.pc and 80386 mailing list
"Stupidity, like virtue, is its own reward" -me

rang@cs.wisc.edu (Anton Rang) (11/16/90)

In article <2300@sixhub.UUCP> davidsen@sixhub.UUCP (Wm E. Davidsen Jr) writes:
>  The secret to having multiple keys in a dbm database is to use the
>first byte as the key type. It's that easy, and works very nicely
>thanks.

  Actually, I thought the original poster meant 'multiple keys' in the
sense of one record having several access paths to it.  (For instance,
being able to retrieve employees by either last name or SSN.)  I don't
know of any *dbm package which does this; it's not designed for it.

  Dbm/ndbm/gdbm works fine if you only have one key per record and
don't need the records to be ordered; for slightly more complicated
things, there's probably public-domain code floating around.  For a
real relational DB, unless one can use University INGRES, I suspect
one would need to go the commercial route.

	Anton

+---------------------------+------------------+-------------+
| Anton Rang (grad student) | rang@cs.wisc.edu | UW--Madison |
+---------------------------+------------------+-------------+

johnl@iecc.cambridge.ma.us (John R. Levine) (11/16/90)

In article <RANG.90Nov15214357@nexus.cs.wisc.edu> rang@cs.wisc.edu (Anton Rang) writes:
>In article <2300@sixhub.UUCP> davidsen@sixhub.UUCP (Wm E. Davidsen Jr) writes:
>>  The secret to having multiple keys in a dbm database is to use the
>>first byte as the key type. It's that easy, and works very nicely thanks.
>
>  Actually, I thought the original poster meant 'multiple keys' in the
>sense of one record having several access paths to it.

You can fake it by making one key the primary key and storing the actual data
in its record.  The secondary keys point to the primary key.  To keep sizes
down you might want to make the primary key something small and boring like a
three or four byte serial number. 

Logically each key type is in a separate file, but you can combine them into
one big dbm file using the prefix byte trick.  It's gross, but no grosser
than what real data bases do every day.

You may need to keep copies of the secondary keys in the primary record if
you want to be able to delete the record or change the primary key.  A
perfectly workable alternative is to delete the primary key record when you
delete the record, be prepared for secondary keys to be left dangling, and
delete them in a batch run when you scan the file and pick out the dead wood.

If you do anything like this, the first thing you should do is a consistency
checker that reports secondary keys with no primary keys and (if it's an
error) vice versa.

-- 
John R. Levine, IECC, POB 349, Cambridge MA 02238, +1 617 864 9650
johnl@iecc.cambridge.ma.us, {ima|spdcc|world}!iecc!johnl
"Typically supercomputers use a single microprocessor." -Boston Globe

chris@mimsy.umd.edu (Chris Torek) (11/19/90)

>In article <RANG.90Nov15214357@nexus.cs.wisc.edu> rang@cs.wisc.edu
>(Anton Rang) writes:
>>... I thought the original poster meant 'multiple keys' in the
>>sense of one record having several access paths to it.

(This was the impression I got as well.)

In article <1990Nov16.145542.22326@iecc.cambridge.ma.us>
johnl@iecc.cambridge.ma.us (John R. Levine) writes:
>You can fake it by making one key the primary key and storing the actual data
>in its record.  The secondary keys point to the primary key.  To keep sizes
>down you might want to make the primary key something small and boring like a
>three or four byte serial number. 
>
>Logically each key type is in a separate file, but you can combine them into
>one big dbm file using the prefix byte trick.  It's gross, but no grosser
>than what real data bases do every day.

You also need to store the `next serial number', so you have to reserve a
key for this as well.  This is indeed rather gross.

A few years ago I thought about this problem for a while, and came up with
a different idea based on the same hashing technique.  To explain it I need
to describe the idea behind dbm.

The way dbm works is to take each `key' string and turn it into a
32-bit number, something like a CRC-32 or a checksum.  This number is
not necessarily unique (two different strings will produce the same
32-bit value) but collisions should be rare, if the string-to-number
algorithm is chosen carefully.  In any case, the same string always
produces the same number.

Now to store a `datum', we turn its `key' into our 32 bit number.  This
number is taken as the block number in a file of blocks (typically 4096
bytes each).  To avoid starting out with 2^32 blocks, we ignore `some'
of the bits in this number.  Initially we ignore *all* the bits,
storing all key/datum pairs in block 0; as the database grows, we
ignore fewer bits.  We can leave a binary decision tree behind as a
record of when we stopped ignoring each bit.  This `tree' is simply a
string of bits, 0 where we are still ignoring, and 1 where we are
not.  Given a key hash `h', we start by saying `ignore all the bits',
then we ignore fewer bits as long as bit (h&mask)+mask is set:

	/*
	 * NB: the following is `off the top of my head'.
	 * It may contain errors.
	 */
	hash = calchash(key);
	for (hmask = 0; bit(map, (hash & hmask) + hmask) != 0;
	     hmask = (hmask << 1) | 1)
		/* void */;

Now if the key is anywhere in the database, it is in the block whose
number is (hash & hmask).  The datum for that key is stored immediately
after the key, so all we have to do is read that block and march
through every even-numbered entry, comparing against the key.  If there
is a match, return the corresponding odd-numbered entry.

Under this implementation, storing key=value pairs such as

	mimsy=128.8.128.8 mimsy.umd.edu
	mimsy.umd.edu=128.8.128.8 mimsy.umd.edu
	mimsy.cs.umd.edu=128.8.128.8 mimsy.umd.edu

(a name to number-and-official-name mapping as might be found in a host
number database) requires storing `128.8.128.8 mimsy.umd.edu' three
times---51 bytes, at 17 bytes each, counting the host number as 4 bytes.
Using John Levine's technique we might instead say

	Nmimsy=17
	Nmimsy.umd.edu=17
	Nmimsy.cs.umd.edu=17
	#17=128.8.128.8 mimsy.umd.edu

so that we can store `128.8.128.8 mimsy.umd.edu' only once, saving 34
bytes (and costing 1+4 per key plus 5 for the intermediate key; we wind
up with a net savings of 14 bytes---minus 2 more hidden inside the dbm
implementation, used when storing that intermediate key).  But this
means that to look up a name we have to say:

	*buf = 'N';
	strcpy(buf + 1, name);
	key.dptr = buf;
	key.dsize = strlen(buf);
	datum = fetch(key);
	if (datum.dptr != NULL) {
		*buf = '#';
		bcopy(datum.dptr, buf + 1, sizeof(int));
		key.dsize = 1 + sizeof(int);
		datum = fetch(key);
		if (datum.dptr == NULL)
			panic("lost target");
		printf("officially, %s = %.*s\n",
		    name, datum.dsize, datum.dptr);
	} else
		printf("%s does not exist\n", name);

There *is* another way.

Instead of storing keys and data in pairs, we can store each
individually.  In other words, we can dissociate them---put each in its
own block.  All we have to do is somehow tie one to the other
(otherwise the database is no good!).  The 32-bit hash value is almost
good enough, but not quite: two different strings can produce the same
hash.  But what if we augment it?

Instead of viewing the database as a series of `key/datum' pairs,
suppose we view it as a collection of `items'.  Each item has three
attributes:

 A. A number that is unique to this item within its block.  (Blocks
    are typically 4096 bytes or so, so there cannot possibly be more
    than 4096 or so items; a 16-bit number suffices.)  I called this
    an `index' for want of a better name; this particular index is
    the item's `in index'.  (`Key' might be a better name in some
    other context :-) .)  This exists so that the item can be a datum.

 B. A `link' record consisting of a 32-bit value and another index.
    This is only used if the item is considered a `key'.  The index
    here is an `out index'; it is zero if the item is not acting as
    a key.  The 32-bit value is an `out hash'.

 C. A `link count' telling how many times this item is a key or datum
    (it can only be a key once).

Now to store an item, whether as key or datum, we compute its hash and
find its block as above.  Then, if the item is a key (has a nonzero
`out index') and we are replacing its current content (link target),
we `unlink' its current target, store the new datum, and `link' this
key to the item produced here.  If the item is not a key and is being
made one, it simply acquires an `out index'.  If the item is a datum,
we just increment its link count.

To find a key's datum, we `follow the link': the out hash is the hash
value for the key's datum.  Just as with a key, this (plus the decision
bit tree) tells us exactly which block will contain the key's datum
(which in this case is certain to exist).  Since the item index values
are unique to each block, once we have the correct block all we need to
do is find the item with the same `in index' number that the key has in
its `out index'.

Thus, instead of

	Nmimsy=#17
	Nmimsy.umd.edu=#17
	Nmimsy.cs.umd.edu=#17
	#17=128.8.128.8 mimsy.umd.edu

we might have a database that looks like this:

	<block 1>:
	   <in=1, out=2, outhash=3ab197c4, links=1, "mimsy">
	<block 3>:
	   <in=3, out=2, outhash=3ab197c4, links=1, "mimsy.umd.edu">
	   <in=4, out=2, outhash=3ab197c4, links=1, "mimsy.cs.umd.edu">
	<block 4>:
	   <in=2, out=0, outhash=0, links=3, "128.8.128.8 mimsy.umd.edu">

More interestingly, if we create a new database and stick

	hello=hello

into it, we get:

	<block 0>:
	   <in=1, out=1, outhash=195e094c, links=2, "hello">

This key points to itself!

There is no problem with circular references, because a key does not
point to another `key', but rather to an `item'.  This item has two
references (itself as a key, and itself as a datum), not infinitely
many.  Storing hello=world causes the count to go to 1 (when the
as-datum link vanishes); removing `hello' as a key then makes the count
go to 0.

The main problem with this scheme is that, as with John Levine's, it
takes two database accesses to locate an item (though this time the
work is hidden from the programmer).  There is also a problem with the
index values.  As each block fills and splits, a new block is created
with some index values `used up', and it can become difficult to decide
on the next `available in index'.  I got around this by recording
minimum and maximum `in use' index numbers in each block header, so
that typically simply taking the next number suffices.  When the min
and max values reach 1 and 65535 (actually USHRT_MAX), however, it is
necessary to scan the block for a free number.  Also, this technique
imposes a fair amount of overhead (12 bytes per item for a size offset,
link count, two index values, and the out hash, versus 2 bytes per item
in dbm/ndbm for the size offset).
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris