mjr@hussar.dco.dec.com (Marcus J. Ranum) (05/24/91)
Archive-name: games/mud/untermud/1991-05-23 Archive: decuac.dec.com:/pub/umud.tar.Z [192.5.214.1] Original-posting-by: mjr@hussar.dco.dec.com (Marcus J. Ranum) Original-subject: Re: mud data storage and retrieval methods Reposted-by: emv@msen.com (Edward Vielmetti, MSEN) quasar@bellcore.com writes: >What database and db techniques are used by popular mud's? At the moment >I have this toy system which works on ndbm (well, it's free, and I don't >have to worry about file systems, b-trees, and unpleasant stuff like that). Ignoring the question of "popular", I'll describe how UnterMUD does caching and database access. :) There are a set of "generic" database access functions defined, namely (in terms of function): get object, put object, check-if-object-exists, delete object prepare to traverse entire DB, get next object, end DB traverse, sync everything. The server code is written to use generic cache_get() cache_put() and so forth, using exactly the same interface, so in theory you could link the MUD directly against the db-layer and run without a cache. Caching is done with a hashed cache in two tiers - the active and the inactive tiers. It's implemented as a hashed array of linked lists, pretty much as you'd expect. The two tiers are used to provide not only a reduction of search time but to provide locking. In other words, if the cache runs out of room for objects, it WILL NOT drop something off the active tier, no matter what, since something may be holding a pointer to that object. Everything in UnterMUD is done with pointer-passing. If the object's not in the active tier, a search is made through the larger inactive tier. If it's still not found, the last object (oldest) in the inactive tier is dropped, and the new object is loaded from the database, and put at the head of the active tier. If the inactive tier is utterly drained, the caching layer is pretty well swamped, so it just allocates more cache holders. The net effect of this is that performance won't degrade much even if rooms get large - the server will reach a steady state size. When Russ had TinyHELL I under UnterMUD, the server took ~512K of core, but most of that was probably due to the fact that people walked around in the very crowded limbo, which stretched the cache. The cache doesn't shrink back down again, though I suppose it could. Whenever a call through the I/O loop (select loop) is finished, the whole cache is reset, which is done simply by moving the active tier to the head of the inactive tier, which effects an LRU. This is MUCH more efficient than a locking scheme like TeenyMUD uses, since it doesn't have to search the whole cache for locks to clear, nor does the programmer have to mess with locks. There's a provision for syncing stuff at intervals. Underneath the cache is the actual permanent DB layer, which, as I mentioned earlier, uses exactly the same call interface as the cache layer. There are currently 3 different DB layers that can be plugged in, depending on what you want, each of which has different advantages or disadvantages. They are as follows: 1) hashed directories - one object stored per UNIX filename 2) hashed chunks - objects stored in a large file of logical blocks, indexed with a dbm/ndbm database 3) hashed gdbm file - objects stored in a gdbm file, which also handles all the blocking internally someday) RMS-based storage for VMS machines The actual storage routines are pretty simple, since every object is indexed based on its object-ID (a string of some length less than 128char) In the case of hashed directories, the object-ID is used to generate the pathname, in the case of the dbm version, it is used as the hash key for a small struct consisting of a size and offset tuple. In the case of the gdbm version, it's used as the key, and gdbm takes care of the rest. The original code for doing this stuff evolved from UberMUD, which used my b+tree code, because I needed to be able to do linear traversals across attributes (but that's another story). This has proven (IMHO) to be a very clean and effective architecture, since it's easy to port, and easy to add efficiency hacks where needed. (IE: if you have Sybase on your machine, write a db layer that calls Sybase) - because things are brought in only as needed, and can be flushed out when they're no longer needed the server tends to reach a steady state as far as resources consumed. UnterMUD source is available for anonymous FTP from decuac.dec.com, in ~ftp/pub - the current version is not "released" because it is still under change. Official release should be soon, there are a few little code changes still to make first. mjr. -- comp.archives file verification decuac.dec.com -rw-r--r-- 1 388 system 154363 May 4 21:34 /pub/umud.tar.Z found untermud ok decuac.dec.com:/pub/umud.tar.Z