[net.news.b] corrupted dbm files?

mp@allegra.UUCP (Mark Plotnick) (06/03/85)

Once we get a few weeks' worth of news records in the history* files,
we begin to get corrupted dbm files.  Most frequently this results in
duplicate articles because the record of a specific article magically
disappears.  Sometimes it results in lost articles because the pagbuf
block gets so corrupted (typically one of the set of offsets at the
beginning of the pagbuf block is 0) that the fetch() that rnews does
causes the dbm package to call abort()!.

Adding tracing information to the dbm routines shows that dbm splits
blocks quite frequently; I think the problem must be in concurrent
updates to the dbm files.  We have 3 major feeds, and quite frequently
have several rnews's running at once, sometimes processing the same
Message-ID.  I shudder to think what damage expire does during its
weekly 12 hours of pillaging.  I've now modified the dbm package not to do
caching on the pag and dir files, and changed inews to lock the dbm
file when doing fetches and stores (and, it supplies its own abort()
routine which just does a log()).

It's too early to tell if the changes make a difference, but they can't
hurt.  Has anyone else done work on these problems?
	Mark Plotnick
	allegra!mp