cc_is@ux63.bath.ac.uk (Sparry) (03/22/88)
There is a major problem with locking in expire. The code is roughly as follows, with bits merged from expire.c and funcs.c. The problem is that a general log error and exit routine is being called from the code which sets up the lock. This general routine removes the lock. I have shown the code for System 5. I have a cron job that applies harder and harder expires when the disk is getting full. If inews is unbatching at the time that the expire is called, then the expire fails, but removes the lock. No extra disk space is available, so a harder expire is started. It finds no lock, so the entire history database is trashed, often along with 'active'. I then need 20mins of CPU time on a badly overloaded machine to rebuild it. The fix is to create a static that is set when dolock() sets the lock, and make rmlock() only delete the lock if it is set. The rest of this article is just extracts of code. #ifdef SCCSID static char *SccsId = "@(#)expire.c 2.57 11/30/87"; #endif /* SCCSID */ main() { . . dolock(); . . } dolock() { int i = 0; sprintf(afline,"%s.lock", ACTIVE); while (LINK(ACTIVE, afline) < 0 && errno == EEXIST) { if (i++ > 5) { xerror("Can't get lock for expire"); } sleep(i*2); } } /* VARARGS1 */ xerror(message, arg1, arg2, arg3) char *message; long arg1, arg2, arg3; { char buffer[LBUFLEN]; fflush(stdout); sprintf(buffer, message, arg1, arg2, arg3); logerr(buffer); xxit(1); } xxit(i) { if (i) UNLINK(ARTFILE); rmlock(); exit(i); } rmlock() { sprintf(bfr, "%s.lock", ACTIVE); UNLINK(bfr); } -- Mr I. W. J. Sparry Phone: +44 225 826983 University of Bath JANET: cc_is@UK.AC.BATH.UX63 Bath BA2 7AY UUCP: uunet!mcvax!ukc!bath63!cc_is (bath63.UUCP) England ARPA: cc_is%ux63.bath.ac.uk@ucl-cs.arpa