rees@apollo.uucp (Jim Rees) (09/05/86)
I've had lots of requests for fixes to make the news software run better on Apollos. The fact is that it runs pretty well as is, but there are things you can do to make it more robust in an environment where lots of people are reading news out of the same spool and lib directories, across a network that may not be completely reliable. Our network currently has about 1600 nodes on it, spread across two states, and those T-1 lines aren't always reliable. Some of these fixes may be applicable to other kinds of unreliable networks. The fix to getnextart() is particularly important. The standard news software pretty much works. The main problem is contention for the 'active' file, but with a small network even that may not be a problem. You have to run with SORTACTIVE on, and it helps to call lock() from sortactive() to lock the 'active' file while it makes the copy. You can eliminate contention for the 'sequence' file by generating message-ids from uids. Use this (in ifuncs.c): getident(hp) struct hbuf *hp; { long uid[2]; std_$call uid_$gen(); uid_$gen(uid); sprintf(hp->ident, "<%lx.%lx@%s%s>", uid[0], uid[1] & 0xfffff, FULLSYSNAME, MYDOMAIN); } There is also a problem with contention for the 'history' file, but this is a problem on any unix system so we don't worry about it. It helps to make getnextart() more robust in the face of network trouble. This piece of code helps (in readr.c): /* Decide if we want to show this article. */ if (bit <= 0 || (fp = fopen(filename, "r")) == NULL) { #ifdef apollo /* Make sure we can still get at the spool directory */ struct stat stbuf; if (stat(SPOOL, &stbuf) < 0) { if (pflag || lflag || eflag) /* Not interactive; error exit */ xerror("Can't get at spool directory"); fprintf(stderr, "Net failure has made news temporarily unavailable.\n"); fprintf(stderr, "Do you want to quit (q) or try again (<RET>)? "); gets(bfr); if (*bfr == 'q') { updaterc(); writeoutrc(); xxit(0); } else if (*bfr == 'x') { xxit(0); } else goto nextart2; } #endif The lock stuff won't work very well. This method works better. I put it in a separate source file and comment out the one in ifuncs.c. #include "iparams.h" #include "/sys/ins/base.ins.c" #include "/sys/ins/ios.ins.c" #include "/sys/ins/error.ins.c" #define INTERVAL 4 /* No. of seconds between lock retries */ int lockcount; /* Shared with ifuncs.c */ extern std_$call error_$get_string(); static ios_$id_t lockfd = -1; lock() { int i; status_$t st; char pn[100], errbuf[100], buf[120]; short pnlen, errlen; if (lockcount++ != 0) return; for (i = 0; i < DEADTIME; i += INTERVAL) { ios_$create(*LOCKFILE, (short) strlen(LOCKFILE), uid_$nil, ios_$truncate_mode, ios_$write_opt, lockfd, st); if (st.all == status_$ok) return; if (st.all == ios_$concurrency_violation) sleep(INTERVAL); else { if (i == 0) unlink(LOCKFILE); else { error_$get_string(st, errbuf, (short) sizeof errbuf, errlen); sprintf(buf, "%s: %.*s", LOCKFILE, errlen, errbuf); xerror(buf); } } } xerror("News system locked up"); } unlock() { status_$t st; if (--lockcount > 0) return; if (lockfd != -1) { ios_$delete(lockfd, st); lockfd = -1; } lockcount = 0; } /* * Given our current batching scheme, article locking seems * unnecessary and just slows us down. */ idlock(str) char *str; { } idunlock() { }