warren@pluto.UUCP (Warren Burstein) (11/18/85)
I saw this happening, just wondered if this is a thing to avoid. Won't both programs be updating the active file at the same time?
adams@calma.UUCP (Robert Adams) (11/18/85)
> I saw this happening, just wondered if this is a thing to avoid. Won't > both programs be updating the active file at the same time? Yes, there are problems if expire and the unbatcher run at the same time. Here I run a program called /usr/lib/new/newexpire which looks like: --------------- #! /bin/csh -f # expire that waits for the mail sender to be finished while ( -e /usr/spool/uucp/LCK..sun ) sleep 60 end sleep 10 # now wait for the unbatcher to finish while ( -e /usr/spool/uucp/LCK..ACTIVE ) sleep 60 end echo "$$" > /usr/spool/uucp/LCK..ACTIVE /usr/lib/news/expire -e 14 -a -A /user/adams/news/archive -n mod.ai net.ai /usr/lib/news/expire -e 14 -a -A /nfs/news_archive -n net.sources mod.sources /usr/lib/news/expire -e 14 rm -f /usr/spool/uucp/LCK..ACTIVE ------------- and then the program that is run by the news feeder ('sun' in this case) was replaced by the script: ------------- #! /bin/csh -f while ( -e /usr/spool/uucp/LCK..ACTIVE ) sleep 60 end echo "$$" > /usr/spool/uucp/LCK..ACTIVE echo "$$" > /usr/spool/uucp/LCK..BATCHER /usr/lib/news/unbatchnews $* rm -f /usr/spool/uucp/LCK..BATCHER rm -f /usr/spool/uucp/LCK..ACTIVE ------------- There are other things in the system that look for LCK..BATCHER (we feed other sites). Yes, this has critical region problems but, compared to what happens to /usr/lib/news/active when both expire and the unbatcher run at the same time, it is a little price. An aside, the advantage of using filenames as "LCK..*" is that they are already cleaned up by /etc/rc when the system boots. adams@calma.UUCP -- Robert Adams ...!ucbvax!calma!adams
wls@astrovax.UUCP (William L. Sebok) (11/19/85)
In article <186@pluto.UUCP> warren@pluto.UUCP (Warren Burstein) writes: >I saw this happening, just wondered if this is a thing to avoid. Won't >both programs be updating the active file at the same time? I believe that there may very well be problems. I also very worried about the history file. I didn't see what I believe to be adequate locking of it in the source for news 2.10.1 and 2.10.2. At one time when we were running news 2.10.1 the history dbm files were getting corrupted (and when they get corrupted the dbm subroutines abort(). The problem in the end turned out to be hardware but I suspected locking problems. That was when I inspected the locking code. I fixed it to my satisfaction by installing the 4.2 BSD flock() call around the history file accesses. Installation of the news could be held up by a long expire but to me that was tolerable. I haven't gotten around to doing anything like this to news 2.10.2 (and with news 2.10.3 around the corner I am not likely to get around to it). Because of my concerns I run expire here at 8-9 am, after news for the night is shut off (site princeton and astrovax both run one of Honeyman's recent versions of Honey Danber that allow different time-of-day restrictions on different grades, thus allowing news to be confined to night without so restricting mail). -- Bill Sebok Princeton University, Astrophysics {allegra,akgua,cbosgd,decvax,ihnp4,noao,philabs,princeton,vax135}!astrovax!wls
bytebug@felix.UUCP (Roger L. Long) (11/21/85)
In article <65@calma.UUCP> adams@calma.UUCP (Robert Adams) writes: >> I saw this happening, just wondered if this is a thing to avoid. Won't >> both programs be updating the active file at the same time? > >Yes, there are problems if expire and the unbatcher run at the >same time. >There are other things in the system that look for LCK..BATCHER >(we feed other sites). Yes, this has critical region problems >but, compared to what happens to /usr/lib/news/active when both >expire and the unbatcher run at the same time, it is a little price. I've not found that anything nasty happens when running expire at the same time we're unbatching news. I should preface this with the fact that we are running the 2.10.2 news software. What happens is that expire doesn't update the active file. It builds a new file named "nactive", and then renames nactive to active when it is finished. If during the time that expire is running new news comes in, the current article numbers that get written to nactive by expire get outdated. However when news tries to use that article number to post a new article to, it sees that something is already there and puts an error message into the error log. It then increments the article number and tries again. Expire does the same sort of thing when dealing with the history file: it builds "nhistory" and then renames it to "history" when it is finished. -- Roger L. Long FileNet Corp trwrb!felix!bytebug
stephen@dcl-cs.UUCP (Stephen J. Muir) (11/23/85)
In article <186@pluto.UUCP> warren@pluto.UUCP (Warren Burstein) writes: >I saw this happening, just wondered if this is a thing to avoid. Won't >both programs be updating the active file at the same time? There are 3 cases to consider here: 1) Highest article number too low: When new news is later received for this newsgroup, "rnews" will try to create the file with the same name as an existing one, but it will notice that it already exists and try the next one instead. It will keep doing this until it manages to get a non-existant file name. After that, the problem will have fixed itself. 2) New newsgroup creation: There may indeed be a problem here, but the time window in which this can happen is very small. This is the time "expire" takes to rename the file after detecting end-of-file. 3) Newsgroup deletion: On our system, this is done manually and I make sure neither "expire" or "rnews" is running. There is a more serious bug (which I have fixed on my system). Once "expire" has finished with the history file, it doesn't flush its buffers before starting its work on the active file. This gives quite a large time slot in which "rnews" can corrupt the history file. -- UUCP: ...!seismo!mcvax!ukc!dcl-cs!stephen DARPA: stephen%comp.lancs.ac.uk@ucl-cs | Post: University of Lancaster, JANET: stephen@uk.ac.lancs.comp | Department of Computing, Phone: +44 524 65201 Ext. 4599 | Bailrigg, Lancaster, UK. Project:Alvey ECLIPSE Distribution | LA1 4YR
mp@allegra.UUCP (Mark Plotnick) (11/25/85)
The way we avoid the problem here is: - don't run expire (it would take 11 hours of real time!), and use find instead. Once a week, manually turn off uuxqts, trim down the history file so it only holds 3 weeks' worth of info, and run "rebuilddbm" (an extract of the rebuilddbm() routine in expire) to recreate the .dir and .pag files. - add locking code around dbm accesses. This is mainly to prevent problems with concurrent rnews's (either because we have multiple uuxqt's going at once or have benevolent gremlins who dig into the uucp spool directory and run unbatch manually). I just mimicked the article locking code, and provided an abort() routine for dbm to call that logs the data from the offending page (so far, it's never been called). I also modified libdbm not to cache pages (this change may also help sendmail out when it's repeatedly hunting for the '@' in an incomplete alias file). Modified news code upon request, but it's for 2.10.1.
spaf@gatech.CSNET (Gene Spafford) (11/26/85)
Rick Adams (rick@seismo.css.gov) posted a very nice fix to this problem some time back. Since it is small, I will post it again (here). First, create a file named "rnews.x" in your news library directory containing: exec /bin/cat $* > /usr/spool/news/rnews.$$ Next, you make your nightly news script do the following (I believe that the "install" command is a BSD-specific command; a combination of "cp", "chown" and "chmod" will replace it for other sites): #! /bin/sh umask 002 # Prevent additions to history file while expire is running /usr/bin/install -c -m 4755 -o news /usr/lib/news/rnews.x /usr/bin/rnews # actually expire the articles # if this was invoked manually, pass along the flags too cd /usr/lib/news /usr/lib/news/expire -v2 $* # get a fresh logfile /bin/mv log olog /bin/cp /dev/null log /bin/chmod 666 log /bin/cat olog >>log.mtd /bin/rm -f ohistory.pag ohistory.dir ohistory olog # turn rnews loose /bin/rm -f /usr/bin/rnews /bin/ln /usr/lib/news/inews /usr/bin/rnews cd /usr/spool/news for i in rnews.* do /usr/bin/rnews <$i /bin/rm -f $i done -- Gene "wedding done, thesis to go" Spafford The Clouds Project, School of ICS, Georgia Tech, Atlanta GA 30332 CSNet: Spaf @ GATech ARPA: Spaf%GATech.CSNet @ Relay.CS.NET uucp: ...!{akgua,decvax,hplabs,ihnp4,linus,seismo,ulysses}!gatech!spaf
reid@glacier.ARPA (Brian Reid) (11/27/85)
In article <2054@gatech.CSNET> spaf@gatech.UUCP (Gene Spafford) writes: >Rick Adams (rick@seismo.css.gov) posted a very nice fix to this >problem some time back. Since it is small, I will post it again >(here). Rick's fix is not good enough, for 2 reasons: (1) It doesn't prevent local postings and "recnews" postings during an expire. Now that the new mod.stupidname groups are the norm, we aren't doing so much recnews as we used to, but back in the good old "fa.*" days this was a big problem. (2) It causes dreadful problems in an ethernet environment when the "rdist" program is used. This might not bother many of you, but it sure bothered us. If somebody turns off news on the master machine, and then rdist runs, then the turned-off version of news gets distributed to all of the client machines. I don't yet have a fix for inews/recnews, but here is my modified rnews that fixes the rdist problem. You'll probably want to run with the article-eater-log code turned off, as the article-eater bug is officially fixed now. #! /bin/sh : This is a shar archive. Extract with sh, not csh. echo x - newson cat > newson << '15935!Funky!Stuff!' #! /bin/sh # # This shell script un-does the effect of "newsoff". It takes any stored # news that accumulated while news was turned off, and runs it through. # See also $NEWSLIB/newsoff and /usr/bin/rnews # # Brian Reid, October 1985 PATH=.:/usr/stanford/bin:/usr/ucb:/usr/bin:/bin: NEWSLIB=/usr/lib/news NEWSSPOOL=/usr/spool/news rm -f $NEWSLIB/rnews.lock cd $NEWSSPOOL ls -tr1 |\ grep \^rnews. |\ awk "{print \"$NEWSLIB/rnews < \",\$1,\"; rm -f \",\$1,\"\"}" |\ /bin/sh 15935!Funky!Stuff! echo x - newsoff cat > newsoff << '15935!Funky!Stuff!' #! /bin/sh # # This shell script turns off incoming news so that system maintenance runs # can continue uninterrupted. See also $NEWSLIB/newson and /usr/bin/rnews. # # Brian Reid, October 1985 NEWSLIB=/usr/lib/news if [ -f $NEWSLIB/rnews.lock ]; then echo "News is already off." else echo "News disabled by" $USER at "`date`" > $NEWSLIB/rnews.lock cat $NEWSLIB/rnews.lock fi 15935!Funky!Stuff! echo x - /usr/bin/rnews cat > rnews << '15935!Funky!Stuff!' #! /bin/sh # # This shell script replaces /usr/bin/rnews. It tests for the presence of a # lock file. If the lock file is there, then the news is hidden away in a # spool directory instead of being processed. The lock file is set by the # "newsoff" command and cleared by the "newson" command. It expects the real # rnews to be a hard link to /usr/lib/news/inews. # # Brian Reid, October 1985 NEWSLIB=/usr/lib/news NEWSSPOOL=/usr/spool/news if [ -f $NEWSLIB/rnews.lock ] then exec /bin/cat > $NEWSSPOOL/rnews.$$ else exec $NEWSLIB/rnews 2>>$NEWSLIB/article-eater-log fi 15935!Funky!Stuff! -- Brian Reid decwrl!glacier!reid Stanford reid@SU-Glacier.ARPA
hansen@pegasus.UUCP (Tony L. Hansen) (12/20/85)
Marsh Gosnell and I have fixed the problems with expire and rnews running at the same time. The code modifications to expire.c and inews.c are being passed back to Rick. Essentially we: 1) provide a lock while expire is redoing its history/active files 2) if rnews sees the lock, it shunts the files into SPOOL/save.news 3) when expire is done with the history/active files, it invokes rnews on each file within SPOOL/save.news Tony Hansen ihnp4!pegasus!hansen