[net.news.sa] Are there problems if expire runs while news is being received?

warren@pluto.UUCP (Warren Burstein) (11/18/85)

I saw this happening, just wondered if this is a thing to avoid.  Won't
both programs be updating the active file at the same time?

adams@calma.UUCP (Robert Adams) (11/18/85)

> I saw this happening, just wondered if this is a thing to avoid.  Won't
> both programs be updating the active file at the same time?

Yes, there are problems if expire and the unbatcher run at the
same time.  Here I run a program called /usr/lib/new/newexpire
which looks like:
---------------
#! /bin/csh -f
# expire that waits for the mail sender to be finished
while ( -e /usr/spool/uucp/LCK..sun )
	sleep 60
end
sleep 10
# now wait for the unbatcher to finish
while ( -e /usr/spool/uucp/LCK..ACTIVE )
	sleep 60
end
echo "$$" > /usr/spool/uucp/LCK..ACTIVE
/usr/lib/news/expire -e 14 -a -A /user/adams/news/archive -n mod.ai net.ai
/usr/lib/news/expire -e 14 -a -A /nfs/news_archive -n net.sources mod.sources
/usr/lib/news/expire -e 14
rm -f /usr/spool/uucp/LCK..ACTIVE
-------------
and then the program that is run by the news feeder ('sun' in
this case) was replaced by the script:
-------------
#! /bin/csh -f
while ( -e /usr/spool/uucp/LCK..ACTIVE )
	sleep 60
end
echo "$$" > /usr/spool/uucp/LCK..ACTIVE
echo "$$" > /usr/spool/uucp/LCK..BATCHER
/usr/lib/news/unbatchnews $*
rm -f /usr/spool/uucp/LCK..BATCHER
rm -f /usr/spool/uucp/LCK..ACTIVE
-------------
There are other things in the system that look for LCK..BATCHER
(we feed other sites).  Yes, this has critical region problems
but, compared to what happens to /usr/lib/news/active when both
expire and the unbatcher run at the same time, it is a little price.

An aside, the advantage of using filenames as "LCK..*" is that
they are already cleaned up by /etc/rc when the system boots.

	adams@calma.UUCP		-- Robert Adams
	...!ucbvax!calma!adams

wls@astrovax.UUCP (William L. Sebok) (11/19/85)

In article <186@pluto.UUCP> warren@pluto.UUCP (Warren Burstein) writes:

>I saw this happening, just wondered if this is a thing to avoid.  Won't
>both programs be updating the active file at the same time?

I believe that there may very well be problems. I also very worried about the
history file. I didn't see what I believe to be adequate locking of it in the
source for news 2.10.1 and 2.10.2.

At one time when we were running news 2.10.1 the history dbm files were getting
corrupted (and when they get corrupted the dbm subroutines abort().  The
problem in the end turned out to be hardware but I suspected locking problems.
That was when I inspected the locking code.  I fixed it to my satisfaction
by installing the 4.2 BSD flock() call around the history file accesses.
Installation of the news could be held up by a long expire but to me that
was tolerable.

I haven't gotten around to doing anything like this to news 2.10.2 (and with
news 2.10.3 around the corner I am not likely to get around to it).  Because
of my concerns I run expire here at 8-9 am, after news for the night is shut
off (site princeton and astrovax both run one of Honeyman's recent versions
of Honey Danber that allow different time-of-day restrictions on different
grades, thus allowing news to be confined to night without so restricting
mail).
-- 
Bill Sebok			Princeton University, Astrophysics
{allegra,akgua,cbosgd,decvax,ihnp4,noao,philabs,princeton,vax135}!astrovax!wls

bytebug@felix.UUCP (Roger L. Long) (11/21/85)

In article <65@calma.UUCP> adams@calma.UUCP (Robert Adams) writes:
>> I saw this happening, just wondered if this is a thing to avoid.  Won't
>> both programs be updating the active file at the same time?
>
>Yes, there are problems if expire and the unbatcher run at the
>same time.

>There are other things in the system that look for LCK..BATCHER
>(we feed other sites).  Yes, this has critical region problems
>but, compared to what happens to /usr/lib/news/active when both
>expire and the unbatcher run at the same time, it is a little price.

I've not found that anything nasty happens when running expire at the 
same time we're unbatching news.  I should preface this with the fact
that we are running the 2.10.2 news software.

What happens is that expire doesn't update the active file.  It builds
a new file named "nactive", and then renames nactive to active when it
is finished.  If during the time that expire is running new news comes
in, the current article numbers that get written to nactive by expire
get outdated.  However when news tries to use that article number to
post a new article to, it sees that something is already there and puts
an error message into the error log.  It then increments the article
number and tries again.  Expire does the same sort of thing when
dealing with the history file:  it builds "nhistory" and then renames
it to "history" when it is finished.
-- 
	Roger L. Long
	FileNet Corp
	trwrb!felix!bytebug

stephen@dcl-cs.UUCP (Stephen J. Muir) (11/23/85)

In article <186@pluto.UUCP> warren@pluto.UUCP (Warren Burstein) writes:
>I saw this happening, just wondered if this is a thing to avoid.  Won't
>both programs be updating the active file at the same time?

There are 3 cases to consider here:

1) Highest article number too low:
   When new news is later received for this newsgroup, "rnews" will try to
   create the file with the same name as an existing one, but it will notice
   that it already exists and try the next one instead.  It will keep doing
   this until it manages to get a non-existant file name.  After that, the
   problem will have fixed itself.

2) New newsgroup creation:
   There may indeed be a problem here, but the time window in which this can
   happen is very small.  This is the time "expire" takes to rename the file
   after detecting end-of-file.

3) Newsgroup deletion:
   On our system, this is done manually and I make sure neither "expire" or
   "rnews" is running.

There is a more serious bug (which I have fixed on my system).  Once "expire"
has finished with the history file, it doesn't flush its buffers before
starting its work on the active file.  This gives quite a large time slot in
which "rnews" can corrupt the history file.
-- 
UUCP:	...!seismo!mcvax!ukc!dcl-cs!stephen
DARPA:	stephen%comp.lancs.ac.uk@ucl-cs	| Post: University of Lancaster,
JANET:	stephen@uk.ac.lancs.comp	|	Department of Computing,
Phone:	+44 524 65201 Ext. 4599		|	Bailrigg, Lancaster, UK.
Project:Alvey ECLIPSE Distribution	|	LA1 4YR

mp@allegra.UUCP (Mark Plotnick) (11/25/85)

The way we avoid the problem here is:
- don't run expire (it would take 11 hours of real time!), and
use find instead.  Once a week, manually turn off uuxqts, trim
down the history file so it only holds 3 weeks' worth of info, and run
"rebuilddbm" (an extract of the rebuilddbm() routine in expire) to
recreate the .dir and .pag files.
- add locking code around dbm accesses.  This is mainly to prevent
problems with concurrent rnews's (either because we have multiple
uuxqt's going at once or have benevolent gremlins who dig into the uucp
spool directory and run unbatch manually).  I just mimicked the article
locking code, and provided an abort() routine for dbm to call that logs
the data from the offending page (so far, it's never been called).  I
also modified libdbm not to cache pages (this change may also help
sendmail out when it's repeatedly hunting for the '@' in an incomplete
alias file).  Modified news code upon request, but it's for 2.10.1.

spaf@gatech.CSNET (Gene Spafford) (11/26/85)

Rick Adams (rick@seismo.css.gov) posted a very nice fix to this
problem some time back.  Since it is small, I will post it again
(here).

First, create a file named "rnews.x" in your news library directory
containing:

exec /bin/cat $* > /usr/spool/news/rnews.$$

Next, you make your nightly news script do the following
(I believe that the "install" command is a BSD-specific command;
a combination of "cp", "chown" and "chmod" will replace it for
other sites):

#! /bin/sh

umask 002

# Prevent additions to history file while expire is running
/usr/bin/install -c -m 4755 -o news  /usr/lib/news/rnews.x /usr/bin/rnews

# actually expire the articles
#  if this was invoked manually, pass along the flags too 
cd /usr/lib/news
/usr/lib/news/expire -v2 $*

# get a fresh logfile
/bin/mv log olog
/bin/cp /dev/null log
/bin/chmod 666 log
/bin/cat olog >>log.mtd
/bin/rm -f ohistory.pag ohistory.dir ohistory olog

# turn rnews loose
/bin/rm -f /usr/bin/rnews
/bin/ln /usr/lib/news/inews /usr/bin/rnews
cd /usr/spool/news
for i in rnews.*
do
	/usr/bin/rnews <$i
	/bin/rm -f $i
done

-- 
Gene "wedding done, thesis to go" Spafford
The Clouds Project, School of ICS, Georgia Tech, Atlanta GA 30332
CSNet:	Spaf @ GATech		ARPA:	Spaf%GATech.CSNet @ Relay.CS.NET
uucp:	...!{akgua,decvax,hplabs,ihnp4,linus,seismo,ulysses}!gatech!spaf

reid@glacier.ARPA (Brian Reid) (11/27/85)

In article <2054@gatech.CSNET> spaf@gatech.UUCP (Gene Spafford) writes:
>Rick Adams (rick@seismo.css.gov) posted a very nice fix to this
>problem some time back.  Since it is small, I will post it again
>(here).

Rick's fix is not good enough, for 2 reasons:
  (1) It doesn't prevent local postings and "recnews" postings during an
      expire. Now that the new mod.stupidname groups are the norm, we
      aren't doing so much recnews as we used to, but back in the good
      old "fa.*" days this was a big problem.
  (2) It causes dreadful problems in an ethernet environment when the
      "rdist" program is used. This might not bother many of you, but
      it sure bothered us. If somebody turns off news on the master
      machine, and then rdist runs, then the turned-off version of news
      gets distributed to all of the client machines.

I don't yet have a fix for inews/recnews, but here is my modified rnews that
fixes the rdist problem. You'll probably want to run with the 
article-eater-log code turned off, as the article-eater bug is officially
fixed now.

#! /bin/sh
: This is a shar archive.  Extract with sh, not csh.
echo x - newson
cat > newson << '15935!Funky!Stuff!'
#! /bin/sh
#
# This shell script un-does the effect of "newsoff". It takes any stored
# news that accumulated while news was turned off, and runs it through.
# See also $NEWSLIB/newsoff and /usr/bin/rnews
#
#	Brian Reid, October 1985

PATH=.:/usr/stanford/bin:/usr/ucb:/usr/bin:/bin:
NEWSLIB=/usr/lib/news
NEWSSPOOL=/usr/spool/news

rm -f $NEWSLIB/rnews.lock
cd $NEWSSPOOL
ls -tr1 |\
 grep \^rnews. |\
 awk "{print \"$NEWSLIB/rnews < \",\$1,\"; rm -f \",\$1,\"\"}" |\
 /bin/sh
15935!Funky!Stuff!
echo x - newsoff
cat > newsoff << '15935!Funky!Stuff!'
#! /bin/sh
#
# This shell script turns off incoming news so that system maintenance runs
# can continue uninterrupted. See also $NEWSLIB/newson and /usr/bin/rnews.
#
#	Brian Reid, October 1985

NEWSLIB=/usr/lib/news
if [ -f $NEWSLIB/rnews.lock ]; 
then
    echo "News is already off."
else
    echo "News disabled by" $USER at "`date`" > $NEWSLIB/rnews.lock
    cat $NEWSLIB/rnews.lock
fi
15935!Funky!Stuff!
echo x - /usr/bin/rnews
cat > rnews << '15935!Funky!Stuff!'
#! /bin/sh
#
# This shell script replaces /usr/bin/rnews. It tests for the presence of a
# lock file. If the lock file is there, then the news is hidden away in a
# spool directory instead of being processed. The lock file is set by the
# "newsoff" command and cleared by the "newson" command. It expects the real
# rnews to be a hard link to /usr/lib/news/inews.
#
#	Brian Reid, October 1985

NEWSLIB=/usr/lib/news
NEWSSPOOL=/usr/spool/news
if [ -f $NEWSLIB/rnews.lock ]
then
    exec /bin/cat > $NEWSSPOOL/rnews.$$
else
    exec $NEWSLIB/rnews 2>>$NEWSLIB/article-eater-log
fi
15935!Funky!Stuff!
-- 
	Brian Reid	decwrl!glacier!reid
	Stanford	reid@SU-Glacier.ARPA

hansen@pegasus.UUCP (Tony L. Hansen) (12/20/85)

Marsh Gosnell and I have fixed the problems with expire and rnews running at
the same time. The code modifications to expire.c and inews.c are being
passed back to Rick. Essentially we:

    1)	provide a lock while expire is redoing its history/active files
    2)	if rnews sees the lock, it shunts the files into SPOOL/save.news
    3)	when expire is done with the history/active files, it invokes
	rnews on each file within SPOOL/save.news

					Tony Hansen
					ihnp4!pegasus!hansen