wcf@psuhcx (Bill Fenner) (11/22/88)
Does anyone have a good way to expire news automatically when the news partition gets full? We only have a 25 meg partition for news, and it often manages to fill up on weekends, and it's a big pain to come in to find a console log 5 inches thick with logs of /usr/spool/news: write failed, filesystem full. We're using nntp to receive articles from psuvax1 (main feed), and uucp to send back to psuvax1 as well as to send to and from hogbbs, a FidoNet BBS, which doesn't create many incoming messages... Bill
wcs@skep2.ATT.COM (Bill.Stewart.[ho95c]) (11/30/88)
In article <1066@psuhcx.psu.edu> wcf@psuhcx (Bill Fenner) writes: > Does anyone have a good way to expire news automatically when the news > partition gets full? We only have a 25 meg partition for news, and it Here's my "trashnews" script. I run it hourly from cron, which seems to be often enough. It uses "df" to find how many blocks are free, and if there aren't enough, it grinds through the history file looking for articles to trash (starts at the top, works down - it doesn't care when the article *should* have expired.) Caveats: - You need to be running a version of news with one history file. - If your "df" output format is different than System V you will have to modify the sed / awk script to pick out the right field. - it doesn't clean up the history file - expire will have to do this for you. I run expire -r weekly. ======================== cut here ============================================ ####### Zap netnews until disk space is adequate. TARGET="/usr/spool" remove="rm -f" ## remove="echo" for debugging debug=":" ## debug="echo" SPOOLDIR=/usr/spool/news ## Where the articles live LIBDIR=/usr/lib/news ## Where the data files live trashgroups="comp/mail/maps comp/binaries/atari talk/politics/misc comp/sys/atari" ## attack these brutally cd $SPOOLDIR ## Where the articles live echo "===================== `date`" limit=5000 export limit LIBDIR remove debug if [ "$1" = "-x" ] ; then set -x ; remove="echo remove"; debug=echo ; shift ; fi case "$1" in [0-9]*) limit=$1 ; shift ;; esac ######## Make sure $TARGET has inodes (evil System V bug!) if df $TARGET | sed 's/(/ (/' | awk ' { { print "df inodes ", $0 ; if ( $5 < 1000 ) exit 0 ; else exit 1 ; } } ' #>/dev/null then echo trashing inodes ; find comp/mail/maps -type f -print | xargs $remove ; find control talk rec/humor -type f -mtime +3 -print | xargs $remove else echo inodes ok fi ######## Make sure $LIBDIR has space if df /usr | sed 's/(/ (/' | awk ' { { print "df,limit ", $3, '$limit' ; if ( $3 < '$limit' ) exit 0 ; else exit 1 ; } } ' #>/dev/null then echo remove /usr/lib/news/ohis* /usr/lib/news/olog* ; $remove /usr/lib/news/ohis* /usr/lib/news/olog* else echo /usr ok fi ############### ( ## generate list of files to trash #echo $LIBDIR/olog* $LIBDIR/ohistory find $trashgroups -type f -mtime +2 -print 2>/dev/null sed -e 's/.* //' -e '/^$/d' -e '/cancel/d' -e 's/\./\//g' $LIBDIR/history ) | while read victim victims ; do if [ -f "$victim" ] ; then if df $TARGET | sed 's/(/ (/' | awk ' { { print "df,limit ", $3, '$limit' ; if ( $3 < '$limit' ) exit 0 ; else exit 1 ; } } ' #>/dev/null then echo remove $victim $victims ; $remove $victim $victims else echo enough ; break fi else $debug $victim already gone fi done #################################### cut here ################ exit 0 -- # Thanks; # Bill Stewart, AT&T Bell Labs 2G218 Holmdel NJ 201-949-0705 ho95c.att.com!wcs # # One Bell System - it works!
dtynan@sultra.UUCP (Der Tynan) (12/01/88)
From article <1066@psuhcx.psu.edu>, by wcf@psuhcx (Bill Fenner): > > Does anyone have a good way to expire news automatically when the news > partition gets full? We only have a 25 meg partition for news, and it > often manages to fill up on weekends, and it's a big pain to come in > to find a console log 5 inches thick with logs of /usr/spool/news: write > failed, filesystem full. I have a similar problem. Having given it some thought, I have come up with a clean solution that (someday) I will implement in 2.11 (or whatever). On the other hand, if any of the *new-and-improved* news software people are reading this, perhaps they'd care to comment? Anyway, the idea is this. In the NEWS/active file, a new field is introduced in the tradition of the 'm' field for 'moderated'. It is a boolean ('y'/'n'?), which indicates that the given newsgroup is not read at this site. In this way, a nightly (or weekly) cron program would zip through all the .newsrc files, to see what groups aren't subscribed to, and update the 'active' file. On the other hand, if someone subscribes to a currently unavailable group, the daemon would reactivate it. And vnews/readnews/whatever would inform the reader that the group isn't currently carried, but will appear in a few days. Of course, certain groups (such as comp.mail.maps) would have a special mark saying that they must ALWAYS be subscribed to ('a' perhaps?). Then, rnews as part of its processing, would look at this flag, and if necessary, dump the article. Currently, the two ways of doing this, are to remove the group from the active file, in which case the 'junk' group fills up like nobodys business. Or, conversely, to have the sysadmin at the remote feed modify the 'sys' file, so that certain groups weren't sent. This is awkward, because changes may occur very frequently. Both schemes also mean that the 'checkgroups' messages will bomb fairly severely. In this age of Trailblazers, I don't think anyone is worried about line bandwidth, but just disk space (20Mb/week), so this scheme would allow them to carry only those groups that people actually read. Comments? - Der -- dtynan@zorba.Tynan.COM (Dermot Tynan @ Tynan Computers) {apple,mips,pyramid,uunet}!Tynan.COM!dtynan --- If the Law is for the People, then why do we need Lawyers? ---
nagel@beaver.ics.uci.edu (Mark Nagel) (12/01/88)
In article <2694@sultra.UUCP>, dtynan@sultra (Der Tynan) writes: | |Anyway, the idea is this. In the NEWS/active file, a new field is |introduced in the tradition of the 'm' field for 'moderated'. It is |a boolean ('y'/'n'?), which indicates that the given newsgroup is not |read at this site. In this way, a nightly (or weekly) cron program |would zip through all the .newsrc files, to see what groups aren't |subscribed to, and update the 'active' file. On the other hand, if |someone subscribes to a currently unavailable group, the daemon would |reactivate it. And vnews/readnews/whatever would inform the reader |that the group isn't currently carried, but will appear in a few |days. Of course, certain groups (such as comp.mail.maps) would have |a special mark saying that they must ALWAYS be subscribed to ('a' |perhaps?). Then, rnews as part of its processing, would look at this |flag, and if necessary, dump the article. I'd prefer that if we are going to be adding a field to the active file, then the field should be the last access date of each newsgroup. This is for these reasons: 1. not everybody uses a news reader that uses .newsrc (e.g. vn, Gnews) 2. many people use a distributed news system with a central server that makes it tough to tell who reads what. If the field was interpreted as "avg number of requests per day", then you get even more information so you can regulate expiration even better. Of course, all of this would seem to require that the active file become a non-readable entity, with some function of inews or whatever needed to get information from the active file so that it can be updated correctly. Or great cooperation from all the news readers out there. From this viewpoint, the cron method is superior. I'm definitely in favor of anything that automatically trims groups that are unpopular. Any such scheme must be used carefully by non-leaf nodes so that news can pass through to downstream feeds (I use "must" loosely here, given the nature of the net). Mark Nagel @ UC Irvine, Dept of Info and Comp Sci ARPA: nagel@ics.uci.edu | radiation: n. ... 2. smog with an UUCP: {sdcsvax,ucbvax}!ucivax!nagel | attitude.
amb@dasys1.UUCP (A. M. Boardman) (12/02/88)
In article <2694@sultra.UUCP> dtynan@sultra.UUCP (Der Tynan) writes: >On the other hand, if someone subscribes to a currently unavailable group, >the daemon would reactivate it. And vnews/readnews/whatever would inform >the reader that the group isn't currently carried, but will appear in a few >days. Of course, certain groups (such as comp.mail.maps) would have a special Under this system, how do you get articles to propogate to the rest of the net? Sample situation: backbone!feedsite!site-with-only-one-or-two-feeds If nobody on the feedsite(s) happened to subscribe to a particular newsgroup, all the stuff posted downstream from it in that group never gets out to the rest of the net. Think about it. --- "On a scale from one to ten, that's pretty bad." A.M.Boardman Big Electric Cat PA Unix ![hoptoad|phri|(uunet)]!dasys1!amb
stu@jpusa1.UUCP (Stu Heiss) (12/03/88)
In article <2694@sultra.UUCP> dtynan@sultra.UUCP (Der Tynan) writes: -From article <1066@psuhcx.psu.edu>, by wcf@psuhcx (Bill Fenner): -> -> Does anyone have a good way to expire news automatically when the news -> partition gets full? -I have a similar problem. Having given it some thought, I have come up with -a clean solution that (someday) I will implement in 2.11 (or whatever). -Anyway, the idea is this. In the NEWS/active file, a new field is introduced -in the tradition of the 'm' field for 'moderated'. It is a boolean ('y'/'n'?), -which indicates that the given newsgroup is not read at this site. In this -way, a nightly (or weekly) cron program would zip through all the .newsrc -files, to see what groups aren't subscribed to, and update the 'active' file. -On the other hand, if someone subscribes to a currently unavailable group, -the daemon would reactivate it. And vnews/readnews/whatever would inform -the reader that the group isn't currently carried, but will appear in a few -days. Of course, certain groups (such as comp.mail.maps) would have a special -mark saying that they must ALWAYS be subscribed to ('a' perhaps?). We do something similar with a couple of shell scripts and no mods to the news software - works quite nicely. I use the previously posted script (inactng.sh) to get a list of inactive (nobody reads them) newsgroups and rm the articles in the associated directories. In addition, we always junk 'junk' and never junk 'comp.mail.maps' and 'news.announce.important'. Run 'junkarts.sh' as often as necessary. In my hourly news startup and nitely expire scripts I have: /usr/lib/news/expire.sh:find $LIB/active -newer $LIB/lastjunk -exec $LIB/junkarts.sh ';' -exec touch $LIB/lastjunk ';' /usr/lib/news/rnews.sh:find $LIB/active -newer $LIB/lastjunk -exec $LIB/junkarts.sh ';' -exec touch $LIB/lastjunk ';' This way if the active file gets modified junkarts is run. Following is junkarts.sh and the utility inactng.sh. #! /bin/sh # This is a shell archive, meaning: # 1. Remove everything above the #! /bin/sh line. # 2. Save the resulting text in a file. # 3. Execute the file with /bin/sh (not csh) to create: # /usr/lib/news/junkarts.sh # /usr/lib/news/inactng.sh # This archive created: Fri Dec 2 10:58:35 1988 # By: stu (JPUSA - Chicago, IL) export PATH; PATH=/bin:/usr/bin:$PATH echo shar: "x - '/usr/lib/news/junkarts.sh'" if test -f '/usr/lib/news/junkarts.sh' then echo shar: "will not over-write existing file '/usr/lib/news/junkarts.sh'" else cat << \SHAR_EOF/usr/lib/news/junkarts.sh > '/usr/lib/news/junkarts.sh' : PATH=/bin:/usr/bin export PATH lib=/usr/lib/news spool=/usr/spool/news tmpa=/tmp/.a$$ tmpb=/tmp/.b$$ trap 'rm -f $tmpa $tmpb;exit' 0 1 2 3 15 junkalways=false cd $spool $lib/inactng.sh|tr . /|grep -v comp/mail/maps|grep -v news/announce/important > $tmpa cat -s $tmpa|while read d do test -d $d && ls $d|sed "s;^;$d/;" done|while read f do test -f $f && echo $f done > $tmpb $junkalways && find junk -type f -print >> $tmpb xargs < $tmpb rm -f $f SHAR_EOF/usr/lib/news/junkarts.sh if test 452 -ne "`wc -c < '/usr/lib/news/junkarts.sh'`" then echo shar: "error transmitting '/usr/lib/news/junkarts.sh'" '(should have been 452 characters)' fi chmod +x '/usr/lib/news/junkarts.sh' fi echo shar: "x - '/usr/lib/news/inactng.sh'" if test -f '/usr/lib/news/inactng.sh' then echo shar: "will not over-write existing file '/usr/lib/news/inactng.sh'" else cat << \SHAR_EOF/usr/lib/news/inactng.sh > '/usr/lib/news/inactng.sh' : ng1=/tmp/ng1$$ ng2=/tmp/ng2$$ trap 'rm -f $ng1 $ng2;exit' 0 1 2 3 15 ACTIVE=/usr/lib/news/active cat `sed 's;[^:]*:[^:]*:[^:]*:[^:]*:[^:]*:\([^:]*\):.*;\1/.newsrc;' /etc/passwd` 2> /dev/null \ | sed -n 's/\([^:]*\):.*$/\1/p' |sort |uniq > $ng1 sed 's/ .*//' $ACTIVE |sort > $ng2 diff $ng1 $ng2 | sed -n 's/^> //p' SHAR_EOF/usr/lib/news/inactng.sh if test 476 -ne "`wc -c < '/usr/lib/news/inactng.sh'`" then echo shar: "error transmitting '/usr/lib/news/inactng.sh'" '(should have been 476 characters)' fi chmod +x '/usr/lib/news/inactng.sh' fi exit 0 # End of shell archive -- Stu Heiss {spl1,uchicago.edu!gargoyle,ddsw1}!jpusa1!stu
henry@utzoo.uucp (Henry Spencer) (12/03/88)
In article <2694@sultra.UUCP> dtynan@sultra.UUCP (Der Tynan) writes: >Anyway, the idea is this. In the NEWS/active file, a new field is introduced >in the tradition of the 'm' field for 'moderated'. It is a boolean ('y'/'n'?), >which indicates that the given newsgroup is not read at this site... If all you want to do is to expire such newsgroups quickly, it's trivial to build a control file for C News expire instead. This won't stop the stuff from being filed, but it will get it off the disk quickly. It has the advantage of no incompatible changes to file formats, and use of existing tools rather than inventing yet another wheel. >... a nightly (or weekly) cron program would zip through all the .newsrc >files, to see what groups aren't subscribed to, and update the 'active' file. Note that any scheme which examines .newsrc needs to have some sort of rule about how recent a .newsrc has to be before it is considered. Otherwise a single user, who theoretically reads a lot of news but hasn't been on for six months, fouls the whole thing up. -- SunOSish, adj: requiring | Henry Spencer at U of Toronto Zoology 32-bit bug numbers. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
dtynan@sultra.UUCP (Der Tynan) (12/06/88)
In article <8048@dasys1.UUCP>, amb@dasys1.UUCP (A. M. Boardman) writes: > In article <2694@sultra.UUCP> dtynan@sultra.UUCP (Der Tynan) writes: > >On the other hand, if someone subscribes to a currently unavailable group, > >the daemon would reactivate it. And vnews/readnews/whatever would inform > >the reader that the group isn't currently carried, but will appear in a few > >days. >Under this system, how do you get articles to propogate to the rest of the net? > > A.M.Boardman Big Electric Cat PA Unix ![hoptoad|phri|(uunet)]!dasys1!amb I thought of that already :-) In *no* way, should the downstream sites be 'censored' by this system. In the case of a batched newssite, inews would continue to generate news batches for neighbours, but wouldn't save the articles in the local news spool directory. In the case of a non-batched link, something similar could be done. Since my original posting, I have gotten some interesting feedback. Amos Shapir suggested using a date field in the 'active' file, to see when the last time the group was read. This is because some people subscribe to certain groups, but haven't actually read them in months. Another possibility would be a 'count' field, for each newsgroup, which reflected the number of people who had read the group. Each week, then, the cron utility would zero the counts. Comments? - Der -- dtynan@zorba.Tynan.COM (Dermot Tynan @ Tynan Computers) {apple,mips,pyramid,uunet}!Tynan.COM!dtynan --- If the Law is for the People, then why do we need Lawyers? ---
nyssa@terminus.UUCP (The Prime Minister) (12/06/88)
In article <1988Dec2.184323.7511@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: >Note that any scheme which examines .newsrc needs to have some sort of >rule about how recent a .newsrc has to be before it is considered. >Otherwise a single user, who theoretically reads a lot of news but hasn't >been on for six months, fouls the whole thing up. The selection criteria for pexpire include: .newsrc touched within a compile time specificied time period OR within the history expiration time The newsgroup being subscribed to The last article read from that newsgroup being in the range of articles on the system. Therefore, a user who hasn't read news for 6 months won't be looked at, unless you save the news for that long. -- James C. Armstrong, Jr nyssa@terminus.UUCP