geb@igor.Rational.COM (Gary E. Barnes) (11/11/88)
We've got a bit of a problem with news. Last night our disk ran out of space but apparently not out of inodes. As a result we have "a large number" of articles that "exist" (to the extent that rn thinks that they are there and tries to display them) but which to not really exist (in that they are completely empty). Can anyone out there tell me what magic I might do to a) get rid of all of these "nonexistent" articles and get the real articles, and b) is there any easy way to prevent this kind of thing in the future (perhaps we've missed a switch somewhere; perhaps some part of the news software needs additional safeguards added)? Gary E. Barnes geb@rational.com
bill@twwells.uucp (T. William Wells) (11/12/88)
In article <341@igor.Rational.COM> geb@igor.Rational.COM (Gary E. Barnes) writes:
: Can anyone out there tell me what magic I might do to a) get rid of
: all of these "nonexistent" articles and get the real articles
Getting rid of the bogus articles is easy, if you believe the
expire(8) man page. Use these lines:
find /usr/spool/news -size 0 -print | xargs rm -f
expire -r
The first goes through your news directories and trashes zero length
files. The second rebuilds your history files.
As for getting the files back, you'll have to work with your feed for
that, I think.
And as for preventing it from happening again? Good luck! I don't
believe there is any way to do that, short of making sure there is
always enough disk space available.
---
Bill
{uunet|novavax}!proxftl!twwells!bill
geoff@desint.UUCP (Geoff Kuenning) (11/14/88)
In article <341@igor.Rational.COM> geb@igor.Rational.COM (Gary E. Barnes) writes: > We've got a bit of a problem with news. Last night our disk ran out > of space but apparently not out of inodes. > Can anyone out there tell me what magic I might do to a) get rid of > all of these "nonexistent" articles and get the real articles, and b) > is there any easy way to prevent this kind of thing in the future Here are a couple of shell scripts I find useful. The first, "askfor", asks your neighbor for a list of articles using SENDME. Start by saving your history file; then run the "find" command and "expire -r" suggested by someone else to get the empty articles out of your history. Now use a few handy unix utilities (such as sort, comm, and cut) to pick out the article ID's of the articles you lost, and feed them into the "askfor" script. The second script, "ckuucp", is handy for watching your uucp and news spool directories to be sure they don't fill up. I run it out of crontab quite often. If either /usr/spool/uucp or /usr/spool/news gets too low on space or i-nodes, it shuts off uucico. Primitive, but effective. You will have to edit it to change "FILESYS" and "NEWSSYS" at a minimum. WARNING: if you use "ckuucp", you must check on a daily basis to be sure your news hasn't been shut off. Otherwise, you feed's spool space will back up and you are risking offending your feed and getting cut off. Geoff Kuenning geoff@ITcorp.com uunet!desint!geoff -------------------------------cut here------------------------------- #! /bin/sh # This is a shell archive, meaning: # 1. Remove everything above the #! /bin/sh line. # 2. Save the resulting text in a file. # 3. Execute the file with /bin/sh (not csh) to create the files: # askfor # ckuucp # This archive created: Mon Nov 14 00:37:20 1988 export PATH; PATH=/bin:$PATH if test -f 'askfor' then echo shar: will not over-write existing file "'askfor'" else sed 's/^X//' << \SHAR_EOF > 'askfor' X: Use /bin/sh X# X# $Header$ X# X# $Log$ X# X# Ask a news neighbor for specific articles. X# X# Usage: X# X# askfor dest-site < list-of-article-ids XOURSITE=`uname -n` XINEWS=/usr1/usenet/lib/inews X Xcase $# in X 1) ;; X *) X echo 'Usage: askfor dest-site < list-of-article-ids' 1>&2 X exit 1 X ;; Xesac X XDEST=$1 XTMP=/tmp/askfor$$ Xtrap "/bin/rm ${TMP}?; exit 1" 1 2 15 Xecho 'Thank you' > ${TMP}a Xxargs -s60 echo > ${TMP}b Xwhile read arts Xdo X $INEWS -n to.$DEST -t "cmsg sendme $arts $OURSITE" < ${TMP}a Xdone < ${TMP}b X/bin/rm ${TMP}? SHAR_EOF chmod +x 'askfor' fi # end of overwriting check if test -f 'ckuucp' then echo shar: will not over-write existing file "'ckuucp'" else sed 's/^X//' << \SHAR_EOF > 'ckuucp' X X#!/bin/sh X# X# @(#)ckuucp.sh 1.4 9/2/88 01:07:16 X# X# Periodically check the amount of disk space left on /usr X# If it falls below $1 blocks (500 default), kill any running uucico X# as soon as its current temp file disappears. If it falls below X# $2 blocks (100 default), kill all running uucico's regardless of X# whether the current temp file is complete. X# X# The size of the news spool directory is also watched. If the sum of X# the sizes of the D.* and TM.* files in /usr/spool/uucp, subtracted X# from the free space on NEWSSYS, is less than $3 (default is the X# value of $1), a soft limit on uucico's is invoked. Similarly, if X# $4 (default $2) is exceeded, uucico's will be killed immediately. X# X# $5 is a multiplicative factor that will be applied to the number of X# blocks in D.* files. This is intended to allow space for fragmentation X# and archiving. The factor must be expressed as a rational number. It X# must be quoted, and if it contains shell metacharacters they must be X# escaped *inside* the quotes. For example, X# X# "5 \* 3" X# X# $6 and $7 are soft and hard limits for the number of free inodes in X# the news spool filesystem. Defaults are 2000 (hard) and 1000 (soft) X# X XPATH=/bin:/usr/bin Xexport PATH X XSOFT=${1-500} XHARD=${2-100} XNSOFT=${3-$SOFT} XNHARD=${4-$HARD} XFACTOR=${5-"125 / 100"} XNISOFT=${6-2000} XNIHARD=${7-1000} XLIB=/usr/lib/uucp XSPOOL=/usr/spool/uucp XFILESYS=/usr/spool XNEWSSYS=/usr1 X X# Before anything else, see if there's a uucico running. This reduces X# the load on a no-uucp system. Note that the loop is always one-trip; X# it's just a way to protect against multiple lock files. X Xfor i in $SPOOL/LCK..* Xdo X if [ ! -f $i ] X then X exit 0 X fi X break Xdone X X# This script only works if there are incoming TM.* files, so we might X# as well check for those too: X Xfor i in $SPOOL/TM.* Xdo X if [ ! -f $i ] X then X exit 0 X fi X break Xdone X Xcd $LIB Xtrap "rm -f $LIB/cklock*; exit 0" 1 2 3 9 15 X X# set up lock files to prevent simultaneous checking X Xcp /dev/null cklock Xchmod 400 cklock Xln cklock cklock1 || exit 1 X Xtrap "rm -f $LIB/cklock*; exit 0" 0 1 2 3 9 15 X X# If there are less than $SOFT free blocks left on the $FILESYS X# file system, we must kill uucp. Restart is somebody else's business. X X Xblocks=`df $FILESYS | sed "s/.*: *\([0-9][0-9]*\) blocks.*/\1/"` Xnblocks=`df $NEWSSYS | sed "s/.*: *\([0-9][0-9]*\) blocks.*/\1/"` Xninodes=`df $NEWSSYS | sed "s/.*blocks *\([0-9][0-9]*\) i-nodes.*/\1/"` Xtotblocks=`ls -s /usr/spool/uucp/D.* /usr/spool/uucp/TM.* \ X | awk 'BEGIN{tot=0}{tot += $1} END {print tot}'` Xnblocks=`eval expr $nblocks - $totblocks "'*'" $FACTOR` X Xtemplist=`echo $SPOOL/TM.*` Xif [ "$templist" = "$SPOOL/TM.*" ] Xthen X templist= Xfi Xwhile [ "X$templist" != X -a \ X \( "$blocks" -le $SOFT -o 0"$nblocks" -le "$NSOFT" \ X -o "0$ninodes" -le "$NISOFT" \) ] Xdo X if [ "$blocks" -le $HARD -o 0"$nblocks" -le "$NHARD" \ X -o 0"$ninodes" -le "$NIHARD" ] X then X plist=`ps -e|grep uucico|cut -c1-6` X case "X$plist" in X X) X ;; X *) X kill $plist X echo Subject: uucico killing' X X'Uucico"'"s $plist have been killed due to no disk - \ X "$blocks" "$nblocks" "$ninodes" | mail root X # Get rid of STST.* that have CONVERSATION in them so X # uurecov doesn't restart, sigh. X sleep 15 X /bin/rm -f `egrep -l CONVERSATION /usr/spool/uucp/STST.*` X ;; X esac X exit 0 X fi X sleep 120 X nlist= X for i in $templist X do X if [ -f "$i" ] X then X nlist="$nlist $i" X elif [ "$i" != "$SPOOL/TM.*" ] X then X# X# Here we have found a disappearing temp file. We will kill X# the uucp that owned it before it gets too much farther. X# X sleep 30 # Give it time to die on its own X owner=`expr $i : $SPOOL'/TM\.0*\([1-9][0-9]*\)\....'` X kill $owner # Tough luck, it waited too long X echo Subject: uucico killing' X X'uucico $owner killed due to low disk - "$blocks" "$nblocks" "$ninodes" \ X | mail root X # Get rid of STST.* that have CONVERSATION in them so X # uurecov doesn't restart, sigh. X sleep 15 X /bin/rm -f `egrep -l CONVERSATION /usr/spool/uucp/STST.*` X fi X done X templist="$nlist" X blocks=`df $FILESYS | sed "s/.*: *\([0-9][0-9]*\) blocks.*/\1/"` X nblocks=`df $NEWSSYS | sed "s/.*: *\([0-9][0-9]*\) blocks.*/\1/"` X ninodes=`df $NEWSSYS | sed "s/.*blocks.*\([0-9][0-9]*\) i-nodes.*/\1/"` X totblocks=`ls -s /usr/spool/uucp/D.* \ X | awk 'BEGIN {tot = 0} {tot += $1} END {print tot}'` X nblocks=`eval expr $nblocks - $totblocks "'*'" $FACTOR` Xdone Xexit 0 SHAR_EOF chmod +x 'ckuucp' fi # end of overwriting check # End of shell archive exit 0 -- Geoff Kuenning geoff@ITcorp.com uunet!desint!geoff
mcdaniel@uicsrd.csrd.uiuc.edu (11/17/88)
Talk about self-referential articles! This is what we saw at our site. (The lines of the form "/* ... */" are the headers from notes): /* Written 12:08 pm Nov 10, 1988 by geb@igor.Rational.COM in uicsrd.csrd.uiuc.edu:news.admin */ /* ---------- "Empty News Articles" ---------- */ /* End of text from uicsrd.csrd.uiuc.edu:news.admin */ I know that "write" often returns immediately (using write-behind), so there's no guarantee of an error code being returned. Isn't there some other way for news/notes to check for a full filesystem? -- Tim, the Bizarre and Oddly-Dressed Enchanter Center for Supercomputing Research and Development Internet, BITNET: mcdaniel@uicsrd.csrd.uiuc.edu UUCP: {uunet,convex,pur-ee}!uiucuxc!uicsrd!mcdaniel ARPANET: mcdaniel%uicsrd@uxc.cso.uiuc.edu CSNET: mcdaniel%uicsrd@uiuc.csnet DECnet: GARCON::"mcdaniel@uicsrd.csrd.uiuc.edu"
dhesi@bsu-cs.UUCP (Rahul Dhesi) (11/23/88)
In article <53200002@uicsrd.csrd.uiuc.edu> mcdaniel@uicsrd.csrd.uiuc.edu writes: >I know that "write" often returns immediately (using write-behind), so >there's no guarantee of an error code being returned. Isn't there >some other way for news/notes to check for a full filesystem? A common bug in C programs is to fail to detect a full device. The write system call is guaranteed to return an error code if the device is full or will become full when the kernel flushes its internal buffer(s) for that open file. But programmers using stdio are not always careful to check return values form ALL functions/macros that could possibly cause an error. These include: printf, putc/putchar, fflush, and fclose. -- Rahul Dhesi UUCP: <backbones>!{iuvax,pur-ee}!bsu-cs!dhesi
guy@auspex.UUCP (Guy Harris) (11/24/88)
>I know that "write" often returns immediately (using write-behind), so >there's no guarantee of an error code being returned. Isn't there >some other way for news/notes to check for a full filesystem? "write" to a UNIX file system should allocate space immediately, even if it defers the actual "write". It does so on both V7/S5 or 4.2BSD file system under all the UNIX systems I know of. "write" to an file system mounted over NFS will do the allocation on the server, so allocation will be deferred, and you may not get notification of a full file system. In this case, if you system supports "fsync", you should do an "fsync" before you close the file descriptor (if you're using standard I/O, do an "fflush" to flush standard I/O's buffers, and then do an "fsync(fileno(xxx))", before you do the "fclose") and check the error return from the "fsync". The SunOS NFS implementation (from which many, perhaps most, UNIX ones are derived) will "remember" errors such as "file system full", and will report them to you if you do an "fsync".
goutier@ouareau.iro.umontreal.ca (Claude Goutier) (12/01/88)
In respect to the loss of articles due to lack of disk space (a problem which frustrated me of some good news also), it should not depend on the fact that write allocate immediately to the disk or not. The right think to do is to not consider an article received unless the text has been correctly written to disk and the descriptors properly set. This way, if anything goes wrong, the articles is still "not received" and will be pick up (hopefully) on the next delivery. It is also nice, if any abnormal condition occurs, to undo what so far has been done and has not been completed. Am I missing something in the way News Articles are copied from one machine to another, or is there any complications in implementing the scheme sketched in the above paragraph? -- Claude Goutier Centre de calcul, Universite de Montreal C.P. 6128, Succ "A", Montreal (Quebec) goutier@iro.umontreal.ca Canada H3C 3J7 (514) 343-7234
waynec@tektronix.TEK.COM (Wayne Clark) (12/02/88)
The way I fixed the "empty news articles" problem was to add a call to fflush() before each call to ferror() in inews.c. It then returns error status to nntpd, so the article doesn't get lost. -- "Out of the frying pan (hardware), and into the fire (software)!" Wayne Clark uucp: {decvax,uunet}!tektronix!waynec Engineering Network Support csnet: waynec@tektronix.TEK.COM Tektronix, Inc. phone: (503) 627-5364
john@frog.UUCP (John Woods) (12/06/88)
In article <779@mannix.iros1.UUCP>, goutier@ouareau.iro.umontreal.ca (Claude Goutier) writes: H> In respect to the loss of articles due to lack of disk space (a problem o> which frustrated me of some good news also), it should not depend on the w> fact that write allocate immediately to the disk or not. > t> The right think to do is to not consider an article received unless the o> text has been correctly written to disk and the descriptors properly set. > This way, if anything goes wrong, the articles is still "not received" d> and will be pick up (hopefully) on the next delivery. It is also nice, o> if any abnormal condition occurs, to undo what so far has been done and > has not been completed. it.> Generally the sending system has no idea what happens on the receiving system. It is thus up to the receiving system to beg for a repeat. One way to accomplish this would be to have rnews cons up a "sendme" control article for lost messages. A complication of this is that you would need to have space somewhere for the control article. Also, earlier versions of news would record articles as received even if they were garbled in writing, but 2.11.13 doesn't (I think); you might need to run expire -r (rebuild) to have those entries removed from the history file. The "sendme" article can be crafted by hand by looking through the /usr/lib/news/log file and plucking out the failure messages. Possibly an easier solution would be to have the sending system send an "ihave" control message containing all of the articles that it has. I would personally like to see the netnews protocol extended to have a "whatchagot" control message to elicit one of these automatically; in the absense of that, a friendly feed might set up a shell script that could be uux-ed (or otherwise remotely executed) that would create one; a friendlier but security-conscious feed might be willing to do that by hand on rare occaisions... -- John Woods, Charles River Data Systems, Framingham MA, (617) 626-1101 ...!decvax!frog!john, john@frog.UUCP, ...!mit-eddie!jfw, jfw@eddie.mit.edu Go be a `traves wasswort. - Doug Gwyn