alexis@panix.uucp (Alexis Rosen) (10/01/90)
(Like the saying goes, it sure was a mother...) I and several others have in recent weeks complained about uuxqt's habit of quitting in the middle of a large job. This was annoying, but I thought I'd solved it by running it from cron every half hour. This is *DEFINITELY* not the answer though- it will still toast your files fairly frequently. Of course, I didn't find this out until Cnews had the grace to tell me there was a problem. Since Apple obviously can't write a working uucp (excepting Ron, who's not distributing it), I decided to hack up a shell script that would deal with both these problems. It does not need to be set{g/u}id to anything, and as far as I can tell it works beautifully. However, I make *no gaurantees*. A month ago I'd never written a real shell script... I guess I have at least one thing to thank A/UX for - aside from those sleepless nights, I mean :-/ The strategy is simple. uuxqt never dies quickly- it must go through (it seems, from watching many runs) at least 15 or 18 X files before it has a chance to die. So I never give it a chance to die. I simply hide the X files, and show them 10 at a time. I'm not sure but I think this may actually speed up uuxqt if there's a lot of files in the spool (like, more than 300). I also do some locking that is, as far as I can tell, 100% safe. If anyone discovers any bugs, please mail to me as well as posting. After all, if it is broken (unlikely though I think that to be), my news won't be all that reliable... Enjoy. --- Alexis Rosen very tired SYSOP/Owner PANIX Public Access Unix Systems of NY cmcl2!panix!alexis or alexis@panix.uucp ---------------------------->% cut here %<-------------------------------- #!/bin/sh # uuxqt.wrap - V1.0 written by Alexis Rosen 9-30-90 # This bourne shell script is a wrapper for uuxqt which will prevent it from # crapping out in the middle of a long run, almost certainly losing a file # in the process. Rename the original uuxqt to uuxqt.real and change this # file's name to uuxqt. This should be ownned/group by uucp, mode 770. # It's too bad the guy who "fixed" uuxqt can't program to save his soul. I # am absolutely disgusted with this. I just hope they get it right in 2.0.1! cd /usr/spool/uucp if [ ! -f X.* ] ; then exit 0 ; fi # nothing to do HIDEDIR=/usr/spool/uucp/hidden-x-files # stick excess X files here cd /usr/spool/uucp if [ ! -d $HIDEDIR ] ; then mkdir $HIDEDIR ; chmod 770 $HIDEDIR ; fi # check for a LCK.WXQT file. If it exists, see if it's stale or not. # There is a very small window of time in which this locking system could # fail. So wait ten seconds (probably way conservative) and inspect the lock. if [ -f LCK.WXQT ] ; then kill -0 `cat LCK.WXQT` 2>/dev/null if [ $? != 0 ] ; then # stale lock rm -f LCK.WXQT else exit 0 fi fi trap 'rm -f LCK.WXQT /tmp/xw$$' 0 1 2 15 echo "$$" >LCK.WXQT # make the lock # Check the lock to make sure we kept it. If not let the other guy do the work. sleep 10 if [ $$ != `cat LCK.WXQT` ] ; then trap 0 1 2 15 ; exit 0 ; fi # Now move all the X. files into the hidden directory and then move 10 back # out. When there aren't many X. files this won't matter but when there are # hundreds, it's much more efficient than moving all but 10. Put the mv inside # the loop to pick up any new X. files that might have just arrived. while : ; do mv -f X.* $HIDEDIR 2>/dev/null # all X. files into the hole ls $HIDEDIR >/tmp/xw$$ # make a list for i in `head /tmp/xw$$` ; do # pull out first ten mv $HIDEDIR/$i $i done if [ ! -f X.* ] ; then exit 0 ; fi # normal exit here /usr/lib/uucp/uuxqt.real $* # fire up the real uuxqt XEXIT=$? if [ $XEXIT != 0 ] ; then exit $XEXIT ; fi done
rmtodd@uokmax.ecn.uoknor.edu (Richard Michael Todd) (10/01/90)
(Note: The following article contains a description of the uuxqt bug that I took out of a news article I posted a couple of days ago. The reason I'm reposting this is because none of you got to see this the first time, because the article has been waiting for *3 DAYS* in uokmax's inbound spool directory waiting for uuxqt to do something with it. Looks like Apple's not the only one with uuxqt problems. You'll probably see the guts of this message again if and when uuxqt on uokmax gets a clue and actually starts executing requests again. Tough. I don't suppose Apple has any plans to port A/UX to the Multimax; it'd be a distinct improvement on what we've already got...) Anyway, alexis@panix.uucp (Alexis Rosen) writes: >I and several others have in recent weeks complained about uuxqt's habit >of quitting in the middle of a large job. This was annoying, but I thought >I'd solved it by running it from cron every half hour. This is *DEFINITELY* >not the answer though- it will still toast your files fairly frequently. Of >course, I didn't find this out until Cnews had the grace to tell me there >was a problem. Actually, the problem is that rnews will toast the batch it's receiving when it's executed with only one free file descriptor. See below. >Since Apple obviously can't write a working uucp (excepting Ron, who's not Actually, I seriously doubt Apple had anything to do with this one. I gather other versions of uuxqt have had this problem; the bug is probably a leftover from AT&T. Anyway, here's the explanation of the uuxqt bug, along with another workaround: The problem is that in uuxqt's main loop, where it processes each file, it forgets to close one of the files. Eventually it runs out of file handles, can't open the work files, and aborts. This in itself isn't too bad (it just aborts and leaves the work files around for the next batch), but when it's just one or two XQTs away from disaster, hence it's only got 1 or 2 free file handles; when rnews is forked off and runs, it only has 1 or 2 file handles, and when it tries to do all those `spacefor`'s and stuff (which require opening pipes, etc.) rnews fails in all sorts of entertaining ways (usually coming to the erroneous conclusion that the disk is out of space.) I gather this bug is in a few other UUCP implementations besides Apple's; at least, last time somebody asked on news.software.b why all his news batches were being eaten on A/UX, Henry Spencer basically said, "Oh, your uuxqt has the bug where ..." Dominic Dunlop asks: >Anyway, how do you fix this sucker? The bug, which was present in 1.1, >seems to be bigger and better in 2.0. (I even looked on Apple's Update >and Information server for a fix. No soap...) First, you rob a bank. Then, you use the $75,000 you got from robbing the bank to buy an AT&T source licence. :-). Seriously, *you* probably can't fix the bug. You can, however, come up with a workaround of sorts. The problem is that rnews freaks out when it can't open files and pipes because of too few file descriptors. A workaround is to move /bin/rnews to /bin/rnews.real and compile the following and put in as /bin/rnews. -------------------------Cut here---------------------------- /* ** Program to get around uuxqt's problem in not closing file descriptors-- ** this is a front end to rnews that ensures that all fds from 3 up are ** closed before the real rnews is executed. */ main() { int n=getdtablesize(); int i; for (i = 3 ; i < n ; ++i) close(i); execl("/bin/sh","sh","/bin/rnews.real",(char *)0); } -------------------------Cut here------------------------------ Of course, when uuxqt itself runs out of file descriptors, it will fail, but it seems to fail "cleanly", i.e. just leaving the remaining work files around for the next uuxqt run. >Altogether now: I want my HDB! Well, I'd settle for just a fixed version of uuxqt; having grown accustomed to old-UUCP, I really don't want to start learning about HDB's bugs.... Anyway, back to Alexis' article: >The strategy is simple. uuxqt never dies quickly- it must go through (it >seems, from watching many runs) at least 15 or 18 X files before it has a >chance to die. So I never give it a chance to die. I simply hide the X >files, and show them 10 at a time. I'm not sure but I think this may actually >speed up uuxqt if there's a lot of files in the spool (like, more than 300). >I also do some locking that is, as far as I can tell, 100% safe. The number of X. files uuxqt gets a chance to run through depends on how many free file descriptors it had when it started, which depends somewhat on how it was executed. >If anyone discovers any bugs, please mail to me as well as posting. After >all, if it is broken (unlikely though I think that to be), my news won't >be all that reliable... Since mail and news are both handled by uuxqt, the two should be about the same in terms of reliability. (Granted, rnews fails if it's just very low on file descriptors, but I suspect any halfway complicated MTA like smail or sendmail will also fail under such conditions.) Given the description of the problem, it shouldn't be too difficult for someone at Apple to fix this bug. Any takers? Ron? You listening? -- Richard Todd rmtodd@chinet.chi.il.us or rmtodd@uokmax.ecn.uoknor.edu
alexis@panix.uucp (Alexis Rosen) (10/02/90)
In article <1990Sep30.233245.22073@uokmax.ecn.uoknor.edu> rmtodd@uokmax.ecn.uoknor.edu (Richard Michael Todd) writes: >alexis@panix.uucp (Alexis Rosen) writes: >>I and several others have in recent weeks complained about uuxqt's habit >>of quitting in the middle of a large job. This was annoying, but I thought >>I'd solved it by running it from cron every half hour. This is *DEFINITELY* >>not the answer though- it will still toast your files fairly frequently. Of >>course, I didn't find this out until Cnews had the grace to tell me there >>was a problem. > >Actually, the problem is that rnews will toast the batch it's receiving >when it's executed with only one free file descriptor. See below. I know. Problem is, as you mention later, this applies to almost anything you might care to name which could be uuxqt's target. Bnews, Cnews, mail with sendmail or whatever. > Actually, I seriously doubt Apple had anything to do with this one. I >gather other versions of uuxqt have had this problem; the bug is probably >a leftover from AT&T. So? Part of their job was to clean out the old trash before bringing in new trash... Seriously, they did the job 90% by getting rid of the real disasters in the other uu* programs (although that TZ crap is still there). Especialy given the simplicity of this bug (both in the ability to reproduce and diagnose it, not to mention its 'known' status), this should have been dealt with. Maybe not the programmer, but certainly the QA person(s) screwed up. > Anyway, here's the explanation of the uuxqt bug, along with another >workaround: > The problem is that in uuxqt's main loop, where it processes each file, it >forgets to close one of the files. Eventually it runs out of file handles, >can't open the work files, and aborts. This in itself isn't too bad (it >just aborts and leaves the work files around for the next batch), but when >it's just one or two XQTs away from disaster, hence it's only got 1 or 2 >free file handles; when rnews is forked off and runs, it only has 1 or 2 >file handles, and when it tries to do all those `spacefor`'s and stuff >(which require opening pipes, etc.) rnews fails in all sorts of entertaining >ways (usually coming to the erroneous conclusion that the disk is out of >space.) I gather this bug is in a few other UUCP implementations besides >Apple's; at least, last time somebody asked on news.software.b why all >his news batches were being eaten on A/UX, Henry Spencer basically said, "Oh, >your uuxqt has the bug where ..." Interesting. I guess I'll post the uuxqt wrapper there too then. But as I said, the fact that this was a known bug is just all that much more annoying. >[...] A workaround is to move /bin/rnews to /bin/rnews.real and >compile the following and put in as /bin/rnews. >-------------------------Cut here---------------------------- >/* >** Program to get around uuxqt's problem in not closing file descriptors-- >** this is a front end to rnews that ensures that all fds from 3 up are >** closed before the real rnews is executed. >*/ >main() { > int n=getdtablesize(); > int i; > for (i = 3 ; i < n ; ++i) close(i); > execl("/bin/sh","sh","/bin/rnews.real",(char *)0); >} >-------------------------Cut here------------------------------ >Of course, when uuxqt itself runs out of file descriptors, it will fail, >but it seems to fail "cleanly", i.e. just leaving the remaining work files >around for the next uuxqt run. Unless I'm missing something (like a bug in my wrapper), it seems to me that this strategy is distinctly inferior to the wrapper. (On the other hand, it's a dramatic improvement over 20-30% of your inbound news :-)...) The problem is that uuxqt will still fail, and then the rest of your stuff will just sit in the spool until uuxqt fires up again- typically, for some more incoming batches. In fact, on a busy system like ours, you'd never catch up, and you would eventually run out of disk space. You could of course tell cron to fire up a new uuxqt every half hour or so, but that's so inelegant... Of course, that's exactly what I did before I realized exactly what was going on (i.e., that my news was getting munched). [somebody else:] >>Altogether now: I want my HDB! > >Well, I'd settle for just a fixed version of uuxqt; having grown accustomed >to old-UUCP, I really don't want to start learning about HDB's bugs.... Really? I'd much rather a subdirectory-based uucp of any sort, HDB, Duke, or whatever. It's really a pain to mess around in a directory with 1500+ files in it. Not to mention the speed problem... In this case, I'd be willing to pay the price of progress. Anything to get all those non-critical programs like uulog (?!) working. :-( >Anyway, back to Alexis' article: >>The strategy is simple. uuxqt never dies quickly- it must go through (it >>seems, from watching many runs) at least 15 or 18 X files before it has a >>chance to die. So I never give it a chance to die. I simply hide the X >>files, and show them 10 at a time. I'm not sure but I think this may actually >>speed up uuxqt if there's a lot of files in the spool (like, more than 300). >>I also do some locking that is, as far as I can tell, 100% safe. > >The number of X. files uuxqt gets a chance to run through depends on how >many free file descriptors it had when it started, which depends somewhat >on how it was executed. Yes. I picked a number which I felt was sufficiently conservative, before I knew exactly what was causing the problem. I don't feel any particular need to go back and change it now... This is especially true since the critical number varies with the uuxqt target command. >>If anyone discovers any bugs, please mail to me as well as posting. After >>all, if it is broken (unlikely though I think that to be), my news won't >>be all that reliable... > Since mail and news are both handled by uuxqt, the two should be about >the same in terms of reliability. (Granted, rnews fails if it's just very >low on file descriptors, but I suspect any halfway complicated MTA like >smail or sendmail will also fail under such conditions.) Nope. News uses more descriptors than mail. Fortunately, my script seems to be holding up really well. Three days under massive newsfeeds, 0 problems. (In case y'all can't notice, I'm pretty pleased with myself. Not because it was a particularly difficult piece of hackery. But because having fixed this problem, I feel like I've had my sight or hearing restored. It's just terrible, missing every fifth or sixth article... Not to mention the fact that it wreaks havok with any attempts to put together multipart binaries...) > Given the description of the problem, it shouldn't be too difficult for >someone at Apple to fix this bug. Any takers? Ron? You listening? I think it's safe to say that they're not unaware of this problem... I'd be really surprised if it persisted in 2.0.1. >-- >Richard Todd rmtodd@chinet.chi.il.us or rmtodd@uokmax.ecn.uoknor.edu --- Alexis Rosen {cmcl2,apple}!panix!alexis alexis@panix.uucp
rmtodd@uokmax.ecn.uoknor.edu (Richard Michael Todd) (10/03/90)
alexis@panix.uucp (Alexis Rosen) writes: >Unless I'm missing something (like a bug in my wrapper), it seems to me that >this strategy is distinctly inferior to the wrapper. (On the other hand, it's >a dramatic improvement over 20-30% of your inbound news :-)...) The problem >is that uuxqt will still fail, and then the rest of your stuff will just sit >in the spool until uuxqt fires up again- typically, for some more incoming >batches. In fact, on a busy system like ours, you'd never catch up, and you >would eventually run out of disk space. You could of course tell cron to fire >up a new uuxqt every half hour or so, but that's so inelegant... Of course, >that's exactly what I did before I realized exactly what was going on (i.e., >that my news was getting munched). Well, on my system, an outbound uucp poll (uucico -r1 -suokmax) is being run once every half hour. If a previous uucp to uokmax is still running, the new poll dies immediately, but it does go ahead and execute uuxqt. So, in effect, uuxqt is being run every half hour. (It also helps that I'm only getting news at 2400 bps, so it's rather difficult to get 15+ batches inside of a half-hour period. I gather you're getting news at Trailblazer speeds, so this is more of a problem. Hmm..have you considered having your feed site send you larger batches? Not only does this cause a lower load of batches for uuxqt, but it makes transfer more efficient -- at Trailblazer speeds, the time spent waiting for the other side to start sending another file is a significant fraction of the time it takes to send a ~50K file. I gather a good many Telebit sites are running 500K news batches.) >[somebody else:] >>Altogether now: I want my HDB! >>Well, I'd settle for just a fixed version of uuxqt; having grown accustomed >>to old-UUCP, I really don't want to start learning about HDB's bugs.... >Really? I'd much rather a subdirectory-based uucp of any sort, HDB, Duke, or >whatever. It's really a pain to mess around in a directory with 1500+ files >in it. Not to mention the speed problem... In this case, I'd be willing to >pay the price of progress. Anything to get all those non-critical programs >like uulog (?!) working. :-( I'd love to see BSD-style subdirectory-based UUCP, too. But given the massive mania for SysV-compatibility, we'll probably get either HDB or old-UUCP, and given the choice, I'll settle for old-style UUCP. BTW, what's wrong with uulog? >Nope. News uses more descriptors than mail. I suspect that this varies a good bit with the MTA. I wouldn't be surprised if smail/deliver together eat up as many file descriptors as rnews does. >> Given the description of the problem, it shouldn't be too difficult for >>someone at Apple to fix this bug. Any takers? Ron? You listening? >I think it's safe to say that they're not unaware of this problem... I'd be >really surprised if it persisted in 2.0.1. Assuming there is a 2.0.1. Personally, I'm hoping for a fixed binary on aux.support.apple.com. -- Richard Todd rmtodd@chinet.chi.il.us or rmtodd@uokmax.ecn.uoknor.edu "MSDOS is a Neanderthal operating system" - Henry Spencer