[comp.unix.aux] uuxqt. The Mother of Invention. Fixed, at last.

alexis@panix.uucp (Alexis Rosen) (10/01/90)

(Like the saying goes, it sure was a mother...)

I and several others have in recent weeks complained about uuxqt's habit
of quitting in the middle of a large job. This was annoying, but I thought
I'd solved it by running it from cron every half hour. This is *DEFINITELY*
not the answer though- it will still toast your files fairly frequently. Of
course, I didn't find this out until Cnews had the grace to tell me there
was a problem.

Since Apple obviously can't write a working uucp (excepting Ron, who's not
distributing it), I decided to hack up a shell script that would deal with
both these problems. It does not need to be set{g/u}id to anything, and as
far as I can tell it works beautifully. However, I make *no gaurantees*. A
month ago I'd never written a real shell script... I guess I have at least
one thing to thank A/UX for - aside from those sleepless nights, I mean :-/

The strategy is simple. uuxqt never dies quickly- it must go through (it
seems, from watching many runs) at least 15 or 18 X files before it has a
chance to die. So I never give it a chance to die. I simply hide the X
files, and show them 10 at a time. I'm not sure but I think this may actually
speed up uuxqt if there's a lot of files in the spool (like, more than 300).
I also do some locking that is, as far as I can tell, 100% safe.

If anyone discovers any bugs, please mail to me as well as posting. After
all, if it is broken (unlikely though I think that to be), my news won't
be all that reliable...

Enjoy.

---
Alexis Rosen
very tired SYSOP/Owner
PANIX Public Access Unix Systems of NY
cmcl2!panix!alexis  or  alexis@panix.uucp

---------------------------->% cut here %<--------------------------------
#!/bin/sh
# uuxqt.wrap - V1.0 written by Alexis Rosen 9-30-90
# This bourne shell script is a wrapper for uuxqt which will prevent it from
# crapping out in the middle of a long run, almost certainly losing a file
# in the process. Rename the original uuxqt to uuxqt.real and change this
# file's name to uuxqt. This should be ownned/group by uucp, mode 770.
# It's too bad the guy who "fixed" uuxqt can't program to save his soul. I
# am absolutely disgusted with this. I just hope they get it right in 2.0.1!

cd /usr/spool/uucp
if [ ! -f X.* ] ; then exit 0 ; fi		# nothing to do

HIDEDIR=/usr/spool/uucp/hidden-x-files		# stick excess X files here

cd /usr/spool/uucp
if [ ! -d $HIDEDIR ] ; then mkdir $HIDEDIR ; chmod 770 $HIDEDIR ; fi

# check for a LCK.WXQT file. If it exists, see if it's stale or not.
# There is a very small window of time in which this locking system could
# fail. So wait ten seconds (probably way conservative) and inspect the lock.
if [ -f LCK.WXQT ] ; then
	kill -0 `cat LCK.WXQT` 2>/dev/null
	if [ $? != 0 ] ; then		# stale lock
		rm -f LCK.WXQT
	else
		exit 0
	fi
fi
trap 'rm -f LCK.WXQT /tmp/xw$$' 0 1 2 15
echo "$$" >LCK.WXQT			# make the lock

# Check the lock to make sure we kept it. If not let the other guy do the work.
sleep 10
if [ $$ != `cat LCK.WXQT` ] ; then trap 0 1 2 15 ; exit 0 ; fi

# Now move all the X. files into the hidden directory and then move 10 back
# out. When there aren't many X. files this won't matter but when there are
# hundreds, it's much more efficient than moving all but 10. Put the mv inside
# the loop to pick up any new X. files that might have just arrived.
while : ; do
	mv -f X.* $HIDEDIR 2>/dev/null	# all X. files into the hole
	ls $HIDEDIR >/tmp/xw$$		# make a list
	for i in `head /tmp/xw$$` ; do	# pull out first ten
		mv $HIDEDIR/$i $i
	done
	if [ ! -f X.* ] ; then exit 0 ; fi		# normal exit here

	/usr/lib/uucp/uuxqt.real $*	# fire up the real uuxqt
	XEXIT=$?
	if [ $XEXIT != 0 ] ; then exit $XEXIT ; fi
done

rmtodd@uokmax.ecn.uoknor.edu (Richard Michael Todd) (10/01/90)

(Note: The following article contains a description of the uuxqt bug that I
took out of a news article I posted a couple of days ago.  The reason I'm
reposting this is because none of you got to see this the first time,
because the article has been waiting for *3 DAYS* in uokmax's inbound spool
directory waiting for uuxqt to do something with it.  Looks like Apple's
not the only one with uuxqt problems.  You'll probably see the guts of this
message again if and when uuxqt on uokmax gets a clue and actually starts
executing requests again.  Tough.  I don't suppose Apple has any plans to
port A/UX to the Multimax; it'd be a distinct improvement on what we've
already got...)

Anyway,

alexis@panix.uucp (Alexis Rosen) writes:
>I and several others have in recent weeks complained about uuxqt's habit
>of quitting in the middle of a large job. This was annoying, but I thought
>I'd solved it by running it from cron every half hour. This is *DEFINITELY*
>not the answer though- it will still toast your files fairly frequently. Of
>course, I didn't find this out until Cnews had the grace to tell me there
>was a problem.

Actually, the problem is that rnews will toast the batch it's receiving 
when it's executed with only one free file descriptor.  See below.

>Since Apple obviously can't write a working uucp (excepting Ron, who's not
 Actually, I seriously doubt Apple had anything to do with this one.  I 
gather other versions of uuxqt have had this problem; the bug is probably
a leftover from AT&T.  

 Anyway, here's the explanation of the uuxqt bug, along with another
workaround: 

 The problem is that in uuxqt's main loop, where it processes each file, it
forgets to close one of the files.  Eventually it runs out of file handles,
can't open the work files, and aborts.  This in itself isn't too bad (it
just aborts and leaves the work files around for the next batch), but when
it's just one or two XQTs away from disaster, hence it's only got 1 or 2
free file handles; when rnews is forked off and runs, it only has 1 or 2 
file handles, and when it tries to do all those `spacefor`'s and stuff
(which require opening pipes, etc.) rnews fails in all sorts of entertaining
ways (usually coming to the erroneous conclusion that the disk is out of
space.)  I gather this bug is in a few other UUCP implementations besides
Apple's; at least, last time somebody asked on news.software.b why all
his news batches were being eaten on A/UX, Henry Spencer basically said, "Oh,
your uuxqt has the bug where ..."

Dominic Dunlop asks: 
>Anyway, how do you fix this sucker?  The bug, which was present in 1.1,
>seems to be bigger and better in 2.0.  (I even looked on Apple's Update
>and Information server for a fix.  No soap...)

First, you rob a bank.  Then, you use the $75,000 you got from robbing the 
bank to buy an AT&T source licence. :-).  

Seriously, *you* probably can't fix the bug.  You can, however, come
up with a workaround of sorts.  The problem is that rnews freaks out
when it can't open files and pipes because of too few file
descriptors.  A workaround is to move /bin/rnews to /bin/rnews.real and
compile the following and put in as /bin/rnews.
-------------------------Cut here----------------------------
/*
** Program to get around uuxqt's problem in not closing file descriptors--
** this is a front end to rnews that ensures that all fds from 3 up are 
** closed before the real rnews is executed.
*/
main() {
	int n=getdtablesize();
	int i;
	for (i = 3 ; i < n ; ++i) close(i);
	execl("/bin/sh","sh","/bin/rnews.real",(char *)0);
}
-------------------------Cut here------------------------------

Of course, when uuxqt itself runs out of file descriptors, it will fail,
but it seems to fail "cleanly", i.e. just leaving the remaining work files
around for the next uuxqt run.  

>Altogether now: I want my HDB!

Well, I'd settle for just a fixed version of uuxqt; having grown accustomed 
to old-UUCP, I really don't want to start learning about HDB's bugs....

Anyway, back to Alexis' article: 
>The strategy is simple. uuxqt never dies quickly- it must go through (it
>seems, from watching many runs) at least 15 or 18 X files before it has a
>chance to die. So I never give it a chance to die. I simply hide the X
>files, and show them 10 at a time. I'm not sure but I think this may actually
>speed up uuxqt if there's a lot of files in the spool (like, more than 300).
>I also do some locking that is, as far as I can tell, 100% safe.

The number of X. files uuxqt gets a chance to run through depends on how
many free file descriptors it had when it started, which depends somewhat
on how it was executed.  

>If anyone discovers any bugs, please mail to me as well as posting. After
>all, if it is broken (unlikely though I think that to be), my news won't
>be all that reliable...

  Since mail and news are both handled by uuxqt, the two should be about
the same in terms of reliability.  (Granted, rnews fails if it's just very
low on file descriptors, but I suspect any halfway complicated MTA like
smail or sendmail will also fail under such conditions.)

  Given the description of the problem, it shouldn't be too difficult for
someone at Apple to fix this bug.  Any takers?  Ron?  You listening?  
-- 
Richard Todd   rmtodd@chinet.chi.il.us  or  rmtodd@uokmax.ecn.uoknor.edu  

alexis@panix.uucp (Alexis Rosen) (10/02/90)

In article <1990Sep30.233245.22073@uokmax.ecn.uoknor.edu> rmtodd@uokmax.ecn.uoknor.edu (Richard Michael Todd) writes:
>alexis@panix.uucp (Alexis Rosen) writes:
>>I and several others have in recent weeks complained about uuxqt's habit
>>of quitting in the middle of a large job. This was annoying, but I thought
>>I'd solved it by running it from cron every half hour. This is *DEFINITELY*
>>not the answer though- it will still toast your files fairly frequently. Of
>>course, I didn't find this out until Cnews had the grace to tell me there
>>was a problem.
>
>Actually, the problem is that rnews will toast the batch it's receiving 
>when it's executed with only one free file descriptor.  See below.

I know. Problem is, as you mention later, this applies to almost anything
you might care to name which could be uuxqt's target. Bnews, Cnews, mail
with sendmail or whatever.

> Actually, I seriously doubt Apple had anything to do with this one.  I 
>gather other versions of uuxqt have had this problem; the bug is probably
>a leftover from AT&T.  

So? Part of their job was to clean out the old trash before bringing in new
trash... Seriously, they did the job 90% by getting rid of the real disasters
in the other uu* programs (although that TZ crap is still there). Especialy
given the simplicity of this bug (both in the ability to reproduce and
diagnose it, not to mention its 'known' status), this should have been dealt
with. Maybe not the programmer, but certainly the QA person(s) screwed up.

> Anyway, here's the explanation of the uuxqt bug, along with another
>workaround: 
> The problem is that in uuxqt's main loop, where it processes each file, it
>forgets to close one of the files.  Eventually it runs out of file handles,
>can't open the work files, and aborts.  This in itself isn't too bad (it
>just aborts and leaves the work files around for the next batch), but when
>it's just one or two XQTs away from disaster, hence it's only got 1 or 2
>free file handles; when rnews is forked off and runs, it only has 1 or 2 
>file handles, and when it tries to do all those `spacefor`'s and stuff
>(which require opening pipes, etc.) rnews fails in all sorts of entertaining
>ways (usually coming to the erroneous conclusion that the disk is out of
>space.)  I gather this bug is in a few other UUCP implementations besides
>Apple's; at least, last time somebody asked on news.software.b why all
>his news batches were being eaten on A/UX, Henry Spencer basically said, "Oh,
>your uuxqt has the bug where ..."

Interesting. I guess I'll post the uuxqt wrapper there too then. But as I said,
the fact that this was a known bug is just all that much more annoying.

>[...] A workaround is to move /bin/rnews to /bin/rnews.real and
>compile the following and put in as /bin/rnews.
>-------------------------Cut here----------------------------
>/*
>** Program to get around uuxqt's problem in not closing file descriptors--
>** this is a front end to rnews that ensures that all fds from 3 up are 
>** closed before the real rnews is executed.
>*/
>main() {
>	int n=getdtablesize();
>	int i;
>	for (i = 3 ; i < n ; ++i) close(i);
>	execl("/bin/sh","sh","/bin/rnews.real",(char *)0);
>}
>-------------------------Cut here------------------------------
>Of course, when uuxqt itself runs out of file descriptors, it will fail,
>but it seems to fail "cleanly", i.e. just leaving the remaining work files
>around for the next uuxqt run.  

Unless I'm missing something (like a bug in my wrapper), it seems to me that
this strategy is distinctly inferior to the wrapper. (On the other hand, it's
a dramatic improvement over 20-30% of your inbound news :-)...)  The problem
is that uuxqt will still fail, and then the rest of your stuff will just sit
in the spool until uuxqt fires up again- typically, for some more incoming
batches. In fact, on a busy system like ours, you'd never catch up, and you
would eventually run out of disk space. You could of course tell cron to fire
up a new uuxqt every half hour or so, but that's so inelegant... Of course,
that's exactly what I did before I realized exactly what was going on (i.e.,
that my news was getting munched).

[somebody else:] >>Altogether now: I want my HDB!
>
>Well, I'd settle for just a fixed version of uuxqt; having grown accustomed 
>to old-UUCP, I really don't want to start learning about HDB's bugs....

Really? I'd much rather a subdirectory-based uucp of any sort, HDB, Duke, or
whatever. It's really a pain to mess around in a directory with 1500+ files
in it. Not to mention the speed problem... In this case, I'd be willing to
pay the price of progress. Anything to get all those non-critical programs
like uulog (?!) working. :-(

>Anyway, back to Alexis' article: 
>>The strategy is simple. uuxqt never dies quickly- it must go through (it
>>seems, from watching many runs) at least 15 or 18 X files before it has a
>>chance to die. So I never give it a chance to die. I simply hide the X
>>files, and show them 10 at a time. I'm not sure but I think this may actually
>>speed up uuxqt if there's a lot of files in the spool (like, more than 300).
>>I also do some locking that is, as far as I can tell, 100% safe.
>
>The number of X. files uuxqt gets a chance to run through depends on how
>many free file descriptors it had when it started, which depends somewhat
>on how it was executed.  

Yes. I picked a number which I felt was sufficiently conservative, before I
knew exactly what was causing the problem. I don't feel any particular need
to go back and change it now...

This is especially true since the critical number varies with the uuxqt
target command.

>>If anyone discovers any bugs, please mail to me as well as posting. After
>>all, if it is broken (unlikely though I think that to be), my news won't
>>be all that reliable...
>  Since mail and news are both handled by uuxqt, the two should be about
>the same in terms of reliability.  (Granted, rnews fails if it's just very
>low on file descriptors, but I suspect any halfway complicated MTA like
>smail or sendmail will also fail under such conditions.)

Nope. News uses more descriptors than mail.

Fortunately, my script seems to be holding up really well. Three days under
massive newsfeeds, 0 problems. (In case y'all can't notice, I'm pretty pleased
with myself. Not because it was a particularly difficult piece of hackery.
But because having fixed this problem, I feel like I've had my sight or
hearing restored. It's just terrible, missing every fifth or sixth article...
Not to mention the fact that it wreaks havok with any attempts to put together
multipart binaries...)

>  Given the description of the problem, it shouldn't be too difficult for
>someone at Apple to fix this bug.  Any takers?  Ron?  You listening?  

I think it's safe to say that they're not unaware of this problem... I'd be
really surprised if it persisted in 2.0.1.

>-- 
>Richard Todd   rmtodd@chinet.chi.il.us  or  rmtodd@uokmax.ecn.uoknor.edu  

---
Alexis Rosen
{cmcl2,apple}!panix!alexis
alexis@panix.uucp

rmtodd@uokmax.ecn.uoknor.edu (Richard Michael Todd) (10/03/90)

alexis@panix.uucp (Alexis Rosen) writes:

>Unless I'm missing something (like a bug in my wrapper), it seems to me that
>this strategy is distinctly inferior to the wrapper. (On the other hand, it's
>a dramatic improvement over 20-30% of your inbound news :-)...)  The problem
>is that uuxqt will still fail, and then the rest of your stuff will just sit
>in the spool until uuxqt fires up again- typically, for some more incoming
>batches. In fact, on a busy system like ours, you'd never catch up, and you
>would eventually run out of disk space. You could of course tell cron to fire
>up a new uuxqt every half hour or so, but that's so inelegant... Of course,
>that's exactly what I did before I realized exactly what was going on (i.e.,
>that my news was getting munched).

Well, on my system, an outbound uucp poll (uucico -r1 -suokmax) is being run
once every half hour. If a previous uucp to uokmax is still running, the new
poll dies immediately, but it does go ahead and execute uuxqt.  So, in effect,
uuxqt is being run every half hour.  (It also helps that I'm only getting
news at 2400 bps, so it's rather difficult to get 15+ batches inside of 
a half-hour period.  I gather you're getting news at Trailblazer speeds, 
so this is more of a problem.  Hmm..have you considered having your feed 
site send you larger batches?  Not only does this cause a lower load of
batches for uuxqt, but it makes transfer more efficient -- at Trailblazer
speeds, the time spent waiting for the other side to start sending another
file is a significant fraction of the time it takes to send a ~50K file.  I
gather a good many Telebit sites are running 500K news batches.)

>[somebody else:] >>Altogether now: I want my HDB!
>>Well, I'd settle for just a fixed version of uuxqt; having grown accustomed 
>>to old-UUCP, I really don't want to start learning about HDB's bugs....

>Really? I'd much rather a subdirectory-based uucp of any sort, HDB, Duke, or
>whatever. It's really a pain to mess around in a directory with 1500+ files
>in it. Not to mention the speed problem... In this case, I'd be willing to
>pay the price of progress. Anything to get all those non-critical programs
>like uulog (?!) working. :-(

I'd love to see BSD-style subdirectory-based UUCP, too.  But given the massive
mania for SysV-compatibility, we'll probably get either HDB or old-UUCP, and
given the choice, I'll settle for old-style UUCP.
  BTW, what's wrong with uulog? 

>Nope. News uses more descriptors than mail.
 I suspect that this varies a good bit with the MTA.  I wouldn't be surprised
if smail/deliver together eat up as many file descriptors as rnews does.  

>>  Given the description of the problem, it shouldn't be too difficult for
>>someone at Apple to fix this bug.  Any takers?  Ron?  You listening?  

>I think it's safe to say that they're not unaware of this problem... I'd be
>really surprised if it persisted in 2.0.1.

Assuming there is a 2.0.1.  Personally, I'm hoping for a fixed binary on
aux.support.apple.com.  
-- 
Richard Todd   rmtodd@chinet.chi.il.us  or  rmtodd@uokmax.ecn.uoknor.edu  
"MSDOS is a Neanderthal operating system" - Henry Spencer