[net.bugs.uucp] Bug in 4.2 UUXQT

ra@rlgvax.UUCP (Rick Adams) (01/16/84)

I found that bug at seismo. What happens is that when an article is
propagated to a site and the "U" flag is on in the sys file, there
is no "real" D. file created, since uucp is asked not to make a copy
of the file. There is a real X. file however. Then, a "Cancel" control
message comes in before uucp has actually transferred the article to
the next site. The news article is removed. Then when uucp finally
tries to transfer the article, if fails when it tries
to copy the article into a real D. file on the remote system.
Howver, the X. file gets through ok.

The "solution" we implemented was to keep an eye on the X.
directory (not very hard with uusnap) and not let the number of files
get too big.

I've discussed this problem with Tom Truscott and the following is his
suggested fix. No one has implemented it (yet).

	Subject: uuxqt of LLEN unXable files

	Here is a possibility:
	If, in uuxqt.c/gtxfile, gtwrk() returns > LLEN/2 files to be executed,
	but none of them have 'gotfiles()',
	then {
		do an iswrk() to reset things.
		loop doing gtwrk(), and if not 'gotfiles' and
			the work file is > 1 day old,
			(try to) delete the work file
		exit from gtxfile() with failure (to avoid an infinite loop)
	}

	The LLEN/2 is a kludge to avoid overreaction to the problem,
	as is the check that the work file is > 1 day old.
	(I know little about advanced uses of uux,
	but suppose it is possible for the work file
	to arrive ahead of the files needed by 'gotfiles'.)
	The exit with failure is to avoid the possible infinite
	loop that might ensue if one tries to restart gtxfile()
	but is not careful (e.g. suppose the LLEN unXables
	are also undeletable).
		Tom Truscott
	I am not planning to implement this anytime soon,
	but you are welcome to!


Rick Adams
{allegra|ihnp4|seismo}!rlgvax!ra

chris@umcp-cs.UUCP (01/16/84)

The solution should really be applied from the other side: after all,
each C. control file delineates one "set" of work to be done.  If all
files required by the "C." file are not present, uucico should send
none of them.  The code would have to go into cntrl.c and simulate
the code in uuxqt that makes sure that all required files are present.
If not, the C. file cannot yet be completed and should be left alone.
Unfortunately merely changing cntrl.c to do this is not sufficient,
because then uucico could loop forever noticing control files which
do not have all required data files, skipping them, and then rereading
the work directory.

Another solution would be to immediately delete the control file if
not all its required files are present; I think this would not harm
anything as it stands now, but would prevent an enhancement by which
one site could relay work requests to another.  As this poses interesting
security problems that might not be a bad idea anyway!
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci
UUCP:	{seismo,allegra,brl-bmd}!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris.umcp-cs@CSNet-Relay

damonp@tektronix.UUCP (Damon Permezel) (01/20/84)

For some reason, at Tek, I am getting all these X.files that
refer to D.files that don't exist. If 20 or more of these accumulate,
UUXQT will stop processing. The reason for this is that UUXQT calls
gtxfile() repeatedly to return the next X.file that contains an
executable request (all necessary files present).

gtxfile() calls gtwrkf(), that essentially puts the first LLEN (20)
files into an array. gotfiles() is called for each of these files,
and since they don't got_all_their_files(), we end up failing the
call to gtwrkf(), falling thru, and tripping over the fact that
rechecked == 1.

No other files in the directory are examined, and UUXQT returns,
thoroughly satisifed with a job well done.

A possible solution (barring determining why I have all these missing
D. files, and in the interest of robustitty) is to remove the offending
X.file, if gotfiles() == 0, and the X.file is older than a few hours.

joe@fluke.UUCP (Joe Kelsey) (02/22/84)

Well, instead of having to manually check and clean out the X. files
all of the time, I decided to bite the bullet and modify uuxqt.c to do
what Tom Truscott suggested.  Here are the modifications to uucp to fix
the problem of dangling X. files.  The only change not noted in the
suggestion by Tom is that you have to modify anlwrk.c to make Nfiles
and LLEN available to the outside world.  I moved LLEN and MAXRQST into
uucp.h and removed the static from the definition of Nfiles.  Here are
the diff -c listings:

*** /tmp/,RCSt1026807	Tue Feb 21 13:41:28 1984
--- uuxqt.c	Tue Feb 21 13:35:27 1984
***************
*** 25,30
  #define	NCMDS	50
  char *Cmds[NCMDS];
  
  int notiok = 1;
  int nonzero = 0;
  

--- 25,33 -----
  #define	NCMDS	50
  char *Cmds[NCMDS];
  
+ /* Nfiles is set in anlwrk.c. fluke!joe */
+ extern int Nfiles;
+ 
  int notiok = 1;
  int nonzero = 0;
  
***************
*** 304,309
   * Mod to recheck for X-able files. Sept 1982, rti!trt.
   * Suggested by utzoo.2458 (utzoo!henry)
   * Uses iswrk/gtwrkf to keep files in sequence, May 1983.
   */
  
  gtxfile(file)

--- 307,313 -----
   * Mod to recheck for X-able files. Sept 1982, rti!trt.
   * Suggested by utzoo.2458 (utzoo!henry)
   * Uses iswrk/gtwrkf to keep files in sequence, May 1983.
+  * Mod to check for old X. files, Feb. 1984, fluke!joe.
   */
  
  gtxfile(file)
***************
*** 311,316
  {
  	char pre[3];
  	register int rechecked;
  
  	pre[0] = XQTPRE;
  	pre[1] = '.';

--- 315,323 -----
  {
  	char pre[3];
  	register int rechecked;
+ 	time_t ystrdy;		/* yesterday */
+ 	extern time_t time();
+ 	struct stat stbuf;	/* for X file age */
  
  	pre[0] = XQTPRE;
  	pre[1] = '.';
***************
*** 333,338
  #endif
  	if (gotfiles(file))
  		return(1);
  	goto retry;
  }
  

--- 340,361 -----
  #endif
  	if (gotfiles(file))
  		return(1);
+ 	/* check for old X. file with no work files and remove them. */
+ 	/* suggested by Tom Truscott. fluke!joe */
+ 	if (Nfiles > LLEN/2) {
+ 	    time(&ystrdy);
+ 	    ystrdy -= (24 * 3600);		/* yesterday */
+ 	    DEBUG(4, "gtxfile: Nfiles > LLEN/2\n", "");
+ 	    (void) iswrk(file, "get", Spool, pre);
+ 	    while (gtwrkf(Spool, file) && !gotfiles(file)) {
+ 		if (stat(subfile(file), &stbuf) == 0)
+ 		    if (stbuf.st_mtime <= ystrdy) {
+ 			DEBUG(4, "gtxfile: unlink %s \n", file);
+ 			unlink(subfile(file));
+ 		    }
+ 	    }
+ 	    return 0;
+ 	}
  	goto retry;
  }
  

Then for anlwrk.c:

52,54c52,54
< 
< #define LLEN 20
< #define MAXRQST 250
---
> /*
>  * fluke!joe moved LLEN and MAXRQST to uucp.h.
>  */
60,61c62,63
< static	int Nfiles = 0;
< static	char Filent[LLEN][NAMESIZE];
---
> int Nfiles = 0;
> char Filent[LLEN][NAMESIZE];

I tested these changes out and it seemed to work quite well.  My X.
directory was cleaned out within an hour and I received all sorts of
backlogged mail!  I guess the only real cleanup would be to change the
(24*3600) to a defined constant, but I didn't feel like it at the
time...

/Joe Kelsey	John Fluke Mfg. Co., Inc.
{microsoft, uw-beaver, allegra}!fluke

honey@down.UUCP (code 101) (02/24/84)

while i agree with your goals, the removal of dead C., D., and X. files
is not a job for the core programs (uucp, uux, uuxqt, and uucico).
over a period of years, redman collected hundreds of dead work files on
harpo and wrote a shell script, which he called SuperShell, that
analyzes and dispenses with them (e.g., giving rnews or rmail their
orphaned input, and returning ancient mail to its sender).  nowitz
converted this script into a c program (called uucleanup), and arranged
for the "standard" daily daemon in honey danber to invoke it.
	peter honeyman

joe@fluke.UUCP (Joe Kelsey) (03/01/84)

It seems that I was a little too trusting of Tom Truscott's original mock C
code to deal with dead work files stopping uuxqt.  In the fix I distributed,
uuxqt could still hang for upwards of several days if the first work file it
finds doesn't "gotfiles", since the modification won't remove it for a day.
I find that by moving the second call to "iswrk" to after the while loop
which removes dead work files, and putting a conditional return there, uuxqt
seems to work much better.  Here is the complete diff for uuxqt from the
distributed version.  One thing to think about is what Peter Honeyman
mentioned, whether the removing of dead files really belongs in uuxqt.  I
suppose if you really don't want to remove the dead file, you could just
comment out the unlink in the while loop, but you probably want to log a
message to make sure someone eventually notices this problem.  I think if
you don't unlink the files, uuxqt could still stall whenever you have a lot
of dead work files.  Anyway, here is the new diff:

*** /tmp/,RCSt1006994	Wed Feb 29 14:08:59 1984
--- uuxqt.c	Wed Feb 29 14:08:12 1984
***************
*** 25,30
  #define	NCMDS	50
  char *Cmds[NCMDS];
  
  int notiok = 1;
  int nonzero = 0;
  

--- 25,33 -----
  #define	NCMDS	50
  char *Cmds[NCMDS];
  
+ /* Nfiles is set in anlwrk.c. fluke!joe */
+ extern int Nfiles;
+ 
  int notiok = 1;
  int nonzero = 0;
  
***************
*** 304,309
   * Mod to recheck for X-able files. Sept 1982, rti!trt.
   * Suggested by utzoo.2458 (utzoo!henry)
   * Uses iswrk/gtwrkf to keep files in sequence, May 1983.
   */
  
  gtxfile(file)

--- 307,313 -----
   * Mod to recheck for X-able files. Sept 1982, rti!trt.
   * Suggested by utzoo.2458 (utzoo!henry)
   * Uses iswrk/gtwrkf to keep files in sequence, May 1983.
+  * Mod to check for old X. files, Feb. 1984, fluke!joe.
   */
  
  gtxfile(file)
***************
*** 311,316
  {
  	char pre[3];
  	register int rechecked;
  
  	pre[0] = XQTPRE;
  	pre[1] = '.';

--- 315,323 -----
  {
  	char pre[3];
  	register int rechecked;
+ 	time_t ystrdy;		/* yesterday */
+ 	extern time_t time();
+ 	struct stat stbuf;	/* for X file age */
  
  	pre[0] = XQTPRE;
  	pre[1] = '.';
***************
*** 333,338
  #endif
  	if (gotfiles(file))
  		return(1);
  	goto retry;
  }
  

--- 340,362 -----
  #endif
  	if (gotfiles(file))
  		return(1);
+ 	/* check for old X. file with no work files and remove them. */
+ 	/* suggested by Tom Truscott. fluke!joe */
+ 	if (Nfiles > LLEN/2) {
+ 	    time(&ystrdy);
+ 	    ystrdy -= (24 * 3600);		/* yesterday */
+ 	    DEBUG(4, "gtxfile: Nfiles > LLEN/2\n", "");
+ 	    while (gtwrkf(Spool, file) && !gotfiles(file)) {
+ 		if (stat(subfile(file), &stbuf) == 0)
+ 		    if (stbuf.st_mtime <= ystrdy) {
+ 			DEBUG(4, "gtxfile: unlink %s \n", file);
+ 			unlink(subfile(file));
+ 		    }
+ 	    }
+ 	    DEBUG(4, "iswrk\n", "");
+ 	    if (!iswrk(file, "get", Spool, pre))
+ 		return 0;
+ 	}
  	goto retry;
  }
  
Has anyone else installed this?  Or am I the only one brave enough?

/Joe