[comp.unix.ultrix] Ultrix UUCP bug/patch repost

grr@cbmvax.UUCP (George Robbins) (01/28/88)

Distribution:

Keywords:


Now that we have this nice Ultrix group, I figure it's time to repost this
little Ultrix uucp discussion from last January.  The problem described was
present in Ultrix 1.1 and 1.2.  Someone from DEC (Marc T?) called me and from
his comments it seemed that it was too late to get the fix into 2.0, and I
don't really know that I even convinced him that there was a problem.

I'm still running 1.2, so I can't be any more authoritative about later
releases or verify that the patch at the tail end of this is applicable
to other releases.  I'm hoping the rumors of 2.2 by Valentines day are true,
since I'm about ready to upgrade, although I'd have installed 4.3 BSD long
ago, if only it came with DECnet and LAT support 8-)...

I've been running the patched version on cbmvax (a 750, later 785) for
a year with no particular problems, and it definitly changed things from
absolutly wretched to tolerable.

digested antiquities:

  From: grr@cbmvax.cbm.UUCP (George Robbins)
  Subject: Re: UUCP [ultrix] guru needed!
  Date: 25 Dec 86 01:47:51 GMT

  In article <1269@ncc.UUCP> lyndon@ncc.UUCP (Lyndon Nerenberg) writes:
  >>
  >This problem seems to be generic to Ultrix UUCP. I have a h*** of a time
  >passing traffic to systems running it. We run CTIX (Sys_V), and I've also
  >tried it from V7 without much luck. It does seem to talk to 4.2 quite
  >nicely.
  >
  >Again, in our case the big problem seems to be TIMEOUTs. We can usually
  >push between 20 and 40 packets across, then the sender (us) starts timing
  >out and resending. After a while it just gives up.
  >
  >One day (in frustration), I compiled the code on a non-VAX machine and
  >tried it. Same result, so it doesn't look like it's a hardware related
  >problem.
  >
  >Like the man said, HELP!
  >--
  >Lyndon Nerenberg (VE6BBM)   Systems Group - A Div. of Nexus Computing Corp.

  I'm glad other people are wondering if ultrix has uucp problems.  It's easy
  to convince oneself that there is some kind of problem in ultrix uucp, it's
  a hell of a lot harder to prove it.

  It smells a lot like a problem in error recovery somewhere.  We see the same
  timeout symptom.  Of course this sort of thing only manifests itself when
  your phone lines get marginal to start with.  I've also seen it on a direct
  connection between our ultrix system and a box running SVr1 uucp.

  I've tried the Ultrix 1.1 and 1.2 uucp's, no particular difference.  I've
  patched uucico to bump the retry count from 10 to 100+, not much help.  I've
  checked the object code against some of the reported 4.3 uucp bugs without
  finding anything obvious.

  From: grr@cbmvax.cbm.UUCP (George Robbins)
  Subject: Re: UUCP [ultrix] guru needed!
  Date: 27 Dec 86 07:31:47 GMT

  Nothing like spending XMAS over a hot protocol analyzer...

  Anyway, it looks like Ultrix does have a problem in the protocol that makes
  certain kinds of errors result in no-recovery situation.
  =========
  Here's what happens:

  Ultrix sends packet N.
  Other sends RR packet N to acknowledge.
  Ultrix sends a packet N+1.
  Packet N+1 falls in bit-bucket. (sync char gets trashed)
  Other times out.
  Other sends RR packet N to acknowledge last packet seen.
  Ultrix sees packet N already acknowledged - sends nothing!!!
  Other times out.
  Other sends RR packet N to acknowledge last packet seen.
  Ultrix sees packet N already acknowledged - sends nothing!!!
  .
  .
  Other decides retry count exceeded - gives up.
  Ultrix times out - fatal.
  ==========
  The other system can't do anything about packet N+1 because it's
  never seen it.  If it had and there was an error in it, it could
  have rejected it, but *only* if it recognizes the bad packet.

  Ultrix is happy because the other keeps sending it nice acknowledgment
  packets, even if they don't say anything new.  It doesn't time out or
  get errors because they are such nice packets.
  ==========

  Now the kicker - as far as I can tell this is the way both Berkeley and
  AT&T unix play the game!  It looks pretty easy to fix, but am I missing
  something?  Just add a test in pksack that says if you get an ack for
  a packet already acknowledged, and you've already sent another packet,
  then set the retransmit flag.

  From: rick@seismo.CSS.GOV (Rick Adams)
  Subject: Re: UUCP [ultrix] guru needed!
  Date: 29 Dec 86 21:18:03 GMT

  Yes, It's a bug. It was fixed in 4.3BSD. Here is a rough idea
  of how it was fixed (Jim Bloom found and fixed this one).

  I'm not sure that the test for Reacks need to wait for 4. I think 2
  would probably be adequate. However, Jim may know of a case I don't.

  As a general rule, the 4.3bsd 'g' protocol driver is in better shape
  than ANY uucp available (including Honey DanBer, [gasp]). At a mimimum, it's
  at least readable (cryptic, but readable)

  <<<<< source fix omitted >>>>>

  From: grr@cbmvax.cbm.UUCP (George Robbins)
  Subject: Re: UUCP [ultrix] guru needed! [patch included]
  Date: 7 Jan 87 08:44:53 GMT

  Part of the problem is that some person at DEC changed the timeouts in
  pkcget from 10 and 20 seconds to 25 and 30 seconds, in hopes of making
  things more robust.  This actually increased the probability of the remote
  system timing out before the local system and encountering the underlying
  protocol problem.

  The following is a simple patch, applicable to ultrix 1.1 and 1.2, to reset
  the timeouts to the correct values:

  1) you must be root (or the setuid bits will go away!)
  2) uucico must not be running
  3) you should know what you are doing...
  4) copy the old uucico to some safe place
  5) make sure the numbers match

  Script started on Wed Jan  7 03:21:27 198
  # cd /usr/lib/uucp
  # cp uuico uucico.nopatch
  # adb -w uucico
  pkcget+57?x
  _pkcget+57:	19d0
  ?w 0ad0
  _pkcget+57:	19d0	=	ad0
  pkcget+5c?x
  _pkcget+5c:	1ed0
  ?w 14d0
  _pkcget+5c:	1ed0	=	14d0
  #
  ^d
  script done on Wed Jan  7 03:24:39 198

  This seems to have solved most of my problems, but I would be interested in
  any reports or comments.