dave@lsuc.uucp (David Sherman) (01/27/88)
Our uucp seems to be losing files fairly often. It's a v7 uucp, to which we have source and into which I've hacked in most of the bug fixes posted to the net over the years. Scenario: mail is sent to remote!whatever!whoever. uucp constructs C.remoteA1234 - with the instructions D.remoteB1233 - with the mail D.lsucX4059 - with the "rmail whatever!whoever" details Now, the mail is a largish file. (Doesn't have to be huge for this to happen.) During the transfer, uucico craps out with BAD READ (expected 'S' got FAIL) where the S is sometimes a C. OK, so maybe it's a phone line problem. It's not reproducible, though it happens several times a day with different sites (we shuffle several Mb a day around). So, it should try again next time, right? But on the next call, I get mail from uucp telling me "file D.remoteB1233 can't access" on lsuc. Yes indeedy, uucico is REMOVING the file even though the transfer didn't succeed. Anyone know why? I delved into cntrl.c, where there are numerous calls to unlinkdf(), but as far as I can make out the code under "case SNDFILE:" is correct. Incidentally, this problem is affecting news as well; I suspect some news batches may not be making it downstream from us for the same reason. It happens with numerous different sites, so it's not the fault of any one site we talk to. Any help would be greatly appreciated. David Sherman The Law Society of Upper Canada (416) 947-3466 -- { uunet!mnetor pyramid!utai decvax!utcsri ihnp4!utzoo } !lsuc!dave
dave@lsuc.uucp (David Sherman) (03/07/88)
Back on January 26 I posted a plea for help. Our uucp was regularly failing on connections to many sites; following the BAD READ failure, I'd get "file xxx can't access" mail on the next connection. The problem applies to v7-vintage uucico's, when talking to 4.3BSD uucp sites, as it turns out. After working through all the replies and suggestions, and getting particularly useful help from kwlalonde@watmath and rick@uunet, I have the answer. It's a two-part answer. If you are running a v7-vintage uucp, you want fix (1), and your neighbours running 4.3 want fix (2). If you are running 4.3BSD uucp, you want fix (2). (1) in v7 uucp sources, cntrl.c, if (msg[1] == Y) { ... unlink(W_DFILE); RMESG(RQSTCMPT, msg); goto process; } Reverse the unlink and RMESG lines. Reason: If the RMESG fails, the remote end may not have received W_DFILE, but it's too late. Change it to call RMESG first, then do the unlink. (2) From rick: Yes, there is a bug in the virgin 4.3bsd uucp in the 'g' protocol driver. It will produce the symptoms you describe. *** pk1.c Fri Nov 7 17:51:10 1986 --- ../nuucp/pk1.c Sun Nov 2 21:12:49 1986 *************** *** 196,202 **** return; } if (k && pksizes[k] == pk->p_rsize) { ! pk->p_rpr = (h->cntl >> 3) & MOD8; pksack(pk); bp = pk->p_ipool; if (bp == NULL) { --- 196,203 ---- return; } if (k && pksizes[k] == pk->p_rsize) { ! pk->p_rpr = h->cntl & MOD8; ! DEBUG(7, "end pksack 0%o\n", pk->p_rpr); pksack(pk); bp = pk->p_ipool; if (bp == NULL) { When I posted the query, I thought the problem "had" to be us because it appeared when talking to all kinds of other sites. I've now prevailed upon all of those sites (maccs, mnetor, utflis, sickkids and some others) to patch their uucico's, and the problem has entirely disappeared. If you're running 4.3BSD, I strongly recommend you implement Rick's fix. (Just feed this article through patch -d /usr/src/cmd/uucp and recompile.) David Sherman The Law Society of Upper Canada Toronto -- { uunet!mnetor pyramid!utai decvax!utcsri ihnp4!utzoo } !lsuc!dave