[net.bugs.uucp] Micom 224+ problems

sid@linus.UUCP (Sid Stuart) (04/09/85)

I am trying to write the uucico driver code for a Micom 224+ 2400/1200/300
BPS modem. I have the code working, uucico can call out to another system,
transfer data and complete the conversation, but when it tries to end the
session, it fails on the last OO syncronization. When it fails, I see
a lot of 'w's coming in. Here is the output from the end of a failed session,
note the 'w's:

*** TOP ***  -  role=1, wmesg 'H'
send 37777777610
rmesg - 'H' rec h->cntl 41
state - 10
rec h->cntl 37777777611
send 41
got HY
 PROCESS: msg - HY
HUP:
wmesg 'H'Y
send 37777777621
send 10
send 10
cntrl - 0
uucp xxx (4/8-14:42-20546) OK (conversation complete)
send OO 0,imsg >\020<
\011\010*"\011\020\002!l\022]HY\000imsg >\000\000\000\000\000\000\000\000\000\
000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\00
0\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\
000wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwv)f\000Swwww
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww_B=DwgYj7wwwwwwwwwwwwwwwwwwwwwwwww
wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww\015\01
2OFFLINE\007\015\012IDLE\015\012\012uucp xxx (4/8-14:43-20546) TIMEOUT (xxx)
send I
exit code 0


Just for a reference, here is the end of a properly terminated session,
from a Concord Data Modem at 2400 baud:

*** TOP ***  -  role=1, wmesg 'H'
send 37777777610
rmesg - 'H' rec h->cntl 41
state - 10
rec h->cntl 37777777611
send 41
got HY
 PROCESS: msg - HY
HUP:
wmesg 'H'Y
send 37777777621
send 10
send 10
cntrl - 0
uucp xxx (4/8-14:45-20579) OK (conversation complete)
send OO 0,imsg >\020<
\011\010*"\011\020\002!l\022]HY\000imsg >\000\000\000\000\000\000\000\000\000\
000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\00
0\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\
000\000\000\000\000\000\000\000\000\000\000\000\000\020<
\011"*\010\011\020\011"*\010\011\020OOOOOO\000imsg >\020<
OOOOOO\000uucp xxx (4/8-14:45-20579) CDS (End of call)
exit code 0

The failure occurs at 1200 baud and 2400 baud, talking to a racal triple
and a Concord Data System modem respectiveley. It does not fail 100% of
the time though, sometimes the code goes through to completion. One time
at 1200 baud, I got 'U's instead of 'w's, but the result was the same. The
'w's don't start at any particular place in the transmission either.
The code can be used since it does finish the conversation before
the failure occurs, but I would prefer to have it not fail at all.
Does anyone have any hints? 


				{decvax, allegra, philabs,}!linus!sid

clewis@mnetor.UUCP (Chris Lewis) (04/10/85)

We are encountering a very similar problem when our Pyramid (BSD4.2
uucp) trys to converse with System V's (VME10 and EXORmacs using
dialup or direct lines).  They *always* TIMEOUT.  We don't have this 
problem when conversing with other BSD4.2s.  I don't think that the 
"w"'s that you mention are directly related to the problem.  I think 
that this is an inconsistency between System V and BSD hangup 
sequences and the "w"'s are just a idiosyncracy of your lines 
(eg: they could be wrong baud rates from the other end).  Can 
anybody shed any light on this?  We would really appreciate it.

Our problem with the EXORmacs is considerably worse.  If our system
ends up in master mode at HUP time the dial-in line of the EXORmacs 
will hang with a uucico on it.  Killing the uucico brings up a "getty" 
which won't respond to keystrokes or "kill -9"'s.  I'm hoping that
fixing the TIMEOUT problem will fix this too.
-- 
Chris Lewis, Computer X (CANADA) Ltd.
UUCP: {allegra, linus, ihnp4}!utzoo!mnetor!clewis
BELL: (416)-475-1300 ext. 321

dave@lsuc.UUCP (David Sherman) (04/10/85)

Could the wwwwwwwwwwwwwwwwwwwwwwww be the 2400-baud
equivalent of the 1200-baud UUUUUUUUUUUUUUUU pattern which
modems produce when they go into test mode?

Dave Sherman
-- 
{utzoo pesnta nrcaero utcs hcr}!lsuc!dave
{allegra decvax ihnp4 linus}!utcsri!lsuc!dave

clewis@mnetor.UUCP (Chris Lewis) (04/11/85)

in <403@mnetor.UUCP> I, myself, write:
> We are encountering a very similar problem when our Pyramid (BSD4.2
> talk to System V uucps.
> ...
> dialup or direct lines).  They *always* TIMEOUT.  We don't have this 
> ...
> "w"'s that you mention are directly related to the problem.  I think 
> that this is an inconsistency between System V and BSD hangup 
> sequences and the "w"'s are just a idiosyncracy of your lines 

Munging around in both BSD and SysV "cico.c" code, I found a slight 
inconsistency:

		alarm(MAXMSGTIME);
		omsg('O', "OOOOO", Ofn);
		DEBUG(4, "send OO %d,", ret);
		if (!setjmp(Sjbuf)) {
			for (;;) {
				/* the following line is NOT in System V */
				omsg('O', "OOOOO", Ofn);
				ret = imsg(msg, Ifn);
				if (ret != 0)
					break;
				if (msg[0] == 'O')
					break;
			}
		}

From the -x9 logs it looked very much like the BSD uucico was waiting
for another 'O' message (a second one).  I put it in the System V 
version, downloaded it to one of our System V remotes, and it worked 
just fine.  By the way, regarding the suggested "strip parity" fix
mentioned earlier, I have found that our BSD cico strips parity in imsg 
but the System V doesn't.  I will have to investigate that too.

Any comments from uucp experts?  Which is right - strip parity on
both ends, updating System V omsg('O'...) to match BSD, or both?
-- 
Chris Lewis, Computer X (CANADA) Ltd.
UUCP: {allegra, linus, ihnp4}!utzoo!mnetor!clewis
BELL: (416)-475-1300 ext. 321

stv@qantel.UUCP (Steve Vance@ex2499) (04/20/85)

In article <403@mnetor.UUCP> clewis@mnetor.UUCP (Chris Lewis) writes:
>We are encountering a very similar problem when our Pyramid (BSD4.2
>uucp) trys to converse with System V's (VME10 and EXORmacs using
>dialup or direct lines).  They *always* TIMEOUT.  We don't have this 
>problem when conversing with other BSD4.2s...  

I believe that this is a different problem.  When System V came out,
early Unisoft ports of every flavor had a bug in uucp, whereby they
could only talk to other Unisoft-System-V machines.  You have described
the symptoms precisely--connection is made, then the System V side does
not respond further, then timeout occurs.  I have heard that a patch
has been available to fix this for over a year.  You'll have to talk to
your system supplier--Unisoft won't talk to you directly.  In our
situation, we are out of luck, since our system supplier seems to have
gone belly-up.  Good thing we had the wherewithall to just junk the
machine.
-- 

Steve Vance
{dual,hplabs,intelca,nsc,proper}!qantel!stv
dual!qantel!stv@berkeley
Qantel Corporation, Hayward, CA

mash@mips.UUCP (John Mashey) (04/24/85)

> In article <403@mnetor.UUCP> clewis@mnetor.UUCP (Chris Lewis) writes:
> >We are encountering a very similar problem when our Pyramid (BSD4.2
> >uucp) trys to converse with System V's (VME10 and EXORmacs using
> >dialup or direct lines).  They *always* TIMEOUT.  We don't have this 
> >problem when conversing with other BSD4.2s...  
> 
> I believe that this is a different problem.  When System V came out,
> early Unisoft ports of every flavor had a bug in uucp, whereby they
> could only talk to other Unisoft-System-V machines.  You have described

I also believe this is the (same) different problem, but I believe it
is the "notorious" bug in uucp's chksum routine (in pk0.c).  Almost all
68K ports have run into this, at least if they started with the MIT C compiler
or other compilers that work correctly. Here is the relevant fragment of code:

	register short sum;
	register unsigned short t;
	register short x;
	if ((unsigned)sum <= t) {
		sum ^= x;
	}
Of course, (unsigned)sum should have been written (unsigned short)sum.
Correct C requires that sum is converted from short->int->unsigned, NOT
short -> unsigned short.
Here is the corresponding piece of output from the 4.2BSD C compiler:
L15:
	cmpw	-2(fp),-4(fp)
	jgtru	L16
	xorw2	-6(fp),-2(fp)
L16:
	ret
This compiler short-circuited the conversion, yielding the intended, but
not correct comparison.  The problem does not exist on 16-bit machines,
and the bug has masked it on VAXen.  Almost everybody doing 68K ports has
run into it: their machines would talk among themselves, but wouldn't
talk to most others. The fix is of course just the trivial change above.
I'm sure numerous hours have been dedicated to this bug; when I was at CT,
it took us 2 solid days to find this.
An excellent example where being "right" is worse than being wrong,
when everybody else is wrong!
-- 
-john mashey
UUCP: 	{decvax,ucbvax,ihnp4}!decwrl!glacier!mips!mash
ARPA:	mips!mash@SU-Glacier.ARPA
DDD:  	415-960-1200
USPS: 	MIPS, 1330 Charleston Rd, Mtn View, CA 94043