[comp.sys.next] uucico lock-ups under 2.0

paul@cgh.uucp (Paul Homchick) (02/23/91)

I am having an intermittent problem with uucico lock-ups.  Three times
since installing version 2.0 of Nextstep, I have had uucico lock up
the serial port.  Uucico appears in the process table, but it isn't
using any cpu resources and it isn't connected to a remote site.
Since uucico has the port, getty isn't running and no one can call nor
can any process call out.  Killing uucico with a 'kill -9' doesn't
free things up, either, as tip and uucico report that they can't open
/dev/cua.  Nor does cycling the modem power help.  The only way I've
found to get the port back is to reboot the machine.

The modem is a Trailblazer Plus hooked up with a cable wired according to
the '030 wiring recommendations.  Nothing has changed in getty or ttys or
L.sys or L-Devices, and this set up worked fine for a year under 1.0.

Does anyone know for certain if there are documented problems with uucico
under 2.0??  Since I travel, this problem has resulted in having the
machine locked up for four days last week, and five days this week,
and my uucp neighbors aren't real happy campers.

New subject.  Warning, what follows has nothing to with the problem I
really could use some help with, but some may find it interesting.

I called NeXT and because I am not a registered developer, they didn't want
let me talk to technical support.  The young lady on the phone said she
gets 'yelled at' when she does that.  She gave me the name of the campus
support person at the school where I purchased the machine.  I did point
out that people do move, and that I hadn't been associated with that school
for nine months, and in fact, lived 1500 miles away from there, but
procedures are procedures and all I could get was this phone number.

So, armed, with this number, I called this fellow up.  His first
comment was "who gave you my number?"  Not a good sign.  He was very
polite, but said he couldn't help and suggested I find a Businessland
or a NeXT office in my area.  His second comment was "Why don't you
try a posting on internet?"

There used to be a Businessland near me, but they went bankrupt.
There is another Businessland in downtown Philadelphia, so I called
them and got an answering machine.  I left a message but no one called
back.  Two days later I called again and got a receptionist.  When I
explained I needed help with a NeXT, she said no one at the office
knew anything about NeXT's, but that the manager in the Baltimore
office might be able to help.  I said, "Thanks, but please don't
bother."

During all of this, it has occurred to me if I had purchased a Ford
Taurus from a dealer in San Diego and then moved to Dallas, that I
wouldn't have take it back to San Diego to get it fixed, and certainly
they wouldn't suggest that I CALL the San Diego dealer to discuss any
problems.  However, with a NeXT...

---
Paul Homchick     :UUCP     {rutgers | uunet} !cbmvax!cgh!paul
                  :Internet                   cgh!paul@dsi.com
                  :MCI Mail                   PHOMCHICK
                  :GEnie                              HOMCHICK

jiro@shaman.com (Jiro Nakamura) (02/24/91)

In article <1481@cgh.UUCP> paul@cgh.uucp (Paul Homchick) writes:
>I am having an intermittent problem with uucico lock-ups.  Three times
>since installing version 2.0 of Nextstep, I have had uucico lock up
>the serial port.  Uucico appears in the process table, but it isn't
>using any cpu resources and it isn't connected to a remote site.
>Since uucico has the port, getty isn't running and no one can call nor
>can any process call out.  Killing uucico with a 'kill -9' doesn't
>free things up, either, as tip and uucico report that they can't open
>/dev/cua.  Nor does cycling the modem power help.  The only way I've
>found to get the port back is to reboot the machine.

  I had the problem too with 030/2.0. It doesn't seem to happen anymore with
the 040 board and flow control. Methinks that it has something to do with
an lost or spurious XON character confusing either the Trailblazer or
the NeXT. I.e., one of the two is waiting for an XOFF character that will
never come...
   I'd say to upgrade to a 040 and use flow control. Using XON/XOFF with
UUCP 'g' protocol isn't very good.... 
  

>L.sys or L-Devices, and this set up worked fine for a year under 1.0.
 
   I think NeXT may have changed things in 2.0, I know that the revision
number has changed:

	shaman# strings /usr/lib/uucp/uucico 
	@(#)PROGRAM:uucico  PROJECT:cmds-42  DEVELOPER:root  BUILT:Sun Nov 11 17:21:04 PST 1990

>Does anyone know for certain if there are documented problems with uucico
>under 2.0??  Since I travel, this problem has resulted in having the
>machine locked up for four days last week, and five days this week,
>and my uucp neighbors aren't real happy campers.
   
     Get an 040. Software flow control seems to have broken. But I haven't
heard of any "documented" problems, this is only from my own experience,
short that it is.


> [Some griping about NeXT Tech Support and NeXT Campus Supoort not
>  being too helpful.]

  I agree, unless you are a developer, the tech support isn't the best. But
whose is? My experience as a developer is that developer tech support is one
of the best in the industry. But even now, turn around time in Ask_NeXT is
about a week, if they opened it up to non-developers, I expect it would get
much worse.
  What is the answer? Who to pamper and at whose cost? NeXT is currently
pampering developers a lot (and I'm the last person to complain) but it is
at the end-user's expense a bit... sigh.

> [Some griping about Businessland being their usual incompetent selves.]

  Yes, sadly NeXT has not trained Businessland enough. Businessland, I think
was a failure because no one there knows how to use these really wonderful
machines. Businessland *should* be providing end-user support, that makes
bunches of sense. But they are either too lazy or dim-sighted to do so.

  Paul, I sympathize greatly with you. My suggestion would be to try to
register as a developer. The process is necessarily a hassle and not many
make it through, and you'd have to go to the course in SFO if you want
tech support, but maybe it is worth it.... The NeXT has a great future
ahead of it, I just wish there were more ways to ease people's frustrations
with the bottom level support.

  - Jiro Nakamura
    jiro@shaman.com
-- 
Jiro Nakamura				jiro@shaman.com
Shaman Consulting			(607) 253-0687 VOICE
"Bring your dead, dying shamans here!"	(607) 253-7809 FAX/Modem

glenn@heaven.woodside.ca.us (Glenn Reid) (02/25/91)

In article <1481@cgh.UUCP> paul@cgh.uucp (Paul Homchick) writes:
>I am having an intermittent problem with uucico lock-ups.  Three times
>since installing version 2.0 of Nextstep, I have had uucico lock up
>the serial port.

>     ...Nor does cycling the modem power help.  The only way I've
>found to get the port back is to reboot the machine.

There is a serious bug in the serial driver in 2.0 related to
incoming calls (and uucico).  You basically can't receive calls
without wedging the serial port.  If you restrict your UUCP traffic
to call-only until this bug is fixed, you shouldn't have any problems
with it locking up.

It's bug #13061.

You should tell your UUCP neighbors not to call you ("Never" in the
L.sys file) and you should set up your /etc/crontab file to poll
them every so often.  Send me Email if you don't know how to do this.

Glenn

-- 
 Glenn Reid				RightBrain Software
 glenn@heaven.woodside.ca.us		NeXT/PostScript developers
 ..{adobe,next}!heaven!glenn		415-851-1785 (fax 851-1470)

lloyd@Axecore.COM (Lloyd Buchanan) (02/25/91)

In article <434@heaven.woodside.ca.us> glenn@heaven.woodside.ca.us (Glenn Reid) writes:
>In article <1481@cgh.UUCP> paul@cgh.uucp (Paul Homchick) writes:
>>I am having an intermittent problem with uucico lock-ups.  Three times
>>since installing version 2.0 of Nextstep, I have had uucico lock up
>>the serial port.
>
>There is a serious bug in the serial driver in 2.0 related to
>incoming calls (and uucico).  You basically can't receive calls
>without wedging the serial port.  If you restrict your UUCP traffic
>to call-only until this bug is fixed, you shouldn't have any problems
>with it locking up.
>
>It's bug #13061.
>
>You should tell your UUCP neighbors not to call you ("Never" in the
>L.sys file) and you should set up your /etc/crontab file to poll
>them every so often.  Send me Email if you don't know how to do this.
>
>Glenn

I have this same extreemly annoying problem.  This worked for me:

	Put a "cp /dev/null /dev/cuf[ab]" in your crontab for 
	execution every 15 min or so.  It seems to clear the hung
	port and doesn't interfere with normal operations.

Lloyd
-- 
Lloyd Buchanan                          lloyd@Axecore.COM
Axe Core Investors                      uupsi!axecore!lloyd
Axe Castle 	                        (914) 333-5226 (phone)
Tarrytown,  NY  10591                   (914) 333-5208 (FAX)

das15@cunixa.cc.columbia.edu (Douglas A Scott) (02/26/91)

In article <1991Feb25.133608.7343@Axecore.COM> lloyd@Axecore.COM (Lloyd Buchanan) writes:
>In article <434@heaven.woodside.ca.us> glenn@heaven.woodside.ca.us (Glenn Reid) writes:
>>In article <1481@cgh.UUCP> paul@cgh.uucp (Paul Homchick) writes:
>>>I am having an intermittent problem with uucico lock-ups.  Three times
>>>since installing version 2.0 of Nextstep, I have had uucico lock up
>>>the serial port.
>>
>>There is a serious bug in the serial driver in 2.0 related to
>>incoming calls (and uucico).  You basically can't receive calls
>>without wedging the serial port.  If you restrict your UUCP traffic
>>to call-only until this bug is fixed, you shouldn't have any problems
>>with it locking up.
>>
>>It's bug #13061.
>>
>>You should tell your UUCP neighbors not to call you ("Never" in the
>>L.sys file) and you should set up your /etc/crontab file to poll
>>them every so often.  Send me Email if you don't know how to do this.
>>
>>Glenn
>
>I have this same extreemly annoying problem.  This worked for me:
>
>	Put a "cp /dev/null /dev/cuf[ab]" in your crontab for 
>	execution every 15 min or so.  It seems to clear the hung
>	port and doesn't interfere with normal operations.
>

	Is this a problem that shows up on a '030 machine when the OS is 
	updated to 2.0, or only when the hardware is updated??  Does /dev/cuf
	appear on the system when the software update (only) is run?  I am
	about to get my 2.0 upgrade, but I get ALL my mail by being called
	up my a UUCP host, and cannot afford to be shut down until bug 13061
	is repaired!  Can I restore the 1.0a uucp code and run it under 2.0?
	Or is this some other section of code that is bad?  2.0:  arrgh.


___________________________________________________________________________
Douglas Scott
zardoz!doug%woof.columbia.edu

nerd@percy.rain.com (Michael Galassi) (02/27/91)

In article <434@heaven.woodside.ca.us> glenn@heaven.woodside.ca.us (Glenn Reid) writes:
>In article <1481@cgh.UUCP> paul@cgh.uucp (Paul Homchick) writes:
>>I am having an intermittent problem with uucico lock-ups.  Three times
>>since installing version 2.0 of Nextstep, I have had uucico lock up
>>the serial port.
>
>>     ...Nor does cycling the modem power help.  The only way I've
>>found to get the port back is to reboot the machine.
>
>There is a serious bug in the serial driver in 2.0 related to
>incoming calls (and uucico).  You basically can't receive calls
>without wedging the serial port.  If you restrict your UUCP traffic
>to call-only until this bug is fixed, you shouldn't have any problems
>with it locking up.

Here is a bit more detail, I was preparing a message for bug_next but
since they already know of it...   The problem only shows up if you
actualy send something to the next in the uucico session, if you poll
with nothing to xmit all is OK.   Also, in most cases (but not all)
you can turn the port off (in /etc/ttys) kill -1 1, turn the port
back on and kill -1 1, to have things work again.  Not pretty but
it sure beats power cycling the machine.

-michael
-- 
Michael Galassi				| nerd@percy.rain.com
MS-DOS:  The ultimate PC virus.		| ...!tektronix!percy!nerd

jkight@is-next.umd.edu (Jeff Kight) (02/28/91)

In article <1991Feb26.160009.3836@percy.rain.com> nerd@percy.rain.com (Michael Galassi) writes:
>Here is a bit more detail, I was preparing a message for bug_next but
>since they already know of it...   The problem only shows up if you
>actualy send something to the next in the uucico session, if you poll
>with nothing to xmit all is OK.   Also, in most cases (but not all)
>you can turn the port off (in /etc/ttys) kill -1 1, turn the port
>back on and kill -1 1, to have things work again.  Not pretty but
>it sure beats power cycling the machine.
>
>-michael

Interesting problem: I've set up 3 NeXTs for uucp. The mail hub is a
NeXTstation, and the 2 remote machines are a NeXTstation and a 2.0 030 cube.
We have the same problems with lockups on remote->hub transfers...however
we are only using the serial ports on the remote machines. ie They poll the
hub machine by dialing into an annex and telneting to the machine.

What seems to be happening (via debugging at level 99) is that the uucp
handshaking is ok, but when the execution of the actual transfer commences I
get reack errors which result in attempts to resend, but the errors build up.

Now, we are using 9600 baud MNP5 modems all around. Different brands have
been tried to no avail. Flow control is off (UUCP should handle everything
without any problems) and so is speed conversion (9600 baud connection)

NeXT has suggested that it could be related to the serial port bug, does
anybody have any ideas or suggestions? btw...I have absolutely no problems
with hub->remote transfers...

------------------------------------------------------------------------------
Jeff Kight                                                 jkight@umd5.umd.edu
Information Services                                        jkight@umd2.bitnet
Computer Science Center                              uunet!umd5.umd.edu!jkight
University of Maryland                                          (301) 405-3014

jiro@shaman.com (Jiro Nakamura) (03/01/91)

In article <8123@umd5.umd.edu> jkight@is-next.umd.edu (Jeff Kight) writes:
>Interesting problem: I've set up 3 NeXTs for uucp. The mail hub is a
>NeXTstation, and the 2 remote machines are a NeXTstation and a 2.0 030 cube.
>We have the same problems with lockups on remote->hub transfers...however
>we are only using the serial ports on the remote machines. ie They poll the
>hub machine by dialing into an annex and telneting to the machine.
>
>What seems to be happening (via debugging at level 99) is that the uucp
>handshaking is ok, but when the execution of the actual transfer commences I
>get reack errors which result in attempts to resend, but the errors build up.


Jeff -
	I've been getting the exact same problems dialing into the local
mainframe for UUCP, so I switched to UUNET. I don't know what the exact
cause of the problem is, but the source is most probably the annex
modem bank. Why? Because I tried experimenting to see if I could rlogin
to UUNET via our annex port and save myself $$ in long distance charges.
	Exact same problem as dialing into the annex for the mainframe
UUCP. BZZZZ. Timeouts on ack'ing sent packets, right? Tried dialing
long distance to UUNET and things went smoothly again (this is on a
T2500).
 	I'd really like to know what the annex's are doing to mess things
up, but I do think they are the cause of our problems. If I could only get
the mainframe talking.... then I could save so much on long distance.
	NeXT helped me out *extensively*, but all we could come up with
was: What the *#@$*&@#$? It doesn't make sense.

	- Jiro Nakamura
	jiro@shaman.com
-- 
Jiro Nakamura				jiro@shaman.com
Shaman Consulting			(607) 253-0687 VOICE
"Bring your dead, dying shamans here!"	(607) 253-7809 FAX/Modem

jiro@shaman.com (Jiro Nakamura) (03/02/91)

Folks -
	Thanks to Steve Boker, I've found out that at least one of the
annex's that links our mainframes to the modems is using XON/XOFF flow
control (and thus munging up UUCICO). Damn.... :-(
        But another direct line doesn't work. Has anyone experienced
success connecting to a Sun 4/280 (I think it is using BNU UUCP) with
a NeXT? I seem to be experiencing the same lack-of-ack problem as the
annex's and this line doesn't use annexes....

 	- jiro
-- 
Jiro Nakamura				jiro@shaman.com
Shaman Consulting			(607) 253-0687 VOICE
"Bring your dead, dying shamans here!"	(607) 253-7809 FAX/Modem