[comp.sys.sun] Subject: Re: Inexplicable loss of DTR on async port

tbr@tfic.bc.ca (Tom Rushworth) (02/14/91)

wsrcc!wolfgang@uunet.uu.net (Wolfgang S. Rupprecht) writes:
>jstewart@ccs.carleton.ca (John Stewart) writes:
>
>>On several occasions in the last two weeks, the port has got into a state
>>in which DTR is not present.  When this happens, the modem of course
>>refuses to answer incoming calls.  uucp is still able to initiate outgoing
>>calls.
>
>I see the same behavior on an SLC running sunos 4.1 and a tb+.  It appears
>to only happen on line power going away for the modem *and* cpu.
>Rebooting the SPARC with the modem still on fixes the problem.

Have you tried killing the offending getty?  (Much simpler if it works...)

>Perhaps the "DTR hang" is a related problem induced by all of the modem
>control lines being false while the modem is doing its self test.

We have similar symptoms, but they have nothing to do with the line power,
see below...

We've been having the same sort of trouble with losing DTR on our
dial-in/dial-out serial port.  I haven't been able to pin down exactly
what's causing it, but I'd like to provide a crude possible solution.  It
works for us, your mileage may vary.  Of course, I'm hoping someone has a
real solution...:-).

Our configuration:
   SPARCstation 1, SunOS 4.1, Telebit T2500 (Version GF7.00), using
   hardware (RTS/CTS) flow control at 19200.

Our problem:
   When we upgraded to 4.1, we had to add a parity diddle (P_ZERO) to our
   uucp dialing script in order to continue dialing into one of our neighbors.
   This seems to result in an occasional message of the form:
      Feb  5 10:03:40 tacitus vmunix: zs0: parity error ignored
   either when the outgoing call is initiated or terminated (not always one
   or the other).  No problem so far, just nuisance.  Unfortuately, about
   one in thirty times what we get instead is a group of three messages:
      Feb  5 09:53:37 tacitus getty: ioctl(TCGETS): Bad file number
      Feb  5 09:53:37 tacitus getty: ioctl(TCGETS): Operation not supported on socket
      Feb  5 09:53:37 tacitus getty: ioctl(TCGETS): Operation not supported on socket
   At this point, while the getty is still running, it has dropped DTR, so no
   incoming calls get answered.  The getty stays in this state until killed,
   when all returns to normal.  I was actually watching the modem and
   console when this happened, so I'm fairly sure it was the outgoing call
   that triggered it.

Is this a bug in getty, or in the serial driver (e.g. returning to the
"open" in getty when the error happens even though the port is not really
open)?  Anyway, I no gotta da source, so I no can fix :-(.  In order to
keep mail and news moving, I came up with the following krude hack, to be
run as root by cron as often as you need it.  It only works for one line,
but since that's all we have.... Any improvements (or a real solution!)
would be most welcome.

-------------------------- cut here -------------- cut here ----------------
#!/bin/sh
# This is a shell archive (shar 3.47)
# made 02/05/1991 18:42 UTC by root@tacitus
# Source directory /etc/uucp
#
# existing files will NOT be overwritten unless -c is specified
#
# This shar contains:
# length  mode       name
# ------ ---------- ------------------------------------------
#    385 -rwxr-xr-x getty_check
#    136 -rw-r--r-- getty_check0.awk
#     82 -rw-r--r-- getty_check1.awk
#
# ============= getty_check ==============
if test -f 'getty_check' -a X"$1" != X"-c"; then
	echo 'x - skipping getty_check (File already exists)'
else
echo 'x - extracting getty_check (Text)'
sed 's/^X//' << 'SHAR_EOF' > 'getty_check' &&
#!/bin/sh
#
# check for error messages from  getty in message log and nuke failed
# getty if found
#
if [ `/bin/nawk -f /etc/uucp/getty_check0.awk /usr/adm/messages` = "HUNG" ]; then
X   /usr/ucb/logger -t getty_check -p auth.err "hung getty cleared"
X   eval `/bin/ps -ax | /bin/nawk -f /etc/uucp/getty_check1.awk`
else
X   /usr/ucb/logger -t getty_check -p auth.err "getty seems OK"
fi
SHAR_EOF
chmod 0755 getty_check ||
echo 'restore of getty_check failed'
Wc_c="`wc -c < 'getty_check'`"
test 385 -eq "$Wc_c" ||
	echo 'getty_check: original size 385, current size' "$Wc_c"
fi
# ============= getty_check0.awk ==============
if test -f 'getty_check0.awk' -a X"$1" != X"-c"; then
	echo 'x - skipping getty_check0.awk (File already exists)'
else
echo 'x - extracting getty_check0.awk (Text)'
sed 's/^X//' << 'SHAR_EOF' > 'getty_check0.awk' &&
BEGIN {st = "foo"}
/getty: ioctl\(TCGETS\): Bad file number/ {st = "HUNG"}
/getty_check: hung getty cleared/ {st = "OK"}
END {print st}
SHAR_EOF
chmod 0644 getty_check0.awk ||
echo 'restore of getty_check0.awk failed'
Wc_c="`wc -c < 'getty_check0.awk'`"
test 136 -eq "$Wc_c" ||
	echo 'getty_check0.awk: original size 136, current size' "$Wc_c"
fi
# ============= getty_check1.awk ==============
if test -f 'getty_check1.awk' -a X"$1" != X"-c"; then
	echo 'x - skipping getty_check1.awk (File already exists)'
else
echo 'x - extracting getty_check1.awk (Text)'
sed 's/^X//' << 'SHAR_EOF' > 'getty_check1.awk' &&
BEGIN {nuke = "kill -HUP 1" }
/std.19200/ {nuke = "kill -9 " $1}
END {print nuke}
SHAR_EOF
chmod 0644 getty_check1.awk ||
echo 'restore of getty_check1.awk failed'
Wc_c="`wc -c < 'getty_check1.awk'`"
test 82 -eq "$Wc_c" ||
	echo 'getty_check1.awk: original size 82, current size' "$Wc_c"
fi
exit 0
----
Tom Rushworth (604) 733-0731 [FAX: 733-0634] | uunet!ubc-cs!van-bc!tacitus!tbr
   Timberline Forest Inventory Consultants   | or: tbr@tfic.bc.ca