[comp.sys.ncr] 32-600 HPSIO HELP?!?!?!: what is "threshold on overrun tally error"

wdh@holos0.uucp (Weaver Hickerson) (10/11/90)

A plea for help!  Man fights Tower, Tower wins.  Round X.

A customer is having problems with communications input on a 6-2-1 board.
We've implemented software on his Tower that sends/receives files to/from a
Burrough's mainframe, in 1210 byte blocks.  Sending works just fine.
REceiving, we get errors occasionally.  My software on the
tower displays an error, hex-dumping the questionable block to a printable
file.  (The f**king protocol we're involved with has no way to request a
block level retransmission, so once a bad block is received, even if it's
block number 1,956, the only way to go is aborting and starting over.
That's another story)

Anyway, the block in question is always missing one or more bytes,
according to the datascope -vs- the block I got in the software.  At 9600 
baud, the problem is more prevalent, and generates errors on the system 
error log.  

                     Here is a rough transcript:
+-------------------------------------------------------------------------+
HPSIO terminal incident
Subsystem/Module: tt03
Controller Address: 00ff1400
Hardware Status: threshold on overrun tally error  
# of errors: 100
Mode of operation at error time: 9600,even parity,1 stop bit,7 data bits
Link signals at error time:  DSR  DCD  CTS  DTR  RTS

Simultaneous Driver activity: 5, 3, 2, 0
+-------------------------------------------------------------------------+
This is driving us crazy.  We dropped down to 4800 baud, and the same
problem persists, but with no system error messages generated.

Does anybody know what "threshold on overrun tally error" means?  Could we
have a defective HPSIO board here, bad port or something?  Is a block
length of 1210 too large for these handy-dandy HPSIO's?

I'm begging...

Weaver
-- 
-Weaver Hickerson   Voice (404) 496-1358   :  ..!edu!gatech!holos0!wdh

wescott@Columbia.NCR.COM (Mike Wescott) (10/13/90)

In article <1990Oct11.165115.4811@holos0.uucp> wdh@holos0.uucp (Weaver Hickerson) writes:
> Does anybody know what "threshold on overrun tally error" means?

It means that some number (the threshold) of overrun events (hpsio
not getting characters out of uart before they're clobbered by more
characters arriving) have occurred.  It will happen more frequently
at higher baud rates.  It can be avoided if you can get H/W flow
control working.  If IRTS (see termio(7) or stty(1)) is set then
the uart will be programmed to automagically drop RTS when an overflow
condition is imminent.

> Could we have a defective HPSIO board here, bad port or something?

Possible, but this isn't evidence of that kind of problem.

> Is a block length of 1210 too large for these handy-dandy HPSIO's?

Nope.
--
	-Mike Wescott
	 mike.wescott@ncrcae.Columbia.NCR.COM

root@texbell.sbc.com (Greg Hackney) (10/14/90)

In article <1990Oct12.184451.2985@nncrcae.Columbia.NCR.COM> wescott@Columbia.NCR.COM (Mike Wescott) writes:

>It means that some number (the threshold) of overrun events (hpsio
>not getting characters out of uart before they're clobbered by more
>characters arriving) have occurred.

>It can be avoided if you can get H/W flow
>control working.  If IRTS (see termio(7) or stty(1)) is set then
>the uart will be programmed to automagically drop RTS when an overflow
>condition is imminent.

Mike, is this affected by the number of NCLISTs in the kernel configuration?
--
Greg

wescott@Columbia.NCR.COM (Mike Wescott) (10/17/90)

In article <496@texbell.sbc.com> root@texbell.sbc.com (Greg Hackney) writes:
> In article <1990Oct12.184451.2985@nncrcae.Columbia.NCR.COM> wescott@Columbia.NCR.COM (Mike Wescott) writes:
>>It means that some number (the threshold) of overrun events (hpsio
>>not getting characters out of uart before they're clobbered by more
>>characters arriving) have occurred.

> Mike, is this affected by the number of NCLISTs in the kernel configuration?

Nope.  The hpsio will also invoke flow control when the low water mark is
reached but the "overrun" error is an indication that interrupts weren't
serviced quickly enough.
--
	-Mike Wescott
	 mike.wescott@ncrcae.Columbia.NCR.COM

wdh@holos0.uucp (Weaver Hickerson) (10/19/90)

Thanks Mike and all those who sent help my way.  It turned out that the
protocol convertor I was hooking to was monitoring a different pin than the
"standard" pin for DTE throttling.  Mike shined a lamp in that direction
with his first posting.

I engineered a new cable and fed-ex'd that to the site.  With the new cable
in place, they are happily receiving their 9-10 megabyte files with no
problem.

Our usenet feed just paid for itself for the year, and Mike Wescott made it
onto my Christmas card list.  Thanks again, folks.


Weaver
-- 
-Weaver Hickerson   Voice (404) 496-1358   :  ..!edu!gatech!holos0!wdh