[comp.sys.ibm.pc.rt] 4-port card problem with 12/88 AOS

edler@cmcl2.NYU.EDU (Jan Edler) (05/24/89)

Someone posted something about this a few weeks ago, but I don't
remember who, and never saw a followup.  We just tried to bring up the
December 1988 AOS kernel on an RT with a buffered 4-port asynchronous
card, and discovered that, indeed, it doesn't work.  We can keep
running the September 1988 kernel on this system for awhile, but does
anyone have a fix?  I've looked at the driver, and I think the problem
is fairly obvious, but I'd rather not have to fix it myself.

Jan Edler
edler@nyu.edu

dyer@spdcc.COM (Steve Dyer) (05/24/89)

In article <39727@cmcl2.NYU.EDU> edler@cmcl2.UUCP (Jan Edler) writes:
>Someone posted something about this a few weeks ago, but I don't
>remember who, and never saw a followup.  We just tried to bring up the
>December 1988 AOS kernel on an RT with a buffered 4-port asynchronous
>card, and discovered that, indeed, it doesn't work.  We can keep
>running the September 1988 kernel on this system for awhile, but does
>anyone have a fix?  I've looked at the driver, and I think the problem
>is fairly obvious, but I'd rather not have to fix it myself.

Yes, I was the one posting about this.  I haven't heard squat, and
I'm still running the Sep. kernel.

I tried removing the software interrupt processing from the driver,
the feature which seems to be responsible for diffs in the TTY subsystem's
code (TTY layer, ldisc and drivers) between Sep 88 and Dec 88, but what
I ended up with was an almost continual screenful of "asyN: overrun"-type
errors given even the slightest input.  I may have blown it, but I don't
think so.

-- 
Steve Dyer
dyer@ursa-major.spdcc.com aka {ima,harvard,rayssd,linus,m2c}!spdcc!dyer
dyer@arktouros.mit.edu

seeger@poe.ufnet.ufl.edu (F. L. Charles Seeger III) (05/24/89)

In article <39727@cmcl2.NYU.EDU> edler@cmcl2.UUCP (Jan Edler) writes:
|Someone posted something about this a few weeks ago, but I don't
|remember who, and never saw a followup.  We just tried to bring up the
|December 1988 AOS kernel on an RT with a buffered 4-port asynchronous
|card, and discovered that, indeed, it doesn't work.  We can keep
|running the September 1988 kernel on this system for awhile, but does
|anyone have a fix?  I've looked at the driver, and I think the problem
|is fairly obvious, but I'd rather not have to fix it myself.

I, too, have a problem with the 4-port buffered async board.  I set the
interrupt to 10, as 9 and 11 were in use (yes, I know that they are
supposed to be shareable).  The kernel then complained that there
couldn't be devices using interrupts 10 and 0 (yes, zero).  So, I've
yanked the board back out and won't do anything with it until I have
nothing better to do.  I can't find any reference to interrupt 0, but
then IBM chose not to ship me full documentation, either.

My system is a 6150-135, 16 MB, 3 310 EESDI drives, APA16 monitor, 2
Ethernet interfaces, and it runs the December '88 AOS release.

Previously, I ran into another problem configuring the kernel.  Apparnetly,
the SGP option must be included if you have 16 MB, but isn't necessary for
an 8 MB machine.  Their kernel configuration sources are pretty ugly, BTW.
This problem looks like a hardware conflict with other RT models that
weren't ifdef'ed properly.  The SGP option just happened to prevent the
nastiness.  Supposedly, the SGP option is necessary only if you have older
RT models on the network (I don't).

Nonetheless, the IP routing works, and the latest Berkeley networking and
mail code ported quite easily.  It still needs a lot of the BSD fixes from 
the past year and half, though.

Always grateful to have someone else fix my problems.  8-)

Regards,
Chuck
--
  Charles Seeger            216 Larsen Hall             +1 904 392 8935
  Electrical Engineering    University of Florida       Just say NO to
  seeger@iec.ufl.edu        Gainesville, FL 32611       EtherTalk

edler@cmcl2.NYU.EDU (Jan Edler) (05/25/89)

Apparently this problem is known to IBM, and considered important,
but there is no fix yet.  The problem also probably applies to psp
ports (the ones on the motherboard of the 6150).  My guess, from looking
at the code, is that you won't see the bug very often if you only use a
single port of each type (asy or psp).

If I can't resist, I might take a crack at it myself; if I do I'll
let you all know.  In the meantime, keep running those september
kernels on your RTs with serial port usage!

Jan Edler
edler@nyu.edu

wlm@archet.UUCP (William L. Moran Jr.) (05/26/89)

In article <39755@cmcl2.NYU.EDU> edler@cmcl2.UUCP (Jan Edler) writes:
>Apparently this problem is known to IBM, and considered important,
>but there is no fix yet.  The problem also probably applies to psp
>ports (the ones on the motherboard of the 6150).  My guess, from looking
>at the code, is that you won't see the bug very often if you only use a
>single port of each type (asy or psp).

Yes, this problem is known to IBM (and has been for at least 18
months), A friend and I have done a great deal of experimentation to  
determine what causes (or exacerbates) the problem and have the found 
the following:

- Using even one serial port is sufficient to cause problems
- Fast disk controllers may make the problem worse
- The megapel makes the problem  worse, the APA16 is a little better,
  and the mono is a little better than that. Don't ever try to use the
  serial port with a megapel in glass tty mode - it's instant death.
- There is some threshold at which the problem really kicks in (i.e.
  at 2400 it is usually possible to ~t in tip 50k, but it is almost
  never possible to take 200k.
- A trigger level in the middle is better than the highest (I forget
  whether it's 14 or 15, and am too lazy to look in /usr/sys :)
- The problem is worse when using a modem than when using just a
  direct line (several unconvincing explanations for this have been
  offered).  

				Bill Moran

This is offered just as information. Nothing I say should be taken as
representing the views of IBM Research (I'm just a grad. student :)





-- 
arpa: moran-william@cs.yale.edu or wlm@ibm.com
uucp: uunet!bywater!acheron!archet!wlm or decvax!yale!moran-william
-------------------------------------------------------------------------------
To keep on running, try with all our might,
But in the midst of effort faint and fail;

dyer@spdcc.COM (Steve Dyer) (05/27/89)

In article <66@archet.UUCP> wlm@archet.UUCP (William L. Moran Jr.) writes:
>Yes, this problem is known to IBM (and has been for at least 18
>months), A friend and I have done a great deal of experimentation to  
>determine what causes (or exacerbates) the problem and have the found 
>the following:
>
>[lots of hardware discussions]

I hope that the remaining people at IBM who are responsible for AOS 4.3 support
aren't unduly misled by this list, because although I'm sure what you
describe may be a serious problem itself, it is NOT the problem that Jan
and I are describing because different software with the same hardware
behaves differently.  There is a clear software problem.  Something
broke between Sep 88 and Dec 88.

I have a system with one EESDI controller, and one ST506 controller; 3
drives in all (1 ESDI, 2 ST506).  One 4 port buffered card with one port
used constantly as a 19.2kb SLIP line and the remaining 3 ports in use
with Telebits ranging from 1200 to 19.2kb.  An APA16.

Simply put, the Dec 88 kernel doesn't work on the 4 port serial lines AT ALL.
I am running with a Dec 88 software base but with a Sep 88 kernel.  This
set-up works fine.  I can't imagine why it should take IBM so long to fix
this.  After all, something changed within a 3 month release cycle, and
both sources are presumably available to them for inspection.

-- 
Steve Dyer
dyer@ursa-major.spdcc.com aka {ima,harvard,rayssd,linus,m2c}!spdcc!dyer
dyer@arktouros.mit.edu

wlm@archet.UUCP (William L. Moran Jr.) (05/28/89)

In article <3383@ursa-major.SPDCC.COM> dyer@ursa-major.spdcc.COM (Steve Dyer) writes:
>I hope that the remaining people at IBM who are responsible for AOS 4.3 support
>aren't unduly misled by this list, because although I'm sure what you
>describe may be a serious problem itself, it is NOT the problem that Jan
>and I are describing because different software with the same hardware
>behaves differently.  There is a clear software problem.  Something
>broke between Sep 88 and Dec 88.
>
... A description of a working system deleted
>
>Simply put, the Dec 88 kernel doesn't work on the 4 port serial lines AT ALL.
>I am running with a Dec 88 software base but with a Sep 88 kernel.  This
>set-up works fine.  I can't imagine why it should take IBM so long to fix
>this.  After all, something changed within a 3 month release cycle, and
>both sources are presumably available to them for inspection.

I agree that there is a software problem, but the software problem is
exacerbated by certain hardware (I described some of this). It is not
correct to say that the Dec. kernel doesn't work on a 4 port. I'm
posting this from an RT running the Dec. release; the serial ports are
on a 4 port. Of course I should point out that this is a buffered 4
port, it is possible that it doesn't work at all on the unbuffered. I
would be the last one to claim that it works well - I crash about once
a day getting news, but it does work.

Anyway, it's good to see people complaining about the 4 port in a
public place since this may force someone to fix the problems the RT
kernel has with serial support.

				Bill Moran
 


-- 
arpa: moran-william@cs.yale.edu or wlm@ibm.com
uucp: uunet!bywater!acheron!archet!wlm or decvax!yale!moran-william
-------------------------------------------------------------------------------
To keep on running, try with all our might,
But in the midst of effort faint and fail;