[comp.unix.sysv386] Son of FAS?

richard@pegasus.com (Richard Foulk) (04/25/91)

I recall hearing that Equinox (or someone like that) equips their
smart serial card with a driver that doesn't use interrupts.  They
do some kind of polling from within one of the kernels inner loops.

Doing away with the interrupt overhead supposedly results in a marked
performance gain.  (Seems reasonable.)

My question is: couldn't this same technique be used to good advantage
with the fifo-ized dumb serial cards?

Since most smart cards gain mostly from the reduced interrupt load they
place on the system wouldn't this blur the difference a bit more?


-- 
Richard Foulk		richard@pegasus.com

gemini@geminix.in-berlin.de (Uwe Doering) (04/25/91)

richard@pegasus.com (Richard Foulk) writes:

>I recall hearing that Equinox (or someone like that) equips their
>smart serial card with a driver that doesn't use interrupts.  They
>do some kind of polling from within one of the kernels inner loops.
>
>Doing away with the interrupt overhead supposedly results in a marked
>performance gain.  (Seems reasonable.)
>
>My question is: couldn't this same technique be used to good advantage
>with the fifo-ized dumb serial cards?
>
>Since most smart cards gain mostly from the reduced interrupt load they
>place on the system wouldn't this blur the difference a bit more?

In theory, this would work. But you would need a FIFO that is longer
that the 16 bytes in the NS16550A. The kernel tick in almost all
UNIX SysVr[34] releases is 100 Hz. That's where they would have to
hook in their driver. So polling would occure every 10 milliseconds.
At 38400 bps this would be 38.4 characters per poll. Obviously, this
is too much to fit in a 16 byte FIFO. If there were UARTs available
that are upwards compatible with the NS16550A, and have, for instance,
64 byte FIFOs, I would use polling in FAS, as it would indeed save a
lot of CPU time.

      Uwe
-- 
Uwe Doering  |  INET : gemini@geminix.in-berlin.de
Berlin       |----------------------------------------------------------------
Germany      |  UUCP : ...!unido!fub!geminix.in-berlin.de!gemini

cpcahil@virtech.uucp (Conor P. Cahill) (04/25/91)

richard@pegasus.com (Richard Foulk) writes:

>My question is: couldn't this same technique be used to good advantage
>with the fifo-ized dumb serial cards?

There are several problems with polling.

The first problem is that the fifo's probably aren't big enough to handle 
polling.  Polling is limited to 1 query every kernel clock cycle which
is normally 100 (HZ) in the sysv386 world.  If you are recieving data
at 38.4k (or approx 4,000 bytes per second) you would need at least a
fifo of 40 bytes.  Of course to handle other timing considerations (like
scheduling) the buffer would have to be even larger.

Another problem is that unless the port could be configured with
some form of intelligence, the response to various events (drop of 
cts, XOFF, etc) would be delayed.

>Since most smart cards gain mostly from the reduced interrupt load they
>place on the system wouldn't this blur the difference a bit more?

They can only do this cleanly when they have both fifo space and intelligence
on the cards.

-- 
Conor P. Cahill            (703)430-9247        Virtual Technologies, Inc.
uunet!virtech!cpcahil                           46030 Manekin Plaza, Suite 160
                                                Sterling, VA 22170

rcd@ico.isc.com (Dick Dunn) (04/26/91)

richard@pegasus.com (Richard Foulk) writes:
> I recall hearing that Equinox (or someone like that) equips their
> smart serial card with a driver that doesn't use interrupts.  They
> do some kind of polling from within one of the kernels inner loops.

Hmmm...the most often you're guaranteed to get into the kernel is once per
clock tick, or 10 ms, which isn't often enough to poll a serial device.
Once you get into the device code on an interrupt, you can hang around and
pick up any additional characters that arrive while you're processing the
first one...but I think most serial drivers nowadays do that anyway.

> Doing away with the interrupt overhead supposedly results in a marked
> performance gain.  (Seems reasonable.)

Quite likely--the interrupt overhead really is most of the time you spend
processing an incoming character on a serial port.

The only catch is that you've got to be careful not to hang around polling
the device for very long, or you'll miss other interrupts.  (This is a
DOSism that just won't work in UNIXland.)
-- 
Dick Dunn     rcd@ico.isc.com -or- ico!rcd       Boulder, CO   (303)449-2870
   ...While you were reading this, Motif grew by another kilobyte.

bill@ssbn.WLK.COM (Bill Kennedy) (04/26/91)

richard@pegasus.com (Richard Foulk) writes:
>I recall hearing that Equinox (or someone like that) equips their
>smart serial card with a driver that doesn't use interrupts.  They
>do some kind of polling from within one of the kernels inner loops.

Polling is quite common on interfaces with a heavy load or a lot of spigots.
I think (don't know because I don't have one) that Altos polls their multi
user I/O cards, i.e. 64 user systems.  It imposes considerable processor
burden but it scoops up the stuff that's just roaring in and out.

>Doing away with the interrupt overhead supposedly results in a marked
>performance gain.  (Seems reasonable.)

It does seem reasonable if you're dealing with a gawd awful interrupt
architecture like Intel and living with abysmal interrupt controllers
like the 8259.  On a system with decent context switch time and some
adequate hardware (meaning don't make the processor work extra just to
use the part) it makes sense to be thrifty with interrupts.  The 80286
is probably one of the worst on earth.  On a 6MHz AT&T PC6300 PLUS you
can't handle a steady stream of async characters >4800bps.  If you plan
to do much of anything else, 2400bps is max.  They document 1200bps as
max.  With something like the NS16550A where you can meddle with the FIFO
threshold you can make a 9600bps stream interrupt at the same frequency
as a 2400bps connection without FIFOs.  The overhead is stopping what
you're doing, saving the machine state, deciding where to go and getting
there.  The memory and I/O events while you're there are "free".

>My question is: couldn't this same technique be used to good advantage
>with the fifo-ized dumb serial cards?
>
>Since most smart cards gain mostly from the reduced interrupt load they
>place on the system wouldn't this blur the difference a bit more?
>
>-- 
>Richard Foulk		richard@pegasus.com

I heard that DigiBoard's latest stuff doesn't interrupt at all.  Sure, you
could poll dual ported memory and any number of things, but the easiest
thing to do is manage your FIFO threshold such that you take as few interrupts
as possible while servicing the maximum number of I/O events.  The 550A is
still going to interrupt after a period of "quiet" with a character in the
buffer.  What makes a whole lot of sense is to service every part you can
get to during the interrupt service for the one who interrupted.  By that I
mean if there's any output to do and the port is ready, send it the next
byte and collect all input and queue it for the top level routine as long
as any port has input available.  You spend a few cycles but not as many as
you would if each event required a separate context switch.

The old Z80-SIO was a pretty good example of a part designed with interrupts
in mind.  If you were using it in async mode it used the two bytes that it
kept for SDLC CRC as a FIFO.  That meant that as long as you grabbed the
first character before four characters had arrived, you were safe.  They also
have another thing called "auto-enables".  Like any hardware feature there
are two sharp edges on the blade.  When you set auto-enables you would not
get a receiver interrupt unless DCD was true and you wouldn't get a
transmitter interrupt unless CTS was true.  That sounds like an ideal thing
to do but it had as many drawbacks as blessings.  I've run some pretty high
speed stuff with hardware handshaking through an SIO and it was a pleasure.
At the same time, "pleasure" has twice as many syllables as most of the words
I had for auto-enables when I really didn't need them but used them anyway.

Polling makes sense when you know that you always have a lot to do.  If you
don't or aren't sure, then interrupts help you decide.  I don't, for example,
enable FIFOs in a '550A unless the baud rate is > 2400bps, you don't need it.
If you become heavily loaded it might make some sense to stop interrupting
and start polling until things calm down.  Anybody want to write an async
driver _that_ smart? :-)
-- 
Bill Kennedy  internet  bill@ssbn.WLK.COM or ssbn!bill@attmail.COM
              uucp      {att,cs.utexas.edu,pyramid!daver}!ssbn.wlk.com!bill

gandrews@netcom.COM (Greg Andrews) (04/26/91)

In article <1991Apr25.010758.1522@pegasus.com> richard@pegasus.com (Richard Foulk) writes:
>I recall hearing that Equinox (or someone like that) equips their
>smart serial card with a driver that doesn't use interrupts.  They
>do some kind of polling from within one of the kernels inner loops.
>
>Doing away with the interrupt overhead supposedly results in a marked
>performance gain.  (Seems reasonable.)
>

Actually, they don't poll with the system processor.  The ports themselves
probably handled by a dedicated processor on the board itself.  Since that
processor has (almost) nothing else to do, it can efficiently poll the
port circuit.  Since the system processor doesn't have to deal with the
low level byte-in/byte-out of the serial ports, it can efficiently do
everything else.

>
>My question is: couldn't this same technique be used to good advantage
>with the fifo-ized dumb serial cards?
>
>Since most smart cards gain mostly from the reduced interrupt load they
>place on the system wouldn't this blur the difference a bit more?
>

No that wouldn't work out very well.  Here's the best example I can think
of:

Polling works like the mailbox at your home.  Once a day, you go out and
check it to see if you've received mail.  Since you only have to poll it
once a day, it doesn't burn up too much of your time.  The mail comes at
regular intervals, so you know when to expect it and you don't have to
continually check the mailbox all day long.

Interrupts are like your telephone.  You have no idea when someone will
call, so you need a signal to get your attention.  When the bell rings,
you put down what you are doing and grab the phone.  If the phone had no
bell, you would have to constantly drop what you are doing and go check
if someone were calling.  Otherwise you might miss a call.  If you had
nothing else to do all day, this wouldn't be a problem.  However, if you
need to get other work done around the house, polling the phone would be
too wasteful of your time.

Asking the system processor to poll the serial ports would be a big waste
of its time.  It would have to check the port so often that everything
else would slow down to a crawl.  The processor is too busy checking if
another byte was received to get anything else done.  Even when nothing
is coming in.

Polling the hardware can be very efficient, since you don't have to waste
time putting the current task aside just to grab a byte out of the port.
But it's best performed by a processor dedicated to just that task.
If the processor must do other things, then asking it to poll the serial
ports will slow it down drastically.  Better to use interrupts so it won't
be bothered until there's real work to be done with the port.

Hope this helps...

-- 
.------------------------------------------------------------------------.
|  Greg Andrews   |       UUCP: {apple,amdahl,claris}!netcom!gandrews    |
|                 |   Internet: gandrews@netcom.COM                      |
`------------------------------------------------------------------------'

kdenning@pcserver2.naitc.com (Karl Denninger) (04/28/91)

In article <1991Apr26.013550.20175@netcom.COM> gandrews@netcom.COM (Greg Andrews) writes:
>In article <1991Apr25.010758.1522@pegasus.com> richard@pegasus.com (Richard Foulk) writes:
>>I recall hearing that Equinox (or someone like that) equips their
>>smart serial card with a driver that doesn't use interrupts.  They
>>do some kind of polling from within one of the kernels inner loops.
>>
>>Doing away with the interrupt overhead supposedly results in a marked
>>performance gain.  (Seems reasonable.)
>
>Actually, they don't poll with the system processor.  The ports themselves
>probably handled by a dedicated processor on the board itself.  Since that
>processor has (almost) nothing else to do, it can efficiently poll the
>port circuit.  Since the system processor doesn't have to deal with the
>low level byte-in/byte-out of the serial ports, it can efficiently do
>everything else.

Correct.

As far as you go, that is.

The Equinox board, in particular, uses one dedicated ASIC which is
specialized to handle serial I/O.  It uses no USARTS or other "standard"
serial interface chips.  All the logic is in ONE chip.

It is gawd-awfully good at what it does as well.  As it should be -- it was
developed for this and only this application.

>Polling works like the mailbox at your home.  Once a day, you go out and
>check it to see if you've received mail.  Since you only have to poll it
>once a day, it doesn't burn up too much of your time.  The mail comes at
>regular intervals, so you know when to expect it and you don't have to
>continually check the mailbox all day long.

The Equinox drivers and newer Digiboard drivers on the host side use a
derivitive of polling.  They use ADAPTIVE polling.  This takes advantage of
the fact that you really can't tell the difference between a 50ms delay and
no delay in reading characters from a serial port at low baud rates..... and
the driver can "learn" the data rate and adjust it's polling rate.  Also,
polling to check one bit is VERY fast (ie: is there anything in the buffer
that I have to deal with right now).  You can do this from the clock
interrupt with nearly no overhead.

Since the board has buffer memory on it, this works very well.  On each
"tick" the board can be checked for pending input and output buffer
availability.  If there is input pending or output buffer space (and you
have output for the card) you can then set a flag for later -- and when you
get around to it do the actual input and output in a batch.

This turns out to be much, much faster under heavy loads, and is
indistinguishable from the interrupt-at-a-time character mode during light
loads.

This kind of design is directly responsible for being able to run 24 ports
at 38,400 full duplex with NO flow control and no lost characters -- all
with single-digit load impact on the system processor.

I used to do this kind of thing in programmable controllers (where we didn't
HAVE the extra cycles - the "cpu" was a Z-80!).  If it's done right it's the 
best solution available.

--
Karl Denninger - AC Nielsen, Bannockburn IL (708) 317-3285
kdenning@nis.naitc.com

"The most dangerous command on any computer is the carriage return."
Disclaimer:  The opinions here are solely mine and may or may not reflect
  	     those of the company.

larry@nstar.rn.com (Larry Snyder) (04/28/91)

kdenning@pcserver2.naitc.com (Karl Denninger) writes:

>The Equinox drivers and newer Digiboard drivers on the host side use a

which version of the digiboard drivers (the 4.6?)

-- 
      Larry Snyder, NSTAR Public Access Unix 219-289-0287/317-251-7391
                         HST/PEP/V.32/v.32bis/v.42bis 
                        regional UUCP mapping coordinator 
               {larry@nstar.rn.com, ..!uunet!nstar.rn.com!larry}

davidsen@sixhub.UUCP (Wm E. Davidsen Jr) (05/01/91)

In article <1991Apr25.122805.1708@virtech.uucp> cpcahil@virtech.uucp (Conor P. Cahill) writes:

| There are several problems with polling.
| 
| The first problem is that the fifo's probably aren't big enough to handle 
| polling.  Polling is limited to 1 query every kernel clock cycle which
| is normally 100 (HZ) in the sysv386 world.  


| Another problem is that unless the port could be configured with
| some form of intelligence, the response to various events (drop of 
| cts, XOFF, etc) would be delayed.

  Your first point is fine, but who cares if it takes 10ms to recycle
after a call terminates. The cts probably dropped a second after the
carrier, so you will never know.

  Now, on the XOFF, any system with a FIFO may well keep right on
sending after the XOFF, and I think there's a standard which calls for
XOFF 1 sec before buffer full, I just can't remember where I saw it. We
hit this in 1978 or so when running a mainframe into an old S100 system
at 38.4.

| >Since most smart cards gain mostly from the reduced interrupt load they
| >place on the system wouldn't this blur the difference a bit more?
| 
| They can only do this cleanly when they have both fifo space and intelligence
| on the cards.

  The one real advantage to a really smart card is that you can have an
interrupt on every character, and allow XOFF to stop output in one
character time, etc. Note that even an 8250 has a one byte FIFO, so you
have a character in the shift register going out, and another in the
latch, committed to going out. So you can't easily stop in less than two
characters no matter what card you have (it can be done after one with
hardware).

-- 
bill davidsen - davidsen@sixhub.uucp (uunet!crdgw1!sixhub!davidsen)
    sysop *IX BBS and Public Access UNIX
    moderator of comp.binaries.ibm.pc and 80386 mailing list
"Stupidity, like virtue, is its own reward" -me

davidsen@sixhub.UUCP (Wm E. Davidsen Jr) (05/01/91)

In article <1991Apr27.231847.8873@pcserver2.naitc.com> kdenning@pcserver2.naitc.com (Karl Denninger) writes:

| The Equinox drivers and newer Digiboard drivers on the host side use a
| derivitive of polling.  They use ADAPTIVE polling.  This takes advantage of
| the fact that you really can't tell the difference between a 50ms delay and
| no delay in reading characters from a serial port at low baud rates..... and
| the driver can "learn" the data rate and adjust it's polling rate.  Also,
| polling to check one bit is VERY fast (ie: is there anything in the buffer
| that I have to deal with right now).  You can do this from the clock
| interrupt with nearly no overhead.
| 
| Since the board has buffer memory on it, this works very well.  On each
| "tick" the board can be checked for pending input and output buffer
| availability.  If there is input pending or output buffer space (and you
| have output for the card) you can then set a flag for later -- and when you
| get around to it do the actual input and output in a batch.

  What kind of latency do they get on XOFF? Whats the worst case number
of characters or ms between the XOFF coming in and the output stopping?
The obvious answer is 10ms, or about 40 characters, but I hope that's
wrong, because it's too many.
-- 
bill davidsen - davidsen@sixhub.uucp (uunet!crdgw1!sixhub!davidsen)
    sysop *IX BBS and Public Access UNIX
    moderator of comp.binaries.ibm.pc and 80386 mailing list
"Stupidity, like virtue, is its own reward" -me

kdenning@genesis.Naitc.Com (Karl Denninger) (05/02/91)

In article <3823@sixhub.UUCP> davidsen@sixhub.UUCP (bill davidsen) writes:
>In article <1991Apr27.231847.8873@pcserver2.naitc.com> kdenning@pcserver2.naitc.com (Karl Denninger) writes:
>
>| The Equinox drivers and newer Digiboard drivers on the host side use a
>| derivitive of polling.  They use ADAPTIVE polling.  This takes advantage of
>| the fact that you really can't tell the difference between a 50ms delay and
>| no delay in reading characters from a serial port at low baud rates..... and
>| the driver can "learn" the data rate and adjust it's polling rate.  Also,
>| polling to check one bit is VERY fast (ie: is there anything in the buffer
>| that I have to deal with right now).  You can do this from the clock
>| interrupt with nearly no overhead.
>| 
>| Since the board has buffer memory on it, this works very well.  On each
>| "tick" the board can be checked for pending input and output buffer
>| availability.  If there is input pending or output buffer space (and you
>| have output for the card) you can then set a flag for later -- and when you
>| get around to it do the actual input and output in a batch.
>
>  What kind of latency do they get on XOFF? Whats the worst case number
>of characters or ms between the XOFF coming in and the output stopping?
>The obvious answer is 10ms, or about 40 characters, but I hope that's
>wrong, because it's too many.

Equinox guarantees that they can stop the output stream on an Xoff within 10
bit times (ONE character).  All the way to 38,400 baud.  Most of the time
it's at the end of the current character being output when the XOFF is fully
received.

They do this by implementing some of the line discipline on the board, which
allows them to do it >NOW<.

And yes, it really is that fast.  I've never seen a buffer overrun on any
hardware tied to these beasties when the Xon/Xoff flow control is turned on.

Their harware flow control, until recently, wasn't as good.  There were
several characters of slop in there.  Fortunately I have Telebits which need
this feature, and they have more than enough buffer to handle it.  I hear 
they have new drivers which fix that.

If you can't tell I love the Equinox boards... :-)

--
Karl Denninger - AC Nielsen, Bannockburn IL (708) 317-3285
kdenning@nis.naitc.com

"The most dangerous command on any computer is the carriage return."
Disclaimer:  The opinions here are solely mine and may or may not reflect
  	     those of the company.