[comp.protocols.tcp-ip] interrupt-driven vs. polled I/O performance

amanda@lts.UUCP (04/14/89)

Dave Crocker argues with Henry Spencer about interrupts:
>[...]
>Interrupts kill.
>
>The raw machine processing time -- i.e., hardware instruction cost -- for
>interrupts is quite significant.  As with most features in this world,
>interrupts are excellent for their intended purpose and questionable for
>other situations.
>
>An interrupt is very useful when you don't expect much activity and don't
>want the overhead of polling.  On the other hand, if you DO expect heavy
>activity, polling is quite nice.

Well, I wouldn't say it's quite so clear cut as all of that.  Granted,
the absolute best raw data rate through a given interface (of whatever
sort) can be achieved by polling, principally because there is then no
need to save and restore any state--the processor is only doing I/O.
This is one of the reasons that separate I/O processors are such a big
performance win: I/O state gets encapsulated in hardware rather than
software.

What I think Henry was getting at, though, is that given a processor
which is doing both I/O and other tasks (or, for that matter, which is
being subjected to an unpredictable I/O load), careful attention to
how interruptsa are handled can make an incredible difference in
real-time response and overall performance, for exactly some of the
same reasons that polling wins.  

The general idea is that you do as little processing as possible one
character at a time.  For example, a serial input interrupt service
routine should stuff the incoming character in a buffer *and that's it*.
This only takes a few microseconds on the kinds of processors being used
in most workstations and desktop computers.  In other words, interrupts
are used to approximate the behavior of a hardware FIFO.  No context
switches or wakeups, just "push a register or two, grab the input,
stuff it into a buffer, update the buffer head, and return."
This buffer is then polled by the upper-level code, which (especially
under high load) then has a chance of grabbing a whole block of data at
once, which it can then rip through all at once without having to switch
state.  Sound familiar :-)?  When I first saw the code to the UNIX tty
handler, my first question was "why are they doing all of this at
interrupt time?"  So far, I've never seen an answer to that besides
"historical reasons."

Now, if you do have the luxury of a dedicated I/O processor, polling and
hardware FIFOs are the way to go.  For an Ethernet terminal sever, for
example, a LANCE (or other chip that buffer bunches of back-to-back
packets all by its little lonesome self) plus FIFOs between the serial
chips and the processor will give you some impressive throughput even
with a relatively wimpy 6MHz 80186...

>Many, occasional sources of activity warrant interrupts.
>A few, active sources warrant polling.
>
>Dave

How 'bout we add "A non-I/O-dedicated processor warrants interrupts?"

I agree with you in general, I just don't see how it's too relevant to
the discussion, which (to me anyway) seemed to be "how can we keep
workstations from croaking under heavy input traffic?"

Of course, this approach is also useful in upper-level processing as well.
For example, it doesn't matter how well your network I/O code is if your
RPC input queue is only 8 items long, and 50 NFS clients try to mount one
of your filesystems...

Amanda Walker
InterCon Systems Corporation
amanda@lts.UUCP

Makey@LOGICON.ARPA (Jeff Makey) (04/15/89)

In article <14-Apr-89.114221@192.41.214.2> amanda@lts.UUCP writes:
>When I first saw the code to the UNIX tty
>handler, my first question was "why are they doing all of this at
>interrupt time?"  So far, I've never seen an answer to that besides
>"historical reasons."

The reason "all of this" gets done at interrupt time is performance,
although not the kind that has been discussed so far.  In this case,
UNIX software is responsible for echoing characters, and to keep echo
delay to a minimum the character(s) actually echoed are put on the
output queue at interrupt time.  Anyone who has used TELNET (hah!
there's the relevance to TCP/IP) over a slow connection knows how
annoying delayed echo is.

                           :: Jeff Makey

Department of Tautological Pleonasms and Superfluous Redundancies Department
    Disclaimer: Logicon doesn't even know we're running news.
    Internet: Makey@LOGICON.ARPA    UUCP: {nosc,ucsd}!logicon.arpa!Makey

rpw3@amdcad.AMD.COM (Rob Warnock) (04/15/89)

In article <416@logicon.arpa> Makey@LOGICON.ARPA (Jeff Makey) writes:
+---------------
| In article <14-Apr-89.114221@192.41.214.2> amanda@lts.UUCP writes:
| >When I first saw the code to the UNIX tty handler, my first question was
| >"why are they doing all of this at interrupt time?"  So far, I've never
| >seen an answer to that besides "historical reasons."
| The reason "all of this" gets done at interrupt time is performance,
| although not the kind that has been discussed so far.  In this case,
| UNIX software is responsible for echoing characters, and to keep echo
| delay to a minimum the character(s) actually echoed are put on the
| output queue at interrupt time.  Anyone who has used TELNET (hah!
| there's the relevance to TCP/IP) over a slow connection knows how
| annoying delayed echo is.  :: Jeff Makey
+---------------

But the echo delay doesn't need to be kept to a *minimum*, but only
well below the objectionable level to the human user. And the difference
between those two scenarios can be a *large* fraction of your machine,
especially when you have a TTY driver that is used by both humans
and (say) UUCP at 19200 baud.

A *very small* delay in input processing -- the amount you get when
you use the two-level interrupt scheme Amanda referred to -- can save
enormously in overhead on your high-speed input lines, without ever
being noticed by your users. Likewise, a *small* delay in beginning
output (to give the upper levels time to copy more data into the output
queues) can result in large improvements in efficiency, again without
being noticed by users.

What numbers are we talking about here? Well, at Fortune Systems we
delayed input by 1/60 second (the queue built by the 1st-level handler
wasn't passed to "tty.c" until the next clock tick), and TTY input
capacity went up a factor of 12. In a certain release of TOPS-10
[6.04? I forget], DEC delayed beginning output until the next tick
[4 ticks?], and the output rate went up by a factor of 4 to 6.
Yet both of these systems had "crisp" echo.

I claim that 100ms for echo is not noticable to a user of a video
display (although it is mildly annoying if you are on an ASR-33 (!)
or any other terminal which causes an audible signal on output),
yet allowing the operating system that 100ms or so can result
in such performance gains that not to do so is gross negligence!

Interrupts are ugly, brutish things, to be sure. ("Nyekulturny!" as
the Russians might say.) But they have their uses, especially when
you'd like certain things (like echo) to happen reasonably on time.
But the key is "reasonably"...


Rob Warnock
Systems Architecture Consultant

UUCP:	  {amdcad,fortune,sun}!redwood!rpw3
DDD:	  (415)572-2607
USPS:	  627 26th Ave, San Mateo, CA  94403

Makey@LOGICON.ARPA (Jeff Makey) (04/25/89)

In article <25231@amdcad.AMD.COM> rpw3@amdcad.UUCP (Rob Warnock) writes:
>But the echo delay doesn't need to be kept to a *minimum*, but only
>well below the objectionable level to the human user.
 [...]
>I claim that 100ms for echo is not noticable to a user of a video
>display

Your point about only needing to keep echo delay below the level of
perception is well taken.  However, 100ms is 1/10 second, which I am
sure I would find quite noticeable (my terminal runs at 19,200 baud).
Taking a cue from motion pictures, I would assume that 1/30 second is
the order of magnitude of the level of perception, but I would want to
perform some experiments with different delays and lots of people to
make sure.

                           :: Jeff Makey

Department of Tautological Pleonasms and Superfluous Redundancies Department
    Disclaimer: Logicon doesn't even know we're running news.
    Internet: Makey@LOGICON.ARPA    UUCP: {nosc,ucsd}!logicon.arpa!Makey

rpw3@amdcad.AMD.COM (Rob Warnock) (04/25/89)

In article <421@logicon.arpa> Makey@LOGICON.ARPA (Jeff Makey) writes:
+---------------
| In article <25231@amdcad.AMD.COM> rpw3@amdcad.UUCP (Rob Warnock) writes:
| >But the echo delay doesn't need to be kept to a *minimum*, but only
| >well below the objectionable level to the human user.
| >I claim that 100ms for echo is not noticable to a user of a video display
| Your point about only needing to keep echo delay below the level of
| perception is well taken.  However, 100ms is 1/10 second, which I am
| sure I would find quite noticeable (my terminal runs at 19,200 baud).
+---------------

But you're only typing about 5 chars/sec. What I'm saying is that the
hand-eye echo delay can be a large fraction (even >1) of the inter-keystroke
time before it becomes noticeable. The fact that you are using 9600, 19200,
or 56k baud is irrelevant.  [O.k. you're a touch-typist, and are doing
10 c/s... but touch-typists depend even *less* on visual feedback.]

+---------------
| Taking a cue from motion pictures, I would assume that 1/30 second is
| the order of magnitude of the level of perception, but I would want to
| perform some experiments with different delays and lots of people to
| make sure.  :: Jeff Makey
+---------------

Well, the analogy doesn't really apply.

1. We're not talking about merging a succession of frames into a seamless
   whole here ["eye-eye coordination", if you will], but [in the worst case]
   about maintaining hand-eye coordination between cursor keys and cursor
   location.

2. There is actually very little feedback from visual input back to the
   keyboarding process; the major feedbacks are (a) auditory -- you *hear*
   the keys being struck, and (b) tactile -- you *feel* the keys hitting
   bottom (or clicking or whatever). Compared to these two, the visual
   feedback (for most people) is quite minor, and is almost ignored.

3. Most error detection while typing is unrelated to the visual perception;
   it has instead to do with typing the wrong thing, and then *realizing*
   it without having to see it. [You may not believe this at first, but
   watch someone else type for a while...]

Of course, it was quite true that when the echo from the computer *did*
have a large auditory component, as with the Teletype [*klunk* *klunk*],
then the 1/10 second delay [110 baud] between keyboard and typehead was
quite noticable [*ker*-*chunk* *ker-chunk*]. And when you used a Teletype in
full-duplex with host echo, that gave you 2/10 sec delay [*ker*-gap-*chunk*
*ker*-gap-*chunk*] that was initially annoying [the "rubber band" touch],
but users got used to it. [They had to.]

With a video display, the irritation begins when the delay is such that
there is a noticable mismatch in action and result. In my experience,
this starts about 0.2-0.3 sec, but gets *really* bad above about 1 second,
when it becomes difficult to match cursor key strikes to cursor location.
[Even worse is the sort of variation you get with program-"echo" on a loaded
system or with Telnet on a slow connection, where you think you're typing
ahead the right number, only to stop and see the cursor "coast" far beyond
where you wanted it.]

But you can try the experiments you want fairly easily by modifying
a copy of Telnet [doing it in the client's easiest] to delay sending
keystrokes in to the host by some amount of time.

I have no really hard data to hand, but based comment from users when I was
building async mux networks back at Dig. Comm. Assoc., I'd still say that a
hand-eye echo delay [with *no* auditory component of echo from the "display"!]
of 100ms is not noticable, and up to 200-300ms is quite acceptable.

[But note that that has to encompass *all* components of the echo path,
including possibly several layers of terminal servers and intermediate
Telnet hosts, if that's what a "typical" user does. Design goals of 20-50ms
each for terminal input and output dally/startup timers are probably o.k.]

This should probably go in comp.ergonomics, except that is *is* germane
to designing networking systems that balance efficiency and latency
considerations...


Rob Warnock
Systems Architecture Consultant

UUCP:	  {amdcad,fortune,sun}!redwood!rpw3
DDD:	  (415)572-2607
USPS:	  627 26th Ave, San Mateo, CA  94403

huitema@mirsa.inria.fr (Christian Huitema) (04/25/89)

From article <421@logicon.arpa>, by Makey@LOGICON.ARPA (Jeff Makey):
> In article <25231@amdcad.AMD.COM> rpw3@amdcad.UUCP (Rob Warnock) writes:
>  [...]
>>I claim that 100ms for echo is not noticable to a user of a video
>>display
> 
> Your point about only needing to keep echo delay below the level of
> perception is well taken.  However, 100ms is 1/10 second, which I am
> sure I would find quite noticeable (my terminal runs at 19,200 baud).

Just analyse a few very popular pieces of software:

	ksh,
	emacs,
	X-Windows,
	...

An interesting point is that all of them are making the echoing from the
user level program! Moreover, X-Windows even send the data on a secundary
TCP connection and back before processing it. And the look at all ``virtual
terminals'' based on the ``pseudo-ttys'', which feature an extra hop inside
the user level before going to the network... Thus, I have some doubts that
one should aim at interrupt time echoing just for the sake of optimizing
``ed'' and ``bin/sh'' if user level echoing is felt acceptable otherwise...

Moreover, by using a conjunction of minimal real time interrupt handlers +
properly scheduled software interrupts, one maximises the system throughput
and thus reduces response times and echoing delays for all users, ``ed'' as
well as ``emacs''.

Christian Huitema

gws@Xylogics.COM (Geoff Steckel) (04/26/89)

There is some knowledge about human sensitivity to delays in
tactile/aural/visual feedback.  Some of the old experiments did things
like have people try to write while watching their hand in a delayed
video monitor.  Some of this was published in general science magazines
like Scientific American as much as 30 years ago.
There are three regimes of interest:

0 to about 50 milliseconds:
   Subject doesn't notice much.  At the upper end, performance slows slightly,
but error rate doesn't increase much.

50 to about 500 milliseconds:
   The **REALLY** annoying range.  Performance slows by x10 or more, even
to a complete stop.  Error rate climbs to virtually 100% at the high end.
Subjects become fatigued and irritable.

500 milliseconds and up:
   If people are forced to use feedback with this much delay, oscillations,
gross errors (like hitting yourself with a pen, etc.), and great annoyance
are the result.  Performing with any accuracy is very slow and very fatiguing.
Most subjects ignore everything except tactile and internal muscle/joint
feedback and `fly blind' -- and feel much better.  They then stop occasionally
and wait for the returned information to catch up with reality.

I worked on a cross-country proprietary network about 15 years ago.
We worked VERY hard to get user feedback time < 100 ms; we wanted 50 or less.
Users started complaining at about 80 ms or so.

Moral: ergonomics are important even (or especially) when designing comm
systems, if humans are in the loop.

	geoff steckel (steckel@alliant.COM, steckel@xenna.COM)

rpw3@amdcad.AMD.COM (Rob Warnock) (04/26/89)

In article <1410@xenna.Xylogics.COM> gws@Xylogics.COM (Geoff Steckel) writes:
+---------------
| There is some knowledge about human sensitivity to delays in
| tactile/aural/visual feedback...  | There are three regimes of interest:
| 0 to about 50 milliseconds: Subject doesn't notice much...
| 50 to about 500 milliseconds: The **REALLY** annoying range...
| 500 milliseconds and up:
+---------------

I completely agree with your numbers for tactile/aural [one of the
"torture tests" we applied before we'd let new announcers work at a
certain campus radio station (c.1965) was to delay the sound in their
monitor earphone from their voice by about 1/2 second -- *killer!*],
but I just can't agree in the case of visual echo to keyboarded data.
It's my experience that the second regime ("really annoying") begins
well above the 200ms range for the keyboard/display situation.

For example, I am typing this message into "vi" via a Telebit modem.
The Telebit, in "micropacket" mode, has a round-trip time of about
240ms for single-character echo (including response to cursor-motion
commands). [I just measured it.] Yet I experience no discomfort or
irritation when doing low-level editing over this modem. [Reading
news, yes, the ~1sec pause while it shifts into "long packet" mode
when displaying a new page is quite annoying. I often wish the Telebit
had something intermediate between its two modes!]

I just don't believe the thresholds are the same for all pairs of
action/perception modes, especially keyboard/display.


Rob Warnock
Systems Architecture Consultant

UUCP:	  {amdcad,fortune,sun}!redwood!rpw3
DDD:	  (415)572-2607
USPS:	  627 26th Ave, San Mateo, CA  94403

peter@ficc.uu.net (Peter da Silva) (04/26/89)

In article <164@mirsa.inria.fr>, huitema@mirsa.inria.fr (Christian Huitema) writes:
[ vi, emacs, etc... do echoing at the user level ]

> Thus, I have some doubts that
> one should aim at interrupt time echoing just for the sake of optimizing
> ``ed'' and ``bin/sh'' if user level echoing is felt acceptable otherwise...

You have obviously been blessed with an abundance of CPU time. There have been
times I've dropped out of 'vi' to 'ex' because the echo delay for 'vi' was so
long I couldn't stand it.

Yes, a real-time kernel would be an improvement... but failing that interrupt-
time echoing has its place.
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.

Business: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180.
Personal: ...!texbell!sugar!peter, peter@sugar.hackercorp.com.

rhorn@infinet.UUCP (Rob Horn) (05/03/89)

In article <25231@amdcad.AMD.COM> rpw3@amdcad.UUCP (Rob Warnock) writes:
>But the echo delay doesn't need to be kept to a *minimum*, but only
>well below the objectionable level to the human user.
 [...]
>I claim that 100ms for echo is not noticable to a user of a video
>display

I went through this extensively for some CAD/CAM and Command & Control
applications.  The dominant factor was the refresh rate of the CRT and
phosphor decay rate (or their equivalents for non-CRT technologies).
Flicker perception, rate timing, and absolute timing perceptions are
each different and impose different update rate issues.  There is a further
problem with eye-hand coordination if the update lags by more than a few
hundred milliseconds.

Most CRT's refresh near either 15 msec or 30 msec.  Most phosphors are picked
to decay in one to two refresh cycles.  Any delay less than the
refresh rate is impossible to perceive because the refresh rate delay
dominates.  Most people cannot perceive delays of up to 50 msec.
(Fast refresh rates *do* affect the perceived image quality, so the
difference between 15 and 30 msec is still significant.)

Flicker perception is triggered by phosphors that decay too quickly
when compared to refresh rates.  Here, some people will perceive
flicker down even to 15 msec, although after-image effects on the
retina must be considered in the experimental design for measuring
these effects.  

For those who do not know what I mean by rate timing, this is the
timing used in race events where human eye plus stopwatch match the
electronic race timers to within 10-20 msec reliably.  A part of this
is matching the motion of the thing being timed with the finger
pushing the stopwatch.  Unpredictable event timing is much less sensitive.
Rate timing effects are significant when performing full screen
updates.  People will perceive pauses that would go unnoticed during
single character updates.
-- 
				Rob  Horn
	UUCP:	...harvard!adelie!infinet!rhorn
		...ulowell!infinet!rhorn, ..decvax!infinet!rhorn
	Snail:	Infinet,  40 High St., North Andover, MA