[comp.protocols.tcp-ip.ibmpc] DMA usage with network I/O

jbvb@vax.ftp.COM (James Van Bokkelen) (05/15/88)

The scenario works like this: The network interface interrupts,  The interrupt
service routine gets invoked, figures out where to put the data, starts DMA.
At this point, the ISR could return, and let packet processing continue when
the DMA completion interrupt happens (the packet needs to be queued for some
sort of demultiplexor to ponder it).  However, the packet is probably only
60 or 70 bytes long (normal Ethernet traffic), and the DMA will be done
almost before the ISR can return.  So, in the standard PC/IP device driver
architecture, the ISR spins, waiting for DMA complete.  Then it continues
processing the packet.

Given this architecture, DMA only brings a benefit when it is faster than
string I/O (which is true in 8088-based machines).  In the classic situation
of a minicomputer disk interface, DMA got you a lot, because once a seek
completed, you could just tell the drive to get sector X to address Y, and
interrupt when done.  The O/S had plenty of time to do something else while
rotational latency took its toll.  With incoming packets, you don't know
how long they are, etc., and the PC's lusing DMA architecture doesn't allow
you to do set-up prior to actual need.  On transmitted packets, DMA would
be useful if you could tell the interface to transmit on DMA complete, but
you can't with any I've seen so far...

James VanBokkelen
FTP Software Inc.

RAF@NIHCU.BITNET (Roger Fajman) (05/16/88)

> Given this architecture, DMA only brings a benefit when it is faster than
> string I/O (which is true in 8088-based machines).

In a machine like a Compaq 386/20 the CPU runs at 20 MHz while the
bus runs at 8 MHz, as in an AT.  Thus it seems to me that the CPU
might be able to find something else to do while the DMA transfer is
going on.  Can anyone tell me if that is true or not?

jbvb@vax.ftp.COM (James Van Bokkelen) (05/16/88)

Depends on how long the transfer takes.  Regardless of CPU speed, the
DMA controller has to access the same memory, and it can only do it
as fast as the bus allows, implying that the 80386 executes 3 or more
wait states for each DMA cycle.  As I said earlier, most packets
are only 60 or 70 bytes long (although when doing bulk data transfer,
you will see two peaks, and there ought to be more 1078-byte or larger
data packets than acks, at least for portions of the transfer).

Given that it might take 100 or more instructions to save and queue the state,
get out of this ISR, get into the DMA ISR and pick up the state again, it
is pretty clear that there is no point in the extra complexity of doing two
context switches while the DMA controller moves 60 or 70 bytes.  If the driver
must support both DMA and programmed I/O, only a programmer who is neither
hurried, nor worried about code size, will bother to implement the context
switch for larger packets.

On fast processors, the speed increase from using programmed I/O will probably
dominate.  Even when the fast processor has something useful to do in the
interrupted context, the increase in complexity (a second ISR; keeping the DMA
controller modes straight with the BIOS, if it is in use; the fact that the
2nd interrupt entry actually slows down packet processing a little) makes it
an unlikely implementation decision.

In summary, yes, there might be a small benefit.  But, the PC was designed
cheap, not fast, and the combination of its architecture and the nature
of network I/O make it an unrewarding tradeoff.

jbvb