jbvb@vax.ftp.COM (James Van Bokkelen) (05/15/88)
The scenario works like this: The network interface interrupts, The interrupt service routine gets invoked, figures out where to put the data, starts DMA. At this point, the ISR could return, and let packet processing continue when the DMA completion interrupt happens (the packet needs to be queued for some sort of demultiplexor to ponder it). However, the packet is probably only 60 or 70 bytes long (normal Ethernet traffic), and the DMA will be done almost before the ISR can return. So, in the standard PC/IP device driver architecture, the ISR spins, waiting for DMA complete. Then it continues processing the packet. Given this architecture, DMA only brings a benefit when it is faster than string I/O (which is true in 8088-based machines). In the classic situation of a minicomputer disk interface, DMA got you a lot, because once a seek completed, you could just tell the drive to get sector X to address Y, and interrupt when done. The O/S had plenty of time to do something else while rotational latency took its toll. With incoming packets, you don't know how long they are, etc., and the PC's lusing DMA architecture doesn't allow you to do set-up prior to actual need. On transmitted packets, DMA would be useful if you could tell the interface to transmit on DMA complete, but you can't with any I've seen so far... James VanBokkelen FTP Software Inc.
RAF@NIHCU.BITNET (Roger Fajman) (05/16/88)
> Given this architecture, DMA only brings a benefit when it is faster than > string I/O (which is true in 8088-based machines). In a machine like a Compaq 386/20 the CPU runs at 20 MHz while the bus runs at 8 MHz, as in an AT. Thus it seems to me that the CPU might be able to find something else to do while the DMA transfer is going on. Can anyone tell me if that is true or not?
jbvb@vax.ftp.COM (James Van Bokkelen) (05/16/88)
Depends on how long the transfer takes. Regardless of CPU speed, the DMA controller has to access the same memory, and it can only do it as fast as the bus allows, implying that the 80386 executes 3 or more wait states for each DMA cycle. As I said earlier, most packets are only 60 or 70 bytes long (although when doing bulk data transfer, you will see two peaks, and there ought to be more 1078-byte or larger data packets than acks, at least for portions of the transfer). Given that it might take 100 or more instructions to save and queue the state, get out of this ISR, get into the DMA ISR and pick up the state again, it is pretty clear that there is no point in the extra complexity of doing two context switches while the DMA controller moves 60 or 70 bytes. If the driver must support both DMA and programmed I/O, only a programmer who is neither hurried, nor worried about code size, will bother to implement the context switch for larger packets. On fast processors, the speed increase from using programmed I/O will probably dominate. Even when the fast processor has something useful to do in the interrupted context, the increase in complexity (a second ISR; keeping the DMA controller modes straight with the BIOS, if it is in use; the fact that the 2nd interrupt entry actually slows down packet processing a little) makes it an unlikely implementation decision. In summary, yes, there might be a small benefit. But, the PC was designed cheap, not fast, and the combination of its architecture and the nature of network I/O make it an unrewarding tradeoff. jbvb