romkey@kaos.UUCP (John Romkey) (11/21/87)
In article <155@tic.UUCP> ruiu@tic.UUCP (Dragos Ruiu) writes: > However, I had great difficulty believing their hardware section, which >repeatedly says that DMA interface boards are slower than others. Unless there >is something I don't know, DMA is much faster than interrupt driven hardware. >I have not had the time or the resources to research their claims, but >common sense would indicate that this claim of theirs is way off base. First off, interrupts and DMA are orthogonal. It's really a choice of DMA versus programmed I/O. On the PC/AT, the DMA controller runs about 10% slower than the standard PC DMA controller. On top of this, the 80286 has string IN and OUT instructions in addition to the 8086 string MOV instruction. Both string IN/OUT and string MOV instructions push data around a lot faster than the DMA on that system controller can. You basically end up with the processor sitting in a tight loop shoveling bytes across the bus. Under MS-DOS this is fine, since MS-DOS is single tasking. You'll never have another process waiting to use computes that are being used up by having the processor move data that other hardware could move (more slowly). Under a multitasking system you'd want to examine the tradeoff; in some situations you'd get more aggregate computes out of the system if the I/O was performed by the (slow) DMA controller than the processor, because other processes could run while the system waited for the DMA to complete. -- - john romkey ...mit-eddie!blblbl!kaos!romkey romkey@xx.lcs.mit.edu
farren@gethen.UUCP (Michael J. Farren) (11/22/87)
In article <261@kaos.UUCP> romkey@kaos.UUCP (John Romkey) writes: > >First off, interrupts and DMA are orthogonal. It's really a choice of DMA >versus programmed I/O. On the PC/AT, the DMA controller runs about 10% >slower than the standard PC DMA controller. On top of this, the 80286 has >string IN and OUT instructions in addition to the 8086 string MOV instruction. >Both string IN/OUT and string MOV instructions push data around a lot faster >than the DMA on that system controller can. You basically end up with the >processor sitting in a tight loop shoveling bytes across the bus. DMA and interrupts are NOT orthogonal. In any interrupt-driven scheme, there will be system overhead required for each and every byte of data transferred. In most systems I am aware of, this interrupt overhead so overwhelms any one-time overhead involved in setting up DMA, and the processor slowing associated with DMA (if any - many systems are designed such that DMA does not affect the processor in any meaningful way. Most 68000 systems, for example, take full advantage of the fact that the processor only requires the bus every other cycle, more or less), for any transfer over a very few bytes, that considering interrupts instead of DMA ensures great inefficiency. On the choice between DMA and programmed I/O, much depends on the system design. The IBM architecture may not allow a distinct advantage for DMA vs. programmed I/O, but many other systems do. To make a general statement that DMA is not as efficient as programmed I/O is wrong. -- ---------------- Michael J. Farren "... if the church put in half the time on covetousness unisoft!gethen!farren that it does on lust, this would be a better world ..." gethen!farren@lll-winken.arpa Garrison Keillor, "Lake Wobegon Days"
truett@cup.portal.com (11/24/87)
On the IBM bus, the DMA loses yet again! Remember that on that bus, the CPU unconditionally grants the bus to any DMA request (it almost has to, that's how the PC and XT do their dynamic memory refresh). Thus, several DMA devices can capture the bus and lock the CPU out. If PIO is used, though, the CPU has a choice. On a bus that allows a DMA request to be blocked by a higher priority compute task this problem probably does not occur. Also, note that the assumption that a CPU only needs the bus every n-th cycle is very dependent on the design of the particular system being considered. There are, I believe, processors out there that can do a fetch and an operation on every cycle. I know some DSPs do and some RISCs probably do, not to memtion highly pipelined microprocessors of more traditional type. Truett Smith, Sunnyvale, CA UUCP: truett@cup.portal.com
romkey@kaos.UUCP (John Romkey) (11/24/87)
In article <372@gethen.UUCP> farren@gethen.UUCP (Michael J. Farren) writes: >In article <261@kaos.UUCP> romkey@kaos.UUCP (John Romkey) writes: >> >>First off, interrupts and DMA are orthogonal. It's really a choice of DMA >>versus programmed I/O. > >DMA and interrupts are NOT orthogonal. In any interrupt-driven scheme, >there will be system overhead required for each and every byte of data >transferred. It was my impression that the original article meant to say "programmed I/O" instead of "interrupts", which is why I launched off into my discussion of programmed I/O. It isn't rational to discuss DMA vs. interrupts for any PC or PC/AT bus network interface that I've encountered, and I've written or seen drivers for most of them (the list would double the length of this message). None of them give you an option of taking an interrupt per byte while transferring data. You either do or do not take one interrupt on receive or transmit completion (or DMA completion if you use DMA), and you either use DMA to transfer data or you use programmed I/O. The two are independent and therefore *orthogonal*. If you want any responsiveness out of your network code (at least in a TCP/IP implementation), you'll want to use interrupts regardless of whether or not you use DMA. You'll decide whether to use DMA based on the network interface's architecture (many memory-mapped interfaces don't support it) and your bus. >Michael J. Farren "... if the church put in half the time on covetousness >unisoft!gethen!farren that it does on lust, this would be a better world ..." >gethen!farren@lll-winken.arpa Garrison Keillor, "Lake Wobegon Days" -- - john romkey ...mit-eddie!blblbl!kaos!romkey romkey@xx.lcs.mit.edu
ruiu@tic.UUCP (11/25/87)
[The discussion is about why DMA would be slower on an AT than polled I/O] In light of the facts pointed out by everyone, a poor implementation of DMA seems to be available on the AT. A certain traditionalist streak in me refuses to accept a tight loop as the highest performance data transfers. So if DMA is no good, then what is the high performance approach needed to 'squeeze' every ounce of performance out of an AT ? An aquaintance who is designing a major PC based hardware project has chosen to use double-ported memory. Truett Smith has already suggested this as the solution. So, in light of the dropping cost of such devices, they are the preferred way to go. Right ? Does anyone care to comment? Does anyone know of any products that use this approach to data transfers ? (What did I start with that innocuous first posting ??!!? :-) -- Dragos Ruiu Disclaimer: My opinons are my employer's, I'm unemployed! UUCP:{ubc-vision,mnetor,vax135,ihnp4}!alberta!edson!tic!dragos!work (403) 432-0090 #1705, 8515 112th Street, Edmonton, Alta. Canada T6G 1K7 Never play leapfrog with Unicorns...
phil@amdcad.UUCP (11/26/87)
In article <162@tic.UUCP> ruiu@tic.UUCP (Dragos Ruiu) writes: >An aquaintance who is designing a major PC based hardware project has chosen >to use double-ported memory. Truett Smith has already suggested this as the >solution. > >So, in light of the dropping cost of such devices, they are the preferred way >to go. Right ? Many dual ported systems are not made of dual ported memory devices. Certainly none of mine are. It's hard to get 2 megabytes of dual ported memory using dinky 2 kilobyte devices. -- I speak for myself, not the company. Phil Ngai, {ucbvax,decwrl,allegra}!amdcad!phil or amdcad!phil@decwrl.dec.com
romkey@kaos.UUCP (John Romkey) (11/27/87)
In article <162@tic.UUCP> ruiu@tic.UUCP (Dragos Ruiu) writes: >An aquaintance who is designing a major PC based hardware project has chosen >to use double-ported memory. Truett Smith has already suggested this as the >solution. > >So, in light of the dropping cost of such devices, they are the preferred way >to go. Right ? Right. Many of the recent network interfaces for the PC and AT in fact use dual-ported memory with the LAN controller hardware on one side and the PC or AT bus on the other. In fact, the best network interfaces on the market right now all take this approach. But there's still a catch. Most of these network interfaces only provide 8K or 16K bytes of RAM. To get really good performance out of them, you want their memory available to receive data from the net as soon as is possible. So you end up copying the data into the PC's main memory. You can actually program the DMA controller to do that, but who'd want to? Using an 8086 MOVS instruction is so much faster...it should be faster even on the PC, but I don't have the books here to check it out and make sure. So you still end up copying, rather than using the data in place. You could put lots of memory on the network interface, like 256Kbytes, but then you'd have two problems. The hardware would have a hard time mapping in all that memory into the PC address space, so it would probably have to be bank-switched. The software would have problems managing it and figuring out who had buffers where and then trying to reclaim them later on. The boards which are memory mapped include the Micom-Interlan NI5210, the Western Digital WD8003, the Univation NIC and the Excelan EXOS205, all of which are ethernet interfaces. Proteon also sells a memory-mapped IEEE 802.5 token ring card, which is either the P1340 or the P1344. I'm sure I've left out a couple, but I just woke up... >-- >Dragos Ruiu Disclaimer: My opinons are my employer's, I'm unemployed! > UUCP:{ubc-vision,mnetor,vax135,ihnp4}!alberta!edson!tic!dragos!work >(403) 432-0090 #1705, 8515 112th Street, Edmonton, Alta. Canada T6G 1K7 >Never play leapfrog with Unicorns... -- - john romkey ...mit-eddie!blblbl!kaos!romkey romkey@xx.lcs.mit.edu
truett@cup.portal.com (11/28/87)
rulu@tic.uucp (Dragos Rulu) asks if there are any examples of commercial hardware products that use the dual-ported memory approach to bulk data movement into and out of an IBM PC/AT. There are many, but let me illustrate the preponderance of this method by looking at one situation where bulk data must be moved quite quickly -- laser printer interfaces. I know of at least four such interfaces that use some form of memory dual- porting to achieve high transfer rates: 1) the Tall Tree JLaser, 2) the Advanced Vision Research MegaBuffer, 3) the Cordata LBP interface, and 4) the AST Turbolaser interface. I believe the Laser Master also does this. Unfortunately, I am not very familiar with the Hewlitt-Packard LaserJet interfaces. This problem of moving data to a fast printing engine quickly becomes most acute when bit-mapped graphics are involved. Why all of the laser printer manuafacturers haven't come out with a SCSI interface is beyond me! It has the throughput and would allow the printer interface to share a slot with other peripherals. Similar considerations would apply to input from image scanners. This surfeit of proprietary interfaces cannot be any good, in the long run, for the industry. I would note that most SCSI interfaces for the PC standard bus give the programmer the choice of DMA or PIO control of the transfer. Truett Smith, Sunnyvale, CA UUCP: truett@cup.portal.com