menges@unc.UUCP (John Menges) (05/06/86)
Here at the University of North Carolina'a Department of Computer Science, we have approximately 40 SUN-2 workstations. We are in the process of installing a number of SUN-3 workstations as well. Some of our SUN-2s have 3COM ethernet controllers. SUN has informed us that there is a communications problem interfacing SUN-2s with 3COM boards and SUN-3s. The problem has something to do with the 3COM boards not having enough buffering to handle full-speed transmissions from the SUN-3. The information we get from our local SUN sales representative, however, is confusing and incomplete. I am hoping that someone on the net will be able to clarify this issue. According to our sales representative, file systems on a SUN-3 cannot be remote (NFS) mounted on a SUN-2 with at 3COM board, and vice-versa, without "slowing down the SUN-3 ethernet controller". Rlogin, rcp, rsh, etc., however, are supposed to work without slowing down the controller. According to some SUN documentation we have, SUN-2s can also be clients of a SUN-3 file server (using ND), if the SUN-3 is told (in /etc/nd.local) to limit the number of packets sent to the client (on a per-client basis) to two before requiring an acknowledgement. Now for the questions that haven't been answered: 1. Why do rlogin, etc. work but not NFS? Does the NFS protocol not use any form of flow control or packet re-transmit? If that is the case, what happens when you run NFS between a VAX or another faster machine and a SUN-2 with a 3COM board? 2. What does it mean to "slow down the ethernet board"? Is it slowed down regardless of who it's talking to (e.g., is SUN-3 to SUN-3 communication slowed down), or on a per-host basis? I'd appreciate any light that anyone can shed on this subject. Thanks in advance! John Menges menges@unc (csnet) decvax!mcnc!unc!menges (uucp)
steve@umcp-cs.UUCP (Steve D. Miller) (05/07/86)
In article <1428@unc.unc.UUCP> menges@unc.UUCP (John Menges) writes: >Here at the University of North Carolina'a Department of Computer Science, >we have approximately 40 SUN-2 workstations. We are in the process of >installing a number of SUN-3 workstations as well. Some of our SUN-2s have >3COM ethernet controllers. SUN has informed us that there is a >communications problem interfacing SUN-2s with 3COM boards and SUN-3s. >The problem has something to do with the 3COM boards not having enough >buffering to handle full-speed transmissions from the SUN-3. I can readily believe that; people around here are of the opinion that the 3COM boards are inherently slow, while (a) the other Sun Ethernet board (ie) is apparently fast to begin with and (b) if code complexity is any indication (the driver is > 2000 lines, and pulls all sorts of strange memory tricks), has a driver that fully supports its speed. The ec (3COM) driver is trivial in comparison. >According to our sales representative, file systems on a SUN-3 cannot >be remote (NFS) mounted on a SUN-2 with at 3COM board, and vice-versa, >without "slowing down the SUN-3 ethernet controller". Rlogin, rcp, rsh, >etc., however, are supposed to work without slowing down the controller. >According to some SUN documentation we have, SUN-2s can also be clients >of a SUN-3 file server (using ND), if the SUN-3 is told (in /etc/nd.local) >to limit the number of packets sent to the client (on a per-client basis) >to two before requiring an acknowledgement. > >Now for the questions that haven't been answered: > > 1. Why do rlogin, etc. work but not NFS? Does the NFS protocol not > use any form of flow control or packet re-transmit? If that is the > case, what happens when you run NFS between a VAX or another faster > machine and a SUN-2 with a 3COM board? I'm not sure that there's any reason why everything (rlogin, rsh, ... , NFS, ND) shouldn't work. It will probably not work too well, though, as the 3COM board will end up dropping lots of packets, so the ie boards will have to do a lot of retransmits...maybe enough to time out an occasional connection, though I doubt it. NFS and ND all work off datagram protocols; ND is an "unofficial" protocol on top of IP, while NFS is built on a UDP-based RPC "connection". The NFS call routines all do retransmits via an exponential backoff scheme, but hard-mounted NFS filesystems will continue to retry the transmission indefinitely. It should be noted that the backoff happens on a per- packet basis only, so the next packet to go out will be sent with the minimum timeout. Of course, all those retransmits will be more work for the server... One potential problem that I just thought of is that reads (and, perhaps, writes; I haven't looked at that part of the code) occur in 4K chunks, fragmented and reassembled as appropriate by IP; this means that the 3COM board is (based on a MTU of ~1500 bytes) going to see three to four big packets come in in *rapid* succession. If ND can only handle two without acks of some sort, then I'd be willing to guess that part of almost every NFS read will get dropped on the floor. I've only been looking at the NFS code for a relatively brief time; does someone out there from Sun (or otherwise more in the know than I am) care to comment? I certainly wouldn't want to buy a lot of machines without more information from an "official" source. > 2. What does it mean to "slow down the ethernet board"? Is it slowed > down regardless of who it's talking to (e.g., is SUN-3 to SUN-3 > communication slowed down), or on a per-host basis? Cthulhu knows what they mean by this, unless they're talking about being slowed down by excessive retransmits. -Steve -- Spoken: Steve Miller ARPA: steve@mimsy.umd.edu Phone: +1-301-454-4251 CSNet: steve@umcp-cs UUCP: {seismo,allegra}!umcp-cs!steve USPS: Computer Science Dept., University of Maryland, College Park, MD 20742
chris@umcp-cs.UUCP (Chris Torek) (05/07/86)
In article <1359@umcp-cs.UUCP> steve@maryland.UUCP (Steve D. Miller) writes: >In article <1428@unc.unc.UUCP> menges@unc.UUCP (John Menges) writes: >> 2. What does it mean to "slow down the ethernet board"? ... > > Cthulhu knows what they mean by this .... Actually, I suspect they mean something along these lines: /* * Start transmission on an ie. */ ieoutput(sc) struct ie_softc *sc; { ... #ifdef UGLY_KLUDGE if (sc->sc_flags & SF_NEEDDELAY) { sc->sc_flags &= ~SF_NEEDDELAY; timeout(ieoutput, (caddr_t) sc, 1); return; } #endif ... ie->ie_command_register = IE_DO_A_SEND; #ifdef UGLY_KLUDGE sc->sc_flags |= SF_NEEDDELAY; #endif } This would introduce a two tick delay per packet, which gives a maximum transmission rate of 25 packets per second (ugh). It might work to do timeout(..., 0), giving 50 packets/sec; but that is still awful. Another alternative, if you do not mind wasting CPU, is ie->ie_command_register = IE_DO_A_SEND; #ifdef OTHER_UGLY_KLUDGE DELAY(1000); /* ~1 ms, hope that is long enough */ #endif I used something like the latter to get around a microcode bug in UDA50s (though I no longer need to get around it: I now simply avoid the situation in which the bug shows up). -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1415) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@mimsy.umd.edu
hedrick@topaz.RUTGERS.EDU (Charles Hedrick) (05/08/86)
NFS sends blocks which are either 4Kbytes or 8Kbytes (depending upon the function). At a lower level, these are are turned into packets (1.5Kbytes if you are using normal Ethernet parameters). All of the packets generated from a given block are queued up to the output at the same time. The result is a burst of between 3 and 6 packets with almost no time between them. This code bypasses much of the normal TCP/IP code, for efficiency. The 3Com boards have only two buffers, and they are on the board. In order to deal with large bursts, Unix must copy one buffer into mbufs while the other one is being filled from the network, and it must finish this process before the next packet shows up. A standalone 68010 with nothing to do but empty 3Com buffers, having zero interrupt latency, might just be able to do this. But a 68010 running Unix certainly cannot. The result is that at least one of the packets in the burst is dropped. Because of the design of NFS, acknowlegements and retransmissions occur on the basis of the 4K or 8K blocks, not the individual packets. So if any one packet is dropped, the whole burst is lost and must be retransmitted. Thus you must receive every packet in a burst correctly. The solution is not exactly to slow down the Ethernet controller. Rather, under version 3.0 there is a parameter you can specify in the mount that gives a maximum block size. You simply limit NFS to 2K blocks. Then its bursts are never longer than 2 packets. This increases CPU overhead slightly, because certain processing must be done once per block, and you are now sending twice as many blocks. It also decreases throughput slightly. It's not clear that this is really a big deal. This could be considered "slowing down the controller", but it is probably better described as "detuning NFS". Note that this is done for each mount. So only mounts between 3Com Sun 2's and Sun 3's need to have this parameter. Everything else on both machines will run as usual. The problem does not afflict normal TCP use because the TCP code in the kernel isn't nearly as fast. It generates packets one at a time, rather than in bursts.
guy@sun.UUCP (05/09/86)
> This code bypasses much of the normal TCP/IP code, for efficiency. ... > The problem does not afflict normal TCP use because the TCP code in > the kernel isn't nearly as fast. It generates packets one at a time, > rather than in bursts. A clarification: it bypasses all of the TCP code, because NFS uses Sun RPC with UDP, not TCP, as its transport mechanism. (It does bypass much of the UDP and IP code, as well.) > Rather, under version 3.0 there is a parameter you can specify in the > mount that gives a maximum block size. You simply limit NFS to 2K > blocks. Note that this is also useful if you are mounting a file system from a machine which is many gateways away from you. The IP datagram containing the UDP datagram containing NFS replies gets fragmented, as was pointed out in the previous message; this causes problems sending the NFS reply along a path involving several gateways. -- Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.arpa
jel@portal.UUcp (John Little) (05/09/86)
In article <1359@umcp-cs.UUCP>, steve@umcp-cs.UUCP (Steve D. Miller) writes: > > I can readily believe that; people around here are of the opinion that > the 3COM boards are inherently slow, while (a) the other Sun Ethernet > board (ie) is apparently fast to begin with and (b) if code complexity is > any indication (the driver is > 2000 lines, and pulls all sorts of strange > memory tricks), has a driver that fully supports its speed. The ec (3COM) > driver is trivial in comparison. The Intel 586 chip that most SUNs use to do Ethernet is not the world's most reliable, bug free, well documented or well supported chip. The last time I saw the bug list for the 586 it was five pages long. I suspect that much of the complexity is due to SUN's working around various bogosities in the chip. There are lots of good reasons why SUN's most recent machine (the 3/50) uses the AMD 7990 instead of the Intel equivalent. Note: I have no official connection with SUN, Intel or AMD. John Little {atari,sun,hoptaod}!portal!jel