KASTEN@MITVMA.MIT.EDU (Frank Kastenholz) (12/21/87)
A few months ago I made a general request to the list about running NFS internetwork gateways. The specific configuration that I have to deal with is two Ethernets separated by some physical distance (possibly intercontinental). There would be some kind of gateway/bridge/router/thing at each Ethernet and the things are connected by a medium to high speed serial link (anything from about 38K to 1.544M bits/sec). The responses that I received (minus the "gee, what a great idea's" or "Well, it had to come someday" etc etc responses) are: (I have left the originating party's name on each on - the rest of the SMTP/RFC822 junk has been removed). ====================================================================== From: "James B. VanBokkelen" <JBVB@AI.AI.MIT.EDU> By default, none of Sun's implementations of NFS use UDP checksums. If you enable them, the last release I heard anything about still had the 4.2 UDP checksum mis-calculation. They assume they're running one hop, on a CRC-protected medium like Ethernet. Accordingly, you're not too likely to catch any situation where the NFS packet is corrupted on the way through a gateway, or over an error-prone link. Instant filesystem corruption. It can certainly be fixed if you have source, but I don't own any Suns (2nd hand info, this), so I can't say exactly what must be done. ======================================================================= From: jas@MONK.PROTEON.COM (John A. Shriver) The nature of the Sun NFS fragmented UDP-grams causes many routers and bridges to have fits. You get 6 back-to-back IP fragments. If ANY of those fragments is lost, the entire UDP-gram must be retranmitted. You can, however, reduce the size of the UDP-gram. In /etc/fstab, you need to add the undocumented rsize & wsize switches. For example: across_gw_machine:/usr /usr nfs rw,noquota,soft,rsize=2048,wsize=2048 This will reduce the size of the UDP-gram to 2048 btyes of data, as opposed to the default 8192. This will cuase only two fragments instead of eight. (Do keep this parameter a multiple of 1024, as all the network code likes page-aligned buffers.) For reference, see bug 135 in Sun Software Technical Bulletin, February 1987, part number 812-8701-01. (The page numbering is botched in this one.) ==================================================================== From: slevy@umn-rei-uc.arpa (Stuart Levy) One problem we've had in NFSing between disparate machines is with naming them. The mount request passes the originating machine -name- rather than having the server use gethostbyaddr(). It's important to check that "hostname" on the client yields a name known to the server and vice versa. That's probably not the whole problem but can cause things to break. A guy from Proteon, Mick Scully (mcs@proteon.com) recently visited here and mentioned that he had mounted NFS filesystems at Berkeley across ARPAnet. ====================================================================== From: hedrick@ATHOS.RUTGERS.EDU (Charles Hedrick) We run NFS over cisco routers, either directly connecting two Ethernets or connecting Ethernets via a T1 line. The only problem is that the Ethernet cards used by cisco (and others) can't handle large numbers of consecutive packets. So you need to specify rsize=nnn,wsize=nnn in the mount. Typically we use 2048, though I think someting a bit closer to 3000 might give better performance. I haven't tried it over anything slower, though we understand that somebody a Univ. of Maryland mounted one of our disks over NSFnet. ===================================================================== From: John Romkey <ROMKEY@XX.LCS.MIT.EDU> One problem you'll run into is that NFS does not checksum its packets. NFS packets are UDP-based instead of TCP-based, and the UDP checksum is optional. On a single ethernet, the ethernet's CRC is possibly reliable enough to detect bad packets, but through an IP router there is too high a probability of losing (.1% would mean one out of 1000 packets was damaged; you really desparately want 0% errors). The reason is that there are chances for corruption of data in the ethernet interface of the IP router, in the IP router's memory, and in the other interface it routes the NFS packet too. The corruption can be due to hardware errors, electrical noise, memory errors and software problems. In fact, you've got the same problem with just an NFS server and client on the same LAN, but since fewer components are involved, the chances of error are much smaller. I've spoken with people who've used NFS over a router, and they've actually seen files corrupted due to the lack of checksums. I'd recommend against it. BTW, the reason they turn off checksums is to up performance. - john romkey ================================================================= ================================================================= Any further discussion should go to the list, to the original author or directly to me (unfortinately I have recently moved from MIT-Multics to MITVMA but the list has yet to catch up to me (whoever is running the distribution list must be on a verryyyyy loooonnnnggg vacation:-)) Seasons greetings to all Frank Kastenholz
karn@faline.bellcore.com (Phil R. Karn) (12/21/87)
By the way, one advantage of bridges over routers for NFS traffic (at least for the Vitalink bridges we use) is that they maintain the original Ethernet CRC; they encapsulate the entire source packet (CRC included) over the HDLC link. This means that a broken Ethernet controller in the bridge won't corrupt your checksumless NFS/UDP packets like a broken Ethernet controller in an IP router would. Phil
eshop@saturn.ucsc.edu (Jim Warner) (12/22/87)
In article <1649@faline.bellcore.com> karn@faline.bellcore.com (Phil R. Karn) writes: >By the way, one advantage of bridges over routers for NFS traffic (at >least for the Vitalink bridges we use) is that they maintain the >original Ethernet CRC; they encapsulate the entire source packet (CRC >included) over the HDLC link.... This is *NOT* a general characteristic of all bridges. It is true for DEC and Vitalink. If this characteristic is important, you should be sure to ask your vendor how they handle it. jim
melohn@SUN.COM (Bill Melohn) (12/22/87)
You can corrupt an NFS file system invisably by sending NFS/UDP packets over an unreliable datalink (like the current version of SLIP) without first turning on UDP checksums, which has been possible since SunOS 3.2. With our point to point IP router, we do a CRC at the serial chip level, making it act like an ethernet. Other people have done NFS over SLIP using error-detecting modems like the Telebit Trailblazer. In any case, the trend is towards error-free or at least error correcting hardware/networks, so the NFS/UDP default seems even more reasonable in a high-fibre future.
karn@faline.bellcore.com (Phil R. Karn) (12/23/87)
> This [maintaining original Ethernet CRCs] is *NOT* a general > characteristic of all bridges. Don't I know it. Before the DEC Lanbridge came out, I built my own out of PDP-11/73s and DEQNAs. Big mistake! The DEQNA has *major* problems running in promiscuous mode. One common manifestation was undetected packet corruption. Lots of funny entries showed up in our routing and ruptime tables because UDP checksums were disabled on the Sun routers. This experience made me a firm believer in end-to-end checksums for *all* packets. The performance impact of UDP checksums in NFS is minimal, but even if it weren't they would still be worth it. Even ARP suffers from the lack of an internal checksum. Phil
SRA@XX.LCS.MIT.EDU (Rob Austein) (12/23/87)
Date: Tuesday, 22 December 1987 02:01-EST From: melohn@Sun.COM (Bill Melohn) ... In any case, the trend is towards error-free or at least error correcting hardware/networks, so the NFS/UDP default seems even more reasonable in a high-fibre future. Sorry, but this is a bad idea. You really do need end-to-end software checksuming. MIT discovered this the hard way years ago when a Chaosnet "bridge" (a level 3 router in spite of the name) developed a stuck bit. Chaosnet hardware does hardware checksumming, like Ethernet (in fact, these days, most of it IS Ethernet, even at MIT there are only two subnets left still using Chaosnet hardware). The Chaosnet hardware faithfully transported all the bits entrusted to it, but the packets were corrupted nonetheless. Things only get worse when you start talking about long haul nets. --Rob
hedrick@athos.rutgers.edu (Charles Hedrick) (12/25/87)
I guess I'm about to jump on the bandwagon for turning on NFS checksumming. We just had Sun field service replace an Ethernet board because we started noticing corrupted files transported via NFS. No gateways or bridges involved. It was apparently a failure in the Ethernet interface board. After the vacation I'm going to look into turning on checksumming everywhere. This was not our first problem. The other one was due to a design bug in the ACC 1822 Multibus card. When put into a gateway with more than one Ethernet card, the load got too heavy for the chips they used to drive it. The bus arbitration didn't work. It stomped on the bus cycles of other devices. Result was random garbaging of data. TCP worked fine, but NFS files were garbaged. The board has just recently been fixed. Of course with these low-rate failures, if checksumming were turned on, we would probably never even know we had a problem. On the other hand, it seems a bit drastic to use garbage in user files as a diagnostic.
LYNCH@A.ISI.EDU (Dan Lynch) (12/26/87)
Gee, when I used to work in a computer center we had this marvelous procedure called "running diagnostics". We did it to make sure all the equipmetn was in proper working order. Now that we have networking have we forgotten our past??? What I see missing is a definite package of diagnostic prodecures to check out each "piece of the system". If the "network IS the computer" it needs to be treated like one. Dan -------
JBVB@AI.AI.MIT.EDU ("James B. VanBokkelen") (12/28/87)
In the Chaosnet example mentioned, the router was running just fine, and the memory problem was corrupting one in N packets forwarded. Yeah, a diagnostic would have found it, but networks are big and fuzzy, and the failure was intermittent, and I think the people who first realized there was a problem spent some time just locating it, and some more time thinking it was a software bug. It would be nice if everything ran memory diagnostics as the idle task, and it would be nice if there weren't interfaces which corrupt packets silently under some conditions. Maybe someday. For the moment, I think end-to-end error detection is a good thing. jbvb
ron@TOPAZ.RUTGERS.EDU (Ron Natalie) (01/03/88)
Sure, and I remeber running DECX for days and having things turn up 100 % OK, but then having the machine blow up with hardware errors five minutes after the normal OS was booted. There's no diagnostic like actually trying to use the system. -Ron