david@ms.uky.edu (David Herron -- One of the vertebrae) (09/04/88)
Ok, maybe this one should have been obvious. But if it hadn't been for Doug Kingston giving me a short list of things to check over I would've been a *lot* longer in finding the problem. (*Thanks*!) The short description is that a number of applications (ftp, smtp, etc) stopped working between some of our machines after switching our vaxen over to Ultrix v2.2 (from MtXinu 4.3bsd + NFS -- a step backward if you ask me, but it's a looong story). A question to the mmdf mailing list (at the time I knew only that smtp didn't work) elicited a reply from Doug that he thought it was more likely TCP/IP differences and to check things like trailers and MSS ... A check found that sure 'nuff we had some machines with trailers and some without. Switching off trailers on the interface made the applications work again. I've got a couple of questions for the assembled experts: 1. Why did things continue to sort-of work between the conflicting machines? I haven't looked at the code yet, but my understanding of the rfc is that ALL packets will be trailer-ified when going out a trailer link (or on 4.3bsd, out a trailer link AND when the host in question negotiated trailer use). If ALL the packets were trailer-ified then the hosts would be seeing data where they were expecting header data and get all confused. 2. Why does Sun not recommend trailers? Do they use a different page size than vaxen? Or is it -- in general it's not good to use trailers on machines other than vaxen or it's not good to use trailers in a mixed environment? 3. Is there any financial aid and/or cheaper rates for a student who wants to attend Interop '88? The following is the long version. It's the report which I wrote up for all the networking people on campus. - Date: Thu, 1 Sep 88 17:11:33 EDT - From: David Herron E-Mail Hack <david@ms.uky.edu> - To: uk-net-people@ms.uky.edu - Subject: trailers - Message-ID: <8809011711.aa06222@g.e.ms.uky.edu> - - Oooo boy, the tiny things that'll cause problems ... - - We've had a confusing problem over here since converting to Ultrix, - that some of the programs would work to/from Ultrix machines and other - times they wouldn't. Like, an outgoing smtp connection would work fine - until it sent out that trailing '.' whereupon it would hang. - - Some asking around led to a suggestion to check trailers, MSS (Max - Segment Size) and a few other options. Some checking around in the - code of the affected programs revealed no non-portable code which - Ultrix broke. Ultrix was, fortunately, enough alike (still) BSD that - things worked as they did under BSD. Albeit with an older technology - of TCP/IP. Eventually I ended up at the trailers suggestion. - - What's a trailer? Well, all it says in the manual page is some - mumbling about changing the layout of IP packets to reduce the amount - of copying that's involved. They are documented in RFC893, and related - rfc's are 984 & 894 which cover the details of doing IP across ethernet - like mediums. - - The trailer idea is to fix the size of the data portion of the IP - packet at some multiple of the page size of your machine. Since the - idea was originally developed at UCB for 4.2, the size is 512 bytes or - some multiple (The page size on a Vax). The information which would - normally be at the head of the packet (IP header information like - to/from addresses, packet size & etc) are moved to the end and are now - called 'trailers'. There is also two other things added to the trailer; - a protocol type field and a trailer length field. - - Unfortunately they didn't do anything intelligent originally like - negotiate use of trailers on a per host basis. Instead trailers - are either on or off on a per interface basis, and is done at - boot time when ifconfig is run. UCB's next version did do - negotiation as part of ARP but in the meantime the 4.2 version - of TCP/IP became part of many systems, many of which we have - here on our ethernet. - - Looking at the various manual pages I have access to: - - 4.3bsd negotiable per host (default=trailers) - WIN/TCP non-negotiable (default=trailers) - sun v3.4 non-negotiable. also 'not recommended' - because it's host dependant. (default=trailers) - ultrix 2.2 non-negotiable (default=trailers) - - Some of our machines had trailers turned off and some had them turned - on. Brian had thought it wasn't important because it was negotiated - and turned them on ... oh well. - - One thing I'm not sure about is why things sort-of worked ... between - two non-negotiating hosts which disagreed over the trailer issue there - shouldn't have been *any* communication, because they disagree over - where the 'header' information is to be kept. Probably there is something - else going on as well, but I'm not sure what. - - For now we've turned off trailers on all of our machines. Would the rest - of you look into your configurations and tell me which ones can do trailers - to begin with, and which ones can negotiate it. (The negotiation is part - of the ARP protocol). This is another of those TCP/IP options which needs - to be agreed upon across our whole ethernet. er.. Well ... if someone were - to have an IP gateway between their net and the campus net, they would be - able to do what they want on their net. - - Maybe we want to run with trailers on everywhere. But we need to make - sure that it makes sense for all the machines... - -- - <---- David Herron -- The E-Mail guy <david@ms.uky.edu> - <---- ska: David le casse\*' {rutgers,uunet}!ukma!david, david@UKMA.BITNET - <---- Problem: how to get people to call ...; Solution: Completely reconfigure - <---- your mail system then leave for a weeks vacation when 90% done. -- <---- David Herron -- The E-Mail guy <david@ms.uky.edu> <---- ska: David le casse\*' {rutgers,uunet}!ukma!david, david@UKMA.BITNET <---- Problem: how to get people to call ...; Solution: Completely reconfigure <---- your mail system then leave for a weeks vacation when 90% done.
rpw3@amdcad.AMD.COM (Rob Warnock) (09/10/88)
In article <10208@s.ms.uky.edu> david@ms.uky.edu (David Herron) writes: +--------------- | ... A check found that sure 'nuff we had some machines with trailers | and some without. Switching off trailers on the interface made the | applications work again. | I've got a couple of questions... | 1. Why did things continue to sort-of work between the conflicting | machines? I haven't looked at the code yet, but my understanding | of the rfc is that ALL packets will be trailer-ified when going | out a trailer link... +--------------- Close. The trailer protocol is only used when the data portion of the packet is an exact multiple of 512 bytes. The trailer protocol actually uses a separate Ethernet type field value for each such multiple of 512. From 4.3's "vaxif/if_il.c" (the comment marked with [!] has a bug, it says "first packet" when it should say "first mbuf"): /* * Ethernet output routine. * Encapsulate a packet of type family for the local net. * Use trailer local net encapsulation if enough data in first [!]==> * packet leaves a multiple of 512 bytes of data in remainder. */ iloutput(ifp, m0, dst) { ... off = ntohs((u_short)mtod(m, struct ip *)->ip_len) - m->m_len; if (usetrailers && off > 0 && (off & 0x1ff) == 0 && m->m_off >= MMINOFF + 2 * sizeof (u_short)) { type = ETHERTYPE_TRAIL + (off>>9); ... } ... This counts the "data" part of the packet only because the headers fit entirely within the first mbuf, which happens (!) to be the case for all protocols supported by the standard code ({UDP,TCP}/IP & XNS). So... if you are doing something with short or odd-sized packets, like a line-by-line Telnet or *very* small mail, you can still communicate between a trailer and non-trailer implementation. Plus, you can always send data from the non-trailer hosts *to* the trailer host, since <ACK>s are small and thus never get trailerized. In fact, you should have been able to watch your SMTP mail on a packet monitor and seen the entire "HELO", etc., dialog go along just fine up to the point that the trailer-using host blasted its first full-sized packet at the non-trailer host... whereupon the trailer'd packet would be periodically retransmitted until the connection timed out. Rob Warnock Systems Architecture Consultant UUCP: {amdcad,fortune,sun}!redwood!rpw3 ATTmail: !rpw3 DDD: (415)572-2607 USPS: 627 26th Ave, San Mateo, CA 94403
david@ms.uky.edu (David Herron -- One of the vertebrae) (09/13/88)
First, thanks to everyone who responded to my posting. The consensus was that trailers while on the surface seeming like a good thing are, in practice, somewhat 'bad' and it's not even clear if they actually help even in their native environment. Plus there are cases (Sun's especially) where they hurt performance because the idea is too Vax specific and especially too specific to the Vax memory management. In article <22891@amdcad.AMD.COM> rpw3@amdcad.UUCP (Rob Warnock) writes: >In fact, you should have been able to watch your SMTP mail on a packet >monitor and seen the entire "HELO", etc., dialog go along just fine up >to the point that the trailer-using host blasted its first full-sized >packet at the non-trailer host... whereupon the trailer'd packet would >be periodically retransmitted until the connection timed out. Well, being able to watch my SMTP mail on a packet monitor assumes the presence of a packet monitor in the first place. The closest I have is tcpdump which, that I know of, does not display the contents of the packet. (The joys of living in a poor state at a University which isn't yet fully up to speed on networking technology & hardware ....) Anyway. What I was seeing from the user level was the SMTP conversation succeeding up to the point where the program had finished sending all of the DATA section. Then it went to send the '.' and hung either in sending the '.' or waiting for the response (depending on the phase of the moon, I think). Now possibly the DATA section was being buffered as much as possible, I don't remember the code that well. Certainly it looked to me (at the time) as if the code were hanging because of a short packet rather than a long one... -- <---- David Herron -- One of the MMDF guys <david@ms.uky.edu> <---- ska: David le casse\*' {rutgers,uunet}!ukma!david, david@UKMA.BITNET <---- What does the phrase "Don't work too hard" <---- have to do with the decline of the american 'work ethic'?