phil@BRL.ARPA (Phil Dykstra) (03/06/88)
I too experienced the loss of one byte problem on about one out of five file transfers from Expo.lcs.mit.edu. The very first byte of the last packet sent was being dropped somehow. It happened with two different TCP's on this end so I suspect that the problem is with expo's TCP. - Phil
bob@allosaur.cis.ohio-state.edu (Bob Sutterfield) (03/06/88)
Here's another vote. Same symptom (511,999 byte files arrive) with both SunOS 3.4 and Pyramid OSx 4.0, though I'm not sure which byte of the 512,000 was being dropped. -=- Bob Sutterfield, Department of Computer and Information Science The Ohio State University; 2036 Neil Ave. Columbus OH USA 43210-1277 bob@cis.ohio-state.edu or ...!cbosgd!osu-cis!bob
earle@mahendo.JPL.NASA.GOV (Greg Earle) (03/09/88)
expo is probably running vanilla SunOS 3.4. There are various TCP bugs in
vanilla 3.4; one is that sometimes an initiator will for some bizarre reason
present a 64K TCP window to the other end, and slowly but surely the window
will get eaten away until it is zero, and then your TCP/IP service (oft times
an FTP) will hang. Another is that when negotiating the Maximum Segment Size
the packet size would get set to 512 bytes always. I have this feeling that
one of these bugs is causing the famed 511999 bug (I was bitten as well).
There is another variant to it; even more insidious is getting all 512000
bytes and discovering only after unsplitting that a file is corrupted. In both
the 511999 and this case, the very first byte of the last 512-byte block of
the file gets dropped. In the 511999 case, that is all there is to it. In
the even-more-insidious case, the first byte is dropped and the last byte is
*replicated* (i.e., instead of [0-512] we get [1-511]511). This happened to
me with core.src.tar.Z.split.ac.
In short: expo should be running either (at least) SunOS 3.4.2, or SunOS 3.5.
If you have a Sun at the receiving end, make sure it is at 3.4.2 or 3.5, and
both you and expo should install the tcp_mss patch. This patches the
locations tcp_mss+0xac & tcp_mss+0xbc in /vmunix (and /dev/kmem) to be 1024,
not 512 (this is a workaround - `Fixed in 4.0' ... ):
yourmachine:1 # adb -w -k /vmunix /dev/mem <==
sbr f06aa18 slr 649
physmem 1fe
tcp_mss+0xac?d <==
_tcp_mss+0xac: 512
tcp_mss+0xac?w 0x400 <==
_tcp_mss+0xac: 0x200 = 0x400
tcp_mss+0xac/d <==
_tcp_mss+0xac: 512
tcp_mss+0xac/w 0x400 <==
_tcp_mss+0xac: 0x200 = 0x400
tcp_mss+0xbc?d <==
_tcp_mss+0xbc: 512
tcp_mss+0xbc?w 0x400 <==
_tcp_mss+0xbc: 0x200 = 0x400
tcp_mss+0xbc/d <==
_tcp_mss+0xbc: 512
tcp_mss+0xbc?w 0x400 <==
_tcp_mss+0xbc: 0x200 = 0x400
tcp_mss+0xbc?d <==
_tcp_mss+0xbc: 1024
^D <==
yourmachine:2 #
[ Sorry to turn xpert into a Sun kernel & TCP discussion, but the keepers of
`expo' might want to consider this, due to the FTP problems ... - Greg ]
--
Greg Earle earle@mahendo.JPL.NASA.GOV
Indep. Sun consultant earle%mahendo@jpl-elroy.ARPA [aka:]
(Gainfully Unemployed) earle%mahendo@elroy.JPL.NASA.GOV
Lake View Terrace, CA ...!{cit-vax,ames}!elroy!jplgodo!mahendo!earle