phil@BRL.ARPA (Phil Dykstra) (03/06/88)
I too experienced the loss of one byte problem on about one out of five file transfers from Expo.lcs.mit.edu. The very first byte of the last packet sent was being dropped somehow. It happened with two different TCP's on this end so I suspect that the problem is with expo's TCP. - Phil
bob@allosaur.cis.ohio-state.edu (Bob Sutterfield) (03/06/88)
Here's another vote. Same symptom (511,999 byte files arrive) with both SunOS 3.4 and Pyramid OSx 4.0, though I'm not sure which byte of the 512,000 was being dropped. -=- Bob Sutterfield, Department of Computer and Information Science The Ohio State University; 2036 Neil Ave. Columbus OH USA 43210-1277 bob@cis.ohio-state.edu or ...!cbosgd!osu-cis!bob
earle@mahendo.JPL.NASA.GOV (Greg Earle) (03/09/88)
expo is probably running vanilla SunOS 3.4. There are various TCP bugs in vanilla 3.4; one is that sometimes an initiator will for some bizarre reason present a 64K TCP window to the other end, and slowly but surely the window will get eaten away until it is zero, and then your TCP/IP service (oft times an FTP) will hang. Another is that when negotiating the Maximum Segment Size the packet size would get set to 512 bytes always. I have this feeling that one of these bugs is causing the famed 511999 bug (I was bitten as well). There is another variant to it; even more insidious is getting all 512000 bytes and discovering only after unsplitting that a file is corrupted. In both the 511999 and this case, the very first byte of the last 512-byte block of the file gets dropped. In the 511999 case, that is all there is to it. In the even-more-insidious case, the first byte is dropped and the last byte is *replicated* (i.e., instead of [0-512] we get [1-511]511). This happened to me with core.src.tar.Z.split.ac. In short: expo should be running either (at least) SunOS 3.4.2, or SunOS 3.5. If you have a Sun at the receiving end, make sure it is at 3.4.2 or 3.5, and both you and expo should install the tcp_mss patch. This patches the locations tcp_mss+0xac & tcp_mss+0xbc in /vmunix (and /dev/kmem) to be 1024, not 512 (this is a workaround - `Fixed in 4.0' ... ): yourmachine:1 # adb -w -k /vmunix /dev/mem <== sbr f06aa18 slr 649 physmem 1fe tcp_mss+0xac?d <== _tcp_mss+0xac: 512 tcp_mss+0xac?w 0x400 <== _tcp_mss+0xac: 0x200 = 0x400 tcp_mss+0xac/d <== _tcp_mss+0xac: 512 tcp_mss+0xac/w 0x400 <== _tcp_mss+0xac: 0x200 = 0x400 tcp_mss+0xbc?d <== _tcp_mss+0xbc: 512 tcp_mss+0xbc?w 0x400 <== _tcp_mss+0xbc: 0x200 = 0x400 tcp_mss+0xbc/d <== _tcp_mss+0xbc: 512 tcp_mss+0xbc?w 0x400 <== _tcp_mss+0xbc: 0x200 = 0x400 tcp_mss+0xbc?d <== _tcp_mss+0xbc: 1024 ^D <== yourmachine:2 # [ Sorry to turn xpert into a Sun kernel & TCP discussion, but the keepers of `expo' might want to consider this, due to the FTP problems ... - Greg ] -- Greg Earle earle@mahendo.JPL.NASA.GOV Indep. Sun consultant earle%mahendo@jpl-elroy.ARPA [aka:] (Gainfully Unemployed) earle%mahendo@elroy.JPL.NASA.GOV Lake View Terrace, CA ...!{cit-vax,ames}!elroy!jplgodo!mahendo!earle