[comp.protocols.tcp-ip] Interlan drops a byte?

fletcher@cs.utexas.edu (Fletcher Mattox) (08/21/88)

Has anybody else seen a 4.3BSD VAX with an Interlan Ethernet interface
drop a byte of data?  Well, that's what we're seeing.  

For example, if you

	% rsh remotehost cat 183_byte_file

and the remotehost is a 4.3/Interlan host, the rsh will fail.

If you look at a packet on the wire, say, with etherfind or tcpdump,
the IP header claims there are 223 bytes (183+40 TCP/IP headers), which
is correct.  Yet there are really only 222 bytes of data in the packet.  
Hmm.

Futhermore other file sizes fail.  It appears that if (n%256 == 183)
where n is the number of bytes in the above rsh, then TCP/IP fails
because a byte is dropped from the data.

If we replace the Interlan with a DEUNA, all is as it should be.

Any ideas?

--
Fletcher Mattox 	fletcher@cs.utexas.edu	 cs.utexas.edu!fletcher

dennis@rlgvax.UUCP (Dennis.Bednar) (08/23/88)

In article <3210@cs.utexas.edu>, fletcher@cs.utexas.edu (Fletcher Mattox) writes:
> 
> If you look at a packet on the wire, say, with etherfind or tcpdump,
> ...

Say, these "ethernet sniffer" tools sound like very useful tools.
Do these tools run on the UNIX machine, or on a PC?  Is source
available? Tell me more about them.  Thanks.

-- 
FullName:	Dennis Bednar
UUCP:		{uunet|sundc}!rlgvax!dennis
USMail:		CCI; 11490 Commerce Park Dr.; Reston VA 22091
Telephone:	+1 703 648 3300

pierre@imag.imag.fr (Pierre LAFORGUE) (08/23/88)

In article <3210@cs.utexas.edu> fletcher@cs.utexas.edu (Fletcher Mattox) writes:
>Has anybody else seen a 4.3BSD VAX with an Interlan Ethernet interface
>drop a byte of data?  Well, that's what we're seeing.  
>For example, if you
>	% rsh remotehost cat 183_byte_file
>and the remotehost is a 4.3/Interlan host, the rsh will fail.

Your report is not very accurate; I think it depends of the Interlan
interface.
Here, we are using the both types: NI1010A and NP100. Both work well (I tried 
your test, of course).
May be your problem comes from your software driver. For instance, the original
BSD4.3 NP100 driver was bugged.
You may ask Un. of Berkeley for the updated driver.
-- 
Pierre LAFORGUE
pierre@imag.imag.fr      pierre@imag.UUCP      uunet.uu.net!imag!pierre

paul@UXC.CSO.UIUC.EDU (Paul Pomes) (08/24/88)

I've also run into the problem using a Interlan 1010A card in a 4.3 BSD
VAX 11/780 with Van's TCP fixes.  Transfers kept coming up one byte short
in transferring the X.V11R2 split files from expo.lcs.mit.edu to
uxc.cso.uiuc.edu.  It was not a constant problem.  Retrying the transfer
would usually get the complete file.  One file did require six attempts
before a 512,000 copy was obtained rather than 511,999.  A cmp of the
two files showed the difference to be at the very end of the file.

Paul Pomes
Univ of Ill, CSO

brunner@SPAM.ISTC.SRI.COM (Thomas Eric Brunner) (08/24/88)

Dennis Bednar wrote asking for "ethernet sniffer" info, whether they
(etherfind, tcpdump) ran on Unix boxes, PCs, was source available, etc.

I've just gone through using two reasonable tools, the Network General
"Sniffer", and the Exelan "LANalyser", both PC based for ethernet
analysis, as well as using Van's Tcpdump (daily) and SMI's Etherfind
(only where tcpdump is not available) for the purpose of determining
why a diskless Sun fails under some conditions to boot when running
the 4.0 release of the Sun OS. So I'll share what little I know.

The "Sniffer" is a menu-driven package that runs on a PC, either lap-top
or the usual notion of a PC. It only uses 2 bytes of a packet for filtering,
so one's ability to form a reasonably complex "from/to/size/type" boolean
is limited, though I personally didn't find this limitation a problem at
all - having lived with "noisier" tools. Price, $19,000, $15,000 for the
single-board PC.

The "LANalyser" is less menu-oriented, also runs on a PC, either flavor,
and uses more bytes for filtering. Price, $15,000.

The LANalyser was the choice of SRI's link-level folks whos interest is
mostly in connectivity diagnosis.

The Sniffer was the choice of my own staff, whos interest is both connectivity
and network-level and above diagnosis. With it we found that the diskless
boot problem resulted from interactions between VMS hosts running TWG/SRI
ip software and the Sun's broadcast boot request. Details available upon
request.

Tcpdump is not available in source form, one can either ftp it from any of
the usual places, or if one is not able to ftp from say, lbl or ucbarpa,
via tape from someone who is. It comes as an executable binary (Sun 3.x),
with awk scripts for digestion of its output, in tar format. On a lan with
a lot of hosts, getting a Sun just to run tcpdump seems very reasonable to
me -- I've seen a quote for a 3/50 (used) for $850... I use it to catch
the initiation of "hostile" and "questionable" events here - we've several
scripts to wrap it which may be in our ftp area (spam) if either I or Tim
remember to keep those versions current, and to diagnosis some really bad
vendor cruft (line-at-a-time smtp, with a seperate cr/lf packet!), etc.

Etherfind is available from SMI as part of their operating system release.
It supports fewer of the boolean constructions than tcpdump, and I really
avoid using it since Van's first release of tcpdump.

Another possibility is SMI's traffic routine, which is visual, a nice intro
to the activities on the wire, but little beyond that. Almost no filtering
is possible. Each of these use the NIT interface, particular to SMI, which
has bugs. For those sourced, putting metering into the device driver(s) is
an equivalent approach.

Others on this last can easily improve on this little note, in particular
someone from Network General and Exelan on pricing, "protocol suite" software
extra pricing, and capabilities.

Happy lan-ing Dennis!

Eric

steve@umiacs.UMD.EDU (Steven D. Miller) (08/24/88)

   The problem here is not in your Interlan card; I've had the same problem
with my Sun-3/60.  I did a packet spy once, and it seems that under certain
circumstances, expo seems to send the FIN with a sequence number that is one
too low.

   The spy I made of the problem is enclosed below, so that others more
knowledgeable in the ways of TCP may check my reasoning.  I think expo is
running vanilla SunOS 3.4 TCP, but I'm by no means certain of that...

	-Steve

Spoken: Steve Miller    Domain: steve@mimsy.umd.edu    UUCP: uunet!mimsy!steve
Phone: +1-301-454-1808  USPS: UMIACS, Univ. of Maryland, College Park, MD 20742

pktnum 2483, timestamp 577459160 sec 410000 usec, len 54
Ethernet level: dst host 08:00:20:00:6f:74,
    src host 02:07:01:00:8a:17, type 800
IP header: version 4, header len 5,
    service 0, len 228, id 95c5, off 0,
    ttl 18, protocol 6, sum e, src 121e00d4,
    dst 80087803
TCP header: source port 14,    dst port 477, <seq,ack> 1d52972,108d4e01
    data off 5, flags=10<ACK> window 1000,    sum 45a2, urgent 0
    TCP data length 512 (0x200) bytes

pktnum 2484, timestamp 577459160 sec 410000 usec, len 54
Ethernet level: dst host 02:07:01:00:8a:17,
    src host 08:00:20:00:6f:74, type 800
IP header: version 4, header len 5,
    service 0, len 28, id b6f9, off 0,
    ttl 1e, protocol 6, sum dad9, src 80087803,
    dst 121e00d4
TCP header: source port 477,    dst port 14, <seq,ack> 108d4e01,1d52f72
    data off 5, flags=10<ACK> window 2000,    sum f076, urgent 0
    TCP data length 0 (0x0) bytes

pktnum 2485, timestamp 577459160 sec 530000 usec, len 54
Ethernet level: dst host 08:00:20:00:6f:74,
    src host 02:07:01:00:8a:17, type 800
IP header: version 4, header len 5,
    service 0, len 228, id 95c6, off 0,
    ttl 18, protocol 6, sum d, src 121e00d4,
    dst 80087803
TCP header: source port 14,    dst port 477, <seq,ack> 1d52b72,108d4e01
    data off 5, flags=10<ACK> window 1000,    sum 81b0, urgent 0
    TCP data length 512 (0x200) bytes

pktnum 2486, timestamp 577459160 sec 530000 usec, len 54
Ethernet level: dst host 02:07:01:00:8a:17,
    src host 08:00:20:00:6f:74, type 800
IP header: version 4, header len 5,
    service 0, len 28, id b6fa, off 0,
    ttl 1e, protocol 6, sum dad8, src 80087803,
    dst 121e00d4
TCP header: source port 477,    dst port 14, <seq,ack> 108d4e01,1d52f72
    data off 5, flags=10<ACK> window 2000,    sum f076, urgent 0
    TCP data length 0 (0x0) bytes

pktnum 2487, timestamp 577459160 sec 810000 usec, len 54
Ethernet level: dst host 08:00:20:00:6f:74,
    src host 02:07:01:00:8a:17, type 800
IP header: version 4, header len 5,
    service 0, len 228, id 95c7, off 0,
    ttl 18, protocol 6, sum c, src 121e00d4,
    dst 80087803
TCP header: source port 14,    dst port 477, <seq,ack> 1d52d72,108d4e01
    data off 5, flags=10<ACK> window 1000,    sum 96d3, urgent 0
    TCP data length 512 (0x200) bytes

[expo sent 512 bytes after seq 1d52d72 ]

pktnum 2488, timestamp 577459160 sec 810000 usec, len 54
Ethernet level: dst host 02:07:01:00:8a:17,
    src host 08:00:20:00:6f:74, type 800
IP header: version 4, header len 5,
    service 0, len 28, id b6fb, off 0,
    ttl 1e, protocol 6, sum dad7, src 80087803,
    dst 121e00d4
TCP header: source port 477,    dst port 14, <seq,ack> 108d4e01,1d52f72
    data off 5, flags=10<ACK> window 2000,    sum f076, urgent 0
    TCP data length 0 (0x0) bytes

[fnord acks that]

pktnum 2489, timestamp 577459160 sec 930000 usec, len 54
Ethernet level: dst host 08:00:20:00:6f:74,
    src host 02:07:01:00:8a:17, type 800
IP header: version 4, header len 5,
    service 0, len 1b8, id 95c8, off 0,
    ttl 18, protocol 6, sum 7b, src 121e00d4,
    dst 80087803
TCP header: source port 14,    dst port 477, <seq,ack> 1d52f71,108d4e01
    data off 5, flags=19<FIN,PUSH,ACK> window 1000,    sum 2a8, urgent 0
    TCP data length 400 (0x190) bytes

[expo, whose sequence number was 1d52f72, now sends a FIN with the sequence
number one too low.]

pktnum 2490, timestamp 577459160 sec 930000 usec, len 54
Ethernet level: dst host 02:07:01:00:8a:17,
    src host 08:00:20:00:6f:74, type 800
IP header: version 4, header len 5,
    service 0, len 28, id b6fc, off 0,
    ttl 1e, protocol 6, sum dad6, src 80087803,
    dst 121e00d4
TCP header: source port 477,    dst port 14, <seq,ack> 108d4e01,1d53102
    data off 5, flags=10<ACK> window 1e71,    sum f075, urgent 0
    TCP data length 0 (0x0) bytes

[fnord acks that]

pktnum 2491, timestamp 577459161 sec 70000 usec, len 54
Ethernet level: dst host 02:07:01:00:8a:17,
    src host 08:00:20:00:6f:74, type 800
IP header: version 4, header len 5,
    service 0, len 28, id b6fe, off 0,
    ttl 1e, protocol 6, sum dad4, src 80087803,
    dst 121e00d4
TCP header: source port 477,    dst port 14, <seq,ack> 108d4e01,1d53102
    data off 5, flags=11<FIN,ACK> window 2000,    sum eee5, urgent 0
    TCP data length 0 (0x0) bytes

[fnord sends its fin, with seq # 108d4e02]

pktnum 2492, timestamp 577459161 sec 230000 usec, len 54
Ethernet level: dst host 08:00:20:00:6f:74,
    src host 02:07:01:00:8a:17, type 800
IP header: version 4, header len 5,
    service 0, len 28, id 95cb, off 0,
    ttl 18, protocol 6, sum 208, src 121e00d4,
    dst 80087803
TCP header: source port 14,    dst port 477, <seq,ack> 1d53102,108d4e01
    data off 5, flags=11<FIN,ACK> window 1000,    sum fee5, urgent 0
    TCP data length 0 (0x0) bytes

pktnum 2493, timestamp 577459161 sec 250000 usec, len 54
Ethernet level: dst host 02:07:01:00:8a:17,
    src host 08:00:20:00:6f:74, type 800
IP header: version 4, header len 5,
    service 0, len 28, id b6ff, off 0,
    ttl 1e, protocol 6, sum dad3, src 80087803,
    dst 121e00d4
TCP header: source port 477,    dst port 14, <seq,ack> 108d4e01,1d53103
    data off 5, flags=11<FIN,ACK> window 2000,    sum eee4, urgent 0
    TCP data length 0 (0x0) bytes

pktnum 2494, timestamp 577459161 sec 270000 usec, len 54
Ethernet level: dst host 08:00:20:00:6f:74,
    src host 02:07:01:00:8a:17, type 800
IP header: version 4, header len 5,
    service 0, len 28, id 95cc, off 0,
    ttl 18, protocol 6, sum 207, src 121e00d4,
    dst 80087803
TCP header: source port 14,    dst port 477, <seq,ack> 1d53102,108d4e02
    data off 5, flags=11<FIN,ACK> window 1000,    sum fee4, urgent 0
    TCP data length 0 (0x0) bytes

pktnum 2495, timestamp 577459161 sec 410000 usec, len 54
Ethernet level: dst host 08:00:20:00:6f:74,
    src host 02:07:01:00:8a:17, type 800
IP header: version 4, header len 5,
    service 0, len 28, id 95cd, off 0,
    ttl 18, protocol 6, sum 206, src 121e00d4,
    dst 80087803
TCP header: source port 14,    dst port 477, <seq,ack> 1d53103,108d4e02
    data off 5, flags=10<ACK> window 1000,    sum fee4, urgent 0
    TCP data length 0 (0x0) bytes

pktnum 2496, timestamp 577459161 sec 410000 usec, len 54
Ethernet level: dst host 02:07:01:00:8a:17,
    src host 08:00:20:00:6f:74, type 800
IP header: version 4, header len 5,
    service 0, len 28, id b700, off 0,
    ttl 1e, protocol 6, sum dad2, src 80087803,
    dst 121e00d4
TCP header: source port 477,    dst port 14, <seq,ack> 108d4e02,0
    data off 5, flags=4<RST> window 0,    sum 41c9, urgent 0
    TCP data length 0 (0x0) bytes

CLYNN@G.BBN.COM (08/25/88)

Steve,
	It looks like another instance of the "have to retransmit a
FIN, so better subtract one from the sequence #" bug. Note that the
response (maybe to pktnum 2483, but possibly to an earlier packet
since the timestamps on 2483 and 2484 are identical) in pktnum 2484
was to ack 1d52f72. Thus the receiver has already received the data
being retransmitted in pktnums 2485, 2487 and 2489.
    [Clearly useless retransmissions -- maybe (the timestamps are all
    very close) an instance of the "retransmit all unacked data on a
    retransmission timeout" algorithm, or (the timestamps are not
    exactly equal) a "retransmit on ack before updating/processing
    send-left & processing the retransmit queue" design deficiency].
It would have been nice if the trace began earlier, say on the first
transmission of the packet containing sequence number 1d53100; was
a FIN sent at the "same" time?
    [Note [fnord sends its fin, with seq  # 108d4e02] you mean 108d4e01.
    Also, notice that in pktnum  2492, a FIN is being  (re)transmitted at
    a different sequence number, 1d53102, than it was in pktnum 2489,
    1d53101 = 1d52f71+190.]

Charlie

fletcher@cs.utexas.edu (Fletcher Mattox) (08/28/88)

In article <3334@imag.imag.fr> pierre@imag.UUCP (Pierre LAFORGUE) writes:
>In article <3210@cs.utexas.edu> fletcher@cs.utexas.edu (Fletcher Mattox) writes:
>>Has anybody else seen a 4.3BSD VAX with an Interlan Ethernet interface
>>drop a byte of data?  Well, that's what we're seeing.  
>>For example, if you
>>	% rsh remotehost cat 183_byte_file
>>and the remotehost is a 4.3/Interlan host, the rsh will fail.
>

>Your report is not very accurate;

Well, no.  The report is quite accurate.  Maybe it's not as complete
as it could have been, though.  :-)

	card:		Interlan BD-N11010, rev C, assy rev A, S.N A-103.
	transceiver:	Interlan NA1010
	driver:		@(#)if_de.c     7.1 (Berkeley) 6/5/86

It does appear that nobody else has seen this, so I'm still a little
puzzled.  Someone did mention that Interlan shipped some bad cards
about five years ago which had a similar problem.  Our card is
at least that old.  Anyway, I've quite worrying about it and just
replaced it with a DEUNA, since we have plenty of those.

Thanks to all who responded.

Fletcher

casey@admin.cognet.ucla.edu (Casey Leedom) (08/28/88)

In article <3237@cs.utexas.edu> fletcher@cs.utexas.edu (Fletcher Mattox)
 writes:
>	card:		Interlan BD-N11010, rev C, assy rev A, S.N A-103.
>	transceiver:	Interlan NA1010
>	driver:		@(#)if_de.c     7.1 (Berkeley) 6/5/86

  Well, I think you're going to have an enormous amount of difficulty
trying to use if_de.c with your Interlan.  Why don't you try if_il.c?
And as someone mentioned earlier, you should grab the latest copy of that
driver since there were some significant bugs in the copy distributed
with 4.3BSD.

Casey

fletcher@cs.utexas.edu (Fletcher Mattox) (08/28/88)

>>	driver:		@(#)if_de.c     7.1 (Berkeley) 6/5/86

Um, make that:

	driver:		@(#)if_il.c     7.1 (Berkeley) 6/5/86

casey@admin.cognet.ucla.edu (Casey Leedom) (08/29/88)

In article <15566@shemp.CS.UCLA.EDU> (casey@cs.ucla.edu) I write:
>   Well, I think you're going to have an enormous amount of difficulty
> trying to use if_de.c with your Interlan.  Why don't you try if_il.c?
> And as someone mentioned earlier, you should grab the latest copy of that
> driver since there were some significant bugs in the copy distributed
> with 4.3BSD.

> From: Chris Torek <chris@gyre.umd.edu>
> Apparently Interlan makes a DEUNA-style board.  Also, the major bugs
> were in if_np.c, not if_il.c ...

  Opps!  I should know better too.  Thanks for the correction.

Casey

scott@h-three.UUCP (scott) (08/31/88)

In article <3210@cs.utexas.edu> fletcher@cs.utexas.edu (Fletcher Mattox) writes:
>Has anybody else seen a 4.3BSD VAX with an Interlan Ethernet interface
>drop a byte of data?  Well, that's what we're seeing.  
 
I've seen Micom/Interlan Ethernet controllers drop bytes. I've seen
them loose a bit, left shifting subsequent data by a bit. Somehow, these
errors were not caught by any of their protocol error checking.

Micom/Interlan claims that fixes are/were in the works. BTW, this was
observed on a Multibus I board.

-- 
Scott H. Crenshaw			scott%h-three@uunet.uu.net
h-three Systems Corporation             uunet!h-three!scott
POB 12557				100 Park Drive Suite 204
Research Triangle Park, NC 27607	(919) 549-8334