dc@lupine.NCD.COM (Dave Cornelius) (08/21/90)
A sniffer trace of the MIT X windows demo 'maze', run on a lightly-loaded host, connected via TCP to an X server on a heavily loaded (or slow) host reveals a bug in the TCP ack generation in 4.2/4.3-tahoe derived TCP implementations. The result of the bug is that a TCP which receives many small packets can appear to send an ACK for each incoming packet. These acks have the property that the <ack> field _advances_ and the <window> field _declines_ by the same amount: the length of the last incoming segment. The problem is that the available window in the receiver can be less than the window which was advertised to the sender. The code in tcp_output.c computes the difference between these two quantities, and leaves the result in a c-language int. This item is potentially negative, in the case cited above. The code then uses this negative int in a comparison with a unsigned short (t_maxseg), and also in an expression involving division by an unsigned long (sb_hiwat). The latter usage coerces the negative int to an unsigned, which steers the comparison against 35% of the max window the wrong way, resulting an outgoing ack for each segment received, (at least until the receiving socket buffer is drained). Granted, the 'maze' demo makes poor use of the TCP connection by causing the host to emit enough 20-byte TCP packets to fill the X-server's TCP window. The BSD TCP code causes a few more packets than necessary to be generated in reply to the client's bombardment :-) Repeat by: (1) On host A: run the MIT X server, and 5-10 processes each running 'main();{while (1) ;}' (2) On host B: run 'maze -display hosta:0' (3) With an ethernet analyzer, watch the traffic. Watch the acks returning from host A after the bombardment of 20-byte X packets from host B. This problem has been observed on: Sun 3/50(sunos3.5) and SparcStation1(sunos 4.1) Old code from tcp_output.c (near line 158 in 4.3.tahoe) < win = sbspace(&so->so_rcv); < <[ .... several state checks omitted... ] < < /* < * Compare available window to amount of window < * known to peer (as advertised window less < * next expected input). If the difference is at least two < * max size segments or at least 35% of the maximum possible < * window, then want to send a window update to peer. < */ < if (win > 0) { < int adv = win - (tp->rcv_adv - tp->rcv_nxt); < < if (so->so_rcv.sb_cc == 0 && adv >= 2 * tp->t_maxseg) < goto send; < if (100 * adv / so->so_rcv.sb_hiwat >= 35) < goto send; < } Modified code: > win = sbspace(&so->so_rcv); > >[ .... several state checks omitted... ] > > /* > * Compare available window to amount of window > * known to peer (as advertised window less > * next expected input). If the peer could make some use > * of the window update, and the difference is at least two > * max size segments or at least 35% of the maximum possible > * window, then want to send a window update to peer. > */ > if (win > 0) { > int adv = win - (tp->rcv_adv - tp->rcv_nxt); > >--> if (adv >= 0) { > if (so->so_rcv.sb_cc == 0 && adv >= 2 * tp->t_maxseg) > goto send; > if (100 * adv / so->so_rcv.sb_hiwat >= 35) > goto send; >--> } > } ----------- Dave Cornelius Network Computing Devices 350 North Bernardo Ave dc@ncd.com -or- Mountain View, CA, 94043 {uunet,ardent,mips}!lupine!dc 415-694-0675
jeffw@pyrnova (Jeff Wallace - S.E.) (08/22/90)
In article <1269@lupine.NCD.COM> dc@lupine.NCD.COM (Dave Cornelius) writes: >A sniffer trace of the MIT X windows demo 'maze', run on a lightly-loaded >host, connected via TCP to an X server on a heavily loaded (or slow) (deleted) much to do about software..... Am I missing smething here, or is "comp.sys.tahoe" really not just about CCI (ICL[Fuji?]), Harris, Unisys hardware boxes? I would assume from the traffic that most people use this news group as a default for the 4.3BSD-Tahoe software release, and not for the "tahoe" hardware platform. Since the CCI 6/32, Unisys 7000/xx, Harris, etc. is a dead (or should I say out of date) product, maby this news group should be moved out of com.sys and placed into comp.os.tahoe ? -jeffw BTW - tahoe (the hardware) originated @ CCI (named boxes after lakes).
guy@auspex.auspex.com (Guy Harris) (08/23/90)
>Am I missing smething here, or is "comp.sys.tahoe" really not just about >CCI (ICL[Fuji?]), Harris, Unisys hardware boxes? I would assume from the >traffic that most people use this news group as a default for the 4.3BSD-Tahoe >software release, and not for the "tahoe" hardware platform. It was *intended* to be about those boxes, but some people are, I guess, confused by the name, and think it's for 4.3-tahoe. 4.3-tahoe bugs should be posted, like any other 4BSD bugs, to "comp.bugs.4bsd", so that even people who know that "comp.sys.tahoe" is for Tahoe machines, and don't care about them and therefore don't read that group, can find out about them. (If you don't get netnews, I don't know if there's a mailing list behind "comp.bugs.4bsd" or not.) >Since the CCI 6/32, Unisys 7000/xx, Harris, etc. is a dead (or should I say >out of date) product, maby this news group should be moved out of com.sys >and placed into comp.os.tahoe ? No, there's nothing about 4.3-tahoe that makes it deserve a "comp.os" newsgroup of its own. It's a UNIX version, so general discussion belongs in "comp.unix.*"; it's a 4BSD version, so bug reports belong in "comp.bugs.4bsd". The fact that the hardware is out-of-date doesn't necessarily mean the group should be moved, either; after all, "comp.sys.ridge" is still around.... >BTW - tahoe (the hardware) originated @ CCI (named boxes after lakes). Except for the 5/32, which was named after a pond. :-)
jeffw@pyrnova (Jeff Wallace - S.E.) (08/23/90)
>The fact that the hardware is out-of-date doesn't necessarily mean the >group should be moved, either; after all, "comp.sys.ridge" is still >around.... > It's me who is out of date (loosing hair fast)... >>BTW - tahoe (the hardware) originated @ CCI (named boxes after lakes). > >Except for the 5/32, which was named after a pond. :-) Nice to hear someone else has heard of an "Oneida" 5/32 before. -jeffw
les@cci632.UUCP (Lance Shepard) (08/24/90)
The 5/32 is the ``Walden.'' The ``Oneida'' was the 5/20. The fault tolerant (multiple processors connected over a LAN, with shared disk, or diskless) version of the 5/20 was called the ``Erie.'' I can't remember if the 5/30 (an ``improved'' 5/20) had a lake designation, or not. Lance Shepard