[comp.protocols.tcp-ip] What seems to be a glitch in the TCP spec.

jbvb@VAX.FTP.COM (James Van Bokkelen) (02/24/89)

Consider the following situation:  A TCP connection is carrying
intermittent, bidirectional traffic.  Host A has just sent a burst
which filled host B's window.  B has enough data for a small segment,
piggybacking the Ack and indicating a window of 0.  A replies with an
Ack, just as B's application consumes the data and B sends a segment
to re-open the window.  B's implementor has chosen to retransmit the
first segment of un-acked data (if any) with window updates and Acks
(B's cost/packet is much larger than the cost/byte).  A's implementor
has exactly followed pg. 69 of RFC 793, which says to discard segments
with obsolete data before checking the Ack or Window values.

     Host A						    Host B
1.	    <-	Seq 100, Ack 200, Data 20, Window 0

2.	     	Seq 200, Ack 120, Data 0, Window 4076	-> (crosses w/3)

3.	    <-	Seq 100, Ack 200, Data 20, Window 4096	(Dropped per pg. 69)

4.		Seq 200, Ack 120, Data 0, Window 4076   -> (ignored, pg. 72)

Segments 2 and 3 cross in transit, A drops 3 before checking either the
Ack or the window and sends 4, which B drops as a duplicate Ack.  At
this point, A has more data to send, but B doesn't.  Everyone has followed
the spec, but A didn't see B's window re-open, and the connection will
sit idle until A sends a window-probe segment.  The user is unhappy,
because a TCP-based window system glitches.

I see four possible solutions:

1. Tell the user that he has found a glitch in the spec, and he should
be glad that heterogenous networking works at all.  See Figure 1.

2. Amend the spec to warn that "window updates and acks may be ignored
if they accompany retransmitted data; therefore this should be avoided".

3. Amend the spec to require that the first window-probe segment be sent
when the window stays zero for two round-trip times.  If the probe is
rejected, try again in a couple of minutes.

4. Amend the spec to require that, when a segment is about to be dropped
per pg. 69, if the segment is immediately to the left of the window (e.g.
(SEG.SEQ + SEG.LEN) == RCV.NXT), SEG.ACK should be checked, and if it
is valid (SEG.ACK >= SND.UNA), both the Ack and the Window should be
processed before the segment is discarded and the obligatory Ack sent.
Note that pg. 72 of the RFC has an error.  As corrected in the upcoming
"Requirements for Internet Hosts" RFC, the send window should be updated
even when SEG.ACK == SND.UNA (otherwise TCP could not continue without
a window probe once the send window went to 0).

I don't like solution 1 (after all, I want the customer's money).  I have
discussed this matter with some other TCP implementors, who don't like
solution 2, particularly when applied to Telnet connections.  Solution
3 is partially implemented in some TCPs (the first probe occurs after
about 4 sec.), but there is still a glitch.  I've fully implemented
solution 4, and have two years of experience with it (in a commercial
product).  It seems to work, and the only objection I can think of is
a miniscule increase in the probability of the connection being reset
due to an old duplicate segment (the mechanism is "ack of unsent data",
not accepting bad data, and the odds of this are still way below those
of damaged data escaping detection by the TCP checksum).

Comments?  In particular, have I missed something relevant in RFC 793?
Has someone else already addressed this issue in some publication other
than an RFC?

One TCP implementor I discussed this with asked:  "Is there any particular
reason why we couldn't amend the spec to say 'check all acks, regardless
of the sequence field?'?"  Any comments on this?

James B. VanBokkelen		26 Princess St., Wakefield, MA  01880
FTP Software Inc.		voice: (617) 246-0900  fax: (617) 246-0901

CERF@A.ISI.EDU (02/26/89)

James,

thanks for taking time to describe the "glitch" scenario carefully.
The reasoning behind the advice to ignore the "old" packet is
that it isn't clear how old the ACK and WINDOW information is.
We were concerned about disorderly arrivals in which window information
is delivered in reverse of the order sent (e.g. opening the window
and then closing it again). If we follow your rule 4, is there a
scenario in which disorderly arrival leaves A believing the window
is closed when it isn't? Of course, in that case, the probe mechanism
should provide more up to date information. 

In general, the probe is needed - I don;t think there is any debate
on that point. The question is whether your slightly relaxed rules
for processing ACKs and WINDOWs is a net gain or opens the door to
new glitches. Intuition suggests that your "hack" [I don't mean
this in any pejorative sense] probably does more good than harm,
though I confess it feels slightly more pragmatic than my
puritanical attitudes allow me to enjoy.

Let's see what the reactions are from other practitioners.

Vint Cerf

narten@PURDUE.EDU (Thomas Narten) (02/26/89)

>Consider the following situation:  A TCP connection is carrying
>intermittent, bidirectional traffic.  Host A has just sent a burst
>which filled host B's window.  B has enough data for a small segment,
>piggybacking the Ack and indicating a window of 0.  A replies with an
>Ack, just as B's application consumes the data and B sends a segment
>to re-open the window.  B's implementor has chosen to retransmit the
>first segment of un-acked data (if any) with window updates and Acks
>(B's cost/packet is much larger than the cost/byte).  A's implementor
>has exactly followed pg. 69 of RFC 793, which says to discard segments
>with obsolete data before checking the Ack or Window values.

James:

Did you stumble across this example in an actual case?  It strikes me
that it should not happen under "normal" conditions.  The keyword is
"retransmit".  The above scenario has B piggybacking the window update
on a retransmission.  But this implies 1)  B's retransmit timer has
fired before the ACK for the first transmission has returned, or 2) B,
aware that the cost per packet is higher than the cost per character,
thinks it can get ahead by including some of the already transmitted data.

Case 1 implies the retransmit timer is broken (in which case said
scenario is the least of your worries) or that the transmission path
is lossy (again, probably more serious than the described problem).

Rather than a glitch in the spec, wouldn't this qualify as feature of
case 2?  Which RFCs suggest retransmitting the first segment of
un-acked data (if present) when sending window updates and ACKs?

Thomas

jbvb@VAX.FTP.COM (James Van Bokkelen) (02/27/89)

   Solution (4), honoring the ACK and window of a pkt rcvd just to the
   left of the receive window, seems to be the defacto standard.  As I'm
   sure you know, BSD TCP does this.  So do the TCPs developed locally.

   Philip Koch
   (Philip.Koch@Dartmouth.EDU)

If this is the case, I wish that the Sun and HP TCP mungers hadn't chosen to
change their bsd-derived code to exact conformance with pg. 69 of RFC 793.

   From: Thomas Narten <narten@purdue.edu>

   Did you stumble across this example in an actual case?  It strikes me
   that it should not happen under "normal" conditions.  The keyword is
   "retransmit".  The above scenario has B piggybacking the window update
   on a retransmission.  But this implies 1)  B's retransmit timer has
   fired before the ACK for the first transmission has returned, or 2) B,
   aware that the cost per packet is higher than the cost per character,
   thinks it can get ahead by including some of the already transmitted data.

My problem is indeed a real-world one, observable with an X-windows server
running on the PC and communicating with HP-UX 9000/350 and Sun 3/50
X clients.  The Sun is running 3.5, I don't know the HP release, I can
obtain it if anyone (are support people listening?) is interested.

   Rather than a glitch in the spec, wouldn't this qualify as feature of
   case 2?  Which RFCs suggest retransmitting the first segment of
   un-acked data (if present) when sending window updates and ACKs?

Yes, it is a feature of case 2 (I am retransmitting the data before the
timer fires).  There are complicated reasons (most historical, relating
to the PCIP ancestry of the TCP implementation) why it is much easier
to do this, so we always have.  I freely admit to doing something that
wasn't proven (right or wrong) by anyone else, but 793 doesn't tell me
not to, either.

I am willing to accept "don't include window updates or acks with retransmits",
(my solution (2)) if this is the consensus, but I have seen some significant
support for checking the window and ack if the segment is only slightly
obsolete, or even regardless of SEG.SEQ.  Many implementers have faced this
issue, but the answers they chose haven't been published.  I want to air the
issue because we have unhappy users, and the fact that 793 allows the glitch
means that they won't be the last people bitten by it.  We may well have to
implement separate window update segments anyway, just because of relative
company sizes...

James B. VanBokkelen		26 Princess St., Wakefield, MA  01880
FTP Software Inc.		voice: (617) 246-0900  fax: (617) 246-0901

jbvb@VAX.FTP.COM (James Van Bokkelen) (02/27/89)

   From: CERF@A.ISI.EDU

   ....
   We were concerned about disorderly arrivals in which window information
   is delivered in reverse of the order sent (e.g. opening the window
   and then closing it again). If we follow your rule 4, is there a
   scenario in which disorderly arrival leaves A believing the window
   is closed when it isn't? Of course, in that case, the probe mechanism
   should provide more up to date information. 

As I see it, there have always been cases where disorderly arrival of window
updates can leave one host believing the window is closed when it should't.
Consider the unidirectional case:

1.		Seq 100, Ack 200, Data 50, Window 4096 ->

2.	    <-	Seq 200, Ack 150, Data 0, Window 0

3.	    <-	Seq 200, Ack 150, Data 0, Window 50

If 3 arrives before 2, and the sender doesn't have more data ready to go
before 2 shows up, the window looks closed.  Of course, it shrank to get
there, but there are enough TCPs that shrink the window that the sender
will probably manage to deal with it.  Probes will restart the data flow
just fine, but with a glitch.

Part of the problem is that X may be the first widely-used application to
provide a fast, intermittent bi-directional load to TCP.  The remainder is
because PC interfaces have to offer 1-segment windows to survive when talking
to faster hosts.  When the window is 2*MSS or more, this is much less likely
to happen.

James B. VanBokkelen		26 Princess St., Wakefield, MA  01880
FTP Software Inc.		voice: (617) 246-0900  fax: (617) 246-0901