[mod.protocols.tcp-ip] Problem with TCP state transitions in CLOSE

KVC@ENGVAX.UUCP.UUCP (02/17/87)

I'm having a problem with an implementation detail of a TCP/IP and I hope
I can get some insights from those who've gone before.
 
The situation is as follows:
 
According to RFC 793, a CLOSE operation means "I have no more data to send."
"The user who CLOSEs may continue to RECEIVE until he is told that the
other side has CLOSED also."
 
Now, my client application does a CLOSE, the TCP sends a FIN and goes from
ESTABLISHED to FIN-WAIT-1.  When the FIN is ACKed, it goes from FIN-WAIT-1
to FIN-WAIT-2.  While in FIN-WAIT-1 or FIN-WAIT-2, data continues to be
received from the network and the application may continue to issue RECEIVEs
to get the data.  Finally, the other end of the connection does a CLOSE
and the TCP receives a FIN.  Receiving a FIN in FIN-WAIT-2 means transition
to TIME-WAIT.
 
The TCP receives a data segment while in FIN-WAIT-2, just before receiving
the FIN.  Now, it's entirely likely that the application won't get around to
doing it's next RECEIVE (for what will be the last buffer of data sent over
from it's buddy on the other end) until the TCP has gotten the FIN.
(Not only is this likely, it happens most of the time, depending on
system response time, to an application here.)
 
The question is, what is a TCP to do if it is in FIN-WAIT-2, it has some
user data that the user hasn't asked for yet, and a FIN comes in?  If it
goes immediately to TIME-WAIT (as this TCP does) the user will receive an error
because RFC 793 says that a RECEIVE call while in TIME-WAIT returns "error:
connection closing."  Seems to me that what we would like to be in here
is a state similar to CLOSE-WAIT, in that RECEIVEs are allowed, but "must
be satisfied by text already on hand, but not yet delivered to the user."
 
There are two possible work-arounds that I came up with, though I am not
particularly happy with them and will wait for some cogent feedback before
implementing.
 
1) Go to TIME-WAIT as the spec says, but handle a user RECEIVE call as described
   in CLOSE-WAIT.
2) If there is user data outstanding when you get a FIN in FIN-WAIT-2, delay
   the transition to TIME-WAIT until the user data has been delivered to
   the user (i.e. he came and asked for it all).
 
My problems with 1) are that you aren't really following what the spec says
and that I'm not sure what to do if that user data is still there after
the 2 MSL timeout has passed.  I don't think it's right for the data to
go away after 2 MSL, since then the success  of the application is dependent
on how fast it can RECEIVE, and I don't think the TCP is supposed to introduce
restrictions like that.  (Please tell me if I'm off base here!)
 
Maybe 2) is better?  Seems to me, though, that the state transition isn't
really handled as the spec says. Also, are you getting out of synch by
staying in FIN-WAIT-2 when everyone expects you to have moved onto TIME-WAIT?
If so, does anyone care?
 
Why did RFC 793 specifically describe what to do with user data in CLOSE-WAIT
but not in this case?
 
How have others handled this in their implementations?  I tried to look
through the Berkeley 4.3 code, but found it a little murky.  I decided they
didn't do (2) but am not sure exactly what voodoo they do do.
 
Thanks for any info,
 
        /Kevin Carosso              kvc%engvax.uucp@usc-oberon.usc.edu
         Hughes Aircraft Co.
 
ps.  If anyone cares, it's the CMU rewrite of the Tektronix IP/TCP code
     for VAX/VMS.

geof@decwrl.DEC.COM@apolling.UUCP (02/17/87)

 > The TCP receives a data segment while in FIN-WAIT-2, just before receiving
 > the FIN.  Now, it's entirely likely that the application won't get around to
 > doing it's next RECEIVE (for what will be the last buffer of data sent over
 > from it's buddy on the other end) until the TCP has gotten the FIN.

The SYN & FIN bits are bytes of control that are IN the data stream.
They consume sequence numbers, and only have meaning when taken in
sequence.  The presence of a FIN at the TCP level may cause a state
to be changed or a flag to be set, but the connection can not actually
close until the application has tried to RECEIVE the FIN.

Thus, there are two aspects of TCP state, the state at the TCP level,
and the state as transmitted by TCP to its client level.  The TCP
connection transitions to TIME-WAIT when it receives a FIN that is
ready to be delivered to the client (note that last caveat, if the
last data packet was lost, the state transition must wait until all
the unRECEIVED data before the FIN is present).  But the TCP connection
does not tell its client that the connection is closed until the
client tries to RECEIVE the FIN.  In practise, this means that you
must maintain the connection state even after it has closed, to allow
the client to receive every last byte of data.

===aside========

While I have your ear, let me sound off about a pet peeve that you
can preemptively fix in your implementation.  The TCP layer should be
willing to remain in zero window state arbitrarily long, as long as
control packets are still crossing the connection.  Don't time out a
connection just because the window is zero for a while.

Last I checked, Apollo (SR 9.2.3), Tops-20, Unix 4.2 all exhibit this bug.

===edisa========

- Geof Cooper
  IMAGEN

karels%okeeffe@UCBVAX.BERKELEY.EDU.UUCP (02/18/87)

As there was some question as to how 4.2/4.3 handle the receipt of a FIN
with unreceived data, I'll answer that question.  The buffering of received
data and synchronization with the user programs is handled in the (generic)
socket layer in 4BSD.  As data is received and ordered by the TCP,
it is handed to the socket layer to await a receive call.  When a FIN
is received (and all data preceding it), TCP processes the FIN and enters
TIME_WAIT state.  The user process may consume the data at its leisure,
and is notified of the end of data (FIN) after the preceding data is consumed.

		Mike

aside to Geof Cooper re: zero windows:
===aside========

The bug to which you alluded, failure to tolerate persistance of a zero
window, does not occur in 4.2 as you describe it.  4.2 becomes impatient
and closes only if the zero window results from shrinking a window
into which it has already sent data.  It will also reset a connection
when new data is received, but cannot be accepted because the user
process has closed and exited without waiting for the connection to close.

===edisa========