[comp.protocols.tcp-ip] closing half-open connections

sfp@APLPY.ARPA (Steven Parr) (09/29/87)

Hi,

We are having problems with tcp connections getting hung in the LAST_ACK state.
I believe the fault lies with the software at the other end of the connection
and so we are in the process of getting an update of that software.  However
purchasing takes time and in the mean time, we keep re-transmitting a FIN
every second or two until the next reboot.

So my question is this:

Does anyone know of a way to force closed a half-open connection such as this?

Seems to me that you should be able to change the state to FIN_ACK_2 or
TIME_WAIT and the connection should go ahead and close itself on the next
expiration of the timer.  Has anyone tried anything like this?  Any suggestions
on how to go about it?  (Looks like adb may be useful, but I know almost nothing
about it.)

If it matters, we have a Pyramid running release 3.1 (without source).

Thanks in advance,
-Steve Parr

sfp@aplpy.arpa

mar@ATHENA.MIT.EDU (09/30/87)

I assume that Pyramid release 3.1 has the Berkeley 4.2 network layer.
In this case you can force the connections closed.  I wrote a program
to do this a while back, but it's full of unix source I shouldn't
redistribute to someone without a source license...

To do it by hand, run
	netstat -A 
and find the address of the PCB for the connection.  This is the hex
number in the first column).  Then start adb with
	adb -w -k /vmunix /dev/kmem

and zero out the short word 8 bytes past the address of the PCB (this
is the size of the offset on the vax, it may vary on the Pyramid.  You
can check it by looking at struct tcpcb in <netinet/tcp_var.h>, and
finding the offset to t_state)
	address+8/w 0
this forces the state of this connection to CLOSED.  The next time a
timer fires for that connection, it will notice that it is in the
closed state and deallocate it.  You can exit adb with
	$q
I suspect that this would work on 4.3 based network layers also,
although the bug shouldn't exist there that requires it.
					-Mark

andrew@mitisft.UUCP (10/01/87)

	While I dont have any suggestions for closing existing half-open
connections (although I think someone posted something awhile back), I
do have a scenario which I have seen cause this, which can be traced to
an ambiguity in the RFC...

	Scenario:

1) Server sends FIN, gets ACK, enters FIN_WAIT_2.

2) Client sends a bunch of data.

3) Server's window size goes to zero due to normal flow control.

4) Client closes connection.
	At this point, client has data buffered, and needs a window update.
	FIN hasnt been sent since data is pending.

5) Client is now in LAST_ACK.  However, he ignores window updates, looking
	only for ACK of FIN he hasnt sent! The connection is effectively
	idle.

	Now, the RFC says all data should be sent after a close (pgs 49 & 61),
and that when a segment arrives in LAST_ACK state only the ACK of FIN should
be checked for (pg 73).

	4.3 seems to have "fixed" this problem by both flushing data on a close
and putting a timer on FIN_WAIT_2, along with having just about everybody
use "linger mode" where the close delays till the data drains (not the default).
I fixed it by looking at window updates during LAST_ACK; not exactly spec,
but harmless (apparently) in the normal cases....

Andrew

mkhaw@teknowledge-vaxc.UUCP (10/01/87)

Here's a /bin/sh driven adb script posted to the net a while back that
forces a socket to close:

<--- cut here --->
#! /bin/sh

# original from cdjohns@NSWC-G.ARPA
#
# TIMETODEATH expressed in decimal instead of hex
#	-- mkhaw@teknowledge-vaxc.arpa

# Use this script to force sockets in FIN_WAIT_2 state to close.
# It works by setting the 2MSL timer in the TCP Protocol Control Block (PCB)
# to a non-zero value.  The kernel then begins to decrement this value until
# it reaches zero, at which point the kernel forces a close on the socket and
# deletes the TCP PCB.  If both sides of the connection are hung, clearing one
# side will possibly clear the other.

# MSLOFFSET is the offset in the tcpcb record for the 2MSL timer.
# <netinet/tcp_var.h> describes the tcpcb record.
# This value is the number of bytes offset, expressed in hexadecimal.

MSLOFFSET=10

# TIMETODEATH is the number of half seconds until the connection is 
# closed.  This value is expressed in decimal and must be greater
# than zero.

TIMETODEATH=06

# Display netstat to get PCB addresses (first column).
echo 'Active connections
PCB      Proto Recv-Q Send-Q  Local Address      Foreign Address    (state)'
netstat -A | fgrep FIN_WAIT_2

echo
echo -n 'PCB address to terminate? '
read addr
echo

# Use adb on kernel to display the PCB of the specified address
adb -k /vmunix /dev/mem << SHAR_EOF
$addr\$<tcpcb
\$q
SHAR_EOF

# Check to see if this was the correct address and PCB. state should be
# 8 for LAST_ACK, 9 for FIN_WAIT_2
echo
echo 'state = 9 = FIN_WAIT_2'
echo -n 'Is this the correct PCB (y/n)? '
read ans
echo
case $ans in
  [Yy]*)
	;;
  *)
	echo 'No Changes.'
	exit
	;;
esac

# Use adb on kernel to set the 2MSL timer for the PCB
adb -k -w /vmunix /dev/mem << SHAR_EOF
$addr+$MSLOFFSET/w 0t$TIMETODEATH
\$q
SHAR_EOF

# Use these lines in place of the above for testing the script.
#adb -k  /vmunix /dev/mem << SHAR_EOF
#$addr+$MSLOFFSET/x 
#\$q
#SHAR_EOF

echo
echo 'Connection will be terminated in `expr $TIMETODEATH / 2` seconds.'
echo 
<--- cut here --->

Mike Khaw
-- 
internet:  mkhaw@teknowledge-vaxc.arpa
usenet:	   {uunet|sun|ucbvax|decwrl|uw-beaver}!mkhaw%teknowledge-vaxc.arpa
USnail:	   Teknowledge Inc, 1850 Embarcadero Rd, POB 10119, Palo Alto, CA 94303

thomson@uthub.UUCP (10/05/87)

In article <247@mitisft.Convergent.COM> andrew@mitisft.Convergent.COM (Andrew Knutsen) writes:

 >	While I dont have any suggestions for closing existing half-open
 >connections (although I think someone posted something awhile back), I
 >do have a scenario which I have seen cause this, which can be traced to
 >an ambiguity in the RFC...

...
 >4) Client closes connection.
 >	At this point, client has data buffered, and needs a window update.
 >	FIN hasnt been sent since data is pending.
 >
 >5) Client is now in LAST_ACK.  However, he ignores window updates, looking
 >	only for ACK of FIN he hasnt sent! The connection is effectively
 >	idle.
 >
 >	Now, the RFC says all data should be sent after a close (pgs 49 & 61),
 >and that when a segment arrives in LAST_ACK state only the ACK of FIN should
 >be checked for (pg 73).

The problem is really with the implementation, not the RFC.
A TCP is not supposed to enter LAST_ACK until it has sent the FIN.
From pg. 61, it should remain in CLOSE_WAIT state "... until all preceding SENDs
have been segmentized; then send a FIN segment, enter [ LAST_ACK ] state".
The actual document said "enter CLOSING state", obviously a typo.

Having said all that, it may well be that the easiest way to handle this
is to accept window updates while in LAST_ACK.
-- 
		    Brian Thomson,	    CSRI Univ. of Toronto
		    utcsri!uthub!thomson, thomson@hub.toronto.edu