[net.unix-wizards] FIN_WAIT_2 Warning: JSQ \"fix\" bad!

mike@brl-tgr (Mike Muuss) (04/24/85)

The recently re-posted John Quarterman "fix" for hanging FIN_WAIT_2
connections is *bad*, in a subtle and dangerous way.  We had it installed
for several weeks before we noticed why it was bad.  I have commented
on the list before about this, but it is worth repeating.

A connection may operate correctly in FIN_WAIT_2 state forever.
A good example of this is the following command line:

	rsh tgr batchnews < /dev/null | unbatchnews

The RSH will sense EOF on stdin immediately, and do an advisory
close on the TCP connection *TO* machine "tgr", yet data will continue
to flow *FROM* machine "tgr" for hours.  This connection operates
in FIN_WAIT_2 state the whole time -- THIS IS CORRECT FUNCTION.

If you choose to install JSQ's fix anyways, just remember that
any RSH that you run from a shell file had better run quickly
(ie, within the FIN_WAIT_2 timeout you pick), or the connection
will be broken for you.

Best,
 -Mike Muuss

jsq@ut-sally.UUCP (John Quarterman) (04/24/85)

Ok, folks, I have tried the scenario Muuss and Hemminger say breaks my
kludge, and they are right:  it does.  We do that kind of connection
so rarely that I never noticed.  We have connections with TOPS-20 hosts
so frequently that I *had* to have something, and not with a six hour
timeout, either.

I am currently trying Hemminger's fix in our systems.  I don't expect
to have any problems with it, as he has evidently run it for some time.
On the off chance that I do find some problem with it, I will report it.
Otherwise, I recommend using Hemminger's fix, not my kludge.

Now, I notice that I neglected to include my usual disclaimer in my
recent posting that I used my kludge and gave it out to others only
because nobody had apparently found anything better.  I will assume
that this omission accounts for the flamage level of the followups.
However, note that I never used the word "fix", I did use the word
"kludge", and I clearly stated that the kludge violated the TCP spec.
A simple note from one of you who tried the kludge to me reporting
the problem with it at the time it was found would have sufficed, eh?
-- 

John Quarterman, jsq@ut-sally.ARPA, {ihnp4,seismo,ctvax}!ut-sally!jsq