[comp.protocols.tcp-ip] Why not use SO_KEEPALIVE?

PETEHIC@ACADVM1.UOTTAWA.CA (Pete Hickey) (11/17/90)

Why not use keep-alives?

I'm no expert by any means, but in my opinion, it should be the
job of a specific server/application how long they want to wait
for their timeouts.  Some applications may decide that if nothing
came in from the socket for x seconds, they kill/close it.  The
application can decide on the time if and when it wants.  It should
not be the job of the lower levels of the protocol.

-Pete

karn@envy.bellcore.com (Phil Karn) (11/17/90)

In article <1990Nov16.164448.9918@bwdls61.bnr.ca>, pww@bwdls55.bnr.ca
(Peter Whittaker) writes:
|> So, are there in fact substantive reasons not to use SO_KEEPALIVE?

Yes.

|> the server application gets killed
|> (KILL -9) but the TCP socket stays up, connected to the remote
client,
|> with the server side sending ACK packets every now and again, just
enough
|> of them to be annoying.

This doesn't make sense. Under BSD UNIX, at least, when you kill a
process you automatically close its file descriptors and sockets. So
if an application has a TCP connection open, it will be closed.

The idea behind SO_KEEPALIVE is to "send ACK packets every now and
then".  (Actually, it sends one-byte data packets that are just before
the window, eliciting ACKs to indicate that the receiver is still
there.) The thinking is that you want to detect when the REMOTE end
has silently died.

But there are some big problems with TCP keepalives:

1. They can generate a lot of unnecessary network traffic. It is
perfectly legit for an application to hold open a TCP connection for
weeks or months without sending any traffic, since TCP connections
normally occupy no resources other than 100 bytes or so of RAM in each
end host.  But if you have them send pings to each other every 30
seconds, then your idle connections can cost you a bundle - especially
if your path includes an X.25 network that charges you by the packet.

2. They reduce robustness by closing connections unnecessarily when
there is a temporary network outage, since network outages are usually
indistinguishable from remote host failures. And if the TCP connection
is idle (which is the only time you be sending keepalives anyway) why
should you care if the network goes down momentarily during that time?
All that matters is that it be there when your application has some
real data to send, but by gratuitously closing the connection for him
you've made life for the application designer that much more
difficult. If the *application* wants to give up after some interval,
then *it* should make that decision, not TCP.

3. How do you set the keepalive interval? Remember that your TCP and
application will have to work over arbitrary Internet paths. Who's to
say that 30 seconds (or 1 minute or 1 hour) is a reasonable interval
between keepalives? What's reasonable today might be very unreasonable
tomorrow.

Phil

pww@bwdls55.bnr.ca (Peter Whittaker) (11/17/90)

Hello, I've been hearing bad things about SO_KEEPALIVE recently, but 
these "bad things" have amounted to nothing more than opinion - no one
has yet been able to give a substantive reason not to use it (with the 
possible exception of "no one uses it, so it has never been debugged, so it's
probably flakey, so don't use it" - a la Catch-22).

So, are there in fact substantive reasons not to use SO_KEEPALIVE?

(This matter arose from an application that uses SO_KEEPALIVE, naturally
enough, with apparent negative impact:  the server application gets killed
(KILL -9) but the TCP socket stays up, connected to the remote client,
with the server side sending ACK packets every now and again, just enough
of them to be annoying.  When we eventually want to restart the server, we
have to reboot or search and destroy all clients, because the address is in
use.  As a fix, we're going to use SO_REUSEADDR as well as SO_KEEPALIVE.
If there are substantive reasons not to use SO_KEEPALIVE, we'll recommend
that it be removed from the software).

Thanks,


--
Peter Whittaker      [~~~~~~~~~~~~~~~~~~~~~~~~~~]   Open Systems Integration
pww@bnr.ca           [                          ]   Bell Northern Research 
Ph: +1 613 765 2064  [                          ]   P.O. Box 3511, Station C
FAX:+1 613 763 3283  [__________________________]   Ottawa, Ontario, K1Y 4H7