[comp.protocols.tcp-ip] Keep-Alive within TCP

cmorris@ingres.com (Colin Morris) (09/07/90)

Having served my four-year sentence in the ISO TP4 world, I've suddenly been
unleashed on TCP. To my great surprise, I understand not all TCP
implementations support "keep-alive". For example, SUN's PC-NFS version
of TCP appears not to. How widespread a "problem" is this? What well-known
implementations don't support this?

--
Colin Morris,                        cmorris@ws2s.ingres.com
Ingres Corporation,
Alameda, California, U.S.A.          Official Hooligan, Italia '90

craig@bbn.com (Craig Partridge) (09/07/90)

In article <1990Sep7.002637.6209@ingres.Ingres.COM> cmorris@ws2s.Ingres.COM (Colin Morris) writes:
>Having served my four-year sentence in the ISO TP4 world, I've suddenly been
>unleashed on TCP. To my great surprise, I understand not all TCP
>implementations support "keep-alive". For example, SUN's PC-NFS version
>of TCP appears not to. How widespread a "problem" is this? What well-known
>implementations don't support this?

    Why is lack of keep-alives a problem? In principle, unless an application
tries to send data, why ping the network to see if the route is up (if the
route is down, it may come up again before you send more data)?

Craig Partridge
Vice-Chairman, I Hate Keep-Alives Association

[For those not in the know, Phil Karn is chair of the IHKAA, and Mike
Padlipsky is founder, previous-chair, and honorary fellow.  We oughta
make T-shirts some time... :-)]

702WFG@SCRVMSYS.BITNET (bill gunshannon) (09/10/90)

>>if TCP appears not to. How widespread a "problem" is this? What well-known
>
>    Why is lack of keep-alives a problem? In principle, unless an application
>tries to send data, why ping the network to see if the route is up (if the
>route is down, it may come up again before you send more data)?
>
>Craig Partridge
>Vice-Chairman, I Hate Keep-Alives Association
>
>[For those not in the know, Phil Karn is chair of the IHKAA, and Mike
>Padlipsky is founder, previous-chair, and honorary fellow.  We oughta
>make T-shirts some time... :-)]


Actually, I'm surprised Phil didn't jump in here and answer this one first.
Must be on vacation or something...


bill gunshannon
(closet member IHKAA)

                                          bill gunshannon
                                       702WFG@SCRVMSYS.BITNET

gwilliam@SH.CS.NET (George Williams) (09/10/90)

Please note the following in regards your query:

 () "keep-alive" is used rather loosly in the TCP/UNIX environment.
    ( Correct and enlighten me as appropriate.) 

    1. I have seen it used to describe an option under UNIX to determine
       whether or and for how long to "re-try" a client initiated
       connection request to a server process. This is usually done under
       the covers, i.e. transparent, to the requesting process.

    2. Some vendors implement it, again option defined, to periodically
       poll the other end of an established connection in order to
       solicit a response indicating "am still here alive" from a peer
       TCP transport. This feature has proven useful to applications
       designed for paralellism or some degree of concurrency relative to
       higher level processing requirements.


    Given the above it would be enlightening to one such as myself if you 
    would elucidate as to which if any of the above context you are 
    using the term "keepalive", i.e. could help define the implementation
    specific context of your query. 

    In the case of (1) above it has proven beneficial and helped to simply
    interface level programming logic requirements to have this feature,
    based on prior development ( subjective ) work in this area.


    As far as (2) , again, it proved to be a useful feature; in lieu of a
    timeout parameter on most UNIX implementations for network read and write
    calls, in avoiding "hung" processes and in the area of process management.
    In distributed compute environments  determination of 'when' a 
    process/service/application is 'really' alive can be a problem depending
    on what OS/CPU combination a process is executing under, just to mention
    one consideration. I won't mention any vendor by name ( some
    of observed problems in this area may have been corrected, even as we
    speak ), but some OS's have been noted to max out or approach 100%
    CPU utilization as a result of process or application level loops for
    retrys on network open/read/write failure attempts. Granted that this
    is the indication of a poor design on the application level, network
    software developers have to take the broad(er) perspective that programmers
    don't want to know the detail of underlying protocols much the same
    as we don't necessarily want to get into application specific  
    implementation details.

 () OSI TP0 and TP4 specify connectionless and connection oriented transport
    service by way of architectural definition. It comes as no surprise that
    the connection oriented service would be rigid in the service and protocol
    specification for the know state of an end to end or peer connection.
    Rationale, one would assume,include some if not all of the aforementioned.
    Not to mention the tradeoff beteeen end to end service association versus
    simple protocol, and I assume infrequent, exchange of "keep-alive"
    information.( I stand to be corrected here, but it's sound logical to me
    so I'm probably way off base ,SMILE. ) You can probably make a better
    statement in this area than I am prepared to.

    TCP, ostensibly, takes the least-common-denominator path to solution 
    from all have been able to determine, based on implementation experience.
    In other words by design ( there is no application or session layer )
    the path that offers the least protocol overhead appears to have been
    chosen for support of end to end transport level connectivity. It looks
    like ( and I make the statement with a limited frame of historical
    reference in this area ) "the right decision" based on the engineering 
    requirement(s) at that time.

 () I have seen vendors that don't or hadn't prior to this year implemented
    this feature as described for both case (1) and (2) above.  So I have
    always considered it an option that was product (vendor) driven. Work may
    be ongoing by some a Working Group to make it a protocol extension.


  
  George Williams
  BBN,STC
  

neerma@cod.NOSC.MIL (Merle A. Neer) (09/19/90)

On a local net the lack of keepalives is a problem in the
sense that if a problem exists in the connection 99.9999999% of
the time its because the other side crashed...i.e. has lost
all knowledge of the connection.....the remedy will be that
tcp will reform the connection but in the meantime valuable
resources (particularly on PC-type hosts) are not available
to other potential connections (PC's are strapPed for buffer
space and file handles)...In the larger internet if a connection
has problems the problem might be in a packet switch or phone
line somewhere and yes, here, tcp 's philosophY Of keeping
the connection alive has validity...and the server/clients
might benefit from keeping knowledge of the connection around.
In lieu of working keep alives we have found it necessary
to program a keepalive capabilitY above tcp in order to
keep PC servers operable (due to the above mentioned phenomena).

Merle