cmorris@ingres.com (Colin Morris) (09/07/90)
Having served my four-year sentence in the ISO TP4 world, I've suddenly been unleashed on TCP. To my great surprise, I understand not all TCP implementations support "keep-alive". For example, SUN's PC-NFS version of TCP appears not to. How widespread a "problem" is this? What well-known implementations don't support this? -- Colin Morris, cmorris@ws2s.ingres.com Ingres Corporation, Alameda, California, U.S.A. Official Hooligan, Italia '90
craig@bbn.com (Craig Partridge) (09/07/90)
In article <1990Sep7.002637.6209@ingres.Ingres.COM> cmorris@ws2s.Ingres.COM (Colin Morris) writes: >Having served my four-year sentence in the ISO TP4 world, I've suddenly been >unleashed on TCP. To my great surprise, I understand not all TCP >implementations support "keep-alive". For example, SUN's PC-NFS version >of TCP appears not to. How widespread a "problem" is this? What well-known >implementations don't support this? Why is lack of keep-alives a problem? In principle, unless an application tries to send data, why ping the network to see if the route is up (if the route is down, it may come up again before you send more data)? Craig Partridge Vice-Chairman, I Hate Keep-Alives Association [For those not in the know, Phil Karn is chair of the IHKAA, and Mike Padlipsky is founder, previous-chair, and honorary fellow. We oughta make T-shirts some time... :-)]
702WFG@SCRVMSYS.BITNET (bill gunshannon) (09/10/90)
>>if TCP appears not to. How widespread a "problem" is this? What well-known > > Why is lack of keep-alives a problem? In principle, unless an application >tries to send data, why ping the network to see if the route is up (if the >route is down, it may come up again before you send more data)? > >Craig Partridge >Vice-Chairman, I Hate Keep-Alives Association > >[For those not in the know, Phil Karn is chair of the IHKAA, and Mike >Padlipsky is founder, previous-chair, and honorary fellow. We oughta >make T-shirts some time... :-)] Actually, I'm surprised Phil didn't jump in here and answer this one first. Must be on vacation or something... bill gunshannon (closet member IHKAA) bill gunshannon 702WFG@SCRVMSYS.BITNET
gwilliam@SH.CS.NET (George Williams) (09/10/90)
Please note the following in regards your query:
() "keep-alive" is used rather loosly in the TCP/UNIX environment.
( Correct and enlighten me as appropriate.)
1. I have seen it used to describe an option under UNIX to determine
whether or and for how long to "re-try" a client initiated
connection request to a server process. This is usually done under
the covers, i.e. transparent, to the requesting process.
2. Some vendors implement it, again option defined, to periodically
poll the other end of an established connection in order to
solicit a response indicating "am still here alive" from a peer
TCP transport. This feature has proven useful to applications
designed for paralellism or some degree of concurrency relative to
higher level processing requirements.
Given the above it would be enlightening to one such as myself if you
would elucidate as to which if any of the above context you are
using the term "keepalive", i.e. could help define the implementation
specific context of your query.
In the case of (1) above it has proven beneficial and helped to simply
interface level programming logic requirements to have this feature,
based on prior development ( subjective ) work in this area.
As far as (2) , again, it proved to be a useful feature; in lieu of a
timeout parameter on most UNIX implementations for network read and write
calls, in avoiding "hung" processes and in the area of process management.
In distributed compute environments determination of 'when' a
process/service/application is 'really' alive can be a problem depending
on what OS/CPU combination a process is executing under, just to mention
one consideration. I won't mention any vendor by name ( some
of observed problems in this area may have been corrected, even as we
speak ), but some OS's have been noted to max out or approach 100%
CPU utilization as a result of process or application level loops for
retrys on network open/read/write failure attempts. Granted that this
is the indication of a poor design on the application level, network
software developers have to take the broad(er) perspective that programmers
don't want to know the detail of underlying protocols much the same
as we don't necessarily want to get into application specific
implementation details.
() OSI TP0 and TP4 specify connectionless and connection oriented transport
service by way of architectural definition. It comes as no surprise that
the connection oriented service would be rigid in the service and protocol
specification for the know state of an end to end or peer connection.
Rationale, one would assume,include some if not all of the aforementioned.
Not to mention the tradeoff beteeen end to end service association versus
simple protocol, and I assume infrequent, exchange of "keep-alive"
information.( I stand to be corrected here, but it's sound logical to me
so I'm probably way off base ,SMILE. ) You can probably make a better
statement in this area than I am prepared to.
TCP, ostensibly, takes the least-common-denominator path to solution
from all have been able to determine, based on implementation experience.
In other words by design ( there is no application or session layer )
the path that offers the least protocol overhead appears to have been
chosen for support of end to end transport level connectivity. It looks
like ( and I make the statement with a limited frame of historical
reference in this area ) "the right decision" based on the engineering
requirement(s) at that time.
() I have seen vendors that don't or hadn't prior to this year implemented
this feature as described for both case (1) and (2) above. So I have
always considered it an option that was product (vendor) driven. Work may
be ongoing by some a Working Group to make it a protocol extension.
George Williams
BBN,STC
neerma@cod.NOSC.MIL (Merle A. Neer) (09/19/90)
On a local net the lack of keepalives is a problem in the sense that if a problem exists in the connection 99.9999999% of the time its because the other side crashed...i.e. has lost all knowledge of the connection.....the remedy will be that tcp will reform the connection but in the meantime valuable resources (particularly on PC-type hosts) are not available to other potential connections (PC's are strapPed for buffer space and file handles)...In the larger internet if a connection has problems the problem might be in a packet switch or phone line somewhere and yes, here, tcp 's philosophY Of keeping the connection alive has validity...and the server/clients might benefit from keeping knowledge of the connection around. In lieu of working keep alives we have found it necessary to program a keepalive capabilitY above tcp in order to keep PC servers operable (due to the above mentioned phenomena). Merle