ian@hpopd.HP.COM (Ian Watson) (07/17/90)
I sometimes get a hang for a long while (say an hour) when I have a client close a sockets connection. The server (SCO Unix 3.2) shows the TCP port state according to 'netstat -i' as being FIN_WAIT_1 before the connection eventually dies completely. The server program has gone away at the time the connection close was requested by the client, so I guess it's just at the TCP level that the connection is almost closed but not quite. Anyone any ideas ? Ian Watson ian@hpopd.HP.COM hplabs!hpopd!ian
gwilliam@SH.CS.NET (George Williams) (07/20/90)
Hello, The close sequence in TCP is a symetrical exchange. That is the side issuing close generates the following protocol exchange: close initiator TCP close receipient ----------------- ------------------ CLOSE REQ =======> ---------------------> fin(1) <------------------- fin(1) ack must insure now data in pipe. usually notification event <======= CLOSE REQ <-------------------- fin(2) --------------------> event notification fin(2) ack The above is a full duplex close operation on a socket/port pair. There are wait states associated with the above that I ommitted. But I think it should be noted: () THe close sequence ideally operates on TCP socket pairs. It was designed to allow for the draining of data in transit or in the pipeline. The sender of close can not send data but is resposible for all data the other side has in transient. This minimizes loss of data during close between two non-cooperative or non-peer applications. It allows even for simultaneous close operations since TCP queues the fin/acks until deliverable. That is the wait you probally are seeing. () You can be left with half open connections if this is not implemented a spec'd. The RFC goes into details in this area. And there are implemetation tips. () Depending on the OS interface to TCP one might be able to get away with exiting without fully closing a connection, i.e. waiting for notification that the other side has closed. I would do this unless I wrote both side or am talking to peer applications that have a common higher level protocol. Some workarounds for violations of the above I have seen have included the Os generating an abort/close when a process exits, and timers on TCP wait states to mention a few. It used to be one of the most bug infested areas of TCP we ever ran across depending on the vendor and the application/os interface. If you a programming for multiple connections or concurrency this really is a problem depending on whose TCP you use. Additionally, that last 'fin ack' has no ack so it is a 'hole' in the protocol. So if it get dropped ( have seen this happen) via routers or bad bridge connections there is no recovery. So most vendors put in the timer ( 2 min or 5 min etc. ) as a backup.