toppin@melpar.UUCP (Doug Toppin) (08/23/89)
We are using IBM Xenix 2 on the PC/AT with Network Research Corp
implementation of TCP/IP (called Fusion) and have hit a snag.
We often get the error EWOULDBLOCK on a heavily used socket (many
opens and closes). In the Sept. MIPS magazine David Betz writes:
"after a close, the resources used by that connection are not freed
immediately... Since this time interval is in the tens of seconds,
eventually all the resources are tied up in this TIME_WAIT."
My questions are:
* is the time value a constant?
* is it tunable?
* is the value part of the TCP/IP specification?
* is it possible to detect that you are using the last available resource?
* is it possible to allocate more of these resources?
thanks
Doug Toppin
uunet!melpar!toppin
hwajin@wrs.wrs.com (Hwajin Bae) (08/24/89)
In article <227@melpar.UUCP> toppin@melpar.UUCP (Doug Toppin) writes: >We are using IBM Xenix 2 on the PC/AT with Network Research Corp >implementation of TCP/IP (called Fusion) and have hit a snag. >We often get the error EWOULDBLOCK on a heavily used socket (many >opens and closes). In the Sept. MIPS magazine David Betz writes: >"after a close, the resources used by that connection are not freed >immediately... Since this time interval is in the tens of seconds, >eventually all the resources are tied up in this TIME_WAIT." >My questions are: >* is the time value a constant? TCP Protocol Specification (RFC 793) defines 2*MSL (Maximum Segment Lifetime) to be the time-out value for the TIME_WAIT state to reach the CLOSED state. TCB is not deleted until this time-out happens. BSD 4.3-tahoe TCP/IP uses MSL of 30 seconds. >* is it tunable? If you have the source, yes. If you have an operating system that let's you change operating system parameters on the fly, yes. >* is the value part of the TCP/IP specification? In RFC 793, MSL is defined to be 2 minutes -- the value seems to be arbitrary. >* is it possible to detect that you are using the last available resource? If you have netstat program, you can to "netstat -a" to see a list of PCB's. Count the TCP PCB's and subtract the number of the TCP PCB's from the maximum number of sockets configured into your kernel. >* is it possible to allocate more of these resources? You should be able to tune the kernel by using kernel configuration package. Each Unix system has its own different kernel tuning mechanism. You will need to read the documentation on your system to figure out how to re-build your kernel. Mostly on system V dervied Unix OS's use "conf.c" and "config.h" files that are created by "config" program using the database files "master" and "dfile". One thing that you can try to get around this TIME_WAIT state in TCP implemenations is to set the SO_LINGER option on your SOCK_STREAM socket with the linger time out value of 0. When you close the socket that has the SO_LINGER option on and the linger time is zero, 4.3 BSD based TCP/IP will close the connection immediately. In 4.2 BSD based TCP implementations SO_DONGLINGER option can be used. #include <socket.h> ..... #ifdef BSD_43 struct linger linger; linger.l_onoff = 1; linger.l_linger = 0; setsockopt (sock, SOL_SOCKET, SO_LINGER, &linger, sizeof (linger)); #else int on = 1; setsockopt (sock, SOL_SOCKET, SO_DONTLINGER, &on, sizeof (on)); #endif /* BSD_43 */ -- Hwajin Bae (hwajin@wrs.com) Wind River Systems 1351 Ocean Ave. Emeryville, CA 94608 USA Tel: 415/428-2623 Fax: 415/428-0540
almquist@JESSICA.STANFORD.EDU (Philip Almquist) (08/29/89)
Hwajin, Your message about how to shorten the amount of time a TCP connection spends in TIME_WAIT state omitted what I think is a rather crucial point: why shortening that time might be a very bad idea. For most users of the TCP protocol, one of its more important properties is that it delivers data reliably. Vint Cerf will correct me if I'm wrong, but my impression is that TIME_WAIT state was not a devious plot on the part of the protocol designers to tie up system resources; rather, it was included in the protocol because it is necessary to insure that delivery is indeed reliable. Philip
smb@ulysses.homer.nj.att.com (Steven M. Bellovin) (08/29/89)
In article <8908290927.AA26586@ucbvax.Berkeley.EDU>, almquist@JESSICA.STANFORD.EDU (Philip Almquist) writes: > but my impression is that TIME_WAIT state was not a > devious plot on the part of the protocol designers to tie up system > resources; rather, it was included in the protocol because it is > necessary to insure that delivery is indeed reliable. Quite true. The problem is that a host that has already closed its direction of transmission must retain knowledge of the connection until the other side has closed successfully. I don't have a copy of the RFC handy, so I won't mention the state names, but the general scenario goes like this.... Recall, first, that each direction of transmission may be closed independently. Furthermore, everything must be ACKed, including specifically close requests (known in the spec as FIN bits). Suppose that host A sends a FIN to host B, thereby ending its transmission. B replies with an ACK, but continues sending data for an arbitrarily long period. Eventually, it too sends a FIN to host A; host A replies with an ACK. Let us consider now who knows what. A has long-since finished transmitting; it cannot send any more data. Furthermore, it knows that B is done, too; it's even acknowledged that. Can A discard all knowledge of the connection? No! What if the ACK going to B gets lost? B's transmitter will time out and resend the the FIN; after all, B doesn't know which packet was dropped. If A has discarded knowledge of the connection, it would have to send a reset (RST) in response to the repeated FIN. This is inappropriate. Accordingly, A goes to TIMEWAIT state instead, thus retaining the appropriate; if it sees a repeat FIN, it can simply repeat the ACK. TIMEWAIT persists for twice the maximum segment lifetime; i.e., long enough to be sure that B has either seen the ACK or has concluded that the connection is hopelessly broken. Interestingly enough, B does not go into TIMEWAIT; the reasoning is left as an exercise for the reader. (N.B. If both sides send simultaneous FINs, both sides will end up in TIMEWAIT.)
sandy@TOVE.UMD.EDU (Sandy Murphy) (08/29/89)
Please (please,please) correct me if I am wrong in this. If Host A is in the TIME_WAIT state, then A has: sent a FIN, received a FIN, received an ACK (of its FIN) and sent an ACK (of Host B's FIN). Therefore, Host B must have: sent a FIN, received A's FIN, and sent an ACK (of A's FIN). Host B may not have yet received the ACK of its FIN. So Host B is in state Closing or LAST_ACK (if it hasn't received the ACK). Host B has already passed all of A's data up to the user. But B doesn't know if A has sent all of B's data to the user. The TIME_WAIT state is to ensure that B can get an ACK so it knows all the data was delivered (see page 22 of the specs). Actually, the TIME_WAIT state is not long enough to ENSURE this. If A's ACK of B's FIN gets lost, B will retransmit its FIN (at least). If this gets lost as well, it is possible for the 2MSL timer to expire before B retransmits again and the FIN arrives at A. Chances are B will receive a RST in that case, because A is CLOSED when the FIN arrives. Note also that both sides of the connection do not necessarily go through the TIME_WAIT state. If Host A closes after it has received a FIN, then it can send a FIN,ACK. An ACK of this FIN means that Host B received the ACK he needs, and A can go directly to CLOSED. In this sequence (ESTAB -> CLOSE_WAIT -> LAST_ACK -> CLOSED) the specs don't say to send an ACK. I'm just supposing that the FIN segment it says to send should include an ACK. Right? --Sandy Murphy University of Maryland
karn@jupiter (Phil R. Karn) (08/31/89)
>[Sandy Murphy's discussion of TCP connection closing] > Actually, the TIME_WAIT state is not long enough to ENSURE this. If >A's ACK of B's FIN gets lost, B will retransmit its FIN (at least). If this gets lost >as well, it is possible for the 2MSL timer to expire before B retransmits again and >the FIN arrives at A. Chances are B will receive a RST in that case, because A >is CLOSED when the FIN arrives. You're quite right. In fact, whenever your network has a nonzero packet loss probability, you can NEVER be absolutely sure that the connection will close gracefully on both ends. This is not only true with TCP, it's the case for ANY protocol. Check out the "two-army problem" on page 397 of Tanenbaum's "Computer Networks" text (second edition) for an explanation. Phil
CERF@A.ISI.EDU (09/01/89)
Phil and Sandy have done a good job of explaining the mechanism and rationale for TIME_WAIT. During the design phase of TCP, tremendous effort was put into exploring, through case analysis, various sequences of events in various orders. The "two armies" problem is also called the Byzantine Generals problem in the literature on synchronization. There are some wonderfully complicated variations in which there are more than two generals and some of them lie. In any case, no amount of waiting and acking will guarantee anything - but the states are there to reduce the possibility that a successful session is considered a failure for lack of an ACK. Generally, though, such designs fail safe in that a truly unsuccessful session is not accidently considered successful though a successful one may be mislabelled a failure. Vint