[comp.unix.programmer] Problem with binding of socket addresses

epeterso@houligan.encore.com (Eric Peterson) (12/04/90)

I've encountered some odd behavior in trying to set up a simple
client-server system via sockets.  It appears to be that Internet
bindings are not readily reusable.  Here's a quick description of
what's going on ...

I'll start up the server process which creates a socket, binds itself
to an Internet address, then listens for incoming connections.  I'll
then start up the client which connects to the server.  Once the
connection is established, the client closes its socket and dies.  The
server sees the disconnection and closes its acceptor socket.  I can
then restart the server to wait for new connections.

However, the server occasionally hits a bug and core dumps or dies off
in some other way.  But it dies off and closes its end of the
connection before the client closes the other end.  When this occurs
and I attempt to restart the server, the bind() call fails with the
error "Address already in use".

Now, neither the client nor the server is running at the time I try to
restart the server, and there isn't a problem with address collisions
with another process.  As far as I can tell, nothing else is using
this address.  So why does bind() fail?

I've seen this occur on several different flavors of Unix, from SunOS
to Ultrix to System V variants with BSD extensions.  Anyone know why
this happens?  And is there any way to prevent this behavior?

Thanks in advance.

Eric
--
       Eric Peterson <> epeterson@encore.com <> uunet!encore!epeterson
   Encore Computer Corp. * Ft. Lauderdale, Florida * (305) 587-2900 x 5208
Why did Constantinople get the works? Gung'f abobql'f ohfvarff ohg gur Ghexf.

devil@techunix.BITNET (Gil Tene) (12/07/90)

In article <epeterso.660257641@houligan> epeterson@encore.com (Eric Peterson) wr
ites:
> .
> .
>Now, neither the client nor the server is running at the time I try to
>restart the server, and there isn't a problem with address collisions
>with another process.  As far as I can tell, nothing else is using
>this address.  So why does bind() fail?

I have also seen this happen on several systems, anything with an
implemetation of sockets... I have no real solution, but it seems
that "waiting a little bit", like about 30 second to a minute
"fixes" this on most systems, and frees the bound address.

-- Gil.
--
--------------------------------------------------------------------
| Gil Tene                      "Some days it just doesn't pay     |
| devil@techunix.technion.ac.il   to go to sleep in the morning."  |
--------------------------------------------------------------------

sean@ms.uky.edu (Sean Casey) (12/08/90)

Set the "reuse address" socket option, between the socket() and the
bind() calls. Then your program can always immediately restart.

Sean

-- 
***  Sean Casey <sean@s.ms.uky.edu>
***  "Live the journey, for each destination is but a doorway to the next..."

jik@athena.mit.edu (Jonathan I. Kamens) (12/11/90)

In article <epeterso.660257641@houligan>, epeterso@houligan.encore.com (Eric Peterson) writes:
|> However, the server occasionally hits a bug and core dumps or dies off
|> in some other way.  But it dies off and closes its end of the
|> connection before the client closes the other end.  When this occurs
|> and I attempt to restart the server, the bind() call fails with the
|> error "Address already in use".
|> 
|> Now, neither the client nor the server is running at the time I try to
|> restart the server, and there isn't a problem with address collisions
|> with another process.  As far as I can tell, nothing else is using
|> this address.  So why does bind() fail?

In article <sean.660642165@s.ms.uky.edu>, sean@ms.uky.edu (Sean Casey) writes:
|> Set the "reuse address" socket option, between the socket() and the
|> bind() calls. Then your program can always immediately restart.

  Sean's suggestion will solve the problem, but he does not explain why the
problem occurs, so I guess I'll do that :-).

  The TCP protocol states that after a TCP stream connection has been closed
abnormally, the same local/foreign port combination cannot be used again for
(2 * MSL).  MSL stands for the Maximum Segment Lifetime, which is usually set
to a minute, which means that it probably takes about two minutes before the
address is useable again.

  The reason for this is to make sure that all packets which were supposed to
get to the old process connected to the socket don't accidentally get
delivered to the new process instead -- the delay is long enough so that
all the waiting packets should ge thrown away.

  Using the reuse address socket option will make it possible for you to
rebind to the socket.  It's also a violation of the TCP protocol.  But what
the hell, sometimes pragmatism has to win out over theory.  This is definitely
one of those times :-).

-- 
Jonathan Kamens			              USnail:
MIT Project Athena				11 Ashford Terrace
jik@Athena.MIT.EDU				Allston, MA  02134
Office: 617-253-8085			      Home: 617-782-0710

sean@ms.uky.edu (Sean Casey) (12/11/90)

jik@athena.mit.edu (Jonathan I. Kamens) writes:

|  The TCP protocol states that after a TCP stream connection has been closed
|abnormally, the same local/foreign port combination cannot be used again for
|(2 * MSL).  MSL stands for the Maximum Segment Lifetime, which is usually set
|to a minute, which means that it probably takes about two minutes before the
|address is useable again.

Not only that, but if there is pending unwritten data, the address
sometimes cannot be reused *ever* unless SO_REUSEADDR was specified.
I've had stream ports hang for a week because there was pending
outbound data when the server closed the socket.

Sean
-- 
***  Sean Casey <sean@s.ms.uky.edu>
***  "Live the journey, for each destination is but a doorway to the next..."

leh@atlantis.cis.ufl.edu (Les Hill) (12/13/90)

In article <1990Dec10.194130.20414@athena.mit.edu>, jik@athena.mit.edu (Jonathan I. Kamens) writes:

|>   Using the reuse address socket option will make it possible for you to
|> rebind to the socket.  It's also a violation of the TCP protocol.  But what
|> the hell, sometimes pragmatism has to win out over theory.  This is definitely
|> one of those times :-).

Ummm... I think you got it backwards, the SO_REUSEADDR allows you to override a restriction on sockets (namely each bound INET socket must be bound to a unique local port/local address combination) -- the TCP protocol allows TCP connections to be specified by local port/local address + foriegn port/foriegn address pairs (at least it did when I read RFC 793 :) which probably caused consternation among the socket mechanism developers who had to deal with passive TCP sockets waiting on *any* foriegn address. 






 I would like to hear an authoritative answer to this question:
	if you set SO_REUSEADDR on a socket, connect to a foreign address, and drop
	the connection, while someone else (also using SO_REUSEADDR :) issues a bind
	for the same local port, will the bind call allow you to rebind to the local
	port/local address within 2*MSL of your drop?  If so, what if you proceed
	with a connect?  This would explain why some code checks for EADDRINUSE
	even after doing a SO_REUSEADDR (then again those folks and I could just be
	taking "useless" precautions :->

Les
-- 
Extraordinary crimes against the people and the state have to be avenged by
agents extraordinary.  Two such people are John Steed -- top professional, and
his partner, Emma Peel -- talented amateur; otherwise known as "The Avengers."
UUCP: ...!gatech!uflorida!leh    BITNET: vishnu@UFPINE    INTERNET: leh@ufl.edu