[comp.unix.programmer] More questions on sockets

jwb@cepmax.ncsu.edu (John W. Baugh Jr.) (06/14/91)

I'm trying to write "send/receive"-like functions (i.e., send(msg,to),
receive(msg,from)) to hide some of the details of Internet stream
sockets for interprocessor communication (oh yeah, and I really don't
know what I'm doing).  Anyway, a couple of questions:

  - when trying to bind a stream socket I sometimes get an error
    "Address already in use", even though I've closed the socket
    (for example, when I run the program in succession a couple of
    times).  Is there something else I have to do?

  - assuming I'm on the right track (big assumption), is it possible
    to raise the level of abstraction of my send_msg/recv_msg
    functions.  For example, ideally one would like to do the
    following:
       send_msg(char *msg, int size, int process);
       recv_msg(char *msg, int size, int process);
     where "process" may be a process on any machine.  Okay, so that's
     probably asking too much.  What I currently have is:
       send_msg(char *msg, int size, char *hostname, int port);
       recv_msg(char *msg, int size, int port);
     Can one do better (w/o an inordinate effort)?

Any comments/suggestions/literature-ptrs would be welcomed.  Code
follows.

John Baugh
jwb@cepmax.ncsu.edu

-----------------------------------------------------------------------------
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/time.h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>

#define TRIES 10000

send_msg(char *msg, int size, char *hostname, int port)
{
  int sock;
  struct sockaddr_in server;
  struct hostent *hp, *gethostbyname();
  int i, connected = 0;

  hp = gethostbyname(hostname);
  if (hp == 0) {
    fprintf(stderr, "%s: unknown host", hostname);
    exit(2);
  }
  bcopy(hp->h_addr, &server.sin_addr, hp->h_length);
  server.sin_family = AF_INET;
  server.sin_port = port;

  for (i = 0; i < TRIES && !connected; i++) {

    sock = socket(AF_INET, SOCK_STREAM, 0);
    if (sock < 0) {
      perror("opening stream socket");
      exit(1);
    }

    if (connect(sock, &server, sizeof(server)) < 0)
      close(sock);
    else
      connected = 1;
  }

  if (!connected) {
    perror("connecting stream socket");
    close(sock);
    exit(1);
  }

  if (write(sock, msg, size) < 0)
    perror("writing on stream socket");
  close(sock);
  return 0;
}


recv_msg(char *msg, int size, int port)
{
  int sock, length;
  struct sockaddr_in server;
  int msgsock, rval;

  sock = socket(AF_INET, SOCK_STREAM, 0);
  if (sock < 0) {
    perror("opening stream socket");
    exit(1);
  }

  server.sin_family = AF_INET;
  server.sin_addr.s_addr = INADDR_ANY;
  server.sin_port = port;
  if (bind(sock, &server, sizeof(server))) {
    perror("binding stream socket");
    exit(1);
  }
  length = sizeof(server);
  if (getsockname(sock, &server, &length)) {
    perror("getting socket name");
    exit(1);
  }
  printf("Socket has port #%d\n", ntohs(server.sin_port));

  listen(sock, 5);
  msgsock = accept(sock, 0, 0);
  if (msgsock == -1)
    perror("accept");
  else {
    bzero(msg, size);
    if ((rval = read(msgsock, msg, size)) < 0)
      perror("reading stream message");
  }
  close(msgsock);
  close(sock);
      
  return 0;
}

leh@crane.cis.ufl.edu (Les Hill) (06/15/91)

In article <1991Jun14.162215.14657@ncsu.edu>, jwb@cepmax.ncsu.edu (John W. Baugh Jr.) writes:
|>   - when trying to bind a stream socket I sometimes get an error
|>     "Address already in use", even though I've closed the socket
|>     (for example, when I run the program in succession a couple of
|>     times).  Is there something else I have to do?

I believe this is a problem due to TCP's "handshaking" on a close.  One workaround
I use is:

...
  int slin = 0; /* unset SO_LINGER */
...
  /* unset SO_LINGER */
  if (setsockopt(socket, SOL_SOCKET, SO_LINGER, (char *)&slin, sizeof(int)) < 0) {
#ifdef DEBUG
    perror("setsockopts:SO_LINGER");
#endif
    return -1;
  }
...

when I am setting up the socket for use.

|>   - assuming I'm on the right track (big assumption), is it possible
|>     to raise the level of abstraction of my send_msg/recv_msg
|>     functions.  For example, ideally one would like to do the
|>     following:
|>        send_msg(char *msg, int size, int process);
|>        recv_msg(char *msg, int size, int process);
|>      where "process" may be a process on any machine.  Okay, so that's
|>      probably asking too much.  What I currently have is:
|>        send_msg(char *msg, int size, char *hostname, int port);
|>        recv_msg(char *msg, int size, int port);
|>      Can one do better (w/o an inordinate effort)?

Probably not.  In order to achieve the functionality you want in your ideal
situation, you (IMHO) will need to implement your own "protocol" on top of TCP.
The "protocol" could get arbitrarily complex -- as an example, I wrote a "sockets
library" that allowed dynamic link management, while maintaining a fairly high
level interface (e.g.

extern int WriteConnection(); /* (char *data, int size, int user) */
extern void ReadConnection(); /* (char *data, int (*func)()) */

) -- the "protocol" I implemeted was about 3K lines worth of nicely formatted
(and probably bloated :) code.

Les
-- 
Extraordinary crimes against the people and the state have to be avenged by
agents extraordinary.  Two such people are John Steed -- top professional, and
his partner, Emma Peel -- talented amateur; otherwise known as "The Avengers."
INTERNET: leh@ufl.edu  UUCP: ...!gatech!uflorida!leh  BITNET: vishnu@UFPINE

mouse@thunder.mcrcim.mcgill.edu (der Mouse) (06/18/91)

In article <1991Jun14.162215.14657@ncsu.edu>, jwb@cepmax.ncsu.edu (John W. Baugh Jr.) writes:

>   - when trying to bind a stream socket I sometimes get an error
>     "Address already in use", even though I've closed the socket (for
>     example, when I run the program in succession a couple of times).
>     Is there something else I have to do?

Wait.  As far as I can tell that's the only thing to be done.  (Unless
you want to set SO_REUSEADDR with setsockopt, in which case there is no
protection against one copy of the daemon stealing the port from a
previously-running copy.)

The problem is that you can't bind a socket port N when there exists
any other socket on that machine with a matching address and port, even
if that other socket is part of an established connection, is a
leftover from a previous connection, or is otherwise not the source of
a potential conflict.  (I consider this a bug, but have never been
sufficiently annoyed to fix it.)

If you run netstat, you can see the stray socket lying around making
life hard for you....

>   - assuming I'm on the right track (big assumption), is it possible
>     to raise the level of abstraction of my send_msg/recv_msg
>     functions.  For example, ideally one would like to do the
>     following:
>        send_msg(char *msg, int size, int process);
>        recv_msg(char *msg, int size, int process);
>      where "process" may be a process on any machine.

This is workable *if* you're willing to create some sort of server to
deal with mapping between the "process" values and <machine,port>
pairs.  Given that, there's no problem.  (Such a program is probably
not difficult, unless you want some sort of pre-definition of the
"process" values....)

					der Mouse

			old: mcgill-vision!mouse
			new: mouse@larry.mcrcim.mcgill.edu

chris@asylum.gsfc.nasa.gov (Chris Shenton) (06/18/91)

   In article <1991Jun14.162215.14657@ncsu.edu>, jwb@cepmax.ncsu.edu (John W. Baugh Jr.) writes:

   >   - when trying to bind a stream socket I sometimes get an error
   >     "Address already in use", even though I've closed the socket (for
   >     example, when I run the program in succession a couple of times).
   >     Is there something else I have to do?

In article <1991Jun18.050654.17373@thunder.mcrcim.mcgill.edu> mouse@thunder.mcrcim.mcgill.edu (der Mouse) writes:

   Wait.  As far as I can tell that's the only thing to be done.  

I've encountered the same thing, and came up with the same non-fix. Any
idea how long you have to wait (don't say ``until the port's free'' :-).
Why?
--
One should drink little ... but often.		   -- Henri de Toulouse-Lautrec

sean@ms.uky.edu (Sean Casey) (06/19/91)

In article <1991Jun14.162215.14657@ncsu.edu>, jwb@cepmax.ncsu.edu (John W. Baugh Jr.) writes:
>   - when trying to bind a stream socket I sometimes get an error
>     "Address already in use", even though I've closed the socket (for
>     example, when I run the program in succession a couple of times).
>     Is there something else I have to do?

This should be in a frequently asked questions list :).

Check out the SO_REUSEADDR option for setsockopt(). Hint: use it after
socket() and before bind().

Sean
-- 
** Sean Casey  <sean@s.ms.uky.edu>

torek@elf.ee.lbl.gov (Chris Torek) (06/19/91)

>In article <1991Jun18.050654.17373@thunder.mcrcim.mcgill.edu>
>mouse@thunder.mcrcim.mcgill.edu (der Mouse) writes:
>[Wait for closing sockets to close.]  As far as I can tell that's the
>only thing to be done.  

In article <CHRIS.91Jun18100547@asylum.gsfc.nasa.gov>
chris@asylum.gsfc.nasa.gov (Chris Shenton) writes:
>I've encountered the same thing, and came up with the same non-fix. Any
>idea how long you have to wait (don't say ``until the port's free'' :-).
>Why?

Well, actually, it really *is* `until the port is free'.

There are two things going on:

 a) A TCP connection can be `half open': one side may be shut down while
    the other side still has the ability to send.  A TCP connection in this
    state could conceivably stay that way forever.  These are the FIN_WAIT
    states shown by netstat.

 b) Even when a TCP connection is `fully closed', the port should not be
    reused until any outstanding packets have timed out.  Otherwise one
    of these packets, lost somewhere in the network, might show up after
    a new connection is open and `sneak in'.  This timer is called TCP_2MSL
    internally; it stands for `2 times the Maximum Segment Lifetime'.
    It typically amounts to 30 seconds.

You can use SO_LINGER (which, N.B., changed between 4.2BSD and 4.3BSD)
to eliminate the second, but not the first.

One simple solution is to use a different port each time, along the
lines of the Sun RPC `port mapper'.  A single meta-server keeps track
of all active services.  To establish a service, you contact the port
mapper and say `I want to offer a service'.  It gives you a port
number.  When you are done you tell it `I no longer offer the service',
and it deletes that from its tables.  To obtain some service, you ask
the port mapper who to contact.  There are some races, but they tend
not to be too serious.
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov

jc@minya.UUCP (John Chambers) (06/19/91)

In article <1991Jun14.162215.14657@ncsu.edu>, jwb@cepmax.ncsu.edu (John W. Baugh Jr.) writes:
> I'm trying to write "send/receive"-like functions (i.e., send(msg,to),
> receive(msg,from)) to hide some of the details of Internet stream
> sockets for interprocessor communication (oh yeah, and I really don't
> know what I'm doing).  Anyway, a couple of questions:
> 
>   - when trying to bind a stream socket I sometimes get an error
>     "Address already in use", even though I've closed the socket
>     (for example, when I run the program in succession a couple of
>     times).  Is there something else I have to do?

It often takes about 2 minutes for the kernel to finush closing down a
TCP (stream) connection. It has to wait for the stream to drain and/or
timeout before it can allow reuse of the  port.   It's  annoying,  but
there's not much to be done about it.

>   - assuming I'm on the right track (big assumption), is it possible
>     to raise the level of abstraction of my send_msg/recv_msg
>     functions.  For example, ideally one would like to do the
>     following:
>        send_msg(char *msg, int size, int process);
>        recv_msg(char *msg, int size, int process);
>      where "process" may be a process on any machine.  Okay, so that's
>      probably asking too much.  

Hmm... I've worked with some libraries that did it this way, and in my
experience, this is a bad idea.  It sounds good at first, and seems to
work in small-scale tests.  Where it gets you into trouble is when you
have  two packages X and Y linked into your program, and they both try
to talk to remote servers. Messages go out, and then a message arrives
for  your  process.   Now  let's  see, should it be given to the input
routine in package X or the one in package Y?  In general, the routine
that  gets  the  message  has no way of knowing what other routines in
which other libraries may be looking  for  it;  solving  this  problem
becomes a major artificial-intelligence project.

The file abstraction is a much better solution, because it allows each
package  to set up and completely control its own communications, with
no interaction between separate packages that happen to be linked into
the same process.

-- 
All opinions Copyright (c) 1991 by John Chambers.  Inquire for licensing at:
Home: 1-617-484-6393 ...!{bu.edu,harvard.edu,ima.com,eddie.mit.edu,ora.com}!minya!jc 
Work: 1-508-486-5475 {sppip7.lkg.dec.com!jc,ub40::jc}

longshot@en.ecn.purdue.edu (Without Reason) (06/19/91)

>In article <CHRIS.91Jun18100547@asylum.gsfc.nasa.gov>
>chris@asylum.gsfc.nasa.gov (Chris Shenton) writes:
>>I've encountered the same thing, and came up with the same non-fix. Any
>>idea how long you have to wait (don't say ``until the port's free'' :-).
>>Why?
>
>Well, actually, it really *is* `until the port is free'.
>
>There are two things going on:
    [deleted stuff about a &b]

	I have also run into this problem and wondered if shutdown(2)
  would do the trick.  I plan to try this with a server I have, but wondered
  if anyone else has tried it?

--longshot
-- 
longshot@ecn.purdue.edu	   (Rich Long)

  To be "remembered with an affection and veneration that shall surge high
     above the waters of oblivion and glisten through the rust of time."

mouse@thunder.mcrcim.mcgill.edu (der Mouse) (06/22/91)

In article <14446@dog.ee.lbl.gov>, torek@elf.ee.lbl.gov (Chris Torek) writes:

> One simple solution is to use a different port each time, along the
> lines of the Sun RPC `port mapper'.  A single meta-server keeps track
> of all active services.  To establish a service, you contact the port
> mapper and say `I want to offer a service'.  It gives you a port
> number.

Uh, Chris, the portmapper I know has the client obtain the port number
(typically by binding to port 0 and then doing getsockname() to find
out what port was actually bound to, but there are exceptions, eg NFS)
and then tell it to the portmapper.  Not the portmapper choosing a port
and telling the client to use it.

					der Mouse

			old: mcgill-vision!mouse
			new: mouse@larry.mcrcim.mcgill.edu