[comp.unix.programmer] Server death detection?

mcdaniel@adi.com (Tim McDaniel) (10/18/90)

My coworker has a problem.  (Well, actually, he has lots of problems,
but only one concerns us now. 8-)

He's got a server and a client talking via sockets over a network,
running under ULTRIX 4.0 and/or SUN OS 4.1.  At some point, the server
is killed.  He would like the client to receive notification ASAP that
this has occurred.

SIGPIPE only happens when the client tries to write to the server.

select is no help at all: the death of the other end of the connection
is not perceived as an exception.

There does not seem to be a setsockopt option for this.

His current workaround is to have an alarm occur every N seconds.  In
the signal handler, he will attempt a non-blocking peek mode receive.
If the server died, it'll return an EOF.  If not, it'll return -1 with
EWOULDBLOCK.

There are two minor concerns with the latter approach.  One is the
overhead if N is small.  The other is that the communication stuff is
in a library, and he has little control over the caller.  If the
caller pauses, it'll get interrupted by the alarm; every program using
this library must therefore check, after each pause, that the
condition being awaited has occurred.  This is good UNIX programming
practice, but he has no way to enforce it.

In general, the workaround is a bit unclean.  He'd like the mechanism
to be immediate and silent to callers.  Is there a better way?
--
Tim McDaniel                 Applied Dynamics Int'l.; Ann Arbor, Michigan, USA
Work phone: +1 313 973 1300                        Home phone: +1 313 677 4386
Internet: mcdaniel@adi.com                UUCP: {uunet,sharkey}!amara!mcdaniel

pawel@cs.UAlberta.CA (Pawel Gburzynski) (10/19/90)

From article <MCDANIEL.90Oct18111427@dolphin.adi.com>, by mcdaniel@adi.com (Tim McDaniel):
> 
> select is no help at all: the death of the other end of the connection
> is not perceived as an exception.
> 
Actually it is. When a socket is closed, it appears as "ready to read" on
the other side (with EOF).

                                   Pawel Gburzynski.

mcdaniel@adi.com (Tim McDaniel) (10/19/90)

I wrote:
> [My coworker, Dan Bergin]'s got a server and a client talking via
> sockets over a network, running under ULTRIX 4.0 and/or SUN OS 4.1.
> At some point, the server is killed.  He would like the client to
> receive notification ASAP that this has occurred.

Well, my coworker has solved the problem.  He had to dig through the
bowels (and "bowels" is an appropriate word) of the SUN manuals.  Not
bad, considering that Dan is a VMS hacker! 8-)

      /*
       * gethostbyname ...
       * getservbyname ...
       * socket ...
       * connect ...
       */
      signal (SIGIO, routine);
      if (fcntl (sock, F_SETOWN, getpid()) < 0) {
         SOCKET_PERROR ("F_SETOWN");
         exit (5);
      }
      if (fcntl (sock, F_SETFL, FASYNC) < 0) {
         SOCKET_PERROR ("F_SETFL");
         exit (5);
      }
      while (1) {pause ();}

"routine" is the handler for SIGIO signals.  The first "fcntl" sets
the owner of the socket to be the current process (the client).  The
second sets up asynchronous notification.  When the server dies, an
EOF is put into the socket.  Since that is readable (read won't
block), a SIGIO then occurs, and the client can look around to see the
problem.  (SOCKET_PERROR is his own macro.)

This works under SUN OS 4.1 and ULTRIX.
--
Tim McDaniel                 Applied Dynamics Int'l.; Ann Arbor, Michigan, USA
Work phone: +1 313 973 1300                        Home phone: +1 313 677 4386
Internet: mcdaniel@adi.com                UUCP: {uunet,sharkey}!amara!mcdaniel