[comp.protocols.tcp-ip] Problems with TCP/IP sockets under 4.2.

robert@SPAM.ISTC.SRI.COM.UUCP (06/11/87)

    I have encountered a problem with TCP sockets under Sun 3.2 OS, and I was
    wondering if anyone on the list has encountered the same problem, or if 
    anyone knows what could be causing the problem.

    I have an application which listens on TCP sockets, in the AF_INET family,
    for connections from other processes.  The listening socket is set to non-
    blocking with ioctl() (FIONBIO) after the call to socket() (AF_INET,
    SOCK_STREAM,0).  After the ioctl, setsockopt() (SOL_SOCKET,SO_REUSEADDR)
    is called, followed by bind(), followed by listen() with a backlog paramater

    of 5.

    When a connection is 'heard', accept is called, and the file-descriptor
    returned from accept is used to establish a 'full' connection to the
    client process.  This file-descriptor too has an ioctl done to it
    (non-blocking = FIONBIO), after which it is read from.

    The problem is this; if the client process goes away abruptly, the server
    socket does not know about it.  Further, if a select() is called on the
    (now supposedly 'dead') file-descriptor for reading, it will indicate
    that there is data there.  If the fd is read, 0 bytes are received;
    no error condition is indicated by the recv.  I have seen upto 600 repeated
    select()'s and recv()'s without getting a -1 from recv.

    My questions: 1. Is this a bug of 4.2BSD networking code (it has been a
    'feature' of the Sun kernel since OS 2.2 or so), or 2. incorrect coding
    on my part in establishing the socket, or 3. a result of the fact that
    exceptfd is not implemented for select on 4.2BSD, or 4. my use of the
    non-blocking socket?

    Currently I close down the socket after 100 0-byte recv()'s, but I'd like
    to have a better idea of why this is happening, and how it can be fixed.
    Any clues will be gratefully accepted.

    Robert Allen,
    robert@spam.istc.sri.com
    415-859-2143