robert@SPAM.ISTC.SRI.COM.UUCP (06/11/87)
I have encountered a problem with TCP sockets under Sun 3.2 OS, and I was
wondering if anyone on the list has encountered the same problem, or if
anyone knows what could be causing the problem.
I have an application which listens on TCP sockets, in the AF_INET family,
for connections from other processes. The listening socket is set to non-
blocking with ioctl() (FIONBIO) after the call to socket() (AF_INET,
SOCK_STREAM,0). After the ioctl, setsockopt() (SOL_SOCKET,SO_REUSEADDR)
is called, followed by bind(), followed by listen() with a backlog paramater
of 5.
When a connection is 'heard', accept is called, and the file-descriptor
returned from accept is used to establish a 'full' connection to the
client process. This file-descriptor too has an ioctl done to it
(non-blocking = FIONBIO), after which it is read from.
The problem is this; if the client process goes away abruptly, the server
socket does not know about it. Further, if a select() is called on the
(now supposedly 'dead') file-descriptor for reading, it will indicate
that there is data there. If the fd is read, 0 bytes are received;
no error condition is indicated by the recv. I have seen upto 600 repeated
select()'s and recv()'s without getting a -1 from recv.
My questions: 1. Is this a bug of 4.2BSD networking code (it has been a
'feature' of the Sun kernel since OS 2.2 or so), or 2. incorrect coding
on my part in establishing the socket, or 3. a result of the fact that
exceptfd is not implemented for select on 4.2BSD, or 4. my use of the
non-blocking socket?
Currently I close down the socket after 100 0-byte recv()'s, but I'd like
to have a better idea of why this is happening, and how it can be fixed.
Any clues will be gratefully accepted.
Robert Allen,
robert@spam.istc.sri.com
415-859-2143