[comp.protocols.tcp-ip] select

sean@fiamass.ie (Sean Mc grath) (01/17/91)

We are having a problem with the select() system call on a Sun Sparc 4/65
running SunOS 4.1.  We have a non blocking socket which we use to issue 
a connect().  This call returns EINPROGRESS as per the documentation.  The 
documentation states that under these circumstatances a select() call for 
write on that socket can be done to determine when the socket is fully
connected. Now here is our problem, we make a select() call which 
always seems to return telling me 
that the socket is now connected. So on I go and issue a write() which
causes the program to bomb out!  It is as if the write call does not return.
Here us the code. Has anyone any ideas?

#include <stdio.h>
#include <sys/ioctl.h>
#include <sys/types.h>
#include <sys/time.h>
#include <errno.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>

main()
{
char			hostname[128];
struct 		sockaddr_in     sockaddr_in_struct;  
int	      sockid;
struct 		sockaddr_in server;
int retval,length,a,b,c;
char x;
struct fd_set readfds,writefds,exceptfds;
struct timeval timeout;
struct hostent *hp,*gethostbyname();

/* Get  a socket */
	sockid	= 		socket(AF_INET,SOCK_STREAM,0);

/* make in NONBLOCKING */
	x=1;
	if(ioctl(sockid,FIONBIO,&x)==-1)
			printf("Cannot unblock sock socket");
	server.sin_family = AF_INET;

/* fill in server details */	
	hp = gethostbyname ("server");
	if(hp==0)
		perror("gethostbyname");
	
	(void)bcopy((char *)hp->h_addr,(char *)&server.sin_addr,hp->h_length);
/* set up the servers port number */
	server.sin_port = (unsigned short)htons(1395);

/* connect up the socket */
/* this call returns with EINPROGRESS */	
	if(connect(sockid,(struct sockaddr *)&server,sizeof(server))==-1)
		perror("Connect returns ");

/* set up the select call */
	timeout.tv_sec=0;
	timeout.tv_usec=0;
	FD_ZERO(&writefds);
	FD_SET(sockid,&writefds);
	select(FD_SETSIZE,(fd_set *)NULL,&writefds,(fd_set *)NULL,&timeout);

	/* now see if we can legally write in this socket */
	if(FD_ISSET(sockid,&writefds)){
		printf("Select says OK to write\n");
		/* we get as far as here and then we are bombed back to SUNOS*/
		if(write(sockid,"abcdefghij",10) == -1){
			perror("Write failed !!");
			printf("Explain that one !!\n");
      }
    }
printf ("This line never gets printed because the write() call never returns");
}

george@na.excelan.com (George Powers) (01/22/91)

In article <9101171257.AA13517@fiamass.ie> fiamass@fiamass.ie (fiamass) writes:
>
>We are having a problem with the select() system call on a Sun Sparc 4/65
>running SunOS 4.1.  We have a non blocking socket which we use to issue 
>a connect().  This call returns EINPROGRESS as per the documentation.  The 
>documentation states that under these circumstatances a select() call for 
>write on that socket can be done to determine when the socket is fully
>connected. Now here is our problem, we make a select() call which 
>always seems to return telling me 
>that the socket is now connected. So on I go and issue a write() which
>causes the program to bomb out!  It is as if the write call does not return.
>Here us the code. Has anyone any ideas?

This question deals in an aspect of socket programming that seems
to be frequently misunderstood, so I am posting my response:

I am not sure exactly why your program fails, but your code does
not conform to standard practice in using select.

One problem is that you assume that select completion means that the
socket is in the next expected state.  Owing to the nature of select's
implementation, you must treat it as meaning that the socket might be
in the expected state, but then again it might not be.  It may have
suffered an error, or it may just be a false alarm.  You should retry
the connect operation after each select returns until connect returns
errno==EISCONN.  Then try the write operation.

Also, your select specifies a zero wait time, which means that in this
example the connection is probably not established when you try the
write.  You should probably wait indefinitely.

These remarks pertain to operations besides connect.  When you select
for writing, you should be prepared for write to return EWOULDBLOCK,
or to write only part of the amount requested.  You may not observe
these things in practice, but they can happen on some systems in
certain circumstances, without violating the operations as documented.

--
UUCP: {ames,sun,apple,mtxinu,cae780,sco}!novell!george  George Powers
Internet: george@novell.com 
--

towfiq@FTP.COM (Mark Towfiq) (01/24/91)

In article <9101171257.AA13517@fiamass.ie> fiamass@fiamass.ie (fiamass) writes:

   We are having a problem with the select() system call on a Sun Sparc
   4/65 running SunOS 4.1.  We have a non blocking socket which we use to
   issue a connect().  This call returns EINPROGRESS as per the
   documentation.  The documentation states that under these
   circumstatances a select() call for write on that socket can be done
   to determine when the socket is fully connected. Now here is our
   problem, we make a select() call which always seems to return telling
   me that the socket is now connected. So on I go and issue a write()
   which causes the program to bomb out!  It is as if the write call does
   not return.  Has anyone any ideas?

First, the question here is what is the purpose of the select() call.
In my experience, select() is used in two situations: 1) you have more
than one socket/file descriptor which you want to perform operations
on, so you use select to find out which one is ready, perform the
operations, and come back to the select; 2) you have an operation
which you want to complete in a certain amount of time (for example
this is used in the name resolver code to send out a few UDP packets
to one nameserver, then another, and so on.  From your example it does
not seem to me that you need select for either of these reasons,
although perhaps that is just a test program.

I don't know for sure why on a Sun your code does not work, but I
would suggest some modifications to your code: 1) check the return
value of the select call, and use the information as provided by the
manual to decipher what the value might me.  If this doesn't work, try
2) using an actual timeout value, as another poster suggested.  If
this still doesn't work, then try 3) doing your connnect before making
the socket non-blocking.  If none of this works, someone is broken
somewhere.

Hope this helps,
Mark
--
Mark Towfiq, FTP Software, Inc.                                  towfiq@FTP.COM
Work No.: +1 617 246 0900			      Home No.: +1 617 488 2818

  "The Earth is but One Country, and Mankind its Citizens" -- Baha'u'llah