fouts@orville (Martin Fouts) (12/02/84)
The attached two program compose the shortest example I can specify of a bug which will cause 4.2bsd to go into an infinite loop. The first program, talk.c opens a socket, connects to the server listening on that socket, sends it a message and then exits. When used with another program it works fine. When used with the second program, it causes the system to hang. The second program, willcrash.c opens a socket, binds that socket to a stream address, listens on that socket for connections, and then does an unrelated select. The select should not return until there is input on channel 0, which is standard input. To cause the bug to happen: 1) Compile the two programs: cc -g -o willcrash willcrash.c cc -g -o talk talk.c 2) Run willcrash: willcrash 3) Run talk: talk test hi 1 4) Attempt to terminate willcrash, either by typing a CTRL-C, or by doing a kill. At this point, the system will go into an infinite loop. Any help in solving this one would be greatly appreciated. Thanks, Marty fouts@ames-nas ---------------------- talk.c ------------------------------------------------ /* talk.c -- unix domain experiment */ #include <stdio.h> #include <sys/types.h> #include <sys/socket.h> extern int errno; main(argc,argv) char *argv[]; { int sock; /* unix socket file descriptor */ char ofname[20]; /* unix socket name */ char *request; char *crayp; int junk; int loop; struct sockaddr socketname; /* * Crack the parameters. */ if (argc < 3) usagerr(); crayp = argv[1]; printf(" Attempting to use socket %s.\n", crayp); sock = socket(AF_UNIX,SOCK_STREAM,0); if (sock < 0) { perror("Talk can't open socket"); exit(1); } socketname.sa_family = AF_UNIX; strcpy(socketname.sa_data,crayp); if (connect(sock,&socketname,sizeof(struct sockaddr)) < 0) { close(sock); perror("talk: Connect failed"); exit(1); } request = argv[2]; loop = atoi(argv[3]); for (junk=0; junk < loop; junk++) { if (write(sock, request, strlen(request)) < 0) { perror("Talk can't send message"); exit(1); } } close(sock); } /* * Indicate a usage error and exit. */ usagerr() { fprintf(stderr, "usage: talk socket message count"); exit(1); } -------------------- willcrash.c ---------------------------------------------- /* * This version will crash 4.2 */ #include <sys/types.h> #include <sys/socket.h> main() { int fd; struct sockaddr s1; int ready = 1; fd = socket (AF_UNIX, SOCK_STREAM, 0); s1.sa_family = AF_UNIX; strcpy (s1.sa_data, "test"); bind (fd, &s1, sizeof (struct sockaddr)); listen (fd, 5); select (20, &ready, 0, 0, 0); } ---------- ----------
jim@haring.UUCP (12/12/84)
There was indeed a bug in early versions of 4.2 which caused this to happen, the problem was trying to connect to a socket where the server process exited before accepting the connection, various parts of the uipc code assumed that another part would tidy up partially completed connects, and looped waiting for it to happen. Unfortunately our system has changed so much that I cannot easily make a diff for this bug, perhaps someone else out there has it handy (it has been discussed in unix-wizards before, about a year ago)?. Now, I know your examples are just 'shorts' designed to show the bug, but perhaps they can be used to show a couple of things that are not clear in the 'IPC primer' or anywhere else for that matter: 1) for the UN*X domain you should include <sys/un.h> and use 'sockaddr_un' instead of 'sockaddr'; 2) the third argument to the 'connect' and 'bind' calls for the UN*X domain the size of the string which is the name of the socket plus the size of the 'sun_family' element of the 'sockaddr_un' structure, e.g. strlen(socketname.sun_path) + sizeof(socketname.sun_family) where 'sun_path' is the element of the 'sockaddr_un' structure which contains the name (and is 108 characters in maximum size); 3) the server process needs to do an 'accept' call for the connection to complete. This is, in fact why the program exhibits the panic, no accept is done to complete the connection. This is how I found the bug a long time ago. Hope that helps, and also that someone can dig up the bug fix. Good luck. Jim McKie Centrum voor Wiskunde en Informatica, Amsterdam mcvax!jim