toddb@tekcrl.UUCP (Todd Brunhoff) (08/08/86)
This problem has eluded me for 6 months and only seems to happen on our
workstations:  we have spell set up so that the dictionaries live on
the main vax (across RFS), and in some circumstances, the spell simply
fails on the fopen for the hash file.  In even fewer circumstances,
the system crashes (in our case, because the mbuf containing the socket
information has been trashed, and the socket routine that it calls
in remoteio() is NULL!).  It appears to be a race condition on opening
the socket for the remote connection.  The fix I implemented below
seems to work and is semantically more correct, but I really can't
explain why the original code fails.  This makes me think that there
is still another bug in there somewhere.
RCS file: RCS/rmt_io.c,v
retrieving revision 2.4
diff -c -r2.4 rmt_io.c
*** /tmp/,RCSt1026736	Thu Aug  7 16:25:25 1986
--- rmt_io.c	Thu Aug  7 16:24:44 1986
***************
*** 11,17
   * may be copied, modified or used in any way, without fee, provided this
   * notice remains an unaltered part of the software.
   *
!  * $Header: rmt_io.c,v 2.4 86/02/17 17:43:21 toddb Exp $
   *
   * $Log:	rmt_io.c,v $
   * Revision 2.4  86/02/17  17:43:21  toddb
--- 11,17 -----
   * may be copied, modified or used in any way, without fee, provided this
   * notice remains an unaltered part of the software.
   *
!  * $Header: rmt_io.c,v 2.5 86/08/07 16:23:02 toddb Exp $
   *
   * $Log:	rmt_io.c,v $
   * Revision 2.5  86/08/07  16:23:02  toddb
***************
*** 14,19
   * $Header: rmt_io.c,v 2.4 86/02/17 17:43:21 toddb Exp $
   *
   * $Log:	rmt_io.c,v $
   * Revision 2.4  86/02/17  17:43:21  toddb
   * If the host is closing, remote_getconnection() goes right ahead and
   * closes and then tries to reopen.  This provides a pretty good recover
--- 14,23 -----
   * $Header: rmt_io.c,v 2.5 86/08/07 16:23:02 toddb Exp $
   *
   * $Log:	rmt_io.c,v $
+  * Revision 2.5  86/08/07  16:23:02  toddb
+  * Semantic fix to remote_getconnection() so that the socket in rp->r_sock
+  * is not assigned before the socket is really created.
+  * 
   * Revision 2.4  86/02/17  17:43:21  toddb
   * If the host is closing, remote_getconnection() goes right ahead and
   * closes and then tries to reopen.  This provides a pretty good recover
***************
*** 137,143
  		 * first, make a socket for the connection; then connect.  (the
  		 * connection code is basically connect(2)).
  		 */
! 		if (err = socreate(AF_INET, &rp->r_sock, SOCK_STREAM, 0))
  			break;
  
  		so = rp->r_sock;
--- 141,147 -----
  		 * first, make a socket for the connection; then connect.  (the
  		 * connection code is basically connect(2)).
  		 */
! 		if (err = socreate(AF_INET, &so, SOCK_STREAM, 0))
  			break;
  
  		debug9("connect...");
***************
*** 140,146
  		if (err = socreate(AF_INET, &rp->r_sock, SOCK_STREAM, 0))
  			break;
  
- 		so = rp->r_sock;
  		debug9("connect...");
  		err = soconnect(so, rp->r_name);
  		u.u_uid = uid;
--- 144,149 -----
  		if (err = socreate(AF_INET, &so, SOCK_STREAM, 0))
  			break;
  
  		debug9("connect...");
  		err = soconnect(so, rp->r_name);
  		u.u_uid = uid;
***************
*** 600,606
  	if (so)
  		soclose(so);
  	else
! 		debug12("rmt_closehost: so == 0, rp=%x\n", rp);
  }
  
  sendrsig(system)
--- 603,609 -----
  	if (so)
  		soclose(so);
  	else
! 		printf("rmt_closehost: so==0,rp=%x\n", rp);
  }
  
  sendrsig(system)