dpk@BRL.ARPA (Doug Kingston) (12/17/86)
Index: sys/rpc/clnt_kudp.c FIX (Gould version and others?) Description: The SUN kernel mode RPC can hang while doing remote RPC that should timeout. An example is NFS when you remote mount a filesystem "soft". Some RPC's to this filesystem will hang. Mount is one such RPC. This problem was found in our Gould kernels which contain code almost identical to the SUN code. I know some other vendors using this code also have the problem. Specifically, the problem is that the function clntkudp_callit does a sleep on &so->so_rcv.sb_cc. The timeout routine, ckuwakeup(), had an incorrect wakeup value which is corrected below. Repeat By: Edit /etc/fstab to remote mount a filesystem. Shutdown the remote system. Reboot the your system. Watch your system hang when it attemps to mount the NFS filesystems. Fix: Apply the following diff: *** /tmp/,RCSt1016522 Tue Dec 16 20:37:15 1986 --- clnt_kudp.c Mon Dec 8 22:34:48 1986 *************** *** 498,504 **** rpc_debug(4, "cku_timeout\n"); #endif p->cku_flags |= CKU_TIMEDOUT; ! sbwakeup(&p->cku_sock->so_rcv); } /* --- 498,504 ---- rpc_debug(4, "cku_timeout\n"); #endif p->cku_flags |= CKU_TIMEDOUT; ! sbwakeup(&p->cku_sock->so_rcv.sb_cc); } /*
chris@mimsy.UUCP (02/18/87)
In article <1601@brl-adm.ARPA> dpk@BRL.ARPA (Doug Kingston) suggests
changing an
sbwakeup(&p->cku_sock->so_rcv);
to
sbwakeup(&p->cku_sock->so_rcv.sb_cc);
Yet sbwakeup() takes a `struct sockbuf *', not an `int *', and does
a wakeup on &sb->sb_cc. How can the above be right?
--
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
UUCP: seismo!mimsy!chris ARPA/CSNet: chris@mimsy.umd.edu
guy@gorodish.UUCP (02/26/87)
>Yet sbwakeup() takes a `struct sockbuf *', not an `int *', and does >a wakeup on &sb->sb_cc. How can the above be right? It can't. I believe the original bug report spoke of problems with soft mounts hanging forever; there is a bug here, but it's unrelated to the RPC code. The problem is that when you do a mount, you do some NFS calls to do things such as getting the attributes of the directory you're mounting. However, the problem is that "nfsrootvp" doesn't set (or, more precisely, clear) the "mi_hard" entry for the file system based on the mount options until *after* all those NFS calls have been made. Thus, if the mount server for a host is responding but the NFS server isn't, the "mount" will hang.