rbd@lamont.ldgo.columbia.edu (roger davis) (07/11/90)
A few weeks ago I posted a note to this newsgroup reporting a problem with diskless clients hanging at login time. Since that time, I've ruled out some of my earlier suspicions and come up with one very strong lead, so I'm reposting this in the hope that someone out there can shed some light. Here's the gist of my original posting: > Our site has been experiencing a lot of problems with hanging > logins on Sun/4s running 4.0.3c. Symptoms are as follows: > 1) User enters login name. > 2) System responds with 'Password:' prompt. > 3) User enters password, and the system hangs indefinitely. > There are no 'yp server *** not responding ...' console messages, > no nothing. This happens with root logins as well as non-root logins, > and happens with rlogins from remote machines as well as local > console logins. There are no messages of any kind in /var/adm/messages. > If the system is rebooted, the next login attempt will usually succeed, > but on several occasions I've had to reboot machines twice before > anyone could get in. > I am certain that this has nothing to do with a user's home directory > (or any other $PATH directory) being temporarily unavailable due to > a down server, because users can usually go to an identically configured > diskless machine sitting right beside the hung machine and log in > with no problem at all. (Many of the original respondents to the posting seemed to have missed the last paragraph above, suggesting that home directories or some other filesystem resource were unavailable due to some NFS server being down at the time. Once again for added emphasis, all file servers on the network are **UP** and functioning normally at the time this happens.) I've examined network traffic with etherfind at the time of the problem, and there appear to be **no** packets either coming from or going to the hung client. At this point I'm personally convinced that the automounter is causing the problem. I disabled the automounter on one client, putting all of its NFS mounts back into /etc/fstab. In about two and a half weeks, the system (which was formerly hanging almost every night) did not hang *once*. Yesterday I reenabled the automounter, and this morning the machine hung again when I attempted to log in at 9 AM. tran@ics.uci.edu reported problems which appear to be identical to ours, but seem to be related to YP. Since our automounter uses a YP auto.master map, I'm speculating that a YP/automount interaction may be the source of the trouble. (In fact, certain of our machines use their own local map instead of the YP map (they're started with 'automount -m -f /etc/auto.master') and these machines *never* experience the problem.) Here's our automount setup: 1) automount is started from the client with the following rc.local command if [ -f /etc/auto.direct ]; then automount; (echo -n ' automount') >/dev/console fi 2) This causes automount to ask YP for the auto.master map, which is client_host% ypcat -k auto.master /- /etc/auto.direct -rw,bg,hard,intr # master automount config file 3) automount then uses the local /etc/auto.direct file, whose contents are as follows (partial listing): # automount direct map /usr/share -ro,soft miles:/export/share /var/spool/batch lamont:/usr/spool/batch /clipper/aegean clipper:& /clipper/geobase clipper:& /duke/data duke:& /duke/duke duke:& /duke/ldgosrc duke:& /dyna/data dyna:& /enso/data enso:& /enso/scratch enso:& /miles/src miles:& /miles/dev miles:& /miles/image miles:& /miles/scratch miles:& /trane/data trane:& /trane/src trane:& /trane/sys trane:& Does anybody see anything wrong with this scheme? Nowhere does Sun give any explicit instruction on creating a YP auto.master map -- I just hacked one up by copying an existing /var/yp/Makefile entry and modifying it in the obvious ways. Everything does in fact seem to work, except of course for these bizarre login hangs. Has anyone else ever used a YP auto.master map successfully? Since it's not well-documented and Sun supplies no YP Makefile entry for such a thing, I suspect that there may not be many people out there doing this.