barbour@boulder.Colorado.EDU (Jim Barbour) (06/30/89)
We have an instructional lab which has 2 clusters. Each file server is a 9000/350 and the cnodes or 320s. There are 9 cnodes per cluster. Currently, we have user's files cross mounted between clusters. i.e. cluster A user files are remotely mounted on cluster B and cluster B user files are remotely mounted on cluster A. Thus, if a user logs onto a machine, his/her home directory could very well be nfs-mounted. So far, this has worked fine. However, we are planning to upgrading from HP-UX 6.2 to 6.5 very soon. We discovered in the release notes a very serious problem. Apparently, if you are logged in on a cnode and your cwd is an nfs directory, you can not start up csh. This would seem to indicate that we would no longer be able to cross-mount these user files. Because of the configuration of the clusters -- software availabillity and so on -- this is highly undesirable. I realize that I could give each person a local home directory on each machine. However, can anyone suggest another work around for this problem? Jim Barbour (barbour@alumni.Colorado.EDU) C.U. Boulder -- HP operations
diamant@hpfclp.SDE.HP.COM (John Diamant) (07/02/89)
> However, we are planning to upgrading from HP-UX 6.2 to 6.5 very soon. We > discovered in the release notes a very serious problem. Apparently, if > you are logged in on a cnode and your cwd is an nfs directory, you can not > start up csh. The problem referred to in the release notes is not with NFS, but with RFA (netunam). I am running 6.5 on my cluster and just confirmed that csh starts up just fine when my current directory is under an NFS mount point. I believe the passage in the release notes you're referring to is the following one: It is no longer possible to start "csh(1)" when the current directory is on an RFA-connected file system. Both "sh(1)" and "ksh(1)" will start properly in a remote directory. RFA is the name of the HP-proprietary transparent remote file access system (accessed via the netunam shell builtins and the netunam system call). It does not refer to NFS. > This would seem to indicate that we would no longer be able to cross-mount > these user files. Because of the configuration of the clusters -- software > availabillity and so on -- this is highly undesirable. Since the problem is with RFA and not NFS, you should not have any problem with your configuration. By the way, the bug is supposed to be fixed in a forthcoming release. John Diamant Software Engineering Systems Division Hewlett-Packard Co. Internet: diamant@hpfclp.sde.hp.com Fort Collins, CO UUCP: {hplabs,hpfcla}!hpfclp!diamant
raveling@venera.isi.edu (Paul Raveling) (07/05/89)
In article <7540029@hpfclp.SDE.HP.COM> diamant@hpfclp.SDE.HP.COM (John Diamant) writes: >> However, we are planning to upgrading from HP-UX 6.2 to 6.5 very soon. We >> discovered in the release notes a very serious problem. Apparently, if >> you are logged in on a cnode and your cwd is an nfs directory, you can not >> start up csh. > >The problem referred to in the release notes is not with NFS, but with >RFA (netunam). I am running 6.5 on my cluster and just confirmed that >csh starts up just fine when my current directory is under an NFS mount point. We have no problem starting csh when all hosts are alive that we have NFS mounts on, but have experienced inability to start csh when any of those hosts has gone down. (Further qualifications follow.) This happens if ANY of those hosts goes down -- it's not necessary to have an file open on the defunct host. The situation cures itself when the dead host revives. What usually happens is assorted operations within X windows start looking catatonic. I usually try to SU in an existing window or open a new xterm window to check it out; in both cases starting up the new csh hangs. However, in these circumstances csh comes up successfully as a login shell. I've been able to log out & log in again as root, using csh, to unmount the offending file system. I've also logged in as myself with csh as the login shell to bring up X without unmounting anything, and X comes up, but X clients that depend on csh hang. It appears that our Sun users have the same problem, suggesting it's purely an NFS behavior. That 2nd hand info -- I haven't personally looked at it on Suns. To confuse matters more, it appeared that for a brief time the problem went away, then came back. Perhaps it's sensitive to some piece of setup in .cshrc files, but we haven't spotted it yet. BTW, we're not running totally diskless. Each workstation has local swap space and enough file system to have a local kernel, /bin, and /etc directories. Practically everything else, including /usr and all users' home directories, is mounted via NFS. ---------------- Paul Raveling Raveling@isi.edu
burzio@mmlai.UUCP (Tony Burzio) (07/07/89)
In article <8840@venera.isi.edu>, raveling@venera.isi.edu (Paul Raveling) writes: > We have no problem starting csh when all hosts are alive > that we have NFS mounts on, but have experienced inability > to start csh when any of those hosts has gone down. (Further > qualifications follow.) This happens if ANY of those hosts > goes down -- it's not necessary to have an file open on the > defunct host. The situation cures itself when the dead host > revives. Are your NFS file systems mounted "soft"? If they are not, NFS will sit around and wait for the desired system to answer the file request forever. An example checklist entry (for a paranoid admin with a hanging ethernet) would be: mmlai:/users /mmlai/users nfs soft,timeo=100 0 0 # NFS mount to MMLAI ********************************************************************* Tony Burzio * All right, so where's rfbackup anyway??? Martin Marietta Labs * mmlai!burzio@uunet.uu.net * *********************************************************************
jack@hpindda.HP.COM (Jack Repenning) (07/11/89)
We have no problem starting csh when all hosts are alive that we have NFS mounts on, but have experienced inability to start csh when any of those hosts has gone down. (Further qualifications follow.) This happens if ANY of those hosts goes down -- it's not necessary to have an file open on the defunct host. The situation cures itself when the dead host revives. ..., but X clients that depend on csh hang. Does starting up X include adding some directory to your PATH, which is actually an NFS mount from the system that's down? As I understand it, csh attempts to hash all directories in $PATH whenever it starts: if it can't get to one, then it hangs (not necessary to actually *run* anything from the inaccessible directory, merely to have it in your PATH). As someone else pointed out, this (and possibly other interactions) get better when you do a soft mount. Basically, "hard mount" means "keep trying forever, " while "soft mount" means "give up after a reasonable time." And "hard" is the default. Jack Repenning