samperi@marob.masa.com (Dominick Samperi) (07/29/89)
There seems to be no obvious way to deal with the problem of hung processes due to a dead NFS server. The recently posted 'cknfs' program might help somewhat, but it does not deal with the situation where a server dies after a user logs in. Furthermore, it appears that a process may still hang even if no reference is made to dead NFS paths. I don't know why...perhaps the crashed machine is flooding the network with garbage packets???? Perhaps some experienced NFS users could comment on various tricks that they have used to deal with the "NFS hang problem"? Thanks! -- Dominick Samperi -- ESCC samperi@marob.masa.com uunet!hombre!samperi
lamy@ai.utoronto.ca (Jean-Francois Lamy) (07/30/89)
Under SunOS 3.x, religiously enforce the following policy: never mount two partitions from two different machines in the same directory. That way the mount point for the partition you are in will always be found before getwd has to look at the directory entry for a dead machine. You can most simply acheive this by mounting all remote partitions on /nfs/machine/partition and making symlinks to preserve the illusion. Under SunOS 4.x, getwd caches results. Poking around with trace reveals that if all NFS mounts refer to symlinks pointing to the actual mount point you won't see the NFS hang problem. Jean-Francois Lamy lamy@ai.utoronto.ca, uunet!ai.utoronto.ca!lamy AI Group, Department of Computer Science, University of Toronto, Canada M5S 1A4
jik@athena.mit.edu (Jonathan I. Kamens) (07/31/89)
Query: any reason why this wasn't asked in comp.protocols.nfs? In article <24D1DF49.7A5@marob.masa.com> samperi@marob.masa.com (Dominick Samperi) writes: >There seems to be no obvious way to deal with the problem of hung processes >due to a dead NFS server. The recently posted 'cknfs' program might help >somewhat, but it does not deal with the situation where a server dies after >a user logs in. Furthermore, it appears that a process may still hang >even if no reference is made to dead NFS paths. I don't know why...perhaps >the crashed machine is flooding the network with garbage packets???? > >Perhaps some experienced NFS users could comment on various tricks that they >have used to deal with the "NFS hang problem"? Project Athena has well over 1,000 workstations, with over 10,000 user accounts, and every user gets his home directory over NFS. There are also a lot of third-party lockers that people use often that are exported via NFS. We therefore encounter this problem much more often than we'd like. The most common way of referencing a dead NFS path even if you don't realize you're doing it is if you have said path in your search path and try to execute a program and/or start a new shell. Both will cause the search path to be scanned, and they could encounter the dead path and hang on it. One solution, which is what we use, is not to hard mount anything but the most important NFS filesystems. We mount all user filesystems soft with a five minute error timeout by default, so if a user's fileserver goes down, processes will only try to access it for five minutes. Once the user gets his prompt back, he can carefully save whatever work he is doing to a local hard disk or mail it to himself to prevent it from being lost. The only filesystems we hard mount by default are the system software packs, since if they go down there isn't much you can do with the workstation anyway. Jonathan Kamens USnail: MIT Project Athena 432 S. Rose Blvd. jik@Athena.MIT.EDU Akron, OH 44320 Office: 617-253-4261 Home: 216-869-6432
brent%terra@Sun.COM (Brent Callaghan) (08/01/89)
In article <13134@bloom-beacon.MIT.EDU>, jik@athena.mit.edu (Jonathan I. Kamens) writes: > > One solution, which is what we use, is not to hard mount anything > but the most important NFS filesystems. We mount all user filesystems > soft with a five minute error timeout by default, so if a user's > fileserver goes down, processes will only try to access it for five > minutes. Once the user gets his prompt back, he can carefully save > whatever work he is doing to a local hard disk or mail it to himself > to prevent it from being lost. A problem with "soft" mounting is that a timed-out I/O will return an error result to the user program. Unix programs are notorious for not checking for error returns on read(), write() etc and can fail in mysterious ways. This can be particularly bad in the case of an executable that is running from a dead server. A pagein that gets an error from a soft mount will crash the process and leave a core dump. I prefer to mount /usr and local executables ("/usr/local" around here) with "hard" and set the "intr" option so that I can at least kill a hung process with a SIGTERM if I get fed up waiting. The "intr" should work OK - although it can take a while since it has to wait for the hung NFS operation to timeout (can take a minute or so). Made in New Zealand --> Brent Callaghan @ Sun Microsystems uucp: sun!bcallaghan phone: (415) 336 1051
pat@orac.pgh.pa.us (Pat Barron) (08/01/89)
In article <13134@bloom-beacon.MIT.EDU> jik@athena.mit.edu (Jonathan I. Kamens) writes: >[...] >Samperi) writes: >>[...] Furthermore, it appears that a process may still hang >>even if no reference is made to dead NFS paths. I don't know why...perhaps >>the crashed machine is flooding the network with garbage packets???? >[...] > The most common way of referencing a dead NFS path even if you don't >realize you're doing it is if you have said path in your search path >and try to execute a program and/or start a new shell. Both will >cause the search path to be scanned, and they could encounter the dead >path and hang on it. Another possible problem, which can be particularly vexing until you really figure out what's going on, is that getwd() can hang if there is a dead NFS fileserver mounted within some directory above your current directory. That is, if your current directory is, for instance, "/usr/users/foobar", and "/usr/local" is mounted from an NFS server and that server is dead, then doing a getwd() [i.e., "pwd", etc...] from /usr/users/foobar *may* hang, depending on what order the directory entries for "users" and "local" appear in /usr. If "users" is first then everything is fine. If "local" is first, then getwd() will try to do a stat() on /usr/local, and you lose. The solution to this is to make sure NFS mount points are leaf nodes in the directory tree (i.e., make sure that no files or directories in the local filesystem appear in the same directory as an NFS mount point). --Pat. -- Pat Barron Internet: pat@orac.pgh.pa.us - or - orac!pat@gateway.sei.cmu.edu UUCP: ...!uunet!apexepa!sei!orac!pat - or - ...!pitt!darth!orac!pat
richard@aiai.ed.ac.uk (Richard Tobin) (08/02/89)
In article <13134@bloom-beacon.MIT.EDU> jik@athena.mit.edu (Jonathan I. Kamens) writes: > The most common way of referencing a dead NFS path even if you don't >realize you're doing it is if you have said path in your search path >and try to execute a program and/or start a new shell. Both will >cause the search path to be scanned, and they could encounter the dead >path and hang on it. > > One solution, which is what we use, is not to hard mount anything >but the most important NFS filesystems. Another solution is to mount the filesystems in, say, /nfs, and have symbolic links to them from the places people actually refer to. Then you can remove the symbolic links if the server is down. Even better, you can have a program do it. Here's one I wrote recently. We've only just started using it, so it may not be bug-free. -- Richard /* * nfslink [-i interval] [-t timeout] host name mountpt [name mountpt ...] * * maintain links to mounted file systems, removing them if the * remote machine isn't responding. * * Copyright Richard Tobin / AIAI 1989 * * May be freely redistributed if this whole notice remains intact. */ #include <stdio.h> #include <errno.h> #include <signal.h> #include <sys/time.h> #include <rpc/rpc.h> #include <rpc/clnt.h> #include <nfs/nfs.h> #include <setjmp.h> #include <sys/stat.h> main(argc, argv) int argc; char **argv; { int c, interval = 20, timeout = 5, firsttime = 1; extern char *optarg; extern int optind, opterr; while((c = getopt(argc, argv, "i:t:")) != EOF) switch(c) { case 'i': interval = atoi(optarg); break; case 't': timeout = atoi(optarg); break; case '?': usage(); break; } if((argc - optind) < 3 || ((argc - optind) & 1) == 0) usage(); while(1) { if(nfscheck(argv[optind], timeout) == 0) makelinks(&argv[optind+1], firsttime); else removelinks(&argv[optind+1], firsttime); firsttime = 0; sleep(interval); } } makelinks(links, verbose) char **links; int verbose; { struct stat namestat; while(*links) { char *name = *links++; char *mountpt = *links++; if(lstat(name, &namestat) == -1) { if(errno == ENOENT) { if(symlink(mountpt, name) == -1) { perror("nfslink: symlink"); fatal("can't link %s to %s\n", name, mountpt); } printf("nfslink: linked %s to %s\n", name, mountpt); fflush(stdout); continue; } else { perror("nfslink: lstat"); fatal("can't lstat %s\n", name, 0); } } if((namestat.st_mode & S_IFMT) == S_IFLNK) { if(pointsto(name, mountpt)) { if(verbose) { printf("nfslink: %s is already linked to %s\n", name, mountpt); fflush(stdout); } } else { fatal("%s is a link, but not to %s\n", name, mountpt); } } else { fatal("%s exists, but is not a symbolic link\n", name, 0); } } } removelinks(links, verbose) char **links; int verbose; { struct stat namestat; while(*links) { char *name = *links++; char *mountpt = *links++; if(lstat(name, &namestat) == -1) { if(errno == ENOENT) { if(verbose) { printf("nfslink: link from %s to %s is already removed\n", name, mountpt); fflush(stdout); } continue; } else { perror("nfslink: lstat"); fatal("can't lstat %s\n", name, 0); } } if((namestat.st_mode & S_IFMT) == S_IFLNK) { if(pointsto(name, mountpt)) { if(unlink(name) == -1) { perror("nfslink: unlink"); fatal("can't remove link from %s to %s\n", name, mountpt); } printf("nfslink: removed link from %s to %s\n", name, mountpt); fflush(stdout); } else { fatal("%s is a link, but not to %s\n", name, mountpt); } } else { fatal("%s exists, but is not a symbolic link\n", name, 0); } } } int pointsto(name, target) char *name, *target; { /* We don't use stat lest it hang, so it's not quite right */ char buf[200]; int len; len = readlink(name, buf, sizeof(buf)-1); if(len == -1) { perror("nfslink: readlink"); fatal("can't read link %s\n", name, 0); } buf[len] = '\0'; return strcmp(buf, target) == 0; } fatal(fmt, arg1, arg2) char *fmt, *arg1, *arg2; { fprintf(stderr, "nfslink: fatal error: "); fprintf(stderr, fmt, arg1, arg2); exit(1); } usage() { fprintf(stderr, "usage: nfslink [-i interval] [-t timeout] host name mountpt [name mountpt ...]\n"); exit(2); } static jmp_buf env; void timedout(); int nfscheck(host, timeout) char *host; int timeout; { int stat; signal(SIGALRM, timedout); if(setjmp(env) == 0) { alarm(timeout); stat = callrpc(host, NFS_PROGRAM, NFS_VERSION, RFS_NULL, xdr_void, 0, xdr_void, 0); alarm(0); if(stat == 0) return 0; } return -1; } void timedout() { longjmp(env, 1); } -- Richard Tobin, JANET: R.Tobin@uk.ac.ed AI Applications Institute, ARPA: R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk Edinburgh University. UUCP: ...!ukc!ed.ac.uk!R.Tobin
cole@dip.cs.wisc.edu (Bruce Cole) (08/03/89)
In article <658@skye.ed.ac.uk> richard@aiai.UUCP (Richard Tobin) writes: >Another solution is to mount the filesystems in, say, /nfs, and have symbolic >links to them from the places people actually refer to. Then you can >remove the symbolic links if the server is down. > >Even better, you can have a program do it. Here's one I wrote recently. >We've only just started using it, so it may not be bug-free. We considered implementing something like this. Unfortunately, for our usage of NFS, simply switching symbolic links is not enough. Most of our workstations NFS their /usr partition. All of the processes that are referencing the original server remain hung even after the necessary symbolic links are changed. I solved this problem by implementing NFS kernel changes on the client workstations. When an NFS request times out, the client consults a user specified list of equivalent servers to find an alternate server. The NFS request is automatically converted by the client workstation to be sent to the alternate server. Running programs do not even realize that a switch in servers has been made. The code works great for read only NFS partitions (such as /usr.) It is useless for handling remotely mounted home directories. I also made made changes to the way NFS handles interruptible file systems so that a user can interrupt an NFS request in a realistic amount of time. (Current NFS implementations can take several minutes to act upon an interrupt request.) -- Bruce Cole Computer Sciences Dept. U. of Wisconsin - Madison