[comp.protocols.nfs] Forcing a umount

johnk@gordian.com (John Kalucki) (03/23/91)

I'm having trouble arranging my nfs links to avoid halting all of
my machines when one machine crashes. I have 6 Mips workstations
which are all intermounted, each host has the data disk of all
other hosts mounted.  All of the filesystems are mounted -soft and
all of the mountpoints are in /n/{machine}, where they should be
out of some of harm's way.

When one machine goes down, all others have difficulties, ranging
from simply freezing up to having pwd(1) timeout. This happens even
though data is not being accessed on the machine that is down (or
so I'd like to think).

I'd like to be able to umount the machine that is down, perhaps in
the same way as when Unix is being shutdown-a forceable unmount
occurs. I don't really care if any processes are confused by the
forceable unmount, I'd just like to avoid the NFS:timed out messages
(and the occasional long timeout period) to get the client
machine functional again.

Another thing I've thought of is to put each server in
/{machine}/{machine} to avoid having pwd stat the directory with
the mountpoint unless needed, but this seems quite ugly.

Also, when a server is up, but I can't unmount it, because the fs
is busy, is there an easy way to find which processes are the
offenders? I've looked at the fuser command that comes on the Mips
boxes, but it doesn't work with NFS.

Any hints?


		-John Kalucki
		johnk@gordian.com

vjs@rhyolite.wpd.sgi.com (Vernon Schryver) (03/24/91)

In article <118@gordius.gordian.com>, johnk@gordian.com (John Kalucki) writes:
> 
> I'm having trouble arranging my nfs links to avoid halting all of
> my machines when one machine crashes....
> When one machine goes down, all others have difficulties, ranging
> from simply freezing up to having pwd(1) timeout....

Symbolic links from /{mach} to /mount/point could make the /{mach}/{mach}
mounting scheme palatable, since there are only 6 machines and 30-36
links involved.  Things change when you go from dozens to grosses....pun
possibly intended

The automounter sounds like a good bet for the situation described.  It's
reduced many of the dead host problems around here, where everyone has from
two to several dozen mounts, with many of the most used servers several
buildings, routers, and networks distant.  The automounter helps, because
you're more likely to get messed up only by the (network) crash of a
machine you currently care about.

Unfortunately the single thread nature of the automounter makes things
more painful when a server you care about does crash or become unreachable. 
The Sun automounter is not too smart about doing other mounts while it is
waiting to give up on a dead server.

Vernon Schryver,   vjs@sgi.com

brent@terra.Eng.Sun.COM (Brent Callaghan) (03/26/91)

In article <93325@sgi.sgi.com>, vjs@rhyolite.wpd.sgi.com (Vernon Schryver) writes:
> In article <118@gordius.gordian.com>, johnk@gordian.com (John Kalucki) writes:
> > 
> > I'm having trouble arranging my nfs links to avoid halting all of
> > my machines when one machine crashes....
> > When one machine goes down, all others have difficulties, ranging
> > from simply freezing up to having pwd(1) timeout....
> 
> 
> Symbolic links from /{mach} to /mount/point could make the /{mach}/{mach}
> mounting scheme palatable, since there are only 6 machines and 30-36
> links involved.  Things change when you go from dozens to grosses....pun
> possibly intended
> 
> The automounter sounds like a good bet for the situation described.  It's
> reduced many of the dead host problems around here, where everyone has from
> two to several dozen mounts, with many of the most used servers several
> buildings, routers, and networks distant.  The automounter helps, because
> you're more likely to get messed up only by the (network) crash of a
> machine you currently care about.
> 
> Unfortunately the single thread nature of the automounter makes things
> more painful when a server you care about does crash or become unreachable. 
> The Sun automounter is not too smart about doing other mounts while it is
> waiting to give up on a dead server.

The automounter can help reduce the incidence of NFS hangs through
a reduction in the number of static NFS mounts that getwd might
bump into, though I'd not recommend it as a solution to this problem.

Hangs due to dead servers are no surprise if the filesystem that
you're dealing with *directly* becomes temporarily unavailable due
to a server crash.  What annoys me (and other NFS users) is where
your application hangs due to a dead NFS filesystem that you have
no interest in.  There are two well-known mechanisms for this kind
of hang: You type a command and the $PATH search hangs on a dead
directory in the PATH.  A workaround here is to keep the PATH as
short as possible and substitute shell aliases where possible.
Another mechanism is the getwd() mechanism: this function does
a file tree walk in order to construct a path from the current
directory to the root.  If the walk (climb?) up the tree trips over
a dead NFS mountpoint then the process that invoked getwd() is hung.
This problem can be avoided by implementations that walk up to the
first mountpoint and prepend the remainder of the path from the
/etc/mtab entry.  If the /etc/mtab entries have the device id
stashed in the mount options (as a "dev" or "fsid" string) then
getwd shouldn't hang except if invoked withing a dead filesystem.

BTW: it's not easy to hang the automounter with a dead NFS mount.
After the automounter has done a mount, it doesn't give a brass
razzoo about whether the server is up or not.  It just points its
symbolic link at the mountpoint in /tmp_mnt and the application that
asked for it follows it. If the mountpoint is dead then the application
hangs, but the automounter carries on.  

In SunOS 4.0.3 the automounter could hang in unmount(2).  An unmount
of a dead NFS mountpoint will succeed - the kernel does not contact
the server when it does an unmount(), but if the path to the mountpoint
contains a dead server then it will hang in the lookup.  The 4.1
automounter carefully pings servers along the path before attempting
the unmount system call.  Of course, a lot of this jiggery-pokery
wouldn't be necessary if the automounter didn't have to be so paranoid
about it's single thread blocking.  A multithreaded automounter will
definitely be an improvement.

--

Made in New Zealand -->  Brent Callaghan  @ Sun Microsystems
			 Email: brent@Eng.Sun.COM
			 phone: (415) 336 1051

vjs@rhyolite.wpd.sgi.com (Vernon Schryver) (03/27/91)

In article <10472@exodus.Eng.Sun.COM>, brent@terra.Eng.Sun.COM (Brent Callaghan) writes:
(in response to an off hand complaint of mine about automounter hangs)
> ...
> BTW: it's not easy to hang the automounter with a dead NFS mount.

some agreement, but see below.

>                                     A multithreaded automounter will
> definitely be an improvement.

Much agreement, providing the locking hassles are safely resolved.
One would not want new problems in the style of the lock-demons.

Real life example from the campus of a workstation maker on the non-solar
side of the Mtn.View dump: 

    Consider a bunch of source machines, all mounted indirectly in /net, via
    some handy map.  Imagine `ln -s /net/{foo,bar,new,dead,so_on} /` so that
    symbolic links within the trees source work universally.  (Some people
    mount on /n, some in /hosts, and the symbolic links must work on the
    servers as well.)

    Now imagine what happens if 
      (1) you have foo,bar, and new all properly mounted, 'dead' not
	   mounted, and the server for dead dies or becomes unreachable
      (2) vi /dead/src/whatever.c
	   which is sym-link resolved into /net/dead/src/whatever.c
      (3) long pause
	    grumble
	    realize the trouble
	    do something about the network or the server for dead
      (4) try to do something useful while the 10GB of file systems for
	      'dead' are fsck'ing
	  so `vi /foo/src/stuff.c`

    Now you've hung another window, as namei() tries to resolve the
    /foo->/net/foo link by talking to the file system mounted on /net.
    Until the automounter stop patently waiting for the server for dead to
    answer the mount protocol or for a timeout, you're stuck.

There is no equivalent hiccup in the obvious manual NFS mounting of
{foo,bar,...} on /net.

An obvious solution is to have the automounter not do the unmount when it
times out.  Instead, let the automounter locally unmount the filesystem,
but cache the fh etc. and not bother to tell the server.  Later when the
file system is referenced, the automounter should first consult its cache
of file handles etc, and do only the local part of the (re-)mount.  Yes, it
would need a varient mount system call.

There are few good reasons to tell server when the automounter hits the
default 5 minute timeout.  It justs wastes server cycles and network
bandwidth to update /etc/rmtab.  For many years B.Lyon has been saying
/etc/rmtab is a botch that should be deleted.

Vernon Schryver,   vjs@sgi.com

jay@silence.princeton.nj.us (Jay Plett) (03/28/91)

In article <93840@sgi.sgi.com>, vjs@rhyolite.wpd.sgi.com (Vernon Schryver) writes:
< In article <10472@exodus.Eng.Sun.COM>, brent@terra.Eng.Sun.COM (Brent Callaghan) writes:
< (in response to an off hand complaint of mine about automounter hangs)
< > ...
< > BTW: it's not easy to hang the automounter with a dead NFS mount.
< 
< some agreement, but see below.
< 
< >                                     A multithreaded automounter will
< > definitely be an improvement.
< 
< Much agreement, providing the locking hassles are safely resolved.

Just get amd.  It forks a child for potentially-blocking operations.  I
have seen it wedge in some very bizarre situations, but it is much
more robust than Sun's automounter.  And a lot more flexible/useful/powerful.
It's available by anon ftp on usc.edu.

	...jay