[comp.unix.questions] WHat is "Stale NFS handle"?

jwp@larry.UUCP (Jeffrey W Percival) (01/19/89)

I have a VAXstation 2000 running Ultrix 2.2, using NFS
to use the /usr/local of a MicroVAX II, also running Ultrix.
Yesterday, I could no longer access some files, and got the
message "stale NFS handle".  What circumstances cause this?
-- 
Jeff Percival (jwp@larry.sal.wisc.edu)

deke@valhalla.ee.rochester.edu (01/20/89)

In article <495@larry.UUCP> jwp@larry.uucp asks about "Stale NFS handle"

I tried to reply by mail, but couldn't reach you.  Here is my reponse.
I've cross posted to a more appropriate newsgroup (I think) in hopes
that the folks there can do a better job than I.  I've also directed
followups there...

  In my experience, "stale NFS handle" means that the directory which
  is your current working directory is an NFS mount that has 'gone away'
  for some reason.  In other words, the 'handle' itself is a pointer to
  something no longer there....
  
  Off-the-top-of-my-head possibilities:
  
  1) The server with that partition went down (and came back up?)
  2) The server with that partition unmounted that partition
  3) the machine you are on somehow had that partition unmounted
  
  Good luck.  Hope other responses are more complete than mine!
  
      ^Deke Kassabian,   deke@ee.rochester.edu   or   ur-valhalla!deke
   Univ of Rochester, Dept of EE, Rochester, NY 14627     (+1 716-275-3106)
  

guy@auspex.UUCP (Guy Harris) (01/20/89)

>I have a VAXstation 2000 running Ultrix 2.2, using NFS
>to use the /usr/local of a MicroVAX II, also running Ultrix.
>Yesterday, I could no longer access some files, and got the
>message "stale NFS handle".  What circumstances cause this?

A "file handle" (the message under SunOS, at least, is "stale NFS file
handle", not just "stale NFS handle") is the cookie used in NFS requests
to refer to a file.  For UNIX file servers, it's generally built out of:

	1) the major and minor device of the disk on which the file
	   system containing the file resides;

	2) the file's inumber;

	3) a "generation count" kept in the inode, and bumped every time
	   a new file gets that inode, so that a file handle referring
	   to a file that occupied that inode before the current file is
	   recognized as invalid - or "stale";

	4) some other stuff, perhaps.

A "stale" file handle is one that no longer validly refers to a file.

This can happen because:

	1) the file was removed out from under your application (the
	   inumber then refers to an empty slot in the ilist);

	2) the file was removed out from under your application, and
	   some newly-created file got assigned that inode (the
	   generation count then doesn't match);

	3) somebody decided to invalidate all existing file handles for
	   the file system on which the file resides (e.g., they didn't
	   want to make it available any more), which can be done with
	   "fsirand", a command that (at least on SunOS) stuffs random
	   numbers into the generation counts;

	4) the file system got trashed, so they had to restore it (which
	   changes the generation counts);

	5) somebody moved the disk partitions around on your server;

	6) the server stopped exporting the file system, or exported it
	   to a smaller set of servers not including yours;

	7) your server upgraded to a new version of the OS that changed
	   the way file handles are interpreted (I know the SunOS 3.x to
	   SunOS 4.0 upgrade did this);

	8) there's a bug in the server such that, for example, rebooting
	   the server invalidates file handles (there was such a bug in
	   an alpha version of SunOS 4.0, which was quite annoying; it
	   was, as far as I know, fixed later - as I remember it, the
	   problem was that instead of using the major/minor of the file
	   system's device in the file handle, it used the index of the
	   file system in the system mount table, which can change when
	   you reboot the machine);

	9) I'm sure there are some that I've forgotten.

If you can't get at *any* files on that file system, try unmounting it
and remounting it (if you get "stale NFS file handle" when unmounting,
you may have to reboot).

chris@spock (Chris Ott) (01/20/89)

>I have a VAXstation 2000 running Ultrix 2.2, using NFS
>to use the /usr/local of a MicroVAX II, also running Ultrix.
>Yesterday, I could no longer access some files, and got the
>message "stale NFS handle".  What circumstances cause this?

     I've seen this message once myself. Unfortunately, my experience was
on a Sun. I'm not sure it will apply to your Vax.

     It was the following situation:

     On a server, there was a directory mounted from the root filesystem,
say "dir1". Then, somewhere inside "dir1" was mounted another filesystem,
say "dir2". So, if you wanted to get to "dir2" on the server, you would
use the path "/dir1/dir2".

     For clients, the first thing that came to my mind is that, if you can
get to "dir2" with "/dir1/dir2" on the server, you should be able to do the
same on the clients, but only have to mount "dir1". I was wrong. Any attempt
to get to "dir2", such as "cd /dir1/dir2" would give the message "stale NFS
handle". Both "dir1" and "dir2" have to be mounted on the clients as well
as the server.

     Hope that helps.

Chris

-------------------------------------------------------------------------------
 Chris Ott
 Computational Fluid Mechanics Lab        Infatuation is blind, not love. A
 University of Arizona                      person in love can see the other's
                                            faults, but loves them anyway.
 Internet: chris@spock.ame.arizona.edu
 UUCP: {allegra,cmcl2,hao!noao}!arizona!amethyst!spock!chris
-------------------------------------------------------------------------------

dd@beta.lanl.gov (Dan Davison) (01/21/89)

In article <873@auspex.UUCP>, guy@auspex.UUCP (Guy Harris) writes:
> >Yesterday, I could no longer access some files, and got the
> >message "stale NFS handle".  What circumstances cause this?
> 
> A "file handle" (the message under SunOS, at least, is "stale NFS file
> handle", not just "stale NFS handle") is the cookie used in NFS requests
> to refer to a file. [...]
> A "stale" file handle is one that no longer validly refers to a file.
> This can happen because: [...]
> 
> 	8) there's a bug in the server such that, for example, rebooting
> 	   the server invalidates file handles (there was such a bug in
> 	   an alpha version of SunOS 4.0, which was quite annoying; it
> 	   was, as far as I know, fixed later - as I remember it, the
> 	   problem was that instead of using the major/minor of the file
> 	   system's device in the file handle, it used the index of the
> 	   file system in the system mount table, which can change when
> 	   you reboot the machine);


Well, I hate to contradict GH, but at least here (SunOS 4.0.1) it's
still not fixed.  One of our three servers hangs about once a day
(a YP binding problem :-< ) and one of the 18 or so clients which
mount that server's partitions will start complaining about a "stale
NFS file handle".  I wish there was a way to fix this; sometimes
umount/mount will do it and sometimes only a reboot will do (the umount
hangs).  Curiously, it's almost always the same file system on
that server, perhaps because it is one of the most heavily used.

dan davison/theoretical biology/t-10 ms k710/los alamos national laboratory
los alamos, nm 875545/dd@lanl.gov(arpa)/dd@lanl.uucp(new)/..cmcl2!lanl!dd
"Freedom is a heavy load, a great and strange burden for the spirit to
undertake.  It is not easy.  It is not a gift given, but a choice made,
and the choice may be a hard one." ...Le Guin, _The Farthest Shore_