[comp.protocols.nfs] WHat is "Stale NFS handle"?

deke@valhalla.ee.rochester.edu (01/20/89)

In article <495@larry.UUCP> jwp@larry.uucp asks about "Stale NFS handle"

I tried to reply by mail, but couldn't reach you.  Here is my reponse.
I've cross posted to a more appropriate newsgroup (I think) in hopes
that the folks there can do a better job than I.  I've also directed
followups there...

  In my experience, "stale NFS handle" means that the directory which
  is your current working directory is an NFS mount that has 'gone away'
  for some reason.  In other words, the 'handle' itself is a pointer to
  something no longer there....
  
  Off-the-top-of-my-head possibilities:
  
  1) The server with that partition went down (and came back up?)
  2) The server with that partition unmounted that partition
  3) the machine you are on somehow had that partition unmounted
  
  Good luck.  Hope other responses are more complete than mine!
  
      ^Deke Kassabian,   deke@ee.rochester.edu   or   ur-valhalla!deke
   Univ of Rochester, Dept of EE, Rochester, NY 14627     (+1 716-275-3106)

sas@pyrps5 (Scott Schoenthal) (01/20/89)

From the NFS (version 2) Protocol Definition:

	NFSERR_STALE
		The "fhandle" given in the arguments was invalid.  That is,
		the file referred to by the file handle no longer exists,
		or access to it has been revoked.

A file handle is initially exchanged when a "lookup" of a file is made.

An example of a condition causing a stale file handle is the following:

	A process on System A opens file "foo" located, via NFS,
	on System B.  As a result of the open, System A caches a
	file handle (a unique string of octets that identifies the
	file to System B).  A process on System B removes "foo".
	The next time A attempts to use the file handle (e.g., to
	read a block of the file), System B will report that the
	file handle is stale:  it does not identify a file on the system.

The above description can be generalized for directories, etc.

NB:  The NFS server is stateless and does not keep track of how many
client references are active against files managed by the server.

In the Sun UNIX port of NFS, removal of a file increments a generation
count on the inode.  The generation field is encoded into the file handle
that the NFS server passes to the client in the lookup.

sas
----
Scott Schoenthal   			sas@pyrps5.pyramid.com
Pyramid Technology Corp.		{sun,hplabs,decwrl,uunet}!pyramid!sas

hedrick@geneva.rutgers.edu (Charles Hedrick) (01/20/89)

Another cause of "stale nfs handle" is if the server has a disk crash,
and has had to rebuild its file system from tape.  If they do things
as documented, they run a program that randomizes some magic numbers
in the file system.  The result is that the file handles associated
with it are now different then they were before.  So when they come up
after the rebuild, everybody who has that file system mounted will
have stale file handles.  Installing a new release of the OS can
sometimes cause this as well, depending upon how drastic an
installation was done.

sadler@heurikon.UUCP (Jon Sadler) (01/21/89)

In article <495@larry.UUCP> jwp@larry.uucp asks about "Stale NFS handle"

To answer this requires some murking around in how NFS really works.

     For example, I create a file called /export/foo, give full world permi-
sions to it.  Whenever I open this file on a NFS client machine, a "file-
handle" (a local structure containing info on the file ranging from the current
position in the file, to the machine it lives on, etc.) is created in-core (in
the kernal).  All access to the file reference this structure, and then are
sent to the remote machine via Sun's RPC.

Now, imagine the following scenerio:

     Say I run /bin/more on the file /export/foo.  Since NFS is stateless,
the server does not know that the file is being accessed, only that there are
requests coming in for data inside the file.  Now say I pause for a half-minute,
while more is prompting me, and someone on the server deletes /export/foo.
When bin/more does to ask for more data (because I hit a key, and it needs more
data), the file will not exist anymore.  Further, the head-inode for
/export/foo may have been re-used in another file.  At this point, /bin/more
will return an error, and the client will say "Stale-File Handle".  Why?
Because the file-handle created for referencing /export/foo is no longer
"up-to-date".  (In other words, it is stale.)

I hope this answers your questions.

Jonathan Sadler
Heurikon Corp.
-- 
BANG PATH:      ...rutgers!uwvax!heurikon!sadler   SNAIL: Jonathan Sadler
                ...rutgers!nucsrl!laidbak!sadler          Heurikon Corp.
UUCP DOMAIN:    sadler@heurikon.UUCP                      3201 Latham Drive
                sadler@laidbak.UUCP                       Madison, WI 53713
ARPA:           sadler@csd4.milw.wisc.edu          PHONE: (608) 271-8700

sas@pyrps5 (Scott Schoenthal) (01/21/89)

In article <292@heurikon.UUCP> sadler@heurikon.UUCP (Jon Sadler) writes:
>
>     For example, I create a file called /export/foo, give full world permi-
>sions to it.  Whenever I open this file on a NFS client machine, a "file-
>handle" (a local structure containing info on the file ranging from the current
>position in the file, to the machine it lives on, etc.) is created in-core (in

The current position in the file is *not* encoded into the file handle in
the Sun NFS version 2 implementation.  The file handle is only used to
uniquely identify a file to the server -- not the parameters to be used
in its access.

sas
----
Scott Schoenthal   			sas@pyrps5.pyramid.com
Pyramid Technology Corp.		{sun,hplabs,decwrl,uunet}!pyramid!sas

mike@ists.ists.ca (Mike Clarkson) (01/22/89)

In article <55699@pyramid.pyramid.com>, sas@pyrps5 (Scott Schoenthal) writes:
> NB:  The NFS server is stateless and does not keep track of how many
> client references are active against files managed by the server.
> 
> In the Sun UNIX port of NFS, removal of a file increments a generation
> count on the inode.  The generation field is encoded into the file handle
> that the NFS server passes to the client in the lookup.

On a related topic:

If a client mounts a NFS partition read-only, then there seems to be
even more caching of information on the client.  If even one byte of a
file on the read-only partition is changed on the server (by someone on
the server), then NFS may error accessing the file, and possibly other files
on that partition.

My question is: is there any way to refresh the cache information short
of unmounting and remounting the partition?  A short little program maybe?

Mike.

-- 
Mike Clarkson					mike@ists.UUCP
Institute for Space and Terrestrial Science	mike@ists.ists.ca
York University, North York, Ontario,		uunet!mnetor!yunexus!ists!mike
CANADA M3J 1P3					+1 (416) 736-5611