[comp.unix.shell] NFS File identity resolution?

prakash@aiag.enet.dec.com (Mayank Prakash) (03/14/91)

--

Given a set of NFS servers and clients on the same network, and two
processes A and B on possibly different nodes, which can communicate
using sockets. How can the two processes determine if a file referred to
as "/a/b/c/file" by process A is really the file referred to as "/A/B/file"
by the process B? In other words, on the node of the process A, "/a/c/b"
could be mounted on X:/server/dir, and on the node of process B, "/A/B"
could be mounted on X:/server/dir. It is therefore not enough to compare
the directory parts alone. One must be able to determine which server
the file is coming from. [Essentially what the define command does, but
I want to do it from C].

Thanks.

  -mayank.

+--------------------------------------------------------------------------+
| InterNet: Prakash@AIAG.ENET.DEC.COM                                      |
| UUCP:     ...!decwrl!aiag.enet.dec.com!Prakash                           |
| VoiceNet: (508)490.8139                                                  |
| BitNet:   prakash%aiag.enet at decwrl.dec.com                            |
| SnailNet: DEC, 290 Donald Lynch Blvd. DLB5-2/B4, Marlboro, MA 01752-0749 |
+--------------------------------------------------------------------------+

Disclaimer: The above is probably only line noise, and does not reflect the 
            opinions of anybody, including mine, far less my employer's.

rickert@mp.cs.niu.edu (Neil Rickert) (03/14/91)

In article <21078@shlump.nac.dec.com> prakash@aiag.enet.dec.com writes:
>--
>
>Given a set of NFS servers and clients on the same network, and two
>processes A and B on possibly different nodes, which can communicate
>using sockets. How can the two processes determine if a file referred to
>as "/a/b/c/file" by process A is really the file referred to as "/A/B/file"
>by the process B? In other words, on the node of the process A, "/a/c/b"

 I don't know if there is an efficient foolproof test.  Perhaps some NFS
gurus will respond.  But as a practical method, if both files have the same
inode number, date, length, and if the parent directories of both have the
same inode number, the statistical probability that they are different
files is very small.

-- 
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
  Neil W. Rickert, Computer Science               <rickert@cs.niu.edu>
  Northern Illinois Univ.
  DeKalb, IL 60115                                   +1-815-753-6940

subbarao@phoenix.Princeton.EDU (Kartik Subbarao) (03/15/91)

In article <21078@shlump.nac.dec.com> prakash@aiag.enet.dec.com writes:
>--
>
>Given a set of NFS servers and clients on the same network, and two
>processes A and B on possibly different nodes, which can communicate
>using sockets. How can the two processes determine if a file referred to
>as "/a/b/c/file" by process A is really the file referred to as "/A/B/file"
>by the process B? In other words, on the node of the process A, "/a/c/b"
>could be mounted on X:/server/dir, and on the node of process B, "/A/B"
>could be mounted on X:/server/dir. It is therefore not enough to compare
>the directory parts alone. One must be able to determine which server
>the file is coming from. [Essentially what the define command does, but
>I want to do it from C].

You can stat() both files, and compare the inodes and see if they're the
same. This would only be a problem, I would guess, if the two files had the
exact same inode number but in fact came from two different disks. Of
course, you could do a simple popen() to df, chop out the right field, and 
make sure that they're the same. This would confirm that they were mounted
on the same place.

			-Kartik
		

--
internet# find . -name core -exec cat {} \; |& tee /dev/tty*
subbarao@phoenix.Princeton.EDU -| Internet
kartik@silvertone.Princeton.EDU (NeXT mail)  
SUBBARAO@PUCC.BITNET			          - Bitnet

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (03/15/91)

In article <21078@shlump.nac.dec.com> prakash@aiag.enet.dec.com writes:
> Given a set of NFS servers and clients on the same network, and two
> processes A and B on possibly different nodes, which can communicate
> using sockets. How can the two processes determine if a file referred to
> as "/a/b/c/file" by process A is really the file referred to as "/A/B/file"
> by the process B?

You cannot, at least not reliably.

On a single machine, two files are the same if and only if they have the
same device and inode (st_dev and st_ino after stat()). But there are
simply no guarantees of this across multiple machines.

Even if you could find out that the two files were the same, what would
you want to do with the information? NFS doesn't even guarantee that
locks on the file will work right.

Even if you could find out that two pathnames referred to the same file,
and even if you could do something sensible with the file, you wouldn't
want to use the names. What if someone moved the file or unmounted and
remounted the disks in the meantime?

---Dan

tchrist@convex.COM (Tom Christiansen) (03/15/91)

From the keyboard of brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
:In article <21078@shlump.nac.dec.com> prakash@aiag.enet.dec.com writes:
:> Given a set of NFS servers and clients on the same network, and two
:> processes A and B on possibly different nodes, which can communicate
:> using sockets. How can the two processes determine if a file referred to
:> as "/a/b/c/file" by process A is really the file referred to as "/A/B/file"
:> by the process B?
:
:You cannot, at least not reliably.

Well, the lockdaemon manages somehow.   I guess it uses 
something like (host,dev,ino,generation) tuples.

--tom

thurlow@convex.com (Robert Thurlow) (03/15/91)

In <20103:Mar1423:16:4291@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:

>In article <21078@shlump.nac.dec.com> prakash@aiag.enet.dec.com writes:
>> Given a set of NFS servers and clients on the same network, and two
>> processes A and B on possibly different nodes, which can communicate
>> using sockets. How can the two processes determine if a file referred to
>> as "/a/b/c/file" by process A is really the file referred to as "/A/B/file"
>> by the process B?

>You cannot, at least not reliably.

>On a single machine, two files are the same if and only if they have the
>same device and inode (st_dev and st_ino after stat()). But there are
>simply no guarantees of this across multiple machines.

You can, and reliably, but it's a bitch.  You have the inode number
on the remote machine, and if you tapdance and walk through /etc/fstab
and make sure you resolve all of the symlinks out of all your paths,
you can get an identifier of the remote machine is question.  Is it
really worth the pain?

>Even if you could find out that the two files were the same, what would
>you want to do with the information? NFS doesn't even guarantee that
>locks on the file will work right.

NFS doesn't do locks, Dan; haven't you been paying attention?  The
lock protocol does work for remote files; what is suspect is your
lock manager implementation.

>Even if you could find out that two pathnames referred to the same file,
>and even if you could do something sensible with the file, you wouldn't
>want to use the names. What if someone moved the file or unmounted and
>remounted the disks in the meantime?

Well, boundary conditions like this exist on strictly local systems,
as well, Dan; it doesn't seem to stop people from using them.  I agree
that this is not a thing to do capriciously for the reasons you give.

Rob T
--
Rob Thurlow, thurlow@convex.com
An employee and not a spokesman for Convex Computer Corp., Dallas, TX

prakash@fyrpwr.enet.dec.com (Mayank Prakash) (03/16/91)

In article <7169@idunno.Princeton.EDU>, subbarao@phoenix.Princeton.EDU (Kartik Subbarao) writes:
|-> 
|-> You can stat() both files, and compare the inodes and see if they're the
|-> same. This would only be a problem, I would guess, if the two files had the
|-> exact same inode number but in fact came from two different disks. Of
That is the problem I am trying to solve.
|-> course, you could do a simple popen() to df, chop out the right field, and 
|-> make sure that they're the same. This would confirm that they were mounted

This would almost work, except that df truncates the name of the source
file, and it needs an extra process spawn. Perhaps I should rather ask
how does df work, instead?

 -mayank.

+--------------------------------------------------------------------------+
| InterNet: Prakash@AIAG.ENET.DEC.COM                                      |
| UUCP:     ...!decwrl!aiag.enet.dec.com!Prakash                           |
| VoiceNet: (508)490.8139                                                  |
| BitNet:   prakash%aiag.enet at decwrl.dec.com                            |
| SnailNet: DEC, 290 Donald Lynch Blvd. DLB5-2/B4, Marlboro, MA 01752-0749 |
+--------------------------------------------------------------------------+

Disclaimer: The above is probably only line noise, and does not reflect the 
            opinions of anybody, including mine, far less my employer's.

boyd@necisa.ho.necisa.oz.au (Boyd Roberts) (03/18/91)

In article <21078@shlump.nac.dec.com> prakash@aiag.enet.dec.com writes:
>--
>
>Given a set of NFS servers and clients on the same network, and two
>processes A and B on possibly different nodes, which can communicate
>using sockets. How can the two processes determine if a file referred to
>as "/a/b/c/file" by process A is really the file referred to as "/A/B/file"
>by the process B? In other words, on the node of the process A, "/a/c/b"
>

Comparing the file-handles will do the job.  But you have to be sure
that the two files are on the same host.  A byte by byte comparison
will do the trick.  There's an NFS call to take a file-descriptor and
return a file-handle (nfs_getfh(2)).

All those other replies about inode numbers are just bogus.  You need
the inode number, dev and generation number.  All that stuff is bundled
into the file handle.  You don't have to worry about the contents of the
file-handle (the object is an opaque cookie anyway), just that the bits
are the same.

And what the hell is this discussion doing in comp.unix.shell?
comp.unix.misc would be a better choice.  Followups to there.

Boyd Roberts			boyd@necisa.ho.necisa.oz.au

``When the going gets wierd, the weird turn pro...''

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (03/18/91)

In article <thurlow.669009583@convex.convex.com> thurlow@convex.com (Robert Thurlow) writes:
> In <20103:Mar1423:16:4291@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> >Even if you could find out that the two files were the same, what would
> >you want to do with the information? NFS doesn't even guarantee that
> >locks on the file will work right.
> NFS doesn't do locks, Dan; haven't you been paying attention?

I was referring to NFS, the (exceedingly buggy and unreliable) network
filesystem, all recent implementations of which include a lock manager
over and above the basic NFS client and server.

> >Even if you could find out that two pathnames referred to the same file,
> >and even if you could do something sensible with the file, you wouldn't
> >want to use the names. What if someone moved the file or unmounted and
> >remounted the disks in the meantime?
> Well, boundary conditions like this exist on strictly local systems,
> as well, Dan; it doesn't seem to stop people from using them.  I agree
> that this is not a thing to do capriciously for the reasons you give.

Those reasons imply that the operation in question is not reliable.
There is no way to reliably access two regular file names at once.
Period.

---Dan

root@lingua.cltr.uq.OZ.AU (Hulk Hogan) (03/21/91)

prakash@fyrpwr.enet.dec.com (Mayank Prakash) writes:
>In article <7169@idunno.Princeton.EDU>, subbarao@phoenix.Princeton.EDU (Kartik Subbarao) writes:
>|-> You can stat() both files, and compare the inodes and see if they're the
>|-> same. This would only be a problem, I would guess, if the two files had the
>|-> exact same inode number but in fact came from two different disks. Of
>That is the problem I am trying to solve.
>|-> course, you could do a simple popen() to df, chop out the right field, and 
>|-> make sure that they're the same. This would confirm that they were mounted

>This would almost work, except that df truncates the name of the source
>file, and it needs an extra process spawn. Perhaps I should rather ask
>how does df work, instead?

How about using the statfs(2) call?  An extract from the manual follows.

|#include <sys/vfs.h>
|int statfs(path, buf)
|char *path;
|struct statfs *buf;
|
|int fstatfs(fd, buf)
|int fd;
|struct statfs *buf;
|
|statfs() returns information about a  mounted  file  system.
|path  is  the  path  name  of  any  file  within the mounted
|filesystem.  buf  is  a  pointer  to  a  statfs()  structure
|defined as follows:
|
|     typedef struct {
|            long    val[2];
|     } fsid_t;
|     struct statfs {
|            long    f_type;     /* type of info, zero for now */
|            long    f_bsize;    /* fundamental file system block size */
|            long    f_blocks;   /* total blocks in file system */
|            long    f_bfree;    /* free blocks */
|            long    f_bavail;   /* free blocks available to non-super-user
|*/
|            long    f_files;    /* total file nodes in file system */
|            long    f_ffree;    /* free file nodes in fs */
|            fsid_t  f_fsid;     /* file system id */
|            long    f_spare[7]; /* spare for later */
|     };

Disclaimer: I haven't done this. However I guess that each file system
has a unique id, and that this is returned f_fsid field.  If so, then 
this could be used to tell if the two files are on the same filesystem
without the need to parse /etc/fstab. Then it's just a matter for stat().

/\ndy
-- 
Andrew M. Jones,  Systems Programmer, 	Internet: andy@lingua.cltr.uq.oz.au
Centre for Lang. Teaching & Research, 	UUCP: uunet!lingua.cltr.uq.oz.au!andy
University of Queensland,  St. Lucia, 	Phone: +61  7 365 6915 (Use 07 in Oz)
Brisbane,  Qld. AUSTRALIA  4072    	Fax: +61 7 365 7077    IRC: HulkHogan

"No matter what hits the fan, it's never distributed evenly....."