[comp.unix.i386] statbug.c: Spot a BUG with NFS stat

guy@auspex.auspex.com (Guy Harris) (03/11/90)

>	On my system the st_dev number returned is negative; clearly a load of
>	garbage.

There is no reason why a negative "st_dev" is necessarily "a load of
garbage".  For an extreme example, consider a system with the
(currently-standard) 8 bits of major and 8 bits of minor device number,
and with 129 different block device drivers....

Many UNIX NFS implementations choose negative "st_dev" values for
remotely-mounted file systems (NFS or RFS), simply to avoid colliding
with values for local file systems.

The problem with "ed" is probably that "ustat()", while working
*perfectly correctly* with those allegedly-"garbage" "st_dev" values,
may not be returning proper data for an NFS-mounted file system. 
Another problem may be that it can't cope with "st_dev" values for file
systems other than local ones, or for NFS file systems in particular.

It certainly *is* possible to have "st_dev" return negative values for
NFS-mounted file systems *and* to have "ustat()" work just fine with
NFS-mounted file systems - SunOS has done so since SunOS 3.2.  The
following test program can be used to see how well "ustat()" works.  It
takes one argument, which should be the path name of a file (any file or
directory) on the file system on which you want to test this, and prints
out the "ustat" information for that file system.

To check its answers, do a "df" on the file system in question and
compare the results.  NOTE: beware of the units that "ustat" may be
using.  UNIX software tends to assume that the "f_tfree" figure is in
units of 512-byte blocks.  As such, if the program reports a figure that
does *not* appear to be in those units, the UNIX implementation on which
it's running arguably has a bug or misfeature.  The SVID - even the S5R4
Third Edition - says only that it's the number of "total free blocks",
and doesn't indicate whether this means:

	sectors;
	512-byte "blocks", even if your system has 1024-byte sectors
	    (yes, there really *are* systems that have 1K sectors rather
	    than 512-byte sectors);
	the native block size of the file system - or, in file systems
	    with two "block" sizes, which one this is; for instance, is
	    is the "block" size or the "fragment" size in a BSD file
	    system?

and it is quite possible that ISC, or somebody, chose some meaning other
than the one that will, as indicated, make "ed" happy, namely "512-byte
'blocks'".  (The S5R3.1 "ed" - and, I think, most "ed"s prior to that,
and probably the S5R3.2 one as well - shift the file size right by 9
bits before comparing it with the "f_tfree", so that "f_tfree" must be
in units of 512-byte chunks for this to work.)

It is also possible that they misimplemented the way "ustat" on an NFS
file system deals with the results of the STATFS call required for
"ustat".  What it should do is multiply "bavail" (since that's the
number of bytes that non-privileged users can use - "bfree" is the
number of blocks "free", but in e.g. the BSD file system, only the
super-user can use all of them; other users can use only "bavail" of
them) by "bsize" and divide the result by 512 (or, alternatively,
multiply it by "bsize/512").

(Yes, I know, NFSSRC4.0 doesn't quite do that.  *Mea culpa.* 512 should
be used instead of DEV_BSIZE; the comment in the code notes that AT&T
was, shall we say, less-than-careful in specifying what the units
actually were.  Doing the multiplication before the division may be
risky, too, if it can overflow; if you can have a file system with > 2
gigabyte on it, this is a potential problem.)

The test program (which works just fine in SunOS 4.0.3) follows.  It may
have to be compiled in the "System V environment" on systems that have
multiple environments (e.g., compile it with "/usr/5bin/cc" on SunOS). 
I've only compiled it under SunOS, so I don't know whether it'll work on
a "normal" S5 system.

NOTE: the current NFS protocol has no way of telling a client how many
"file slots" - e.g., inodes - are free on a file system.  As such, the
"f_tinode" number will be bogus on an NFS file system; "ed" doesn't use
that, though.

----------------------------------Cut Here-------------------------------------
#include <stdio.h>
#include <sys/types.h>
#include <sys/sysmacros.h>
#include <sys/stat.h>
#include <ustat.h>
#include <errno.h>

/*
 * If you have the ANSI C "strerror" routine, toss this declaration and
 * the later definition out.
 */
static char *strerror(/*int errnum*/);

int
main(argc, argv)
	int argc;
	char **argv;
{
	struct stat statb;
	struct ustat ustatb;

	if (argc != 2) {
		(void) fprintf(stderr, "Usage: ustattst <file>\n");
		return 1;
	}

	if (stat(argv[1], &statb) < 0) {
		(void) fprintf(stderr,
		    "ustattst: Can't stat %s: %s\n", argv[1], strerror(errno));
		return 2;
	}

	if (ustat(statb.st_dev, &ustatb) < 0) {
		(void) fprintf(stderr,
		    "ustattst: Can't do ustat on (%d, %d): %s\n",
		    major(statb.st_dev), minor(statb.st_dev), strerror(errno));
		return 2;
	}

	(void) printf("tfree %ld tinode %ld fname %.6s fpack %.6s\n",
	    ustatb.f_tfree, ustatb.f_tinode, ustatb.f_fname,
	    ustatb.f_fpack);

	return 0;
}

extern int sys_nerr;
extern char *sys_errlist[];

static char *
strerror(errnum)
	int errnum;
{
	static char errbuf[5+1+10+1];	/* "Error %d\0" */
	
	if (errnum > 0 && errnum < sys_nerr)
		return sys_errlist[errnum];
	else {
		(void) sprintf(errbuf, "Error %d", errnum);
		return errbuf;
	}
}

liam@cs.qmw.ac.uk (William Roberts) (03/12/90)

Your "bug" is complete nonsense. If you can give chapter and
verse for your strange assertion that negative dev_t values are
"garbage", then and only then do you have a bug in your system
rather than a bug in your understanding.

On my system, <sys/types.h> have dev_t as unsigned short
anyway, so no negative numbers there. Since an NFS filestore is
not represented by a local physical device, the dev_t number
isn't required to have a major device number that could be used
as an index into the bdevsw or cdevsw tables.

In fact, most NFS implementations choose to fake a major device
number that is way in excess of the likely valid range to avoid
clashes. It's also a function of the client, because "device
numbers" aren't part of the NFS abstraction.

-- 

William Roberts                 ARPA: liam@cs.qmw.ac.uk
Queen Mary & Westfield College  UUCP: liam@qmw-cs.UUCP
Mile End Road                   AppleLink: UK0087
LONDON, E1 4NS, UK              Tel:  01-975 5250 (Fax: 01-980 6533)

daveb@i88.isc.com (David G. Burton) (03/15/90)

In article <638@hades.OZ> greyham@hades.OZ (Greyham Stoney) writes:
| SYSTEM: ISC 386/ix 2.0.2 NFS
| PROBLEM: stat() library function returns a garbage st_dev field for NFS mounted
| 	files.
| ... 
| TEST:	The program below (statbug.c), when run with a single filename argument
| 	will see if the st_dev number returned by stat() is positive.
|	... 
| 	On my system the st_dev number returned is negative; clearly a load of
| 	garbage. ...
| 
| Please tell me if your system fails this test.
| [ program to check if stat().st_dev < 0 deleted ]

NFS implementations derived from Lachman NFS *should* fail this "test".
The test is inappropriate.  As explained in an earlier post, SVR3.x
determines the remote-ness of a file by a negative st_dev. The -local
option of find(1) reports true if ~st_dev is less than zero (i.e. if
st_dev is positive, it's local). The NFS dependent stat code gets st_dev,
then complements the MSB so that find(1) works properly. Note that the
stat(2) man page is not completely accurate when it states for st_dev,
"No other meaning is associated with this value." Also heed the caveat
on the ustat(2) man page - use statfs(2) instead.

| 	... It may also be positive and a load of garbage however. A very
| 	big positive number probably looks sus.

On an i386 NFS client, the only way for st_dev to come out positive would
be for bmajor(nfsd_major_number) > 127. This is not possible.

-- Dave Burton
--
Dave Burton
uunet!ism780c!laidbak!daveb