[comp.sys.sgi] /usr/etc/nfsstat interpretation

swanson@UXE.CSO.UIUC.EDU (Amy Swanson) (12/06/89)

	I have a 4D/240 server that is being used as a NFS and Yellow Pages
server for a lab of 20 Personal Irises.  Lately, we've been experiencing a
lot of problems with NFS "getattr" failures.  I looked at the output of the
/usr/etc/nfsstat but am not sure what I am looking for or how to determine if
it is the server or a client having a problem, and if it is a client, how do
I find out which of the 20 PIs is having the problem.  Could someone help me 
interpret the following nfsstat output?  Thanks!


Amy



rels_55% /usr/etc/nfsstat -cs

Server rpc:
calls      badcalls   nullrecv   badlen     xdrcall
5292246    0          0          0          0          

Server nfs:
calls      badcalls
5292246    0          

null       getattr    setattr    root       lookup     readlink   read       
0  0%      1674964 31% 49493  0%  0  0%      1009263 19% 913  0%    1210477 22% 

wrcache    write      create     remove     rename     link       symlink    
0  0%      1054781 19% 16559  0%  14286  0%  3994  0%   1405  0%   2  0%      

mkdir      rmdir      readdir    fsstat     
307  0%    92  0%     253719  4% 1991  0%   



Client rpc:
calls      badcalls   retrans    badxid     timeout    wait       newcred
316821     322        105        9          266        0          0          

Client nfs:
calls      badcalls   nclget     nclsleep
316821     322        316821     0          

null       getattr    setattr    root       lookup     readlink   read       
0  0%      30735  9%  1  0%      0  0%      32406 10%  12432  3%  180105 56% 

wrcache    write      create     remove     rename     link       symlink    
0  0%      45778 14%  129  0%    129  0%    0  0%      0  0%      0  0%      

mkdir      rmdir      readdir    fsstat     
0  0%      0  0%      14949  4%  157  0%    


===============

Amy Swanson
SGI/Alliant Systems Administrator
NCSA - National Center for Supercomputing Applications
University of Illinois @ Urbana-Champaign

amys@ncsa.uiuc.edu

brendan@illyria.wpd.sgi.com (Brendan Eich) (12/07/89)

> 	I have a 4D/240 server that is being used as a NFS and Yellow Pages
> server for a lab of 20 Personal Irises.  Lately, we've been experiencing a
> lot of problems with NFS "getattr" failures.  I looked at the output of the
> /usr/etc/nfsstat but am not sure what I am looking for or how to determine if
> it is the server or a client having a problem, and if it is a client, how do
> I find out which of the 20 PIs is having the problem.  Could someone help me 
> interpret the following nfsstat output?  Thanks!

When you write "getattr" failures, do you mean this message came out on
the client's console?

	NFS getattr failed for server XXX: Connection timed out

If so, the client is soft-mounting the server's filesystem(s).  A hard mount
never times out.  The nfsstat numbers confirm this:

> Client rpc:
> calls      badcalls   retrans    badxid     timeout    wait       newcred
> 316821     322        105        9          266        0          0          
>
> Client nfs:
> calls      badcalls   nclget     nclsleep
> 316821     322        316821     0          

The 9 badxids suggest that the server got behind the client: old replies
came back after the calling client process had timed out, and new calls
waiting for their replies noted the old replies' transaction identifiers
(xids) as "bad".

The 266 timeouts indicate an overloaded server, sick Ethernet, or some
similar problem.  Try mounting with a larger initial timeout (the timeo
mount(1M)/fstab(4) option), and/or retransmission attempt limit (retrans).
Use netstat(1) to check for input/output errors detected by the network
interface, which might indicate a sick Ethernet.

Brendan Eich
Silicon Graphics, Inc.
brendan@sgi.com