swanson@UXE.CSO.UIUC.EDU (Amy Swanson) (12/06/89)
I have a 4D/240 server that is being used as a NFS and Yellow Pages server for a lab of 20 Personal Irises. Lately, we've been experiencing a lot of problems with NFS "getattr" failures. I looked at the output of the /usr/etc/nfsstat but am not sure what I am looking for or how to determine if it is the server or a client having a problem, and if it is a client, how do I find out which of the 20 PIs is having the problem. Could someone help me interpret the following nfsstat output? Thanks! Amy rels_55% /usr/etc/nfsstat -cs Server rpc: calls badcalls nullrecv badlen xdrcall 5292246 0 0 0 0 Server nfs: calls badcalls 5292246 0 null getattr setattr root lookup readlink read 0 0% 1674964 31% 49493 0% 0 0% 1009263 19% 913 0% 1210477 22% wrcache write create remove rename link symlink 0 0% 1054781 19% 16559 0% 14286 0% 3994 0% 1405 0% 2 0% mkdir rmdir readdir fsstat 307 0% 92 0% 253719 4% 1991 0% Client rpc: calls badcalls retrans badxid timeout wait newcred 316821 322 105 9 266 0 0 Client nfs: calls badcalls nclget nclsleep 316821 322 316821 0 null getattr setattr root lookup readlink read 0 0% 30735 9% 1 0% 0 0% 32406 10% 12432 3% 180105 56% wrcache write create remove rename link symlink 0 0% 45778 14% 129 0% 129 0% 0 0% 0 0% 0 0% mkdir rmdir readdir fsstat 0 0% 0 0% 14949 4% 157 0% =============== Amy Swanson SGI/Alliant Systems Administrator NCSA - National Center for Supercomputing Applications University of Illinois @ Urbana-Champaign amys@ncsa.uiuc.edu
brendan@illyria.wpd.sgi.com (Brendan Eich) (12/07/89)
> I have a 4D/240 server that is being used as a NFS and Yellow Pages > server for a lab of 20 Personal Irises. Lately, we've been experiencing a > lot of problems with NFS "getattr" failures. I looked at the output of the > /usr/etc/nfsstat but am not sure what I am looking for or how to determine if > it is the server or a client having a problem, and if it is a client, how do > I find out which of the 20 PIs is having the problem. Could someone help me > interpret the following nfsstat output? Thanks! When you write "getattr" failures, do you mean this message came out on the client's console? NFS getattr failed for server XXX: Connection timed out If so, the client is soft-mounting the server's filesystem(s). A hard mount never times out. The nfsstat numbers confirm this: > Client rpc: > calls badcalls retrans badxid timeout wait newcred > 316821 322 105 9 266 0 0 > > Client nfs: > calls badcalls nclget nclsleep > 316821 322 316821 0 The 9 badxids suggest that the server got behind the client: old replies came back after the calling client process had timed out, and new calls waiting for their replies noted the old replies' transaction identifiers (xids) as "bad". The 266 timeouts indicate an overloaded server, sick Ethernet, or some similar problem. Try mounting with a larger initial timeout (the timeo mount(1M)/fstab(4) option), and/or retransmission attempt limit (retrans). Use netstat(1) to check for input/output errors detected by the network interface, which might indicate a sick Ethernet. Brendan Eich Silicon Graphics, Inc. brendan@sgi.com