[comp.protocols.nfs] Response from rpc.pcnfsd - tuning timeouts

GEustace@massey.ac.nz (Glen Eustace) (03/05/90)

We are experiencing a problem with our PCs and their communication with
the rpc.pcnfsd process on our server.

We have turned on the debugging to try and get some idea what is
happenning.  It would appear that we get a kind of meltdown situation.

PC A is trying to do its NET NAME command.
- It gets no reply to the RPC, probably a timeout I think.
- It tries again.  Upto about 4 times at the moment and finally gives up.

PC B is trying to do a NET USE LPT1: etc
- same story only PRI_INIT rather than AUTH_PROC is repeated.

PC C is trying to do a print.
- same story only multiple PRI_STARTs.

It would appear that with the current load on our server and network, the
rpc packets are being executed but the successful reply is being lost by
the PCs so they try again.  As more and more PCs hit this situation
things just get progressively worse.

Is there some way of increasing the rpc timeout used by the NET Command.
I have tried the /t switch on PCNFS.SYS but it does not seem to have any
effect.  We are currently trying to improve the performance of rpc.pcnfsd
by removing the getpwnam( nobody ) and 'nice'ing the process.  But we are
not having a lot of success.

Any help would be appreciated.

-- 
-----------------------------------------------------------------------
  Glen Eustace, Software Manager, Computer Centre, Massey University,
   Palmerston North, New Zealand. Phone: +64 63 69099 x7440 GMT+12
             E-Mail via Internet: G.Eustace@massey.ac.nz
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

geoff@hinode.East.Sun.COM (Geoff Arnold @ Sun BOS - R.H. coast near the top) (03/06/90)

Quoth GEustace@massey.ac.nz (Glen Eustace) (in <599@massey.ac.nz>):
#We are experiencing a problem with our PCs and their communication with
#the rpc.pcnfsd process on our server.
#[and then provides details]
#It would appear that with the current load on our server and network, the
#rpc packets are being executed but the successful reply is being lost by
#the PCs so they try again.  As more and more PCs hit this situation
#things just get progressively worse.

I'd be curious to see what kind of collision rate you're experiencing
on the network, and also what kind of boards you have in the PCs. As an
experiment, I reniced rpc.pcnfsd to 20 on my workstation, then fired up
everything I could think of - repaginating a 50 page Frame document,
confirming changes to my 500KB mail folder :-), etc. Even so, my PC
was able to get a "net name" executed in around 3 seconds.

Try "netstat -i" on both the server and the PC. 

PS I can't believe that slowing the RPC retransmission backoff would
help.

PPS You might also try comparing the performance against the value you
get after running "NET PCNFSD some-pc"; i.e. talking to a system not running a
portmapper or a PCNFSD daemon. After all, if you actually get through to
the rpc.pcnfsd it means you've successfully communicated with the
portmapper. If it was a networkproblem you'd expect to see the portmapper
requests failing too.

Puzzled,

Geoff

Geoff Arnold, PC-NFS architect, Sun Microsystems. (geoff@East.Sun.COM)
-------------------
The Bible is not my Book and Christianity is not my religion. I could never give
assent to the long complicated statements of Christian dogma." (Abraham Lincoln)