ericw@janis.UUCP (Eric Wedaa) (05/24/91)
Does anyone have any ideas on how to tune the nfs daemon "nfsd". I'm calling
it with the recommended call (nfsd 4) but my server stil gets a bit slow
when folks start doing makes across the net. Any recommendations on how
to come up with some standard performance figures so that I have some
way to reliably measure my changes? Or am I just being overly sensitive to
the users and their comments on performance?
The following was made this morning when the local system was relatively
calm, and a series (6) of 'find / -name core -print' was being run on one of
the clients.
#uptime
10:29am up 3 days, 57 mins, 4 users, load average: 3.35, 2.76, 2.27
#ps uax
(Trailing trash removed)
USER PID %CPU %MEM SZ RSS TT STAT TIME COMMAND
ericw 14765 57.4 3.6 333 225 16 R 0:03 ps uax
root 86 12.2 0.6 100 31 ? S 35:18 /etc/nfsd 4
root 87 11.6 0.6 100 31 ? S 35:32 /etc/nfsd 4
root 89 11.6 0.6 100 31 ? S 35:20 /etc/nfsd 4
root 88 11.2 0.6 100 31 ? S 35:44 /etc/nfsd 4
bob 14700 2.6 3.3 272 205 17 S 2:16 vnews
ericw 14760 0.8 2.2 291 134 16 T 0:04 vi + /tmp/posta14759
ericw 12707 0.8 2.1 211 130 16 S 0:44 -csh (csh)
root 155 0.6 0.2 5 3 ? S 13:28 /etc/update
root 168 0.3 2.3 207 140 ? S 9:10 /etc/rwhod
root 158 0.0 0.3 32 14 ? I 1:41 /etc/cron
root 95 0.0 0.2 11 5 ? I 0:00 /etc/biod 4
#ruptime
(Edited down to clients only, local machine is Janis)
drteeth up 3+01:14, 13 users, load 1.12, 1.13, 1.05
janis up 3+01:04, 4 users, load 3.05, 3.04, 2.47
kermit up 14+18:07, 1 user, load 0.00, 0.00, 0.06
robin up 19+23:07, 0 users, load 0.03, 0.06, 0.07
--
=====================================================================
| These views are mine, and not those of my company or co-workers. |
=====================================================================
grr@cbmvax.commodore.com (George Robbins) (05/24/91)
In article <119@janis.UUCP> ericw@janis.UUCP (Eric Wedaa) writes: > > Does anyone have any ideas on how to tune the nfs daemon "nfsd". I'm calling > it with the recommended call (nfsd 4) but my server stil gets a bit slow > when folks start doing makes across the net. Any recommendations on how > to come up with some standard performance figures so that I have some > way to reliably measure my changes? Or am I just being overly sensitive to > the users and their comments on performance? There are two ways of "tuning" this situation. One is to reduce the number of NFS daemons, the other is to run the daemons in low priority, either by starting them with "nice" or "renice'ing" them. Either one presumably might have a negative impact on NFS performance, expecially if the server is CPU bound, but this doesn't seem to be a serious problem. > The following was made this morning when the local system was relatively > calm, and a series (6) of 'find / -name core -print' was being run on one of > the clients. We have had this a happen here several times. In our environment, it's not too difficult to find the culprit and do a little "education". In a student or other environment, this might not be an effective solution. I don't know if other NFS implementations show such a severe impact when they are abused by their clients. -- George Robbins - now working for, uucp: {uunet|pyramid|rutgers}!cbmvax!grr but no way officially representing: domain: grr@cbmvax.commodore.com Commodore, Engineering Department phone: 215-431-9349 (only by moonlite)
rusty@groan.Berkeley.EDU (Rusty Wright) (05/25/91)
The metric I heard at a presentation about system tuning by a guy from Sun at the Sun User Group meeting is that you want 2 nfsd's per exported filesystem (or was it per disk drive). My guess is that's 1 to handle the reads and 1 to handle the writes so that if you export filesystems readonly you could drop the number of nfsd's. Likewise if any of the filesystems are accessed infrequently.
kenc@suntan.viewlogic.com (Kenstir) (05/25/91)
In article <119@janis.UUCP>, ericw@janis.UUCP (Eric Wedaa) writes: > > Does anyone have any ideas on how to tune the nfs daemon "nfsd". I'm calling > it with the recommended call (nfsd 4) but my server stil gets a bit slow > when folks start doing makes across the net. Any recommendations on how > to come up with some standard performance figures so that I have some > way to reliably measure my changes? Or am I just being overly sensitive to > the users and their comments on performance? Upon recommendation by a DEC customer service guy, I lowered the number of nfsd's to the number of actual disks I have, 2. I found that the user on the console becomes significantly happier during heated NFS traffic, as there are only 2 processes against you in the battle for the timeslice. DECstation 3100, Ultrix 4.1. -- Kenneth H. Cox Viewlogic Systems, Inc. kenc@viewlogic.com ..!{harvard,husc6}!viewlogic.com!kenc
grr@cbmvax.commodore.com (George Robbins) (05/27/91)
In article <RUSTY.91May24113908@groan.Berkeley.EDU> rusty@groan.Berkeley.EDU (Rusty Wright) writes: > The metric I heard at a presentation about system tuning by a guy from > Sun at the Sun User Group meeting is that you want 2 nfsd's per > exported filesystem (or was it per disk drive). My guess is that's 1 > to handle the reads and 1 to handle the writes so that if you export > filesystems readonly you could drop the number of nfsd's. Likewise if > any of the filesystems are accessed infrequently. Frankly, this sounds like suicide the way the Ultrix NFS deamons like to misbehave. Also, it might be better advice for a dedicated server than a timesharing system that heappens to export many of it's filesystems. I'm really curious whether the Ultrix behavior is a result of bugs or simply the way that all NFS servers act. The worst case seems to be "find" which reads "directories" rather than "files", which I believe are different classes of operation under NFS. It may be that "stateless" behavior that NFS implements turns sequentially "reading" a directory into some highly cpu intensive search and search again algorithm. [ for c.p.nfs types: a client doing a "find" against an Ultrix NFS exported filesystem brings the server to it's knees, with the NFS deamons sharing ~100% of the CPU time amongst themselves... Ouch. This happens often enough to be a recognizable syndrome and prompts a witch hunt to find which client is up to mischief ] -- George Robbins - now working for, uucp: {uunet|pyramid|rutgers}!cbmvax!grr but no way officially representing: domain: grr@cbmvax.commodore.com Commodore, Engineering Department phone: 215-431-9349 (only by moonlite)
mark@loki.une.oz.au (Mark Garrett) (05/27/91)
grr@cbmvax.commodore.com (George Robbins): > In article <RUSTY.91May24113908@groan.Berkeley.EDU> rusty@groan.Berkeley.EDU (Rusty Wright) writes: > filesystem brings the server to it's knees, with the NFS deamons sharing > ~100% of the CPU time amongst themselves... Ouch. This happens often > enough to be a recognizable syndrome and prompts a witch hunt to find > which client is up to mischief ] I've only got ultrix to play with, and it was only last weekend that I gave NFS its first run ,exporting src from a VAX3500 to a DecSystem5400 , to find that the 3500 got a real kick in the response time with very little NFS effort. I was not impressed, believing this to be normal behavior for NFS I'm glad to hear its Ultrix (sort of). Has Ultrix 4.2 fixed this, would BSD source for NFS have the same problems ? -- Mark Garrett Internet: mark@loki.une.edu.au Phone: +61 (066) 20 3859 University of NewEngland, Northern Rivers, Lismore NSW Australia.
jim@cs.strath.ac.uk (Jim Reid) (05/28/91)
In article <21936@cbmvax.commodore.com> grr@cbmvax.commodore.com (George Robbins) writes:
I'm really curious whether the Ultrix behavior is a result of bugs or
simply the way that all NFS servers act. The worst case seems to be "find"
which reads "directories" rather than "files", which I believe are different
classes of operation under NFS. It may be that "stateless" behavior that
NFS implements turns sequentially "reading" a directory into some highly
cpu intensive search and search again algorithm.
[ for c.p.nfs types: a client doing a "find" against an Ultrix NFS exported
filesystem brings the server to it's knees, with the NFS deamons sharing
~100% of the CPU time amongst themselves... Ouch. This happens often
enough to be a recognizable syndrome and prompts a witch hunt to find
which client is up to mischief ]
Any recursive directory traverse via NFS can be painful (du is just as
bad as find). This is because the client makes LOTS of NFS requests -
several read directory entries to get the file names and the file
handles followed by a get file atributes request for each file. If the
client is faster at sending these out than the server is at replying,
this is bad news. The server will be bombarded with NFS requests which
it can't service quickly enough. The requests timeout, so the client
sends them all over again, saturating the server once more and closing
the loop. Another nasty is that the client and server file attribute
caches will get flushed and filled with entries from the traverse.
This can mean that heavily used cache entries have been removed to
make way for those at the tail of the directory traverse.
Increasing the number of nfsds on the server may help in this
situation, but I doubt it. [It's already working the disk as hard as
it can so another nsfd process to enqueue requests to the server's
disk driver isn't going to help much.] A better solution will be to
experiment with increased values for the timeout and restransmission
NFS mount parameters ON THE CLIENTS. This will make them behave less
agressively when the server is having a hard time.
Jim
brent@terra.Eng.Sun.COM (Brent Callaghan) (05/29/91)
In article <JIM.91May28110939@baird.cs.strath.ac.uk>, jim@cs.strath.ac.uk (Jim Reid) writes: > Any recursive directory traverse via NFS can be painful (du is just as > bad as find). This is because the client makes LOTS of NFS requests - > several read directory entries to get the file names and the file > handles followed by a get file atributes request for each file. If the > client is faster at sending these out than the server is at replying, > this is bad news. The server will be bombarded with NFS requests which > it can't service quickly enough. The requests timeout, so the client > sends them all over again, saturating the server once more and closing > the loop. No, this isn't right. The biod's on an NFS client can generate multiple concurrent requests - but only for in I/O operations: read and write. A process doing a find generates lots of READDIR's and LOOKUP's. These are executed in the context of the process generating them. For each request, the process blocks until it gets a reply from the server. If we're assuming that there's just one "find" process running on the client, then there's no way that the client can "bombard" the server with requests - it's all synchronous. Whether you do it over NFS or locally, find is a very disk intensive activity. -- Made in New Zealand --> Brent Callaghan @ Sun Microsystems Email: brent@Eng.Sun.COM phone: (415) 336 1051