speck@vlsi.caltech.edu (02/10/86)
<flame> About a month ago I discovered that 80% of all disk I/O done on our Suns was the single, simple line (in rwhod.c): whod = creat(path, 0666); where path = "/usr/spool/rwho/rwhod.%s" (%s = hostname). How could this innocent-looking line be such a hog? 1) Each machine executed it 18 times per minute (we have 18 rwhod's running on one net) 2) All those directories had to be looked up each time 3) On Suns, /usr/spool is a symlink to /private/usr/spool, adding another 3 directories to be looked up 4) On Suns, /usr and /usr/spool sit on a Network FileSystem. Sun's NFS has no caching in the clients; each lookup requires a server transaction over the network 5) 14 Suns used the /usr network filesystem As you can see, namei() was getting a LOT of overuse - on filenames that were very repetitious. I can't help wondering if rwhod wasn't the major contributor to the statistics that led Berkeley to implement namei() caching for 4.3bsd. Did they check the contents of the cache for suspicious correlations? The fix for this (chdir to the rwho directory) is a lot simpler and more efficient than namei caching - all of the directory traversals, symlink lookups, and NFS activity simply GO AWAY. But there's more: 6) Once it's got the inode, creat() takes 30ms, mostly I/O time, just to truncate the file - taking four times as long as an open() - and most of the time the file is going to be written to the same size as it was before. Why is creat(), probably one of the top 10 system calls, so slow on 4.2bsd systems? Why is ftruncate just as slow - and still takes 30ms even if the file is already the correct size? Apparently these system calls do *synchronous* I/O, ignoring the buffer cache (even on plain VAX 4.2bsd, without any NFS clouding the issue). Has Berkeley accomplished nothing in their meddling with the filesytem? Don Speck seismo!cit-vax!speck or speck@vlsi.caltech.edu
wesommer@mit-eddie.UUCP (Bill Sommerfeld) (02/14/86)
The rwho daemon is well known for creating n^2 scaling problems on large nets. It uses up a lot of CPU keeping effectively identical copies of the files in a directory on all systems on the local net. In article <789@brl-smoke.ARPA> speck@vlsi.caltech.edu (Don Speck) writes: ><flame> > About a month ago I discovered that 80% of all disk I/O >done on our Suns was the single, simple line (in rwhod.c): > > whod = creat(path, 0666); > >where path = "/usr/spool/rwho/rwhod.%s" (%s = hostname). > > How could this innocent-looking line be such a hog? > >1) Each machine executed it 18 times per minute (we have > 18 rwhod's running on one net) There is a simple solution to this, which requires a small fix to rwhod. Modify it so that it accepts a -n option (for "no write to disk"). You then modify the loop such that it doesn't do that creat() and write() when the -n option is set. >2) All those directories had to be looked up each time >3) On Suns, /usr/spool is a symlink to /private/usr/spool, > adding another 3 directories to be looked up >4) On Suns, /usr and /usr/spool sit on a Network FileSystem. > Sun's NFS has no caching in the clients; each lookup > requires a server transaction over the network >5) 14 Suns used the /usr network filesystem > You can then set things up so that /usr/spool/rwho on all machines points to the /usr/spool/rwho of the server, and modify all but the server to run rwhod -n in /etc/rc. The only time that remote I/O is needed is when someone does an rwho or ruptime to find out what's going on. >Why is creat(), probably one of the top 10 system calls, so >slow on 4.2bsd systems? Why is ftruncate just as slow - and >still takes 30ms even if the file is already the correct size? >Apparently these system calls do *synchronous* I/O, ignoring >the buffer cache (even on plain VAX 4.2bsd, without any NFS >clouding the issue). > They do synchronous I/O so that the filesystem is not corrupted in uncontrolled ways when a system crashes. This simplifies fsck's job. Bill Sommerfeld MIT Project Athena wesommer@athena.mit.edu mit-eddie!mit-athena!wesommer