[net.unix-wizards] rwhod, creat slowness

speck@vlsi.caltech.edu (02/10/86)

<flame>
    About a month ago I discovered that 80% of all disk I/O
done on our Suns was the single, simple line (in rwhod.c):

	whod = creat(path, 0666);

where path = "/usr/spool/rwho/rwhod.%s" (%s = hostname).

    How could this innocent-looking line be such a hog?

1)  Each machine executed it 18 times per minute (we have
    18 rwhod's running on one net)
2)  All those directories had to be looked up each time
3)  On Suns, /usr/spool is a symlink to /private/usr/spool,
    adding another 3 directories to be looked up
4)  On Suns, /usr and /usr/spool sit on a Network FileSystem.
    Sun's NFS has no caching in the clients; each lookup
    requires a server transaction over the network
5)  14 Suns used the /usr network filesystem

As you can see, namei() was getting a LOT of overuse - on
filenames that were very repetitious.  I can't help wondering
if rwhod wasn't the major contributor to the statistics that
led Berkeley to implement namei() caching for 4.3bsd.  Did they
check the contents of the cache for suspicious correlations?
The fix for this (chdir to the rwho directory) is a lot simpler
and more efficient than namei caching - all of the directory
traversals, symlink lookups, and NFS activity simply GO AWAY.

But there's more:
6)  Once it's got the inode, creat() takes 30ms, mostly I/O time,
    just to truncate the file - taking four times as long as an
    open() - and most of the time the file is going to be written
    to the same size as it was before.

Why is creat(), probably one of the top 10 system calls, so
slow on 4.2bsd systems?  Why is ftruncate just as slow - and
still takes 30ms even if the file is already the correct size?
Apparently these system calls do *synchronous* I/O, ignoring
the buffer cache (even on plain VAX 4.2bsd, without any NFS
clouding the issue).

Has Berkeley accomplished nothing in their meddling with the
filesytem?

Don Speck	seismo!cit-vax!speck  or  speck@vlsi.caltech.edu

wesommer@mit-eddie.UUCP (Bill Sommerfeld) (02/14/86)

The rwho daemon is well known for creating n^2 scaling problems on large
nets.  It uses up a lot of CPU keeping effectively identical copies of
the files in a directory on all systems on the local net.

In article <789@brl-smoke.ARPA> speck@vlsi.caltech.edu (Don Speck) writes:
><flame>
>    About a month ago I discovered that 80% of all disk I/O
>done on our Suns was the single, simple line (in rwhod.c):
>
>	whod = creat(path, 0666);
>
>where path = "/usr/spool/rwho/rwhod.%s" (%s = hostname).
>
>    How could this innocent-looking line be such a hog?
>
>1)  Each machine executed it 18 times per minute (we have
>    18 rwhod's running on one net)

There is a simple solution to this, which requires a small fix to
rwhod.  Modify it so that it accepts a -n option (for "no write to disk").
You then modify the loop such that it doesn't do that creat() and
write() when the -n option is set.

>2)  All those directories had to be looked up each time
>3)  On Suns, /usr/spool is a symlink to /private/usr/spool,
>    adding another 3 directories to be looked up
>4)  On Suns, /usr and /usr/spool sit on a Network FileSystem.
>    Sun's NFS has no caching in the clients; each lookup
>    requires a server transaction over the network
>5)  14 Suns used the /usr network filesystem
>

You can then set things up so that /usr/spool/rwho on all machines
points to the /usr/spool/rwho of the server, and modify all but the
server to run rwhod -n in /etc/rc.  The only time that remote I/O is
needed is when someone does an rwho or ruptime to find out what's going
on.  

>Why is creat(), probably one of the top 10 system calls, so
>slow on 4.2bsd systems?  Why is ftruncate just as slow - and
>still takes 30ms even if the file is already the correct size?
>Apparently these system calls do *synchronous* I/O, ignoring
>the buffer cache (even on plain VAX 4.2bsd, without any NFS
>clouding the issue).
>
They do synchronous I/O so that the filesystem is not corrupted in
uncontrolled ways when a system crashes.  This simplifies fsck's job.


				Bill Sommerfeld
				MIT Project Athena

				wesommer@athena.mit.edu
				mit-eddie!mit-athena!wesommer