[comp.arch] disk queues of length zero..... or scaling up hurts

mash@mips.com (John Mashey) (03/06/91)

In article <1991Mar6.003008.9131@bellcore.bellcore.com> mo@bellcore.com (Michael O'Dell) writes:
>I know *I* have seen servers with much longer disk queues.

>For example -

>	Assume you memory map and create large file on a machine with lots
>	of free memory.  Say you write 50 megabytes.  You now close
>	the file and hence ask for it to really go to disk.
>	WHAM! 50 megabytes goes on the disk queue.  Yes this does happen,
>	and boy, is the poor dweeb at some other terminal who just
>	typed "ls" on the same filesystem really screwed.

>There are many more anomolies out there when the machine and the memory
>get sufficiently fast and large....

Actually, mo is understating the case ... it can get even worse...
Suppose you permit the disk cache to occupy up to, or close to 100%
of memory outside the kernel.
Then, all of a sudden, not only is there a giant disk queue for
a specific disk [which makes ls not only on the filesystem, but on the
disk, not so good], BUT you have a gaint bunch of dirty pages in memory,
and if you're not careful, you may have thrown away clean pages of
read-only code to get there.

Now, I'm ANYBODY else on the system, and I type: glurp,
and discover the kernel has to get enough pages written out,
to get enough pages to page glurp in, and if glurp is big,
it executes a little while, then page faults, waits a long time,
then page faults, because every page fault needs to grab a
dirty page, and to get a dirty page, you need to get it written out, etc.

All of this is jsut exacerbated by fast CPUs with big memories,
especially since the typical time-shraring quantum has remained
around 1/60th to 1/100th of a second when we had 1-mips machines;
now we can dirty 50X more pages per quantum....

Our folks had to do a bunch of work in RISC/os to stop this kind of
thing from killing multi-user response time, such as letting the
% of memory allocated as disk cache go up and down, but only up to
a parameter normally set less than 100%.  Of course, you don't need
to have dirtied the disk cache just by one program, a bunch of them
not going so fast can do it also.

As one more example, our folks found some bug in the BSD file system
that's been there forever, and never noticed unti lwe got 50-mips
machines.  We always find that the fastest machine finds some new race
condition in otherwise solid code that's been running a long time.  sigh.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	 mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash 
DDD:  	408-524-7015, 524-8253 or (main number) 408-720-1700
USPS: 	MIPS Computer Systems MS 1/05, 930 E. Arques, Sunnyvale, CA 94086