[comp.unix.questions] the 10% factor

MATHRICH@umcvmb.missouri.edu (Rich Winkel UMC Math Department) (11/14/89)

I understand that allowing bsd filesystems to exceed 90% of their
capacity results in a significant reduction in performance.  This makes
a certain amount of sense to me, at least when files are being created, deleted
or appended to.  However, I've been told this is also true of static
filesystems.  For instance, if I have a 20MB partition set aside for
an nfs client's swap activity, I can't utilize the entire 20MB for the swap
file, I'm supposed to leave 10% unused to avoid a performance impact, even
though the swap file's size is unchanging.  Is
this true?  Could someone explain this to me?

Thanks,
Rich

chris@mimsy.umd.edu (Chris Torek) (11/14/89)

In article <21436@adm.BRL.MIL> MATHRICH@umcvmb.missouri.edu (Rich Winkel
UMC Math Department) writes:
>I understand that allowing bsd filesystems to exceed 90% of their
>capacity results in a significant reduction in performance.

The reserved space has two functions:

	a) it speeds up block allocation: cylinder groups with no
	   free blocks in desired rotational positions are rare.

	b) it speeds up access: since (a) is true, the chance that
	   the blocks of any single file are scattered randomly
	   about the disk is low.

>... I've been told this is also true of static filesystems.

Here only part (b) matters.  If files on the file system are only
allocated once, ever, and the file system itself remains static
thereafter, there are no allocations in (a) to worry about.  The
question is then whether (b) is worth concern.  The short answer is
`probably not.'

>... if I have a 20MB partition set aside for an nfs client's swap activity,

(I can only guess that you mean `swap files for SunOS ``swap on a file''
style paging/swapping' here.  Swap partitions, in the original Unix
sense, are not file systems.)

If all the files in that partition are allocated statically---never
grow---then there is no reason to keep a 10% reserve.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

hitz@auspex.auspex.com (Dave Hitz) (11/18/89)

In article <21436@adm.BRL.MIL> MATHRICH@umcvmb.missouri.edu (Rich Winkel UMC Math Department) writes:
> I understand that allowing bsd filesystems to exceed 90% of their
> capacity results in a significant reduction in performance.  This makes
> a certain amount of sense to me, at least when files are being created,
> deleted or appended to.  However, I've been told this is also true of
> static filesystems.  For instance, if I have a 20MB partition set aside
> for an nfs client's swap activity, I can't utilize the entire 20MB for
> the swap file, I'm supposed to leave 10% unused to avoid a performance
> impact, eve though the swap file's size is unchanging.  Is this true?
> Could someone explain this to me?

Here's a shot.

The problem is that as a file grows on a mostly full partition, the
block immediately following the last one in the file is generally not
available.  Even on almost empty files this is always a possibility,
but statistically it turns out that only as the filesystem passes 90%
full does this problem become nasty.

Once blocks for a file are allocated in a bad order, it doesn't matter
if the filesystem is static or not.  If you read that file
sequentially, the disk will have to seek all over the place to get the
blocks.  So even for static filesystems, some limit probably still
makes sense.  (You might try creating known rarely used files last, and
crank up utilization to 100%.)

Now the question about swap is interesting.  When you create a swap
file using mkfile(8), it allocates disk blocks when the file is
created.  If you create a 20 MB swap file in an otherwise empty 20 MB
partition, the block allocation would probably be quite good because
there would be no conflict with smaller chuncks of space being
allocated in other files.  If you want more than one swap file per partition,
that would probably work also, but make sure *not* to run the mkswap's in
parallel.  Do them one after the other.  (And don't use the -n option which
would defeat the whole point of getting the blocks nicely allocated at
create time.)

-- 
Dave Hitz					home: 408-739-7116
UUCP: {uunet,mips,sun,bridge2}!auspex!hitz 	work: 408-492-0900

emv@math.lsa.umich.edu (Edward Vielmetti) (11/18/89)

Dave Hitz's article refers to creating swap files with mkfile
(this is on a sun at least) and recommends not to use the -n
option since that will lead to fragmented swapfiles.

That's fine but -- space is a bit tight here, & various people
use their machines differently.  I'd really like to give most
people on diskless 4m 3/50's a 16M swap file that has say
8M pre-allocated & filled w/whatever it takes to swap nicely,
and leave the other 8M to grow dynamically just in case 
someone manages to need it.

I don't see a way to do it with the tools provided, -- does
this approach make sense, & what would it take to do it ?

thanks

--Ed

BACON@MTUS5.BITNET (Jeffery Bacon) (11/19/89)

In article <2643@auspex.auspex.com>, hitz@auspex.auspex.com (Dave Hitz) says:
>
>The problem is that as a file grows on a mostly full partition, the
>block immediately following the last one in the file is generally not
>available.  Even on almost empty files this is always a possibility,
>but statistically it turns out that only as the filesystem passes 90%
>full does this problem become nasty.
>

    How does one learn this stuff? I RTF(Sun)Ms, but it didn't help that much.
Or was I looking in the wrong place?

>
>Now the question about swap is interesting.  When you create a swap
>file using mkfile(8), it allocates disk blocks when the file is
>created.  If you create a 20 MB swap file in an otherwise empty 20 MB
>partition, the block allocation would probably be quite good because
>there would be no conflict with smaller chuncks of space being
>allocated in other files.  If you want more than one swap file per partition,
>that would probably work also, but make sure *not* to run the mkswap's in
>parallel.  Do them one after the other.  (And don't use the -n option which
>would defeat the whole point of getting the blocks nicely allocated at
>create time.)
>

     Well, I hate to sound too stupid, but I did just this on a couple of
swap partitions I made a couple of months ago. (Run the mkfile(8)s in
parallel. This is on a Sun 3/260, SunOS4.0.3, 2 280MB Fujitsu's, 3 ~36Mb
swaps on two 117MB partitions. The partitions are packed full, with about
500K free each.) Should I remake the swap files?
     At the same time, it occured to me to cut the number of inodes and the
size of the minfree value. After all, there are only 3 files...(plus the
. and .. directory)...I don't see anything wrong with having done that, am I
correct?

>--
>Dave Hitz                                       home: 408-739-7116
>UUCP: {uunet,mips,sun,bridge2}!auspex!hitz      work: 408-492-0900
-------
Jeffery Bacon
Computing Technology Svcs., Michigan Technological University
bitnet: bacon@mtus5  uucp (alternate): <world>!itivax!anet!bacos

madd@world.std.com (jim frost) (11/21/89)

BACON@MTUS5.BITNET (Jeffery Bacon) writes:
>    How does one learn this stuff? I RTF(Sun)Ms, but it didn't help that much.
>Or was I looking in the wrong place?

Get a copy of "The Design and Implementation of 4.3 BSD UNIX",
available at any good bookstore near you.  It explains how the
filesystem was implemented and why they recommend the 10% buffer.
There's also an article in the USENIX Computing Systems Journal
(Volume 2, Number 3, Summer 1989), "Heuristics for Disk Drive
Positioning in 4.3BSD" which describes how to tune the BSD hp disk
driver (and some implementation issues) which you might find
interesting.

The BSD book, and the System V counterpart by Bach, is very
enlightening if you really want to know the how and why.  Both of them
will require a bit of operating systems understanding before you delve
into them.  Both of them are a lot better if you have source to poke
at, too.

Several of the major operating systems classbooks devote substantial
space to UNIX, both System V and BSD, which might be a better place to
start since they're more general.  I've lent almost all of mine out so
I can't make particular recommendations right now.

jim frost
software tool & die
madd@std.com

hitz@auspex.auspex.com (Dave Hitz) (11/30/89)

In article <89323.004057BACON@MTUS5.BITNET> BACON@MTUS5.BITNET (Jeffery Bacon) writes:
>    How does one learn this stuff?

RTFC :-)

>      At the same time, it occured to me to cut the number of inodes and the
> size of the minfree value. After all, there are only 3 files...(plus the
> . and .. directory)...I don't see anything wrong with having done that, am I
> correct?
...
>     Well, I hate to sound too stupid, but I did just this on a couple of
> swap partitions I made a couple of months ago. (Run the mkfile(8)s in
> parallel. This is on a Sun 3/260, SunOS4.0.3, 2 280MB Fujitsu's, 3 ~36Mb
> swaps on two 117MB partitions. The partitions are packed full, with about
> 500K free each.) Should I remake the swap files?

Easy one first:  There is *definitely* nothing wrong with pulling down
the number of inodes.

Hard one: Setting minfree down to 0 doesn't break anything, but *may*
cause performance problems.  To determine whether it's *actually*
causing performance problems would require an experiment or a
simulation of your exact setup.

You can make sure that minfree=0 does not hurt performance much by
starting with a completely empty partition and building swap files one
at a time using mkfile(8) *without* the -n option.

Is it worth doing this?  Figuring that out is probably harder than just
doing it.  (Why don't you run a swap-death benchmark before and after and
tell us what happens.)

My *guess* would be that unless you swap heavily, you won't notice
any difference.  If you do swap heavily, random factors will dominate.
If you swap to a part of the file that was created early, before the
partition was getting full, probably no problem.  If you swap at the
end of the file, you may get a slow down.

I'd like to emphasize that these guesses about out how disk
fragmentation is likely to affect performance are in fact guesses.  I
know how to make the problem (if there is one) go away, but I don't
know if there is one.

-- 
Dave Hitz					home: 408-739-7116
UUCP: {uunet,mips,sun,bridge2}!auspex!hitz 	work: 408-492-0900