[comp.unix.admin] tunefs

emv@math.lsa.umich.edu (Edward Vielmetti) (09/01/90)

	   (single user mode)
	   # mount /usr
	   # cp /usr/etc/tunefs /tunefs
	   # umount -a
	   # /tunefs -m 2 /dev/rz1g
	   # /tunefs -m 2 /dev/rz1h
		   ...
	   # mount -a -t ufs

   This changes the minimum amount of free space from the default (10
   percent) to a more adequate [in a disk-space-starved environment, like
   ours :-(] 2% (that's the number 2 in the /tunefs commands).

The man page for tunefs(8) on SunOS 4.0.3 says

        -m minfree
          This value specifies the percentage of space held  back
          from  normal  users;  the minimum free space threshold.
          The default value used is 10%.  This value can  be  set
          to  zero, however up to a factor of three in throughput
          will be lost over the performance  obtained  at  a  10%
          threshold.

What does the space-performance tradeoff curve look like?  If I take a
300M file system and reduce the free space allocation to 15M (5%),
will that have the same impact as trimming a 100M file system's 10M
free space to 5M ?  For super performance can I go to 15% free in some
places?

Ideally I want to get enough disk space back so that people will notice
that there's more, without having them notice that it's any slower.

followups to comp.unix.admin,

--Ed

Edward Vielmetti, U of Michigan math dept <emv@math.lsa.umich.edu>

grunwald@foobar.colorado.edu (Dirk Grunwald) (09/01/90)

I've wondered the same thing in the case of /usr; if I'm not going to be adding
or deleting files often, does the threshold really matter?

lm@snafu.Sun.COM (Larry McVoy) (09/01/90)

In article <EMV.90Aug31155848@stag.math.lsa.umich.edu> emv@math.lsa.umich.edu (Edward Vielmetti) writes:
>   This changes the minimum amount of free space from the default (10
>   percent) to a more adequate [in a disk-space-starved environment, like
>   ours :-(] 2% (that's the number 2 in the /tunefs commands).

The fast file system (you have it if you have tunefs) has a multi alg
allocator.  The file system trys to allocate things nicely.  For
example, if you are running with a rot delay of 4ms (most people are)
then the allocator will try to allocate the next block on the same track,
4ms done the line.  Failing that, it will try anywhere on the same
track, then anywhere in the same cylinder group, and finally a some
sort of brute force search (I may have this a bit wrong; go read the
paper to see what really happens).

Anyway, the scoop is that as the file system fills up it gets less and
less likely that the allocator can put things where you want.  That's
where the 10% free comes in.  That 10% is evenly distributed across
the disk (in theory) which makes it more likely that the allocator
can put things in the right place.

So, should you do it?  If you have an absolutely static disk, mounted
read only all the time,  then crank that %free down to 0, fill up
the disk and sleep easy.  If you have an active disk (/tmp or user
directories) then you should leave it at 10% if you want reasonable 
performance.

Oh - another consideration: if all of your files are very small,
less than the file system block size (usually 8K), then most of
this doesn't matter and you can crank it down to 2% w/o too much
trouble.
---
Larry McVoy, Sun Microsystems     (415) 336-7627       ...!sun!lm or lm@sun.com

chris@mimsy.umd.edu (Chris Torek) (09/02/90)

In article <25513@boulder.Colorado.EDU> grunwald@foobar.colorado.edu
(Dirk Grunwald) writes:
>if I'm not going to be adding or deleting files often, does the [tunefs
>-m `minimum free space'] threshold really matter?

Free blocks (or lack thereof) matter only when a file needs a new block
or fragment.  UCB CSRG keep archives of particular releases of systems on
line sometimes; these go in file systems with a reserve of 0%, since they
are never modified after creation.

The code in question is in /sys/ufs/ufs_alloc.c (in recent BSDs).  Ideally
it never even uses the second approach; but in order to allocate the very
last block it will usually need the third.

/*
 * Implement the cylinder overflow algorithm.
 *
 * The policy implemented by this algorithm is:
 *   1) allocate the block in its requested cylinder group.
 *   2) quadradically rehash on the cylinder group number.
 *   3) brute force search for a free block.
 */
u_long
hashalloc(ip, cg, pref, size, allocator)
	struct inode *ip;
	int cg;
	long pref;
	int size;	/* size for data blocks, mode for inodes */
	u_long (*allocator)();
{
	register struct fs *fs;
	long result;
	int i, icg = cg;

	fs = ip->i_fs;
	/*
	 * 1: preferred cylinder group
	 */
	result = (*allocator)(ip, cg, pref, size);
	if (result)
		return (result);
	/*
	 * 2: quadratic rehash
	 */
	for (i = 1; i < fs->fs_ncg; i *= 2) {
		cg += i;
		if (cg >= fs->fs_ncg)
			cg -= fs->fs_ncg;
		result = (*allocator)(ip, cg, 0, size);
		if (result)
			return (result);
	}
	/*
	 * 3: brute force search
	 * Note that we start at i == 2, since 0 was checked initially,
	 * and 1 is always checked in the quadratic rehash.
	 */
	cg = (icg + 2) % fs->fs_ncg;
	for (i = 2; i < fs->fs_ncg; i++) {
		result = (*allocator)(ip, cg, 0, size);
		if (result)
			return (result);
		cg++;
		if (cg == fs->fs_ncg)
			cg = 0;
	}
	return (0);
}
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

gamin@ireq-robot.hydro.qc.ca (Martin Boyer) (09/05/90)

In article <141722@sun.Eng.Sun.COM> lm@sun.UUCP (Larry McVoy) writes:
>[...]
>If you have an absolutely static disk, mounted
>read only all the time,  then crank that %free down to 0, fill up
>the disk and sleep easy.  If you have an active disk (/tmp or user
>directories) then you should leave it at 10% if you want reasonable 
>performance.

And how about the /export/swap (or whatever is used for client swap
space in SunOS 4.x) partition?   Once the files are created (using
mkfile for each client), they are never allocated again.  It seems to
make sense to set the %free to zero for this partition.

I have done that and everything seems fine.  Am I wrong?

Martin
--
Martin Boyer                            mboyer@ireq-robot.hydro.qc.ca
Institut de recherche d'Hydro-Quebec    mboyer@ireq-robot.uucp
Varennes, QC, Canada   J3X 1S1
+1 514 652-8136

jgp@moscom.UUCP (Jim Prescott) (09/07/90)

In <EMV.90Aug31155848@stag.math.lsa.umich.edu> emv@math.lsa.umich.edu (Edward Vielmetti) writes:
>In some other article someone said:
>>This changes the minimum amount of free space from the default (10
>>percent) to a more adequate [in a disk-space-starved environment, like
>>ours :-(] 2% (that's the number 2 in the /tunefs commands).
>
>What does the space-performance tradeoff curve look like?
>Ideally I want to get enough disk space back so that people will notice
>that there's more, without having them notice that it's any slower.

Who doesn't ?-)

One thing to be aware of if you lower minfree is that you increase the
chances of running out of disk space while you still have space left (ie.
getting a "no space on device" message while df still shows available
disk space).

This happens because the FFS keeps track of 2 types of free space, full
blocks and fragments (often 8k/1k).  When looking for space the system
will split a free block into fragments but it won't coalesce fragments
into a full block (naturally it will coalesce them when free'ing
stuff).  If you run out of full blocks then the disk is full for most
purposes, even if df shows 20M free.

This can occur even if you don't change minfree but would be extremely
unlikely.  The problem occurs when more than minfree% of your disk is
free fragments which would require almost pathological fs activity for
10% but gets more likely at 1 or 2%.

Changing optimization to space will also make running out of full blocks
less likely.  SunOS 4 does this automatically when the actual minfree
drops but this is usually too late (the fragmentation build up over time).

Note that I think it was with SunOS 3.5 that we actually observed this
but I doubt anyone with a reasonably stock FFS does frag->block coalescing
on demand.  It would be real hard/slow.

On suns you can use dumpfs to get info on how your free space is
distributed (nbfree and nffree).  (use a pager with dumpfs as it prints
out tons of stuff, the most useful of which is in the first page).
-- 
Jim Prescott	jgp@moscom.com	{rutgers,ames}!rochester!ur-valhalla!moscom!jgp