MATHRICH@umcvmb.missouri.edu (Rich Winkel UMC Math Department) (11/14/89)
I understand that allowing bsd filesystems to exceed 90% of their capacity results in a significant reduction in performance. This makes a certain amount of sense to me, at least when files are being created, deleted or appended to. However, I've been told this is also true of static filesystems. For instance, if I have a 20MB partition set aside for an nfs client's swap activity, I can't utilize the entire 20MB for the swap file, I'm supposed to leave 10% unused to avoid a performance impact, even though the swap file's size is unchanging. Is this true? Could someone explain this to me? Thanks, Rich
chris@mimsy.umd.edu (Chris Torek) (11/14/89)
In article <21436@adm.BRL.MIL> MATHRICH@umcvmb.missouri.edu (Rich Winkel UMC Math Department) writes: >I understand that allowing bsd filesystems to exceed 90% of their >capacity results in a significant reduction in performance. The reserved space has two functions: a) it speeds up block allocation: cylinder groups with no free blocks in desired rotational positions are rare. b) it speeds up access: since (a) is true, the chance that the blocks of any single file are scattered randomly about the disk is low. >... I've been told this is also true of static filesystems. Here only part (b) matters. If files on the file system are only allocated once, ever, and the file system itself remains static thereafter, there are no allocations in (a) to worry about. The question is then whether (b) is worth concern. The short answer is `probably not.' >... if I have a 20MB partition set aside for an nfs client's swap activity, (I can only guess that you mean `swap files for SunOS ``swap on a file'' style paging/swapping' here. Swap partitions, in the original Unix sense, are not file systems.) If all the files in that partition are allocated statically---never grow---then there is no reason to keep a 10% reserve. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris
hitz@auspex.auspex.com (Dave Hitz) (11/18/89)
In article <21436@adm.BRL.MIL> MATHRICH@umcvmb.missouri.edu (Rich Winkel UMC Math Department) writes: > I understand that allowing bsd filesystems to exceed 90% of their > capacity results in a significant reduction in performance. This makes > a certain amount of sense to me, at least when files are being created, > deleted or appended to. However, I've been told this is also true of > static filesystems. For instance, if I have a 20MB partition set aside > for an nfs client's swap activity, I can't utilize the entire 20MB for > the swap file, I'm supposed to leave 10% unused to avoid a performance > impact, eve though the swap file's size is unchanging. Is this true? > Could someone explain this to me? Here's a shot. The problem is that as a file grows on a mostly full partition, the block immediately following the last one in the file is generally not available. Even on almost empty files this is always a possibility, but statistically it turns out that only as the filesystem passes 90% full does this problem become nasty. Once blocks for a file are allocated in a bad order, it doesn't matter if the filesystem is static or not. If you read that file sequentially, the disk will have to seek all over the place to get the blocks. So even for static filesystems, some limit probably still makes sense. (You might try creating known rarely used files last, and crank up utilization to 100%.) Now the question about swap is interesting. When you create a swap file using mkfile(8), it allocates disk blocks when the file is created. If you create a 20 MB swap file in an otherwise empty 20 MB partition, the block allocation would probably be quite good because there would be no conflict with smaller chuncks of space being allocated in other files. If you want more than one swap file per partition, that would probably work also, but make sure *not* to run the mkswap's in parallel. Do them one after the other. (And don't use the -n option which would defeat the whole point of getting the blocks nicely allocated at create time.) -- Dave Hitz home: 408-739-7116 UUCP: {uunet,mips,sun,bridge2}!auspex!hitz work: 408-492-0900
emv@math.lsa.umich.edu (Edward Vielmetti) (11/18/89)
Dave Hitz's article refers to creating swap files with mkfile (this is on a sun at least) and recommends not to use the -n option since that will lead to fragmented swapfiles. That's fine but -- space is a bit tight here, & various people use their machines differently. I'd really like to give most people on diskless 4m 3/50's a 16M swap file that has say 8M pre-allocated & filled w/whatever it takes to swap nicely, and leave the other 8M to grow dynamically just in case someone manages to need it. I don't see a way to do it with the tools provided, -- does this approach make sense, & what would it take to do it ? thanks --Ed
BACON@MTUS5.BITNET (Jeffery Bacon) (11/19/89)
In article <2643@auspex.auspex.com>, hitz@auspex.auspex.com (Dave Hitz) says: > >The problem is that as a file grows on a mostly full partition, the >block immediately following the last one in the file is generally not >available. Even on almost empty files this is always a possibility, >but statistically it turns out that only as the filesystem passes 90% >full does this problem become nasty. > How does one learn this stuff? I RTF(Sun)Ms, but it didn't help that much. Or was I looking in the wrong place? > >Now the question about swap is interesting. When you create a swap >file using mkfile(8), it allocates disk blocks when the file is >created. If you create a 20 MB swap file in an otherwise empty 20 MB >partition, the block allocation would probably be quite good because >there would be no conflict with smaller chuncks of space being >allocated in other files. If you want more than one swap file per partition, >that would probably work also, but make sure *not* to run the mkswap's in >parallel. Do them one after the other. (And don't use the -n option which >would defeat the whole point of getting the blocks nicely allocated at >create time.) > Well, I hate to sound too stupid, but I did just this on a couple of swap partitions I made a couple of months ago. (Run the mkfile(8)s in parallel. This is on a Sun 3/260, SunOS4.0.3, 2 280MB Fujitsu's, 3 ~36Mb swaps on two 117MB partitions. The partitions are packed full, with about 500K free each.) Should I remake the swap files? At the same time, it occured to me to cut the number of inodes and the size of the minfree value. After all, there are only 3 files...(plus the . and .. directory)...I don't see anything wrong with having done that, am I correct? >-- >Dave Hitz home: 408-739-7116 >UUCP: {uunet,mips,sun,bridge2}!auspex!hitz work: 408-492-0900 ------- Jeffery Bacon Computing Technology Svcs., Michigan Technological University bitnet: bacon@mtus5 uucp (alternate): <world>!itivax!anet!bacos
madd@world.std.com (jim frost) (11/21/89)
BACON@MTUS5.BITNET (Jeffery Bacon) writes: > How does one learn this stuff? I RTF(Sun)Ms, but it didn't help that much. >Or was I looking in the wrong place? Get a copy of "The Design and Implementation of 4.3 BSD UNIX", available at any good bookstore near you. It explains how the filesystem was implemented and why they recommend the 10% buffer. There's also an article in the USENIX Computing Systems Journal (Volume 2, Number 3, Summer 1989), "Heuristics for Disk Drive Positioning in 4.3BSD" which describes how to tune the BSD hp disk driver (and some implementation issues) which you might find interesting. The BSD book, and the System V counterpart by Bach, is very enlightening if you really want to know the how and why. Both of them will require a bit of operating systems understanding before you delve into them. Both of them are a lot better if you have source to poke at, too. Several of the major operating systems classbooks devote substantial space to UNIX, both System V and BSD, which might be a better place to start since they're more general. I've lent almost all of mine out so I can't make particular recommendations right now. jim frost software tool & die madd@std.com
hitz@auspex.auspex.com (Dave Hitz) (11/30/89)
In article <89323.004057BACON@MTUS5.BITNET> BACON@MTUS5.BITNET (Jeffery Bacon) writes: > How does one learn this stuff? RTFC :-) > At the same time, it occured to me to cut the number of inodes and the > size of the minfree value. After all, there are only 3 files...(plus the > . and .. directory)...I don't see anything wrong with having done that, am I > correct? ... > Well, I hate to sound too stupid, but I did just this on a couple of > swap partitions I made a couple of months ago. (Run the mkfile(8)s in > parallel. This is on a Sun 3/260, SunOS4.0.3, 2 280MB Fujitsu's, 3 ~36Mb > swaps on two 117MB partitions. The partitions are packed full, with about > 500K free each.) Should I remake the swap files? Easy one first: There is *definitely* nothing wrong with pulling down the number of inodes. Hard one: Setting minfree down to 0 doesn't break anything, but *may* cause performance problems. To determine whether it's *actually* causing performance problems would require an experiment or a simulation of your exact setup. You can make sure that minfree=0 does not hurt performance much by starting with a completely empty partition and building swap files one at a time using mkfile(8) *without* the -n option. Is it worth doing this? Figuring that out is probably harder than just doing it. (Why don't you run a swap-death benchmark before and after and tell us what happens.) My *guess* would be that unless you swap heavily, you won't notice any difference. If you do swap heavily, random factors will dominate. If you swap to a part of the file that was created early, before the partition was getting full, probably no problem. If you swap at the end of the file, you may get a slow down. I'd like to emphasize that these guesses about out how disk fragmentation is likely to affect performance are in fact guesses. I know how to make the problem (if there is one) go away, but I don't know if there is one. -- Dave Hitz home: 408-739-7116 UUCP: {uunet,mips,sun,bridge2}!auspex!hitz work: 408-492-0900