[comp.sys.mac.hardware] Disk Optimization

macq@miguel.llnl.gov (Don MacQueen) (09/21/90)

Adds and documentation for disk optimization software (defragmenting,
etc) speak as if the best thing to do is first defragment and then
'optimize' which means put all the files together and all the free space
together each in one big block.  But if this is done then the next time
a file is made bigger it is necessarily fragmented.  Wouldn't it be
better to take the free space and divide it up proportionally to the
file sizes and stick a bit of free space after each file?  Then files
would have a better chance of not becoming fragmented.

Any comments?

Don MacQueen
macq@miguel.llnl.gov

minich@d.cs.okstate.edu (Robert Minich) (09/21/90)

by macq@miguel.llnl.gov (Don MacQueen):
| Ads and documentation for disk optimization software (defragmenting,
| etc) speak as if the best thing to do is first defragment and then
| 'optimize' which means put all the files together and all the free space
| together each in one big block.  But if this is done then the next time
| a file is made bigger it is necessarily fragmented.  Wouldn't it be
| better to take the free space and divide it up proportionally to the
| file sizes and stick a bit of free space after each file?  Then files
| would have a better chance of not becoming fragmented.
| 
| Any comments?

  Sure, why not... First, I don't claim to have tested any of these ideas
out, but I do claim to have a certain curiosity about file systems. (I
understand/know a bit about MFS, HFS, SysV, and OS/2 HPFS, although I
don't know too much about HPFS.) It seems most opsys's don't care too much
about fragmentation. I assume it encourages third party developers. :-)
Anyway, back to the subject.
  The "optimal" setup for a file system is heavily dependent on how the
files change over time. (Grow, shrink, wiggle, delete and create.) On the
Mac, you tend to have a few files that are basically static. These are your
apps and to a large extent the system file. The sys file may go through
convulsions internally but still take up the same space on disk.) These
files you probably want somewhere apart from data files with and with
minimal padding for the files to grow. Your data files that are always
there -- the accounting, phonebook, and many spreadsheets -- will usually
grow but not shrink. Other files come and go, like that letter to Mom and
the daily memos.
  I guess an "optimal" disk would put all the repeatedly accessed files
(the sys and apps) in the middle of a disk where the heads are in a good
position to either be on top of the next-requested data or only half a disk
away. Other things that are accessed like mad are the catalog files and
these should ideally be around the disk blocks they represent. (The only
file sys I know that does this is HPFS, which places catalog info on 32mb
boundaries.)
  Back to reallity... the most a Mac defragger can probably do for you is
put the static files in a large block (preferably near the middle), put the
always-around files together with plenty of padding, and put the volatile
files wherever is left. As far as putting all the allocated blocks in one
place, I think that would be dumb since a file that would grow, even a
little, would all of a sudden be scattered from somewhere in the middle of
a hunk of allocated space to somewhere distant where there's some free
blocks.
  As a last note, the Mac HFS does a bit to help fragmentation all on it's
own. First of all, it uses "allocation blocks" that are multiple disk
blocks. Second, it tries to allocate contiguous blocks. (Someone from Apple
please clarify this: how does the MacOS decide what blocks to add to an
allocation? Does it look for a suitable run in the volume bitmap or what? I
assume allocContig() does look for big runs.) A nice thing for an
application to do when it completely rewrites a file (I assume most word
processors act this way. A DBase app probably wouldn't.), it can first
shrink the file to nothingness then allocate all the necessary space in one
swoop, giving the file system a chance to defragment at least that file.

That's enough from me BUT... if you know of any documentation on file
systems (of any flavor) that I can get my hands on, ESPECIALLY in
machine-readable form, please let me know. I'd like to explore this area a
bunch more... heck, maybe I can get some credits for it. :-)

-- 
|_    /| | Robert Minich            |
|\'o.O'  | Oklahoma State University| A fanatic is one who sticks to 
|=(___)= | minich@a.cs.okstate.edu  | his guns -- whether they are 
|   U    | - Ackphtth               | loaded or not.

a544@mindlink.UUCP (Rick McCormack) (09/21/90)

From article by Robert Minich
<1990Sep21.021427.26233@d.cs.okstate.edu> referring to article by
macq@miguel.llnl.gov (Don MacQueen) re defragmenting files:
" I guess an "optimal" disk would put all the repeatedly accessed files
(the sys and apps) in the middle of a disk where the heads are in a good
position to either be on top of the next-requested data or only half a disk
away. Other things that are accessed like mad are the catalog files and
these should ideally be around the disk blocks they represent. (The only
file sys I know that does this is HPFS, which places catalog info on 32mb
boundaries.)

(end  of quote)

I seem to remember that the Apple II file system put its table of contents and
file info in the middle of the disk, and worked outward and inward from there
with data and application files.

How many people would buy an optimizer that gave them the ability to specify
lots of detail about where files went on a disk?  I'd bet not too many.

wwtaroli@rodan.acs.syr.edu (Bill Taroli) (09/23/90)

In article <1990Sep20.163033@miguel.llnl.gov> macq@miguel.llnl.gov (Don MacQueen) writes:
>together each in one big block.  But if this is done then the next time
>a file is made bigger it is necessarily fragmented.  Wouldn't it be
>better to take the free space and divide it up proportionally to the
>file sizes and stick a bit of free space after each file?  Then files
>would have a better chance of not becoming fragmented.

What you say is true. However, at least one disk optimizer, Disk Express II,
attempts to account for this. What it does is move all the files, in a
predetermined manner, to the _end_ of the disk. It then leaves all recently,
read "in the last couple days", used files at the head of the disk. It is 
designed to run nightly. Thus, if you were working on an old file whose size
grew, this file would be moved to the head of the disk. Presumably, if you edit
it, then you will have some interest in the file for a while... and will likely
edit it again. In this way, the situation you describe is avoided (or at least
you don't have to live with it for more than 24 hours).

However, some have criticized DiskExpress II because it can run during your
work. In older versions, this would actually occur during whatever you were
doing. As of 2.04, this problem has been solved somewhat by DiskExpress II
bowing out more easily if you are typing, mousing, etc. I haven't formed a firm
opinion on this, but have toggled the "Optimize Automatically" option many 
times.

It should be noted here that DiskExpress II will also allow you to have it
track your usage of files without performing automatic backups. This affords
you the option to have DiskExpress II perform the optimizations when _you_ want
in addition to putting the most recently used files at the beginning of the 
drive.

This is the only package I know that performs optimizations in this manner
(pushing older files to the end of the drive), but there may be others. This
just happens to be the one I use.

Regards,

*******************************************************************************
* Bill Taroli (WWTAROLI@RODAN.acs.syr.edu)    | "You can and must understand  *
* Syracuse University, Syracuse NY            | computers NOW!" -- Ted Nelson *
*******************************************************************************
--
*******************************************************************************
* Bill Taroli (WWTAROLI@RODAN.acs.syr.edu)    | "You can and must understand  *
* Syracuse University, Syracuse NY            | computers NOW!" -- Ted Nelson *
*******************************************************************************

Sonny.Shrivastava@f555.n161.z1.FIDONET.ORG (Sonny Shrivastava) (09/28/90)

I think the performance gained by placing system files in the middle of the 
disk would be negligible.  In fact, I rarely notice a performance difference 
my disk when it's fragmented and when it isn't.  I use SUM-II to defragment 
my drive.  Although it doesn't place files optimally as you suggest, it does 
pack everything so all the data on the disk is contiguous.  I think it does a 
good job.

--  
Sonny Shrivastava - via FidoNet node 1:125/777
    UUCP: ...!uunet!hoptoad!fidogate!161!555!Sonny.Shrivastava
INTERNET: Sonny.Shrivastava@f555.n161.z1.FIDONET.ORG