[comp.sys.amiga] Disk fragmentation

ugmiker@sunybcs.UUCP (10/30/87)

Hi all,
     Well, Thanks alot or all the replys  about  disk  fragmenta-
tion,  they  have  really  helped  alot....  I can now boot up my
machine, in about 15 seconds.  My old Boot sequence used to  take
about  a  minute  and  six  seconds.   I think the most speed was
picked up by reordering my programs on the disk.  I  also  picked
up  some  speed  by  first  loading  "run"  into  "ram:" and then
"ram:run"ing everything from then on in, thus not needing to  re-
load run everytime I ran a program into the background....
     With my old boot up, I would have to wait until all the pro-
grams were loaded before I could get to work, now, as soon as run
starts them all running, I can get to work.

                  Thanks alot...to everyone....

                                        mike



==========================================================================
Apathy ??? ahhhh who cares about that ....(besides of course me :-) 
==========================================================================
Mike Reilly  
University of Buffalo Computer Science       
csnet:	ugmiker@buffalo.CSNET 
uucp:	..!{nike|watmath,alegra,decvax}!sunybcs!ugmiker
BITNET:	ugmiker@sunybcs.BITNET   <-OR->   ACSCMPR@ubvmsc.BITNET

jdc@rama.UUCP (James D. Cronin) (03/29/89)

I have had a hard drive on my Amiga for about 1 year.  During that
time countless files have been created/modified/deleted.  Should I
start worrying about disk fragmentation?  Is there any way to measure
and/or correct it?  Are there performance penalties?  It seems as
though things will slow down when files are split up into small chunks
scattered around the disk (as opposed to one contiguous chunk).

Any suggestions or pointers are most welcome.  My concern is based on
VAX/VMS experience, but I really don't know too much about AmigaDos
internals.

Thanks in advance,
Jim Cronin

-- 

James D. Cronin           UUCP:  {...}!rochester!tropix!rama!jdc
Scientific Calculations/Harris

ugkamins@sunybcs.uucp (John Kaminski) (04/09/89)

In article <10533@rama.UUCP> jdc@rama.UUCP (James D. Cronin) writes:
>I have had a hard drive on my Amiga for about 1 year.  During that
>time countless files have been created/modified/deleted.  Should I
>start worrying about disk fragmentation?
...
>Any suggestions or pointers are most welcome.  My concern is based on
>VAX/VMS experience, but I really don't know too much about AmigaDos
>internals.
>
>Thanks in advance,
>James D. Cronin           UUCP:  {...}!rochester!tropix!rama!jdc
>Scientific Calculations/Harris

The filing system on the Amiga uses hashing with chaining.  In case you don't
know what this is, it is generating an otherwise "useless" number by combining
the numbers (the character code values) in the file name in a certain
consistent way (such as adding them up) to condense it down into a single
number to be used as an index into the disk.  For example, suppose the
result of adding up all the character codes in the desired pathname is
50.  At 11 sectors per track and 2 tracks per cylinder, sector 50 would
be on cylinder 2 (starting from 0, cyl 0 has sectors 0-21, cyl 1 has 22-44,
and cyl 2 has 45-67, etc.), track/side 0 (cyl 2 track 0 has sectors 45-56),
sector 5.  Right there should be a file header block which has the filename
controlled by that exact hash.  The possibility of duplicate hashes (almost
obviously) exists, so that is where the "chaining" part comes in.  One field
in the record header block describes where the next entry with that same hash
is found.  For example, in our example above, if the filename sought is not
the one in the file header block, the next hash field might contain the number
188, in which case one gets block 188 from the disk and examines the filename
in THAT file header block.  This continues until either the filename matches,
or the next hash field contains a 0 ("points" to nowhere) in which case
the filing system reports back that the file was not found.  If the file is
being created, the filing system goes to the root block to retrieve the
bit table that keeps track of which sectors are in use, searches for an
open "slot," then fills in the newly found block with a file header block,
sets the next hash field THERE to 0, sets the next hash field in the file
header block block ON THE PREVIOUS END OF THE HASH CHAIN to the number of the
newly allocated sector number, and updates the "in use" bit table.  I'm not
sure which order is used, but that's what needs to be done.  When the file
is written, another free block of storage is allocated from that bit table
and the file header block for the new file is rewritten with the field to
keep track of where the file starts in the appropriate field.  Deletion is
only a matter of readjusting the next hash field of the file header block
that is the previous "link" in the chain, or copying the file header block
of the next "link" in the chain to the initial hash position if one is
deleting the first link in the chain.  While that is done, one updates
the free sector bit map to reflect that the deleted storage is once again
available.  Note that the data itself is not deleted, just the references
to it.  That is why DISKDOCTOR will sometimes bring back deleted files
and directories. Sooooooo......

Fragmentation can occur but is generally kept to a minimum.  As usual, the
easiest way I know to consolidate the files is to copy them with

copy from <disk> all to <disk> clone

where the second <disk> is freshly formatted.  If there is a utility for copying
the contents of a hard disk to several floppies, I am unaware of it.  The dis-
cussion above also points out that you should try to copy the files that you
want accessed faster first so that they will be at the head of the hash chain
or at least as close as possible to the head.

As for a few pointers, how about 0xf20c2, 0xE0204, and 0x600c2 ?

P.S. -- the actual hashing function used is more complex than just adding
up the character code values.  I don't know it, but someone once told it
to me over the phone.  If you have DISKED, it can tell you the hash  for a
particular file name.

lphillips@lpami.wimsey.bc.ca (Larry Phillips) (04/10/89)

In <5117@cs.Buffalo.EDU>, ugkamins@sunybcs.uucp (John Kaminski) writes:
>The filing system on the Amiga uses hashing with chaining.  In case you don't
>know what this is, it is generating an otherwise "useless" number by combining
>the numbers (the character code values) in the file name in a certain
>consistent way (such as adding them up) to condense it down into a single
>number to be used as an index into the disk.  For example, suppose the
>result of adding up all the character codes in the desired pathname is
>50.  At 11 sectors per track and 2 tracks per cylinder, sector 50 would
>be on cylinder 2 (starting from 0, cyl 0 has sectors 0-21, cyl 1 has 22-44,
>and cyl 2 has 45-67, etc.), track/side 0 (cyl 2 track 0 has sectors 45-56),
>sector 5.  Right there should be a file header block which has the filename
>controlled by that exact hash.

 Well, that's close. Tha hash value does not point at a sector on disk, but to a
hash table offset within either the root block or a directory block. Contained
in that hash table entry is the pointer to either a directory block or a file
header block.

>Fragmentation can occur but is generally kept to a minimum.  As usual, the
>easiest way I know to consolidate the files is to copy them with

 Fragmentation is entirely dependent upon the number and sizes of files deleted
vs. the number and sizes of files written, and on the order in which these
operations are performed. It is possible to badly fragment a disk in a matter
of minutes, though in normal operation, it takes considerably longer. There is
no way to accurately determine how long it will take to fragment a disk without
knowing a lot more about what will be deleted and written to the partition.

>  ...  If there is a utility for copying
>the contents of a hard disk to several floppies, I am unaware of it.  The dis-
>cussion above also points out that you should try to copy the files that you
>want accessed faster first so that they will be at the head of the hash chain
>or at least as close as possible to the head.

ExpressCopy, from Expressway software is a HD backup program that will do
exactly this. It will copy any arbitrary partition, directory, or subdirectory
onto multiple floppies, creating standard Amigados floppies that can be
accessed as normally generated floppies can, and at the rate of about 1 every
1.5 minutes, to multiple drives.

-larry

--
Frisbeetarianism: The belief that when you die, your soul goes up on
                  the roof and gets stuck.
+----------------------------------------------------------------------+ 
|   //   Larry Phillips                                                |
| \X/    lphillips@lpami.wimsey.bc.ca or uunet!van-bc!lpami!lphillips  |
|        COMPUSERVE: 76703,4322                                        |
+----------------------------------------------------------------------+

ugkamins@sunybcs.uucp (John Kaminski) (04/11/89)

In article <2357@van-bc.UUCP> lphillips@lpami.wimsey.bc.ca (Larry Phillips) writes:
>In <5117@cs.Buffalo.EDU>, ugkamins@sunybcs.uucp (John Kaminski) writes:
>>The filing system on the Amiga uses hashing with chaining.  In case you don't
>
> Well, that's close. Tha hash value does not point at a sector on disk, but to a
>hash table offset within either the root block or a directory block. Contained
>in that hash table entry is the pointer to either a directory block or a file
>header block.
   OOOPS!  yep.  Sorry 'bout that.  Live and RElearn.  Manuals are wonder-
ful if you just read them before shooting off one's fingers -- er... mouth.
>
>ExpressCopy, from Expressway software is a HD backup program that will do
>exactly this. It will copy any arbitrary partition, directory, or subdirectory
>onto multiple floppies, creating standard Amigados floppies that can be
>accessed as normally generated floppies can, and at the rate of about 1 every
>1.5 minutes, to multiple drives.
>
Thanks for the info.

or the .info?

   //      "Do we get what we deserve
  //         or deserve what we get?" -- me
\X/  The Amiga 1000 vows to never die

P.S. -- another instance of a news poster rejection for including more old text
than original/new text.  Time here to reiterate a quote that now should prob-
ably be (somewhat painfully) applied to myself:
"Don't listen to me -- I never do"  -- The Doctor
    (which, coincidentally, I like Tom Baker better)