[psu.micro] Cluster Size

KDA101@psuvm.psu.edu (KeithPetto Alexander) (04/14/90)

        OK.  Here's the NEW deal.  The size of a cluster is determined
when you format a disk.  Floppy disks appear to have standard cluster
sizes.  Regardless of DOS version, low density disks (actually these are
double sided double density) are formatted with a cluster size of 1024
bytes (1K).  This means a 1 byte file takes 1024 bytes and a 1025 byte
file takes 2048 bytes (on a low density disk).  High density floppy disks
have a cluster size of 512 bytes.  Although there are more reasons than
one for this, I believe the main reason is that with the large amount of
space on the disk, it was decided there was enough room for a larger FAT
table.  (A larger FAT table lets you waste less disk space.)
        Now, hard disks.  The cluster size of a hard disk is dependant on
how it was formatted.  If you format a hard disk with DOS 2.? you the
smallest cluster size available is 8K (what a waste for all those 100 byte
batch files!).  With DOS 3.? and 4.? the smallest cluster size is 2K, much
better but still a little wasteful.  Now don't get me wrong, with large
hard disks, DOS 3.? or 4.? can't always give you a 2K cluster.  When your
hard disk is larger than about 40Mb you end up with a 4K cluster.  And
larger hard disks need even larger clusters.  This is all due to FAT table
size.  Your FAT table must be small enough so that searches through it
don't take longer than the actual file load.

        Cluster size basic rules of thumb:

                Low Density 3.5" (720Kb) or 5.25" (360Kb):
                        1024 byte (1K) clusters.
                High Density 3.5" (1.44Mb) or 5.25" (1.2Mb):
                        512 byte (.5K) clusters.
                Hard disks: Don't count on it.
                        Small businesses and people (this week) have
                        hard disks < 40Mb so maybe everybody using DOS
                        3.? and 4.? has a cluster size of 2K but that
                        is probably assuming too much.

        Below are the responses that this info was collected from (along
with my own searching).  Thanks to all who responded.

                                 -- Petto
---------------------------------------------------------------------------
From: raymond@math.berkeley.edu (Raymond Chen)
because the cluster size determines many things:

[1] How much space is wasted on the disk (larger cluster size = more wastage)
[2] How big the FAT is (smaller cluster size = larger FAT), and a larger
    FAT means that it takes longer to search the FAT for empty clusters,
    as well as taking up more memory in RAM.
[3] How fast you can load information from the disk (larger cluster
    size = faster)

So you have to strike a balance.

To find out the cluster size, use int 21h subfunction 36h.


---------------------------------------------------------------------------
From: dje@ai.mit.edu (David J. Elliot)
My 660MB ESDI drive has a 16K cluster size (MS DOS 4).


---------------------------------------------------------------------------
From: tarvaine@jyu.fi (Tapani Tarvainen)

I've seen cluster sizes from 128 bytes (some oddball ramdisk) to 16K
(big hard disk with old DOS).  At the moment I have 2K clusters in
my machine, but the formatting program I used would have allowed
anything from 512 bytes to 8K (or perhaps even more, I'm not sure).

However, for your purpose a poll isn't really a good approach,
the cluster size can be determined from the boot sector of the
disk (whether or not the disk is bootable): bytes 11-12 give
bytes/sector and byte 13 sectors/cluster, just multiply them
(where numbering starts with 0).


---------------------------------------------------------------------------
From: ts@uwasa.fi (Timo Salmi LASK)
Here is information about one of the programs that already does the
whole thing: dirf.exe in (/pc/ts/)tsutlc16.arc.  Available by
anonymous ftp from chyde.uwasa.fi, Vaasa, Finland.  Furthermore,
sysinfo.exe in (/pc/ts/)tsutil29.arc (has been submitted to the
binaries) gives you a round tour to the cluster sizes of all your
devices, ramdisks included.


---------------------------------------------------------------------------
From: GILLA@QUCDN.QueensU.CA (Arnold G. Gill)
    The cluster size depends explicitly on the software used to format them.
For example, with MS-DOS 2.1, the cluster size on the hard disk was 8k.  At
MS-DOS 3, the cluster size dropped to the current 2k.  I don't know if it is
any different for the very large (40+ Mb) hard disks, but it might be.

     I found it interesting to find out that the high-density disks have 1/2k
cluster sizes.  Is this due to the fact that the extra address space needed is
available on 286 machines, but never on 8088/8086 ones?


---------------------------------------------------------------------------
From: cpqhou!scotts@uunet.uu.net
There is a simple way to get the sectors/cluster for ANY device (hard/floppy).
Simply issue a Dos function 1b, if you dont know how, it works like this:

Issue an interrupt 21H call as follows (if you know assembly):
                mov ah,1b
                int 21h
There, now you have all the information you need returned in:
        AL=Sectors per cluster
        CX=Sector size
        DX=Total number of clusters on device

I hope this helps.  If you want to do this from C, then you can do it
via intdos() in MSC and a similar command in Turbo (check the manual).
You going to have to use some REGS construct...


---------------------------------------------------------------------------
From: fogel@herald.usask.ca
According to the book "PC Power Tools" by PC Magazine,
the cluster size of a hard disk depends both upon it's size,
and the DOS version that was used to format it (with FDISK
and FORMAT).

For a 10 MB hard disk, the cluster size is 4,096 bytes.
Under DOS 2.x, larger hard disks had proportionally larger
cluster sizes (a 20 MB disk had 8,192 byte clusters).
DOS 3.0 fixed this (by increasing the maximum number of
clusters per disk to 65,518); it makes 2,048 clusters on 20 and 30 MB
disks.  For really large disks (over 120 Mb, you might expect
the cluster size to start going up again).


---------------------------------------------------------------------------
From: ted@helios.ucsc.edu (Ted Cantrall)
I had my ST-225 divided into 4 partitions when I first got it. (Just exploring
the options) When I realized that I was getting 4K clusters, I repartitioned
to one and my cluster size went to 2K. (dos 3.21)       -ted-


---------------------------------------------------------------------------
From: Peter B{ckgren <eskimo@clinet.FI>
If you have a copy of DISKMANAGER check it's README file for a
table listing all cluster sizes DOS CHKDSK utility support for
different hard disk sizes.

As for what I use :
Disks 512-1024
HD    512-4096  (512 is only used for a ramdisk)
Very old DOS-versions used 8192 at times but I've never
seen any greater values than that.
Some program I wrote a few years ago supported 512-4096 (sufficent) but
it did have a parameter option for bigger cluster sizes.


---------------------------------------------------------------------------
From: pnl@hpfinote.HP.COM (Peter Lim)
Okay, lets sum everything up. The cluster size have to do with the size
of the disk and the number of bits used in the FAT (file allocation table).

For DOS2.1 they used strictly 12-bit FAT and when your hard disk size
is > 16 MB, the cluster size becomes 8K (very wasteful). Then come DOS3.x;
and somehow, M*cr*S*ft decided that for disk size less than 16 MB, use
a 12 bit FAT which resulted in 4K cluster size. And for disk size >= 16 MB,
use 16 bit FAT which gives 2K cluster size. Neat isn't ? The floppy cluster
size is pretty standard (1K for low density & .5K for hi density); but
this has something to do with selected FAT size and disk capacity and NOT
the addressing capacity of AT vs XT. This information is tucked away
somewhere in DOS and can be obtained via. DOS interrupt service. I've
done this before but can't quite remember how to.

In fact, I've written a C program (MSC) which works like the UNIX "du"
command, but can tell the different cluster sizes. It will also deduce
the amount of space needed for the directory entry (in a kludgy way :-))
and allow you to force size evaluation to a fixed cluster size (this allows
you to do "du" on a hard disk with a cluster size of floopy disk; then you
can be sure if the files will fit on the floppy). I wrote this way back in
1987 using the "DIR" package posted by someone else (sorry can't remember
his name). Works fairly fast as far as I'm concern. May be I'll post it
to the net sometime soon ..... (if anyone e-mail request to me, it might
be faster  :-)  .. as getting from MS-DOS to UNIX is not exactly trivial
at this site).