[comp.sys.ibm.pc.programmer] Complete guide to MS-DOS file system

jhallen@wpi.wpi.edu (Joseph H Allen) (02/23/90)

Since some have asked over email, Here's the Complete MS-DOS File System
Description(tm):

Definitions
-----------

	A sector on a disk is addressed by three quantities.  The head or side
number, the sector number and the cylinder number.  All of the sectors on one
side of one cylinder constitute a track.

The head number is numbered from 0 to Number-of-heads - 1
The sector number is numbered from 1 to Number-of-sectors-per-track
The cylinder number is numbered from 0 to Number-of-cylinders - 1

Sectors normally (always?) contain 512 bytes.

Bios disk interface
-------------------

	The bios calls [int 13H, AL=0], [int 13H, AL=2], and [int 13H, AL=3]
are the primary disk interface calls.  [int 13H, AL=0] Resets the disk
controller.  This call is supposed to be made after the read or write calls
return with an error.  [int 13H, AL=2] is the read sectors call and [int 13H,
AL=3] is the write sectors call.  These two calls take the following
parameters: 

AL= Number of sectors to read or write
CH= cylinder number
CL= sector number
DH= head number
DL= drive number
ES:BX= buffer address

Drive numbers 0 to 127 indicate floppy drives (I.E., drive 0 is A, etc..)

Drive numbers 128 to 255 indicate hard drives (I.E., drive 0 is C).  If you
use these calls with hard drives, the parameters are slightly different.

CL= Lower 6 bits are sector number, upper 2 bits are 2 most significant bits
of cylinder number.  I.E., 10 bits are used for the cylinder number, so the
drive can have at most 1024 cylinders.

These calls returns with:

CF =1 for error, =0 for no error
AL number of sectors read
AH status code (for when there is an error)

Limitations:  These calls use the DMA controller to transfer the data. 
Therefore, they can not transfer accross 64K bounderies.  I.E. 0000 to FFFF is
ok, but between FFFF and 10000 will return an error.

BIOS boot sequence
------------------

BIOS will try to boot from drive 0 and then from drive 128.  Only these two
drives will be checked.  Sector 1 of head 0, cylinder 0 is loaded at address
7C00.  A jump is then made to 7C00.  At this time, CS=0 and PC=7c00.  All
other registers are indeterminate (or at least, the boot programs treat them
as such).  The drive which was booted from is indicated by a byte in the boot
sector.  This is why only drive 0 and 128 can be booted- there's no way for
the boot program to know whether it was booted from drive A or B (or C or D).

Memory Map
----------

0 - 400h	contains the interrupt vectors
400h to 4ffh	contains bios data

All other ram is free for use by the boot program (and the rest of the
operating system for that matter).

Partitioning
------------

If you boot off of a hard disk, first a partition boot program is loaded and
then the actual MS-DOS boot program is loaded.  This boot program can handle 4
partitions.  Each partition has a 16 byte entry.  The four entries are at
offset 1be of the first sector of the hard disk.  Each entry contains:

Offset	Contents
------	--------
0	Byte: 80h for active partition, 0 for inactive

1	Byte: beginning partition side

2	Byte: low order 6 bits are first sector of partition, high order 2 bits
	are most significant bits of first cylinder of partition

3	Byte: low order 8 bits of first cylinder of partition

4	Byte: System type byte.  4=MS-DOS 16 bit FAT, 1=MS-DOS 12-bit FAT,
	5=MS-DOS extended partition

5	Byte: Last side of partition

6	Byte: low order 6 bits are last sector of partition, high order 2 bits
	are most significant bits of last cylinder of partition

7	Byte: low order 8 bits of last cylinder of partition

8	Long: Number of sectors before partition

12	Long: Number of sectors in partition

Only one partition may be active or the boot program says "Bad Partition
Table"  Plus, only 0 and 128 are valid numbers in the active/inactive byte
(more about this in a second).

Also the two bytes beginning at offset 1fe of the partition boot program must
have 55h AAh in them or the program says "Bad Partition Table"

When the partition boot program is booted, it copies itself to 600h and tries
to load the first sector of the active partition into 7c00.  You may notice
that the first 4 bytes of the partition entry are layed out perfectly for
loading into DX and CX for the bios read call.  Guess why the active partition
is indicated with an 80h?  Because it will conveniently load into DL to
indicate the drive number.  (this also means that you can't have partitions on
floppies without changing the partition boot program.  You might think that if
bios checked other hard disks this feature would let you boot off of them- but
alas, the partition boot program insists on the byte being either 80h or 0).

The sector which is loaded from the active partition must also contain a 55AA
sequence at offset 1FE or the partition boot program will say:  "Invalid
operating system" 

How does the boot program in the booted partition know which partition it
belongs to? The partition boot program leaves SI (and BP but I think SI is the
standard one) pointing to the partition entry that it booted when it jumps to
7C00 (you'll remember that the partition boot program copied itself down to
600H).  

Extended partitions
-------------------

Now how do you get more than 4 partitions?  You'll remember that a 5 in the
partition type entry means 'extended partition'.  The way this works is the
extended partition looks like a complete disk with not only an operating
system partition but also a partition table at the first sector of the
partition.  This partition table is exactly like the partition boot program
(has 55AA as last two bytes and partition entries beginning at 1be) but it
contains no code (which is stupid because if it did you could make one of the
extended partitions active).  

Now suppose the extended partition has 8 sub-partitions (logical partitions)
in it.  Then the partition table of the entire extended partition will contain
one normal partition entry for the partition and one extended partition entry
for the rest of the extended partition after that normal partition.  The
sub-extended-partition will look the same as outermost partition- a partition
table with one normal partition and one extended partition for the rest of the
extended partition.

Clarifying picture:

A disk with a primary dos partition and an extended partition with 5 logical
partitions in it.

             A          B          C           D          E        F
	+-+-------+-+--------+-+---------+-+--------+-+-------+-+------+
	| |Primary| |        | |         | |        | |       | |      |
	+-+-------+-+--------+-+---------+-+--------+-+-------+-+------+
         |    ^    |     ^ 
         +----+    +-----+                                       
         |         |         |---------- sub-extended-partition -------|                     
         |         |                             | 
         |         +-----------------------------+
         |          
         |         |------------- extended partition ------------------|
         |                            | 
         +----------------------------+ 

One interesting problem that this might have is that if somehow one of the
partition tables points to itself as an extended partition, MS-DOS might go
into an infinite loop when loading all the partitions.  Anyone want to bet
that ms-dos doesn't check for this?  (actually this is a pretty bad problem...
you wouldn't even be able to boot off of a floppy).

One final thing.  Although the partition entries seem to support having the
partitions begin and end anywhere, they always begin on a new track.  This
means that each partition table wastes an entire track.

MS-DOS Boot
-----------

When the MS-DOS boot sector finally gets loaded, it loads the first root
directory sector into address 500H (this will use 500-6FF- leaving the upper
half of the partition boot program with partition entries intact at 700-7FF).
Then it loads ibmbio.sys (or io.sys) which must be a contiguous block of
sectors and runs that.  Why must it be a contiguous block of sectors?  I have
no idea- there is more than enough room in the boot sector to have it look
through the FAT (my MS-DOS clone is going to do this).  IO.SYS (or IBMBIO.SYS)
then loads MSDOS.SYS (IBMDOS.SYS) and the operating system is finally running.

MS-DOS File System
------------------

I'm sure you've heard it all before.  Each disk (partition) has a boot sector,
1 or more FATS (usually 2) a root directory (1 or more sectors) and a data
area.  The sectors in the data area are grouped together into clusters.  The
number of sectors in each cluster is indicated by a byte in the boot sector. 
The first cluster in the data area is cluster number 2 (and the first 2 FAT
entries don't contain valid cluster pointers).  You can have either a 12 bit
fat or a 16 bit fat.  You have a 16 bit fat if there are more than 4078
clusters.  The maximum partition size is limited to 32MB for all DOSes up to
3.3 because when you have 65536 clusters, each cluster must only contain 1
sector because the absolute disk read and absolute disk write calls and the
part of the boot sector which indicates the number of sectors on the disk is
only 16-bits.  In DOS 4.0, this word in the boot sector is 4 bits and the
absolute disk read and absolute disk write calls are different.  You can still
only have 64K clusters, but you can have more sectors and hence a larger
cluster size.

If you have a 12 bit fat, a fat word can cross between sectors (yuck). 
Remeber that the original ms-dos had disks with 1 sector fats in them.

If someone wants I can post what's contained in the boot sector and directory
entries- but this information can be found in any of the popular dos technical
guides.

I have one question about the boot sector though.  One of the entries is
called "hidden sectors".  Exactly where are these hidden sectors and what are
they used for?  All hard disks seem to have exactly 1 track of hidden sectors
in each partition.
-- 
            "Come on Duke, lets do those crimes" - Debbie
"Yeah... Yeah, lets go get sushi... and not pay" - Duke

harlow@plains.UUCP (Jay B. Harlow) (02/23/90)

In article <8915@wpi.wpi.edu> jhallen@wpi.wpi.edu (Joseph H Allen) writes:
>I have one question about the boot sector though.  One of the entries is
>called "hidden sectors".  Exactly where are these hidden sectors and what are
>they used for?  All hard disks seem to have exactly 1 track of hidden sectors
>in each partition.
>-- 

Hello (again)....

  If you read my earlier posting before i CANNED it, sorry i was wrong
(confused).   the Hidden sectors are the number of sectors on the hard disk
before this BOOTABLE partition. (master boot partition & other partitions..),
the entry for reserved sectors is used for a possible longer boot record,
which i hope to use in a small Protected mode OS i am playing with...
  so there is normally one track of Hidden sectors before your DOS partition
the mormal first partition on the Drive... only this doesn't hold for 
extended partitions.. :-(,  the boot code itself primarily uses the hidden
sectors to find where on the disk it is....

			Jay

-- 
		Jay B. Harlow	<harlow@plains.nodak.edu>
	uunet!plains!harlow (UUCP)	harlow@plains (Bitnet)

Of course the above is personal opinion, And has no bearing on reality...

jhallen@wpi.wpi.edu (Joseph H Allen) (02/23/90)

In article <3538@plains.UUCP> harlow@plains.UUCP (Jay B. Harlow) writes:
>In article <8915@wpi.wpi.edu> jhallen@wpi.wpi.edu (Joseph H Allen) writes:

>Hello (again)....

>  If you read my earlier posting before i CANNED it, sorry i was wrong
>(confused).   the Hidden sectors are the number of sectors on the hard disk
>before this BOOTABLE partition. (master boot partition & other partitions..),
>the entry for reserved sectors is used for a possible longer boot record,
>which i hope to use in a small Protected mode OS i am playing with...

Ok, that's good.  Nothing needs to get passed from partition boot program and
the MS-DOS boot program then (So why is information passed between these
programs? Who knows...  Altough, I guess having the boot table already loaded
into memory is good since then you don't have to reread it) I guess I would
have noticed this if the active DOS partition on my hard disk wasn't first.  

On DOS 4.0, 4 bytes are reserved for this variable so it extends properly.
-- 
            "Come on Duke, lets do those crimes" - Debbie
"Yeah... Yeah, lets go get sushi... and not pay" - Duke