[comp.sys.ibm.pc] Links under MS-DOS

jr@oglvee.UUCP (Jim Rosenberg) (08/25/88)

From article <11920@steinmetz.ge.com>, by davidsen@steinmetz.ge.com (William E. Davidsen Jr):
>   I thought I knew MS-DOS pretty well, but how do you create links in
> DOS? I could save a lot of space doing that. I asked a few people around
> here and they don't know about it either.

Links under UNIX work because of two factors.  (1) Directory information is
"detached" from file layout information.  (2) File layout information includes
a specific link count.  DOS is not so bad on count 1, but fails completely on
count 2.  Under UNIX a directory contains nothing but the file name and a
number, called an i-number.  This is an index into a linear list of structures
called inodes.  In UNIX it's the inode that really "is" the file, and the
inode has all the information pertaining to ownership, permissions, size,
time and date stamps, and the "root" information for where the file is laid
out on disk.  Under DOS, a directory entry does contain time and date stamps
and attribute bits, so directory information and file layout information are
really only "partially detached".  The directory entry also contains the
number of the first cluster in the file.  Although this is not really like a
UNIX inode concept, it does detach from the directory entry the actual mapping
of disk blocks to files.  This mapping is what you get from the FAT.  The FAT
is a linear list with one entry for each cluster.  Each FAT entry contains
either the number of the next cluster in the file, or a marker if it's the
last cluster.  (This is an oversimplification, but you get the idea.)
*NOWHERE* does DOS maintain a link count.

Now in theory one could have multiple directory entries with the same first
cluster.  One minor problem with this is that they would not necessarily
agree concerning time and date stamp and attribute bits.  But the lack of a
link count is fatal.  Proper behavior by the operating system when a file is
"deleted" is to decrement the link count, and release all its blocks *IF* the
link count goes to zero.  Since DOS assumes no links, it will *always* release
the blocks.  Those will get reallocated the next time any BDOS function needs
to allocate a new disk block.  If you had a second directory entry somewhere
with the same starting cluster number, disaster would result:  this cluster
now belongs to a completely different file (not necessarily at the beginning,
either!) and has been overwritten.  In short you have an inconsistent file
system and a completely garbaged file.

How might links be added to DOS?  It sure doesn't seem practical to add them
to the directory entry.  Aside from the issue that working code might break
if you farkle with the format of a directory entry, you would have to find
all the directory entries linked together to change a link count in all of
them, and this would be a hideous performance hit.  The only reasonable place
to put a link count is in the FAT.  This is not such a crazy idea if the world
could be content with losing one bit in the maximum cluster size.  For let's
say a 16-bit FAT, if you treat cluster numbers as signed 16-bit ints, the EOF
cluster is I believe 0xFFFF, or -1.  If we could live with 15-bit FAT entries,
that means on the EOF cluster the link count could simply be coded as
-(link count).  Not having seen a line of Microsoft's code (thank goodness!!)
I bet this wouldn't even be that hard.  Not being a DOS wizard, for all I
know that high-order bit may be verboten already anyway as part of a legal
cluster number.

It's real simple now, just pick up the telephone, tell Bill Gates you want
this.  I'm sure he'll just whip it right up!  :-)
-- 
Jim Rosenberg                        pitt
Oglevee Computer Systems                 >--!amanue!oglvee!jr
151 Oglevee Lane                      cgh
Connellsville, PA 15425                                #include <disclaimer.h>