[comp.sys.mac.programmer] Missing megabytes

jrk@sys.uea.ac.uk (Richard Kennaway) (03/07/89)

I am writing a program which, among other things, makes a directory
listing of a volume.  There is a strange discrepancy in the
information it collects.

Using PBHGetVInfo, it determines the number of files and folders on the
volume, the number of blocks, and the number of free blocks.  The
difference between the last two figures should be the number of used
blocks.

When it scans through all the files and folders with PBHGetCatInfo
in the usual way, it always finds exactly the right number of files and
folders, but the sum of their sizes is sometimes less than the
calculated number of used blocks.  Some typical figures (numbers of blocks):

				disk1	disk2	disk3
			
    ioVNmAlBlks - ioVFrBlk	38600	41600	43758
    total size of all files	36908	41600	43459

It calculates the size of a file by rounding both the ioFlPyLen and
ioFlRPyLen components of the PBHGetCatInfo parameter block up to a
multiple of the block size (ioVAlBlkSiz in the block returned by
PBHGetVInfo), dividing by the block size, and adding the two figures.

Am I missing something?  Disk1 and disk2 above are of identical
manufacture (40Mb Qisk).  If there is internal housekeeping
information that occupies allocation blocks but doesnt belong to any
file, why does it occupy 1.7 megs on disk1 and nothing on disc2?
Disk3 is the internal disc of a MacII.

Inspection of the directory listings showed nothing strange.  In all
cases, the listing was made to a file on a volume other than the one
being scanned.
-- 
Richard Kennaway                SYS, University of East Anglia, Norwich, U.K.
uucp:	...mcvax!ukc!uea-sys!jrk	Janet:	kennaway@uk.ac.uea.sys

kaufman@polya.Stanford.EDU (Marc T. Kaufman) (03/10/89)

In article <457@sys.uea.ac.uk> jrk@s1.UUCP (Richard Kennaway) writes:
>I am writing a program which, among other things, makes a directory
>listing of a volume.  There is a strange discrepancy in the
>information it collects.

>When it scans through all the files and folders with PBHGetCatInfo
>in the usual way, it always finds exactly the right number of files and
>folders, but the sum of their sizes is sometimes less than the
>calculated number of used blocks.

You should be aware that disk space is allocated in "Clumps" of size ioVClpSiz,
where the clump size is chosen so that [disk size / clump size] < 65535,
which is the maximum number of allocation blocks that the file system can
handle.  In addition, there is overhead for the allocation block bit map and
the allocation and directory B-trees, which will not be reflected in the
simple summation of file sizes.

Marc Kaufman (kaufman@polya.stanford.edu)

keith@Apple.COM (Keith Rollin) (03/10/89)

In article <7550@polya.Stanford.EDU> kaufman@polya.Stanford.EDU (Marc T. Kaufman) writes:
>In article <457@sys.uea.ac.uk> jrk@s1.UUCP (Richard Kennaway) writes:
>>I am writing a program which, among other things, makes a directory
>>listing of a volume.  There is a strange discrepancy in the
>>information it collects.
>
>>When it scans through all the files and folders with PBHGetCatInfo
>>in the usual way, it always finds exactly the right number of files and
>>folders, but the sum of their sizes is sometimes less than the
>>calculated number of used blocks.
>
>You should be aware that disk space is allocated in "Clumps" of size ioVClpSiz,
>where the clump size is chosen so that [disk size / clump size] < 65535,
>which is the maximum number of allocation blocks that the file system can
>handle.  In addition, there is overhead for the allocation block bit map and
>the allocation and directory B-trees, which will not be reflected in the
>simple summation of file sizes.

You are confusing clump size (drClpSiz) with the allocation block size 
(drAlBlksiz). The clump size is used to determine how many contiguous 
allocation blocks to assign to a file as it is being written. This is done to 
reduce fragmentation.  Richard is correctly taking into account the allocation 
block size.

I don't think that the clump size or volume bitmap come into play here. After 
a file is closed, the extent is "cut back" to just the right size needed. Also, 
the volume bitmap is not included in the allocation block section of the hard 
disk; the allocation blocks start just after the bitmap. Compare the values of
drVBMStart and drAlBlSt to see this.

I am actually surprised that Disk 2 matched at all. I would have thought that
the two figures would have differed by the sizes of the catalog and extents
B*-trees in all cases. Right now, I am considering 3 scenarios:

1) The 2 sizes should never match: this is because the method used to add up
the files' sizes will never take into account the catalog and extents B*-trees.

2) The 2 sizes should always match: this assumes that the size of the 2
B*-trees is subtracted from the total number of allocation blocks. In this
case, the discrepencies encountered on Disks 1 & 3 could be explained by a
blown allocation bitmap. Try running Disk First Aid.

3) The 2 sizes should sometimes match: this assumes that there are 
characteristics of a fragmented disk that would account for this. For 
instance, a fragmented disk will have an extents B*-tree. For some bizarre 
reason, this could account for the difference.

Off the top of my head, I don't know which one of these is more likely (if any
of them are right at all).


------------------------------------------------------------------------------
Keith Rollin  ---  Apple Computer, Inc.  ---  Developer Technical Support
INTERNET: keith@apple.com
    UUCP: {decwrl, hoptoad, nsc, sun, amdahl}!apple!keith
"Argue for your Apple, and sure enough, it's yours" - Keith Rollin, Contusions

holland@m2.csc.ti.com (Fred Hollander) (03/11/89)

In article <27058@apple.Apple.COM> keith@Apple.COM (Keith Rollin) writes:
>In article <7550@polya.Stanford.EDU> kaufman@polya.Stanford.EDU (Marc T. Kaufman) writes:
>>In article <457@sys.uea.ac.uk> jrk@s1.UUCP (Richard Kennaway) writes:
>>>I am writing a program which, among other things, makes a directory
>>>listing of a volume.  There is a strange discrepancy in the
>>>information it collects.
>>
>>>When it scans through all the files and folders with PBHGetCatInfo
>>>in the usual way, it always finds exactly the right number of files and
>>>folders, but the sum of their sizes is sometimes less than the
>>>calculated number of used blocks.

This is just a guess.  I'm certainly no expert in this area, but is it
possible that the resource fork needs to start at a new block?  If so,
wouldn't you need to round up 2 blocks for files with resource forks,
and 1 block for data only?

Fred Hollander
Computer Science Center
Texas Instruments, Inc.
hollander@ti.com

The above statements are my own and not representative of Texas Instruments.

alexis@ccnysci.UUCP (Alexis Rosen) (03/13/89)

I'm going way out on a limb here, since I'm no expert on the file system
(and Mark is, I think), but...

I believe Mark is confuding the Clump size and the Allocation block
size.  There are a maximum of 64K allocation blocks on a disk. This has
nothing to do with the clump size. The clump size is simply the minimum
number of alloc. blocks that get alotted to a file when it wants to
grow. (I'm not sure, but I think that this then gets trimmed back when
the file is closed.)

I wonder if Richard has files on his disk with a version number not zero.
I know that inside Mac says that HFS doesn't support this at all, but I'm
not sure that's true. Another possibility is that you have an extents file
on disks 1 and 3 but not on disk 2 (If disk 2 is a forty meg disk, you've
filled it past 99%, so I'm guessing that you defragmented it...). I'm not
too clear on that, though, because what about the catalog file? That must
exist on all three, so you must have counted it for disk 2, so...

The most likely possibility is that there's this little green guy living
in your mac, and... :-)

Alexis Rosen
alexis@ccnysci.uucp

jrk@sys.uea.ac.uk (Richard Kennaway CMP RA) (03/14/89)

Thanks to those who responded to my query about the missing megabytes.

[Recap for those who didnt see the message but might encounter the same
problem:  I had a program that finds the number of blocks in use on a
disc from information returned by PBHGetVInfo.  It also calculates what
should be the same information by using indexed calls of PBHGetCatInfo to
discover the size of every file on the volume and adding up their sizes,
remembering to first separately round up the lengths of the data and resource
forks of each file to a multiple of block size.  But for two hard discs I
tried this on, the latter figure was smaller than the former, on one by ~300
1k blocks and on the other by ~1700.  The strange aspect was that for a
third hard disc, the two figures always came out identical, ruling out
explanations based on the presence of housekeeping info which might be
included in the former figure and excluded from the latter.]

The solution came from keith@Apple.COM (Keith Rollin), who in article
<27058@apple.Apple.COM> writes:
>I am actually surprised that Disk 2 matched at all. I would have thought that
>the two figures would have differed by the sizes of the catalog and extents
>B*-trees in all cases. Right now, I am considering 3 scenarios:
...
>2) The 2 sizes should always match: this assumes that the size of the 2
>B*-trees is subtracted from the total number of allocation blocks. In this
>case, the discrepencies encountered on Disks 1 & 3 could be explained by a
>blown allocation bitmap. Try running Disk First Aid.

That must be it.  I ran Disk First Aid, and it recovered the missing blocks.
I then tried out my program on some other hard discs around the department.
In every case I found at least a few missing blocks, recoverable with Disk
First Aid.  Looks like it's a good idea to run DFA occasionally.  Perhaps
the allocation map is getting damaged by software under development crashing
frequently in horrible ways.  I have done a lot more programming on the disc
that had the 1700 missing blocks than on the one that matched exactly.

In fact, the disc that had the 1700 block deficit still has 6 missing blocks,
but this probably isnt serious.  Disk First Aid reports no problems.  I
would guess that these are permanent defects on the disc, that arent
counted either as free blocks or as belonging to any file.
-- 
Richard Kennaway                SYS, University of East Anglia, Norwich, U.K.
uucp:	...mcvax!ukc!uea-sys!jrk	Janet:	kennaway@uk.ac.uea.sys