jrk@sys.uea.ac.uk (Richard Kennaway) (03/07/89)
I am writing a program which, among other things, makes a directory listing of a volume. There is a strange discrepancy in the information it collects. Using PBHGetVInfo, it determines the number of files and folders on the volume, the number of blocks, and the number of free blocks. The difference between the last two figures should be the number of used blocks. When it scans through all the files and folders with PBHGetCatInfo in the usual way, it always finds exactly the right number of files and folders, but the sum of their sizes is sometimes less than the calculated number of used blocks. Some typical figures (numbers of blocks): disk1 disk2 disk3 ioVNmAlBlks - ioVFrBlk 38600 41600 43758 total size of all files 36908 41600 43459 It calculates the size of a file by rounding both the ioFlPyLen and ioFlRPyLen components of the PBHGetCatInfo parameter block up to a multiple of the block size (ioVAlBlkSiz in the block returned by PBHGetVInfo), dividing by the block size, and adding the two figures. Am I missing something? Disk1 and disk2 above are of identical manufacture (40Mb Qisk). If there is internal housekeeping information that occupies allocation blocks but doesnt belong to any file, why does it occupy 1.7 megs on disk1 and nothing on disc2? Disk3 is the internal disc of a MacII. Inspection of the directory listings showed nothing strange. In all cases, the listing was made to a file on a volume other than the one being scanned. -- Richard Kennaway SYS, University of East Anglia, Norwich, U.K. uucp: ...mcvax!ukc!uea-sys!jrk Janet: kennaway@uk.ac.uea.sys
kaufman@polya.Stanford.EDU (Marc T. Kaufman) (03/10/89)
In article <457@sys.uea.ac.uk> jrk@s1.UUCP (Richard Kennaway) writes: >I am writing a program which, among other things, makes a directory >listing of a volume. There is a strange discrepancy in the >information it collects. >When it scans through all the files and folders with PBHGetCatInfo >in the usual way, it always finds exactly the right number of files and >folders, but the sum of their sizes is sometimes less than the >calculated number of used blocks. You should be aware that disk space is allocated in "Clumps" of size ioVClpSiz, where the clump size is chosen so that [disk size / clump size] < 65535, which is the maximum number of allocation blocks that the file system can handle. In addition, there is overhead for the allocation block bit map and the allocation and directory B-trees, which will not be reflected in the simple summation of file sizes. Marc Kaufman (kaufman@polya.stanford.edu)
keith@Apple.COM (Keith Rollin) (03/10/89)
In article <7550@polya.Stanford.EDU> kaufman@polya.Stanford.EDU (Marc T. Kaufman) writes: >In article <457@sys.uea.ac.uk> jrk@s1.UUCP (Richard Kennaway) writes: >>I am writing a program which, among other things, makes a directory >>listing of a volume. There is a strange discrepancy in the >>information it collects. > >>When it scans through all the files and folders with PBHGetCatInfo >>in the usual way, it always finds exactly the right number of files and >>folders, but the sum of their sizes is sometimes less than the >>calculated number of used blocks. > >You should be aware that disk space is allocated in "Clumps" of size ioVClpSiz, >where the clump size is chosen so that [disk size / clump size] < 65535, >which is the maximum number of allocation blocks that the file system can >handle. In addition, there is overhead for the allocation block bit map and >the allocation and directory B-trees, which will not be reflected in the >simple summation of file sizes. You are confusing clump size (drClpSiz) with the allocation block size (drAlBlksiz). The clump size is used to determine how many contiguous allocation blocks to assign to a file as it is being written. This is done to reduce fragmentation. Richard is correctly taking into account the allocation block size. I don't think that the clump size or volume bitmap come into play here. After a file is closed, the extent is "cut back" to just the right size needed. Also, the volume bitmap is not included in the allocation block section of the hard disk; the allocation blocks start just after the bitmap. Compare the values of drVBMStart and drAlBlSt to see this. I am actually surprised that Disk 2 matched at all. I would have thought that the two figures would have differed by the sizes of the catalog and extents B*-trees in all cases. Right now, I am considering 3 scenarios: 1) The 2 sizes should never match: this is because the method used to add up the files' sizes will never take into account the catalog and extents B*-trees. 2) The 2 sizes should always match: this assumes that the size of the 2 B*-trees is subtracted from the total number of allocation blocks. In this case, the discrepencies encountered on Disks 1 & 3 could be explained by a blown allocation bitmap. Try running Disk First Aid. 3) The 2 sizes should sometimes match: this assumes that there are characteristics of a fragmented disk that would account for this. For instance, a fragmented disk will have an extents B*-tree. For some bizarre reason, this could account for the difference. Off the top of my head, I don't know which one of these is more likely (if any of them are right at all). ------------------------------------------------------------------------------ Keith Rollin --- Apple Computer, Inc. --- Developer Technical Support INTERNET: keith@apple.com UUCP: {decwrl, hoptoad, nsc, sun, amdahl}!apple!keith "Argue for your Apple, and sure enough, it's yours" - Keith Rollin, Contusions
holland@m2.csc.ti.com (Fred Hollander) (03/11/89)
In article <27058@apple.Apple.COM> keith@Apple.COM (Keith Rollin) writes: >In article <7550@polya.Stanford.EDU> kaufman@polya.Stanford.EDU (Marc T. Kaufman) writes: >>In article <457@sys.uea.ac.uk> jrk@s1.UUCP (Richard Kennaway) writes: >>>I am writing a program which, among other things, makes a directory >>>listing of a volume. There is a strange discrepancy in the >>>information it collects. >> >>>When it scans through all the files and folders with PBHGetCatInfo >>>in the usual way, it always finds exactly the right number of files and >>>folders, but the sum of their sizes is sometimes less than the >>>calculated number of used blocks. This is just a guess. I'm certainly no expert in this area, but is it possible that the resource fork needs to start at a new block? If so, wouldn't you need to round up 2 blocks for files with resource forks, and 1 block for data only? Fred Hollander Computer Science Center Texas Instruments, Inc. hollander@ti.com The above statements are my own and not representative of Texas Instruments.
alexis@ccnysci.UUCP (Alexis Rosen) (03/13/89)
I'm going way out on a limb here, since I'm no expert on the file system (and Mark is, I think), but... I believe Mark is confuding the Clump size and the Allocation block size. There are a maximum of 64K allocation blocks on a disk. This has nothing to do with the clump size. The clump size is simply the minimum number of alloc. blocks that get alotted to a file when it wants to grow. (I'm not sure, but I think that this then gets trimmed back when the file is closed.) I wonder if Richard has files on his disk with a version number not zero. I know that inside Mac says that HFS doesn't support this at all, but I'm not sure that's true. Another possibility is that you have an extents file on disks 1 and 3 but not on disk 2 (If disk 2 is a forty meg disk, you've filled it past 99%, so I'm guessing that you defragmented it...). I'm not too clear on that, though, because what about the catalog file? That must exist on all three, so you must have counted it for disk 2, so... The most likely possibility is that there's this little green guy living in your mac, and... :-) Alexis Rosen alexis@ccnysci.uucp
jrk@sys.uea.ac.uk (Richard Kennaway CMP RA) (03/14/89)
Thanks to those who responded to my query about the missing megabytes. [Recap for those who didnt see the message but might encounter the same problem: I had a program that finds the number of blocks in use on a disc from information returned by PBHGetVInfo. It also calculates what should be the same information by using indexed calls of PBHGetCatInfo to discover the size of every file on the volume and adding up their sizes, remembering to first separately round up the lengths of the data and resource forks of each file to a multiple of block size. But for two hard discs I tried this on, the latter figure was smaller than the former, on one by ~300 1k blocks and on the other by ~1700. The strange aspect was that for a third hard disc, the two figures always came out identical, ruling out explanations based on the presence of housekeeping info which might be included in the former figure and excluded from the latter.] The solution came from keith@Apple.COM (Keith Rollin), who in article <27058@apple.Apple.COM> writes: >I am actually surprised that Disk 2 matched at all. I would have thought that >the two figures would have differed by the sizes of the catalog and extents >B*-trees in all cases. Right now, I am considering 3 scenarios: ... >2) The 2 sizes should always match: this assumes that the size of the 2 >B*-trees is subtracted from the total number of allocation blocks. In this >case, the discrepencies encountered on Disks 1 & 3 could be explained by a >blown allocation bitmap. Try running Disk First Aid. That must be it. I ran Disk First Aid, and it recovered the missing blocks. I then tried out my program on some other hard discs around the department. In every case I found at least a few missing blocks, recoverable with Disk First Aid. Looks like it's a good idea to run DFA occasionally. Perhaps the allocation map is getting damaged by software under development crashing frequently in horrible ways. I have done a lot more programming on the disc that had the 1700 missing blocks than on the one that matched exactly. In fact, the disc that had the 1700 block deficit still has 6 missing blocks, but this probably isnt serious. Disk First Aid reports no problems. I would guess that these are permanent defects on the disc, that arent counted either as free blocks or as belonging to any file. -- Richard Kennaway SYS, University of East Anglia, Norwich, U.K. uucp: ...mcvax!ukc!uea-sys!jrk Janet: kennaway@uk.ac.uea.sys