captkidd@athena.mit.edu (Ivan Cavero Belaunde) (02/14/91)
I'm trying to write a routine that takes a file identifier (filename and WDRefNum or filename, dirID and vRefNum) and returns a pointer to a block containing a list of the block numbers on that volume occupied by a file. Although I understand the basic procedure pretty clearly, there are a couple of stumbling blocks that are driving me crazy. Basically a file consists of one or more extents (chunks of contiguous blocks in the volume). Information about the extents is stored in a B*-tree, on which each node has the following structure: struct { NodePtr ndFLink; // Ptr to next leaf node NodePtr ndBLink; // Ptr to prev leaf node char ndType; // Node type char ndLevel; // Node level short ndNRecs; // # of records in node < record 0 > < record 1 > .... < record n > < free space > < offset to free space > < offset to record n > .... < offset to record 1 > < offset to record 0 > } NodeRec, *NodeRec; This structure doesn't seem to be very well defined, however. IM says that all nodes (no matter what kind) are a fixed size, and thus one can access the fields at the end which point to the records (which can be of varying size). I haven't been able to find this fixed size documented anywhere, however (anyone know?). (It's not absolutely necessary, since I know the size records I'm dealing with, but it'd be better). The structure of the records is not clearly documented either, but it would seem to be: struct { unsigned char xkrKeyLen; // Key length char xkrFkType; // Fork type long xkrFNum; // File number short xkrFABN; // Allocation block # within file union { ExtentDesc extents[3]; // Extent descriptors (3 of them) // (if leaf node) long dnNode; // Ptr to node in lowever level // (if index node) } } ExtentRecord; struct { short xtFirstBlk; // # of extent's 1st allocation block short xtNumBlks; // Number of blocks in extent } ExtentDesc; The extents B*-tree is not in RAM, however. It's in a file (second FCB in the FCB buffer). So my question is: what's the format of this file (so I can actually navigate the B*-tree)? Is there a header before the actual data comes? And do the node pointers effectively translate to marks within the extent file? Now it would seem that what I need to do is: 1) Open the file and get a refNum for it. 2) Get the info from the FCB (file number) 3) Get info from the corresponding VCB (vcbAlBlSt for calculating actual block #s, vcbXTRef [the extent file might not always be the second FCB in the FCB buffer]) 4) Navigate the extent tree, searching for the first extent record, then using the data to calculate the search key for the next extent record, and so on, building the list of block #s as we go. Now, has anyone out there done this before? Are there any hairy issues as far as moving the mark for the extent file FCB behind the file manager's back? What about the extent cache, and is there a way to flush it (does it need to be)? And finally, are my data structure definitions correct? I can't seem to find formal definitions *anywhere*, and I just deduced those from the descriptions in IM4. Or maybe there's a really obscure call somewhere that will get me the info I need :-) Doubt it, tho'. I'd really appreciate any insights anyone has on this topic. Thanx y'all. -Ivan Cavero Belaunde Internet: captkidd@ATHENA.MIT.EDU "I suppose you think that was funny." "I don't know. I'll have to consult our humor officer. Mr. Spock, was that funny?" "I shall have to analyze it, sir. It may take time." -From Peter David's "Star Trek" comic
andyp@treehouse.UUCP (Andy Peterman) (02/15/91)
In article <1991Feb13.210702.9407@athena.mit.edu> captkidd@athena.mit.edu (Ivan Cavero Belaunde) writes: >I'm trying to write a routine that takes a file identifier (filename and >WDRefNum or filename, dirID and vRefNum) and returns a pointer to a block >containing a list of the block numbers on that volume occupied by a file. >Although I understand the basic procedure pretty clearly, there are a couple >of stumbling blocks that are driving me crazy. See if you can find a manual for the old disk editor program Fedit Plus. It has a B-Tree description that goes beyond that of IM volume IV. I'll try to answer some of your questions in case you can't find it: >Basically a file consists of one or more extents (chunks of contiguous >blocks in the volume). Information about the extents is stored in a B*-tree, >on which each node has the following structure: > > ... [ stuff deleted] > >This structure doesn't seem to be very well defined, however. IM >says that all nodes (no matter what kind) are a fixed size, and thus >one can access the fields at the end which point to the records (which >can be of varying size). I haven't been able to find this fixed size >documented anywhere, however (anyone know?). (It's not absolutely necessary, >since I know the size records I'm dealing with, but it'd be better). The first node of a B-Tree is a header node (type 1). It will have three records. The first one, starting at an offet of $000E into the node, contains the following: $000E Logical sector number of the last leaf node $0012 The size of each B-Tree node (always 512 bytes or one sector) $0014 Maximum size of a key in the B-Tree. $0016 The total number of nodes in the B-Tree. $001A The number of free nodes in the B-Tree. You can see from the above that the size of a B-Tree node is specified at an offset of $0012 into the B-Tree file (first node, first record). The second record of the B-Tree header node (at $0078) is unused. The third record (at $00F8) contains a bit map of which sectors (nodes) are in use for the B-Tree file. >The extents B*-tree is not in RAM, however. It's in a file (second FCB in the >FCB buffer). So my question is: what's the format of this file (so I >can actually navigate the B*-tree)? Is there a header before the actual >data comes? And do the node pointers effectively translate to marks >within the extent file? To use either the extent or catalog b-tree file, get the Volume Control Block for the volume and get the refNums from vcbXTRef or vcbCtlRef (see IM-IV page 176-177). These files will alread be open and all you have to do is do a standard PBRead using these refnums. (I might as well mention it now - DO NOT WRITE TO EITHER OF THESE FILES!!!!). The files just contain the b-tree nodes - if you read the first 512 bytes of the file, you will get the header node as described above. The next 512 bytes will be node 2, etc. >Now it would seem that what I need to do is: > > 1) Open the file and get a refNum for it. > 2) Get the info from the FCB (file number) > 3) Get info from the corresponding VCB > (vcbAlBlSt for calculating actual block #s, > vcbXTRef [the extent file might not always be the second > FCB in the FCB buffer]) No. See above for an easier solution. > 4) Navigate the extent tree, searching for the first extent record, > then using the data to calculate the search key for the > next extent record, and so on, building the list of block #s > as we go. Well, sort of. If you're looking for the extent of a file, you need to start with the extent entry in its FCB (or catalog B-Tree entry). The first three extents of a file are stored in the fcbExtRec field of the FCB (see IM-IV page 180). Most files, unitl they get fragmented, do not contain entries in the extent b-tree - they just have one to three entries here. >Now, has anyone out there done this before? Are there any hairy issues as >far as moving the mark for the extent file FCB behind the file manager's >back? What about the extent cache, and is there a way to flush it (does it >need to be)? And finally, are my data structure definitions correct? I >can't seem to find formal definitions *anywhere*, and I just deduced those I haven't used the extents file, although I have read from the catalog b-tree file successfully. I don't remember if there is a cache problem, although I think I do a FlushVol before I try reading the file. Good luck!!! -- Andy Peterman | Opinions expressed treehouse!andyp@gvgpsa.gvg.tek.com | are definitely those of (916) 273-4569 | my employer!