rab@murdu.oz (Richard Alan Brown) (09/21/89)
After trying to understand the AOS/VS filesystem, I have become more confused than ever. What I have tried to do, is to show how much _real_ disk space is used by a file or group of files. I have written a CLI command to list 'f/len/index/elem/typ/l=filename', and then I have used 'awk' to process this and calculate the numbers. However, the results are clearly incorrect, and reflect my poor understanding of AOS/VS. Here's what I think I know: Each file has a length given in bytes. Given the element size for that file (default 4 on our system), the real file size is just the length in bytes taken up to the next multiple of the element size. (e.g. An element size of 4 means 4*512 = 2k bytes, so such files are allocated in 2k chunks). But what about index blocks? OK, so I also count the number of index levels in a file, and allow a block for each level (Is this correct? Are index blocks true blocks, or blocks within a 2k chunk? In other words, does the system lose 4 blocks on the first index 'block' and use this for the next three?) BUT! I have noticed DG's sneaky compression of files (executables) with large blocks of nulls in them (am I right?), so that a file can seem large (in bytes), while actually taking up much less disk space. (Is this only for PRV files? Why doesn't the file system tell users the 'correct' size?) Now for the *really* tricky part. Create an empty CPD. put a file in it (length 0). Start adding data. Who knows how much space the file takes up!? Does the SPACE command include the space taken up by directory entries? How does one calculate that (Note that when one deletes the file, the CPD is not 'empty'. This presumably is the directory entry...?). So if there are any Data General employees or hackers out there, maybe you could enlighten me? Richard Brown (rab@murdu.mu.OZ.AU) or (rab@murdu.ucs.unimelb.OZ.AU)
jba@harald.ruc.dk (Jan B. Andersen) (09/21/89)
I think it's impossible, but you can get pretty close by DUMP'ing the files to disk and then check the size of the dumpfile. The overhead is only for the filename, the ACL, time and date etc. /| / Postmaster@RUC.dk /^^^\ .----------------. / | / DG-passer@RUC.dk { o_o } | SIMULA does it | /--|/ jba@meza.RUC.dk \ o / --> | with CLASS | `--' ' rucjb@os1100.uni-c.dk --mm---mm-- `----------------'
dik@cwi.nl (Dik T. Winter) (09/22/89)
About getting the disk space used by a specific file. In article <116@harald.UUCP> jba@harald.ruc.dk (Jan B. Andersen) writes: > I think it's impossible, but you can get pretty close by DUMP'ing the > files to disk and then check the size of the dumpfile. The overhead is > only for the filename, the ACL, time and date etc. I would think that a dumped file might use more space than a file that is not dumped! The previous poster alluded to compression techniques where large blocks of 0's were not written to disk. They might very well expand when dumping in AOS (I do not know). Anyhow, there are many OSes were it is impossible to obtain the number of disk blocks used by a specific file. E.g. under Unix (is this heresy?) it is possible to create a file were ls (list files) gives a size of several hundreds of megabytes and du (disk usage) on the directory reveals that only 2 blocks are used. (And of course under Unix it is possible that the file that uses most space on the system is invisible because, although it is still open, it is already unlinked!) -- dik t. winter, cwi, amsterdam, nederland INTERNET : dik@cwi.nl BITNET/EARN: dik@mcvax
mjn@sbcs.sunysb.edu (The Sixth Replicant) (09/22/89)
In article <8416@boring.cwi.nl> dik@cwi.nl (Dik T. Winter) writes: >About getting the disk space used by a specific file. > >I would think that a dumped file might use more space than a file that is >not dumped! The previous poster alluded to compression techniques where >large blocks of 0's were not written to disk. They might very well expand >when dumping in AOS (I do not know). In point of fact, DUMP, DUMP_II and, I believe MOVE, do zero elimination. Whenever there are blocks of zeros, these are not written to the output. There is a system call which has an option to give the next _allocated_ block. I believe it's ?BLKIO, but its been several years. This is used by DUMP_II to avoid spending large quantities of CPU time while scanning zeros from large, unallocated files. ------------------------------------------------------------------------- Marc Neuberger mjn@sbcs.sunysb.edu ----------------------------------------------------------------------------- Marc Neuberger mjn@sbcs.sunysb.edu
rgs@igw.megatek.uucp (Rusty Sanders) (09/23/89)
> After trying to understand the AOS/VS filesystem, I have become > more confused than ever. What I have tried to do, is to show how much _real_ > disk space is used by a file or group of files. > > [...] > > So if there are any Data General employees or hackers out there, maybe you > could enlighten me? Well, I'm not currently a DG hacker, but in a previous life I used to be (I do unix/Sun stuff now, much nicer as far as I'm concerned). Anyway, you're right. What you are trying to do is VERY difficult with the AOS/VS I remember (it's been a few years, so things MIGHT of changed, but I doubt it). Anyway, quite a long time ago I wrote a little program to do just what you're asking about. You ran it on a filesystem and it generated a nice listing of all files (in size order), and their actuall sizes. It is possible to do, but it's not at all easy. What do you have to do? Well, first thing is to dismount the filesystem. This means you can't do it on the root, but that's just tough (actually, I re-wrote it to run under MP-AOS, and booted it from a floopy when I wanted to size the root, but I won't ever admit it). Anyway, the trick is to read the raw disk device, traversing the directory structures, and reading all the raw index blocks for all the files. Sounds like a pain? You bet it was. Fortunately, DG provides (or at least used to provide) a internals document that described the file sructure. If you're still interested contact your local friendly neighborhood DG sales engineer (does DG call them applications engineers?) and see what you can shake loose. Either that, or beat them up to provide a decent O/S interface that allows you to find things like this out. And don't ask if I still have that program handy. It long ago was lost in the abyss of corporate hyjinks and layoffs. Of course, if you wanted to hire me as a consultant..... ---- Rusty Sanders, Megatek Corp. --> rgs@megatek or... ...ucsd! ...hplabs!hp-sdd! ...ames!scubed! ...uunet!
gary@dgcad.SV.DG.COM (Gary Bridgewater) (09/24/89)
In article <1702@murdu.oz> rab@murdu.oz (Richard Alan Brown) writes: >I know: > >Each file has a length given in bytes. Given the element size for that file >(default 4 on our system), the real file size is just the length in bytes >taken up to the next multiple of the element size. (e.g. An element size of 4 >means 4*512 = 2k bytes, so such files are allocated in 2k chunks). Yes. >But what about index blocks? OK, so I also count the number of index levels >in a file, and allow a block for each level (Is this correct? Are index blocks >true blocks, or blocks within a 2k chunk? In other words, does the system lose >4 blocks on the first index 'block' and use this for the next three?) A 0 level file has no index blocks. It is a direct file and its size is one element. A 1 level file has one disk block (512 bytes) which contains 128 four byte logical disk addresses pointing to data elements. A 2 level file has one disk block pointing to 128 'level 1' index blocks. A 3 level file has one disk block pointing to 128 'level 2' index blocks. >BUT! I have noticed DG's sneaky compression of files (executables) with large >blocks of nulls in them (am I right?), so that a file can seem large (in bytes), >while actually taking up much less disk space. (Is this only for PRV files? We prefer "clever". In the above index block scheme you can hae element pointers that are zero. AOS/VS takes this to mean that the entire element is empty and it provides the 0 bytes if you try to read these blocks. This is a great disk space savings for executables and data bases. It will work on ANY kind of file but only if you A) write at least whole index block worth of 0s at once or B) use some form of file positioning command to skip data. Note that it is important when moving such files over the net to use the MOVE/FTA/COMPRESS command rather than just MOVE (RMA form) or MOVE/FTA (no compression). Without both FTA and COMPRESS the transfer takes place a byte at a time and the system won't notice the null blocks. A way to fix files which have been incorrectly grown this way is to DUMP and LOAD them since DUMP will squeeze out the 0s and LOAD will do positional block writes. This 'compression' makes exact space computation tricky. It also makes reading such file interesting - study the BLKIO system call, for instance. Its main feature is the ability to skip these empty spaces - that is why DUMP_II can dump such files MUCH faster than DUMP which reads a byte at a time. It is also why the system can seem to "go into its navel" when READing such a file - no disk activity and the expansion is done at system priority. >Why doesn't the file system tell users the 'correct' size?) What is the real size? If you read it a byte at a time you will get an EOF after the Nth byte so the file is N bytes long. If you ?BLKIO the file an element at a time you can discover how many blocks it is taking up and from that you can infer the element structure if you map the empty elements. But do you want the CLI to do that everytime you say F/LEN? You could submit an str to have another switch added to the VSII CLI to perform this activity. You could also submit an STR asking that the system maintain a count of the number of elements allocated to a file which would make the system bigger and slower to provide a rarely needed piece of information. If you want to know how much space on the disk the file takes then create a cpd, move the file there, do a space, delete the file, do another space to get the size of the cpd itself and subtract from the first size. Crude but exact. >Now for the *really* tricky part. Create an empty CPD. put a file in it >(length 0). Start adding data. Who knows how much space the file takes up!? >Does the SPACE command include the space taken up by directory entries? How >does one calculate that (Note that when one deletes the file, the CPD is not >'empty'. This presumably is the directory entry...?). The size of the directory is the second number above. Directory space is a function of the number of files, any UDAs and the length of the filenames. Directories are also files so they have index blocks too! Yes, the space a directory takes is included in the size of the directory. >So if there are any Data General employees or hackers out there, maybe you >could enlighten me? You could also order a Filesystem Internals manual which goes into all the gory details of this. Another poster mentions using DUMP to discover a file's true size. Won't work - DUMP compresses nulls wherever it finds them irrespective of block boundaries. And some Unix versions also do this sort of 'hollow' file optimization. We may very well have inherited it from MULTICS which is the inspiration for both Unix and AOS(/VS). The above is my interpretation of How It All Works and should not be interpreted as an Official Version. See the manual and the Release notices. Buy the sources and KNOW enlightenment. Your mileage may vary. -- Gary Bridgewater, Data General Corp., Sunnyvale Ca. gary@sv4.ceo.sv.dg.com or {amdahl,aeras,amdcad,mas1,matra3}!dgcad.SV.DG.COM!gary No good deed goes unpunished.
guestx@wave4.webo.dg.com (Guest login for misc) (10/18/89)
In article <741@megatek.UUCP> rgs@igw.megatek.uucp (Rusty Sanders) writes: > After trying to understand the AOS/VS filesystem, I have become > more confused than ever. What I have tried to do, is to show how much _real_ > disk space is used by a file or group of files. > > [...] > > So if there are any Data General employees or hackers out there, maybe you > could enlighten me? A little known feature of AOS/VS II is that this information is available through the ?FSTAT system call. If you have a AOS/VS II system, issue a FILESTATUS/PACKET CLI command and check out locations 25 and 26, this will give you the actual number of blocks taken up by the file (in octal). This used to only work with CPDs and LDUs, but we made it work with all files. Standard Disclaimers Apply Don Lehman AOS/VS II Development. Internet: don@tzone.ceo.dg.com