kohli@gemed (Jim Kohli, but my friends call me) (09/23/89)
> Richard Alan Brown @ Comp Sci, Melbourne Uni, Australia notes: > But I have noticed DG's sneaky compression of files >(executables) with large blocks of nulls in them (am I right?), >so that a file can seem large (in bytes), while actually taking >up much less disk space. (Is this only for PRV files? Why >doesn't the file system tell users the 'correct' size?) > AOS/VS "compresses" *ANY* kind of file which has a complete "element" (i.e., contiguous disk allocation) of zeroes. This happens to save disk space -- and it does have some advantages which I think outweigh the disadvantages. This is also one reason DG discourages make .PR files contiguous-- although you may get faster page faulting, you will take up more disk space-- although it may be worth the tradeoff. Why doesn't the file system tell you the 'correct' size? It doesn't know! What is the 'correct' size, after all? The size which AOS/VS gives you is the amount of space that the data *IN* the file would occupy if it had no compression applied to it. This number should be treated as the size of the file in all cases where disk space allocation is not critical. Obviously, if a file is contiuous, and has any non-zero data it it, the only space it will occupy is directory overhead space. If you really need to find out exactly how much space is occupied by a single file, your best recourse is to create a CPD, make a junk file in it (to init the directory), delete the junk file, *NOW* note the space in the directory, move your file into it and note the difference in space. This method will not work for a series of files unless you recreate the CPD for each file (because the directory overhead space is not reclaimed). >Now for the *really* tricky part. Create an empty CPD. put a >file in it (length 0). Start adding data. Who knows how much >space the file takes up!? > You are pretty close-- see the above... > Does the SPACE command include the >space taken up by directory entries? > You betcha > How does one calculate >that (Note that when one deletes the file, the CPD is not >'empty'. This presumably is the directory entry...?). > Correct, again! How bad do you need to know? Dumping stuff won't really do much because the same "rule of compression" applies to dumpfiles. DG gets a lot of flack from people who really need to know how much space their file is really taking, and I believe their method is "make CPD, make junk file, delete junk file, write down space, move file into CPD, subtract new space from old space". Jim """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" decrepitate: 1. to roast or calcine (salt, minerals, etc.) so as to cause crackling or until crackling ceases... In a context: "Oh, my brain is decrepitating! Aaaaaaarrrgggghh!!!!" """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" Jim Kohli | "Oh Grammar! Water bag icer gut! GE Medical Systems | A nervous sausage bag ice!" PO Box 414 | Milwaukee, WI 53201-414 | """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" gemed!hal!kohli@crd.ge.com sun!sunbird!gemed!hal!kohli """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
mjn@sbcs.sunysb.edu (The Sixth Replicant) (09/23/89)
In article <1055@mrsvr.UUCP> kohli@gemed.med.ge.com (Jim Kohli, but my friends call me) writes: >AOS/VS "compresses" *ANY* kind of file which has a complete >"element" (i.e., contiguous disk allocation) of zeroes. This >... I just want to clarify one point here. AOS/VS doesn't do any compression. The file system allows users to have unallocated blocks in files. When read, these blocks will be returned as zeros. Certain commands (MOVE, DUMP, DUMP_II) exploit this property to save space. If, however I write an element of zeros to disk, I'll get an element of zeros on disk, the file systems doesn't scan for zeros to see if it can optimize the write. This would take far too much CPU. I don't honestly know what UNIX does when I write to a high block number in an empty file, but I'd be somewhat surprised if it doesn't allow for unallocated blocks. The description of WRITE in "The design of the UNIX Operating System" by Bach (p 101) suggests to me that the intervening space is left unallocated. I think the difference between AOS/VS and UNIX may be that AOS/VS provides the CPD is which you are told _exact_ space usage, where in UNIX you only have du, which I tend to suspect doesn't give the precise number. ----------------------------------------------------------------------------- Marc Neuberger mjn@sbcs.sunysb.edu
kohli@gemed (Jim Kohli, but my friends call me) (09/24/89)
Path: mrsvr.UUCP!csd4.csd.uwm.edu!uwm.edu!mailrus!ncar!boulder!sunybcs!sbcs!mjn From: mjn@sbcs.sunysb.edu (The Sixth Replicant) Newsgroups: comp.os.aos Subject: Re: The *real* file size (was: How to find *real* file sizes in AOS/VS...?) Keywords: AOS/VS filesize fubar Message-ID: <3540@sbcs.sunysb.edu> Date: 23 Sep 89 00:29:22 GMT References: <1055@mrsvr.UUCP> Sender: news@sbcs.sunysb.edu Reply-To: mjn@sbstaff2.UUCP (The Sixth Replicant) Organization: Tyrell Corp. Lines: 23 >In article <1055@mrsvr.UUCP> I (Jim Kohli) wrote >>AOS/VS "compresses" *ANY* kind of file which has a complete >>"element" (i.e., contiguous disk allocation) of zeroes. This >>... >I just want to clarify one point here. AOS/VS doesn't do any compression. >The file system allows users to have unallocated blocks in files. When >read, these blocks will be returned as zeros. Certain commands (MOVE, >DUMP, DUMP_II) exploit this property to save space. If, however I write >an element of zeros to disk, I'll get an element of zeros on disk, the >file systems doesn't scan for zeros to see if it can optimize the write. >This would take far too much CPU. > You are right! I had originally believed this fairy tale because I was involved in doing some I/O benchmarking which involved a lot of ?RDB's/?WRB's, and DG had one of their high power support dudes (Dave Barrows) criticize our results as follows: "well, if you're only reading and writing zeroes, they aren't actually written to the disk..." (this is a recollection, but it isn't vague on this point). I guess it was easy to rationalize at the time because it seemed possible that the PTE's might maintain a "zero page" bit (no such bit has been documented to my knowledge, but this was in 1982 when the soul of the new machine was more like an unfriendly spirit). Sorry about that bbbbbboard readers! """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" A sun 3/50 with 8 MB isn't *JUST* a conspicuous consumption of silicon! """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" Jim Kohli | "Oh Grammar! Water bag icer gut! GE Medical Systems | A nervous sausage bag ice!" PO Box 414 | (Oar aesthete groin-murder???) Milwaukee, WI 53201-414 | """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" gemed!hal!kohli@crd.ge.com sun!sunbird!gemed!hal!kohli """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
meissner@dg-rtp.dg.com (Michael Meissner) (10/14/89)
> From: kohli@gemed (Jim Kohli, but my friends call me) > Newsgroups: comp.os.aos > Keywords: AOS/VS filesize fubar > Date: 22 Sep 89 20:08:47 GMT > Reply-To: kohli@gemed.med.ge.com (Jim Kohli, but my friends call me) > Organization: GE Medical (Applied Science Lab) > > > Richard Alan Brown @ Comp Sci, Melbourne Uni, Australia notes: > > But I have noticed DG's sneaky compression of files > >(executables) with large blocks of nulls in them (am I right?), > >so that a file can seem large (in bytes), while actually taking > >up much less disk space. (Is this only for PRV files? Why > >doesn't the file system tell users the 'correct' size?) > > > AOS/VS "compresses" *ANY* kind of file which has a complete > "element" (i.e., contiguous disk allocation) of zeroes. This > happens to save disk space -- and it does have some advantages > which I think outweigh the disadvantages. Wrong. If you actually write an element's worth of zeros, you will get a block allocated on the disk. If you call ?ALLOCATE it will allocate blocks on the disk. The only way you get holes is by: 1) You do a ?SPOS to a location that is at least an element- size beyond the current end of the file; or 2) You run a 32-bit program that ?SPAGES more than a couple of pages (I forget what the threshold is between where the system allocates the pages for you, and where the pages are not created until you touch them). I'm not entirely positive about this last case. The linker will use ?SPOS when linking common sections together that don't have any initializations. All of the DUMP/LOAD commands will check for zero blocks and not write the blocks on the tape, and use ?SPOS to create the hole. I'm not sure about COPY. The newer DUMP commands will use the ?BLKIO system call to bypass any holes in a file, while the older DUMP and early versions of DUMP_II would become CPU intensive (the kernel would realize that the element was not on disk, and zero fill the page -- meanwhile DUMP would then rapidly search the page to see if it contained all zeros, and if it did, chuck it out the window). On my development system, we once had a 40+ Meg file that had been created by a bad ?SPOS (the file was only about 1 Meg), it it took quite some time before the system load went back to normal. Normal disclaimer -- I only speak for myself, and not for Data General -- Michael Meissner, Data General. If compiles where much Uucp: ...!mcnc!rti!xyzzy!meissner faster, when would we Internet: meissner@dg-rtp.DG.COM have time for netnews?