deng@shire (Mingqi Deng) (02/15/89)
While using NC and Norton utility programs, I found that disk spaces were used very ineffeciently with ASCII files (nearly 40% wasted). Both NC and Norton showed that the space used by some of my text files used space up to 78% percent more than their 'real size' (shown by DIR command). (What was actually shown is the space used by all the files in a directory. FS of Norton gives an explicit report of percentage slackness. I figured out for NC.) I am confused. I know each sector is 512 (256? The real figure is not very important here.) bytes in DOS 3.3 and any file whose size is not a multiple of 512 bytes will leave last sector allocated for it to be partially empty. But this only makes the space used increase by about 500 bytes. My files are not just 200 bytes in size. That is why I am puzzeled. One reasonable explanation I can think of is as the following. The text file could had been created through many editing sessions, and the DOS simply uses additional sectors for the sectors to which new text has been inserted, rather than rewrite the whole file to opitmize space. This is a trade between speed and space. But for largre files, this is certainly a big waste. Does anybody know how to 'compact' the disk spaces used by text files? Thanks. Mingqi deng@shire.cs.psu.edu deng@psuvaxs.bitnet deng@psuvaxs.UUCP
silver@eniac.seas.upenn.edu (Andy Silverman) (02/16/89)
In article <4295@psuvax1.cs.psu.edu> deng@shire (Mingqi Deng) writes: >While using NC and Norton utility programs, I found that disk spaces >were used very ineffeciently with ASCII files (nearly 40% wasted). > >Both NC and Norton showed that the space used by some of my text files >used space up to 78% percent more than their 'real size' (shown by DIR >command). (What was actually shown is the space used by all the files >in a directory. FS of Norton gives an explicit report of percentage >slackness. I figured out for NC.) > >I am confused. I know each sector is 512 (256? The real figure is not >very important here.) bytes in DOS 3.3 and any file whose size is not a >multiple of 512 bytes will leave last sector allocated for it to be >partially empty. But this only makes the space used increase by about >500 bytes. My files are not just 200 bytes in size. That is why I am >puzzeled. > Well, the fallacy here is that while a sector in DOS does indeed contain 512 bytes, all files are allocated in minimum units of "clusters." A cluster on a floppy is usually 2 sectors (1K of data), and hard drives have cluster sizes ranging from 4 to 16 sectors, depending on the size of a drive. So while you may have a text file that's only 15 bytes, it will take up a minimum of 1K on a floppy, or 2K or even 8K on a hard disk. Norton's program reports on "slack" which is the difference between a file's true size and the amount of space it takes up on the drive (determined by cluster size). One way to compress text files so that they take up less drive space is to use a program like ARC or PKPAK to combine several text files into one large library file, which is then further squashed using mathematical techniques to reduce disk usage even more. Andy Silverman Internet: silver@eniac.seas.upenn.edu CompuServe: 72261,531
hollen@spot.megatek.uucp (Dion Hollenbeck) (02/17/89)
From article <4295@psuvax1.cs.psu.edu>, by deng@shire (Mingqi Deng): > While using NC and Norton utility programs, I found that disk spaces > were used very ineffeciently with ASCII files (nearly 40% wasted). > > [...stuff deleted...] > > I am confused. I know each sector is 512 (256? The real figure is not > very important here.) bytes in DOS 3.3 and any file whose size is not a > multiple of 512 bytes will leave last sector allocated for it to be > partially empty. But this only makes the space used increase by about > 500 bytes. My files are not just 200 bytes in size. That is why I am > puzzeled. > One factor you have not taken into account is allocation unit size. DOS does not allocate one sector at a time, but one allocation unit at at a time and depending on the size of the disk, the size of the allocation unit changes (the bigger the disk, the bigger the allocation unit) otherwise, the standard File Allocation Table would not be big enough to map large disks. Due to this scheme, a file containing only 1 byte could take up to 4096 bytes on a very large disk. Try increasing your file sizes by multiples of 512 and see when the next jump in actual size allocated happens and you should then know what the allocation unit size is on your disk. Dion Hollenbeck (619) 455-5590 x2814 Megatek Corporation, 9645 Scranton Road, San Diego, CA 92121 seismo!s3sun!megatek!hollen ames!scubed/
simon@ms.uky.edu (Simon Gales) (02/22/89)
In article <495@megatek.UUCP> hollen@spot.megatek.uucp (Dion Hollenbeck) writes: > ... >Try increasing >your file sizes by multiples of 512 and see when the next jump in actual >size allocated happens and you should then know what the allocation >unit size is on your disk. > Just run chkdsk, it will tell you your cluster (allocation unit) size. I have a 40meg hd running under DOS 4, the cluster size is 2K. -- /------------------------------------------------------------------------\ Simon Gales@University of Ky {rutgers, uunet}!ukma!simon - simon@ms.uky.edu - simon@UKMA.BITNET