[comp.unix.xenix] My COMPRESS wont UNCOMPRESS!

perry@bluemtn.uucp (Perry Minyard (3MTA3)) (08/31/90)

Help me!   I compressed a 20megabyte file (My VPIX Drive C:) and the
result was a 13 meg file, but when I uncompress it, its now only 1.3 meg!

I'm using SCO XENIX 2.3.3.
Has anyone heard of bugs with the compress/uncompress utilties?  Is there
a patch to fix it?

This really sux!

-Perry Minyard
-- 
"A mind is a terrible thing to taste."   | Perry Minyard
- Ministry                               | ..!emory!bluemtn!perry
-------------------------------------------------------------------------

Greg Wettstein <NU013809@NDSUVM1.BITNET> (08/31/90)

I believe you have encountered what is referred to in the trade as a
'sparse' file.  I would predict that if you kick up VPIX and let it look
at your Drive C you would find that everything is normal.

When a UNIX(c) based operating system writes to a file the operating system
does not necessarily allocate all the disk space which would be required
to meet the needs of a file if it were actually that large.  Since I just
read the previous sentence and it is somewhat confusing I will explain with
an example.

Take as an example a file which is 512 bytes long.  Assume that this file
was created by writing 512 ASCII space characters (:x20) to the file.  This
would consume 512 bytes or one block of disk space.

Now assume that you are writing in C and run the following program:

    auto char buffer[512];

    auto int handle;

    /* Code for opening file deleted. */

   lseek(handle, 1000000, SEEK_SET);
   memset(buffer, ' ', 512);
   write(handle, buffer, 512);

In theory this code fragment would seek out to the 1 millionth byte in
the file and then write an additional 512 ASCII character spaces to the
file.  The operating system will report the file to be 1,000,512 bytes
bytes in length using ls -l.

The catch is that although the operating system thinks the file is slightly
larger than one megabyte in size the actual amount of disk space allocated
to the file is probably around 1024 bytes or whatever the minimum allocation
is for your particular operating system.  A file created in this manner is
known as a 'sparse' file.

I have found 'sparse' files to be somewhat unpredictable in terms of how
they will be handled by various file utilities.  It has been my experience
that if a sparse file is copied using cp the utility will expand the file
and the copy of the file produced will actually consume whatever size
assigned to that file by the operating system.  I found this out when I
tried to copy a particular database file created by one of our past
software vendors which the operating system reported to be 65 megabytes
in size and which contained a grand total of about 1.2 megabytes of data.
When I unknowingly copied the file cp obligingly filled the 80 megabyte
hard disk the system was running and the whole operation gently came down
around my ears.

I 'packed' my Drive C VPIX file just recently and shrunk its reported size
significantly by taring its contents (under DOS using GNU TAR) to a
file on the XENIX file system.  I deleted the Drive C file, copied the
default Drive C file from the VPIX distribution and loaded the tar file
back into the Drive C file.  Everything worked normally afterwards and
df reported a substantial savings in disk space.

I hope this infomation was helpful.  As I mentioned I am not real fond
of sparse files from past experience and due to the fact that they keep
reporting nagging POSSIBLE FILE SIZE ERROR when fsck runs.

                             As always,
                             Dr. G.W. Wettstein
                             Fargo Clinic / MeritCare
                             Oncology Research Division Computing Facility

                             UUCP: uunet!plains!wind!greg
                             INTERNET: greg%wind.uucp@plains.nodak.edu
                             Phone: 701-234-2833

`The truest mark of a man's wisdom is his ability to listen to other
 men expound their wisdom.'