[net.unix] Tar Padding Nulls Advice

tjk@rlgvax.UUCP (Tom Kredo) (07/09/85)

We have a data base program that allocates nulls in the middle of
the file to use for data expansion.  This may be created as a
1MB file, but only be a 1KB file of data.  UNIX only really allocates 1 block
for this file.  If we tar, cpio, or cp this file, the file becomes
fully allocated as the nulls are fully padded.  This is a problem
since a backup may no longer fit on the same file system.  Any
suggestions as to ways around this other than hacking tar, cpio, and cp?
Thanks!

gnu@sun.uucp (John Gilmore) (07/16/85)

> We have a data base program that allocates nulls in the middle of
> the file to use for data expansion.  This may be created as a
> 1MB file, but only be a 1KB file of data.  UNIX only really allocates 1 block
> for this file.  If we tar, cpio, or cp this file, the file becomes
> fully allocated as the nulls are fully padded.  This is a problem
> since a backup may no longer fit on the same file system.  Any
> suggestions as to ways around this other than hacking tar, cpio, and cp?

I think the preferred way of handling this is for the kernel file
system to check when you write, to see if you wrote all zeros.
The circumstances of the test can be suitably restricted to avoid
overhead, e.g. only do the check if the block being written to is
currently a hole and the first 4 bytes of the write are zero (compare
the first long to 0).  This fixes the tar/dump/cp/cpio problem.

It would be slightly more intuitive if the file system never stored a
block of all zeros on disk (and this could be depended on by users)
but the compromise above fixes most of the things that bite people today.