[comp.unix.questions] backup with compressed cpio files ?

bothe@netmuc.UUCP (02/08/89)

hello,
has anyone tested to backup compressed cpiofiles ?

what i suggest is to save my files with:
find . -print | compress | cpio -oacB >/dev/rmt0

and to restore them with
</dev/rmt0 uncompress | cpio -icB ...

is the (un)compress save enough to do this?
can it handle all chars (incl. nuls and 8bit) ?

can it handle compressed files (we had trouble with a double
compressed file) ???

Answer please to:
bothe.muc@nixpbe.UUCP

davidsen@steinmetz.ge.com (William E. Davidsen Jr) (02/18/89)

In article <9100001@netmuc> bothe@netmuc.UUCP writes:

| 
| what i suggest is to save my files with:
| find . -print | compress | cpio -oacB >/dev/rmt0

  Won't work. You want to compress the output of cpio, not the list of
files into it. An interesting concept, but no cigar.

  This should work:
	find . -print | cpio -oca | compress >/dev/rmt0

Two notes:
 1) you may need "dd bs=5k" after compress
 2) if you have -depth as an option, use it to reset directory times

  This works pretty well if you're willing to sacrifice performance for
media size. Double compressed files may get larger. Compression size and
performance is MUCH better if you can do all of one type of file (C
source, text, nroff, COFF) at a time, since the compress algorithm is
adaptive and performs better the more you give it of the same type.
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

tar@ksuvax1.cis.ksu.edu (Tim Ramsey) (02/18/89)

In article <9100001@netmuc> bothe@netmuc.UUCP writes:

>hello,
>has anyone tested to backup compressed cpiofiles ?

>what i suggest is to save my files with:
>find . -print | compress | cpio -oacB >/dev/rmt0

>and to restore them with
></dev/rmt0 uncompress | cpio -icB ...

I see two problems:

1) You can't compress the output from find and pipe that into cpio.  Try:
      find . -print | compress | cpio -oac > /dev/rmt0

   Note that since cpio isn't writing directly to the tape device, the
   "-B" option doesn't do anything.

2) If you get a bad spot on the tape later you won't (never say never :-)
   be able to recover anything beyond the bad spot.  This is a tradeoff
   when you use compress.

Tim
--
Timothy Ramsey
BITNET: tar@KSUVAX1
Internet: tar@ksuvax1.cis.ksu.edu
UUCP: ...!rutgers!ksuvax1!tar

sl@van-bc.UUCP (pri=-10 Stuart Lynne) (02/19/89)

In article <13176@steinmetz.ge.com> davidsen@crdos1.UUCP (bill davidsen) writes:
>In article <9100001@netmuc> bothe@netmuc.UUCP writes:
>| find . -print | compress | cpio -oacB >/dev/rmt0

>	find . -print | cpio -oca | compress >/dev/rmt0

>  This works pretty well if you're willing to sacrifice performance for
>media size. Double compressed files may get larger. Compression size and

If you use this beware that a bad block in your backup media will render the 
rest of the backup useless for all intents.

	compressdirv
	find . -print | cpio -oca > /dev/rmt0

is almost as good, won't try and recompress compressed files. Also if you
loose a block on your tape you will have better luck trying to get at the
rest of the tape.

If you where running SCO Xenix 2.3 with their support for error correction
on tape the other way might be suitable, but I wouldn't recommend it
otherwise.

-- 
Stuart.Lynne@wimsey.bc.ca {ubc-cs,uunet}!van-bc!sl     Vancouver,BC,604-937-7532

gwyn@smoke.BRL.MIL (Doug Gwyn ) (02/19/89)

In article <872@deimos.cis.ksu.edu> tar@ksuvax1.cis.ksu.edu (Tim Ramsey) writes:
>In article <9100001@netmuc> bothe@netmuc.UUCP writes:
>>what i suggest is to save my files with:
>1) You can't compress the output from find and pipe that into cpio.  Try:
>      find . -print | compress | cpio -oac > /dev/rmt0
>   Note that since cpio isn't writing directly to the tape device, the
>   "-B" option doesn't do anything.

What the hell are you guys talking about??
The output from "find" is a list of pathnames,
which must remain readable as "cpio" receives them
If you direct "cpio -o"'s output at a magtape,
it certainly WILL write directly to it.

	find . -depth -print | cpio -oc | compress | dd bs=5k > /dev/rmt0

bjorn@sysadm.UUCP (Bjorn Satdeva) (02/19/89)

In article <9100001@netmuc> bothe@netmuc.UUCP writes:
 
>what i suggest is to save my files with:
>find . -print | compress | cpio -oacB >/dev/rmt0
 
Don't you mean :

find . -print | cpio -oacB | compress >/dev/rmt0

Bjorn Satdeva
uunet!sysadm!bjorn	/sys/admin, inc  The Unix System Administration Experts

tar@ksuvax1.cis.ksu.edu (Tim Ramsey) (02/19/89)

In article <9667@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <872@deimos.cis.ksu.edu> I wrote:

>>      find . -print | compress | cpio -oac > /dev/rmt0
>>   Note that since cpio isn't writing directly to the tape device, the
>>   "-B" option doesn't do anything.

>What the hell are you guys talking about??
>The output from "find" is a list of pathnames,
>which must remain readable as "cpio" receives them
>If you direct "cpio -o"'s output at a magtape,
>it certainly WILL write directly to it.

>	find . -depth -print | cpio -oc | compress | dd bs=5k > /dev/rmt0

This is what I meant to say.  This is not what came out.  Doug, you
are absolutely right -- here's what I *meant* to say:
        find . -print | cpio -oac | compress > /dev/rmt0

Sorry if I confused anybody besides myself.  :-(

Tim
--
Timothy Ramsey
BITNET: tar@KSUVAX1
Internet: tar@ksuvax1.cis.ksu.edu
UUCP: ...!rutgers!ksuvax1!tar

fnf@estinc.UUCP (Fred Fish) (02/19/89)

In article <9667@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>	find . -depth -print | cpio -oc | compress | dd bs=5k > /dev/rmt0

(I'm sure Doug knows this, but for the sake of completeness...)

In order to ensure that dd writes 5K records I believe you have to
specify "ibs" and "obs" to be different sizes, to force an internal
buffer copy.  Substitute "ibs=1k obs=5k" for "bs=5K".  Also note that
the last buffer may not be a full 5k, which may confuse a cpio that
reads the tape directly.

With compress, one read error and you're SOL as far as recovering the rest
of the data.  Of course, vanilla cpio is so braindead that you're SOL
anyway, so you might as well go ahead and compress it...  :-)
-- 
# Fred Fish, 1835 E. Belmont Drive, Tempe, AZ 85284,  USA
# 1-602-491-0048           asuvax!{nud,mcdphx}!estinc!fnf

ggs@ulysses.homer.nj.att.com (Griff Smith) (02/20/89)

In article <64@estinc.UUCP>, fnf@estinc.UUCP (Fred Fish) writes:
> In article <9667@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
> >	find . -depth -print | cpio -oc | compress | dd bs=5k > /dev/rmt0
> In order to ensure that dd writes 5K records I believe you have to
> specify "ibs" and "obs" to be different sizes, to force an internal
> buffer copy.

Only true if you are using the dd in V9.  It is only necessary that
you use ibs and obs rather than bs; the sizes can be identical.

> Substitute "ibs=1k obs=5k" for "bs=5K".

Why?  Unless you are working on a brain-damaged tape drive that can't
write blocks larger than 5k, why not use large blocks to get more on
the tape.  20K, maybe.  You're going to read the tape with dd anyway,
and cpio won't have trouble reading the pipeline from compress.
Try "ibs=8k obs=20k".

> Also note that the last buffer may not be a full 5k, which may confuse
> a cpio that reads the tape directly.

Cpio can't read the tape directly, it's compressed.

> With compress, one read error and you're SOL as far as recovering the rest
> of the data.

Which is a good reason to forget the whole suggestion anyway.
> -- 
> # Fred Fish, 1835 E. Belmont Drive, Tempe, AZ 85284,  USA
> # 1-602-491-0048           asuvax!{nud,mcdphx}!estinc!fnf
-- 
Griff Smith	AT&T (Bell Laboratories), Murray Hill
Phone:		1-201-582-7736
UUCP:		{most AT&T sites}!ulysses!ggs
Internet:	ggs@ulysses.att.com

daveb@gonzo.UUCP (Dave Brower) (02/20/89)

In <872@deimos.cis.ksu.edu> tar@ksuvax1.cis.ksu.edu (Tim Ramsey) writes:
>In <9100001@netmuc> bothe@netmuc.UUCP writes:
>
>>has anyone tested to backup compressed cpiofiles ?
>>find . -print |  cpio -oac | compress >/dev/rmt0
>
>2) If you get a bad spot on the tape later you won't (never say never :-)
>   be able to recover anything beyond the bad spot.  This is a tradeoff
>   when you use compress.

Actually it's not hopeless.  I suffered a HD crash on my 3b1, and when
restoring my 60 compressed floppy backup found that disk #13 or so was
bad, keeping me from getting at anything beyond, at least so I thought
at first.

I contacted a few of the people whose names are in the comress source,
and they scratched their heads a bit and said essentially there are
three things you can do:

	1.  Hope the failure is deep enough that the tables are full
	    and stable and unchanging, so that you can just slice out
	    the bad section and have it uncompress OK.

	2.  Look past the bad section for some magic data that indicates
	    compress has decided the current table is trash, and is so
	    going to dump it and restart.  You could theoretically pick
	    it back up from that point.

	3.  Give up.

I was prepared to do #2, except I found after writing a program to 
reject damaged cpio archives (fixcpio) that I had only suffered #1.
So, it worked OK for me.

Now, given that many people will give up when their *uncompressed* cpio
archive has a bad spot, saying "out of phase -- get help", I don't 
think most people would lose anything by compressing.

-dB
-- 
"I came here for an argument." "Oh.  This is getting hit on the head"
{sun,mtxinu,amdahl,hoptoad}!rtech!gonzo!daveb	daveb@gonzo.uucp

erickson@carroll1.UUCP (Dave Erickson) (02/21/89)

In article <64@estinc.UUCP> fnf@estinc.UUCP (Fred Fish) writes:
>
>With compress, one read error and you're SOL as far as recovering the rest
>of the data.  Of course, vanilla cpio is so braindead that you're SOL
>anyway, so you might as well go ahead and compress it...  :-)
>-- 

	I agree.  One medium error and you're kicked out of the program.  This
happens rather mercilessly whether you are on the first file of the first tape
or on the third tape - 3 hours through the cpio program!!  Then of course you
have to restart.  We back up our 3B2 every week with the command:

#cd /
#find * -print|cpio -ocvO /dev/rSA/qtape1

This takes several hours and 4 tapes minimum...assuming there are no medium errors, or
pretty much any other kind. 

We are working on more efficient algorithms such as breaking the cpio at every
tape change and starting a new one with whatever is left to save, but nothing
is really perfected yet.

If anyone of you 3B2ers out there have any suggestions (besides 'sell'), I'd
love to hear from you.

Please e-mail your backup suggestions to :
erickson@carroll1.UUCP

I'll post anything useful

thanx.

Dave

cdold@starfish.Convergent.COM (Clarence Dold) (02/22/89)

From article <229@carroll1.UUCP>, by erickson@carroll1.UUCP (Dave Erickson):
> In article <64@estinc.UUCP> fnf@estinc.UUCP (Fred Fish) writes:

>>With compress, one read error and you're SOL as far as recovering the rest
>>of the data.  Of course, vanilla cpio is so braindead that you're SOL
>>anyway, so you might as well go ahead and compress it...  :-)

> 
> 	I agree.  One medium error and you're kicked out of the program.  This

What about the new 'zoo'?
It allows compression to-from stdio, and has a skipping feature, to 
bypass bad blocks in an archive.
I haven't tried this yet, but it does interest me, since I have to
back up a 240MB Database to 150MB QIC.
-- 
Clarence A Dold - cdold@starfish.Convergent.COM         (408) 434-2083
                ...pyramid!ctnews!professo!dold         MailStop 18-011
                P.O.Box 6685, San Jose, CA 95150-6685

breck@aimt.UU.NET (Robert Breckinridge Beatie) (02/22/89)

In article <872@deimos.cis.ksu.edu>, tar@ksuvax1.cis.ksu.edu (Tim Ramsey) writes:
> In article <9100001@netmuc> bothe@netmuc.UUCP writes:
> 
> >what i suggest is to save my files with:
> >find . -print | compress | cpio -oacB >/dev/rmt0
> 
> 1) You can't compress the output from find and pipe that into cpio.  Try:
>       find . -print | compress | cpio -oac > /dev/rmt0

How is this different?  You're still passing compressed find output to
cpio.  Don't you (and bothe@netmuc) mean:
		find . -print | cpio -oac | compress > /dev/rmt0
?
> 
> 2) If you get a bad spot on the tape later you won't (never say never :-)
>    be able to recover anything beyond the bad spot.  This is a tradeoff
>    when you use compress.

Isn't this just as much a problem with cpio?  It doesn't seem to be too
robust, so even if the output weren't compressed, cpio would just complain:
	"Out of sync: better luck next time"
or some similarly irritating message.  It seems you'd have to unpack the
file by hand.

And as long as we're unpacking the file by hand, would it be possible to
look for the "reset" marks that compress leaves in its output periodically
and then pick up from that point?  I thought that compress, when its code
tables fill up and when it figures that its compression effeciency is
decreasing that it wrote a "reset" symbol and then cleared its code tables
and effectively started over from scratch.

So, would it be possible to take the data on the tape after the reset
symbol, prepend a compress header to that data, and hand that stream
to uncompress and get back at least part of what you had on the tape
after the bad spot?

Of course, I'm not that familiar with the workings of compress so I
could be way off base here.

-- 
Breck Beatie	    				(408)748-8649
{uunet,ames!coherent}!aimt!breck  OR  breck@aimt.uu.net
"Sloppy as hell Little Father.  You've embarassed me no end."

ric@Apple.COM (Ric Urrutia) (03/11/89)

>In <9100001@netmuc> bothe@netmuc.UUCP writes:
>
>>has anyone tested to backup compressed cpiofiles ?
>>find . -print |  cpio -oac | compress >/dev/rmt0


I suggest using the following command to write the file
	find . -depth -print | cpio -o | compress | dd bs=8k of=/dev/rmt0
	(I use 8k on my cartridge but use whatever blocking factor is
		appropriate for your system.)

To uncompress the tape file, use the following command:
	dd if=/dev/rmt0 bs=8k | uncompress | cpio -ivdum (or whatever cpio 
		options you need).

Good luck!