[comp.unix.questions] saving directories when they don't fit on 1 tape

wsmith@mdbs.UUCP (Bill Smith) (09/29/89)

Is there an automatic, reasonably portable way to save a directory and all 
files beneath it on tape when it does not fit on a single piece of media 
and can not easily be split at the next level down in the directory tree?

I would like something like the multi-volume versions of tar that I have
seen (on Ultrix for example) in a Sun environment.   dump is the closest
thing I've seen, but as I understand the man page, it only works on complete
filesystems which is 2 or 3 times more data than I want to save.

Bill Smith
uunet!pur-ee!mdbs!wsmith

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (09/29/89)

In article <1454@mdbs.UUCP>, wsmith@mdbs.UUCP (Bill Smith) writes:
|  
|  Is there an automatic, reasonably portable way to save a directory and all 
|  files beneath it on tape when it does not fit on a single piece of media 
|  and can not easily be split at the next level down in the directory tree?

  There is a program called bundle in the archives which breaks up a
data stream and performs buffering. It will prompt you to mount another
tape after N bytes have been sent to the device.

Example:
  tar cf - mydir | bundle /dev/rst8 55000k 60k
                            ^        ^     ^
                            |        |     |__ buffer size
                            |        |________ device size
                            |_________________ device name
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
"The world is filled with fools. They blindly follow their so-called
'reason' in the face of the church and common sense. Any fool can see
that the world is flat!" - anon

rice@dg-rtp.dg.com (Brian Rice) (09/30/89)

In article <1454@mdbs.UUCP> wsmith@mdbs.UUCP (Bill Smith) writes:

>Is there an automatic, reasonably portable way to save a directory and all 
>files beneath it on tape when it does not fit on a single piece of media 
>and can not easily be split at the next level down in the directory tree?

Use cpio.  First, there are some cpio's which support spreading
tape files across several tapes.  Ours does, I'm confident; check 
your cpio's man page to make sure.  But I don't think vanilla 
System V's do (or are required to by AT&T), and SunOS 4.0's cpio 
man page says it doesn't.

So suppose your cpio doesn't do the trick by itself.  We can use
cpio's flexibility to work around dern near anything.  Here's a 
simple, *approximate* method, which will work most of the time and
is hassle-free:  Suppose you have 400 files which take up 190 
megabytes, which you want to put on two 150-megabyte tapes.  Just 
put in a tape, give this command (as root):

find directory_to_write -print | head -200 | cpio -vBo >tape_device

Then put in a new tape, and give this command:

find directory_to_write -print | tail -200 | cpio -vBo >tape_device

When you reload these using cpio -i, be sure to include the -d option,
which creates directories as needed.  Depending on your medium, you
might need to use some other option than B (see the man page; there's
scads of them).

Obviously, this method will fail if one file occupies, say, 160 MB
and the remaining 399 together only take 30 MB. So, if you want a 
method which will work all the time, you're going to have to dress 
this up with a shell script.  The idea is to partition all the files 
into groups, each of which has total size less than one of your 
physical media, then cpio each group out, prompting the operator 
between each group to insert a new tape.  If I were going to write this,
here's the strategy I'd use:

1.  Use find to create a list of all files which you wish to 
    write to tape.  Call it /tmp/out1.
2.  Pipe this file to a program which, for each line of input,
    runs /bin/ls -sd on it, and appends the result to a file.
    Call this file /tmp/out2.
3.  awk '{print $1}' /tmp/out2 > /tmp/out3.
4.  Run a program which proceeds through /tmp/out1 and /tmp/out3, 
    adding the value of each line of /tmp/out3 to a cumulative total 
    while appending each corresponding line of /tmp/out1 to /tmp/out4.
    When we find that adding in the value of a new line of /tmp/out3 to
    our total would cause the total to exceed tape capacity (less a 
    fudge factor), or when we reach the end of file on /tmp/out1 and
    /tmp/out3, we do this command:

    cat /tmp/out4 | cpio -ovB > tape_device

    If there are no more lines in /tmp/out1 and /tmp/out3, we're
    done.  Otherwise:

    echo Please insert a new tape and press a key...
    # Wait for the user to press a key, using your favorite method.

    Then we empty out /tmp/out4 and start reading from /tmp/out1 and
    /tmp/out3 again.
5.  rm /tmp/out[1234]

Pretty ugly, ain't it.  That's why we should hide it in a shell script.
But I think it'll work.  /tmp/out[123] should all wind up with the same 
length, unless something is dreadfully wrong (that -d option on ls
is very important!).

Good luck.
Brian Rice   rice@dg-rtp.dg.com   (919) 248-6328
DG/UX Product Assurance Engineering
Data General Corp., Research Triangle Park, N.C.
"My other car is an AViiON."

root@medsys.uucp (Superuser) (09/30/89)

rice@dg-rtp.dg.com (Brian Rice) writes:

In article <1454@mdbs.UUCP> wsmith@mdbs.UUCP (Bill Smith) writes:

>Is there an automatic, reasonably portable way to save a directory and all 
>files beneath it on tape when it does not fit on a single piece of media 
>and can not easily be split at the next level down in the directory tree?

In Xenix one need only edit the /etc/default/tar file to include the size
of the tape (61440 = 60 meg):

archive8=/dev/rct0		20	61440	y
                                        -----
Otherwise, the complete tar argument should work:

tar -cfbkv /dev/rct0 20 61440 [directory-name]

--
             __   __|  __        __        LaVerne Olney -- Med-Systems
    |/^\/^\ /__) /  | (__  |  | (__     Medical Office Management Software
    |  |  | \__  \__| ___) \__| ___)           1932 Brookside Road
                              |            Kingsport, TN  37660  U.S.A. 
  Unix BBS: 615-288-3957   (__)             UUCP: uunet!medsys!laverne

guy@auspex.auspex.com (Guy Harris) (10/01/89)

>Use cpio.  First, there are some cpio's which support spreading
>tape files across several tapes.  Ours does, I'm confident; check 
>your cpio's man page to make sure.  But I don't think vanilla 
>System V's do (or are required to by AT&T), and SunOS 4.0's cpio 
>man page says it doesn't.

Note that the S5 "cpio"s support for multi-volume archives depends on
the tape driver handling the end-of-medium condition by returning a
short count or one of a couple of errors, *and* doing so on both reads
and writes.  I don't know that the Sun SCSI tape driver does either; it
runs the tape in buffered mode, presumably in the hope of getting
streaming tapes to stream, and unfortunately that makes it somewhat hard
to handle the end-of-medium condition - there may be several blocks in
the tape drive's buffer that haven't been written to tape, and they'd
have to be retrieved using the proper SCSI command and written to the
next tape.

Unfortunately, the "obvious" trick of doing this in the driver,
invisibly, could cause problems if you e.g.  decide to punt and try a
bigger or higher-density tape instead of continuing to the next volume;
there'd need to be some way for the driver to know whether to write the
buffered data when the next tape is mounted or to discard it, and to in
fact know when the next tape is ready to be written on.  "cpio" and
"tar" don't currently do anything to give explicit indications, so you'd
have to rely on opens and closes and the like, and hope that the "mount
next volume and keep going" case really *is* distinguishable from the
"punt and get a bigger tape" case by the driver, or that you can live
with the results if it isn't.

Also, I think the Sun driver may not report end-of-medium conditions on
"read".  UNIX drivers tend not to do so....

The strategy you suggest is a bit safer here, since you manually break
up the operation, rather than letting the tape drive do so with
end-of-medium indications.  The V7/S3 and BSD "dump" programs use a
similar strategy - you tell them how much tape there is, and some other
information, and they figure out how many blocks will fit on the volume.

gentry@kcdev.UUCP (Art Gentry) (10/02/89)

In article <1454@mdbs.UUCP>, wsmith@mdbs.UUCP (Bill Smith) writes:
> Is there an automatic, reasonably portable way to save a directory and all 
> files beneath it on tape when it does not fit on a single piece of media 
> and can not easily be split at the next level down in the directory tree?

At least on my Hewlett Packard and AT&T systems, both tar and cpio will do
multi tape saves.  They will write untill eot is detected and then prompt
for another device to continue writting.  If you just hit [return], they
will continue writing to the same device (after you have changed the tape,
of course!! :-})

-- 
| R. Arthur Gentry     AT&T Communications     Kansas City, MO     64106 |
| Email: attctc!kcdev!gentry        ATTMail: attmail!kc4rtm!gentry       |
| The UNIX BBS: 816-221-0475        The Bedroom BBS: 816-637-4183        |
| $include {std_disclaimer.h}       "I will make a quess" - Spock - STIV |

mje@olsa99.UUCP (Mark J Elkins) (10/02/89)

From article <1466@xyzzy.UUCP>, by rice@dg-rtp.dg.com (Brian Rice):
> In article <1454@mdbs.UUCP> wsmith@mdbs.UUCP (Bill Smith) writes:
> 
>>Is there an automatic, reasonably portable way to save a directory and all 
>>files beneath it on tape when it does not fit on a single piece of media 
>>and can not easily be split at the next level down in the directory tree?
> 
> Use cpio.  First, there are some cpio's which support spreading
> tape files across several tapes.
[Deleted example using tail and head]

> Obviously, this method will fail if one file occupies, say, 160 MB
> and the remaining 399 together only take 30 MB.

What about using 'afio'?  It has an 's' option - to specify the length
of your tapes.  When it hits this value, it prompts for the next tape.
Its also 'cpio' compatable!

 ie 'find whatever -print | afio -ovs 150m /dev/streamer_tape'

There is one bug however - the internal buffer size for cpio is 5120
bytes.  If you backup to a tape which is not a multiple of 5 Meg, the
backup works fine (it writes a partial block at the end of the tape)
but when restoring - it will only read full blocks of data - so will
not read any partial blocks.

This is a pain because just because you are meant to be able to write
60 Meg to a 60 Meg streamer - is not always true (I can sometimes get
62 Meg).  To be safe, I said "Only write 58 Meg".  It writes fine -
just never quite restores!

What guarantee do we have that all 600 ft tapes really have at least
600 ft of tape on them?

Has this problem been addressed ?

The archive it writes is 'ascii cpio' format. However - when trying to
restore 'ascii cpio' with 'cpio' - its possible to get problems with
certain files with numbers in their names (terminfo).  Does 'afio'
address this problem?
-- 
/"""\  Mark J Elkins, Olivetti Africa, Unix Software Support
|o.o|  UUCP: {ddsw1 | olgb1 | olnl1} !olsa99!mje
\_=_/  mje@olsa99.UUCP  (mje@olsa99.uunet)
#define DISCLAMER

cpcahil@virtech.UUCP (Conor P. Cahill) (10/03/89)

In article <895@kcdev.UUCP>, gentry@kcdev.UUCP (Art Gentry) writes:
> In article <1454@mdbs.UUCP>, wsmith@mdbs.UUCP (Bill Smith) writes:
> > Is there an automatic, reasonably portable way to save a directory and all 
> > files beneath it on tape when it does not fit on a single piece of media 
> > and can not easily be split at the next level down in the directory tree?
> 
> At least on my Hewlett Packard and AT&T systems, both tar and cpio will do
> multi tape saves.  They will write untill eot is detected and then prompt
> for another device to continue writting.  If you just hit [return], they
> will continue writing to the same device (after you have changed the tape,
> of course!! :-})

Cpio does not allow this.  You must enter the name of the output device for
each follow on tape. This is the most agravating feature of any unix utility
that I use.   Invariably I will absent mindedly hit just a return on the 12th
diskette and have to remake the entire set.

I would prefer the mechanism that you specify or at least a mechanism that
requires a confirmation that you wish to abort the archiving.



-- 
+-----------------------------------------------------------------------+
| Conor P. Cahill     uunet!virtech!cpcahil      	703-430-9247	!
| Virtual Technologies Inc.,    P. O. Box 876,   Sterling, VA 22170     |
+-----------------------------------------------------------------------+

buck@siswat.UUCP (A. Lester Buck) (10/04/89)

In article <81@olsa99.UUCP>, mje@olsa99.UUCP (Mark J Elkins) writes:
> What guarantee do we have that all 600 ft tapes really have at least
> 600 ft of tape on them?

Even with a 600+ ft tape, given enough write errors and extended
inter-record gaps, one can still run out of tape.  So we have no guarantee
that a fixed amount of data will fit on any given tape.

Somehow we muddle through...


-- 
A. Lester Buck		...!texbell!moray!siswat!buck

jgd@rsiatl.UUCP (John G. De Armond) (10/04/89)

In article <1220@virtech.UUCP> cpcahil@virtech.UUCP (Conor P. Cahill) writes:
>
>Cpio does not allow this.  You must enter the name of the output device for
>each follow on tape. This is the most agravating feature of any unix utility
>that I use.   Invariably I will absent mindedly hit just a return on the 12th
>diskette and have to remake the entire set.
>
>I would prefer the mechanism that you specify or at least a mechanism that
>requires a confirmation that you wish to abort the archiving.
>

Wrong.  If you will read the fine print while you are RTFM, you will note
that this mode of operation only happens when you redirect the output
of cpio to a tape device.  If  you instead use the -O option to tell
cpio what device to use, it will happily continue using the same device.
A command line like:

cpio -oO /dev/tape ... 

will do exactly what you want.  

Disclaimer:  I know this works on Microport, Interactive, Convergent
and AT&T 3b2xxx systems.  Your results may vary.


John
-- 
John De Armond, WD4OQC                     | Manual? ... What manual ?!? 
Radiation Systems, Inc.     Atlanta, GA    | This is Unix, My son, You 
gatech!stiatl!rsiatl!jgd  **I am the NRA** | just GOTTA Know!!! 

ch@maths.tcd.ie (Charles Bryant) (10/05/89)

In article <1454@mdbs.UUCP> wsmith@mdbs.UUCP (Bill Smith) writes:

>Is there an automatic, reasonably portable way to save a directory and all 
>files beneath it on tape when it does not fit on a single piece of media 
>and can not easily be split at the next level down in the directory tree?

I have found the best way to back up a 70Mb file system onto a 60Mb cartridge
is with "cpio -o | compress". You can rely on about 2:1 compression unless a
lot of your data is already compressed.
-- 

		Charles Bryant. (ch@dce.ie)
Working at Datacode Electronics Ltd. (Modem manufacturers)

cpcahil@virtech.UUCP (Conor P. Cahill) (10/05/89)

In article <228@rsiatl.UUCP>, jgd@rsiatl.UUCP (John G. De Armond) writes:
> In article <1220@virtech.UUCP> cpcahil@virtech.UUCP (Conor P. Cahill) writes:
> >Cpio does not allow this.  You must enter the name of the output device for
> >each follow on tape. This is the most agravating feature of any unix utility
> >that I use.   Invariably I will absent mindedly hit just a return on the 12th
> >diskette and have to remake the entire set.
> >
> >I would prefer the mechanism that you specify or at least a mechanism that
> >requires a confirmation that you wish to abort the archiving.
> >
> 
> Wrong.  If you will read the fine print while you are RTFM, you will note
> that this mode of operation only happens when you redirect the output
> of cpio to a tape device.  If  you instead use the -O option to tell
> cpio what device to use, it will happily continue using the same device.

The -O option only came into being in release 3.1 (or 3.2, not sure which).
It was not present in release 3.0 or earlier releases.  Not knowing the
release of the other user it is best to stay with the lowest common
denominator. (and the BSDers will quickly point out that BSD doesn't have
cpio so how can it be an LCD).










-- 
+-----------------------------------------------------------------------+
| Conor P. Cahill     uunet!virtech!cpcahil      	703-430-9247	!
| Virtual Technologies Inc.,    P. O. Box 876,   Sterling, VA 22170     |
+-----------------------------------------------------------------------+