[comp.unix.questions] tar frustration

lew@gsg.UUCP (Paul Lew) (08/11/88)

From article <2858@ttrdc.UUCP>, by levy@ttrdc.UUCP (Daniel R. Levy):
> 
> All this points up a "feature" of tar which I find frustrating:  if I want
> tar to tape-archive a large number of files randomly scattered all over the
> file system (such as for an incremental backup) I'm SOL because tar wants
> to be told either a directory to completely search or file names to archive,
> via the argument list.  "cpio" circumvents this problem, since I can feed it
> a list of files, but what if I don't WANT to use cpio?

Check with public domain tar posted to comp.sources.unix volumn 12.  There is
a flag 'T' which will take filenames from a file.  I used it to save sources
files like:

	find $src -print | sed	\
		-e '/\.o$/d'	\
		-e '/\.a$/d'	\
		-e '/~$/d'	\
		-e '/\/core$/d'	\
		-e '/\/a\.out$/d' | tar -c -T -

It works out great.  If $src is an absolute pathname, this tar will remove
the leading /s when writing to tape.
-- 
Paul Lew			{oliveb,harvard,decvax}!gsg!lew	(UUCP)
General Systems Group, 5 Manor Parkway, Salem, NH 03079	(603) 893-1000

levy@ttrdc.UUCP (Daniel R. Levy) (08/12/88)

> >tar cf /dev/whatever *

> I would suggest using "." rather than "*" to avoid the expansion of the commmand
> line to ridiculously long lengths.  While the meaning is certainly different, I
> have yet to think of any problems with this method when the intent is to tar up
> the contents of my current directory.

All this points up a "feature" of tar which I find frustrating:  if I want
tar to tape-archive a large number of files randomly scattered all over the
file system (such as for an incremental backup) I'm SOL because tar wants
to be told either a directory to completely search or file names to archive,
via the argument list.  "cpio" circumvents this problem, since I can feed it
a list of files, but what if I don't WANT to use cpio?  (Say, in a situation
which would trigger a known cpio bug, like inode numbers greater than 65535 or
uid's less than 0 [a la SUN] when doing cpio -c.)  Using the "r" option of tar
with repeated invocations of tar would work all right, but would be blastedly
slow because it would rewind the tape over and over and over.  If I used the
no-rewind tape device, I'd get a whole bunch of little tar archives, one for
each invocation.
-- 
|------------Dan Levy------------|  THE OPINIONS EXPRESSED HEREIN ARE MINE ONLY
| Bell Labs Area 61 (R.I.P., TTY)|  AND ARE NOT TO BE IMPUTED TO AT&T.
|        Skokie, Illinois        | 
|-----Path:  att!ttbcad!levy-----|

ron@topaz.rutgers.edu (Ron Natalie) (08/13/88)

If anybody is interested, I've got kicking around a public domain version of
TAR that takes the CPIO user interface, that is, the list of file names is
provided on the standard input.  While I was writing this, I noticed that
while every version of the TAR manual page that I've come accross describes
the format calls for zero-filling the fields, every implementation I've
seen actually space fills them (as if they had done it with Printf("%12d")).

Fortunately, every one is liberal in reading the archives (scanf doesn't care).

-Ron

pope@vatican (John Pope) (08/13/88)

In article <2858@ttrdc.UUCP>, levy@ttrdc (Daniel R. Levy) writes:
>
>All this points up a "feature" of tar which I find frustrating:  if I want
>tar to tape-archive a large number of files randomly scattered all over the
>file system (such as for an incremental backup) I'm SOL because tar wants
>to be told either a directory to completely search or file names to archive,
>via the argument list.  "cpio" circumvents this problem, since I can feed it
>a list of files, but what if I don't WANT to use cpio?

To feed tar a list of files, I just keep the directories I want in a file called
"save_list" and do:

	tar cf /dev/rst8 `cat save_list` 

As a side note, SunOS has a handy "X" option to tar, which specifies a filename
containing files to exclude from the backup:

	tar cfX /dev/rst8 exclude_list `cat save_list` 

This lets me back up everything in /usr/foo, but exclude the subdirectory
/usr/foo/bar, for example.
-- 
-- 
John Pope
	Sun Microsystems, Inc. 
		pope@sun.COM

james@bigtex.uucp (James Van Artsdalen) (08/13/88)

In article <2858@ttrdc.UUCP>, levy@ttrdc.UUCP (Daniel R. Levy) wrote:

> All this points up a "feature" of tar which I find frustrating:  if I want
> tar to tape-archive a large number of files randomly scattered all over the
> file system (such as for an incremental backup) I'm SOL because tar wants
> to be told either a directory to completely search or file names to archive,
> via the argument list.

Am I the only one to use John Gilmore's tar that was posted a while
back?  It solves all of the problems I've so far (including this one -
his can take files from stdin).  It does need some work to get it
working under SysV, but definitely worthwhile.
-- 
James R. Van Artsdalen    ...!uunet!utastro!bigtex!james     "Live Free or Die"
Home: 512-346-2444 Work: 328-0282; 110 Wild Basin Rd. Ste #230, Austin TX 78746

root@conexch.UUCP (Larry Dighera) (08/13/88)

In article <2858@ttrdc.UUCP< levy@ttrdc.UUCP (Daniel R. Levy) writes:
<< <tar cf /dev/whatever *
<
<< I would suggest using "." rather than "*" to avoid the expansion of the commmand
<< line to ridiculously long lengths.  While the meaning is certainly different, I

<All this points up a "feature" of tar which I find frustrating:  if I want
<tar to tape-archive a large number of files randomly scattered all over the
<file system (such as for an incremental backup) I'm SOL because tar wants
<to be told either a directory to completely search or file names to archive,
<via the argument list.  "cpio" circumvents this problem, since I can feed it
<a list of files, but what if I don't WANT to use cpio?  (Say, in a situation
<which would trigger a known cpio bug, like inode numbers greater than 65535 or
<uid's less than 0 [a la SUN] when doing cpio -c.)  Using the "r" option of tar
<with repeated invocations of tar would work all right, but would be blastedly
<slow because it would rewind the tape over and over and over.  If I used the
<no-rewind tape device, I'd get a whole bunch of little tar archives, one for
<each invocation.

This is so simple that it makes me feel like I don't understand the problem.
If you want tar to take the names of the files it is to put into the archive
from a file which contains the names of the files, just do this:

	tar cvf /dev/whatever `cat file_of_names`

You can generate file_of_names with find just like is normally done with
cpio.  Ain't UNIX grand?

Larry Dighera



-- 
USPS: The Consultants' Exchange, PO Box 12100, Santa Ana, CA  92712
TELE: (714) 842-6348: BBS (N81); (714) 842-5851: Xenix guest account (E71)
UUCP: conexch Any ACU 2400 17148425851 "" "" ogin:-""-ogin:-""-ogin: nuucp
UUCP: ...!uunet!turnkey!conexch!root || ...!trwrb!ucla-an!conexch!root

woods@gpu.utcs.toronto.edu (Greg Woods) (08/14/88)

In article <64026@sun.uucp> pope@vatican (John Pope) writes:
>In article <2858@ttrdc.UUCP>, levy@ttrdc (Daniel R. Levy) writes:
>>
>>All this points up a "feature" of tar which I find frustrating:  if I want
>>tar to tape-archive a large number of files randomly scattered all over the
>>file system (such as for an incremental backup) I'm SOL because tar wants
>>to be told either a directory to completely search or file names to archive,
>>via the argument list.  "cpio" circumvents this problem, since I can feed it
>>a list of files, but what if I don't WANT to use cpio?

SCO's version of tar has an "F" option, which allows specification of a
filename containing a list of files (ala cpio).  Especially useful if
`cat filename` give too long an argument list.

>As a side note, SunOS has a handy "X" option to tar, which specifies a filename
>containing files to exclude from the backup:
>
>	tar cfX /dev/rst8 exclude_list `cat save_list` 

This sounds handy too.

Anyone done any work on the PD-tar lately?  It needs multi-volume
handling, should have the ability to format floppies, and change devices
inter-volume (ala cpio), before I'll bother using it.
-- 
						Greg Woods.

UUCP: utgpu!woods, utgpu!{ontmoh, ontmoh!ixpierre}!woods
VOICE: (416) 242-7572 [h]		LOCATION: Toronto, Ontario, Canada

gwyn@smoke.ARPA (Doug Gwyn ) (08/14/88)

In article <7056@conexch.UUCP> root@conexch.UUCP (Larry Dighera) writes:
>This is so simple that it makes me feel like I don't understand the problem.

That's right...

>	tar cvf /dev/whatever `cat file_of_names`

There is a fairly small limit to the total number of bytes available
for the arguments to a command; something like 4096 to 10240 bytes.
It would be very easy to exceed this when archiving a large tree.

fnf@fishpond.UUCP (Fred Fish) (08/15/88)

In article <1988Aug13.190030.1495@gpu.utcs.toronto.edu> woods@gpu.utcs.Toronto.EDU (Greg Woods) writes:
>Anyone done any work on the PD-tar lately?  It needs multi-volume
>handling, should have the ability to format floppies, and change devices
>inter-volume (ala cpio), before I'll bother using it.

Just a suggestion for anyone making such changes; you might want to consider
also adding device cycling, a feature I just recently put in BRU (Backup and
Restore Utility).

That is, if you give it more than a single "-f <device>" option, it remembers
them and cycles through them in the specified order each time it finds the
end of one volume and needs another.  When it gets to the end of the list
it issues an appropriate prompt and waits for a response.  Thus if you have
four tape drives for example, you can do something like:

	bru -f /dev/rmt0 -f /dev/rmt1 -f /dev/rmt2 -f /dev/rmt3 ...

and go away until your four tapes are done.

-Fred
-- 
# Fred Fish, 1346 West 10th Place, Tempe, AZ 85281,  USA
# noao!nud!fishpond!fnf                   (602) 921-1113

ron@topaz.rutgers.edu (Ron Natalie) (08/15/88)

The reason you can't just do "tar cv `cat filenames`" is that tar will always
search down through directories.  Someone pointed out that Gilmore's PD tar
has a -T option that will allow you to avoid this while reading a file.  If
this is so, I suggest people get a copy.  My cpio-user-interfaced tar is not
really distribution quality but it does work.

-Ron

ok@quintus.uucp (Richard A. O'Keefe) (08/15/88)

In article <7056@conexch.UUCP> root@conexch.UUCP (Larry Dighera) writes:
>This is so simple that it makes me feel like I don't understand the problem.
>If you want tar to take the names of the files it is to put into the archive
>from a file which contains the names of the files, just do this:
>
>	tar cvf /dev/whatever `cat file_of_names`
>
If I have understood correctly, the original problem is a very simple
one: THERE IS A LIMIT ON THE SIZE OF THE COMMAND-LINE ARGUMENTS.
A common figure for this limit is about ten thousand characters
(look for NCARGS in <sys/param.h>).
Now, suppose I want to put 200 files on a tape, each of which has
a (relative) path name amouting to some 100 characters.  OH DEAR.

The problem never was TYPING the file names in the command, the problem
was that if you have a lot of files to archive, the command line just
gets too big to be accepted as a command.  (I have run into this several
times with `echo */*` and the like.)

For many UNIX utilities, this is not a problem, because having many
file names in one command is only a convenience anyway (e.g. *grep,
awk, sed, sometimes wc, ...) and you can use xargs(1) to get the
desired effect -- though that has some weird limits of its own -- but
tar is different.  With some drives you *can't* add to the end of a tape.

As someone else pointed out, the answer is to use John Gilmore's PDtar,
which amongst many other neat things has a '-T' option for reading names
from a file.

aad@stpstn.UUCP (Anthony A. Datri) (08/16/88)

In article <64026@sun.uucp> pope@vatican (John Pope) writes:

>	tar cf /dev/rst8 `cat save_list` 

>As a side note, SunOS has a handy "X" option to tar, which specifies a
>filename containing files to exclude from the backup:

>	tar cfX /dev/rst8 exclude_list `cat save_list` 

All well and good, but the shell continues to have a command line
length limit.  What I'd like is a *real* backup utility.  Twenex DUMPER
or even VMS BACKUP will let you do the right thing -- "backup all
files in these directories that have changed since this time, and use
more than one tape if you have to."  Dump will use more than one tape,
but acts on filesystems, not directories.  We've got a sun 3/180
here acting as a server to a bunch of machines, and several 3/50's
with local disks.  There's a directory under /usr where user files
are stored on the local disks, and we have to back it up.  rdump
will work for incrementals, but it still doesn't have the ability
to take more than one filesystem per command, so if you want to
put more than one filesystem on a tape, you have to hope that you
have enough tape to fit it.  Doing a weekly full dump, the whole
/usr partition goes out, wasting lots of tape, and requiring someone
to babysit the drive to keep feeding it tapes.  With tar, I can
say "just backup everything in /foo/u and /bar/u" where
/foo and /bar are NFS mounted /usr partitions on the 3/50's with
disks.  But that won't do multiple tapes.

Even HP's tcio will let you tar onto multiple tapes.  It's ugly,
but it works.  What I'd really like to see is for Sun to come out with
a decent backup system, something that Unix lacks, especially in
a Sun environment, where you've got files all over the place with
NFS.  I don't mind getting to the remote things through NFS mounts
on the central machine, but I *would* like to be able to back up
just what I want with one or two commands.  Perhaps a DUMPER
port is in order...:-)


-- 
@disclaimer(Any concepts or opinions above are entirely mine, not those of my
	    employer, my GIGI, or my 11/34)
beak is								  beak is not
Anthony A. Datri,SysAdmin,StepstoneCorporation,stpstn!aad

allbery@ncoast.UUCP (Brandon S. Allbery) (08/16/88)

As quoted from <7056@conexch.UUCP> by root@conexch.UUCP (Larry Dighera):
+---------------
| This is so simple that it makes me feel like I don't understand the problem.
| If you want tar to take the names of the files it is to put into the archive
| from a file which contains the names of the files, just do this:
| 
| 	tar cvf /dev/whatever `cat file_of_names`
| 
| You can generate file_of_names with find just like is normally done with
| cpio.  Ain't UNIX grand?
+---------------

But what happens when your file_of_names is > 5120 characters?  The exec
syscall enforces this limit, you can't code around it.

I suggest PD tar or afio (PD cpio); the latter could be modified
(incompatibly, but it would work) to support larger inode numbers and such.
In fact, I would modify afio for larger dev_t's and ino_t's, then force AT&T
to modify cpio compatibly:  they will have to face both issues sooner or
later, and most likely sooner with STREAMS networking (need *lots* of minor
devices and remote or local filesystems could be big).

++Brandon
-- 
Brandon S. Allbery, uunet!marque!ncoast!allbery			DELPHI: ALLBERY
	    For comp.sources.misc send mail to ncoast!sources-misc

levy@ttrdc.UUCP (Daniel R. Levy) (08/17/88)

In article <1988Aug13.190030.1495@gpu.utcs.toronto.edu>, woods@gpu.utcs.toronto.edu (Greg Woods) writes:
# Anyone done any work on the PD-tar lately?  It needs multi-volume
# handling, should have the ability to format floppies, and change devices
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# inter-volume (ala cpio), before I'll bother using it.

Are you kidding?  Is this not inherently nonportable?
-- 
|------------Dan Levy------------|  THE OPINIONS EXPRESSED HEREIN ARE MINE ONLY
| Bell Labs Area 61 (R.I.P., TTY)|  AND ARE NOT TO BE IMPUTED TO AT&T.
|        Skokie, Illinois        | 
|-----Path:  att!ttbcad!levy-----|

jfh@rpp386.UUCP (The Beach Bum) (08/17/88)

In article <2862@ttrdc.UUCP> levy@ttrdc.UUCP (Daniel R. Levy) writes:
>In article <1988Aug13.190030.1495@gpu.utcs.toronto.edu>, woods@gpu.utcs.toronto.edu (Greg Woods) writes:
># Anyone done any work on the PD-tar lately?  It needs multi-volume
># handling, should have the ability to format floppies, and change devices
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
># inter-volume (ala cpio), before I'll bother using it.
>
>Are you kidding?  Is this not inherently nonportable?

no, this can be handled (portably, no less) by having an option for a
command to execute prior to starting each volume.  afio has this ability
and i have used it to unload tapes without having to go over to the
tape drive.  it would be trivial to add code to the pd tar to see what
the first character of the line typed in response to the 'next volume'
prompt.  if the first character is '!', then execute the rest of the
line as a command.  format comes to mind as a useful one ...
-- 
John F. Haugh II                 +--------- Cute Chocolate Quote ---------
HASA, "S" Division               | "USENET should not be confused with
UUCP:   killer!rpp386!jfh        |  something that matters, like CHOCOLATE"
DOMAIN: jfh@rpp386.uucp          |         -- apologizes to Dennis O'Connor

mouse@mcgill-vision.UUCP (der Mouse) (08/20/88)

In article <2858@ttrdc.UUCP>, levy@ttrdc.UUCP (Daniel R. Levy) writes:
> All this points up a "feature" of tar which I find frustrating:
> [...can't take random filenames except in argument list...].  "cpio"
> circumvents this problem, since I can feed it a list of files, but
> what if I don't WANT to use cpio?

You use a tar that can take filenames from stdin.  Mine can.  I think
Gilmore's can.  (Mine also treats absolute pathnames on the tape
specially, and has other frills.)

					der Mouse

			old: mcgill-vision!mouse
			new: mouse@larry.mcrcim.mcgill.edu

mark@iccdev.UUCP (Mark Wutka) (08/27/88)

In article <1249@mcgill-vision.UUCP>, mouse@mcgill-vision.UUCP (der Mouse) writes:
> In article <2858@ttrdc.UUCP>, levy@ttrdc.UUCP (Daniel R. Levy) writes:
> > All this points up a "feature" of tar which I find frustrating:
> > [...can't take random filenames except in argument list...].  "cpio"
> > circumvents this problem, since I can feed it a list of files, but
> > what if I don't WANT to use cpio?
> 
> You use a tar that can take filenames from stdin.  Mine can.  I think
> Gilmore's can.  (Mine also treats absolute pathnames on the tape
> specially, and has other frills.)

If you don't have a tar that will take files from stdin, you can try what
I have used here:

	tar <whatever> `cat listoffiles`

Hopefully this will work right on your shell. You can list the files
one per line if you like. I'm not sure if all the shells do this, but
the one I use - ksh - will concatenate the files into one long line
separated by spaces. Be careful, though, it may bomb if you give it
some enormous number of files. I haven't had a blow-up with several
hundred files, though.

If you want to make tar take from stdin, just try:

	tar <whatever> `cat`

This will works with other commands, of course. The only one that comes
to my mind right now is "ar". If you want to create a library and can't
really use wildcards you can do it this way.


-- 
...!gatech!ncrats!iccdev!mark

This is what happens when I roll my head on the keyboard:
kijmuhnyjuygikmluhygbtnjkm,l.jhnubgyvfnjmuki,lmnjhbgv

MorsinAc@econ.vu.nl (Triple A) (09/27/88)

My summary says it all...

This is my roll:
sdewxz 0-omk-56l;b vnjbgmh