[comp.unix.questions] How to archive several files with tar?

CSC9017043T%LUST1@LURE.LATROBE.EDU.AU ("B.J SNOOPY, LATROBE UNIVERSITY, AUSTRALIA.") (06/09/91)

Hi,
    I am having a  great deal of difficulty  trying to tar a group of text
files to a single file.  I have read the manual entries on this, but can't
make any sense  of them.  I would greatly  appreciate it if  someone would
help me out here..

                                           Thanks.

===============================================================================
Put your nose to the Grindstone.     |    B.J. Snoopy, 
    - amalgamated Plastic surgeons   |    Latrobe University, Australia.
      and Toolmakers LTD.            |    CSC9017043T%lust1@lure.latrobe.edu.au
===============================================================================

jc@raven.bu.edu (James Cameron) (06/09/91)

>>>>> On 8 Jun 91 17:21:47 GMT, CSC9017043T%LUST1@LURE.LATROBE.EDU.AU ("B.J SNOOPY, LATROBE UNIVERSITY, AUSTRALIA.") said:

|> Hi,
|>     I am having a  great deal of difficulty  trying to tar a group of text
|> files to a single file.  I have read the manual entries on this, but can't
|> make any sense  of them.  I would greatly  appreciate it if  someone would
|> help me out here..

|>                                            Thanks.
|> Put your nose to the Grindstone.     |    B.J. Snoopy, 


	Take a look at this:

-jc-  [raven: ~/temp] % touch test{1,2,3,4}
-jc-  [raven: ~/temp] % ls
test1   test2   test3   test4
-jc-  [raven: ~/temp] % tar cvf test.tar .
a ./test1 0 blocks
a ./test2 0 blocks
a ./test3 0 blocks
a ./test4 0 blocks
a ./test.tar 0 blocks
-jc-  [raven: ~/temp] % ls
test.tar        test1           test2           test3           test4
-jc-  [raven: ~/temp] % rm test?
-jc-  [raven: ~/temp] % ls
test.tar
-jc-  [raven: ~/temp] % tar xvf test.tar
x ./test1, 0 bytes, 0 tape blocks
x ./test2, 0 bytes, 0 tape blocks
x ./test3, 0 bytes, 0 tape blocks
x ./test4, 0 bytes, 0 tape blocks
x ./test.tar, 0 bytes, 0 tape blocks
-jc-  [raven: ~/temp] % ls
test.tar        test1           test2           test3           test4
-jc-  [raven: ~/temp] %


	So, what you are looking for is basically:

% tar cvf file.tar .

        to create tape archive of current directory.  

Hope that helps!!!

jc

--
					-- James Cameron  (jc@raven.bu.edu)

Signal Processing and Interpretation Lab.  Boston, Mass  (617) 353-2879
------------------------------------------------------------------------------
"But to risk we must, for the greatest hazard in life is to risk nothing.  For
the man or woman who risks nothing, has nothing, does nothing, is nothing."
	(Quote from the eulogy for the late Christa McAuliffe.)

adrianho@barkley.berkeley.edu (Adrian J Ho) (06/10/91)

In article <JC.91Jun8194845@raven.bu.edu> jc@raven.bu.edu (James Cameron) writes:
[sample session deleted]
>	   So, what you are looking for is basically:
>% tar cvf file.tar .
>	   to create tape archive of current directory.  

Make that:

	tar cvf ../file.tar .

The idea is _not_ to put the tar file in the directory you're tar'ing,
because your tar file will otherwise end up being included in itself.
In the worst case, it could be the _last_ file included, which means
that your tar file could end up being _twice_ as large as it's
supposed to be.

jik@cats.ucsc.edu (Jonathan I. Kamens) (06/10/91)

In article <ADRIANHO.91Jun9105223@barkley.berkeley.edu>, adrianho@barkley.berkeley.edu (Adrian J Ho) writes:
|> The idea is _not_ to put the tar file in the directory you're tar'ing,
|> because your tar file will otherwise end up being included in itself.
|> In the worst case, it could be the _last_ file included, which means
|> that your tar file could end up being _twice_ as large as it's
|> supposed to be.

  Actually, the worst case is that the tar file will continue to grow as tar
writes into it, so that when tar opens that file to archive it, the file will
grow as tar is archiving it, which means that there will be more to archive,
etc., etc. until the disk fills up or the user exceeds his quota or the
maximum file size or whatever.

-- 
Jonathan Kamens					jik@CATS.UCSC.EDU

gwc@root.co.uk (Geoff Clare) (06/12/91)

In <16873@darkstar.ucsc.edu> jik@cats.ucsc.edu (Jonathan I. Kamens) writes:

>In article <ADRIANHO.91Jun9105223@barkley.berkeley.edu>, adrianho@barkley.berkeley.edu (Adrian J Ho) writes:
>|> The idea is _not_ to put the tar file in the directory you're tar'ing,
>|> because your tar file will otherwise end up being included in itself.
>|> In the worst case, it could be the _last_ file included, which means
>|> that your tar file could end up being _twice_ as large as it's
>|> supposed to be.

>  Actually, the worst case is that the tar file will continue to grow as tar
>writes into it, so that when tar opens that file to archive it, the file will
>grow as tar is archiving it, which means that there will be more to archive,
>etc., etc. until the disk fills up or the user exceeds his quota or the
>maximum file size or whatever.

Actually, Jonathan is wrong and Adrian was right.

Although many utilities will fill the disk if made to read their output
file, this is not true of tar (unless you have a broken tar).  In the tar
output format the size of each file is contained in a header which
precedes the file's contents.  Hence the amount of data which tar will
read from its own output file is limited to its size when tar starts
reading it.  If the output file is the last input file, it will
approximately double in size when tar reads it.

Any decent version of tar will warn you if an input file changes in size.
-- 
Geoff Clare <gwc@root.co.uk>  (Dumb American mailers: ...!uunet!root.co.uk!gwc)
UniSoft Limited, London, England.   Tel: +44 71 729 3773   Fax: +44 71 729 3273

jik@cats.ucsc.edu (Jonathan I. Kamens) (06/13/91)

In article <2740@root44.co.uk>, gwc@root.co.uk (Geoff Clare) writes:
|> In <16873@darkstar.ucsc.edu> jik@cats.ucsc.edu (Jonathan I. Kamens) writes:
|> >In article <ADRIANHO.91Jun9105223@barkley.berkeley.edu>, adrianho@barkley.berkeley.edu (Adrian J Ho) writes:
|> >|> In the worst case, it could be the _last_ file included, which means
|> >|> that your tar file could end up being _twice_ as large as it's
|> >|> supposed to be.
|> 
|> >  Actually, the worst case is that the tar file will continue to grow as tar
|> >writes into it, so that when tar opens that file to archive it, the file will
|> >grow as tar is archiving it, which means that there will be more to archive,
|> >etc., etc. until the disk fills up or the user exceeds his quota or the
|> >maximum file size or whatever.
|> 
|> Actually, Jonathan is wrong and Adrian was right.

Um, no.

|> Although many utilities will fill the disk if made to read their output
|> file, this is not true of tar (unless you have a broken tar).  In the tar
|> output format the size of each file is contained in a header which
|> precedes the file's contents.  Hence the amount of data which tar will
|> read from its own output file is limited to its size when tar starts
|> reading it.  If the output file is the last input file, it will
|> approximately double in size when tar reads it.
|> 
|> Any decent version of tar will warn you if an input file changes in size.

I have *used* a version of tar which does not do proper version checking, and
which therefore creates a tar archive that fills the disk.  I would not have
mentioned it if I hadn't used it.

Now, perhaps my recollection is wrong, and I didn't actually ever use such a
tar.  But unless you're willing to make the categorical claim that such a tar
does not exist (which you appear to be unwilling to do, since you said "unless
you have a broken tar" and later "Any decent version of tar"), saying that I
am "wrong" is unwarranted.

I was talking about the "worst case," i.e. a poor version of tar.

-- 
Jonathan Kamens					jik@CATS.UCSC.EDU

t891368@otto.bf.rmit.oz.au (Mark) (06/14/91)

jik@cats.ucsc.edu (Jonathan I. Kamens) writes:

>  Actually, the worst case is that the tar file will continue to grow as tar
>writes into it, so that when tar opens that file to archive it, the file will
>grow as tar is archiving it, which means that there will be more to archive,
>etc., etc. until the disk fills up or the user exceeds his quota or the
>maximum file size or whatever.

>-- 
>Jonathan Kamens					jik@CATS.UCSC.EDU

*grin*

I did that in my early days. Something like this:

-rwx------ mark users  blah   blah   blah    753273651 zzfiles.tar

Fortunately I got bored and killed the job as I thought it had hung.
We have quota's on that machine now :)

gwc@root.co.uk (Geoff Clare) (06/14/91)

In article <2740@root44.co.uk>, I wrote:
> Although many utilities will fill the disk if made to read their output
> file, this is not true of tar (unless you have a broken tar).  In the tar
> output format the size of each file is contained in a header which
> precedes the file's contents.  Hence the amount of data which tar will
> read from its own output file is limited to its size when tar starts
> reading it.  If the output file is the last input file, it will
> approximately double in size when tar reads it.
> 
> Any decent version of tar will warn you if an input file changes in size.

In <16978@darkstar.ucsc.edu> jik@cats.ucsc.edu (Jonathan I. Kamens) writes:

>I have *used* a version of tar which does not do proper version checking, and
>which therefore creates a tar archive that fills the disk.  I would not have
>mentioned it if I hadn't used it.

Jonathan has completely missed my main point and picked up on an aside.
It is not "version checking" that stops tar from filling the disk, it is
the fundamental fact that the tar output format contains each archived
file's size before its contents.  Once tar has written out the size of
the file in a header block, it must write exactly that many bytes of data
as the file's contents, even if the file changes size.  Otherwise the
output tar archive will be corrupt and unusable.

My comment that "any decent version of tar will warn you if an input
file changes in size" was an afterthought, intended to reassure people
who deduced (correctly) that, since tar is forced to write the number of
bytes it said it was going to write rather than the number of bytes it
is able to read from the file, if a file changes in size the archived
data may not correspond to either the old or the new version of the
file, but something in between.

>Now, perhaps my recollection is wrong, and I didn't actually ever use such a
>tar.  But unless you're willing to make the categorical claim that such a tar
>does not exist (which you appear to be unwilling to do, since you said "unless
>you have a broken tar" and later "Any decent version of tar"), saying that I
>am "wrong" is unwarranted.

>I was talking about the "worst case," i.e. a poor version of tar.

I claim categorically that such a tar is broken.

-- 
Geoff Clare <gwc@root.co.uk>  (Dumb American mailers: ...!uunet!root.co.uk!gwc)
UniSoft Limited, London, England.   Tel: +44 71 729 3773   Fax: +44 71 729 3273

jik@cats.ucsc.edu (Jonathan I. Kamens) (06/18/91)

In article <2742@root44.co.uk>, gwc@root.co.uk (Geoff Clare) writes:
|> In article <2740@root44.co.uk>, I wrote:
|> > Any decent version of tar will warn you if an input file changes in size.
|> 
|> In <16978@darkstar.ucsc.edu> jik@cats.ucsc.edu (Jonathan I. Kamens) writes:
|> 
|> >I have *used* a version of tar which does not do proper version checking, and
|> >which therefore creates a tar archive that fills the disk.  I would not have
|> >mentioned it if I hadn't used it.
|> 
|> Jonathan has completely missed my main point and picked up on an aside.

No, I got your main point.  But you missed mine, which I guess makes us even.

|> It is not "version checking" that stops tar from filling the disk, it is
|> the fundamental fact that the tar output format contains each archived
|> file's size before its contents.  Once tar has written out the size of
|> the file in a header block, it must write exactly that many bytes of data
|> as the file's contents, even if the file changes size.  Otherwise the
|> output tar archive will be corrupt and unusable.

A GOOD VERSION OF TAR will check the amount of bytes it has written, and stop
when it has reached the number of bytes recorded in the header preceding the
file.  My point in my last two postings in this thread is that there are BAD
VERSIONS OF TAR that don't do that.  They stat the file, record its size in
the tar file, and then open the file and read it until EOF and write its
contents into the tar file.

|> >I was talking about the "worst case," i.e. a poor version of tar.
|> 
|> I claim categorically that such a tar is broken.

Yes, of course it's broken.  I never claimed otherwise.  My point in my
previous two postings, and in this one, is that SUCH BROKEN VERSIONS EXIST,
and that, therefore, the worse thing that can happen to you if you try to add
a tar archive to itself is that the file will grow until it can't grow any
further.

Got it?

-- 
Jonathan Kamens					jik@CATS.UCSC.EDU

mouse@thunder.mcrcim.mcgill.edu (der Mouse) (06/20/91)

In article <2742@root44.co.uk>, gwc@root.co.uk (Geoff Clare) writes:
> Once tar has written out the size of the file in a header block, it
> must write exactly that many bytes of data as the file's contents,
> even if the file changes size.  Otherwise the output tar archive will
> be corrupt and unusable.

Not quite.  Note that the tar canister will still be usable provided
only that the expected number of data blocks are present.  It would not
generate broken canisters - though I must agree it would be bad
practice - for a tar to silently ignore file size changes which don't
change the number of data blocks occupied by the file.

("data blocks" here refers to tar blocks (of 512 bytes), not anything
to do with filesystem blocks.)

>> I was talking about the "worst case," i.e. a poor version of tar.
> I claim categorically that such a tar is broken.

I agree.  But *I* have certainly come up against plenty of broken
software; haven't *you*?

					der Mouse

			old: mcgill-vision!mouse
			new: mouse@larry.mcrcim.mcgill.edu