[comp.unix.wizards] GNU-tar vs dump

nvt9001@BELVOIR-EMH3.ARMY.MIL (DoN Nichols) (01/03/89)

	Can anyone out there in wizard-land tell me any reason why I
should continue to use dump(1) for system backups given the capabilities
in the latest release of tar(1) from GNU?
	The GNU tar allows muli-volume archives, incremental backups,
restores to full permissions (ignoring umask), limiting an archive to a
single filesystem, it can be told to ignore a set of named files (with
wildcards) allowing dropping of 'core' files & backups left by editors
such as '#*' and '*~'.
	It also can use the maximum blocking factor that my drive will
handle (b40 - 20KB) and my dump(1) is fixed at b10 (5KB).  This is a
wear consideration on my drive (a Cipher F880), which as a streaming
drive, has to do a lot of extra motor work to emulate a start-stop
drive.  The tar even seems fast enough so that the drive seldom has to
do the start-stop emulation, while dump(1) does a lot of it.
	Another consideration is that I cannot persuade dump(1) to
restore the root file system when I boot from a floppy. (It complains of
'out of swap space').  The tar is perfectly happy to do a restore onto
the root file system when booted from a floppy.
	Don't tell me to go to the vendor and complain.  This is a
personal system obtained from a hamfest + a lot of detective work to
restore it to complete status.  The manufacturer seems to be
non-existant (CMS Industries - computer is a CMS-16/UNX).  The people
who did the port of UNIX to the system (a version 7 with some Berlkey
enhancements), when contacted said "We don't even have any backup tapes
that old!". (This was ported in 1982).
	Replies to nvt9001@belvoir-emh3.army.mil (or
nvt9001@belvoir-mail1.arpa) and I will summarize to the net if others
are interested.
					Thanks in Advance
					Don Nichols
					nvt9001@belvoir-emh3.army.mil

disclaimer: This posting does not reflect the policy of the U.S.
	Government, or any part thereof.  The system in question is not
	the one from which this posting is being made.  (Mine is so old
	that it has the driver for the ACU compiled into the uucp, and I
	have great difficulty persuading it to initiate a call with a normal
	modem(Hayes) or an AT&T Penril ACU., so I am not connected
	directly to the net, even if I could find a feed. Pointers to a
	public-domain implementaion of a complete uucp would also be helpful)

bzs@Encore.COM (Barry Shein) (01/03/89)

From: nvt9001@BELVOIR-EMH3.ARMY.MIL (DoN Nichols)
>	Can anyone out there in wizard-land tell me any reason why I
>should continue to use dump(1) for system backups given the capabilities
>in the latest release of tar(1) from GNU?

I haven't really looked at gnutar (it sounds very good from your
description so I'll go look at it, thanks :-) but one thing I would
look out for is whether or not DELETIONS are saved on backups,
particularly incrementals.

What this means (in practice) is that if a backup procedure doesn't
backup deletions then files deleted between fulls and incrementals or
two incrementals will reappear when the file system is restored.
Another way of stating this is whether the state of the directory's
contents are restored or just added to.

The worst effect, for most folks, is when this pushes the partition
over 100% and you can't complete the restore, it just bombs out with a
full disk, very frustrating, no easy fix (you hunt around, make space,
then try to reapply the incremental which failed.)

Another effect would be the reappearance of files you deleted for
security reasons (eg. sensitive data) after a restore, obviously this
is only worrisome to some people.

If it's a time-sharing system of course your worst problem will be
when your phone rings off the hook by users wondering why deleted
files reappeared this morning. Depending on how much you like to hear
from them from time to time this may or may not be a problem.

There are ways to avoid this using tar, nothing magic is needed, just
a shell script and some imagination (ie. put a list of directory
contents onto the tape and start by executing the deletions.)

	-Barry Shein, ||Encore||

dg@lakart.UUCP (David Goodenough) (01/04/89)

From article <17999@adm.BRL.MIL>, by nvt9001@BELVOIR-EMH3.ARMY.MIL (DoN Nichols):
> 
> 	Can anyone out there in wizard-land tell me any reason why I
> should continue to use dump(1) for system backups given the capabilities
> in the latest release of tar(1) from GNU?

Does it do interactive restores? (I don't know - I've not seen the docs
on GNU tar, so I'm asking out of genuine curiosity, not sarcasm)
-- 
	dg@lakart.UUCP - David Goodenough		+---+
							| +-+-+
	....... !harvard!xait!lakart!dg			+-+-+ |
AKA:	dg%lakart.uucp@xait.xerox.com		  	  +---+

Rudy.Nedved@rudy.fac.cs.cmu.edu (01/04/89)

The goal of dump is to restore the system to the state of the dump. The
goal of tar is to archive information for later retrieval. In general,
these goals overlap.

Problem 1: tar does not deal with empty data blocks in a file. It asks
the system for the block and the system gives it a block of zeroes. When
you restore a disk that was very full from backup...you will end up
using more disk space then was actually there since the null blocks will
be written out. This can be partially solved by having tar read the
blocked or raw disk device file but that means it must be system dependent.

Problem 2: I don't know about gnu-tar but standard tar has limits on
filenames....If you have long file names in your system, you may lose
that file on restore or at least have the file name truncated.

Problem 3: In general, tar is system independent for good reasons so it
is possible you may lose information critical to the complete restore of
your disks...Maybe they don't have this problem with gnu tar.

I am curious.

-Rudy

egisin@mks.UUCP (Eric Gisin) (01/04/89)

One of the potential problems with using tar or cpio for backups
is that a sparse file (one with unallocated blocks)
that uses little disk space will use more space in the backup.
A sparse file can be created by a buggy program or by a user:
	$ adb -w big
	0t100000000?w0
	^D
	$ ls -ls big
	  24 -rw-r--r--  1 egisin   100000004 Jan  3 17:59 big
This file uses 24K on the BSD filesytem, and about 100M in a tar backup.

jbs@fenchurch.mit.edu (Jeff Siegal) (01/04/89)

In article <18008@adm.BRL.MIL> Rudy.Nedved@rudy.fac.cs.cmu.edu writes:
>Problem 1: tar does not deal with empty data blocks in a file. 
>[...]This can be partially solved by having tar read the
>blocked or raw disk device file but that means it must be system dependent.

You can just have tar -x look at the file data from the archive, and
if it contains a string of NULs, have it avoid writing the nulls to
the extracted file (using fseek to skip the appropriate number of
bytes).

This *might* even make it execute faster (I would guess not, though).

Jeff Siegal

andrew@alice.UUCP (Andrew Hume) (01/06/89)

this happens in real life with dbm databases (and probably all
hashed databases) which are huge and sparse.

bzs@Encore.COM (Barry Shein) (01/07/89)

Another limitation of using tar (which, again, I don't know if gnutar
attacked) is restoring device entries. This isn't always a problem
since you usually got a working /dev/ from somewhere to start the
restore but if there are other device entries which are normally
dumped/restored this could be a consideration.

It can be handled with a simple shell script to dump and restore these
(you could create it and put it to tape automatically), basically a
big find and then an awk postprocessor program to turn this into a
shell script to do the necessary mknod's if needed later, then save
that to tape, 99% of it is:

	find /filesys \( -type c -o -type b \) -exec ls -l '{}' ';' | \
		awk -f todev > outfile ; chmod u+x outfile

where todev is:

BEGIN {
	print "#!/bin/sh"
}
{
	print "mknod " substr($1,0,1) " " substr($4,0,length($4)-1) \
		" " $5 " " $9
}

(ok, that was gratuitous fun...but it was clean fun.)

	-Barry Shein, ||Encore||

fnf@estinc.UUCP (Fred Fish) (01/08/89)

In article <629@mks.UUCP> egisin@mks.UUCP (Eric Gisin) writes:
>One of the potential problems with using tar or cpio for backups
>is that a sparse file (one with unallocated blocks)
>that uses little disk space will use more space in the backup.
> (example deleted)
>	  24 -rw-r--r--  1 egisin   100000004 Jan  3 17:59 big
>This file uses 24K on the BSD filesytem, and about 100M in a tar backup.

The problem with archive space consumed can be eliminated by compressing
(LZW) sparse files during the archive process.  This can be done totally
transparently to the user. The example 100Mb file compresses to about 200Kb.

During extraction, each block can be tested for the case of a block of
null bytes (after decompression), and seeks used to recreate the hole.
This test/seek is actually faster in practice than writing blocks of null
bytes.  I believe this is also independently confirmed by someone who
posted their results of modifying "cp" to create sparse files.

BRU (Backup and Restore Utility) uses both of these techniques, so this is
not just speculation.  A complete filesystem save and restore often
results in additional free space recovered from files that could be
sparse, but weren't.

-Fred
-- 
# Fred Fish, 1346 West 10th Place, Tempe, AZ 85281,  USA
# asuvax!nud!fishpond!estinc!fnf          (602) 921-1113

glenn@eecg.toronto.edu (Glenn Mackintosh) (01/08/89)

In article <4601@xenna.Encore.COM> bzs@Encore.COM (Barry Shein) writes:
>
>Another limitation of using tar (which, again, I don't know if gnutar
>attacked) is restoring device entries. This isn't always a problem
>since you usually got a working /dev/ from somewhere to start the
>restore but if there are other device entries which are normally
>dumped/restored this could be a consideration.

Well, looking at the doc on gnu tar I note that it stores the major and
minor device numbers as well as the file type. I would infer from this that
it will regenerate the devices properly but I haven't tried it.

While I am at it I might as well answer some other questions that people
have asked. 

It has special options for doing incremental backups. On creation it will
put an entry for each directory that it works on with a list of all files
that were in the directory and a flag indicating whether the file put into
the tarfile or not. On extraction it can be made to remove files which it
does not find in this directory list (the assumption being that the file was
deleted). These can be useful when rebuilding a corrupted filesystem from
incremental backups.

The version I have does not do anything special about large blocks of zeros
in a file (as would result from unallocated blocks in the file) but a
modification was posted to one of the gnu news groups which caused it to not
allocate blocks for them when they are extracted. This means that a restored
database file would not take any more space than the original did. However,
the tarfile itself will contain these potentially large blocks of useless
information.  Since you can get it to make multi volume archives you
actually could try to put all this on even though it might cross one or even
several tape boundaries (and could take up quite a bit of time and tape in
the process).

I don't see what they could do to get around this with the current tarfile
format since the file record is just one large character block. They could
potentially add some magic field to indicate that the file had holes in it
and before the contents of the file have another record which was a list of
offsets and sizes indicating how to rebuild the file. This would be a fairly
major incompatibility with older versions though. Also it would mean that
the inodes for the file would have to be scanned beforehand to find out if
the file contained unallocated blocks in order to decide whether it needed
this new format (or it could just do it for every file but this seems
unnecessary).

   Glenn Mackintosh
   University of Toronto

-------------------------------------------------------------------------------
Include standard disclaimers here.

CSNET:  glenn@eecg.toronto.edu
ARPA:   glenn%eecg.toronto.edu@relay.cs.net
UUCP:   UUNET!ai.toronto.edu!eecg.toronto.edu!glenn
CDNNET: glenn@eecg.toronto.cdn
BITNET: glenn@eecg.utoronto.bitnet (may not work from all sites)

les@chinet.chi.il.us (Leslie Mikesell) (01/08/89)

In article <4601@xenna.Encore.COM> bzs@Encore.COM (Barry Shein) writes:
>
>Another limitation of using tar (which, again, I don't know if gnutar
>attacked) is restoring device entries. This isn't always a problem
>since you usually got a working /dev/ from somewhere to start the
>restore but if there are other device entries which are normally
>dumped/restored this could be a consideration.

GNU tar knows how to mknod() device and FIFO entries (haven't tried it
but the code is there).  It can also save a special entry containing
all of the directory entries at the time of an incremental and remove
anything that isn't supposed to be there when it is restored. Also, it
can be told to stay within a filesystem. It looks like the only thing missing
is the ability to reset the ctime of restored files.  

Les Mikesell

jfh@rpp386.Dallas.TX.US (John F. Haugh II) (01/09/89)

In article <11@estinc.UUCP> fnf@estinc.UUCP (Fred Fish) writes:
>The problem with archive space consumed can be eliminated by compressing
>(LZW) sparse files during the archive process.  This can be done totally
>transparently to the user. The example 100Mb file compresses to about 200Kb.

This problem and others can be solved by telling GNU-tar about the file
system.  There is no reason a system utility shouldn't be aware of the
system layout.

How many CPU years are going to be wasted LZW'ing all those sparce blocks
when a little file system knowelege would have saved us all that grief?
Write the code once and be done with it.
-- 
John F. Haugh II                        +-Quote of the Week:-------------------
VoiceNet: (214) 250-3311   Data: -6272  |"Now with 12 percent less fat than
InterNet: jfh@rpp386.Dallas.TX.US       | last years model ..."
UucpNet : <backbone>!killer!rpp386!jfh  +--------------------------------------

scs@adam.pika.mit.edu (Steve Summit) (01/09/89)

In article <10797@rpp386.Dallas.TX.US> jfh@rpp386.Dallas.TX.US (John F. Haugh II) writes:
(with respect to compressing "empty" blocks of zeroes)
>This problem and others can be solved by telling GNU-tar about the file
>system.  There is no reason a system utility shouldn't be aware of the
>system layout.
>How many CPU years are going to be wasted LZW'ing all those sparce blocks
>when a little file system knowelege would have saved us all that grief?

How many person years have been and will be wasted attempting to
port programs which ought to be portable but which contain
gratuitous system dependencies?  Tar can be written portably;
every attempt should be made to do so.  It has already been
asserted (and I'm inclined to believe it) that the time spent
looking for zeroes to compress is inconsequential, particularly
in an I/O intensive program such as tar.

A good example of the same problem can be found in diff: a nice,
simple text file utility which ought to be maximally portable,
and is an especially attractive porting target because nothing
like it exists on lesser systems such as VMS and MS-DOS.  Yet
part of its algorithm for distinguishing between text and binary
files involves reading a struct exec from the beginning of the
file and checking for magic numbers, which requires #including
the (very Unix-specific) <a.out.h>.  Doing so is in fact
pointless because the algorithm then goes on (in the absence of a
valid magic number) to look through the beginning of the file for
nonprinting characters, which a.out files are virtually certain
to contain.

Machine- or system-dependent code should be written only as a
last resort, when the need is clear and dire, when no portable
way of writing it can be found, and then only in utilities which
"have a right" to contain such dependencies (adb, fsck, etc.).
Tar is a file interchange program; you'll likely want to get it
working on another system some day so you can transfer things.

(Of course, non-essential system-dependent code, such as a Unix
filesystem empty block check, or diff's magic number detection,
could be surrounded with appropriate #ifdefs.  Unfortunately, it
rarely is, which leaves the eventual porter, if he isn't
experienced and isn't the author, quite uncertain as to how to
proceed, and liable to drop the project.  In the case of the
proposed filesystem knowledge for tar, an #ifdef unix wouldn't
even help, because Unix filesystem formats have been known to
change, and they can't even be assumed to be consistent on one
system any more, given the existence of file system switches and
remote file systems.  Why commit tar to all of these problems?)

>Write the code once and be done with it.

Indeed.

                                            Steve Summit
                                            scs@adam.pika.mit.edu

m5@lynx.uucp (Mike McNally) (01/10/89)

In article <4601@xenna.Encore.COM> bzs@Encore.COM (Barry Shein) writes:
>
>Another limitation of using tar (which, again, I don't know if gnutar
>attacked) is restoring device entries. 

POSIX tar format has provisions for this.  It also has provisions for longer
file names (not MAXPATHLEN, but longer than 100 bytes).

Does GNU tar have a mechanism for creating an artificial hierarchy on the
tape?  In our tar, we added a switch like -C that specifies a prefix for
each following file name.  This is very convenient (not for backup; for
file transfer).

-- 
Mike McNally                                    Lynx Real-Time Systems
uucp: {voder,athsys}!lynx!m5                    phone: 408 370 2233

            Where equal mind and contest equal, go.

guy@auspex.UUCP (Guy Harris) (01/10/89)

 >It looks like the only thing missing is the ability to reset the ctime
 >of restored files.  

And how is one to do that, pray tell?  Hint: "utime" is not the answer. 
Neither is "utimes", for those of you who have it....

fnf@estinc.UUCP (Fred Fish) (01/10/89)

In article <10797@rpp386.Dallas.TX.US> jfh@rpp386.Dallas.TX.US (John F. Haugh II) writes:
>This problem and others can be solved by telling GNU-tar about the file
>system.  There is no reason a system utility shouldn't be aware of the
>system layout.

I believe there is, portability.

>How many CPU years are going to be wasted LZW'ing all those sparce blocks
>when a little file system knowelege would have saved us all that grief?
>Write the code once and be done with it.

And then rewrite it every time the filesystem changes.  I would rather
take the CPU hit than support 10 or 15 different filesystem layouts, some
of them completely unrelated to UNIX.

-Fred
-- 
# Fred Fish, 1835 E. Belmont Drive, Tempe, AZ 85284,  USA
# asuvax!nud!estinc!fnf

les@chinet.chi.il.us (Leslie Mikesell) (01/11/89)

In article <5176@lynx.UUCP> m5@lynx.UUCP (Mike McNally) writes:

>Does GNU tar have a mechanism for creating an artificial hierarchy on the
>tape?  In our tar, we added a switch like -C that specifies a prefix for
>each following file name.  This is very convenient (not for backup; for
>file transfer).
>
It has the reverse: there is a command to change directories that you
can intersperse with the directory names.  This makes it possible to
make it look like files from distant directories have the same parent.
To create an artificial hierarchy on a tape, the "prefix" directories
would have to exist, though not necessarily in the locations apparent
on the tape.  Leading '/'s are stripped (unconditionally) on both
saves and restores.  I'd prefer an option, perhaps with the ability
to specify how many leading directory names to strip.

Les Mikesell

felix@netmbx.UUCP (Felix Gaehtgens) (01/12/89)

everybody's talking about GNU-tar and i'm convinced! it's (from what i've
read) much better than tar, dump, restore, cpio, etc.

but how about posting it in comp.sources.unix, comp.sources.misc or so? then
we all can have a copy to evaluate.

so long,
			felix
-- 
BANG: ..pyramid!tub!netmbx!xaos!felix	SMART: felix@xaos.sub
>In as much as a number of excellent hallucinogens already exist, I would not
>reccomend going to the trouble of extracting the adrenal gland from living
>humans.  I personally also have ethical problems with this procedure ....

ajudge@maths.tcd.ie (Alan Judge) (01/14/89)

In article <1966@netmbx.UUCP> felix@netmbx.UUCP (Felix Gaehtgens) writes:
>everybody's talking about GNU-tar and i'm convinced! it's (from what i've
>read) much better than tar, dump, restore, cpio, etc.
>
>but how about posting it in comp.sources.unix, comp.sources.misc or so? then
>we all can have a copy to evaluate.
I agree, perhaps comp.sources.misc would be better given the current lead time
on comp.sources.unix.
--
Alan Judge, Dept. of Maths., Trinity College, Dublin, Ireland.
ajudge@maths.tcd.ie		Stupid Mailers: ...!uunet!maths.tcd.ie!ajudge

weening@Gang-of-Four.Stanford.EDU (Joe Weening) (01/14/89)

One thing that worries me about using tar for dumps is that it updates
the access time of all the files that it reads.  Is there any way to
avoid this?
-- 

Joe Weening                                Computer Science Dept.
weening@Gang-of-Four.Stanford.EDU          Stanford University

guy@auspex.UUCP (Guy Harris) (01/14/89)

>One thing that worries me about using tar for dumps is that it updates
>the access time of all the files that it reads.  Is there any way to
>avoid this?

Not without changing "tar" to do an "fstat" before it starts reading to
get the current accessed time, and an "fstat" to get the current
modified time and a "utime(s)" call to set the accessed time to its old
value (and set the modified time to its current value - you have to
change both) after it finishes reading.

fnf@estinc.UUCP (Fred Fish) (01/15/89)

In article <6099@polya.Stanford.EDU> weening@Gang-of-Four.Stanford.EDU (Joe Weening) writes:
>One thing that worries me about using tar for dumps is that it updates
>the access time of all the files that it reads.  Is there any way to
>avoid this?

Yes, it is possible, but it has other side effects that are undesirable.
One of the unavoidable warts of UNIX backup utilities that operate through
the normal user interface to the filesystem (as opposed to filesystem
specific utilities like "dump" that go directly to the raw filesystem)
is that it is impossible to independently set (backdate) all of the
times associated with a particular file.  Below is one of the "design
notes" files I include in the bru source distribution:

#include notes/ctime...

For several years, bru only looked at the mtime field in the stat(2) structure
to decide if a file had been modified.  This worked fine for files which had
their contents modified, but did nothing for files that had only their
attributes changed, such as with chmod(1), chown(1), etc.  Since the ctime
field was ignored, we could use the utime(2) system call to reset the
atime fields whenever we merely read a file to back it up.  This allowed
us to preserve ALL the time fields except for ctime.

However, it was finally decided that ignoring ctime was the wrong thing to
do, and that bru should look at the most recent of mtime or ctime to determine
the "modification date".  Thus if either the file or it's attributes changed,
the file would be backed up.  In the case of only the attributes changing,
backing up the entire file is overkill, but not harmful.  Although some files
without contents are stored in bru archives (such as directories and FIFO's)
it is not possible to store just the attributes of a normal file using the
current format.  This was a design oversight.

No longer ignoring ctime has one mandatory side effect in the way bru 
operates.  We can no longer use utime(2) to reset the atime and mtime
fields after merely reading a file.  In effect, the "-a" command line
option now is permanently set, as the calls to the routine to reset these
times has been removed from the bru source (although the routine itself
has been kept around temporarily, just ifdef'd out).

The original idea behind preserving atime was to allow migration of files
to offline storage if they had not be accessed in any "meaningful" way
recently.  Simply backing up the system was not considered to be a 
meaningful access, so the atime field was preserved.  In actuality,
no program that the author currently knows about performs migration or
deletion based on the atime field.  In fact, the filesystem management
system currently under development by the author of bru will ignore
the atime field also.  It will base it's file migration or deletion
actions on other criteria.

#end notes/ctime

-Fred
-- 
# Fred Fish, 1835 E. Belmont Drive, Tempe, AZ 85284,  USA
# asuvax!nud!estinc!fnf

jfh@rpp386.Dallas.TX.US (John F. Haugh II) (01/15/89)

In article <6099@polya.Stanford.EDU> weening@Gang-of-Four.Stanford.EDU (Joe Weening) writes:
>One thing that worries me about using tar for dumps is that it updates
>the access time of all the files that it reads.  Is there any way to
>avoid this?

Yup, restore the old access and modified times after scribbling the
file to tape.  That's why you have the source code ...
-- 
John F. Haugh II                        +-Quote of the Week:-------------------
VoiceNet: (214) 250-3311   Data: -6272  |"UNIX doesn't have bugs,
InterNet: jfh@rpp386.Dallas.TX.US       |         UNIX is a bug."
UucpNet : <backbone>!killer!rpp386!jfh  +--------------------------------------

kre@cs.mu.oz.au (Robert Elz) (01/15/89)

> >One thing that worries me about using tar for dumps is that it updates
> >the access time of all the files that it reads.  Is there any way to
> >avoid this?
> 
> Not without changing "tar" to do an "fstat" before it starts reading ...
> and a "utime(s)" call to set the accessed time ...

Of course, doing this modifies the ctime field, which means that this
file (every file) is going to be dumped on your next incremental (unless
you're basing that on mtime, in which case your incrementals aren't
saving changed modes, owners, etc).

One of the very worst USG (sys III, PWB, whetever) "features" has been the
attempt to convince the world that a tape interchange format (cpio, tar,
whatever) is useable for dumps.  Its simply not true.  Backups are an
entirely different problem, and need an entirely different solution.

kre

debra@alice.UUCP (Paul De Bra) (01/16/89)

In article <15@estinc.UUCP> fnf@estinc.UUCP (Fred Fish) writes:
>In article <6099@polya.Stanford.EDU> weening@Gang-of-Four.Stanford.EDU (Joe Weening) writes:
>>One thing that worries me about using tar for dumps is that it updates
>>the access time of all the files that it reads.  Is there any way to
>>avoid this?
>
>Yes, it is possible, but it has other side effects that are undesirable.
> [long explanation deleted]

Several backup programs have started using "ctime" instead of "mtime",
assuming that this guarantees that whatever a user does to his files they
will be backed up.

Unfortunately this does not work: if one renames a directory none of the
attributes of the files in this directory change. So the files are not
backed up and unless one knows the previous name of the directory one
cannot find the files in the backup again. Try "rm *" accidently and then
recover the files from backup... Unless you still know the previous name
of the directory and/or the names of the files you want to recover you
won't find the files...

Paul.
-- 
------------------------------------------------------
|debra@research.att.com   | uunet!research!debra     |
------------------------------------------------------

henry@utzoo.uucp (Henry Spencer) (01/17/89)

In article <8768@alice.UUCP> debra@alice.UUCP () writes:
>... if one renames a directory none of the
>attributes of the files in this directory change. So the files are not
>backed up and unless one knows the previous name of the directory one
>cannot find the files in the backup again...

This is why sensible backup programs know you must back up the contents
of directories, not just files.  This isn't a miracle cure, but it helps.
-- 
"God willing, we will return." |     Henry Spencer at U of Toronto Zoology
-Eugene Cernan, the Moon, 1972 | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

les@chinet.chi.il.us (Leslie Mikesell) (01/18/89)

In article <848@auspex.UUCP> guy@auspex.UUCP (Guy Harris) writes:
>Not without changing "tar" to do an "fstat" before it starts reading to
>get the current accessed time, and an "fstat" to get the current
>modified time and a "utime(s)" call to set the accessed time to its old
>value (and set the modified time to its current value - you have to
>change both) after it finishes reading.

Cpio has the -a flag to do this and it is (unfortunately) on in the
sysadmin scripts on the AT&T 3B2's.  The problem is that resetting the
atime sets the ctime to the current time.  Then if you try to do a
sensible backup based on ctime you can't do it.  This has been a real
problem once or twice because the network DOS server uses the client
PC's time to set the mtime on files.  So, with the PC clock set wrong
and a sysadmin backup made there is no way to find the real date of
a file.

Les Mikesell

alo@kampi.hut.fi (Antti Louko) (01/18/89)

In article <1989Jan17.044721.5636@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>In article <8768@alice.UUCP> debra@alice.UUCP () writes:
>>... if one renames a directory none of the
>>attributes of the files in this directory change. So the files are not
>>backed up and unless one knows the previous name of the directory one
>>cannot find the files in the backup again...
>
>This is why sensible backup programs know you must back up the contents
>of directories, not just files.  This isn't a miracle cure, but it helps.

And in my opinion, it doesn't help enough. Consider following:

mkdir /tmp/d; cd /tmp/d
mkdir d1 d2
echo This f1 is under d1 > d1/f1
echo This f1 is under d2 > d2/f1

Do level 0 gnu-tar dump of /tmp/d

mv d1 new-d2
mv d2 d1
mv new-d2 d2

Do level 1 gnu-tar dump of /tmp/d

Because level 1 dump doesn't contain any info that d1 and d2 were
swapped, and no times of f1:s were changed, restoring level 0 and
then level 1 gives you the situation after level 0 dump.

gnu-tar dump should contain the same inode-number info which is in
conventional dump(8) archives.
---
alo@santra.UUCP (mcvax!santra!alo)       Antti Louko
alo@santra.hut.fi                        Helsinki University of Technology
alo@fingate.bitnet                       Computing Centre