[net.unix-wizards] 4.2BSD restore

woods@hao.UUCP (Greg Woods) (04/21/86)

  Sorry if this has already been asked here, but I rarely have time to 
keep up with all the articles in this group. This relates to restoring
file systems from dump(8) tapes after the data on the disk was wiped
out for some reason. I have two questions: first, it usually happens
that we don't have a level 0 dump right before the disk dies, so therefore
one or more incremental restores are required. It doesn't seem to let you do
a "restore -r" on a tape that is not level 0, so I have to resort to the
kludge of doing "restore -i" followed by "add ." and "extract". This works
fine, except that zillions of warning messages about directories that already
exist that are going to be written into come out on the console (often for
5-6 pages), and once it starts, several more pages of complaints about hard
links that cannot be created ("File exists") come out (as far as I can tell,
the link structure seems to be preserved when all the incremental restores
are finished). So, is there a way to shut off these obnoxious messages? Better 
still, is there a better way to get the disk back to the state it was in when 
the last dump was done?
  The second question is: after the restore and icremental restores are 
complete, the next time we do a dump at ANY level, it seems to want to dump
the entire disk, despite the fact that the original modification times of
the restored files are preserved. Why is this, and is there any way to make
it behave "sensibly" (i.e. just dump the stuff it would have dumped had the
restores never been done)? Thanks for any help. Please MAIL responses to me
and I will post a summary.

--Greg
--
{ucbvax!hplabs | decvax!noao | mcvax!seismo | ihnp4!seismo}
       		        !hao!woods

CSNET: woods@ncar.csnet  ARPA: woods%ncar@CSNET-RELAY.ARPA

"The darkness never goes, from some men's eyes"

jerry@oliveb.UUCP (Jerry Aguirre) (04/22/86)

I have also noticed that after a restore all the files marked as
changed and will go on the next dump.  The 4.2BSD restore works on the
mounted file system.  This simplifies the dump code and partial
restores but makes it impossible to correctly update the time stamps on
the file.  Previous versions of restor used the raw file system and
could write any information it wanted.

Remember that dump will use the ctime, not the mtime, when deciding
what files to dump.  There is no system call to backdate the ctime on a
mounted file system.

Procedurally this is a crock as after spending X hours of down time
restoring the file system, you have to turn around and spend Y
additional hours doing a new level 0 dump.  Besides additional down
time the new level 0 also messes up the dump schedule.

					Jerry Aguirre @ Olivetti ATC
{hplabs|fortune|idi|ihnp4|tolerant|allegra|glacier|olhqma}!oliveb!jerry

richl@tektools.UUCP (04/23/86)

In article <801@oliveb.UUCP> jerry@oliveb.UUCP (Jerry Aguirre) writes:
> I have also noticed that after a restore all the files marked as
> changed and will go on the next dump.  [ ... ]
> 
> Procedurally this is a crock as after spending X hours of down time
> restoring the file system, you have to turn around and spend Y
> additional hours doing a new level 0 dump.

If you really did just spend hours restoring the file system, then you
have restored all or virtually all of the file system. If you don't
want to do a level 0 dump, simply edit /etc/dumpdates to reflect
whatever time you want dumps taken from. If you just restored the entire
file system you have the entire system on tape anyway; so you won't
miss anything by this.

Rick Lindsley
richl%tektools@tektronix.csnet
...!{ihnp4,decvax,cbosgd}!tektronix!tektools!richl

ed@mtxinu.UUCP (04/24/86)

In article <801@oliveb.UUCP> jerry@oliveb.UUCP (Jerry Aguirre) writes:
>I have also noticed that after a restore all the files marked as
>changed and will go on the next dump.  The 4.2BSD restore works on the
>mounted file system.  This simplifies the dump code and partial
>restores but makes it impossible to correctly update the time stamps on
>the file.  Previous versions of restor used the raw file system and
>could write any information it wanted.

4.2 dump still dumps from the raw disk.  It's restore that is simplified
bu working on the mounted filesystem.  The reason is that the code to
decide where to place a block on the disk is *very* complex in 4.2
because, generally, of the optimizations used to make the filesystem
fast to read.  It was decided (and I agree) that having two copies of
that code (one in the kernel and one in restore) was too dangerous.

>Remember that dump will use the ctime, not the mtime, when deciding
>what files to dump.  There is no system call to backdate the ctime on a
>mounted file system.
>
>Procedurally this is a crock as after spending X hours of down time
>restoring the file system, you have to turn around and spend Y
>additional hours doing a new level 0 dump.  Besides additional down
>time the new level 0 also messes up the dump schedule.

There are other reasons, as well, for redoing the level 0 dump after
doing a full restore.  Directories relate to files via their inode
number.  In order that the new directory contents be recorded
correctly, a new level 0 is required.  If not, any incremental
dump done - even if the times were correct - would not correctly
relate to the old full(er) dump.  This *is* documented in the
"BUGS" section of the restore(1) writeup.

I agree that it is a big nuisance to redo the level 0 dumps after
a full restore.  Hopefully, however, restores happen infrequently
enough to justify the extra work to get the increased filesystem
performance.

-- 
Ed Gould                    mt Xinu, 2910 Seventh St., Berkeley, CA  94710  USA
{ucbvax,decvax}!mtxinu!ed   +1 415 644 0146

"A man of quality is not threatened by a woman of equality."

berliner@convex.UUCP (04/29/86)

/* Written by ed@mtxinu.UUCP in convex:net.unix-wizards */
> THIS IS BOTH INCORRECT AND DANGEROUS!  It was unnecessary with
> pre-4.2 versions of dump/restor.  With 4.2 and later versions
> of dump/restore, it will cause incrementals made after the edit
> to be *nearly useless* if a full restore is required at a later time.
> 
> Read the man pages for dump and restore for more details (or see the
> response I posted when the original request came through).
> 
> DON'T just change /etc/dumpdates!

If this is indeed true, you can easily fake it as follows (for dumping the
/xxx filesystem after completely restoring it from dump tapes):

	/etc/dump 0usf 30000 /dev/null /xxx >& /dev/null &

This will do an effective level 0 dump and update the /etc/dumpdates file
accordingly.  You also won't be prompted to mount tape 2 ...

Brian Berliner
Convex Computer Corporation
{ihnp4, sun, uiucdcs, rice, allegra}!convex!berliner

richl@tektools.UUCP (Rick Lindsley) (04/30/86)

In article <974@tektools.UUCP> richl@tektools.UUCP (Rick Lindsley) writes:
>
>If you really did just spend hours restoring the file system, then you
>have restored all or virtually all of the file system. If you don't
>want to do a level 0 dump, simply edit /etc/dumpdates to reflect
>whatever time you want dumps taken from. If you just restored the entire
> >file system you have the entire system on tape anyway; so you won't
> >miss anything by this.
> 

In article <8@mtxinu.UUCP> ed@mtxinu.UUCP (Ed Gould) writes:
> THIS IS BOTH INCORRECT AND DANGEROUS!

alas, Ed is correct. It's been too long since I've been intimate with
dump and restore. Rewind and erase my previous message.

Rick

nz@wucs.UUCP (05/01/86)

In article <974@tektools.UUCP> richl@tektools.UUCP (Rick Lindsley) writes:
 > In article <801@oliveb.UUCP> jerry@oliveb.UUCP (Jerry Aguirre) writes:
 > > I have also noticed that after a restore all the files marked as
 > > changed and will go on the next dump.  [ ... ]
 > 
 > If you really did just spend hours restoring the file system, then you
 > have restored all or virtually all of the file system. If you don't
 > want to do a level 0 dump, simply edit /etc/dumpdates to reflect
 >  ...

In fact, this problem is due to a quirk in a dump/restore, the
restored files have the correct modification time, but their creation
times are the time of the restore!

Repeat-By:
	Put a file system on a dummy partition, and do a level 0
	dump(8) of it.  Wait a few minutes.  Then, trash the partition
	and restore it fully from your dump.  Change one file.
	Now, do a level 1 backup with dump(8).  The backup should be
	extremely small (one file) but it isn't!  It is the whole
	filesystem!

To Fix:
	Change dump so that it looks only at the modification time, not
	the creation and modification times.

	Here is a diff for the file dump/dumptraverse.c.  My change is
	bracketed by some #ifdef-#else-#endif.s, you can take those out
	if you like.

*** /tmp/,RCSt1018628	Thu May  1 14:40:45 1986
--- dumptraverse.c	Fri Apr  4 11:34:08 1986
***************
*** 38,43
  	BIS(ino, clrmap);
  	if(f == IFDIR)
  		BIS(ino, dirmap);
  	if ((ip->di_mtime >= spcl.c_ddate || ip->di_ctime >= spcl.c_ddate) &&
  	    !BIT(ino, nodmap)) {
  		BIS(ino, nodmap);

--- 38,44 -----
  	BIS(ino, clrmap);
  	if(f == IFDIR)
  		BIS(ino, dirmap);
+ #ifndef MTIME_ONLY
  	if ((ip->di_mtime >= spcl.c_ddate || ip->di_ctime >= spcl.c_ddate) &&
  #else
  	if ((ip->di_mtime >= spcl.c_ddate ) &&
***************
*** 39,44
  	if(f == IFDIR)
  		BIS(ino, dirmap);
  	if ((ip->di_mtime >= spcl.c_ddate || ip->di_ctime >= spcl.c_ddate) &&
  	    !BIT(ino, nodmap)) {
  		BIS(ino, nodmap);
  		if (f != IFREG && f != IFDIR && f != IFLNK) {

--- 40,48 -----
  		BIS(ino, dirmap);
  #ifndef MTIME_ONLY
  	if ((ip->di_mtime >= spcl.c_ddate || ip->di_ctime >= spcl.c_ddate) &&
+ #else
+ 	if ((ip->di_mtime >= spcl.c_ddate ) &&
+ #endif
  	    !BIT(ino, nodmap)) {
  		BIS(ino, nodmap);
  		if (f != IFREG && f != IFDIR && f != IFLNK) {

	We use this modified version some of the time.  I 
	noticed the problem, and made the changes, after a series of
	disk crashes and fixes that required full restores.  Boy, 
	the next weekends backups were big!  Hmmm...

	Anyway, I hope this is helpful to some folks out there.
-- 
...nz (Neal Ziring at WU ECL  -  we're here to provide superior computing.)

	{seismo,ihnp4,cbosgd}!wucs!nz   OR   nz@wucs.UUCP

    "You could get an infinite number of wires into this !*$$#!?! junction 
                         box, but we usually don't go that far in practice"
				--   Employee of London Electricity Board, 1959

dave@onfcanim.UUCP (Dave Martindale) (05/02/86)

In article <27300009@convex> berliner@convex.UUCP writes:
>
>/* Written by ed@mtxinu.UUCP in convex:net.unix-wizards */
>> THIS IS BOTH INCORRECT AND DANGEROUS!  It was unnecessary with
>> pre-4.2 versions of dump/restor.  With 4.2 and later versions
>> of dump/restore, it will cause incrementals made after the edit
>> to be *nearly useless* if a full restore is required at a later time.
>> 
>> Read the man pages for dump and restore for more details (or see the
>> response I posted when the original request came through).
>> 
>> DON'T just change /etc/dumpdates!
>
>If this is indeed true, you can easily fake it as follows (for dumping the
>/xxx filesystem after completely restoring it from dump tapes):
>
>	/etc/dump 0usf 30000 /dev/null /xxx >& /dev/null &
>
>This will do an effective level 0 dump and update the /etc/dumpdates file
>accordingly.  You also won't be prompted to mount tape 2 ...

No, dammit, you can't do this either.  The whole point is this: some of
the information on a dump tape is relative to inode number.  When you
restore an incremental dump over a lower-level dump, it has to delete
files that have been removed and relink things that have changed names,
not just extract new files. 

When you did a restore under non-4.[23] versions of UNIX, you get back
all files to exactly the same inode number they were before - the I-list
is the same so you don't need to do a new dump.  Handling deleted files
or link changes required almost zero work - restore just read a bit map
of inodes to clear, and rewrote all changed inodes and directories, and
presto! everything was consistent. 

With 4.2, you get an *entirely new* I-list, since all of the files'
inodes are allocated from scratch, rather than put back where they were.
And any incremental dumps you do from the new filesystem reflect this
new I-list. 

With the 4.2 restore, keeping things consistent requires a lot of work
figuring out the names of things that need to be deleted or moved or
whatever.  Restore keeps track of what is where by keeping around a
symbol table, indexed by inode number, between passes of restore -
that's what "restoresymtable" is, in case you hadn't guessed. 

If you restore from a fullsave tape, and then try loading an incremental
dump on top of it *that wasn't made relative to the fullsave you used*,
chaos will result, because the I-lists do not correspond.

The fudge suggested above will do nothing more than update /etc/dumpdates,
and that isn't good enough.  When you restore a 4.2BSD filesystem,
you *must* do a new level 0 dump onto tape, in order to have something
for you later incremental dumps to be relative to.  If you fudge the
dump date, you will get incremental dumps that contain changed files
and can be used for "restore x", but they cannot be used for "restore r".

So, if you want to be able to use "restore r" to fully restore a filesystem
after a hardware failure, you *absolutely must* write that new level 0
tape, no matter how long it takes.  If you're satisfied with only being
able to use "restore x", why not just use tar instead of dump in the
first place?

	Dave Martindale

smb@ulysses.UUCP (Steven Bellovin) (05/03/86)

(discussions about dump/restore).

> To Fix:
> 	Change dump so that it looks only at the modification time, not
> 	the creation and modification times.

No!  Dump is behaving exactly properly; if you install this fix you'll
lose the results of chmod calls, access time, etc.

dave@onfcanim.UUCP (Dave Martindale) (05/03/86)

In article <1631@wucs.UUCP> nz@wucs.UUCP (Neal Ziring) writes:
> > > I have also noticed that after a restore all the files marked as
> > > changed and will go on the next dump.  [ ... ]
> > 
> > If you really did just spend hours restoring the file system, then you
> > have restored all or virtually all of the file system. If you don't
> > want to do a level 0 dump, simply edit /etc/dumpdates to reflect
> >  ...
>
>In fact, this problem is due to a quirk in a dump/restore, the
>restored files have the correct modification time, but their creation
>times are the time of the restore!
>
>Repeat-By:
>	Put a file system on a dummy partition, and do a level 0
>	dump(8) of it.  Wait a few minutes.  Then, trash the partition
>	and restore it fully from your dump.  Change one file.
>	Now, do a level 1 backup with dump(8).  The backup should be
>	extremely small (one file) but it isn't!  It is the whole
>	filesystem!
>
>To Fix:
>	Change dump so that it looks only at the modification time, not
>	the creation and modification times.
>

If you think of a dump as "all the files that have changed *contents* since
the most recent higher-level dump", then you do indeed want to dump
according to modification times.  But this is **not** what dump was
written to do.  It is intended to write out *all* the information needed
to restore *all* changes made to the filesystem, including changes
in permissions or link count of a file.  These latter things change
the inode's "di_ctime", which is really "change" not "create" time.

So, if you want to use dump and restore as a method of completely restoring
the state of a filesystem, not just getting back all the contents of the
files, don't change the algorithm.

And if you run 4.[23] BSD, you will still have to do level 0 dump after
doing a full restore.  No way of avoiding it.

	Dave Martindale

richl@tektools.UUCP (Rick Lindsley) (05/05/86)

In article <1631@wucs.UUCP> nz@wucs.UUCP (Neal Ziring) writes:
> 
> In fact, this problem is due to a quirk in a dump/restore, the
> restored files have the correct modification time, but their creation
> times are the time of the restore!
> 

As previously discussed, 4.2 & 4.3 restore do not use the raw disk, so
they cannot maintain the ctime on the files. But changing dump to not
look at the ctime could bite you in a big way -- ctime is changed if
the owner, group, or mode was changed. Looking at the mtime will
guarantee that you have a correct *copy* of the binary, but it may not
be the correct owner, group, or mode! Also, if you link two files together,
the new link won't appear on your dump because changing the link count
changes the ctime, not the mtime.

And of course, the reasons for doing a level 0 dump after a restore have
also been expounded upon in this space.

Rick

mac@tflop.UUCP (Mike Mc Namara) (05/06/86)

	Consider: 

	My disk head has crashed.  I call DEC in. They fix the disk. I get my
	dump tapes.  The most recent full backup tape can not be read (tape errors).
	I get the next previous tape. I do a restore -r.  I then get the various
	incrementals up to that point where the full faulty dump was made.  I 
	restore them.  I have now built the filesystem up to just before that
	dump which cannot be read.  I try to read the first incremental after
	that bad tape, but it can not be read (restore complains about dates...)

	Am I lost? is that info gone? This actually happened, and since the faulty
	full backup only had seen one week of storage time, the users were simply
	told that that week was lost, and no big problems.

	How could I have gotten restore to give me that post-unreadable-full-backup 
	info back on the system?  (short of cracking the tape itself.)

-- 
---------------------------------+--------------------------------------------
| Michael Mc Namara              | Let the words by yours, I'm done with mine.
| UUCP: dual!vecpyr!tflop!mac    | May your life proceed by its own design.
| ARPA: tflop!mac@ames.arpa      |
---------------------------------+--------------------------------------------

berliner@convex.UUCP (05/08/86)

/* Written 11:35 pm  May  1, 1986 by dave@onfcanim.UUCP in net.unix-wizards */
> ...  If you fudge the
> dump date, you will get incremental dumps that contain changed files
> and can be used for "restore x", but they cannot be used for "restore r".
>
> So, if you want to be able to use "restore r" to fully restore a filesystem
> after a hardware failure, you *absolutely must* write that new level 0
> tape, no matter how long it takes.  If you're satisfied with only being
> able to use "restore x", why not just use tar instead of dump in the
> first place?

Ah, now I see.  So all this trauma over the need to make level 0 dumps after
a full filesystem restore only applies to the "restore r" operation and not
the the "restore x" (or "restore i" for that matter) operation!  With
4.2BSD, there is (in my opinion) virtually no need to do "restore r" to
restore a file system -- I just run newfs and do "restore x".  Yes, I get a
new I-list, but who cares.  The end result is the same, is it not?

I *am* satisfied using "restore x" and "restore i".  I have never found it
beneficial (on 4.[23] systems) to do a "restore r".  Why don't I use tar?

	o  dump will let me write to multiple tapes
	o  dump runs *much* faster than tar -- especially if you use a
	   hacked-up dumptape.c module posted to net.sources a long time ago.
	o  dump keeps a "symbol table" at the front of the first tape; this
	   makes the "restore i" extremely nice.
	o  "restore i" let's me move around the directory hierarchy and
	   selectively extract files of my choosing.  It also tells me the
	   inode number, so that I can find (immediately) which tape to load
	   for a multi-tape dump.

I do use tar, but not to backup filesystems daily.  Thanks for filling us
all in on the "restore r" operation and the need to do level 0 dumps to keep
everything in-sync -- I will remember it in the future...

Brian Berliner
Convex Computer Corp.
{ihnp4, uiucdcs, sun, rice, allegra}!convex!berliner

dave@onfcanim.UUCP (Dave Martindale) (05/10/86)

In article <27300011@convex> berliner@convex.UUCP writes:
>
>Ah, now I see.  So all this trauma over the need to make level 0 dumps after
>a full filesystem restore only applies to the "restore r" operation and not
>the the "restore x" (or "restore i" for that matter) operation!  With
>4.2BSD, there is (in my opinion) virtually no need to do "restore r" to
>restore a file system -- I just run newfs and do "restore x".  Yes, I get a
>new I-list, but who cares.  The end result is the same, is it not?

No.  Suppose I have just restored a filesystem from tape (because I had
a head crash) and this filesystem contains files A, B, and C. It also
contains D and E, which are two links to the same file.  Instead of
doing a level 0 dump, I just edit /etc/dumpdates. 

Then I edit file A, delete file B, and mv file C to F then edit it.  I
also delete E, leaving D as the sole link to the file, then create a new
file named E. Now, I have files called A, D, E, and F. I do an
incremental dump, and a good thing too, because I have another head
crash that night. 

When I restore from the old level 0 dump, I get back A, B, C and D/E
again.  Then doing a restore x from the incremental extracts A, D, E,
and F. Unfortunately, this isn't the state of the filesystem as of the
crash - the file B is still there, and I thought I had deleted it.  And
so is C, which looks like an old copy of F.

Worst of all, D and E are still links to the same file, and now,
depending on the order of extraction from the tape, it contains what
either D or E should contain.  So, to me, it appears that either D
contains what should be (and is) in E, or E contains what should be in D.
The contents of one of these files is gone!  (It's still on the tape,
so if I happen to notice the problem and figure out what happened,
I can delete the incorrect file and re-extract it from the tape.
But I don't read every one of my files to check them after a restore,
so I don't notice the problem until 6 months later and by then the
tape has been reused.)

This sort of problem has occurred everywhere in that filesystem, not
just my own directory.  My users are either seen wandering the halls
muttering to themselves about their faulty memories, or waiting in line
to yell at me. 

In future, I take the time to do a new level 0 dump, which allows me to
do a "restore r" on the incremental, and everything *is* restored
correctly after our next crash. 


For me, the few extra hours required to do the level 0 dump after a
full restore (after all, how often do disks crash or their contents
get destroyed?) is *well* worth it to avoid the trauma of all users
trying to sort out their files after a restore.

Now, a question for system V administrators out there:  Doesn't cpio
suffer from exactly the same problems described above if it is used
for incremental saves?

	Dave Martindale

roy@phri.UUCP (Roy Smith) (05/11/86)

	I don't see what all the fuss is about.  In the little over 2 years
that we've been running 4.2, I've had to do exactly one level 0 restore
(when DEC swapped the HDA on our disk).  At least on our system (4 Meg 750,
TU-80, RA-81) the restore took about 2-3 times as long as the required
post-restore dump.  In the long run, the extra couple of hours just didn't
make any difference.

	My pet peeve about restore is that when you do "restore -i" to get
into interactive mode, the default is to put "." on the extraction list.
When I want to extract a single file from a backup (the usual case), the
first thing I have to do is "delete ." which takes forever on large file
systems.
-- 
Roy Smith, {allegra,philabs}!phri!roy
System Administrator, Public Health Research Institute
455 First Avenue, New York, NY 10016