[comp.unix.internals] Why is restore so slow?

jerry@olivey.olivetti.com (Jerry Aguirre) (01/26/91)

Those familiar with using dump and restore will have noticed the
difference in speed between them.  The dump procedure, expecially with
the current multi-buffered version, usually sings along at close to full
tape speed.  Restore, on the other hand, is a real dog taking up to 10
times as long for the same amount of data.

Has anyone done any evaluations of why there is such an extreem
difference in speed?  Granted that creating files involves more overhead
than dumping them restore still seems very slow.  As restore operates on
the mounted file system it has the advantage of accessing a buffered
file system with write behind.

My particular theory is that the disk buffering algorithms are precisely
wrong for restore.  By this I mean they keep in the buffers the data
that will never be needed again and flush the data that will.  I plan to
do some experimentation and would appreciate hearing any ideas you might
offer.

				Jerry Aguirre

das@eplunix.UUCP (David Steffens) (01/30/91)

All right, my $0.02 on this issue.

Who cares how slow restore is?  How often do you do have to do
full restore on a filesystem or a whole disk?  Once or twice a year?
If it's more often than that, then you have a REAL problem
and maybe you ought to spend your time and energy fixing THAT!

And none of your fancy programming tricks for me, thank you.
I'd much rather have a SLOW restore that was guaranteed to WORK
than one that was FAST and had unknown bugs because of some
magic algorithm that wasn't tested under all possible conditions.
My users and I would rather wait longer for a reliable restoration
of our files than have incomplete or inaccurate results in a hurry.

Reminds me of the 4.2bsd restore which claimed to have a checkpoint
option that supposedly allowed the restore to be stopped and restarted.
Never could get it to work correctly for us.  Wasted an awful lot
of time on that one.  I also remember the Ultrix1.1 dump which DEC
tried to "improve".  Unfortunately, one of their "optimizations"
had a small, undiscovered side-effect -- the highest-numbered inode
on the filesystem was never written on the dump tape.  Produced no end
of fun during the restore if said inode happened to be a directory.

I don't wish to repeat these experiences!  Repeat after me, the three
most important performance characteristics of dump and restore are:
	RELIABILITY, RELIABILITY and RELIABILITY.
-- 
David Allan Steffens       | I believe in learning from past mistakes...
Eaton-Peabody Laboratory   | ...but does a good education require so many?
Mass. Eye & Ear Infirmary, 243 Charles Street, Boston, MA 02114
{harvard,mit-eddie,think}!eplunix!das      (617) 573-3748 (1400-1900h EST)

jfh@rpp386.cactus.org (John F Haugh II) (01/30/91)

In article <1013@eplunix.UUCP> das@eplunix.UUCP (David Steffens) writes:
>Who cares how slow restore is?  How often do you do have to do
>full restore on a filesystem or a whole disk?  Once or twice a year?
>If it's more often than that, then you have a REAL problem
>and maybe you ought to spend your time and energy fixing THAT!

There are quite a few reasons in an EDP environment for restoring
files.  For example, before a large unreversible process, it is
common to dump the entire database partition so it can be restored
if the process is found to have completed incorrectly.  This is
very common for such operations as payroll, monthly account closing,
quarterly stuff, etc.

The answer to questions like "How often do you do X" often come
down to "Often enough that we can't stand it any longer."
-- 
John F. Haugh II                             UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 832-8832                           Domain: jfh@rpp386.cactus.org
"13 of 17 valedictorians in Boston High Schools last spring were immigrants
 or children of immigrants"   -- US News and World Report, May 15, 1990

jc@skyking.UUCP (J.C. Webber III) (01/30/91)

In <2880@redstar.cs.qmw.ac.uk> liam@cs.qmw.ac.uk (William Roberts;) writes:

>Restore suffers from the fact that files are stored in inode-number order: 
>this is not the ideal order for createing files as it thrashes the namei-cache 
>because the files are recreated randomly all over the place. We reorganised 
>our machine once and used dump/restore to move our /usr/spool and /usr/mail 
>partitions around: /usr/spool contains lots of tiny files called things like 
>/usr/spool/news/comp/unix/internals/5342 and this took an incredibly long time 
>to restore. /usr/mail contains several hundred files but no subdirectories and 
>restored in about the same sort of time as it took to dump. 

I have the imfamous lost inode problem on my system (after installing 
Bnews) so periodically I need to recover my /usr/spool partition.
What I have been doing is using "find . print|cpio -pudmv /new.slice
/usr/spool"  to move the files to a different partition while I clean
up the /usr/spool slice.  I do a rm -r * on /usr/spool, umount it,
fsck it, remount it and the cpio all the files back from the backup
partition.  My purpose in doing all this rather than just a simple
fsck is an attempt to recover *all* stray inodes sprinkled throughout
the /usr/spool slice.  My assumption is that the cpio'ing of these
files will cause them to have their inodes reassigned and layed out
sequencially on the disk.  Is this a correct assumtion?  It seams
to last longer (between "out of inode" messages) than just simply
fsck'ing the partition.  BTW, does anyone know of a resonable fix
for this problem.  I don't have kernal sources, but I can do some
kernel parameter tweeking and rebuilding using the sysconf utilities
on my system.  I have a CounterPoint System19k 68020 SystemV.3.0.
've read some postings about how this bug manifests itself, but
haven't seen anything about how to fix it except to contact the
vendor.  Well, that won't work, CounterPoint is out of buisness.

thx...jc

--
 /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
 \  J.C. Webber III		jc@skyking.UUCP	                              /
 /  R&D Lab Manager		(...uunet!mips!skyking!jc)                    \
 \  Mips Computer Systems	{ames,decwrl,pyramid,prls)!mips!skyking!jc    /
 /  (408)524-8260                                                             \
 \/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

das@eplunix.UUCP (David Steffens) (01/31/91)

In article <19012@rpp386.cactus.org>,
jfh@rpp386.cactus.org (John F Haugh II) says:
> In article <1013@eplunix.UUCP> das@eplunix.UUCP (David Steffens) writes:
>> Who cares how slow restore is?  How often do you do have to do
>> full restore on a filesystem or a whole disk?  Once or twice a year?
> There are quite a few reasons in an EDP environment for restoring
> files.  For example, before a large unreversible process, it is
> common to dump the entire database partition so it can be restored
> if the process is found to have completed incorrectly.

OK, so it would seem that the remainder of my posting (which you
didn't quote) is even more relevant in your case.  Wouldn't you:
| ...much rather have a SLOW restore that was guaranteed to WORK
| than one that was FAST and had unknown bugs because of some
| magic algorithm that wasn't tested under all possible conditions?

And therefore, don't you agree that:
| the three most important performance characteristics
| of dump and restore are: RELIABILITY, RELIABILITY and RELIABILITY?
-- 
David Allan Steffens       | I believe in learning from past mistakes...
Eaton-Peabody Laboratory   | ...but does a good education require so many?
Mass. Eye & Ear Infirmary, 243 Charles Street, Boston, MA 02114
{harvard,mit-eddie,think}!eplunix!das      (617) 573-3748 (1400-1900h EST)

martin@mwtech.UUCP (Martin Weitzel) (01/31/91)

In article <19012@rpp386.cactus.org> jfh@rpp386.cactus.org (John F Haugh II) writes:
:In article <1013@eplunix.UUCP> das@eplunix.UUCP (David Steffens) writes:
:>Who cares how slow restore is?  How often do you do have to do
:>full restore on a filesystem or a whole disk?  Once or twice a year?
:>If it's more often than that, then you have a REAL problem
:>and maybe you ought to spend your time and energy fixing THAT!
:
:There are quite a few reasons in an EDP environment for restoring
:files.  For example, before a large unreversible process, it is
:common to dump the entire database partition so it can be restored
                                                    ^^^
:if the process is found to have completed incorrectly.  This is
:very common for such operations as payroll, monthly account closing,
:quarterly stuff, etc.

But you are talking about a dump here that can be restored. CAN be.
Not MUST be. Restoring the whole thing only becomes necessary IF
the process has completed incorrectly. If this gets the normal case
rather than the exeception, you were right, but then I dare to say the
software is seriously flawed, if some operation frequently completes
incorrectly.

One case where it is normal that some operation fails frequently
is during program development and testing. But then there should be
enough space to have several sets of test data on disk. Otherwise the
development system is badly choosen and you were better of to buy a
larger disk. Programmers are expensive. Their costs accumulate over
time. Buying some additional hardware costs only once! (Remembers me
of the time when I had to insert a special floppy if I wanted to copy
a file, since for this machine, a IBM 5110 single user BASIC-"PC", you
couldn't copy files with the builtin software. It was rather counter
productive, since the copy program used the same memory where the BASIC
source was and if you forgot to save the source, the work of the last
hours may have been gone :-().

:The answer to questions like "How often do you do X" often come
:down to "Often enough that we can't stand it any longer."

Sure, but in any case the question should be allowed: "Why have you to
do it so often?". What would you say if someone complained that
formatting and verifying a 380 MB disks takes him half a day and
that's simply too much time each the day? Wouldn't you ask him WHY
he is formatting the disk evey day?
-- 
Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83

das@eplunix.UUCP (David Steffens) (02/03/91)

In article <19019@rpp386.cactus.org>,
jfh@rpp386.cactus.org (John F Haugh II) says:
> ... The most likely real-world ordering is probably reliability,
> usablity and performance... but you can't ignore all else -
> performance and usability really do count for quite a bit,
> because you are dealing with perceptions
> which affect how the user views the software...

If we were talking about any other unix utility (users, perhaps? :-),
I would probably agree with you.  But we're not.  We're talking about
restore, and to a lesser extent dump.  A similar, albeit weaker, case
can be made for tar and cpio since these utilities are also frequently
used in situations that demand extremely high reliabilty, e.g. archiving.

Unlike most unix utilities, there's only ONE reason to use restore...
to recover lost data.  If restore fails to perform this function,
then NO other performance characteristics matter.  Partial or inaccurate
recovery of data in a hurry is simply not very useful, IMHO.

> ... No, the psychology of computer users is such that any process
> which is SLOW will be avoided...

Users have nothing to do with it -- dump/restore is a sysadmin task.
And as an experienced sysadmin, I can tell you from hard-won personal
experience that a procedure that is slow but reliable is almost
always to be preferred to one that is faster but screws up occasionally.
The reason should be obvious -- in the long run, the time required
to detect and correct failures will probably far exceed the time saved.

> ... Although RELIABILITY is paramount, any process
> which operators are inclined to skip is of no value - they
> will pick the first less reliable process which is markedly faster...

The I Ching says: Inferior people should not be employed. :-)
-- 
David Allan Steffens       | I believe in learning from past mistakes...
Eaton-Peabody Laboratory   | ...but does a good education require so many?
Mass. Eye & Ear Infirmary, 243 Charles Street, Boston, MA 02114
{harvard,mit-eddie,think}!eplunix!das      (617) 573-3748 (1400-1900h EST)

das@eplunix.UUCP (David Steffens) (02/05/91)

In article <19023@rpp386.cactus.org>,
jfh@rpp386.cactus.org (John F Haugh II) says:
> Futhermore, you will find little consolation from your boss who doesn't
> understand why 300 EDP employees are sitting on their hands while you
> restore that =one= file they use so frequently.

Your boss will be even _more_ annoyed with you when he discovers
that you _cannot_ restore that file because you (or your OS vendor)
mucked around with dump/restore in order to "improve performance",
successfully trading reliability for some piddling increase in speed!
-- 
David Allan Steffens       | I believe in learning from past mistakes...
Eaton-Peabody Laboratory   | ...but does a good education require so many?
Mass. Eye & Ear Infirmary, 243 Charles Street, Boston, MA 02114
{harvard,mit-eddie,think}!eplunix!das      (617) 573-3748 (1400-1900h EST)

jfh@rpp386.cactus.org (John F Haugh II) (02/05/91)

In article <1022@eplunix.UUCP> das@eplunix.UUCP (David Steffens) writes:
>Your boss will be even _more_ annoyed with you when he discovers
>that you _cannot_ restore that file because you (or your OS vendor)
>mucked around with dump/restore in order to "improve performance",
>successfully trading reliability for some piddling increase in speed!

In my original response I noted that reliability is the number one
concern.  This means that performance, which is a significant concern
because of human factors, should be improved whenever possible to not
impact reliability.  There are quite a few programming techniques
which could be heaved at dump and restore which would greatly increase
performance or usability without impacting reliability at all.  The
increases in performance which I've seen made to dump/restore with
zero decrease in reliability range from 2x to 10x.

As I stated earlier as well, the best written dump/restore type of
utility I've used was free software from Archive Corp that was included
with a tape drive I had purchased for a MS-DOS/PC.  It included double
buffering to drive the tape and disk at their limits and a point-
and-shoot interface for navigating the tape.  In terms of reliability,
usability and performance, this was a 4-star product.  By comparision,
dump/restore is 3-stars for reliability, and 1-star each for
performance and usability, IMNSHO.  As would be predicted, the user's
of that particular PC were very willing to backup and restore their
own files given the ease and speed with which the task could be
accomplished.
-- 
John F. Haugh II                             UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 832-8832                           Domain: jfh@rpp386.cactus.org
"I've never written a device driver, but I have written a device driver manual"
                -- Robert Hartman, IDE Corp.

xtdn@levels.sait.edu.au (02/07/91)

OK, so I've seen lot's of postings suggesting that the reason restore is so
slow (compared to dump) is because reliability is more important than
efficiency.  But that argument is just a nonsense.  And it completely fails
to explain why dump (which, after all, needs to be just as reliable) is so
much faster than restore.

Now I don't know why there's such a difference in performance but I do
suspect that perhaps it's deliberate.  I think it's a reasonable assumption
that (sensible) people do backups much more often than they do restores.
Given that, I also think it makes good sense to optimise dump, even to the
point that restore suffers in performance.

One such optimisation could be to write the raw disk to tape (actually you'd
only dump those blocks that contain data that you want backed up, but the
point is that you'd be reading from the raw disk).  This would be quite fast
because you wouldn't be opening each file (which takes time), or reading the
file sequentially - see how much disk head movement you avoid?  Now such a
tape would consist of a number of chunks, each chunk detailing the file, the
file offset, and the data to write at that offset.  The restore process then
becomes a matter of reading the next chunk, opening and seeking the file, and
then writing the data.  All that head movement, opening files, seeking to the
right spot, and later, closing files, would certainly slow down the process.

I already said that I don't know how dump/restore works, but I would almost
be willing to bet that it's something like the scheme I just outlined.  Maybe
someone who does know could tell us what really happens?


David Newall, who no longer works       Phone:  +61 8 344 2008
for SA Institute of Technology          E-mail: xtdn@lux.sait.edu.au
                "Life is uncertain:  Eat dessert first"

greywolf@unisoft.UUCP (The Grey Wolf) (02/08/91)

In article <15866.27b02da2@levels.sait.edu.au> xtdn@levels.sait.edu.au writes:
>One such optimisation could be to write the raw disk to tape (actually you'd
>only dump those blocks that contain data that you want backed up, but the
>point is that you'd be reading from the raw disk).  This would be quite fast
>because you wouldn't be opening each file (which takes time), or reading the
>file sequentially - see how much disk head movement you avoid?  Now such a
>tape would consist of a number of chunks, each chunk detailing the file, the
>file offset, and the data to write at that offset.  The restore process then
>becomes a matter of reading the next chunk, opening and seeking the file, and
>then writing the data.  All that head movement, opening files, seeking to the
>right spot, and later, closing files, would certainly slow down the process.
>
>I already said that I don't know how dump/restore works, but I would almost
>be willing to bet that it's something like the scheme I just outlined.  Maybe
>someone who does know could tell us what really happens?

You're not terribly far off, with the exception that UNIX doesn't keep
a timestamp for individual blocks -- only inodes hold the timestamp, and
there's no way to tell whether a particular block in the file has been
updated (this would be terribly inefficient anyway -- chances are that if
you've blown away a file, only having the changed blocks would be useless).

Dump works by reading the disk partition directly -- it performs all the
directory/file mapping on its own by reading the on-disk inode list for that
partition.  It looks in /etc/dumpdates to determine how recent changes
have happened and, by looking at the inodes, makes an internal map of
those inodes which have been affected within the requested period of time
(with a "level 0" dump, everything since the beginning of time ( 4:00 pm,
New Year's Eve, 1969 on the American West Coast ... (-:), and then starts
mapping the directories in, dumping the directory information out and
finally dumping the contents of the files.  Wandering through the file-
system by oneself and performing only the necessary operations is going
to be much faster than sitting and going through the kernel's filesystem
overhead.

[ Side note:  I *hate* operators who cannot think to keep track of the
  inode number of the file that is being dumped when they do multiple
  tape dumps!  Makes restores a *pain*. ]

Restore, on the other hand, is a dog.  Why?  It *has* to be.  When files are
getting restored, one cannot simply re-write the raw disk ; the filesystem
overhead cannot be avoided on anything less than a full restore.  Even there,
a reason for avoiding just doing a raw data dump (via dd(1) (yes, I know
that's not what dd stands for)) is that full backup/restores serve to reduce
the disk fragmentation by putting everything back more or less contiguously.

(We used to have to do this periodically back at the lab because class
users had a tendency to produce lots and lots of little files.  The /users
file system would fragment ridiculously quickly over the semester.  I think
fragmentation reached about 5% (which is very high).)

It's also kind of convenient that if a normal user wishes to effect a
partial restore, he/she generally can, without having to be placed into a
special group or be given super-user privileges.

>
>
>David Newall, who no longer works       Phone:  +61 8 344 2008
>for SA Institute of Technology          E-mail: xtdn@lux.sait.edu.au
>                "Life is uncertain:  Eat dessert first"


-- 
thought:  I ain't so damb dumn!	| Your brand new kernel just dump core on you
war: Invalid argument		| And fsck can't find root inode 2
				| Don't worry -- be happy...
...!{ucbvax,acad,uunet,amdahl,pyramid}!unisoft!greywolf