baier@unipas.fmi.uni-passau.de (Joern Baier) (05/20/91)
Two weeks ago I posted an article concerning the problems which can arise doing a dump in multi-user mode. Nearly everyone who answered pointed out that at his site the backups are run while in multi-user mode but nobody has already observed a serious error as a result of this policy. Problems may occur when a file or a directory has been deleted between two passes of the dump. Peter Renzland <peter@ontmoh.uucp> explained this in great detail so I will cite him here in full length: >... >One thing that can happen is that a file is deleted, and the space is >re-allocted to a new file between the time the Inode for the old file >is written to tape and the data for what used to be the old file is written. >This data now belongs to another file. This is no problem until you try >to restore. Now the old inode points to data blocks which not only contain >data that could be none of the business of the owner of the old file, but >there is now a file-system inconsistency, because two files (inodes) now >point to the same data, and the old one could have had indirect blocks, >which may result in filesystem corruption that extends beyond just the old >file, and the (few) new file(s) that were created during the backup window >of vulnerability. > >All this is possible because dump bypasses the filesystem, for added >speed (and diminished integrity). >... According to Alain Brossard (brossard@sasun1.epfl.ch) it is also possible that the entire dump will become unreadable if the freed inode had been a directory and is now a file (or vice versa?) but this seems to be very unlikely. Thanks to all who answered. Joern. -- Joern Baier (baier@unipas.fmi.uni-passau.de) Jesuitengasse 9 D-W8390 Passau Tel.: +49/851/35239
zwicky@erg.sri.com (Elizabeth Zwicky) (05/21/91)
In article <1991May20.123129.14433@forwiss.uni-passau.de> baier@unipas.fmi.uni-passau.de (Joern Baier) writes: >Nearly everyone who answered pointed out that at his site the backups are >run while in multi-user mode but nobody has already observed a serious error >as a result of this policy. Sometimes I feel that I'm doomed to spend my life repeating this, but here goes: Yes, it does cause problems, I have seen it do so with my very own eyes more than once. Depending on the version of dump you are running, these problems range from files that are not on the tape to unusable tapes. Most people do very few restores; many of the people who have never had a problem have never done a full restore, either. I have seen files come up missing; I have also seen someone do a full restore on a filesystem which fsck then deleted. If you cannot risk having a backup be bad, don't do it in multi-user. You can probably risk having your daily backups be bad. You probably can't risk all your backups. This is why many of us do some backups in multi-user and some in single-user. Elizabeth Zwicky zwicky@erg.sri.com
jay@silence.princeton.nj.us (Jay Plett) (05/21/91)
In article <1991May20.204327.17694@erg.sri.com>, zwicky@erg.sri.com (Elizabeth Zwicky) writes: > In article <1991May20.123129.14433@forwiss.uni-passau.de> baier@unipas.fmi.uni-passau.de (Joern Baier) writes: > >Nearly everyone who answered pointed out that at his site the backups are > >run while in multi-user mode but nobody has already observed a serious error > >as a result of this policy. ... > Yes, it does cause problems, I have seen it do so ... > ... Most people do very few restores; many of the people who have > never had a problem have never done a full restore, either. I have done full restores. Not a lot of them, relative to the number of dumps. I have never had a problem restoring a dump made on a live filesystem. This does not imply that I never will. > ... If you cannot risk > having a backup be bad, don't do it in multi-user. Good advice. > You can probably > risk having your daily backups be bad. Ah, there's the point. If you can _risk_ losing one or two days work, then do daily level 0s on live filesystems. This is the beauty of Exabytes--it is feasible to do so. If a tape is bad at restore time, toss it and go back a day. If that one is bad, go back another day. The risk dimishes greatly with each day you go back. Look at the odds. The probability of a disk crash on any particular day is really very small. The probability of a bad level 0 done on a live filesystem might be larger, but it's still small. The probability of two successive bad tapes is smaller. Apply your favorite function to calculate the probability of a bad tape coinciding with both of the two days before a crash, and decide if that risk is acceptable to your users. Balance the risk against the cost to your users of routinely shutting down for backups. Don't forget to evaluate the possibility that a dump of an idle system might also be unrestorable. I believe that--for many sites--the advantages of dumping live filesystems outweigh the disadvantages. ...jay
tjc@ecs.soton.ac.uk (Tim Chown) (05/21/91)
In <1991May20.204327.17694@erg.sri.com> zwicky@erg.sri.com (Elizabeth Zwicky) writes: >I have seen files come up missing; I have also seen someone do a full >restore on a filesystem which fsck then deleted. If you cannot risk >having a backup be bad, don't do it in multi-user. You can probably >risk having your daily backups be bad. You probably can't risk all >your backups. This is why many of us do some backups in multi-user and >some in single-user. Another solution, if you are backing up user files and only have a few hundred Mb to do overnight, is to simply backup with 'tar' rather than 'dump'. We use that method to do 300Mb of user files from a Masscomp onto an Exabyte on a Sun and have never had a problem - the tar takes nearly four hours (it would take less on a local device). Tim --
mwm@pa.dec.com (Mike (My Watch Has Windows) Meyer) (05/21/91)
In article <690@silence.princeton.nj.us> jay@silence.princeton.nj.us (Jay Plett) writes:
Look at the odds. The probability of a disk crash on any particular
day is really very small. The probability of a bad level 0 done on a
live filesystem might be larger, but it's still small. The probability
of two successive bad tapes is smaller.
This last statement is only true if you assume that bad dumps are
unrelated. This is a false assumption. Given that someone was doing
something that caused a dump to be bad one day, I'd say the
probability of them having done that the previous day is larger than
the probability that a dump would be bad.
And the longer they've been doing it, the more likely it is that the
dump is bad. For instance, if you find that every dump made on a
weekday night for the last month is bad, which way would you bet on
the dump for tonight, if it were a weekday?
You're right - a stable file system doesn't guarantee that you can do
a restore. It eliminates one source of problems, and one that can be
set up to occur on a regular basis, at that.
<mike
--
But I'll survive, no you won't catch me, Mike Meyer
I'll resist the urge that is tempting me, mwm@pa.dec.com
I'll avert my eyes, keep you off my knee, decwrl!mwm
But it feels so good when you talk to me.
zwicky@erg.sri.com (Elizabeth Zwicky) (05/22/91)
The "we'll just do live backups over and over again" theory suffers from a common problem with security through redundancy; common mode failures. All the backups may fail the same way if the same file is always active when they're running. The easiest way to do this is to accidentally get the backups synchronized with a cron job or a very predictable human, but you can get the same effect with a very long-running program. Thus, you can backup something a few hundred times and have all the backups missing the same file. This is Not Fun. Using tar instead of dump buys you extremely little. tar will skip active files, which means they won't corrupt your backup. This is its sole advantage, and its only an advantage over some versions of dump. It will *also* skip files with names that are too long; depending on the version of tar you are running, it may also exhibit various nasty other problems dump doesn't have. On the whole, dump is safer. Elizabeth Zwicky zwicky@erg.sri.com
russell@ccu1.aukuni.ac.nz (Russell J Fulton;ccc032u) (05/22/91)
zwicky@erg.sri.com (Elizabeth Zwicky) writes: >Using tar instead of dump buys you extremely little. tar will skip >active files, which means they won't corrupt your backup. This is its >sole advantage, and its only an advantage over some versions of dump. >It will *also* skip files with names that are too long; depending on >the version of tar you are running, it may also exhibit various nasty >other problems dump doesn't have. On the whole, dump is safer. Would some knowledgable person care to comment on bru in light of Elizabeth's comments above. We use bru to back up our SGI 4D/240S with 6GB disk (five 1.2 GB drives) We back up one drive a night and do an incremental on the rest. The system is usually fairly quiet when the backup is done (in the small hours) with only a small number of batch jobs active. We have had no trouble, yet, and have had to restore a disk on two occasions in the last year. Cheers, Russell. -- Russell Fulton, Computer Center, University of Auckland, New Zealand. <rj_fulton@aukuni.ac.nz>
verber@pacific.mps.ohio-state.edu (Mark Verber) (05/22/91)
In article <690@silence.princeton.nj.us> jay@silence.princeton.nj.us (Jay Plett) writes:
Ah, there's the point. If you can _risk_ losing one or two days work,
then do daily level 0s on live filesystems. This is the beauty of
Exabytes--it is feasible to do so. If a tape is bad at restore time,
toss it and go back a day. If that one is bad, go back another day.
The risk dimishes greatly with each day you go back.
Don't bet on it. Lets say that you are running your backups from cron
-- most of us with exebytes do. Suppose you have something else
running in cron, or like my site, a user process that runs for days at
a time which does all sorts of i/o. Lets say the i/o going on when
dump runs happens to be doing just the wrong kind -- eg your dump is
corrupted. Every dump you take could be screwed -- redundancy didn't
win you much, did it.
I understand the desire to do dumps on active file systems for daily
incrementals... but what do people have against doing the level 0
dumps in single user. Can't you afford a few hours downtime in the
middle of the night once a month to insure a clean dump? You don't
even have to be around while the dumps are running with exebytes since
you don't have to change tapes or if your full saves won't fit on the
drive(s) you have, get a stacker.
sigh,
mark
jeffl@NCoast.ORG (Jeff Leyser) (05/22/91)
In post <1991May21.172208.281@erg.sri.com>, zwicky@erg.sri.com (Elizabeth Zwicky) says: !! [Redundant dumps don't buy you much] !!Using tar instead of dump buys you extremely little. tar will skip !!active files, which means they won't corrupt your backup. This is its !!sole advantage, and its only an advantage over some versions of dump. !!It will *also* skip files with names that are too long; depending on !!the version of tar you are running, it may also exhibit various nasty !!other problems dump doesn't have. On the whole, dump is safer. What about find | cpio? We do that here to backup 1.5GB to a single exabyte during a "quiet" period on Sunday. Multi-user, but quiet. -- Jeff Leyser jeffl@ncoast.org Opinions? I thought this was typing practice! leyser@tsa.attmail.com
rsk@gynko.circ.upenn.edu (Rich Kulawiec) (05/22/91)
In article <VERBER.91May21181220@avalon.mps.ohio-state.edu> verber@pacific.mps.ohio-state.edu (Mark Verber) writes: >I understand the desire to do dumps on active file systems for daily >incrementals... but what do people have against doing the level 0 >dumps in single user. Can't you afford a few hours downtime in the >middle of the night once a month to insure a clean dump? The answer to that last question, for some folks in certain environments, is "no, we can't". But if you are running 4.3BSD or a derivative thereof, you are probably running a version of dump(8) that has modifications made at BRL (by Doug Gwyn, I think), Purdue EE (George Goble), Purdue CS (Dan Trinkle) and Purdue CC (me). Some of those modifications are intended to prevent dump from being confused by an active filesystem. Those mods aren't bulletproof -- but in about five years of using this version of dump on many machines (VAX, Sun-3, Sun-4, MIPS, Pmax, etc.) I've never encountered a dump that I couldn't restore, i.e. that wasn't self-consistent. That doesn't mean that all such dumps were "complete", especially since the meaning of "complete" gets fuzzy when we attempt to apply that term to an active filesystem; but it does mean that I had what I needed to recover from crashes. -- ---Rsk rsk@gynko.circ.upenn.edu
verber@pacific.mps.ohio-state.edu (Mark Verber) (05/22/91)
In article <43617@netnews.upenn.edu> rsk@gynko.circ.upenn.edu (Rich
Kulawiec) writes in responce to my question about doing level 0 dumps
in single user mode using BSD 4.3 dump+purdue+brl hacks which are
less likely to have problems with active file system:
Those mods aren't bulletproof -- but in about five years of using this
version of dump on many machines (VAX, Sun-3, Sun-4, MIPS, Pmax, etc.)
I've never encountered a dump that I couldn't restore, i.e. that wasn't
self-consistent. That doesn't mean that all such dumps were
"complete", especially since the meaning of "complete" gets fuzzy when
we attempt to apply that term to an active filesystem; but it does
mean that I had what I needed to recover from crashes.
I am glad for you. On the other hand I have seen restored fail
utterly when the dump was taken on an active file system. The dump we
used has all the above patches installed! We all know even with all
those mods that there are failure conditions: Chris Torek and others
have posted them time to time so I am not going to repeat them.
Murphy's Law indicates that when you are in critical need of a clean
dump... that is when you happened to get an inconsistent one. I
haven't found a corrupted dump tape often, but it has happened ...
which is enough to keep me doing level 0 in single-user. I value my
user's data. [This may come from an incredible three day period of
time when the staff retyped an entire thesis for a graduating PhD
candidate who lost everything after a series of failures.]
Another thing to note is most of us are seeing more and more disk
hanging off our machines. There are more files and more dumps being
done. A few years ago most of us had 1-2gb of disk. These days sites
with 5-10x that are common. That increases the total number of
possible failures because a lot more dumps are being running. The
sites I know who have seen inconsistent dumps are also sites that have
10-30 servers and 15-30gb of disk a few years ago.
Mark Verber
Ohio State Physics Dept / Computing Services
jb3o+@andrew.cmu.edu (Jon Allen Boone) (05/23/91)
Ok, so I'll take the advice that you ought to do level 0 dumps in single-user mode. Question: Will the following scenario work? 2:00am (Dump Time!) Cron on machine A (MA) says to do dumps. Exebyte is on machine B (MB), so rsh the job to MB. MB determines which filesystems need to be dumped at what levels. Let's say that there are 10 different file systems. Let's say that 1 needs a level 0 dump. machine C (MC) needs the level 0 dump. Cron on MC has a job scheduled which determines that it needs a level 0 dump - so it shuts down. Then, it dumps the level 0 filesystem, rsh'ing the output to a dd command on MB. Once it's done, it reboots the machine to multi-user mode. Well? (Note: Comments about the insecurity of rsh, etc. are welcome - but probably already known.) If you can't do that, then WHAT can you do? IF you have many different file systems - too many to actually go around and hang a unique exebyte off of each one - and you can't realistically change the location of the exebyte each night, what do you do other than multi-user mode backups (some of which are level 0)? Also, I just today did a multi-user backup/restore from one machine to another - a level 0 dump of both / and /usr - restoring each to another machine (with myself and another person logged in on the dump machine) - and it seems to have worked just fine. -=> iain <=- ----------------------------------|++++++++++++++++++++++++++++++++++++++++ | "He divines remedies against injuries; | "Words are drugs." | | he knows how to turn serious accidents | -Antero Alli | | to his own advantage; whatever does not | | | kill him makes him stronger." | "Culture is for bacteria." | | - Friedrich Nietzsche | - Christopher Hyatt | -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
olson@anchor.esd.sgi.com (Dave Olson) (05/23/91)
In <1991May21.213844.12302@ccu1.aukuni.ac.nz> russell@ccu1.aukuni.ac.nz (Russell J Fulton;ccc032u) writes: | zwicky@erg.sri.com (Elizabeth Zwicky) writes: | | >Using tar instead of dump buys you extremely little. tar will skip | >active files, which means they won't corrupt your backup. This is its | >sole advantage, and its only an advantage over some versions of dump. | >It will *also* skip files with names that are too long; depending on | >the version of tar you are running, it may also exhibit various nasty | >other problems dump doesn't have. On the whole, dump is safer. | | Would some knowledgable person care to comment on bru in light of Elizabeth's | comments above. We use bru to back up our SGI 4D/240S with 6GB disk (five | 1.2 GB drives) We back up one drive a night and do an incremental on the | rest. The system is usually fairly quiet when the backup is done (in the | small hours) with only a small number of batch jobs active. | | We have had no trouble, yet, and have had to restore a disk on two occasions | in the last year. This should be no problem, as long as you don't run into filename length limitations. bru has a limitation of 127 chars for the pathname length. bru behaves similarly to tar when files change size while they are being backed up. bru also tends to be somewhat :) wasteful of tape, using about 25% more tape for typical files than tar, due to the checksums, etc. that it does. Most versions of tar limit you to 100 chars of pathname (which might be relative or full path); the POSIX version, which should be showing up on various systems (such as IRIX 4.0) has 255 char limit. I have NEVER seen a version of tar that skips 'active files'. Files that grow between the time tar stats them and finishes writing them will only have the original length; those that get shorter via truncations and rewrites will be padded with nulls to the original size. One of the main limitations of bru and tar relative to dump (for some people) is simply that they typically take longer, since they go through the filesystem. This is much more evident when many small files are backed up for most unix systems, as the open time becomes dominant. tar also suffers from the limitation (in many peoples minds) that it is difficult to do incremental backups. bru has the ability to backup files on mtime, but this misses ctime changes, such as owner or permission changes. -- Dave Olson Life would be so much easier if we could just look at the source code.
FFAAC09@cc1.kuleuven.ac.be (Nicole Delbecque & Paul Bijnens) (05/23/91)
In article <1991May21.172208.281@erg.sri.com>, zwicky@erg.sri.com (Elizabeth Zwicky) says: > >Using tar instead of dump buys you extremely little. tar will skip >active files, which means they won't corrupt your backup. What is meant by "active files" in the context of tar. How does tar know when a file is opened by another program (it cannot read /dev/kmem, can it)? Do you mean it skips the files, not putting them on the tape? I thought the fundamental difference between dump and tar/cpio was that tar/cpio just read the files through the block-device while dump reads the raw device. In the light of "internal" consistency of the backup (i.e. the tape can always be restored) using tar, can somebody explain what happens in these cases: 1. While tar reads through the file, the file grows at the end. e.g. log files frequently do this. Where does tar stop? At the old end of the file (so that the inode information on the tape is consistent with the length of the data following it) or when it encounters EOF of the disk-file (but now the tape is inconsistent)? 2. Someone truncates the file (to 0 or with the syscall trunc()). Tar just reads the block-device. Will tar add zero-filled blocks to match the recorded file-length? 3. Let's suppose a dbm-file. Tar just reads sequentially. Some other program updates the dbm-file. I think the tape will contain some file (it will restore without problems), but you cannot do anything useful with the restored file. Our version of cpio maintains it consistency: it can always restore the files. But, like in case 3, the restored file could be useless. How does tar behave in this respect? p.s. Another "advantage" of tar/cpio to do the backups is that you can restore the files on a different system (e.g. from a BSD to a SysV machine) without much hassle. This becomes more important for the archived backups. You never know what machine you will have in 5 years from now. -- Polleke (Paul Bijnens) Linguistics dept., K. University Leuven, Belgium FFAAC09@cc1.kuleuven.ac.be
peter@ficc.ferranti.com (Peter da Silva) (05/24/91)
What I don't understand is why people are still using "dump" to do backups? A pretty minimal script using "find -newer level-file" and "cpio" works just fine on active file systems. -- Peter da Silva; Ferranti International Controls Corporation; +1 713 274 5180; Sugar Land, TX 77487-5012; `-_-' "Have you hugged your wolf, today?"
yar@cs.su.oz (Ray Loyzaga) (05/24/91)
In article <KJIBZ8B@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes: > What I don't understand is why people are still using "dump" to do backups? > A pretty minimal script using "find -newer level-file" and "cpio" works just > fine on active file systems. > -- > Peter da Silva; Ferranti International Controls Corporation; +1 713 274 5180; > Sugar Land, TX 77487-5012; `-_-' "Have you hugged your wolf, today?" Restore -i is pretty cute, particularly as our users rarely know the complete path name of a file accurately, and dump is faster ... What does cpio do if it receives a name from find that has just been removed? How about directories? Do you have to read the entire cpio file to know if a file is on it (assuming no TOC held on a disk)? Does -newer just check the modification time, if so you might miss some files that have been touched backwards, it should use the inode change time.
rmtodd@servalan.uucp (Richard Todd) (05/24/91)
peter@ficc.ferranti.com (Peter da Silva) writes: >What I don't understand is why people are still using "dump" to do backups? >A pretty minimal script using "find -newer level-file" and "cpio" works just >fine on active file systems. 1. "dump" preserves the access times on files, and "restore" restores the files with the access times set correctly. "cpio" neither records the access times in its archive nor leaves the access times of the files on disk unaffected. Thus, "cpio" screws up any schemes one may have for locating user files that haven't been accessed in, say, 6 months and automatically moving them off to tape and deleting them. 2. "dump" handles files with holes in them correctly (the holes don't take up space on the backup, and "restore" restores the files with holes correctly). "cpio" doesn't. Having all your dbm files suddenly explode in disk usage after having been brought back off of tape is considered bad form in some circles... 3. Just how were you planning to do restores of those incremental backups? Seems to me that the naive approach (extracting the incremental cpio just like the full cpio backups) won't work correctly on directories which have had files deleted between the making of the full and the incremental backup. Say you've got a directory "foo", which had files "a","b", and "c" in it at the time of the full backup, but between that backup and the incremental someone deleted files "a" and "b". In restoring the filesystem after the crash, you read in the level-0 cpio backup, which puts "foo/a", "foo/b", and "foo/c" on the disk. Now you read in the incremental cpio backup, which (because "foo" had files deleted, shows up with a newer mod time than the level-0, and thus gets backed up) has the "foo" directory and "foo/c" on it, and thus foo/c gets written to your disk--but "foo/a" and "foo/b" are not deleted. You have not restored the filesystem to its state as of the time of the incremental backup. This means that you need to do some extra work to make sure that all the stuff you got rid of once gets gotten rid of again. (Note that if you're really unlucky, and had a lot of old stuff deleted and new stuff added between the full and incr. backups, restoring the incremental cpio file will fill your disk.) This just adds to the hassle a sysadmin has to deal with when restoring a filesystem, when usually the sysadmin has entirely too much to deal with anyway... Basically, there's a subtle difference between the goal of "dump" and "cpio". "Dump" is a *backup* program; its function is to save the state of a filesystem in such a way that it can be restored exactly later. "Cpio" is an archiving program; like "tar" or "zoo", its function is to package up a bunch of files in a halfway portable fashion so that they can be transported about easily from one place to another, from one system to another. You can try to press "cpio" or "tar" into service as a backup program, but it's not really the same thing... -- Richard Todd rmtodd@uokmax.ecn.uoknor.edu rmtodd@chinet.chi.il.us rmtodd@servalan.uucp
benseb@grumpy.sdsc.edu (Booker Bense) (05/24/91)
In article <1991May24.013214.2526@servalan.uucp> rmtodd@servalan.uucp (Richard Todd) writes: >peter@ficc.ferranti.com (Peter da Silva) writes: > >>What I don't understand is why people are still using "dump" to do backups? >>A pretty minimal script using "find -newer level-file" and "cpio" works just >>fine on active file systems. > [stuff about dump being better ] >-- >Richard Todd rmtodd@uokmax.ecn.uoknor.edu rmtodd@chinet.chi.il.us > rmtodd@servalan.uucp - Well, I've been wrestling with this problem for some time now. I sort of run things on a network that consists of 2 Ultrix Decstations, 3 VMS/vaxstations and some xterms. The VMS disks are visible from the decstation using UCX. We have one 1.2 gig DAT hanging off a Decstation and I am attempting to implement a reasonable backup stragety. -First you have to define why you are backing up. I have two goals in mind. 1. Disk crashes - Need to recreate enough of the environment to be useful. 2. Pilot Error - Backups for accidental deletion by users. - These two objectives have totally different goals and I have come to the conclusion that TWO different backup strageties are needed. - To implement the first I do ``dump''s of the major filesystems onces a month. I come in on a saturday and do this with no one on the machine. After this discussion , I'll do it single-user mode. - For the second I have set NFS up so root on the machine with the DAT can read any file on the network ( either VMS or Ultrix ). With various combinations of find and egrep -v I create a list of files from the ``user filesystems'' and use GNU tar to dump this list onto the end of a tar archive. This job is run by cron every night. GNU tar has enough flexiblity that I can get only the ``latest version'' of the file off the archive when neccessary. I also have utilities that will take care of converting VMS var. record length to Stream lf format. This has proven far more useful than the dump tapes and is relatively automatic. ( I only have to change tapes about once a month ). - The hard part has been convincing the kernal that the tape drive really was capable of 1.2 gigs. Many thanks to Don Rice in comp.unix.ultrix for the helpful advice. - Booker C. Bense prefered: benseb@grumpy.sdsc.edu "I think it's GOOD that everyone NeXT Mail: benseb@next.sdsc.edu becomes food " - Hobbes
peter@ficc.ferranti.com (Peter da Silva) (05/25/91)
In article <2458@cluster.cs.su.oz.au> yar@cluster.cs.su.oz (Ray Loyzaga) writes: > Restore -i is pretty cute, particularly as our users rarely know the > complete path name of a file accurately, We can just grep the backup.log file. We have to do that to find the volume number anyway. > and dump is faster ... That's probably true. > What does cpio do if it receives a name from find that has just been removed? Continues. > How about directories? Ditto. > Do you have to read the entire cpio file to > know if a file is on it (assuming no TOC held on a disk)? We keep a list of files on disk. > Does -newer just check the modification time, if so you might miss some files that > have been touched backwards, it should use the inode change time. Newer uses the modification time. I believe we use the inode change time (we don't actually use find, but rather use a faster special-purpose program). -- Peter da Silva; Ferranti International Controls Corporation; +1 713 274 5180; Sugar Land, TX 77487-5012; `-_-' "Have you hugged your wolf, today?"
peter@ficc.ferranti.com (Peter da Silva) (05/25/91)
In article <1991May24.013214.2526@servalan.uucp> rmtodd@servalan.uucp (Richard Todd) writes: > 1. "dump" preserves the access times on files, and "restore" restores the > files with the access times set correctly. "cpio" neither records the access > times in its archive nor leaves the access times of the files on disk > unaffected. Thus, "cpio" screws up any schemes one may have for locating > user files that haven't been accessed in, say, 6 months and automatically > moving them off to tape and deleting them. Your CPIO might have all those flaws. Ours doesn't. Ever hear of a program by the name of "pax"? > 2. "dump" handles files with holes in them correctly (the holes don't take > up space on the backup, and "restore" restores the files with holes correctly). > "cpio" doesn't. Having all your dbm files suddenly explode in disk usage > after having been brought back off of tape is considered bad form in some > circles... Again, a solved problem. (we don't use DBM, but our databases do have a similar behaviour) > 3. Just how were you planning to do restores of those incremental backups? We don't worry about deleted files reappearing, and it has not been a problem in general. We do not restore en-masse from major disasters anyway... it's always a good chance to tidy up old software, bring a system to the latest rev level of everything, and so on. > This means that you need to do > some extra work to make sure that all the stuff you got rid of once gets > gotten rid of again. This is a minor problem compared to the complexity of shutting down all the systems for the daily backups. I don't think we could work that way in any case, as we usually back up a lot of systems remotely over the network. > You can try to press "cpio" or "tar" into service as a backup program, but > it's not really the same thing... Until UNIX ships with a version of dump we can use, we don't really have an alternative. I'm really surprised that anyone with any significant number of machines is still using it. -- Peter da Silva; Ferranti International Controls Corporation; +1 713 274 5180; Sugar Land, TX 77487-5012; `-_-' "Have you hugged your wolf, today?"
rca@ingres.com (Bob Arnold) (05/25/91)
In article <KJIBZ8B@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes: >What I don't understand is why people are still using "dump" to do backups? >A pretty minimal script using "find -newer level-file" and "cpio" works just >fine on active file systems. Is this a serious question? If it is related to the discussion about potential problems with dump, find | cpio is vulnerable to sync problems too, because the find is running ahead of (and much faster than) the cpio. A more generic answer: 1) dump will not traverse filesystem/NFS boundaries. So just how am I supposed to back up the root filesystem with cpio? Like this ?!?!: cd / ; find . -print | cpio .... Try that on a big NFS server/client sometime. To make life even more miserable, many systems without BSD dump also lack BSD find's "-xdev" or "-fstype nfs -prune" options. 2) cpio's user interface is far inferior to dump/restore for both backup and (especially) file retrieval. dump/restore has all this, cpio doesn't: a) messages telling the user how much work has been done so far (yeah some people might not call this a feature, and dump does it in a funky way, but it's better than nothing if you like it) b) good media load error handling (write-protected media, etc) c) interactive selection of files for retrieval d) decent end-of-tape handling (do you want to restart the volume?) 3) dump provides services that cpio doesn't: a) tracking of multi-level backups b) lists of files that are supposed to be on the tape c) access to devices on remote hosts 4) many versions of cpio can't handle multi-volume backups. 5) various vendor wierdnesses with cpio (these are more braindamaged than most cpio's): a) at least one new version of cpio doesn't handle device nodes b) at least one version of cpio can't write portable headers (the "c" option) I say all this as someone who has written a backup script to put multiple filesystems on a single tape. It works on perhaps 30 UNIX variants. The part of the script that actually puts backups on tape is 18 lines when it uses dump and 282 lines when we are forced to use cpio (both counts include comments). The extra code is basically to provide as much as I can of dump's functionality (sort of as a wrapper for cpio). Overall, I could probably eliminate two thirds to three quarters of my code if I didn't have to deal with cpio. Now, if I only had the choice :-) Bob -- __ _ _ Bob Arnold ASK / Ingres Product Division |/ \ / \ / \| 1080 Marina Village Parkway | / / | Alameda, CA, 94501 | \__/ \__/| rca@ingres.com 415/748-2819
bernie@metapro.DIALix.oz.au (Bernd Felsche) (05/25/91)
In <1991May24.013214.2526@servalan.uucp> rmtodd@servalan.uucp (Richard Todd) writes: >peter@ficc.ferranti.com (Peter da Silva) writes: >>What I don't understand is why people are still using "dump" to do backups? >>A pretty minimal script using "find -newer level-file" and "cpio" works just >>fine on active file systems. >1. "dump" preserves the access times on files, and "restore" restores the >files with the access times set correctly. "cpio" neither records the access >times in its archive nor leaves the access times of the files on disk >unaffected. Thus, "cpio" screws up any schemes one may have for locating >user files that haven't been accessed in, say, 6 months and automatically >moving them off to tape and deleting them. Depends on the cpio options... doesn't your cpio have -a ? Do you know how to use it? >2. "dump" handles files with holes in them correctly (the holes don't take >up space on the backup, and "restore" restores the files with holes correctly). >"cpio" doesn't. Having all your dbm files suddenly explode in disk usage >after having been brought back off of tape is considered bad form in some >circles... This is a problem with any backup utility I know of, which accesses files through the filesystem. The "holes" are not physical. You have to look at the blocks allocated to check for holes. Mind you, there's nothing that stops a utility from doing it from the filesystem, is there? After all, any run of NULLs in a file can be represented as a relative seek. Heck you could *shrink* files for backup. > Basically, there's a subtle difference between the goal of "dump" and >"cpio". "Dump" is a *backup* program; its function is to save the state of >a filesystem in such a way that it can be restored exactly later. "Cpio" is >an archiving program; like "tar" or "zoo", its function is to package up a >bunch of files in a halfway portable fashion so that they can be transported >about easily from one place to another, from one system to another. You can >try to press "cpio" or "tar" into service as a backup program, but it's >not really the same thing... More than a subtle difference. We do an image copy of our root filesystem every night onto a spare partition, on another drive. Even with the system idle, an fsck on the copy still reports problems.. nothing serious.. mostly FIFO file size errors. The copy takes 3 minutes. This is archived on tape once a week. -- Bernd Felsche, _--_|\ #include <std/disclaimer.h> Metapro Systems, / sale \ Fax: +61 9 472 3337 328 Albany Highway, \_.--._/ Phone: +61 9 362 9355 Victoria Park, Western Australia v Email: bernie@metapro.DIALix.oz.au
peter@ficc.ferranti.com (Peter da Silva) (05/25/91)
In article <1991May25.003146.13982@ingres.Ingres.COM> rca@Ingres.COM (Bob Arnold) writes: > In article <KJIBZ8B@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes: > >What I don't understand is why people are still using "dump" to do backups? > >A pretty minimal script using "find -newer level-file" and "cpio" works just > >fine on active file systems. > Is this a serious question? Yes. > If it is related to the discussion about potential problems with dump, > find | cpio is vulnerable to sync problems too, because the find is > running ahead of (and much faster than) the cpio. Yes, but cpio doesn't produce a bad archive when it gets out of sync. > 1) dump will not traverse filesystem/NFS boundaries. So just how am I > supposed to back up the root filesystem with cpio? We don't back up the root file system. We back up /sys, /etc and /net, so we retain config files, but otherwise if the root file system gets blown away we use it as an opportunity to copy in a clean one. We work hard on keeping all our systems looking the same, and as a result backing up root is a waste of time. > 2) cpio's user interface is far inferior to dump/restore for both backup > and (especially) file retrieval. That's why we use pax. > 3) dump provides services that cpio doesn't: > a) tracking of multi-level backups Trivial. > b) lists of files that are supposed to be on the tape Redirect the output of cpio to a file. > c) access to devices on remote hosts We're on a network that provides transparent UNIX file system semantics on remote hosts, and another thing that puzzles us terribly is why people put up with junk like NFS and RFS. > 5) various vendor wierdnesses with cpio (these are more braindamaged > than most cpio's): That's why we use pax. And another advantage to cpio is that it sticks to the traditional UNIX tools approach. Why use one big integrated program when you get so much more flexibility from a script built out of tools. -- Peter da Silva; Ferranti International Controls Corporation; +1 713 274 5180; Sugar Land, TX 77487-5012; `-_-' "Have you hugged your wolf, today?"
les@chinet.chi.il.us (Leslie Mikesell) (05/26/91)
In article <1991May24.013214.2526@servalan.uucp> rmtodd@servalan.uucp (Richard Todd) writes: >1. "dump" preserves the access times on files, and "restore" restores the >files with the access times set correctly. "cpio" neither records the access >times in its archive nor leaves the access times of the files on disk >unaffected. Thus, "cpio" screws up any schemes one may have for locating >user files that haven't been accessed in, say, 6 months and automatically >moving them off to tape and deleting them. Most cpio's have the -a option to "fix" the disk access time to be the same as it was before cpio read the file. The problem is that if you use it, it changes the ctime in the process of fixing the atime. So if you use ctime to test for the files that need to be put on an incremental backup (and you should...), they all appear to be new. The only solution I've found is to back up across a read only mount, which isn't too difficult if you are going over an network anyway. >2. "dump" handles files with holes in them correctly (the holes don't take >up space on the backup, and "restore" restores the files with holes correctly). >"cpio" doesn't. Having all your dbm files suddenly explode in disk usage >after having been brought back off of tape is considered bad form in some >circles... Afio (a cpio work-alike) will do this by seeking over blocks of nulls during the restore. >3. Just how were you planning to do restores of those incremental backups? >Seems to me that the naive approach (extracting the incremental cpio just >like the full cpio backups) won't work correctly on directories which have >had files deleted between the making of the full and the incremental >backup. GNUtar is the only thing I've seen that gets this right (running AT&T sysV, I haven't used dump/restore). It (optionally) makes an entry containing all the current contents of the directories as it goes by and can (optionally) delete files that aren't supposed to be there as you restore. Les Mikesell les@chinet.chi.il.us
les@chinet.chi.il.us (Leslie Mikesell) (05/26/91)
In article <W3KBZID@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes: >Yes, but cpio doesn't produce a bad archive when it gets out of sync. It doesn't die if files have been deleted, but some very nasty things happen if a file is truncated between the time cpio generates its header and when it actually reads the file. I think most versions even get confused if the file grows in this interval (i.e. they put the data to EOF in the archive even if it is not consistant with the header length field.). >We're on a network that provides transparent UNIX file system semantics >on remote hosts, and another thing that puzzles us terribly is why people >put up with junk like NFS and RFS. What's wrong with RFS, other than killing all the processes that were using it when a link goes down? Les Mikesell les@chinet.chi.il.us
terry@jgaltstl.UUCP (terry linhardt) (05/27/91)
In article <1991May24.013214.2526@servalan.uucp>, rmtodd@servalan.uucp (Richard Todd) writes: > 1. "dump" preserves the access times on files, and "restore" restores the > files with the access times set correctly. "cpio" neither records the access > times in its archive nor leaves the access times of the files on disk > unaffected. Thus, "cpio" screws up any schemes one may have for locating > user files that haven't been accessed in, say, 6 months and automatically > moving them off to tape and deleting them. > This statement is not necessarily correct. If you use cpio with the -a option access times are *not* reset. -- |---------------------------------------------------------------------| | Terry Linhardt The Lafayette Group uunet!jgaltstl!terry | |---------------------------------------------------------------------|
acsiv@menudo.uh.edu (Duck @ U of Houston) (05/27/91)
In article <W3KBZID@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes: > >That's why we use pax. > >-- >Peter da Silva; Ferranti International Controls Corporation; +1 713 274 5180; >Sugar Land, TX 77487-5012; `-_-' "Have you hugged your wolf, today?" I'll bite...what's pax? Don -- Donald M. Harper (713) 749-6283 University of Houston Academic User Services, User Services Specialist II Cannata Research Computing Center, Operator Internet : DHarper@uh.edu | Bitnet : DHarper@UHOU | THEnet : UHOU::DHARPER
dan@gacvx2.gac.edu (05/27/91)
In article <1991May26.180717.22463@menudo.uh.edu>, acsiv@menudo.uh.edu (Duck @ U of Houston) writes: > In article <W3KBZID@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes: >> >>That's why we use pax. > > I'll bite...what's pax? archie> whatis pax pax Reads and writes tar(1) and "cpio" formats, both traditional and IEEE 1003.1 (POSIX) extended. Handles multi-volume archives and automatically determines format while reading. Has tar(1), "cpio", and "pax" interfaces. "pax" interface is based on IEEE 1003.2 Draft 7 -- Dan Boehlke Internet: dan@gac.edu Campus Network Manager BITNET: dan@gacvax1.bitnet Gustavus Adolphus College St. Peter, MN 56082 USA Phone: (507)933-7596
adeboer@gjetor.geac.COM (Anthony DeBoer) (05/27/91)
In article <1991May24.013214.2526@servalan.uucp> rmtodd@servalan.uucp (Richard Todd) writes: >1. "dump" preserves the access times on files, and "restore" restores the >files with the access times set correctly. "cpio" neither records the access >times in its archive nor leaves the access times of the files on disk >unaffected. Thus, "cpio" screws up any schemes one may have for locating >user files that haven't been accessed in, say, 6 months and automatically >moving them off to tape and deleting them. If your cpio implements the "a" parameter, you can do a backup without affecting the access times of the files on the disk: # find . -print | cpio -ovBca > /dev/rmt0 Of course, if you ever have to restore these files, the access time would get munged at that point. -- Anthony DeBoer NAUI#Z8800 | adeboer@gjetor.geac.com | Programmer (n): One who Geac Canada Ltd. | uunet!geac!gjetor!adeboer | makes the lies the Toronto, Ontario | #include <disclaimer.h> | salesman told come true.
torek@elf.ee.lbl.gov (Chris Torek) (05/28/91)
[NB: I am assuming you do incremental backups, not just full-system backups. If you have a few dozen gigabytes of disk, you almost certainly rely on incremental backups.] Cpio, like any utility that works through the file system, is not well suited as an `exact backup' program for most if not all Unix systems. (There is a `trick' to get around this, but it typically does not work well anyway.) Here is why: In article <1991May27.132333.26592@gjetor.geac.COM> adeboer@gjetor.geac.COM (Anthony DeBoer) writes: >If your cpio implements the "a" parameter, you can do a backup without >affecting the access times of the files on the disk: > ># find . -print | cpio -ovBca > /dev/rmt0 Reading a file through the file system updates the file's access time (and reading a directory updates the directory's access time). The only way to change the time back, through the file system, is with the utime or utimes call (which call to use depends on which Unix system; some support both---the only real difference is that utimes uses more precise timestamps). Using utime/utimes on a file will update the file's `ctime' (inode-change time), since it changes the inode information. Any exact-backup program must write out every file whose ctime is greater than that of the last backup. Otherwise an important change, such as `file 1234 went from rw-r--r-- to rw-------', will not appear on the tape. It cannot quit after writing just the inode information since the utime(s) system call(s) can be used to make a new file look older; when this is done, only the ctime tells the truth: that the file needs to be backed-up. Thus, with a through-the-file system backup program, you have your choice of either clobbering access times (reading everything that is being backed-up) or making incrementals impossible. Here is the trick: When operating on a read-only file system, read calls do not update the access time. Thus, if you can unmount the file system and remount it read-only, you can use cpio or other `ordinary' utilities to make a backup without affecting the inode timestamps. It does not work well because you typically find that you cannot unmount the file system. If you could, you could unmount it and run `dump' anyway. (You can remount read-only; dump does not care if file systems are mounted, only whether they change `non-atomically' to dump.) If inode access times are unimportant to you, this argument collapses. We happen to like them, and the fact that dump can write a gigabyte to an Exabyte in just over 66 minutes (or 33 with the 500KB/s model) does not hurt either. Dump is currently limited by tape drive data rates; this is all too often untrue for file system operations. -- In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427) Berkeley, CA Domain: torek@ee.lbl.gov
khushro@caen.engin.umich.edu (Khushro Shahookar) (05/29/91)
We have had so many responses to Re: SUMMARY: Backup while in multi-user mode Could a kind soul please post a summary of the summary, listing all possible holes in dump, tar, cpio, whatever... -KHUSHRO SHAHOOKAR khushro@eecs.umich.edu So much news, so little time ...
bill@unixland.uucp (Bill Heiser) (05/30/91)
In article <EDJB+TC@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes: > >Your CPIO might have all those flaws. Ours doesn't. Ever hear of a program >by the name of "pax"? Does pax have the problem that CPIO has, where if it encounters a file on an NFS-mounted partition, and doesn't have read permission on the file, it causes the CPIO to fail (rather than just skipping the file with maybe a warning)? I'm leery of something non-standard like PAX. What happens 5 years from now when we need to restore something from a tape created on a Sun/386i (for example) to a Sparc-L? :-) -- bill@unixland.natick.ma.us The Think_Tank BBS & Public Access Unix ...!uunet!think!unixland!bill bill@unixland ..!{uunet,bloom-beacon,esegue}!world!unixland!bill 508-655-3848 (2400) 508-651-8723 (9600-HST) 508-651-8733 (9600-PEP-V32)
karish@mindcraft.com (Chuck Karish) (05/30/91)
In article <1991May30.002422.14775@unixland.uucp> bill@unixland.uucp (Bill Heiser) writes: >In article <EDJB+TC@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes: >> >>Your CPIO might have all those flaws. Ours doesn't. Ever hear of a program >>by the name of "pax"? > >I'm leery of something non-standard like PAX. What happens 5 years from >now when we need to restore something from a tape created on a Sun/386i >(for example) to a Sparc-L? :-) Non-standard? pax was written specifically to support the tar and cpio formats immortalized by the POSIX.1 standard. Its behavior is specified by the draft POSIX.2 standard. It knows how to read traditional tar and cpio formats. Five years from now you'll be able to read pax archives by using pax. -- Chuck Karish karish@mindcraft.com Mindcraft, Inc. (415) 323-9000
mills@ccu.umanitoba.ca (Gary Mills) (05/30/91)
In <1991May30.002422.14775@unixland.uucp> bill@unixland.uucp (Bill Heiser) writes: >In article <EDJB+TC@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes: >> >>Your CPIO might have all those flaws. Ours doesn't. Ever hear of a program >>by the name of "pax"? >Does pax have the problem that CPIO has, where if it encounters a file If you have SunOS 4.1.1, check out /usr/5bin/paxcpio. -- -Gary Mills- -Networking Group- -U of M Computer Services-
bill@unixland.uucp (Bill Heiser) (05/31/91)
In article <675574289.1712@mindcraft.com> karish@mindcraft.com (Chuck Karish) writes: > >Non-standard? pax was written specifically to support the tar and >cpio formats immortalized by the POSIX.1 standard. Its behavior is >specified by the draft POSIX.2 standard. It knows how to read >traditional tar and cpio formats. > >Five years from now you'll be able to read pax archives by using >pax. > OK -- my knowledge of PAX is obviously very limited. Basically, I know it exists ... My point is that I may want to read tapes later on a machine that doesn't happen to have PAX available. Will ordinary tars and cpios read tapes written by PAX? -- bill@unixland.natick.ma.us The Think_Tank BBS & Public Access Unix ...!uunet!think!unixland!bill bill@unixland ..!{uunet,bloom-beacon,esegue}!world!unixland!bill 508-655-3848 (2400) 508-651-8723 (9600-HST) 508-651-8733 (9600-PEP-V32)