FFAAC09@cc1.kuleuven.ac.be (Nicole Delbecque & Paul Bijnens) (03/12/91)
You wrote: > I am planning to use the command > >find /users | cpio .... I use a variation of "find ... | cpio ..." every day here to make our backup. We make the backup in multi-user mode. You should make sure your cpio can handle two things correctly. When it reads the next-file-to-backup from its standard input it does a open() and then a statf(). The size information from that statf is (with the other i-node info) writen to the tape. While reading a file, which is opened by another process, three things can happen. 1. Most of the time, the file grows. A good implementation of cpio should not read until eof, but only write as much databytes as indicated bye the header, that was already written to the tape. 2. The files shrinkes. This can be the result of a chsize() system-call if it is implemented, or a truncation, followed by a some write()s. To be consistent in this case, cpio should write as much zero-bytes to the tape, as to reach the file-size in the header. 3. Some parts in the middle of the file are changed. (Opened for "w+", and read, seek, write in the program). Cpio then just writes part of the old data, followed by updated data. In any of the above cases, cpio retains internal consistency. I mean: the data on the tape can be read again with cpio without "getting out of sync" or other trouble. The restored data can be damaged: (again the three above cases): 1. You just backed up the old data. The new data is never backed up. However, the next day it will be backed up... 2. Restoring the file may result in a huge file with the last part all nulls. Waste of disk-space, but easy to fix with a 10-line C-program (or one line perl). 3. If the data in the file should be in an internal consistent state (e.g. database file), then you have trouble with this file. Restoring the file can give a useless piece of junk. On our system we do the backup in the evening or morning, when there aren't many people logged on. We had never any problems like the above. However, I did manage to set up an experimental situation to fool cpio, so I could verify our cpio maintained it internal consistency (we have no source to look at). I decided to do our backup in multi-user mode because: a. our cpio maintains it internal consistency. b. None of our files are updated constantly. (Our database program has its own backup-facility.) c. Having some "damaged" file on tape one day is not a problem because the next day it will be backed up again with the next incremental backup (make sure to set the time-stamp of your backup to the BEGINNING of the operation: files modified during the backup are newer this way). bye -- Polleke (Paul Bijnens) Linguistics dept., K. University Leuven, Belgium FFAAC09@cc1.kuleuven.ac.be
FLYNN%EVALUN11.BITNET@cunyvm.cuny.edu (Mark F. Flynn) (03/13/91)
I'm setting up to do backups of the /users directory on a regular basis. The problem is the follwing. The system is an HP Apollo 400t running HP-UX (probably not important) used for developing and running biggish numerical applications. During the day, I expect to have lots of access to the source codes (which is the really important bit), and during the night, lots of CPU- (and possibly I/O-) hungry background processes. Telling people to not run jobs a certain day or whatever is not acceptable. What I want to know is if there will be any problems with backing up files which may be open for either input or output. I am planning to use the command find /users | cpio .... Any ideas? Mark Flynn FLYNN@EVALUN11.(EARN BITNET) Departamto de Fisica Atomica y Nuclear Universidad de Valencia Spain