pearmana@prlhp1.prl.philips.co.uk (Andy Pearman) (09/11/90)
I've been thinking about how our dumping works on our various Unix based machines and I'm now a bit worried about our HP systems. On our Suns we use dump/restore which works on inodes. This means that a file with several hard links will only be dumped once - no problem then when restoring. On HP-UX (6.21) we use their backup script which effectively does: cd / find . -hidden -print | cpio -ocxa | tcio ...... Am I right in thinking that find works by looking in directories and -print's everything it finds, which of course means that several directory entires hard-linked to the same file will be picked up individually and passed to cpio for dumping ? When performing a restore using cpio I assume that each file read is allocated a fresh inode and therefore what may have been one file hard-linked several times will be restored as several individual files (taking up more disk-space than originally used). I would be grateful if someone could clear this up for me. Andy -- Andy Pearman, Computer Dept, Philips Research Labs, Redhill, Surrey, England. pearmana@prl.philips.co.uk
stroyan@hpfcso.HP.COM (Mike Stroyan) (09/13/90)
> When performing a restore using cpio I assume that each file read > is allocated a fresh inode and therefore what may have been > one file hard-linked several times will be restored as several > individual files (taking up more disk-space than originally used). > > I would be grateful if someone could clear this up for me. Actually, cpio makes a list of inodes and links those files with duplicated inodes when performing a restore. Mike Stroyan, mike_stroyan@fc.hp.com
fritz@hpfcbig.SDE.HP.COM (Gary Fritz) (09/13/90)
> Am I right in thinking that find works by looking in directories > and -print's everything it finds, which of course means that > several directory entires hard-linked to the same file will be > picked up individually and passed to cpio for dumping ? Yes, and this means linked files will be written to the tape multiple times. (Try creating a test directory containing two files linked together, then do a "find . -print | cpio -ocxa > /tmp/test" and examine /tmp/test. You'll see the file is written out twice.) However, when restoring the cpio archive, cpio will look for linked files and create links appropriately. (Try "cpio -icv < /tmp/test".) It does this by maintaining a list of linked files, however, and can therefore run out of space. The cpio(1) man page sez: ... If there are too many unique linked files, the program runs out of memory to keep track of them, and thereafter, linking information is lost. If you prefer to use dump/restore, you can. At least, it's present on my 7.03 HP-UX system. I'm not certain it was on 6.21. Gary Not an official statement of HP, etc. etc. Just trying to be helpful.
rer@hpfcdc.HP.COM (Rob Robason) (09/13/90)
cpio does get the names of the various paths to the links as you suggest, but is smart enough to detect the linkage and record it on the archive. When the archive is read, the links are reestablished. I think the file does actually get archived for each link, even though the link is recognized. This is so individual files can be retrieved. Rob
djw@hpldsla.sid.hp.com (David Williams) (09/14/90)
I hope this makes sense, I've re-written it a couple of times, but there are just too many "words".... > On HP-UX (6.21) we use their backup script which effectively does: > > cd / > find . -hidden -print | cpio -ocxa | tcio ...... > > Am I right in thinking that find works by looking in directories > and -print's everything it finds, which of course means that > several directory entires hard-linked to the same file will be > picked up individually and passed to cpio for dumping ? That's pretty much the way that it works. Note though, that cpio saves the file inode number (and the device number of the file's file system), and the number of links in the archive. This leads to... > When performing a restore using cpio I assume that each file read > is allocated a fresh inode and therefore what may have been > one file hard-linked several times will be restored as several > individual files (taking up more disk-space than originally used). Not quite. When doing the restore, cpio(1) tracks files in the archive that have a link count greater than one. Cpio -i saves the pathname of the first file loaded, along with the inode/device number in the archive. If another file in the archive has the same inode/device number it is linked to this first file. Simple hay? So only the first 'file' in an archive allocates a new inode on the target file system, all the others are linked to it - as desired. Note, back on the dump (cpio -o) side of things, the N number of file links are each archived as a complete file. I guess this means tapes get filled up more than needed, but it means on restore you don't have to start with the first tape to get the "real" file. This is a different strategy than used by some other tools which just use pointers for all links after the first - resulting in a "go find tape number N if you want file 'blah'" type of message sometimes. Ftio(1) use a similar strategy to cpio for storage of the links. Tar(1) (and I think fbackup(1)) go the pointer strategy for saving the non-zero'th link. Hope that helps, David Williams ___________________________________________________________________ Hewlett-Packard Scientific Instruments Division (SID) /\___________ 1601 California Ave, Palo Alto, CA, USA. /\______________/\________ phone: 415 857 6100. FAX: 415 852 8011 //\\____________|__________ HP-UX Mail: djw@hpldsla.hp.com / \____/\____/\___________ HPdesk: (djw)hpldsla/HP1900/00 /\____________/ \__________ <usual disclaimer>
seligman@CS.Stanford.EDU (Scott Seligman) (09/15/90)
In article <1150@prlhp1.prl.philips.co.uk> pearmana@prlhp1.prl.philips.co.uk (Andy Pearman) writes: > > On HP-UX (6.21) we use their backup script which effectively does: > > cd / > find . -hidden -print | cpio -ocxa | tcio ...... Could someone please explain why HP chooses to use this incantation? It appears to make a list of all file names, and then individually seek each and every one (!) before writing them to the backup medium. Why not use the more direct dump(1M) and restore(1M)? Scott Seligman Internet: seligman@cs.stanford.edu UUCP: ...{apple,decwrl,ucbvax}!cs.stanford.edu!seligman
jad@hpcndnm.hp-sdd (John A Dilley) (09/18/90)
In article <1990Sep15.061919.3988@Neon.Stanford.EDU> seligman@CS.Stanford.EDU (Scott Seligman) writes:
Could someone please explain why HP chooses to use this incantation?
It appears to make a list of all file names, and then individually
seek each and every one (!) before writing them to the backup medium.
Why not use the more direct dump(1M) and restore(1M)?
In HP-UX 6.0/1.1 we support dump/restore(1M). In 7.0 we also
support rdump/rrestore(1M), so you should be able to choose your
favorite dump method and go for it. I believe the 7.0 rdump/rrestore
can dump from an HP system to a BSD-based DEC VAX or Sun system running
(able to run) /etc/rmt (and vise-versa).
-- jad --
John DILLEY
Hewlett-Packard
Colorado Networks Division
UX-mail: jad@cnd.hp.com
Phone: (303) 229-2787
--
This is not an official statement of Hewlett-Packard Corp., and does not
necessarily reflect the views of HP. The information above is provided
completely without warranty of any kind.
-- jad --