rcodi@yabbie.rmit.oz (Ian Donaldson) (08/02/88)
Description: cpio sometimes silently refuses to link destination files back together. This happens with the -p and -i options, and the -o option can generate an incorrect archive. At the end of this report is a list of other problems with cpio, some major, some minor. Versions: The cpio supplied with SunOS 3.5 exhibits the fault. The cpio supplied with the SVR2 Vax distribution also exhibits the fault. Repeat by: On BSD fast-filesystems, this is easy; go to a filesystem that has more than 32768 inodes (practically any large one), and do this: rm -fr a b c cat </dev/null > a ln a b mkdir c ls a b | cpio -pdmv c ls -lg c This -may- fail if "a" has an inode number > 32767. What will happen is in the directory "c" you end up with two different files "a" and "b" that are not linked together, but have the same contents. cpio also doesn't indicate that it had linked them. (because it didn't) On a System-V filesystem, you need to find a filesystem that has more than 32767 inodes currently allocated (not as common), and do the same thing. This fails more on BSD because of the way the inodes are allocated; higher i-numbers are commonplace. On the System-V filesystem, inodes are allocated at the lower end of the scale. Fix: Not trivial. Best fix is to "rm /usr/bin/cpio" :-) A partial fix that will work with high probability is to change the type of "m_ino" from "short" to "ino_t" in struct "ml", routine postml(). This allows inodes upto number 65535 to be handled 100%, and above that you will get unpredictable failures, when two multiply-linked inodes that have the lower-16 bits of their inode numbers the same end up being linked when they shouldn't be. (I strongly suspect that SVR3 cpio has the above partial fix) A proper fix is more difficult because the cpio(5) format only has a 16-bit field for the inode-number, whereas BSD systems (at least) have 32-bit inode numbers. My workaround is to use the cpio(5) inode-field only when a file has multiple links (ie: st_nlink > 1), and generate a unique number bearing no relationship to the original inode number. If the file has a single link or is a directory, the cpio(5) inode number field would be zero. Since cpio only uses the inode-number field when nlinks > 1, this should not pose a problem, and gives a maximum of 65534 multiply-linked files per archive, a limit that probably won't be hit quickly. (if there is a cpio that uses the inode-number field all the time, then this would be broken, but I cannot see why it should do so) (I strongly suspect that SVR3 cpio doesn't have this fix, since the inode numbers in the archive seem to match the originals on singly-linked files) The SVR2 cpio source is a complete disaster, and should be totally rewritten from scratch (if it hasn't already been done in SVR3). There are more bugs in it than you can poke a stick at! Other-bugs: Among other things wrong with -this- cpio are: (I'm not necessarily speaking of the SunOS cpio here, but I know some of these bugs do apply to it) - disk errors could easily result in an archive being corrupted because the file-size is written in the header based on what stat(3) says, but the copy routine gives up copying the file when it comes across a bad read from disk; and doesn't pad the file out on the archive to its stated size. This means that the header for the file is incorrect, and upon reading will get cpio very confused. - somebody truncating a file that is being dumped (eg: via creat() or truncate()) could result in a corrupt archive, making cpio absolutely useless for dumping live filesystems (I suspect this might have been fixed in SVR3) - cpio will restore the modification time of a destination file during '-p' or '-i' commands, even if the file was not successfully restored. This makes it damn difficult to redo the restore/copy and find out which files weren't restored (eg: filesystem full, you end up with lotsa inodes of zero length with the original mod-dates!). - having errno in an error message is absurd (this seems to have been eradicated in SVR3, and in SunOS 3.5) - many arrays can be over indexed because of lack of bounds checking (eg: you supply more than 100 patterns on the command line, or a file name that exceeds 256 characters). Both of these situations could cause a coredump. - a limitation of the number of reels of tape is imposed due to a bug in that /dev/tty is reopened but not closed each time input is requested when changing reels. On many systems, this limit is around 16, if NOFILE=20. - there are calls to utime(2) using the stat structure directly, which is incorrect on many (eg: BSD) systems, resulting in the inode mtime being reset to the epoch if the -a option is used. This is because the stat structure is different. This bug isn't present in SunOS 3.5 cpio. - attempting to dump files with inode numbers > 32767 or NFS inodes will result in a corrupt archive if using -c, due to sign extension in bintochar() causing the octal numbers to be written with wider width. (NFS inodes have dev being negative; i-numbers > 32767 are negative). This bug isn't present in SunOS cpio. - writing an archive with files owned by "nobody" (uid = -2) will create a corrupt archive when using -c. This bug IS present in SunOS 3.5 cpio. - opening of /dev/tty isn't even checked for success, resulting in a probable coredump if you run it from cron and multi-reels are required. (SVR3 cpio has options to specify a substitute for /dev/tty anyway, which would allow it to run from cron ok, and I strongly suspect that this has been fixed in SunOS 3.5, by the looks of the error messages in the binary) - trying to restore files from an archive that are within read-only directories in the archive are impossible unless you don't restore the directories. This is because the mode of the directory is set before the files are extracted. This bug IS present in SunOS 3.5 cpio. (could be tricky to fix, unless you keep a list of directories and set all the modes for them at the end of the run; or chmod the directories if you need to write into them. Tar typically has this problem too, when you use its 'p' option) All in all, cpio is a total mess and should be totally rewritten or replaced. Beware if you use this as your only backup mechanism. I would be interested in anybody with the "latest" cpio (SVR3?) could peek at the source and comment on which of the above bugs remain. (I don't have access to SVR3 sources) Ian D
les@chinet.chi.il.us (Leslie Mikesell) (08/05/88)
In article <824@yabbie.rmit.oz> rcodi@yabbie.rmit.oz (Ian Donaldson) writes: > At the end of this report is a list of other problems with cpio, > some major, some minor. Thanks for the warning. Have you looked at the "afio" program that was posted to the net a while back? Perhaps it would be more reliable, or at least easier to fix for those of us without source to cpio. Les Mikesell