dce@mips.UUCP (David Elliott) (01/01/70)
>In article <596@quacky.UUCP> dce@quacky.UUCP (David Elliott) writes: >-Now, anyone want to hear how you can change cpio to handle long >-device numbers and/or long inode numbers without losing data or >-even changing the magic number? OK. Here goes. I'll discuss the general case, in which both inode and device numbers are handled, instead of just one or the other (UTek cpio just handles long device numbers). This idea plays on two properties of cpio: 1. The code has to go ahead and read the data associated with all headers, including special files. The System V and BRL versions of cpio both do this, and it is probably wrong to ignore this field in any case. 2. Device numbers are used for linking and creating special devices (character, block, and FIFO) only, so it is not required that the device number pair for two files in the same directory to neccessarily be the same. What happens is that each device/inode pair is mapped into a unique "long" integer, which is stored as 2 shorts in the cpio device and inode number fields. Later, I'll give a neat method of mapping this stuff. Additionally, special devices are given a special "magic cookie" hash value (I used 32767 in UTek, though any value will do), a data section big enough to contain the real device and inode numbers (I'd use a fixed field of 12-digits each), and the h_filesize value adjusted appropriately. It's easiest to always make the data section ASCII instead of worrying about matching it with the header type. Extraction of special devices with the known "magic cookie" is done by reading the data section to obtain the "real" device and inode numbers. In the UTek version, I went ahead and made the header- reading routine read the data and adjust the h_filesize structure back to 0, but any method for reading ahead will do. When I did this for UTek, I used a naive system that kept a table of all of the device number pairs, and this took a lot of space. After thinking about it again, I realized that there is a special property of the mapping: if the file has a link count of 1, it will never be seen again. Thus, the algorithm is: if (link count is 1) { increment "unique" counter return "unique" counter } if (device/inode is in table) { return the value for that entry } else { increment "unique" counter store device/inode in the table return "unique" counter } In practice, it looks like you can start the unique counter at 1 and change the first statement to if (link count is 1) { return 0 } This works because cpio doesn't worry about making links if the file doesn't have more than 1. Thus, it will never know that a whole bunch of files "look" like they are linked. On the other hand, if someone were to change cpio to handle things a little differently, this could easily be broken. What do people think about this? -- David Elliott {decvax,ucbvax,ihnp4}!decwrl!mips!dce