jerry@olivey.olivetti.com (Jerry Aguirre) (08/04/88)
Prior to the "rename" system call the mv command would change the ctime of a file even though the resulting data AND inode were identical. This was an anavoidable consequence of the link and unlink process used to implement renaming. Well, now that we have rename, it still does! It doesn't seem right that a system call that doesn't change the inode or the data in the file should result in the file being dumped in the next backup. Is this a bug or is there some justification for rename updating the ctime?
chris@mimsy.UUCP (Chris Torek) (08/04/88)
In article <26657@oliveb.olivetti.com> jerry@olivey.olivetti.com (Jerry Aguirre) writes: >Prior to the "rename" system call the mv command would change the ctime >of a file even though the resulting data AND inode were identical. This >was an anavoidable consequence of the link and unlink process used to >implement renaming. >Well, now that we have rename, it still does! ... >Is this a bug or is there some justification for rename updating the >ctime? There is certainly *some* justification: in fact, rename works internally by adjusting the link count temporarily, so that if the machine crashes, at least one of the two names will exist (but both might exist). If it were to do otherwise, fsck would not be able to fix things. One could argue that it should then go back and reset the ctime after the rename completes, although there is no kernel mechanism to do this (iupdat() insists on using the current time: ip->i_ctime = time.tv_sec). -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
ggs@ulysses.homer.nj.att.com (Griff Smith) (08/04/88)
In article <26657@oliveb.olivetti.com>, jerry@olivey.UUCP writes: > Prior to the "rename" system call the mv command would change the ctime > of a file even though the resulting data AND inode were identical. ... > ... now that we have rename, it still does! ... > Is this a bug or is there some justification for rename updating the > ctime? I think this is necessary to make the "r" option of the "restore" command work. When doing a full "r" restore with incrementals, there must be records of all changes to the name(s) of an inode. The records are kept by having "dump" write images of all directories that point to changed files. If the inode isn't marked as changed by "rename", "dump" has no way of knowing that the directory must be dumped. -- Griff Smith AT&T (Bell Laboratories), Murray Hill Phone: 1-201-582-7736 UUCP: {allegra|ihnp4}!ulysses!ggs Internet: ggs@ulysses.att.com
rta@pixar.UUCP (Rick Ace) (08/05/88)
In article <26657@oliveb.olivetti.com>, jerry@olivey.olivetti.com (Jerry Aguirre) writes: > Prior to the "rename" system call the mv command would change the ctime > of a file even though the resulting data AND inode were identical. This > was an anavoidable consequence of the link and unlink process used to > implement renaming. > > Well, now that we have rename, it still does! It doesn't seem right > that a system call that doesn't change the inode or the data in the file > should result in the file being dumped in the next backup. > > Is this a bug or is there some justification for rename updating the > ctime? The principal purpose of the ctime field in the I-node is to alert /etc/dump that the file has changed in *some way* and must be backed up. In the case of "rename", the file's name has changed; thus the file has changed in some way, and that fact must be recorded by /etc/dump. When performing a non-level-0 dump, /etc/dump uses the ctime field in its decision to determine whether or not to select a file for backup. If anything about a file changes (mode, nlinks, data contents, name, etc.), then that must be noted on a dump tape. Unfortunately, /etc/dump operates on an "all or nothing" basis, so in this case you pay the price of backing up both the I-node's contents and the file's data just to record the fact that the name alone changed. (Indeed, there is not enough information in the I-node for /etc/dump to distinguish between cases where the file's data has changed and when only an I-node field has.) Rick Ace Pixar 3240 Kerner Blvd San Rafael CA 94901 ...!{sun,ucbvax}!pixar!rta
gallen@apollo.COM (Gary Allen) (08/06/88)
In article <26657@oliveb.olivetti.com> jerry@olivey.olivetti.com (Jerry Aguirre) writes: >Prior to the "rename" system call the mv command would change the ctime >of a file even though the resulting data AND inode were identical. This >was an anavoidable consequence of the link and unlink process used to >implement renaming. > >Well, now that we have rename, it still does! It doesn't seem right >that a system call that doesn't change the inode or the data in the file >should result in the file being dumped in the next backup. > >Is this a bug or is there some justification for rename updating the >ctime? Consider the command 'mv foo bar'. Before issuing this command, there is a file called 'foo' and (usually) no file called 'bar'. After issuing this command, the exact opposite is true. Since there was no such file as 'bar' before, it is certainly logically altered by creation even though technically neither the inode nor the disk blocks of the file have been. Logically at least, 'mv foo bar' is the same as 'cp foo bar; rm foo'. The directory itself is certainly altered so that must be backed up, (which effectively deletes 'foo') and 'bar' is a file that never existed before so it must obviously be backed up. Caveat: this is merely my assumption of how it should work. I didn't look at how dump actually works. It's certainly possible to be more clever just to handle the case of renamed files but that seems a fair amount of trouble for not a lot. Gary Allen Apollo Computer Chelmsford, MA {decvax,yale,umix}!apollo!gallen
chris@mimsy.UUCP (Chris Torek) (08/06/88)
In article <2167@pixar.UUCP> rta@pixar.UUCP (Rick Ace) writes: >In the case of "rename", the file's name has changed; thus the file has >changed in some way, and that fact must be recorded by /etc/dump. This is true. Both you and Griff Smith (and undoubtedly many others) have reached the wrong conclusion from this fact, however, at least if the backups are to be done by `dump'. (In another program you would be right.) Since the parent directory(ies) of the file that was renamed were altered, dump will save them regardless of any changes to that file. This is both necessary and sufficient for restore to figure out that the file was renamed. The file's inode number has not changed, nor has any of the information in the inode, nor its data; it need not be dumped at all. Only the change of name must be recorded, and it is implicit in the fact that some directory no longer has that <name,ino> pair, and another directory (possibly the same) now has a new <name,ino> pair. Whether restore would in fact correctly restore such a change is another matter: since dump *does* write the file, restore can get away with assuming it. Other backup programs (tar certainly, cpio perhaps) do not write directory data blocks, depending instead on later path name information to imply the directory contents. Here the file must indeed be dumped. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
allbery@ncoast.UUCP (Brandon S. Allbery) (08/08/88)
As quoted from <2167@pixar.UUCP> by rta@pixar.UUCP (Rick Ace): +--------------- | The principal purpose of the ctime field in the I-node is to alert | /etc/dump that the file has changed in *some way* and must be backed up. | In the case of "rename", the file's name has changed; thus the file has | changed in some way, and that fact must be recorded by /etc/dump. +--------------- We have a misunderstanding here. Inodes do not have names, they have i-numbers!!! A "filename" is a key used by an index (the directory mechanism and namei()) to determine the i-number of the file. As a result, the file itself has not changed but the directories (index pages) searched to find the i-number have changed; so the directories should be backed up, but the file shouldn't. If this is not clear, here's a simplified example of the situation: struct inode { int i_nlink; int i_ctime; int i_data; /* could be anything */ } *x, *y; /* This function is invoked by the kernel whenever anything in the */ /* inode pointed to by "ip" changes. */ iupdat(ip) struct inode *ip; { extern int now; ip->i_ctime = now; } rename(xp, yp) char **xp, **yp; { /* unlink old "yp" */ if (--(*yp)->i_nlink == 0) /* dex causes an iupdat() */ free(*yp); *yp = (char *) NULL; /* link "xp" to "yp" */ (*yp = *xp)->i_nlink++; /* inx causes an iupdat() */ /* Things are consistent if the system crashes here! */ /* unlink old "xp" */ --(*xp)->i_nlink; /* dex causes an iupdat() */ /* should be like yp above, but nlink == 0 "can't happen" */ *xp = (char *) NULL; /* unlink old "x" */ } main() { rename(&x, &y); } The problem is that the object originally pointed to by "xp" has a value incremented and immediately decremented; this causes the kernel to change the ctime (the iupdat() comment). It's actually unnecessary overhead because it *is* immediately undone; however, note the comment above about a crash "here". fsck would not appreciate the incomplete rename() in the slightest if the nlink value weren't changed for the duration of the link. So the price of making the filesystem a bit more resistant to crashes is having unnecessary data in the next incremental dump. A fix would be to have rename() save and restore the ctime of the inode pointed to by "xp" above.... ++Brandon [DISCLAIMER: The above code is not taken from AT&T or Berkeley source code, it is merely a C version of a simplification of what I understand to be going on, with commonly-known function and structure element names used to make things clearer. I sincerely doubt that the code actually looks anything like that....] -- Brandon S. Allbery, uunet!marque!ncoast!allbery DELPHI: ALLBERY For comp.sources.misc send mail to ncoast!sources-misc
mangler@cit-vax.Caltech.Edu (Don Speck) (08/22/88)
In article <12865@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes: > Since the parent directory(ies) of the file that was renamed were > altered, dump will save them regardless of any changes to that file. > This is both necessary and sufficient for restore to figure out that > the file was renamed. This should be true regardless of whether the rename was accomplished by the rename() call, or by link()/unlink(). So why have changes in the link count affected the ctime in *any* version of Unix? Tar and cpio backups, which don't account for renaming, would be better off NOT backing up renamed files, so that when you restore a series of such backups you get only one copy of the file, rather than one in the old location and one in the new, overflowing the filesystem. (Tar and cpio have serious problems as backup tools in any case). Rogue shouldn't care about renames when determining whether a saved game is the original copy. I claim that things would work better all around if changes in the link count did NOT update the ctime. The link count is not an attribute of the object we call a file, it is a convenience to simplify garbage collection. On a write-once filesystem you wouldn't even want link counts. Don Speck speck@vlsi.caltech.edu {amdahl,ames!elroy}!cit-vax!speck
karish@denali.stanford.edu (Chuck Karish) (08/22/88)
In article <7660@cit-vax.Caltech.Edu> mangler@cit-vax.Caltech.Edu (Don Speck) writes: >In article <12865@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes: >So why have changes in >the link count affected the ctime in *any* version of Unix? ctime records the last modification time of the inode. This information is useful for some programs; others may ignore it. >Tar and cpio backups, which don't account for renaming, would be better >off NOT backing up renamed files, so that when you restore a series of >such backups you get only one copy of the file, rather than one in the >old location and one in the new, overflowing the filesystem. (Tar and >cpio have serious problems as backup tools in any case). I disagree. If tar or cpio were smart enough to do this, it'd be impossible to predict what files would be backed up. What if I rename a file, run tar or cpio, then remove the first link to the file? The file isn't present on disk, and it's not present on the backup. Anyway, how do you propose that tar or cpio discover that a file has been renamed? >Rogue shouldn't care about renames when determining whether a saved game >is the original copy. I guess this is right, as long as the link count is 1 and it's not looking through a symbolic link. Checking ctime is still the easiest way to check that the file hasn't been fooled with. >I claim that things would work better all around if changes in the link >count did NOT update the ctime. The link count is not an attribute of the >object we call a file, it is a convenience to simplify garbage collection. >On a write-once filesystem you wouldn't even want link counts. The link count is an attribute of the object we call an inode. It's useful for more than garbage collection. For example, tar uses it to solve the problem you pointed out before: resolving multiple file names that refer to the same data (when both links are being saved). This feature would be useful on a write-once filesystem. Chuck Karish ARPA: karish@denali.stanford.edu BITNET: karish%denali@forsythe.stanford.edu UUCP: {decvax,hplabs!hpda}!mindcrf!karish USPS: 1825 California St. #5 Mountain View, CA 94041
mangler@cit-vax.Caltech.Edu (Don Speck) (09/11/88)
In article <7660@cit-vax.Caltech.Edu> I wrote: > Tar and cpio backups, which don't account for renaming, would be better > off NOT backing up renamed files, so that when you restore a series of > such backups you get only one copy of the file, In article <23338@labrea.Stanford.EDU>, karish@denali.stanford.edu (Chuck Karish) writes: > What if I rename > a file, run tar or cpio, then remove the first link to the file? > The file isn't present on disk, and it's not present on the backup. The reason it's not on that backup is that its ctime is old. Thus, it must have existed at the time of the *previous* backup. Restore the previous backup first, rename the file, and restore the current incremental backup. You already have to do manual rename/rm/rmdir when restoring tar or cpio incremental backups, since tar and cpio don't carry enough information to do this for you. If you renamed the directory containing the file, or a symbolic link pointing to the file, instead of the file itself, the ctime of the file would not be updated on any Unix, and so the file, though it effectively has been renamed, wouldn't appear on your incremental backup. The ctime is not a reliable indicator that the file's name(s) have changed. > tar uses it [the link count] to > solve the problem you pointed out before: resolving multiple file names > that refer to the same data (when both links are being saved). The inode number is used to determine which files are linked. To save memory, tar doesn't remember the inode number if the link count is 1; but without link counts it all still works. A link count greater than 1 doesn't even mean that tar will necessarily encounter any other links. I reiterate that the link count is a convenience for garbage collection, it's not essential for anything else (most filesystems, such as VMS's, don't even have them). You could set all the link counts to MAXINT and everything should continue to work until you run out of disk space. Changes in the link count ought not change ctime, just as renaming the containing directory does not, even though both actions change one or more of the file's names. The name(s) (the links) are not part of the inode; the link count should be classed with the links. Would this be more clear if the link count had been stored elsewhere, analogous to the inode allocation bitmap of some filesystems? Don Speck speck@vlsi.caltech.edu {amdahl,ames!elroy}!cit-vax!speck