qacr@oce-rd1.UUCP (Alistair Crooks) (08/13/85)
On our Suns, running Sun Unix(TM) 4.2 BSD, Releases 1.2, 1.3, 1.4, a find(1) will fail when the pathname-list is a symbolic link. The problem can be repeated by linking symbolically to the /usr/bin directory fom my home directory, for example, calling it pathname-list, and executing find pathname-list -name find -print from my home directory. find does not seem to expand the link, or use readlink(), or anything else. Is it meant to, or should a find just give no output, as though there weren't any files? Current thinking seems to be that if a bug is documented, it is a feature. I have looked at the manual entry, but can see i) no references to links (symbolic or otherwise) being handled differently to other directory entries or ii) any disclaimer in the BUGS section of the manual. Stop Press : Sun Release 2 also shows this. Any comments... Alistair G. Crooks BSO Eindhoven/Oce Nederland b.v. {seismo,philabs,decvax,ucbvax}!mcvax!oce-rd1!qacr {seismo,philabs,decvax,ucbvax}!mcvax!bsovax!ocealis -- Alistair G. Crooks BSO Eindhoven/Oce Nederland b.v. {seismo,philabs,decvax,ucbvax}!mcvax!oce-rd1!qacr {seismo,philabs,decvax,ucbvax}!mcvax!bsovax!ocealis
chris@umcp-cs.UUCP (Chris Torek) (08/14/85)
Find is designed not to traverse symbolic links, as they often cause pathname loops. It is arguably a mistake to skip those that are given in the pathname-list.... -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 4251) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@maryland
guy@sun.uucp (Guy Harris) (08/16/85)
> On our Suns, running Sun Unix(TM) 4.2 BSD, Releases 1.2, 1.3, 1.4, > a find(1) will fail when the pathname-list is a symbolic link. > > find does not seem to expand the link, or use readlink(), or anything > else. Is it meant to, or should a find just give no output, > as though there weren't any files? The 4.2 manual doesn't come out and say it explicitly, but the 4.2 "find" treats symbolic links as symbolic links, rather than looking at what they point to. The manual does say: -type C True if the type of the file is C, where C is "b", "c", "d", "f" or "l" for block special file, character special file, directory, plain file, or symbolic link. which does imply that it does an "lstat" rather than a "stat", and looks at symbolic links rather than at what they point to. (This is in standard 4.2BSD.) Guy Harris
lwa@apollo.uucp (Larry Allen) (08/26/85)
In all of the 4.2bsd implementations I know about (VAX, Sun, Apollo), find(1) is specifically arranged to not follow symbolic links. There are a couple of reasons for this: 1) If you follow a symbolic link, it's hard to get back. Going back to .. doesn't work; instead, find would have to explicitly keep a stack of the directories it had visited. While this would work, it would require find to do a getwd(2) at every directory level, and getwd is pretty slow. 2) There are problems with loops in the directory structure. A symbolic link can point to an ancestor of the current directory, potentially resulting in infinite loops. Again this can be solved by keeping track of all directories visited so far, but it would be slow, especially on big searches. As an aside, note that this issue of following symbolic links is a "gotcha" in systems which provide both System 5 and 4.2Bsd compatibility, like Apollo's. We have added an lstat(2) call to the System 5 library, and modified programs like find and du which search the directory tree to use lstat and hence to avoid following symbolic links. -Larry Allen Apollo Computer
peter@graffiti.UUCP (Peter da Silva) (08/31/85)
> In all of the 4.2bsd implementations I know about (VAX, Sun, Apollo), find(1) is specifically arranged > to not follow symbolic links. There are a couple of reasons for this: > 1) If you follow a symbolic link, it's hard to get back. Going back to .. doesn't work; instead, > find would have to explicitly keep a stack of the directories it had visited. While this would > work, it would require find to do a getwd(2) at every directory level, and getwd is pretty slow. Why would it require find to do a getwd? The file spelling checker in Kernighan and Pike doesn't. All you have to do is build the directory stack on the fly. I always assumed find did this rather than depending on .. hyd-ptd!/usr/src/news and hyd-ptd!/usr/spool/uucp/news.src were the same file for a while when I was trying to get uucp up on datafact. Find never got lost here, that I recall.
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (09/02/85)
> > In all of the 4.2bsd implementations I know about (VAX, Sun, Apollo), find(1) is specifically arranged > > to not follow symbolic links. There are a couple of reasons for this: > > 1) If you follow a symbolic link, it's hard to get back. Going back to .. doesn't work; instead, > > find would have to explicitly keep a stack of the directories it had visited. While this would > > work, it would require find to do a getwd(2) at every directory level, and getwd is pretty slow. > > Why would it require find to do a getwd? The file spelling checker in Kernighan > and Pike doesn't. All you have to do is build the directory stack on the fly. > I always assumed find did this rather than depending on .. Peter is right. Indeed, most if not all of the utilities in the BRL UNIX System V emulation for 4.2BSD now avoid doing chdir( ".." ) in order to avoid problems with symbolic links. Our SVR2 Bourne shell has been modified so that "cd .." does what one might expect (trim off rightmost piece of pathname from current working directory) rather than wander off in a different direction than was used to enter the directory. It is initialized by "cd $HOME" in /usr/5lib/profile (our equivalent of /etc/profile), to make sure that it thinks you are in the directory as speicifed in /etc/passwd and not wherever /bin/pwd would say you are.
jpl@allegra.UUCP (John P. Linderman) (10/21/85)
Index: usr.bin/find.c 4.2BSD Description: Only one -newer option will be correctly processed on a find. Repeat-By: # The following script demonstrates the problem (which also # exists on System V and Version 8) and the effect of the fix. # The fix also adds the ability to compare on access and # inode modification times as well as file modification time, # as is also demonstrated in the script. $ touch 1 $ touch 2 $ touch 3 $ find . \( -newer 2 -o -newer 3 \) -print . $ /usr/5bin/find . \( -newer 2 -o -newer 3 \) -print . $ find . \( -newer 3 -o -newer 2 \) -print . ./3 $ ./find . \( -newer 2 -o -newer 3 \) -print . ./3 $ mv 1 4 $ find . -newer 2 -print . ./3 $ ./find . -newer 2 -print . ./3 $ ./find . -newerc 2 -print . ./3 ./4 $ Fix: The following diffs to the BSD 4.2 source correct the problem, and add a dozen options. (Only a few options are genuinely useful, but it was cleaner to add them all than to prune out the useless ones.) -newer can be followed by one or two occurrences of the letters [acm] to specify which time from the stat structure (st_atime, st_ctime or st_mtime -- see stat(2)) will be used in the comparison. The first letter, if any, determines the time used for the files the find command is searching. The second, if any, determines the time from the file that follows the -newer option. Both default to m, so -newer foo, -newerm foo, and -newermm foo are identical. Note that -newerc causes the INODE modification time of the found files to be compared to the FILE (not inode) modification time of the specified target. This was done deliberately, because it works correctly with the following incremental backup scheme touch startstamp find ... -newerc laststamp ... mv startstamp laststamp If the dump dies midstream, laststamp is not changed, so the next dump will get all the files this dump would have. If the dump does run to completion, the mv changes the inode modification time of startstamp but not the file modification time, so the next incremental dump will pick up all the files changed after OR DURING this dump, including those whose modes or owners were changed or those renamed. I don't know if the System V and Version 8 sources are identical, but (except for the MAXPATHLEN change), the changes appear to be analogous. The new features are particularly useful in conjunction with the System V touch command, which allows one to set the modification dates of a file to an arbitrary time. These give greater precision and cleaner semantics than the -mtime and -atime options (one day since when??). John P. Linderman Department of find bug finders allegra!jpl 11c11 < char Pathname[200]; --- > char Pathname[MAXPATHLEN + 1]; 30c30,32 < long Newer; --- > #define NNEW 50 > int Nnewer; > time_t Newer[NNEW]; 230c232,234 < else if(EQ(a, "-newer")) { --- > else if(strncmp(a, "-newer", 6) == 0) { > char *p = a + 6; > time_t *t1p, *t2p; 235,236c239,278 < Newer = Statb.st_mtime; < return mk(newer, (struct anode *)0, (struct anode *)0); --- > if(Nnewer >= NNEW) { > fprintf(stderr, "find: too many -newer constructs\n"); > exit(1); > } > t1p = t2p = &(Statb.st_mtime); > switch (*p) { > case 'm': > p++; > break; > case '\0': > break; > case 'a': > t1p = &(Statb.st_atime); > p++; > break; > case 'c': > t1p = &(Statb.st_ctime); > p++; > break; > } > switch (*p) { > case 'm': > p++; > break; > case '\0': > break; > case 'a': > t2p = &(Statb.st_atime); > p++; > break; > case 'c': > t2p = &(Statb.st_ctime); > p++; > break; > } > if (*p == '\0') { > Newer[Nnewer] = *t2p; > return mk(newer, (struct anode *)t1p, > (struct anode *)(&Newer[Nnewer++])); > } 428c470,471 < newer() --- > newer(p) > register struct { int f; time_t *t1, *t2; } *p; 430c473 < return Statb.st_mtime > Newer; --- > return *(p->t1) > *(p->t2);
mp@allegra.UUCP (Mark Plotnick) (10/21/85)
> From jpl@allegra.UUCP (John P. Linderman) > The following diffs to the BSD 4.2 source correct the problem, > and add a dozen options. (Only a few options are genuinely > useful, but it was cleaner to add them all than to prune out > the useless ones.) Look out, world! I was just showing Linderman a bug in "ls" this morning...
jpl@allegra.UUCP (John P. Linderman) (10/21/85)
> From: mp@allegra.UUCP (Mark Plotnick) >> From jpl@allegra.UUCP (John P. Linderman) >> The following diffs to the BSD 4.2 source correct the problem, >> and add a dozen options. (Only a few options are genuinely >> useful, but it was cleaner to add them all than to prune out >> the useless ones.) > Look out, world! I was just showing Linderman a bug in "ls" this morning... Hoist with my own petard, eh? The changes only add two new ``concepts'', the use of access times or inode change times instead of file modification times. Then the two files, the backwards-compatible defaults, and simple combinatorics generate a dozen possibilities. Those ending with one or more m's are ``useless'' in the sense that the same effect can be obtained by leaving the m's off, but they are ``useful'' because they provide a consistent mapping to the underlying concepts. By this reckoning, adding the n+1'st flag to ls adds 2**n new options, so a dozen is pretty modest. But now that you mention it, it would be nice to have a flag to ls that caused all unprintable characters in file names to ... John P. Linderman Finder of lost options allegra!jpl