franco@MIKEY.BBN.COM (09/24/87)
I understand that the 'stat' system call on a NULL string for a file name will default to '.' (DOT), the current directory. You don't have to explain to me why this is. I already know why. I only just recently discovered this fact. But what I would like to know is, why does the test -d of a NULL string in the following sample script return true? Foo="" if test -d "$Foo" then do something constructive fi don't do anything Let me rephrase the question. I understand that /bin/test just does a 'stat' call on the NULL filename string which defaults to DOT which does exist and is a directory and that is why test returned true and the 'do something constructive' part would get executed. But why should I have to know in advance that NULL will default to DOT in the case when I'm testing for the existence of a directory? Shouldn't /bin/test handle a NULL string in a special way, i.e..return false? I don't see why I have to know in advance that NULL defaults to DOT and I shouldn't have to do two tests to find out if the file is a directory. Also, the documentation doesn't say anything about the case of a NULL filename when using 'test -d'. To me, this really sounds like a bug or a mis-implementation and documentation of /bin/test. Please forward these questions to people who you think should read it just in case unix-wizards doesn't cover the appropriate group, i.e..other mailing lists. And please CC me on all responses to this message, for sometimes I don't read all of the unix-wizards digests. thank you, -franco CSNET: franco%bbn.com@relay.cs.net UUCP: ..!harvard!franco@bbn.com
guy%gorodish@Sun.COM (Guy Harris) (09/25/87)
> I understand that the 'stat' system call on a NULL string for a file name > will default to '.' (DOT), the current directory. Umm, it doesn't do this on every system. DO NOT RELY ON THIS BEHAVIOR. > But what I would like to know is, why does the test -d of a > NULL string in the following sample script return true? Don't rely on *this* behavior either - check that a string isn't null before handing it to "/bin/test"! Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com
ron@topaz.rutgers.edu (Ron Natalie) (09/26/87)
STAT should never be called with NULL, but it should always work when called with a zero length string (e.g. ""). -Ron
gwyn@brl-smoke.ARPA (Doug Gwyn ) (09/27/87)
In article <15069@topaz.rutgers.edu> ron@topaz.rutgers.edu (Ron Natalie) writes: >STAT should never be called with NULL, but it should always work >when called with a zero length string (e.g. ""). On many implementations, "" is not a valid filename (when it is, it is synonymous with ".").
franco@MIKEY.BBN.COM (09/28/87)
Amos: If "$foo" is a NULL string /bin/test doesn't replace foo with a '.'. if [ -d "$foo" ]; then do_something $foo/xx fi With the above script if foo is null then the do something will act on /xx instead of ./xx, clearly two different directories and something that a programmer might not want to happen. You see, my point to the original questions was, Why do I have to know in advance that test -d "" is going to say that "" is a directory when it is clearly not a directory. And, why should I have to do two tests to find out if "$foo" is a directory, /bin/test should do it. Relying on the default sounds to me to be bad programming practice, but I've looked into source code such as ls.c and it does the same thing, rely on the default, namely if you do 'ls ""' it does an 'ls .'. Other programs do the same, chmod chown, cd and the like. All of which I consider poorly written if they rely on the default like that. Point 2 is, what if the do_something was cd $foo; rm -rf $foo and the script ran as root? OH MY GOD!!!!!!!!! -franco
kimcm@ambush.UUCP (09/30/87)
In article <29056@sun.uucp> guy%gorodish@Sun.COM (Guy Harris) writes: >Don't rely on *this* behavior either - check that a string isn't null before >handing it to "/bin/test"! Just how ? Write a specialized c-program "isnull"....er! Otherwise you'll have to use test to test whether it should be passed to test.... (-; Well it's two different uses of test...So I'm just picky I think... Kim Chr. Madsen.
allbery@ncoast.UUCP (Brandon Allbery) (10/01/87)
As quoted from <9479@brl-adm.ARPA> by franco@MIKEY.BBN.COM (Frank A. Lonigro): +--------------- | I understand that the 'stat' system call on a NULL string for a file name | will default to '.' (DOT), the current directory. You don't have to explain | to me why this is. I already know why. I only just recently discovered | this fact. But what I would like to know is, why does the test -d of a | NULL string in the following sample script return true? | | Foo="" | if test -d "$Foo" | then | do something constructive | fi +--------------- Both "stat" and this sample demonstrate what is basically a bug in the kernel routine namei(), which maps a filename to a <device, inode> pair. The pseudocode for namei() is as follows: pathp = pathname; if (*pathp == '/') { pathp++; /* skip the leading `/' */ inode = u.u_rdir; /* calling proc's root dir */ } else inode = u.u_cdir; /* calling proc's current dir */ while (scan_to_a_slash_or_end_of_the_path(&pathp, name)) inode = get_named_inode_in_dir_inode(name); return inode; Notice that there is no check for a null pathname: this means that, since the first character of the path is not `/' the inode is initially set to the current directory; and since there is no file name following the start of the path the while loop always fails. So namei() happily returns the value of u.u_cdir -- the current directory. [DISCLAIMER: The pseudocode above comes from my understanding of the problem; it is NOT based on any AT&T or UCB CSRG code. Baseides, the only namei()-type code I've ever seen is in Minix, and it's done differently there.] Some, but most definitely NOT all, versions of Unix explicitly check for a null path in namei() and return an error (ENOENT, or perhaps EINVAL). All others have the bug; since it's not exactly crippling, it's not considered a high priority to fix it, so it stays. -- Brandon S. Allbery, moderator of comp.sources.misc {{harvard,mit-eddie}!necntc,well!hoptoad,sun!mandrill!hal}!ncoast!allbery ARPA: necntc!ncoast!allbery@harvard.harvard.edu Fido: 157/502 MCI: BALLBERY <<ncoast Public Access UNIX: +1 216 781 6201 24hrs. 300/1200/2400 baud>> "`You left off the thunderclap and the lightning flash.', I told him. `Should I try again?' `Never mind.'" --Steven Brust, JHEREG
dce@mips.UUCP (David Elliott) (10/03/87)
In article <467@ambush.UUCP> kimcm@ambush.UUCP (Kim Chr. Madsen) writes: >In article <29056@sun.uucp> guy%gorodish@Sun.COM (Guy Harris) writes: >>Don't rely on *this* behavior either - check that a string isn't null before >>handing it to "/bin/test"! > >Just how ? Write a specialized c-program "isnull"....er! >Otherwise you'll have to use test to test whether it should be passed to >test.... (-; > >Well it's two different uses of test...So I'm just picky I think... You can use the -a (boolean "and" operator) or -o ("or"). That is test " $name" != " " -a -d "$name" will tell you if "$name" is a directory, whether it is empty or not. Also, test " $name" = " " -o ! -d "$name" will tell you if "$name" is not a directory. In both cases, you use a single invokation of the "test" command, so even old shells without builtin test can work (and if you're worried about the extra fork/exec's, there's even a way to use "case" and shell globbing to find out if a file or directory exists with a single fork). (Note: I use the construct " $name" = " " because I'm very picky. This is similar to the more common $name = "" x$name = x used by many programmers. The former breaks if "$name" is "-d" or something, and the latter breaks if "$name" contains whitespace.) -- David Elliott {decvax,ucbvax,ihnp4}!decwrl!mips!dce
chris@mimsy.UUCP (10/03/87)
In article <4779@ncoast.UUCP> allbery@ncoast.UUCP (Brandon Allbery) writes: >... Some, but most definitely NOT all, versions of Unix explicitly >check for a null path in namei() and return an error (ENOENT, or >perhaps EINVAL). All others have the bug; since it's not exactly >crippling, it's not considered a high priority to fix it, so it stays. It is also not considered a bug (at least by some), just as the null set is considered a set (at least by some), and zero sized arrays are considered not unreasonable (at least by some [An error? Probably. But what if . . . ?]). -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
gwyn@brl-smoke.UUCP (10/04/87)
In article <4779@ncoast.UUCP> allbery@ncoast.UUCP (Brandon Allbery) writes: >Both "stat" and this sample demonstrate what is basically a bug in the >kernel routine namei(), which maps a filename to a <device, inode> pair. "" was not considered a "bug" originally, but a nice feature -- it was specifically mentioned in one of the original UNIX papers. It does have the drawback that it doesn't behave right under string concatenation etc., and it was eventually outlawed (probably in response to naive-user complaints) in the official AT&T releases of UNIX.
boyd@basser.oz (Boyd Roberts) (10/06/87)
In article <15069@topaz.rutgers.edu> ron@topaz.rutgers.edu (Ron Natalie) writes: >STAT should never be called with NULL, but it should always work >when called with a zero length string (e.g. ""). > >-Ron Well, I agree with not passing stat() NULL, that's EFAULT city! As for "", I couldn't disagree more. The zero length string only works due to a quirk in namei(). Do you ever see the ``zero length string'' named file in a directory? No, you don't. But you do see ``.'' & ``..'' etc. Say what you mean. If you want ``.'' then say it. I see ``.'' in every directory. But, I've never seen the ``zero length string''. Boyd Roberts boyd@basser.oz ``When the going gets wierd, the wierd turn pro...''
latham@ablnc.ATT.COM (Ken Latham) (10/07/87)
> > >Don't rely on *this* behavior either, check that a string isn't null before > > >handing it to "/bin/test"! > > > >Just how ? Write a specialized c-program "isnull"....er! > > You can use the -a (boolean "and" operator) or -o ("or"). That is > > test " $name" != " " -a -d "$name" > The most direct ( and simplest complete ) method I know of is : test -n "${name}" -a -d "${name}" -n --- test if a string is NOT zero length. -z --- test if a string IS zero length. Don't let the {}'s get to you, they just delineate the variable name. Ken Latham AT&T Support ..!ihnp4!ablnc!latham
gwyn@brl-smoke.ARPA (Doug Gwyn ) (10/08/87)
In article <1036@basser.oz> boyd@basser.oz (Boyd Roberts) writes: >I see ``.'' in every directory. You haven't looked at enough directories, then, particularly using distributed network file systems. On such a system, an "ls" of a directory may not show a "." entry, but you can nonetheless open the current working directory via open(".",0). One could argue for accepting open("",0) the same way, and in fact older UNIXes did. However, somewhere along the way somebody decided that such usage would more probably be a programmer error than intentional, and removed support for it (at least in recent UNIX System V implementations).
franco@MIKEY.BBN.COM (Frank A. Lonigro) (10/09/87)
In article <15069@topaz.rutgers.edu> ron@topaz.rutgers.edu (Ron Natalie) writes: >STAT should never be called with NULL, but it should always work >when called with a zero length string (e.g. ""). > >-Ron Boyd Roberts responds: >Well, I agree with not passing stat() NULL, that's EFAULT city! > >As for "", I couldn't disagree more. The zero length string only >works due to a quirk in namei(). Do you ever see the ``zero length string'' >named file in a directory? No, you don't. > >-Boyd Roberts I agree with Boyd. I was the one who initially brought this subject to unix-wizards and I appreciate all the responses, but I'm afraid no-one has answered my original questions. Everyone has commented on the fact that stat("", &stbuf) (stat of a zero length string) defaults to stat of (DOT) the current directory. This fact was already established prior to my original questions. What I don't believe and understand is that: 1) This is not documented anywhere, BSD4.*, SUN OS, ULTRIX , etc... The fact that stat of zero length string defaults to DOT, that is. I guess maybe I'm supposed to know about the namei() quirk. I'm being sarcastic here! {8-)> 2) Programs like 'ls', 'cd', 'chmod', 'test', 'rmdir', etc.... ^^^^ have all been coded to rely on this default(this can be verified by do something like 'ls ""' or 'chmod ""', etc). In my opinion, is very bad programming practice. And to think that such programs have been coded this way since the beginning of UNIX is hideous. (I realize the previous sentence may be an over-assumption, but I think the quirk in the namei() function has been there ever since the first semi-bug free and run-able UNIX came hot off the presses and since then, systems programmers writing application software for the UNIX OS such as 'ls' and the like relied on this quirk simply because they knew about it in advance and no-one since has bothered to re-code them properly, i.e..checked before hand if the string they are about to pass to stat is a zero length string.) So, let me re-phrase my original questions in hopes of getting them answered properly. Why does Joe/Jane Programmer have to know, in advance, that stat("", &stbuf) will default to stat(".", &stbuf)? (A better re-phrasing of the previous question is, Why does ' test -d "" ' return TRUE when the zero length string is clearly not a directory?) And, equally important, Why do I have to do two tests to find out that a shell variable is a directory (and not a zero length string), whether I use a one line invocation of /bin/test and a boolean or not? I really think that /bin/test should not rely on the default. It should return FALSE in the case of ' test -d "" '. For that matter, all the others should not rely on that default either. Better yet, maybe stat(2) can be recoded to return ERROR if the "path" it was passed is a zero length string. But if for some historical reasons, such programs must remain as relying on the default which implies that stat(2) remains as returning a default, then can the man pages for each state some sort of warning about this fact so future Joe/Jane users will be better informed. -franco
guy%gorodish@Sun.COM (Guy Harris) (10/10/87)
> 2) Programs like 'ls', 'cd', 'chmod', 'test', 'rmdir', etc.... > ^^^^ > have all been coded to rely on this default(this can be verified > by do something like 'ls ""' or 'chmod ""', etc). In my opinion, is > very bad programming practice. In my opinion, this is utter nonsense. There is no reason for all those programs to be coded to explicitly *check* for a null string as an argument; the mere fact that 'ls ""' does not cause an error hardly means that the program has been explicitly coded to rely on this. Does the fact that the "strcmp" routine does not check whether its arguments are NULL mean that "strcmp" was, say, explicitly coded to rely on the fact that in some C implementations a NULL string, when improperly treated as a string pointer, yields a null string? Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com
gwyn@brl-smoke.ARPA (Doug Gwyn ) (10/10/87)
In article <9721@brl-adm.ARPA> franco@MIKEY.BBN.COM (Frank A. Lonigro) writes: >I agree with Boyd. I was the one who initially brought this subject to >unix-wizards and I appreciate all the responses, but I'm afraid no-one >has answered my original questions. I don't believe you've understood the responses! There is NOT usually special code in utilities such as "test" to convert an empty-string filename into ".". Rather, the commands just pass WHATEVER you supply to them, empty string or non-empty string, to the kernel. The KERNEL accepts an empty string as a filename and interprets it as meaning the current working directory. This IS "documented" in some old UNIX papers, but there is NO POINT to trying to document it for specific commands, bvecause it is a generic property of filenames, not of any particular command. Note that only SOME implementations of the UNIX kernel interpret "" like this, so you should not rely on it.
kre@munnari.oz (Robert Elz) (10/10/87)
In article <9721@brl-adm.ARPA>, franco@MIKEY.BBN.COM (Frank A. Lonigro) writes: > Everyone has commented on the fact that stat("", &stbuf) (stat of a zero > length string) defaults to stat of (DOT) the current directory. Then "everyone" is wrong (not that I think that is really what "everyone" said in the first place). "" is NOT ".", they are two totally different names (which happen to usually be links to the same file). "" is the ONLY totally reliable way for a process to access its current directory, "." searches the current directory file the name "." and normally finds an inode which is usually the current directory. But this is not guaranteed. In some kernels, the vendor has decided to hard wire in the "." -> "" mapping, so that it can't be changed. That's fine, changing the meaning of "." is not something that intelligent people do (which is probably why I do it from time to time). These systems are typically not really unix. Some kernels have made the name "" illegal. This was a silly decision, as now there's no one portable way to access the current directory, however it is "ok" if these kernels are also kernels where "." is wired in, as "." then provides a guaranteed way to access the current directory. Unfortunately, there are kernels which have prohibited "", and yet which haven't wired in ".". In these kernels (including one quite famous and widely distributed version) there is no guaranteed way for a process to access the current directory. > 1) This is not documented anywhere, BSD4.*, SUN OS, ULTRIX , etc... Not true. Its quite explicitly stated in the Ritchie & Thompson CACM paper, "The Unix Time-Sharing System", CACM, July 1974, pp 365-375. This been updated, and reprinted, with the documentation of every variety of the unix system that I've ever seen (there are probably a few that don't include it (the vendor should be bankrupted) BSD and Sun OS aren't in that category, I very much doubt that Ultrix is either). For anyone who hasn't bothered to read this document, I advise you to suspend all activities related to unix, and go and read it now. It should be the first document all new unix programmers (including shell programmers) read. All users should be encouraged to read it. I will quote one sentence ... As another limiting case, the null file name refers to the current directory. (The other limiting case is the name "/"). Logically, to prohibit "" as a name, one would also have to prohibit "/" as a name, since they are logically equivalent. "/." would be required to reference the root (which of course, isn't really necessarily the root in any case). > I guess maybe I'm supposed to know about the namei() quirk. It is not a "quirk" regardless of what some people have claimed, its a deliberate, and necessary, attribute of the unix file naming scheme. However yes, you *are* supposed to know about the name "" (if you didn't know it already, it should be obvious from a few minutes thought). > 2) Programs like 'ls', 'cd', 'chmod', 'test', 'rmdir', etc.... > have all been coded to rely on this default Nonsense. Your examples merely show that none of those commands has been deliberately coded to look for "" as a special case, and do something different (and wrong) with it. 'test -d ""' (the original question) *should* return true, as "" *is* a directory. kre
jgp@moscom.UUCP (Jim Prescott) (10/11/87)
In article <9723@brl-adm.ARPA> franco@MIKEY.BBN.COM writes: >Everyone has commented on the fact that stat("", &stbuf) defaults to stat >of (DOT) the current directory. This fact was already established prior >to my original questions. Actually under System V passing a null string to a system call returns ENOENT. I tried playing with a few commands on our NCR Tower (SysV.2) /bin/ls '' replaces null string with . /usr/ucb/ls '' prints "cannot stat " test -d '' false csh, -d '' false On our V7 system '' is treated as . by the kernel (namei). Things like ls don't treat '' special. The System V ls actually checks for a null string and changes it into a dot. You can tell the difference with ls '' '', SysV prints the dir name as ".:" and V7 as ":". The ucb ls demonstrates the need for a standard subroutine to print the text message associated with the value of errno, we could call it perror() or something :-) > 2) Programs like 'ls', 'cd', 'chmod', 'test', 'rmdir', etc.... > have all been coded to rely on this default Not to rely on it, they just don't care about it. If the kernel says that the file "" is the same as the file "." it really isn't appropriate for chmod to disagree. Having every program trying to error check arguments to system calls would be silly, the kernel is going to check anyway and it will at least be consistent. > I think the quirk in the namei() function has been there ever > since the first semi-bug free and run-able UNIX came hot off the I think it has been in since the beginning. I believe it was considered a feature but wouldn't be surprised if part of the reason was that it was faster, namei was (and to a lesser extent still is) a place where the kernel spends lots of it's time. >So, let me re-phrase my original questions in hopes of getting them >answered properly. Why does Joe/Jane Programmer have to know, in advance, >that stat("", &stbuf) will default to stat(".", &stbuf)? If JJ Programer only wants to code on SysVid Compliant machines then they can assume stat("",&b) and test -d "" will return ENOENT and false resepctively. If they want to port their code to the larger (for now) non-SysV world they need to know that the kernel treats "" like "." simply because it does. Even if JJ considers it to be a bug, the fact that it is present on most Unix machines means that it must be catered to. >I really think that /bin/test should not rely on the default. It should >return FALSE in the case of ' test -d "" '. The fewer assumptions user code makes about filenames the better. If it needs to be fixed it needs to be fixed in the kernel. To be portable you need to make both checks yourself since "" is valid on some systems and illegal on others. >returning a default, then can the man pages for each state some sort >of warning about this fact so future Joe/Jane users will be better informed. Systems should document how their namei behaves and how it differs from other systems (the latter being unrealistic except for a comparison against a standard, I'm not sure if sysvid address this or not). As another example, "normal" Unix systems will accept multiple '/'s wherever one is valid. While this could be the same "bug" in namei, I think that "////etc/////foo" involves two directory scans, not nine. This too is non-portable since some systems use "//vaxa/etc/foo" to specify a super-root directory in a network. Since it isn't documented it is probably illegal on some systems. Dealing with the filesystem is a rather common function. Since namei defines most of the user interface with the file system it deserves to be documented in full. Tell whats legal, illegal and both (illegal but happens to work). -- Jim Prescott moscom!jgp@cs.rochester.edu {rutgers,ames,cmcl2}!rochester!moscom!jgp
gwyn@brl-smoke.ARPA (Doug Gwyn ) (10/11/87)
In article <1849@munnari.oz> kre@munnari.oz (Robert Elz) writes: >"" is NOT ".", they are two totally different names (which happen to usually >be links to the same file). Elz is usually right, but this time he isn't. I've never seen a UNIX implementation where the empty string was an actual link. A link IS an entry in a directory file. Many (all older) implementations have explicit links for "." and ".." in every correctly-formed directory. It is possible, if a directory is somehow corrupted, for either of these structural links to be lost or to acquire the wrong i-number. Back when "mkdir" was done in user mode by the sequence mknod(), link(,"."), link(,"..") (which required superuser privilege in order to plant links to directories), if "mkdir" was interrupted in the middle of the sequence one could get a corrupted directory structure. It's harder for that to happen on newer systems where all the "mkdir" actions are done ("atomically", cough, cough) inside the kernel. By the way, "fsck" can usually fix corrupted "." and ".." links. Some newer implementations, such as perhaps NFS involving an MS/DOS remote system, do not have "." and ".." entries in directories. I understand that at least one H-P UNIX implementation is like this, too. However, such systems can upon request open(".",0), due to special kernel code (in what is usually referred to as the namei() function: name-to-inumber mapper) that "knows" what is meant by ".". (Similarly for "..", but that's not involved in the current discussion.) >"" is the ONLY totally reliable way for a process to access its current >directory, "." searches the current directory file the name "." and >normally finds an inode which is usually the current directory. But >this is not guaranteed. "." IS guaranteed to refer to the current directory, unless your filesystem is corrupted, in which case you should fix it (fsck etc.). >In some kernels, the vendor has decided to hard wire in the "." -> "" >mapping, so that it can't be changed. That's fine, changing the meaning >of "." is not something that intelligent people do (which is probably why >I do it from time to time). These systems are typically not really unix. Well, you're running afoul of POSIX, SVID, and probably every other formal description of the UNIX hierarchical filesystem structure when you deliberately mangle your "." entries. The reason that linking to directories was deliberately restricted to privileged processes was to prevent exactly this sort of structural corruption. >Unfortunately, there are kernels which have prohibited "", and yet which >haven't wired in ".". In these kernels (including one quite famous and >widely distributed version) there is no guaranteed way for a process to >access the current directory. "." is the only advertised way to access the current working directory without keeping track of the path used to get there. Under normal circumstances, there is no way for a normal user process to subvert the advertised meaning of ".". (A privileged (UID==0) process can of course do anything at all, including completely mangling the filesystem or replacing the kernel with an alternative that looks nothing like UNIX.) > As another limiting case, the null file name refers > to the current directory. Yes, this is the explicit mention of "" that I was referring to when I said that it wasn't just an oversight. However, this property is no longer true for many existing implementations. >Logically, to prohibit "" as a name, one would also have to prohibit "/" >as a name, since they are logically equivalent. "/." would be required >to reference the root ... Sorry, I don't see any "logical equivalence". In fact, as you must know, many people are successfully using systems where "" doesn't work but "/" does access the root of the filesystem hierarchy. >It is not a "quirk" regardless of what some people have claimed, its >a deliberate, and necessary, attribute of the unix file naming scheme. It does fit the scheme, and makes sense, but it's by no means "necessary". I remember discussing this business with Dennis Ritchie once, and as I recall, his response was that at first he was quite annoyed at "" being outlawed in System V, but after hearing some of the arguments, he reluctantly granted that that position had some validity. The main argument was that the empty string is invisible when used as a component of a path; although "dir/" (with the null string after /) works fine, "/subdir" (with the null string before the slash) doesn't work right, whereas "./subdir" works fine. This shows that "." is SAFER to use as a directory name than "", since lots of applications will construct path names by taking a directory name and appending / and a subdirectory name. The only time "." would be less safe is if somebody has been deliberately subverting the safeguards intended to guarantee the ".", ".." structural integrity.
guy%gorodish@Sun.COM (Guy Harris) (10/12/87)
> I think it has been in since the beginning. I believe it was considered > a feature but wouldn't be surprised if part of the reason was that it was > faster, Probably not; the time spent testing one byte of the string is minimal compared to the time spent doing the rest of "namei". > If JJ Programer only wants to code on SysVid Compliant machines then > they can assume stat("",&b) and test -d "" will return ENOENT and false > resepctively. Guess again; the SVID says nothing more than "a null string is undefined and *may* be considered an error" (emphasis mine). There will certainly be SVID-compliant systems that treat a null string as a synonym for ".". Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com
ka@uw-june.UUCP (Kenneth Almquist) (10/14/87)
> 1) The fact that stat of zero length string defaults to DOT...is > not documented anywhere, BSD4.*, SUN OS, ULTRIX , etc... In the 4.3 BSD manual, under INTRO(2), in the section labelled "Definitions", is a definition of "Path Name" which states in part: "A null path name refers to the current directory." > Why does Joe/Jane Programmer have to know, in advance, > that stat("", &stbuf) will default to stat(".", &stbuf)? (A better > re-phrasing of the previous question is, Why does ' test -d "" ' return TRUE > when the zero length string is clearly not a directory?) Another rephrasing of the question is "Why does Joe/Jane Programmer have to know, in advance, about path names to test whether a path name refers to a directory?" The zero length string clearly (:-)) refers to a directory. > ...maybe stat(2) can be recoded to return ERROR if the "path" it was > passed is a zero length string. This is a reasonable suggestion. As I understand it, originally the null string was used to refer to the current directory, *except* that a null string at the beginning of a file name containing a slash referred to the root directory. (Thus "a//b" referred to "a/b", but "/b" referred to "{root directory}/b".) Exceptions like this are painful; and eventually "." was added as an alternative way to refer to the current directory. The advantage of "." over "" is that when you can append the string "/b" to the string ".", you get what you expect: a path name referring to a file in the directory ".". But the old way of referring to the current directory was retained; presumably for backward compatibility. The result is that people like Frank are told that "." refers to the current directory, but are not told about the alternative "" since "" is presumably obsolete. It's easy to say RTFM, but there really should not be two ways to refer to the current directory. One of the advantages of UN*X is that early in its life it had a small installation base which made it easier to remove features. Too bad "" wasn't removed. Kenneth Almquist
ka@uw-june.UUCP (Kenneth Almquist) (10/14/87)
>> "" is the ONLY totally reliable way for a process to access its current >> directory, "." searches the current directory file the name "." and >> normally finds an inode which is usually the current directory. But >> this is not guaranteed. > > "." IS guaranteed to refer to the current directory, unless your > filesystem is corrupted, in which case you should fix it (fsck etc.). Under either System V or 4.3 BSD, do: mkdir junk cd junk rmdir ../junk ls . This will tell you that "." does not exist, because under these versions of UN*X the entry for "." is removed from a directory when rmdir runs, rather than when the all the processes using the directory finally finish. Does this mean that System V and 4.3 BSD don't conform to POSIX? Kenneth Almquist
kre@munnari.oz (Robert Elz) (10/14/87)
In article <6555@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn) writes: > In article <1849@munnari.oz> kre@munnari.oz (Robert Elz) writes: > >"" is NOT ".", they are two totally different names (which happen to usually > >be links to the same file). > > Elz is usually right, but this time he isn't. I've never seen a UNIX > implementation where the empty string was an actual link. Yes, sorry, I meant, and should have said, that "." is usually a link to the current directory, which is represented by "". "" is not a link itself of course, its a method to access the kernels saved internal representation of the current directory, where "." searches that directory for the name "." and uses whatever that happens to be, which is usually the same thing. > Under normal > circumstances, there is no way for a normal user process to subvert the > advertised meaning of ".". (A privileged (UID==0) process can of course > do anything at all, including completely mangling the filesystem or > replacing the kernel with an alternative that looks nothing like UNIX.) Yes, without doubt, using "." is correct in 99% of cases (100% if you know that you're using a kernel that guarantees it, as its reported that some HP kernels do). However uid==0 does exist, and there are occasions where processes need to be able to work in spite of what the super-user might have done. > >Logically, to prohibit "" as a name, one would also have to prohibit "/" > >as a name, since they are logically equivalent. "/." would be required > >to reference the root ... > > Sorry, I don't see any "logical equivalence". In fact, as you must > know, many people are successfully using systems where "" doesn't work > but "/" does access the root of the filesystem hierarchy. I meant "logically equivalent" not "couldn't be otherwise". To interpret "/" correctly you must use the same reasoning as is used to interpret "". That is "start at the root directory (kernel's internal idea of what that happens to be) and then go nowhere". Just as "" is "start at the processes current working directory, and then go nowhere". Obviously, kernels can implement whatever they like, their existance doesn't imply logical consistency. > The main > argument was that the empty string is invisible when used as a component > of a path; although "dir/" (with the null string after /) works fine, > "/subdir" (with the null string before the slash) doesn't work right, > whereas "./subdir" works fine. Yes, this is a potential programming problem, programs do need to check for "" and handle it differently. Unfortunately, that's true of many things, and outlawing them all isn't the way to fix this, education is. I believe that "/" suffers the same problem in some implementations doesn't it? Adding "/subdir" after "/" doesn't always work. Ie: what is produced is "//subdir" which is fine in most unix's, but outlawed in some others (or has a different meaning). Again, using "/." the alternative (usually correct) name for the root will fix this, producing "/./subdir" which is (should be) always legal. Programs just have to deal with all these cases. A "pathcat" C library function that would do the right thing in all cases would be a useful addition. I also thought that some kernels had outlawed "dir/" with the null string there being treated much like that in "". Silly, but if "" is going to be illegal, then it is consistent to outlaw the other case as well. kren
boyd@basser.oz (Boyd Roberts) (10/14/87)
In article <6532@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes: >In article <1036@basser.oz> boyd@basser.oz (Boyd Roberts) writes: >>I see ``.'' in every directory. > >You haven't looked at enough directories, then, particularly using >distributed network file systems. Since when has your ``network file system'' supported UNIX semantics? We are, after all, talking about UNIX file-system semantics. UNIX directory semantics state that there's a ``.'' and a ``..''. Both versions of ``mkdir'' (the program & the system call) ensure that these two entries are created WITH the directory. So, what does your ``network file system'' do? Is it a UNIX file-system? I mean, where IS the beef? As for the ``test ""'' nonsense, RTFM. Ever used ${dir-.}? Boyd Roberts boyd@basser.oz ``When the going gets wierd, the wierd turn pro...''
jc@minya.UUCP (John Chambers) (10/14/87)
In article <9723@brl-adm.ARPA>, franco@MIKEY.BBN.COM (Frank A. Lonigro) writes: > In article <15069@topaz.rutgers.edu> ron@topaz.rutgers.edu (Ron Natalie) writes: > >STAT should never be called with NULL, but it should always work > >when called with a zero length string (e.g. ""). > > Boyd Roberts responds: > >As for "", I couldn't disagree more. The zero length string only > >works due to a quirk in namei(). Do you ever see the ``zero length string'' > >named file in a directory? No, you don't. One of my favorite (only partly-)joking ways of explaining to non-computer people why it is that computers are so screwey is to explain that, while the Arabs taught Europeans that 0 was a useful number well over 1000 years ago, the fact is still not accepted by most of the computing field! The funniest examples appear when you look at the way strings of zero length are handled by most software. These days, you'd think it was funny if an accountant thought that 0 was an invalid balance on an account. But most programmers think that 0 is an invalid length for a string of characters, and advocate software that rejects such nonsense (while writing software that blows up on null strings). This discussion shows strong signs of being full of yet more good examples of such pre-0 thinking. In fact, null strings are (and should be) perfectly valid strings, and usually have a clear meaning. For instance, if string s1 is a name of some directory, and s2 is a name of something in that directory, then you should be able to get the name for that something by catenating s1, "/", and s2. Consider the case where s1="foo/bar", and s2="". The result of this operation is "foo/bar/". This should be a valid name of something that is (or at least could be) in "foo/bar". What is that thing? Well, "foo/bar/" could be a valid name of a file within foo/bar, i.e., an entry with a nonzero inode and all the name chars filled with nulls. If this is for some reason considered unacceptable, there is only one other reasonable interpretation, namely that a final '/' on a file name is redundant (just like the second '/' in "foo//bar" is redundant), and the name "foo/bar/" is a synonym for "foo/bar". But if this is true, then it is by transitivity also a synonym for "foo/bar/.", "foo/bar/./." and so on. From this we conclude that "" is synonymous with ".", "./.", and so on. Similarly, we could observe that by catenating "ls " and s1, we expect the resulting command "ls foo/bar" to give a listing of foo/bar. If we apply the same reasoning, we expect that cat("ls " s2) = "ls " will give us a listing of the directory that s2 names. But "ls " lists the current directory. So "" again appears to be a name for the current directory. In general, I think that the inventors of Unix are to be commended for not giving in to the usual mathematical illiteracy of the rest of the computer field, and giving us a system that usually treats zero in a sensible manner. I've done time working on IBM mainframes, where for instance you can't have an empty record in a file or a file with no records. This means special code in your programs to check for null output records and pad them with a space. It means special code to make sure that you write at least one record (with one space) to an output file, which implies keeping track of how much you've written. It means that, on input, a program has to be prepared for records that have been kludged up with extra white space. It means a whole lot of hassle in general that could have been avoided if they had just realized that zero is a valid length of a string, and written their routines to handle the zero case correctly. For that matter, try writing an empty output file on VMS sometime. > have all been coded to rely on this default(this can be verified > by do something like 'ls ""' or 'chmod ""', etc). In my opinion, is > very bad programming practice. And to think that such programs > have been coded this way since the beginning of UNIX is hideous. No, it's just a sign of mathematical sophistication. (:-) -- John Chambers <{adelie,ima,maynard,mit-eddie}!minya!{jc,root}> (617/484-6393)
gwyn@brl-smoke.ARPA (Doug Gwyn ) (10/14/87)
In article <3283@uw-june.UUCP> ka@uw-june.UUCP (Kenneth Almquist) writes: > mkdir junk > cd junk > rmdir ../junk > ls . >This will tell you that "." does not exist ... How is this situation supposed to differ from mkdir junk rmdir junk ls junk That will tell you that "junk" does not exist. And the system is right both times. When I said that "." is guaranteed to always exist, I of course did not mean that it exists as an entry in a directory that itself does not exist -- how could a non-existent directory be considered to have ANY entries? This doesn't seem to have any bearing on the "" vs. "." issue. If the directory supposedly so denoted doesn't exist, the form of the denotation soesn't change that fact.
franco@MIKEY.BBN.COM (Frank A. Lonigro) (10/14/87)
Thanks very much to those of you who answered my questions with intelligence. I have a couple more. Just thought I'd let you know that I found the manual page on the description of a path name but found it very contradictory. Here is an excerpt from the BSD manual page: File Name Names consisting of up to {FILENAME_MAX} characters may be used to name an ordinary file, special file, or directory. These characters may be selected from the set of all ASCII characters excluding 0 (null) and the ASCII code for / (slash). (The parity bit, bit 8, must be 0.) Note that it is generally unwise to use *, ?, or [ ] as part of file names because of the special meaning attached to these characters by the shell. Path Name A path name is a null-terminated character string starting with an optional slash (/), followed by zero or more directory names separated by slashes, option- ally followed by a file name. The total length of a path name must be less than {PATHNAME_MAX} characters. If a path name begins with a slash, the path search begins at the root directory. Otherwise, the search begins from the current working directory. A slash by itself names the root directory. A null pathname refers to the current directory. According to the naming convention of file names, all ASCII chars can be used except a NULL or the slash. This convention includes the naming of directories. So, how can a null string be considered a file name, let alone a directory, and why should there be two ways to refer to the same thing (the current directory)? It really doesn't make sense! Sorry for more questions, and thanks for pointing me to the documentation. -franco
mike@turing.unm.edu.unm.edu (Michael I. Bushnell) (10/15/87)
In article <9767@brl-adm.ARPA> franco@MIKEY.BBN.COM (Frank A. Lonigro) writes:
~
~ Thanks very much to those of you who answered my questions with
~intelligence. I have a couple more.
~
~ Just thought I'd let you know that I found the manual page on
~the description of a path name but found it very contradictory. Here
~is an excerpt from the BSD manual page:
~
~ File Name
~ Names consisting of up to {FILENAME_MAX} characters may
~ be used to name an ordinary file, special file, or
~ directory.
~
~ These characters may be selected from the set of all
~ ASCII characters excluding 0 (null) and the ASCII code
~ for / (slash). (The parity bit, bit 8, must be 0.)
~
~ Note that it is generally unwise to use *, ?, or [ ] as
~ part of file names because of the special meaning
~ attached to these characters by the shell.
~
~ Path Name
~ A path name is a null-terminated character string
~ starting with an optional slash (/), followed by zero
~ or more directory names separated by slashes, option-
~ ally followed by a file name. The total length of a
~ path name must be less than {PATHNAME_MAX} characters.
~
~ If a path name begins with a slash, the path search
~ begins at the root directory. Otherwise, the search
~ begins from the current working directory. A slash by
~ itself names the root directory. A null pathname
~ refers to the current directory.
~
~According to the naming convention of file names, all ASCII chars can be
~used except a NULL or the slash. This convention includes the naming of
~directories. So, how can a null string be considered a file name, let
~alone a directory, and why should there be two ways to refer to the same
~thing (the current directory)? It really doesn't make sense!
~
Look back at the good posting about 0 length strings. A pathname with
length zero is NOT the same as a pathname with length one, and the
first char being NULL.
The reason for not allowing NULL in pathnames is, I hope, obvious. It
is also the terminator for strings. A null string is a string of
length zero. According the rules in the second section above, it has
[no leading slash][no directory names (sep. by '/')][empty final
comp.]
The rules clearly allow "" as a pathname, and specify that it must
mean cwd.
But "" is NOT "\0".
Michael I. Bushnell
a/k/a Bach II
mike@turing.unm.edu
{ucbvax,gatech}!unmvax!turing!mike
---
WHO sees a BEACH BUNNY sobbing on a SHAG RUG?!
-- Zippy the Pinhead
gwyn@brl-smoke.ARPA (Doug Gwyn ) (10/15/87)
In article <673@unmvax.unm.edu> mike@turing.unm.edu.UUCP (Michael I. Bushnell) writes: >But "" is NOT "\0". What the hell are you talking about? If you write "" at the C source code level, you have specified an array of one char containing just a 0 byte; "\0" generates a 2-long char array containing two 0 bytes. These will both be treated the same by any function that handles null-terminated strings; in both cases the strings are considered to contain 0 significant characters (not including the terminator). At the shell level, a "" argument to a command turns into a similar null-terminated string containing 0 characters before the terminator, and that is the form seen in the argv[] array by the command's main() function. A "\0" argument normally is the same as "0" and therefore contains one character (host code for a printable 0 digit, e.g. 48 in ASCII) followed by a null terminator. This doesn't seem to be what you were referring to. In summary, an "empty string" contains 0 significant characters. If it is encoded as a null-terminated string, then it has nothing stored before the terminator. This is much like the mathematical notion of an empty set. You cannot determine whether an empty string denotes a valid filename or not from syntactic considerations; it does if the kernel interprets it that way and it doesn't if the kernel doesn't. Both types of UNIX kernel behavior are widespread. As to whether an empty string "should" have meaning as a pathname, and what meaning it should have, that is a matter of opinion and judgement. The original intent was for "" by itself to refer to the current working directory; an argument could be made that this is just an instance of a general algorithm for interpreting pathnames. But an argument can also be made that its meaning is not unambiguously determined by the general rules, so that a special rule is necessary to uniquely nail down the meaning of "". Different systems nailed it down differently.
chris@mimsy.UUCP (Chris Torek) (10/16/87)
>In article <673@unmvax.unm.edu> mike@turing.unm.edu.UUCP (Michael >I. Bushnell) writes: >>But "" is NOT "\0". In article <6569@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes: >What the hell are you talking about? Colloquial usage confounds and confuses. To communicate in C: "" A `zero length string', actually an array containing one byte, the ASCII NUL (code 0/0 in ASCII tables) (this assumes your machine uses ASCII). This is also called `the null string'. The code NUL is also called the `null character'. (char *)0 A null-valued string, or a nil string, or (char *)NULL the nil pointer to char. Note that both of these use the word `null'. We have three things called *null*, namely NULL, NUL, and "". Only the last is a *string*. When properly cast, the first `looks like a string' from the outside. NUL is a character and never looks like a string. Now then, one `null' that is an invalid pathname is (char *)NULL. This is a value that is guaranteed not to point to any valid data object. On a 4BSD Vax, it happens to point to an (invalid of course) data object that is nonetheless accessible *and* starts with a zero byte, so that it looks for all the world just like the former null string. *This is mere happenstance*. The other `null' that is (sometimes) an invalid pathname is the null string, "". This is a perfectly valid pointer, pointing to a zero byte---the character NUL. On some machines, therefore, open("") and open((char *)NULL) do the same thing. This is an accident of the implementation. They are not required to do the same thing. On a 4BSD machine, the former is required to work, and acesses the current directory. On some 4BSD machines, the latter happens to do the same; on others, it returns an EFAULT error. Summary: the *null string* `""' is a pathname. The *nil pointer to char* `(char *)NULL' is not a pathname. The *NUL character* in position 0/0 on ASCII charts is not even a string. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
trt@rti.UUCP (Thomas Truscott) (10/16/87)
In article <6565@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes: > In article <3283@uw-june.UUCP> ka@uw-june.UUCP (Kenneth Almquist) writes: > > mkdir junk; cd junk; rmdir ../junk > > ls . > >This will tell you that "." does not exist ... >... >This doesn't seem to have any bearing on the "" vs. "." issue. Of course it has bearing! The (side) issue is: is "" or "." more properly the name of the process' working directory? The answer is "" (on systems that do not reject it) as there are cases where the directory entry "." is inaccessible. Here are some other arguments we have heard: "Do you ever see the ``zero length strength'' named file in a directory? No, you don't." (Roberts) Do you ever see the ``/'' named file in a directory? "... why should there be two ways to refer to the same thing (the current directory)? It really doesn't make sense!" (Lonigro) It is true that "/" is "/." is "/./." is "/tmp/.." but it would be painful to disallow "." and ".." in pathnames. "It does have the drawback that it doesn't behave right under string concatenation etc., and it was eventually outlawed (probably in response to naive-user complaints) in the official AT&T releases of UNIX." (Gwyn) I agree. The "" pathname was outlawed not (1) to avoid the oft-cited anomaly in concat(dir, "/", component) but rather (2) to help programmers who use uninitialized or unchecked values. Note that concat(uninitialized_value, "/", component) is a program bug, not The Anomaly, and disallowing "" as a pathname may not fix the bug. Face it, outlawing "" is like outlawing filenames with whitespace in them, both are justifiable but both are blemishes in the Great Plan. DOT AND DOT-DOT IN THE UTILITARIAN UNIX SOCIETY We have learned that "" has no role in UNIX of the future, but what of "." and ".."? In days of yore they were simple directory entries and the kernel paid them no special attention. As Ken Thompson put it, the notions of working and parent directories "just fell out" of the mknod system call. But mknod, though elegant, lacked utility (perhaps because it, like creat, lacked a final "e"). And so the kernel came to watch over the dot and the dot-dot. The foremost concern was for the keepers of the mount points, but anon came mkdir and rmdir and then the systems opened. At its highest manifestation (NFS) the incantations of ".." and '.' verily reverberate through the great pathways of the kernel. And now we see that the kernel pays so much attention to the semantics of the dot and the dot-dot that the actual directory entries are once again paid no special attention. So I suggest that we banish "." and ".." in the filesystem of the future and make them as the "/", invisible spirits to be called from the vasty deep. Tom Truscott
neilb@elecvax.eecs.unsw.oz (Neil F. Brown) (10/16/87)
Some thoughts about file names - "" in particular. On a level 7 system (which we still use) chmod -x . # silly, but possible and I have seen it happen # as in chmod -x * .* ls -l <some sort of error about not being able to access > # thinks "damn" chmod +x . <error, can't access . - after all, that would require seaching current directory, which doesn't have x perm> # damn, can't remember where I was pwd <sorry, can't access ..> chmod +x "" # this works as "" IS the current directory, no directory search needed This is what first convinced me that "", not "." was really the current directory. On BSD4.2 it goes much the same way until you get to chmod +x "" This don't work on BSD!! I was shocked. Is NOTHING sacred? Now I HAVE to remember where I am. Such is "progress". Also, in V7, EVERY null terminated string was a potentially valid path name. 4.?BSD disallowed setting the 8th bit (though this may change when the realities of international character sets sink home). SVr2(?) disallows "" - bad to worse. Though I'm not sure, did they? Can I say ln file "" After all, an error of ENOENT sort of means I could create an entry with that name. Unfortunately, I dont have SV here. So now, not every strings is potentially valid, so the universe is less general. Such is progress. And what about the file name "foo/" One of the original documents ("The Unix File System"?) states that trailing slashes are stripped, so this is equivalent to "foo", or was. On a BSD system, try echo */ It probably only lists the directories. (It depends on the shell). On 4.2BSD at least, "foo/" is only accessable if foo is a directory. Personally, I prefer these semantics. But there is still a funny. try rmdir foo/ You get an error message like rmdir: foo/: Is a directory. Well, I know its a directory, thats why I used rmdir!! rmdir foo of course works. After considering all of this, I came to the conclusion that the best semi-formal semantics for Unix file names was SLASH = '/'+ # a non empty string of slashes NAME = [^/]+ # a non empty string of non-slash (non \0) chars A "file" is essentially a byte-stream (+ seek+ioctl+fcntl+...) A "directory" is a mapping from NAMEs to "file"s i.e. a "directory" is a function from the space of NAMEs to the space of "file"s path = filename # in which case path refers to a file/device/ socket/stream/etc. a "file" | dirname # path refers to a "directory" . dirname = SLASH # path refers to the root directory (for the process) | <empty> # path refers to current (working) directory | filename SLASH # The file named is to be interpreted in some # system dependent fashion as defining a "directory" # function. The path refers to that "directory". . filename = dirname NAME # the function dirname is applied to the NAME # to produce a "file" . In this system, "foo" is technically different from "foo/". You could reasonably make a read(2) on "foo/" return the system dependant representation of the directory, while a read on "foo/" returns the NAMEs (null terminated) which the directory while successfully map (i.e. put readdir into the kernel). Of course, this last thought would break much, so it probably ain't worth the effort. Now I'm not saying this is the way it IS, anywhere. I just think that its a particularly clean way to define the semantics of path names. If anyone has a different, complete, semi-formal definition, I would love to see it. Happy arguing. NeilBrown (Orginisation, address, etc in the header where they belong)
ekrell@hector.UUCP (Eduardo Krell) (10/16/87)
In article <1796@rti.UUCP> trt@rti.UUCP writes: >It is true that "/" is "/." is "/./." is "/tmp/.." Only if /tmp is not a symbolic link, or, if it is a symbolic link, if you use my ".." interpretation. Sorry, guys, I couldn't resist ... Eduardo Krell AT&T Bell Laboratories, Murray Hill {ihnp4,seismo,ucbvax}!ulysses!ekrell
john@frog.UUCP (John Woods, Software) (10/16/87)
In article <9721@brl-adm.ARPA>,franco@MIKEY.BBN.COM (Frank A. Lonigro) writes: >>In article<15069@topaz.rutgers.edu>ron@topaz.rutgers.edu(Ron Natalie)writes: >>>STAT should never be called with NULL, but it should always work >>>when called with a zero length string (e.g. ""). >>>-Ron >>As for "", I couldn't disagree more. The zero length string only >>works due to a quirk in namei(). Do you ever see the ``zero length string'' >>named file in a directory? No, you don't. >>-Boyd Roberts > What I don't believe and understand is that...: > 2) Programs like 'ls', 'cd', 'chmod', 'test', 'rmdir', etc.... > have all been coded to rely on this default(this can be verified > by do something like 'ls ""' or 'chmod ""', etc). Wondrously false. This is NOT verified by 'ls ""', because ls is just handing the kernel that string ""; ls has no hand in it whatsoever. > So, let me re-phrase my original questions in hopes of getting them > answered properly. Why does Joe/Jane Programmer have to know, in advance, > that stat("", &stbuf) will default to stat(".", &stbuf)? Personally, I think that "" being equivalent to "." is a handy feature, if you don't trust the convention "." will be present in every directory (it isn't at all hard to curdle things so this isn't true). A story: UNOS (CRDS's SVID-compliant operating system) used to allow "" as a synonym for ".", because it made the code easier. One of our customers complained that this was a bug, because their application had a code section roughly (VERY roughly, I never saw their code, just the complaint): while (fred == NULL) { printf("What file should I use? "); gets(buffer); if ((fred = fopen(buffer, "r")) == NULL) printf("no such file.\"); } I argued strenuously that their application should check for the user just typing newline, and probably ought to do some sanity checking of the file that gets opened (what if they type "."???). Unfortunately, others here felt that calling "" an illegal file name was a good idea, and would placate this idiot at the same time. So where does this leave J. Programmer? If a shell script is to work on System V, then "" is the name of no file, and can't be depended on. If a shell script is to work on 4.nBSD or Version 6 or Version 7, then "" *IS* the name of a file, and can't be depended on NOT TO EXIST. If you are on a system where "" names a file, and you don't want your software to let "" name a file, then you have to special case your code to not allow "". -- John Woods, Charles River Data Systems, Framingham MA, (617) 626-1101 ...!decvax!frog!john, ...!mit-eddie!jfw, jfw@eddie.mit.edu "Cutting the space budget really restores my faith in humanity. It eliminates dreams, goals, and ideals and lets us get straight to the business of hate, debauchery, and self-annihilation." -- Johnny Hart
aglew%xenurus@gswd-vms.Gould.COM (Andy Glew) (10/18/87)
..> "" and "." as path elements. I rather like giving equal rights to the empty string (I worship Dijkstra and always right while(1) /*skip*/; instead of while(1);), but the big lossage for the empty string is that (concat "" "/" "subdir") doesn't work. By any rights pathnames beginning with "/" should really be relative pathnames... Obviously the thing to do is to redefine all pathname handling functions to accept vectors of strings, like argv; eg. char *pathv[] = { "", "subdir", (char *)0, }; fd = openb(pathv,O_RDONLY,mode); This has the pleasant effect of permitting "/" as a character in pathnames. Equal rights to slashes, and other strings with criminal tendencies! In fact, we should go even further, and permit nul characters in pathnames by providing a length: #define LS(str) {(sizeof(str)-1),str} struct pathel { int length; char *str; } struct pathel pathv[] = { LS(""), LS("subdir with a null and / in it"), (char *)0, }; But then we're discriminating against null strings, ie. string pointers of value 0. Obviously, need a length on the pathv... And you know, I might seriously be in favour of this if I worked every day in a decent language like LISP, instead of in Stone Age C. Smile, dammit! Andy "Krazy" Glew. Gould CSD-Urbana. USEnet: ihnp4!uiucdcs!ccvaxa!aglew 1101 E. University, Urbana, IL 61801 ARPAnet: aglew@gswd-vms.arpa I always felt that disclaimers were silly and affected, but there are people who let themselves be affected by silly things, so: my opinions are my own, and not the opinions of my employer, or any other organisation with which I am affiliated. I indicate my employer only so that other people may account for any possible bias I may have towards my employer's products or systems.
gwyn@brl-smoke.ARPA (Doug Gwyn ) (10/19/87)
In article <1076@basser.oz> boyd@basser.oz (Boyd Roberts) writes: >So, what does your ``network file system'' do? Is it a UNIX file-system? That depends on what you mean by "UNIX", doesn't it? It is proper according to the POSIX specification.
domo@riddle.UUCP (Dominic Dunlop) (10/20/87)
In article <30667@sun.uucp> guy%gorodish@Sun.COM (Guy Harris) writes: >> If JJ Programer only wants to code on SysVid Compliant machines then >> they can assume stat("",&b) and test -d "" will return ENOENT and false >> resepctively. > >Guess again; the SVID says nothing more than "a null string is undefined and >*may* be considered an error" (emphasis mine). There will certainly be >SVID-compliant systems that treat a null string as a synonym for ".". Yes. This is a problem, isn't it? In fact draft 11 of IEEE 1003.1 (POSIX), like the SVID, lets conforming implementations go either way. For good (?) measure, the X/OPEN Portabilty Guide, edition 2, volume 2, is also definitively vague: ``The null string is undefined, and may be considered an error.'' At least we have a consensus that the way to handle this issue is to punt. Or do we? I just received mail from the IEEE 1003.1 committee saying that this unsatisfactory state of affairs is to be discussed at the POSIX Implementor's Workshop, to be held today (20th October) and tomorrow at the Sheraton Potomac in Gaithersburg, MD. Apparently POSIX won't cut it as a Federal Information Processing Standard unless this and a number of other vaguenesses are cleared up. If I interpret the mail correctly, 1003.1 may end up saying one of three things: 1. Null pathname invalid (specific value returned in errno -- ENOENT??) 2. Null pathname means current directory 3. _POSIX_PATHNAME_NULL defined somewhere (<posix.h>??) by implementor iff null pathname means current directory, otherwise null pathname invalid. As I haven't been paying as much attention to developments as perhaps I should, it could be that my interpretation is off by miles. Could somebody who attended the meeting please post a summary based on something other than guesswork? Thanks. Dominic Dunlop domo@sphinx.co.uk domo@riddle.uucp
olapw@olgb1.oliv.co.uk (Tony Walton) (10/20/87)
In article <9767@brl-adm.ARPA>, franco@MIKEY.BBN.COM (Frank A. Lonigro) writes: ....... > alone a directory, and why should there be two ways to refer to the same > thing (the current directory)? It really doesn't make sense! > I don't pretend to know the original reason (if any) for having two ways to refer to the current directory, but it certainly makes life easier for the users. Consider the "ls" command - it's easier to type "ls" than "ls ." every time. On the other hand, "." is necssary as a "place holder" in some commands - like find, mv, etc.
dave@murphy.UUCP (Dave Cornutt) (10/21/87)
In article <8898@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes: > In article <4779@ncoast.UUCP> allbery@ncoast.UUCP (Brandon Allbery) writes: > >... Some, but most definitely NOT all, versions of Unix explicitly > >check for a null path in namei() and return an error (ENOENT, or > >perhaps EINVAL). All others have the bug; since it's not exactly > >crippling, it's not considered a high priority to fix it, so it stays. > > It is also not considered a bug (at least by some), just as the > null set is considered a set (at least by some), and zero sized > arrays are considered not unreasonable (at least by some [An error? > Probably. But what if . . . ?]). I rarely find myself in the position of disagreeing with Chris, but I remain unconvinced on this one. Others, such as John Chambers, have also expressed support for the principle that the null string should be a valid filename. Indeed, it does state in the 4.3BSD manuals that "" refers to the current directory. Trouble is, it also refers to the root directory, depending on the context you use it in. Suppose you write something that will take a directory name and open a file "foo" in it. I'd probably do this by taking the directory name and concatanating "/foo" to it, so if the name was "xyz", I'd get "xyz/foo", and if the directory was ".", I'd get "./foo", both of which would access the desired file. Trouble is, if the directory was "", it wouldn't work; although "" might be a perfectly valid directory, appending "/foo" to it would produce a reference to the root directory, which is certainly not was intended. If I thought that someone might actually want to do this, I'd have to stick in a special check for "" and substitute something else. So, where do you want the special check to go: in every program that might be given "" as a directory name, or in the kernel where it can be eliminated as a legal filename and save everyone a lot of headaches? As for "." not existing on some systems: the use of this to mean the current directory is so common in Unix systems nowdays (far more than "", which I'll bet many people that read this group weren't aware of before this discussion came up) that if I ran into a system where it wasn't implemented somehow, either as an actual file name or as a magic name in the kernel, I would complain to the vendor and tell them that their implementation is incomplete. --- "I dare you to play this record" -- Ebn-Ozn Dave Cornutt, Gould Computer Systems, Ft. Lauderdale, FL [Ignore header, mail to these addresses] UUCP: ...!{sun,pur-ee,brl-bmd,seismo,bcopen,rb-dc1}!gould!dcornutt or ...!{ucf-cs,allegra,codas,hcx1}!novavax!gould!dcornutt ARPA: dcornutt@gswd-vms.arpa "The opinions expressed herein are not necessarily those of my employer, not necessarily mine, and probably not necessary."
ron@topaz.rutgers.edu (Ron Natalie) (10/21/87)
There is no context in which "" is the ROOT unless your current directory is the root. If the name starts with slash, use top of tree otherwise use the current directory. I have a hard time figuring out why "/" means root ".bin/" means the .bin directory in the current directory "/tmp/" means the /tmp directory but I've explicitly got to say "." rather than "". If it were the case that null names are undefined, then "/" ought to be illegal, it would need to be "/." to actually refer to the directory. On NON-UNIX machines, both "." and "" and directory delims "/" in general ought to work funny, if at all. Just look at the headstanding that has to be done to support UNIX-like directory heirarchy under VMS. -Ron
richard@aiva.ed.ac.uk (Richard Tobin) (10/22/87)
In article <231@olgb1.oliv.co.uk> olapw@olgb1.oliv.co.uk (Tony Walton) writes: >In article<9767@brl-adm.ARPA>, franco@MIKEY.BBN.COM (Frank A. Lonigro) writes: >> alone a directory, and why should there be two ways to refer to the same >> thing (the current directory)? It really doesn't make sense! >I don't pretend to know the original reason (if any) for having two ways >to refer to the current directory, but it certainly makes life easier for >the users. Consider the "ls" command - it's easier to type "ls" than "ls ." >every time. On the other hand, "." is necssary as a "place holder" in some >commands - like find, mv, etc. The facts that ls defaults to the current directory and that a null ("") path is equivalent to "." are quite unrelated. The former is because ls looks to see if it has a filename argument, and if it doesn't it uses ".". The second is because the kernel routine namei() allows "" as a degenerate pathname. Thus ls and ls "" both list the current directory, but for quite different reasons. I never use "" for the current directory deliberately; when I encounter it it's usually because of an error - maybe an over-ambitious C-shell backquoted expression that didn't produce any output. -- Richard Tobin, JANET: R.Tobin@uk.ac.ed AI Applications Institute, ARPA: R.Tobin%uk.ac.ed@nss.cs.ucl.ac.uk Edinburgh University. UUCP: ...!ukc!ed.ac.uk!R.Tobin
kre@monet.Berkeley.EDU (Robert Elz) (10/24/87)
In article <6569@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes: >The original intent was for "" by itself to refer to the current working >directory; an argument could be made that this is just an instance of >a general algorithm for interpreting pathnames. But an argument can >also be made that its meaning is not unambiguously determined by the >general rules, so that a special rule is necessary to uniquely nail >down the meaning of "". Different systems nailed it down differently. I don't quite see that there is, or ever was, the second argument. The general rule is quite clear about this, its also documented, and always has been. The systems which have made "" illegal I think did so because most uses of "" as a pathname are undeniably programming errors (as was the one which started this whole discussion), and there was some desire to help catch the errors. A worthy aim, unfortunately not a good idea at all. kre
mwp@munnari.oz (Michael W. Paddon) (10/24/87)
in article <231@olgb1.oliv.co.uk>, olapw@olgb1.oliv.co.uk (Tony Walton) says: > > In article <9767@brl-adm.ARPA>, franco@MIKEY.BBN.COM (Frank A. Lonigro) writes: > ....... >> alone a directory, and why should there be two ways to refer to the same >> thing (the current directory)? It really doesn't make sense! >> > > I don't pretend to know the original reason (if any) for having two ways > to refer to the current directory, but it certainly makes life easier for > the users. Consider the "ls" command - it's easier to type "ls" than "ls ." > every time. On the other hand, "." is necssary as a "place holder" in some > commands - like find, mv, etc. When you type "ls", the current directory is listed by default because no arguments were given (ie. argc == 1), not because the null string was given as an argument. If you wanted to do that, you would have to type "ls ''". There is only *one* way (under non-braindamaged unix) to access the current working directory of a process. That is by the null pathname. The kernel interprets this as a reference to the inode number kept in the per-process u-area. Referencing the pathname "." tells the kernel to find an inode with the name "." in the current working directory of the process (which is found as stated above). The file name "." is CONVENTIONALLY a hard link to the directory it resides in, just as ".." is conventionally linked to the parent of that directory. There is no guarentee or requirement that this be the case from the kernel's point of view (indeed the rule for ".." is already broken at mount points). The mknod(2) call, incidentally, allows the creation of a directory without these links. The reason you need "." is so you can explicitly reference files relative to the current directory ie. "./program", when you want to make *sure* you are running that instance of program. Obviously the null string cannot be used for this. But the need for the null string (and indeed its very nature) is more fundamental than "." as a way to reference the current directory. Michael Paddon ============== =========================== UUCP: {seismo,mcvax,ukc,ubc-vision}!munnari!mwp ARPA: mwp%munnari.oz@seismo.css.gov CSNET: mwp%munnari.oz@australia
guy%gorodish@Sun.COM (Guy Harris) (10/25/87)
> I don't pretend to know the original reason (if any) for having two ways > to refer to the current directory, but it certainly makes life easier for > the users. Consider the "ls" command - it's easier to type "ls" than "ls ." > every time. None of this has anything whatsoever to do with "" being a name for the current directory; UNIX versions of "ls" (or, at least, the 4.3BSD and S5R3 versions of "ls") treat "ls" by itself as equivalent to 'ls .', not 'ls ""'. Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com
guy%gorodish@Sun.COM (Guy Harris) (10/25/87)
> Apparently POSIX won't cut it as a Federal Information Processing Standard > unless this and a number of other vaguenesses are cleared up. The consensus of the meeting in question was that, in a large number of cases, the behavior of POSIX systems *should* be left unspecified; there was no benefit to nailing the behavior down (there are at most three programs in all of SunOS, for instance, that use the "kill" system call with a process ID argument of -1; applications other than system shutdown commands - which are supplied by the vendor, not written by users - should not *care* whether this causes the signal to be sent to the process doing the "kill"). In the particular case of a null pathname, the consensus of the meeting was that a null pathname should be treated as an error, with error code ENOENT. This may or may not end up being what the FIPS says. POSIX will probably continue to allow either behavior, which is not a problem; there is no reason for an application to explicitly use a null string as a pathname, and those applications that might do so implicitly can just let the chips fall where they may (if they don't treat a null string specially already, e.g. a command that prompts you for a pathname and, if you just type RETURN, uses a default pathname, or uses the last one you typed at it, or complains, or exits, or...). Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com
randy@umn-cs.UUCP (Randy Orrison) (10/25/87)
In article <231@olgb1.oliv.co.uk> olapw@olgb1.oliv.co.uk (Tony Walton) writes: > Consider the "ls" command - it's easier to type "ls" than "ls ." >every time. On the other hand, "." is necssary as a "place holder" in some >commands - like find, mv, etc. "ls" is not the same as "ls ''" - one has no arguments, and so defaults to the current directory, and the other has one argument, a zero length string which may or may not refer to the current directory (Opinion: ENOENT). I like ls defaulting to the current directory, and there's no reason to ever change this (what else would it default to???). For find, mv, cp, et. al. the "." is necessary as a place holder: what would "mv x/y x/z" mean if both x/y and x/z existed and you didn't have to specify the destination? mv both to the current directory, or replace x/z with x/y? It's nice in MSDOS to be able to say "copy a:junk.dat" and have it end up in the current directory, but then it doesn't allow "copy a:junk.dat a:new.dat ." (But then root directories don't have ".", much less "..") My vote: "." is the current directory and "" isn't. -randy Disclaimer: What did i just say? -- Randy Orrison, University of Minnesota School of Mathematics UUCP: {ihnp4, seismo!rutgers!umnd-cs, sun}!umn-cs!randy ARPA: randy@ux.acss.umn.edu (Yes, these are three BITNET: randy@umnacca different machines)
lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) (10/25/87)
In article <231@olgb1.oliv.co.uk>, olapw@olgb1.oliv.co.uk (Tony Walton) writes: > I don't pretend to know the original reason (if any) for having two ways > to refer to the current directory, but it certainly makes life easier for > the users. Consider the "ls" command - it's easier to type "ls" than "ls ." > every time. On the other hand, "." is necssary as a "place holder" in some > commands - like find, mv, etc. You also couldn't say $ ./a.out to force the shell to search the current directory (without a full path name). Users that want to trash . don't seem to understand the full extent of this change. You would also have to type in full path names or even use $ `pwd`/a.out or $ $PWD/a.out Yuck! Now if by trashing . "they" mean to keep the semantics but remove it from the file system I'm not so adamant. It's just too late to change now. Larry Cipriani tut!lvc
mwlcs423@nmtsun.nmt.edu (M. Warner Losh) (10/26/87)
Anyone who has had a problem with test as a program in their account should like this. Place something like : fprintf (stderr, "Program header\n); as the first line of main(). If the program needs to act as a fileter, then this won't mess up the pipes and stuff, and you know the program is running...... Also, this can be taken out once you know the program works.
franco@MIKEY.BBN.COM (Frank A. Lonigro) (10/27/87)
> Richard Tobin writes: > I never use "" for the current directory deliberately; when I encounter it > it's usually because of an error - maybe an over-ambitious C-shell > backquoted expression that didn't produce any output. This is exactly my point!!! After reading the many responses to my original invocation of this subject (many of which got off onto the wrong track), I still think that: /bin/test -d "$FOO" should return FALSE if FOO="" . And that I shouldn't have to do two tests to find out that a shell variable is a directory and is not NULL. Imagine if you will this scenario. The average programmer(a non-unix.wizard or guru) writes a shell script that will ultimately need to run as "root"(UID 0) to be used as a HOME directory remover. The script uses "awk" to pull the USER's(the one to be removed) HOME directory out of the "/etc/passwd" file and stick it in a shell variable called Home. The script will then check to see if $Home is a directory and if it is, it will remove it and all it's contents. A sample shell script follows: Home=`awk { stuff to get home } /etc/passwd` if [ -d "$Home" ]; then cd $Home : 'cd to HOME' Cwd=`/bin/pwd` : 'To fool symlinks(where are we really)' cd /tmp : 'cd to some place known' rm -rf $Cwd : 'remove the directory' rm -rf $Home : 'remove the link, if there was a link' fi Now, can you imagine what this script will do if the USER you want to delete didn't match any entries in the "/etc/passwd" file? In other words Home="" ? My guess is: YOU WOULDN'T HAVE A SYSTEM TO LOGIN TO WHEN YOU CAME TO WORK IN THE MORNING!!! Just another entry into the "Don't do this on UNIX" list. (unless ""!="." || !isadirectory("")) -franco%bbn.com@relay.cs.net ...!harvard!bbn!franco P.S. PLEASE, I don't want to hear about better ways to write such a script. Removing directories through some sort of batch job no matter how much checking and testing is done, has been deemed: "DANGER! DANGER! ALIEN LIFE FORM OF UNKNOWN ORIGIN APPROACHING FROM BEHIND THAT ROCK, WILL ROBINSON!" -Robot- "Shut up! You bubble headed boobie! -Dr. Smith- "When someone can't move any more. When someone can't talk any more. Penny! Please, don't die!" -Mr. Nobody-
franco@MIKEY.BBN.COM (Frank A. Lonigro) (10/27/87)
G B Reilly <reilly@facman.wharton.upenn.edu> responded to my most resent posting in which I described a script that would destroy a system. > > C'mon - average user creating a root script! > > Next you'll believe DEC and IBM when they tell us that systems > programmers aren't needed to run machines. You know! It's responses like this that make me think twice about whether or not the subscribers to this list are really UNIX WIZARDS (what ever that is!) or just HACKERS who think they know what they're talking about. I didn't mean average programmer in the sense of a college student (which I believe reilly is), I really meant average systems programmer (someone who graduated college and has a real job) who didn't know that /bin/test -d "" would return TRUE. Out in the real working world, just about anything can happen, especially if you have over 100 computers (not including PC's or workstations) to manage like I do. And on all these computers, you have programmers (some have root privileges) developing software. It may be true that "" is a directory, but I still haven't heard a good excuse for needing it. Frank A. Lonigro Systems Software Engineer (UNIX) Bolt Beranek and Newman Inc. (BBN) Cambridge, MA BBN - A leader in Networking and Communications.
rwhite@nusdhub.UUCP (Robert C. White Jr.) (10/27/87)
In article <9974@brl-adm.ARPA>, franco@MIKEY.BBN.COM (Frank A. Lonigro) writes: > After reading the many responses to my original invocation of this subject > (many of which got off onto the wrong track), I still think that: > /bin/test -d "$FOO" > should return FALSE if FOO="" The issue above is easily setteled by a little care... > /bin/test -d "$FOO" should become one of the following: /bin/test -d "${FOO:? <error message>}" # to terminate shell /bin/test -d "${FOO:=/etc/system}" # to garentee false /bin/test -d "${FOO:-/etc/system}" # same without affecting # the value of FOO If you take care of your variable sustitutions, they will take care of you. Incidently, on our system 'test -d "" ' returns false anyway... Cest le Gare Rob.
bzs@bu-cs.bu.EDU (Barry Shein) (10/27/87)
The issue is whether a string containing nothing indicates your program has nothing to worry about. This might seem like much ado about nothing but some argue that nothing gained might be nothing lost, particularly when deleting directories with shell scripts which have variables containing nothing. Put another way (for the backward-compatability fanatics), is nothing sacred? Perhaps this explains nothing. It certainly is neither the first or last time this list has seen arguments about nothing. -Barry Shein, Boston University (adapted loosely from an entry in some Encyplopedia of Philosophy on "nothing".)
gwyn@brl-smoke.ARPA (Doug Gwyn ) (10/30/87)
In article <9977@brl-adm.ARPA> franco@MIKEY.BBN.COM (Frank A. Lonigro) writes: >Imagine if you will this scenario. The average programmer(a non-unix.wizard >or guru) writes a shell script that will ultimately need to run as "root"(UID 0) >to be used as a HOME directory remover. Frankly, under this scenario the semantics of test -d "" is the least of your problems.