[comp.unix] Some thoughts on filenames - "" in particular

neilb@elecvax.eecs.unsw.oz (Neil F. Brown) (11/14/87)

I once noticed the following:
	On a level 7 system (which we still use)

	chmod -x . # silly, but possible and I have seen it happen
		   # as in chmod -x * .*
	ls -l	 
	<some sort of error about not being able to access  >
	# thinks: "damn"
	chmod +x .
	<error, can't access . - after all, that would require
		seaching current directory, which doesn't have x perm>
	# damn, can't remember where I was
	pwd
	<sorry, can't access ..>
	chmod +x ""
	# this works as "" IS the current directory, no directory search needed

    This is what first convinced me that "", not "." was really the current
    directory.

    On BSD4.2 it goes much the same way until you get to
	chmod +x ""
    This don't work on BSD!! I was shocked. Is NOTHING sacred?
    Now I HAVE to remember where I am.  Such is "progress".

    Also, in V7, EVERY null terminated string was a potentially valid path
    name.
    4.?BSD disallowed setting the 8th bit (though this may change when the
	realities of international character sets sink home).
    SVr2(?) disallows "" - bad to worse.
	Though I'm not sure, did they? The error is ENOENT. Does this mean 
	I can
		ln file ""
	But no, as this would mean thing//thong is different to thing/thong.
    So now, not every strings is potentially valid, so the universe is less
    general.
    Such is progress.

    And what about the file name
	"foo/"
    One of the original documents ("The Unix File System"?) states that
    trailing slashes are stripped, so this is equivalent to "foo", or was.

    On a BSD system, try
	echo */
    It probably only lists the directories. (It depends on the shell).
    On 4.2BSD at least, "foo/" is only accessable if foo is a directory.
    Personally, I prefer these semantics.
    But there is still a funny. try
	rmdir foo/
    You get an error message like
	rmdir: foo/: Is a directory.
    Well, I know its a directory, thats why I used rmdir!!
	rmdir foo
    of course works.

    After considering all of this, I came to the conclusion that the best
    semi-formal semantics for Unix file names was

    SLASH =	'/'+	# a non empty string of slashes
    NAME =	[^/]+	# a non empty string of non-slash (non \0) chars

    A "file" is essentially a byte-stream (+ seek+ioctl+fcntl+...)
    A "directory" is a mapping from NAMEs to "file"s
		i.e. a "directory" is a function from the space of NAMEs to
			the space of "file"s

    path =  filename	# in which case path refers to a file/device/
				socket/stream/etc. a "file"
	 | dirname	# path refers to a "directory"
	 .
    dirname =  SLASH	# path refers to the root directory (for the process)
	 |   <empty>	# path refers to current (working) directory
	 | filename SLASH # The file named is to be interpreted in some
			# system dependent fashion as defining a "directory"
			# function. The path refers to that "directory".
	 .
    filename = dirname NAME
			# the function dirname is applied to the NAME
			# to produce a "file"
	 .
    
    Note that the empty string is never considered to be a directory
    entry; all directory entries are non-empty strings.
    [ At this point we could discuss what equivalence relation
      we will impose on name components - only 14 chars are
      significant, case is not significant...
      But thats off the track.
    ]
    In this system, "foo" is technically different from "foo/".
    You could reasonably make a read(2) on "foo/" return the system
    dependant representation of the directory, while a read on
    "foo/" returns the NAMEs (null terminated) which the directory
    while successfully map (i.e. put readdir into the kernel).
    Of course, this last thought would break much, so it probably
    ain't worth the effort.

    Now I'm not saying this is the way it IS, anywhere.
    I just think that its a particularly clean way to define the semantics
    of path names. If anyone has a different, complete, semi-formal
    definition, I would love to see it.

    Happy arguing.

    NeilBrown
    (Orginisation, address, etc in the header where they belong)

mike@turing.unm.edu (Michael I. Bushnell) (11/16/87)

In article <2423@mcdchg.UUCP>, neilb@elecvax (Neil F. Brown) writes:
~	chmod +x ""
~	# this works as "" IS the current directory, no directory search needed
~
~    This is what first convinced me that "", not "." was really the current
~    directory.
~
~    On BSD4.2 it goes much the same way until you get to
~	chmod +x ""
~    This don't work on BSD!! I was shocked. Is NOTHING sacred?
~    Now I HAVE to remember where I am.  Such is "progress".


We just got 4.3+NFS from Mt. Xinu and it doesn't have this problem.
As they got the directory scanning stuff from Sun, I bet it works
there, too.
--
				Michael I. Bushnell
				a/k/a Bach II
				mike@turing.unm.edu
				{ucbvax,gatech}!unmvax!turing!mike
---
HOORAY, Ronald!!  Now YOU can marry LINDA RONSTADT too!!
				-- Zippy the Pinhead
[The moderator presumes that he wasn't the referent above. Ronald Heiby. -mod]

dave@murphy.UUCP (Dave Cornutt) (11/24/87)

I know this is beating a dead horse, and probably everyone is sick of this
subject by now, and I promise to make this my last posting on this topic.

The real problem with "" as a file name, as several folks have already
pointed out, is that depending on what context you use it in, it can
refer either to the current directory or to the root directory; if you
use it as a filename by itself, it's the current directory, but it
you construct a filename by appending "/foo" to it, you're referring
to a file in the root.  The cure is to give root a name, just like
every other directory.  For example, let "$" be the name of the root.
This is a "magic" name; when namei sees "$" by itself in any component
of a pathname, it refers to the root, so file "xyz" on the root directory
is not "/xyz" but "$/xyz".  Then, you can let any null filename mean the
current directory; since root has a non-null name, you can append paths
to it and have it work.  So, "/xyz" now refers to "<null>/xyz", and
since the null path is the current directory, "/xyz" refers to xyz
in the current directory.  Now, you can append paths to the null
path and have it refer to files in the current directory, just like
every other directory name works.

Does this make any sense, or is it too much like VMS?
---
"You must be joking, take a running jump" - *Harold the Barrel*

Dave Cornutt, Gould Computer Systems, Ft. Lauderdale, FL
[Ignore header, mail to these addresses]
UUCP:  ...!{sun,pur-ee,brl-bmd,uunet,bcopen,rb-dc1}!gould!dcornutt
 or ...!{ucf-cs,allegra,codas,hcx1}!novavax!gould!dcornutt
ARPA: dcornutt@gswd-vms.arpa

"The opinions expressed herein are not necessarily those of my employer,
not necessarily mine, and probably not necessary."

usenet@mcdchg.UUCP (12/03/87)

In article <2597@mcdchg.UUCP>, dave@murphy.UUCP (Dave Cornutt) writes:
> I know this is beating a dead horse, and probably everyone is sick of this
> subject by now, and I promise to make this my last posting on this topic.

Over in comp.sys.amiga we've been having a very similar discussion. You see,
on the Amiga NULL is the *only* valid name for the current directory. Isn't
that just peachy? Think about all the times you've wanted to refer to "./".
Think about what you would do if you couldn't DO that?

The Amiga has a special name for the root. Colon.

:xyz means /xyz.

:/xyz is illegal.

/xyz means ../xyz.


So, you can't just add "/name" and get stuff to work. Peachy.
What am I saying? I'm saying the UNIX file naming convention is just fine. 
There's no reason to go hacking it up. It's useful to allow NULL to work as a
synonym for ., because it makes for nice default behaviour. Don't wreck the
file system making it "perfect".

> Does this make any sense, or is it too much like VMS?

This doesn't make any sense at all. It's too much like all sorts of brain
damaged things.
-- 
-- Peter da Silva  `-_-'  ...!hoptoad!academ!uhnix1!sugar!peter
-- Disclaimer: These U aren't mere opinions... these are *values*.

usenet@mcdchg.UUCP (12/03/87)

In article <2597@mcdchg.UUCP>, dave@murphy (Dave Cornutt) writes:
~The cure is to give root a name, just like
~every other directory.  For example, let "$" be the name of the root.
~This is a "magic" name; when namei sees "$" by itself in any component
~of a pathname, it refers to the root, so file "xyz" on the root directory
~is not "/xyz" but "$/xyz".

~Does this make any sense, or is it too much like VMS?

I) Yes, it is too much like VMS.  We must strive to maintain the purity
   of UNIX.  Viva la UNIX!  Down with VMS, Ultrix, OS/VS,...

II) It does make sense; but it would require rewriting virtually
    utility and rewriting my and most other people's brains.
--
				Michael I. Bushnell
				a/k/a Bach II
				mike@turing.unm.edu
				{ucbvax,gatech}!unmvax!turing!mike
---
Is it clean in other dimensions?
				-- Zippy the Pinhead