[comp.unix.wizards] Symbolic Links

tytso@thor.mit.edu (Theodore Y. Ts'o) (01/01/70)

In article <2854@ulysses.homer.nj.att.com> ekrell@hector (Eduardo Krell) writes:
>This argument, that ".." is a physical link (rather than a logical one)
>falls appart at mount points, where the parent directory and where ".." points
>are different. It also happens at remote file system mount points, for
>the same reason. So it's not so "simple and consistent". It now requires
>some education about file systems, mount points, and distributed file
>systems. The list seems to keep growing.
>
Yes, Unix requires some education.  I like to think that is a consequence
of its power.  If you want a system that is easy to learn, use a system
with all the flexibility (and "user friendliness") of a Macintosh.
>
>An implementation is not only possible, it already exists. I wouldn't be
>defending this if I didn't have the opportunity to use it and test it and
>get the feeling as to whether it's the right thing to do or not.
>    
Part of the Unix way is to be as flexible as possible.  I HOPE this is 
optional (turned on with a system call, or some such), or are you,
as a religious fanatic, going to force your way on everyone?  Make
it optional, and those who like it off can turn it off, preferably
without needing the source license.  (Or the other solution: using BSD :-)
The above paragraph assumes that someone at ATT is pushing this
interpretation into SYS V.  (Why is it that mostly ATT posters think
this is a good idea?)  If you're discribing a purely local hack, then
I hope it stays that way.

I really think adding state into the .. is really a *bad* idea.  (or could
you guess :-)  It confuses the issue, and it isn't as simple as you
seem to make it out to be.  Example: /mnt/paris is a link to /.
When you boot up in single user mode, before any history is established,
WHERE DOES .. POINT TO?  Since you don't think .. is a physical
pointer, the answer '/' is going to require a lot of explaning.  And
if the kernal flips a coin, I'll let you explain to the user why he
typed cd .. from / and ended up in /mnt.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Theodore Ts'o				mit-eddie!mit-athena!tytso
3 Ames St., Cambridge, MA 02139		tytso@athena.mit.edu
			If it's for real, it isn't!

root@hobbes.UUCP (John Plocher) (01/01/70)

+---- Eduardo Krell writes in <2854@ulysses.homer.nj.att.com> ----
| This is really getting annoying ...

Yup

| This is easy.

Nope

|   The intention is to make ".." behave as a logical operator. This means that
| 
| 1: if I do "cd /usr/include/sys" and then "cd ..", I end up in /usr/include
|    no matter what.

So write (or build in) a new cd command which does this.  I'm not stopping
*you* from doing it.  Just don't force *me* to use it, thank you.

| 
| 2: if I find a ".." in a pathname, it refers to the logical parent,
|    not the physical one. Thus, while in /usr/include/sys, "ls .." will produce
|    a directory listing of /usr/include.

What about a C program which does a #include <sys/foobar.h>?  What if the
file <sys/foobar.h> itself does a #include "../foobarboop.h" ?  All this 
works if and only if the C compiler references things as "/usr/include" +
filename in <>'s"

  Now I am recompiling the kernal, and its source is in the directory
/sys/src.  There is a subdirectory here called sys which is a link to the
directory known above as /usr/include/sys.  The $68K question is "What file
does "../foobarboop" reference now?"  If it isn't the SAME ONE as referenced
in the first case, something is broken.  period.

  This is the difference between a deterministic method (what we have now) and
a nondeterministic one (what is proposed).  I'll take the former any day!

My solution:
  When in /usr/include/sys, and you want a listing of /usr/include, do a
ls .., and when you want a listing of /sys/src, do a ls /sys/src.  If you
get confused easily, the first case can be expanded to ls /usr/include.
If this still confuses you, well, MS-DOS doesn't have links and so doesn't
have this problem.  Use that :-)))  The point is, links AS THEY ARE NOW are
very useful and, two, links to directories destroy the "tree" structure
of the filesystem.  Links turn it back into a directed graph which is
not easy to understand if you try to use the "tree" mentality on it.

There is method behind the madness of the way links work; if you don't
understand this, study it some more.  Use pushd and popd.  Make some
aliases (like cd) which do what you want.  Spend an afternoon customizing
your environment like this and it will be an afternoon well spent.  Just 
don't "break" the deterministic behavior of links for *me*.

| >However, this all started with a proposal to move these semantics into the
| >kernel
| You can't achieve 2: above unless it's done in the kernel.

I don't want 2: and 1: can be done with aliases (or whatever) in *your* shell.

-- 
John Plocher uwvax!geowhiz!uwspan!plocher  plocher%uwspan.UUCP@uwvax.CS.WISC.EDU

ed@mtxinu.UUCP (Ed Gould) (01/01/70)

>This argument, that ".." is a physical link (rather than a logical one)
>falls appart at mount points, where the parent directory and where ".." points
>are different. It also happens at remote file system mount points, for
>the same reason. So it's not so "simple and consistent". It now requires
>some education about file systems, mount points, and distributed file
>systems. The list seems to keep growing.

It doesn't fall apart anywhere on my system (4.3BSD + NFS) except that
/ is a special case: /.. == /

>This is easy. The intention is to make ".." behave as a logical operator.
>This means that
>
>1: if I do "cd /usr/include/sys" and then "cd ..", I end up in /usr/include
>   no matter what.
>
>2: if I find a ".." in a pathname, it refers to the logical parent,
>   not the physical one. Thus, while in /usr/include/sys, "ls .." will produce
>   a directory listing of /usr/include.
>
>The semantics are very clean and simple.

Simple, perhaps, but no more so than the current BSD symlinks.  What's
cleaner about it?

>You can't achieve 2: above unless it's done in the kernel.

No argument.

>An implementation is not only possible, it already exists. I wouldn't be
>defending this if I didn't have the opportunity to use it and test it and
>get the feeling as to whether it's the right thing to do or not.

Unless the kernel implements carrying an arbitrarily-long string with
each process, then I claim that the implementation is broken.  Consider
the following program, with DEPTH suitably large:

	main() {
		int i;

		for(i = 0; i < DEPTH; i++) {
			mkdir("subdir");
			symlink("subdir", "symlink");
			chdir("symlink");
		}
		chdir("../../..");
		/* 
		 * Where am I now and how did I get there???
		 */
	}

(This example could be improved, but I believe that it illustrates
my point.)

-- 
Ed Gould                    mt Xinu, 2560 Ninth St., Berkeley, CA  94710  USA
{ucbvax,decvax}!mtxinu!ed   +1 415 644 0146

"A man of quality is not threatened by a woman of equal022ran

johng@ecrcvax.UUCP (John Gregor) (01/01/70)

Another problem I forsee in changing the semantics of .. is that symbolic links
can be cyclic.  So, a few hundred trips around the cycle and the kernal would
have to keep that much state information around.  The question is, how much?
A fixed table probably won't make it (unless obscenely large), and from what
I know, use of dynamic structures for kernal data structures is frowned upon.
Keep 'em the same, and put the smarts in your shell (that's where stacks should
be).  Whats keeping you from changing the code in your favourite shell to remap
'..' along with the code that handles '?', '*', and all the rest?  (Binary
sites please don't flame me)

				Later
				  johng

John Gregor			        	johng%ecrcvax.UUCP@germany.CSNET
ECRC
Arabellastr. 17
D-8000 Munich 81
West Germany

gwyn@brl-smoke.UUCP (01/01/70)

In article <1254@mhres.mh.nl> jv@mhres.mh.nl (Johan Vromans) writes:
>No dot, no dot-dot, and directories have only one link ... Of course, you
>can access "." and ".." from system calls - it will do what you expect.

The interesting question is, if your system has or eventually will
have symbolic links to directories, then what WOULD you expect ".."
to mean?  Seems like the only thing it COULD mean is what Korn,
Krell, et al have proposed.

snoopy@doghouse.gwd.tek.com (Snoopy) (01/01/70)

In article <2886@ulysses.homer.nj.att.com> ekrell@hector (Eduardo Krell) writes:
>In article <8195@mimsy.UUCP> chris@mimsy.UUCP writes:

>>	#ifdef KERNEL
>>	#include "../machine/pte.h"
>>	#else
>>	#include <machine/pte.h>
>>	#endif

>The right way of doing this is by issuing the right -I options to cpp.

Why is -I the "right" way and #define the "wrong" way?

What does the relevant section of your Makefile look like? (ho boy,
now I'm asking for it!  Makefile wars!)  People seem to have a hard
enough time writing a reasonable Makefile the way it is.  (Hint:
if the Makefile is larger than the program is, something's wrong!)

Snoopy (wearing my Nomex for this one)
tektronix!doghouse.gwd!snoopy
snoopy@doghouse.gwd.tek.com

"68020s?  I use them as multiply redundant pushpins!"

dhesi@bsu-cs.UUCP (Rahul Dhesi) (01/01/70)

Some suggestions.

I.
".." should always mean the same thing, whether from the shell
prompt or in a system call.

II.
Symbolic links are attached to a place in the directory hierarchy
and (in 4.xBSD) translated by the kernel.  Environment variables 
are local to a process and interpreted by applications software.

What we need is a third entity that will (a) not be attached to a 
specific place in the directory hierarchy but (b) be known to
the kernel.

Then, instead of saying

>	#ifdef KERNEL
>	#include <../h/thing.h>
>	#else
>	#include <sys/thing.h>
>	#endif

we will simply be able to say:

     #include <$SYS/thing.h>

and have $SYS be interpreted by the filesystem.  Similarly, if I
want to move to the directory that contains thing.h, I will type

     $ cd $SYS
     $ pwd
     /usr/include/sys
     $ 

And "cd .." will always move me up to the parent directory.

There is no reason why the value of SYS could not be relative.

     $ setlink ABC /usr/include/sys
     $ setlink EFG ../jkl
     $ setlink XYZ alpha
     $ cd $ABC/$EFG/$XYZ
     $ pwd
     /usr/include/jkl/alpha
     $

ABC, EFG, and XYZ look like environment variables, but they are known
to the kernel and accessed via a hash table, not a sequential search.
The hash table itself is accessible via a special entry in the normal
environment so that, for example, a library function can look for the
environment variable LINKTABLE and get some value that will let it
access the hash table directly, so the kenrel need not be involved in
all accesses, only those that are needed in system calls.

Actually, we might not even need to involve the kernel at all.  Just
change ALL interface routines that could possible get a filename and
let them perform the translation before the parameters are sent on to
the kernel.  In effect all system calls that handle filenames are
buffered via library functions that do link name translation.

What if I want to be able to see the following?

     $ pwd
     /usr/abc
     $ cd $ABC
     $ pwd
     /usr/include/jkl
     $ back
     $ pwd
     /usr/abc

This is best done by the shell, since the command "back" will be given
to the shell but never to a system call.  The shell need only keep
track of the path traced in reaching the current directory, and go back
a step in it when it sees the "back" command.

The above are just some ideas.  Now for reality.

Pick one:

     [ ] symbolic links available around 1983, that largely work
	 and are useful, though they lead to some confusing 
	 situations because the kernel and the user may interpret
	 ".." differently; may be up to 255 characters long

     [ ] symbolic links designed carefully to lead to no confusion;
	 approved after much consideration by four layers of
	 bureacracy; not currently available, and probably won't be 
	 until 1989; will probably be limited to 14 characters

Pick just one.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

boyd@basser.oz (Boyd Roberts) (01/01/70)

Yes, the semantics of symbolic links were always the tricky bit.  The
implementation was always pretty straight-forward, save for this ridiculous
``cd ..'' scenario.  That's the price you paid for using them.  I can live
with that cost.

What I can't live with is the cost of implementing the ``cd ..'' bad craziness
in the kernel.  If you really want this functionality, use the shell.  Some
clever function/alias with (maybe) some help from a program would just do
the trick.

There's enough crap in the kernel without ridiculous & gross ``back-pointers''
& snarfed away pathnames & weird process state bits needed to implement this
totally dubious hack.  But, if you're hell bent on hacking the kernel
why not save yourself some agony?  Just get chdir() to return ENOTDIR when
it detects a ``cd ..'' from inside a symbolicly linked directory.  After all,
if you're going to break the kernel there's no point busting a gut to do it.



Boyd Roberts			boyd@basser.oz

``When the going gets weird, the weird turn pro...''

snoopy@doghouse.gwd.tek.com (Snoopy) (01/01/70)

In article <3728@elecvax.eecs.unsw.oz> neilb@elecvax.eecs.unsw.oz (Neil F. Brown) writes:

>The thrust of my arguement is - do you ever really want to have a directory
>in two different places? i.e. with two different absolute path names that
>don't include the well-understood (I thought) `.' and `..'.

Seems this would be pretty common nowadays with the various multi-machine
filesystems out there (DFS/NFS/RFS(PD)/RFS(AT&T)/etc)

	ln -s //fileserver/project/src/foo/RCS RCS
	make foo

There are now two pathnames to RCS.  And the symbolic link should stay
*symbolic* so that the link will still work after the disk partitions
on fileserver get rearranged.  Just like writing shellscripts using
$HOME or ~/ instead of hard-coding your home directory.

>If you don't, then generalised mounting will solve your problems.

Is this generalised mount something a normal user can do, or do
you have to be root?  Do you have to remount everything everytime
a machine goes down?

Snoopy
tektronix!doghouse.gwd!snoopy
snoopy@doghouse.gwd.tek.com

jhc@mtune.ATT.COM (Jonathan Clark) (06/24/87)

>> if I say 'cd a/b/c/d/e;cd ..' then I am now in a/b/c/d, regardless of how
>> many symlinks or networked file systems I had to go through to get there.

>Which can, in most cases, be done by the shell interpreting the "cd"
>command; the question is whether one should add extra stuff to the
>kernel to make this work invisibly?  I tend to agree with what was
>given as Dennis Ritchie's position, which was "no".

I'd disagree. Yes, I realize that the kernel would then have to keep
track of where you are in the file system, that's unfortunate.
However, adhering to this does maintain the tree-structure of the
current file system. I would also argue that when 'cd'ing through
hard-linked directories then the kernel should, following the same
model, keep track of how you got there. There has to be a Buckeroo
Banzai quote in there somewhere.

symlinks are then available to be used as methods of mapping /usr/src into
/big-file-server/usr/src, thus hiding icky details of which file systems
are where on an extended system or set of systems from the user. Which is
not to say that they can't be used for other things.

The model given above does modify the concept of 'absolute position in
the file system', but I was unable in 30 seconds or so to work out
what if anything this might break. Loop-detecting 'find's, possibly,
but they should be easily modified (also they are fairly new). In
fact, if anything they should be easier to write, since to detect a
loop one only has to keep the duple <file-system, inode> for each
ancestor directory, and compare the current two values. A match equals
a loop. How would a loop-detecting 'find' work over a symlink if `pwd`
returned the position relative to the mount point of the file system?

This is not thoroughly thought out, I admit, so go ahead and shoot me down.
-- 
Jonathan Clark
[NAC,attmail]!mtune!jhc

An Englishman never enjoys himself except for some noble purpose.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/25/87)

In article <1101@mtune.ATT.COM> jhc@mtune.UUCP (Jonathan Clark) writes:
>... I realize that the kernel would then have to keep
>track of where you are in the file system, that's unfortunate.

The kernel has to do this anyway, in order to resolve relative
pathnames during an open().

hwe@beta.UUCP (Skip Egdorf) (06/26/87)

When I moved a group of users from a Multics to a brand new 11/70
Version 6 system, there was little trauma. The editors were similar,
the languages were different (PL/I vs C) but that was expected and
managed by all concerned.

The main confusion was with the different semantics of links.
You see, Multics used what are now called soft links, and everyone
complained how the natural, normal way links should work was broken in
UNIX. I had to explain several times why, when a common link was made
to a program by several users, and then the owner re-built the
program, the other users still got the old version. How un-natural
could you get????

(I think that I was the only one who did more than just compile
user code on the Multics. I missed, and still miss SO MANY of the
features of that environment for development... However, that is another
article, and none of the loss was felt by the other users except for
link semantics)

I finally just told the users that Multics links were Multics links
and Unix links were Unix links and that they were just DIFFERENT!
Either were a tool for getting a job done, and there was little use
in worrying about why the Multics orange didn't make good pies and cider,
and why the Unix apple produced very poor marmalade.

Now we have a system with both sorts of different and useful tools,
and the argument goes on in re-invented form.
My two cents worth is: Don't try to put hard-link semantics onto
symbolic links, and don't try to put symbolic-link semantics onto
hard links. They are both needed concepts. The cost of the additional
power is the increased semantics of the file system.

				Skip Egdorf
				hwe@lanl.gov

gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/29/87)

In article <6837@beta.UUCP> hwe@beta.UUCP (Skip Egdorf) writes:
>My two cents worth is: Don't try to put hard-link semantics onto
>symbolic links, and don't try to put symbolic-link semantics onto
>hard links. They are both needed concepts.

Another evaluation might be: "Whenever two similar but different
ways of doing a task are implemented on the same system, semantic
problems occur where their domains overlap."

It seems that symlinks were an attempt to overcome the restriction
against linking across different mounted file systems.  Surely some
other approach fully compatible with hard links could have been
found.  If it had, then Korn's ".." interpretation would still be
operative for whatever approach might have been adopted, just as it
already is for the root of a mounted filesystem.

m5@bobkat.UUCP (Mike McNally ) (07/01/87)

In article <6026@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <1101@mtune.ATT.COM> jhc@mtune.UUCP (Jonathan Clark) writes:
>>... I realize that the kernel would then have to keep
>>track of where you are in the file system, that's unfortunate.
>
>The kernel has to do this anyway, in order to resolve relative
>pathnames during an open().

All the kernel needs to keep track of is the inode number of the current
working directory.  That's why the code for getwd() is more than just
"ask the kernel for the current wd path".  

Perhaps that's what Doug meant.




-- 
Mike McNally, mercifully employed at Digital Lynx ---
    Where Plano Road the Mighty Flood of Forest Lane doth meet,
    And Garland fair, whose perfumed air flows soft about my feet...
uucp: {texsun,killer,infotel}!pollux!bobkat!m5 (214) 238-7474

thorinn@diku.UUCP (Lars Henrik Mathiesen) (07/15/87)

In article <6035@brl-smoke.ARPA> gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
>Surely some other approach fully compatible with hard links could have been
>found.

Excuse me, but how are symbolic links *to files* different from hard links?
For that matter, it seems to me that you'd get exactly the same problems if
you used hard links to directories. The only way to distinguish between
hard and symbolic links is lstat(2) (and an occasional ELOOP), isn't it?
(From this point of view it is a *feature* of symbolic links that their modes
aren't ever used.)
  The problem is that the use of symbolic links for directories is encouraged
(or at least not discouraged), which shows up the semantic problems much more.

But to solve the very real problem that csh(1) does not take symbolic links
into consideration when simplifying directory pathnames, I've written the
enclosed program for 4.3BSD. It is intended to be used with aliases as follows:
	alias cd    'chdir `cdfix    $cwd \!*`'
	alias pushd 'pushd `pushdfix $cwd \!*`'
and includes special case code to supply the default arguments if none are
given (which is why it must have two links). It runs faster than my old
solution (which was to do a 'cd `pwd`' after each chdir or pushd).
--
Lars Mathiesen, DIKU, U of Copenhagen, Denmark		..mcvax!diku!thorinn
Institute of Datalogy -- we're scientists, not engineers.

-------- cut here (cdfix.c AND pushdfix.c) ----------
/*
 * This is the source for both cdfix AND pushdfix, which should be hard links
 * to the same file. This goes for the source too.
 */

#include <stdio.h>
#ifdef DO_TILDE
#include <pwd.h>
#endif
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/param.h>

int errno;
char *errpath, *progname;
char usage[] = "Usage: %s old-dir-path change-dir-path\n";

char *getenv();

panic(file, syscall, error)
    char *file, *syscall;
{
    char buf[BUFSIZ];

    sprintf(buf, "%s: %s: %s", progname, syscall, file);
    if (error)
	perror(buf, error);
    else
	fprintf(stderr, "%s\n", buf);
    printf("%s\n", errpath);
    exit(0);
}

#define LIM(x) (&(x)[MAXPATHLEN])

main(argc, argv)
    char **argv;
{
    static char head[MAXPATHLEN + 1], tail[MAXPATHLEN + 1];
    register char *headend, *tailbeg, *cp;
    struct stat stbuf;
    int loop = 0;
#ifdef DOTILDE
    char *name;
    struct passwd *pw;
#endif

    progname = argv[0];
    errpath = argv[1];
    if (argc != 3) {
	if (argc != 2)
	    fprintf(stderr, usage, progname);
	/*
	 * If this program is used to construct the argument
	 * to chdir or pushd in the csh, an empty argument is
	 * not equivalent to no argument; so we supply one that
	 * is equivalent. Assumes that two links are used, so
	 * that program name starts with 'p' for use with pushd.
	 */
	printf(*progname == 'p' ? "+1\n" : "~\n");
	exit(0);
    }

    /*
     * Check for the easy case.
     * This also lets +n arguments alone for pushd.
     */
    if (argv[2][0] != '.' && !index(argv[2], '/')) {
	printf(argv[2]);
	exit(0);
    }

    /*
     * Set up head and tail, assuming that the old path is OK.
     * Head contains a "canonical" path (no ., .. or superfluous /).
     * Head is normally NOT null-terminated!
     * Tail contains an non-canonical, absolute or relative path
     * to append on to head while keeping head canonical.
     */
    headend = head + strlen(argv[1]);
    strcpy(head, argv[1]);
    tailbeg = LIM(tail) - strlen(argv[2]);
    strcpy(tailbeg, argv[2]);

#ifdef DOTILDE
    /*
     * Attempt to get a home directory if necessary.
     * Normally this will be done by csh itself
     */
    if (tailbeg[0] == '~') {
	name = ++tailbeg;
	while (*tailbeg && *tailbeg != '/')
	    tailbeg++;
	while (*tailbeg == '/')
	    *tailbeg++ = '\0';
	if (*name == '\0')
	    cp = getenv("HOME");
	else
	    cp = (pw = getpwnam(name)) ? pw->pw_dir : NULL;
	if (cp) {
	    strcpy(head, cp);
	    headend = head + strlen(head);
	} else
	    panic(name - 1, "cannot substitute", 0);
    }
#endif

    while (tailbeg < LIM(tail)) {

	/* Consistency check */
	if (tailbeg[0] == '\0')
	    panic(head, "botch", 0);
	if (tailbeg[0] == '/') {
	    /* Absolute pathname */
	    head[0] = '/';
	    headend = head + 1;
	    tailbeg++;
	} else if (tailbeg[0] == '.' &&
		   (tailbeg[1] == '\0' || tailbeg[1] == '/'))
	    /* .  - just skip it */
	    tailbeg++;
	else if (tailbeg[0] == '.' && tailbeg[1] == '.' &&
		 (tailbeg[2] == '\0' || tailbeg[2] == '/')) {
	    /* .. */
	    if (headend == head + 1)
		/* This was /.. - skip it */
		tailbeg += 2;
	    else {
		/* Make head into string, then back up over last element */
		*headend = '\0';
		while (headend > &head[1] && *--headend != '/')
		    /* void */ ;
		/* See if head was a symbolic link */
		if (lstat(head, &stbuf) != 0)
		    panic(head, "lstat", errno);
		if ((stbuf.st_mode & S_IFMT) == S_IFLNK) {
		    /* Prepend the symbolic link to tail */
		    *--tailbeg = '/';
		    if ((tailbeg -= stbuf.st_size) < tail)
			panic(head, "tail length exceeded", 0);
		    if (++loop > 20)
			panic(head, "loop count exceeded", 0);
		    if (readlink(head, tailbeg, stbuf.st_size) != stbuf.st_size)
			panic(head, "readlink", errno);
		    /* Go back and check for absolute vs. relative */
		    continue;
		} else
		    /* Skip .. */
		    tailbeg += 2;
	    }
	} else {
	    /* copy element to head */
	    if (headend > &head[1])
		*headend++ = '/';
	    while (*tailbeg && *tailbeg != '/')
		*headend++ = *tailbeg++;
	    if (headend > LIM(head))
		panic(head, "path length exceeded", 0);
	}
	/* Remove redundant slashes */
	while(*tailbeg == '/')
	    tailbeg++;
    }
    *headend = '\0';
    printf("%s\n", head);
    exit(0);
}

rbj@icst-cmr.arpa (Root Boy Jim) (08/10/87)

   From: Doug Gwyn  <gwyn@brl-smoke.arpa>
   In article <7956@brl-adm.ARPA> rbj@icst-cmr.arpa (Root Boy Jim) writes:
   >Dave Korn writes:
   >   About 50% of the respondents agreed with me completely.  Another 30%
   >The best way to do something wrong is to take a poll (especially if
   >it includes naive users) and implement the result.

   The point of the poll was to garner sufficient "public" support to get
   the internal AT&T decision makers to seriously consider Korn's proposal.
   I'm sure Korn was not relying on the poll to help him determine the
   TECHNICAL viability of his proposal, but rather the POLITICAL viability.

You didn't address the issue. I know what his point was. BTW, do you
really think TPC will listen to the net's desires? It'd be a first.

   >   2.  The directory of .. must be independent of the way you got there.
   >	   Note that this has already been broken by
   >	   a.	Remote mounts in RFS
   >Please explain. I hope that you are not using RFS's (possible) botches
   >to justify botching something else. BTW, NFS seems to work correctly.

   Cottrell, if you don't know what you're talking about then shut up.
   RFS is doing exactly the right thing in this case.

My lack of specific knowledge of an obscure remote file system does not
invalidate my general argument. I was asking him to justify his statement.

   BTW, NFS is hardly a model for "correct" semantics for UNIX file systems!

Great, you just insulted Sun.

   >Good point. However, before you guys tackle something complex like
   >symbolic links, perhaps you had better finish the job on regular file
   >systems and implement `mkdir', `rmdir', and `rename' as system calls

   Mkdir & rmdir ARE system calls in the current version of UNIX.

Well congratulations! It only took them how long? We only have SVr2v2
here, and only get it as an excuse to run Berkeley. Still, I do look
at the docs from time to time, and load a few useful programs from it.

   It is
   not especially advantageous to do this for ordinary file systems, but
   it certainly is for a distributed file system.  Rename is more difficult
   and less urgent, but it too will probably move into the kernel soon.

They are useful because they make directorys more like files. No bogus
mvdir or mvtree (or whatever) commands. And no races on rename.

   Your patronizing tone is totally inappropriate.

Well, excuse me. I didn't get any response with a polite, detailed reply.

BTW, aren't we the pot calling the kettle black? I interrupt this
stream of consciousness to quote another article of yours:

   Consider:  During the early stages of porting my software to a new
   hostile (i.e. 4BSD) system, I use a shell script named "cc" to get

Hostile? `cc' is probably one of the more standardized commands.

   things compiled right.  No way am I going to edit a zillion Makefiles
   to redefine "cc" to "cc.sh", then later change them back.

I agree.

   (Admittedly
   augmented "make" provides a way to accomplish this through use of an
   environment variable,

Admittedly, so does vanilla make: `make CC=cc.sh target'

   but the point is that the name of a command
   should encode only the command's function, not information about its
   type, creator, size, or other irrelevancies.)

Such as the method of execution. You see? I can be agreeable too.

Okay, now back to the previous article:

   I know several AT&T
   software developers (including Korn) who can run rings around you.

Good taste is just as important as sheer prowess. Perhaps more.

   Their biggest problem lies in the bureaucracy that lies between their
   prototypes and the official released product.  To some degree this does
   make for higher quality (virtually every recent new feature in UNIX
   System V has had an improved design over the original prototype), but
   it does delay or even discourage the appearance of new features.

No argument there.

I will restate my position for anyone who missed it.

1) The idea of following `cd /a/b' with `cd ..' and ending up in `/a'
   is an attractive one, no doubt. There are many ways of doing this.
   One is to do it in the shell, possibly with a switch enabling it.
   In fact, csh's inability to handle its `cwd' variable across
   symbolic links can be used to advantage: alias cdup 'cd $cwd:h'.
2) Alternatively, one could argue to use the right tool. Pushd and
   popd were invented for retracing one's steps. If all you have is
   a hammer, everything looks like a nail.
3) However, there are real problems with attempting to reverse a
   symlink *in the kernel*. Foremost is that different include files
   would be referenced *depending on what path someone took to get
   to the source directory*! This is unacceptable in my mind. It is
   worse to reference the wrong file than to reference no file at all.
4) Robert Elz has stated that `attempting to reverse a symlink is
   clearly absurd'. I am in good company.
5) Symlinks are not only a means of linking across file systems, but
   have other uses as well, namely an easy way of making links to
   a directory and links to files which are renamed when edited.

	(Root Boy) Jim Cottrell	<rbj@icst-cmr.arpa>
	National Bureau of Standards
	Flamer's Hotline: (301) 975-5688

P.S. As to why I keep calling AT&T TPC, see Barry Shein's articles for
more elegant prose than I have the patience to muster.

However, I will offer an example of corporate braindamage: Touch Tone.
This is clearly technologically superior, easier and cheaper to process
than the old pulse tone dialing. Yet instead of standardizing on this
feature and improving the quality of life for everyone, certain powers
saw this as a way to extract a few extra dollars from the public.
And now they want us to trust in ISDN? Sheesh!

The preceding opinions are mine and not those of my employer,
who is prevented by law from having opinions.

ekrell@hector..UUCP (Eduardo Krell) (08/12/87)

I thought we went thru this a while ago ...

In article <8731@brl-adm.ARPA> rbj@icst-cmr.arpa writes:

(about RFS)

>My lack of specific knowledge of an obscure remote file system does not
>invalidate my general argument. I was asking him to justify his statement.

If you lack specific knowledge, then what authority do you have to call RFS
"an obscure remote file system"?. Or is this just System V or AT&T bashing?

>   BTW, NFS is hardly a model for "correct" semantics for UNIX file systems!
>
>Great, you just insulted Sun.

What's your point?. If you are going to have a distributed file system,
shouldn't you expect the same semantics independently of where the files
are? (aka "Location Transparency").

>1) The idea of following `cd /a/b' with `cd ..' and ending up in `/a'
>   is an attractive one, no doubt. There are many ways of doing this.
>   One is to do it in the shell, possibly with a switch enabling it.
>   In fact, csh's inability to handle its `cwd' variable across
>   symbolic links can be used to advantage: alias cdup 'cd $cwd:h'.

Doing it in the shell fixes the "cd /a/b; cd .." problem. But shouldn't
"cat ../foo" be the same as "cd ..; cat foo" ?. Now, how are you going
to do THAT in the shell?.

>3) However, there are real problems with attempting to reverse a
>   symlink *in the kernel*. Foremost is that different include files
>   would be referenced *depending on what path someone took to get
>   to the source directory*! This is unacceptable in my mind. It is
>   worse to reference the wrong file than to reference no file at all.

This has already been discussed. The solution is to make all the
subdirectories symbolic links as well.

>5) Symlinks are not only a means of linking across file systems, but
>   have other uses as well, namely an easy way of making links to
>   a directory and links to files which are renamed when edited.

But how does this relate to the original problem?
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!ekrell

kre@munnari.UUCP (08/14/87)

I would have thought that this topic would have died a natural death by now,
but...

In article <2789@ulysses.homer.nj.att.com>,
ekrell@hector..UUCP (Eduardo Krell) writes:
> Doing it in the shell fixes the "cd /a/b; cd .." problem. But shouldn't
> "cat ../foo" be the same as "cd ..; cat foo" ?. Now, how are you going
> to do THAT in the shell?.

Perfectly true, that should work the same both ways.  There are just two
options I can see for making this (and other sensible expected semantics)
work in a consistent, repeatable, fashion.

Either adopt Doug Gwyn's suggestion, and replace the concept of a current
directory with the vms style path prefix - but as he said, that wouldn't
be unix any more.

Or leave the symlink semantics as they are in BSD (as Dennis designed
them) and stop pretending that they are links that happen to be able
to point across filesystems or to directories, and regard them as
fully fledged entities.  ("Leave them" applies to the path lookup
semantics wrt ".." etc.  As I have said before, I would modify some
of the other semantics a little, but all that is just gloss).

There's no intermediate representation that will work correctly (for
any reasonable definition of correctly) all the time.  And it doesn't
matter whether the faked semantics are implemented in the shell, or
in the kernel.  It doesn't work *consistently all the time*.

I'm fully aware of existing shell implementations (which are the easier
of the two to get close to right) and in normal everyday use the defects
quite probably are never going to appear (apart from the type mentioned
above, the problem that breaks these schemes can be demonstrated
without stepping outside the environment where the kludge is done,
ie: a shell implementation can be broken by a shell script that uses
nothing that isn't built into the shell, plus overhead glue .. which
might be built into some shells (test, echo, etc)).

It shouldn't be surprising to people that these things don't work,
kludges rarely do.  I'm going to leave it to people's imagination
to work out what the problem is (all you need to do is consider the
necessary features of any implementation, then use a brute force
method to bust that .. its not subtle).

kre

ekrell@hector..UUCP (Eduardo Krell) (08/16/87)

In article <1781@munnari.oz> kre@munnari.UUCP writes:

I wrote:

>> Doing it in the shell fixes the "cd /a/b; cd .." problem. But shouldn't
>> "cat ../foo" be the same as "cd ..; cat foo" ?. Now, how are you going
>> to do THAT in the shell?.
>
>Perfectly true, that should work the same both ways.  There are just two
>options I can see for making this (and other sensible expected semantics)
>work in a consistent, repeatable, fashion.
>
...
>
>Or leave the symlink semantics as they are in BSD (as Dennis designed
>them) and stop pretending that they are links that happen to be able
>to point across filesystems or to directories, and regard them as
>fully fledged entities.  ("Leave them" applies to the path lookup
>semantics wrt ".." etc.  As I have said before, I would modify some
>of the other semantics a little, but all that is just gloss).

But this not a solution to the problem above. You can't have it both
ways. If you fix the "cd .." problem in the shell (like ksh did) and
if you expect "cat ../foo" to be the same as "cd ..; cat foo", then
you HAVE to do it in the kernel. There's no other way.

>There's no intermediate representation that will work correctly (for
>any reasonable definition of correctly) all the time.  And it doesn't
>matter whether the faked semantics are implemented in the shell, or
>in the kernel.  It doesn't work *consistently all the time*.

It depends on your definition of consistency. The way BSD did symbolic
links, you can't make them transparent. That is, "cd .." will move you
to the physical parent instead of the logical one. You can fix that in
the shell, but then the "cat ../foo" and "cd ..; cat foo" problem arises.
A fix in the kernel can solve these 2 problems, but other problems appear.
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!e!jrp/u

allbery@ncoast.UUCP (08/16/87)

As quoted from <2789@ulysses.homer.nj.att.com> by ekrell@hector..UUCP (Eduardo Krell):
+---------------
| I thought we went thru this a while ago ...
| 
| In article <8731@brl-adm.ARPA> rbj@icst-cmr.arpa writes:
| 
| (about RFS)
| 
| >My lack of specific knowledge of an obscure remote file system does not
| >invalidate my general argument. I was asking him to justify his statement.
| 
| If you lack specific knowledge, then what authority do you have to call RFS
| "an obscure remote file system"?. Or is this just System V or AT&T bashing?
+---------------

Umm, there are two "RFS"'es.  One is a remote file system for BSD that was
posted to the then mod.sources back when NFS wasn't standard yet.  The other
is the SVR3 remote file system.  I wonder if the poster know which was being
referenced?
-- 
 Brandon S. Allbery, moderator of comp.sources.misc and comp.binaries.ibm.pc
  {{harvard,mit-eddie}!necntc,well!hoptoad,sun!mandrill!hal}!ncoast!allbery
ARPA: necntc!ncoast!allbery@harvard.harvard.edu  Fido: 157/502  MCI: BALLBERY
   <<ncoast Public Access UNIX: +1 216 781 6201 24hrs. 300/1200/2400 baud>>
** Site "cwruecmp" is changing its name to "mandrill".  Please re-address **
*** all mail to ncoast to pass through "mandrill" instead of "cwruecmp". ***

ekrell@hector..UUCP (Eduardo Krell) (08/18/87)

In article <4185@ncoast.UUCP> allbery@ncoast.UUCP (Brandon Allbery) writes:

|Umm, there are two "RFS"'es.  One is a remote file system for BSD that was
|posted to the then mod.sources back when NFS wasn't standard yet.  The other
|is the SVR3 remote file system.  I wonder if the poster know which was being
|referenced?

If you track this discussion back to the original article by Dave Korn,
you'll see that the reference to RFS was to the one in SVR3.
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!ekrell

kre@munnari.oz (Robert Elz) (08/20/87)

In article <2809@ulysses.homer.nj.att.com>,
ekrell@hector..UUCP (Eduardo Krell) writes [& quotes me, the middle line]:
> 
> >> Doing it in the shell fixes the "cd /a/b; cd .." problem. But shouldn't
> >> "cat ../foo" be the same as "cd ..; cat foo" ?. Now, how are you going
> >> to do THAT in the shell?.
> >
> >Or leave the symlink semantics as they are in BSD ...
> 
> But this not a solution to the problem above.

It depends what you define the "problem" to be.  The problem you
posed initially I see as the inconsistency in the handling of ".."
in the shell as a special case (since ".." would mean different
things in "cd" commands and in others).

Leaving the semantics as they are in BSD does indeed fix this
problem, ".." is entirely consistent (unless you've added a shell
that breaks things).

I don't see that

	$ cd /a/b
	$ cd ..
	$ pwd
	/c

is a problem at all, and it doesn't need to be fixed, especially
it isn't worth breaking anything, or leaving inconsistencies to fix it.

> It depends on your definition of consistency. The way BSD did symbolic
> links, you can't make them transparent.

Fine.  That's not a problem.  As I tried to say, the solution is to
simply stop pretending that a symlink to a directory is somehow
equivalent to the directory itself, and see it as being an object
with its own existance, semantics, and usefulness.

kre

ekrell@hector..UUCP (Eduardo Krell) (08/21/87)

In article <1788@munnari.oz> kre@munnari.UUCP writes:

>Leaving the semantics as they are in BSD does indeed fix this
>problem, ".." is entirely consistent (unless you've added a shell
>that breaks things).

What you're saying is that there is no problem and thus we disagree.

>I don't see that
>
>	$ cd /a/b
>	$ cd ..
>	$ pwd
>	/c
>
>is a problem at all, and it doesn't need to be fixed, especially
>it isn't worth breaking anything, or leaving inconsistencies to fix it.

I think it is a problem. How many system administrators out there have had
to explain this to new Unix users?. Even not-so-naice users are bothered
by this. I used to hit this problem every time I did "cd /usr/include/sys"
and then "cd ..". This is an operation that I do at least daily. Back when
I used the C-shell, I would end up in /sys or someplace like that.

Now that I use ksh, I end up exactly where I want : /usr/include.

If you take a poll asking people where they would expect to be after
doing "cd /usr/include/sys; cd ..", I would bet a large majority to
answer "/usr/include". It's intuitive and it makes sense. If the system
behaves in a way that's different from what most people expect, then it's
broken as far as I'm concerned.

>Fine.  That's not a problem.  As I tried to say, the solution is to
>simply stop pretending that a symlink to a directory is somehow
>equivalent to the directory itself, and see it as being an object
>with its own existance, semantics, and usefulness.

When I do a "cd /a/b" and end up in a different directory, say, /c/d,
then the symbolic link IS equivalent to the directory it points to.
Once I "cd" to that directory, I get to the same i-node as /c/d.
That qualifies as being equivalent, I would say.
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!ekrell

kre@munnari.oz (Robert Elz) (08/23/87)

I suspect that 99% of the population have gotten bored with this by now,
so this is going to be the last I say on the subject, but, one last time...

In article <2838@ulysses.homer.nj.att.com>,
ekrell@hector..UUCP (Eduardo Krell) writes:
> What you're saying is that there is no problem and thus we disagree.

No, not quite, not "no problem", just a different problem.  This
is the problem...

> How many system administrators out there have had
> to explain this to new Unix users?

The problem isn't that it needs to be explained, the way the system
is set up at the minute, with everything attempting to give the
impression that symlinks are invisible objects, explanations are
going to be required.

The problem is that people conclude from this that the right thing
to do is change the semantics into something just about impossible
to explain so that its necessary to explain less often.

> If you take a poll asking people where they would expect to be after
> doing "cd /usr/include/sys; cd ..", I would bet a large majority to
> answer "/usr/include".

Without more knowledge of what's going on, and with a background of
unix experienc, you're probably right.  If I polled a bunch of people
I pulled off the street, I doubt that many would answer at all.  If
those people were educated about unix with knowledge of symlinks from
the start, and with the knowledge that they aren't invisible, then the
answer to a later poll might be different than you expect.

If you took a poll of people asking whether file "x" should change
when you alter file "y", how many do you think would say yes?  Of
course we all know about links already, but they are confusing, and
most new users usually need to have them explained.  Let's get rid
of links...

> If the system
> behaves in a way that's different from what most people expect, then it's
> broken as far as I'm concerned.

All kinds of real things don't behave the way most people expect.
They're not all "broken".

> When I do a "cd /a/b" and end up in a different directory, say, /c/d,
> then the symbolic link IS equivalent to the directory it points to.
> Once I "cd" to that directory, I get to the same i-node as /c/d.
> That qualifies as being equivalent, I would say.

If they're equivalent, then "cd .." after either should have the
same effect, otherwise they can't possibly be equivalent, surely?

If we're there in /c/d (a path without symlinks, as obtained from /bin/pwd)
and you do a "cd ..", where should we go?  I say wherever ".." points,
usually "/c", in all cases.  Simple, and consistent.  It does require
some education about the properties of symlinks.

You say "/a" if the path used initially was "/a/b".  Seems simple
too?  But is it really?  Do you want to be in "/a" if "b" was ".." ?
Probably not, so now the rule is "/a" unless "b" is "..", and in
that case something else, depending on what "a" was, and potentially
on previous history.  How about where "b" is "."?  There are
looking to be a bunch of special cases here, which is starting to
get a bit messy isn't it?

And none of this overcomes the problems with actually implementing it
and making it work properly, assuming that you can actually produce
a nice clean definition of what you're trying to implement.

Let's just leave symlinks basically alone.  I don't mind if people
want to use shells that have magic cd commands that attempt to guess
what you intended to do .. if it gets it right most of the time for you,
then great, being able to use any shell that suits you is one of the
advantages of unix.  However, this all started with a proposal to move
these semantics into the kernel, forcing them on everyone, assuming
that an implementation is possible.  That would be a disaster.

kre

ekrell@hector..UUCP (Eduardo Krell) (08/25/87)

This is really getting annoying ...

In article <1793@munnari.oz> kre@munnari.UUCP writes:

>> When I do a "cd /a/b" and end up in a different directory, say, /c/d,
>> then the symbolic link IS equivalent to the directory it points to.
>> Once I "cd" to that directory, I get to the same i-node as /c/d.
>> That qualifies as being equivalent, I would say.
>
>If they're equivalent, then "cd .." after either should have the
>same effect, otherwise they can't possibly be equivalent, surely?

Only if you see ".." as a physical, hard-coded pointer.

>If we're there in /c/d (a path without symlinks, as obtained from /bin/pwd)
>and you do a "cd ..", where should we go?  I say wherever ".." points,
>usually "/c", in all cases.  Simple, and consistent.  It does require
>some education about the properties of symlinks.

This argument, that ".." is a physical link (rather than a logical one)
falls appart at mount points, where the parent directory and where ".." points
are different. It also happens at remote file system mount points, for
the same reason. So it's not so "simple and consistent". It now requires
some education about file systems, mount points, and distributed file
systems. The list seems to keep growing.

>And none of this overcomes the problems with actually implementing it
>and making it work properly, assuming that you can actually produce
>a nice clean definition of what you're trying to implement.

This is easy. The intention is to make ".." behave as a logical operator.
This means that

1: if I do "cd /usr/include/sys" and then "cd ..", I end up in /usr/include
   no matter what.

2: if I find a ".." in a pathname, it refers to the logical parent,
   not the physical one. Thus, while in /usr/include/sys, "ls .." will produce
   a directory listing of /usr/include.

The semantics are very clean and simple.

>However, this all started with a proposal to move
>these semantics into the kernel, forcing them on everyone, assuming
>that an implementation is possible.  That would be a disaster.

You can't achieve 2: above unless it's done in the kernel.

An implementation is not only possible, it already exists. I wouldn't be
defending this if I didn't have the opportunity to use it and test it and
get the feeling as to whether it's the right thing to do or not.
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!ekrell

chris@mimsy.UUCP (Chris Torek) (08/25/87)

>In article <1793@munnari.oz> kre@munnari.UUCP writes:
>>If we're there in /c/d (a path without symlinks, as obtained from /bin/pwd)
>>and you do a "cd ..", where should we go?  I say wherever ".." points,
>>usually "/c", in all cases.  Simple, and consistent.  It does require
>>some education about the properties of symlinks.

In article <2854@ulysses.homer.nj.att.com> ekrell@hector.UUCP
(Eduardo Krell) writes:
>This argument, that ".." is a physical link (rather than a logical one)
>falls appart at mount points, where the parent directory and where ".."
>points are different.  It also happens at remote file system mount
>points, for the same reason.

Mount points are (1) required to be on a leaf and (2) mount the root
of a tree.  (That (1) is enforced by hiding anything that is under
the mount point is irrelevant.)  This means that mount points leave
the file system tree-structured.

The *raison d'etre* for symbolic links is that sometimes a tree
structure is insufficient.  It should follow that they do not behave
like trees.

If you wish to treat all path names as strings before attempting to
apply them to the file system itself, and resolve `..' as `up one
level', we can discard the entire directory structure of Unix itself.
There is no need for `.' and `..' directory entries.  These become
magic strings.  There is no need for the file system to be implemented
as a directed graph (although it may still be convenient).

It may be convenient, but it does not feel like Unix.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	seismo!mimsy!chris

ekrell@hector..UUCP (Eduardo Krell) (08/26/87)

In article <1362@bloom-beacon.MIT.EDU> tytso@thor.UUCP (Theodore Y. Ts'o) writes:

>Part of the Unix way is to be as flexible as possible.  I HOPE this is 
>optional (turned on with a system call, or some such), or are you,
>as a religious fanatic, going to force your way on everyone?

It could be made optional. I have no problems with that. It could be set up
on a per-user basis.

>The above paragraph assumes that someone at ATT is pushing this
>interpretation into SYS V.  (Why is it that mostly ATT posters think
>this is a good idea?)  If you're discribing a purely local hack, then
>I hope it stays that way.

I think symbolic links are very useful, but I believe the way they were
implemented in BSD is broken. I believe that if I do "cd /usr/include/sys"
and then "cd ..", I should end up in /usr/include (without the help of
a smart shell).
This is not a purely local hack. The behavior described above already exists
in ksh, which is quite popular. If symbolic links will be added to the
official System V, then I should think they should follow our proposal.

>Example: /mnt/paris is a link to /.
>When you boot up in single user mode, before any history is established,
>WHERE DOES .. POINT TO?  Since you don't think .. is a physical
>pointer, the answer '/' is going to require a lot of explaning.

Not so. when you "start up", you're in "/". When in "/", "cd .." will
keep you in "/". Now, if you do "cd /mnt/paris" and then "cd ..",
you'll end up in /mnt.
You can think of it as if there's an implicit "cd /" when the system boots.

> And
>if the kernal flips a coin, I'll let you explain to the user why he
>typed cd .. from / and ended up in /mnt.

there's no coin flipping. What you get when you type "cd .." is backing
up to the logical parent directory, the one you used to get to where you
are. In your example :

cd /          ; cd ..      => /
cd /mnt/paris ; cd ..      => /mnt
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!ekrell

ekrell@hector..UUCP (Eduardo Krell) (08/26/87)

In article <8137@mimsy.UUCP> chris@mimsy.UUCP writes:

>Mount points are (1) required to be on a leaf and (2) mount the root
>of a tree.  (That (1) is enforced by hiding anything that is under
>the mount point is irrelevant.)  This means that mount points leave
>the file system tree-structured.

The point I was making is that .. is already being treated in a special
way in the kernel. You don't always get the i-node pointed by the .. entry
in the directory.

>If you wish to treat all path names as strings before attempting to
>apply them to the file system itself, and resolve `..' as `up one
>level'

But isn't this EXACTLY what's done when the ".." is at a mount point?.
The .. entry at the root of the mounted file system points to the root
directory of the file system (i-node 2), yet when you "cd ..", you get
to a different place.

>There is no need for `.' and `..' directory entries.  These become
>magic strings.  There is no need for the file system to be implemented
>as a directed graph (although it may still be convenient).

I believe "." is already treated specially by the kernel in namei().
That is, no search is made in the directory for ".". It just returns
u.u_cdir. It already is a magic string.

I have seen many programs that use canonical pathnames. They get rid of
both "." and ".."s before resolving the pathname. They currently break
because of symbolic links.

>It may be convenient, but it does not feel like Unix.

Somehow I feel you'll have a hard time convincing the thousands of ksh
users that they're not running Unix ...
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!ekrell

gwyn@brl-smoke.ARPA (Doug Gwyn ) (08/26/87)

In article <8137@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>The *raison d'etre* for symbolic links is that sometimes a tree
>structure is insufficient.  It should follow that they do not behave
>like trees.

Yes, and the problem with the Berkeley symlink behavior is that
the file system hierarchy does not even LOCALLY behave like a
tree.  When a user requests what is naturally thought of as a
"local" operation like .., having the global topology brutally
manifest itself is quite a shock.  I won't go into the reasons
why it is important from a human factors viewpoint for .. to be
a "local" operation, but I will note that even in a system with
symlinks, it is one, most of the time.  What Korn, Krell, and
others want to do is change that "most of the time" to "always".

>If you wish to treat all path names as strings before attempting to
>apply them to the file system itself, and resolve `..' as `up one
>level', we can discard the entire directory structure of Unix itself.

That's a bit overstated.  There's at least one such implementation
that I know of, and there have been some partial approaches to this
in areas not involving symlinks, yet those systems are recognizably
based on the normal UNIX hierarchical file system + inode model.

>There is no need for `.' and `..' directory entries.  These become
>magic strings.

This is already true of many network file system implementations,
and .. has had special meaning on all UNIX-based systems under
certain circumstances for many years.

>  There is no need for the file system to be implemented
>as a directed graph (although it may still be convenient).

With symlinks, you simply don't have a DAG, globally.  However,
as a general issue, hierarchical naming is EXTREMELY important.
Symlinks provide a way to "cheat" on this; the question is
whether the cheat is to be blessed as a fundamental feature or
treated as an incidental perturbation on the hierarchical scheme.

>It may be convenient, but it does not feel like Unix.

As with text editors, it all depends on what you've gotten
used to.  Personally I can't stand symlinks making .. act
like a global jump to a distant place in the hierarchy.

ekrell@hector..UUCP (Eduardo Krell) (08/27/87)

In article <197@hobbes.UUCP> root@hobbes.UUCP (John Plocher) writes:

>| 1: if I do "cd /usr/include/sys" and then "cd ..", I end up in /usr/include
>|    no matter what.
>
>So write (or build in) a new cd command which does this.  I'm not stopping
>*you* from doing it.  Just don't force *me* to use it, thank you.

I don't need to write a cd command. ksh already does it for me, thanks Dave!

>What about a C program which does a #include <sys/foobar.h>?  What if the
>file <sys/foobar.h> itself does a #include "../foobarboop.h" ?  All this 
>works if and only if the C compiler references things as "/usr/include" +
>filename in <>'s"

First of all, I just went thru all the files in /usr/include/sys under System
V Release 3 and there is no #include "../<anything>".
There are some in BSD 4.3 in the form #include "../machine/foo.h" which
should really be #include <machine/foo.h>.

>  Now I am recompiling the kernal, and its source is in the directory
>/sys/src.  There is a subdirectory here called sys which is a link to the
>directory known above as /usr/include/sys.  The $68K question is "What file
>does "../foobarboop" reference now?"  If it isn't the SAME ONE as referenced
>in the first case, something is broken.  period.

In our system, /usr/include/sys is a symbolic link to /sys/h. If a cd to
/usr/include/sys, I can do "ls" and see all the header files. If I do "ls ..",
I don't get /usr/include.
If I do "cd /usr/include/sys" and then "ls ..", if I don't get the contents
of /usr/include, something is broken, period.

>  This is the difference between a deterministic method (what we have now) and
>a nondeterministic one (what is proposed).  I'll take the former any day!

nondeterministic is a bad name. It's 100% deterministic, you tell me how
you got there and I'll tell you what ".." means. It's not a physical link
anymore but that has nothing to do with nondeterminism.

>My solution:
>  When in /usr/include/sys, and you want a listing of /usr/include, do a
>ls .., and when you want a listing of /sys/src, do a ls /sys/src.  If you
>get confused easily, the first case can be expanded to ls /usr/include.

The problem is /usr/include/sys is a symbolic link to /sys/h. If I do ls ..,
I don't get /usr/include !!! That's the problem.

>The point is, links AS THEY ARE NOW are very useful

and they are much more useful (at least to me) if .. is treated as a logical
operator.

> and, two, links to directories destroy the "tree" structure
>of the filesystem.  Links turn it back into a directed graph which is
>not easy to understand if you try to use the "tree" mentality on it.

I know links to directories destroy the tree structure. Now, the questions
is: provided we don't have tree structure anymore, what's the next best thing?

>There is method behind the madness of the way links work; if you don't
>understand this, study it some more.

I don't need to study anything. I understand how symbolic links work.
In fact, I've added symbolic links to our experimental version of System V,
so don't think I'm a naive user. The problem is I find it very hard to
justify that if a do a cd /usr/include/sys, then .. can be anywhere in
any file system. I want it to mean /usr/include.

>Just don't "break" the deterministic behavior of links for *me*.

I am not introducing nondeterminism. If you "cd /foo/bar" and the "cd .."
you'll ALWAYS end up in /foo. ALWAYS, no expections. In your system, you might get
to /foo or you might get someplace else, depending on whether /foo/bar was
a symbolic link or not. I'll call that nondeterminism to some degree.

>I don't want 2: and 1: can be done with aliases (or whatever) in *your* shell.

1: is already done in ksh. 2: is needed to keep "ls .." and "cd ..; ls"
consistent given we already have 1:.
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!ekrell

ekrell@hector..UUCP (Eduardo Krell) (08/27/87)

In article <480@mtxinu.UUCP> ed@mtxinu.UUCP (Ed Gould) writes:

>It doesn't fall apart anywhere on my system (4.3BSD + NFS) except that
>/ is a special case: /.. == /

Imagine the following: you replace "cd" by a function that reads the directory
entries (including . and ..) directly and does its own version of chdir().
Call that program "ncd". Now, go to a mount point, say /foo/bar, and do "ncd ..".
The million dollar question: where are you now?

>Simple, perhaps, but no more so than the current BSD symlinks.  What's
>cleaner about it?

it's cleaner because I am guaranteed that if I "cd /usr/include/sys", then
any references to ".." like in "cd .." or "ls .." will refer to /usr/include
no matter what.

>Unless the kernel implements carrying an arbitrarily-long string with
>each process, then I claim that the implementation is broken.  Consider
>the following program, with DEPTH suitably large:

You forget that there's a hard-coded maximum pathname length in the kernel,
so I won't be breaking anything that's not broken already.
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!ekrell

ken@cs.rochester.edu (Ken Yap) (08/27/87)

Say, guys, why don't you have it out by mail and post a summary when
the wars are over? Half :-).

	Ken

chris@mimsy.UUCP (08/27/87)

In article <2877@ulysses.homer.nj.att.com> ekrell@hector.UUCP
(Eduardo Krell) writes:
>... I just went thru all the files in /usr/include/sys under System
>V Release 3 and there is no #include "../<anything>".
>There are some in BSD 4.3 in the form #include "../machine/foo.h" which
>should really be #include <machine/foo.h>.

No, they should not be that way; and: they are already that way.

(What?)

Look more closely:  These files read

	#ifdef KERNEL
	#include "../machine/pte.h"
	#else
	#include <machine/pte.h>
	#endif

which is as it should be---one should be able to build experimental
kernels without reference to /usr/include.

>In our system, /usr/include/sys is a symbolic link to /sys/h. If a cd to
>/usr/include/sys, I can do "ls" and see all the header files. If I do "ls ..",
>I don't get /usr/include.
>If I do "cd /usr/include/sys" and then "ls ..", if I don't get the contents
>of /usr/include, something is broken, period.

Say not `period', but rather, according to the way you and a large
group of others feel things should be done.

>... It's 100% deterministic, you tell me how you got there and I'll
>tell you what ".." means.

It is deterministic; but it is context sensitive.  This is what I
do not like, and I think this is what Robert Elz does not like.

>I know links to directories destroy the tree structure. Now, the questions
>is: provided we don't have tree structure anymore, what's the next best thing?

And this is the center of the argument.  The two sides seem to be

	the next best thing is ..-as-a-logical-operator (`up'):
	make it *look* like a tree

and

	the next best thing is directed acyclic graph behaviour

and I think everyone knows on which side we stand. . . .

(Incidentally, I have a C shell alias `up': `cd /$cwd:h'; when I
want to go up, I type `up'.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	seismo!mimsy!chris

kyle@xanth.UUCP (08/27/87)

In article <6342@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
> In article <8137@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
> >It may be convenient, but it does not feel like Unix.
> 
> As with text editors, it all depends on what you've gotten
> used to.  Personally I can't stand symlinks making .. act
> like a global jump to a distant place in the hierarchy.

Indeed it does depend on your point of view.  I see symlinks as the
global jump to the distant place in the hierarchy, not "..".   This
would be readily apparent if the C shell and other shells that keep
a 'current working directory' variable weren't ignorant of symlinks.

kyle jones   <kyle@odu.edu>   old dominion university, norfolk, va

daveb@geac.UUCP (Brown) (08/27/87)

In article <197@hobbes.UUCP> root@hobbes.UUCP (John Plocher) writes:
>  This is the difference between a deterministic method (what we have now) and
>a nondeterministic one (what is proposed).  I'll take the former any day!
  Please record one vote for deterministic...  --dave

  However, I don't really want to discuss the semantics of symlinks
today, but instead discuss a technique:

	Make symbolic links to directories *reflexive*.

  Ie, the (shell) command "lnb some/path [some/other/path]" should
create a symlink to the fully-expanded value of some/path in the
directory referred to as some/other/path, or the current directory
if none is specified. (With suitable behavior on error).
  Similarly, an option to "lc" should list links to directories
in some conveniently distinguished way.

  This allows one to think of the file system as having two
"aspects", each with their own "attributes", and still keep
them mentally distinct (with apologies to Zelazney).
  1) Trees, implemented with inode/directory hierarchies 
  2)  Near-arbitrary graphs, implemented with pairs of names.  

  I would comment more on the psychology, but I suspect this is
sufficient for wizards...
-- 
 David Collier-Brown.                 {mnetor|yetti|utgpu}!geac!daveb
 Geac Computers International Inc.,   |  Computer Science loses its
 350 Steelcase Road,Markham, Ontario, |  memory (if not its mind)
 CANADA, L3R 1B3 (416) 475-0525 x3279 |  every 6 months.

ekrell@hector..UUCP (Eduardo Krell) (08/27/87)

In article <8195@mimsy.UUCP> chris@mimsy.UUCP writes:

>Look more closely:  These files read
>
>	#ifdef KERNEL
>	#include "../machine/pte.h"
>	#else
>	#include <machine/pte.h>
>	#endif
>
>which is as it should be---one should be able to build experimental
>kernels without reference to /usr/include.

The right way of doing this is by issuing the right -I options to cpp.
That's how we do it all the time. In this case, you have one directory
called, say, "myinclude", with all the subdirectories and header files
you need.

Then you just need to say cc -Imyinclude and those files will be searched
for before the ones in /usr/include.
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!ekrell

wesommer@ICARUS.mit.edu (William E. Sommerfeld) (08/27/87)

In article <2268@xanth.UUCP> kyle@xanth.UUCP (Kyle Jones) writes:
.. with respect to symlinks warping you through the filesystem space..
>Indeed it does depend on your point of view.  I see symlinks as the
>global jump to the distant place in the hierarchy, not "..".   This
>would be readily apparent if the C shell and other shells that keep
>a 'current working directory' variable weren't ignorant of symlinks.
>
>kyle jones   <kyle@odu.edu>   old dominion university, norfolk, va

If you want to mean '..' to be 'back' rather than 'up', use pushd
rather than cd.

I find the following (csh) aliases useful: it puts the true absolute
pathname of the working directory in your prompt.  Unfortunately, it's
a bit slower, but that's because pwd is an order N**2 crock.

	alias	np	'chdir `pwd`; set prompt="{${cwd}}\\
% "'
	alias 	cd	'chdir \!*; np'
	alias 	pushd	'pushd \!*; np'
	alias   popd	'popd \!*; np'

				Bill Sommerfeld
				ARPA: wesommer@athena.mit.edu
				UUCP: ...!mit-eddie!wesommer

gwyn@brl-smoke.ARPA (Doug Gwyn ) (08/27/87)

In article <8195@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
- 	#ifdef KERNEL
- 	#include "../machine/pte.h"
- 	#else
- 	#include <machine/pte.h>
- 	#endif
- which is as it should be---one should be able to build experimental
- kernels without reference to /usr/include.

Another solution would have been to add -I.. to CFLAGS in the Makefile
and just use <wherever/whatever.h> ("wherever" really should be "sys")
in the source.  <> names are not necessarily sought just in /usr/include.
We use this feature quite a bit.

ed@mtxinu.UUCP (Ed Gould) (08/28/87)

Eduardo Krell:
>You forget that there's a hard-coded maximum pathname length in the kernel,
>so I won't be breaking anything that's not broken already.

You didn't read my program very carefully.  It didn't use any pathnames
longer than about 7 characters, but changed into a directory whose full
path name could have been many thousands of characters long.  The current
restructions that I know about are for pathnames handed to a single
system call.

-- 
Ed Gould                    mt Xinu, 2560 Ninth St., Berkeley, CA  94710  USA
{ucbvax,decvax}!mtxinu!ed   +1 415 644 0146

"A man of quality is not threatened by a woman of equality."

kerry@uqcspe.OZ (Kerry Raymond) (08/28/87)

Time to compromise on this issue ?

The problem seems to boil down to this. Symbolic links do not preserve
the traditional tree structure of the Unix file system. So should ".."
continue to mean the `hard-link' parent (with a bit of fiddling to handle
mounting file systems) as it did before symbolic links ? Or should
".." mean the `soft-link' parent (go back the way you came) ?

Each method is more convenient than the other in certain situations,
and expert opinion (not mine !) suggests that both can be implemented.

The problem really seems to be the use of ".." for both meanings. 
Why not leave ".." as the directory entry pointing to the hard-link
parent (with fiddling at mount points) as it is, and call the soft-link
parent something else like ",," or "..." or something else that's quick
and easy to type and isn't a special symbol in the commonly used shells,
and isn't the sort of a name that a naive user will give to a file.

That way existing systems are compatible, and those people who wish
to alter their shells and kernels etc for soft-link-parent semantics
can do so. The only loss seems to be the special case of ",," being
introduced, which could slow down namei(). It shouldn't take very long
to re-train people to type ",," instead of ".."

jim@strath-cs.UUCP (08/28/87)

In article <8195@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>........................ --one should be able to build experimental
>kernels without reference to /usr/include.

Indeed. This is not always the case however. The last time our 4.2
kernel was substantially hacked, I was bitten by an obscure bug caused
by a mismatch of the include files. The changed include files were under
/sys, but the *copies* in /usr/include were left untouched -- my plan
was to get the new kernel stable, update the include files and then
recompile any user programs.

What happened was that a C program (genassym.c) was used to generate a
file of symbolic constants used by the assembler in assembling locore.s.
This pulled in values from /usr/include and not /sys! The result was
the generated assembler file had the wrong values for some fundamental
constants like the size of the user area. Needless to say, this meant
locore.o was screwed up. The kernel would compile OK, but it would fall
over when it 'forked' init...... Even compiling a new kernel from scratch
did not cure this.

[This is a bit of a digression from discussing symbolic links, which has
become a little tiresome.]

		Jim

dave@murphy.UUCP (Dave Cornutt) (08/28/87)

Well, it appears that there is a large number of people in both camps.  I
happen to be one of the members of the "leave it alone" camp, but I've
already posted my feelings on this subject, and I don't feel like fanning
these flames any further.  So, instead, I'll offer a suggestion for how
such a thing could be implemented.

Do it on a per-process basis, perhaps using a bit in the process flags field.
When the flag is on, ".." behaves as the go-up operator; when it is off,
".." behaves as it does now.  When a process forks, have the child inherit
the setting of the bit from its parent.  The shells should have a variable
or something that, when set, turns the flag on, so that the shell (and
all of its children) behave in the new manner.  (Unsetting the variable
restores the old behavior for the shell and any subsequently created
children.)  Init should have the flag off, for backwards compatibility.
Processes which have the flag off should not have to carry the overhead of
the directory context structure.

How to change the state of the bit?  Well, if you don't feel like inventing
a new system call, you could use ulimit (SYSV) or setrlimit (BSD).  Actually,
the really spiffy way would be to implement /proc and have it be an ioctl,
but I think that's still a ways off from appearing in commercial products.
---
"I dare you to play this record" -- Ebn-Ozn

Dave Cornutt, Gould Computer Systems, Ft. Lauderdale, FL
[Ignore header, mail to these addresses]
UUCP:  ...!{sun,pur-ee,brl-bmd,seismo,bcopen,rb-dc1}!gould!dcornutt
 or ...!{ucf-cs,allegra,codas,hcx1}!novavax!gould!dcornutt
ARPA: dcornutt@gswd-vms.arpa

"The opinions expressed herein are not necessarily those of my employer,
not necessarily mine, and probably not necessary."

ekrell@hector..UUCP (Eduardo Krell) (08/28/87)

In article <428@ecrcvax.UUCP> johng@ecrcvax.UUCP (John Gregor) writes:
>Another problem I forsee in changing the semantics of .. is that symbolic links
>can be cyclic.  So, a few hundred trips around the cycle and the kernal would
>have to keep that much state information around.

There's already a hard-coded constant in the kernel that allows a maximum of
8 symbolic links during a given pathname expansion (that's how BSD deals with
loops caused by symbolic links).
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!ekrell

jc@minya.UUCP (John Chambers) (08/29/87)

In article <1685@sol.ARPA>, ken@cs.rochester.edu (Ken Yap) writes:
> Say, guys, why don't you have it out by mail and post a summary when
> the wars are over? Half :-).
> 

No, don't stop!  I'm bringing this flaming session (oops, I mean serious
technical discussion :-) to the attention of the folks over in sci.lang,
and they might find it very interesting.

Hey, sci.lang folks, we have here a real, live anecdotal illustration of
the Sapir-Whorf Hypothesis.  There's a very confused discussion going on
in comp.unix.wizards that is based on a peculiarity of the English language,
and since the participants are clearly all native speakers of English, their
concepts and attitudes are strongly constrained, making a solution impossible.

The discussion is on the subject of symbolic links; in particular, the
question is:  After doing a "cd foo" where foo is a symbolic link to some
other directory, what is the meaning of ".."?  Is it the "true" parent
of foo?  Is it the directory you cd'd from?  This is complete with quite
forceful assertions that one or the other is The Right Interpretation.

What has this to do with the SWH?  Well, the crux of the discussion is
the phrase 'the meaning of ".."'.  The language-based problem is, of
course, the English word "the", which is used to imply that there is
a unique object that satisfies the criterion.  For the original Unix
file system, a directory had a unique parent, and ".." had a well-defined
interpretation as "the parent directory"  Once symbolic links were foisted
upon an unsuspecting Unix world, this broke down.  It is now possible for
a directory to have multiple parents, and the phrase "the parent of" no
longer has a (unique) referent.  The participants in the discussion seem
to have no idea that this is a problem; they continue to argue about "the"
meaning of "..".  

I contend that if this discussion were taking place in, say, Latin or Russian
(which lack definite articles), the discussion would have been short-lived.
A phrase such as "parent of" would be obviously ambiguous, and discussion
of the correct meaning would easily be seen as silly as discussion which
of your biologic parents ("the mother" or "the father") is the correct
interpretation of "the parent".  

This discussion is especially interesting from a linguistic viewpoint,
because the native language of these people distinguishes "mother" from
"father", and "the parent" is inherently ambiguous.  However, the parties
to the current discussion are in the sublanguage of Unix, so they speak
a language in which "the parent" is (or at least used to be) unambiguous.
Their prior competence in everyday English seems to have no effect on
their analysis of the Unix problem; the language's use of the definite
article strongly constrains their approaches to the "problem" and prevents
a solution.

I recommend this discussion to readers of sci.lang, and I request that
the participants in comp.unix.wizards keep it up for a while longer.

-- 
	John Chambers <{adelie,ima,maynard}!minya!{jc,root}> (617/484-6393)

gwyn@brl-smoke.ARPA (Doug Gwyn ) (08/29/87)

In article <126@minya.UUCP> jc@minya.UUCP (John Chambers) writes:
>The participants in the discussion seem to have no idea that this
>is a problem; they continue to argue about "the" meaning of "..".  

I find your condescending sneering at people trying to work out
a technical issue extremely inappropriate.

The participants in the discussion (before you entered) certainly
WERE aware of the only technical point you had to offer, and they
were not arguing about ""the" meaning of ".."".  The essential
problem is that the interpretation of ".." by the kernel namei()
code is necessarily deterministic -- that is what changes
multiple theoretically possible "meanings" into a particular
acual interpretation.  The issue is, and has been, what would be
the best choice among the (quite well understood) available
alternatives for the kernel behavior.  The dispute has been
essentially over the value metric for "best".

Perhaps it is YOU that have problems with the English language?

gwyn@brl-smoke.ARPA (Doug Gwyn ) (08/30/87)

In article <586@murphy.UUCP> dave@murphy.UUCP (Dave Cornutt) writes:
>Do it on a per-process basis...

As one of the principal developers of dual-universe UNIXes,
perhaps it is appropriate for me to comment on this.  Dual
interpretations are a pain in the donkey!  Select a general
enough interpretation and stick with it; don't make anything
that doesn't have to be, dependent on user whims.  For an
example of the dangers, consider the attempt Berkeley made
to cater to non-restart of slow I/O system calls after return
from a signal handler in 4.3BSD as originally distributed.
The bit SV_INTERRUPT in the sv_flags of the struct sigvec
was supposed to control this behavior.  Unfortunately, it
was not reset upon an exec(), so if ANY process enabled
non-restart, all descendants acted that way too.  Of course,
the descendant processes were written assuming the default
4.3BSD restart of system calls, so things like our EMACS
editor broke horribly when run from a System V shell.  The
moral is, too many options on how the OS behaves is a hazard.

POSIX (1003.1) has shown an annoying tendency recently to not
specify behavior that differs between 4BSD and SysV.  I've
shown that practically ANY behavior can be emulated, given
general enough kernel support.  It is a disservice to
application programmers to waffle on the interface specs so
that the programmer cannot count on reasonable default behavior.
Providing facilities using #ifdef or run-time interrogation
to determine which of several possible behaviors actually
will be obtained is HORRIBLE -- that leaves us no better off
than we were without POSIX!  The point of a STANDARD is to
SPECIFY these things, not to force the programmer to probe
in order to find them out.  I think what's happened is that
the vendors of existing systems have decided they want
"POSIX" to be just a label they can stamp on their current
product for marketing reasons, without having to do any real
work.  If the Final Use POSIX 1003.1 standard is not better
in this regard, I'm going to be agitating AGAINST its
adoption for Federal procurement specification, because as
a developer of portable applications, I need a stronger
specification than this waffling gives me.  Since the only
practical alternative is the SVID, it behooves BSD-based
vendors to cooperate in providing common, definite behavior
even though it means having to provide a compatibility
library for existing BSD-specific applications.  (I don't
see any alternative to that anyway, if they're going to also
conform to X3.159-198x (ANSI C).)

bzs@bu-cs.bu.edu (08/31/87)

Although putting the smarts in the shell can be handy for some (I
personally -like- the way symlinks work now, I use them all the time
as short-cuts to working areas, 'cd /usr/local/emacs' is a guaranteed
jump to the GNU emacs source area, de gustibus non disputendum) one
anecdotal problem with shell magic is exemplified by the following
"bug" I get from a user every few weeks:

	fp = fopen("~user/somefile","r");

how come it don't work...oh...that's stupid...

I can assume that putting symlink magic into the shell only will
win me another line of confused students (how come chdir()...?)

Swell...but I think it's arrogant to declare one thing more or less
confusing than the other to "them" (perhaps it's all confusing.)

	-Barry Shein, Boston University

neilb@elecvax.eecs.unsw.oz (Neil F. Brown) (08/31/87)

It occurs to me that there is a very simple way to avoid the
problems with symbolic links. It is to not have symbolic links
to directories.
I realise this has already been thought of, but I feel it was
discarded too quickly.

Firstly I would like to point out that it is entirely possible to
disallow such links. We simply have namei() fail if the target of
a symbolic link is a directory (ENAUGHTYSYMLINK) - possibly using
the sticky bit to turn off this restriction.
However, this restriction is not really needed, just have ln complain
if the target is a directory (or if the target doesn't exist??),
tell users how silly symlinks to directories are, and dont put
them in public places. i.e. give people enough rope but tell them
not to use it.

"But", you say, "We LIKE sym links to directories, we need them,
life was so bland and colourless before we found them."

I think not. Consider the apparent `uses' for symlinks to directories.
1/ The most commonly mentioned directory symlink seems to be
	/usr/include/sys -> /usr/sys/h
   This, to me, is a bad thing. The kernel include files should be
   in ONE place and one place only.  As many non-kernel programs
   use them, they should be in a standard place. This place seems
   to be X/sys where X is known to /lib/cpp and is probably
   /usr/include.  (Further, X should be the home directory of
   user `i' so users only need use ~i.)

   If you want to test compile a new kernel, use the -Iplace option.

   "But", you argue, "we want the kernel includes to be on the same file
   system as the kernel sources, and if this is not /usr, the above
   won't work!"

   I fully agree the includes should be physically near the sources
   but I wouldn't use symlinks to put them in their logical home,
   I would, instead, fix the mount command. i.e. cure the problem,
   not the symptom. See below for modified mount.
   First, other apparent uses for symlinked directories.

2/ Subtree Y belongs logically in place /Path/ but there is insufficent
   room on that file system. So, move it to /BIGFILESYSTEM/Y and symlink
	/Path/Y -> /BIGFILESYSTEM/Y
   Again, a problem with mount, NOT another job for SymLink. See below.

3/ I remember long ago (1 month?) someone suggested symlinks were terribly
   useful for moving around a large tree. Link interesting directories
   to X and then put X in your cdpath so interesting places can be
   found quickly from anywhere.
   I have a better solution, set up some shell variables
   S=/somewhere/sources B=/overthere/binaries X=/my/favourite/place
   then just use cd $X/whatever. (Notice that with upper case
   variable names, the whole interesting-place name is typed with
   the shift key down, easier on the fingers).

   So that's not a good excuse for symlinks to directories.

4/ Pseudo directory unions - linking some directories and some files from
   another project so you can make small changes without vast copying.
   This is a case when you almost do want a directory to be in two places at
   once.
   But then, this would be an environment set up by people who know what
   they are doing, and who should be able to cope with any problems.
   Anyway, directories that are linked in should only be there for programs
   to notice. If the user wants to go into the directory, its contents should
   be linked across instead. (Maybe a cd which won't pass through a symlink??)

   Of course, what is really required is true directory unions.
   [ How about symlink ... is a directory to scan if name is not found
     in current directory, with mode bits determining what sort of 
     opens won't try ... (exec,read,write,creat,trunc... completely
     new meaning on mode bits, but it might work).
     This is just a quick thought - dont hold me to it.
   ]

5/ Vast numbers of other apparent uses that are simply "wrong" or could
   better be fixed by judicious use of the right PATH or -I type
   variable, or should be done properly with a different feature
   such as generalised mounting.

The generalised mount.

   The mount function links inode I of block device B to inode 1 (the root)
   of block device b and says any access to (B,I) is really an access
   to (b,1) and an attempt to access .. from (b,1) is really a access of
   .. from (B,I).

   A generalised mount would link (B,I) to (b,i) and say that access
   to (B,I) yields (b,i) and an access of (b,i) yields (B,I), with
   appropriate handling of . and .. (yes, .. IS something special).
   This way we can effectively swap any two subtrees of any two file
   systems thus making it possible to put large subtrees on whichever
   file system has the required space, and then put the subtree
   where it belongs in the overall file tree without any messy symlinks.

   This overcomes the /usr/include/sys problem, cleans up the mount
   command, and allows the system administrator much more flexability
   in distributing available space about the directory tree.

   This is not, of course, the whole story on a generalised mount but
   any system programmer worth their salt should be able to fill in the
   details.


NeilBrown - Uni New South Wales - Australia
"Vote for the tree - not directed graphs"

kre@munnari.UUCP (08/31/87)

One thing that's being ignored here is that the interpretation of
".." (if you want it to be a variable, or context sensitive thing)
that should apply is that which applied when the ".." was created,
not the one that happens to be current when the ".." is interpreted.

You can argue that

	cd /usr/include/sys; cd ..

should put you in /usr/include, and there's a certain superficial
attractiveness to this.

However, if I do

	cd /usr/include/sys; ln -s ../../bin bin 

I can then (I presume) do

	cd bin

and I end up in /usr/bin if all is going well, and the context
sensitive interpretation applies.

Now, sometime later, I tell you that you can get to /usr/bin
from /usr/include/sys by just using "cd bin" so you decide to
try it ...

	cd /sys/h (you're not silly, you know where /usr/include/sys
		   really lives, and its much less typing)
	cd bin

and you find yourself in /bin .. not exactly what was promised at all.

The way to fix this is to make the interpretation of ".." depend on
the current context of the user who planted the reference to "..",
this will also solve all the problems of uses of ".." implanted in
C source, in shell scripts, and just about anywhere else that you
might want to look, which will all break if you change the semantics
of ".." in any way other than this.

And yes, I admit, ".." at a mount point is a special case, and changing
the mount point will break any referenes to ".." that happen to pass
back through it.  However, this is a system administrator activity,
something that is rarely done, and is only done by people expected to
be responsible for their actions, any random user can't make this change
just because he feels like it.

This is, incidentally, the same problem as the use of SHELL in the
Sys V "make", where the shell that should be used is the one that
the Makefile writer planned, whereas the one used is that of the
user running make.  Of course the Makefile writer could have planned
for this, and explicitly set SHELL, but if that's required, what's
the point of importing SHELL at all, ever?

The same is true of the system(3) routine in libc (on all versions
of unix I think).  Except here which shell to use depends on where
the arg to system came from ... system("foo bar ...") should use the
programmer's shell (or the one he plans on when he wrote the program)
whereas system(gets(buf)) should use the user's shell.

Why not just accept that these questions are really hard to get right,
and the fewer of them we need to get right, the better.

As someone else said, the more "user options" there are the harder
it is to get anything done right at all.

In the case of ".." the solution is quite simple, leave the path
interpretation semantics the bsd way, no options, no per user
interpretation, and simply accept that

	cd /usr/include/sys; cd ..

puts you in /sys (wherever that happens to be) rather than /usr/include.

If System V can't have symbolic links without changing the semantics,
then lets just keep System V without symbolic links at all.

kre

hansen@pegasus.UUCP (08/31/87)

Eduardo Krell: <>, Ed Gould: <
<>You forget that there's a hard-coded maximum pathname length in the kernel,
<>so I won't be breaking anything that's not broken already.
< 
< You didn't read my program very carefully.  It didn't use any pathnames
< longer than about 7 characters, but changed into a directory whose full
< path name could have been many thousands of characters long.  The current
< restructions that I know about are for pathnames handed to a single system
< call.

I've read through David Korn's design memo on his proposed implementation of
symbolic links. Within it, he considers several possibilities:

    o	keeping the canonical path name for the current directory in the
	u-block, limiting the length of the current directory path to 1024
	bytes

    o	keep device/inode info for the current directory in the u-block,
	also limiting the length of the path to 1024 (or whatever) bytes

    o	storing only the current device/inode info for the current level
	of the current directory and using this to point into a shared
	system-wide table

The latter implementation, which he actually uses, does not limit the length
of the path of any directory to 1024 bytes.

					Tony Hansen
				ihnp4!pegasus!hansen, attmail!tony

ekrell@ulysses.homer.nj.att.com (Eduardo Krell[arm]) (08/31/87)

In article <484@mtxinu.UUCP>, ed@mtxinu.UUCP writes:

> You didn't read my program very carefully.  It didn't use any pathnames
> longer than about 7 characters, but changed into a directory whose full
> path name could have been many thousands of characters long.  The current
> restructions that I know about are for pathnames handed to a single
> system call.

The point is that the file you generate would be unusable anyway, in the sense
that any attempts to reference that file with an absolute pathname will
break because of the 8 symbolic links limit.
-- 
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!ekrell

mvs@alice.UUCP (09/01/87)

Back when I used the C-shell, I would comment more on the hierarchical
scheme.  If you want a listing of /usr/include, something is broken.
(In our system, /usr/include/sys is a hazard.)  In this case, you have
one directory called, say, "myinclude", with all the flexibility (and
"user friendliness") of a Macintosh.

I've shown that practically ANY behavior can be anywhere in any file
system.  If I do "cd /mnt/paris" and then "cd ..", you'll end up
exactly where I want it to mean /usr/include.  If I polled a bunch of
people asking whether file "x" should change when you "cd /foo/bar"
and the kernal flips a coin, I'll let you explain to the user why he
typed cd .. > / ; cd /mnt/paris ; cd .. from / and ended up in /foo.

There is no problem and thus we disagree only if you don't feel like
Unix.  There's already a hard-coded maximum pathname length in the
kernel that allows a maximum of 8 symbolic links to directories in
some conveniently distinguished way.  Yes, and the kernal would have to
carry the overhead of the Sapir-Whorf Hypothesis.  But is it that
mostly ATT posters think this is sufficient for wizards...

Since the only practical alternative is the problem...  But isn't this
EXACTLY what's done when the wars are over?  The point of a STANDARD is
to make ".." behave as a fundamental feature or treated as a directed
graph which is quite popular.  Time to compromise on this issue ?

_-_-_-_-Mark

jv@mhres.UUCP (09/01/87)

Don't flame me if I missed the point, but ... as a Unix user ...

At any moment, I can issue the command "pwd" which gives me the
current directory. If I am at "/usr/include/sys" then I can deduce what
my parent is by removing the last element of the name of the current
directory (in the above example: "/usr/include"). Sounds consistent
and easy to me.

Imagine, my system (HP9000/530 with HP-UX, a very good System V.2 with
Berkeley enhancements) does not have "." and "..":

%ls -al
total 158
-rw-------   1 jv       bsp          422 Apr  1 19:41 .login
-rw-rw-rw-   1 jv       bsp          263 Sep  1 13:32 .newsrc
-rw-rw-rw-   1 jv       bsp            0 Sep  1 13:36 .pnewsexpert
-rw-rw-rw-   1 jv       bsp           26 Sep  1 13:31 .rnlast
-rw-rw-rw-   1 jv       bsp           58 Aug 27 09:49 .rnsoft
-rw-r--r--   1 jv       bsp           14 May 23 23:04 .signature
drwxrwxrwx   1 jv       bsp            0 Aug 18 09:39 News
drwxr-xr-x   1 jv       bsp          504 Jun 13 01:05 icon
drwxr-xr-x   1 jv       bsp          912 Sep  1 13:31 maildir
%

No dot, no dot-dot, and directories have only one link ... Of course, you
can access "." and ".." from system calls - it will do what you expect.

-- 
Johan Vromans                           | jv@mh.nl via European backbone
Multihouse N.V., Gouda, the Netherlands | uucp: ..{seismo!}mcvax!mh.nl!jv
"It is better to light a candle than to curse the darkness"

mkhaw@teknowledge-vaxc.ARPA (Mike Khaw) (09/02/87)

in article <1254@mhres.mh.nl>, jv@mhres.mh.nl (Johan Vromans) says:
...
-> Imagine, my system (HP9000/530 with HP-UX, a very good System V.2 with
-> Berkeley enhancements) does not have "." and "..":
-> 
-> %ls -al
-> total 158
-> -rw-------   1 jv       bsp          422 Apr  1 19:41 .login
-> -rw-rw-rw-   1 jv       bsp          263 Sep  1 13:32 .newsrc
-> -rw-rw-rw-   1 jv       bsp            0 Sep  1 13:36 .pnewsexpert
-> -rw-rw-rw-   1 jv       bsp           26 Sep  1 13:31 .rnlast
-> -rw-rw-rw-   1 jv       bsp           58 Aug 27 09:49 .rnsoft
-> -rw-r--r--   1 jv       bsp           14 May 23 23:04 .signature
-> drwxrwxrwx   1 jv       bsp            0 Aug 18 09:39 News
-> drwxr-xr-x   1 jv       bsp          504 Jun 13 01:05 icon
-> drwxr-xr-x   1 jv       bsp          912 Sep  1 13:31 maildir
-> %

That's funny, we have an HP9000/320 and two HP9000/350s running HP-UX
and all 3 have . and ..

Mike Khaw
-- 
internet:  mkhaw@teknowledge-vaxc.arpa
usenet:	   {uunet|sun|ucbvax|decwrl|uw-beaver}!mkhaw%teknowledge-vaxc.arpa
USnail:	   Teknowledge Inc, 1850 Embarcadero Rd, POB 10119, Palo Alto, CA 94303

ekrell@hector..UUCP (Eduardo Krell) (09/02/87)

In article <1804@munnari.oz> kre@munnari.oz (Robert Elz) writes:

>Now, sometime later, I tell you that you can get to /usr/bin
>from /usr/include/sys by just using "cd bin" so you decide to
>try it ...
>
>	cd /sys/h (you're not silly, you know where /usr/include/sys
>		   really lives, and its much less typing)
>	cd bin
>
>and you find yourself in /bin .. not exactly what was promised at all.

2 answers to that:

1. Then it's my fault because I didn't follow your instructions. You told
   me I could get to /usr/bin from /usr/include/sys, not /sys/h

2. When you created the symbolic link, you should have used "/usr/bin"
  instead of "../../bin". Then it would work.

>In the case of ".." the solution is quite simple, leave the path
>interpretation semantics the bsd way, no options, no per user
>interpretation, and simply accept that
>
>	cd /usr/include/sys; cd ..
>
>puts you in /sys (wherever that happens to be) rather than /usr/include.

That's acceptable to you. I think this is broken. At least ksh does
it right.

>If System V can't have symbolic links without changing the semantics,
>then lets just keep System V without symbolic links at all.

I think symbolic links are useful and desirable. The point is we have
an opportunity to do it right this time. We shouldn't miss that
opportunity.
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!ekrell

aj@zyx.UUCP (Arndt Jonasson) (09/02/87)

In article <2268@xanth.UUCP> kyle@xanth.UUCP (Kyle Jones) writes:
>In article <6342@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
>> In article <8137@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>> >It may be convenient, but it does not feel like Unix.
>> 
>> As with text editors, it all depends on what you've gotten
>> used to.  Personally I can't stand symlinks making .. act
>> like a global jump to a distant place in the hierarchy.
>
>Indeed it does depend on your point of view.  I see symlinks as the
>global jump to the distant place in the hierarchy, not "..".   This
>would be readily apparent if the C shell and other shells that keep
>a 'current working directory' variable weren't ignorant of symlinks.

I think that most people's problems with the way symbolic links are now
(i.e. in BSD) arise in interactive use, i.e. when using the 'cd' command
in the shell.

The only times when I have been irritated by symbolic links are when I
have done 'cd ..' from a directory which I reached by doing 'cd' on a
link, with the known result. Mostly, I detect this after the very next
command. No big deal (this is only my personal opinion, of course).

Rather than 'fix' the semantics of '..', the appropriate mnemonic command
should be used; either 'up' or 'back', depending on what you want ('bk'
may be more appropriate, this being Unix). Additionally, the 'cd' command
can be made more talkative when it encounters a symbolic link. Below is a
program that some may find useful.


/*
  Public domain  -  linkp.c  -  for BSD systems
  Author: Arndt Jonasson  -  870902

  This program accepts zero or one arguments. In the case of one argument,
  the argument is printed out again on standard output.

  As a side-effect, any symbolic links encountered while traversing the
  argument are printed on standard error.

  The intended use is in a csh alias, such as:

  % alias cd 'cd `linkp \!*`'

  Then

  % cd ~me/sys

  may print out

   [/usr/users/me -> /dsk1/users/me]
   [/usr/users/me/sys -> /usr/include/sys]

  This should reduce the risk of one's being unpleasantly surprised when doing
  'cd ..' to a minimum.
*/

#include <stdio.h>
#include <strings.h>

#define TRUE 1
#define FALSE 0

#define MAXLEN	1024

main (argc, argv)
int argc;
char *argv[];
{
   int len;
   char path[MAXLEN];
   char *cp;
   char c;

   if (argc < 2)			 /* No argument, nothing to do */
      exit (0);

   len = strlen (argv[1]);		 /* If there isn't a '/' at the end */
   sprintf (path,			 /*   of the pathname, add one */
	    "%s%s",
	    argv[1],
	    (len > 0 && argv[1][len-1] != '/') ? "/" : "");

   cp = path;

   while (c = *cp++)			 /* Check each character of the */
					 /*   pathname */
   {
      if (c == '/')			 /* A slash: check whether this */
					 /*   partial pathname is a link */
      {
	 int n;				 /* n is the length of the link text */
	 char buf1[MAXLEN];		 /* buf1 holds the potential link */
	 char buf2[MAXLEN];		 /* buf2 will hold its link text */

	 cp[-1] = '\0';			 /* Zap away the '/' temporarily */
	 strcpy (buf1, path);		 /* Copy the pathname up to the */
					 /*   current '/' into the working */
					 /*   buffer */
	 cp[-1] = '/';			 /* Put back the '/' */

	 n = readlink (buf1, buf2, MAXLEN);
	 if (n > 0)			 /* Is this pathname a link? */
	 {
	    fprintf (stderr, "[%s", buf1);

	    do				 /* Yes, follow the link */
	    {
	       fprintf (stderr, " -> %.*s", n, buf2);
	       buf2[n] = '\0';		 /* End the link text properly */
	       strcpy (buf1, buf2);	 /*   and copy it into buf1 */
	       n = readlink (buf1, buf2, MAXLEN); /* Is it a link itself? */
	    }
	    while (n > 0);

	    fprintf (stderr, "]\n");
	 }
      }
   }

   puts (argv[1]);			 /* Send the original argument to */
					 /*   stdout */
   exit (0);
}
-- 
Arndt Jonasson, ZYX Sweden AB, Styrmansgatan 6, 114 54 Stockholm, Sweden
Mail address:	 ...!seismo!mcvax!zyx!aj	=	aj@zyx.SE

guy%gorodish@Sun.COM (Guy Harris) (09/03/87)

> -> Imagine, my system (HP9000/530 with HP-UX, a very good System V.2 with
> -> Berkeley enhancements) does not have "." and "..":
>
> That's funny, we have an HP9000/320 and two HP9000/350s running HP-UX
> and all 3 have . and ..

Nothing particularly funny about that at all.  Some HP9000 models - the 530 may
be one of them - run a UNIX that is built on top of a special kernel.  Those
models are the ones built out of what I have heard referred to as the "Focus"
chipset, which is a proprietary chipset implementing what is, I believe, a
stack machine.  The underlying file system of those machines, as implemented
atop that special kernel (which I have heard referred to as the "Sun kernel" -
no relation, as far as I know) is not any well-known UNIX file system; the UNIX
file system operations are implemented atop that file system.  I think they
also implemented a specialized BASIC system atop that kernel as well; I don't
know whether it was originally intended to have UNIX put atop it, or whether
that was done later.

I seem to remember somebody from HP indicating that this kernel's file system
doesn't implement "." or ".." directly, and that they don't fully emulate them;
this may explain why they don't show up when you read the directory.
Presumably, they are handled correctly when they appear in a pathname.

Other HP9000 models use 68020s; there may be other 68K models, and I think
there are also Spectrum (Precision Architecture) models.  I believe all of
those run UNIX directly.
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com

mwp@munnari.oz (Michael W. Paddon) (09/03/87)

in article <1254@mhres.mh.nl>, jv@mhres.mh.nl (Johan Vromans) says:
> 
> At any moment, I can issue the command "pwd" which gives me the
> current directory. If I am at "/usr/include/sys" then I can deduce what
> my parent is by removing the last element of the name of the current
> directory (in the above example: "/usr/include"). Sounds consistent
> and easy to me.
> 
> Imagine, my system (HP9000/530 with HP-UX, a very good System V.2 with
> Berkeley enhancements) does not have "." and "..":
> 
> %ls -al
> total 158
> -rw-------   1 jv       bsp          422 Apr  1 19:41 .login
> -rw-rw-rw-   1 jv       bsp          263 Sep  1 13:32 .newsrc
> -rw-rw-rw-   1 jv       bsp            0 Sep  1 13:36 .pnewsexpert
> -rw-rw-rw-   1 jv       bsp           26 Sep  1 13:31 .rnlast
> -rw-rw-rw-   1 jv       bsp           58 Aug 27 09:49 .rnsoft
> -rw-r--r--   1 jv       bsp           14 May 23 23:04 .signature
> drwxrwxrwx   1 jv       bsp            0 Aug 18 09:39 News
> drwxr-xr-x   1 jv       bsp          504 Jun 13 01:05 icon
> drwxr-xr-x   1 jv       bsp          912 Sep  1 13:31 maildir
> %
> 
> No dot, no dot-dot, and directories have only one link ... Of course, you
> can access "." and ".." from system calls - it will do what you expect.
> 
> -- 
> Johan Vromans                           | jv@mh.nl via European backbone
> Multihouse N.V., Gouda, the Netherlands | uucp: ..{seismo!}mcvax!mh.nl!jv

I hope you are not suggesting that we don't need '.' and '..' for the
above reasons. The routine getwd(3) [and thereby pwd] use the '..' link
to trace their way back to the root of the tree to yield the true name
of the current directory. The argument for using pwd to obviate the need
for '..' is therefore rather circular.

Note that there is no easy way to find the *real* path of a file or
directory if some of the proposed changes to symbolic links are
implemented in the kernel.
The kernel only knows, in this case, one possible path -- the
one you used to get there. Given this problem, how would you know if
you were removing the real directory or just a symbolic link? As has
been stated before, the solution is to regard symlinks as seperate
objects with their own semantics.

The '.' link is just as useful when a user wants to run an executable
in the current directory instead of the first instance of that name in
his PATH envariable. The only other solution -- putting '.' first in the
path -- is an obvious security risk.


						mwp
						===
===========================
UUCP:	{seismo,mcvax,ukc,ubc-vision}!munnari!mwp
ARPA:	mwp%munnari.oz@seismo.css.gov
CSNET:	mwp%munnari.oz@australia

kre@munnari.oz (Robert Elz) (09/03/87)

Its definitely time to stop this disussion, its getting nowhere, and
when alice!mvs starts posting its gibberish its best to shut up before
some clown decides to comment on that...

In article <2912@ulysses.homer.nj.att.com>, ekrell@hector..UUCP (Eduardo Krell)
writes:
> 1. Then it's my fault because I didn't follow your instructions. You told
>    me I could get to /usr/bin from /usr/include/sys, not /sys/h

But they're the same place.  Isn't that the whole point of a "link"?
Whichever name you refer to you get the same, identical, object, with
the same properties as all other names that refer to the same object.

If you're trying to make symbolic links hide themselves in the filesystem
and almost appear not to be there at all, I would have thought that this
would have been one of the fundamental properties you would have been
determined to preserve.

Certainly it is one that I demand, even though I am quiet willing,
even eagre, to change "symbolic link" to "pointer file" or some such
name where it doesn't envoke the same emotional response to how it
should behave, I still want object X to be object X, regardless
of its past history.  Can you really not imagine the confusion
such a change would cause?

> 2. When you created the symbolic link, you should have used "/usr/bin"
>   instead of "../../bin". Then it would work.

No, never.  Building in full path names to *anything* is a very
poor idea.  Of course, in this particular example it would probably
be safe, as /usr/bin rarely moves ... but in general a relative
pathname should be used in anything that's to be saved permanently,
(as opposed to things like command args, etc) to allow for the whole
object (tree) to be moved someplace else with impunity.

> I think symbolic links are useful and desirable.

Finally, we agree on something...

> The point is we have an opportunity to do it right this time.
> We shouldn't miss that opportunity.

If there's anything that's clear from this discussion, its
that there isn't a consensus of opinion on what is "right".

Given that, do you really want to be immortalized as the AT&T
person who forced your semantics of symlinks into a public
release, only to have it changed in the next release because
of the outcry?

Would you like to be whoever it was that suggested that Sys V.0's
compiler (or linker, or whatever) should require "extern" on all but
one instance of an extern variable, even though just about everyone
would agree that that change was "right"?

Maybe I should be more explicit on why I don't think that you can
possibly have implemented what you claim to have implemented, and
implemented it correctly.  Ed Gould provided a counter example
that you seemed to just shrug off without really understanding it.

Let me expand on that, and do it in a way where the usefulness of the
technique is a little more apparent.

First, let us agree that one of the properties of unix systems that
we do want to preserve, is that there isn't any system imposed limit
on how long a process can execute normally, if it can execute its
major code loop few a few thousand iterations, it should be able to
just keep on doing that forever.

There are all kinds of other limitations on processes (number of open
files, amount of memory, ...) but none that affect continuous execution.

Now, lets assume that I have an application with a HUGE database to
support.  Lets make it so huge that it isn't likely to fit in any
part of the file system tree that my customers are likely to have
available.  I could require them to rearrange their mounted filesystems
to provide lots of file space under one directory, but that's not
a very intelligent business decision for me to make, especially not
after Sys V gets symlinks, and I know I can rely on those.

It happens that my application's database can be nicely divided into
a number (let's say just two for this example, but that doesn't matter)
of separate filesystem trees.

Of course, I'm just delivering a binary to my customer, so he can't
recompile it to build in the directory names, and he wouldn't want to
anyway.  I could require the relevant directory names be passed as
args to the various commands, but that's tedious, even assuming shell
scripts to do it all.  (Tedious to set up initially, and tedious to
maintain as things move around later).  Using ENV vars for this is
simply wrong.

So, what I decide to do is have each of the major file trees contain symlink
pointers to the other file trees, for this example lets call the symlinks
"a", "b", etc (just two will do).  In the "b" tree I have a symlink "a"
that point at the "a" data, and in that tree a "b" symlink that points
at the "b" data.

All very simple.

Now the application looks something like this

	for (;;) {

		chdir("a");
		process_a_data();
		chdir("b");
		process_b_data();

	}

with no other chdir sys calls anywhere.

When my customer first installs my database, he buts the "a" data on
/disc1/a and the b data on /disc2/files/b then he (or my installation script)
does something like

	chdir /disc1/a
	ln -s /disc2/files/b b
	chdir /disc2/files/b
	ln -s /disc1/a a

The application starts, and runs for years without stopping, no
problems at all.

But after those years, my customer decides that he can afford a big new
disc, and on this he's going to be able to put all the files into
one tree.  Lets call the new place /disc3.

First he moves all the data files, then

	chdir /disc3
	ln -s . a
	ln -s . b

and he starts the application running.  How long does your implementation
give it before things start mysteriously failing?

Nb: there are no paths here with lengths longer than 1024.  Nor are
there any paths that breach the "8 symlink" rule.  [Aside: if you can
find a clean way to rmove that one, apart from just increasing the
number, I won't object at all .. its not easy though].

Please actually try this code on your implementation (make the "process"
functions be empty macros to avoid wasting time).  If it fails, then
I contend that your implementation is not "right".  Every couple of 
thousand times round the loop you could have the program fork and exec "pwd"
to provied some idea where in the tree it is, and where chdir("..")
will go.

If your implementation uses the "store the path" technique, then all
this will work, but you will have changed lots of other unix semantics,
other things just won't behave as they used to, and you've given no
indication that's how your implementation works.

Finally, others suggested generalized mount as a solution to this
problem.  I have no objection to that concept at all, however it
doesn't really solve the problem.

First, symlinks are user definable things, mount is generally
an administrators tool.  As a user I want to be able to make
pointers to directories, and I don't want to lose that ability.

Second, unless the semantics of mount are changed more than I think
was intended when this is done, when you "mount" /sys/h on /usr/include/sys
you are effectively removing /sys/h from its old position, and putting
it under /usr/include.  Whether than means that references to /sys/h
now fail, I don't know.  If they do, then that is not the object at all.
If they don't, then "/sys/h/.." would be "/usr/include" which doesn't
seem to be the object either.

So, generalized mount is a good idea, and can certainly be useful,
but it in no way helps solve the symlink problems.

kre

vic@phoenix.PRINCETON.EDU (V Duchouni) (09/04/87)

In article <3711@elecvax.eecs.unsw.oz> neilb@elecvax.eecs.unsw.oz (Neil F. Brown) writes:
>It occurs to me that there is a very simple way to avoid the
>problems with symbolic links. It is to not have symbolic links
>to directories.
>I realise this has already been thought of, but I feel it was
>discarded too quickly.
>
   NO !!! Not too quickly at all.
Symbolic links to directories are essential. Consider two or more
hosts sharing the /usr directory. How would they maintain separate
spool directories, mail alias databases etc without symbolic links.
	Symbolic links do magic, each host, interprets the pathname
of the link relative to its own fs tree e.g. /usr/spool->/private/usr/spool
will be a different place on each machine if /private is unshared.
	This mechanism has saved heaps of duplication of static system 
files on the local SUNS, no version of mount which refers to a physical
rather than a symbolic address could even come close.

	V.Duchovni <vic@fine.Princeton.Edu>

jv@mhres.mh.nl (Johan Vromans) (09/04/87)

In article <16294@teknowledge-vaxc.ARPA> mkhaw@teknowledge-vaxc.ARPA (Mike Khaw) writes:
>in article <1254@mhres.mh.nl>, jv@mhres.mh.nl (Johan Vromans) says:
>...
>-> Imagine, my system (HP9000/530 with HP-UX, a very good System V.2 with
>-> Berkeley enhancements) does not have "." and "..":
>
>That's funny, we have an HP9000/320 and two HP9000/350s running HP-UX
>and all 3 have . and ..
>

That's right. I suppose the 800 series has them too. Only the 500 series
has a different (and more robust) file structure than the others.

It took me some time to loose the habit of changing file characteristics with
"chmod go-w * .*" being super-user...

-- 
Johan Vromans                              | jv@mh.nl via European backbone
Multihouse N.V., Gouda, the Netherlands    | uucp: ..{?????!}mcvax!mh.nl!jv
"It is better to light a candle than to curse the darkness"

dave@murphy.UUCP (Dave Cornutt) (09/04/87)

In article <6366@brl-smoke.ARPA>, gwyn@brl-smoke.UUCP writes:
> In article <586@murphy.UUCP> dave@murphy.UUCP (Dave Cornutt) writes:
> >Do it on a per-process basis...
> 
> As one of the principal developers of dual-universe UNIXes,
> perhaps it is appropriate for me to comment on this.  Dual
> interpretations are a pain in the donkey!  Select a general
> enough interpretation and stick with it; don't make anything
> that doesn't have to be, dependent on user whims.

I'm trying to suggest a way that David Korn's scheme could be implemented
in SYSV in such a way that it would be compatible with existing imple-
mentations of symbolic links.  (I don't see what dual-universe setups
have to do with it.)

> example of the dangers, consider the attempt Berkeley made
> to cater to non-restart of slow I/O system calls after return
> from a signal handler in 4.3BSD as originally distributed.
> The bit SV_INTERRUPT in the sv_flags of the struct sigvec
> was supposed to control this behavior.  Unfortunately, it
> was not reset upon an exec(), so if ANY process enabled
> non-restart, all descendants acted that way too.  Of course,
> the descendant processes were written assuming the default
> 4.3BSD restart of system calls, so things like our EMACS
> editor broke horribly when run from a System V shell.  The
> moral is, too many options on how the OS behaves is a hazard.

Then I guess we should get rid of all those pesky tty mode bits and go back
to just CBREAK and RAW.  If we get rid of all options that can modify the
behavior of the OS, then we have a pretty inflexible system, and that just
doesn't seem to be in the spirit of UNIX.  Of course, haphazard implementation
of options can lead to its own set of problems, especially when the default
is not defined or not compatible with previous behavior.  In the case above,
not resetting the SV_INTERRUPT flag on exec could be regarded as a design
botch since the default state after an exec is undefined, and it directly
affects the way certain things in the program may behave.  Actually, the
thing that should have been done originally, way back in the beginnings
of BSD, was to make the V7 behavior the default and provide some way of
setting a flag on a per-process basis to get the auto-restart character-
istic.  But hindsight is easy.

The ".." thing is more of a per-session characteristic.  A user either wants
all processes that he/she runs to interpret ".." as "up", or none of them. 
Unfortunately, the UNIX kernel does not maintain any per-user or per-
session information (execpt for disk quotas and a few things in the tty
driver), so you have to simulate it by carrying things over from parent to
child processes.  There is precedent for this; consider the resource limits
in BSD.  That's why I suggested inheriting the setting of the flag on exec. 
Since it doesn't directly affect the call and return conventions of any 
system calls, I think it would be reasonably safe.

I'll have to admit that I'm not all that comfortable with something that
can be flipped on and off that can change the interpretation of file names.  
I'm just trying to come up with an idea so that we can have our cake and
eat it too: the Korn idea, but with compatibility (by default) with existing
behavior.  Since the only existing behavior is the BSD one, that has to be
the default.  (I'll admit that there are some political problems with asking
SYSV to adopt a BSD idea.)  This certainly isn't the only way to do it;
someone else (sorry, but I can't find the article) suggested making ",,"
mean "up", which would be fine with most people.  On the other hand, the
idea of just imposing Korn's scheme without giving any alternative would
not fly with our customers at all.  Since both behaviors appear to have
a substantial number of advocates, I'm trying to suggest a way that it
could be implemented while allowing SYSV and BSD to move torward a common
ground, instead of creating another breach.

Maybe that's too much to ask for.
---
"I dare you to play this record" -- Ebn-Ozn

Dave Cornutt, Gould Computer Systems, Ft. Lauderdale, FL
[Ignore header, mail to these addresses]
UUCP:  ...!{sun,pur-ee,brl-bmd,seismo,bcopen,rb-dc1}!gould!dcornutt
 or ...!{ucf-cs,allegra,codas,hcx1}!novavax!gould!dcornutt
ARPA: dcornutt@gswd-vms.arpa

"The opinions expressed herein are not necessarily those of my employer,
not necessarily mine, and probably not necessary."

ekrell@hector..UUCP (Eduardo Krell) (09/05/87)

In article <1809@munnari.oz> mwp@munnari.UUCP writes:

>The kernel only knows, in this case, one possible path -- the
>one you used to get there. Given this problem, how would you know if
>you were removing the real directory or just a symbolic link?

How about lstat() ?
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!ekrell

ekrell@hector..UUCP (Eduardo Krell) (09/05/87)

In article <1811@munnari.oz> kre@munnari.UUCP writes:

>Its definitely time to stop this disussion, its getting nowhere

We finally agree on something...

>But they're the same place.  Isn't that the whole point of a "link"?
>Whichever name you refer to you get the same, identical, object, with
>the same properties as all other names that refer to the same object.

But the point is that it should be transparent to some degree to the user
(one of the argument is just how much transparency they ought to have).
The user shouldn't know /usr/include/sys is a symbolic link and shouldn't
care.

>If you're trying to make symbolic links hide themselves in the filesystem
>and almost appear not to be there at all, I would have thought that this
>would have been one of the fundamental properties you would have been
>determined to preserve.

I do want to hide them, much the same way one hides the implementation
details from an abstract data type (ADT). The user doesn't know how an
ADT is implemented and if he does know, he can't rely on that.

>I still want object X to be object X, regardless
>of its past history.  Can you really not imagine the confusion
>such a change would cause?

I imagine it would never be as bad as the confusion caused by the current
implementation of symbolic links. In my implementation, "cd /usr/include/sys;
cd .." always puts you in /usr/include.
An implementation that doesn't do that is confusing.

>Given that, do you really want to be immortalized as the AT&T
>person who forced your semantics of symlinks into a public
>release, only to have it changed in the next release because
>of the outcry?

No, I want to be immortalized as a person who helped to restore
the tree structure to the Unix File System and who implemented
symbolic links in a better, consistent way.

>Would you like to be whoever it was that suggested that Sys V.0's
>compiler (or linker, or whatever) should require "extern" on all but
>one instance of an extern variable, even though just about everyone
>would agree that that change was "right"?

I don't know who is "just about everyone" in this case. It certainly
wasn't my idea or Dave Korn's idea or anyone else in Bell Lab's research
area.

> Ed Gould provided a counter example
>that you seemed to just shrug off without really understanding it.

I fully understood it. What you don't seem to understand is that his
program generated filenames that were unusable since they couldn't be
referenced from outside because of the limit of 8 symbolic links in
a pathname.

I hoped my argument was clear enough, but I will be happy to provide
a more detailed explanation if anyone still doesn't understand it.

>First, let us agree that one of the properties of unix systems that
>we do want to preserve, is that there isn't any system imposed limit
>on how long a process can execute normally, if it can execute its
>major code loop few a few thousand iterations, it should be able to
>just keep on doing that forever.
>
>There are all kinds of other limitations on processes (number of open
>files, amount of memory, ...) but none that affect continuous execution.

Well, a program that tries to open a zillion files (and who checks the
returned values from open()) will "fail" when it reaches that limit.
A program that calls malloc() forever will "fail" sooner or later.
I don't quite understand your definition of "continous execution".

>Now the application looks something like this
>
>	for (;;) {
>
>		chdir("a");
>		process_a_data();
>		chdir("b");
>		process_b_data();
>
>	}
>
>with no other chdir sys calls anywhere.

There should be a chdir("..") after calling process_a_data() to put you
back in the original directory, right?. (of course, if "a" is a symbolic
link and you're still using the BSD semantics, then you can't get back
to the original directory with chdir("..") ... your example illustrates
the broken semantics of symbolic links).

>But after those years, my customer decides that he can afford a big new
>disc, and on this he's going to be able to put all the files into
>one tree.  Lets call the new place /disc3.
>
>First he moves all the data files, then
>
>	chdir /disc3
>	ln -s . a
>	ln -s . b
>
>and he starts the application running.  How long does your implementation
>give it before things start mysteriously failing?

I can't really see why my implementation would break. Can you please explain
me how you THINK my implementation works and why do you think it will break?

>Please actually try this code on your implementation

before I do that I need a good main(). Do I or don't I need the chdir("..") ?

>If your implementation uses the "store the path" technique

Tell me what is the "store the path" technique and I'll tell you whether
we use it or not.
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!ekrell

dave@sdeggo.UUCP (David L. Smith) (09/05/87)

Here's an idea:  Why not just define a new file "..." to _always_ mean go back 
to the parent directory you had in the path you used.  Then we'd have ., the 
current directory, .. the parent directory and ... the directory which was the 
parent in the path we specified during the chdir().  At login, .. and ... would 
be equal.  Or, this could be swapped for those who have been screaming about how
.. should always be the parent directory specified in the path.  This info
could just be stored on a per-process basis by the kernel, no problem.  
Nothing breaks and we don't need any funky modes to confuse everyone.

Any problems with this idea?


-- 
David L. Smith
{sdcsvax!sdamos,ihnp4!jack!man, hp-sdd!crash}!sdeggo!dave
sdeggo!dave@sdamos.ucsd.edu 
Oxymoron: Unix Documentation


-- 
David L. Smith
{sdcsvax!sdamos,ihnp4!jack!man, hp-sdd!crash}!sdeggo!dave
sdeggo!dave@sdamos.ucsd.edu 
"How can you tell when our network president is lying?  His lips move."

frank@zen.UUCP (09/06/87)

In article <16294@teknowledge-vaxc.ARPA> mkhaw@teknowledge-vaxc.ARPA (Mike Khaw) writes:
>in article <1254@mhres.mh.nl>, jv@mhres.mh.nl (Johan Vromans) says:
>-> Imagine, my system (HP9000/530 with HP-UX, a very good System V.2 with
>-> Berkeley enhancements) does not have "." and "..":
>
>That's funny, we have an HP9000/320 and two HP9000/350s running HP-UX
>and all 3 have . and ..
>Mike Khaw

HP-UX has a different file system implementations on the series 500 from
those on the 300 and 800 series.  At the time 500 series first came out
(1983), HP (I am informed) didn't rate either the Bell or BSD file systems
for reliability, so they implemented their SDF (Structured Directory Format)
as the 500 series file system.  Later improvements have meant that HP turned
to a version of the McCusick/BSD High Performance file system for first the
300 series, and now the 800 series.  The point being that the former does
not use "." and ".." files, while the latter two do.  

At the moment, no implementation of HP-UX supports symbolic links.

Frank Wales,                        [frank@zen.uucp<->mcvax!zen.co.uk!frank]
Zengrange Ltd, Greenfield Rd, Leeds, ENGLAND, LS9 8DB.      (+44) 532 489048

ekrell@hector..UUCP (Eduardo Krell) (09/06/87)

In article <9091@tekecs.TEK.COM> snoopy@doghouse.gwd.tek.com (Snoopy) writes:

>Why is -I the "right" way and #define the "wrong" way?

Because it doesn't require the "#ifdef KERNEL" kludge and I believe
header files shouldn't have any #include " ... " in them. That is,
you should only use brackets in #include in header files.

Why, you ask?. Because if you have Reisser's cpp (as most do), then the
rules for searching "" files is different from the one in K&R. Reisser's cpp
looks for "" files in the directory where the file with the #include was
(which is different from the directory where the original .c file was).

What this means is that if a header file in a standard header directory
uses

#include "y.h"

then the y.h in that directory will be used ALWAYS, and there's no way
I can override that and make it include the y.h I have in my own directory.

>What does the relevant section of your Makefile look like?

for make users:

CFLAGS = -I<whatever>

for nmake users:

.SOURCE.h : <whatever>
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!ekrell

neilb@elecvax.eecs.unsw.oz (Neil F. Brown) (09/07/87)

>From: kre@munnari.oz (Robert Elz)
>Date: 3 Sep 87 16:26:44 GMT

>Finally, others suggested generalized mount as a solution to this
>problem.  I have no objection to that concept at all, however it
>doesn't really solve the problem.
>
>First, symlinks are user definable things, mount is generally
>an administrators tool.  As a user I want to be able to make
>pointers to directories, and I don't want to lose that ability.
>
>Second, unless the semantics of mount are changed more than I think
>was intended when this is done, when you "mount" /sys/h on /usr/include/sys
>you are effectively removing /sys/h from its old position, and putting
>it under /usr/include.  Whether than means that references to /sys/h
>now fail, I don't know.  If they do, then that is not the object at all.
>If they don't, then "/sys/h/.." would be "/usr/include" which doesn't
>seem to be the object either.

Indeed, /sys/h would become the stub directory that /usr/include/sys was
before the mount. But is that really so bad?
I recall when I first saw a
	#ifdef KERNEL
	#include <../h/thing.h>
	#else
	#include <sys/thing.h>
	#endif
I thought this was somewhat repulsive. Why can't the includes always be
in the same place.  If we want to different include files we convince our
makefiles to give a -I flag to cpp - probably via the environment.

The thrust of my arguement is - do you ever really want to have a directory
in two different places? i.e. with two different absolute path names that
don't include the well-understood (I thought) `.' and `..'.
If you don't, then generalised mounting will solve your problems.

There are two sub-cases for this question:
1/ Public directories,
2/ Private directories.

For public directories I say NO, No a thousand times NO.  In a well ordered
system there should be a place for everything, and everything in its place.
Only the insecure need two homes.

For private directories the situation could be different. I have occasionally
played with symlinks, but not found a lot of good uses - but maybe I'm
just unimaginative.

So, I ask - an open question. What uses do you have for symbolic links to
directories?
 - If you are just moving things from one file system to another, it is a
   generalised mount you want. (I don't think many ordinary users will have
   this problem - will they??)
 - Is it just an alias, a keystroke saver. If so, it is my feeling that the
   shell is the place for aliases. If they are just aliases, would you really
   want to put them in a program. If not, the shell is the `right' place.
 - Is it a convenience that could be better (or at least satisfactorily)
   handled by one of the many Paths (PATH, CDPATH, -I for cpp includes,
   -L (i think) of ld libraries, MANPATH ...).
 - Or do you REALLY want the file system to think that the directory is in
   two places at once. If so I would be interested to know of the use.
   ( I may be able to use it to my advantage some day...)

In answer to kre's use for symlinks in WonderDatabaseApplication
(a use which clearly shows holes in Krell's solution) I would suggest
the time honoured configuration solution of a
.WonderDatabaseApplicationRC file containing the directory names.
It works for other programs, why not this one?

A final note, I'm not really suggesting that symlinks to directories should
be banned in private (I abhor them in public). They can be very useful for
the occasional quick-and-dirty.
But do you really want them on such a long term basis that programs may
use them and want to be able to `..' out of them?

All answers welcome, by news or mail as you see appropriate.

NeilBrown

allbery@ncoast.UUCP (09/07/87)

Perhaps the time has come to consider alternatives to symbolic links.  The
alternative should coexist with both standard filesystem semantics and
symbolic links.

Some possibilities are:

(1) Generic links.  These would be special files containing, not pathnames,
    but (device, inode) pairs.  These could also be generalized into
    directory unions by storing multiple (device, inode) pairs, possibly
    along with an attribute word specifying whether to allow file creation,
    etc. in that member of the union.  While this looks like a nice thing
    to have in general, it doesn't solve the problem.

(2) Multiple backpointers.  This isn't much different from storing the
    entire path (or device/inode numbers for same), unfortunately; no
    solution.

(3) Accept the fact that you can't do it rationally; the whole reason
    multiple links to directories are discouraged in the first place.
    This implies that symbolic links to directories are persona non grata.
    Somehow, I doubt that this is an acceptable solution, even though it
    solves a number of potential problems.  (I'm minded of a fool on an
    Altos which had an important database directory symlinked over the
    Worknet; this jerk saw that the link count was 1 for large files on
    both machines, concluded that there were duplicate files, and deleted
    them on "one" machine; this, of course, blew them away for good.  This
    same fool hadn't been doing backups, of course... and did even more
    stupid things when I tried to recover them.  Another story for later.)

(4) Force a user to be explicit in forward/backward pointer following.
    I'll use the ">" and "<" from Multics, in an unorthodox way, to show
    the forward and backward pointers; for UNIX, we'd have to invent
    a backpointer separator, since using ">" and "<" not only overloads
    them from the shell but also forces people to learn to use ">" in
    place of "/".  Alas, we can't steal "\" for this unless we change
    the escape character, which is even hairier....
    
    $ mkdir a b a>c
    $ ln a>c b>c
    $ cd a>c
    $ pwd
    /u>allbery>a>c
    $ cd <b>c
    $ pwd
    /u>allbery>b>c
    $ cd <
    $ pwd
    /u>allbery>b
    $ cd c
    $ cd <a
    $ pwd
    /u>allbery>a
    $ _

    Not only is it ugly, but it places a burden on the user which we're
    trying to avoid anyway; we may as well throw out symlinks entirely,
    it's just as much of a burden (in fact, less) to use full pathnames
    than to do this.  Moreover, the meaning of unspecified "<" is still
    not well defined unless you save the pathyou used to get there in
    the first place.

Maybe someone out there can come up with a better way to do it, but I doubt
it; the whole purpose of the tree structure is to make it unnecessary to
carry the full pathname around in order to get from point A to point B
when point A is a "parent" of point B.  I, myself, vote for disallowing
symlinks to directories; it's the long way around for /usr/include/sys ->
/sys/h, but it's also not ambiguous.  Either that or carry the path around,
either as a string or as (device, inode) pairs; and this places a limit
on the length of a path, which is an unreasonable restriction.
-- 
	    Brandon S. Allbery, moderator of comp.sources.misc
  {{harvard,mit-eddie}!necntc,well!hoptoad,sun!mandrill!hal}!ncoast!allbery
ARPA: necntc!ncoast!allbery@harvard.harvard.edu  Fido: 157/502  MCI: BALLBERY
   <<ncoast Public Access UNIX: +1 216 781 6201 24hrs. 300/1200/2400 baud>>
All opinions in this message are random characters produced when my cat jumped
(-:		      up onto the keyboard of my PC.			   :-)

mwp@munnari.oz (Michael W. Paddon) (09/07/87)

in article <2924@ulysses.homer.nj.att.com>, ekrell@hector..UUCP (Eduardo Krell) says:
> 
> In article <1809@munnari.oz> mwp@munnari.UUCP writes:
> 
>>The kernel only knows, in this case, one possible path -- the
>>one you used to get there. Given this problem, how would you know if
>>you were removing the real directory or just a symbolic link?
> 
> How about lstat() ?
>     

This hardly makes things simpler for the novice user as you claim.
In fact explaining why
		cd somepath/dir		[assume an empty directory]
		cd ..
		rmdir dir
may work sometimes but not others may be much harder
than explaining to people the idea of a "jump" into another place in
the tree. An operation like this will take more sophistication under
your proposed scheme for little real benefit.

It may be more worthwhile to look at a file system structure that is
more like a general graph than hacking up a tree to have some graph
attributes. But, as has been said before, if you do that it probably
isn't UNIX anymore.

						mwp
						===
===========================
UUCP:	{seismo,mcvax,ukc,ubc-vision}!munnari!mwp
ARPA:	mwp%munnari.oz@seismo.css.gov
CSNET:	mwp%munnari.oz@australia

ekrell@hector..UUCP (Eduardo Krell) (09/08/87)

In article <1813@munnari.oz> mwp@munnari.oz (Michael W. Paddon) writes:

|>>The kernel only knows, in this case, one possible path -- the
|>>one you used to get there. Given this problem, how would you know if
|>>you were removing the real directory or just a symbolic link?
|> 
|> How about lstat() ?
|>     
|
|This hardly makes things simpler for the novice user as you claim.
|In fact explaining why
|		cd somepath/dir		[assume an empty directory]
|		cd ..
|		rmdir dir

The original question was about removing a symbolic link, and how
could you tell the difference between removing the symbolic link and
the directory it points to.

Your example above is different (I'm assuming "somepath" is the symlink).
I can also come with an example as simple as yours that's hard to explain
to a naive user:

/foo has 2 subdirectories: /foo/d1 and /foo/d2. /foo/d1 is empty.
/foo/d2 is really a symbolic link.

	cd /foo/d2
	rmdir ../d1

This never works under BSD semantics.

If, on the other hand, "dir" is the symbolic link in your example,
then "rmdir dir" won't work since dir is not a directory. "rm dir"
would work under my implementation, it won't work under BSD semantics
unless the directory it points to is also called "dir".
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!ekrell

schwartz@gondor.psu.edu (Scott E. Schwartz) (09/08/87)

In article <1119@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>What we need is a third entity that will (a) not be attached to a 
>specific place in the directory hierarchy but (b) be known to
>the kernel.

>we will simply be able to say:
>     #include <$SYS/thing.h>
>and have $SYS be interpreted by the filesystem. 

This sounds like a good idea to me.  Doesn't Apollo provide something
like this in Aegis?  I've seen Apollo systems (the engineering department
at Swarthmore, if you must know) that had a link from /usr/bin to 
something like  /usr/$(SYS)/bin  so users can select /usr/bsd/bin or 
/usr/sys5/bin.  


-- Scott Schwartz            schwartz@gondor.psu.edu

mwp@munnari.oz (Michael W. Paddon) (09/09/87)

in article <2931@ulysses.homer.nj.att.com>, ekrell@hector..UUCP (Eduardo Krell) says:

> The original question was about removing a symbolic link, and how
> could you tell the difference between removing the symbolic link and
> the directory it points to.

The point is that your implementation requires *more* sophistication
form the average user than the BSD one. How many casual users know
what lstat(2) does off the top of their heads?

> Your example above is different (I'm assuming "somepath" is the symlink).
> I can also come with an example as simple as yours that's hard to explain
> to a naive user:

Explaining the BSD semantics requires only the model of a global jump
to another part of the tree. Explaining situations which your semantics
lead to requires detailed knowledge of symlinks and the way context can
affect path names.

> /foo has 2 subdirectories: /foo/d1 and /foo/d2. /foo/d1 is empty.
> /foo/d2 is really a symbolic link.
> 
> 	cd /foo/d2
> 	rmdir ../d1
> 
> This never works under BSD semantics.

But with the idea of global jumps, this example is meaningless.

It seems to me that you are suggesting a change to the kernel because
the semantics of "cd .." don't please you under the BSD system. Your
proposal certainly fixes this behaviour. There seem to be other ramifications
of the scheme that far outweigh the proposed advantages, as has
been pointed out by numerous examples.

I am not wholly against your ideas -- "cd .." really irks me on occasion.
However, jumping in and changing the kernel may not be the best solution.

					mwp
					===
===========================
UUCP:	{seismo,mcvax,ukc,ubc-vision}!munnari!mwp
ARPA:	mwp%munnari.oz@seismo.css.gov
CSNET:	mwp%munnari.oz@australia

gwyn@brl-smoke.ARPA (Doug Gwyn ) (09/09/87)

In article <599@murphy.UUCP> dave@murphy.UUCP (Dave Cornutt) writes:
>Then I guess we should get rid of all those pesky tty mode bits ...

I realize you were being sarcastic, but actually the fact that the
setting of the terminal handler modes by one process can affect
another process's operation IS a problem, as everyone who has found
their terminal left in a funny state by a sick or buggy process can
attest.  UNIX might have been better off with the state of the
controlling terminal kept on a per-process basis (and the "stty"
command built into the shell).  There would be some problems
implementing that, and it's too late now anyway, but it is a good
example of the problem with the OS maintaining "modes".

The reason for mentioning dual universes w.r.t. ".." interpretation
is that the interpretation mode would have to be inherited, so that
the shell-level "$ command ../file" would be handled per user wishes
by "command".  But note that the programmer of "command" may have
decided to use ".." with the other interpretation in places not
connecting with handling the argument.  I don't see how you could
reasonably have it both ways at the same time.  Trying to provide
a run-time choice of interpretation for ".." seems unworkable.

ekrell@hector..UUCP (Eduardo Krell) (09/10/87)

In article <1817@munnari.oz> mwp@munnari.UUCP writes:
   (me)
>> The original question was about removing a symbolic link, and how
>> could you tell the difference between removing the symbolic link and
>> the directory it points to.
>
>The point is that your implementation requires *more* sophistication
>form the average user than the BSD one. How many casual users know
>what lstat(2) does off the top of their heads?

What are you talking about? How can you tell if a file is a symbolic link
or not under BSD? That's right, lstat(). There's no other way!
The amount of code it would take (or the user-level command) to tell
the difference between a plain file/directory and a symbolic link is
under our implementation EXACTLY THE SAME AS IN BSD. Period.

>Explaining the BSD semantics requires only the model of a global jump
>to another part of the tree. Explaining situations which your semantics
>lead to requires detailed knowledge of symlinks and the way context can
>affect path names.

It only requires to know that /foo/bar/.. is the same as /foo (which is
the way I thought Unix worked).

>But with the idea of global jumps, this example is meaningless.

And I guess I can now dismiss your examples as meaningless with the idea
that /foo/bar/.. == /foo ?

>It seems to me that you are suggesting a change to the kernel because
>the semantics of "cd .." don't please you under the BSD system.

I made this clear a while ago: I use ksh on BSD, System V and all other
Unix systems we run. ksh treats "cd .." the way we propose to. I don't
need to "fix" the "cd .." problem, it's already taken care of by ksh.

>Your
>proposal certainly fixes this behaviour. There seem to be other ramifications
>of the scheme that far outweigh the proposed advantages, as has
>been pointed out by numerous examples.

I think the advantages clearly far outweigh these ramifications.
For every example you come up with, I can show you a counter-example which
is as simple as yours, in which BSD semantics breaks.

>I am not wholly against your ideas -- "cd .." really irks me on occasion.
>However, jumping in and changing the kernel may not be the best solution.

"cd .." can be "fixed" in the shell (see above). "ls .." can't, and I
believe "ls .." should be equivalent to "cd ..; ls"
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!ekrell

allbery@ncoast.UUCP (Brandon Allbery) (09/10/87)

As quoted from <1119@bsu-cs.UUCP> by dhesi@bsu-cs.UUCP (Rahul Dhesi):
+---------------
| There is no reason why the value of SYS could not be relative.
| 
|      $ setlink ABC /usr/include/sys
|      $ setlink EFG ../jkl
|      $ setlink XYZ alpha
|      $ cd $ABC/$EFG/$XYZ
|      $ pwd
|      /usr/include/jkl/alpha
|      $
| 
| ABC, EFG, and XYZ look like environment variables, but they are known
| to the kernel and accessed via a hash table, not a sequential search.
| The hash table itself is accessible via a special entry in the normal
| environment so that, for example, a library function can look for the
| environment variable LINKTABLE and get some value that will let it
| access the hash table directly, so the kenrel need not be involved in
| all accesses, only those that are needed in system calls.
+---------------

Anyone for TOPS-10/TOPS/20/VMS "DEFINE"?  Logical names have been around for
a long time; that they aren't in UNIX may indicate that they aren't the
panacea you think.  (Anyone know for sure?  DMR?)

+---------------
| Pick one:
| 
|      [ ] symbolic links available around 1983, that largely work
| 	 and are useful, though they lead to some confusing 
| 	 situations because the kernel and the user may interpret
| 	 ".." differently; may be up to 255 characters long
| 
|      [ ] symbolic links designed carefully to lead to no confusion;
| 	 approved after much consideration by four layers of
| 	 bureacracy; not currently available, and probably won't be 
| 	 until 1989; will probably be limited to 14 characters
+---------------

Having had to (attempt to) repair disasters caused by the first, I'll opt for
the second.  Nobody said they had to _stay_ limited to 14 characters.
-- 
	    Brandon S. Allbery, moderator of comp.sources.misc
  {{harvard,mit-eddie}!necntc,well!hoptoad,sun!mandrill!hal}!ncoast!allbery
ARPA: necntc!ncoast!allbery@harvard.harvard.edu  Fido: 157/502  MCI: BALLBERY
   <<ncoast Public Access UNIX: +1 216 781 6201 24hrs. 300/1200/2400 baud>>
All opinions in this message are random characters produced when my cat jumped
(-:		      up onto the keyboard of my PC.			   :-)

dhesi@bsu-cs.UUCP (Rahul Dhesi) (09/10/87)

I proposed a symbolic link that could be used like this:

          $ cd $ABC/$EFG/$XYZ

and explained:
     ...ABC, EFG, and XYZ look like environment variables, but they are
     known to the kernel and accessed via a hash table, not a sequential
     search.

In article <4494@ncoast.UUCP> allbery@ncoast.UUCP (Brandon Allbery) writes:
>
>Anyone for TOPS-10/TOPS/20/VMS "DEFINE"?  Logical names have been around for
>a long time; that they aren't in UNIX may indicate that they aren't the
>panacea you think.  (Anyone know for sure?  DMR?)

There are serious problems with logical names as implemented in these
systems.  For example, if ABC:, EFG:, and XYZ: are logical names, VMS
will not accept a string like "ABC:/EFG:/XYZ" or "ABC:EFG:XYZ:".
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

mwp@munnari.oz (Michael W. Paddon) (09/11/87)

in article <2934@ulysses.homer.nj.att.com>, ekrell@hector..UUCP (Eduardo Krell) says:
 
> What are you talking about? How can you tell if a file is a symbolic link
> or not under BSD? That's right, lstat(). There's no other way!
> The amount of code it would take (or the user-level command) to tell
> the difference between a plain file/directory and a symbolic link is
> under our implementation EXACTLY THE SAME AS IN BSD. Period.

The point of my example was a simple everyday situation. You cd to a
directory and find it empty. You then cd .. and rmdir the said directory.
This will always work under BSD -- your system will sometimes have the user
trying to rmdir a symlink. The user doesn't need to use lstat under BSD.

> And I guess I can now dismiss your examples as meaningless with the idea
> that /foo/bar/.. == /foo ?

This seems to be the essence of the problem. People want the best of both
schemes. I don't particularily like the solution of a per-process flag.

> 
>>It seems to me that you are suggesting a change to the kernel because
>>the semantics of "cd .." don't please you under the BSD system.
> 
> I made this clear a while ago: I use ksh on BSD, System V and all other
> Unix systems we run. ksh treats "cd .." the way we propose to. I don't
> need to "fix" the "cd .." problem, it's already taken care of by ksh.

OK then, the semantics of "ls .." are what's really annoying you!
It doesn't change my observations.

> I think the advantages clearly far outweigh these ramifications.
> For every example you come up with, I can show you a counter-example which
> is as simple as yours, in which BSD semantics breaks.
 
This appears to make each scheme equally flawed! As I stated previously,
now might be the time to be radical and attempt to redesign the UNIX file
system from scratch.

> "cd .." can be "fixed" in the shell (see above). "ls .." can't, and I
> believe "ls .." should be equivalent to "cd ..; ls"

Any occurence of ".." in a shell command may be fixed at the shell level to
the way you want things to work. Obviously.

The question is should the kernel be changed? Most of your arguments for the
changes have hinged on user level examples. Is it desirable to force these
changes on the kernel as well? Until you can *prove* that necessity it is
probably an unwise move given the support for the existing scheme.
Remember that if you hack your shell, you can't possibly break existing code.


							mwp
							===

PS. Everyone else on the network is probably getting *very* tired of this
subject by now. I am, however, genuinely interested in continuing this discussion
by mail.

===========================
UUCP:	{seismo,mcvax,ukc,ubc-vision}!munnari!mwp
ARPA:	mwp%munnari.oz@seismo.css.gov
CSNET:	mwp%munnari.oz@australia

brett@wjvax.UUCP (Brett Galloway) (09/11/87)

I have listened (read) this discussion for quite a while, and one thing
that has bothered me with the anti-bsd-symlinks position is one of
consistency.  Making '..` return along the path that was originally
traversed is fine, as long as that can be unambiguously determined.  In
the context of the cd command (chdir(2)), it can be.  However, there are many
other cases where the cd command is not used, but similar behaviour is
expected.  For example, the path

	/u/foo/bar/../file

could reasonably be expected to reference /u/foo/file.  However, if the
path state information is bound only to the cd command, this won't work.
In order to be consistent, *two* sets of state information must be retained.
One for the cd command (to apply across *different* system calls), and one
for path resolution, to be used *within* a system call.

Are, in fact, both of these proposed?
-- 
-------------
Brett Galloway
{pesnta,twg,ios,qubix,turtlevax,tymix,vecpyr,certes,isi}!wjvax!brett

allbery@ncoast.UUCP (Brandon Allbery) (09/14/87)

As quoted from <1128@bsu-cs.UUCP> by dhesi@bsu-cs.UUCP (Rahul Dhesi):
+---------------
| In article <4494@ncoast.UUCP> allbery@ncoast.UUCP (Brandon Allbery) writes:
| >Anyone for TOPS-10/TOPS/20/VMS "DEFINE"?  Logical names have been around for
| >a long time; that they aren't in UNIX may indicate that they aren't the
| >panacea you think.  (Anyone know for sure?  DMR?)
| 
| There are serious problems with logical names as implemented in these
| systems.  For example, if ABC:, EFG:, and XYZ: are logical names, VMS
| will not accept a string like "ABC:/EFG:/XYZ" or "ABC:EFG:XYZ:".
+---------------

I never said they had to have DEC's limitations; after all, this is UNIX.
-- 
	    Brandon S. Allbery, moderator of comp.sources.misc
  {{harvard,mit-eddie}!necntc,well!hoptoad,sun!mandrill!hal}!ncoast!allbery
ARPA: necntc!ncoast!allbery@harvard.harvard.edu  Fido: 157/502  MCI: BALLBERY
   <<ncoast Public Access UNIX: +1 216 781 6201 24hrs. 300/1200/2400 baud>>
All opinions in this message are random characters produced when my cat jumped
(-:		      up onto the keyboard of my PC.			   :-)

snoopy@doghouse.gwd.tek.com (Snoopy) (09/15/87)

In article <2927@ulysses.homer.nj.att.com> ekrell@hector (Eduardo Krell) writes:
>In article <9091@tekecs.TEK.COM> snoopy@doghouse.gwd.tek.com (Snoopy) writes:

>>Why is -I the "right" way and #define [s.b. #ifdef] the "wrong" way?

>Because it doesn't require the "#ifdef KERNEL" kludge 

It may be a kludge, it may also be necessary at times.  What if you
really want/need to have a #include depend on a define?  In the KERNEL
case, -I is easy, since the non-kernel case is typically a standalone
utility which has a seperate makefile (or at *least* a seperate target)
Doing if/then/else in a makefile isn't as easy (assuming it's even
possible) as doing it in C.  Having a seperate target is a worse kludge
than the #ifdef you were trying to avoid in the first place.

>Why, you ask?. Because if you have Reisser's cpp (as most do), then the
>rules for searching "" files is different from the one in K&R.

Sounds like it's Reisser's cpp that is broken.  (Seems like I've heard
that before...)

We're getting a bit off the track gang, time for:

ln -s discussion link/symbolic/bsd.vs.korn

Snoopy
tektronix!doghouse.gwd!snoopy
snoopy@doghouse.gwd.tek.com

ekrell@hector.UUCP (09/18/87)

In article <1014@wjvax.wjvax.UUCP> brett@wjvax.UUCP (Brett Galloway) writes:

>I have listened (read) this discussion for quite a while, and one thing
>that has bothered me with the anti-bsd-symlinks position is one of
>consistency.  Making '..` return along the path that was originally
>traversed is fine, as long as that can be unambiguously determined.  In
>the context of the cd command (chdir(2)), it can be.  However, there are many
>other cases where the cd command is not used, but similar behaviour is
>expected.  For example, the path
>
>	/u/foo/bar/../file
>
>could reasonably be expected to reference /u/foo/file.  However, if the
>path state information is bound only to the cd command, this won't work.
>In order to be consistent, *two* sets of state information must be retained.
>One for the cd command (to apply across *different* system calls), and one
>for path resolution, to be used *within* a system call.
>
>Are, in fact, both of these proposed?

Yes, but they're really the same thing.

The path state information has nothing to do with the "cd" command. There's
a notion of a "current working directory" in the kernel which is kept on a per-user
level in the u block. Each time a chdir() is done, this current working directory
is changed. This is what "cd" uses. The kernel also needs to know this current
working directory to resolve relative pathnames like "foo/bar/../file".
There's also a notion of a user's "root directory" which is needed because of
chroot(2) (the meaning of "/" is context sensitive).

The proposed change is handled within namei(), the kernel function which
resolves a file name like "/u/foo/bar/../file" into the actual place in the
disk where the file is (device, inode pair). namei() is used for both "cd"
(which uses chdir() which calls namei()), and opening or stating a file,
(which call name() internally).
    
    Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

    {ihnp4,seismo,ucbvax}!ulysses!ekrell

mangler@cit-vax.Caltech.Edu (System Mangler) (09/19/87)

In article <4477@ncoast.UUCP>, allbery@ncoast.UUCP (Brandon Allbery) writes:
> (1) Generic links.  These would be special files containing, not pathnames,
>     but (device, inode) pairs.

Putting device numbers into a zillion special files makes it exceedingly
painful to move a filesystem to a different device, and isn't likely to
work at all for a network filesystem.  Ditto for inode numbers.

Don Speck   speck@vlsi.caltech.edu  {amdahl,rutgers}!cit-vax!speck

mouse@mcgill-vision.UUCP (09/20/87)

In article <3711@elecvax.eecs.unsw.oz>, neilb@elecvax.eecs.unsw.oz (Neil F. Brown) writes:
> It occurs to me that there is a very simple way to avoid the problems
> with symbolic links.  It is to not have symbolic links to
> directories.

Back when what we now call hard links were the only kind of link, they
were allowed to refer to directories.  Now, they technically still are,
but hard links to directories may be created only by root, and are
strongly deprecated in any case.  These wonderful new things, symbolic
links, were brought around to be used when you wanted a link to a
directory.

> [this can be done.  just have namei() fail.]
> However, this restriction is not really needed, just have ln complain
> if the target is a directory (or if the target doesn't exist??),

Not sufficient.

% mv /foo/bar /foo/bar-
% touch /foo/bar
% ln -s /foo/bar /baz
% rm /foo/bar
% mv /foo/bar- /foo/bar

> "But", you say, "We LIKE sym links to directories, we need them, life
> was so bland and colourless before we found them."

> I think not. Consider the apparent `uses' for symlinks to
> directories.

> 1/ The most commonly mentioned directory symlink seems to be
> 	/usr/include/sys -> /usr/sys/h
>    This, to me, is a bad thing. The kernel include files should be
>    in ONE place and one place only.

This is why the symlink: rather than having two copies, one on /sys/h
and the other on /usr/include/sys, we have just one, with the other
pointing to it.  The files need to be present in /usr/include/sys so
they can be found with <sys/foo.h> and they need to appear under /sys
so that all kernel source is logically together.  (You seem to believe
they should be in /sys/h so that all of /sys is on the same filesystem.
Irrelevant, say I.  Who cares what disk /sys/h is on in relation to
/sys/sys or /sys/whatever?  What matters is that it all appears under
/sys, where all the rest of the kernel source is.)

>    (Further, [/usr/include] should be the home directory of user `i'
>    so users only need use ~i.)

Where in the world did you come up with that?  Really now, how often do
you want to access /usr/include from the cshell?

> 3/ I remember long ago (1 month?) someone suggested symlinks were
>    terribly useful for moving around a large tree.  Link interesting
>    directories to X and then put X in your cdpath so interesting
>    places can be found quickly from anywhere.
>    I have a better solution, set up some shell variables
>    S=/somewhere/sources B=/overthere/binaries X=/my/favourite/place
>    then just use cd $X/whatever.

Go fly a kite.  The only sense in which this is "better" is that it
allows you to get rid of symlinks to directories.  From my point of
view, it is actually worse because it requires more typing.

> 4/ [...]
> 5/ Vast numbers of other apparent uses that are simply "wrong" or
>    could better be fixed by [other means]

Example: I have two source directories for some program, one for our
VAXen and one for our Suns.  This is because (a) this way there is no
confusion of .o files and (b) the sources are slightly different.  One
of these has most of the source files symlinked to the other, but
that's irrelevant - those are just links to files.  However, I find
that when working on the VAX version I often want to look at some file
in the Sun directory and vice versa.  So in the VAX source directory I
create a symlink "sun" pointing to the Sun source directory and in the
Sun directory I create a symlink "vax" pointing to the VAX source
directory.  This way I can just visit sun/sun-foo.c when working on the
VAX side, and similarly when working on the Sun version.

What would you say is the "right" way to do this?  It won't work to
create environment variables "sun" and "vax" and visit $sun/sun-foo.c,
even assuming my editor groks environment variables, because that means
I have to change the environment every time I start working on another
program - the symlinks for programA don't bother me when I'm messing
with programB.  We can't use your generalized mount because that would
make each machine lose its own directory tree.  In fact, we could never
complete both mounts at once!

Example: I am working on a program which is kept in its own directory
somewhere.  This program meshes with some kernel support, which I am
also working on.  So in the program source directory I make a symlink
"sys" pointing to the directory with the kernel portions, so I can
access the kernel pieces with short, easy-to-type names.  Same problems
as above.

Example: /usr/doc/ps1/13.rcs/man is a symlink pointing to
/usr/src/new/rcs/man.  Clearly the contents of this directory (yes, it
is a directory) should appear both places - you can't weasel out the
way you tried to with /usr/include/sys.  What do we do?  This is also
true of several other documentation directories.

Example: /sys/machine is a symlink to "vax" or "sun3" or whatever, as
appropriate for the machine in question.  What should we replace this
with?

Example: We use NFS to export filesystems around.  We mount machine foo
on /@foo.  For uniformity (pathnames can use "/@machine/..."
regardless), machine foo itself has a symlink "/@foo" pointing to ".".
What do we replace this with?

Enough examples.  Onward....

> The generalised mount.

>    The mount function links inode I of block device B to inode 1 (the
Inode 2.  From <sys/fs.h>, or <ufs/fs.h> if you have Sun's NFS in your
kernel:
 * The root inode is the root of the file system.
 * Inode 0 can't be used for normal purposes and
 * historically bad blocks were linked to inode 1,
 * thus the root inode is 2.
>    root) of block device b and says any access to (B,I) is really an
>    access to (b,1) and an attempt to access .. from (b,1) is really a
>    access of .. from (B,I).

>    A generalised mount would link (B,I) to (b,i) and say that access
>    to (B,I) yields (b,i) and an access of (b,i) yields (B,I), with
>    appropriate handling of . and .. (yes, .. IS something special).
>    This way we can effectively swap any two subtrees of any two file
>    systems [which is a Good Thing].

>    This overcomes the /usr/include/sys problem,

No, because then the include files disappear from /sys/h!

>    cleans up the mount command,

Symmetric mounts make networked filesystems nearly useless.  Imagine:
machine foo has something, say /usr/man, that machine bar wants to
access.  So machine bar mounts:

% mount foo:/usr/man /usr/man

Now all of a sudden foo has lost its own /usr/man because it got
replaced with bar's /usr/man, which is empty - after all, that's why
bar wants to import foo:/usr/man.

					der Mouse

				(mouse@mcgill-vision.uucp)

rbj@icst-cmr.arpa (Root Boy Jim) (09/22/87)

   From: Eduardo Krell <ekrell@hector..uucp>
   > is Chris Torek

   >If you wish to treat all path names as strings before attempting to
   >apply them to the file system itself, and resolve `..' as `up one
   >level'

   But isn't this EXACTLY what's done when the ".." is at a mount point?.
   The .. entry at the root of the mounted file system points to the root
   directory of the file system (i-node 2), yet when you "cd ..", you get
   to a different place.

I'm reluctant to prolong this discussion, but this is one point I haven't
seen addressed so far. I suppose it all depends on your viewpoint.

When you are in the root directory of a mounted file system, you are
really in *two* directorys, the leaf on the mounted-on file system, and
the root of the mounted filesystem. The kernel interprets all path names
but `..' relative to the mounted file system, and interprets `..' relative
to the leaf of the mounted on file system.

Those of you old enuf to remember Version 6 will remember that
`cd /usr; cd ..' left you in /usr. Do you really want this? I think not.

       Eduardo Krell                   AT&T Bell Laboratories, Murray Hill

       {ihnp4,seismo,ucbvax}!ulysses!ekrell

	(Root Boy) Jim Cottrell	<rbj@icst-cmr.arpa>
	National Bureau of Standards
	Flamer's Hotline: (301) 975-5688

dave@murphy.UUCP (Dave Cornutt) (09/25/87)

In article <3728@elecvax.eecs.unsw.oz>, neilb@elecvax.eecs.unsw.oz (Neil F. Brown) writes:
> The thrust of my arguement is - do you ever really want to have a directory
> in two different places? i.e. with two different absolute path names that
> don't include the well-understood (I thought) `.' and `..'.
> If you don't, then generalised mounting will solve your problems.

I have at least one case right here.  I have a program which uses include
files and object modules from another program which is rather large.  I
don't want to make copies of the files I need because (1) I don't want to
have to keep up with updates to these files, which are not under my
control, and (2) they are too large to have multiple copies laying around;
contrary to the claims of some small-system advocates, just because you
have an 800M drive to play with doesn't mean you waste space keeping
30 copies of everything around when a simple alternative -- symlinks --
are available.  So, I have one link to the include directory of this
program, and another to the object directory.  Sometimes I need to switch
to a different version of the program, to keep up with updates or just to
try an experiment -- no problem; I just move the links.  This way, I
can get access to the things without having to make copies, and without
having to impose my idea of directory structure on the person who
maintains this program.

Occasionally I cd through one of these links to have a look around.  Funny,
but I never really thought about "cd .." taking me back to my directory,
and I'm not sure that I'd even want it to do that.  Reason is, these
directories that my links point to are in the middle of a heirarchy, and
sometimes I want to go look at other places in that structure, so I expect
"cd .." to take me up a level in that heirarchy, not back to my directory.
So, how *do* I get back to my directory?  Well, pushd/popd solves the
problem for me -- a shell mechanism, with no kernel trying to do me
favors that, in this case, I don't want.  It reminds me of the story
about the Boy Scout who was determined to help a little old lady across
the street, even if he had to drag her, kicking and screaming.

> There are two sub-cases for this question:
> 1/ Public directories,
> 2/ Private directories.

I think the above was a useful example of (2).  I can think of other
examples that might fit either category: (a) a third-party program that you
don't have the source to that has a pathname hard-coded into it (we have
one here), (b) moving critical software packages to another file system
temporarily if you lose a drive or to make space for something else, and
(c) relocating parts of the system that everyone "knows" is supposed to
be in a certain place.  The best example of this is the /usr-/pub division
on diskless-machine servers, which may be serving several types of machines.
We have another example on our system: vi and the vi recovery utility
expect the path to the preserve directory to be /usr/spool/preserve.
We don't have enough space on /usr for it (some of our users edit files
of >2M), and we don't have a disk partition available to mount it on.
No problem; we just move it to the /usr/spool filesystem and leave a
symlink in /usr, and vi is none the wiser.

To sum up, symlinks are a fairly simple way of doing things that you can't
easily do otherwise, and they cause few security complications, so I can't
understand why some people are so anxious to throw them away.  I'd much
rather have a system where I can move things around without upsetting
programs (and people) than a system that conforms to some ideological
filesystem purity where I can't move things around.  I've found that
most users can readily grasp the concept of symlinks if someone who has
both knowledge and patience just sits down with them for a few minutes
and explains it to them.  I remember working on an old Data General RDOS
system which had symlinks and being impressed with what a simple-but-
powerful concept it was, and how frustated I got back in my V7 days
because it didn't have symlinks.  Yes, you can create spaghetti file
systems with them, but you can also write spaghetti code in C, and I
don't see anyone screaming for a major overhaul of C just because a
few poor programmers can't deal with it.  After all, flexibility and
customization are the name of the game in UNIX.

I think the generalized-mount idea is worth pursuing (didn't V6 have
something like this?), but not as a replacement for symlinks.  I also
like David Korn's idea of the retained cd path for handlind symlinks
to directories, but I want it to be able to turn it on and off; I don't
want to be browbeaten into using it.  I'm pretty sure that our customers
would not accept changes or elimination of symlink being shoved down
their throats.
---
"I dare you to play this record" -- Ebn-Ozn

Dave Cornutt, Gould Computer Systems, Ft. Lauderdale, FL
[Ignore header, mail to these addresses]
UUCP:  ...!{sun,pur-ee,brl-bmd,seismo,bcopen,rb-dc1}!gould!dcornutt
 or ...!{ucf-cs,allegra,codas,hcx1}!novavax!gould!dcornutt
ARPA: dcornutt@gswd-vms.arpa

"The opinions expressed herein are not necessarily those of my employer,
not necessarily mine, and probably not necessary."