[comp.std.unix] Names vs. fds -- it's a floor wax *and* a dessert topping

chip@tct.uucp (Chip Salzenberg) (10/16/90)

Submitted-by: chip@tct.uucp (Chip Salzenberg)


According to brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
>You are making an unwarranted assumption here: that the shell *has* to
>handle all types of fd creation. It's convenient, of course, but by no
>My TCP connectors, for example, are implemented outside the shell.

Yes, the shell can be let off the hook.  The point I was making,
however, is still valid.  Without a unified namespace, new types of
objects require *some* program to be written and/or modified.  And
such programming isn't always available where it would be needed.

>> In Dan's hypothetical fd-centric UNIX, we would have to either
>> (1) pass such filenames to the shell for interpretation, thus incurring
>> a possibly substantial performance hit; or (2) modify each program to
>> understand all the names the shell would understand.
>
>On the contrary. syslog is a counterexample.

If syslog is your best example of a zero-programming fd-centric
service, then your position is mighty weak.  A program that uses a
syslog-style service must make a call to one or more service-specific
subroutines.  Thus, if a new server is brought on line, program
modification will be required before the new server can be used.

And, of course, there is the vast number of programs that already
exist and use open() exclusively.  Perhaps academics can blow off an
installed base, but we commercial money-grubbers can't afford the
luxury of modifying everything we've ever written -- even just once.

>... [syslog] shows that an fd-centric model works ...

Actually, it shows that fd's *with* a unified namespace are useful.
How, pray tell, do you think openlog() gets its fd?  Via the *name*
"/dev/log".  Syslog depends on the unified namespace (such as it is).

>(1) you do not need to invoke the shell or any other process, and you do
>not need to incur a performance hit;

Granted.

>(2) you do not need to modify each program to understand everything
>that the syslogd program can.  Syslog has proven quite viable.

True ... once the program uses syslog() or an equivalent service.
But the binaries out there in the world don't.

>Provided that there is a message-passing facility available, and
>provided that it has sufficient power to pass file descriptors (which is
>true both under BSD's UNIX-domain sockets and under System V's streams),
>the syslog model will generalize to any I/O mechanism without loss of
>efficiency.

Aha!  So the New, Improved and Expanded syslog becomes the system-wide
name-to-fd translator.  Furthermore, since new servers would require
changes to all clients, the system-wide name-to-fd translator knows
about all available object types.  I think I see the light.

But the server needs a name, if for no other reason than to provide a
library binding.  My suggestion is -- can you guess? -- "open()".

It's deja vu all over again.

>This is just as easy to do outside the kernel as inside the kernel;
>therefore it should be outside.

The server's location is irrelevant.  Its existence is not.

>A unified namespace has several great disadvantages: 1. It provides a
>competing abstraction with file descriptors ...

As Peter da Silva said: Think synergy.  Names are desciptions of
passive objects; fds are descriptions of active (open) objects.
There's no competitition involved.

My idea of harmful competition is multiple abstractions for passive
objects -- pathnames, struct sockaddrs and SysV IPC keys -- and for
active objects -- fds, SysV IPC ids -- each of which has its own set
of open/read/write/close analogues.  I therefore consider both SysV
IPC and BSD sockets to be botches due to their competition with the
Unix name/fd abstraction.  (I'd have a better opinion of sockets if
the socket() call didn't exist and if connect() were named open().)

>2. It is not clear that all sensible I/O objects will fit into one
>namespace.

It's clear to me.

>3. A unified namespace has not been tested on a large scale in the
>real world, and hence is an inappropriate object of standardization
>at this time.

"Advancement by invented standards" is an oxymoron, true.  Given that
POSIX seems to be intent on inventing *something*, though, I push for
a unified namespace.  Several people have described work with Unix (or
Unix-like) systems that keep everything in one namespace.  And surely
Plan 9 counts as "prior art."
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>
    "I've been cranky ever since my comp.unix.wizards was removed
         by that evil Chip Salzenberg."   -- John F. Haugh II


Volume-Number: Volume 21, Number 203

barmar@think.uucp (Barry Margolin) (10/17/90)

Submitted-by: barmar@think.uucp (Barry Margolin)

In article <13642@cs.utexas.edu> chip@tct.uucp (Chip Salzenberg) writes:
>According to brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
>>3. A unified namespace has not been tested on a large scale in the
>>real world, and hence is an inappropriate object of standardization
>>at this time.
>"Advancement by invented standards" is an oxymoron, true.  Given that
>POSIX seems to be intent on inventing *something*, though, I push for
>a unified namespace.  Several people have described work with Unix (or
>Unix-like) systems that keep everything in one namespace.  And surely
>Plan 9 counts as "prior art."

Multics also has a unified namespace, although I suspect its interface
would not be considered acceptable by most Unix folks (when users interface
to it, the result looks a bit JCLish).

Rather than forcing everything into the file system hierarchy, Multics's
generalized I/O system is based on dynamically linking to the procedure
that knows how to open a particular type of I/O device.  The Multics
equivalent to open() takes a character string, called an "attach
description" that looks like a command line.  The first token in the string
is transformed into the name of a subroutine, the dynamic linker is invoked
to load the subroutine, and the remaining arguments are collected into an
array of character strings that are passed as the argument list.  For
instance, to open a file the call would look something like:

	fd = attach ("file /foo/bar/baz");

I've translated from Multics and PL/I style to Unix and C.  The Multics
version also has an additional argument, to specify a label for the file
descriptor, as there are commands to list and manipulate the attachments of
the current process (note that on Multics the entire login session is a
single process).

As another example, to attach to a TCP stream, the rsh command might do:

	fd = attach ("tcp think.com -port shell -privileged_local_port");

On Multics, there's a separate open() that is used after attach() (and,
correspondingly, separate close() and detach()).  Attach() says what device
(physical, as in a tape drive, or virtual, as in a file) you're attaching
to, while open specifies how you're using it at any particular time (input
vs output, file name on a labeled tape, etc.).  In the case of devices such
as tape drives, this distinction allows keeping a tape drive reserved to
the process while opening and closing individual files.

The user interface to this is the "io" command, with syntax like:

	io attach stdin file /foo/bar/baz
	io open stdin -input

To fit this into Unix, you'd have to make a variant of > and < that is
follwed by an attach description rather than a filename.  Actually, you
could just use pipes, e.g.

	io attach stdout tape -drive 0 | dd ... | io attach stdin tcp ...
--
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar


Volume-Number: Volume 21, Number 206