[comp.std.unix] Standards Update, IEEE 1003.4: Real-time Extensions

jsh@usenix.org (Jeffrey S. Haemer) (10/21/89)

From: Jeffrey S. Haemer <jsh@usenix.org>



            An Update on UNIX* and C Standards Activities

                            September 1989

                 USENIX Standards Watchdog Committee

                   Jeffrey S. Haemer, Report Editor

IEEE 1003.4: Real-time Extensions Update

John Gertwagen <jag@laidbak> reports on the July 10-14, 1989 meeting
in San Jose, California:

The P1003.4 meeting in San Jose was very busy.  The meeting focused on
resolving mock-ballot objections and comments. Despite limited
resources for documenting changes, a lot of work got done.  Here's
what stood out.

Shared memory
     The preferred interface falls somewhere between shared-memory-
     only and a mapped-files interface, such as AIX's mmap(), which
     allows files to be treated like in-core arrays.  Group direction
     was to reduce the functionality to support only shared memory, so
     long as the resulting interfaces could be implemented as a
     library over mmap().

Process memory locking
     The various region locks were clarified and, thus, simplified;
     the old definitions were fuzzy and non-portable.  For those who
     haven't seen it, there is actually a memory residency interface
     (i.e., fetch and store operations to meet some metric) instead of
     a locking interface.  Most vendors will probably implement it as
     a lock, but some may want it to impose highest memory priority in
     the paging system.

Inter-process communication
     Members questioned whether the interface definitions could really
     support a broader range of requirements; they're like no others
     in the world today.  Having been designed to meet the real-time
     group's wish list, there are lots of bells and whistles -- far
     more than in System V IPC -- but it's not clear, for example,
     that they are network extensible.  Discussions in these areas
     continue.

__________

  * UNIX is a registered trademark of AT&T in the U.S. and other
    countries.

September 1989 Standards Update      IEEE 1003.4: Real-time Extensions


                                - 2 -

Events and semaphores
     Members were concerned about possible overlap with other
     mechanisms, especially those being considered for threads. The
     question is basically, "Should there be separate functions for
     different flavors or a single function with multiple options?"
     General sentiment (including our snitch's) seems to be for
     multiple functions; however an implementation might choose to
     make them library interfaces to a common, more general system
     call.  There is, however, a significant minority opinion the
     other way.

Scheduling
     Many balloters found process lists and related semantics
     confusing.  An attempt was made to re-cast the wording to be more
     strictly in terms of process behavior.

Timers
     Inheritance was brought in line with existing (BSD) practice.

Outside of the mock ballot, there were two other major news items.

First, there is a movement afoot to make the .4 interfaces part of
1003.1.  They would become additional chapters and might be voted
separately or in logical groups.  This would bring P1003 in line with
the ISO model of a base standard plus application profiles. POSIX.4
would become the real-time profile group.  This is a non-trivial
change.

Up to now, the criterion for .4 has been that of "minimum necessary
for real-time", and has coincidentally been extended to support other
requirements "where convenient".  This is not a good starting point
for a base interface.  For example, mmap(), or something very much
like it, is probably the right base for "shared storage objects", but
real-time users want an interface for shared memory, not for mapped
files.  Our snitch worries that things might look a bit different had
the group been working on a base standard from the beginning.

Second, the committee officially began work on a threads interface,
forming a threads small group and creating a stub chapter in the .4
draft.  A working proposal for the interface, representing the
consensus direction of the working group, will be an appendix to the
next draft.

A lot of work remains to be done before .4 can go to ballot and the
current January '90 target may not be realistic.  If the proposed re-
organization occurs, a ballot before the summer of 1990 seems unlikely.  

September 1989 Standards Update      IEEE 1003.4: Real-time Extensions

Volume-Number: Volume 17, Number 40

jsh@usenix.org (Jeffrey S. Haemer) (12/07/89)

From: Jeffrey S. Haemer <jsh@usenix.org>


            An Update on UNIX* and C Standards Activities

                            December 1989

                 USENIX Standards Watchdog Committee

                   Jeffrey S. Haemer, Report Editor

IEEE 1003.4: Real-time Extensions Update

John Gertwagen <jag@laidbak> reports on the November 13-17, 1989
meeting in Milpitas, CA:

Background

The P1003.4 Real-time Working Group, began as the /usr/group technical
committee on real-time extensions.  About two years ago, it was
chartered by the IEEE to develop minimum extensions to POSIX to
support real-time applications.  Over time its scope has expanded, and
P1003.4 is now more a set of general interfaces that extend P1003.1
than a specifically real-time standard.  Its current work is intended
to support not only real-time, but also database, transaction
processing, Ada runtime, and networking environments.  The group is
trying to stay consistent with both the rest of POSIX and other common
practice outside the real-time domain.

The work is moving quickly.  Though we have only been working for two
years, we are now on Draft 9 of the proposed standard, and expect to
go out for ballot before the end of the year.  To help keep up this
aggressive schedule.  P1003.4 made only a token appearance at the
official P1003 meeting in Brussels.  The goal of the Milpitas meeting
was to get the draft standard ready for balloting.

Meeting Summary

Most of the interface proposals are now relatively mature, so there
was a lot of i-dotting and t-crossing, and (fortunately) little
substantive change.  The "performance metrics" sections of several
interface chapters still need attention, but there has been little
initiative in the group to address them, so it looks like the issues
will get resolved during balloting.

The biggest piece of substantive work was a failed attempt to make the

__________

  * UNIX is a registered trademark of AT&T in the U.S. and other
    countries.

December 1989 Standards Update       IEEE 1003.4: Real-time Extensions


                                - 2 -

recently introduced threads proposal clean enough to get into the
ballot.  The stumbling block is a controversy over how to deal with
signals.

There are really two, related problems: how to send traditional
UNIX/POSIX signals to a multi-threaded process, and how to
asynchronously interrupt a thread.

Four options have been suggested: delivering signals to a (somehow)
privileged thread, per Draft 8; delivering a signal to whichever
thread is unlucky enough to run next, a la Mach; delivering the signal
to each thread that declares an interest; and ducking the issue by
leaving signal semantics undefined.

We haven't been able to decide among the options yet; the working
group is now split evenly. About half think signal semantics should
follow the principle of least surprise, and be fully extended to
threads.  The other half think each signal should be delivered to a
single thread, and there should be a separate, explicit mechanism to
let threads communicate with one another.

(Personally, I think the full semantics of process signals is extra
baggage in the "lightweight" context of threads.  I favor delivering
signals to a privileged thread -- either the "first" thread or a
designated "agent" -- and providing a separate, lightweight interface
for asynchronously interrupting threads.  On the other hand, I would
be happy to see threads signal one another in a way that looks,
syntactically and semantically, like inter-process signals.  In fact,
I think the early, simpler versions of signal() look a lot like what's
needed (around V6 or so).  I don't care whether the implementation of
"process" and "thread" signals is the same underneath or not.  That
decision should be left to the vendor.)

Directions

Draft 9 of P1003.4 is being readied for ballot as this is being
written and should be distributed by mid-December.  With a little
luck, balloting will be over by the Summer of '90.

Threads is the biggest area of interest in continued work.  The
threads chapter will be an appendix to Draft 9 and the balloting group
will be asked to comment on the threads proposal as if it were being
balloted.  Unless there is a significant write-in effort, the threads
interface will probably be treated as a new work item for P1003.4.
Then, if the outstanding issues can be resolved expediently, threads
could go to ballot as early as April '90.

With the real-time interfaces defined, it looks like the next task of
this group will be to create one or more Real-time Application

December 1989 Standards Update       IEEE 1003.4: Real-time Extensions


                                - 3 -

Portability Profiles (RAPPS?) that define how to use the interfaces in
important real-time application models.  Agreeing on the application
models may be harder than agreeing on the interfaces, but we'll see.

December 1989 Standards Update       IEEE 1003.4: Real-time Extensions


Volume-Number: Volume 17, Number 92

henry@utzoo.uucp (12/09/89)

From: henry@utzoo.uucp

>From: Jeffrey S. Haemer <jsh@usenix.org>
>[threads vs signals] In fact,
>I think the early, simpler versions of signal() look a lot like what's
>needed (around V6 or so)...

Actually, it can be simpler yet, as Waterloo's Thoth system showed.
Subject to some sort of suitable protections (perhaps including a way
to ignore signals), when a thread receives a signal, it drops dead.
No signal handlers or blocking.  If you want some sort of recovery
action, have another thread waiting for the first one to die:  it has
access to all the first thread's data, so it can do whatever recovery
is appropriate.

                                     Henry Spencer at U of Toronto Zoology
                                 uunet!attcan!utzoo!henry henry@zoo.toronto.edu

Volume-Number: Volume 17, Number 96

jsh@usenix.org (Jeffrey S. Haemer) (08/22/90)

From:  Jeffrey S. Haemer <jsh@usenix.org>


           An Update on UNIX*-Related Standards Activities

                             August, 1990

                 USENIX Standards Watchdog Committee

          Jeffrey S. Haemer <jsh@usenix.org>, Report Editor

IEEE 1003.4: Real-time Extensions

Rick Greer <rick@ism.isc.com> reports on the July 16-20 meeting in
Danvers, Massachusetts:

Most of the action in the July dot four meeting centered around -- you
guessed it -- threads.  The current threads draft (1003.4a) came very
close to going to ballot.  An overwhelming majority of those present
voted to send the draft to ballot, but there were enough complaints
from the dot fourteen people (that's multiprocessing -- MP) about the
scheduling chapter to hold it back for another three months.
Volunteers from dot fourteen have agreed to work on the scheduling
sections so that the draft can go to ballot after the next meeting, in
October.

Actually, dot four voted to send the draft to ballot after that
meeting in any case, and the resolution was worded in such a way that
a consensus would be required to prevent the draft from going to
ballot in October.  Thus, the MP folks have this one final chance to
clean up the stuff that's bothering them -- if it isn't fixed by
October, it will have to be fixed in balloting.  Some of us in dot
fourteen felt the best way to fix the thread scheduling stuff was via
ballot objection anyway.  Unfortunately, the threads balloting group
is now officially closed, and a number of MP people saw this as their
last chance to make a contribution to the threads effort.  The members
of dot fourteen weren't the only ones to be taken by surprise by the
closure of the threads balloting group.  It seems that many felt that
it would be a cold day in hell before threads made it to ballot and
weren't following the progress of dot four all that closely.  [Editor:
I've bet John Gertwagen a beer that threads will finish balloting
before the rest of dot four.  The bet's a long way from being decided,
but I still think I'll get my beer.]

Meanwhile, the ballot resolution process continues for the rest of dot
four, albeit rather slowly.  There are a number of problems here, the
biggest being lack of resources.  In general, people would much rather
argue about threads than deal with the day-to-day grunt work
associated with the IEEE standards process.  [Editor: The meeting will

__________

  * UNIXTM is a Registered Trademark of UNIX System Laboratories in
    the United States and other countries.

August, 1990 Standards Update        IEEE 1003.4: Real-time Extensions


				- 2 -

be in Seattle, Washington.  Go.  Be a resource.] Many of the technical
reviewers have yet to get started on their sections.  Nevertheless,
proposed resolutions to a number of objections were presented and
accepted at the Danvers meeting.

     [Editor: Rick is temporarily unavailable, but Simon Patience of
     the OSF has kindly supplied these examples:

     The resolved objections were taken from the CRB: replacing the
     event mechanism by ``fixed'' signals, replacing the shared memory
     mechanism by mmap() and removing semaphore handles from the file
     system name space.  Replacing events by signals was accepted; no
     problem here.  Adopting mmap() got a mixed reception, partly
     because some people thought you had to take all of mmap(), where
     a subset might suffice.  The final vote on this was not to ask
     the reviewer to adopt mmap(), which may not not satisfy the
     objectors.  I'd guess the balloting group will eventually hold
     sway here!  Finally, the group accepted abandoning the use of
     file descriptors for semaphore handles, but some participants
     wanted to keep semaphore names pathnames.  The reviewer was sent
     off to rethink the implications of this suggestion.  ]

We should be seeing a lot more of this in Seattle.  Similar comments
apply to the real-time profile (AEP) work.  The AEP group made more
progress over the last three months than the technical reviewers did,
although even that (the AEP progress) was less than I'd hoped.  We're
expecting our first official AEP draft in October.

August, 1990 Standards Update        IEEE 1003.4: Real-time Extensions

Volume-Number: Volume 21, Number 50

peter@ficc.ferranti.com (peter da silva) (08/24/90)

From:  peter@ficc.ferranti.com (peter da silva)

My personal opinion is that *anything* that can go into the file system name
space *should*. That's what makes UNIX UNIX... that it's all visible from the
shell...
---
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
peter@ferranti.com

Volume-Number: Volume 21, Number 57

chip@tct.uucp (Chip Salzenberg) (08/28/90)

From:  chip@tct.uucp (Chip Salzenberg)

>     Finally, the group accepted abandoning the use of
>     file descriptors for semaphore handles, but some participants
>     wanted to keep semaphore names pathnames.

Aargh!  Almost everyone realizes that System V IPC is a botch, largely
because it doesn't live in the filesystem.  So what does IEEE do?
They take IPC out of the filesystem!

What sane reason could there be to introduce Yet Another Namespace?
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>

Volume-Number: Volume 21, Number 65

sp@mysteron.osf.org (Simon Patience) (08/28/90)

From:  sp@mysteron.osf.org (Simon Patience)

In article <467@usenix.ORG> chip@tct.uucp (Chip Salzenberg) writes:
>From:  chip@tct.uucp (Chip Salzenberg)
>
>>     Finally, the group accepted abandoning the use of
>>     file descriptors for semaphore handles, but some participants
>>     wanted to keep semaphore names pathnames.
>
>Aargh!  Almost everyone realizes that System V IPC is a botch, largely
>because it doesn't live in the filesystem.  So what does IEEE do?
>They take IPC out of the filesystem!
>
>What sane reason could there be to introduce Yet Another Namespace?

The reason for semaphores not being in the file system is twofold. Some
realtime embedded systems do not have a file system but do want semaphores
So this allows them to have them without having to bring in the baggage a
file system would entail. Secondly, as far as threads, which are supposed to
be light weight, are concerned it allows semaphores to be implmented in user
space rather than forcing them into the kernel for the file system.

A good reason for *not* having IPC handles in the file system is to allow
network IPC to use the same interfaces. If you have IPC handles in the
file system then two machines who have applications trying to communicate
would also have to have at least part of their file system name space to
be shared. This is non trivial to arrange for two machines so can you
imaging the problem of doing this for 100 (or 1000?) machines.

I am just the messenger for these views and do not necessarily hold them
myself. They were the reasons that came up during the discussion.

Simon.

  Simon Patience				Phone: (617) 621-8736
  Open Software Foundation			FAX:   (617) 225-2782
  11 Cambridge Center				Email: sp@osf.org
  Cambridge MA 02142				       uunet!osf.org!sp

Volume-Number: Volume 21, Number 68

Chuck.Phillips@FtCollins.NCR.COM (Chuck.Phillips) (08/29/90)

From:  Chuck.Phillips@FtCollins.NCR.COM (Chuck.Phillips)

>>>>> On 28 Aug 90 11:58:40 GMT, sp@mysteron.osf.org (Simon Patience) said:
>>     Finally, the group accepted abandoning the use of
>>     file descriptors for semaphore handles, but some participants
>>     wanted to keep semaphore names pathnames.
>>
>Aargh!  Almost everyone realizes that System V IPC is a botch, largely
>because it doesn't live in the filesystem.  So what does IEEE do?
>They take IPC out of the filesystem!
>
>What sane reason could there be to introduce Yet Another Namespace?

Simon> The reason for semaphores not being in the file system is twofold.
Simon> Some realtime embedded systems do not have a file system but do want
Simon> semaphores...

Simon> A good reason for *not* having IPC handles in the file system is to
Simon> allow network IPC to use the same interfaces.

How about adding non-file-system-based "handles" to an mmap-like interface?
(e.g. shmmap(host,porttype,portnum,addr,len,prot,flags)?)  This could
allow the same interface to be used for network and non-network IPC,
without the overhead of a trap for every non-network IPC transaction.

`Scuse me while I don my flame retardant suit...  :-)

#include <std/disclaimer.h>
--
Chuck Phillips  MS440
NCR Microelectronics 			Chuck.Phillips%FtCollins.NCR.com
2001 Danfield Ct.
Ft. Collins, CO.  80525   		uunet!ncrlnk!ncr-mpd!bach!chuckp

Volume-Number: Volume 21, Number 72

chip@tct.uucp (Chip Salzenberg) (08/30/90)

From:  chip@tct.uucp (Chip Salzenberg)

According to sp@mysteron.osf.org (Simon Patience):
>Some realtime embedded systems do not have a file system but do want
>semaphores.  So this allows them to have them without having to bring
>in the baggage a file system would entail.

I was under the impression that POSIX was designing a portable Unix
interface.  Without a filesystem, you don't have Unix, do you?
Besides, a given embedded system's library could easily emulate a
baby-simple filesystem.

>Secondly, as far as threads, which are supposed to be light weight,
>are concerned it allows semaphores to be implmented in user space
>rather than forcing them into the kernel for the file system.

The desire for user-space support indicates to me that there should be
some provision for non-filesystem (anonymous) IPCs that can be created
and used without kernel intervention.  This need does not reduce the
desirability of putting global IPCs in the filesystem.

>A good reason for *not* having IPC handles in the file system is to allow
>network IPC to use the same interfaces.

Filesystem entities can be used to trigger network activity by the
kernel (or its stand-in), even if they do not reside on shared
filesystems.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>

Volume-Number: Volume 21, Number 74

preece@urbana.mcd.mot.com (Scott E. Preece) (08/30/90)

From:  preece@urbana.mcd.mot.com (Scott E. Preece)

| From:  sp@mysteron.osf.org (Simon Patience)

| The reason for semaphores not being in the file system is twofold. Some
| realtime embedded systems do not have a file system but do want semaphores
| So this allows them to have them without having to bring in the baggage a
| file system would entail.
---
I don't care whether they have something that looks like UNIX filesystem
code or not, or whether they have disk drives or not, but I don't think
it's unreasonable to require them to handle semaphore names as though
they were in a filesystem namespace.  Otherwise you're going to end up
with a collection of independent features, each minimally specified so
that it can work without assuming anything else, and anyone with any
sense is going to say "Yuck" and use a real operating system that
provides reasonable integration and for a uniform notion of, among other
things, naming.
---
| ...		 Secondly, as far as threads, which are supposed to
| be light weight, are concerned it allows semaphores to be implmented in user
| space rather than forcing them into the kernel for the file system.
---
Eh?  I don't know what the group has proposed since the ballot, but it
would seem that using a filesystem name only makes a difference when you
first specify you're going to be looking at a particular semaphore,
which shouldn't be a critical path event.  After that you use a file
descriptor, which I think could be handled in user space about as well
as anything else.  In either case you're going to have to go to the
kernel when scheduling is required (when you block or when you release
the semaphore).
---
| A good reason for *not* having IPC handles in the file system is to allow
| network IPC to use the same interfaces. If you have IPC handles in the
| file system then two machines who have applications trying to communicate
| would also have to have at least part of their file system name space to
| be shared. This is non trivial to arrange for two machines so can you
| imaging the problem of doing this for 100 (or 1000?) machines.
---
You're going to have to synchronize *some* namespace anyway, why
shouldn't it be a piece of the filesystem namespace?

A consistent approach to naming and name resolution for ALL global
objects should be one of the basic requirements for any new POSIX (or
UNIX!) functionality.  We should have *one* namespace so that we can
write general tools that only need to know about one namespace.
--
scott preece
motorola/mcd urbana design center	1101 e. university, urbana, il   61801
uucp:	uunet!uiucuxc!udc!preece,	 arpa:	preece@urbana.mcd.mot.com

Volume-Number: Volume 21, Number 75

kingdon@ai.mit.edu (Jim Kingdon) (08/31/90)

From:  kingdon@ai.mit.edu (Jim Kingdon)

One obvious (if a little wishy-washy) solution is to not specify
whether the namespaces are the same.  That is, applications are
required to use a valid path, and have to be prepared for things like
unwritable directories, but implementations are not required to check
for those things.

This makes sense in light of the fact that there seems to be a general
lack of consensus about which is best.  Even though there is existing
practice for both ways of doing things, it may be premature to
standardize either behavior now.

Volume-Number: Volume 21, Number 76

edj@trazadone.westford.ccur.com (Doug Jensen) (08/31/90)

From:  Doug Jensen <edj@trazadone.westford.ccur.com>

1003.13 is working on real-time AEP's, including one for small embedded
real-time systems which does not have a file system. So the POSIX answer
is yes, without the filesystem you still can have a POSIX-compliant
interface.

Doug Jensen
Concurrent Computer Corp.
edj@westford.ccur.com

Volume-Number: Volume 21, Number 78

fouts@bozeman.bozeman.ingr (Martin Fouts) (09/05/90)

From:  fouts@bozeman.bozeman.ingr (Martin Fouts)

>>>>> On 24 Aug 90 03:28:06 GMT, peter@ficc.ferranti.com (peter da silva) said:
peter> My personal opinion is that *anything* that can go into the file system name
peter> space *should*. That's what makes UNIX UNIX... that it's all visible from the
peter> shell...

I'm not sure which Unix you've been running for the past five or more
years, but a lot of stuff doesn't live in the file system name space
under various BSD derived systems, nor do the networking types believe
it belongs there.  IMHO neither does a process handle, nor a
semaphore, and don't even talk to me about "named pipes" as an IPC
mechanism.

(Gee, I guess reasonable men might differ on what belongs in the name
space ;-)

Marty
--
Martin Fouts

 UUCP:  ...!pyramid!garth!fouts (or) uunet!ingr!apd!fouts
 ARPA:  apd!fouts@ingr.com
PHONE:  (415) 852-2310            FAX:  (415) 856-9224
 MAIL:  2400 Geng Road, Palo Alto, CA, 94303

Moving to Montana;  Goin' to be a Dental Floss Tycoon.
  -  Frank Zappa

Volume-Number: Volume 21, Number 83

gwyn@smoke.brl.mil (Doug Gwyn) (09/07/90)

From:  Doug Gwyn <gwyn@smoke.brl.mil>

In article <488@usenix.ORG> fouts@bozeman.bozeman.ingr (Martin Fouts) writes:
>I'm not sure which Unix you've been running for the past five or more
>years, but a lot of stuff doesn't live in the file system name space
>under various BSD derived systems, nor do the networking types believe
>it belongs there.

Excuse me, but the "networking types" I talk to believe that sockets
were a botch and that network connections definitely DO belong within
a uniform UNIX "file" name space.  Peter was quite right to note that
this is an essential feature of UNIX's design.  In fact there are UNIX
implementations that do this right, 4BSD is simply not among them yet.

Volume-Number: Volume 21, Number 85

peter@ficc.ferranti.com (peter da silva) (09/07/90)

From:  peter da silva <peter@ficc.ferranti.com>

In article <488@usenix.ORG> fouts@bozeman.bozeman.ingr (Martin Fouts) writes:
> > My personal opinion is that *anything* that can go into the file system
> > name space *should*. That's what makes UNIX UNIX... that it's all visible
> > from the shell...

> I'm not sure which Unix you've been running for the past five or more
> years, but a lot of stuff doesn't live in the file system name space
> under various BSD derived systems,

Yes, and there's even more stuff in System V that doesn't live in that
name space. In both cases it's *wrong*.

> nor do the networking types believe
> it belongs there.

Some more details on this subject would be advisable. I'm aware that not
everything *can* go in the file system name space, by the way...

> IMHO neither does a process handle, nor a
> semaphore, and don't even talk to me about "named pipes" as an IPC
> mechanism.

An active semaphore can be implemented any way you want, but it should
be represented by an entry in the name space. The same goes for process
handles and so on.

Named pipes are an inadequate mechanism for much IPC, but they work quite
well for many simple cases. If you're looking at them as some sort of
paragon representing the whole concept, you're sadly mistaken.

Anyway... what is it that makes "dev/win" more worthy of having an entry
in "/dev" than "dev/socket"?
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
peter@ferranti.com

Volume-Number: Volume 21, Number 87

chip@tct.uucp (Chip Salzenberg) (09/07/90)

From:  chip@tct.uucp (Chip Salzenberg)

According to fouts@bozeman.bozeman.ingr (Martin Fouts):
>I'm not sure which Unix you've been running for the past five or more
>years, but a lot of stuff doesn't live in the file system name space ...

The absense of sockets (except UNIX domain), System V IPC, etc. from
the file system is, in the opinion of many, a bug.  It is a result of
Unix being extended by people who do not understand Unix.

Research Unix, which is the result of continued development by the
creators of Unix, did not take things out of the filesystem.  To the
contrary, it put *more* things there, including processes (via the
/proc pseudo-directory).

It is true that other operating systems get along without devices,
IPC, etc. in their filesystems.  That's fine for them; but it's not
relevant to Unix.  Unix programming has a history of relying on the
filesystem to take care of things that other systems handle as special
cases -- devices, for example.  The idea that devices can be files but
TCP/IP sockets cannot runs counter to all Unix experience.

The reason why I continue this discussion here, in comp.std.unix, is
that many Unix programmers hope that the people in the standardization
committees have learned from the out-of-filesystem mistake, and will
rectify it.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>

Volume-Number: Volume 21, Number 89

swart@src.dec.com (Garret Swart) (09/08/90)

From:  swart@src.dec.com (Garret Swart)

I believe in putting lots of interesting stuff in the file system name
space but I don't believe that semaphores belong there.  The reason
I don't want to put semaphores in the name space is the same reason
I don't want to put my program variables in the name space:  I want
to have lots of them, I want to create and destroy them very quickly
and I want to operate on them even more quickly.  In other words, the
granularity is wrong.

The purpose of a semaphore is to synchronize actions on an object.
What kinds of objects might one want to synchronize?  Generally the
objects are either OS supplied like devices or files, or user defined
data structures.  The typical way of synchronizing files and devices
is to use advisory locks or the "exclusive use" mode on the device.
The more difficult case and the one for which semaphores were invented,
and later added to Unix, is that of synchronizing user data structures.

In Unix, user data structures may live either in a process's private
memory or in a shared memory segment.  In both cases there are probably
many different data structures in that memory and many of these data
structures may need to be synchronized.  For maximum concurrency the
programmer may wish to synchronize each data structure with its own
semaphore.  In many applications these data structures may come and
go very quickly and the expense of creating a semaphore to synchronize
the data can be important factor in the performance of the application.

It thus seems more natural to allow semaphores to be efficiently
allocated along with the data that they are designed to synchronize.
That is, allow them to be allocated in a process's private address
space or in a mapped shared memory segment.  A shared memory segment
is a much larger grain object, creating, destroying and mapping them
can be much more expensive than creating, destroying or using a
semaphore and these segments are generally important enough to the
application to have sensible names.  Thus putting a shared memory
segment in the name space seems reasonable.  

For example, a data base library may use a shared member segment named
/usr/local/lib/dbm/personnel/bufpool to hold the buffer pool for the
personnel department's data base.  The data base library would map
the buffer pool into each client's address space allowing many data
base client programs to efficiently access the data base.  Each page
in the buffer pool and each transaction would have its own set of
semaphores used to synchronize access to the page in the pool or the
state of a transaction.  Giving the buffer pool a name is no problem,
but giving each semaphore a name is much more of a hassle.

[Aside:  Another way of structuring such a data base library is as
an RPC style multi-threaded server.  This allows access to the data
base from remote machines and allows easier solutions to the security
and failure problems inherent in the shared memory approach.  However
the shared memory approach has a major performance advantage for systems
that do not support ultra-fast RPCs.  Another approach is to run the
library in an inner mode.  (Unix has one inner mode called the kernel,
VMS has 3, Multics had many.)  This solves the security and failure
problems of the shared segments but it is generally difficult for mere
mortals to write their own inner mode libraries.]

One other issue that may cause one to want to unify all objects in
the file system, at least at the level of using file descriptors to
refer to all objects if not going so far as to put all objects in the
name space, is the fact that single threaded programming is much nicer
if there is a single primitive that will wait for ANY event that the
process may be interested in (e.g. the 4.2BSD select call.)  This call
is useful if one is to write a single threaded program that doesn't
busy wait when it has nothing to do but also won't block when an event
of interest has occurred.  With the advent of multi-threaded programming
the single multi-way wait primitive is no longer needed as instead
one can create a separate thread each blocking for an event of interest
and processing it.  Multi-way waiting is a problem if single threaded
programs are going to get maximum use out of the facility.

I've spoken to a number of people in 1003.4 about these ideas.  I am
not sure whether it played any part in their decision.

Just to prove that I am a pro-name space kind of guy, I am currently
working on and using an experimental file system called Echo that
integrates the Internet Domain name service for access to global names,
our internal higher performance name service for highly available
naming of arbitrary objects, our experimental fault tolerant, log based,
distributed file service with read/write consistency and universal
write back for file storage, and auto-mounting NFS for accessing other
systems.

Objects that are named in our name space currently include:

   hosts, users, groups, network servers, network services (a fault
   tolerant network service is generally provided by several servers),
   any every version of any source or object file known by our source
   code control system

Some of these objects are represented in the name space as a directory
with auxiliary information, mount points or files stored underneath.
This subsumes much of the use of special files like /etc/passwd,
/etc/services and the like in traditional Unix.  Processes are not
currently in the name space, but they will/should be.  (Just a "simple
matter of programming.")

For example /-/com/dec/src/user/swart/home/.draft/6.draft is the name
of the file I am currently typing, /-/com/dec/src/user/swart/shell
is a symbolic link to my shell, /-/com/dec/prl/perle/nfs/bin/ls is
the name of the "ls" program on a vanilla Ultrix machine at DEC's Paris
Research Lab..

[Yes, I know we are using "/-/" as the name of the super root and not
either "/../" or "//" as POSIX mandates, but those other strings are
so uhhgly and /../ is especially misleading in a system with multiple
levels of super root, e.g. on my machine "cd /; pwd" types
/-/com/dec/src.]

Things that we don't put in the name space are objects that are passed
within or between processes by 'handle' rather than by name.  For
example, pipes created with the pipe(2) call, need not be in the name
space.  [At a further extreme, pipes for intra-process communication
don't even involve calling the kernel.]

I personally don't believe in overloading file system operations on
objects for which the meaning is tenuous (e.g. "unlink" => "kill -TERM"
on objects of type process); we tend to define new operations for
manipulating objects of a new type.  But that is even more of a
digression than I wanted to get into!

Sorry for the length of this message, I seem to have gotten carried
away.

Happy trails,

Garret Swart
DEC Systems Research Center
130 Lytton Avenue
Palo Alto, CA 94301
(415) 853-2220
decwrl!swart.UUCP or swart@src.dec.com

Volume-Number: Volume 21, Number 91

gumby@Cygnus.COM (David Vinayak Wallace) (09/08/90)

From: gumby@Cygnus.COM (David Vinayak Wallace)

   Date: 7 Sep 90 15:23:19 GMT
   From: chip@tct.uucp (Chip Salzenberg)
[Most of quoted message deleted.  -mod]

   It is true that other operating systems get along without devices,
   IPC, etc. in their filesystems.  That's fine for them; but it's not
   relevant to Unix.  Unix programming has a history of relying on the
   filesystem to take care of things that other systems handle as special
   cases -- devices, for example....

What defineds `true Unix?'  Don't forget that Multics had all this and
more in the filesystem; this stuff was REMOVED when Unix was written.
Is this `continued development by the creators of Unix' just going
back to what Unix rejected 20 years ago?

Or for a pun for Multics fans: what goes around comes around...

Volume-Number: Volume 21, Number 92

peter@ficc.ferranti.com (Peter da Silva) (09/08/90)

From: peter@ficc.ferranti.com (Peter da Silva)

Other operating systems have learned from UNIX in this respect, in fact!

AmigaOS puts all manner of interesting things in the file name space,
including pipes (PIPE:name), windows (CON:Left/Top/Width/Height/Title/Flags),
and the environment (ENV:varname). Other things have been left out but are
being filled in by users (it's relatively easy to wite device handlers on
AmigaOS). There are some really odd things like PATH:. This can be opened
as a file and looks like a list of directory names, or used as a directory
in which case it looks like the concatenation of all the named directories.
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
peter@ferranti.com

Volume-Number: Volume 21, Number 93

jfh@rpp386.cactus.org (John F. Haugh II) (09/11/90)

From: jfh@rpp386.cactus.org (John F. Haugh II)

In article <497@usenix.ORG> swart@src.dec.com (Garret Swart) writes:
>I believe in putting lots of interesting stuff in the file system name
>space but I don't believe that semaphores belong there.  The reason
>I don't want to put semaphores in the name space is the same reason
>I don't want to put my program variables in the name space:  I want
>to have lots of them, I want to create and destroy them very quickly
>and I want to operate on them even more quickly.  In other words, the
>granularity is wrong.

There is no requirement that you bind every semaphore handle to
a file system name.  Only that the ability to take a semaphore
handle and create a file system name or take a file system name
entry and retreive a semaphore handle.  This would permit you to
rapidly create and destroy semaphore for private use, as well as
provide an external interface for public use.

There is no restriction in either case as to the speed which you
can perform operations on the handle - file descriptors are
associated with file system name entries in many cases and I've
not seen anyone complain that file descriptors slow the system
down.
-- 
John F. Haugh II                             UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 832-8832                           Domain: jfh@rpp386.cactus.org
"SCCS, the source motel!  Programs check in and never check out!"
		-- Ken Thompson

Volume-Number: Volume 21, Number 96

ok@goanna.cs.rmit.OZ.AU (Richard A. O'Keefe) (09/11/90)

From: ok@goanna.cs.rmit.OZ.AU (Richard A. O'Keefe)

In article <497@usenix.ORG>, swart@src.dec.com (Garret Swart) writes:
> I believe in putting lots of interesting stuff in the file system name
> space but I don't believe that semaphores belong there.  The reason
> I don't want to put semaphores in the name space is the same reason
> I don't want to put my program variables in the name space:  I want
> to have lots of them, I want to create and destroy them very quickly
> and I want to operate on them even more quickly.  In other words, the
> granularity is wrong.

So why not choose a different granularity?  Have the thing that goes in
the file system name space be an (extensible) *array* of semaphores.
To specify a semaphore, one would use a (descriptor, index) pair.
To create a semaphore in a semaphore group, just use it.
If you want to have a semaphore associated with a data structure in
mapped memory, just use a lock on the appropriate byte range of the
mapped file.

(Am I hopelessly confused, or aren't advisory record locks *already*
equivalent to binary semaphores?  Trying to lock a range of bytes in
a file is just a multi-wait, no?  Why do we need two interfaces?  (I
can see that two or more _implementations_ behind the interface might
be a good idea, but that's another question.)
-- 
Heuer's Law:  Any feature is a bug unless it can be turned off.

Volume-Number: Volume 21, Number 97

chip@tct.uucp (Chip Salzenberg) (09/12/90)

From: chip@tct.uucp (Chip Salzenberg)

According to gumby@Cygnus.COM (David Vinayak Wallace):
>Is this `continued development by the creators of Unix' just going
>back to what Unix rejected 20 years ago?

They threw away what wouldn't fit.  Then they added features, but
piece by piece, and only as they observed a need.

This cycle has started again with Plan 9, which borrows heavily from
Unix -- almost everything lives in the filesystem -- but which is in
fact a brand new start.

Unix owes much to Multics, and we can learn from it, but we needn't be
driven by it.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>

Volume-Number: Volume 21, Number 102

fouts@bozeman.bozeman.ingr (Martin Fouts) (09/18/90)

Submitted-by: fouts@bozeman.bozeman.ingr (Martin Fouts)

>>>>> On 7 Sep 90 15:23:19 GMT, chip@tct.uucp (Chip Salzenberg) said:

Chip> According to fouts@bozeman.bozeman.ingr (Martin Fouts):
>I'm not sure which Unix you've been running for the past five or more
>years, but a lot of stuff doesn't live in the file system name space ...

Chip> The absense of sockets (except UNIX domain), System V IPC, etc. from
Chip> the file system is, in the opinion of many, a bug.  It is a result of
Chip> Unix being extended by people who do not understand Unix.
                             ^-------------------------------^

My aren't we superior.  (;-) At one time, I believed that sockets
belonged in the filesystem name space.  I spent a long time arguing
this point with members of the networking community before they
convinced me that certain transient objects do not belong in that name
space.  (See below)

Chip> Research Unix, which is the result of continued development by the
Chip> creators of Unix, did not take things out of the filesystem.  To the
Chip> contrary, it put *more* things there, including processes (via the
Chip> /proc pseudo-directory).

The value of proc in the file system are debatable.  Certain debugging
tools are easier to hang on an fcntl certain others are not.  However, the
presences of the proc file system is not a strong arguement for the
inclusion of othere features in the file system.

Chip> It is true that other operating systems get along without devices,
Chip> IPC, etc. in their filesystems.  That's fine for them; but it's not
Chip> relevant to Unix.  Unix programming has a history of relying on the
Chip> filesystem to take care of things that other systems handle as special
Chip> cases -- devices, for example.  The idea that devices can be files but
Chip> TCP/IP sockets cannot runs counter to all Unix experience.

Unix programming has a history of using the filesystem for some things
and not using it for others.  For example, I can demonstrate a
semantic under which it is possible to put the time of day clock into
the file system and reference it by opening the i.e. /dev/timeofday
file.  Each time I read from that file, I would get the current time.
Via fcntls, I could extend this to handle timer functions.  It wasn't
done in Unix.  (I've done similar things in other OSs I've designed,
though.)

The whole point of the response which you partially quoted was to
remind the poster I was responding to that not all functions which
might have been placed in the filesystem automatically have.

Chip> The reason why I continue this discussion here, in comp.std.unix, is
Chip> that many Unix programmers hope that the people in the standardization
Chip> committees have learned from the out-of-filesystem mistake, and will
Chip> rectify it.
Chip> --

The reason I respond is that it is not automatically safe to assume
that something belongs in the file system because something else is
already there.  There is also an explicit problem not mentioned in
this discussion which is the distinction between filesystem name space
and filesystem semantics.  Sometimes there are objects which would be
reasonable to treat with filesystem semantics for which there is no
reasonable mechanism for introducing them into the filesystem name
space.  Because of the way network connections are made, I have been
convinced by networking experts (who are familiar with the "Unix
style") that the filesystem namespace does not have a good semantic
match for the network name space.
 
Chip> Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>

Chip> Volume-Number: Volume 21, Number 89

Marty
--
Martin Fouts

 UUCP:  ...!pyramid!garth!fouts (or) uunet!ingr!apd!fouts
 ARPA:  apd!fouts@ingr.com
PHONE:  (415) 852-2310            FAX:  (415) 856-9224
 MAIL:  2400 Geng Road, Palo Alto, CA, 94303

Moving to Montana;  Goin' to be a Dental Floss Tycoon.
  -  Frank Zappa

Volume-Number: Volume 21, Number 114

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (09/19/90)

Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein)

In article <523@usenix.ORG> fouts@bozeman.bozeman.ingr (Martin Fouts) writes:
> At one time, I believed that sockets
> belonged in the filesystem name space.  I spent a long time arguing
> this point with members of the networking community before theyy
> convinced me that certain transient objects do not belong in that name
> space.

In contrast, I've found it quite easy to get people to agree that
practically every object should be usable as an open *file*. The beauty
and power of UNIX is the abstraction of files---not filesystems. I'd say
that the concept of an open file descriptor is one of the most important
reasons that UNIX-style operating systems are taking over the world.

chip@tct.uucp (Chip Salzenberg) writes:
> The reason why I continue this discussion here, in comp.std.unix, is
> that many Unix programmers hope that the people in the standardization
> committees have learned from the out-of-filesystem mistake, and will
> rectify it.

I am a UNIX programmer who strongly hopes that standards committees will
never make the mistake of putting network objects into the filesystem.
Although the semantics of read() and write() fit network connections
perfectly, the semantics of open() most certainly do not. I will readily
support passing network connections as file descriptors. I will fight
tooth and nail to make sure that they need not be passed as filenames.

---Dan

Volume-Number: Volume 21, Number 115

chip@tct.uucp (Chip Salzenberg) (09/20/90)

Submitted-by: chip@tct.uucp (Chip Salzenberg)

According to fouts@bozeman.bozeman.ingr (Martin Fouts):
>According to chip@tct.uucp (Chip Salzenberg):
>> Research Unix [...] put *more things [in the filesystem],
>> including processes (via the /proc pseudo-directory).
>
>The value of proc in the file system are debatable.  Certain debugging
>tools are easier to hang on an fcntl certain others are not.

With /proc, some things are much easier.  (Getting a list of all
active pids, for example.)  Nothing, however, is harder.  A big win.

>However, the presences of the proc file system is not a strong arguement
>for the inclusion of othere features in the file system.

I disagree.  I consider it an excellent example of how the designers
of Unix realize that all named objects potentially visible to more
than one process belong in the filesystem namespace.

>Unix programming has a history of using the filesystem for some things
>and not using it for others.  For example, I can demonstrate a
>semantic under which it is possible to put the time of day clock into
>the file system ...

Of course.  But in the absense of remotely mounted filesystems --
which V7 Unix was not designed to support -- there is only one time of
day, so it needs no name.  (I wouldn't be surprised if Plan 9 has a
/dev/timeofday, however.)

>... not all functions which might have been placed in the
>filesystem automatically have.

This observation is correct.  But it is clear that the designers of
Research Unix have used the filesystem for everything that needs a
name, and they continue to do so.  Their work asks, "Why have multiple
namespaces?"  Plan 9 asks the question again, and with a megaphone.

>Because of the way network connections are made, I have been
>convinced by networking experts (who are familiar with the "Unix
>style") that the filesystem namespace does not have a good semantic
>match for the network name space.

Carried to its logical conclusion, this argument would invalidate
special files and named pipes, since they also lack a "good semantic
match" with flat files.  In fact, the only entities with a "good
semantic match" for flat files are -- you guessed it -- flat files.

So, how do we program in such a system?  We use its elegant interface
-- or should I say "interfaces"?  Plain files, devices, IPCs, and
network connections each have a semantically accurate interface, which
unfortunately makes it different from all others.

This is progress?  "Forward into the past!"
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>

Volume-Number: Volume 21, Number 119

chip@tct.uucp (Chip Salzenberg) (09/20/90)

Submitted-by: chip@tct.uucp (Chip Salzenberg)

According to brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
>The beauty and power of UNIX is the abstraction of files---
>not filesystems.

The filesystem means that anything worth reading or writing can be
accessed by a name in one large hierarchy.  It means a consistent
naming scheme.  It means that any entity can be opened, listed,
renamed or removed.

Both the filesystem and the file descriptor are powerful abstractions.
Do not make the mistake of minimizing either one's contribution to the
power and beauty of UNIX.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>

Volume-Number: Volume 21, Number 118

peter@ficc.ferranti.com (Peter da Silva) (09/23/90)

Submitted-by: peter@ficc.ferranti.com (Peter da Silva)

In article <523@usenix.ORG> fouts@bozeman.bozeman.ingr (Martin Fouts) writes:
> My aren't we superior.  (;-) At one time, I believed that sockets
> belonged in the filesystem name space.  I spent a long time arguing
> this point with members of the networking community before they
> convinced me that certain transient objects do not belong in that name
> space.  (See below)

You mean things that don't operate like a single bidirectional stream, like
pipes? It's funny that the sockets that *do* behave that way are not in the
file system, while UNIX-domain sockets (which have two ends on the local box)
are.

> Unix programming has a history of using the filesystem for some things
> and not using it for others.

UNIX programming has a history of using whatever ad-hoc hacks were needed
to get things working. It's full of evolutionary dead-ends... some of which
have been discarded (multiplexed files) and some of which have been patched
up and overloaded (file protection bits). But where things have moved closer
to the underlying principles (everything is a file, for example) it's become
the better for it.

> Sometimes there are objects which would be
> reasonable to treat with filesystem semantics for which there is no
> reasonable mechanism for introducing them into the filesystem name
> space.

This seems reasonable, but the rest is a pure argument from authority.
Could you repeat these arguments for the benefit of hose of us who don't
have the good fortune to know these networking experts you speak of?

[ Everyone involved in this discussion, please try to keep it in a
technical, not a personal, vein.  -mod ]

-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
peter@ferranti.com

Volume-Number: Volume 21, Number 127

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (09/25/90)

Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein)

In article <528@usenix.ORG> chip@tct.uucp (Chip Salzenberg) writes:
> According to brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
> >The beauty and power of UNIX is the abstraction of files---
> >not filesystems.
> Both the filesystem and the file descriptor are powerful abstractions.

On the contrary: Given file descriptors, the filesystem is an almost
useless abstraction.

Programs fall into two main classes. Some (such as diff) take a small,
fixed number of filename arguments and treat each one specially. They
become both simpler and more flexible if they instead use file
descriptors. I'll propose multitee as an example of this.

Others (such as sed or compress) take many filenames and perform some
action on each file in turn. They also become both simpler and more
flexible if they instead take input and output from a couple of file
descriptors, perhaps with a simple protocol for indicating file
boundaries. I'll propose the new version of filterfile as a
demonstration of how this can simplify application development.

In both cases, the application need know absolutely nothing about the
filesystem. A few utilities deal with filenames---shell redirection and
cat. A few utilities do the same for network connections---authtcp and
attachport. A few utilities do the same for pipes---the shell's piping.
But beyond these two or three programs per I/O object, the filesystem
contributes *nothing* to the vast majority of applications.

There is one notable exception. Some programs depend on reliable,
static, local or virtually local storage, usually for what amounts to
interprocess communication. (login needs /etc/passwd. cron reads crontab.
And so on.) This is exactly what filesystems were designed for, and a
program that wants reliable, static, local storage is perfectly within
its rights to demand the sensible abstraction we call a filesystem.

Most applications that use input and output, though, don't care that
it's reliable or static or local. For them, the filesystem is pointless.
Many of us are convinced that open() and rename() and unlink() and so on
are an extremely poor match for unreliable or dynamic or remote I/O. We
also see the sheer uselessness of forcing all I/O into the filesystem.
You must convince us that open() makes sense for everything that might
be a file descriptor, and that it provides a real benefit for future
applications, before you destroy what we see as the beauty and power of
UNIX.

---Dan

Volume-Number: Volume 21, Number 128

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (09/25/90)

Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein)

In article <539@usenix.ORG> peter@ficc.ferranti.com (Peter da Silva) writes:
> But where things have moved closer
> to the underlying principles (everything is a file, for example) it's become
> the better for it.

The underlying principle is that everything is a file *descriptor*.

> > Sometimes there are objects which would be
> > reasonable to treat with filesystem semantics for which there is no
> > reasonable mechanism for introducing them into the filesystem name
> > space.
> This seems reasonable, but the rest is a pure argument from authority.
> Could you repeat these arguments for the benefit of hose of us who don't
> have the good fortune to know these networking experts you speak of?

The filesystem fails to deal with many (most?) types of I/O that aren't
reliable, static, and local. Here's an example: In reality, you initiate
a network stream connection in two stages. First you send off a request,
which wends its way through the network. *Some time later*, the response
arrives. Even if you aren't doing a three-way handshake, you must wait a
long time (in practice, up to several seconds on the Internet) before
you know whether the open succeeds.

In the filesystem abstraction, you open a filename in one stage. You
can't do anything between initiating the open and finding out whether or
not it succeeds. This just doesn't match reality, and it places a huge
restriction on programs that want to do something else while they
communicate.

You can easily construct other examples, but one should be enough to
convince you that open() just isn't sufficiently general for everything
that you might read() or write().

---Dan

Volume-Number: Volume 21, Number 129

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (09/25/90)

Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein)

In article <529@usenix.ORG> chip@tct.uucp (Chip Salzenberg) writes:
> According to fouts@bozeman.bozeman.ingr (Martin Fouts):
> >However, the presences of the proc file system is not a strong arguement
> >for the inclusion of othere features in the file system.
> I disagree.  I consider it an excellent example of how the designers
> of Unix realize that all named objects potentially visible to more
> than one process belong in the filesystem namespace.

I disagree. I consider it an excellent example of how the designers of
UNIX realize that all *reliable*, *static*, *local* (or virtually local)
I/O objects potentially visible to more than one process belong in the
filesystem namespace.

/dev/proc, for example, is reliable---there's no chance of arbitrary
failure. It's static---processes have inertia, and stick around until
they take the positive action of exit()ing. And it's local---you don't
have an arbitrary delay before seeing the information. So it's a
perfectly fine thing to include in the filesystem without hesitation.

Objects that aren't reliable, or aren't static, or aren't local, also
aren't necessarily sensible targets of an open(). Some of them might fit
well, but each has to be considered on its own merits.

> So, how do we program in such a system?  We use its elegant interface
> -- or should I say "interfaces"?  Plain files, devices, IPCs, and
> network connections each have a semantically accurate interface, which
> unfortunately makes it different from all others.

The single UNIX interface is the file descriptor. You can read() or
write() reasonable I/O objects through file descriptors. Very few
programs---the shell is a counterexample---need to worry about what it
takes to set up those file descriptors. Very few programs---stty is a
counterexample---need to know the ioctl()s or other functions that
control the I/O more precisely. What is your complaint?

---Dan

Volume-Number: Volume 21, Number 136

henry@zoo.toronto.edu (Henry Spencer) (09/25/90)

Submitted-by: henry@zoo.toronto.edu (Henry Spencer)

In article <541@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>In the filesystem abstraction, you open a filename in one stage. You
>can't do anything between initiating the open and finding out whether or
>not it succeeds. This just doesn't match reality, and it places a huge
>restriction on programs that want to do something else while they
>communicate.

Programs that want to do two things at once should use explicit parallelism,
e.g. some sort of threads facility.  In every case I've seen, this yielded
vastly superior code, with clearer structure and better error handling.
-- 
TCP/IP: handling tomorrow's loads today| Henry Spencer at U of Toronto Zoology
OSI: handling yesterday's loads someday|  henry@zoo.toronto.edu   utzoo!henry

Volume-Number: Volume 21, Number 131

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (09/26/90)

Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein)

In article <543@usenix.ORG> henry@zoo.toronto.edu (Henry Spencer) writes:
> In article <541@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> >In the filesystem abstraction, you open a filename in one stage. You
> >can't do anything between initiating the open and finding out whether or
> >not it succeeds. This just doesn't match reality, and it places a huge
> >restriction on programs that want to do something else while they
> >communicate.
> Programs that want to do two things at once should use explicit parallelism,
> e.g. some sort of threads facility.  In every case I've seen, this yielded
> vastly superior code, with clearer structure and better error handling.

I agree that programs that want to do two things at once should use
threads. However, a program that sends out several connection requests
is *not*, in fact, doing several things at once. open() forces it into
an unrealistic local model; surely you agree that this is not a good
semantic match for what actually goes on.

That example shows what goes wrong when locality disappears. As another
example, NFS (as it is currently implemented) shows what goes wrong when
reliability disappears. Have you ever run ``df'' on a Sun, only to have
it hang and lock up your terminal? Your process is stuck in kernel mode,
waiting for an NFS server that may be flooded with requests or may have
crashed. Programs that use the filesystem for IPC assume that their
files won't just disappear; this isn't true under NFS.

I am not saying that networked filesystems are automatically a bad
thing. Quite the contrary: a distributed filesystem with caching and
other forms of replication can easily be local and reliable, and I'll
gladly see standard UNIX make provisions for it. But something that's
not local, or not reliable, or not static, is also not necessarily
appropriate for the filesystem.

---Dan

Volume-Number: Volume 21, Number 132

ske@pkmab.se (Kristoffer Eriksson) (09/26/90)

Submitted-by: ske@pkmab.se (Kristoffer Eriksson)

In article <541@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>In the filesystem abstraction, you open a filename in one stage. [...]
>
>You can easily construct other examples, but one should be enough to
>convince you that open() just isn't sufficiently general for everything
>that you might read() or write().

What prevents us from inventing a few additional filesystem operations
that ARE general enough?

I think the important thing about the filesystem abstraction that is being
debated here, is the idea of a common name space, and that idea does not
require open() to be an indivicible operation, and it does not require that
open() must be the only way to associate a file descriptor to a named object,
as long as there is only one name space.
-- 
Kristoffer Eriksson, Peridot Konsult AB, Hagagatan 6, S-703 40 Oerebro, Sweden
Phone: +46 19-13 03 60  !  e-mail: ske@pkmab.se
Fax:   +46 19-11 51 03  !  or ...!{uunet,mcsun}!sunic.sunet.se!kullmar!pkmab!ske

Volume-Number: Volume 21, Number 133

henry@zoo.toronto.edu (Henry Spencer) (09/27/90)

Submitted-by: henry@zoo.toronto.edu (Henry Spencer)

In article <544@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>> Programs that want to do two things at once should use explicit parallelism,
>> e.g. some sort of threads facility.  In every case I've seen, this yielded
>> vastly superior code, with clearer structure and better error handling.
>
>I agree that programs that want to do two things at once should use
>threads. However, a program that sends out several connection requests
>is *not*, in fact, doing several things at once...

I'm afraid I don't understand:  a program that is trying, simultaneously,
to open several different connections is somehow not doing several things
at once?  I think this is a confusion of implementation with specification.

The program *is* doing several things at once, to wit opening several
connections at once.  If "open" is split into several steps, you can
implement this in a single-threaded program, crudely, by interleaving
the steps of the different opens.  My point is that the code is cleaner,
and often details like good error handling are easier, if you admit that
there is parallel activity here and use explicitly parallel constructs.
Then an "open" that is ready for step 2 does not need to wait for all
the others to finish step 1 first.  And if you do this, there is no need
to decompose "open" at all, because each thread just does all the steps
of one open in sequence.  Furthermore, it can then proceed to do other
useful setup chores, e.g. initial dialog on its connection, without
waiting for the others.  This is a far more natural model of what's
going on than forcing everything into one sequential process, and a
much better match for the semantics of the problem.
-- 
TCP/IP: handling tomorrow's loads today| Henry Spencer at U of Toronto Zoology
OSI: handling yesterday's loads someday|  henry@zoo.toronto.edu   utzoo!henry

Volume-Number: Volume 21, Number 134

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (09/27/90)

Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein)

In article <546@usenix.ORG> henry@zoo.toronto.edu (Henry Spencer) writes:
> I'm afraid I don't understand:  a program that is trying, simultaneously,
> to open several different connections is somehow not doing several things
> at once?

Correct. Between sending an open request out upon the network and
receiving an acknowledgment, the program is not doing anything at all
related to that connection.

Let me be more specific. Host X, on the Internet, wants to know the
time. It decides to ask ten hosts around the network for the time.

In reality, here's what happens in X's interaction with Y: X sends to Y
a request for a connection on port 37. Pause. Y acknowledges. Y sends a
few bytes back and closes the connection. During the pause, X is doing
nothing.

But there are several Y's. So X sends out ten requests in sequence. It
waits. Each Y responds at some point; X collects the responses in
whatever order they come. Where is it doing any two things at once, let
alone several?

> The program *is* doing several things at once, to wit opening several
> connections at once.

``Opening a connection'' is really an abuse of the language, because a
network open consists of at least two steps that may come arbitrarily
far apart. Let me replace it by phrases that honestly describe what the
computer is doing: ``sending out a connection request, and later
accepting an acknowledgment.''

Now, out of the requests and acknowledgments going on, what two are
happening at once? None of them. You're being misled by the terminology.
``Opening a connection'' is such a common phrase that we automatically
accept it as a description of reality, and consequently believe that it
is well described by open(); but it isn't. The time between request and
acknowledgment is filled with nothing but a void.

  [ combining threads with a one-step open() ]
> This is a far more natural model of what's
> going on than forcing everything into one sequential process, and a
> much better match for the semantics of the problem.

No. It is not an accurate description of what is going on, since an
open() is implicitly local while a network open is not.

Abstract imagery aside, though, ``naturalness'' is really defined by how
a concept helps a programmer. BSD's non-blocking connect() and select()
for connection acceptance, while perhaps not the best-named system
calls, are extremely easy to work with. They adapt perfectly to network
programming problems because they accurately reflect what the system is
doing. In contrast, forking off threads and kludging around a local
open() is unnecessarily complex and would make network programming
unnecessarily difficult. For me that condemns it as an unnatural,
inaccurate reflection of reality.

---Dan

Volume-Number: Volume 21, Number 135

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (09/27/90)

Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein)

In article <545@usenix.ORG> ske@pkmab.se (Kristoffer Eriksson) writes:
> In article <541@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
    [ file descriptors are general; the filesystem is not ]
> What prevents us from inventing a few additional filesystem operations
> that ARE general enough?

That's a good question. I am willing to believe that a somewhat
different kind of filesystem could sensibly handle I/O objects that are
neither reliable nor local. I find it somewhat harder to believe that
the concept of a filesystem can reasonably reflect dynamic I/O:
information placed into a filesystem should stick around until another
explicit action.

In any case, you'll have to invent those operations first.

> I think the important thing about the filesystem abstraction that is being
> debated here, is the idea of a common name space,

Here's what I thought upon reading this.

First: ``A common name space is irrelevant to the most important
properties of a filesystem.''

Second: ``A common name space is impossible.''

And finally: ``We already have a common name space.''

Let me explain. My first thought was that the basic purpose of a
filesystem---to provide reliable, static, local I/O---didn't require a
common name space. As long as there's *some* way to achieve that goal,
you have a filesystem. UNIX has not only some way, but a uniform,
consistent, powerful way: file descriptors.

But that's dodging your question. Just because a common name space is
irrelevant to I/O doesn't mean that it may not be helpful for some other
reason. My second thought was that the kind of name space you want is
impossible. You want to include network objects, but no system can
possibly keep track of the tens of thousands of ports under dozens of
protocols on hundreds of thousands of computer. It's just too big.

But that's not what you're looking for. Although the name space is huge,
any one computer only looks at a tiny corner of that space. You only
need to see ``current'' names. My third thought: We already have that
common name space! (file,/bin/sh) is in that space. (host,128.122.142.2)
is in that space. (proc,1) is in that space. No system call uses this
common name space, but it's there. Use it at will.

---Dan

Volume-Number: Volume 21, Number 137

rja7m@plaid.cs.Virginia.EDU (Ran Atkinson) (09/27/90)

Submitted-by: rja7m@plaid.cs.Virginia.EDU (Ran Atkinson)

In article <545@usenix.ORG> ske@pkmab.se (Kristoffer Eriksson) writes:

>What prevents us from inventing a few additional filesystem operations
>that ARE general enough?

PLEASE.  Let's don't go off inventing new things as part of a standards
effort.  The proper way to approach standardisation is to standardise
the existing practice and avoid all new inventions that haven't been
fully implemented and tested widely.  Many of the problems with UNIX-derived
OSs have come from folks who didn't do this and ended up with stuff that
wasn't really compatible with the rest of the OS in function or approach.

A lot of the problems I see coming out of the working groups in P1003
come from folks failing to standardise existing practice and instead
going off and inventing a new idea in the committee that hasn't been
implemented and lacks adequate actual experience with whether the idea
really works and is a general solution to a real problem.  

  Randall Atkinson
  randall@Virginia.EDU

Volume-Number: Volume 21, Number 140

chip@tct.uucp (Chip Salzenberg) (09/28/90)

Submitted-by: chip@tct.uucp (Chip Salzenberg)

According to brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
>The underlying principle is that everything is a file *descriptor*.

No one disputes the significance of file descriptors.

Nevertheless, it is important not to underestimate the simplification
gained by using one namespace for all objects -- files, devices,
processes, hosts, IPC entities, etc.  A filesystem is good for files,
but a namespace is good for everything.  And if an object has a name,
and you want a file descriptor referring to that object, why invent a
new system call?  I'd rather continue using open().

>In reality, you initiate a network stream connection in two stages.
>First you send off a request, which wends its way through the network.
>*Some time later*, the response arrives.

This situation is easily modeled with open() and O_NDELAY.  Compare
the way Unix opens a modem control tty.  Normally, the open() call
will block until the carrier detect line is asserted.  However, the
O_NDELAY parameter to open() avoid the blockage.

Likewise, an open() on a TCP connection would block until the
connection succeeds or fails.  However, the O_NDELAY parameter would
allow the program to continue immediately, with provisional status of
"success".  The program could come back and check on the open() status
later, perhaps with an fcntl() call.

Devices are well-entrenched residents of the filesystem namespace.  So
far, all proposed reasons for keeping network connections out of the
filesystem would apply equally to devices.  Do we really want to leave
the filesytem free of everything except files?  That way lay CP/M.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>

Volume-Number: Volume 21, Number 138

chip@tct.uucp (Chip Salzenberg) (09/28/90)

Submitted-by: chip@tct.uucp (Chip Salzenberg)

According to brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
>NFS (as it is currently implemented) shows what goes wrong when
>reliability disappears.

In a discussion of filesystem semantics, NFS is a straw man.  Everyone
knows it's a botch.

If AFS and RFS don't convince one that a networked filesystem
namespace can work well, then nothing will.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>

Volume-Number: Volume 21, Number 139

chip@tct.uucp (Chip Salzenberg) (09/28/90)

Submitted-by: chip@tct.uucp (Chip Salzenberg)

According to brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
>On the contrary: Given file descriptors, the filesystem is an almost
>useless abstraction.

Characterizing the Unix filesystem as "almost useless" is, frankly,
hogwash.  A hierarchical filesystem with mount points is a simple, yet
powerful, organizational tool.

To get back to the original point of this thread, one of my primary
complaints about the System V IPC facilities is that they all live in
a flat namespace.  There is no way for me to create a subdirectory for
my application, with naturally named IPCs within that directory.  Such
hierarchical division is "almost useless?"  Hardly.

>Many of us are convinced that open() and rename() and unlink() and so on
>are an extremely poor match for unreliable or dynamic or remote I/O.

Given Unix, where devices -- even those with removable media -- are
accessed through the filesystem, I can see no reason whatsoever to
treat network connections and other IPC facilities differently.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>

Volume-Number: Volume 21, Number 141

gwyn@smoke.brl.mil (Doug Gwyn) (09/28/90)

Submitted-by: gwyn@smoke.brl.mil (Doug Gwyn)

In article <540@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>You must convince us that open() makes sense for everything that might
>be a file descriptor, ...

open() provides a mechanism for obtaining the object's handle ("file
descriptor") in the first place.  The argument is really about whether
there ought to be more than one way to originate such a handle.  (dup(),
fork(), etc. merely propagate a handle obtained by other means.)  It is
possible, as I described over a year ago in the now-defunct
comp.unix.wizards newsgroup, to design a UNIX-like operating system
where "it takes a handle to get a handle".  However, UNIX is definitely
not like that.  From a software engineering viewpoint, if a single
mechanism for originating handles will suffice, then that is the best
approach.

The hierarchical filesystem serves a useful function that you neglected
to mention:  It provides "nodes" at which objects have an opportunity
to contribute to decisions during interpretation of pathnames.  For
example, a directory node plays a very important organizational role,
a device driver node acts like a "portal", nodes act as mount points,
and so on.  Without an identifiable node structure the system would be
severely emaciated.  Indeed, Plan 9 exploits this even more heavily
than does UNIX.

Volume-Number: Volume 21, Number 145

gwyn@smoke.brl.mil (Doug Gwyn) (09/28/90)

Submitted-by: gwyn@smoke.brl.mil (Doug Gwyn)

In article <541@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>In the filesystem abstraction, you open a filename in one stage. You
>can't do anything between initiating the open and finding out whether or
>not it succeeds. This just doesn't match reality, and it places a huge
>restriction on programs that want to do something else while they
>communicate.

UNIX was designed explicitly on the model of communicating sequential
processes.  Each process acts as though it executes in a single thread,
blocking when it accesses a resource that is not immediately ready.
While it would be easy to argue that there is a need for improved IPC,
I haven't heard any convincing arguments for making asynchronity
explcitly visible to a process.  In fact, it was considered quite a
step forward in computing back in the old days ("THE" operating system,
for example) when viable means of hiding asynchronity were developed.

Volume-Number: Volume 21, Number 144

peter@ficc.ferranti.com (Peter da Silva) (09/30/90)

Submitted-by: peter@ficc.ferranti.com (Peter da Silva)

In article <548@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> I disagree. I consider it an excellent example of how the designers of
> UNIX realize that all *reliable*, *static*, *local* (or virtually local)
> I/O objects potentially visible to more than one process belong in the
> filesystem namespace.

Like "/dev/tty"? I think you've got some semantic gap here between what's
appropriate for a file versus what's appropriate for a file descriptor. An
arbitrary failure on an open file descriptor causes problems... but that
doesn't keep socket() from returning an fd. An arbitrary failure or an
arbitrary delay on an open call is perfectly reasonable: programs expect
open to fail. They depend on write() working.

And serial lines are subject to all the "hazardous" behaviour of network
connections. An open can be indefinitely deferred. The connection, especially
over a modem, can vanish at any time. Why not take *them* out of the namespace
as well?

> You can read() or
> write() reasonable I/O objects through file descriptors. Very few
> programs---the shell is a counterexample---need to worry about what it
> takes to set up those file descriptors.

And that's the problem, because the shell is the program that is used to
create more file descriptors than just about anything else. If the shell
had a syntax for creating sockets and network connections we wouldn't be
having this discussion... but then if it did then you might as well make
it be via filenames...

And look where this discussion started... over shared memory and messages
and semaphores being in a separate namespace. But shared memory and message
ports are all:

	reliable,
	static,
	and local...

at least as much as processes.
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
peter@ferranti.com

Volume-Number: Volume 21, Number 150

peter@ficc.ferranti.com (Peter da Silva) (10/01/90)

Submitted-by: peter@ficc.ferranti.com (Peter da Silva)

In article <547@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> ``Opening a connection'' is such a common phrase that we automatically
> accept it as a description of reality, and consequently believe that it
> is well described by open(); but it isn't. The time between request and
> acknowledgment is filled with nothing but a void.

There are a *number* of cases in UNIX where an open() does not return in
a determinable time. The correct solution to this is not to pull stuff out
of the file system, but to provide an asynchronous open() call (that can
well be hidden by a threads library, but the mechanism should be there).

This is related to the issue of whether network end-points belong in the
file system, but it is not the same issue because there's much more than
networks involved... including objects (serial ports with modem control,
in particular) that are already in the filesystem.

Oddly enough, the latest draft of P1003.4 that I have available does NOT
include an asynchronous OPEN request. This is a serious omission.
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
peter@ferranti.com

Volume-Number: Volume 21, Number 158

donn@hpfcrn.fc.hp.com (Donn Terry) (10/01/90)

Submitted-by: donn@hpfcrn.fc.hp.com (Donn Terry)

I've been following this discussion on the issues of filesystem namespace.

I'd like to step back from the details and look at it a little 
more philosophically.  I think that that may lead to a resolution of the
issues (or at least some progress) (or a decrease in the shrillness)
(or something).

UNIX was designed to simplify the programmer's life.  In particular, 
anything that could be reasonably generalized, was.  This generalization
is not an easy task, and not easy to explain.  The genius of Ritchie and
Thompson was both because they acheived the generalization, and because
they got others to beleive in it.

The generalization is more difficult to deal with when you are "used to" some
other model.  (I see folks using various propietary systems griping about
UNIX because it doesn't do everything just the way they are used to.)
As Dijkstra once observed about BASIC (I paraphrase, not having the quote).
"The teaching of BASIC should be forbidden because it forever ruins 
students from being able to use better languages."

I think that (although he exaggerates) that Dijkstra's comment also applies
in this case.  We all are contaminated to some degree or other by the 
preconceptions we bring with us from other training (be it experience with
other OSs or something else).

I have some personal concerns about some of the functionality in 1003.4
because it appears to be based upon models from other, successful, 
implementations, but ones that may not have been through the process of
generalization.  It was R&T's thought that having lots of processes would
solve such problems, and for the day, it did.  Now it doesn't because of
tightly coupled activities (tasks?) needing "fast" switch time. 

To me, threads is the generalization that follows the original philosophy, 
not bringing up the OS-like functions similar to select() to the user. 
(I didn't like threads at first, like many don't; I may still not like the
details, but they do seem to provide the generalization needed for 
that class of task, without the application writer having to write a
mini-dispatcher of his/her own.)

The broad context of namespace is similar, to me.  What's the
generalization?  I don't really know.  My (UNIX flavored) biases say
that it's the filesystem.  However, a generalization, not a statement
that "my problem is different so must be treated differently", is the
right answer. 

Let me try something for the readers of this group to think about.

The "UNIX Filesystem" really consists of two parts:  a heirarchical
namespace mechanism that currently names objects which are at least
files, devices, file stores (mounted volumes), and data stream IPC
mechanisms (OK, FIFOs!).  Some systems add other IPC mechanisms
(Streams, Sockets), and the process space (/proc.)  I could go on.

One of the class of objects named in the namespace is ordinary files.
The set of ordinary files is a collection of flat namespaces, where
the names are (binary) numbers.  (Each mounted volume is an element
of the collection, and each i-number is a filename.  The "real names"
of files are the volume and i-number pair; that's how you tell if two
files are identical, not by their names in the namespace, of which
they may have zero or more.)  (The fact that the other object types
also usually have i-numbers is an accident of implementation.)

Open() is a means to translate from the namespace to a handle on an object.
It may be that the handle is for an ordinary file, or for some other 
object (as I listed above).  Historically, files were the most common
concept, and the namespace becomes the "filesystem".  (The volume/inode
namespace isn't, and shouldn't be, accessible, because the gateway
functions that Doug Gwyn mentions are necessary and valuable.)

Given the above three paragraphs, one could consciously separate the 
namespace from the file system further, and then the arguments that 
"a connection is not a file" seems weaker.  A "connection" is an object
in the namespace, and open() gives you a handle on it.  Given that you
know what the object is, you may have to perform additional operations
on it, or avoid them.  (E.g., many programs operate differently based on the 
nature of the object they open; if it's a tty it does ioctl() calls on
it, if not, it doesn't.)

I'm not yet sure that the "filesystem" namespace is (or is not) the
right generalization, but a generalization is useful so that we don't
end up were we were when R&T started out with a bunch of unrelated
namespaces where, by relating them, common functions could be combined,
and common operations could be performed commonly.  For example, it
would be a shame if we find that some network objects that were not put
in the generic namespace could reasonably have the
open()/read()/write()/close() model applied to them, and because they
were in a different namespace, this could not be done (easily).

Many exisiting proprietary systems (and even more historical ones) left
you in the state that a program that sequentially read an ordinary file
couldn't simply do the same thing to a device (without extra programming,
anyway).  Not looking for the generalization could lead us to the same
state again for the "newer" technologies.

Donn Terry
Speaking only for myself.

Volume-Number: Volume 21, Number 161

henry@zoo.toronto.edu (Henry Spencer) (10/02/90)

Submitted-by: henry@zoo.toronto.edu (Henry Spencer)

In article <547@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>> The program *is* doing several things at once, to wit opening several
>> connections at once.
>
>``Opening a connection'' is really an abuse of the language, because a
>network open consists of at least two steps that may come arbitrarily
>far apart...

This is the nub of the issue, and it's a difference in semantic models.
Dan insists on seeing open as a sequence of operations visible to the
user, in which case his viewpoint is reasonable.  I prefer the Unix
approach -- the details of an open are none of the user's business,
only whether it succeeds or fails -- in which case "opening a connection"
is entirely reasonable terminology, and opening several at once (i.e.
sending out multiple requests before receiving acknowledgements) is
indeed doing several things at once, best handled with explicit
parallelism.

Both models are defensible, but I would sort of hope that in a Unix
standard, the Unix model would be employed.

It is easy to construct examples where explicit parallelism buys you
things that the multi-step model can't easily achieve, such as writing
data from one connection to disk while another one is still exchanging
startup dialog.  One *can* always do this in the multi-step model, but
it amounts to simulating parallel threads.  The main structure of the
program turns into:

	for (;;) {
		wait for something to happen on some connection
		deal with it, in such a way that you never block
	}

which does work, but greatly obscures the structure of what's going on,
and tends to require all sorts of strange convolutions in "deal with it"
because of the requirement that it not block.  (If it blocks, activity
on *all* connections blocks with it.)  BSDish server code tends to be
very hard to understand because of exactly this structure.  With multiple
threads, each one can block whenever convenient, and the others still
make progress.  Best of all, the individual threads' code looks like a
standard Unix program:

	open connection
	do reads and writes on it and other things as necessary
	close it
	exit

instead of being interwoven into a single master loop with all the rest.

Almost any program employing select() would be better off using real
parallelism instead, assuming that costs are similar.  (It is easy to
make costs so high that parallelism isn't practical.)
-- 
Imagine life with OS/360 the standard  | Henry Spencer at U of Toronto Zoology
operating system.  Now think about X.  |  henry@zoo.toronto.edu   utzoo!henry

Volume-Number: Volume 21, Number 163

donn@hpfcrn.fc.hp.com (Donn Terry) (10/02/90)

Submitted-by: donn@hpfcrn.fc.hp.com (Donn Terry)

I was thinking about this a bit more, and want to propose some food for
thought on the issue.

Classically, open() is a function that "opens a file descriptor", which
is where the name comes from.

However, if you think, rather, of open() as "translate from the (filesystem)
namespace this string, and give me a handle on the object" it actually makes
more sense.

The operations that can be performed on a file are the classical operators
applicable to such a handle.  However, some are forbidden or meaningless on 
some object types (lseek on FIFOs, ioctl on ordinary files, some fcntls on
devices), and some have operations only applicable to them (ioctl on 
devices) and no other type.  I can easily imagine an object that had none
of the classical file operations applied to it.

Now, there is also nothing that requires that open() be the only function
that returns such a generic object handle.  Imagine (simple example) a
a heirarchical namespace that contains all possible character
bitcodes in the namespace.  Open() would not work very well because of the
null termination and slash rules.  However, I can imagine another function
that takes a char** as an argument, where each element is the name in
the next level of the heirarchy.  (With length in the first byte.)  It
would still return a classical file descriptor.  Similarly, maybe the
punctuation is different, or the notion of "root" is different; generalizing
open() to "give me a handle in a namespace" may be most useful.

I intend this not as any sort of proposal of something that should or should
not be done, but as an "icebreaker" in terms of thinking about the problem.

What are the further generalizations we need, how do they make sense and
fit together, and (the real test of success) what are some of the unexpected
benefits of the generalization?  (Granting that the "biggest" unexpected
benefit will show up "later".)

Donn Terry
Speaking only for myself.

Volume-Number: Volume 21, Number 167

fouts@bozeman.bozeman.ingr (Martin Fouts) (10/03/90)

Submitted-by: fouts@bozeman.bozeman.ingr (Martin Fouts)

>>>>> On 27 Sep 90 20:03:39 GMT, chip@tct.uucp (Chip Salzenberg) said:

Chip> Given Unix, where devices -- even those with removable media -- are
Chip> accessed through the filesystem, I can see no reason whatsoever to
Chip> treat network connections and other IPC facilities differently.
Chip> -- 

One reason to not treat every IPC facility as part of the file system:
Shared memory IPC mechanisms which don't need to be visible to
processes not participating in the IPC.

Marty
--
Martin Fouts

 UUCP:  ...!pyramid!garth!fouts (or) uunet!ingr!apd!fouts
 ARPA:  apd!fouts@ingr.com
PHONE:  (415) 852-2310            FAX:  (415) 856-9224
 MAIL:  2400 Geng Road, Palo Alto, CA, 94303

Moving to Montana;  Goin' to be a Dental Floss Tycoon.
  -  Frank Zappa


Volume-Number: Volume 21, Number 169

domo@tsa.co.uk (Dominic Dunlop) (10/03/90)

Submitted-by: domo@tsa.co.uk (Dominic Dunlop)

In article <107020@uunet.UU.NET> donn@hpfcrn.fc.hp.com (Donn Terry) writes
cogently about file system and other name spaces.  I'm not going to add
significantly to what he said, merely embroider a little:

> One of the class of objects named in the namespace is ordinary files.
> The set of ordinary files is a collection of flat namespaces, where
> the names are (binary) numbers.  (Each mounted volume is an element
> of the collection, and each i-number is a filename.  The "real names"
> of files are the volume and i-number pair; that's how you tell if two
> files are identical, not by their names in the namespace, of which
> they may have zero or more.)  (The fact that the other object types
> also usually have i-numbers is an accident of implementation.)

I'd just like to add that the existing POSIX.1 standard does incorporate
the concept of ``a per-file system unique identifier for a file'',
although its ethnic origins have been disguised by calling it a ``file
serial number'' rather than an i-number.  The corresponding field in the
stat structure is, by no coincidence at all, st_ino.

Donn's point about the need to be able to determine whether two
``handles'' (whatever they may be) refer to the same object is a good
one.  It follows that, if new types of object are made accessible
through filename space, the information returned by stat() (or fstat())
should be sufficient uniquely to identify each distinct object.  Of
course, where the object is not a conventional file, life becomes more
complex than simply saying that each unique serial number/device id
combination refers to a unique object.  Although POSIX.1 is 
reticent on the topic because it is studiously avoiding the UNIX-ism of
major and minor device numbers, we all know that, faced with a device
file on a UN*X system, we should ignore the serial number, and use only
the device id in determining uniqueness.

I dare say that, as more types of object appear in filename space (and
I, for one, should like to see them do so), the question of determining
uniqueness will become knottier.  Suppose, for example, that one used
filenames as handles for virtual circuits across a wide-area network.
Conceivably, the number of such circuits could be sufficiently large
that it will become difficult o shoe-horn a unique identifier into the
existing stat structure fields.  A problem for the future?

-- 
Dominic Dunlop

Volume-Number: Volume 21, Number 172

jason@cnd.hp.com (Jason Zions) (10/03/90)

Submitted-by: jason@cnd.hp.com (Jason Zions)

Dominic Dunlop says:

> I dare say that, as more types of object appear in filename space (and
> I, for one, should like to see them do so), the question of determining
> uniqueness will become knottier.  Suppose, for example, that one used
> filenames as handles for virtual circuits across a wide-area network.
> Conceivably, the number of such circuits could be sufficiently large
> that it will become difficult o shoe-horn a unique identifier into the
> existing stat structure fields.  A problem for the future?

Actually, a problem for today. P1003.8 has to cope with the fact that a
local file for major 0, minor 0x010100, inode 1234 is *different* from a
file on some remote machine with the same (major,minor,inode) triplet. But
adding a new field or fields to the stat structure isn't gonna work;
expanding that structure will cause many implementations to shatter (i.e.
break spectacularly). Just cobbling up a major number for some random
remotely-mounted filesystem is unsatisfactory, unless the cobble is
persistant over umount/mount operations. (An application starts to run;
opens file1 on remsys, gets (maj,min,ino). Network goes down, comes up;
system remounts remsys. App opens file2 on remsys. That major number had
better be the same for remsys!)

What's needed is a simple routine which can be called to determine if two
handles point to the same object. It would be nice if there was a routine
which took as arguments a file handle and a path name and returned true iff
the path referred to the same file. This routine would be guaranteed by the
implementor to work for any file-system resident object provided for; e.g.
an SVR4 implementation would have to be able to tell if a file opened via
RFS referred to the same underlying file as one opened under NFS.

I don't know if that's sufficient, though; application programmers may be
using the stat info for other purposes, and a remote_addr field might be a
good idea. Once P1003.12 decides on a representation for an arbitrary
network address, which might be considerably larger than an IP address.

Jason Zions

Volume-Number: Volume 21, Number 174

peter@ficc.ferranti.com (Peter da Silva) (10/04/90)

Submitted-by: peter@ficc.ferranti.com (Peter da Silva)

In article <13132@cs.utexas.edu> fouts@bozeman.bozeman.ingr (Martin Fouts) writes:
> One reason to not treat every IPC facility as part of the file system:
> Shared memory IPC mechanisms which don't need to be visible to
> processes not participating in the IPC.

Provide an example, considering the advantages of having shell level
visibility of objects has for (a) debugging, (b) system administration,
(c) integration, (d)...

It's nice to be able to fake a program out with a shell script.
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
peter@ferranti.com

Volume-Number: Volume 21, Number 176

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (10/04/90)

Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein)

In article <551@usenix.ORG> chip@tct.uucp (Chip Salzenberg) writes:
> According to brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
> >NFS (as it is currently implemented) shows what goes wrong when
> >reliability disappears.
> In a discussion of filesystem semantics, NFS is a straw man.  Everyone
> knows it's a botch.
> If AFS and RFS don't convince one that a networked filesystem
> namespace can work well, then nothing will.

Exactly! This example proves my point. What's so bad about NFS---why it
doesn't fit well into the filesystem---is that it doesn't make the
remote filesystem reliable and local. If you show me Joe Shmoe's RFS
with reliable, local, static I/O objects, I'll gladly include it in the
filesystem.

---Dan

Volume-Number: Volume 21, Number 185

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (10/04/90)

Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein)

In article <106697@uunet.UU.NET> peter@ficc.ferranti.com (Peter da Silva) writes:
  [ Programs depend on write() working. ]

On the contrary. When the descriptor is unreliable, you get an I/O
error or the data is simply corrupted; this is exactly what happens with
disk I/O. Programs that handle errors on read() and write() are more
robust than programs that don't.

More commonly, when the descriptor is dynamic and the other side drops,
you get a broken pipe. This is certainly not a rare failure mode.

In context, I said that open() is only appropriate for reliable, static,
local I/O objects. You seem to be arguing that read() and write(), and
file descriptors in general, also require reliable, static, local I/O
objects, and so my distinction is silly. But UDP sockets, pipes, and TCP
sockets are unreliable, dynamic, and remote file descriptors
respectively, and read()/write() work with them perfectly.

> > You can read() or
> > write() reasonable I/O objects through file descriptors. Very few
> > programs---the shell is a counterexample---need to worry about what it
> > takes to set up those file descriptors.
> And that's the problem, because the shell is the program that is used to
> create more file descriptors than just about anything else. If the shell
> had a syntax for creating sockets and network connections we wouldn't be
> having this discussion...

Oh? Really? I have a syntax for creating sockets and network connections
from my shell. For example, I just checked an address by typing

  $ ctcp uunet.uu.net smtp sh -c 'echo expn rsalz>&7;echo quit>&7;cat<&6'

So we shouldn't be having this discussion, right?

> but then if it did then you might as well make
> it be via filenames...

Why? I don't see a natural filename syntax for TCP connections, so why
should I try to figure one out? What purpose would it serve? Only two
programs---a generic client and a generic server---have to understand
the filenames. If those two programs work, what's the problem?

  [ shm and sem are reliable, static, local ]

As a BSD addict I don't have much experience with those features, but I
believe you're right. So feel free to put shared memory objects into the
filesystem; I won't argue. Semaphores, I'm not sure about, because it's
unclear what a file descriptor pointing to a semaphore should mean. Are
semaphores I/O objects in the first place?

---Dan

Volume-Number: Volume 21, Number 182

aglew@crhc.uiuc.edu (Andy Glew) (10/04/90)

Submitted-by: aglew@crhc.uiuc.edu (Andy Glew)

>In the filesystem abstraction, you open a filename in one stage. You
>can't do anything between initiating the open and finding out whether or
>not it succeeds. This just doesn't match reality, and it places a huge
>restriction on programs that want to do something else while they
>communicate.

Sounds like you want an asynchronous open facility, much like the
asynchronous read and write that others already have on their wish
list for file I/O (and other I/O) (not everyone believes that multiple
threads are the way to do asynch I/O).

--
Andy Glew, a-glew@uiuc.edu [get ph nameserver from uxc.cso.uiuc.edu:net/qi]

Volume-Number: Volume 21, Number 181

chip@tct.uucp (Chip Salzenberg) (10/05/90)

Submitted-by: chip@tct.uucp (Chip Salzenberg)

According to fouts@bozeman.bozeman.ingr (Martin Fouts):
>One reason to not treat every IPC facility as part of the file system:
>Shared memory IPC mechanisms which don't need to be visible to processes
>not participating in the IPC.

Yes, it is obviously desirable to have IPC entities without names.
This feature is a simple extension of the present ability to keep a
plain file open after its link count falls to zero.  Of course, the
committee could botch the job by making it an error to completely
unlink a live IPC.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>

Volume-Number: Volume 21, Number 186

nick@bis.com (Nick Bender) (10/06/90)

Submitted-by: nick@bischeops.uucp (Nick Bender)

In article <13218@cs.utexas.edu>, brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
= Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein)
= 
= In article <551@usenix.ORG> chip@tct.uucp (Chip Salzenberg) writes:
= > According to brnstnd@kramden.acf.nyu.edu (Dan Bernstein):
= > >NFS (as it is currently implemented) shows what goes wrong when
= > >reliability disappears.
= > In a discussion of filesystem semantics, NFS is a straw man.  Everyone
= > knows it's a botch.
= > If AFS and RFS don't convince one that a networked filesystem
= > namespace can work well, then nothing will.
= 
= Exactly! This example proves my point. What's so bad about NFS---why it
= doesn't fit well into the filesystem---is that it doesn't make the
= remote filesystem reliable and local. If you show me Joe Shmoe's RFS
= with reliable, local, static I/O objects, I'll gladly include it in the
= filesystem.
= 
= ---Dan

Any program which assumes that write(2) always works is broken. Period.
That's why you sometimes get long streams of "filesystem full" on your
console when some brain-damaged utility doesn't check a return value.
In my view this is not a reason to call NFS a botch.

nick@bis.com


Volume-Number: Volume 21, Number 188

chip@tct.uucp (Chip Salzenberg) (10/09/90)

[I would like to avoid an NFS flame fest if possible.
 If you respond, please keep it in the context of a 
 UNIX standards discussion, as Chip has mostly
 done here.  Thanks.  --Fletcher ]

Submitted-by: chip@tct.uucp (Chip Salzenberg)

According to nick@bis.com (Nick Bender):
>Any program which assumes that write(2) always works is broken. Period.

True.

>In my view this is not a reason to call NFS a botch.

Also true ... but the possible failure of write() wasn't my reason.

NFS is an interesting and occasionally useful service.  However, it it
does not provide UNIX filesystem semantics.  In particular, given
appropriate permissions, link() and mkdir() on a UNIX filesystem are
guaranteed to succeed exactly once.  On an NFS mount, however, they
may report failure even after having succeeded.

Also, the vaunted "advantage" of NFS, it's statelessness, goes out the
window as soon as you want to lock a file.

Finally, NFS does not permit access to remove special files such as
devices and named pipes.

Yes, Virginia, NFS is a botch.

So what is the relevance of NFS's dain bramage to this newsgroup?
Simply that NFS is not POSIX compliant.  Therefore, using NFS as an
example of how the namespace is supposedly almost useless is nothing
more than a straw man.  If a person wants to knock remote UNIX
filesystems, let him try to knock reasonable ones like RFS and AFS.

No, Dan, this article does not imply that network connections don't
belong in the filesystem.  It means that *if* link() and mkdir() are
defined on a UNIX filesystem, they must succeed exactly once.  Compare
a UNIX system that has mounted a CP/M disk.  The CP/M disk format
precludes the use of link() and mkdir(), yet the UNIX namespace is
quite useful for accessing the files on the disk.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>


Volume-Number: Volume 21, Number 191

fouts@bozeman.bozeman.ingr (Martin Fouts) (10/11/90)

Submitted-by: fouts@bozeman.bozeman.ingr (Martin Fouts)

>>>>> On 3 Oct 90 17:19:04 GMT, peter@ficc.ferranti.com (Peter da Silva) said:
Peter> In article <13132@cs.utexas.edu> fouts@bozeman.bozeman.ingr (Martin Fouts) writes:
> One reason to not treat every IPC facility as part of the file system:
> Shared memory IPC mechanisms which don't need to be visible to
> processes not participating in the IPC.

Peter> Provide an example, considering the advantages of having shell level
Peter> visibility of objects has for (a) debugging, (b) system administration,
Peter> (c) integration, (d)...

Short persistance IPC mechanisms found in multithreaded shared memory
implementations consist of a small region of memory and a lock guarding
that region.  Producer/consumer parallelism using this mechanism does
not need to be visible.  Effectively, this is the shared memory
equivalent of an unnamed pipe.

a) debugging is handled by the process debugger, not by the shell and
   has the same visibility as any other memory resident data.

b) There is no system administration, since the objects have exactly
   process duration with the same termination semantics as a pipe, in
   that termination of any of the processes is usually catastrophic

c) I'm not sure what integration support would benefit from making
   a short duration object visible.

d) ....

--
Martin Fouts

 UUCP:  ...!pyramid!garth!fouts (or) uunet!ingr!apd!fouts
 ARPA:  apd!fouts@ingr.com
PHONE:  (415) 852-2310            FAX:  (415) 856-9224
 MAIL:  2400 Geng Road, Palo Alto, CA, 94303

Moving to Montana;  Goin' to be a Dental Floss Tycoon.
  -  Frank Zappa

Volume-Number: Volume 21, Number 196

peter@ficc.ferranti.com (Peter da Silva) (10/12/90)

Submitted-by: peter@ficc.ferranti.com (Peter da Silva)

In article <13442@cs.utexas.edu> fouts@bozeman.bozeman.ingr (Martin Fouts) writes:
> Short persistance IPC mechanisms found in multithreaded shared memory
> implementations consist of a small region of memory and a lock guarding
> that region.  Producer/consumer parallelism using this mechanism does
> not need to be visible.  Effectively, this is the shared memory
> equivalent of an unnamed pipe.

Effectively, this *is* shared memory. And shared memory has proven itself
to be a viable candidate for insertion into the name space.

I didn't say that every application of an IPC mevchanism should have its
own entry in the name space. Creating a file for each element in a shared
memory region makes about as much sense as creating a file for each
message in a pipe. But the region itself should be visible from the
outside.
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
peter@ferranti.com



Volume-Number: Volume 21, Number 201

fouts@bozeman.bozeman.ingr (Martin Fouts) (10/13/90)

Submitted-by: fouts@bozeman.bozeman.ingr (Martin Fouts)


>>>>> On 4 Oct 90 20:39:37 GMT, chip@tct.uucp (Chip Salzenberg) said:

Chip> According to fouts@bozeman.bozeman.ingr (Martin Fouts):
>One reason to not treat every IPC facility as part of the file system:
>Shared memory IPC mechanisms which don't need to be visible to processes
>not participating in the IPC.

Chip> Yes, it is obviously desirable to have IPC entities without names.
Chip> This feature is a simple extension of the present ability to keep a
Chip> plain file open after its link count falls to zero.  Of course, the
Chip> committee could botch the job by making it an error to completely
Chip> unlink a live IPC.
Chip> -- 

Of course, if I have to acquire a file handle for my IPC, I can't
imlement it as efficiently as if I just do it locally in shared memory
and don't bother the system about it's existance.

Marty
--
Martin Fouts

 UUCP:  ...!pyramid!garth!fouts (or) uunet!ingr!apd!fouts
 ARPA:  apd!fouts@ingr.com
PHONE:  (415) 852-2310            FAX:  (415) 856-9224
 MAIL:  2400 Geng Road, Palo Alto, CA, 94303

Moving to Montana;  Goin' to be a Dental Floss Tycoon.
  -  Frank Zappa


Volume-Number: Volume 21, Number 205

chip@tct.uucp (Chip Salzenberg) (10/19/90)

Submitted-by: chip@tct.uucp (Chip Salzenberg)

[ Is it my imagination, or is this thread getting stale?  
  Oh well.  I think John will be back soon.  He can decide. --Fletcher ]

According to fouts@bozeman.bozeman.ingr (Martin Fouts):
>Of course, if I have to acquire a file handle for my IPC, I can't
>imlement it as efficiently as if I just do it locally in shared memory
>and don't bother the system about it's existance.

Well, if the system doesn't know about it, then it's not a system IPC
facility.  If, however, the system does know about it, then it has to
have a handle, which might as well be a small integer -- i.e. a file
descriptor.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>
    "I've been cranky ever since my comp.unix.wizards was removed
         by that evil Chip Salzenberg."   -- John F. Haugh II


Volume-Number: Volume 21, Number 208