jsh@usenix.org (Jeffrey S. Haemer) (10/21/89)
From: Jeffrey S. Haemer <jsh@usenix.org> An Update on UNIX* and C Standards Activities September 1989 USENIX Standards Watchdog Committee Jeffrey S. Haemer, Report Editor IEEE 1003.4: Real-time Extensions Update John Gertwagen <jag@laidbak> reports on the July 10-14, 1989 meeting in San Jose, California: The P1003.4 meeting in San Jose was very busy. The meeting focused on resolving mock-ballot objections and comments. Despite limited resources for documenting changes, a lot of work got done. Here's what stood out. Shared memory The preferred interface falls somewhere between shared-memory- only and a mapped-files interface, such as AIX's mmap(), which allows files to be treated like in-core arrays. Group direction was to reduce the functionality to support only shared memory, so long as the resulting interfaces could be implemented as a library over mmap(). Process memory locking The various region locks were clarified and, thus, simplified; the old definitions were fuzzy and non-portable. For those who haven't seen it, there is actually a memory residency interface (i.e., fetch and store operations to meet some metric) instead of a locking interface. Most vendors will probably implement it as a lock, but some may want it to impose highest memory priority in the paging system. Inter-process communication Members questioned whether the interface definitions could really support a broader range of requirements; they're like no others in the world today. Having been designed to meet the real-time group's wish list, there are lots of bells and whistles -- far more than in System V IPC -- but it's not clear, for example, that they are network extensible. Discussions in these areas continue. __________ * UNIX is a registered trademark of AT&T in the U.S. and other countries. September 1989 Standards Update IEEE 1003.4: Real-time Extensions - 2 - Events and semaphores Members were concerned about possible overlap with other mechanisms, especially those being considered for threads. The question is basically, "Should there be separate functions for different flavors or a single function with multiple options?" General sentiment (including our snitch's) seems to be for multiple functions; however an implementation might choose to make them library interfaces to a common, more general system call. There is, however, a significant minority opinion the other way. Scheduling Many balloters found process lists and related semantics confusing. An attempt was made to re-cast the wording to be more strictly in terms of process behavior. Timers Inheritance was brought in line with existing (BSD) practice. Outside of the mock ballot, there were two other major news items. First, there is a movement afoot to make the .4 interfaces part of 1003.1. They would become additional chapters and might be voted separately or in logical groups. This would bring P1003 in line with the ISO model of a base standard plus application profiles. POSIX.4 would become the real-time profile group. This is a non-trivial change. Up to now, the criterion for .4 has been that of "minimum necessary for real-time", and has coincidentally been extended to support other requirements "where convenient". This is not a good starting point for a base interface. For example, mmap(), or something very much like it, is probably the right base for "shared storage objects", but real-time users want an interface for shared memory, not for mapped files. Our snitch worries that things might look a bit different had the group been working on a base standard from the beginning. Second, the committee officially began work on a threads interface, forming a threads small group and creating a stub chapter in the .4 draft. A working proposal for the interface, representing the consensus direction of the working group, will be an appendix to the next draft. A lot of work remains to be done before .4 can go to ballot and the current January '90 target may not be realistic. If the proposed re- organization occurs, a ballot before the summer of 1990 seems unlikely. September 1989 Standards Update IEEE 1003.4: Real-time Extensions Volume-Number: Volume 17, Number 40
jsh@usenix.org (Jeffrey S. Haemer) (12/07/89)
From: Jeffrey S. Haemer <jsh@usenix.org> An Update on UNIX* and C Standards Activities December 1989 USENIX Standards Watchdog Committee Jeffrey S. Haemer, Report Editor IEEE 1003.4: Real-time Extensions Update John Gertwagen <jag@laidbak> reports on the November 13-17, 1989 meeting in Milpitas, CA: Background The P1003.4 Real-time Working Group, began as the /usr/group technical committee on real-time extensions. About two years ago, it was chartered by the IEEE to develop minimum extensions to POSIX to support real-time applications. Over time its scope has expanded, and P1003.4 is now more a set of general interfaces that extend P1003.1 than a specifically real-time standard. Its current work is intended to support not only real-time, but also database, transaction processing, Ada runtime, and networking environments. The group is trying to stay consistent with both the rest of POSIX and other common practice outside the real-time domain. The work is moving quickly. Though we have only been working for two years, we are now on Draft 9 of the proposed standard, and expect to go out for ballot before the end of the year. To help keep up this aggressive schedule. P1003.4 made only a token appearance at the official P1003 meeting in Brussels. The goal of the Milpitas meeting was to get the draft standard ready for balloting. Meeting Summary Most of the interface proposals are now relatively mature, so there was a lot of i-dotting and t-crossing, and (fortunately) little substantive change. The "performance metrics" sections of several interface chapters still need attention, but there has been little initiative in the group to address them, so it looks like the issues will get resolved during balloting. The biggest piece of substantive work was a failed attempt to make the __________ * UNIX is a registered trademark of AT&T in the U.S. and other countries. December 1989 Standards Update IEEE 1003.4: Real-time Extensions - 2 - recently introduced threads proposal clean enough to get into the ballot. The stumbling block is a controversy over how to deal with signals. There are really two, related problems: how to send traditional UNIX/POSIX signals to a multi-threaded process, and how to asynchronously interrupt a thread. Four options have been suggested: delivering signals to a (somehow) privileged thread, per Draft 8; delivering a signal to whichever thread is unlucky enough to run next, a la Mach; delivering the signal to each thread that declares an interest; and ducking the issue by leaving signal semantics undefined. We haven't been able to decide among the options yet; the working group is now split evenly. About half think signal semantics should follow the principle of least surprise, and be fully extended to threads. The other half think each signal should be delivered to a single thread, and there should be a separate, explicit mechanism to let threads communicate with one another. (Personally, I think the full semantics of process signals is extra baggage in the "lightweight" context of threads. I favor delivering signals to a privileged thread -- either the "first" thread or a designated "agent" -- and providing a separate, lightweight interface for asynchronously interrupting threads. On the other hand, I would be happy to see threads signal one another in a way that looks, syntactically and semantically, like inter-process signals. In fact, I think the early, simpler versions of signal() look a lot like what's needed (around V6 or so). I don't care whether the implementation of "process" and "thread" signals is the same underneath or not. That decision should be left to the vendor.) Directions Draft 9 of P1003.4 is being readied for ballot as this is being written and should be distributed by mid-December. With a little luck, balloting will be over by the Summer of '90. Threads is the biggest area of interest in continued work. The threads chapter will be an appendix to Draft 9 and the balloting group will be asked to comment on the threads proposal as if it were being balloted. Unless there is a significant write-in effort, the threads interface will probably be treated as a new work item for P1003.4. Then, if the outstanding issues can be resolved expediently, threads could go to ballot as early as April '90. With the real-time interfaces defined, it looks like the next task of this group will be to create one or more Real-time Application December 1989 Standards Update IEEE 1003.4: Real-time Extensions - 3 - Portability Profiles (RAPPS?) that define how to use the interfaces in important real-time application models. Agreeing on the application models may be harder than agreeing on the interfaces, but we'll see. December 1989 Standards Update IEEE 1003.4: Real-time Extensions Volume-Number: Volume 17, Number 92
henry@utzoo.uucp (12/09/89)
From: henry@utzoo.uucp >From: Jeffrey S. Haemer <jsh@usenix.org> >[threads vs signals] In fact, >I think the early, simpler versions of signal() look a lot like what's >needed (around V6 or so)... Actually, it can be simpler yet, as Waterloo's Thoth system showed. Subject to some sort of suitable protections (perhaps including a way to ignore signals), when a thread receives a signal, it drops dead. No signal handlers or blocking. If you want some sort of recovery action, have another thread waiting for the first one to die: it has access to all the first thread's data, so it can do whatever recovery is appropriate. Henry Spencer at U of Toronto Zoology uunet!attcan!utzoo!henry henry@zoo.toronto.edu Volume-Number: Volume 17, Number 96
jsh@usenix.org (Jeffrey S. Haemer) (08/22/90)
From: Jeffrey S. Haemer <jsh@usenix.org> An Update on UNIX*-Related Standards Activities August, 1990 USENIX Standards Watchdog Committee Jeffrey S. Haemer <jsh@usenix.org>, Report Editor IEEE 1003.4: Real-time Extensions Rick Greer <rick@ism.isc.com> reports on the July 16-20 meeting in Danvers, Massachusetts: Most of the action in the July dot four meeting centered around -- you guessed it -- threads. The current threads draft (1003.4a) came very close to going to ballot. An overwhelming majority of those present voted to send the draft to ballot, but there were enough complaints from the dot fourteen people (that's multiprocessing -- MP) about the scheduling chapter to hold it back for another three months. Volunteers from dot fourteen have agreed to work on the scheduling sections so that the draft can go to ballot after the next meeting, in October. Actually, dot four voted to send the draft to ballot after that meeting in any case, and the resolution was worded in such a way that a consensus would be required to prevent the draft from going to ballot in October. Thus, the MP folks have this one final chance to clean up the stuff that's bothering them -- if it isn't fixed by October, it will have to be fixed in balloting. Some of us in dot fourteen felt the best way to fix the thread scheduling stuff was via ballot objection anyway. Unfortunately, the threads balloting group is now officially closed, and a number of MP people saw this as their last chance to make a contribution to the threads effort. The members of dot fourteen weren't the only ones to be taken by surprise by the closure of the threads balloting group. It seems that many felt that it would be a cold day in hell before threads made it to ballot and weren't following the progress of dot four all that closely. [Editor: I've bet John Gertwagen a beer that threads will finish balloting before the rest of dot four. The bet's a long way from being decided, but I still think I'll get my beer.] Meanwhile, the ballot resolution process continues for the rest of dot four, albeit rather slowly. There are a number of problems here, the biggest being lack of resources. In general, people would much rather argue about threads than deal with the day-to-day grunt work associated with the IEEE standards process. [Editor: The meeting will __________ * UNIXTM is a Registered Trademark of UNIX System Laboratories in the United States and other countries. August, 1990 Standards Update IEEE 1003.4: Real-time Extensions - 2 - be in Seattle, Washington. Go. Be a resource.] Many of the technical reviewers have yet to get started on their sections. Nevertheless, proposed resolutions to a number of objections were presented and accepted at the Danvers meeting. [Editor: Rick is temporarily unavailable, but Simon Patience of the OSF has kindly supplied these examples: The resolved objections were taken from the CRB: replacing the event mechanism by ``fixed'' signals, replacing the shared memory mechanism by mmap() and removing semaphore handles from the file system name space. Replacing events by signals was accepted; no problem here. Adopting mmap() got a mixed reception, partly because some people thought you had to take all of mmap(), where a subset might suffice. The final vote on this was not to ask the reviewer to adopt mmap(), which may not not satisfy the objectors. I'd guess the balloting group will eventually hold sway here! Finally, the group accepted abandoning the use of file descriptors for semaphore handles, but some participants wanted to keep semaphore names pathnames. The reviewer was sent off to rethink the implications of this suggestion. ] We should be seeing a lot more of this in Seattle. Similar comments apply to the real-time profile (AEP) work. The AEP group made more progress over the last three months than the technical reviewers did, although even that (the AEP progress) was less than I'd hoped. We're expecting our first official AEP draft in October. August, 1990 Standards Update IEEE 1003.4: Real-time Extensions Volume-Number: Volume 21, Number 50
peter@ficc.ferranti.com (peter da silva) (08/24/90)
From: peter@ficc.ferranti.com (peter da silva) My personal opinion is that *anything* that can go into the file system name space *should*. That's what makes UNIX UNIX... that it's all visible from the shell... --- Peter da Silva. `-_-' +1 713 274 5180. 'U` peter@ferranti.com Volume-Number: Volume 21, Number 57
chip@tct.uucp (Chip Salzenberg) (08/28/90)
From: chip@tct.uucp (Chip Salzenberg) > Finally, the group accepted abandoning the use of > file descriptors for semaphore handles, but some participants > wanted to keep semaphore names pathnames. Aargh! Almost everyone realizes that System V IPC is a botch, largely because it doesn't live in the filesystem. So what does IEEE do? They take IPC out of the filesystem! What sane reason could there be to introduce Yet Another Namespace? -- Chip Salzenberg at Teltronics/TCT <chip@tct.uucp>, <uunet!pdn!tct!chip> Volume-Number: Volume 21, Number 65
sp@mysteron.osf.org (Simon Patience) (08/28/90)
From: sp@mysteron.osf.org (Simon Patience) In article <467@usenix.ORG> chip@tct.uucp (Chip Salzenberg) writes: >From: chip@tct.uucp (Chip Salzenberg) > >> Finally, the group accepted abandoning the use of >> file descriptors for semaphore handles, but some participants >> wanted to keep semaphore names pathnames. > >Aargh! Almost everyone realizes that System V IPC is a botch, largely >because it doesn't live in the filesystem. So what does IEEE do? >They take IPC out of the filesystem! > >What sane reason could there be to introduce Yet Another Namespace? The reason for semaphores not being in the file system is twofold. Some realtime embedded systems do not have a file system but do want semaphores So this allows them to have them without having to bring in the baggage a file system would entail. Secondly, as far as threads, which are supposed to be light weight, are concerned it allows semaphores to be implmented in user space rather than forcing them into the kernel for the file system. A good reason for *not* having IPC handles in the file system is to allow network IPC to use the same interfaces. If you have IPC handles in the file system then two machines who have applications trying to communicate would also have to have at least part of their file system name space to be shared. This is non trivial to arrange for two machines so can you imaging the problem of doing this for 100 (or 1000?) machines. I am just the messenger for these views and do not necessarily hold them myself. They were the reasons that came up during the discussion. Simon. Simon Patience Phone: (617) 621-8736 Open Software Foundation FAX: (617) 225-2782 11 Cambridge Center Email: sp@osf.org Cambridge MA 02142 uunet!osf.org!sp Volume-Number: Volume 21, Number 68
Chuck.Phillips@FtCollins.NCR.COM (Chuck.Phillips) (08/29/90)
From: Chuck.Phillips@FtCollins.NCR.COM (Chuck.Phillips) >>>>> On 28 Aug 90 11:58:40 GMT, sp@mysteron.osf.org (Simon Patience) said: >> Finally, the group accepted abandoning the use of >> file descriptors for semaphore handles, but some participants >> wanted to keep semaphore names pathnames. >> >Aargh! Almost everyone realizes that System V IPC is a botch, largely >because it doesn't live in the filesystem. So what does IEEE do? >They take IPC out of the filesystem! > >What sane reason could there be to introduce Yet Another Namespace? Simon> The reason for semaphores not being in the file system is twofold. Simon> Some realtime embedded systems do not have a file system but do want Simon> semaphores... Simon> A good reason for *not* having IPC handles in the file system is to Simon> allow network IPC to use the same interfaces. How about adding non-file-system-based "handles" to an mmap-like interface? (e.g. shmmap(host,porttype,portnum,addr,len,prot,flags)?) This could allow the same interface to be used for network and non-network IPC, without the overhead of a trap for every non-network IPC transaction. `Scuse me while I don my flame retardant suit... :-) #include <std/disclaimer.h> -- Chuck Phillips MS440 NCR Microelectronics Chuck.Phillips%FtCollins.NCR.com 2001 Danfield Ct. Ft. Collins, CO. 80525 uunet!ncrlnk!ncr-mpd!bach!chuckp Volume-Number: Volume 21, Number 72
chip@tct.uucp (Chip Salzenberg) (08/30/90)
From: chip@tct.uucp (Chip Salzenberg) According to sp@mysteron.osf.org (Simon Patience): >Some realtime embedded systems do not have a file system but do want >semaphores. So this allows them to have them without having to bring >in the baggage a file system would entail. I was under the impression that POSIX was designing a portable Unix interface. Without a filesystem, you don't have Unix, do you? Besides, a given embedded system's library could easily emulate a baby-simple filesystem. >Secondly, as far as threads, which are supposed to be light weight, >are concerned it allows semaphores to be implmented in user space >rather than forcing them into the kernel for the file system. The desire for user-space support indicates to me that there should be some provision for non-filesystem (anonymous) IPCs that can be created and used without kernel intervention. This need does not reduce the desirability of putting global IPCs in the filesystem. >A good reason for *not* having IPC handles in the file system is to allow >network IPC to use the same interfaces. Filesystem entities can be used to trigger network activity by the kernel (or its stand-in), even if they do not reside on shared filesystems. -- Chip Salzenberg at Teltronics/TCT <chip@tct.uucp>, <uunet!pdn!tct!chip> Volume-Number: Volume 21, Number 74
preece@urbana.mcd.mot.com (Scott E. Preece) (08/30/90)
From: preece@urbana.mcd.mot.com (Scott E. Preece) | From: sp@mysteron.osf.org (Simon Patience) | The reason for semaphores not being in the file system is twofold. Some | realtime embedded systems do not have a file system but do want semaphores | So this allows them to have them without having to bring in the baggage a | file system would entail. --- I don't care whether they have something that looks like UNIX filesystem code or not, or whether they have disk drives or not, but I don't think it's unreasonable to require them to handle semaphore names as though they were in a filesystem namespace. Otherwise you're going to end up with a collection of independent features, each minimally specified so that it can work without assuming anything else, and anyone with any sense is going to say "Yuck" and use a real operating system that provides reasonable integration and for a uniform notion of, among other things, naming. --- | ... Secondly, as far as threads, which are supposed to | be light weight, are concerned it allows semaphores to be implmented in user | space rather than forcing them into the kernel for the file system. --- Eh? I don't know what the group has proposed since the ballot, but it would seem that using a filesystem name only makes a difference when you first specify you're going to be looking at a particular semaphore, which shouldn't be a critical path event. After that you use a file descriptor, which I think could be handled in user space about as well as anything else. In either case you're going to have to go to the kernel when scheduling is required (when you block or when you release the semaphore). --- | A good reason for *not* having IPC handles in the file system is to allow | network IPC to use the same interfaces. If you have IPC handles in the | file system then two machines who have applications trying to communicate | would also have to have at least part of their file system name space to | be shared. This is non trivial to arrange for two machines so can you | imaging the problem of doing this for 100 (or 1000?) machines. --- You're going to have to synchronize *some* namespace anyway, why shouldn't it be a piece of the filesystem namespace? A consistent approach to naming and name resolution for ALL global objects should be one of the basic requirements for any new POSIX (or UNIX!) functionality. We should have *one* namespace so that we can write general tools that only need to know about one namespace. -- scott preece motorola/mcd urbana design center 1101 e. university, urbana, il 61801 uucp: uunet!uiucuxc!udc!preece, arpa: preece@urbana.mcd.mot.com Volume-Number: Volume 21, Number 75
kingdon@ai.mit.edu (Jim Kingdon) (08/31/90)
From: kingdon@ai.mit.edu (Jim Kingdon) One obvious (if a little wishy-washy) solution is to not specify whether the namespaces are the same. That is, applications are required to use a valid path, and have to be prepared for things like unwritable directories, but implementations are not required to check for those things. This makes sense in light of the fact that there seems to be a general lack of consensus about which is best. Even though there is existing practice for both ways of doing things, it may be premature to standardize either behavior now. Volume-Number: Volume 21, Number 76
edj@trazadone.westford.ccur.com (Doug Jensen) (08/31/90)
From: Doug Jensen <edj@trazadone.westford.ccur.com> 1003.13 is working on real-time AEP's, including one for small embedded real-time systems which does not have a file system. So the POSIX answer is yes, without the filesystem you still can have a POSIX-compliant interface. Doug Jensen Concurrent Computer Corp. edj@westford.ccur.com Volume-Number: Volume 21, Number 78
fouts@bozeman.bozeman.ingr (Martin Fouts) (09/05/90)
From: fouts@bozeman.bozeman.ingr (Martin Fouts)
>>>>> On 24 Aug 90 03:28:06 GMT, peter@ficc.ferranti.com (peter da silva) said:
peter> My personal opinion is that *anything* that can go into the file system name
peter> space *should*. That's what makes UNIX UNIX... that it's all visible from the
peter> shell...
I'm not sure which Unix you've been running for the past five or more
years, but a lot of stuff doesn't live in the file system name space
under various BSD derived systems, nor do the networking types believe
it belongs there. IMHO neither does a process handle, nor a
semaphore, and don't even talk to me about "named pipes" as an IPC
mechanism.
(Gee, I guess reasonable men might differ on what belongs in the name
space ;-)
Marty
--
Martin Fouts
UUCP: ...!pyramid!garth!fouts (or) uunet!ingr!apd!fouts
ARPA: apd!fouts@ingr.com
PHONE: (415) 852-2310 FAX: (415) 856-9224
MAIL: 2400 Geng Road, Palo Alto, CA, 94303
Moving to Montana; Goin' to be a Dental Floss Tycoon.
- Frank Zappa
Volume-Number: Volume 21, Number 83
gwyn@smoke.brl.mil (Doug Gwyn) (09/07/90)
From: Doug Gwyn <gwyn@smoke.brl.mil> In article <488@usenix.ORG> fouts@bozeman.bozeman.ingr (Martin Fouts) writes: >I'm not sure which Unix you've been running for the past five or more >years, but a lot of stuff doesn't live in the file system name space >under various BSD derived systems, nor do the networking types believe >it belongs there. Excuse me, but the "networking types" I talk to believe that sockets were a botch and that network connections definitely DO belong within a uniform UNIX "file" name space. Peter was quite right to note that this is an essential feature of UNIX's design. In fact there are UNIX implementations that do this right, 4BSD is simply not among them yet. Volume-Number: Volume 21, Number 85
peter@ficc.ferranti.com (peter da silva) (09/07/90)
From: peter da silva <peter@ficc.ferranti.com> In article <488@usenix.ORG> fouts@bozeman.bozeman.ingr (Martin Fouts) writes: > > My personal opinion is that *anything* that can go into the file system > > name space *should*. That's what makes UNIX UNIX... that it's all visible > > from the shell... > I'm not sure which Unix you've been running for the past five or more > years, but a lot of stuff doesn't live in the file system name space > under various BSD derived systems, Yes, and there's even more stuff in System V that doesn't live in that name space. In both cases it's *wrong*. > nor do the networking types believe > it belongs there. Some more details on this subject would be advisable. I'm aware that not everything *can* go in the file system name space, by the way... > IMHO neither does a process handle, nor a > semaphore, and don't even talk to me about "named pipes" as an IPC > mechanism. An active semaphore can be implemented any way you want, but it should be represented by an entry in the name space. The same goes for process handles and so on. Named pipes are an inadequate mechanism for much IPC, but they work quite well for many simple cases. If you're looking at them as some sort of paragon representing the whole concept, you're sadly mistaken. Anyway... what is it that makes "dev/win" more worthy of having an entry in "/dev" than "dev/socket"? -- Peter da Silva. `-_-' +1 713 274 5180. 'U` peter@ferranti.com Volume-Number: Volume 21, Number 87
chip@tct.uucp (Chip Salzenberg) (09/07/90)
From: chip@tct.uucp (Chip Salzenberg) According to fouts@bozeman.bozeman.ingr (Martin Fouts): >I'm not sure which Unix you've been running for the past five or more >years, but a lot of stuff doesn't live in the file system name space ... The absense of sockets (except UNIX domain), System V IPC, etc. from the file system is, in the opinion of many, a bug. It is a result of Unix being extended by people who do not understand Unix. Research Unix, which is the result of continued development by the creators of Unix, did not take things out of the filesystem. To the contrary, it put *more* things there, including processes (via the /proc pseudo-directory). It is true that other operating systems get along without devices, IPC, etc. in their filesystems. That's fine for them; but it's not relevant to Unix. Unix programming has a history of relying on the filesystem to take care of things that other systems handle as special cases -- devices, for example. The idea that devices can be files but TCP/IP sockets cannot runs counter to all Unix experience. The reason why I continue this discussion here, in comp.std.unix, is that many Unix programmers hope that the people in the standardization committees have learned from the out-of-filesystem mistake, and will rectify it. -- Chip Salzenberg at Teltronics/TCT <chip@tct.uucp>, <uunet!pdn!tct!chip> Volume-Number: Volume 21, Number 89
swart@src.dec.com (Garret Swart) (09/08/90)
From: swart@src.dec.com (Garret Swart) I believe in putting lots of interesting stuff in the file system name space but I don't believe that semaphores belong there. The reason I don't want to put semaphores in the name space is the same reason I don't want to put my program variables in the name space: I want to have lots of them, I want to create and destroy them very quickly and I want to operate on them even more quickly. In other words, the granularity is wrong. The purpose of a semaphore is to synchronize actions on an object. What kinds of objects might one want to synchronize? Generally the objects are either OS supplied like devices or files, or user defined data structures. The typical way of synchronizing files and devices is to use advisory locks or the "exclusive use" mode on the device. The more difficult case and the one for which semaphores were invented, and later added to Unix, is that of synchronizing user data structures. In Unix, user data structures may live either in a process's private memory or in a shared memory segment. In both cases there are probably many different data structures in that memory and many of these data structures may need to be synchronized. For maximum concurrency the programmer may wish to synchronize each data structure with its own semaphore. In many applications these data structures may come and go very quickly and the expense of creating a semaphore to synchronize the data can be important factor in the performance of the application. It thus seems more natural to allow semaphores to be efficiently allocated along with the data that they are designed to synchronize. That is, allow them to be allocated in a process's private address space or in a mapped shared memory segment. A shared memory segment is a much larger grain object, creating, destroying and mapping them can be much more expensive than creating, destroying or using a semaphore and these segments are generally important enough to the application to have sensible names. Thus putting a shared memory segment in the name space seems reasonable. For example, a data base library may use a shared member segment named /usr/local/lib/dbm/personnel/bufpool to hold the buffer pool for the personnel department's data base. The data base library would map the buffer pool into each client's address space allowing many data base client programs to efficiently access the data base. Each page in the buffer pool and each transaction would have its own set of semaphores used to synchronize access to the page in the pool or the state of a transaction. Giving the buffer pool a name is no problem, but giving each semaphore a name is much more of a hassle. [Aside: Another way of structuring such a data base library is as an RPC style multi-threaded server. This allows access to the data base from remote machines and allows easier solutions to the security and failure problems inherent in the shared memory approach. However the shared memory approach has a major performance advantage for systems that do not support ultra-fast RPCs. Another approach is to run the library in an inner mode. (Unix has one inner mode called the kernel, VMS has 3, Multics had many.) This solves the security and failure problems of the shared segments but it is generally difficult for mere mortals to write their own inner mode libraries.] One other issue that may cause one to want to unify all objects in the file system, at least at the level of using file descriptors to refer to all objects if not going so far as to put all objects in the name space, is the fact that single threaded programming is much nicer if there is a single primitive that will wait for ANY event that the process may be interested in (e.g. the 4.2BSD select call.) This call is useful if one is to write a single threaded program that doesn't busy wait when it has nothing to do but also won't block when an event of interest has occurred. With the advent of multi-threaded programming the single multi-way wait primitive is no longer needed as instead one can create a separate thread each blocking for an event of interest and processing it. Multi-way waiting is a problem if single threaded programs are going to get maximum use out of the facility. I've spoken to a number of people in 1003.4 about these ideas. I am not sure whether it played any part in their decision. Just to prove that I am a pro-name space kind of guy, I am currently working on and using an experimental file system called Echo that integrates the Internet Domain name service for access to global names, our internal higher performance name service for highly available naming of arbitrary objects, our experimental fault tolerant, log based, distributed file service with read/write consistency and universal write back for file storage, and auto-mounting NFS for accessing other systems. Objects that are named in our name space currently include: hosts, users, groups, network servers, network services (a fault tolerant network service is generally provided by several servers), any every version of any source or object file known by our source code control system Some of these objects are represented in the name space as a directory with auxiliary information, mount points or files stored underneath. This subsumes much of the use of special files like /etc/passwd, /etc/services and the like in traditional Unix. Processes are not currently in the name space, but they will/should be. (Just a "simple matter of programming.") For example /-/com/dec/src/user/swart/home/.draft/6.draft is the name of the file I am currently typing, /-/com/dec/src/user/swart/shell is a symbolic link to my shell, /-/com/dec/prl/perle/nfs/bin/ls is the name of the "ls" program on a vanilla Ultrix machine at DEC's Paris Research Lab.. [Yes, I know we are using "/-/" as the name of the super root and not either "/../" or "//" as POSIX mandates, but those other strings are so uhhgly and /../ is especially misleading in a system with multiple levels of super root, e.g. on my machine "cd /; pwd" types /-/com/dec/src.] Things that we don't put in the name space are objects that are passed within or between processes by 'handle' rather than by name. For example, pipes created with the pipe(2) call, need not be in the name space. [At a further extreme, pipes for intra-process communication don't even involve calling the kernel.] I personally don't believe in overloading file system operations on objects for which the meaning is tenuous (e.g. "unlink" => "kill -TERM" on objects of type process); we tend to define new operations for manipulating objects of a new type. But that is even more of a digression than I wanted to get into! Sorry for the length of this message, I seem to have gotten carried away. Happy trails, Garret Swart DEC Systems Research Center 130 Lytton Avenue Palo Alto, CA 94301 (415) 853-2220 decwrl!swart.UUCP or swart@src.dec.com Volume-Number: Volume 21, Number 91
gumby@Cygnus.COM (David Vinayak Wallace) (09/08/90)
From: gumby@Cygnus.COM (David Vinayak Wallace) Date: 7 Sep 90 15:23:19 GMT From: chip@tct.uucp (Chip Salzenberg) [Most of quoted message deleted. -mod] It is true that other operating systems get along without devices, IPC, etc. in their filesystems. That's fine for them; but it's not relevant to Unix. Unix programming has a history of relying on the filesystem to take care of things that other systems handle as special cases -- devices, for example.... What defineds `true Unix?' Don't forget that Multics had all this and more in the filesystem; this stuff was REMOVED when Unix was written. Is this `continued development by the creators of Unix' just going back to what Unix rejected 20 years ago? Or for a pun for Multics fans: what goes around comes around... Volume-Number: Volume 21, Number 92
peter@ficc.ferranti.com (Peter da Silva) (09/08/90)
From: peter@ficc.ferranti.com (Peter da Silva) Other operating systems have learned from UNIX in this respect, in fact! AmigaOS puts all manner of interesting things in the file name space, including pipes (PIPE:name), windows (CON:Left/Top/Width/Height/Title/Flags), and the environment (ENV:varname). Other things have been left out but are being filled in by users (it's relatively easy to wite device handlers on AmigaOS). There are some really odd things like PATH:. This can be opened as a file and looks like a list of directory names, or used as a directory in which case it looks like the concatenation of all the named directories. -- Peter da Silva. `-_-' +1 713 274 5180. 'U` peter@ferranti.com Volume-Number: Volume 21, Number 93
jfh@rpp386.cactus.org (John F. Haugh II) (09/11/90)
From: jfh@rpp386.cactus.org (John F. Haugh II) In article <497@usenix.ORG> swart@src.dec.com (Garret Swart) writes: >I believe in putting lots of interesting stuff in the file system name >space but I don't believe that semaphores belong there. The reason >I don't want to put semaphores in the name space is the same reason >I don't want to put my program variables in the name space: I want >to have lots of them, I want to create and destroy them very quickly >and I want to operate on them even more quickly. In other words, the >granularity is wrong. There is no requirement that you bind every semaphore handle to a file system name. Only that the ability to take a semaphore handle and create a file system name or take a file system name entry and retreive a semaphore handle. This would permit you to rapidly create and destroy semaphore for private use, as well as provide an external interface for public use. There is no restriction in either case as to the speed which you can perform operations on the handle - file descriptors are associated with file system name entries in many cases and I've not seen anyone complain that file descriptors slow the system down. -- John F. Haugh II UUCP: ...!cs.utexas.edu!rpp386!jfh Ma Bell: (512) 832-8832 Domain: jfh@rpp386.cactus.org "SCCS, the source motel! Programs check in and never check out!" -- Ken Thompson Volume-Number: Volume 21, Number 96
ok@goanna.cs.rmit.OZ.AU (Richard A. O'Keefe) (09/11/90)
From: ok@goanna.cs.rmit.OZ.AU (Richard A. O'Keefe) In article <497@usenix.ORG>, swart@src.dec.com (Garret Swart) writes: > I believe in putting lots of interesting stuff in the file system name > space but I don't believe that semaphores belong there. The reason > I don't want to put semaphores in the name space is the same reason > I don't want to put my program variables in the name space: I want > to have lots of them, I want to create and destroy them very quickly > and I want to operate on them even more quickly. In other words, the > granularity is wrong. So why not choose a different granularity? Have the thing that goes in the file system name space be an (extensible) *array* of semaphores. To specify a semaphore, one would use a (descriptor, index) pair. To create a semaphore in a semaphore group, just use it. If you want to have a semaphore associated with a data structure in mapped memory, just use a lock on the appropriate byte range of the mapped file. (Am I hopelessly confused, or aren't advisory record locks *already* equivalent to binary semaphores? Trying to lock a range of bytes in a file is just a multi-wait, no? Why do we need two interfaces? (I can see that two or more _implementations_ behind the interface might be a good idea, but that's another question.) -- Heuer's Law: Any feature is a bug unless it can be turned off. Volume-Number: Volume 21, Number 97
chip@tct.uucp (Chip Salzenberg) (09/12/90)
From: chip@tct.uucp (Chip Salzenberg) According to gumby@Cygnus.COM (David Vinayak Wallace): >Is this `continued development by the creators of Unix' just going >back to what Unix rejected 20 years ago? They threw away what wouldn't fit. Then they added features, but piece by piece, and only as they observed a need. This cycle has started again with Plan 9, which borrows heavily from Unix -- almost everything lives in the filesystem -- but which is in fact a brand new start. Unix owes much to Multics, and we can learn from it, but we needn't be driven by it. -- Chip Salzenberg at Teltronics/TCT <chip@tct.uucp>, <uunet!pdn!tct!chip> Volume-Number: Volume 21, Number 102
fouts@bozeman.bozeman.ingr (Martin Fouts) (09/18/90)
Submitted-by: fouts@bozeman.bozeman.ingr (Martin Fouts) >>>>> On 7 Sep 90 15:23:19 GMT, chip@tct.uucp (Chip Salzenberg) said: Chip> According to fouts@bozeman.bozeman.ingr (Martin Fouts): >I'm not sure which Unix you've been running for the past five or more >years, but a lot of stuff doesn't live in the file system name space ... Chip> The absense of sockets (except UNIX domain), System V IPC, etc. from Chip> the file system is, in the opinion of many, a bug. It is a result of Chip> Unix being extended by people who do not understand Unix. ^-------------------------------^ My aren't we superior. (;-) At one time, I believed that sockets belonged in the filesystem name space. I spent a long time arguing this point with members of the networking community before they convinced me that certain transient objects do not belong in that name space. (See below) Chip> Research Unix, which is the result of continued development by the Chip> creators of Unix, did not take things out of the filesystem. To the Chip> contrary, it put *more* things there, including processes (via the Chip> /proc pseudo-directory). The value of proc in the file system are debatable. Certain debugging tools are easier to hang on an fcntl certain others are not. However, the presences of the proc file system is not a strong arguement for the inclusion of othere features in the file system. Chip> It is true that other operating systems get along without devices, Chip> IPC, etc. in their filesystems. That's fine for them; but it's not Chip> relevant to Unix. Unix programming has a history of relying on the Chip> filesystem to take care of things that other systems handle as special Chip> cases -- devices, for example. The idea that devices can be files but Chip> TCP/IP sockets cannot runs counter to all Unix experience. Unix programming has a history of using the filesystem for some things and not using it for others. For example, I can demonstrate a semantic under which it is possible to put the time of day clock into the file system and reference it by opening the i.e. /dev/timeofday file. Each time I read from that file, I would get the current time. Via fcntls, I could extend this to handle timer functions. It wasn't done in Unix. (I've done similar things in other OSs I've designed, though.) The whole point of the response which you partially quoted was to remind the poster I was responding to that not all functions which might have been placed in the filesystem automatically have. Chip> The reason why I continue this discussion here, in comp.std.unix, is Chip> that many Unix programmers hope that the people in the standardization Chip> committees have learned from the out-of-filesystem mistake, and will Chip> rectify it. Chip> -- The reason I respond is that it is not automatically safe to assume that something belongs in the file system because something else is already there. There is also an explicit problem not mentioned in this discussion which is the distinction between filesystem name space and filesystem semantics. Sometimes there are objects which would be reasonable to treat with filesystem semantics for which there is no reasonable mechanism for introducing them into the filesystem name space. Because of the way network connections are made, I have been convinced by networking experts (who are familiar with the "Unix style") that the filesystem namespace does not have a good semantic match for the network name space. Chip> Chip Salzenberg at Teltronics/TCT <chip@tct.uucp>, <uunet!pdn!tct!chip> Chip> Volume-Number: Volume 21, Number 89 Marty -- Martin Fouts UUCP: ...!pyramid!garth!fouts (or) uunet!ingr!apd!fouts ARPA: apd!fouts@ingr.com PHONE: (415) 852-2310 FAX: (415) 856-9224 MAIL: 2400 Geng Road, Palo Alto, CA, 94303 Moving to Montana; Goin' to be a Dental Floss Tycoon. - Frank Zappa Volume-Number: Volume 21, Number 114
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (09/19/90)
Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) In article <523@usenix.ORG> fouts@bozeman.bozeman.ingr (Martin Fouts) writes: > At one time, I believed that sockets > belonged in the filesystem name space. I spent a long time arguing > this point with members of the networking community before theyy > convinced me that certain transient objects do not belong in that name > space. In contrast, I've found it quite easy to get people to agree that practically every object should be usable as an open *file*. The beauty and power of UNIX is the abstraction of files---not filesystems. I'd say that the concept of an open file descriptor is one of the most important reasons that UNIX-style operating systems are taking over the world. chip@tct.uucp (Chip Salzenberg) writes: > The reason why I continue this discussion here, in comp.std.unix, is > that many Unix programmers hope that the people in the standardization > committees have learned from the out-of-filesystem mistake, and will > rectify it. I am a UNIX programmer who strongly hopes that standards committees will never make the mistake of putting network objects into the filesystem. Although the semantics of read() and write() fit network connections perfectly, the semantics of open() most certainly do not. I will readily support passing network connections as file descriptors. I will fight tooth and nail to make sure that they need not be passed as filenames. ---Dan Volume-Number: Volume 21, Number 115
chip@tct.uucp (Chip Salzenberg) (09/20/90)
Submitted-by: chip@tct.uucp (Chip Salzenberg) According to fouts@bozeman.bozeman.ingr (Martin Fouts): >According to chip@tct.uucp (Chip Salzenberg): >> Research Unix [...] put *more things [in the filesystem], >> including processes (via the /proc pseudo-directory). > >The value of proc in the file system are debatable. Certain debugging >tools are easier to hang on an fcntl certain others are not. With /proc, some things are much easier. (Getting a list of all active pids, for example.) Nothing, however, is harder. A big win. >However, the presences of the proc file system is not a strong arguement >for the inclusion of othere features in the file system. I disagree. I consider it an excellent example of how the designers of Unix realize that all named objects potentially visible to more than one process belong in the filesystem namespace. >Unix programming has a history of using the filesystem for some things >and not using it for others. For example, I can demonstrate a >semantic under which it is possible to put the time of day clock into >the file system ... Of course. But in the absense of remotely mounted filesystems -- which V7 Unix was not designed to support -- there is only one time of day, so it needs no name. (I wouldn't be surprised if Plan 9 has a /dev/timeofday, however.) >... not all functions which might have been placed in the >filesystem automatically have. This observation is correct. But it is clear that the designers of Research Unix have used the filesystem for everything that needs a name, and they continue to do so. Their work asks, "Why have multiple namespaces?" Plan 9 asks the question again, and with a megaphone. >Because of the way network connections are made, I have been >convinced by networking experts (who are familiar with the "Unix >style") that the filesystem namespace does not have a good semantic >match for the network name space. Carried to its logical conclusion, this argument would invalidate special files and named pipes, since they also lack a "good semantic match" with flat files. In fact, the only entities with a "good semantic match" for flat files are -- you guessed it -- flat files. So, how do we program in such a system? We use its elegant interface -- or should I say "interfaces"? Plain files, devices, IPCs, and network connections each have a semantically accurate interface, which unfortunately makes it different from all others. This is progress? "Forward into the past!" -- Chip Salzenberg at Teltronics/TCT <chip@tct.uucp>, <uunet!pdn!tct!chip> Volume-Number: Volume 21, Number 119
chip@tct.uucp (Chip Salzenberg) (09/20/90)
Submitted-by: chip@tct.uucp (Chip Salzenberg) According to brnstnd@kramden.acf.nyu.edu (Dan Bernstein): >The beauty and power of UNIX is the abstraction of files--- >not filesystems. The filesystem means that anything worth reading or writing can be accessed by a name in one large hierarchy. It means a consistent naming scheme. It means that any entity can be opened, listed, renamed or removed. Both the filesystem and the file descriptor are powerful abstractions. Do not make the mistake of minimizing either one's contribution to the power and beauty of UNIX. -- Chip Salzenberg at Teltronics/TCT <chip@tct.uucp>, <uunet!pdn!tct!chip> Volume-Number: Volume 21, Number 118
peter@ficc.ferranti.com (Peter da Silva) (09/23/90)
Submitted-by: peter@ficc.ferranti.com (Peter da Silva) In article <523@usenix.ORG> fouts@bozeman.bozeman.ingr (Martin Fouts) writes: > My aren't we superior. (;-) At one time, I believed that sockets > belonged in the filesystem name space. I spent a long time arguing > this point with members of the networking community before they > convinced me that certain transient objects do not belong in that name > space. (See below) You mean things that don't operate like a single bidirectional stream, like pipes? It's funny that the sockets that *do* behave that way are not in the file system, while UNIX-domain sockets (which have two ends on the local box) are. > Unix programming has a history of using the filesystem for some things > and not using it for others. UNIX programming has a history of using whatever ad-hoc hacks were needed to get things working. It's full of evolutionary dead-ends... some of which have been discarded (multiplexed files) and some of which have been patched up and overloaded (file protection bits). But where things have moved closer to the underlying principles (everything is a file, for example) it's become the better for it. > Sometimes there are objects which would be > reasonable to treat with filesystem semantics for which there is no > reasonable mechanism for introducing them into the filesystem name > space. This seems reasonable, but the rest is a pure argument from authority. Could you repeat these arguments for the benefit of hose of us who don't have the good fortune to know these networking experts you speak of? [ Everyone involved in this discussion, please try to keep it in a technical, not a personal, vein. -mod ] -- Peter da Silva. `-_-' +1 713 274 5180. 'U` peter@ferranti.com Volume-Number: Volume 21, Number 127
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (09/25/90)
Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) In article <528@usenix.ORG> chip@tct.uucp (Chip Salzenberg) writes: > According to brnstnd@kramden.acf.nyu.edu (Dan Bernstein): > >The beauty and power of UNIX is the abstraction of files--- > >not filesystems. > Both the filesystem and the file descriptor are powerful abstractions. On the contrary: Given file descriptors, the filesystem is an almost useless abstraction. Programs fall into two main classes. Some (such as diff) take a small, fixed number of filename arguments and treat each one specially. They become both simpler and more flexible if they instead use file descriptors. I'll propose multitee as an example of this. Others (such as sed or compress) take many filenames and perform some action on each file in turn. They also become both simpler and more flexible if they instead take input and output from a couple of file descriptors, perhaps with a simple protocol for indicating file boundaries. I'll propose the new version of filterfile as a demonstration of how this can simplify application development. In both cases, the application need know absolutely nothing about the filesystem. A few utilities deal with filenames---shell redirection and cat. A few utilities do the same for network connections---authtcp and attachport. A few utilities do the same for pipes---the shell's piping. But beyond these two or three programs per I/O object, the filesystem contributes *nothing* to the vast majority of applications. There is one notable exception. Some programs depend on reliable, static, local or virtually local storage, usually for what amounts to interprocess communication. (login needs /etc/passwd. cron reads crontab. And so on.) This is exactly what filesystems were designed for, and a program that wants reliable, static, local storage is perfectly within its rights to demand the sensible abstraction we call a filesystem. Most applications that use input and output, though, don't care that it's reliable or static or local. For them, the filesystem is pointless. Many of us are convinced that open() and rename() and unlink() and so on are an extremely poor match for unreliable or dynamic or remote I/O. We also see the sheer uselessness of forcing all I/O into the filesystem. You must convince us that open() makes sense for everything that might be a file descriptor, and that it provides a real benefit for future applications, before you destroy what we see as the beauty and power of UNIX. ---Dan Volume-Number: Volume 21, Number 128
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (09/25/90)
Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) In article <539@usenix.ORG> peter@ficc.ferranti.com (Peter da Silva) writes: > But where things have moved closer > to the underlying principles (everything is a file, for example) it's become > the better for it. The underlying principle is that everything is a file *descriptor*. > > Sometimes there are objects which would be > > reasonable to treat with filesystem semantics for which there is no > > reasonable mechanism for introducing them into the filesystem name > > space. > This seems reasonable, but the rest is a pure argument from authority. > Could you repeat these arguments for the benefit of hose of us who don't > have the good fortune to know these networking experts you speak of? The filesystem fails to deal with many (most?) types of I/O that aren't reliable, static, and local. Here's an example: In reality, you initiate a network stream connection in two stages. First you send off a request, which wends its way through the network. *Some time later*, the response arrives. Even if you aren't doing a three-way handshake, you must wait a long time (in practice, up to several seconds on the Internet) before you know whether the open succeeds. In the filesystem abstraction, you open a filename in one stage. You can't do anything between initiating the open and finding out whether or not it succeeds. This just doesn't match reality, and it places a huge restriction on programs that want to do something else while they communicate. You can easily construct other examples, but one should be enough to convince you that open() just isn't sufficiently general for everything that you might read() or write(). ---Dan Volume-Number: Volume 21, Number 129
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (09/25/90)
Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) In article <529@usenix.ORG> chip@tct.uucp (Chip Salzenberg) writes: > According to fouts@bozeman.bozeman.ingr (Martin Fouts): > >However, the presences of the proc file system is not a strong arguement > >for the inclusion of othere features in the file system. > I disagree. I consider it an excellent example of how the designers > of Unix realize that all named objects potentially visible to more > than one process belong in the filesystem namespace. I disagree. I consider it an excellent example of how the designers of UNIX realize that all *reliable*, *static*, *local* (or virtually local) I/O objects potentially visible to more than one process belong in the filesystem namespace. /dev/proc, for example, is reliable---there's no chance of arbitrary failure. It's static---processes have inertia, and stick around until they take the positive action of exit()ing. And it's local---you don't have an arbitrary delay before seeing the information. So it's a perfectly fine thing to include in the filesystem without hesitation. Objects that aren't reliable, or aren't static, or aren't local, also aren't necessarily sensible targets of an open(). Some of them might fit well, but each has to be considered on its own merits. > So, how do we program in such a system? We use its elegant interface > -- or should I say "interfaces"? Plain files, devices, IPCs, and > network connections each have a semantically accurate interface, which > unfortunately makes it different from all others. The single UNIX interface is the file descriptor. You can read() or write() reasonable I/O objects through file descriptors. Very few programs---the shell is a counterexample---need to worry about what it takes to set up those file descriptors. Very few programs---stty is a counterexample---need to know the ioctl()s or other functions that control the I/O more precisely. What is your complaint? ---Dan Volume-Number: Volume 21, Number 136
henry@zoo.toronto.edu (Henry Spencer) (09/25/90)
Submitted-by: henry@zoo.toronto.edu (Henry Spencer) In article <541@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >In the filesystem abstraction, you open a filename in one stage. You >can't do anything between initiating the open and finding out whether or >not it succeeds. This just doesn't match reality, and it places a huge >restriction on programs that want to do something else while they >communicate. Programs that want to do two things at once should use explicit parallelism, e.g. some sort of threads facility. In every case I've seen, this yielded vastly superior code, with clearer structure and better error handling. -- TCP/IP: handling tomorrow's loads today| Henry Spencer at U of Toronto Zoology OSI: handling yesterday's loads someday| henry@zoo.toronto.edu utzoo!henry Volume-Number: Volume 21, Number 131
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (09/26/90)
Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) In article <543@usenix.ORG> henry@zoo.toronto.edu (Henry Spencer) writes: > In article <541@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: > >In the filesystem abstraction, you open a filename in one stage. You > >can't do anything between initiating the open and finding out whether or > >not it succeeds. This just doesn't match reality, and it places a huge > >restriction on programs that want to do something else while they > >communicate. > Programs that want to do two things at once should use explicit parallelism, > e.g. some sort of threads facility. In every case I've seen, this yielded > vastly superior code, with clearer structure and better error handling. I agree that programs that want to do two things at once should use threads. However, a program that sends out several connection requests is *not*, in fact, doing several things at once. open() forces it into an unrealistic local model; surely you agree that this is not a good semantic match for what actually goes on. That example shows what goes wrong when locality disappears. As another example, NFS (as it is currently implemented) shows what goes wrong when reliability disappears. Have you ever run ``df'' on a Sun, only to have it hang and lock up your terminal? Your process is stuck in kernel mode, waiting for an NFS server that may be flooded with requests or may have crashed. Programs that use the filesystem for IPC assume that their files won't just disappear; this isn't true under NFS. I am not saying that networked filesystems are automatically a bad thing. Quite the contrary: a distributed filesystem with caching and other forms of replication can easily be local and reliable, and I'll gladly see standard UNIX make provisions for it. But something that's not local, or not reliable, or not static, is also not necessarily appropriate for the filesystem. ---Dan Volume-Number: Volume 21, Number 132
ske@pkmab.se (Kristoffer Eriksson) (09/26/90)
Submitted-by: ske@pkmab.se (Kristoffer Eriksson) In article <541@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >In the filesystem abstraction, you open a filename in one stage. [...] > >You can easily construct other examples, but one should be enough to >convince you that open() just isn't sufficiently general for everything >that you might read() or write(). What prevents us from inventing a few additional filesystem operations that ARE general enough? I think the important thing about the filesystem abstraction that is being debated here, is the idea of a common name space, and that idea does not require open() to be an indivicible operation, and it does not require that open() must be the only way to associate a file descriptor to a named object, as long as there is only one name space. -- Kristoffer Eriksson, Peridot Konsult AB, Hagagatan 6, S-703 40 Oerebro, Sweden Phone: +46 19-13 03 60 ! e-mail: ske@pkmab.se Fax: +46 19-11 51 03 ! or ...!{uunet,mcsun}!sunic.sunet.se!kullmar!pkmab!ske Volume-Number: Volume 21, Number 133
henry@zoo.toronto.edu (Henry Spencer) (09/27/90)
Submitted-by: henry@zoo.toronto.edu (Henry Spencer) In article <544@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >> Programs that want to do two things at once should use explicit parallelism, >> e.g. some sort of threads facility. In every case I've seen, this yielded >> vastly superior code, with clearer structure and better error handling. > >I agree that programs that want to do two things at once should use >threads. However, a program that sends out several connection requests >is *not*, in fact, doing several things at once... I'm afraid I don't understand: a program that is trying, simultaneously, to open several different connections is somehow not doing several things at once? I think this is a confusion of implementation with specification. The program *is* doing several things at once, to wit opening several connections at once. If "open" is split into several steps, you can implement this in a single-threaded program, crudely, by interleaving the steps of the different opens. My point is that the code is cleaner, and often details like good error handling are easier, if you admit that there is parallel activity here and use explicitly parallel constructs. Then an "open" that is ready for step 2 does not need to wait for all the others to finish step 1 first. And if you do this, there is no need to decompose "open" at all, because each thread just does all the steps of one open in sequence. Furthermore, it can then proceed to do other useful setup chores, e.g. initial dialog on its connection, without waiting for the others. This is a far more natural model of what's going on than forcing everything into one sequential process, and a much better match for the semantics of the problem. -- TCP/IP: handling tomorrow's loads today| Henry Spencer at U of Toronto Zoology OSI: handling yesterday's loads someday| henry@zoo.toronto.edu utzoo!henry Volume-Number: Volume 21, Number 134
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (09/27/90)
Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) In article <546@usenix.ORG> henry@zoo.toronto.edu (Henry Spencer) writes: > I'm afraid I don't understand: a program that is trying, simultaneously, > to open several different connections is somehow not doing several things > at once? Correct. Between sending an open request out upon the network and receiving an acknowledgment, the program is not doing anything at all related to that connection. Let me be more specific. Host X, on the Internet, wants to know the time. It decides to ask ten hosts around the network for the time. In reality, here's what happens in X's interaction with Y: X sends to Y a request for a connection on port 37. Pause. Y acknowledges. Y sends a few bytes back and closes the connection. During the pause, X is doing nothing. But there are several Y's. So X sends out ten requests in sequence. It waits. Each Y responds at some point; X collects the responses in whatever order they come. Where is it doing any two things at once, let alone several? > The program *is* doing several things at once, to wit opening several > connections at once. ``Opening a connection'' is really an abuse of the language, because a network open consists of at least two steps that may come arbitrarily far apart. Let me replace it by phrases that honestly describe what the computer is doing: ``sending out a connection request, and later accepting an acknowledgment.'' Now, out of the requests and acknowledgments going on, what two are happening at once? None of them. You're being misled by the terminology. ``Opening a connection'' is such a common phrase that we automatically accept it as a description of reality, and consequently believe that it is well described by open(); but it isn't. The time between request and acknowledgment is filled with nothing but a void. [ combining threads with a one-step open() ] > This is a far more natural model of what's > going on than forcing everything into one sequential process, and a > much better match for the semantics of the problem. No. It is not an accurate description of what is going on, since an open() is implicitly local while a network open is not. Abstract imagery aside, though, ``naturalness'' is really defined by how a concept helps a programmer. BSD's non-blocking connect() and select() for connection acceptance, while perhaps not the best-named system calls, are extremely easy to work with. They adapt perfectly to network programming problems because they accurately reflect what the system is doing. In contrast, forking off threads and kludging around a local open() is unnecessarily complex and would make network programming unnecessarily difficult. For me that condemns it as an unnatural, inaccurate reflection of reality. ---Dan Volume-Number: Volume 21, Number 135
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (09/27/90)
Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) In article <545@usenix.ORG> ske@pkmab.se (Kristoffer Eriksson) writes: > In article <541@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: [ file descriptors are general; the filesystem is not ] > What prevents us from inventing a few additional filesystem operations > that ARE general enough? That's a good question. I am willing to believe that a somewhat different kind of filesystem could sensibly handle I/O objects that are neither reliable nor local. I find it somewhat harder to believe that the concept of a filesystem can reasonably reflect dynamic I/O: information placed into a filesystem should stick around until another explicit action. In any case, you'll have to invent those operations first. > I think the important thing about the filesystem abstraction that is being > debated here, is the idea of a common name space, Here's what I thought upon reading this. First: ``A common name space is irrelevant to the most important properties of a filesystem.'' Second: ``A common name space is impossible.'' And finally: ``We already have a common name space.'' Let me explain. My first thought was that the basic purpose of a filesystem---to provide reliable, static, local I/O---didn't require a common name space. As long as there's *some* way to achieve that goal, you have a filesystem. UNIX has not only some way, but a uniform, consistent, powerful way: file descriptors. But that's dodging your question. Just because a common name space is irrelevant to I/O doesn't mean that it may not be helpful for some other reason. My second thought was that the kind of name space you want is impossible. You want to include network objects, but no system can possibly keep track of the tens of thousands of ports under dozens of protocols on hundreds of thousands of computer. It's just too big. But that's not what you're looking for. Although the name space is huge, any one computer only looks at a tiny corner of that space. You only need to see ``current'' names. My third thought: We already have that common name space! (file,/bin/sh) is in that space. (host,128.122.142.2) is in that space. (proc,1) is in that space. No system call uses this common name space, but it's there. Use it at will. ---Dan Volume-Number: Volume 21, Number 137
rja7m@plaid.cs.Virginia.EDU (Ran Atkinson) (09/27/90)
Submitted-by: rja7m@plaid.cs.Virginia.EDU (Ran Atkinson) In article <545@usenix.ORG> ske@pkmab.se (Kristoffer Eriksson) writes: >What prevents us from inventing a few additional filesystem operations >that ARE general enough? PLEASE. Let's don't go off inventing new things as part of a standards effort. The proper way to approach standardisation is to standardise the existing practice and avoid all new inventions that haven't been fully implemented and tested widely. Many of the problems with UNIX-derived OSs have come from folks who didn't do this and ended up with stuff that wasn't really compatible with the rest of the OS in function or approach. A lot of the problems I see coming out of the working groups in P1003 come from folks failing to standardise existing practice and instead going off and inventing a new idea in the committee that hasn't been implemented and lacks adequate actual experience with whether the idea really works and is a general solution to a real problem. Randall Atkinson randall@Virginia.EDU Volume-Number: Volume 21, Number 140
chip@tct.uucp (Chip Salzenberg) (09/28/90)
Submitted-by: chip@tct.uucp (Chip Salzenberg) According to brnstnd@kramden.acf.nyu.edu (Dan Bernstein): >The underlying principle is that everything is a file *descriptor*. No one disputes the significance of file descriptors. Nevertheless, it is important not to underestimate the simplification gained by using one namespace for all objects -- files, devices, processes, hosts, IPC entities, etc. A filesystem is good for files, but a namespace is good for everything. And if an object has a name, and you want a file descriptor referring to that object, why invent a new system call? I'd rather continue using open(). >In reality, you initiate a network stream connection in two stages. >First you send off a request, which wends its way through the network. >*Some time later*, the response arrives. This situation is easily modeled with open() and O_NDELAY. Compare the way Unix opens a modem control tty. Normally, the open() call will block until the carrier detect line is asserted. However, the O_NDELAY parameter to open() avoid the blockage. Likewise, an open() on a TCP connection would block until the connection succeeds or fails. However, the O_NDELAY parameter would allow the program to continue immediately, with provisional status of "success". The program could come back and check on the open() status later, perhaps with an fcntl() call. Devices are well-entrenched residents of the filesystem namespace. So far, all proposed reasons for keeping network connections out of the filesystem would apply equally to devices. Do we really want to leave the filesytem free of everything except files? That way lay CP/M. -- Chip Salzenberg at Teltronics/TCT <chip@tct.uucp>, <uunet!pdn!tct!chip> Volume-Number: Volume 21, Number 138
chip@tct.uucp (Chip Salzenberg) (09/28/90)
Submitted-by: chip@tct.uucp (Chip Salzenberg) According to brnstnd@kramden.acf.nyu.edu (Dan Bernstein): >NFS (as it is currently implemented) shows what goes wrong when >reliability disappears. In a discussion of filesystem semantics, NFS is a straw man. Everyone knows it's a botch. If AFS and RFS don't convince one that a networked filesystem namespace can work well, then nothing will. -- Chip Salzenberg at Teltronics/TCT <chip@tct.uucp>, <uunet!pdn!tct!chip> Volume-Number: Volume 21, Number 139
chip@tct.uucp (Chip Salzenberg) (09/28/90)
Submitted-by: chip@tct.uucp (Chip Salzenberg) According to brnstnd@kramden.acf.nyu.edu (Dan Bernstein): >On the contrary: Given file descriptors, the filesystem is an almost >useless abstraction. Characterizing the Unix filesystem as "almost useless" is, frankly, hogwash. A hierarchical filesystem with mount points is a simple, yet powerful, organizational tool. To get back to the original point of this thread, one of my primary complaints about the System V IPC facilities is that they all live in a flat namespace. There is no way for me to create a subdirectory for my application, with naturally named IPCs within that directory. Such hierarchical division is "almost useless?" Hardly. >Many of us are convinced that open() and rename() and unlink() and so on >are an extremely poor match for unreliable or dynamic or remote I/O. Given Unix, where devices -- even those with removable media -- are accessed through the filesystem, I can see no reason whatsoever to treat network connections and other IPC facilities differently. -- Chip Salzenberg at Teltronics/TCT <chip@tct.uucp>, <uunet!pdn!tct!chip> Volume-Number: Volume 21, Number 141
gwyn@smoke.brl.mil (Doug Gwyn) (09/28/90)
Submitted-by: gwyn@smoke.brl.mil (Doug Gwyn) In article <540@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >You must convince us that open() makes sense for everything that might >be a file descriptor, ... open() provides a mechanism for obtaining the object's handle ("file descriptor") in the first place. The argument is really about whether there ought to be more than one way to originate such a handle. (dup(), fork(), etc. merely propagate a handle obtained by other means.) It is possible, as I described over a year ago in the now-defunct comp.unix.wizards newsgroup, to design a UNIX-like operating system where "it takes a handle to get a handle". However, UNIX is definitely not like that. From a software engineering viewpoint, if a single mechanism for originating handles will suffice, then that is the best approach. The hierarchical filesystem serves a useful function that you neglected to mention: It provides "nodes" at which objects have an opportunity to contribute to decisions during interpretation of pathnames. For example, a directory node plays a very important organizational role, a device driver node acts like a "portal", nodes act as mount points, and so on. Without an identifiable node structure the system would be severely emaciated. Indeed, Plan 9 exploits this even more heavily than does UNIX. Volume-Number: Volume 21, Number 145
gwyn@smoke.brl.mil (Doug Gwyn) (09/28/90)
Submitted-by: gwyn@smoke.brl.mil (Doug Gwyn) In article <541@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >In the filesystem abstraction, you open a filename in one stage. You >can't do anything between initiating the open and finding out whether or >not it succeeds. This just doesn't match reality, and it places a huge >restriction on programs that want to do something else while they >communicate. UNIX was designed explicitly on the model of communicating sequential processes. Each process acts as though it executes in a single thread, blocking when it accesses a resource that is not immediately ready. While it would be easy to argue that there is a need for improved IPC, I haven't heard any convincing arguments for making asynchronity explcitly visible to a process. In fact, it was considered quite a step forward in computing back in the old days ("THE" operating system, for example) when viable means of hiding asynchronity were developed. Volume-Number: Volume 21, Number 144
peter@ficc.ferranti.com (Peter da Silva) (09/30/90)
Submitted-by: peter@ficc.ferranti.com (Peter da Silva) In article <548@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: > I disagree. I consider it an excellent example of how the designers of > UNIX realize that all *reliable*, *static*, *local* (or virtually local) > I/O objects potentially visible to more than one process belong in the > filesystem namespace. Like "/dev/tty"? I think you've got some semantic gap here between what's appropriate for a file versus what's appropriate for a file descriptor. An arbitrary failure on an open file descriptor causes problems... but that doesn't keep socket() from returning an fd. An arbitrary failure or an arbitrary delay on an open call is perfectly reasonable: programs expect open to fail. They depend on write() working. And serial lines are subject to all the "hazardous" behaviour of network connections. An open can be indefinitely deferred. The connection, especially over a modem, can vanish at any time. Why not take *them* out of the namespace as well? > You can read() or > write() reasonable I/O objects through file descriptors. Very few > programs---the shell is a counterexample---need to worry about what it > takes to set up those file descriptors. And that's the problem, because the shell is the program that is used to create more file descriptors than just about anything else. If the shell had a syntax for creating sockets and network connections we wouldn't be having this discussion... but then if it did then you might as well make it be via filenames... And look where this discussion started... over shared memory and messages and semaphores being in a separate namespace. But shared memory and message ports are all: reliable, static, and local... at least as much as processes. -- Peter da Silva. `-_-' +1 713 274 5180. 'U` peter@ferranti.com Volume-Number: Volume 21, Number 150
peter@ficc.ferranti.com (Peter da Silva) (10/01/90)
Submitted-by: peter@ficc.ferranti.com (Peter da Silva) In article <547@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: > ``Opening a connection'' is such a common phrase that we automatically > accept it as a description of reality, and consequently believe that it > is well described by open(); but it isn't. The time between request and > acknowledgment is filled with nothing but a void. There are a *number* of cases in UNIX where an open() does not return in a determinable time. The correct solution to this is not to pull stuff out of the file system, but to provide an asynchronous open() call (that can well be hidden by a threads library, but the mechanism should be there). This is related to the issue of whether network end-points belong in the file system, but it is not the same issue because there's much more than networks involved... including objects (serial ports with modem control, in particular) that are already in the filesystem. Oddly enough, the latest draft of P1003.4 that I have available does NOT include an asynchronous OPEN request. This is a serious omission. -- Peter da Silva. `-_-' +1 713 274 5180. 'U` peter@ferranti.com Volume-Number: Volume 21, Number 158
donn@hpfcrn.fc.hp.com (Donn Terry) (10/01/90)
Submitted-by: donn@hpfcrn.fc.hp.com (Donn Terry) I've been following this discussion on the issues of filesystem namespace. I'd like to step back from the details and look at it a little more philosophically. I think that that may lead to a resolution of the issues (or at least some progress) (or a decrease in the shrillness) (or something). UNIX was designed to simplify the programmer's life. In particular, anything that could be reasonably generalized, was. This generalization is not an easy task, and not easy to explain. The genius of Ritchie and Thompson was both because they acheived the generalization, and because they got others to beleive in it. The generalization is more difficult to deal with when you are "used to" some other model. (I see folks using various propietary systems griping about UNIX because it doesn't do everything just the way they are used to.) As Dijkstra once observed about BASIC (I paraphrase, not having the quote). "The teaching of BASIC should be forbidden because it forever ruins students from being able to use better languages." I think that (although he exaggerates) that Dijkstra's comment also applies in this case. We all are contaminated to some degree or other by the preconceptions we bring with us from other training (be it experience with other OSs or something else). I have some personal concerns about some of the functionality in 1003.4 because it appears to be based upon models from other, successful, implementations, but ones that may not have been through the process of generalization. It was R&T's thought that having lots of processes would solve such problems, and for the day, it did. Now it doesn't because of tightly coupled activities (tasks?) needing "fast" switch time. To me, threads is the generalization that follows the original philosophy, not bringing up the OS-like functions similar to select() to the user. (I didn't like threads at first, like many don't; I may still not like the details, but they do seem to provide the generalization needed for that class of task, without the application writer having to write a mini-dispatcher of his/her own.) The broad context of namespace is similar, to me. What's the generalization? I don't really know. My (UNIX flavored) biases say that it's the filesystem. However, a generalization, not a statement that "my problem is different so must be treated differently", is the right answer. Let me try something for the readers of this group to think about. The "UNIX Filesystem" really consists of two parts: a heirarchical namespace mechanism that currently names objects which are at least files, devices, file stores (mounted volumes), and data stream IPC mechanisms (OK, FIFOs!). Some systems add other IPC mechanisms (Streams, Sockets), and the process space (/proc.) I could go on. One of the class of objects named in the namespace is ordinary files. The set of ordinary files is a collection of flat namespaces, where the names are (binary) numbers. (Each mounted volume is an element of the collection, and each i-number is a filename. The "real names" of files are the volume and i-number pair; that's how you tell if two files are identical, not by their names in the namespace, of which they may have zero or more.) (The fact that the other object types also usually have i-numbers is an accident of implementation.) Open() is a means to translate from the namespace to a handle on an object. It may be that the handle is for an ordinary file, or for some other object (as I listed above). Historically, files were the most common concept, and the namespace becomes the "filesystem". (The volume/inode namespace isn't, and shouldn't be, accessible, because the gateway functions that Doug Gwyn mentions are necessary and valuable.) Given the above three paragraphs, one could consciously separate the namespace from the file system further, and then the arguments that "a connection is not a file" seems weaker. A "connection" is an object in the namespace, and open() gives you a handle on it. Given that you know what the object is, you may have to perform additional operations on it, or avoid them. (E.g., many programs operate differently based on the nature of the object they open; if it's a tty it does ioctl() calls on it, if not, it doesn't.) I'm not yet sure that the "filesystem" namespace is (or is not) the right generalization, but a generalization is useful so that we don't end up were we were when R&T started out with a bunch of unrelated namespaces where, by relating them, common functions could be combined, and common operations could be performed commonly. For example, it would be a shame if we find that some network objects that were not put in the generic namespace could reasonably have the open()/read()/write()/close() model applied to them, and because they were in a different namespace, this could not be done (easily). Many exisiting proprietary systems (and even more historical ones) left you in the state that a program that sequentially read an ordinary file couldn't simply do the same thing to a device (without extra programming, anyway). Not looking for the generalization could lead us to the same state again for the "newer" technologies. Donn Terry Speaking only for myself. Volume-Number: Volume 21, Number 161
henry@zoo.toronto.edu (Henry Spencer) (10/02/90)
Submitted-by: henry@zoo.toronto.edu (Henry Spencer) In article <547@usenix.ORG> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >> The program *is* doing several things at once, to wit opening several >> connections at once. > >``Opening a connection'' is really an abuse of the language, because a >network open consists of at least two steps that may come arbitrarily >far apart... This is the nub of the issue, and it's a difference in semantic models. Dan insists on seeing open as a sequence of operations visible to the user, in which case his viewpoint is reasonable. I prefer the Unix approach -- the details of an open are none of the user's business, only whether it succeeds or fails -- in which case "opening a connection" is entirely reasonable terminology, and opening several at once (i.e. sending out multiple requests before receiving acknowledgements) is indeed doing several things at once, best handled with explicit parallelism. Both models are defensible, but I would sort of hope that in a Unix standard, the Unix model would be employed. It is easy to construct examples where explicit parallelism buys you things that the multi-step model can't easily achieve, such as writing data from one connection to disk while another one is still exchanging startup dialog. One *can* always do this in the multi-step model, but it amounts to simulating parallel threads. The main structure of the program turns into: for (;;) { wait for something to happen on some connection deal with it, in such a way that you never block } which does work, but greatly obscures the structure of what's going on, and tends to require all sorts of strange convolutions in "deal with it" because of the requirement that it not block. (If it blocks, activity on *all* connections blocks with it.) BSDish server code tends to be very hard to understand because of exactly this structure. With multiple threads, each one can block whenever convenient, and the others still make progress. Best of all, the individual threads' code looks like a standard Unix program: open connection do reads and writes on it and other things as necessary close it exit instead of being interwoven into a single master loop with all the rest. Almost any program employing select() would be better off using real parallelism instead, assuming that costs are similar. (It is easy to make costs so high that parallelism isn't practical.) -- Imagine life with OS/360 the standard | Henry Spencer at U of Toronto Zoology operating system. Now think about X. | henry@zoo.toronto.edu utzoo!henry Volume-Number: Volume 21, Number 163
donn@hpfcrn.fc.hp.com (Donn Terry) (10/02/90)
Submitted-by: donn@hpfcrn.fc.hp.com (Donn Terry) I was thinking about this a bit more, and want to propose some food for thought on the issue. Classically, open() is a function that "opens a file descriptor", which is where the name comes from. However, if you think, rather, of open() as "translate from the (filesystem) namespace this string, and give me a handle on the object" it actually makes more sense. The operations that can be performed on a file are the classical operators applicable to such a handle. However, some are forbidden or meaningless on some object types (lseek on FIFOs, ioctl on ordinary files, some fcntls on devices), and some have operations only applicable to them (ioctl on devices) and no other type. I can easily imagine an object that had none of the classical file operations applied to it. Now, there is also nothing that requires that open() be the only function that returns such a generic object handle. Imagine (simple example) a a heirarchical namespace that contains all possible character bitcodes in the namespace. Open() would not work very well because of the null termination and slash rules. However, I can imagine another function that takes a char** as an argument, where each element is the name in the next level of the heirarchy. (With length in the first byte.) It would still return a classical file descriptor. Similarly, maybe the punctuation is different, or the notion of "root" is different; generalizing open() to "give me a handle in a namespace" may be most useful. I intend this not as any sort of proposal of something that should or should not be done, but as an "icebreaker" in terms of thinking about the problem. What are the further generalizations we need, how do they make sense and fit together, and (the real test of success) what are some of the unexpected benefits of the generalization? (Granting that the "biggest" unexpected benefit will show up "later".) Donn Terry Speaking only for myself. Volume-Number: Volume 21, Number 167
fouts@bozeman.bozeman.ingr (Martin Fouts) (10/03/90)
Submitted-by: fouts@bozeman.bozeman.ingr (Martin Fouts)
>>>>> On 27 Sep 90 20:03:39 GMT, chip@tct.uucp (Chip Salzenberg) said:
Chip> Given Unix, where devices -- even those with removable media -- are
Chip> accessed through the filesystem, I can see no reason whatsoever to
Chip> treat network connections and other IPC facilities differently.
Chip> --
One reason to not treat every IPC facility as part of the file system:
Shared memory IPC mechanisms which don't need to be visible to
processes not participating in the IPC.
Marty
--
Martin Fouts
UUCP: ...!pyramid!garth!fouts (or) uunet!ingr!apd!fouts
ARPA: apd!fouts@ingr.com
PHONE: (415) 852-2310 FAX: (415) 856-9224
MAIL: 2400 Geng Road, Palo Alto, CA, 94303
Moving to Montana; Goin' to be a Dental Floss Tycoon.
- Frank Zappa
Volume-Number: Volume 21, Number 169
domo@tsa.co.uk (Dominic Dunlop) (10/03/90)
Submitted-by: domo@tsa.co.uk (Dominic Dunlop) In article <107020@uunet.UU.NET> donn@hpfcrn.fc.hp.com (Donn Terry) writes cogently about file system and other name spaces. I'm not going to add significantly to what he said, merely embroider a little: > One of the class of objects named in the namespace is ordinary files. > The set of ordinary files is a collection of flat namespaces, where > the names are (binary) numbers. (Each mounted volume is an element > of the collection, and each i-number is a filename. The "real names" > of files are the volume and i-number pair; that's how you tell if two > files are identical, not by their names in the namespace, of which > they may have zero or more.) (The fact that the other object types > also usually have i-numbers is an accident of implementation.) I'd just like to add that the existing POSIX.1 standard does incorporate the concept of ``a per-file system unique identifier for a file'', although its ethnic origins have been disguised by calling it a ``file serial number'' rather than an i-number. The corresponding field in the stat structure is, by no coincidence at all, st_ino. Donn's point about the need to be able to determine whether two ``handles'' (whatever they may be) refer to the same object is a good one. It follows that, if new types of object are made accessible through filename space, the information returned by stat() (or fstat()) should be sufficient uniquely to identify each distinct object. Of course, where the object is not a conventional file, life becomes more complex than simply saying that each unique serial number/device id combination refers to a unique object. Although POSIX.1 is reticent on the topic because it is studiously avoiding the UNIX-ism of major and minor device numbers, we all know that, faced with a device file on a UN*X system, we should ignore the serial number, and use only the device id in determining uniqueness. I dare say that, as more types of object appear in filename space (and I, for one, should like to see them do so), the question of determining uniqueness will become knottier. Suppose, for example, that one used filenames as handles for virtual circuits across a wide-area network. Conceivably, the number of such circuits could be sufficiently large that it will become difficult o shoe-horn a unique identifier into the existing stat structure fields. A problem for the future? -- Dominic Dunlop Volume-Number: Volume 21, Number 172
jason@cnd.hp.com (Jason Zions) (10/03/90)
Submitted-by: jason@cnd.hp.com (Jason Zions) Dominic Dunlop says: > I dare say that, as more types of object appear in filename space (and > I, for one, should like to see them do so), the question of determining > uniqueness will become knottier. Suppose, for example, that one used > filenames as handles for virtual circuits across a wide-area network. > Conceivably, the number of such circuits could be sufficiently large > that it will become difficult o shoe-horn a unique identifier into the > existing stat structure fields. A problem for the future? Actually, a problem for today. P1003.8 has to cope with the fact that a local file for major 0, minor 0x010100, inode 1234 is *different* from a file on some remote machine with the same (major,minor,inode) triplet. But adding a new field or fields to the stat structure isn't gonna work; expanding that structure will cause many implementations to shatter (i.e. break spectacularly). Just cobbling up a major number for some random remotely-mounted filesystem is unsatisfactory, unless the cobble is persistant over umount/mount operations. (An application starts to run; opens file1 on remsys, gets (maj,min,ino). Network goes down, comes up; system remounts remsys. App opens file2 on remsys. That major number had better be the same for remsys!) What's needed is a simple routine which can be called to determine if two handles point to the same object. It would be nice if there was a routine which took as arguments a file handle and a path name and returned true iff the path referred to the same file. This routine would be guaranteed by the implementor to work for any file-system resident object provided for; e.g. an SVR4 implementation would have to be able to tell if a file opened via RFS referred to the same underlying file as one opened under NFS. I don't know if that's sufficient, though; application programmers may be using the stat info for other purposes, and a remote_addr field might be a good idea. Once P1003.12 decides on a representation for an arbitrary network address, which might be considerably larger than an IP address. Jason Zions Volume-Number: Volume 21, Number 174
peter@ficc.ferranti.com (Peter da Silva) (10/04/90)
Submitted-by: peter@ficc.ferranti.com (Peter da Silva) In article <13132@cs.utexas.edu> fouts@bozeman.bozeman.ingr (Martin Fouts) writes: > One reason to not treat every IPC facility as part of the file system: > Shared memory IPC mechanisms which don't need to be visible to > processes not participating in the IPC. Provide an example, considering the advantages of having shell level visibility of objects has for (a) debugging, (b) system administration, (c) integration, (d)... It's nice to be able to fake a program out with a shell script. -- Peter da Silva. `-_-' +1 713 274 5180. 'U` peter@ferranti.com Volume-Number: Volume 21, Number 176
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (10/04/90)
Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) In article <551@usenix.ORG> chip@tct.uucp (Chip Salzenberg) writes: > According to brnstnd@kramden.acf.nyu.edu (Dan Bernstein): > >NFS (as it is currently implemented) shows what goes wrong when > >reliability disappears. > In a discussion of filesystem semantics, NFS is a straw man. Everyone > knows it's a botch. > If AFS and RFS don't convince one that a networked filesystem > namespace can work well, then nothing will. Exactly! This example proves my point. What's so bad about NFS---why it doesn't fit well into the filesystem---is that it doesn't make the remote filesystem reliable and local. If you show me Joe Shmoe's RFS with reliable, local, static I/O objects, I'll gladly include it in the filesystem. ---Dan Volume-Number: Volume 21, Number 185
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (10/04/90)
Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) In article <106697@uunet.UU.NET> peter@ficc.ferranti.com (Peter da Silva) writes: [ Programs depend on write() working. ] On the contrary. When the descriptor is unreliable, you get an I/O error or the data is simply corrupted; this is exactly what happens with disk I/O. Programs that handle errors on read() and write() are more robust than programs that don't. More commonly, when the descriptor is dynamic and the other side drops, you get a broken pipe. This is certainly not a rare failure mode. In context, I said that open() is only appropriate for reliable, static, local I/O objects. You seem to be arguing that read() and write(), and file descriptors in general, also require reliable, static, local I/O objects, and so my distinction is silly. But UDP sockets, pipes, and TCP sockets are unreliable, dynamic, and remote file descriptors respectively, and read()/write() work with them perfectly. > > You can read() or > > write() reasonable I/O objects through file descriptors. Very few > > programs---the shell is a counterexample---need to worry about what it > > takes to set up those file descriptors. > And that's the problem, because the shell is the program that is used to > create more file descriptors than just about anything else. If the shell > had a syntax for creating sockets and network connections we wouldn't be > having this discussion... Oh? Really? I have a syntax for creating sockets and network connections from my shell. For example, I just checked an address by typing $ ctcp uunet.uu.net smtp sh -c 'echo expn rsalz>&7;echo quit>&7;cat<&6' So we shouldn't be having this discussion, right? > but then if it did then you might as well make > it be via filenames... Why? I don't see a natural filename syntax for TCP connections, so why should I try to figure one out? What purpose would it serve? Only two programs---a generic client and a generic server---have to understand the filenames. If those two programs work, what's the problem? [ shm and sem are reliable, static, local ] As a BSD addict I don't have much experience with those features, but I believe you're right. So feel free to put shared memory objects into the filesystem; I won't argue. Semaphores, I'm not sure about, because it's unclear what a file descriptor pointing to a semaphore should mean. Are semaphores I/O objects in the first place? ---Dan Volume-Number: Volume 21, Number 182
aglew@crhc.uiuc.edu (Andy Glew) (10/04/90)
Submitted-by: aglew@crhc.uiuc.edu (Andy Glew) >In the filesystem abstraction, you open a filename in one stage. You >can't do anything between initiating the open and finding out whether or >not it succeeds. This just doesn't match reality, and it places a huge >restriction on programs that want to do something else while they >communicate. Sounds like you want an asynchronous open facility, much like the asynchronous read and write that others already have on their wish list for file I/O (and other I/O) (not everyone believes that multiple threads are the way to do asynch I/O). -- Andy Glew, a-glew@uiuc.edu [get ph nameserver from uxc.cso.uiuc.edu:net/qi] Volume-Number: Volume 21, Number 181
chip@tct.uucp (Chip Salzenberg) (10/05/90)
Submitted-by: chip@tct.uucp (Chip Salzenberg) According to fouts@bozeman.bozeman.ingr (Martin Fouts): >One reason to not treat every IPC facility as part of the file system: >Shared memory IPC mechanisms which don't need to be visible to processes >not participating in the IPC. Yes, it is obviously desirable to have IPC entities without names. This feature is a simple extension of the present ability to keep a plain file open after its link count falls to zero. Of course, the committee could botch the job by making it an error to completely unlink a live IPC. -- Chip Salzenberg at Teltronics/TCT <chip@tct.uucp>, <uunet!pdn!tct!chip> Volume-Number: Volume 21, Number 186
nick@bis.com (Nick Bender) (10/06/90)
Submitted-by: nick@bischeops.uucp (Nick Bender) In article <13218@cs.utexas.edu>, brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: = Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) = = In article <551@usenix.ORG> chip@tct.uucp (Chip Salzenberg) writes: = > According to brnstnd@kramden.acf.nyu.edu (Dan Bernstein): = > >NFS (as it is currently implemented) shows what goes wrong when = > >reliability disappears. = > In a discussion of filesystem semantics, NFS is a straw man. Everyone = > knows it's a botch. = > If AFS and RFS don't convince one that a networked filesystem = > namespace can work well, then nothing will. = = Exactly! This example proves my point. What's so bad about NFS---why it = doesn't fit well into the filesystem---is that it doesn't make the = remote filesystem reliable and local. If you show me Joe Shmoe's RFS = with reliable, local, static I/O objects, I'll gladly include it in the = filesystem. = = ---Dan Any program which assumes that write(2) always works is broken. Period. That's why you sometimes get long streams of "filesystem full" on your console when some brain-damaged utility doesn't check a return value. In my view this is not a reason to call NFS a botch. nick@bis.com Volume-Number: Volume 21, Number 188
chip@tct.uucp (Chip Salzenberg) (10/09/90)
[I would like to avoid an NFS flame fest if possible. If you respond, please keep it in the context of a UNIX standards discussion, as Chip has mostly done here. Thanks. --Fletcher ] Submitted-by: chip@tct.uucp (Chip Salzenberg) According to nick@bis.com (Nick Bender): >Any program which assumes that write(2) always works is broken. Period. True. >In my view this is not a reason to call NFS a botch. Also true ... but the possible failure of write() wasn't my reason. NFS is an interesting and occasionally useful service. However, it it does not provide UNIX filesystem semantics. In particular, given appropriate permissions, link() and mkdir() on a UNIX filesystem are guaranteed to succeed exactly once. On an NFS mount, however, they may report failure even after having succeeded. Also, the vaunted "advantage" of NFS, it's statelessness, goes out the window as soon as you want to lock a file. Finally, NFS does not permit access to remove special files such as devices and named pipes. Yes, Virginia, NFS is a botch. So what is the relevance of NFS's dain bramage to this newsgroup? Simply that NFS is not POSIX compliant. Therefore, using NFS as an example of how the namespace is supposedly almost useless is nothing more than a straw man. If a person wants to knock remote UNIX filesystems, let him try to knock reasonable ones like RFS and AFS. No, Dan, this article does not imply that network connections don't belong in the filesystem. It means that *if* link() and mkdir() are defined on a UNIX filesystem, they must succeed exactly once. Compare a UNIX system that has mounted a CP/M disk. The CP/M disk format precludes the use of link() and mkdir(), yet the UNIX namespace is quite useful for accessing the files on the disk. -- Chip Salzenberg at Teltronics/TCT <chip@tct.uucp>, <uunet!pdn!tct!chip> Volume-Number: Volume 21, Number 191
fouts@bozeman.bozeman.ingr (Martin Fouts) (10/11/90)
Submitted-by: fouts@bozeman.bozeman.ingr (Martin Fouts) >>>>> On 3 Oct 90 17:19:04 GMT, peter@ficc.ferranti.com (Peter da Silva) said: Peter> In article <13132@cs.utexas.edu> fouts@bozeman.bozeman.ingr (Martin Fouts) writes: > One reason to not treat every IPC facility as part of the file system: > Shared memory IPC mechanisms which don't need to be visible to > processes not participating in the IPC. Peter> Provide an example, considering the advantages of having shell level Peter> visibility of objects has for (a) debugging, (b) system administration, Peter> (c) integration, (d)... Short persistance IPC mechanisms found in multithreaded shared memory implementations consist of a small region of memory and a lock guarding that region. Producer/consumer parallelism using this mechanism does not need to be visible. Effectively, this is the shared memory equivalent of an unnamed pipe. a) debugging is handled by the process debugger, not by the shell and has the same visibility as any other memory resident data. b) There is no system administration, since the objects have exactly process duration with the same termination semantics as a pipe, in that termination of any of the processes is usually catastrophic c) I'm not sure what integration support would benefit from making a short duration object visible. d) .... -- Martin Fouts UUCP: ...!pyramid!garth!fouts (or) uunet!ingr!apd!fouts ARPA: apd!fouts@ingr.com PHONE: (415) 852-2310 FAX: (415) 856-9224 MAIL: 2400 Geng Road, Palo Alto, CA, 94303 Moving to Montana; Goin' to be a Dental Floss Tycoon. - Frank Zappa Volume-Number: Volume 21, Number 196
peter@ficc.ferranti.com (Peter da Silva) (10/12/90)
Submitted-by: peter@ficc.ferranti.com (Peter da Silva) In article <13442@cs.utexas.edu> fouts@bozeman.bozeman.ingr (Martin Fouts) writes: > Short persistance IPC mechanisms found in multithreaded shared memory > implementations consist of a small region of memory and a lock guarding > that region. Producer/consumer parallelism using this mechanism does > not need to be visible. Effectively, this is the shared memory > equivalent of an unnamed pipe. Effectively, this *is* shared memory. And shared memory has proven itself to be a viable candidate for insertion into the name space. I didn't say that every application of an IPC mevchanism should have its own entry in the name space. Creating a file for each element in a shared memory region makes about as much sense as creating a file for each message in a pipe. But the region itself should be visible from the outside. -- Peter da Silva. `-_-' +1 713 274 5180. 'U` peter@ferranti.com Volume-Number: Volume 21, Number 201
fouts@bozeman.bozeman.ingr (Martin Fouts) (10/13/90)
Submitted-by: fouts@bozeman.bozeman.ingr (Martin Fouts) >>>>> On 4 Oct 90 20:39:37 GMT, chip@tct.uucp (Chip Salzenberg) said: Chip> According to fouts@bozeman.bozeman.ingr (Martin Fouts): >One reason to not treat every IPC facility as part of the file system: >Shared memory IPC mechanisms which don't need to be visible to processes >not participating in the IPC. Chip> Yes, it is obviously desirable to have IPC entities without names. Chip> This feature is a simple extension of the present ability to keep a Chip> plain file open after its link count falls to zero. Of course, the Chip> committee could botch the job by making it an error to completely Chip> unlink a live IPC. Chip> -- Of course, if I have to acquire a file handle for my IPC, I can't imlement it as efficiently as if I just do it locally in shared memory and don't bother the system about it's existance. Marty -- Martin Fouts UUCP: ...!pyramid!garth!fouts (or) uunet!ingr!apd!fouts ARPA: apd!fouts@ingr.com PHONE: (415) 852-2310 FAX: (415) 856-9224 MAIL: 2400 Geng Road, Palo Alto, CA, 94303 Moving to Montana; Goin' to be a Dental Floss Tycoon. - Frank Zappa Volume-Number: Volume 21, Number 205
chip@tct.uucp (Chip Salzenberg) (10/19/90)
Submitted-by: chip@tct.uucp (Chip Salzenberg) [ Is it my imagination, or is this thread getting stale? Oh well. I think John will be back soon. He can decide. --Fletcher ] According to fouts@bozeman.bozeman.ingr (Martin Fouts): >Of course, if I have to acquire a file handle for my IPC, I can't >imlement it as efficiently as if I just do it locally in shared memory >and don't bother the system about it's existance. Well, if the system doesn't know about it, then it's not a system IPC facility. If, however, the system does know about it, then it has to have a handle, which might as well be a small integer -- i.e. a file descriptor. -- Chip Salzenberg at Teltronics/TCT <chip@tct.uucp>, <uunet!pdn!tct!chip> "I've been cranky ever since my comp.unix.wizards was removed by that evil Chip Salzenberg." -- John F. Haugh II Volume-Number: Volume 21, Number 208