brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (10/05/90)
Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) We all know that the best standards codify existing practice, while the worst standards attempt to introduce new features without knowing what they'll do. For example, POSIX 1003.1 has slaughtered some of my best code and thrown huge roadblocks into my porting attempts, simply by adding an unnecessary feature (sessions) that hadn't been proven to work in the real world. It's a nice standard---except where it enters totally uncharted territory. Now we're looking at another possible addition to UNIX that hasn't been widely tested: a unified namespace for opening all I/O objects. But we already have a unified file descriptor abstraction for reading, writing, and manipulating those objects, as well as passing them between separate processes. Why do we need more? I propose that we stop discussing this issue in comp.std.unix and start implementing real-world solutions. My approach is to separate opening and connecting into special programs, and stick to file descriptors for almost all applications. If you have a different solution, such as overloading open(), why don't you start playing with your library and seeing what works? When we have a lot more real-world experience with various solutions, we can come back here and consider standardization. Until then, ciao. ---Dan Volume-Number: Volume 21, Number 187
ok@goanna.cs.rmit.OZ.AU (Richard A. O'Keefe) (10/08/90)
Submitted-by: ok@goanna.cs.rmit.OZ.AU (Richard A. O'Keefe) In article <13220@cs.utexas.edu>, brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: > Now we're looking at another possible addition to UNIX that hasn't been > widely tested: a unified namespace for opening all I/O objects. But we > already have a unified file descriptor abstraction for reading, writing, > and manipulating those objects, as well as passing them between separate > processes. Why do we need more? If you have to use different functions for creating file descriptors in the first place, then you haven't got a unified file descriptor abstraction. Suppose I want to write a "filter" program that will merge two streams. It would be nice if I could pass descriptors to a program, but that's not how most UNIX shells work; I have to pass strings. Now, my filter knows what it *needs* (sequential reading with nothing missing or out of order, but if the connection is lost somehow it's happy to drop dead) so it could easily do fd = posix_open(argv[n], "read;sequential;reliable;soft"); and then it can use any file, device, or other abstraction which will provide this interface. My program *can't* know what's available. If someone comes along with a special "open hyperspace shunt" function; my program can't benefit from it. If hyperspace shunts are in the global name space and posix_open() understands their name syntax, my program will work just fine. Surely this is the point? We want our programs to remain useful when new things are added that our programs could meaningfully work with. I can see the point in saying "shared memory segments aren't much like transput; let's keep them out of the global name space", but sockets and NFS files and such *are* "transput-like". Anything which will support at least sequential I/O should look like a file. If that means that some things in the global name space are "real UNIX files" with full 1003.1 semantics but some things aren't, that's ok as long as my programs can find out whether they've got what they need. One point to bear in mind is that application programs written in C, Fortran, Ada, &c are likely to map file name strings in those languages fairly directly to strings in the POSIX name space; to keep something that _could_ have supported C, or Fortran, or Ada transput requests out of the file name space is to make such things unavailable to portable programs. If some network connections can behave like sequential files (even if they don't support full 1003.1 semantics), then why keep them out of reach to portable programs? (I have used a system where a global name space was faked by the RTL. Trouble is, different languages did it differently, if at all...) Even shared memory segments *could* support read, write, lseek... -- Fear most of all to be in error. -- Kierkegaard, quoting Socrates. Volume-Number: Volume 21, Number 190
ske@pkmab.se (Kristoffer Eriksson) (10/09/90)
Submitted-by: ske@pkmab.se (Kristoffer Eriksson) In article <13220@cs.utexas.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >Now we're looking at another possible addition to UNIX that hasn't been >widely tested: a unified namespace for opening all I/O objects. > >I propose that we stop discussing this issue in comp.std.unix and start >implementing real-world solutions. I am already running a system where a file name can lead to any kind of I/O object. It works fine, as far as I can judge. What more should I do? (Not everything that could be is implemented via file names in this system, but there are networks and databases that are interfaced via this mechanism, and I like it a lot. Server programs attach themselves to directory or file names, and will take care of all file operations attempted by clients on this file or directory.) > My approach is to separate opening and connecting into special programs, >and stick to file descriptors for almost all applications. Doesn't your objection about the semantics of open() on network connections fall down in that case? Do your special programs for obtaining the file descriptors make the real semantics of network connections available to the application any more than open() does? I think file names are more useful. How do you for instance stick a file descriptor that you obtained by one of your special programs into the configuration file for some program? File names are readily suitable for that. If you just stick the network address into the file, the application will be restricted to network connections (maybe only one type of network, at that), and the application will have to know how to access that kind of connection. > If you have a different solution, such as >overloading open(), why don't you start playing with your library and >seeing what works? Too static. You will in practice be conserving the top level of the name space inside your library routines. With non-shared libraries this would mean you'ld have to recompile all your programs if you need to change what kind of objects you can access or how they are accessed. With chared libraries this only requires recompiling the libraries, but still isn't something you'ld like to do every day. With the entire name space available through the filesystem, you can change the entire hierarchy dynamically, and starting the server for some kind of objects may as part of that same operation establish the access path (just a file name) through which it is accessed. -- Kristoffer Eriksson, Peridot Konsult AB, Hagagatan 6, S-703 40 Oerebro, Sweden Phone: +46 19-13 03 60 ! e-mail: ske@pkmab.se Fax: +46 19-11 51 03 ! or ...!{uunet,mcsun}!sunic.sunet.se!kullmar!pkmab!ske Volume-Number: Volume 21, Number 193
chip@tct.uucp (Chip Salzenberg) (10/09/90)
Submitted-by: chip@tct.uucp (Chip Salzenberg) According to ok@goanna.cs.rmit.OZ.AU (Richard A. O'Keefe): >My program *can't* know what's available. If someone comes along >with a special "open hyperspace shunt" function; my program can't >benefit from it. If hyperspace shunts are in the global name space >and posix_open() understands their name syntax, my program will work >just fine. Thank you, Richard, for stating well what I have intuitively felt. (Dan, you wanted a reasoned rebuttal. Very well: here it is.) It is true that interactive use of UNIX, especially by programmers, puts a lot of emphasis on the shell interface. If such an environment were all there were to Unix, then Dan's fd-centric view of the world could possibly be useful. To use Richard's example: when a hyperspace shunt became available, its use would require only a change to the shell source code and a recompilation. However, the reality of modern Unix use is something else entirely: pre-packaged utilities, usually available only as binaries, that for practical purposes *cannot* be changed or replaced. In this environment, kernel features that require program customization are unwieldy at best, useless at worst. As long as shells fall into this category -- "programs usually distributed as binaries" -- fd-centric UNIX will never be practical. One could argue that binary-only distribution is evil and should be stopped. I can agree that binaries are less useful than source code; in fact, my personal motto is, "Unless you have source code, it isn't software." Nevertheless, copyright and trade secret law being what they are, we will continue to see binary-only distributions for the indefinite future. Even if source code were for all UNIX programs were freely available, I doubt that anyone would seriously propose modifying *all* of them each time a new kind of fd-accessible object were added to the kernel. Finally, filenames often are stored in places where no shell will ever see them, such as program-specific configuration files. So in Dan's hypothetical fd-centric UNIX, we would have to either (1) pass such filenames to the shell for interpretation, thus incurring a possibly substantial performance hit; or (2) modify each program to understand all the names the shell would understand. In my opinion, neither of these alternatives is viable. To summarize: A unified namespace has one great advantage: new types of objects are immediately available to all programs -- even the programs for which you do not have the means or the desire to modify and recompile. -- Chip Salzenberg at Teltronics/TCT <chip@tct.uucp>, <uunet!pdn!tct!chip> "I've been cranky ever since my comp.unix.wizards was removed by that evil Chip Salzenberg." -- John F. Haugh II Volume-Number: Volume 21, Number 194
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (10/11/90)
Submitted-by: brnstnd@kramden.acf.nyu.edu (Dan Bernstein) I was not planning to post further on this topic, but Chip has provided some good arguments that deserve a proper rebuttal. In article <13392@cs.utexas.edu> chip@tct.uucp (Chip Salzenberg) writes: > It is true that interactive use of UNIX, especially by programmers, > puts a lot of emphasis on the shell interface. If such an environment > were all there were to Unix, then Dan's fd-centric view of the world > could possibly be useful. The success of UNIX has proven how useful this ``fd-centric'' view is. > To use Richard's example: when a hyperspace > shunt became available, its use would require only a change to the > shell source code and a recompilation. You are making an unwarranted assumption here: that the shell *has* to handle all types of fd creation. It's convenient, of course, but by no means necessary. My TCP connectors, for example, are implemented outside the shell. > However, the reality of modern Unix use is something else entirely: > pre-packaged utilities, usually available only as binaries, that for > practical purposes *cannot* be changed or replaced. In this > environment, kernel features that require program customization are > unwieldy at best, useless at worst. As long as shells fall into this > category -- "programs usually distributed as binaries" -- fd-centric > UNIX will never be practical. This is also unfounded. My TCP connectors provide a counterexample to your hypothesis (that the shell must handle everything and hence be recompiled) and your conclusion (that fd-centric UNIX doesn't work). Any programming problem can be solved by adding a level of indirection. > One could argue that binary-only distribution is evil and should be > stopped. I do, in fact, think exactly that. But I will not use it as a basis for my arguments. > Finally, filenames often are stored in places where no shell will ever > see them, such as program-specific configuration files. So in Dan's > hypothetical fd-centric UNIX, we would have to either (1) pass such > filenames to the shell for interpretation, thus incurring a possibly > substantial performance hit; or (2) modify each program to understand > all the names the shell would understand. In my opinion, neither of > these alternatives is viable. On the contrary. syslog is a counterexample. While it is hardly as modular as I would like, it shows that (0) an fd-centric model works; (1) you do not need to invoke the shell or any other process, and you do not need to incur a performance hit; (2) you do not need to modify each program to understand everything that the syslogd program can. syslog has proven quite viable. Provided that there is a message-passing facility available, and provided that it has sufficient power to pass file descriptors (which is true both under BSD's UNIX-domain sockets and under System V's streams), the syslog model will generalize to any I/O mechanism without loss of efficiency. open() can always be replaced by a write() to the facility followed by a file descriptor transfer. This is just as easy to do outside the kernel as inside the kernel; therefore it should be outside. > To summarize: > A unified namespace has one great advantage: new types of objects are > immediately available to all programs -- even the programs for which > you do not have the means or the desire to modify and recompile. To summarize: I believe I've provided counterexamples to each of your arguments and conclusions, and so I continue to maintain that a unified namespace is pointless. There is no need to recompile any programs just to provide a new I/O mechanism. A unified namespace has several great disadvantages: 1. It provides a competing abstraction with file descriptors, hence adding complexity to the kernel, and giving vendors two different outlets for extensions. This will result in a confused system, where some features are available only under one abstraction or the other. 2. It is not clear that all sensible I/O objects will fit into one namespace. If the precedent of a unified namespace is established now, I/O objects that don't fit will be much harder to add later. 3. A unified namespace has not been tested on a large scale in the real world, and hence is an inappropriate object of standardization at this time. ---Dan Volume-Number: Volume 21, Number 195
peter@ficc.ferranti.com (Peter da Silva) (10/12/90)
Submitted-by: peter@ficc.ferranti.com (Peter da Silva) In article <13441@cs.utexas.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: > In article <13392@cs.utexas.edu> chip@tct.uucp (Chip Salzenberg) writes: > > It is true that interactive use of UNIX, especially by programmers, > > puts a lot of emphasis on the shell interface. If such an environment > > were all there were to Unix, then Dan's fd-centric view of the world > > could possibly be useful. > The success of UNIX has proven how useful this ``fd-centric'' view is. Not at all. You can equally argue that it proves how useful the "unified name space" view is, because *that* is another of the features that marks UNIX as something new. Or that it proves the "filter" concept, or any of the other things that *as a whole* go to making UNIX what it is. UNIX is synergy. > This is also unfounded. My TCP connectors provide a counterexample to > your hypothesis (that the shell must handle everything and hence be > recompiled) and your conclusion (that fd-centric UNIX doesn't work). > Any programming problem can be solved by adding a level of indirection. OK, how do you put your TCP connectors into /etc/inittab as terminal types? Or into /usr/brnstnd/.mailrc as mailbox names? Or into any other program that expects filenames in configuration scripts (remember, not all scripts are shell scripts). > A unified namespace has several great disadvantages: 1. It provides a > competing abstraction with file descriptors, No, it adds a complementary abstraction to file descriptors. In fact, a unified name space and file descriptors together form an abstraction that is at the heart of UNIX: everything is a file. A file has two states: passive, as a file name; and active, as a file descriptor. > This will result in a confused system, where some features are available > only under one abstraction or the other. Which is what you seem to be advocating. > A unified namespace has not been tested on > a large scale in the real world, and hence is an inappropriate object of > standardization at this time. I would like to suggest that UNIX itself proves the success of a unified namespace. -- Peter da Silva. `-_-' +1 713 274 5180. 'U` peter@ferranti.com Volume-Number: Volume 21, Number 200
flee@guardian.cs.psu.edu (Felix Lee) (10/16/90)
Submitted-by: flee@guardian.cs.psu.edu (Felix Lee) >On the contrary. syslog is a counterexample. While it is hardly as >modular as I would like, it shows that (0) an fd-centric model works; syslog shows the limitations of an fd-centric model. B News, for example, writes log entries in the files "log" and "errlog". You cannot redirect this into syslog without modifying code. If syslog existed in the filesystem namespace, you might ln -s /syslog/news.info log ln -s /syslog/news.err errlog or maybe even ln -s ~/mylog/news.err errlog and everything would work. Why should I have to teach all my programs about syslog when I can just write to a filesystem object instead? -- Felix Lee flee@cs.psu.edu Volume-Number: Volume 21, Number 204