jcampbell@mrfort.DEC (Jon Campbell) (08/01/85)
Well, my mailbox runneth over with mail telling me how I've struck at the heart of UNIX by suggesting file attributes. I think perhaps I have presented the problem (and its possible solution) in the wrong light. What many users have suggested is that I put a "file header" at the beginning of each file. This seems like a reasonable approach, except that existing FORTRANs do not put such cruft at the beginning of files now. So we have a skew problem. What I was suggesting, though it might have not been clear, is an "invisible" file header, one which you look at in a slightly different way than the real data (the bytes in the file). Perhaps this could be by using a negative byte address in the file, perhaps some other way. I'm not particularly interested in the way it might be done, except that it cannot be part of the actual data and it cannot be a separate file. There are many such operating systems (which have file information in invisible or hidden headers) around, such as the ATEX text-processing system used in many newspapers. Ordinary programs and utilities need not ever look at the invisible header if they are interested in the data only. I suggested that it be part of the "file information block" (i.e., the filename, creation date, and size) because that is a convenient way to have it copied transparently when you make a copy of the file or rename the file. I am not suggesting changing the way that the vast majority of UNIX utilities and user programs currently look at files, nor suggesting any changes to them. I am suggesting that we give a data-handle, if you will, for those programs and utilities which care to use the "attributes". There is no loss of performance, no restrictions placed on file usage, and very small extra disk space used. I think that you folks who are having a look at creating UNIX utilities which can do serious data manipulation, read magtapes from "foreign" operating systems and munge it (without having to read the ANSI magtape header files by hand), or write utilities which can look at different files without knowing a priori the file format, will recognize the problem that I am trying to address. I am not trying to "strike at the heart" of UNIX; I am letting you know that there is a problem to be solved that cannot be solved easily. Thanks for all of your feedback. I am looking forward for more. Thanks, Jon Campbell -------- -------------------- Please note that this mail message is likely to be incomplete. The sender aborted the transmission. rhea::MAILER-DAEMON --------------------
guy@sun.uucp (Guy Harris) (08/02/85)
> There are many such operating systems (which have file information in > invisible or hidden headers) around, such as the ATEX text-processing > system used in many newspapers. Ordinary programs and utilities need > not ever look at the invisible header if they are interested in the > data only. > > I suggested that it be part of the "file information block" (i.e., > the filename, creation date, and size) because that is a convenient > way to have it copied transparently when you make a copy of the file > or rename the file. 1) There is no such "file information block", strictly speaking, on UNIX. The file name is not stored anywhere with the file (the only place the name resides is in a directory, unlike in Files-11, where one copy resides in the "file header" and other copies in the directories that reference the file) and the creation date/time isn't stored anywhere. 2) How does putting it in the "file information block" (whatever that may be in UNIX - the inode, I presume) make it "copied transparently when you make a copy of the file"? The only way I can interpret "transparently" is that any software that now copies files will automagically copy the header information without any change to that software. This is not the case. If the header isn't in the data of the file, it can't be set by doing a "write" to the file; all the UNIX copy command ("cp") does is: open the "from" file for reading open the "to" file for writing, truncating it if it exists and creating it if it doesn't while (data remains to be read from the "from" file) { read data from the "from" file write it to the "to" file } Nowhere in here is there anything which could set the file type, record length, etc., etc., etc.. You'd have to hang it off the "open for writing" operation (that's where the file permission modes are set now). That would require "cp" to change, and would require lots of other programs to change as well. Hardly transparent. > I am not suggesting changing the way that the vast majority of UNIX > utilities and user programs currently look at files, nor suggesting > any changes to them. I am suggesting that we give a data-handle, if > you will, for those programs and utilities which care to use the > "attributes". There is no loss of performance, no restrictions placed > on file usage, and very small extra disk space used. Wrong. If, for example, you write files with "FORTRAN carriage control" differenly from UNIX text files (with embedded ASCII control characters for carriage control), current UNIX utilities will not be able to read those files, *unless you change them* - which you say you are not suggesting. > I think that you folks who are having a look at creating UNIX utilities > which can do serious data manipulation, Plenty of UNIX utilities can do that already. > read magtapes from "foreign" operating systems and munge it (without > having to read the ANSI magtape header files by hand), Such programs exist for UNIX - yes, they have to read the magtape header by hand, but so what? Unless you modified "grep", you couldn't do grep mumble /dev/mt0/frobozz.c (or however you'd have "grep" read file "frobozz.c" on a magtape) without changing "grep" to understand the ANSI record format. Even if you did modify "grep" (and the operating system, so that you could treat a magtape as a file-structured device using ANSI labels), you probably wouldn't want to. You'd probably want to extract the file first - using a program to extract files from a tape; that program would be the only program on the whole system which had to know anything about ANSI labels, etc. > or write utilities which can look at different files without knowing a > priori the file format, Why would you want a utility that could work on text files and FORTRAN binary files? What operations on such files (other than copy, move, etc.) would be common to both kinds of files? You hardly want to print a FORTRAN binary file the same way you print a text file, or scan through a FORTRAN binary file with "grep", or... Nor would you want to be able to feed a text file to a FORTRAN program that expects binary files (I doubt you can do that with VMS or any other operating system, either). Most of the UNIX programs that use "simple" access methods (i.e., reading byte streams) have no interest in reading anything but text files. The other programs read structured files through a user-mode I/O package; that package would have no problem reading a file header placed at the beginning of the file. "cp", since it copies bytes, not records, would copy those structured files or any other collection of bytes you want to put into a files; the same holds true for "tar", "cpio", etc.. No program which expects to read text files would be likely to want to read a structured file like that. As for FORTRAN vs. ASCII carriage control, seems to me I remember a DEC operating system called RT-11 which used ASCII carriage control for all its text files, and it seemed to support FORTRAN... In short, lots of us who *are* familiar with FORTRAN files and ANSI tapes do *not* recognize UNIX as having any of the problems you're talking about - but all this has been said before; you've provided no new arguments in favor of adding attributes like that to UNIX files. Guy Harris
jss@sjuvax.UUCP (J. Shapiro) (08/05/85)
Mr. Campbell has one point, at least, which should not be ignored. UNIX is badly in need of some sort of semaphore structure for use between processes which do not know about each other. Without this facility, it would be very difficult to write a library which could provide reliable record locking, which is one of the facilities he needed which is sorely lacking in current UNIX. It seems to me, after admittedly very little thought, that one of two things is needed: 1) some sempahore facility which would have a namespace which would allow owner/group/world read/write priviledges. This would actually be generally useful, and current file primatives do not provide this facility in a resource efficient or reliable manner. 2) a block level lock on a file, either physical block or logical block, preferably physical. The first facility, I believe, is to be greatly preferred. Please, arguments about using pseudo devices or files in the file system or pipes/sockets/wombats-carrying-postcards don't wash. These are neither resource efficient nor portable. The semaphore facilities necessary are not hard to implement (I have done them myself on other systems), and would help a great deal in solving many problems of record access, which contrary to popular opinion in UNIX land constitutes a great deal of what is done out in the real business world. To my knowledge, all of the database systems providing for reliable record access do this by circumnavigating UNIX, which seems to me to be a bit of a waste. I enjoy using UNIX as a development environment, and I believe that 99% of its ideas are in theory right, but it has a few shortcomings. Others have noticed the process synchronization shortcomings. Has anyone done anything about them? Jon Shapiro Haverford College
mjs@eagle.UUCP (M.J.Shannon) (08/05/85)
> Mr. Campbell has one point, at least, which should not be ignored. UNIX is > badly in need of some sort of semaphore structure for use between > processes which do not know about each other. Without this facility, > it would be very difficult to write a library which could provide > reliable record locking, which is one of the facilities he needed > which is sorely lacking in current UNIX. It seems to me, after admittedly > very little thought, that one of two things is needed: > > 1) some sempahore facility which would have a namespace which > would allow owner/group/world read/write priviledges. > This would actually be generally useful, and current file > primatives do not provide this facility in a resource > efficient or reliable manner. System V has just such a semaphore facility. It also has shared memory and messages to allow processes to bind themselves to each other and cooperate even more closely. > 2) a block level lock on a file, either physical block or logical > block, preferably physical. System Vr2 (I'm almost certain) has advisory file locking. I don't have the documentation handy, but it may allow the user to specify file addresses to be locked. While this is not mandatory locking (i.e., no processes will block on reads or writes due to a lock), cooperating processes can prevent themselves from stepping on each other's data with these locks. > The first facility, I believe, is to be greatly preferred. Please, > arguments about using pseudo devices or files in the file system or > pipes/sockets/wombats-carrying-postcards don't wash. These are > neither resource efficient nor portable. The semaphore facilities > necessary are not hard to implement (I have done them myself on other > systems), and would help a great deal in solving many problems of > record access, which contrary to popular opinion in UNIX land > constitutes a great deal of what is done out in the real business world. > > To my knowledge, all of the database systems providing for reliable > record access do this by circumnavigating UNIX, which seems to me to > be a bit of a waste. > > I enjoy using UNIX as a development environment, and I believe that > 99% of its ideas are in theory right, but it has a few shortcomings. > Others have noticed the process synchronization shortcomings. Has > anyone done anything about them? > > Jon Shapiro > Haverford College Flame on (medium-well): What? That famous university-developed system doesn't support any IPC? No locks? No semaphores? No shared memory? No messages? Gee.... No! AT&T: The Right Choice; System V: The Right UNIX* System. * - UNIX is a trademark of AT&T. It is *not* a trademark of the Regents of California. Flame reduced to pilot light. -- Marty Shannon UUCP: ihnp4!eagle!mjs Phone: +1 201 522 6063 Warped people are throwbacks from the days of the United Federation of Planets.
wcs@ho95e.UUCP (x0705) (08/05/85)
Jon Shapiro (> >) asked about semaphores and record locking on UNIX. Marty Shannon (>) pointed out that System V has semaphores and shared memory, and that: > System Vr2 (I'm almost certain) has advisory file locking. I don't have the > documentation handy, but it may allow the user to specify file addresses to be > locked. While this is not mandatory locking (i.e., no processes will block on > reads or writes due to a lock), cooperating processes can prevent themselves > from stepping on each other's data with these locks. Actually, it's the paging release (System V Rel 2, Vax Ver 2, 3B20 Ver 4, 3B2 Ver ???). I think mandatory locking is supposed to by in Sys V Rel 3. > > > The first facility [semaphores], I believe, is to be greatly preferred. Please, > > arguments about using pseudo devices or files in the file system or > > pipes/sockets/wombats-carrying-postcards don't wash. These are > > neither resource efficient nor portable. The semaphore facilities > > necessary are not hard to implement (I have done them myself on other > > systems), and would help a great deal in solving many problems of > > record access, which contrary to popular opinion in UNIX land > > constitutes a great deal of what is done out in the real business world. The SysVR2v* functions were developed to phase in support for the /usr/group standard, which was put together by UNIX users out in the "real world". (Actually, wombats carrying postcards can be quite efficient in a distributed environment. They're somewhat more portable than TCP/IP, and continue working if the power goes down.) (Trademarks and owners include: { UNIX, Vax, DEC, 3B**, AT&T-**, Wombat Inc.} ) -- ## Bill Stewart, AT&T Bell Labs, Holmdel NJ 1-201-949-0705 ihnp4!ho95c!wcs
preece@ccvaxa.UUCP (08/06/85)
> 1) There is no such "file information block", strictly speaking, on > UNIX. ---------- There's a lot of stuff in the inode that looks an awful lot like a file information block. [It would be cute if there were more room in the directory entry -- then we could have separate attribute lists for each link to a file...] ---------- > 2) How does putting it in the "file information block" (whatever that > may be in UNIX - the inode, I presume) make it "copied transparently > when you make a copy of the file"? ---------- Obviously, it doesn't. On the other hand, except for dump/restore, it would be sufficient to have open(2) create an empty one. Tools to manipulate it could be handled separately (after you cp the file you cp_file_properties to get the attribute list), though it wouldn't be a big deal to make cp and mv handle them, too, and that is what a vendor who was going to do this would do. ---------- > If, for example, you write files with "FORTRAN carriage control" > differenly from UNIX text files (with embedded ASCII control characters > for carriage control), current UNIX utilities will not be able to read > those files, *unless you change them* - which you say you are not > suggesting. ---------- Why not? The utilities may not deal with them intelligently or in the way intended, but the files themselves would still be just streams of data bytes, which they would NOT be if you put the header in the file. Grep, for instance, would find nothing confusing in a file with a hidden property list, but would find a confusing header if the header were embedded in the file (confusing in the sense of "containing stuff other than data" -- it would still be able to process either kind of file). ---------- > > or write utilities which can look at different files without knowing a > > priori the file format, > Why would you want a utility that could work on text files and FORTRAN > binary files? ---------- You're mis-interpreting "format." A utility might need to deal with 132 byte records and 80 byte records or with files having carriage control and files not having carriage control. There are also, of course, a lot of useful things you could put in a property list for use by maintenance programs (such as, perhaps, a more neatly integrated version control system). ---------- > No program which expects to read text files would be likely to want to > read a structured file like that. ---------- There are structured files which still have perfectly normal Unix text file characteristics (a file, for instance, of 81-byte records containing a newline in byte 80 of each record). The embedded header approach makes a number of things more difficult, including random access (offsets have to account for the header) and use with normal Unix utilities (a filter would be needed before using , for instance, grep; multi-file commands (such as merge) would need to have temporaries prepared, since you couldn't provide filtering on more than one file through a pipe). ---------- > but all this has been said before; you've provided no new arguments in > favor of adding attributes like that to UNIX files. ---------- I'm not holding my breath. I think they would be useful and would help sell into a few new markets, but I don't think we can't live without them. -- scott preece gould/csd - urbana ihnp4!uiucdcs!ccvaxa!preece
sean@ukma.UUCP (Sean Casey) (08/06/85)
In article <1238@sjuvax.UUCP> jss@sjuvax.UUCP (J. Shapiro) writes: >Mr. Campbell has one point, at least, which should not be ignored. UNIX is >badly in need of some sort of semaphore structure for use between >processes which do not know about each other. Without this facility, >etc... >To my knowledge, all of the database systems providing for reliable >record access do this by circumnavigating UNIX, which seems to me to >be a bit of a waste. For real. The ingres we run here has a daemon running all the time just so it can be guaranteed atomic locks! Kludge city. -- - Sean Casey UUCP: sean@ukma.UUCP or - Department of Mathematics {cbosgd,anlams,hasmed}!ukma!sean - University of Kentucky ARPA: ukma!sean@ANL-MCS.ARPA
sean@ukma.UUCP (Sean Casey) (08/06/85)
In article <1311@eagle.UUCP> mjs@eagle.UUCP (M.J.Shannon) writes: >What? That famous university-developed system doesn't support any IPC? No >locks? No semaphores? No shared memory? No messages? Gee.... No! What? That big corporation-developed system doesn't have TCP/IP? No sockets? No symbolic links? No cp -r? No C-shell? Geee.... No! -- - Sean Casey UUCP: sean@ukma.UUCP or - Department of Mathematics {cbosgd,anlams,hasmed}!ukma!sean - University of Kentucky ARPA: ukma!sean@ANL-MCS.ARPA
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (08/07/85)
> I enjoy using UNIX as a development environment, and I believe that > 99% of its ideas are in theory right, but it has a few shortcomings. > Others have noticed the process synchronization shortcomings. Has > anyone done anything about them? Yes, AT&T has semaphores, message queues, and record locking. Please don't implement a similar yet different facility; we have too many of those already.
guy@sun.uucp (Guy Harris) (08/08/85)
> What? That famous university-developed system doesn't support any IPC? No > locks? No semaphores? No shared memory? No messages? Gee.... No! No shared memory, no semaphores - (currently) true. No IPC, no messages - go forth and read: SOCKET(2) BIND(2) LISTEN(2) ACCEPT(2) CONNECT(2) SEND(2) RECV(2) in your 4.2BSD manuals, then try again. (Unlike those in a certain famous USDL-developed system, they work over a network. Wait a minnit, somebody's saying they don't work over a network because nothing other than file transfer and remote command execution works over a network in that system. Gee.... No!) No locks - no record locks, but for file locks, go forth and read: FLOCK(2) and try again. Moral: flaming about the general unworthiness of some UNIX system other than your favorite generally causes singed eyebrows and little illumination. (This goes for you Berkeleyphiles - and even Researchphiles - out there, too.) Guy Harris
chris@gargoyle.UUCP (Chris Johnston) (08/09/85)
Someone write a utility to convert to and from the column one format control and end this discussion.
lasse@daab.UUCP (Lars Hammarstrand) (08/14/85)
In article <2030@ukma.UUCP> sean@ukma.UUCP (Sean Casey) writes: >In article <1311@eagle.UUCP> mjs@eagle.UUCP (M.J.Shannon) writes: >>What? That famous university-developed system doesn't support any IPC? No >>locks? No semaphores? No shared memory? No messages? Gee.... No! > >What? That big corporation-developed system doesn't have TCP/IP? No >sockets? No symbolic links? No cp -r? No C-shell? Geee.... No! > > >-- > >- Sean Casey UUCP: sean@ukma.UUCP or >- Department of Mathematics {cbosgd,anlams,hasmed}!ukma!sean >- University of Kentucky ARPA: ukma!sean@ANL-MCS.ARPA Why don't you look for UniPlus+ port of SysV, there you have everything you need and then you don't want to run anything else on your machine!. Lars Hammarstrand. Datorisering AB, Stockholm, SWEDEN. UUCP: {seismo,decvax,philabs}!{mcvax,ukc,unido}!enea!daab!lasse ARPA: decvax!mcvax!enea!daab!lasse@berkley.ARPA decvax!mcvax!enea!daab!lasse@seismo.ARPA
alexis@reed.UUCP (Alexis Dimitriadis) (08/14/85)
> There's a lot of stuff in the inode that looks an awful lot like a file > information block. [It would be cute if there were more room in the > directory entry -- then we could have separate attribute lists for > each link to a file...] As someone pointed out, a small amount of file "attributes" is currently being kept on the directory entry itself -- as a suffix to the filename. Other programs use specially named files, or system calls for lock management, etc. Biff uses the owner-execute-premission bit of the terminal name as a flag. (ugh). I am a big fan of UNIX, but I have often felt that _some_ place to keep user-defined, out-of-band information about a file should be provided, say, a field in the inode explicitly set aside for user-defined information. Something could be worked out about who should be able to modify it, and programs that care to could set their output to have the "attributes" of their input (fstat could be made to return the relevant info on pipes). The "attributes" could or could not be copied when copying a file. cp has to explicitly set the permission mode of the new file, after all. More serious would be namespace problems, when more than one family of applications tries to use the field. Please do not flame, I have been carefully following the discussion on the "file information block", and I have never supported it anyway. :-) If I am missing something too, I would like to know. -- _______________________________________________ As soon as I get a full time job, the opinions expressed above will attach themselves to my employer, who will never be rid of them again. alexis @ reed ...teneron! \ ...seismo!ihnp4! - tektronix! - reed.UUCP ...decvax! /
john@genrad.UUCP (John P. Nelson) (08/15/85)
>>>What? That famous university-developed system doesn't support any IPC? No >>>locks? No semaphores? No shared memory? No messages? Gee.... No! >> >>What? That big corporation-developed system doesn't have TCP/IP? No >>sockets? No symbolic links? No cp -r? No C-shell? Geee.... No! >> > >Why don't you look for UniPlus+ port of SysV, there you have everything you >need and then you don't want to run anything else on your machine!. Not really. It doesn't support symbolic links. Or select() using any file descriptors other than a network socket. Unix domain sockets are not supported. cp -r isn't there either. Actually, the TCP/IP is an expensive option - as normally distributed, all the functions like socket(), connect(), etc. return an error (Unimplemented system call or something like that). Oh, and some of the TCP/IP socket function interfaces are slightly different than 4.2 (I can't recall the specifics right now) Oh, it DOES have C-shell. However, the kernal does NOT recognize "#!", which limits the usefulness of csh scripts. I would rather have berkeley 4.2, but we really need shared memory. John Nelson (a UniPlus+ System V user)
peter@baylor.UUCP (Peter da Silva) (08/17/85)
> >>>What? That famous university-developed system doesn't support any IPC? No > >>>locks? No semaphores? No shared memory? No messages? Gee.... No! > >> > >>What? That big corporation-developed system doesn't have TCP/IP? No > >>sockets? No symbolic links? No cp -r? No C-shell? Geee.... No! > >> > > > >Why don't you look for UniPlus+ port of SysV, there you have everything you > >need and then you don't want to run anything else on your machine!. > > Not really. It doesn't support symbolic links. Or select() using any Good to see I'm not the only person willing to flame on this. At least I got the message that SV-ists are just as refractory as creationists. I still want job control and symbolic links, you all hear? -- Peter da Silva (the mad Australian werewolf) UUCP: ...!shell!neuro1!{hyd-ptd,baylor,datafac}!peter MCI: PDASILVA; CIS: 70216,1076
lasse@daab.UUCP (Lars Hammarstrand) (08/19/85)
> . . > . . > . . >option - as normally distributed, all the functions like socket(), connect(), >etc. return an error (Unimplemented system call or something like that). >Oh, and some of the TCP/IP socket function interfaces are slightly different >than 4.2 (I can't recall the specifics right now) ??ehy, On what machine are you running ????, I'm just going to Germany to look at 2 machines running B-net programs on a Cromemco, rcp rstat .. you now!. >Oh, it DOES have C-shell. However, the kernal does NOT recognize "#!", >which limits the usefulness of csh scripts. True, but not the whole true, it starts up the 'sh' as son as it finds a "#" in the beginning of a [*]shell script. And it does got a Tcsh too (if you have got a source licence of the csh) BTW: is it realy the kernel who recognize the "#!" sequence?,(just wondered!) > >I would rather have berkeley 4.2, but we really need shared memory. > >John Nelson (a UniPlus+ System V user) Ok, I believe you, and I don't want to start a war between diffrent UNIX systems, because in the bottom they are all *UNIX* systems, I just wanted so say that System V is not so bad as many people told me 2 years ago. Lars Hammarstrand. Datorisering AB, Stockholm, SWEDEN. UUCP: {seismo,decvax,philabs}!{mcvax,ukc,unido}!enea!daab!lasse ARPA: decvax!mcvax!enea!daab!lasse@berkley.ARPA decvax!mcvax!enea!daab!lasse@seismo.ARPA Ps: I'm only in it for the *UNIX* ! Ds