samperi@marob.MASA.COM (Dominick Samperi) (09/13/88)
Can people who have had experience working with both VMS files (at the FDL level) and UNIX files (at the inode level, say) comment on the advantages and disadvantages of the file systems used by these operating systems? My experience is mostly with the UNIX file system, so I was a little surprised when I discovered recently that VMS text files, object code files, and executable files all have different record structures. What does the added complexity of having to deal with RMS, FDL, CONVERT, etc., buy? -- Dominick Samperi, NYC samperi@acf8.NYU.EDU samperi@marob.MASA.COM cmcl2!phri!marob uunet!hombre!samperi (^ ell)
dave@arnold.UUCP (Dave Arnold) (09/13/88)
samperi@marob.MASA.COM (Dominick Samperi) writes: > [...] stuff deleted > ...comment on the > advantages and disadvantages of the file systems used by these operating > systems? My experience is mostly with the UNIX file system, so I was a > little surprised when I discovered recently that VMS text files, object > code files, and executable files all have different record structures. > What does the added complexity of having to deal with RMS, FDL, CONVERT, > etc., buy? The VMS file system doesn't buy you anything, unless your application requires ISAM---However, how often do you need ISAM? I think the VMS filesystem is overly complicated, and one of the major downfalls of VMS (but can be tolerated). If the original DEC designers had it to do over again, I suspect they would have stuck with a Stream-only based filesystem (Like UNIX), and provided ISAM libraries. The FORTRAN record format, FIXED SIZE RECORDS, VARIABLE LENGTH, CARRAIGE RETURN CARRIAGE CONTROL... Oh, don't forget the VFC record format... These are all completely archaic, and date the VMS operating system. I feel very strongly about this. Anyone disagree? VMS's stengths? AST's, Timer queues, condition handling, exit handling, message facility. In regards to the above, VMS was way ahead of it's time circa 1978, and life would be difficult without the above. Other VMS pitfalls? The resource quota system!!!!!!! How often have you written a program, and got the famous: %SYSTEM-F-EXCEEDED QUOTA message? Isn't it fun trying to figure out which bloody quota was exceeded?! Stupid! -- Dave Arnold dave@arnold.UUCP {cci632|uunet}!ccicpg!arnold!dave
bzs@encore.UUCP (Barry Shein) (09/13/88)
>What does the added complexity of having to deal with RMS, FDL, CONVERT, >etc., buy? >-- >Dominick Samperi, NYC There are plusses and minuses in both approaches. The intention of formalizing a bunch of file access methods is to put the code whereever the vendor (designer) believes it will do the most good. For example, by knowing you have promised to access some file only sequentially it can be stored in a manner optimal for that usage. Similarly, an indexed file can have its read methods set up, perhaps maintaining two separate cache's (indices,data), for optimal access. It also means that you go through some standard set of routines with a standard set of assumptions (eg. I can open an ISAM file, knowing a few things about it, w/o asking for details about how it's stored, if one builds their own ISAM file into a bag-of-bytes file it may not be at all obvious how to read it without access to the original program which wrote it.) The downside is that these access methods tend to get used. What I mean is, used unnecessarily where bag-of-bytes files would do just fine and cause much less confusion. For example, on an earlier release (probably 1.6) of VMS I wanted to edit a file produced by RUNOFF (to do a few global changes so underlining or some such would print properly on my printer.) Not as easy as it sounded, EDT refused to load this print file for editing, complained about an illegal file type. One could point the finger at EDT and say it was deficient in not handling enough file formats, I tend to think that barring super-human effort it was inherent in the design environment, it would be hard to properly edit every file type that was allowed (last I checked CONVERT still couldn't convert some reasonable-looking conversions.) I believe TECO did the job fine, but I was pretty shocked at not being able to edit this fairly plain looking text file. It wasn't the *data* which was preventing loading this into EDT (as with, say, trying to load an a.out into VI which wouldn't work too well either, but for a different reason), it was merely a bit somewhere identifying this as a print file or some such nonesense and thus EDT kicking it out without trying. Such problems were ubiquitous (at least it always seemed like someone was coming to me trying to work around a similar problem, utilities wouldn't cooperate.) Under IBM systems with a similar record oriented philosophy I remember real panic if we couldn't find the original parameters under which a file was created. It basically couldn't be opened anymore unless you could produce the right magic numbers it was created with (blocking factors etc.) I'm sure some wizardly types could have solved that directly but it sure wasn't obvious to us, other than guessing numbers and paying real money to watch perhaps dozens of tries go down the drain and feeling kind of foolish and seriously out of control. The problem with the Unix "unstructured" approach is that either you use some of the (very few) library routines (dbm is a major one, so are the object deck readers in SYSV) or you roll your own, each application will have its own way of storing data (compare termcap with passwd with inittab with crontab with ...) often not terribly well documented or efficient (agreed, often efficiency is a poor excuse for obscurity.) It's all a balancing act. In my ideal world there would be a variety of standardized access methods and you would avoid using them like the plague, especially in general system utilities, simple byte-stream files should account for most input and output (a la Unix), but for those occasional, carefully justified problems, access methods could be resorted to. Also, the operating system would know as little about them as possible (eg. opening any file as a byte-stream would do something reasonable, *never* return an error.) -Barry Shein, ||Encore||
dave@arnold.UUCP (Dave Arnold) (09/14/88)
In article <3597@encore.UUCP>, bzs@encore.UUCP (Barry Shein) writes: > > What I mean is, used unnecessarily where bag-of-bytes files would do > just fine and cause much less confusion. Exactly. > For example, on an earlier release (probably 1.6) of VMS I wanted to > edit a file produced by RUNOFF (to do a few global changes so > underlining or some such would print properly on my printer.) Not as > easy as it sounded, EDT refused to load this print file for editing, EDT still gives a warning about files created with VAXC. Dumb! > The problem with the Unix "unstructured" approach is that either you > use some of the (very few) library routines (dbm is a major one, so > are the object deck readers in SYSV) or you roll your own, each > application will have its own way of storing data (compare termcap > with passwd with inittab with crontab with ...) often not terribly > well documented or efficient (agreed, often efficiency is a poor > excuse for obscurity.) This is not a problem. It's not often that your application requires you to "Roll your own". And you get a very simple filesystem. When you try to design a filesystem that will attempt to please everyone under all circumstances, you over build---A real mess. Anyone try tuning a RMS ISAM file? Some pretty spiffy analysis tools :-, > It's all a balancing act. Tightrope. > -Barry Shein, ||Encore|| I appreciate your points, Barry, but don't agree. -- Dave Arnold dave@arnold.UUCP {cci632|uunet}!ccicpg!arnold!dave
jbw@bucsb.UUCP (Joe Wells) (09/15/88)
In article <178@arnold.UUCP> dave@arnold.UUCP (Dave Arnold) writes: >VMS's strengths? >AST's, Timer queues, condition handling, exit handling, message ^^^^^ ^^^^^^^^^^^^^^^^^^ >facility. ASTs are for me VMS's greatest advantage. The ability to have multiple system calls outstanding at one time is a godsend for realtime control systems. Running a separate process for each blocking task and using IPC just doesn't cut it. Stack unwinding is really nice too. The ability in LISP to abort instruction sequences with "throw" and have everything clean itself up as the stack is unwound is *very* powerful. In addition, you can post your own unwinding cleanup instructions. Under VMS, you can do this in *any* language. The operating system and the VMS procedure calling standard provide for generic stack unwinding. Directory links under VMS are not necessary for a file to exist. Under UNIX, when all the links to a file disappear, and all processes close the file, the file is deleted. In VMS, a file can exist without a name. It can be accessed by its unique file identifier. In addition, the problem of dangling directory links to deleted files does not exist. When the VMS equivalent of the UNIX inode is reused, a counter in the index table slot is incremented. Thus any dangling pointers to the previous file that used the same slot won't have any effect. I would be much happier with the UNIX environment if it supported these features, but then if money grew on trees, I probably couldn't climb them. I'm also not trying to imply that VMS doesn't have more than its own share of ridiculously stupid features. >Other VMS pitfalls? >The resource quota system!!!!!!! >How often have you written a program, and got the famous: >%SYSTEM-F-EXCEEDED QUOTA >message? Isn't it fun trying to figure out which bloody quota >was exceeded?! Stupid! Good lord! Don't remind me of this! What a royal pain in the *ss! >-- >Dave Arnold >dave@arnold.UUCP {cci632|uunet}!ccicpg!arnold!dave Joe Wells UUCP: ...!harvard!bu-cs!bucsf!jbw INTERNET: jbw@bucsf.bu.edu
bzs@encore.UUCP (Barry Shein) (09/16/88)
Last things first... >> It's all a balancing act. > >Tightrope. > >> -Barry Shein, ||Encore|| > >I appreciate your points, Barry, but don't agree. >-- >Dave Arnold Not sure what you don't agree with, I assume it's the following: >> The problem with the Unix "unstructured" approach is that either you >> use some of the (very few) library routines (dbm is a major one, so >> are the object deck readers in SYSV) or you roll your own, each >> application will have its own way of storing data (compare termcap >> with passwd with inittab with crontab with ...) often not terribly >> well documented or efficient (agreed, often efficiency is a poor >> excuse for obscurity.) > >This is not a problem. It's not often that your application requires >you to "Roll your own". And you get a very simple filesystem. >When you try to design a filesystem that will attempt to please >everyone under all circumstances, you over build---A real mess. It's a problem if you have the problem. "It's not often" might be true in your world, I doubt you could convince the people I know trying to store their library catalogues (eg) that efficient keyed storage and lookup is an uncommon problem. Or business types trying to keep payroll or customer lists etc. I agree it's hard to design a general filing system which pleases everyone. I'm not sure it's a law of nature that one cannot. In fact, Unix might be quite close, just missing some application level standards in regards to file storage libraries (from which, perhaps, interested people could investigate tuning the system a little, the buffer cache probably does most of what they want anyhow.) -Barry Shein, ||Encore||
dave@arnold.UUCP (Dave Arnold) (09/17/88)
In article <3613@encore.UUCP>, bzs@encore.UUCP (Barry Shein) writes: > > Last things first... > > >> It's all a balancing act. > > > >Tightrope. I should have added a bunch of :-) to my original followup. It seems there is a bitter feeling towards my posting. I didn't intend to cause such feelings. I will be more careful in the future. My appologies. Signed, Dave (egg on my face) Arnold -- Dave Arnold dave@arnold.UUCP {cci632|uunet}!ccicpg!arnold!dave
jeh@crash.cts.com (Jamie Hanrahan) (09/18/88)
In article <3597@encore.UUCP> bzs@encore.UUCP (Barry Shein) writes: [much good stuff...] > >It's all a balancing act. In my ideal world there would be a variety >of standardized access methods and you would avoid using them like the >plague, especially in general system utilities, simple byte-stream >files should account for most input and output (a la Unix), ... I disagree. I much prefer VMS's variable-length-record text file format to Unix's byte-stream. Why? Because the Unix byte stream uses perfectly legitimate data as a record separator. To make matters worse, the standard C method for dealing with strings uses a *different* character as a string terminator! Unix has a lot of GREAT ideas in it, but this isn't one of them. Barry goes on to say that you should be able to open any file as a byte stream and not get an error. Well, you can do the equivalent under VMS-- you can open any file, sequential, relative, or indexed, for sequential access, and RMS will happily hand you the records in order (in order by primary key if it's an indexed file). And if you prefer a byte-stream rather than a record-oriented interface (and, yes, the byte-stream i/f has GREAT advantages from a program style standpoint; non-believers, particularly those who have never looked inside Unix utilities, should take a look at Kernighan and Plauger's _Software Tools_ or _Software Tools in Pascal_ to see what I mean), you or the system can provide a set of byte-stream interface routines to do that with a record-oriented file system. (That DEC's VAX C RTL does this, shall we say, imperfectly, is a problem in the implementation, not the concept.) (Incidently, Barry's problem with EDT stems from Runoff's former use of print-format files, wherein carriage control information for each record is stored in a fixed-length field preceding the text information. A program that expects to read an ordinary text file can read such a file, but it won't see the fixed-length field, so if it's an editor it can't reconstruct the field on output. The print-format file is one use of "Variable with Fixed Control" record format, and I'm very happy to report that very few VMS programs generate such files these days; it's one record format that VMS could have done without. ) To give you an idea of the generality of VAX RMS, the system runs happily using just a few of the available file formats. Text is stored in variable- length-record, sequential files. So is object code (possible even though you can have null bytes, line feeds, etc., etc., in object records... because RMS doesn't use in-band data for record terminators!). Images and library files go in fixed-length-record files, essentially with their own internal format implemented by the programs that deal with them. There are a few indexed files like the user authorization file. And that's about it. For me, the bottom line is that it works, that RMS with all its fabled "inefficiencies" runs rings around most folks who try to bypass it (whoa, now! I said "most". This because most people don't do the good job with read-ahead and write-behind buffering that RMS does. Sure, if you do that, AND implement the record handling yourself, you can beat RMS, barely. My point is that you don't have to bother to get good performance), and that I've dealt with VMS's file system for years without feeling I was doing battle with it. No doubt if I was moved to a Unix environment I would gripe a lot for a few weeks about "those stupid byte-stream files", but I'd like to think that I'd adapt and figure out how to do things the Unix way and work with the system instead of fighting it. I'd like to think that most Unix folks who come to VMS would do the converse. I'm probably wrong on both counts... :-)
jeh@crash.cts.com (Jamie Hanrahan) (09/18/88)
In article <178@arnold.UUCP> dave@arnold.UUCP (Dave Arnold) writes: >How often have you written a program, and got the famous: > >%SYSTEM-F-EXCEEDED QUOTA > >message? Isn't it fun trying to figure out which bloody quota >was exceeded?! Stupid! Apologies for the cross-followup to the unix group; I don't know if Dave reads the VMS group. The VMS quota system has two good reasons for being. First, it prevents a runaway program from using up all of something that might be in short supply, like nonpaged pool or process slots. Without this you could sit in a loop doing $QIO with no wait to an offline device, and you'd bother everybody on the system. With quotas in effect you only bother yourself. You can always enable process resource wait mode, which will cause your process to go into MWAIT state (usually seen, for this purpose, as RWAST) until the needed quota is returned, presumably by the completion of a previously-requested operation. (Process resource wait mode is enabled by default.) You can also get EXQUOTA if you try to do a buffered I/O operation that's larger than the size permitted by the SYSGEN parameter MAXBUF. This is a common pitfall. MAXBUF is only middling-sized by default (somewhere near 1K if I recall correctly). Many sites routinely set this up to 8K or so, especially those that have megabytes of pool available. The other purpose of the quota system is to make sure that everything you've started is finished before your image is allowed to run down. Say you start a direct I/O operation to a flaky device driver; the system charges your DIOLM by one. You wait and decide to ^Y out, but the driver's cancel I/O code doesn't work right so the I/O doesn't get aborted. The system notes that the original DIOLM is different from the current value and won't permit your image to run down until they're the same. This is one of the great banes of both system managers and driver writers, but it's necessary much of the time; if that I/O op is a read, and it decides to complete AFTER your image has run down and the physical pages you used to own (and to which the DMA will be performed) get assigned to somebody else, watch out! It's impossible for the system to distinguish where this is necessary and where it isn't, so it's done all of the time.
jeh@crash.cts.com (Jamie Hanrahan) (09/18/88)
In article <178@arnold.UUCP> dave@arnold.UUCP (Dave Arnold) writes: >The VMS file system doesn't buy you anything, unless your application >requires ISAM---However, how often do you need ISAM? > >I think the VMS filesystem is overly complicated, and one of the major >downfalls of VMS (but can be tolerated). If the original DEC designers >had it to do over again, I suspect they would have stuck with a >Stream-only based filesystem (Like UNIX), and provided ISAM libraries. >The FORTRAN record format, FIXED SIZE RECORDS, VARIABLE LENGTH, >CARRAIGE RETURN CARRIAGE CONTROL... Oh, don't forget the VFC record >format... These are all completely archaic, and date the VMS >operating system. I strongly disagree. I answered this in another note, but there are a few other points here... How often do you need ISAM? Well, if you have to implement it yourself, probably you'll do without. But if it's there it gets used, for good and sufficient reasons. There are MANY great applications for indexed files... Netnews, for instance. Some folks at BYU did a netnews workalike for VMS, relying heavily upon indexed files to keep track of the newsgroup contents, but storing the articles in individual files just as Unix netnews does. It's a VERY clean design, and they can process a batch of received news MUCH faster than Unix can running on the same hardware. (To be as exact as dim memory allows, I think they said ten times faster or so, and that the Unix folks at the site were both amazed and jealous.) Someone will likely complain that "all that RMS code" costs a lot in terms of efficiency. I offer this challenge: Take a simple Unix filter like DETAB running on some Unix system on a VAX (Ultrix, BSD, AT&T, whatever). Rewrite it to use record-oriented I/O under VMS. Boot VMS on the same hardware (or the equivalent). We've done this and the VMS/RMS versions run *at least* twice as fast, sometimes five or six times. (The much greater improvement in BYU News comes from a redesign to take advantage of indexed files, not just conversion from stream- to record-oriented I/O.) I know, I know -- for many applications stream I/O makes for much cleaner program design. But for others, it doesn't, at least not when you have good alternatives available. I don't think that fixed vs.variable length records, implied carriage control, etc., are archaic at all. Variable with fixed control, on the other hand, is right down there with punched cards and paper tape!
dhesi@bsu-cs.UUCP (Rahul Dhesi) (09/18/88)
In article <3438@crash.cts.com> jeh@crash.CTS.COM (Jamie Hanrahan) writes: >I much prefer VMS's variable-length-record text file format >to Unix's byte-stream. Why? Because the Unix byte stream uses perfectly >legitimate data as a record separator. UNIX files have no records, so there is no record separator. But if you consider lines of text to be records and the newline character to be a record separator (the concept is in your mind, not in the filesystem), then VMS has a similar problem: The low-level I/O routines use perfectly legitimate data for administrative information! Only at the RMS level is the overhead data made out-of-band. And even under UNIX, it is perfectly possible for an ISAM library to maintain out-of-band administrative data. -- Rahul Dhesi UUCP: <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi
chris@mimsy.UUCP (Chris Torek) (09/18/88)
In article <3438@crash.cts.com> jeh@crash.cts.com (Jamie Hanrahan) writes: >... the Unix byte stream uses perfectly legitimate data as a record >separator. Do you know what a `byte stream' is? Byte streams do not have records; they can hardly have record separators. If you want records in a Unix file system file, you must define them yourself. This is what Barry Shein was talking about. >Barry goes on to say that you should be able to open any file as a byte >stream and not get an error. Well, you can do the equivalent under VMS-- >you can open any file, sequential, relative, or indexed, for sequential >access, and RMS will happily hand you the records in order (in order by >primary key if it's an indexed file). And if you prefer a byte-stream >... you or the system can provide a set of byte-stream interface routines >to do that with a record-oriented file system. Simulating a byte stream on top of records is considerably more difficult than simulating records on top of a byte stream. I have been lead to believe that, under VMS, each different kind of record-oriented file must be read with a different primitive. (You must also provide a buffer that is as large as the largest record.) Hence to simulate a byte stream, you must know about every possible record format. On the other hand, to simulate a record format, you must know about every possible byte stream. Fortunately, there is only one possible byte stream, by the definition of `byte stream'. . . . -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
chris@mimsy.UUCP (Chris Torek) (09/18/88)
In article <3442@crash.cts.com> jeh@crash.cts.com (Jamie Hanrahan) writes: >How often do you need ISAM? Well, if you have to implement it yourself, >probably you'll do without. But if it's there it gets used, for good and >sufficient reasons. This was Barry Shein's point. But he might, and I will, go a bit further: sometimes it also gets used for bad, insufficient reasons. (That does not mean it should not be there; but maybe it should not be *too* easy to use.) >... Some folks at BYU did a netnews workalike for VMS [using indexed >files] .... It's a VERY clean design, and they can process a batch of >received news MUCH faster than Unix can running on the same hardware. >(To be as exact as dim memory allows, I think they said ten times >faster or so, and that the Unix folks at the site were both amazed >and jealous.) You are comparing incomparable things here. The reason their news unbacher is that much faster than the one in B news is almost certainly because `it's a very clean design' and not because it uses any particular storage format. The B news unbatcher is a model of inefficiency, clumsy patches, and re-re-re-re-worked code. For instance, an uncompressed batch file is read by forking a separate process for each article in the file. B news's only saving grace is that it works, and it works on everything from PDP-11s to Convexes. (Henry Spencer and Geoff Collyer rewrote the B news software and got a similar order of magnitude performance increase, without changing the file formats at all.) >... I offer this challenge: Oh dear. >Take a simple Unix filter like DETAB running on some Unix system on a >VAX (Ultrix, BSD, AT&T, whatever). Rewrite it to use record-oriented >I/O under VMS. Boot VMS on the same hardware (or the equivalent). >We've done this and the VMS/RMS versions run *at least* twice as fast, >sometimes five or six times. If you pick your benchmarks carefully, you can prove anything. Many real programs spend a fair bit of time doing I/O, and VMS RMS I/O is indeed quite efficient when properly used. But so is Unix I/O. VMS currently has an implementation edge if the application reads large blocks, since it does this by playing games with the MMU. On the other hand, Mach can do the same trick. >I don't think that fixed vs.variable length records, implied >carriage control, etc., are archaic at all. I like the way Ken Thompson put it: These concepts fill a much-needed gap in other operating systems. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
bzs@encore.UUCP (Barry Shein) (09/18/88)
>I should have added a bunch of :-) to my original followup. It seems >there is a bitter feeling towards my posting. I didn't intend to cause >such feelings. I will be more careful in the future. > >My appologies. > >Signed, > >Dave (egg on my face) Arnold No bitter feeling or any such thing, I was just trying to draw out exactly what in my note you were objecting to, I might have been wrong, as inconceivable as that may be. Such disagreements are oftentimes explained by nothing more (or less) than differing perceptions of priorities, in this case the importance/frequency of a need for efficient keyed (&c) storage access methods. But do wipe the egg of your face, it's just left over from breakfast and is irrelevant to the conversation, it's making us sick Dave :-) -Barry Shein, ||Encore||
bzs@encore.UUCP (Barry Shein) (09/19/88)
In the first place, it's not obviously an either/or situation. I suspect that VMS's RMS could be implemented on top of Unix with little or no change to the O/S (although performance tuning would have to trade off asynch read-ahead/write-behind and Unix's buffer cache which accomplishes much the same basic thing [ie. the block you want next is highly likely to be off the disk and in memory by the time you need it], albeit in a different manner with different considerations.) I wouldn't be at all shocked to see DEC announce (essentially) RMS under Ultrix (and I'll bet a dollar someone is working on this.) Fine idea, as long as it's not in the OS. One problem with structured files that's easy to see is whether information stored in the file to represent the structure is part of the file or not. For example, if in a variable length, blocked format you store the length of each record as a preceding field of 16-bits, is the size of the file the size of all its data + NRECORDS*2 (2 bytes)? Or just the size of the file (that is, what does a file status query return?) That doesn't seem terribly important at first (who cares, choose a solution and stick to it) until one wants to access the thing as a raw file (something always trivial to do in Unix's scheme.) Now, is the 16-bit field counted in a file position seek? Can I safely take two positions, POS1 and POS2 (byte offsets into the file, a la ftell or lseek) and subtract them, perhaps then allocating and copying the data? Or might the result be larger (OS adds in the 16-bit fields) or even incorrect (POS2 should have been incremented by NRECORDS*2, but I can't really calculate that number NRECORDS very easily, in advance.) I'm not sure I'm claiming that Unix solves any of this other than laying things out so very barebones and w/o OS interpretation that it's totally up to the user, no hand-to-hand combat with a record management system required. Anyhow, I may not be expressing myself very well, but I have used VMS and IBM record access methods enough over the years to know that sometimes they can drive you to tears (usually because the OS feels it has a better idea of what you are doing than the programmer does, and modifies or otherwise "corrects" your requests.) What's far more important, in my experience, is to have an orderly set of access methods and to use them only where they are truly justified (ie. simply because it's faster is not a good enough excuse if 99% of the actual applications will perform faster than human response time with either method, naive or sophisticated.) I remember, for example, when the VMS HELP files went from a very simple, textual format to their current library format and it made working with them in new and creative ways nearly impossible (I had written a full-screen access to the VMS help files in TECO, no kidding, which was nearly impossible to salvage, I never bothered.) I'm not sure the changeover was really much of an improvement, sped up something which was fast enough already and added a lot of complexity where it was unappreciated, adding a new help topic became more complicated etc. Not a flame, just trying to emphasize my point about it's good to have access methods, but it tends to lead people astray into using them just to avoid scanning a file when the latter would perform fine and would greatly simplify later maintenance (typically, the file can be manipulated with a text editor) etc. -Barry Shein, ||Encore||
schwartz@shire (Scott Schwartz) (09/19/88)
In article <3442@crash.cts.com> jeh@crash.CTS.COM (Jamie Hanrahan) writes: |I offer this challenge: Take a simple Unix filter like |DETAB running on some Unix system on a VAX (Ultrix, BSD, AT&T, whatever). |Rewrite it to use record-oriented I/O under VMS. ... |We've done this and the VMS/RMS versions run *at least* twice as fast, |sometimes five or six times. I've seen unix programs (things like a grep replacement) that got similar speedups by replacing stdio calls with read/write and a large buffer. I wonder how much of that 2-6x is from overhead in stdio, rather than in the filesystem. Here is some sample data: /* f1.c: */ #include <stdio.h> main() { int c; while ((c = getchar()) != EOF) putchar(c); } /* f3.c: */ #include <stdio.h> main() { int len; char buffer[BUFSIZ*10]; while (len = read(0, buffer, sizeof(buffer))) write(1, buffer, len); } /* test file */ shire% wc test 26388 100728 1292508 test /* results */ shire% time f1 <test >foo 5.8u 0.7s 0:06 100% 0+224k 2+163io 0pf+0w shire% time f3 <test >foo 0.0u 1.0s 0:01 63% 0+248k 0+160io 0pf+0w Shire is a Sun 4 running SunOS 4.0. I got similar results on a Vax 780 running 4.3 BSD (except that it took 10 times longer to run.) -- Scott Schwartz schwartz@gondor.cs.psu.edu Your array may be without head or tail, yet it will be proof against defeat. Sun Tzu, "The Art of War"
james@bigtex.uucp (James Van Artsdalen) (09/19/88)
In article <3438@crash.cts.com>, jeh@crash.CTS.COM (Jamie Hanrahan) wrote: > I disagree. I much prefer VMS's variable-length-record text file format > to Unix's byte-stream. Why? Because the Unix byte stream uses perfectly > legitimate data as a record separator. In reading the write(2) man page, I somehow completely missed the discussion of file record separators in unix. -- James R. Van Artsdalen ...!uunet!utastro!bigtex!james "Live Free or Die" Home: 512-346-2444 Work: 328-0282; 110 Wild Basin Rd. Ste #230, Austin TX 78746
gwyn@smoke.ARPA (Doug Gwyn ) (09/19/88)
In article <3951@psuvax1.cs.psu.edu> schwartz@shire.cs.psu.edu (Scott Schwartz) writes: >In article <3442@crash.cts.com> jeh@crash.CTS.COM (Jamie Hanrahan) writes: >|I offer this challenge: Take a simple Unix filter like >|DETAB running on some Unix system on a VAX (Ultrix, BSD, AT&T, whatever). >|Rewrite it to use record-oriented I/O under VMS. ... >|We've done this and the VMS/RMS versions run *at least* twice as fast, >|sometimes five or six times. >I've seen unix programs (things like a grep replacement) that got >similar speedups by replacing stdio calls with read/write and a large >buffer. I wonder how much of that 2-6x is from overhead in stdio, >rather than in the filesystem. >Here is some sample data: The point is valid, although your two examples were not functionally identical, since in one case you were inspecting EVERY character in a file and in the other you never inspected ANY character. User-mode overhead from stdio tends to be comparable to system overhead for typical applications, assuming a fairly good implementation of stdio. Certainly it is a mistake to use stdio to implement "cat", for example (for several reasons), but for most applications the additional services provided by stdio (buffering, etc.) are useful, as is the fact that the stdio functions are available on all systems whereas open()/read()/etc. may not be (and when they are, their semantics are not as well defined). The analogous UNIX "challenge" would be: Take a simple UNIX filter (I have no idea where he gets "detab", which is not standard on UNIX) and rewrite it to use direct system calls on UNIX... Personally I think I have better things to do than crank out system- specific code.
guy@gorodish.Sun.COM (Guy Harris) (09/19/88)
> I wouldn't be at all shocked to see DEC announce (essentially) RMS > under Ultrix (and I'll bet a dollar someone is working on this.) Fine > idea, as long as it's not in the OS. Or, more precisely, not in a more-privileged mode than user mode; I consider the OS to be more than just the kernel - for instance, I consider UNIX standard I/O to be part of the OS. Under RSX-11, if I remember correctly, RMS is just a library that runs in user mode; VMS decided to fill another much-needed gap by running it in executive mode. Neither of them stuffed it into the kernel, at least....
guy@gorodish.Sun.COM (Guy Harris) (09/19/88)
> I disagree. I much prefer VMS's variable-length-record text file format > to Unix's byte-stream. Why? Because the Unix byte stream uses perfectly > legitimate data as a record separator. To make matters worse, the standard > C method for dealing with strings uses a *different* character as a string > terminator! Unix has a lot of GREAT ideas in it, but this isn't one of them. Umm, as others have already pointed out, UNIX doesn't use '\n' as a record separator; it uses it as a *line* separator. UNIX - like VMS - ultimately (at the kernel level) implements files as a sequence of bytes (RMS sits on top of QIOs that read virtual blocks of the file, *n'est ce pas?*). One file format UNIX happens to implement atop this abstraction is the "text file"; "text files" consist of "lines", which are sequences of bytes (not containing '\0' - some applications can't handle them, since it's the C string terminator) ending with '\n'. Other file formats exist, such as executable images and archives, which are, respectively, the UNIX equivalents of images (and object files - object files and images use the same format) and library files. However, UNIX doesn't come standard with any libraries that implement "record" files. Such libraries are available from third-party vendors (e.g., C-ISAM), and I very much doubt that they use '\n' or any other particular byte value as a record separator. Some of the real differences between UNIX and VMS here are that: 1) As already stated, VMS comes with libraries that implement "record" files, while UNIX doesn't; 2) Many UNIX utilities (e.g., "cp") deal with files at the byte-stream level, so they don't care *what* format the file is in; 3) Many more UNIX facilities use text files, rather than record files, as their underlying file format; while one reason for this may be the absence of a "record file" library, another reason is that you can use the standard UNIX text file tools to manipulate those files.
mike@turing.unm.edu (Michael I. Bushnell) (09/19/88)
In article <68850@sun.uucp>, guy@gorodish (Guy Harris) writes: >> I wouldn't be at all shocked to see DEC announce (essentially) RMS >> under Ultrix (and I'll bet a dollar someone is working on this.) Fine >> idea, as long as it's not in the OS. > >Or, more precisely, not in a more-privileged mode than user mode; I consider >the OS to be more than just the kernel - for instance, I consider UNIX standard >I/O to be part of the OS. But standard I/O runs in user mode, not in a more-priviledged mode. What you consider the OS to be is not what it in fact is. A good working description of OS is that part of the system which the arbitrary user cannot rewrite and use in lieu of the distributed code. You can rewrite stdio, and then not use the distributed one. This definition is *very* closely linked to what privilege mode the code runs in...if it runs in user mode, the user could replace it. >Under RSX-11, if I remember correctly, RMS is just a library that runs in user >mode; VMS decided to fill another much-needed gap by running it in executive >mode. Neither of them stuffed it into the kernel, at least.... But...the user can't necessarily replace RMS without getting to write his own CHME dispatch table, something the kernel is not likely to let him do. -- -- N u m q u a m G l o r i a D e o \ Michael I. Bushnell \ HASA - "A" division /\ mike@turing.unm.edu / \ {ucbvax,gatech}!unmvax!turing.unm.edu!mike
mike@turing.unm.edu (Michael I. Bushnell) (09/19/88)
In article <68855@sun.uucp>, guy@gorodish (Guy Harris) writes: >> I disagree. I much prefer VMS's variable-length-record text file format >> to Unix's byte-stream. Why? Because the Unix byte stream uses perfectly >> legitimate data as a record separator. To make matters worse, the standard >> C method for dealing with strings uses a *different* character as a string >> terminator! Unix has a lot of GREAT ideas in it, but this isn't one of them. > >Umm, as others have already pointed out, UNIX doesn't use '\n' as a record >separator; it uses it as a *line* separator. UNIX - like VMS - ultimately (at >the kernel level) implements files as a sequence of bytes (RMS sits on top of >QIOs that read virtual blocks of the file, *n'est ce pas?*). > >One file format UNIX happens to implement atop this abstraction is the "text >file"; "text files" consist of "lines", which are sequences of bytes (not >containing '\0' - some applications can't handle them, since it's the C string >terminator) ending with '\n'. > >Other file formats exist, such as executable images and archives, which are, >respectively, the UNIX equivalents of images (and object files - object files >and images use the same format) and library files. But a very important thing to remember is this: The designers of UNIX didn't expect to see people edit binaries, but they stuck with the byte-stream abstraction. Programs that are willing to stick to it (like GNU emacs, and unlink ed, ex, and vi) can benifit tremendously. I can and do edit binaries using emacs. It didn't take *any* modification of the operating system to do this, and emacs didn't require *any* special modifications to do so...all it needed was to learn how *not* to use separators. The point is that while you might not see the value in it now, you might later, when it is too late. Try using your favorite VMS editor to edit a binary and change a string constant! Not too likely, I'm afraid. -- -- N u m q u a m G l o r i a D e o \ Michael I. Bushnell \ HASA - "A" division /\ mike@turing.unm.edu / \ {ucbvax,gatech}!unmvax!turing.unm.edu!mike
sommar@enea.se (Erland Sommarskog) (09/20/88)
Jamie Hanrahan (jeh@crash.CTS.COM) writes: >I know, I know -- for many applications stream I/O makes for much cleaner >program design. But for others, it doesn't, at least not when you have >good alternatives available. I don't think one should over-emphasize the importance of what I/O- concept the OS uses. If I program in an high-level langauge it is rather the I/O-concept of that language which is of interest. At least if I/O is well-defined. In many modern langauges, I/O is not part of the langauge, but rather a library which could be more or standardized. What is left is of course the question of efficiency. So if the langauge like C only has stream I/O (I assume it is so, I don't speak C, so I could be wrong) then we don't benefit from a complex file system when all we want is simple streams. Ada, on the other hand, has text files, and record files both for sequential and direct access. For the compiler-writer it may be of interest if the file system supports the appropriate formats, for me as a programmer it does not. Whether it's in the file system or the RTL doesn't matter. Jamie Hanrahan complained that stream I/O meant that in-band data were used as a terminator. In practice this mean writing an LF in the middle of a text line is impossible in Unix, while is quite OK in VMS. (Which on the other hand impose a maximum length on the line.) So what about Ada? If I write an LF character the result will be different on VMS and Unix? Non-portable? Yes, but the manual also clearly says that I/O of non-printable characters is not defined by the language. -- Erland Sommarskog ! "Hon ligger med min b{ste v{n, ENEA Data, Stockholm ! jag v}gar inte sova l{ngre", Orup sommar@enea.UUCP ! ("She's making love with best friend, ! I dare not to sleep anymore")
jeh@crash.cts.com (Jamie Hanrahan) (09/20/88)
No, this isn't a followup rebuttal, even though I've been beat up pretty badly re. my statement about "record separators" (okay, okay, "line separators") in Unix files. I said my piece already, right? But I was annoyed to see someone say "Please, don't start another Unix vs. VMS war". I don't think this is a "war" at all. I think I've learned a bit about the right way to think about Unix files, knowledge which will no doubt come in handy some day, probably sooner than I think it will (if past experience is any guide). Maybe some other folks have learned something about VMS files too. Isn't this what the net is about? (But if someone says "Please don't let this get out of hand", I'll second.)
jfh@rpp386.Dallas.TX.US (The Beach Bum) (09/20/88)
In article <68855@sun.uucp> guy@gorodish.Sun.COM (Guy Harris) writes: >Umm, as others have already pointed out, UNIX doesn't use '\n' as a record >separator; it uses it as a *line* separator. UNIX - like VMS - ultimately (at >the kernel level) implements files as a sequence of bytes (RMS sits on top of >QIOs that read virtual blocks of the file, *n'est ce pas?*). vms has file attributes directly associated with the file. qio does read virtual blocks - but you can't easily convince rms to read a file in some mode other than the mode the file was created with. if you have an isam file you want to read as a 80 character fixed length record file, it's qio or nothing [ but grief ] -- John F. Haugh II (jfh@rpp386.Dallas.TX.US) HASA, "S" Division "If the code and the comments disagree, then both are probably wrong." -- Norm Schryer
eric@snark.UUCP (Eric S. Raymond) (09/20/88)
In article <13608@mimsy.uucp>, chris@mimsy.UUCP (Chris Torek) writes: > (Henry Spencer and Geoff Collyer rewrote the B news software and got > a similar order of magnitude performance increase, without changing > the file formats at all.) And I did likewise, with similar results, for B3.0. Chris is, as usual, quite correct; the fault lies not in our file formats, but in our code. The major win was just eliminating the fork-per-article overhead in the unbatcher. The principle exemplified here bears repeating yet again: A CLEAN DESIGN IS THE ROYAL ROAD TO SPEEDY CODE and fiddling with flat-vs-ISAM files, clever code hacks or other 'micro-level' optimizations is usually a recipe for lots of pain with very little gain. -- Eric S. Raymond (the mad mastermind of TMN-Netnews) UUCP: ...!{uunet,att,rutgers}!snark!eric = eric@snark.UUCP Post: 22 S. Warren Avenue, Malvern, PA 19355 Phone: (215)-296-5718
guy@gorodish.Sun.COM (Guy Harris) (09/21/88)
> vms has file attributes directly associated with the file. qio does > read virtual blocks - but you can't easily convince rms to read a file > in some mode other than the mode the file was created with. As I remember, the VMS file attributes are maintained, but not really used, by the code that I would refer to as the VMS file system (the ACPs or "extended QIO processors" or whatever they call the new stuff they added in recent versions). I think there are QIOs (perhaps undocumented) that RMS uses to fetch and store those attributes.
aperez@cvbnet2.UUCP (Arturo Perez Ext.) (09/21/88)
From article <68855@sun.uucp>, by guy@gorodish.Sun.COM (Guy Harris): >> I disagree. I much prefer VMS's variable-length-record text file format >> to Unix's byte-stream. Why? Because the Unix byte stream uses perfectly >> legitimate data as a record separator. To make matters worse, the standard >> C method for dealing with strings uses a *different* character as a string >> terminator! Unix has a lot of GREAT ideas in it, but this isn't one of them. > > One file format UNIX happens to implement atop this abstraction is the "text > file"; "text files" consist of "lines", which are sequences of bytes (not > containing '\0' - some applications can't handle them, since it's the C string > terminator) ending with '\n'. > > Other file formats exist, such as executable images and archives, which are, > respectively, the UNIX equivalents of images (and object files - object files > and images use the same format) and library files. > > However, UNIX doesn't come standard with any libraries that implement "record" > files. Such libraries are available from third-party vendors (e.g., C-ISAM), > and I very much doubt that they use '\n' or any other particular byte value as > a record separator. > I'm curious. I understand VMS's supposed need for the various file formats. And although I disagree, that's DEC decision; let them live with it. They just want application designers to use the tools that DEC designed. Maybe because it makes their software easier to support. I don't really know. And I don't really work with VMS often enough to really care. But I do know from experience that the Unix file system is so straightforward that ANYBODY can use it without having to worry about the millions of descriptors that are needed to set up an I/O request on RMS. What I'm curious about is the fact that I've never heard of any record access libraries for Unix. I know that I've written simpleminded record access applications. I'm sure other people have as well. Is there anyone actually selling record access libraries for the Unix community? If not why isn't anyone doing it? Arturo Perez ComputerVision, a division of Prime primerd!cvbnet!aperez The difference between genius and idiocy is that genius has its limits.
gwyn@smoke.ARPA (Doug Gwyn ) (09/21/88)
In article <3954@enea.se> sommar@enea.se (Erland Sommarskog) writes: >In practice this mean writing an LF in the middle of a text line is >impossible in Unix, while is quite OK in VMS. On the other hand, what is a "text line" that occupies portions of multiple lines on a display device? Change "text line" to "text record" and the concept makes more sense, but then why is text necessarily organized into records, and why do these records look like they do instead of something like x T aps x res 723 1 1 x init x font 1 R x font 2 I x font 3 B x font 4 H x font 5 CW x font 6 S x font 7 S1 x font 8 GR V0 p1 s10 f1 H696 V480 h2075c- 35 33152 33-n120 0 H696 V960 cT 67h54i28sw71i28sw71a50nw86e45x50a50m82p52l28e45.n120 0 x trailer V7953 x stop
jeremy@chook.ua.oz (Jeremy Webber) (09/21/88)
In all this discussion I have not seen mention of the fact that you can open a VMS file for block i/o and then treat it as a stream of blocks. This can be useful for just moving data around. It can also be dangerous, but no more so than treating a file as a stream of bytes. One thing that I think DEC stuffed up badly though is that they did not define a standard for text files. Instead, you have variable-length-carriage-control, Fortran carriage control, List carriage control, stream-LF, stream-CR and probably half a dozen others that I have not thought about. This makes writing text file manipulation programs, such as text editors, a real pain. It also makes manipulation of text by programs written in different languages hazardous. I believe that DEC should modify the run time libraries of all languages to convert internal text to and from a standard text form when reading and writing files. I can see the performance advantages of letting the file system "know" about RMS. Particularly with regard to record locking and other commercial uses. In short, there are advantages and disadantages in the VMS as against the UNIX method of treating files, and you'll probably choose the one best for your application. -Jeremy Webber (jeremy@chook.ua.oz.au) Computer Science, Adelaide University, Australia "One of these days I'll get around to writing a .signature file"
meo@stiatl.UUCP (Miles O'Neal) (09/21/88)
In article <68855@sun.uucp>, guy@gorodish.Sun.COM (Guy Harris) writes: > 3) Many more UNIX facilities use text files, rather than record files, > as their underlying file format; while one reason for this may be > the absence of a "record file" library, another reason is that you > can use the standard UNIX text file tools to manipulate those files. If you even have your data files as text files, debugging becomes much easier. For instance, would you rather debug 98764389437034gh307ytfhr398f39 or 12/22/88 01:30 10790 100 100 382 -1 ? These are not real data, but examples of what data files I've dealt with looked like. The processing to do all this is cheap nowdays, so why not use text files if there is no OVERWHELMING reason not to? Another thing this buys you is that, in my experience, its easier to change file formats if you use text files. It requires a little plannning, but in general is a lot less work than doing the same thing with any other type of data. Strangely enough, you can do similar things with VMS, OS/32, or even CP/M...
dave@arnold.UUCP (Dave Arnold) (09/22/88)
eric@snark.UUCP (Eric S. Raymond) writes: > In article <13608@mimsy.uucp>, chris@mimsy.UUCP (Chris Torek) writes: > > (Henry Spencer and Geoff Collyer rewrote the B news software and got > > a similar order of magnitude performance increase, without changing > > the file formats at all.) > > [...] > > The principle exemplified here bears repeating yet again: > > A CLEAN DESIGN IS THE ROYAL ROAD TO SPEEDY CODE > I couldn't agree any more. People I work with seem to get bogged down in the "How big of a QIO can I do" syndrome during early early program design and development. I really protest this (especially when they encourage me to do the same). One of the reasons why I am a *GREAT* :-) programmer...is...because...: I much prefer to view things in the most simple way. I actually go to great effort rewriting things (with my bosses glare $$$) just to acheive a simpler program design. Sometimes the rewrite achieves better performance (not intentionally). And if not, facilitates easier performance enhancements---But I save those for last. This is the thing that I love about UNIX so much that I wish VMS shared: SIMPLICITY. Everything is so damn simple, it goes right over some people's head. Now if UNIX only had AST's, timer queues, exception handling, and a better "SHELL"---I would be in heaven. Remember the days when we would bring monolithic straight-line code to bed with us, and make marks on the listing? I even remember back in the late 1970's my boss teaching me the cons of structured programming by explaining to me that a function call just turns into a JMP instruction :-) This is the 80's!!! Soon to be 90's!! Let's not get stuck in the dark ages! -- Dave Arnold dave@arnold.UUCP {cci632|uunet}!ccicpg!arnold!dave
eric@snark.UUCP (Eric S. Raymond) (09/22/88)
In article <3453@crash.cts.com>, jeh@crash.CTS.COM (Jamie Hanrahan) writes: > I don't think this is a "war" at all. I think I've > learned a bit about the right way to think about Unix files, knowledge > which will no doubt come in handy some day, probably sooner than I think it > will (if past experience is any guide). Maybe some other folks have learned > something about VMS files too. Isn't this what the net is about? Yup. Me, I learned a lot about VMS from your postings. Not that I'd ever use it without you put a gun to my head, but I learned a lot. Thank you for your lucid descriptions of how RMS works. BTW, cultural differences are funny; I kept wanting to parse that acronym RMS as "Richard M. Stallman", an entity even more complex and obscure (but much less brain-damaged :-)) than VMS file I/O. -- Eric S. Raymond (the mad mastermind of TMN-Netnews) UUCP: ...!{uunet,att,rutgers}!snark!eric = eric@snark.UUCP Post: 22 S. Warren Avenue, Malvern, PA 19355 Phone: (215)-296-5718
allbery@ncoast.UUCP (Brandon S. Allbery) (09/23/88)
As quoted from <179@arnold.UUCP> by dave@arnold.UUCP (Dave Arnold): +--------------- | In article <3597@encore.UUCP>, bzs@encore.UUCP (Barry Shein) writes: | > The problem with the Unix "unstructured" approach is that either you | > use some of the (very few) library routines (dbm is a major one, so | > are the object deck readers in SYSV) or you roll your own, each | > application will have its own way of storing data (compare termcap | > with passwd with inittab with crontab with ...) often not terribly | > well documented or efficient (agreed, often efficiency is a poor | > excuse for obscurity.) | | This is not a problem. It's not often that your application requires | you to "Roll your own". And you get a very simple filesystem. +--------------- This all ties together with the terminfo-vs.-termcap discussion. Actually, I have written an interpreted terminfo (as part of the "tgraph" compatibility package for SVR2 curses); it is slow, but that's mainly because of laziness. It should be quite possible to write it to work quickly, with the same longer name usage *but* *extensible* unlike terminfo. Just as byte-stream file systems are more general and more useful than typed file systems, simple, general, FAST "access method" routines on top of the stream file systems are better than either typed file systems or roll-your- own access methods. (Example: COFF, or the new format perhaps, could easily be generalized to make a "resource library file" similar to Macintosh resource forks. Which would make "ld" a general utility rather than just an object relocation editor.) Termcap's obscurity and outright bugs (skip a backslash or expand a tab to spaces and the whole file goes to pot) make it a rather bad access method; while fixed versions (such as the Gnu version) handle the bugs, it's still harder to understand those two-character capnames than terminfo capnames. The interpretive terminfo-style reader is a step in the right direction. I also have a terminfo-like routine (currently implemented via yacc, so it's REALLY slow) which supports typed arrays. On the other hand, termcap/info doesn't solve all problems; it's senseless to complain about termcap and passwd not having the same format, they're keyed and used differently. Passwd uses yet another SIMPLE, GENERAL format, which is easily manipulated even at the shell level. Crontab is actually a simple variant of that format, and perhaps should be merged, but the existing tools can very easily deal with both. (After all, there's really a difference only in that a colon is used as passwd's field separator, while crontab uses a tab. Interpretation of fields varies, but that's going to happen anyway in a real-world database situation.) ++Brandon -- Brandon S. Allbery, uunet!marque!ncoast!allbery DELPHI: ALLBERY For comp.sources.misc send mail to ncoast!sources-misc "Don't discount flying pigs before you have good air defense." -- jvh@clinet.FI
bzs@xenna (Barry Shein) (09/25/88)
If I can be permitted to summarize this discussion: VMS's RMS can be useful in many situations and amounts to an added application library bundled in with VMS which Unix folks would have to go out and purchase separately (I've seen similar libraries for Unix advertised in trade mags, they do exist.) Presumably one can add a similarly useful access methods library to Unix, the biggest question being the desirability of true asynchronous I/O (it's possible that, from a pure performance standpoint, Unix wouldn't benefit that much from this due to its buffer cache although some would still like it.) VMS's biggest drawback, in regards RMS, is that there wasn't much more discipline on the part of the applications designers to use (preferably) one access method for most applications so utilities could work together more smoothly. Having one utility produce a text file which cannot be read in and manipulated by another seems to violate "the law of least astonishment" in a major way. Simply handling all the permutations is not as reliable as agreeing on one format except where carefully justified. This is particularly true when changing between programming languages (at least one reader claims this.) I think it's safe to say this was a constructive discussion. -Barry Shein, ||Encore||
mazumdar@fredonia.UUCP (Jin Mazumdar) (09/29/88)
I have just been browsing through this discussion and have not
read all follow ups. Although UNIX does not have fixed length
records can one not convert any file in UNIX to fixed length records
using the dd utility? On the other hand on fixed format systems the
best you could do is fake variable format with an end of record marker
and possibly wasting the rest of the record.
Jin Mazumdar (uucp:) ...decvax!sunybcs!fredonia!mazumdar
>>> The following are for historical interest only <<<
Dept. Of Math and C. S.
State University of New York College at Fredonia
Fredonia, N.Y. 14063 (716) 673 3459
dhesi@bsu-cs.UUCP (Rahul Dhesi) (09/29/88)
In article <1127@fredonia.UUCP> mazumdar@fredonia.UUCP (Jin Mazumdar) writes: >Although UNIX does not have fixed length >records... It certainly does. Look at the structure of /etc/utmp and /usr/adm/wtmp or equivalent files on your system. -- Rahul Dhesi UUCP: <backbones>!{iuvax,pur-ee}!bsu-cs!dhesi
jfh@rpp386.Dallas.TX.US (The Beach Bum) (09/30/88)
In article <4136@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes: >In article <1127@fredonia.UUCP> mazumdar@fredonia.UUCP (Jin Mazumdar) writes: >>Although UNIX does not have fixed length >>records... > >It certainly does. Look at the structure of /etc/utmp and /usr/adm/wtmp >or equivalent files on your system. not in the typical sense. there is no file-system level support for fixed length records. unix files are byte streams, meaning [ with the exception of certain device files ] you can read 1 byte or, hardware permitting, 1MB. with other operating systems the size of the record is fixed at file creation time and may not be changed without copying the contents of the file using a file conversion utility of some type. /etc/utmp may be read one byte at a time, except that the "records" would not have any meaning. -- John F. Haugh II (jfh@rpp386.Dallas.TX.US) HASA, "S" Division "Why waste negative entropy on comments, when you could use the same entropy to create bugs instead?" -- Steve Elias
allbery@ncoast.UUCP (Brandon S. Allbery) (10/07/88)
As quoted from <4136@bsu-cs.UUCP> by dhesi@bsu-cs.UUCP (Rahul Dhesi): +--------------- | In article <1127@fredonia.UUCP> mazumdar@fredonia.UUCP (Jin Mazumdar) writes: | >Although UNIX does not have fixed length | >records... | | It certainly does. Look at the structure of /etc/utmp and /usr/adm/wtmp | or equivalent files on your system. +--------------- The programs that use those files use fixed-length "records"; the file system itself does not enforce them, however. The difference is that you don't have to tell your favorite binary editor that it must open /etc/utmp with a record size of (sizeof (struct utmp)) bytes. ++Brandon -- Brandon S. Allbery, uunet!marque!ncoast!allbery DELPHI: ALLBERY For comp.sources.misc send mail to <backbone>!sources-misc comp.sources.misc is moving off ncoast -- please do NOT send submissions direct "So many articles, so little time...." -- The Line-Eater