chris@mimsy.UUCP (Chris Torek) (10/01/88)
>>In article <1127@fredonia.UUCP> mazumdar@fredonia.UUCP (Jin Mazumdar) writes: >>>Although UNIX does not have fixed length records... >In article <4136@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) answers: >>It certainly does. Look at the structure of /etc/utmp and /usr/adm/wtmp >>or equivalent files on your system. In article <7296@rpp386.Dallas.TX.US> jfh@rpp386.Dallas.TX.US (The Beach Bum) replies: >not in the typical sense. there is no file-system level support for >fixed length records. As others have pointed out, when you get right down to the `file system', the same is true of VMS. The odious (oops :-) ), er, ODS II file system is actually a series of (512 byte?) blocks. In a sense, the records are `all in your head'. The true underlying difference in the *file* *system* is this: ODS II files are a file header plus a lump of blocks, while Unix files are a file header (`inode') plus a lump of bytes. This distinction is surprisingly important: By making the interesting part of a file `a lump of bytes', Unix can represent as files odd monsters like terminals, pipes, and inter-machine IPC streams. ODS II cannot, because terminals, pipes, and IPC streams are not lumps of blocks. In particular, input from, or output to, these monsters comes in varying and often unpredictable sizes. Of course, these funny file monsters are not *exactly* like files, and even on Unix programs can tell them apart. (In particular, the inode has a type field, and there are special-purpose operations, such as setting terminal characteristics, that have no function that can be applied to files.) The key is that applications are not *required* to tell them apart, and in general, well-written applications make no attempt to do so, and thus work just as well with these odd `files' as with real files. Moving up a level (to RMS on VMS and to stdio and other libraries on Unix), the systems diverge further. RMS manages record information in the `lump of blocks' files; if you use RMS routines, they will deal with the conversion from one kind of record to another as appropriate (apparently, usually by outlawing it entirely, an approach I find rather dubious). The records are `in RMS's head', so to speak. Unix maintains much of its distance: stdio provides no automatic conversions, makes nothing illegal, and allows programs to have records in *their* heads (via, e.g., fread and fwrite, or newline as a delimiter), but also provides what might be called `variable length newline delimited records': lines of text, collected by gets and fgets and printed by puts and fputs, passed around using C's `string' style (which further disallows ascii NUL). Some application programs believe in the `lines of text' model, and misbehave when fed arbitrary binary files. Other applications have no record models and work equally well on any kind of file. Unix libraries other than stdio are far less universal, so there is not much to be said about them other than that they exist, and they may or may not provide record models, keyed access, and so forth. Now, records are not inherently `wrong' or `evil', and there are applications in which specific formats make sense. The Unix file system does not prevent writing such applications, but neither does it provide much assistance. The essentials are there: random access via lseek, and reading and writing via read and write. It is up to the application (or a vendor's library) to use these in a clean and efficient manner. Stdio's fread and fwrite allow fixed-length records in a relatively clean and efficient manner, but does not provide counted records or delimited records. VMS's RMS *does* provide the assistance, although your model must fit one of its models. If your model differs sufficiently---if your `records in your head' are not like any of RMS's `records in its head', you have to `tinker with RMS's head', or else bypass RMS entirely. Of course, RMS provides all the conventional formats; it only gets in the way if you intend to be nonconventional. Unix never gets in the way, but rarely makes it easy. There is a `flip side' to this, though. Since Unix provides few record models (the only one commonly available is the newline-delimited text line), programmers are not tempted to invent new formats that other applications cannot deal with. The classic example is RMS `print file format'. It sounds reasonable enough: A print file is intended to be printed, and the program to print files can make sure that the file is a print file. Alas, when it comes time to make a quick tweak, one discovers that the editor (EDT) cannot edit print files. The report must be out in ten minutes, but you will have to go back and change the original and re-run it, and of course that takes 30 minutes.... (Certainly the above happens rarely; if the programmers using the system maintain some self-discipline, they will not go off and invent new file formats when an existing one does the job. Someone should have told that to the author of RUNOFF: Its output should be a *text* file.) In summary: the major difference is that Unix `records' are in the eye of the beholder, and not (as in VMS) supplied as any part of the system. They are there when you truly need them; they are not there when you do not want them. In VMS you must bypass or fool RMS if you do not want them. (Apparently fooling RMS is easy, and might better be called `asking RMS nicely'.) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
sommar@enea.se (Erland Sommarskog) (10/02/88)
Chris Torek (chris@mimsy.UUCP) writes: >The classic example is RMS `print file >format'. It sounds reasonable enough: A print file is intended to be >printed, and the program to print files can make sure that the file is >a print file. Alas, when it comes time to make a quick tweak, one >discovers that the editor (EDT) cannot edit print files. >... >Someone should >have told that to the author of RUNOFF: Its output should be a *text* >file.) I just tried RUNOFF. If we forget these CR-LF at the end of each line, it gave a perfectly normal text file. I guess they have modified RUNOFF since Chris played with VMS. But there are other facilities that use weird formats. The report generator in VAX-Cobol produces VFC files (I think it is.) The editor (TPU these days, *not* EDT) doesn't mind it, but I haven't tried to write the file back to disk, which I suspect would result in a "file is converted to a supported format". -- Erland Sommarskog ENEA Data, Stockholm sommar@enea.UUCP
gwyn@smoke.ARPA (Doug Gwyn ) (10/03/88)
In article <13800@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: >In summary: the major difference is that Unix `records' are in the eye >of the beholder, and not (as in VMS) supplied as any part of the >system. Chris's summary was pretty good, but in case anyone wasn't aware, he was describing *disk files*. Some of the more general notions of "file" in UNIX really do have records, magtape and terminal input being the obvious ones. Of course the record structure is forced by the nature of these devices, so it isn't a design botch, but it is something one should be aware of. I recall not long ago finding out much to my annoyance that at least one version of "cat" was losing record size information due to having been converted to stdio instead of read/write. And yes, it DID matter.