levy@ttrdc.UUCP (Daniel R. Levy) (02/20/88)
In article <1988Feb17.171813.15472@utzoo.uucp>, henry@utzoo.uucp (Henry Spencer) writes: > > Compiler apparently does not like eof markers that are at the end of > > the last line in a file. Error c1004 results. Putting a cr-lf at the > > end of the file solved the problem. > > Compiler is doing it right, although the complaint is perhaps undesirably > cryptic. X3J11: "A source file that is not empty shall end in a new-line > character." (Page 6, line 38, 11 Jan 1988 draft) Ahem, what does X3J11 have to say about source files on systems (like VMess) that support "record" text files? There need be no new line character in them, but each record ("line") in the file is defined by some other mechanism, e.g., a byte count prepended to each record. (Of course the answer should be that such source files are treated for the purpose of the standard as if they ended with a newline character, but is this actually an explicit part of the standard? And if not, shouldn't it be? And for that matter, what if, in the middle of a record in such a file, an actual newline character is found? How is the compiler supposed to treat that??) -- |------------Dan Levy------------| Path: ..!{akgua,homxb,ihnp4,ltuxa,mvuxa, | an Engihacker @ | <most AT&T machines>}!ttrdc!ttrda!levy | AT&T Computer Systems Division | Disclaimer? Huh? What disclaimer??? |--------Skokie, Illinois--------|
gwyn@brl-smoke.ARPA (Doug Gwyn ) (02/21/88)
In article <2183@ttrdc.UUCP> levy@ttrdc.UUCP (Daniel R. Levy) writes: >Ahem, what does X3J11 have to say about source files on systems (like VMess) >that support "record" text files? Why should X3J11 have to say anything about this? It is the job of the implementor to meet the specs. In fact this particular problem has been solved many times already by C compiler vendors. Note that the implementor only has to provide one form of text stream and one form of binary stream; other file formats could be handled as extensions. Most vendors are likely to do whatever they can to support as many file types as possible, because it will make their customers happier. On VMS, for example, RMS can be used to help map strange file types into a smaller number of regular models.
scjones@sdrc.UUCP (Larry Jones) (02/21/88)
In article <2183@ttrdc.UUCP>, levy@ttrdc.UUCP (Daniel R. Levy) writes: > In article <1988Feb17.171813.15472@utzoo.uucp>, henry@utzoo.uucp (Henry Spencer) writes: [ every line in a file must end with '\n', including the last line] > Ahem, what does X3J11 have to say about source files on systems (like VMess) > that support "record" text files? There need be no new line character in them, > but each record ("line") in the file is defined by some other mechanism, > e.g., a byte count prepended to each record. (Of course the answer should be > that such source files are treated for the purpose of the standard as if they > ended with a newline character, but is this actually an explicit part of the > standard? And if not, shouldn't it be? And for that matter, what if, in > the middle of a record in such a file, an actual newline character is found? > How is the compiler supposed to treat that??) X3J11 says nothing about how to >implement< the standard; that's up to each individual implementor. If your system happens to match the model used by the standard (like Unix and MS-DOS), then the implementation is easy. If not (like VMS and MVS), then you have to do more work. Certainly the most common solution is to translate record endings into newlines but this is not always the right solution (e.g. files with fortran carriage control). Some OSs provide enough information to do this right (VMS), others don't. Some implementations allow the user to specify the right interpretation, others don't. Embedded newlines are a headache. In any case, the model specified by the standard was carefully chosen so as to be compatible with existing practice (the Unix model) while avoiding things which are very hard to get right on some systems. How do you represent a line that doesn't end with a newline on a system with record files where record end is taken as a newline? How do you represent a zero length line in a record file when zero length records are invalid? How do you represent an empty file when empty files are deleted when closed to avoid "wasting" disk and directory space? How do you represent variable length lines in a fixed length record file? These are the reasons for the various restrictions in the standard's file model. ---- Larry Jones UUCP: uunet!sdrc!scjones SDRC MAIL: 2000 Eastman Dr., Milford, OH 45150 AT&T: (513) 576-2070 "When all else fails, read the directions."
dhesi@bsu-cs.UUCP (Rahul Dhesi) (02/22/88)
[ every line in a file must end with '\n', including the last line] For files read and written from a Pascal program, the sequence-of- characters-terminated-by-a-newline model has existed for some time. No doubt OS implementors have already solved the problem, if they were going to solve it at all, to conform to ISO Pascal. The VAX/VMS C manual describes the painful translation that goes on so that the VMS C runtime system can convert between the newline model and the internal record structure. Since the compiler is just a program, the same translation can be done when it reads a source file. Newlines embedded in records are easy to deal with: They simply terminate a line, so the record will appear as two lines. Any other behavior contradicts the term "newline". Under VMS, if you use the EDT editor, an invisible ^M character appears to be at the end of each line. You can force EDT to insert an embedded carriage return by asking it to insert an arbitrary character by ASCII code, and then typing 13. But when you manipulate text (e.g. move text from one place to another in the file), EDT translates the embedded carriage return into a record terminator and the result is that you get two lines and the embedded carriage return is gone. (Disclaimer: this was so when I last checked. VMS evolves rapidly so your version may behave differently.) OS implementors, use record files if you must, but please also allow the user to create unstructured, newline-terminated files. Else you and your users are going to suffer as more and more programming languages, to be portable, specify this type of file behavior. You can fight it just to be different, or you can gracefully give in and support a universal text format. Did you all know that DEC *is* giving in? Its stream-LF format files not only behave like UNIX-style newline-terminated text text files, but are also freely executable without any protest from the operating system. DEC's C compiler is even more forgiving: it will happily compile C source files that are in stream-LF format *and* have a spurious carriage return character at the end of each line of text. IBM, as always, will take a little longer. -- Rahul Dhesi UUCP: <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi