gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (10/22/85)
> >Many UNIX text-file utilities will discard a (necessarily final) > >text line that does not end in a newline. Quite simply, such a > >file is not a proper UNIX text file. > > Who says? Where's the definition of a 'proper' UNIX text file? The problem is, there are several interpretations of such a file, depending on the utility involved. Perhaps there should be a well-defined standard interpretation, but there isn't currently. "A file of text consists simply of a string of characters, with lines demarcated by the newline character." -- from "The UNIX Time-Sharing System" by Ritchie & Thompson "text file, ASCII file -- a file, the bytes of which are understood to be in ASCII code" -- from "Glossary" in "UNIX Time-Sharing System Programmer's Manual", 8th Ed. "A text stream is an ordered sequence of bytes composed into lines, each line consisting of zero or more characters plus a terminating new-line character. ... The sequentially last character read in from a text stream will, however, always be sequentially the last character that was earlier written out to the text stream, if that character was a new-line." -- from ANSI X3J11/85-045 My personal choice would be similar to Ritchie & Thompson, where newlines delimit (NOT "terminate") text lines, so that the last character in a text file would not need to be a newline. However, this raises the question of what utilities should do with the null line at the end of every text file that DOES end with a newline; this will still be utility-dependent (and should be documented whenever it is handled differently from other text lines in the file). X3J11/85-045 botched it anyhow, since they intended that ALL UNIX files qualify as "text streams" under stdio (vs. "binary streams", which have to be handled differently on some non-UNIX OSes). So, how do we establish a standard interpretation for non-newline- terminated UNIX text files? (Discussion should move to net.unix.)