[net.bugs] "proper UNIX text file" ???

rcj@burl.UUCP (Curtis Jackson) (10/21/85)

In article <2235@brl-tgr.ARPA> gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) writes:
>Many UNIX text-file utilities will discard a (necessarily final)
>text line that does not end in a newline.  Quite simply, such a
>file is not a proper UNIX text file.

I think that the \User Guide to the UNIX System/ rebutts this as well as
I could:

"In the UNIX system, files have no internal structure; they are simply
a finite sequence of arbitrary characters."

A Unix file is a series of bytes, nothing more is needed to make it a
'proper' UNIX text file.  The reason that sed and some (few) others discard
the last few bytes after the last newline is because these utilities work
on 'lines' of input -- and the definition used by all (most?) of them for
a line is zero or more non-newline characters followed by a newline.
-- 

The MAD Programmer -- 919-228-3313 (Cornet 291)
alias: Curtis Jackson	...![ ihnp4 ulysses cbosgd mgnetp ]!burl!rcj
			...![ ihnp4 cbosgd akgua masscomp ]!clyde!rcj

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (10/23/85)

> In article <2235@brl-tgr.ARPA> gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) writes:
> >Many UNIX text-file utilities will discard a (necessarily final)
> >text line that does not end in a newline.  Quite simply, such a
> >file is not a proper UNIX text file.
> 
> I think that the \User Guide to the UNIX System/ rebutts this as well as
> I could:
> 
> "In the UNIX system, files have no internal structure; they are simply
> a finite sequence of arbitrary characters."
> 
> A Unix file is a series of bytes, nothing more is needed to make it a
> 'proper' UNIX text file.

Tell that to "ed".  This is the second response I have gotten
this morning that shows lack of care in reading.  The magic
word "text" in front of "file" is what is called an adjective.
It qualifies (i.e., modifies) the general meaning of the noun
to produce a more specific meaning.  There are UNIX files and
there is a subset of these called UNIX text files.  Many UNIX
utilities are designed to work only with text files, which DO
have internal structure (line-oriented).  These and only these
are the files under discussion.

P.S.  I didn't realize Thomas & Yates was considered authoritative.

mikel@codas.UUCP (Mikel Manitius) (10/25/85)

> > In article <2235@brl-tgr.ARPA> gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) writes:
> > >Many UNIX text-file utilities will discard a (necessarily final)
> > >text line that does not end in a newline.  Quite simply, such a
> > >file is not a proper UNIX text file.
> > 
> > "In the UNIX system, files have no internal structure; they are simply
> > a finite sequence of arbitrary characters."
> > 
> > A Unix file is a series of bytes, nothing more is needed to make it a
> > 'proper' UNIX text file.
> 
> Tell that to "ed".  This is the second response I have gotten
> this morning that shows lack of care in reading.  The magic
> word "text" in front of "file" is what is called an adjective.
> It qualifies (i.e., modifies) the general meaning of the noun
> to produce a more specific meaning.  There are UNIX files and
> there is a subset of these called UNIX text files.  Many UNIX
> utilities are designed to work only with text files, which DO
> have internal structure (line-oriented).  These and only these
> are the files under discussion.

That still isn't entierly the point, Unix (the operating system)
doesn't know the difference between a "text" file and anything
else, the only thing it knows about are files and directories,
it does not concern itself with what is within the file.

The aplication program, such as "ed" chooses to determine that a
file that does not contain printable characters is not a "text"
file, this is purly abstract.

And just becuase the last character in the file is not a newline,
does mean that all of the characters preceding it all the way to
the previous newline are not text. Utilities such as "fgets"
determine a line when data is delimitered by a newline, if it
gets an EOF before a newline it flushes that data. Note that
fgets(3) is a library function and not a system call, thusly
it is a utility and not part of the operating system.
-- 
			Mikel Manitius @ AT&T-IS Altamonte Springs, FL
			...{ihnp4|akguc|attmail|indra!koura}!codas!mikel