chris@mimsy.UUCP (Chris Torek) (09/07/88)
In article <8422@smoke.ARPA> gwyn@smoke.ARPA (Doug Gwyn ) writes: >... In fact [stdio] EOF should not be "sticky"; if more data becomes >available, as on a terminal, it should be available for subsequent >reading. The 4.2BSD implementation broke this but it might be okay >on 4.3BSD. I thought this behaviour was added to 4.2BSD to conform to some existing standard. What does the dpANS say? POSIX? -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
gwyn@smoke.ARPA (Doug Gwyn ) (09/08/88)
In article <13427@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes: >In article <8422@smoke.ARPA> gwyn@smoke.ARPA (Doug Gwyn ) writes: >>... In fact [stdio] EOF should not be "sticky"; if more data becomes >>available, as on a terminal, it should be available for subsequent >>reading. The 4.2BSD implementation broke this but it might be okay >>on 4.3BSD. >I thought this behaviour was added to 4.2BSD to conform to some >existing standard. No; it was added because Bill Shannon thought it was a good idea. I noticed it because it broke several interesting applications. >What does the dpANS say? POSIX? Remember that the C dpANS does not address multitasking issues (where a file can grow due to other concurrent processes), nor does it specify much about "terminal" device behavior. I recall the 4.2BSD sticky-EOF behavior coming up in dicsussion and not finding any demurrers when it was labeled "bogus", but I also doubt that it is explictly ruled "nonconforming". I don't remember seeing this specific issue addressed by 1003.1.
ka@june.cs.washington.edu (Kenneth Almquist) (09/11/88)
In article <13427@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes: > In article <8422@smoke.ARPA> gwyn@smoke.ARPA (Doug Gwyn ) writes: >> ... In fact [stdio] EOF should not be "sticky"; if more data becomes >> available, as on a terminal, it should be available for subsequent >> reading. The 4.2BSD implementation broke this but it might be okay >> on 4.3BSD. > > I thought this behaviour was added to 4.2BSD to conform to some > existing standard. Berkeley conform to an existing standard? You must be kidding. The story I read on the net a few years ago is that Berkeley made this change to fix a problem with fread. The problem is that the fread documentation contradicts itself, stating both that, "fread returns the number of items actually read," and "fread returns 0 on end of file or error." What should fread do when its caller requests three items, but fread encounters and end of file after reading only two? The first sentence claims it should return two (the number of items read), while the second claims it should return zero (because end of file was encountered). Berkeley interpreted the documentation as indicating that fread should return two, but should then return zero on the next call. The obvious way to implement this would be to have fread do an ungetc on the EOF so that the next time it was called it would immediately read an EOF and return zero. However, ungetc does not allow an EOF to be pushed back onto the input. This deficiency of ungetc is (in my view) the biggest flaw in the design of the stdio library, and it makes it impossible to implement scanf correctly, so Berkeley would have done the world a favor by extending the stdio library to allow EOF to be pushed back. Instead, they chose a simpler approach: make getc always return EOF when the eof or error flags are set. This approach allowed them to fix the fread problem by writing only a couple of lines of code, but it also broke getc. In 4.2 BSD the behavior of getc is a bug since it disagrees with the documentation. In 4.3 BSD, Berkeley modified the documentation to agree with the code. ("It's not a bug, it's a feature!") By the way, AT&T also noticed the contradiction in the fread documentation. They fixed the documentation so that it clearly reflected the behavior of the code. This seems like a better approach since modifying the code to agree with the documentation doesn't make much sense when the meaning of the documentation is so unclear. In any case, AT&T's approach, unlike Berkeley's, didn't break working code. > What does the dpANS say? POSIX? I don't know, and how they resolve this issue is less important than that the issue is resolved. The standard I/O library is supposed to be *standard*; that's the whole point of it. There are, however, several reasons why they should prefer Dennis Ritchie's original definition of getc over Berkeley's: 1. Ritchie's definition has seniority. Berkeley's gratuitous change to getc was not made until 4.2 BSD and was not documented until 4.3 BSD. All other versions of UN*X use Ritchie's definition. 2. Aesthetics. Ritchie's definition can be stated in seven words: Return EOF when at end of file. 3. Authority. If anyone's opinion should be respected when setting UN*X standards, Ritchie's should be. Kenneth Almquist -- And there shall come among you false prophets, who will corrupt my teachings and teach that EOF should be sticky....
shannon%datsun@Sun.COM (Bill Shannon) (09/14/88)
In article <8459@smoke.ARPA>, gwyn@smoke.ARPA (Doug Gwyn ) writes: > No; it was added because Bill Shannon thought it was a good idea. > I noticed it because it broke several interesting applications. What you didn't notice is that it fixed many programs that were broken. We weren't able to come up with a fix which maintained compatibility with all programs which *happened* to work and with the documentation, and which also fixed the programs that were broken. If you have such a fix and haven't already told us about it, please do so.
gwyn@smoke.ARPA (Doug Gwyn ) (09/15/88)
In article <68214@sun.uucp> shannon%datsun@Sun.COM (Bill Shannon) writes: [re. sticky EOF] >What you didn't notice is that it fixed many programs that were broken. It's certainly true that there were several programs found on UNIX systems that "read the EOF" more than once, and thus would fail miserably on files other than static fixed-length files, most notably when reading from ttys. Because "EOF" is not an official UNIX notion (really it is "0 bytes read") and because this is often a transient condition, I much prefer to fix the applications that made the bogus assumption and leave the library alone.
shannon%datsun@Sun.COM (Bill Shannon) (09/15/88)
In article <8495@smoke.ARPA>, gwyn@smoke.ARPA (Doug Gwyn ) writes: > I much prefer to fix the > applications that made the bogus assumption and leave the library alone. based on the documentation available at the time, it wasn't clear *which* applications were making bogus assumptions. all we knew is that we had a set of applications that were making incompatible assumptions. based on our reading of the documentation that was available, we made the change that you're complaining about. needless to say, we did not make this change in isolation, we consulted with several other experienced UNIX developers and they agreed with us. I'm sorry we didn't consult you. I'd be happy to let POSIX or ANSI C or whatever tell us what the right answer is, but at this point your answer is only different.