greg@utcsri.UUCP (Gregory Smith) (04/10/86)
In article <2476@brl-smoke.ARPA> rbj@icst-cmr (Root Boy Jim) writes: >Which brings me to another point. Fgets is worthless on binary >data. It returns its first argument, which I already know. >If a null is part of the data, how do you know where it stopped >reading. Well if you're lucky, there will be a newline in there >and that's the end of it. But if you're reading blocks of nulls, >you're SOL. I would like fgets to return the number of chars read. > That's exactly what 'read' is there for, no? Still - I agree. Even if there is a single null in a line, you will effectively lose everthing between that null and the next '\n' if you read it with fgets. I too have a question regarding fgets. fgets, as has been said, normally stops reading at the end of a line ( after a '\n'). I had the following problem with EOF detection: Suppose that the last line in the file is "wombat soup", and this is followed by '\n' and EOF, as is the normal case for text files. So my second-to-last call to fgets reads "wombat soup\n" and does not set feof(infile). My last call to fgets, however, just sets feof(infile) and returns! It didn't write anything into the buffer. So the program saw "wombat soup\n" twice. If the last line is *not* '\n'-terminated, the last call to fgets puts a null-terminated "wombat soup" into the string and sets feof(infile), which is reasonable and what I expected. So why doesn't fgets stick a '\0' in the buffer when it sees EOF immediately? Isn't this a bug? What I did to fix it was to set line_buffer[0]=NUL *before* calling fgets, which is simple enough to do. Still.... grumble, grumble... We have 4.2 BSD on vax11/780. -- "If you aren't making any mistakes, you aren't doing anything". ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg
henry@utzoo.UUCP (Henry Spencer) (04/11/86)
> ...So why doesn't fgets stick > a '\0' in the buffer when it sees EOF immediately? Isn't this a bug? > What I did to fix it was to set line_buffer[0]=NUL *before* calling > fgets, which is simple enough to do. Still.... grumble, grumble... Probably the right answer to this is that you should be checking the return value from fgets, rather than consulting feof separately. The semantics of the return value aren't well explained in the manual, but the code is doing the right thing: you get NULL back only if there was *nothing* available on the input. If there's a partial line at the end of the file, you get that partial line and a non-NULL return, and then *next* time you get a NULL. It is widely agreed that the details of the semantics of fgets could have been done better. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,decvax,pyramid}!utzoo!henry
jsdy@hadron.UUCP (Joseph S. D. Yao) (04/16/86)
In article <2524@utcsri.UUCP> greg@utcsri.UUCP (Gregory Smith) writes: >I had the following problem with EOF detection: Suppose that the >last line in the file is "wombat soup", and this is followed by '\n' and >EOF, as is the normal case for text files. So my second-to-last call to >fgets reads "wombat soup\n" and does not set feof(infile). My last call >to fgets, however, just sets feof(infile) and returns! It didn't write >anything into the buffer. So the program saw "wombat soup\n" twice. Good heavens, man. Do you mean to say they don't teach you to check your return values? That's what they're for, after all. The correct paradigm is: char buf[...]; /* If arg or char*, can't use sizeof */ extern char *fgets(); while (fgets(buf, sizeof(buf), file) != (char *) NULL) { ... } This is why fgets() returns a value: the fact that a non-NULL return is the value of buf, the usefulness of which was questioned by an earlier writer, is just to make it something non-NULL. (If buf is NULL-valued, you have other problems.) Once again, apropos another comment in the above note, fgets() is intended for reading "standard text files," which are strings of ASCII characters (assumed non-NUL), each "line" of which is termi- nated by a newline (NL) character. For anything else, one should check whether fread() might not be a better routine to use. -- Joe Yao hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}
BJORNDAS%CLARGRAD.BITNET@WISCVM.WISC.EDU (08/20/86)
O great C gurus, help a relative greenhorn! Why is it that fgets() returns NULL when it reaches end of file, whereas all the other standard i/o functions seem to return EOF at that point? This confuses me, especially since one would suppose fputs() to be the partner function of fgets() and therefore to work in the same way. Sterling Bjorndahl BJORNDAS at CLARGRAD on BITNET
bzs@bu-cs.BU.EDU (Barry Shein) (08/22/86)
fgets() returns a char * to the string read. Traditionally, any function that cannot return a promised pointer (such as when an EOF occurs) returns NULL (there exists a few syscalls which return ((char *) -1) or equivalent, c'est la vie, this has been hashed out, I guess the rule that syscalls return -1 won [eg. sbrk].) puts() and fputs() always return the result of the last putc() done (at least it does in 4.2/4.3bsd.) No mention of this is made in the manual pages I have. This will be the last character of the string unless an error occurred, in which case it will be EOF. Notice that for puts() this will always be '\n' (or EOF on error.) You're intuitions seem right, they should probably both return a similar thing (ie. fputs() should probably return a pointer to the string printed, or NULL on error, or maybe fgets() should return the value of the last 'getc()' done, I like the former better.) Oh well, such is history. Of course, in the case of getc() et al, EOF makes sense as an out-of-band character (ie. a value that cannot be a legal char, hence distinguishable, but no relation to a pointer.) Thus, fgets() is consistent with the idea of returning a NULL as a failed pointer while fputs() is consistent with the documentation in that the doc seems to promise nothing... -Barry Shein, Boston University
guy@sun.uucp (Guy Harris) (08/22/86)
> Why is it that fgets() returns NULL when it reaches end of file, > whereas all the other standard i/o functions seem to return EOF > at that point? Because "fgets" returns a value of type "char *" while most of the other functions and macros return a value of type "int". EOF is not a valid value of type "char *", so "fgets" can't return EOF. NULL is a valid value of type "char *", and doesn't refer to any object, so it's the proper choice for an "out-of-band" value for "fgets" to return on error. Now, you can ask "why does 'fgets' return a value of type 'char *'?" at this point. It returns a pointer to the buffer that it just filled in; obviously *somebody* found this useful, although I don't find it so. If "fgets" didn't return that pointer, it could have been defined as returning a value of type "int" instead, and that value would have been 0 on success and EOF on failure. It's too late to change it, though. BTW: "fgets()" returns NULL on end-of-file OR error; don't write code that assumes that "fgets()" returning NULL means that the end of the file was found, use "ferror" or "feof" to disambiguate these cases. -- Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com (or guy@sun.arpa)
cramer@kontron.UUCP (Clayton Cramer) (08/29/86)
> O great C gurus, help a relative greenhorn! Why is it that fgets() > returns NULL when it reaches end of file, whereas all the other > standard i/o functions seem to return EOF at that point? This > confuses me, especially since one would suppose fputs() to be the > partner function of fgets() and therefore to work in the same way. > > Sterling Bjorndahl > BJORNDAS at CLARGRAD on BITNET Ah, heck. Next you'll complain because fgets() and gets() do different things to the \n at the end of a string. There are days I wish every line of C currently existing would evaporate, so that the I/O functions could be...rationalized. Clayton E. Cramer
karl@haddock (09/03/86)
sun!guy (Guy Harris) writes: >Now, you can ask "why does 'fgets' return a value of type 'char *'?" at >this point. It returns a pointer to the buffer that it just filled in; >obviously *somebody* found this useful, although I don't find it so. Me neither. It sounds like a "why not?" situation. >If "fgets" didn't return that pointer, it could have been defined as >returning a value of type "int" instead, and that value would have been 0 on >success and EOF on failure. Better yet, return the number of characters read (so 0 on failure). >It's too late to change it, though. Not without changing the name again. (That's how fgets() evolved from gets(), though the latter still exists.) Karl W. Z. Heuer (ima!haddock!karl; karl@haddock.isc.com), The Walking Lint
karl@haddock (09/03/86)
bzs@bu-cs.BU.EDU (Barry Shein) writes: >Traditionally, any function that cannot return a promised pointer ... >returns NULL (there exists a few syscalls which return ((char *) -1) or >equivalent, c'est la vie, this has been hashed out, I guess the rule that >syscalls return -1 won [eg. sbrk].) Last time I used a pdp11, there was a bug in lseek(). The error return, which should have been (long)-1 (i.e. 0xffffffff) was 0xffff0000. I never determined whether this was an AT&T standard bug or a local glitch. The surprising thing is that lseek *was* making the check, but it explicitly set r1 (the lower half) to 0 instead of -1! On systems where char* is wider than int, sbrk() could have a similar problem. Karl W. Z. Heuer (ima!haddock!karl; karl@haddock.isc.com), The Walking Lint
guy@sun.uucp (Guy Harris) (09/04/86)
> Better yet, return the number of characters read (so 0 on failure).
Better still, return the number of characters read, but return EOF on
failure; 1) this is what "fputs" does and 2) this encourages you to
distinguish between EOF and error, something that several UNIX utilities, to
their everlasting shame, do not do.
--
Guy Harris
{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
guy@sun.com (or guy@sun.arpa)
Bader@b.psy.cmu.edu (Miles Bader) (09/05/86)
Neither of your suggestions (# of characters, EOF or 0) makes any more sense to me than the current behavior, which seems perfectly reasonable. If nothing better is being offered, why waste time arguing about it?
karl@haddock (09/08/86)
sun!guy (Guy Harris) writes: >[haddock!karl writes:] >>Better yet, return the number of characters read (so 0 on failure). >Better still, return the number of characters read, or EOF; One could argue that zero is the number of chars read, and that there is no failure return as such (cf. fread()). But see below. >1) this is what "fputs" does The successful result of fputs/puts is not mentioned in my manual; all one can conclude is that it differs from the failure result (EOF). X3J11 says that the successful result is zero. >and 2) this encourages you to distinguish between EOF and error. Are you suggesting that the result should be EOF if end-of-file was reached, but 0 if a read error occurred? This is workable for gets()/puts() (except for fputs() of an empty string), but not (e.g.) scanf(), and is probably a bad idea in general. My revised opinion: functions other than system calls should return NULL for an out-of-band pointer, EOF for an OOB character, or ERROR (which should be defined someplace) for an OOB int. ERROR and EOF are logically different, even if they have the same value. Physical end-of-file, read-error, and a legitimate result of -1 can (and should) be distinguished with feof, ferror, and/or errno. The ngets() function (if it gets written) should return ERROR on failure. It can return zero only if passed a zero-length buffer. Similarly, nputs() should return the number of characters written, or ERROR. (This one could be called "puts", except that on an implementation that uses zero for puts() success it may break programs that depend on this.) >distinguish between EOF and error, something that several UNIX utilities, >to their everlasting shame, do not do. Programs should always check for failure returns, and distinguish the types of failure when it matters (as it usually does when end-of-file is one type). But that's a more general problem, and although there are some fairly nice ways to solve it, they would break a lot of existing code. Karl W. Z. Heuer (ima!haddock!karl; karl@haddock.isc.com), The Walking Lint
karl@haddock (09/09/86)
Bader@b.psy.cmu.edu writes: >[concerning the return value of gets()/fgets()] >Neither of your suggestions (# of characters, EOF or 0) makes any more sense >to me than the current behavior, which seems perfectly reasonable. If >nothing better is being offered, why waste time arguing about it? In the current behavior, the value on successful return is the buffer arg -- a useless value already available to the user. The character count is not available except by calling strlen(), which is somewhat redundant since the library function already has the value (or enough information to construct it in constant time). There is a similar duplication of effort in strcpy() followed by strcat(), for which reason I think strcpy() should've returned the END of the string (pointer to the '\0') instead of the beginning. I need that value more often than I need a copy of the first argument. Karl W. Z. Heuer (ima!haddock!karl; karl@haddock.isc.com), The Walking Lint