cdl@mplvax.UUCP (Carl Lowenstein) (05/06/85)
The same ugly undocumented behavior has shown up with the stdio library using 3 C compilers on 3 operating systems (DECUS C on RT-11, cc on SysV, (3b2), and cc on 4.2BSD (vax). First to quote the documentation: "These functions return . . . a short count for . . . illegal data items" (SysV). ". . . if conversion was intended, it was frustrated by an inappropriate character in the input." (4.2BSD). Ok, but the character pointer is never advanced past that inappropriate character, so the poor user's program is either stuck in an infinite loop or else it has to advance the pointer to get going again. Surely others have noticed this in the past. Below is a little test program which shows a workaround. Without the getchar(), it will loop until you get tired of watching it. Try it with bad octal digits like 8,9,a,b . . . /*-------------------------------------------------------------------------*/ /* scanft.c */ /* * look at bug in scanf */ #include <stdio.h> main() { int i, k; for (;;) { printf("\n number: "); k = scanf("%o", &i); printf("scanf returns %d\n",k); if (k == EOF) break; if (k == 0){ i = getchar(); /* flush a character */ printf(" choked on '%c'\n",i); continue; /* go back and ask again */ } printf("value = %o\n", i); } exit(0); } /*-------------------------------------------------------------------------*/ -- carl lowenstein marine physical lab u.c. san diego {ihnp4|decvax|akgua|dcdwest|ucbvax} !sdcsvax!mplvax!cdl
gwyn@Brl.ARPA (VLD/VMB) (05/08/85)
That's not a bug, it's a feature. How else would you be able to determine what comes next when a scanf stops prematurely? If it ate the "failing" character, you could never see what it was. I think the routine was designed on the assumption that the programmer would not be so stupid as to keep trying to scan a chunk of input over & over with the same failing format.
cdl@mplvax.UUCP (Carl Lowenstein) (05/09/85)
In article <10496@brl-tgr.ARPA> gwyn@Brl.ARPA (VLD/VMB) writes: >That's not a bug, it's a feature. How else would you be able >to determine what comes next when a scanf stops prematurely? >If it ate the "failing" character, you could never see what it >was. I think the routine was designed on the assumption that >the programmer would not be so stupid as to keep trying to >scan a chunk of input over & over with the same failing format. *mild flame* This programmer is so stupid as to expect to find the behavior of scanf documented in the manual. *unflame* -- carl lowenstein marine physical lab u.c. san diego {ihnp4|decvax|akgua|dcdwest|ucbvax} !sdcsvax!mplvax!cdl
zben@umd5.UUCP (05/12/85)
In article <190@mplvax.UUCP> cdl@mplvax.UUCP (Carl Lowenstein) writes: >This programmer is so stupid as to expect to find the behavior >of scanf documented in the manual. Ye Gods! Expect the behavior of system primitives to be DOCUMENTED in the MANUAL?? Why, why, thats as bad as expecting meaningful diagnostics from the system language compiler! "Error in conditional" indeed... Clearly this poor person is from a 'dinosaur' environment, probably an IBM 370 or Univac 1100 system, where people actually take more than 10 seconds to document what they have done, and where you have a ghost of a chance of finding out **ANYTHING** from the manuals, as opposed to having to prostrate yourself before a Unix Guru (read "high priest") to get the real scoop... Clearly I'm more than a little burned by being called a 'high priest' for merely spending 15 years reading Univac manuals and system code, to get to the point where I can *answer* questions from users too *lazy* to *read* the manuals... Still, this sort of sillyness is exactly why I have a hard time believing that Unix and C are "for real". At this point I find Unix and C to be at the halfway point in the reality spectrum between my real Univac 1100 work and trying to do systems programs in Applesoft Basic... -- Ben Cranston ...{seismo!umcp-cs,ihnp4!rlgvax}!cvl!umd5!zben zben@umd2.ARPA
matt@oddjob.UUCP (Matt Crawford) (05/12/85)
In article <190@mplvax.UUCP> cdl@mplvax.UUCP (Carl Lowenstein) writes: >In article <10496@brl-tgr.ARPA> gwyn@Brl.ARPA (VLD/VMB) writes: >>If it ate the "failing" character, you could never see what it >>was. I think the routine was designed on the assumption that >>the programmer would not be so stupid as to keep trying to >>scan a chunk of input over & over with the same failing format. > >*mild flame* > >This programmer is so stupid as to expect to find the behavior >of scanf documented in the manual. > >*unflame* > carl lowenstein marine physical lab u.c. san diego THIS programmer is not too arrogant to open the manual before telling someone what's not in it: SCANF(3S) UNIX Programmer's Manual SCANF(3S) For example, ..... int i; float x; char name[50]; scanf("%2d%f%*d%[1234567890]", &i, &x, name); with input 56789 0123 56a72 will assign 56 to i, 789.0 to x, skip `0123', and place the string `56\0' in name. The next call to getchar will return `a'. ------------------------------------ ---- If you make a mistake you can (a) admit it, (b) shut up, or (c) prolong the argument and provide more entertainment. I will choose course (a) and admit that I am making a mistake by posting anything at all on this subject. _____________________________________________________ Matt University crawford@anl-mcs.arpa Crawford of Chicago ihnp4!oddjob!matt
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (05/12/85)
> Clearly I'm more than a little burned by being called a 'high priest' for > merely spending 15 years reading Univac manuals and system code, to get to > the point where I can *answer* questions from users too *lazy* to *read* > the manuals... Still, this sort of sillyness is exactly why I have a > hard time believing that Unix and C are "for real". UNIX was developed by and for intelligent programmers.
geoff@burl.UUCP (geoff) (05/13/85)
> >the programmer would not be so stupid as to keep trying to > >scan a chunk of input over & over with the same failing format. > > *mild flame* > > This programmer is so stupid as to expect to find the behavior > of scanf documented in the manual. > > *unflame* > > -- > carl lowenstein marine physical lab u.c. san diego > {ihnp4|decvax|akgua|dcdwest|ucbvax} !sdcsvax!mplvax!cdl how about the bottom of page 2 of scanf documentation (V5.2)-- "Scanf conversion terminates at EOF, at the end of the control string, or when an input character conflicts with the control string. In the latter case, the offending character is left unread in the input stream." I can only surmise that you have a different version of the manual -- it does seem quite clear. geoff sherwood
jack@boring.UUCP (05/13/85)
Ahh, the joys of scanf..... Something I've tried about every year in the last decade but haven't got to work on any machine is the following : main() { char buf[64]; printf("Gimme string -"); scanf("%s\n", buf); ... I tried to leave the \n out, putting a space in it's place, putting a space before the %s, everything. Never, though, have I succeeded in read the *first* string from stdin with scanf (the rest is no problem). So, everytime I need to do this, I fiddle with scanf for an hour or so, and then replace the scanf by a fgets() or gets(). Question: Am I asking impossible things from scanf, or an I just soooooo very stupid that I haven't found out how to do this in many many years???? (I would prefer answers in the form 'it is impossible', but I'll settle for 'you are stupid', if accompanied by an explanation *why* I am stupid). Note that this is about reading the *first* string from stdin. After that, things are fine, as long as you're careful where to scan the \n's, etc. -- Jack Jansen, jack@mcvax.UUCP The shell is my oyster.
gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (05/14/85)
> Something I've tried about every year in the last decade but > haven't got to work on any machine is the following : > > main() { > char buf[64]; > > printf("Gimme string -"); > scanf("%s\n", buf); > ... Try: #include <stdio.h> /*ARGSUSED*/ main(argc, argv) char *argv[]; { char buf[64]; (void)printf("Gimme string -"); (void)scanf("%[^\n]", buf); (void)getchar(); /* eat NL */ ... } It is important not to begin or end the format string with whitespace, since that causes ALL whitespace in the input stream at that point to be skipped. In particular, trying to consume the newline with the format statement will cause you to have to type extra stuff before the first line scan is considered complete, and leading whitespace on the second line will be eaten. By the way, what happens if someone types a very long line in response to your prompt? (This sort of thing caused some really bad security loopholes in older UNIX systems.) The safe way to input a line is with fgets() (NOT gets()).
geoff@utcs.UUCP (Geoff Collyer) (05/14/85)
In article <10600@brl-tgr.ARPA> gwyn@brl-tgr.ARPA (Doug Gwyn) writes: >> ... I can *answer* questions from users too *lazy* to *read* >> the manuals... > >UNIX was developed by and for intelligent programmers. printf(3S) and scanf(3S) are incomplete and slippery specifications. If you doubt this, try writing the code from the manual pages. The current manual pages (and source code?) seem to be descended from the v6 Portable C library (-lp) and have never been substantially modified. The v7 printf(3S) implies that one can supply a format specifier of %lu to print an unsigned long int. The v7 C compiler doesn't support unsigned longs, yet %lu will print a long int as if it were unsigned. It is possible to express this in C (assuming twos-complement representation) by heroic measures. What should printf do when given the format specifier %017s and a string shorter than 17 characters? I read printf(3S) as saying that printf will pad with zeroes, though the v7 printf (at least) pads with blanks and Dennis Ritchie has argued that this is desirable behaviour. Various of System III or V don't support zero padding when the field width begins with a zero. AT&T has converted this incompatible behaviour from a bug into a feature by documenting it (at least in System V). To date, ANSI has wisely sided against AT&T in this case. scanf(3S) implies that inappropriate characters in the input will be left unread, but this is not possible, given stdio's (zero or) one character of pushback, for pathological input such as 3.4e-z under %f; the best one can do is to push back the z, though all of e-z should be pushed-back. At a quick glance, the draft ANSI C library write-ups for printf and scanf seem better than the UNIX manual pages, though still not as explicit as I would like. -- "All I'm after is just a *mediocre* brain, something like the president of the AT&T Company." - Alan Turing
zben@umd5.UUCP (05/15/85)
In article <714@oddjob.UUCP> matt@oddjob.UUCP (Matt Crawford) writes: >In article <190@mplvax.UUCP> cdl@mplvax.UUCP (Carl Lowenstein) writes: >>In article <10496@brl-tgr.ARPA> gwyn@Brl.ARPA (VLD/VMB) writes: >>> >>>If it ate the "failing" character, you could never see what it >>>was. I think the routine was designed on the assumption that >>>the programmer would not be so stupid as to keep trying to >>>scan a chunk of input over & over with the same failing format. >> >>This programmer is so stupid as to expect to find the behavior >>of scanf documented in the manual. >> >THIS programmer is not too arrogant to open the manual before telling >someone what's not in it: > >SCANF(3S) UNIX Programmer's Manual SCANF(3S) > > For example, ..... > > int i; float x; char name[50]; > scanf("%2d%f%*d%[1234567890]", &i, &x, name); > > with input > > 56789 0123 56a72 > > will assign 56 to i, 789.0 to x, skip `0123', and place the > string `56\0' in name. The next call to getchar will return > `a'. ------------------------------------ > ---- > The same documentation appears on our 2.9BSD system - I guess it is the same on 4.xBSD - and yes, a reasonable person should be able, after scratching his head for awhile, figure out what is happening. How much time do you waste scratching your head? The following mail arrived and I think it germane: ------------------------------------------------------------- I get tired of people saying that UNIX & C are not documented. There are a few undocumented features of programs, but they are that way because they might go away, and shouldn't be used (yet). E.g., the VPATH variable in make. But all the system functions are documented *quite well*. Take the scanf manual page: ---- Scanf conversion terminates at EOF, at the end of the control string, or when an input character conflicts with the control string. In the latter case, the offending character is left unread in the input stream. Scanf returns the number of successfully matched and assigned input items; this number can be zero in the event of an early conflict between an input character and the control string. If the input ends before the first conflict or conversion, EOF is returned. ---- If that isn't *painfully obvious*, I don't know what is. Maybe you're using 4.2BSD; if you do, I apologize. That system is a total hack munged by grad students and the documentation is even worse. This excerpt comes from SVR2. Since it is a production system, it has to be well-documented, and it is. Michael Baldwin AT&T Bell Labs ------------------------------------------------------------- Now *this* is adequate documentation... Re: "high priest" thing. It's very easy to tell the most vicious form of Polack joke, until you really become friends with a Pole. It must similarly be very easy to eliminate upon "high priests", until you are confronted with one. Maturity consists in large measure of doing what is right in preference to doing what is easy. Nuf said? -- Ben Cranston ...{seismo!umcp-cs,ihnp4!rlgvax}!cvl!umd5!zben zben@umd2.ARPA
cdl@mplvax.UUCP (Carl Lowenstein) (05/15/85)
In article <687@burl.UUCP> geoff@burl.UUCP (geoff) writes: > >how about the bottom of page 2 of scanf documentation (V5.2)-- > >"Scanf conversion terminates at EOF, at the end of the control string, or >when an input character conflicts with the control string. In the latter >case, the offending character is left unread in the input stream." > >I can only surmise that you have a different version of the manual -- >it does seem quite clear. > geoff sherwood You're right. It is quite clear. Unfortunately, it isn't in the 4.2BSD manual, the v7 manual, the Decus manual. Since I have all these and SVR2 too in different places, it's easy to get confused. I wish I could find the original stdio document from v6 to see whether that sentence got dropped along the way, or was recently added to prevent people like me from provoking discussions unnecessarily. -- carl lowenstein marine physical lab u.c. san diego {ihnp4|decvax|akgua|dcdwest|ucbvax} !sdcsvax!mplvax!cdl
guy@sun.uucp (Guy Harris) (05/15/85)
Now you know why, the few times I've ever used "scanf", I read the string into a buffer and used "sscanf"... Guy Harris