avery@netcom.UUCP (Avery Colter) (11/11/90)
In the self-teaching course I have here, scanf is the most often used input function. I don't see gets used much at all. And indeed, gets only seems of much advantage when you want to take in a whole line into one string. Otherwise, scanf can take individual numbers and put them directly into numerical variables. With gets, you'd have to first manually parse the line, and then use strtol to translate them into numbers. I didn't see puts used for printing strings to screen much either. printf was the function of choice. -- Avery Ray Colter {apple|claris}!netcom!avery {decwrl|mips|sgi}!btr!elfcat (415) 839-4567 "Fat and steel: two mortal enemies locked in deadly combat." - "The Bending of the Bars", A. R. Colter
gordon@osiris.cso.uiuc.edu (John Gordon) (11/11/90)
Scanf() is bad because if you use it to directly get user input, and the user types in something different than scanf() is expecting, it screws up. A better scheme is to store user input in an intermediate buffer and sscanf() the buffer.
roy%cybrspc@cs.umn.edu (Roy M. Silvernail) (11/11/90)
avery@netcom.UUCP (Avery Colter) writes: > In the self-teaching course I have here, scanf is the most often used > input function. I don't see gets used much at all. > > And indeed, gets only seems of much advantage when you want to take in > a whole line into one string. The problem with scanf() is that it can behave unpredictably when you give it badly formatted input. It's better, IMHO, to gets() a whole line, check its validity and _then_ sscanf() it into the target variables. (no need for strtol() or similar, since sscanf() looks at the validated string just as scanf() would have looked at the original input) It just makes things more bullet-resistant. -- Roy M. Silvernail |+| roy%cybrspc@cs.umn.edu |+| #define opinions ALL_MINE; main(){float x=1;x=x/50;printf("It's only $%.2f, but it's my $%.2f!\n",x,x);} "This is cyberspace." -- Peter da Silva :--: "...and I like it here!" -- me
jak@sactoh0.SAC.CA.US (Jay A. Konigsberg) (11/12/90)
In article <16582@netcom.UUCP> avery@netcom.UUCP (Avery Colter) writes: >In the self-teaching course I have here, scanf is the most often used >input function. I don't see gets used much at all. > >And indeed, gets only seems of much advantage when you want to take in >a whole line into one string. > IMHO gets() or getchar() is better for input because the programmer has greater control over what is being input. Specifically, if the programmer wants a float value and a character is input gets() won't error on it, scanf() will. My argument goes mainly to bullet-proofing programs. >Otherwise, scanf can take individual numbers and put them directly into >numerical variables. With gets, you'd have to first manually parse the >line, and then use strtol to translate them into numbers. > Generally, I'll use scanf() when reading from a file that a program has created. Then scanf() is superior. >I didn't see puts used for printing strings to screen much either. >printf was the function of choice. > puts()/fputs() is generally faster than printf() and should be used when possible. However, I will confess that printf() is much more commonly used. Perhaps its because printf() will handle all cases and puts() will only handle the string only case. -- ------------------------------------------------------------- Jay @ SAC-UNIX, Sacramento, Ca. UUCP=...pacbell!sactoh0!jak If something is worth doing, it's worth doing correctly.
rjc@uk.ac.ed.cstr (Richard Caley) (11/12/90)
In article <VXogs2w163w@cybrspc> roy%cybrspc@cs.umn.edu (Roy M. Silvernail) writes: The problem with scanf() is that it can behave unpredictably when you give it badly formatted input. It's better, IMHO, to gets() a whole line, check its validity and _then_ sscanf() it into the target variables. Maybe it was just a typo, but repeat after me `GETS is EVIL' This has been un unpayed anouncement by paranoids anonymous. -- rjc@uk.ac.ed.cstr _O_ |<
zvs@bby.oz.au (Zev Sero) (11/12/90)
Roy = roy%cybrspc@cs.umn.edu (Roy M. Silvernail) Avery = avery@netcom.UUCP (Avery Colter) Avery> In the self-teaching course I have here, scanf is the most often used Avery> input function. I don't see gets used much at all. Roy> The problem with scanf() is that it can behave unpredictably when you Roy> give it badly formatted input. It's better, IMHO, to gets() a whole Roy> line, check its validity and _then_ sscanf() it into the target When you are not absolutely, 100% sure that the input from stdin will be what the program expects (e.g. when stdin is coming from a user, or even from a file if you didn't generate it yourself), scanf() is a bad idea, for the reason Roy mentioned. When stdin is a terminal (as it is in almost all cases), you must expect the user to type absolutely anything that pops into its putative brain, or simply to lean on the keyboard and give you a nice random string! But for exactly the same reason, you should never, never, never use gets(). The gets() function does not check how many characters it reads. It just keeps going until it sees a newline. If the array you're storing the thing in overflows, tough bikkies. H&S warn against the use of gets() for this reason, but I was flabbergasted to see a textbook published by Microsoft which consistently used gets() for input. The safe way to read input from a user is to use fgets() and sscanf(). char buf[1000]; int i; if (!fgets (buf, sizeof buf, stdin)) { [complain] } i = sscanf (buf, [whatever]); if (i != [the right number]) { [complain] } Only use scanf() and/or gets() when you are sure that the program is only ever called as a pipe from another program which you know will not produce any surprises, or with stdin coming from a file generated by such a program. --- Zev Sero - zvs@bby.oz.au If a compiler emits correct code purely by divine guidance and has no memory at all, it can still be a C compiler. - Chris Torek
imp@marvin.Solbourne.COM (Warner Losh) (11/12/90)
In article <VXogs2w163w@cybrspc> roy%cybrspc@cs.umn.edu (Roy M. Silvernail) writes: >It's better, IMHO, to gets() a whole line, check its validity and _then_ sscanf() True. However, I'd use fgets(). See below. >It just makes things more bullet-resistant. gets() is a bad function to use when you don't have total control over the input (like a user typing at a program). Since it can't check to see if the input line is too large for the buffer, "bad things" can happen as a result. One vector of the Internet Worm/Virus/Whatever used the fact that the finger daemon used gets and was running as root to cause some trouble.... Warner -- Warner Losh imp@Solbourne.COM How does someone declare moral bankruptcy?
chris@mimsy.umd.edu (Chris Torek) (11/12/90)
Whenever you deal with (significant pause, change of voice tone) users (shudder) ( :-) ) you must think of all the things that could possibly go wrong. It is impossible to make any system completely foolproof--- fools are too ingenious---but it is usually not too hard to make a system fool-resistant. Consider the following three ways to read and print a series of integers: way_the_first() { int i; while (scanf("%d", &i) != EOF) printf("I got %d.\n", i); } way_the_second() { int i; char inbuf[10]; while (gets(inbuf)) printf("I got %d.\n", atoi(inbuf)); } way_the_third() { int i; char inbuf[10]; while (fgets(inbuf, sizeof inbuf, stdin)) printf("I got %d.\n", atoi(inbuf)); } The first is susceptible to a number of problems. Foremost, however, is the user typing the wrong thing. If the user enters the word `one' instead of the digit `1', the loop runs forever, because the letter `o' is not a digit and the scanf LEAVES IT BEHIND in the input stream. Each call to scanf() finds another `o' and then puts it back. The second is susceptible to a different kind of problem. If a user types in `supercalifragilisticexpialidocious', your program does something completely unpredictable, because you have asked the computer to shove 35 characters (34 plus '\0') into a 10 character buffer. It Just Ain't Gonna Work. The third, while imperfect, is the best of the three. There is nothing the user can type in (other than special system functions that, say, trap into a debugger) that will cause the program to run wild. This is why we (`we' == comp.lang.c posters who have seen it before) recommend using fgets to read input. Scanf and gets are both rather fragile; fgets is not. (Once you have some input, you can pick it apart however you like, including via sscanf.) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris
kimcm@diku.dk (Kim Christian Madsen) (11/12/90)
rjc@uk.ac.ed.cstr (Richard Caley) writes: >In article <VXogs2w163w@cybrspc> roy%cybrspc@cs.umn.edu (Roy M. Silvernail) writes: > The problem with scanf() is that it can behave unpredictably when you > give it badly formatted input. It's better, IMHO, to gets() a whole > line, check its validity and _then_ sscanf() it into the target > variables. >Maybe it was just a typo, but repeat after me > `GETS is EVIL' >This has been un unpayed anouncement by paranoids anonymous. gets can get you into a lot of trouble if used for in a non-controlled manner, e.g. for user input. Then you be better off by using fgets or reading char-by-char with getchar() or family. But using scanf() for user input is asking for trouble! Kim Chr. Madsen just as
gwyn@smoke.brl.mil (Doug Gwyn) (11/12/90)
In article <4300@sactoh0.SAC.CA.US> jak@sactoh0.SAC.CA.US (Jay A. Konigsberg) writes: >IMHO gets() or getchar() is better for input because the programmer >has greater control over what is being input. Specifically, if the >programmer wants a float value and a character is input gets() won't >error on it, scanf() will. My argument goes mainly to bullet-proofing >programs. If you really are concerned about bulletproofing, don't use gets() unless you have control over the length of the lines being scanned. Use fgets() instead, if the input comes from some uncontrolled source (like a human).
sarima@tdatirv.UUCP (Stanley Friesen) (11/13/90)
In article <16582@netcom.UUCP> avery@netcom.UUCP (Avery Colter) writes: >In the self-teaching course I have here, scanf is the most often used >input function. I don't see gets used much at all. Sounds like a poorly designed course. using scanf for user input is very dangerous. Why? Because scanf keeps reading until its entire input list is fulfilled or EOF is reached. It treats NL as *white* *space*. Thus given the invocation: scanf("%d %d %d", &i, &j, &k); a user can become very frustrated if he only types in two integers, followed by a NL (or RETURN). The computer will just *sit* there and do nothing. No error message about incomplete input, no prompt, no output, no nothing. And unless the poor user can intuit that the computer wants another number, he is stuck. Bleah. >And indeed, gets only seems of much advantage when you want to take in >a whole line into one string. Or if you want to make sure that the computer can respond to the user after every input line. >Otherwise, scanf can take individual numbers and put them directly into >numerical variables. With gets, you'd have to first manually parse the >line, and then use strtol to translate them into numbers. Hardly, you just use sscanf on the string read by gets. Since sscanf treats end-of-string as EOF, this will not get stuck like scanf. Now, if sscanf returns an input count lower than expected you can print a "Usage:" message to the user explaining clearly that you need that third number. Voila, much less frustration, and a more friendly, conversational program. >I didn't see puts used for printing strings to screen much either. >printf was the function of choice. Agreed here. There is little reason to use puts unless the output string is already formatted. -- --------------- uunet!tdatirv!sarima (Stanley Friesen)
rob@b15.INGR.COM (Rob Lemley) (11/13/90)
In <1990Nov12.050450.7194@Solbourne.COM> imp@marvin.Solbourne.COM (Warner Losh) writes: >True. However, I'd use fgets(). See below. . . . >gets() is a bad function to use when you don't have total control over >the input (like a user typing at a program). Since it can't check to >see if the input line is too large for the buffer, "bad things" can >happen as a result. Another bad: Both gets() and fgets() (SysV R3) will blindly read in NULL chars (ascii zero's). Since gets() and fgets() return no info about the number of chars read (unless you use ftell() maybe?), you might throw away a whole or partial line of input (and never know about it!). Rob -- Rob Lemley System Consultant, Scanning Software, Intergraph, Huntsville, AL rcl@b15.ingr.com OR ...!uunet!ingr!b15!rob 205-730-1546
roy%cybrspc@cs.umn.edu (Roy M. Silvernail) (11/13/90)
imp@marvin.Solbourne.COM (Warner Losh) writes: > gets() is a bad function to use when you don't have total control over > the input (like a user typing at a program). Since it can't check to > see if the input line is too large for the buffer, "bad things" can > happen as a result. Thank you! I hadn't thought of this possibility. Anything I can do to make my stuff more fool-resistant... (in anticipation of the new-model-year improved fools ;-) -- Roy M. Silvernail |+| roy%cybrspc@cs.umn.edu |+| #define opinions ALL_MINE; main(){float x=1;x=x/50;printf("It's only $%.2f, but it's my $%.2f!\n",x,x);} "This is cyberspace." -- Peter da Silva :--: "...and I like it here!" -- me
hilfingr@rama.cs.cornell.edu (Paul N. Hilfinger) (11/13/90)
I have been following this discussion with some interest, but I am still a little puzzled about a few things. 1. Chris Torek displayed the following code to illustrate why scanf is "fragile" > way_the_first() { > int i; > while (scanf("%d", &i) != EOF) > printf("I got %d.\n", i); > } and said that the foremost problem is that "if the user enters the word `one' instead of the digit `1', the loop runs forever, because the letter `o' is not a digit and the scanf LEAVES IT BEHIND in the input stream." Can't argue with that, but are we criticizing the best example of the use of scanf? What are everyone's comments on the following? way_the_first_and_a_half() { for (;;) { /* Please no flaming about how to do infinite loops */ int i; int r = scanf("%d", &i); if (r == EOF) break; if (r == 1) printf("I got %d.\n", i); else (void) getchar(); } } I know of one obvious problem. For the illegal input `--1', scanf reads the first `-', finds an error, and then getchar reads and throws away the second `-'. This could be corrected by using something more elaborate than getchar() for error correction. On the other hand, let's say that my goal in just to produce code that detects errors and recovers from them adequately (in particular, without blowing up), even if its choice of recovery is not always perfect. 2. Several contributors have suggested the use of sscanf after using fgets. This has problems, since sscanf won't tell you where in its input string it stopped reading. Fortunately, there are strtod, strtol, etc., but they still leave the problem that the newline character is not just whitespace when using fgets. One must make annoying provisions for ends of lines that are not necessary when input is treated as a continuous stream of characters. Do any of you have nice ways of dealing with these problems? Thanks for your help. Paul Hilfinger
rjc@uk.ac.ed.cstr (Richard Caley) (11/13/90)
In article <1990Nov12.112032.22979@diku.dk> kimcm@diku.dk (Kim Christian Madsen) writes: rjc@uk.ac.ed.cstr (Richard Caley) writes: >In article <VXogs2w163w@cybrspc> roy%cybrspc@cs.umn.edu (Roy M. Silvernail) writes: > The problem with scanf() is that it can behave unpredictably when you > give it badly formatted input. It's better, IMHO, to gets() a whole > line, check its validity and _then_ sscanf() it into the target > variables. >Maybe it was just a typo, but repeat after me > `GETS is EVIL' >This has been un unpayed anouncement by paranoids anonymous. gets can get you into a lot of trouble if used for in a non-controlled manner, e.g. for user input. Then you be better off by using fgets or reading char-by-char with getchar() or family. But using scanf() for user input is asking for trouble! Sorry for being confusing, I wasn't defending scanf, I was just pointing out the fact that gets is never useful. -- rjc@uk.ac.ed.cstr _O_ |<
henry@zoo.toronto.edu (Henry Spencer) (11/14/90)
In article <16582@netcom.UUCP> avery@netcom.UUCP (Avery Colter) writes: >In the self-teaching course I have here, scanf is the most often used >input function. I don't see gets used much at all. It is not surprising that an introductory course will focus on doing things the easy way rather than the better but more complex way, for the sake of not confusing beginners. As others have discussed at length, the problem with scanf is a poor and inflexible design that gives you little control over the situation when unexpected input is encountered. Pulling in a line with fgets (not gets!) and then picking it apart with sscanf makes clean error recovery much easier. >I didn't see puts used for printing strings to screen much either. >printf was the function of choice. People frequently draw this analogy, but it is false and misleading. Printf works very well for output because its inputs are C data, very tightly constrained by the language and the machine, and the free-form version is what it is *generating*. The situation is not symmetrical; scanf is faced with a very different and much harder problem. -- "I don't *want* to be normal!" | Henry Spencer at U of Toronto Zoology "Not to worry." | henry@zoo.toronto.edu utzoo!henry
karl@ima.isc.com (Karl Heuer) (11/14/90)
In article <48257@cornell.UUCP> hilfingr@cs.cornell.edu (Paul N. Hilfinger) writes: >2. Several contributors have suggested the use of sscanf after using >fgets. This has problems, since sscanf won't tell you where in its >input string it stopped reading. Fixed in ANSI C, via the `%n' specifier. Karl W. Z. Heuer (karl@ima.isc.com or uunet!ima!karl), The Walking Lint
aduncan@rhea.trl.oz (Allan Duncan) (11/15/90)
From article <VXogs2w163w@cybrspc>, by roy%cybrspc@cs.umn.edu (Roy M. Silvernail): > The problem with scanf() is that it can behave unpredictably when you > give it badly formatted input. It's better, IMHO, to gets() a whole > line, check its validity and _then_ sscanf() it into the target > variables. (no need for strtol() or similar, since sscanf() looks at the > validated string just as scanf() would have looked at the original > input) It just makes things more bullet-resistant. ^^^^^^^^^^^^^^^^ I hope you are really using fgets( stdin,...) rather than gets(...) - there are a lot of _system_ things out there that can be broken by just keeping on typing till the buffer is overflowed! Allan Duncan ACSnet a.duncan@trl.oz (03) 541 6708 ARPA a.duncan%trl.oz.au@uunet.uu.net UUCP {uunet,hplabs,ukc}!munnari!trl.oz!a.duncan Telecom Research Labs, PO Box 249, Clayton, Victoria, 3168, Australia.
jon@jonlab.UUCP (Jon H. LaBadie) (11/16/90)
In article <1990Nov12.014850.14475@melba.bby.oz.au>, zvs@bby.oz.au (Zev Sero) writes: > > But for exactly the same reason, you should never, never, never use > gets(). The gets() function does not check how many characters it > reads. It just keeps going until it sees a newline. If the array > you're storing the thing in overflows, tough bikkies. This question is asked regarding input from terminals only. I've a vague recollection that declaring input arrays to be BUFSIZ in length provides some protection to overflow by gets(3C). Is this just "conventional wisdom", or does something in the choice of BUFSIZ for a particular system ensure any overflow protection? Jon -- Jon LaBadie {att, princeton, bcr, attmail!auxnj}!jonlab!jon
gwyn@smoke.brl.mil (Doug Gwyn) (11/16/90)
In article <879@jonlab.UUCP> jon@jonlab.UUCP (Jon H. LaBadie) writes: >Is this just "conventional wisdom", or does something in the choice >of BUFSIZ for a particular system ensure any overflow protection? gets() will input arbitrarily long lines. The only thing really special about BUFSIZ in this regard is that many UNIX text editors do not support lines longer than that, so text files containing longer lines are rarely encountered (but not impossible).
henry@zoo.toronto.edu (Henry Spencer) (11/17/90)
In article <879@jonlab.UUCP> jon@jonlab.UUCP (Jon H. LaBadie) writes: >I've a vague recollection that declaring input arrays to be BUFSIZ >in length provides some protection to overflow by gets(3C). Nope. Except insofar as making the arrays longer reduces the probability of somebody overflowing them. There is no magic associated with BUFSIZ. -- "I don't *want* to be normal!" | Henry Spencer at U of Toronto Zoology "Not to worry." | henry@zoo.toronto.edu utzoo!henry
ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (11/20/90)
In article <879@jonlab.UUCP> jon@jonlab.UUCP (Jon H. LaBadie) wrote > I've a vague recollection that declaring input arrays to be BUFSIZ > in length provides some protection to overflow by gets(3C). In article <1990Nov16.165203.18786@zoo.toronto.edu>, henry@zoo.toronto.edu (Henry Spencer) replied : Nope. Except insofar as making the arrays longer reduces the probability : of somebody overflowing them. There is no magic associated with BUFSIZ. The original question asked specifically about input from terminals. Some operating systems (UNIX, VMS, OS/2, others) place a limit on the number of characters in a line entered at a keyboard. In OS/2 it's 255. The POSIX standard defines a parameter, I think it's MAXCANON or something like that. The limit has typically been 255, but there's no reason it couldn't be more. Since each read() from the keyboard is going to be stored in a stdio buffer, BUFSIZ had better be at least as large as this limit, so declaring your arrays that big should be enough to handle terminal input. Except... gets() will keep on reading from stdin until it hits a \n or an EOF. Lines entered from a keyboard _normally_ end with a \n, but they don't have to. Let <EOF> represent your end-of-file character on a UNIX system and let <junk70> represent 70 printing characters. Then <junk70><EOF> <junk70><EOF> <junk70><EOF> <junk70><EOF> <junk70><EOF> <junk70><EOF> <junk70><EOF> <junk70><EOF> <junk70><EOF> <junk70><RET> will result in gets() seeing a line with 701 characters, refilling the stdio buffer several times. (I've tried this. It works.) Since VMS returns a record to the caller when you hit <RET> _or_ a function key, I imagine that it might be possible to play a similar trick in VMS. So the answer is, for your own private use, yes you can get away with using BUFSIZ as a limit for keyboard input, but don't do dare do that in a program you sell to customers. -- I am not now and never have been a member of Mensa. -- Ariadne.
epames@eos.ericsson.se (Michael Salmon) (11/20/90)
In article <4319@goanna.cs.rmit.oz.au> Richard A. O'Keefe writes: >gets() will keep on reading from stdin until it hits a \n or an EOF. >Lines entered from a keyboard _normally_ end with a \n, but they don't >have to. Let <EOF> represent your end-of-file character on a UNIX system This is getting a long way from the original point but I thought I should respond to this statement. In C EOF is not a character, its value is such that it can *NEVER* be present in a file. In all the C implementations that I have ever seen EOF has been represented by the value -1 but it can be any integer and one of the traps that I think everyone falls for at some time is to presume that getc() returns a character rather than an integer. EOF is never 255 or ^D, it is in fact a read() that returns 0 as the byte count. N.B. the least significant char of the usual representation of -1 is 255. Michael Salmon L.M.Ericsson Stockholm
richard@aiai.ed.ac.uk (Richard Tobin) (11/21/90)
In article <1990Nov20.123036.11103@ericsson.se> epames@eos.ericsson.se writes: >should respond to this statement. In C EOF is not a character, its You seem to have misunderstood Richard O'Keefe's point. By EOF, Richard meant the C #defined constant, normally -1. By <EOF>, he meant the key you press to send end-of-file (perhaps ^D), which is why he said: >>Let <EOF> represent your end-of-file character on a UNIX system When you type <EOF> after a <linefeed> (or another <EOF>) it results in read() returning zero, and getc() returning EOF. This is the "normal" use of the <EOF> key. When you type <EOF> at other times (eg after typing some letters), it causes the line to be made available for read()ing, just as <linefeed> does. However, there is no \n character (or ^D or whatever) appended to the data. Typing abc<EOF> results in read() returning 3 and getting the characters 'a', 'b', and 'c'. Thus you can type in something that's a line from the point of view of the tty driver, but doesn't end with '\n' and isn't a line from the point of view of fgets(), which does something like while(--count > 0 && (c = getc(file)) != EOF && c != '\n') completely ignoring the boundaries of data returned by read(). -- Richard -- Richard Tobin, JANET: R.Tobin@uk.ac.ed AI Applications Institute, ARPA: R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk Edinburgh University. UUCP: ...!ukc!ed.ac.uk!R.Tobin
ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (11/22/90)
In article <1990Nov20.123036.11103@ericsson.se>, epames@eos.ericsson.se (Michael Salmon) writes: : In article <4319@goanna.cs.rmit.oz.au> Richard A. O'Keefe writes: : >gets() will keep on reading from stdin until it hits a \n or an EOF. : >Lines entered from a keyboard _normally_ end with a \n, but they don't : >have to. Let <EOF> represent your end-of-file character on a UNIX system : This is getting a long way from the original point but I thought I : should respond to this statement. It would have been better to try understanding it first. I wrote Let <EOF> represent your end-of-file character on a UNIX system. People who have a UNIX system and don't know what I'm talking about should read the "stty" manual page. If they still don't understand, they should find someone who does understand and get an explanation. That or refrain from posting. The point is, of course, that the way you signal end-of-file from a keyboard in UNIX is by typing a particular character. Q: _Which_ character? A: _You_ get to pick. Some people use End-of-Transmission (^D). Some people like ^Z. I've seen ^Y used. : In C EOF is not a character, its : value is such that it can *NEVER* be present in a file. So flipping what? I said nothing whatsoever about C's EOF macro. I was talking about the key you type at the keyboard. This is totally independent of C's EOF macro and the two values have nothing in common. What I was doing was exhibiting a method of tricking gets() into reading an arbitrary number of characters from the keyboard as one "line"; that method relies on the malicious typist typing his <EOF> character from time to time, whichever character he has selected for that purpose. -- I am not now and never have been a member of Mensa. -- Ariadne.
epames@eos.ericsson.se (Michael Salmon) (11/22/90)
In article <3797@skye.ed.ac.uk> richard@aiai.UUCP (Richard Tobin) writes: >You seem to have misunderstood Richard O'Keefe's point. By EOF, >Richard meant the C #defined constant, normally -1. By <EOF>, he >meant the key you press to send end-of-file (perhaps ^D), which is why >he said: > >>>Let <EOF> represent your end-of-file character on a UNIX system ^D is *NOT* an eof character, it is a command to the tty driver to send the contents of the input buffer, the same as your ERASE etc. characters are special commands to the tty driver. By typing ^D when there are no characters in the input buffer you are sending 0 characters which is the end of file condition. Michael Salmon L.M.Ericsson Stockholm
ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (11/23/90)
I wrote Let <EOF> represent your end-of-file character on a UNIX system In article <1990Nov22.071319.3222@ericsson.se>, epames@eos.ericsson.se (Michael Salmon) writes: > ^D is *NOT* an eof character, it is a command to the tty driver ... With the utmost possible respect, may I suggest that since the context was "UNIX system"s, we take the UNIX manuals as authoritative? From "man 7 termio": Normally, terminal input is processed in units of lines. A line is delimited by a new-line (ASCII LF) character, an end-of-file (ASCII EOT) character, or an end-of-line character. The SVID release 2 has the same text, and speaks of The ERASE, KILL, and EOF characters ... So when I wrote of an "end-of-file character" I was using *precisely* the terminology blessed by the SVID, which nowhere calls it a "command". -- I am not now and never have been a member of Mensa. -- Ariadne.
epames@eos.ericsson.se (Michael Salmon) (11/23/90)
In article <4354@goanna.cs.rmit.oz.au> Richard A. O'Keefe writes: >I wrote > Let <EOF> represent your end-of-file character on a UNIX system > .... > >The SVID release 2 has the same text, and speaks of > The ERASE, KILL, and EOF characters ... > >So when I wrote of an "end-of-file character" I was using *precisely* >the terminology blessed by the SVID, which nowhere calls it a "command". I agree that that is what the manual says and I think it is unfortunate as it doesn't mean end of file as defined by gets() etc. I quote below from SunOS man page for termio. EOF (CTRL-D or ASCII EOT) may be used to generate an end-of-file from a terminal. When received, all the characters waiting to be read are immediately passed to the program, without waiting for a NEW- LINE, and the EOF is discarded. Thus, if there are no characters waiting, which is to say the EOF occurred at the beginning of a line, zero charac- ters will be passed back, which is the standard end-of-file indication. Strictly my own opinions. Michael Salmon L.M.Ericsson Stockholm
gwyn@smoke.brl.mil (Doug Gwyn) (11/24/90)
In article <1990Nov22.071319.3222@ericsson.se> epames@eos.ericsson.se writes: >^D is *NOT* an eof character, it is a command to the tty driver to send >the contents of the input buffer, ... The character that in "cooked" mode is interpreted by the terminal driver to act as an invisible line delimiter is normally called the EOF character in UNIX user documentation. While most people map ^D to this control function, the choice is programmable. In any event, the (cooked mode) input sequence A B <EOF> C D <NL> E F <EOF> <EOF> G H <NL> <EOF> I J (where <EOF> is often ^D and <NL> is often ^M) results in four "packets" being inserted into the terminal input queue: A B C D \n E F <empty> G H \n <empty> with the two characters I and J still buffered for canonical (erase/kill) processing. The first subsequent read() on the terminal (assuming that several characters are requested for the read count) will return the two characters: A B The second such read() will return the three characters: C D \n the third such read() will return: E F The fourth read() will return: <empty> The fifth read() will return: G H \n The sixth read() will return: <empty> (That is the only use for the EOF character that most UNIX users are aware of.) The seventh read() will block until a line delimiter is input. The standard I/O functions need to be prepared to deal with this behavior, generally by having input operations loop until enough data is obtained to satisfy the implementation request (i.e. up to a \n for gets() or until the requested count is satisfied for fread()). While doing this, a read() that returns 0 characters is conventionally interpreted as an "end of file" indication. While most applications will not read past an EOF indication, on stream-like input channels such as terminals this might be a reasonable thing to do under some circumstances. Anyway, this is the intended UNIX behavior. There are undoubtedly variations even among UNIX implementations, and other operating systems may have significantly different terminal input support.
richard@aiai.ed.ac.uk (Richard Tobin) (11/27/90)
In article <1990Nov22.071319.3222@ericsson.se> epames@eos.ericsson.se writes: >>>>Let <EOF> represent your end-of-file character on a UNIX system >^D is *NOT* an eof character, it is a command to the tty driver to send >the contents of the input buffer, I thought I had made it quite clear what happens when you type ^D. Are you trying to make a substantial point here, or are you just quibbling about the term "end-of-file character"? When Richard O'Keefe says "end-of-file character" he means "the character you type when you want to cause the program to see an end-of-file condition". Just like "erase character" (a term you used yourself) means "the character you press when you want to erase a character", and "suspend character" means "the character you press when you want to suspend your program". If your point is that the behaviours after newline and after other characters are really the same - ie send the waiting characters, of which there may be a zero or non-zero number - then yes, that's true, but it's normally more useful to distinguish these cases. -- Richard -- Richard Tobin, JANET: R.Tobin@uk.ac.ed AI Applications Institute, ARPA: R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk Edinburgh University. UUCP: ...!ukc!ed.ac.uk!R.Tobin
epames@eos.ericsson.se (Michael Salmon) (11/27/90)
In article <3819@skye.ed.ac.uk> Richard Tobin writes: >In article <1990Nov22.071319.3222@ericsson.se> epames@eos.ericsson.se writes: > >>>>>Let <EOF> represent your end-of-file character on a UNIX system > >>^D is *NOT* an eof character, it is a command to the tty driver to send >>the contents of the input buffer, > >I thought I had made it quite clear what happens when you type ^D. > >Are you trying to make a substantial point here, or are you just >quibbling about the term "end-of-file character"? I think that the substantial point is that there is no "end-of-file character". End of file is a read() of zero characters, when reading from a terminal this can be achieved by typing ^D (usually) with a blank line. The end of file indication requires both conditions. Getting back to gets(), it behaved exactly as I expected it would and as the manuals say it should. Solely the opinion of Michael Salmon L.M.Ericsson Stockholm
ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (11/28/90)
>Let <EOF> represent your end-of-file character on a UNIX system In article <1990Nov27.110005.7203@ericsson.se>, epames@eos.ericsson.se (Michael Salmon) writes: > I think that the substantial point is that there is no "end-of-file > character". According to the UNIX manuals, there *IS*. The end of file character is a character you type on the keyboard. Nobody has ever claimed that read() or gets() or getchar() *return* this character to the caller, or that they themselves ever see it. Never mind whether the name is confusing, that *IS* the name used in the UNIX manuals. To those who understood the point the first time, sorry to have troubled you. To anyone who still thinks 'that there is no "end-of-file character"' on a UNIX system, do us all a favour, *read* *the* *fine* *manuals*. -- I am not now and never have been a member of Mensa. -- Ariadne.