[comp.lang.c] Possible scanf bug?

cs163wcr@sdcc10.ucsd.edu (C Code. C Code run.) (01/30/91)

Is this a bug, or am I just using scanf wrong?  I'm trying to
make scanf read a line that's ended with newline.

char buffer[100];	/* Line won't be over 80, but so what */
scanf ("%[^\n]%*c",buffer);

If it reads an empty line, buffer isn't changed at all!  It should
make buffer the null string!  Right?

Steve Boswell
whatis@ucsd.edu

scjones@thor.UUCP (Larry Jones) (01/31/91)

In article <16134@sdcc6.ucsd.edu>, cs163wcr@sdcc10.ucsd.edu (C Code. C Code run.) writes:
> char buffer[100];	/* Line won't be over 80, but so what */
> scanf ("%[^\n]%*c",buffer);
> 
> If it reads an empty line, buffer isn't changed at all!  It should
> make buffer the null string!  Right?

Wrong.  The %[ format specifier requires at least one character to
match in order to be considered successful.  When you try to read
an empty line with it, it's just like trying to read "a" with a %d.
If you had checked the return value from scanf, it should have returned
0 (indicating that there were NO items read) as opposed to the usual
(for your format) 2.  You should probably be using gets (which suffers
from the same problem as your scanf -- there's no check to make sure
you don't overflow the buffer) or fgets instead.
----
Larry Jones, SDRC, 2000 Eastman Dr., Milford, OH  45150-2789  513-576-2070
Domain: scjones@thor.UUCP  Path: uunet!sdrc!thor!scjones
My brain is trying to kill me. -- Calvin

rjohnson@shell.com (Roy Johnson) (02/01/91)

In article <146@thor.UUCP> scjones@thor.UUCP (Larry Jones) writes:
>In article <16134@sdcc6.ucsd.edu>, cs163wcr@sdcc10.ucsd.edu (C Code. C Code run.) writes:
>> char buffer[100];	/* Line won't be over 80, but so what */
>> scanf ("%[^\n]%*c",buffer);
>> 
>> If it reads an empty line, buffer isn't changed at all!  It should
>> make buffer the null string!  Right?

>Wrong.  The %[ format specifier requires at least one character to
>match in order to be considered successful.  When you try to read
>an empty line with it, it's just like trying to read "a" with a %d.
>If you had checked the return value from scanf, it should have returned
>0 (indicating that there were NO items read) as opposed to the usual
>(for your format) 2.  You should probably be using gets (which suffers
------------------^^^
>from the same problem as your scanf -- there's no check to make sure
>you don't overflow the buffer) or fgets instead.

Except that it would return 1, for number of variables assigned, not
number of formats read.

--
======= !{sun,psuvax1,bcm,rice,decwrl,cs.utexas.edu}!shell!rjohnson =======
"If he exploded, all of Manhattan would be talking in high, squeaky voices
for months!"  "Cool." -- When I Was Short
Roy Johnson, Shell Development Company

greywolf@unisoft.UUCP (The Grey Wolf) (02/06/91)

In article <RJOHNSON.91Jan31110927@olorin.shell.com> rjohnson@shell.com (Roy Johnson) writes:
>In article <146@thor.UUCP> scjones@thor.UUCP (Larry Jones) writes:
	[ scanf not reading what the user wants, said to return zero ]
>Except that it would return 1, for number of variables assigned, not
>number of formats read.

From BSD 4.3 UNIX, scanf(3S) manual page:

    "The _s_c_a_n_f functions return the number of successfully
     matched and assigned input items[*].  This can be used to decide how many
     ^^^^^^^^^^^^^^^^^^^^ [implies both]
     input items were found.  The constant EOF is returned upon end of input;
     note that this is different from 0, which means that no conversion was
     done; if conversion was intended, it was frustrated by an inappropriate
     character in the input."

The Pyramid OSx (BSD) man page reads likewise.

[*] The SunOS 3.5 and 4.0.3 manuals and the Pyramid OSx (AT&T) man page read
similarly, but with the stipulation inserted here:
    "...this number can be zero in the event of an early conflict between
     an input character and the control string.
>
>--
>======= !{sun,psuvax1,bcm,rice,decwrl,cs.utexas.edu}!shell!rjohnson =======
>"If he exploded, all of Manhattan would be talking in high, squeaky voices
>for months!"  "Cool." -- When I Was Short
>Roy Johnson, Shell Development Company

Of course, how it actually works in *practice* may vary from system to
system (depending upon how braindead your implementation of the stdio
functions is).
-- 
thought:  I ain't so damb dumn!	| Your brand new kernel just dump core on you
war: Invalid argument		| And fsck can't find root inode 2
				| Don't worry -- be happy...
...!{ucbvax,acad,uunet,amdahl,pyramid}!unisoft!greywolf

kpv@ulysses.att.com (Phong Vo[drew]) (02/08/91)

In article <3343@unisoft.UUCP>, greywolf@unisoft.UUCP (The Grey Wolf) writes:
: In article <RJOHNSON.91Jan31110927@olorin.shell.com> rjohnson@shell.com (Roy Johnson) writes:
: >In article <146@thor.UUCP> scjones@thor.UUCP (Larry Jones) writes:
: 	[ scanf not reading what the user wants, said to return zero ]
: 
: From BSD 4.3 UNIX, scanf(3S) manual page:
: 
:     "The _s_c_a_n_f functions return the number of successfully
:      matched and assigned input items[*].  This can be used to decide how many
:      ^^^^^^^^^^^^^^^^^^^^ [implies both]
What is uncleared is the meaning of "matched". Since the "" string is legitimate,
it is arguable that string matching patterns should always satisfy except for eof.
Therefore, it can be argued that sscanf("1,,3","%[^,],%[^,],%[^,]",s1,s2,s3)
should return three matches where s2 will contain the null string.
Note that the situation is not the same for numerical matching where there is
no "legitimate" undefined numbers.

kpv@ulysses.att.com (Phong Vo[drew]) (02/11/91)

In article <154@thor.UUCP>, scjones@thor.UUCP (Larry Jones) writes:
: In article <14285@ulysses.att.com>, kpv@ulysses.att.com (Phong Vo[drew]) writes:
: > What is uncleared is the meaning of "matched". Since the "" string is legitimate,
: > it is arguable that string matching patterns should always satisfy except for eof.
: > Therefore, it can be argued that sscanf("1,,3","%[^,],%[^,],%[^,]",s1,s2,s3)
: > should return three matches where s2 will contain the null string.
: 
: The ANSI standard makes it clear that %[ matches one or more characters,
: not zero or more.  The descriptions of the printf and scanf functions
: have always been terse and not very precise or informative.  Some of the
: X3J11 members went to great pains to ensure that the ANSI descriptions
: were complete and accurate.
: ----
If this was the choice made by ANSI, then it is a bad choice.
This makes it impossible to write a simple scanf expression 
to parse a record delimitated by some character, for example,
a line from the (UNIXy) /etc/passwd file where some of the
fields may have empty values. With the alleged ANSI requirement,
you must now do it in a more complex loop. Aesthetically,
the requirement is also not nice because it breaks the
invertibility of the following type of pairs of functions:
	printf("%s:%s:%s\n",string1,string2,string3);
	scanf("%[^:]:%[^:]:%[^:]\n",string1,string2,string3);
where stringi's range over the domain of strings without embedded :'s.