[comp.std.c] Explain this sscanf behavior.

mason@tc.fluke.COM (Nick Mason) (07/07/90)

What should sscanf do with the following?  Does anyone
have the ANSI standard and shed some light on the following?

I would like "hard" replies, not "I think it should ....".

Thanks in advance.

Given:

	char *buf="123";
	char *str="123x";

	int  a, b, x;

	b = -99;

	x = sscanf(str, "%d%n", &a, &b);

	printf("x=%d, a=%d, b=%d\n",x,a,b);

	x = sscanf(buf, "%d%n", &a, &b);

	printf("x=%d, a=%d, b=%d\n",x,a,b);


What is the CORRECT output according to the standard???

I tried this with 3 different compilers and got the following:

compiler A:

	x=1   a=123  b=3
	x=1   a=123  b=3

compiler B:

	x=1   a=123  b=3
	x=1   a=123  b=4  <-- yes 4.

compiler C:

	x=1  a=123  b=3
	x=1  a=123  b= -99



I'm confused????!!!!! Compiler C is "100% ANSI compatible".????


Nick.

daniels@tle.enet.dec.com (Bradford R. Daniels) (07/07/90)

> compiler A:

> 	x=1   a=123  b=3
> 	x=1   a=123  b=3

This is the correct result.

> compiler B:
> 	x=1   a=123  b=3
> 	x=1   a=123  b=4  <-- yes 4.

This is clearly wrong, since there aren't even 4 characters in the string...

> compiler C:
> 	x=1  a=123  b=3
> 	x=1  a=123  b= -99

This is also wrong, since %n does not consume any input, and so the
lack of input should not cause it to fail.

> I'm confused????!!!!! Compiler C is "100% ANSI compatible".????

ANSI compatible does not mean bug free.  This behavior is a bug, and
I think I see how it's happening...  In fact, I think I'll go make sure
my RTL doesn't have the same problem...

- Brad

-----------------------------------------------------------------
Brad Daniels			|  Digital Equipment Corp. almost
DEC Software Devo		|  definitely wouldn't approve of
"VAX C RTL Whipping Boy"	|  anything I say here...

kdq@demott.COM (Kevin D. Quitt) (07/08/90)

In article <1990Jul6.181830.2549@tc.fluke.COM> mason@tc.fluke.COM (Nick Mason) writes:
>
>I tried this with 3 different compilers and got the following:
>
>compiler A:
>
>	x=1   a=123  b=3
>	x=1   a=123  b=3
>
>compiler B:
>
>	x=1   a=123  b=3
>	x=1   a=123  b=4  <-- yes 4.
>
>compiler C:
>
>	x=1  a=123  b=3
>	x=1  a=123  b= -99
>

    I'm confused by example C: where did it get the old value of b? BTW,
Microsoft C and cc on Motorola Delta 3x00 agree with A, and gnu's gcc
coredumps. 

-- 
 _
Kevin D. Quitt         demott!kdq   kdq@demott.com
DeMott Electronics Co. 14707 Keswick St.   Van Nuys, CA 91405-1266
VOICE (818) 988-4975   FAX (818) 997-1190  MODEM (818) 997-4496 PEP last

                96.37% of all statistics are made up.

gwyn@smoke.BRL.MIL (Doug Gwyn) (07/08/90)

In article <1990Jul6.181830.2549@tc.fluke.COM> mason@tc.fluke.COM (Nick Mason) writes:
>	int  a, b = -99, x;
>	x = sscanf("123x", "%d%n", &a, &b);
>	printf("x=%d, a=%d, b=%d\n",x,a,b);
>	x = sscanf("123", "%d%n", &a, &b);
>	printf("x=%d, a=%d, b=%d\n",x,a,b);
>What is the CORRECT output according to the standard???

You've uncovered an interesting feature:  Although the %n specifier
does not consume input, it can still have an "input failure" when
EOF was encountered during preceding conversions that matched non-
empty sequences.

>compiler C:
>	x=1  a=123  b=3
>	x=1  a=123  b= -99
>I'm confused????!!!!! Compiler C is "100% ANSI compatible".????

It might be; it did return the right answer in this particular case.

P.S.  This is not an official interpretation; if it bothers you,
please send an official request for interpretation to CBEMA X3.

gwyn@smoke.BRL.MIL (Doug Gwyn) (07/08/90)

In article <376@demott.COM> kdq@demott.COM (Kevin D. Quitt) writes:
>    I'm confused by example C: where did it get the old value of b?

I think we have to assume that
	b = -99;
was also supposed to be performed before the second sscanf().

arnej@solan11.solan.unit.no (Arne Henrik Juul) (07/09/90)

In article <376@demott.COM>, kdq@demott.COM (Kevin D. Quitt) writes:
|>..., and gnu's gcc
|>coredumps. 

That is, you get a coredump on sscanf(str,"%d%n",&a,&b).
This is indeed correct, since your sscanf() (like mine) probably writes
to the format string or the argument str, and it isn't allowed to do that.

Try compiling with -fwritable-strings.
GCC puts string constants in the non-writable text segment by default.

Of course sscanf is broken, but that's our tough luck...

-- arnej@solan.unit.no -- juul@norunit.bitnet -- arnej@olga1.olga.unit.no --  
--                This disclaimer intentionally left blank                --

mason@tc.fluke.COM (Nick Mason) (07/09/90)

In article <376@demott.COM> kdq@demott.COM (Kevin D. Quitt) writes:
>In article <1990Jul6.181830.2549@tc.fluke.COM> mason@tc.fluke.COM (Nick Mason) writes:
>>
>>I tried this with 3 different compilers and got the following:
>>
>>compiler A:
>>
>>	x=1   a=123  b=3
>>	x=1   a=123  b=3
>>
>>compiler B:
>>
>>	x=1   a=123  b=3
>>	x=1   a=123  b=4  <-- yes 4.
>>
>>compiler C:
>>
>>	x=1  a=123  b=3
>>	x=1  a=123  b= -99
>>
>
>    I'm confused by example C: where did it get the old value of b? BTW,
>Microsoft C and cc on Motorola Delta 3x00 agree with A, and gnu's gcc
>coredumps. 
>
What version of Microsoft C did you use and what memory model??

Compiler A is SUN 4.0 cc,
Compiler B is MSC 5.1 large model, 
Compiler C is MSC 6.0 large model.

And I agree, gnu's gcc coredumps.

Nick.

inst182@tuvie (Inst.f.Techn.Informatik) (07/10/90)

To add a new variant, on a DECstation 3100 (Ultrix 2.1) both cc (1.31) and
gcc (1.37) give the following results:

x=2, a=123, b=3
x=1, a=123, b=3

(no coredumps)

karl@haddock.ima.isc.com (Karl Heuer) (07/12/90)

In article <1666@tuvie> inst182@tuvie.UUCP (Inst.f.Techn.Informatik) writes:
>To add a new variant, on a DECstation 3100 (Ultrix 2.1) both cc (1.31) and
>gcc (1.37) give the following results:
>	x=2, a=123, b=3
>	x=1, a=123, b=3

That first one has to be wrong; you can't get a return value of 2 when you
only have one input format specifier (%n doesn't count).

Karl W. Z. Heuer (karl@kelp.ima.isc.com or ima!kelp!karl), The Walking Lint

kim@wacsvax.cs.uwa.OZ.AU (Kim Shearer) (07/13/90)

In <13168@shlump.nac.dec.com> daniels@tle.enet.dec.com (Bradford R. Daniels) writes:

>> compiler A:

>> 	x=1   a=123  b=3
>> 	x=1   a=123  b=3

>This is the correct result.

>> compiler B:
>> 	x=1   a=123  b=3
>> 	x=1   a=123  b=4  <-- yes 4.

>This is clearly wrong, since there aren't even 4 characters in the string...

 Why, the value of b is clearly invalid as only one
 argument was comsumed. Nothing was read into b so
 how can you say it was incorrectly assigned.

+--------------------------------+--------------------------------------------+
Kim Shearer                      |     ARPA: kim%wacsvax.uwa.oz@uunet.uu.net
Dept. of Computer Science        |     UUCP: ..!uunet!munnari!wacsvax!kim
University of Western Australia  |     ACSnet: kim@wacsvax.uwa.oz       
CRAWLEY, Australia 6009          |     PHONE:  +61 9 380 3452 
+--------------------------------+--------------------------------------------+

daniels@tle.enet.dec.com (Bradford R. Daniels) (07/13/90)

> >> compiler B:

> >> 	x=1   a=123  b=3
> >> 	x=1   a=123  b=4  <-- yes 4.
> 
> >This is clearly wrong, since there aren't even 4 characters in the string...
> 
>  Why, the value of b is clearly invalid as only one
>  argument was comsumed. Nothing was read into b so
>  how can you say it was incorrectly assigned.

Pardon?  I have to disagree with you here.  Clearly, an assignment was made,
otherwise the value of b would be -99.  The return value of 1 tells us nothing,
since the standard explicitly states that the count of converted arguments
does not get incremented by %n.

- Brad

-----------------------------------------------------------------
Brad Daniels			|  Digital Equipment Corp. almost
DEC Software Devo		|  definitely wouldn't approve of
"VAX C RTL Whipping Boy"	|  anything I say here...

chris@mimsy.umd.edu (Chris Torek) (07/20/90)

(code in question: `sscanf("123", "%d%n", &a, &b)')

In article <13313@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
>You've uncovered an interesting feature:  Although the %n specifier
>does not consume input, it can still have an "input failure" when
>EOF was encountered during preceding conversions that matched non-
>empty sequences.

Although the wording in the standard can be read to mean `%n fails with
an input failure if the stream is at EOF' (which would cause b to be
unmodified), I cannot believe that this was the intended behaviour.  It
seems to me that only conversions that require input should be able to
cause input failures, and I think the standard could be interpreted
this way too.

>P.S.  This is not an official interpretation; if it bothers you,
>please send an official request for interpretation to CBEMA X3.

Someone please do so.

(who me? :-) )
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris