[comp.lang.c] fscanf bug in TC

jt2@prism.gatech.EDU (TROSTEL,JOHN M) (04/17/91)

In article <1991Apr16.141117.5065@odin.diku.dk> juul@diku.dk (Anders Juul Munch) writes:
>cn@allgfx.agi.oz (Con Neri) writes:
>
>>Hi netters,
>>	I havec been working with a friend developing some code using
>>Turbo C++ V1.5 but only writing in standard C. We have been getting an error
>>with a particular piece of code, namely
>
>>	fscanf(fp,"%f", &f);
>
>>	The runtime error is
>
>>	scanf: floating point formats not linked.
>>	Abnormal Program termination.
>
>>	Can some one shed some light on what this means? 
i have found the same problem.  The way I worked around it was to declare
a new float variable, say fl_var, and use it to read in my data. See old
and new code below:

OLD CODE:

...
float *f_ptr;
...
f_ptr = (float *)calloc(...);
...
fscanf(file,"%f",f_ptr);
...
      ^----- gives the run time error

NEW CODE:
...
float *f_ptr, new_var;
...
f_ptr=(float *)calloc(...);
...
fscanf(file,"%f",&new_var);
f_ptr[i] = new_var;    /* i'm inside a loop here */
....
       ^------ this code works!!??

Well, I can't figure it! Nothing else was changed in the program to
make it work.  That is it DIDN'T like the address sent to it with
using just 'f_ptr' but DID like the address it got with '&new_var'.

Anyone else figure this out more?  Anyone from Borland about to tell
us how to fix this?

-- 
John M. Trostel   ( aka Kayak-Man )                  
Georgia Institute of Technology, Atlanta Georgia, 30332
uucp: ...!{decvax,hplabs,ncar,purdue,rutgers}!gatech!prism!jt2
Internet: jt2@prism.gatech.edu

krey@i30fs1.NoSubdomain.NoDomain (Andreas Krey) (04/17/91)

In article <26502@hydra.gatech.EDU>, jt2@prism.gatech.EDU (TROSTEL,JOHN M) writes:
|> In article <1991Apr16.141117.5065@odin.diku.dk> juul@diku.dk (Anders Juul Munch) writes:
|> >cn@allgfx.agi.oz (Con Neri) writes:
|> >
|> >>Hi netters,
|> >>	I havec been working with a friend developing some code using
|> >>Turbo C++ V1.5 but only writing in standard C. We have been getting an error
|> >>with a particular piece of code, namely
|> >
|> >>	fscanf(fp,"%f", &f);
|> >
|> >>	The runtime error is
|> >
|> >>	scanf: floating point formats not linked
|> >>	Abnormal Program termination.
|> >
|> >>	Can some one shed some light on what this means? 
|> i have found the same problem.  The way I worked around it was to declare
|> a new float variable, say fl_var, and use it to read in my data. See old
|> and new code below:
|> 

Sorry to say so, but that is probably something unrelated. The problem
with 'scanf: floating point formats not linked' is with the libraries.
Most users of printf/scanf don't do floating point and the standard library
code of printf/scanf cannot convert that. You have to set a compiler
flag/option to include the variant capable of float conversion when linking.
(Cannot name the option, I only know this feature from a little C compiler.)

|> OLD CODE:
|> 
|> ...
|> float *f_ptr;
|> ...
|> f_ptr = (float *)calloc(...);
|> ...
|> fscanf(file,"%f",f_ptr);
|> ...
|>       ^----- gives the run time error
|> 
|> NEW CODE:
|> ...
|> float *f_ptr, new_var;
|> ...
|> f_ptr=(float *)calloc(...);
|> ...
|> fscanf(file,"%f",&new_var);
|> f_ptr[i] = new_var;    /* i'm inside a loop here */
|> ....
|>        ^------ this code works!!??
|> 
|> Well, I can't figure it! Nothing else was changed in the program to
|> make it work.  That is it DIDN'T like the address sent to it with
|> using just 'f_ptr' but DID like the address it got with '&new_var'.
|> 
|> Anyone else figure this out more?  Anyone from Borland about to tell
|> us how to fix this?
|> 

This is either a compiler bug (improbable), oder something with
the memory models. new_var is on the stack, f_ptr points to the heap;
looks like far/near pointer trouble. That is, passing the wrong pointer
type.

|> -- 
|> John M. Trostel   ( aka Kayak-Man )                  
|> Georgia Institute of Technology, Atlanta Georgia, 30332
|> uucp: ...!{decvax,hplabs,ncar,purdue,rutgers}!gatech!prism!jt2
|> Internet: jt2@prism.gatech.edu

-- 
Andy

-------------------------------------------
Zeit ist Geld. Aber Geld ist keine Zeit.
[Intl: Time is money. But money isn't time.]

To fight xrn stupidity: Andreas Krey, krey@ira.uka.de

ts@cup.portal.com (Tim W Smith) (04/18/91)

< that the algorithm is imperfect: if the program isn't using
< floating point, %e, %f, and %g can't be needed, but they might
< not be needed if the program is using floating point, either.
< However, "program uses floating point" is in principle computable
< at compile time, while "%e, %f, or %g might get passed to printf"
< isn't.)

Suppose you have a program that is receiving data a byte at a time
over a serial port.  Each item consists of a type followed by the
data.  This program simply wants to print the data.

Suppose one of the types is single precision floating point.
The programmer "knows" that a float is the same size as a long,
and that the data was sent by taking the address of a float on
the other side and just sending out the bytes.

We might see code like this:

	long data;

	data = nextbyte() << 24;
	data |= nextbyte() << 16;
	data |= nextbyte() << 8;
	data |= nextbyte();

	printf( "%f", data );

Oops!  No floating point at compile time but needed at runtime.

						Tim Smith

ps: of course, I would never do this!

scs@adam.mit.edu (Steve Summit) (04/18/91)

In article <26502@hydra.gatech.EDU> jt2@prism.gatech.EDU (TROSTEL,JOHN M) writes:
>>cn@allgfx.agi.oz (Con Neri) writes:
>>>We have been getting an error
>>>with a particular piece of code, namely
>>>	scanf: floating point formats not linked.
>>>	Abnormal Program termination.
>>>	Can some one shed some light on what this means? 
>i have found the same problem.  The way I worked around it was to declare
>a new float variable, say fl_var, and use it to read in my data.
>Anyone else figure this out more?  Anyone from Borland about to tell
>us how to fix this?

This issue faintly amazes me.  I can't believe that:

1. there are still people who have not heard about this problem,
2. Borland apparently still hasn't fixed it,
3. the problem exists in the first place, and
4. so many programs manage to elicit it.

(Neither 1 nor 4 are flames; I'm just, as I say, faintly amazed.)

There's no need to speculate on this problem.  It is explained in
both the comp.lang.c and comp.sys.ibm.pc.misc FAQ lists.  Here's
the whole story (at least, as much of it as I know), for those
who care:

printf is actually a miniature interpreter, and only discovers at
run time which format specifiers appear in its format string.
Since the floating-point code for dealing with %e, %f, and %g is
substantial, and since many programs do not use floating point,
it's tempting to leave the floating-point code out for programs
which don't need it, especially on machines with limited address
spaces.  In fact, Ritchie's original PDP-11 C compiler did so,
albeit with considerably more success than does Turbo C.

The basic idea is that there are two copies of the conceptual
equivalent of printf.obj (printf.o for us Unix fans) lying
around: one which handles %e, %f, and %g, and one which doesn't.
The compiler communicates with the linker somehow, informing it
whether the program is using floating point or not, and whether
the full-blown or truncated printf code should be linked.  (Note
that the algorithm is imperfect: if the program isn't using
floating point, %e, %f, and %g can't be needed, but they might
not be needed if the program is using floating point, either.
However, "program uses floating point" is in principle computable
at compile time, while "%e, %f, or %g might get passed to printf"
isn't.)

How does the compiler determine that a "program uses floating
point?"  Ritchie's compiler asserts that a program uses floating
point if a variable is declared as float or double (or, as I
recall, a pointer to same), and if that variable is then used.
(Even this heuristic isn't perfect; Doug Gwyn claims to have
augmented it to handle a few more, really obscure cases, but I
don't know the details -- perhaps they involve casts.)  Turbo C
apparently asserts that a program uses floating point only if the
program actually calls for floating point arithmetic.

I don't know why Turbo C uses such an obviously inadequate
heuristic, particularly when a simple, correct one exists (and
was used by a highly visible, 15 year old compiler).  It may be
that the printf float/nofloat decision is driven by the (equally
broken) PC floating-point-via emulator/coprocessor/both/neither
distinction, rather than by a "magic" extra undefined external
(which is how Ritchie's compiler did it, with the symbol __fltused).
I'm sure that the folks on comp.os.msdos.programmer (where this
discussion really belongs, and to where followups have been
redirected) could provide more information.

(I have heard that recent releases of Turbo C finally manage to
correct this problem.  I would appreciate any confirmation of
this rumor.)

The remaining puzzle is why so many programs are bitten by this
bug.  How many real programs read floating point values in and
printf them back out (or printf compile-time floating point
constants) without doing any arithmetic on them?  (This is not to
blame the victim, or to excuse Borland for having the bug.  The
test programs which are used to demonstrate the bugs are always
stripped down, as bug-demonstrating test programs should be.  I
wonder why the longer programs from which the test programs were
stripped down managed not to do enough floating-point arithmetic
to trigger proper linking?  Perhaps the problem only comes up
when people write little test programs to play with printf
floating-point specifiers, and there are enough of those little
test programs to account for the frequency of the question.)

                                            Steve Summit
                                            scs@adam.mit.edu

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (04/19/91)

In article <41392@cup.portal.com>, ts@cup.portal.com (Tim W Smith) writes:
> Suppose one of the types is single precision floating point.
> 	long data;
> 
> 	data = nextbyte() << 24;
> 	data |= nextbyte() << 16;
> 	data |= nextbyte() << 8;
> 	data |= nextbyte();

> 	printf( "%f", data );

> Oops!  No floating point at compile time but needed at runtime.
> ps: of course, I would never do this!

I'm so glad to hear that, as it doesn't do what you intended.
The %f format expects to find a DOUBLE value on the stack, which in TC
is 64 bits, but 'long' is only 32 bits.  [See the .signature.]  To make
this example work, it would need to be
	{ union {long L; float F;} pun;		/* we can't use a cast */
	  pun.L = data;				/* because we DON't want */
	  printf("%f", pun.F);			/* conversion */
	}
This, of course, "declares a floating-point variable", so the floating-
point routines should be linked in.
-- 
Bad things happen periodically, and they're going to happen to somebody.
Why not you?					-- John Allen Paulos.

ts@cup.portal.com (Tim W Smith) (04/19/91)

Oops!  Change my example to read 8 bytes into two longs and then
have it do this:

	printf( "%f", longA, longB );

						Tim Smith

ps: of course, I would not even think of doing this.  In fact, I
didn't write this...my evil twin did.

mjs@hubcap.clemson.edu (M. J. Saltzman) (04/21/91)

Well, I just found this bug, too, in Turbo C++ v2.0.  Since it's a
current topic of conversation, I thought I'd toss in my experience
with it.

In article <26502@hydra.gatech.EDU>, jt2@prism.gatech.EDU (TROSTEL,JOHN M) writes:
|> In article <1991Apr16.141117.5065@odin.diku.dk> juul@diku.dk (Anders Juul Munch) writes:
|> >cn@allgfx.agi.oz (Con Neri) writes:
|> >
|> >>Hi netters,
|> >>	I havec been working with a friend developing some code using
|> >>Turbo C++ V1.5 but only writing in standard C. We have been getting an error
|> >>with a particular piece of code, namely
|> >
|> >>	fscanf(fp,"%f", &f);
|> >
|> >>	The runtime error is
|> >
|> >>	scanf: floating point formats not linked
|> >>	Abnormal Program termination.
|> >
|> >>	Can some one shed some light on what this means? 
|> i have found the same problem.  The way I worked around it was to declare
|> a new float variable, say fl_var, and use it to read in my data. See old
|> and new code below:
|> 

In <1991Apr17.143139.20903@ira.uka.de> krey@i30fs1 (Andreas Krey) responds:
AK>Sorry to say so, but that is probably something unrelated. The problem
AK>with 'scanf: floating point formats not linked' is with the libraries.
AK>Most users of printf/scanf don't do floating point and the standard library
AK>code of printf/scanf cannot convert that. You have to set a compiler
AK>flag/option to include the variant capable of float conversion when linking.
AK>(Cannot name the option, I only know this feature from a little C compiler.)

Actually, this seems to be the crux of the matter.  As to the compiler
option, Turbo C++ includes an option to *exclude* the FP libraries,
even if the compiler thinks they're necessary, but if the compiler
determines that no FP library is needed, it will not include it, no
matter what command line options you use. 

The particular circumstances under which the error occurs seems to
be when you pass the value of a (float *) variable to scanf (as opposed
to the address of a float).  It doesn't seem to be enough to require
floating point *operations* in the code to override the problem; you
need to include a call to a FP *function* from the math.h header.
You can also read directly into a float.  If you do this anywhere in
the code, the problem disappears.

|> OLD CODE:
|> 
|> ...
|> float *f_ptr;
|> ...
|> f_ptr = (float *)calloc(...);
|> ...
|> fscanf(file,"%f",f_ptr);
|> ...
|>       ^----- gives the run time error
|> 
|> NEW CODE:
|> ...
|> float *f_ptr, new_var;
|> ...
|> f_ptr=(float *)calloc(...);
|> ...
|> fscanf(file,"%f",&new_var);
|> f_ptr[i] = new_var;    /* i'm inside a loop here */
|> ....
|>        ^------ this code works!!??
|> 
|> Well, I can't figure it! Nothing else was changed in the program to
|> make it work.  That is it DIDN'T like the address sent to it with
|> using just 'f_ptr' but DID like the address it got with '&new_var'.
|> 
|> Anyone else figure this out more?  Anyone from Borland about to tell
|> us how to fix this?
|> 

AK>This is either a compiler bug (improbable), oder something with
AK>the memory models. new_var is on the stack, f_ptr points to the heap;
AK>looks like far/near pointer trouble. That is, passing the wrong pointer
AK>type.

Nope.  This time it seems to be a compiler bug.  The above is one way
to work around it.  The other is to call a math.h function.

Then, in <1991Apr18.021515.1481@athena.mit.edu>, scs@adam.mit.edu (Steve Summit) writes:
SS>This issue faintly amazes me.  I can't believe that:

SS>1. there are still people who have not heard about this problem,
SS>2. Borland apparently still hasn't fixed it,
SS>3. the problem exists in the first place, and
SS>4. so many programs manage to elicit it.

SS>(Neither 1 nor 4 are flames; I'm just, as I say, faintly amazed.)

Well, even so, I don't write code every day, so I don't keep up with
this group that much.  Enough programs probably don't eleicit it
that many people never encounter it.

I called Borland, and they suggest calling the dummy math function as
a workaround.  I don't know why they haven't fixed it either.

SS>[most of explanation of bug mechanism deleted]

SS>How does the compiler determine that a "program uses floating
SS>point?"  Ritchie's compiler asserts that a program uses floating
SS>point if a variable is declared as float or double (or, as I
SS>recall, a pointer to same), and if that variable is then used.
SS>(Even this heuristic isn't perfect; Doug Gwyn claims to have
SS>augmented it to handle a few more, really obscure cases, but I
SS>don't know the details -- perhaps they involve casts.)  Turbo C
SS>apparently asserts that a program uses floating point only if the
SS>program actually calls for floating point arithmetic.

This can't be it.  I think it must check to see if the address of
a float is passed to scanf, but miss the case where a pointer to 
a float is passed.  Actually doing floating point has no effect on
the bug, although calling a floating point library function does.

SS>I don't know why Turbo C uses such an obviously inadequate
SS>heuristic, particularly when a simple, correct one exists (and
SS>was used by a highly visible, 15 year old compiler).  It may be
SS>that the printf float/nofloat decision is driven by the (equally
SS>broken) PC floating-point-via emulator/coprocessor/both/neither
SS>distinction, rather than by a "magic" extra undefined external
SS>(which is how Ritchie's compiler did it, with the symbol __fltused).
SS>I'm sure that the folks on comp.os.msdos.programmer (where this
SS>discussion really belongs, and to where followups have been
SS>redirected) could provide more information.

I suspect that Borland started with something like Ritchie's
heuristic, then outsmarted themselves trying to recognize more cases
where they could save.  I don't know for sure.  There hasn't been any
discussion on comp.sys.msdos.programmer yet.  Maybe this will start
some.

SS>(I have heard that recent releases of Turbo C finally manage to
SS>correct this problem.  I would appreciate any confirmation of
SS>this rumor.)

Not as of Turbo C++ version 2.0.  When I called, they did not offer
an upgraded interim release as a fix, even though one apparently
exists.

SS>The remaining puzzle is why so many programs are bitten by this
SS>bug.  How many real programs read floating point values in and
SS>printf them back out (or printf compile-time floating point
SS>constants) without doing any arithmetic on them?  (This is not to
SS>blame the victim, or to excuse Borland for having the bug.  The
SS>test programs which are used to demonstrate the bugs are always
SS>stripped down, as bug-demonstrating test programs should be.  I
SS>wonder why the longer programs from which the test programs were
SS>stripped down managed not to do enough floating-point arithmetic
SS>to trigger proper linking?  Perhaps the problem only comes up
SS>when people write little test programs to play with printf
SS>floating-point specifiers, and there are enough of those little
SS>test programs to account for the frequency of the question.)

Well, my application was a problem generator for an optimization 
package.  It allocates some arrays, reads values into them, then
does simple arithmetic to produce an output file with many more
numers in it.  I actually encountered the problem for the first
time when I was trying to write the input portion, and was just 
echoing the input, but the final version of the program also
fails, since the arithmetic is too simple (no library calls), 
and no values are read into floats on the stack.

Finally, in <1991Apr16.091320.29937@monu6.cc.monash.edu.au>, ron@monu6.cc.monash.edu.au (Ron Van Schyndel) writes:

RVS>Congratulations!  You have found the famous MATH bug, present in most C 
RVS>compilers (at least, *I* think its a bug).

No argument here!  (Maybe most *DOS* C compilers...)

RVS>When TC compiles your program, it keeps track of whether floating point
RVS>instructions were actually GENERATED.  In your code above, a MEMORY ADRESS
RVS>is passed to some unknown (TC doesn't know what FSCANF is - it's linked in
RVS>later) function.  That doesn't cause floating point code to be generated.
RVS>Thus, the compiler does not cause the floating point library to be linked in,
RVS>and it is only at runtime that this is detected.

Well, that's apparently not exactly the mechanism, at least in Turbo C++
(see above).  Don't know about other compilers.

RVS>Be happy for the error message,  MS C version 4 and earlier would simply hang
RVS>in this situation.

Ick.

RVS>I think this is a bug, since even if you include the -f or -f87 option, telling
RVS>the compiler EXPLICITLY that you want floating point included, it will still
RVS>NOT include it in the above situation.

It's fine for the compiler to try and second-guess you (if it does a
good job in most circumstances), but no heuristic will cover every
case.  It's beyond me why Borland provides the ability to explicitly
*exclude* the FP libraries, in case the compiler wrongly includes
them, but they don't provide an option to *include* them when the
compiler erroneously leaves them out.  It seems reasonable to make
this a separate decision from that of *which* library to use, if one
is needed at all.

RVS>The fix?   Include the following before the FSCANF.
RVS>
RVS>        f = 3.0 * i;          /* where i is ANY variable whose value cannot */
RVS>	fscanf(fp,"%f", &f);  /* be anticipated by the compiler */
RVS>
RVS>The f will get immediately overwritten by the FSCANF, but the compiler will 
RVS>now be forced to include the floating point library code.

You probably got the value for i by reading it with scanf().  This fix
doesn't work if you don't (see above again).

RVS>Hope this helps, RON

I hope I've added something to the discussion.  (OK, *now* let's
followup to comp.os.msdos.programmer).

		Matthew Saltzman
		mjs@clemson.edu

donc@microsoft.UUCP (Don CORBITT) (04/30/91)

In article <41392@cup.portal.com> ts@cup.portal.com (Tim W Smith) writes:
>< that the algorithm is imperfect: if the program isn't using
>< floating point, %e, %f, and %g can't be needed, but they might
>< not be needed if the program is using floating point, either.
>< However, "program uses floating point" is in principle computable
>< at compile time, while "%e, %f, or %g might get passed to printf"
>< isn't.)

[deleted text with justification for following code]

>We might see code like this:
>
>	long data;

	[...]

>	printf( "%f", data );
>Oops!  No floating point at compile time but needed at runtime.

This is incorrect, since printf() is varargs, a float value passed would
be expanded to double.  I would use a union here, if I wanted to do such
a thing.  Of course, someone will come up with such a program.  The fix
is simple, do some floating point math somewhere else in the program, 
even in a func that's never called.  

--
Don Corbitt
Microsoft Windows Development