[comp.lang.c] Implicit decimal points in floating-point reads

brotzman@nssdca.gsfc.nasa.gov (Lee E. Brotzman) (05/21/91)

Howdy,

   I am writing some code that reads a file format that uses text headers to
describe fields in ASCII tables (the format is called FITS for Flexible Image
Transport System -- tables are included as an extension of the image format).
The fields are described with standard FORTRAN 77 input formats.
   The problem I am encountering is this:  FORTRAN allows input strings
representing floating point values to have implicit decimal places, i.e.
the string "26208" read with a format of F5.3 results in a value of 26.208.
As far as I can tell there is no equivalent functionality in C, I have tried
using scanf with an input format of "%5.3f", but the result is garbage (see
below).
   Kernighan and Ritchie (1988) seems vague on the subject and doesn't really
say whether implicit decimal points are allowed on input.  The Turbo C
Reference Guide discussion of the format string for scanf does not mention the
precision part of the string (".3" in my example above).  
   I have tried the snippet of code below on both my PC under Turbo C and
on a VAX/VMS system under VAX C.  Both return a 0 for the number of items read
and 0.0 for the floating point value.
   My question:  is it possible to read data that are formatted with implicit
decimal points in ANSI standard C without writing my own routine? I do not need
code for a work-around, I can write my own.  I am just wondering whether I have 
to.  Before anyone mentions it -- no, I can not change the input data to 
include the decimal point. 
   I'd be surprised that such an obviously useful bit of functionality
that has existed for decades in FORTRAN isn't available in C, especially
considering all of the other features packed into the scanf routine. 
Please tell me that I'm being a bonehead and missing something obvious.  :-)

-- Lee E. Brotzman                    Internet:  brotzman@nssdca.gsfc.nasa.gov
-- ST Systems Corp.                   SPAN:      NSSDCA::BROTZMAN
-- National Space Science Data Center BITNET:    ZMLEB@SCFVM

---------------------------< TEST.C >---------------------------------
#include <stdio.h>

main()
{  char a1[10] = "26208";
   float x = 0.0;
   int ifld = 0;
   ifld = sscanf(a1, "%5.3f", &x);
   printf("%s %f\n%s %i", "input string converted to: ", x,
            "number of parameters converted: ", ifld);
}
----------------------------------------------------------------------
-------------------------< TEST.OUT >---------------------------------
input string converted to:  0.000000
number of parameters converted:  0
----------------------------------------------------------------------

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (05/21/91)

In article <5366@dftsrv.gsfc.nasa.gov>, brotzman@nssdca.gsfc.nasa.gov (Lee E. Brotzman) writes:
>    The problem I am encountering is this:  FORTRAN allows input strings
> representing floating point values to have implicit decimal places, i.e.
> the string "26208" read with a format of F5.3 results in a value of 26.208.
> As far as I can tell there is no equivalent functionality in C

Scanf breaks out fields, then converts those fields to binary
exactly the way the strtod() function would.  There is no way at all
for you to say where the implicit decimal point would be.  "%5.3f" is
_not_ a valid scanf() format in C.

If you want to read a field of width W and then convert that to
floating point, use "%Wc" (where W is a literal integer, e.g. "%5c")
to read the characters, and then write your own function to parse them.

I have my own scanf() replacement kit which lets me read most Fortran
formats quite easily.

Something you might consider doing is picking up "f2c" from
research.att.com.  That contains an implementation of Fortran input
(and of course output) written in C.  The simplest thing for you to
do might be to use that library.

>    I'd be surprised that such an obviously useful bit of functionality
> that has existed for decades in FORTRAN isn't available in C, especially
> considering all of the other features packed into the scanf routine. 
> Please tell me that I'm being a bonehead and missing something obvious.  :-)

It may be obviously useful to you, but it's amazing how many C programmers
never missed it.  When I used Fortran, I always thought that implied
decimal points were a "feature" to let you squeeze one more column out of
a punched card, and designed my input formats so that they weren't needed.
It made checking the data so much easier.
-- 
There is no such thing as a balanced ecology; ecosystems are chaotic.

fenn@wpi.WPI.EDU (Brian Fennell) (05/22/91)

In article <5366@dftsrv.gsfc.nasa.gov> brotzman@nssdca.gsfc.nasa.gov writes:
>   The problem I am encountering is this:  FORTRAN allows input strings
>representing floating point values to have implicit decimal places, i.e.
>the string "26208" read with a format of F5.3 results in a value of 26.208.
>As far as I can tell there is no equivalent functionality in C, I have tried
>using scanf with an input format of "%5.3f", but the result is garbage (see
>below).

I think the answer is "you can't get there from here."

>Please tell me that I'm being a bonehead and missing something obvious.  :-)

no bonehead about it, but since this is something you need a fix for,
not a how-do-I-write-like-a-real-C-Guru-would, try this...

>---------------------------< TEST.C >---------------------------------
>#include <stdio.h>
>
>main()
>{  char a1[10] = "26208";
>   float x = 0.0;
>   int ifld = 0;
>   ifld = sscanf(a1, "%5f", &x); 

    x /= 1000.;	/* simple, to the point, no fancy fix routines needed */

>   printf("%s %f\n%s %i", "input string converted to: ", x,
>            "number of parameters converted: ", ifld);
>}

Brian Fennell == fenn@wpi.wpi.edu

gwyn@smoke.brl.mil (Doug Gwyn) (05/22/91)

In article <5366@dftsrv.gsfc.nasa.gov> brotzman@nssdca.gsfc.nasa.gov writes:
>   My question:  is it possible to read data that are formatted with implicit
>decimal points in ANSI standard C without writing my own routine?

No, there is no Fortranish support for that in *scanf() formats.

>   I'd be surprised that such an obviously useful bit of functionality
>that has existed for decades in FORTRAN isn't available in C, ...

What is obvious to ME is that such a format is an accident waiting to strike!

It is also TRIVIAL to multiple a scanned integer by a power of 10 to scale
it thusly.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (05/22/91)

In article <1991May21.200003.13471@wpi.WPI.EDU>, fenn@wpi.WPI.EDU (Brian Fennell) writes:
> >main()
> >{  char a1[10] = "26208";
> >   float x = 0.0;
> >   int ifld = 0;
> >   ifld = sscanf(a1, "%5f", &x); 
> 
>     x /= 1000.;	/* simple, to the point, no fancy fix routines needed */

Simple, to the point, and WRONG.  The thing about the Fortran format which
the original poster wanted to emulate is that the decimal point is placed
implicitly where the format specification says *UNLESS* there is an
explicit decimal point in the input field.

	if (!strchr(a1, '.')) x /= 1.0e5;

comes closer.  Of course, I don't need to explain to a sophisticated
audience why this loses accuracy compared with bending the string, do I?
-- 
There is no such thing as a balanced ecology; ecosystems are chaotic.

DOCTORJ@SLACVM.SLAC.STANFORD.EDU (Jon J Thaler) (05/22/91)

In article <16222@smoke.brl.mil>, gwyn@smoke.brl.mil (Doug Gwyn) says:
>
> In article <5366@dftsrv.gsfc.nasa.gov> brotzman@nssdca.gsfc.nasa.gov writes:
> > My question:  is it possible to read data that are formatted with implicit
> > decimal points in ANSI standard C without writing my own routine?
>
> No, there is no Fortranish support for that in *scanf() formats.
>
> > I'd be surprised that such an obviously useful bit of functionality
> > that has existed for decades in FORTRAN isn't available in C, ...
>
> What is obvious to ME is that such a format is an accident waiting to strike!

I agree.  I programmed in FORTRAN for 20 years (before seeing the light...)
and never saw the need for laziness at this level.

brotzman@nssdcb.gsfc.nasa.gov (Lee E. Brotzman) (05/24/91)

In article <16222@smoke.brl.mil>, gwyn@smoke.brl.mil (Doug Gwyn) writes...
>In article <5366@dftsrv.gsfc.nasa.gov> brotzman@nssdca.gsfc.nasa.gov writes:
>>   My question:  is it possible to read data that are formatted with implicit
>>decimal points in ANSI standard C without writing my own routine?
> 
>No, there is no Fortranish support for that in *scanf() formats.
> 
>>   I'd be surprised that such an obviously useful bit of functionality
>>that has existed for decades in FORTRAN isn't available in C, ...
> 
>What is obvious to ME is that such a format is an accident waiting to strike!
> 
>It is also TRIVIAL to multiple a scanned integer by a power of 10 to scale
>it thusly.

   Doug's response has been typical of the half-dozen or so I have 
received (thanks everybody), except that it's a bit more snippish than 
most.

   This "accident waiting to happen" stuff is bunk however.  We've 
been reading data formatted with implicit decimal points quite nicely 
for a few decades now, thank you very much.  Haven't noticed any 
problem with it until I tried using C, which simply wasn't designed 
with analysis of large volumes of data in mind like Fortran was.  No flame 
there, just statement of opinion.

  In my application, I can not know beforehand whether a specific field 
has implicit decimal points, the Fortran formats don't say, they just have 
the pleasant side effect of reading them correctly either way.  
Inspecting the fields, converting the values and dividing by some 
power of ten (which must also be selected by a string inspection) is 
trivial, but perhaps not all that efficient. I'm going to be using 
this code to filter records from files with 250,000 records 200 bytes 
long on CD-ROM.  If the CD-ROM drives can't stream the data out at 
full speed, they lose their place and have to reacquire, which can 
increase search times by factors of 3 or 4, so fast conversion of 
field values is critical.

   Sometimes -- no, make that most times -- dealing with data is not nearly 
as easy as dealing with code.  Programmers tend to forget this.  Data 
processing professionals can't afford to.

   Thanks to everyone who responded publicly and privately.  No 
further comments are required, as I don't ordinarily read this 
newsgroup.

   So long,

-- Lee E. Brotzman                    Internet:  brotzman@nssdca.gsfc.nasa.gov
-- ST Systems Corp.                   SPAN:      NSSDCA::BROTZMAN
-- Astrophysics Data System           BITNET:    ZMLEB@SCFVM
-- National Space Science Data Center "Prayer: the last refuge of a scoundrel"
-- "My thoughts are my own"                                 Lisa Simpson, 1990

gwyn@smoke.brl.mil (Doug Gwyn) (05/25/91)

In article <5409@dftsrv.gsfc.nasa.gov> brotzman@nssdcb.gsfc.nasa.gov writes:
>>It is also TRIVIAL to multiple a scanned integer by a power of 10 to scale
>>it thusly.
>Inspecting the fields, converting the values and dividing by some 
>power of ten (which must also be selected by a string inspection) is 
>trivial, but perhaps not all that efficient.

Sigh -- this is all wrong, and if he'd bothered to understand what was
said rather than complain about how it doesn't fit his preconceptions,
he'd have a good working solution.

gwyn@smoke.brl.mil (Doug Gwyn) (05/25/91)

In article <5889@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
>>     x /= 1000.;	/* simple, to the point, no fancy fix routines needed */
>Simple, to the point, and WRONG.  The thing about the Fortran format which
>the original poster wanted to emulate is that the decimal point is placed
>implicitly where the format specification says *UNLESS* there is an
>explicit decimal point in the input field.

Note that in any given instance either (a) the whole file does have
decimal points in the field, which can be determined from inspecting
just ONE field ONE time, (b) the whole file does not have decimal
points in the field, ditto, or (c) the file has a mixture of records
with some having the decimal point and some not having it.  Case (c)
is the "accident waiting to happen" situation, and would almost
certainly correspond to manual creation of the data.  Cases (a) and
(b) are almost certainly what one would see if the data were the
output of another program or instrumentation system.