[comp.sys.amiga] C scanf question

new@udel.EDU (Darren New) (07/18/89)

I'm having a problem using fscanf under Lattice 5.02.
I have a file where each line was written with

char flag1='0', flag2='+', line[80];
fprintf(file, "%c%c%s\n", flag1, flag2, line[80]);

I'm trying to read it back into the same variables one line at a time.
I've tried
fscanf(file, "%c%c%s\n", &flag1, &flag2, line);
fscanf(file, "%c%c%s ", ...);
fscanf(file, " %c%c%s", ...);
fscanf(file, "\n%c%c%s", ...);

All of these work on the first line and give me a '\n' in flag1 and a
'0' in flag2 and a '+' in line[0] on
the second and subsequent lines. The lattice library manual states that
the space character will skip newlines, tabs, and spaces, and therefore the
second or third one should be correct. However, it APPEARS as though Lattice
is seeing the first %c and saying I'm not going to skip any whitespace.
I'm well versed in C but usually have avoided the STDIO routines for
various reasons in the past. Can anyone tell me what I'm doing wrong?
				-- Darren

morris-ng@cup.portal.com (Yuklung Morris Ng) (07/19/89)

RE: scanf() question:
 
It seems to me that the fprintf() is the problem.  When you try to print a
string to the file and you write "fprintf(file,"%s",line[80]);" what you are
writing is garbage. The program will try to write the string at address
&line + 80 and end with a \0 char. So the correct format should be
"fprintf(file,"%s",&line);"...
 
+---------+----------------------------+----------------------------------+
|      ///| Morris Y. L. Ng            | Usenet: morris-ng@cup.portal.com |
|     /// | Computer Science & Finance | Portal: Yuklung Morris Ng        |
|    ///  | San Jose State University  | Home  : (###)###-#### (Guess?!)  |
|\\\///   +----------------------------+----------------------------------+
| \XX/    |          "Be my Amiga!  And I will be your Amigo!"            |
+---------+---------------------------------------------------------------+

charles@hpcvca.CV.HP.COM (Charles Brown) (07/20/89)

> I'm trying to read it back into the same variables one line at a time.

> I've tried
> fscanf(file, "%c%c%s\n", &flag1, &flag2, line);
> fscanf(file, "%c%c%s ", ...);
> fscanf(file, " %c%c%s", ...);
> fscanf(file, "\n%c%c%s", ...);

> I'm well versed in C but usually have avoided the STDIO routines for
> various reasons in the past. Can anyone tell me what I'm doing wrong?
> 				-- Darren

I can't say what is specifically wrong here, but I can tell you a
better way to do this:  Use fgets() to read the line and then use
sscanf() to read the line read by fgets().  The scanf() family is ill
suited to parsing lines.
--
	Charles Brown	charles@cv.hp.com or charles%hpcvca@hplabs.hp.com
			or hplabs!hpcvca!charles or "Hey you!"
	Not representing my employer.
	"The guy sure looks like plant food to me." Little Shop of Horrors

mplevine@sbc2.sgi.com (Marshall Levine) (07/20/89)

In article <20578@cup.portal.com>, morris-ng@cup.portal.com (Yuklung Morris Ng) writes:
> string to the file and you write "fprintf(file,"%s",line[80]);" what you are
> ... should be "fprintf(file,"%s",&line);"...

Actually, I believe it should be:

fprintf(file,"%s",line);     or, if you want to be verbose:
fprintf(file,"%s",&line[0]);

You are going to print characters until you hit a '\0', but you are starting
at the memory location of the pointer to the string, rather than the string
itself.  No big deal, just one more core dump (or system crash)...


-- Marshall Levine

mplevine@sgi.com
mplevine@phoenix.princeton.edu

Advanced Systems Design, Silicon Graphics Inc.
Department of Computer Science, Princeton University

--
-- Marshall Levine

mplevine@sgi.com
mplevine@phoenix.princeton.edu

Advanced Systems Design, Silicon Graphics Inc.
Department of Computer Science, Princeton University

rap@peck.ardent.com (Rob Peck) (07/20/89)

In article <20578@cup.portal.com> morris-ng@cup.portal.com (Yuklung Morris Ng) writes:
>RE: scanf() question:
> 
>It seems to me that the fprintf() is the problem.  When you try to print a
>string to the file and you write "fprintf(file,"%s",line[80]);" what you are
>writing is garbage. The program will try to write the string at address
>&line + 80 and end with a \0 char. So the correct format should be
>"fprintf(file,"%s",&line);"...
                    ^^^^^


Looks like an "oops" in the correction.  This should be "&line[0]" or
just "line".  The way the correction is written, the user would be
writing X characters beginning at the memory location at which the
pointer named 'line' was stored.


Just couldn't let it get by.

Rob Peck

Ata@multics.radc.af.mil (John G. Ata) (07/22/89)

    Delivery-Date:  20 July 1989 09:17 edt
    Delivery-By:  Network_Server.Daemon (amiga-relay-request@louie.udel.e)
    Sender:  amiga-relay-request at UDEL.EDU
    Date:  Wednesday, 19 July 1989 17:45 edt
    From:  Charles Brown <charles at HPCVCA.CV.HP.COM>
    Subject:  Re: C scanf question
    To:  amiga-relay at UDEL.EDU
    Newsgroups:  comp.sys.amiga
    Organization:  Hewlett-Packard Co., Corvallis, Oregon
    
    > I'm trying to read it back into the same variables one line at a time.
    
    > I've tried
    > fscanf(file, "%c%c%s\n", &flag1, &flag2, line);
    > fscanf(file, "%c%c%s ", ...);
    > fscanf(file, " %c%c%s", ...);
    > fscanf(file, "\n%c%c%s", ...);
    
    > I'm well versed in C but usually have avoided the STDIO routines for
    > various reasons in the past. Can anyone tell me what I'm doing wrong?
    >                                       -- Darren
    

Sounds like you're forgetting the <NL> at the end of an input line.
Thus, the second time through the loop, the <NL> will be the first
character of input.  To get around this problem, do an extra getc after
you do the fscanf to eliminate the <NL>.  Then it should work properly.
For the fscanf, "%c%c%s" should do as format effectors.

                              John G. Ata

MROBINSON@wash-vax.bbn.com (07/22/89)

[Charles Brown made a comment that the scanf family was ill-suited to
 parsing lines]

First, I can't see anything wrong with the original code.  Then again, I
don't know much about Lattice C.  I use scanf a bit, though, and wanted to
pass on some knowledge.  My reference book on C is Harbison and Steele's
"C:  A Reference Manual", mainly because it is the most useful book on C that
I've ever seen.  Many of the people on my hall at work stop by to borrow my
book on a daily basis, it's very irritating.  Please go get your own copy.

Anyway.  The versions of C that I use support formats of the following type:
fscanf(f,"%c%c%s%*[^\n]\n",&c1,&c2,%str);
I'll explain the %*[^\n] format in steps.  The * says "whatever is read in for
this format, throw it away".  That means we're skipping something in the input.
The [] mean "match any string made of characters found within these braces".
The ^ means "oh, I mean all the characters EXCEPT the ones I list".  Thus, this
format means "find all the characters before the end of the line, and throw
them away".  I use this format when I want to allow comments at the end of the
line.

Now, about parsing lines.  If your line format is pretty simple, the scanf
family can handle it no problem.  If it's as complicated as the general case
of regular expressions, forget scanf, and make a real token-oriented parser.
If it's an LL or LR grammar, use yacc or Snobol or Prolog's definite clause
grammars.  If it's context sensitive to any real degree, use Prolog DCGs, and
take some smart pills.  Your example looked like it fell into the pretty
simple category, so take some time to learn scanf.

If any of you out there are seriously into parsing, I strongly encourage you
to look into Quintus or BIM Prolog and Warren and Pereira's Definite Clause
Grammars.  DCGs are way cool.  I heard a guy from Berkeley give a talk once
about using DCGs to translate among four hardware description languages.  I've
also used them myself for some medium-sized projects, with success and
pleasure.

--Max Robinson, mrobinson@wash-vax.bbn.com

jms@tardis.Tymnet.COM (Joe Smith) (07/22/89)

In article <19915@louie.udel.EDU> new@udel.EDU (Darren New) writes:
>I have a file where each line was written with
>  char flag1='0', flag2='+', line[80];
>  fprintf(file, "%c%c%s\n", flag1, flag2, line[80]);

You need to use fscanf(file, "%c%c%s%*c", &flag1, &flag2, line);.
The asterisk is mentioned on page F-100 of the Lattice 4.0 manual, under
the fscanf/scanf description.  It says that the conversion is to be formed
(reading the next character) but the result not stored.  It snarfs up the
whitespace (blank, tab, or newline) that terminated in string read in by
the previous "%s".
-- 
Joe Smith (408)922-6220 | SMTP: JMS@F74.TYMNET.COM or jms@tymix.tymnet.com
McDonnell Douglas FSCO  | UUCP: ...!{ames,pyramid}!oliveb!tymix!tardis!jms
PO Box 49019, MS-D21    | PDP-10 support: My car's license plate is "POPJ P,"
San Jose, CA 95161-9019 | narrator.device: "I didn't say that, my Amiga did!"

jms@tardis.Tymnet.COM (Joe Smith) (07/24/89)

In article <20201@louie.udel.EDU> MROBINSON@wash-vax.bbn.com writes:
>[Charles Brown made a comment that the scanf family was ill-suited to
> parsing lines]
>First, I can't see anything wrong with the original code.  Then again, I
>don't know much about Lattice C.  I use scanf a bit, though, and wanted to
>pass on some knowledge. 

You forgot to pass on the most important reason for NOT using scanf.
If you have "char line[80]" and use fscanf(f,"%s",line), what happens if
the input has more than 80 consecutive characters without a tab, blank
or newline?  The answer is that some poor variable in your program may
be changed.  And if the input is several K of nonblanks, fscanf will
cheerfully overwrite your stack and maybe your entire program as will.

The only safe way to use the scanf family on untrustworthy data is to
use sscanf on a string read by fgets or equivalent.

-- 
Joe Smith (408)922-6220 | SMTP: JMS@F74.TYMNET.COM or jms@tymix.tymnet.com
McDonnell Douglas FSCO  | UUCP: ...!{ames,pyramid}!oliveb!tymix!tardis!jms
PO Box 49019, MS-D21    | PDP-10 support: My car's license plate is "POPJ P,"
San Jose, CA 95161-9019 | narrator.device: "I didn't say that, my Amiga did!"

new@udel.EDU (Darren New) (07/25/89)

In article <455@tardis.Tymnet.COM> jms@tardis.Tymnet.COM (Joe Smith) writes:
>You forgot to pass on the most important reason for NOT using scanf.
>If you have "char line[80]" and use fscanf(f,"%s",line), what happens if
>the input has more than 80 consecutive characters without a tab, blank
>or newline?  The answer is that some poor variable in your program may
>be changed.  And if the input is several K of nonblanks, fscanf will
>cheerfully overwrite your stack and maybe your entire program as will.
>
>The only safe way to use the scanf family on untrustworthy data is to
>use sscanf on a string read by fgets or equivalent.

Two points:
1)  fscanf(f, "%80s", line) would do what you want there.
2) in this case, since I'm generating an intermediate file and then
   reading it back in again, the data is indeed "trustworthy" at least
   as far as it needs to be.

I think the basic problem with my code is this:
The Lattice manual says on page L97 that whitespace in the format string
will cause input up to the next non-whitespace to be skipped.
However, K&R-II (revenge of the ANSI) says that the ANSI scanf ignores
whitespace in the format string, which is what appears to be happenning
to me.  I suspect that the library routine was brought into compliance
and the manual was not properly updated. In any case, the point is
now somewhat moot in my case, as "line" was to hold a file name and
I realised that file names may have whitespace anyway; hence, I parse
the beginning with scanf and then use fgets to get the end of the line.
Thanks to all for the responses.   

	    More stupid questions coming soon to a VDT near you.
	    -- Darren

asaph@TAURUS.BITNET (07/26/89)

In article <455@tardis.Tymnet.COM> jms@tardis.Tymnet.COM (Joe Smith) writes:
>If you have "char line[80]" and use fscanf(f,"%s",line), what happens if
>the input has more than 80 consecutive characters without a tab, blank
>or newline?
>The only safe way to use the scanf family on untrustworthy data is to
>use sscanf on a string read by fgets or equivalent.
>
>Joe Smith (408)922-6220 | SMTP: JMS@F74.TYMNET.COM or jms@tymix.tymnet.com

   according to the manx 3.2 book you can limit the amount of chars read from
a file via scanf by using a line like this

char buffer[50]

:
:
fscanf (file,"%50s",buffer)
this should read no more then 50 chars including the null terminator.
for a more general read you could do

char buffer[BUFFSIZE]
fscanf (file,"%*s",BUFFSIZE,buffer) <-- I'm not sure about this one though!
If it doesn't work you might have to create the string "%" + num + "s", using
something like sprintf, for instance. (Talk about overkill !).
BTW, the the %+num+char is good for all types %2d will read a decimal number
of no more then 2 chars.
  As for the parsing usefullness of scanf I have this to contribute:
We where just recently given a project to do here at TAU, a sort of assembler
program, needles to say some line parsing is nessaccery there, I know that
some of my friend used sscanf extensively for this. after all the ability
to say %[ ~abcd] is pretty nice. I actualy wrote a small pattern matcher
but thats another story altogether.
                                        asaph
asaph@taurus.bitnet     - or -          asaph@math.tau.ac.il