[comp.unix.aux] Problem in sscanf with gcc-1.37 ?

lamarche@ireq.hydro.qc.ca (07/27/90)

The following problem arrived when I tried to compile Dclock
(X11R4 client, comp.sources.x). Here is a short program that
describe it.

#include <stdio.h>
main  ()
{
   char works[] = "Works 1 2";
   char* fails = "Fails 1 2";
   int d1,d2;

   printf ( "String: %s ", works );
   fflush(stdout);
   sscanf ( works, "Works %1d %1d",  &d1, &d2 );
   printf ( "--> Read: %d %d\n", d1, d2 );

   printf ( "String: %s ", fails );
   fflush(stdout);
   sscanf ( fails, "Fails %1d %1d",  &d1, &d2 );
   printf ( "--> Read: %d %d\n", d1, d2 );
}

That gives the following output:
String: Works 1 2 --> Read: 1 2
String: Fails 1 2 Memory fault - core dumped

Conclusion:
We can printf <fail> but we can't sscanf it.

Since this program works with cc, is this behavior normal with gcc ? 
  
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
| Louis Lamarche, IREQ |       lamarche@ireq.hydro.qc.ca
| CP 1000, Varennes    |                 or
| QC, Canada, J3X 1S1  |  514-652-8077 (office)  514-324-2919 (home)

rmtodd@servalan.uucp (Richard Todd) (07/27/90)

lamarche@ireq.hydro.qc.ca appears to have run afoul of the classic 
scanf/writable strings problem, wherein:

>#include <stdio.h>
>main  ()
>{
>   char works[] = "Works 1 2";
>   char* fails = "Fails 1 2";
>   int d1,d2;

>   sscanf ( works, "Works %1d %1d",  &d1, &d2 );
works ok, but
>   sscanf ( fails, "Fails %1d %1d",  &d1, &d2 );
doesn't.  The problem is the interaction of a new feature in GCC with a long-
standing bug in stdio.  Here goes:

  Sscanf, like all the scanf functions, reads a string (either from memory or
a file descriptor) and parses it.  Internally, the scanf functions do reading
by calling fgetc() and the like; reads from the string argument passed to 
sscanf are done by making a "fake" FILE structure that fgetc() and the like 
know represents a string in memory and not an actual file.  Now, when parsing,
the scanf routines sometimes need to backtrack; the obvious way to back up one
character is to call ungetc().  Now ungetc(), when working on a fake FILE
pointing to a string, writes back into the string the character that was 
just read from it.  Ordinarily, this is just a pointless waste of a few
microseconds, but it causes problems when in a program compiled with gcc.
  Why?  Well, gcc implements a new feature allowed by the ANSI standard,
where string constants are actually constants--you can't write to them,
since they are placed in text space instead of data space.  Attempting to
write into a string constant will cause a memory fault.  Hence, when you
pass the address of a string constant to scanf, it will die.
  Now, let's look at those sample code lines more closely:

>   char* fails = "Fails 1 2";
    This declares a pointer to char, and initializes it to point to the 
string constant "Fails 1 2".  Hence, when you do sscanf(fails,...), you get 
a memory fault.  

>   char works[] = "Works 1 2";
  This, though it looks similar to the above, is very different.  This 
allocates an array in automatic (stack-based) storage, and initializes it 
with the sequence of characters 'W','o',...'2',0.  Since works[] is
allocated in the stack region, which is writable, sscanf() has no problem.

  So, what do you do to try to get these X programs going that call
sscanf() with string constants?  Well, for starters, change the makefiles
to add "-fwritable-strings" to the options for gcc.  This causes gcc to put
string constants in data space, just like cc does, so that writes to string
constants succeed.  This will get those programs compiled and working.  
--
Richard Todd	rmtodd@uokmax.ecn.uoknor.edu  rmtodd@chinet.chi.il.us
	rmtodd@servalan.uucp

dwb@archer.apple.com (David W. Berry) (07/27/90)

In article <2204@s3.ireq.hydro.qc.ca> lamarche@ireq.hydro.qc.ca () writes:
	What you're seeing is a side effect of the fact that gcc
puts strings into text space.  To defeat this you can specify the
-fwritable-strings flag to gcc.

>main  ()
>{
>   char works[] = "Works 1 2";
	This allocates 10 bytes on the stack and then copies a string
from text space to the stack space.  Under most C compilers, it isn't
even allowed...

>   char* fails = "Fails 1 2";
	This allocates 4 bytes on the stack and puts a pointer to the
string in text space in the stack space.

Since the data that fails points to is not writable, when _doscan trys
to do an ungetc, it fails with a buserror.

	David
	David W. Berry			A/UX Toolbox Engineer
	dwb@apple.com