casey@lll-crg.llnl.gov (Casey Leedom) (02/13/89)
Description:
Ungetc will put characters back into an _IOSTRG input stream. This
will cause a segmentation violation if the string is in read only
memory (shared text segment). For example:
if (!some_option) {
opt = getenv(...);
some_option = opt ? opt : "some_string";
}
...
sscanf(some_option, "some_format", ...);
In some cases, some_option will refer to the constant string
"some_string". Since sscanf uses ungetc to back up over input
it decides it hadn't wanted to look at yet, we write into the
constant string "some_string".
Repeat-By:
Use gcc to compile the following program without the
-fwritable-strings option:
main()
{
int i, j;
sscanf("4x4", "%dx%d", &i, &j);
printf("%d %d\n", i, j);
}
Fix:
Debatable. I fixed ungetc, but who knows? Maybe we do want to
be able to do ungetc's on a string argument. Currently _IOSTRG
is only used internally for sscanf and sprintf, so changing ungetc
doesn't break anything now.
Another option would be to replace all uses of ungetc in doscan
with fseek(iop, -1L, 1) but I don't think fseek works on _IOSTRG
arguments and I think it's a lot more expensive to call.
Another option is to separate out the IO stream backup function
from putting a [possibly different] character back on the stream.
But this involves either defining a new interface, or extending
the definition of ungetc's interface to accept something like
ungetc(__IOS_BACKUP);
And Finally, we could have read only streams and simply have
sscanf set that flag in addition to _IOSTRG. But this would add
extra processing for every putc, etc.
*** lib/libc/stdio/ungetc.c-dist Wed Mar 26 18:08:43 1986
--- lib/libc/stdio/ungetc.c Sun Feb 12 13:45:30 1989
***************
*** 18,24 ****
return (EOF);
iop->_cnt++;
! *--iop->_ptr = c;
return (c);
}
--- 18,27 ----
return (EOF);
iop->_cnt++;
! if (iop->_flag & _IOSTRG)
! --iop->_ptr;
! else
! *--iop->_ptr = c;
return (c);
}kre@cs.mu.oz.au (Robert Elz) (02/14/89)
In article <20282@lll-winken.LLNL.GOV>, casey@lll-crg.llnl.gov (Casey Leedom) writes: > Description: > Ungetc will put characters back into an _IOSTRG input stream. > Fix: > Debatable. Maybe better to have ungetc() look at the character that's in the buffer - if its the same as the one being pushed back (which it always will be in sscanf()) then don't do the write. ungetc() is rare enough that the extra mem read & test aren't going to matter. That way you don't get a segmentaion violation with write only strings if you're just doing sscanf(), but you will if you try random other ungetc's on a string (somehow). You may also win doing this with normal writeable buffers - if the buffer is currently mapped shared read, copy on write, you may be able to avoid the copy (yes, I know, that will happen once a century...) kre
aida@porthos.csl.sri.com (Hitoshi Aida) (02/14/89)
In article <20282@lll-winken.LLNL.GOV> casey@lll-crg.llnl.gov (Casey Leedom) writes: >Fix: > Debatable. I fixed ungetc, but who knows? Maybe we do want to > be able to do ungetc's on a string argument. Currently _IOSTRG > is only used internally for sscanf and sprintf, so changing ungetc > doesn't break anything now. > > Another option would be to replace all uses of ungetc in doscan > with fseek(iop, -1L, 1) but I don't think fseek works on _IOSTRG > arguments and I think it's a lot more expensive to call. > > Another option is to separate out the IO stream backup function > from putting a [possibly different] character back on the stream. > But this involves either defining a new interface, or extending > the definition of ungetc's interface to accept something like > ungetc(__IOS_BACKUP); I think this is the correct option. I prefer a new interface. Anyway, for the moment, following is more consistent. *** lib/libc/stdio/ungetc.c-dist Wed Mar 26 18:08:43 1986 --- lib/libc/stdio/ungetc.c Sun Feb 12 13:45:30 1989 *************** *** 18,24 **** return (EOF); iop->_cnt++; ! *--iop->_ptr = c; return (c); } --- 18,31 ---- return (EOF); iop->_cnt++; ! if (iop->_flag & _IOSTRG) { ! if ((*--iop->_ptr & 0377) != c) { ! iop->_cnt--; ! iop->_ptr++; ! return (EOF); ! } ! } else ! *--iop->_ptr = c; return (c); } Hitoshi AIDA (aida@csl.sri.com) Computer Science Laboratory, SRI International
jack@cwi.nl (Jack Jansen) (02/14/89)
I ran into this problem as well, some time ago. My fix was to have ungetc() only push the character back if it isn't there already, i.e. if( *--fp->_ptr != ch ) *fp->_ptr = ch; Makes sscanf work with read-only strings, and still keeps the semantics the same. -- -- Fight war, not wars | Jack Jansen, jack@cwi.nl Destroy power, not people! -- Crass | (or mcvax!jack)
ditto@cbmvax.UUCP (Michael "Ford" Ditto) (02/14/89)
In article <20282@lll-winken.LLNL.GOV> casey@lll-crg.llnl.gov (Casey Leedom) writes: > ! if (iop->_flag & _IOSTRG) > ! --iop->_ptr; > ! else > ! *--iop->_ptr = c; As an alternative to the above four lines, why not: if (*--iop->_ptr != (char)c) *iop->ptr = c; This will give anyone ungetc'ing something weird into a readonly string the segmentation fault they deserve, and still allow ungetc'ing into a writable string (even though nobody should be doing that). -- -=] Ford [=- "The number of Unix installations (In Real Life: Mike Ditto) has grown to 10, with more expected." ford@kenobi.cts.com - The Unix Programmer's Manual, ...!sdcsvax!crash!elgar!ford 2nd Edition, June, 1972. ditto@cbmvax.commodore.com
karl@haddock.ima.isc.com (Karl Heuer) (02/22/89)
The solutions I've seen all involve changing ungetc(). It would also be possible to change the scanf() internals to just decrement _ptr instead of calling ungetc(); I believe that, in the usual stdio implementation, this would do the right thing for both string files and real files. Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint