[comp.lang.c] sscanf absurdity

mnc@m10ux.UUCP (Michael Condict) (07/13/88)

I'm not sure whether this belongs here or in comp.unix.wizards, but it is
probably a problem with most implementations of the C stdio library, so here
it is:

Many of you are probably aware of the bad reputation that sscanf has w.r.t.
execution time, especially since it is doing no I/O, right?  Wrong!  The
AT&T Sys V Rel 2 implementation of sscanf (and presumably earlier versions)
DOES do I/O, or at least it tries to.  Look at sscanf in scanf.c and at
_filbuf in filbuf.c.  Note that sscanf fakes up a FILE structure for the
purpose of allowing getc to be called on the string.  It sets the _IOREAD
flag in the FILE structure to indicate that the string is read-only and
it sets the fd number to _NFILE, to indicate an illegal fd, i.e., that no
I/O should be done.  Well, eventually, if getc runs off the buffer while
trying to satisfy a scanf format item, such as occurs during:

	sscanf("1234", "%d", &i);

then _filbuf will be called to refill the buffer.  It will not notice that
the _file field of the FILE struct is set to _NFILE and will actually call
read on the illegal file fd, causing an error return, not to mention 
hundreds or thousands of wasted instructions.  This can easily add 20%
additional CPU time to your process, if you are using sscanf repeatedly.

The fix is simple -- insert the following before the test of the _IOREAD
flag in _filbuf:

	if ( iop->_file >= _NFILE) return(EOF);

I've just checked the BSD implementation and it doesn't have this problem,
so BSD Vaxen and Suns are probably okay.  Amdahl UTS (System V Rel 1)
definitely does have the problem.

-- 
Michael Condict		{ihnp4|vax135|cuae2}!m10ux!mnc
AT&T Bell Labs		(201)582-5911    MH 3B-416
Murray Hill, NJ