[comp.bugs.4bsd] doscan.o bug in Ultrix 1.2 & 2.0

tom@nsta.UUCP (07/07/87)

Bug in doscan.o in /lib/libc.a on ULTRIX 1.2 and ULTRIX 2.0.
Try in you program.c the following line:

val = fscanf(file,"%*[ \t\n]%74[abcdefghijklmnopqrstuvwxyz
ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789]", namebuf);

In case your input does not begin with space, tab or newline, scanf will 
return 0 which is not right, since "*" should match "0 or more" .

The pascal compiler uses this type of scanf to read enumerated type data.
Try for example the following program:
( you can use any enumerated type instead of boolean).

program test(input,output);
var 

    a :boolean;

begin

    readln(a);
    writeln(a);
end.

Try as input the word "false" with and without leading blank.
You will get the following error : ( without leading blank)
Trace/BPT trap
Unknown name "" found on enumerated type read

I have inform DEC, but meanwhile i installed 4.3 doscan.o into
Ultrix libc.a.

guy@gorodish.UUCP (07/07/87)

> val = fscanf(file,"%*[ \t\n]%74[abcdefghijklmnopqrstuvwxyz
> ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789]", namebuf);
> 
> In case your input does not begin with space, tab or newline, scanf will 
> return 0 which is not right, since "*" should match "0 or more" .

I presume the "0 or more" comes from the ULTRIX manual; I see nothing like
it in the 4.3BSD manual.  The 4.3BSD manual does not indicate one way
or the other whether a zero-length match for a "%[]" item is
considered a successful or failed match.  If it is a successful
match, then a zero-length match of the "%74[abc...]" item is also a
successful match.

It sounds like they may be using the S5 version of "doscan", either
for S5 compatibility or because it fixes a number of bugs in the 4.3
version.  The SVID says for "%[]" conversions:

	...At least one character must match for this conversion to
	be considered successful.

and the October 1, 1986 ANSI C draft says:

	...If the length of the input item is zero, the execution of
	the directive fails: this condition is a matching failure,
	unless an error prevented input from the stream, in which
	case it is an input failure.

In other words, this behavior is most likely here to stay.  The bug
is in the Pascal runtime, not in "scanf".
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com

allbery@ncoast.UUCP (Brandon Allbery) (07/11/87)

As quoted from <309@nsta.UUCP> by tom@nsta.UUCP (Tom Gorodecki):
+---------------
| Bug in doscan.o in /lib/libc.a on ULTRIX 1.2 and ULTRIX 2.0.
| Try in you program.c the following line:
| 
| val = fscanf(file,"%*[ \t\n]%74[abcdefghijklmnopqrstuvwxyz
| ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789]", namebuf);
| 
| In case your input does not begin with space, tab or newline, scanf will 
| return 0 which is not right, since "*" should match "0 or more" .
+---------------

* means match but don't assign.  As to whether %[ succeeds on match 0 instances,
this appears to be a matter of interpretation; there isn't an answer.  (I
discovered this when UC 0.4.3 used %[ in userfile parsing; I recoded the
getuser() routine to avoid it for 0.4.4.)

In general, it's best not to use fscanf() unless you are absolutely certain
of the file's format.  (News B2.11 is quite fragile in this regard.)  I
would solve it (for anything but Pascal I/O )-: by fgets(), then skip over
leading whitespace and sscanf() the string.  For Pascal, it gets harder;
the "best" way is skip whitespace, then read characters until you match the
longest value-string defined for the scalar type being input.  (If the list
is organized in alphabetical order (or an array of pointers to them is so
organized), this can be done optimally.
-- 
[Copyright 1987 Brandon S. Allbery, all rights reserved] \ ncoast 216 781 6201
[Redistributable only if redistribution is subsequently permitted.] \ 2400 bd.
Brandon S. Allbery, moderator of comp.sources.misc and comp.binaries.ibm.pc
{{ames,harvard,mit-eddie}!necntc,{well,ihnp4}!hoptoad,cbosgd}!ncoast!allbery
<<The opinions herein are those of my cat, therefore they must be correct!>>