chris@mimsy.umd.edu (Chris Torek) (01/04/90)
The draft says that scanf's %[efg] formats match `an optionally signed floating-point number, whose format is the same as expected for the subject string of the |strtod| function.' This in turn is defined as an optional sign, followed by a non-optional digit sequence, followed by an optional exponent. The exponent, if present, has the form: `e' or `E', followed by an optional sign, followed by a non-optional digit sequence. Thus, the question that applies to strtol and strtoul (as to whether a sign followed by no digits is acceptable) does not apply. A different question then rears its ugly head: If the number `looks right' up to a point, but then fails to match the constraints imposed on it, what is to happen? We have the following possible sequences we can feed scanf() when it is matching %[efg]: .e10 [missing mantissa digits] +1.2345e [missing exponent digits] -e [missing both digits] This much is clear: These can only be considered a matching failure. The draft goes on to say, however, that `If conversion terminates on a conflicting input character, the offending input character is left unread in the input stream.' This can only be meant to imply `conflicting with a literal character from the format string', not `conflicting with the format required by a conversion such as %f'. Alas, the draft does *not* say what input character(s) are left unread in the case of a matching failure. This question arises only for numeric formats (and perhaps only for floating-point, depending upon whether `%d' should accept bare `-' and `+'). Note that the most useful answer---that the entire malformed floating point number remains unconsumed---requires mandating an arbitrary amount of pushback (or, equivalently, lookahead): the `floating point number' 1.111111111111111111111111111111111111111111111111111111111e- looks just fine until the lack of a digit following the `-' shows up. The question, then, can be stated as follows: What is the condition of the input stream when a matching failure occurs `deep inside' a conversion? (We intend to allow an arbitrary amount of pushback, so whatever the answer to this question, it is easy for me to handle; but I want to know what the standard intends.) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris
gwyn@smoke.BRL.MIL (Doug Gwyn) (01/05/90)
In article <21625@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes: > +1.2345e [missing exponent digits] The characters through '5' are consumed and 'e' remains "unread in the input stream", which in practice means pushed-back or its moral equivalent. (If available, peek-ahead could be employed to avoid having to push anything back. It amounts to the same thing EXCEPT for possible interaction with ungetc(), which is an ugly can of worms. The Standard deliberately does not consider these "unread" characters as pushed-back in the sense of ungetc().) >The draft goes on to say, however, that `If conversion terminates on >a conflicting input character, the offending input character is left >unread in the input stream.' This can only be meant to imply `conflicting >with a literal character from the format string', not `conflicting with >the format required by a conversion such as %f'. No; see also line 35 on page 136 of the December 1988 draft. It really does mean that the peeked-ahead characters that failed to match remain "unread". >Note that the most useful answer---that the entire malformed floating >point number remains unconsumed---requires mandating an arbitrary amount >of pushback (or, equivalently, lookahead): the `floating point number' > 1.111111111111111111111111111111111111111111111111111111111e- >looks just fine until the lack of a digit following the `-' shows up. No, the string up to the 'e' is of the expected form and must be properly converted. Only three characters of peek-ahead suffice to detect that the apparent exponent part really isn't an exponent part.