[comp.lang.c] How to validate input?

browns@iccgcc.decnet.ab.com (Stan Brown) (11/30/90)

Here's a puzzler.  We want to have the user type an octal integer at the
keyboard, and find out whether the user actually did it, and whether
there were any invalid characters on the line.  We use fgets( ) to read
a line from the keyboard, then decode it via

    long octa;
    char extra;
    int num_fields;

    num_fields = sscanf(kbd_string, "%lo%c", &octa, &extra);
    if (num_fields == 1) {
	/* do more stuff */
    }
    
The problem is, when the user types just "8" on the line, num_fields is
coming back as 1!

The standard says sscanf stops converting when it finds an invalid
character, so at first blush we expected num_fields to be 0 because the
first character in the string isn't an octal digit.  But working through
the definition of %o, which is in terms of strtoul( ), and in working
through the definition of the latter, we find that a "base part" of no
characters is considered valid.

So we have two questions:
    1. To be standard-conforming, should the sscanf call above, with the
       input string containing "8\0", return a value of 0, 1, or 2?
    2. What is the best way to accomplish what we're trying to
       accomplish, i.e. to check quickly that the user typed a valid
       octal number and nothing else?

Please do not attribute these remarks to any other person or company.
                                   email: browns@iccgcc.decnet.ab.com
Stan Brown, Oak Road Systems, Cleveland, Ohio, USA    +1 216 371 0043

browns@iccgcc.decnet.ab.com (Stan Brown) (11/30/90)

updated version of this post--previous one cancelled

Here's a puzzler.  We want to have the user type an octal integer at the
keyboard, and find out whether the user actually entered one, and
whether there were any invalid characters on the line.  We use fgets( )
to read a line from the keyboard, then decode it via

    long octa;
    char extra;
    int num_fields;

    num_fields = sscanf(kbd_string, "%lo%c", &octa, &extra);
    if (num_fields == 1) {
	/* do more stuff */
    }
    
The problem is, when the user types just "8" on the line, num_fields is
coming back as 1!

The standard says sscanf stops converting when it finds an invalid
character, so at first blush we expected num_fields to be 0 because the
first character in the string isn't an octal digit.  But working through
the definition of %o, which is in terms of strtoul( ), and in working
through the definition of the latter, we find that a "base part" of no
characters is considered valid.

So we have two questions:
    1. To be standard-conforming, should the sscanf call above, with the
       input string containing "8\0", return a value of 0, 1, or 2?
    2. What is the best way to accomplish what we're trying to
       accomplish, i.e. to check quickly that the user typed a valid
       octal number and nothing else?

email suggestions were for %[01234567]

BTW, code must run on VAX and Microsoft C.  This rules out using %n
to find out that no valid characters were received.


Please do not attribute these remarks to any other person or company.
                                   email: browns@iccgcc.decnet.ab.com
Stan Brown, Oak Road Systems, Cleveland, Ohio, USA    +1 216 371 0043

chris@mimsy.umd.edu (Chris Torek) (11/30/90)

In article <2195.2754fcc2@iccgcc.decnet.ab.com>
browns@iccgcc.decnet.ab.com (Stan Brown) writes:
[regarding *scanf("%o%c"...) with input beginning with `8']
>The standard says sscanf stops converting when it finds an invalid
>character, so at first blush we expected num_fields to be 0 because the
>first character in the string isn't an octal digit.  But working through
>the definition of %o, which is in terms of strtoul( ), and in working
>through the definition of the latter, we find that a "base part" of no
>characters is considered valid.

I got a different answer when I wrote my `scanf' innards: only a few
formats are allowed to be `empty', and none of the numeric conversion
formats are included in those few.  So while "" is a `proper' octal
number to strtoul(), it is not a `proper' octal number to %o and %o
must stop with a matching failure, causing the sscanf call to return 0.

>    2. What is the best way to accomplish what we're trying to
>       accomplish, i.e. to check quickly that the user typed a valid
>       octal number and nothing else?

I prefer

	char *cp; unsigned long value;

	value = strtoul(buf, &cp, 8);
	if (cp == buf || *cp != '\0')
		... input value was invalid ...;

but I believe the sscanf described in the parent article should also work.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

browns@iccgcc.decnet.ab.com (Stan Brown) (12/05/90)

> In article <2195.2754fcc2@iccgcc.decnet.ab.com>
> browns@iccgcc.decnet.ab.com (Stan Brown) writes:
> [regarding *scanf("%o%c"...) with input beginning with `8']
>>The standard says sscanf stops converting when it finds an invalid
>>character, so at first blush we expected num_fields to be 0 because the
>>first character in the string isn't an octal digit.  But working through
>>the definition of %o, which is in terms of strtoul( ), and in working
>>through the definition of the latter, we find that a "base part" of no
>>characters is considered valid.

In article <28132@mimsy.umd.edu>, chris@mimsy.umd.edu (Chris Torek) writes:
> I got a different answer when I wrote my `scanf' innards: only a few
> formats are allowed to be `empty', and none of the numeric conversion
> formats are included in those few.  So while "" is a `proper' octal
> number to strtoul(), it is not a `proper' octal number to %o and %o
> must stop with a matching failure, causing the sscanf call to return 0.

As I read the draft standard (Jan 11 '88, which I assume did not change
in the final), this is not conforming.  

sec 4.9.6.6, the sscanf function:  "... The sscanf function is
equivalent to fscanf, except that the argument s specifies a string from
which the input is to be obtained, rather than from a stream.  Reaching
the end of the string is equivalent to encountering end-of-file for the
fscanf function. ..."

sec 4.9.6.2, The fscanf function: "... _o_ Matches an optionally signed
octal integer, whose format is the same as expected for the subject
sequence of the strtoul function with the value 8 for the base argument.
.."

sec 4.10.1.6, The strtoul function: "... the expected form of the
subject sequence is a sequence of letters and digits representing an
integer with the radix specified by base, optionally preceded by a plus
or minus sign, but not including an integer suffix. ... The subject
sequence is defined as the longest subsequence of the input string,
starting with the first non-white-space character, that is an initial
subsequence of a sequence of the expected form. The subject sequence
contains no characters if the input string is empty or consists entirely
of white space, or if the first non-white-space character is other than
a permissible letter or digit. ... If the subject sequence is empty or
does not have the expected form, no conversion is performed ... If no
conversion could be performed, zero is returned." 

All of which would say to me that sscanf("8", "%o" &x) should set x to 0.
Other opinions...?

Stan again:
>>    2. What is the best way to accomplish what we're trying to
>>       accomplish, i.e. to check quickly that the user typed a valid
>>       octal number and nothing else?

And Chris:
> I prefer
> 	char *cp; unsigned long value;
> 
> 	value = strtoul(buf, &cp, 8);
> 	if (cp == buf || *cp != '\0')
> 		... input value was invalid ...;

This I like a lot.  Thanks to all the other folks who posted or emailed.

Please do not attribute these remarks to any other person or company.
                                   email: browns@iccgcc.decnet.ab.com
Stan Brown, Oak Road Systems, Cleveland, Ohio, USA    +1 216 371 0043