msb@sq.sq.com (Mark Brader) (02/12/90)
> > isascii(*s) && isdigit(*s) > > According to _Standard C_ ... there is no "isascii". And "isdigit" etc. > take an int in the set (EOF, 0..UCHAR_MAX) ... > So, to write ANSI conformant C you must always say something like > isdigit((unsigned char) *s) If the code has to run on ANSI and non-ANSI C's, I'd prefer: #include <stdio.h> #include <ctype.h> #ifndef isascii /* oh, must be ANSI C */ #define isascii(x) (((x) >= 0 && (x) < UCHAR_MAX) || (x) == EOF)) #endif and then isascii(*s) && isdigit(*s) The X3J11 people did not put isascii() in ANSI C because of the "ascii" part of the name. Correctly, they did not want to make any part of the C Standard ASCII-dependent. I suggested that isascii() be guaranteed merely to have semantics similar to the above #define and the name kept as a historical artifact, but they didn't buy it. To keep this article short, I won't discuss making it work when the argument has side-effects (as in isascii (*p++)). -- Mark Brader At any rate, C++ != C. Actually, the value of the SoftQuad Inc., Toronto expression "C++ != C" is implementation-defined. utzoo!sq!msb, msb@sq.com -- Peter da Silva This article is in the public domain.
gwc@root.co.uk (Geoff Clare) (02/14/90)
In article <1990Feb12.043324.5259@sq.sq.com> msb@sq.com (Mark Brader) writes: >If the code has to run on ANSI and non-ANSI C's, I'd prefer: > > #include <stdio.h> > #include <ctype.h> > > #ifndef isascii /* oh, must be ANSI C */ > #define isascii(x) (((x) >= 0 && (x) < UCHAR_MAX) || (x) == EOF)) > #endif > >and then > isascii(*s) && isdigit(*s) Sorry, that won't work on the many systems which have an ANSI-type isdigit() AND a normal isascii(). This includes all X/Open Portability Guide 3 compliant systems. Also the assumption that isascii being undefined implies ANSI C is bogus. A definition of isascii() could have been enabled by a feature test macro. I believe the only way to cope with all variants of ctype.h macros, is to have a user-supplied configuration parameter. E.g. #ifdef OLD_STYLE_CTYPE #define ISDIGIT(x) (isascii(x) && isdigit(x)) #else #define ISDIGIT(x) isdigit(x) #endif -- Geoff Clare, UniSoft Limited, Saunderson House, Hayne Street, London EC1A 9HH gwc@root.co.uk (Dumb mailers: ...!uunet!root.co.uk!gwc) Tel: +44-1-315-6600 (from 6th May 1990): +44-71-315-6600
evans@ditsyda.oz (Bruce Evans) (02/15/90)
In article <1990Feb12.043324.5259@sq.sq.com> msb@sq.com (Mark Brader) writes:
*If the code has to run on ANSI and non-ANSI C's, I'd prefer:
*
* #include <stdio.h>
* #include <ctype.h>
*
* #ifndef isascii /* oh, must be ANSI C */
* #define isascii(x) (((x) >= 0 && (x) < UCHAR_MAX) || (x) == EOF))
* #endif
*
*and then
* isascii(*s) && isdigit(*s)
Why doesn't ANSI C guarantee isdigit() (etc.) on *all* characters? The usual
implementation would be to move the base of the ctype array from -1
(EOF) back to -128 (SCHAR_MIN).
Then you can define isascii(x) to be 1 in the above, and not have to worry
about side affects. You still have to watch out for isdigit() on non-chars.
--
Bruce Evans evans@ditsyda.oz.au
karl@haddock.ima.isc.com (Karl Heuer) (02/16/90)
In article <2448@ditsyda.oz> evans@ditsyda.oz (Bruce Evans) writes: >Why doesn't ANSI C guarantee isdigit() (etc.) on *all* characters? (The above seems to mean, "both signed and unsigned characters".) Assume a character set in which (char)-1 is printable, e.g. ISO Latin 1. Your proposal would require that isprint((int)(signed char)(-1)) test as true. But isprint(EOF) is required to test false. Thus, this would require that EOF be defined as a value other than -1. This is permitted by the Standard (I'm not entirely sure why), but it would be A Bad Thing to create conditions that *require* it. Karl W. Z. Heuer (karl@ima.ima.isc.com or harvard!ima!karl), The Walking Lint
rbutterworth@watmath.waterloo.edu (Ray Butterworth) (02/23/90)
In article <2448@ditsyda.oz> evans@ditsyda.oz (Bruce Evans) writes: >Why doesn't ANSI C guarantee isdigit() (etc.) on *all* characters? Then the missing isascii() wouldn't even be needed. This was proposed to the Committee and rejected on the (incorrect) grounds that it couldn't be implemented without the macro's having to evaluate its arguments more than once.
jeffa@hpmwtd.HP.COM (Jeff Aguilera) (02/24/90)
> Then the missing isascii() wouldn't even be needed. > This was proposed to the Committee and rejected on the (incorrect) > grounds that it couldn't be implemented without the macro's having > to evaluate its arguments more than once. Introduce the inline keyword, and then inline all the ctype functions. Simple solution, superior in all respects. (Gawd, ANSI C is brain dead.) ----- jeffa
martin@mwtech.UUCP (Martin Weitzel) (02/24/90)
In article <34540@watmath.waterloo.edu> rbutterworth@watmath.waterloo.edu (Ray Butterworth) writes: >In article <2448@ditsyda.oz> evans@ditsyda.oz (Bruce Evans) writes: >>Why doesn't ANSI C guarantee isdigit() (etc.) on *all* characters? > >Then the missing isascii() wouldn't even be needed. >This was proposed to the Committee and rejected on the (incorrect) >grounds that it couldn't be implemented without the macro's having >to evaluate its arguments more than once. As much as *I* understand ANSI-C, all characters of the "machine character set" must have positive values. So IMHO problems with isdigit() and the other <ctype.h>-stuff can only occur, - if the compiler claims (only) to support ASCII, - but you in fact use it for non-ASCII (eg ISO 8859). This is not the problem of the compiler writers, because you clearly could never assume, that isdigit() operates on EBCDIC *and* ASCII at the same time. The pitty is, that the lower half of ISO 8859 (or IBM extended ASCII as found on the 'typical' PC) is 1:1 mapped into the international ASCII variant. So it becomes not obvious, that an implementation written for ASCII is abused with 8-Bit char-s. On the other side, if a compiler claims to *support* ISO 8859 it has no other choice than to implement all plain char-s as unsigned char. So, the problem should go away. Problems with istype() seems to stem from abuse of an implementation on character sets, it was not designed for! (Nevertheless I *know*, that it is sometimes necessary to "abuse" an implementation in this way, at least in europe with our umlaut-s. If I abuse something, I should not complain it's broken. But the warning in the original posting was valid, of course.) -- Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83
karl@haddock.ima.isc.com (Karl Heuer) (02/26/90)
In article <680020@hpmwjaa.HP.COM> jeffa@hpmwtd.HP.COM (Jeff Aguilera) writes: >[Ray Butterworth wrote:] >> [If isascii() were guaranteed for *all* characters,] >> Then the missing isascii() wouldn't even be needed. >> This was proposed to the Committee and rejected on the (incorrect) >> grounds that it [would make the macro unsafe]. > >Introduce the inline keyword, and then inline all the ctype functions. >Simple solution, superior in all respects. Unnecessary, since (as Ray noted) the unsafe-macro argument was *incorrect*: the macro version would still evaluate the argument exactly once. Insufficient, since (as I noted earlier) the real problem is with the collision between signed chars and EOF. This is a problem with the specification itself, regardless of whether or not it's implemented as a macro. >(Gawd, ANSI C is brain dead.) I suppose it is, but not because of anything you've said here. I blame it on heredity; pre-ANSI C was worse. Karl W. Z. Heuer (karl@ima.ima.isc.com or harvard!ima!karl), The Walking Lint
henry@utzoo.uucp (Henry Spencer) (03/01/90)
In article <668@mwtech.UUCP> martin@mwtech.UUCP (Martin Weitzel) writes: >As much as *I* understand ANSI-C, all characters of the "machine >character set" must have positive values. So IMHO problems with >isdigit() and the other <ctype.h>-stuff can only occur, > >- if the compiler claims (only) to support ASCII, >- but you in fact use it for non-ASCII (eg ISO 8859). Sorry, not so. The ANSI C restriction is weaker than that found in earlier C documentation: it says that the characters in the "source character set" -- roughly speaking, the characters used to write C -- must be positive. There is no promise made about other characters in your machine's character set. Various things in the standard encourage implementors to make char unsigned, but it is not actually compulsory even if you have an 8-bit character set. -- "The N in NFS stands for Not, | Henry Spencer at U of Toronto Zoology or Need, or perhaps Nightmare"| uunet!attcan!utzoo!henry henry@zoo.toronto.edu