ycy@walt.cc.utexas.edu (Joseph Yip) (07/19/90)
Hi, The char and unsigned char problem has been with me for a long time. I know char represents 7 bits ASCII and unsigned char works with 8-bit. Most of UNIX string functions (strcpy, strcmp,...). malloc() also returns *char, not *unsigned char. Some systems have char defaulted to 8-bit. Other require you to declare explicitly as unsigned char. If I pass a unsigned char pointer to a function that expects a char pointer, e.g. int foo( char *p); ... unsigned char *buf; a = foo(buf); will there be a problem? Will the foo() mask off my 7th-bit? You know I hate writing the same system library functions where the only difference is the 7th-bit. If I am using ANSI C, the compiler will give me warnings or errors because of the type mismatch! Thank you - Joseph Yip Email: joseph@zeus.ee.utexas.edu
karl@haddock.ima.isc.com (Karl Heuer) (07/19/90)
In article <34292@ut-emx.UUCP> ycy@walt.cc.utexas.edu (Joseph Yip) writes: >I know char represents 7 bits ASCII and unsigned char works with 8-bit. Not quite. `char' is an arithmetic type which is at least eight bits wide, but it's implementation-defined whether it's signed or unsigned. For normal use in text processing, you shouldn't need to know the integer value of a character, so `char' is sufficient. The unfortunate exceptions are that the return value of `getc()' and the argument to a <ctype.h> function are a bastard type: instead of the logically correct `char', they use the union of `unsigned char' and { EOF }. Now, since all normal% characters are contained within the intersection of `char' and `unsigned char', you can safely ignore this botch if you *know* you're dealing with the most restrictive kind of text. >If I pass a unsigned char pointer to a function that expects a char >pointer ... will there be a problem? Will [it] mask off my 7th-bit? No. At worst you'll need to use an explicit cast, but I believe the Standard contains a clause to guarantee that the behavior is as you expect. My recommendation is to always use `char *' for text, and do conversions to `unsigned char' only in the context of <ctype.h> functions. >You know I hate writing the same system library functions where the >only difference is the 7th-bit. I don't see any need. Save your energy for a *real* problem, like wchar_t. Karl W. Z. Heuer (karl@kelp.ima.isc.com or ima!kelp!karl), The Walking Lint ________ % Besides being true of all ASCII characters, this guarantee is also extended to the entire C source character set in non-ASCII alphabets. Basically this forbids an EBCDIC implementation from making `char' a signed 8-bit type.
evil@arcturus.uucp (Wade Guthrie) (07/24/90)
> My recommendation is to always use `char *' for text, and do > conversions to `unsigned char' only in the context of <ctype.h> > functions. Unless, of course, you need to do byte-style things rather than character-style things. In that case, unsigned char can be real neat! -- Wade Guthrie (evil@arcturus.UUCP) | "He gasped in terror at what sounded Rockwell International; Anaheim, CA | like a man trying to gargle while My opinions, not my employer's. | fighting off a pack of wolves" | Hitchhiker's Guide