brnstnd@stealth.acf.nyu.edu (01/07/90)
From: brnstnd@stealth.acf.nyu.edu The question is self-explanatory. This is a practical question as well as a theoretical one: I'd like a solution that is both conformant and portable in the real world. Does (int) (unsigned int) ch do the trick? What about (int) (unsigned char)? ---Dan Volume-Number: Volume 18, Number 8
brnstnd@longway.tic.com (Dan Bernstein) (01/10/90)
Apparently (int) (unsigned char) ch is guaranteed to be nonnegative and at most UCHAR_MAX, as long as sizeof(int) > sizeof(char). (If sizeof(int) == sizeof(char), there's obviously no way to fit all the characters into just the positive integers.) This answer is reasonably portable, though pre-ANSI machines must define UCHAR_MAX explicitly. I really did mean this as a UNIX C question, not just a C question; but POSIX doesn't seem to say anything about casts. Some other answers: (int) (unsigned int) ch will convert negative characters into negative integers, so it's wrong if char runs from, say, -128 to 127. ((int) ch) & 0xff works and answers my original question, but it won't handle machines with more than 256 characters. There's no compile-time way to find the right bit pattern---UCHAR_MAX + 1 may not be a power of two. ((int) ch) + UCHAR_MAX + 1) % (UCHAR_MAX + 1) produces the same result as (int) (unsigned char) ch (that is, if sizeof(int) > sizeof(char); otherwise it fails miserably) but is slow on most machines. This is a wonderful opportunity for me to make fun of the mathematical naivete that allows a negative value for x % y in some cases. *(unsigned char *)&ch seems an awfully complicated way to cast ch to an unsigned char. (Technically, it doesn't answer my question, because the result isn't an int.) Yes, Bill, I do have ch in a register, so forget it. ---Dan Volume-Number: Volume 18, Number 16
karl@haddock.ima.isc.com (Karl Heuer) (01/12/90)
In article <7544@cs.utexas.edu> Dan Bernstein <stealth.acf.nyu.edu!brnstnd@longway.tic.com> writes: >((int) ch) & 0xff works and answers my original question, but it won't handle >machines with more than 256 characters. There's no compile-time way to find >the right bit pattern---UCHAR_MAX + 1 may not be a power of two. I believe it is required that all of the U{type}_MAX constants are one less than a power of two. I used to think that U{type}_MAX+1 == 1 << (sizeof(type)*CHAR_BIT) was required, but it seems that Cray has good reasons for violating that identity. I suppose this is another item to send to X3J11 for interpretation. Karl W. Z. Heuer (karl@haddock.isc.com or ima!haddock!karl), The Walking Lint