[comp.std.c] How to convert a char into an int from 0 through 255?

brnstnd@stealth.acf.nyu.edu (01/07/90)

From: brnstnd@stealth.acf.nyu.edu

The question is self-explanatory. This is a practical question as well
as a theoretical one: I'd like a solution that is both conformant and
portable in the real world. Does (int) (unsigned int) ch do the trick?
What about (int) (unsigned char)?

---Dan

Volume-Number: Volume 18, Number 8

brnstnd@longway.tic.com (Dan Bernstein) (01/10/90)

Apparently (int) (unsigned char) ch is guaranteed to be nonnegative
and at most UCHAR_MAX, as long as sizeof(int) > sizeof(char). (If
sizeof(int) == sizeof(char), there's obviously no way to fit all the
characters into just the positive integers.) This answer is reasonably
portable, though pre-ANSI machines must define UCHAR_MAX explicitly.

I really did mean this as a UNIX C question, not just a C question; but
POSIX doesn't seem to say anything about casts.

Some other answers:

(int) (unsigned int) ch will convert negative characters into negative
integers, so it's wrong if char runs from, say, -128 to 127.

((int) ch) & 0xff works and answers my original question, but it won't
handle machines with more than 256 characters. There's no compile-time
way to find the right bit pattern---UCHAR_MAX + 1 may not be a power of
two.

((int) ch) + UCHAR_MAX + 1) % (UCHAR_MAX + 1) produces the same result
as (int) (unsigned char) ch (that is, if sizeof(int) > sizeof(char);
otherwise it fails miserably) but is slow on most machines. This is a
wonderful opportunity for me to make fun of the mathematical naivete
that allows a negative value for x % y in some cases.

*(unsigned char *)&ch seems an awfully complicated way to cast ch to an
unsigned char. (Technically, it doesn't answer my question, because the
result isn't an int.) Yes, Bill, I do have ch in a register, so forget it.

---Dan


Volume-Number: Volume 18, Number 16

karl@haddock.ima.isc.com (Karl Heuer) (01/12/90)

In article <7544@cs.utexas.edu> Dan Bernstein <stealth.acf.nyu.edu!brnstnd@longway.tic.com> writes:
>((int) ch) & 0xff works and answers my original question, but it won't handle
>machines with more than 256 characters. There's no compile-time way to find
>the right bit pattern---UCHAR_MAX + 1 may not be a power of two.

I believe it is required that all of the U{type}_MAX constants are one less
than a power of two.

I used to think that   U{type}_MAX+1 == 1 << (sizeof(type)*CHAR_BIT)   was
required, but it seems that Cray has good reasons for violating that identity.
I suppose this is another item to send to X3J11 for interpretation.

Karl W. Z. Heuer (karl@haddock.isc.com or ima!haddock!karl), The Walking Lint