[comp.lang.c] How to convert a char into an int from 0 through 255?

bill@twwells.com (T. William Wells) (01/07/90)

From: bill@twwells.com (T. William Wells)

In article <498@longway.TIC.COM> uunet!stealth.acf.nyu.edu!brnstnd (Dan Bernstein) writes:
: From: brnstnd@stealth.acf.nyu.edu
:
: The question is self-explanatory. This is a practical question as well
: as a theoretical one: I'd like a solution that is both conformant and
: portable in the real world. Does (int) (unsigned int) ch do the trick?
: What about (int) (unsigned char)?

Excuse, but this is purely a C question, so I've directed
followups to comp.lang.c.

[ I'm not sure I agree, but I don't see comp.std.unix people clamoring
to answer this question, so let's send it to comp.lang.c to see if there
is more interest there.  -mod ]

 Anyway, I don't think that either is
guaranteed. One that is, assuming that the character is not in a
register, is: *(unsigned char *)&ch.

(NB: a char might be converted to a number larger than 255 if
characters are larger than eight bits.)

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

Volume-Number: Volume 18, Number 15

brnstnd@longway.tic.com (Dan Bernstein) (01/10/90)

Apparently (int) (unsigned char) ch is guaranteed to be nonnegative
and at most UCHAR_MAX, as long as sizeof(int) > sizeof(char). (If
sizeof(int) == sizeof(char), there's obviously no way to fit all the
characters into just the positive integers.) This answer is reasonably
portable, though pre-ANSI machines must define UCHAR_MAX explicitly.

I really did mean this as a UNIX C question, not just a C question; but
POSIX doesn't seem to say anything about casts.

Some other answers:

(int) (unsigned int) ch will convert negative characters into negative
integers, so it's wrong if char runs from, say, -128 to 127.

((int) ch) & 0xff works and answers my original question, but it won't
handle machines with more than 256 characters. There's no compile-time
way to find the right bit pattern---UCHAR_MAX + 1 may not be a power of
two.

((int) ch) + UCHAR_MAX + 1) % (UCHAR_MAX + 1) produces the same result
as (int) (unsigned char) ch (that is, if sizeof(int) > sizeof(char);
otherwise it fails miserably) but is slow on most machines. This is a
wonderful opportunity for me to make fun of the mathematical naivete
that allows a negative value for x % y in some cases.

*(unsigned char *)&ch seems an awfully complicated way to cast ch to an
unsigned char. (Technically, it doesn't answer my question, because the
result isn't an int.) Yes, Bill, I do have ch in a register, so forget it.

---Dan


Volume-Number: Volume 18, Number 16

martin@mwtech.uucp (Martin Weitzel) (01/17/90)

From: martin@mwtech.uucp (Martin Weitzel)

[ Please send all further articles on this subject to comp.lang.c.  -mod ]

In article <7544@cs.utexas.edu> Dan Bernstein <stealth.acf.nyu.edu!brnstnd@longway.tic.com> writes:
[some lines deleted]
>handle machines with more than 256 characters. There's no compile-time
>way to find the right bit pattern---UCHAR_MAX + 1 may not be a power of
>two.

Look at my recent posting about a portable 'offsetof()'-Makro.
The general principle outlined there, is almost allways usable
in any situation, where you have a way to solve a problem at
run time, but you need the answer at compile time.

-- 
Martin Weitzel, email: martin@mwtech.UUCP, voice: 49-(0)6151-6 56 83


Volume-Number: Volume 18, Number 21