daveh@marob.uucp (Dave Hammond) (04/08/91)
In the following code examples, the '\nnn' representation causes the compiler to sign-extend 8 bit values, causing comparisons to fail (255 != -1). This occurred on every machine I tested (Generic 386, Altos 386, DEC uVax). Is this because the '\nnn' token being seen as a char value and chars are signed on the machines I tested on? Or must I expect this result from all C compilers? /* c1.c */ main() { int a = '\xFF'; printf("a=%o\n",a); } /* c2.c */ main() { int a = 0xFF; printf("a=%o\n",a); } output from diff c1.s c2.s: 4c4 < TITLE $c1 --- > TITLE $c2 33c33 < mov DWORD PTR [ebp-4], -1 --- > mov DWORD PTR [ebp-4], 255 -- Dave Hammond daveh@marob.uucp uunet!rutgers!phri!marob!daveh
grogers@convex.com (Geoffrey Rogers) (04/10/91)
In article <28007837.35A9@marob.uucp> daveh@marob.uucp (Dave Hammond) writes: >In the following code examples, the '\nnn' representation causes the >compiler to sign-extend 8 bit values, causing comparisons to fail (255 >!= -1). This occurred on every machine I tested (Generic 386, Altos >386, DEC uVax). Is this because the '\nnn' token being seen as a char >value and chars are signed on the machines I tested on? Yes, this is correct. This is because char's can be either signed or unsigned. On the machines you mention, char is signed. >Or must I expect this result from all C compilers? No. +------------------------------------+---------------------------------+ | Geoffrey C. Rogers | "Whose brain did you get?" | | grogers@convex.com | "Abbie Normal!" | | {sun,uunet,uiucdcs}!convex!grogers | | +------------------------------------+---------------------------------+
gwyn@smoke.brl.mil (Doug Gwyn) (04/10/91)
In article <28007837.35A9@marob.uucp> daveh@marob.uucp (Dave Hammond) writes: >Is this because the '\nnn' token being seen as a char value and chars are >signed on the machines I tested on? Essentially, yes. The value 255 gets stuffed into a char then converted to type int, which propagates the sign bit when char acts as a signed type.
steve@taumet.com (Stephen Clamage) (04/10/91)
grogers@convex.com (Geoffrey Rogers) writes: |In article <28007837.35A9@marob.uucp> daveh@marob.uucp (Dave Hammond) writes: |>.... Is this because the '\nnn' token being seen as a char |>value and chars are signed on the machines I tested on? |Yes, this is correct. This is because char's can be either signed or |unsigned. On the machines you mention, char is signed. |>Or must I expect this result from all C compilers? |No. Le me also point out that you must be careful of code which depends on whether the default char type is signed or unsigned; sooner or later it will be ported to an environment where it will no longer work. When possible, use explicit declarations of 'signed char' or 'unsigned char'. (Unfortunately, not all compilers accept the 'signed' keyword.) In the posted example, you could do something like: int i = (unsigned char) '\xFF'; although it is unclear to me why this is better than using a non-char value: int i = 0xFF; -- Steve Clamage, TauMetric Corp, steve@taumet.com
wolfram@cip-s08.informatik.rwth-aachen.de (Wolfram Roesler) (04/12/91)
It's best to compare in the following way: char x = -1; char y = 0xff; if ((unsigned char)x = (unsigned char)y) ... ... so before comparing, cast both to unsigned char. This because you do not know (because it's undefined by the language definition) if char is unsigned or not.
Sepp@ppcger.ppc.sub.org (Josef Wolf) (04/15/91)
wolfram@cip-s08.informatik.rwth-aachen.de (Wolfram Roesler) writes: ] It's best to compare in the following way: ] char x = -1; ] char y = 0xff; ] if ((unsigned char)x = (unsigned char)y) ] ... ] ... so before comparing, cast both to unsigned char. This because you ] do not know (because it's undefined by the language definition) if char ] is unsigned or not. You even don't know about the number of bits of a char (or is it defined?) So you can get into trouble with this. Greetings Sepp | Josef Wolf, Germersheim, Germany | +49 7274 8047 -24 Hours- (call me :-) | | sepp@ppcger.ppc.sub.org | +49 7274 8048 -24 Hours- | | ...!ira.uka.de!smurf!ppcger!sepp | +49 7274 8967 18:00-8:00, Sa + Su 24h | | ----=> kommt Zeit => kommt Frau | all lines 300/1200/2400 bps 8n1 | | kommt Frau => geht Zeit, geht Zeit => geht Frau, geht Frau => kommt Zeit |
wolfram@cip-s02.informatik.rwth-aachen.de (Wolfram Roesler) (04/18/91)
Sepp@ppcger.ppc.sub.org (Josef Wolf) writes: >wolfram@cip-s08.informatik.rwth-aachen.de (Wolfram Roesler) writes: >] It's best to compare in the following way: >] char x = -1; >] char y = 0xff; >] if ((unsigned char)x = (unsigned char)y) >] ... >] ... so before comparing, cast both to unsigned char. This because you >] do not know (because it's undefined by the language definition) if char >] is unsigned or not. >You even don't know about the number of bits of a char (or is it defined?) >So you can get into trouble with this. A char is defined to contain a single character, so this is usually 8 bits. However, I dont think my prg will cause trouble since char x = -1 includes an implicit cast. Assume a char is 8 bits and an int is 16, then this line will not simply copy the low 8 bits of -1 into x, but it will do this preserving the sign (if chars are signed, which we assume here). Your compiler might give a warning about signed assignement to an unsigned var, but that's all.
bhoughto@bishop.intel.com (Blair P. Houghton) (04/19/91)
In article <wolfram.671964442@cip-s02> wolfram@cip-s02.informatik.rwth-aachen.de (Wolfram Roesler) writes: >Sepp@ppcger.ppc.sub.org (Josef Wolf) writes: >>You even don't know about the number of bits of a char (or is it defined?) >>So you can get into trouble with this. > >A char is defined to contain a single character, so this is usually 8 bits. Look in <limits.h> (or grep through your compiler's include-files) for `CHAR_BIT'. This is the definition of the number of bits in a `char'. --Blair "Give me egrep and a place to stand and I will move the earth."
wolfram@cip-s01.informatik.rwth-aachen.de (Wolfram Roesler) (05/08/91)
BTW, isnt a char defined to contain at least 8 bits in Ansi-C? I read something about that in a (however rather unprecise) book.
worley@compass.com (Dale Worley) (05/08/91)
In article <wolfram.673701563@cip-s01> wolfram@cip-s01.informatik.rwth-aachen.de (Wolfram Roesler) writes:
Subject: Re: 0xFF != '\xFF' ?
BTW, isnt a char defined to contain at least 8 bits in Ansi-C?
I read something about that in a (however rather unprecise) book.
Yes, chars are required to have at least 8 bits. See section 2.2.4.2
of the standard.
However beware that "character constants" in C really have type int.
The standard states: "If [a] character constant contains a single
character or escape sequence, its value is the one that results when
an object with type char whose value is that of the single character
or escape sequence is converted to type int." (3.1.3.4, page 30, line
33, Dec 88 draft) This means that if your chars are *signed*,
character constants with the high bit on will be *sign-extended* into
ints. If you want to avoid this, you will probably have to say
(unsigned char)'\xFF'
Then the conversion to int is done by zero-extending rather than
sign-extending.
In general, signed char's are losing, especially if you have
characters with values >127.
Dale
Dale Worley Compass, Inc. worley@compass.com
--
You have joined the Legion to die. We will send you where you can die.
-- Plaque reputedly posted in mess halls of the French Foreign Legion.