[comp.std.c] Initialisation of unsigned strings

eru@tnvsu1.tele.nokia.fi (07/28/89)

I often use the "unsigned char" type to represent characters to avoid the
nonsense with negative letters (characters with the MSB set are commonly used
in ordinary text here). Thus sometimes arrays of "unsigned char" are initial-
ised with string constants. Then one day I compiled some such code with Gnu-C
with the switches -ansi -pedantic and it said that my initialisations are
non-ANSI.

First I thought that gcc was being too pedantic, but unfortunately the May
1988 draft seems to agree: 3.1.2.5 classifies "unsigned char" as an unsigned
integer type, and 3.5.7 says that an array of character type may be initialised
by a character string literal. Why could it not say "array of plain, signed or
unsigned char"?

Thus it appears that those who want to program in pure ANSI-C without the
fuzziness introduced by the implementation-dependent behaviour of plain char
lose the convenience of string literals. This ought to be changed in the
next version of the standard (if there ever is one). Prior art exists:
all the C-compilers with unsigned char that I have used compile this usage
of string literals.


Erkki Ruohtula    ! Nokia Telecommunications
eru@tele.nokia.fi ! P.O. Box 33 SF-02601 Espoo, Finland

dfp@cbnewsl.ATT.COM (david.f.prosser) (07/28/89)

In article <438@mjolner.tele.nokia.fi> eru@tnvsu1.tele.nokia.fi () writes:
>Thus it appears that those who want to program in pure ANSI-C without the
>fuzziness introduced by the implementation-dependent behaviour of plain char
>lose the convenience of string literals. This ought to be changed in the
>next version of the standard (if there ever is one). Prior art exists:
>all the C-compilers with unsigned char that I have used compile this usage
>of string literals.

The pANS requires that arrays of any of the three char types can be
initialized with a string literal.  The definition of *character type*
can be found in section 3.1.2.5 (like most of the other mumble types):

	The three types char, signed char, and unsigned char are
	collectively called the character types.

However, one cannot initialize a pointer to a signed or unsigned char
with a string literal (without a cast).

	char ca[] = "abc";				/* valid */
	signed sca[] = "abc";				/* valid */
	unsigned uca[] = "abc";				/* valid */

	char *cp = "abc";				/* valid */
	signed char *scp = "abc";			/* invalid */
	unsigned char *ucp = "abc";			/* invalid */

	signed char *p = (signed char *)"abc";		/* valid */
	unsigned char *p = (unsigned char *)"abc";	/* valid */

Dave Prosser	...not an official X3J11 answer...

walter@hpclwjm.HP.COM (Walter Murray) (07/28/89)

>First I thought that gcc was being too pedantic, but unfortunately the May
>1988 draft seems to agree: 3.1.2.5 classifies "unsigned char" as an unsigned
>integer type, and 3.5.7 says that an array of character type may be initialised
>by a character string literal. Why could it not say "array of plain, signed or
>unsigned char"?

Note that "unsigned char" IS a character type, as well as an unsigned
integer type.  In the May, 1988, draft, see page 23, line 9 (3.1.2.5).
So it is legal to initialize an array of unsigned char with a character
string literal.

Walter Murray
Not speaking for X3J11
----------