[comp.lang.c] Character and string literals

diamond@diamond.csl.sony.junet (Norman Diamond) (07/11/89)

comp.lang.c has been added to the distribution of this followup.

Mr. Sommarskog tested the Eiffel compiler to see which characters are
accepted in character and/or string literals.  The Eiffel compiler
generates a portable assembly code (C of course) as intermediate code.

In article <102@enea.se> sommar@enea.se (Erland Sommarskog) writes:

>  Now, what about the error the C compiler detected? The cause is
>the very last string, which contains character 255. (Which corre-
>sponds to lowercase dotted "y" in 8859/1.) Apparently the C compiler
>takes this end of file. (My knowledge of C and Unix is little, but
>isn't -1 often a code for end of file? And -1 and 255 is the same
>thing for a byte.)

Indeed yes.  There are periodic flamefests in comp.lang.c, reminding
C programmers that they should getchar() into a short or int, instead
of into a char, so that they can test the int value correctly against
the constant EOF, which is -1.  Looks like some programmer wrote a
C compiler without knowing how to use C.  (This happens a lot.)

--
Norman Diamond, Sony Computer Science Lab (diamond%csl.sony.jp@relay.cs.net)
 The above opinions are claimed by your machine's init process (pid 1), after
 being disowned and orphaned.  However, if you see this at Waterloo, Stanford,
 or Anterior, then their administrators must have approved of these opinions.

sommar@enea.se (Erland Sommarskog) (07/30/89)

(This comes from comp.lang.eiffel originally. I cross-posted to
comp.lang.c and .misc and directed followup to the latter group,
since I see this a general language issue. And, besides, I don't
read comp.lang.c.)

John Cowan (cowan@marob.masa.com)  = ">"
Norman Diamond (diamond@csl.sony.junet) = ">>"
Me = ">>>"

I was testing the Eiffel compiler to see which non-ASCII characters
it accepted and which it rejected. The compiler generates C as portable
assembler, and one of the characters made the C compiler choke:

>>>  Now, what about the error the C compiler detected? The cause is
>>>the very last string, which contains character 255. (Which corre-
>>>sponds to lowercase dotted "y" in 8859/1.) Apparently the C compiler
>>>takes this end of file. (My knowledge of C and Unix is little, but
>>>isn't -1 often a code for end of file? And -1 and 255 is the same
>>>thing for a byte.)

>>Indeed yes.  There are periodic flamefests in comp.lang.c, reminding
>>C programmers that they should getchar() into a short or int, instead
>>of into a char, so that they can test the int value correctly against
>>the constant EOF, which is -1.  Looks like some programmer wrote a
>>C compiler without knowing how to use C.  (This happens a lot.)

>On the other hand, it would be better for the Eiffel compiler to emit
>the sequence "\377" in this case, rather than the character itself.
>No C program should contain characters from outside the C character set.
>It's not illegal, merely a poor idea.

In that case C better extends its character set pretty quick. And all
other languages too. Try to convince the user with a 8859/1 that he
has just made a poor choice of a character. The lowercase dotted "y"
looks as legal to him as any other printable character.

John Cowan the says with one character per line:
>inews is a fascist

Os rimply replace ">" with some other string, " >" for example.

I usually don't comment signature in public, but:
>		Charles li reis, nostre emperesdre magnes
>		Set anz toz pleins at estet in Espagne.
What on Earth is this for language? Galician? Provencal?
-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se
"Hey poor, you don't have to be Jesus!" - Front 242