[net.lang.c] Multiple character "character constants"

stevesu@bronze.UUCP (Steve Summit) (10/06/83)

I'm sure I won't be the only one to post a followup on this, but
I haven't seen any yet, so here goes.

Multiple-character character constants are a horrible idea. 
Whenever I see one, I have no idea whether or not it's a typo. 
If it's a typo, I don't know if the extra characters are
extraneous or if a string was intended.  If it's not a typo, I
don't know if the programmer knew what he was doing, or if the
program has a chance of working on my machine.

Character constants are explicitly defined to be a single
character.  (Kernighan and Ritchie, page 35, and again on page
180.)  On the other hand, Appendix A, under "portability
considerations" notes (page 212) that "Since character constants
are really objects of type _i_n_t, multi-character character
constants are permitted.  The specific implementation is very
machine dependent, however, because the order in which characters
are assigned to a word varies from one machine to another."

I didn't really understand why multi-character constants would be
useful in the example given in the original article.  (I admit
that I didn't try very hard.)  I would never use a
machine-dependent "feature" like that.  Portability is a very
important issue.  I jump back and forth between vaxes and 11's,
and I like to take my programs with me.  I only write machine
dependent code when I have to (when I'm using assembly language
to do something I can't do in C, or writing code that deals with
object file format.)  If you really want to do a switch on
multiple-character sequences, use a bunch of strcmp's or a hash
function to convert a genuine string to a genuine integer.

                                         Steve Summit
                                         tektronix!tekmdp!stevesu

chris@umcp-cs.UUCP (10/08/83)

Troff uses two-character character constants for the same reason it
uses two characters for everything:  they happen to fit nicely into
an int on a PDP-11.  This makes troff fit on a PDP-11.  (It also speeds
things up considerably over using two "char"s.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci
UUCP:	{seismo,allegra,brl-bmd}!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris.umcp-cs@UDel-Relay

ken@turtleva.UUCP (Ken Turkowski) (10/10/83)

I realize that I may be opening up a can of worms with this question,
but I am trying to put together a package that will verify headers for
files in a machine-independent way.

Everyone knows that different machines have their own way of packing
bytes into words and longwords.
On the big-endian side are IBM and the 68000 with their (excuse me, K&R)
	struct long { char byte0, byte1, byte2, byte3; };
The little-endian VAX and the 16032 with
	struct long { char byte3, byte2, byte1, byte0; };
and the mongrel PDP-11 with
	struct long { char byte1, byte0, byte3, byte2; };
so that only the ordering of bytes in character arrays remains constant
among different machines, not the ordering of bytes in shorts or longs.

Machine independence therefore requires that programs treat the byte
as the holiest quantum of information,
and that all other data structures are derived from the ordering of bytes.

Now, suppose that file headers are four bytes, and that it is desired
to do a switch on this quantity.  C does allow the use of multiple
character "character constants", supposedly accommodating as many
characters as can fit into an int.  On a 32-bit machine,
this means that 4 byte character constants can be constructed.
For example,
	switch (headercode) {
	    case 'eqn ':
	    case 'nrff':
	    case 'fort':
	    case 'pasc':
	}
Now, the question is this:  how are the characters packed into the
int?  In the same way that strings would be packed into a 4-character
array starting at the same address?  This does seem most reasonable,
although I can't seem to find anything definitive on the subject.
Also, do all C compilers support the multiple character "character
constants"?

			Ken Turkowski
		    CADLINC, Palo Alto
		{decwrl,amd70}!turtlevax!ken