[mod.std.c] mod.std.c Digest V6#8

osd7@homxa.UUCP (Orlando Sotomayor-Diaz) (05/13/85)
From: Orlando Sotomayor-Diaz (The Moderator) <cbosgd!std-c>


mod.std.c Digest            Sun, 12 May 85       Volume 6 : Issue   8 

Today's Topics:
                       Alternate character sets
                         How does ## work...
----------------------------------------------------------------------

Date: Thu, 9 May 85 21:48:10 edt
From: Kevin Martin <ihnp4!watmath!kpmartin>
Subject: Alternate character sets
To: std-c@cbosgd

Has there been any thought of supporting alternate character sets (i.e. other
than the character set used for 'c's and "string"s)? At least one C compiler,
the Bell Labs GCOS compiler, has them (BCD `string`s), and many relatives
of this compiler "know" about grave accents and complain if you use them.
This would allow simpler use of Honeywell's BCD, CDC's funny 64-character set,
and also the dreaded rad50 character set used on many 16 and 32 bit machines.

It's *not* something new, and it's *not* 'syntactic sugar'.

                        Kevin Martin, UofW Software Development Group.
(They'll only re-write their linkers if the new ones can read old object
files, and the old files say neat things like "$      object" in BCD  :-))

[ The source character set must contain 52 letters (English alphabet,
upper and lower case), the ten decimal digits, and the following
29 graphic characters:

! " # % & ' ( ) * + , - . / : ; < = > ? [ \ ] ^ _ { | } -

Notice that the grave accent (`) is not required, though a particular
implementation may add it.  Some of the 29 graphic characters above
can be mapped from trigraphs, as discussed here some time ago.

			-- Mod. -- ]

------------------------------

Date: Thu, 9 May 85 21:36:25 edt
From: Kevin Martin <ihnp4!watmath!kpmartin>
Subject: How does ## work...
To: std-c@cbosgd

I have only heard vague descriptions of what ## does: It concatenates tokens.
However, it appears to do so after formal parameter replacement in a macro.
But the body of a macro is already tokenized (see the definition for #define).
So ## must un-tokenize (back into a character stream) and re-tokenize. The
question is: How many tokens after the ## does it un-tokenize? Consider:

	#define foo(n) n ## 32Ugly

The macro prototype consists of the tokens:
	(formal parameter 1) '##' (unsigned constant 32) (identifier 'gly')

Now we call the macro:
	foo(22.)

The new token sequence becomes:
	(floating constant '22.') '##' (unsigned constant 32) (id 'gly')

Now, what is the resulting token sequence?
	(floating constant '22.32') (identifier 'Ugly')      ?

A clean way of avoiding these problems is to give a stricter definition of
'##': It joins *exactly* two tokens into *exactly* one. No leftovers. This
would make my example erroneous, since the 'U' would be left over after
re-scanning '22.32'.

Or does the draft standard already say this?
               Kevin Martin, UofW Software Development Group

[ "Macro names found in a macro argument are replaced appropriately.
A comma in the replacement token sequence does not change the actual
number of arguments to the macro. After all replacements have taken
place, each instance in the definition of a ## token is deleted, and
the tokens preceeding and following it are concatenated to a single
token."

Section C.8.2, p. 49, Draft 85-008.

I'm not sure the paragraph above is suffient to answer the question.
Any comments?
		-- Mod -- ]

------------------------------

End of mod.std.c Digest - Sun, 12 May 85 20:01:49 EDT
******************************
USENET -> posting only through cbosgd!std-c.
ARPA -> ... through cbosgd!std-c@BERKELEY.ARPA (NOT to INFO-C)
In all cases, you may also reply to the author(s) above.