[comp.lang.c] Trigraphs and Pragmas

beckenba@cit-vax.Caltech.Edu (Joe Beckenbach) (05/31/88)

[another novice's ventures into the waters...]

	I've seen various unhappiness on the net with trigraphs for
those who don't want to have to worry about them. I've even seen a 
suggestion for 
	#trigraph		[no disrespect, but UGH!]
and for
	#define ??<	\173	[strike 1: ??< not always substitutable,
				 strike 2: \ itself needs trigraph ]

Why not something like a `standard' pragma:
	#pragma trigraph warn
	about possible conversions when the programmer intended that
no trigraphs were written? Note that this pragma can be ignored on C
compilers which NEED trigraphs to express the language [ignored but warned
about, naturally].

To add to trigraph sequences, perhaps something like:
	#pragma trigraph LBRACE &<
to allow &< to map to {		[notice &< is never legal C!]
	#pragma trigraph o240 o140
to mess up some character mappings by replacing all occurances of the
source file character \140 [ASCII high space, national whatever] to the
8-bit character \240 [high ASCII space, national whatever #2].

	With appropriate work, the preprocessor could have equivalent
semantics for several different sequences mapping to characters. I
intentionally am limiting myself to mappings from  (n chars -> 1 char)
to avoid co-opting #define; if the extension (n chars -> m chars) is allowed,
the #define is a special case.
	For example, I might want to use the sequence @@@ to delimit
certain subsections of my files, eg where I need to do more work. Then a
simple remapping within the preprocessor handles this:
#pragma trigraph SPACE		@

	Another example, the ANSI trigraphs under this syntax would become:
#pragma trigraph HASH		??=
#pragma trigraph BSLASH		??/
#pragma trigraph CARET		??'
#pragma trigraph LBRACKET	??[
#pragma trigraph RBRACKET	??]
#pragma trigraph VBAR		??!
#pragma trigraph LBRACE		??<
#pragma trigraph RBRACE		??>
#pragma trigraph TILDE		??-
	or
#pragma trigraph HASH		o077o077o075
	if ASCII and the relevant ISO character set have the same octal codings.

If ASCII and the relevant ISO character standard differ in the 7-bit range, then
oxxx will stand for the ASCII 7-bit coding and something like ixxx will stand
for the ISO character equivalent.


	On related notes:
1- Where can I order the ANSI C standard?
2- Where can I find the registry of international (esp. character) standards,
	and also where can I get ordering information for such a body's work?
3- Who are the best people to circulate this sort of "standard #pragma" ideas,
	and will there be a standardized #pragma library available?
	[Or is that completely counter to the intention of the standards
	committee? Any committeemen (or Ritchie, or whoever) willing to comment?)
4- And how long until the first completely ANSI, non-pre-ANSI C will be out?
	[When this happens, then we all will know the transition is at hand.
	Of course, the wise compiler supporter will offer automatic filters
	to speed conversion between old and new.]

-- 
Joe Beckenbach	beckenba@csvax.caltech.edu	Caltech 1-58, Pasadena CA 91125
	BS E&AS (CS) 1988

gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/01/88)

In article <6757@cit-vax.Caltech.Edu> beckenba@cit-vax.UUCP (Joe Beckenbach) writes:
>1- Where can I order the ANSI C standard?

There is no standard yet.  Public review draft copies can be obtained
from Global Engineering Press.  The third (and hopefully last) formal
public review draft should be at the printers by now; contact the X3
Secretariat of CBEMA for official information about this (and please
post what you find out, if it hasn't already been posted).
	X3 (CBEMA): (202)737-8888

>3- Who are the best people to circulate this sort of "standard #pragma" ideas,
>	and will there be a standardized #pragma library available?

The X3J11 committee does not wish to serve as a #pragma clearing house.
I think some of the committee members are interested in setting one up,
though.

>4- And how long until the first completely ANSI, non-pre-ANSI C will be out?

Obviously this depends on the availability of the wording for the
final, official standard.  That cannot happen before the end of
August, 1988.  It is likely to be substantively the same as the
third formal public review draft, but nobody is guaranteeing that.

karl@haddock.ISC.COM (Karl Heuer) (06/01/88)

In article <6757@cit-vax.Caltech.Edu> beckenba@cit-vax.UUCP (Joe Beckenbach) writes:
>To add to trigraph sequences, perhaps something like:
>	#pragma trigraph LBRACE &<
>to allow &< to map to {		[notice &< is never legal C!]

Allowing the character set to change mid-stream would create a whole new mess
of problems.  (Even the simple "#pragma trigraphs off" is troublesome, since
trigraphing precedes comment removal which precedes preprocessor action.)

Given the constraint that trigraphs are staying in the language, I think that
the best solution is: (users:) make sure your code doesn't contain "accidental
trigraphs" (sed -e "s;??\\([-=(/)'<!>]\\);?\\\\?\\1;g"   will fix them), and
(implementors:) issue a warning if a file contains both trigraphs and
trigraphable characters.

Ignoring that constraint, I think that the best solution is to remove them.
I'll discuss this in a separate posting.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

henry@utzoo.uucp (Henry Spencer) (06/01/88)

> To add to trigraph sequences, perhaps something like:
> 	#pragma trigraph LBRACE &<
> to allow &< to map to {		[notice &< is never legal C!]

This does somewhat depend on the outcome of the "can #pragma change the
semantics of the language?" debate (although in practice an implementor
who felt the need would undoubtedly do it anyway).

>	#pragma trigraph o240 o140

This is unfortunately character-set specific; it would be nice to avoid
that.

> 1- Where can I order the ANSI C standard?

You can't, because it doesn't exist yet.  The second-public-comment draft,
now obsolete, can probably still be had from Global Engineering Documents,
(714)261-1455, and the third-public-comment draft presumably will be available
from them when it comes out (which should be any time now).  The final
standard will be available from ANSI, of course.

>2- Where can I find the registry of international (esp. character) standards,
>	and also where can I get ordering information for such a body's work?

Probably the best place to ask both of these questions is ANSI; unfortunately
I don't have their address handy.

>	and will there be a standardized #pragma library available?

X3J11 decided it did not want to get involved in standardizing #pragma,
which is implementation-specific (i.e., nonstandard) by definition.  There
is some interest in some sort of registry for #pragmas, but X3J11/ANSI
almost certainly isn't going to be involved in that, and I have no more
definite information at this time.

> 4- And how long until the first completely ANSI, non-pre-ANSI C will be out?

Ask your compiler suppliers...  Actually any near-future ANSI compiler is
going to support pre-ANSI forms as well, as a practical consideration.
-- 
"For perfect safety... sit on a fence|  Henry Spencer @ U of Toronto Zoology
and watch the birds." --Wilbur Wright| {ihnp4,decvax,uunet!mnetor}!utzoo!henry