[net.lang.c] Fuel for your flames: Things I would like in CPP

kpmartin@watmath.UUCP (Kevin Martin) (09/30/84)

There are several things which I would like in CPP, but before I describe
them, I would like to stifle the obvious flame:
*USING M4 IS NOT A SUITABLE SOLUTION*
1) Not every system has m4 (e.g. non-unix systems)
2) Having to manually use a pre-processor which is not well-integrated
   into the language is a pain; ask anyone who uses Troff+Eqn+Tbl how
   much fun it is to find the original input line which caused Troff
   to issue an error message
3) m4 can't use CPP macros, nor vice versa. For instance, m4 can't use
   names such as NOFILE (in <sys/types.h>), since the include file
   is brought in by CPP's #include directive (which m4 doesn't understand),
   and even if m4 understood #include, it wouldn't understand #define.


Now, on to things I would like to see:
1) The ability to define a token to be the result of an expression:
   e.g.
		#eval XXX 1+2
   would define 'XXX' to have the value 3, just as if I had typed
		#define XXX 3
   The expression would be macro-expanded just as it is for #if

   This is useful for two purposes: Any defined constants within the
   expression can be re-defined without changing the new constant, e.g.
		/* Define offsets */
		#define X 1
		#define Y 2
		#define Z 3

		/* Define first group */
		#define BASE 10
		#eval XX BASE+X
		#eval YY BASE+Y
		#eval ZZ BASE+Z

		/* Define second group */
		#define BASE 20
		 ... /* the value of XX, when expanded, remaine unchanged */

   The other use is to generate new identifiers. This requires some form
   of token concatenation to work:
		#eval NUM NUM+1
		#define NAME tempvar/**/NUM
		/* (or however token concatenation works) */


2) The ability to put newlines in a macro definition, in order that
   the macro, when expanded, can perform other #directives
   This should probably use a directive other than #define, since
   escaping the newline for #define already has meaning. Perhaps:
		#macro name(formals)
			/* put macro body here */
		#endm name
   (The 'name' on the #endm would be optional, and default to the macro
   which is currently being defined; it is there so that the invokation
   of a macro can define other macros)


3) The ability to temporarily send output to another file, to be later
   included by a form of the #include directive. This is similar to the
   diversion facility in m4.
   This facility, combined with macros (as in (2) above), allows the
   generation of parallel tables, since the macro can put the entry
   into the first table, switch to another diversion, and put the entry
   for the second table, then switch back.


4) Given proper macros, it is now useful to have the ability to
   issue error messages, perhaps including a severity (warning --
   the compile can continue, error -- CPP can continue, but the compiler
   should not be called up, fatal -- force CPP to give up immediately).
		#message 0 "Warning: NUM < 0"
   Maybe the quotes aren't needed...

The combination of the first three features allows generation of ragged
initialized arrays: Each row is given a name using #eval and token
concatanation, the row is given storage class 'static', and, by switching
to another diversion, the row pointer in the edge vector can be
initialized to point to the row.

Note that none of these changes will harm existing C programs.
And with a well-written CPP, none of these changes are terribly difficult
to implement.
                 Kevin Martin, UofW Software Development Group

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (10/02/84)

Those look like useful extensions to CPP.  Perhaps they should be put
in the "officially sanctioned extensions" section of the standards
document.  However, it might be counterproductive to insist on all C
compilers having them right away, since it forces a major rewrite
of most existing CPPs.

kendall@wjh12.UUCP (Sam Kendall) (10/02/84)

Kevin Martin proposes extensions to the CPP, making it a fundamentally
more powerful macro processor in several ways.  I will not examine
his proposals in detail, but any consideration of them should include
a lot of thought about the really dangerous and impossible-to-read
things that could be done with them.  Martin conveniently provides an
example:
> The combination of the first three features allows generation of ragged
> initialized arrays: Each row is given a name using #eval and token
> concatanation, the row is given storage class 'static', and, by switching
> to another diversion, the row pointer in the edge vector can be
> initialized to point to the row.
Heaven help us!

One should think very carefully before extending the CPP, because macro
semantics are extremely dangerous.  And they are dangerous in a
different way than pointers.  Complex usage of a powerful macro
processor such as MACRO-11 or m4 is not just hard to make readable; it
is impossible to make readable.  I consider the CPP's lack of power a
feature.

Ill-considered wish-lists can be fun; still, I would like to see
proposed features considered guilty until proven innocent, meaning that
people who propose them should make some attempt to justify them beyond
saying "Wow, look what we could do with this!"

	Sam Kendall	  {allegra,ihnp4,ima,amd}!wjh12!kendall
	Delft Consulting Corp.	    decvax!genrad!wjh12!kendall

henry@utzoo.UUCP (Henry Spencer) (10/02/84)

These suggestions are interesting, but I suspect that the response to
some of them should be "if the ANSI C committee tries to solve all the
world's problems, the ETA of the standard is roughly 2357 AD".  The
realities of building standards dictate that one sometimes just has to
say "there is no standard-conforming way to do X", because the list of
possible Xs is nearly infinite.  Deciding to build something with
approximately the same power as the current C language may be a cowardly
decision, but bravery is not necessarily a virtue in a standards effort.

It is also worth noting that (please correct me if I'm wrong on this,
Kevin) there is no operational experience with any of these things.
Standard committees have to decide, quite early on, whether they are
going to try and invent new solutions, or try to stick very closely
to things that are well-understood and proven in action.  The latter
approach is generally safer, and seems to be the prevailing mood of
the ANSI C folks.

> And with a well-written CPP, none of these changes are terribly difficult
> to implement.

You're sure?  Including the CPP's that are integrated into the scanners
of compilers?  I'm not saying you're wrong, just saying that this strikes
me as a very bold statement indeed.  There is more than one (good) way to
implement a CPP, and some of them may not lend themselves to such changes.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

aug@cybvax0.UUCP (Amit Green) (10/07/84)

>Kevin Martin writes <9225@watmath.UUCP>:
>	4) Given proper macros, it is now useful to have the ability to
>	   issue error messages, perhaps including a severity (warning --
>	   the compile can continue, error -- CPP can continue, but the compiler
>	   should not be called up, fatal -- force CPP to give up immediately).
>			#message 0 "Warning: NUM < 0"
>	   Maybe the quotes aren't needed...

I think this would be usefull.  Currently I have to use something like this:
	#ifndef BUFSIZ
	# include "? Whoops - BUFSIZ not defined"
	#endif

>joemu@tekecs writes <4092@tekecs.UUCP>
>	Should benign (identical) redefinition of a macro be allowed?

I think this would be usefull; and harmless.  I often redefine routines
(in different include files of course):
	extern char *retstring () ;
	extern char *retstring () ;

If I sometimes changed retchar to a macro, I would want to be able to have
the following in both my include files also:
	#define retstring(s)	"s"