[comp.lang.c] Escaped Parenthesis and Portable Sign-extension

greg@noel.CTS.COM (J. Gregory Noel) (02/22/90)

Mark Shepard (shepard@upas.CS.ORST.EDU) writes:
>	Say I have a variable X of some integer (char, short, int, long) type
>	Xtype_t, and that I want to print out this variable with printf()
>	in many places in my program.  Now, I want this program to be as
>	easy to maintain as possible, so I don't want to spread information
>	about the actual type of Xtype_t all throughout the program via my
>	printf() format statements....

I've encountered the same thing.  Try this:
	typedef long Xtype_t;
	#define Xtype_fmt "%ld"
	...
	Xtype_t X;
	...
	printf("The value of X is" Xtype_fmt "\n", X);

Because of the string catenation in ANSI C, this is very flexible and portable.
It can be easily extended for hex or octal formats.  (I don't really see the
need to print a signed number as unsigned (or vice versa), but this scheme can
also be extended for that as well.)  I don't know if this is the Zen solution,
but it seems the obvious way to do it, short of C++.

In fact, when the committee was running around creating all the *_t typedefs,
I wonder why they didn't mandate *_fmt #defines to permit them to be printed.
(I know, I know, "lack of current implementation experience."  But that didn't
stop them from inventing a bunch of *_t typedefs....  Was it ever discussed,
Doug?  Or would this be "quality of implementation?")
-- 
-- J. Gregory Noel, UNIX Guru       greg@noel.cts.com  or  greg@noel.uucp

shepard@upas.CS.ORST.EDU (Mark Shepard) (02/22/90)

I have two questions about portability (forgive me if these are too simplistic
but I'd like a second opinion).  

First, parenthesis:  In a program I'm currently hacking on, the chars '(' and
')', when they appear as char constants, are escaped: '\(' and '\)'.
As far as I know ( and ) have no special meaning, and in k&r C the escape
is just ignored (right?).  But "gcc" complained, and I was going to change
all the \( and \) to ( and ) to keep it happy, but...wait a minute...SOMEBODY
must have put those there for a reason?!?  Is there a reason these are escaped?
If I change 'em back to satisfy gcc, will this program break on some other 
system?  What does ANSI say about escaping characters which don't need it?
(undefined?)

Okay, now the second thing:

( You're gonna think this is REALLY WEIRD, so please explain to me the
  "enlightened" solution. )

Problem:
	Say I have a variable X of some integer (char, short, int, long) type
	Xtype_t, and that I want to print out this variable with printf()
	in many places in my program.  Now, I want this program to be as
	easy to maintain as possible, so I don't want to spread information
	about the actual type of Xtype_t all throughout the program via my
	printf() format statements, which have to specify the width of the
	integer-type which I'm passing.  e.g.:
	
		typedef long Xtype_t;
		...
		Xtype_t X;
		...
		printf("%ld",X);

	If I later change Xtype_t to int or short and if the size of 
	ints, shorts, and/or longs aren't the same, I'll have to change all
	the printf's.  Yuck.

Solution 1:  Put a suitable printf-format string-constant or something in
	a .h file--change this when you change Xtype_t.

	#define print_Xtype(x)	printf("%ld",x);
	typedef long Xtype_t;
	...
	print_Xtype(X);

	This an similar schemes work, but I reject them because there so
	cumbersome...after all, that's what's so nice about printf!

Solution 2:  Cast everything to a long before printing.
	Since the size is unknown, we just force it to be a long and then print
	it as a long--so we don't have to change formats when we change the
	type:

	printf("%ld",(long)X);

	Okay, this is pretty good, but (now I'm going to change the problem :-)
	lets say I want to print X as both signed and unsigned (or in hex, 
	which is always unsigned).  If I do:

	printf("dec=%ld hex=%lx",(long)X,(long)X);

	if X is in fact smaller than a long and is a signed type, it will be
	sign-extended which will be fine if I print it out as signed, but
	if I print it as unsigned, I'll get something like this:

	/* ints are 16 bits, longs are 32 bits */
	X=-2;
	...
	printf("dec=%ld hex=%lx",(long)X,(long)X);

	output:
	dec=-2 hex=fffffffe

	instead of the correct output:
	dec=-2 hex=fffe

	( Yes, I could probably screw around with field-sizes, etc, in printf,
	but that sort of defeits the purpose. )

My Solution:
	I wrote two macro/functions, UL() and SL(), which take any integer
	type and convert that type to a long with (SL) sign-extension or 
	without (UL) sign-extension.  Thus, I can write:

	printf("dec=%ld hex=%lx",SL(X),UL(X));

	and get the correct result, regardless of how I change the actual
	type of X in some header file.

	Here are UL and SL:

/**********************************************************************/	
/* ulsl.h */
/* UL(x), SL(x) -- convert x to long-int, with or w/o sign-extension */
#define UL(x)	_UL((void *)&(x),sizeof(x))
#define SL(x)	_SL((void *)&(x),sizeof(x))

extern	unsigned long	_UL	(_2( void * , size_t ));
extern	long		_SL	(_2( void * , size_t ));

/* ulsl.c */
/* _UL -- convert any integer type to unsigned-long W/O sign-extension */
unsigned long
_UL( p, s )
	void * p;
	size_t s;
{
	if (s==sizeof(char))	return (unsigned char)*(char *)p;
	if (s==sizeof(short))	return (unsigned short)*(short *)p;
	if (s==sizeof(int))	return (unsigned int)*(int *)p;
	if (s==sizeof(long))	return (unsigned long)*(long *)p;
	return 0;
	}

/* _SL -- convert any integer type to unsigned-long WITH sign-extension */
long
_SL( p, s )
	void * p;
	size_t s;
{
	if (s==sizeof(char))	return (char)*(char *)p;
	if (s==sizeof(short))	return (short)*(short *)p;
	if (s==sizeof(int))	return (int)*(int *)p;
	if (s==sizeof(long))	return (long)*(long *)p;
	return 0;
	}
/**********************************************************************/	

	So, am I crazy and is this really dumb, or what? 

	I don't know all the tricks to writing "portable"
	or "maintainable" code, so if someone could tell me how this
	is *supposed* to be done, I'd appreciate it.

	BTW, is what I've done here legal in ANSI?

	Mark Shepard
	shepard@cs.orst.edu || ...!hplabs!hp-pcd!orstcs!shepard

henry@utzoo.uucp (Henry Spencer) (02/24/90)

In article <16119@orstcs.CS.ORST.EDU> shepard@upas.CS.ORST.EDU (Mark Shepard) writes:
>... What does ANSI say about escaping characters which don't need it?
>(undefined?)

Right.  The effect of any escape sequence other than those defined in the
standard is undefined.
-- 
"The N in NFS stands for Not, |     Henry Spencer at U of Toronto Zoology
or Need, or perhaps Nightmare"| uunet!attcan!utzoo!henry henry@zoo.toronto.edu