[comp.lang.c] typeof isn't enough to define swap correctly

andy@rocky.STANFORD.EDU (Andy Freeman) (09/04/87)

In article <1354@dataio.Data-IO.COM>, bright@dataio.Data-IO.COM (Walter Bright) writes:
> ... One thing I want, which the committee always seems to poo-poo, is a
> typeof. Having a typeof would make creation of 'generic' routines possible:

> 	#define swap(a,b)	{ typeof(a) tmp; tmp = a; a = b; b = tmp; }

Even given typeof, this definition of swap is buggy.  (It has at least
two bugs.)  I've never figured out how to define a C swap macro that
works - I'm willing to believe it is impossible.

Anyone have a use for typeof that isn't doesn't have the major bug that
the swap macro defined above has?

-andy
-- 
Andy Freeman
UUCP:  {arpa gateways, decwrl, sun, hplabs, rutgers}!sushi.stanford.edu!andy
ARPA:  andy@sushi.stanford.edu
(415) 329-1718/723-3088 home/cubicle

crowl@cs.rochester.edu (Lawrence Crowl) (09/05/87)

In article <557@rocky.STANFORD.EDU> andy@rocky.UUCP (Andy Freeman) writes:
]In article <1354@dataio.Data-IO.COM>, bright@dataio.Data-IO.COM (Walter Bright) writes:
]> ... One thing I want, which the committee always seems to poo-poo, is a
]> typeof. Having a typeof would make creation of 'generic' routines possible:
]
]> 	#define swap(a,b)	{ typeof(a) tmp; tmp = a; a = b; b = tmp; }
]
]Even given typeof, this definition of swap is buggy.  (It has at least
]two bugs.)  I've never figured out how to define a C swap macro that
]works - I'm willing to believe it is impossible.
]
]Anyone have a use for typeof that isn't doesn't have the major bug that
]the swap macro defined above has?

The following swap macro is correct.  It relies on the pointer features of C
to evaluate its arguments exactly once.

#define swap(a,b) { \
    typeof(a) t,*p1,*p2; /* temporary and pointers to sources */ \
    p1 = &(a); p2 = &(b); /* evaluate arguments exactly once */ \
    t = *p1; *p1 = *p2; *p2 = t; /* do actual swap */ \
}

The following example works.  (Well, I substituted typeof(a) with int.)

int a[] = { 0, 1, 2, 3 } ; int b[] = { 0, 1, 2, 3 } ;
int i = 0 ; int j = 1 ;

swap(a[i++],b[j++]);

a == { 1, 1, 2, 3 }; b == { 0, 0, 2, 3 };
i == 1; j == 2;

One problem still remains.  What if typeof(a) != typeof(b)?
-- 
  Lawrence Crowl		716-275-9499	University of Rochester
		     crowl@cs.rochester.arpa	Computer Science Department
 ...!{allegra,decvax,seismo}!rochester!crowl	Rochester, New York,  14627

dhesi@bsu-cs.UUCP (Rahul Dhesi) (09/05/87)

In article <557@rocky.STANFORD.EDU> andy@rocky.UUCP (Andy Freeman) quotes
Walter Bright's swap macro:
>> 	#define swap(a,b)	{ typeof(a) tmp; tmp = a; a = b; b = tmp; }

and says,

     Even given typeof, this definition of swap is buggy...I've never
     figured out how to define a C swap macro that works - I'm willing
     to believe it is impossible.

Way back, when Algol's call-by-name parameter passing method was
invented, they never realized what a can of worms it would be.  One of
these worms is the proven impossibility of writing a general swap
function in Algol.  (I think I'm thinking of Algol68, but it might be
true of earlier versions too.)

As far as I can tell, the swap macro is the equivalent of an Algol
subprogram that uses call-by-name, so it's not surprising that a
generized swap macro seems to be impossible.

The trouble occurs in things like

     swap(i, a[i]);

because the macro changes i before it changes a[i].  An unexpected effect
is:

     a[a[i]] = something;

(P.S.  Niklaus Wirth has a joke about how his name is pronounced:  you
can call him by name, pronounced Veert, or call him by value,
pronounced Worth.)
-- 
Rahul Dhesi         UUCP:  {ihnp4,seismo}!{iuvax,pur-ee}!bsu-cs!dhesi

chris@mimsy.UUCP (Chris Torek) (09/05/87)

This should work even without typeof().  I do not recommend it.
Note that you will get warnings from PCC if SWAP is applied to
arrays.  You can remove the sizeof/panic code if you dare.

	#define SWAP(a, b) { \
		struct __t { char bytes[sizeof(a)]; }; \
		struct __t *__a = (struct __t *)&(a); \
		struct __t *__b = (struct __t *)&(b); \
		struct __t __t; \
		if (sizeof(struct __t) != sizeof(b)) \
			panic("bad swap() args: %s, line %d", \
				__FILE__, __LINE__); \
		__t = *__a; \
		*__a = *__b; \
		*__b = __t; \
	}
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

steve@umigw.MIAMI.EDU (steve emmerson) (09/05/87)

In article <1880@sol.ARPA> crowl@cs.rochester.edu (Lawrence Crowl) writes:
>The following swap macro is correct.  It relies on the pointer features of C
>to evaluate its arguments exactly once.
>
>#define swap(a,b) { \
>    typeof(a) t,*p1,*p2; /* temporary and pointers to sources */ \
>    p1 = &(a); p2 = &(b); /* evaluate arguments exactly once */ \
>    t = *p1; *p1 = *p2; *p2 = t; /* do actual swap */ \
>}
>One problem still remains.  What if typeof(a) != typeof(b)?

It also won't work with pure array objects (i.e. swapping one array with
another).  A generic routine should be able to work with *anything*.

Close but no cigar.
-------------------------------------------------------------------------------
Steve Emmerson                      USAN: steve@umigw
RSMAS Remote Sensing Laboratory     AT&T: (305) 361-4065
University of Miami                 SPAN: miami::emmerson (host 3.2)
                                    UUCP: ...!hao!umigw!miami!emmerson   
USPS: RSMAS/MPO                     DDN:  emmerson@miami.miami.edu
      4600 Rickenbacker Cswy.             emmerson%miami.span@star.stanford.edu
      Miami, FL 33149-1098                emmerson%miami.span@vlsi.jpl.nasa.gov

dhesi@bsu-cs.UUCP (Rahul Dhesi) (09/05/87)

In article <109@umigw.MIAMI.EDU> steve@umigw.UUCP (steve emmerson) writes:
[about an allegedly universal swap() macro]
>It also won't work with pure array objects (i.e. swapping one array with
>another).  A generic routine should be able to work with *anything*.

Including register variables.
-- 
Rahul Dhesi         UUCP:  {ihnp4,seismo}!{iuvax,pur-ee}!bsu-cs!dhesi

chris@mimsy.UUCP (Chris Torek) (09/05/87)

In article <8415@mimsy.UUCP> I included the structure assignments
>		__t = *__a; \
>		*__a = *__b; \
>		*__b = __t; \

These may fail due to alignment constraints; if so, changing them
to calls to memcpy or bcopy would work.  But then, that would eliminate
much of the purpose of the macro in the first place.

Well, I did say I did not recommend it. . . .
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

dg@wrs.UUCP (David Goodenough) (09/09/87)

In article <557@rocky.STANFORD.EDU> andy@rocky.UUCP (Andy Freeman) writes:
>In article <1354@dataio.Data-IO.COM>, bright@dataio.Data-IO.COM (Walter Bright) writes:
>> ... One thing I want, which the committee always seems to poo-poo, is a
>> typeof. Having a typeof would make creation of 'generic' routines possible:
>
>> 	#define swap(a,b)	{ typeof(a) tmp; tmp = a; a = b; b = tmp; }
>
>Even given typeof, this definition of swap is buggy.  (It has at least
>two bugs.)  I've never figured out how to define a C swap macro that
>works - I'm willing to believe it is impossible.

Actually its almost trivial:

create swap.h:

#define		swap(a, b)	_swap_(&a, &b, sizeof(a))

static _swap_(p1, p2, count)
char *p1, *p2;
 {
    char x;

    while (count--)
     {
	x = *p1;
	*p1++ = *p2;
	*p2++ = x;
     }
 }

And you're open for business.

Of course you may need to throw some other stuff in there to get lint to
shut up, but the above does work. (If you want to get industrious (as I did)
put _swap_() in a library - saves having a copy for each file that includes
swap.h). And before anyone starts bitching about the &a and what do you
do with a swap(a, 0) or swap(a, b + 5) both args should be lvalues and
hence &-able
--
		dg@wrs.UUCP - David Goodenough

					+---+
					| +-+-+
					+-+-+ |
					  +---+

karl@haddock.ISC.COM (Karl Heuer) (09/10/87)

In article <343@wrs.UUCP> dg@wrs.UUCP (David Goodenough) writes:
>both args should be lvalues and hence &-able

Registers and bitfields are modifiable lvalues, but not &-able.
Arrays and consts (in ANSI C) are &-able lvalues, but not modifiable.

The swap macro you gave requires &-able, modifiable lvalues as args.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

mpl@sfsup.UUCP (M.P.Lindner) (09/10/87)

In article <109@umigw.MIAMI.EDU>, steve@umigw.MIAMI.EDU (steve emmerson) writes:
> In article <1880@sol.ARPA> crowl@cs.rochester.edu (Lawrence Crowl) writes:
 ...
> >#define swap(a,b) { \
 ...
> >}
> >One problem still remains.  What if typeof(a) != typeof(b)?
> 
> It also won't work with pure array objects (i.e. swapping one array with
> another).  A generic routine should be able to work with *anything*.

Here's one that works WITHOUT TYPEOF AND WORKS FOR ARRAYS!

#define swap(a, b) { \
	register int	i; \
	for (i = 0; i < sizeof(a); i++) { \
		((char *) &a)[i] ^= ((char *) &b)[i]; \
		((char *) &b)[i] ^= ((char *) &a)[i]; \
		((char *) &a)[i] ^= ((char *) &b)[i]; \
	} \
}

OK, so it's not as pretty or efficient as everyone would like, but it's
completely general purpose and doesn't need any new features.

Mike Lindner

mpl@sfsup.UUCP (M.P.Lindner) (09/10/87)

OK, as I read further I discovered people were complaining about not handling
name clashes, arrays, and register variables.  My method won't handle register
variables (can't take the address of a register), but it WILL handle arrays,
and as for name clashes, might I suggest the following kludge based on a trick
used by C++ to avoid name clashes (no, I wouldn't really do this, any more than
I would really use that swap routine):

#define swap(a,b) { \
	int _unlikely_string_/**/a/**/b; \
	etc.

blech, I have a bad taste in my mouth.  Note that this scheme (and much of
C++) fails if the arguments to the macro have space around them (ie
swap(x, y) instead of swap(x,y) ).

Mike (feeling sick now) Lindner

crowl@cs.rochester.edu (Lawrence Crowl) (09/11/87)

In article <2019@sfsup.UUCP> mpl@sfsup.UUCP (M.P.Lindner) writes:
>[Re: swap macros] Here's one that works WITHOUT TYPEOF AND WORKS FOR ARRAYS!
>
>#define swap(a, b) { \
>	register int	i; \
>	for (i = 0; i < sizeof(a); i++) { \
>		((char *) &a)[i] ^= ((char *) &b)[i]; \
>		((char *) &b)[i] ^= ((char *) &a)[i]; \
>		((char *) &a)[i] ^= ((char *) &b)[i]; \
>	} \
>}
>
>OK, so it's not as pretty or efficient as everyone would like, but it's
>completely general purpose and doesn't need any new features.

Well, it is not general purpose and does need new features.  Here are the
problems I see:

1.  What if sizeof(a) != sizeof(b)?  Correct calls will work, but should we
    accept incorrect calls?  A simple check will fix this.
2.  What if typeof(a) != typeof(b)?  A simple check fixes this too if the 
    typeof feature is added.
3.  The above macro cannot be used in an expression.  This seems irreparable
    without either calling a true function or extending C with a mechanism to
    declare variables within an expression (ala BCPL).
4.  Most compilers complain about taking the address of an array.
5.  Because you are taking the address, register variables cannot be arguments.
6.  The call swap(i,j) will not work, because of the declaration of i.  I
    assume you played xor games to avoid name clashes with the temporary.
    Unfortunately, you introduced one with the index variable.  The only way
    I know to get around this is to play games with the preprocessor to
    construct identifiers.  While this cannot be 100% effective, it is probably
    a practical solution.
7.  Because the macro parameters are passed by name, if the arguments for a or
    b have side effects, the macro will not work.  The only way to solve this
    problem is to evaluate each argument exactly once, which means you will
    have to take the arguments' addresses.  (This brings back points 4 and 5.)
8.  The loop is probably not the most efficient way to do this.

Let's try again.  I will ignore points 1 and 2 because they are easy.  I will
ignore points 3, 4 and 5 because they essentially cannot be fixed.  I will try
to address points  6, 7, and 8.

/* the mechanism to construct new identifiers */
#define concat(n,m) n/**/m

#define swap(a,b) { \
/* construct new type with new name */
typedef struct { char c[ sizeof(a) ] ; } concat(swap_T,__LINE__) ; \
/* pointers to sources allow evaluating arguments only once */ \
concat(swap_T,__LINE__) *concat(swap_P,__LINE__) = &(a) ; \
concat(swap_T,__LINE__) *concat(swap_Q,__LINE__) = &(b) ; \
/* do actual swap */
concat(swap_T,__LINE__) concat(swap_R,__LINE__) = *concat(swap_P,__LINE__) ; \
*concat(swap_P,__LINE__) = *concat(swap_Q,__LINE__) ; \
*concat(swap_Q,__LINE__) = concat(swap_R,__LINE__) ; \
}

Have I forgotten anything?

-- 
  Lawrence Crowl		716-275-9499	University of Rochester
		     crowl@cs.rochester.arpa	Computer Science Department
 ...!{allegra,decvax,seismo}!rochester!crowl	Rochester, New York,  14627

andy@rocky.STANFORD.EDU (Andy Freeman) (09/11/87)

In article <2105@sol.ARPA> crowl@cs.rochester.edu (Lawrence Crowl) writes:
>Let's try again.  I will ignore points 1 and 2 because they are easy.  I
>will ignore points 3, 4 and 5 because they essentially cannot be fixed.
>I will try to address points  6, 7, and 8.

Like I said, it is probably impossible to write a correct swap macro.
I'm glad you knew yours was wrong before you posted it.  BTW - your
concat macro doesn't work in all legal implementations of cpp.  Changing
it to the ANSI c feature designed for that problem doesn't help - your
definition doesn't handle point 6 correctly.

At least it handles swap(a[i],b[j]), a version someone posted with the
concat bug didn't....

-andy
-- 
Andy Freeman
UUCP:  {arpa gateways, decwrl, sun, hplabs, rutgers}!sushi.stanford.edu!andy
ARPA:  andy@sushi.stanford.edu
(415) 329-1718/723-3088 home/cubicle

bright@dataio.Data-IO.COM (Walter Bright) (09/11/87)

In article <2105@sol.ARPA> crowl@cs.rochester.edu (Lawrence Crowl) writes:
#/* the mechanism to construct new identifiers */
##define concat(n,m) n/**/m
#
##define swap(a,b) { \
#/* construct new type with new name */
#typedef struct { char c[ sizeof(a) ] ; } concat(swap_T,__LINE__) ; \
#/* pointers to sources allow evaluating arguments only once */ \
#concat(swap_T,__LINE__) *concat(swap_P,__LINE__) = &(a) ; \
#concat(swap_T,__LINE__) *concat(swap_Q,__LINE__) = &(b) ; \
#/* do actual swap */
#concat(swap_T,__LINE__) concat(swap_R,__LINE__) = *concat(swap_P,__LINE__) ; \
#*concat(swap_P,__LINE__) = *concat(swap_Q,__LINE__) ; \
#*concat(swap_Q,__LINE__) = concat(swap_R,__LINE__) ; \
#}
#Have I forgotten anything?

I know that
	#define swap(a,b) { typeof(a) _tmp; _tmp = a; a = b; b = _tmp; }
is not completely generic. But it is at least readable, simple, compiles
efficiently, and works for variables, register variables, structs and unions.

As far as swap(a++,b++) goes, who cares? Good coding style should
preclude that occurrence.

If you have a name collision with _tmp, your program is probably a
candidate for the Obfuscated C Contest anyway.

Arrays are a special beast anyway. Very little else works with arrays.
I'd rather have swap work with register variables than with arrays.

To summarize, I would prefer a simple macro with obvious limitations to
a complex macro with very obscure limitations. A completely foolproof
swap() would have to be implemented within the compiler anyway.

throopw@xyzzy.UUCP (09/12/87)

> mpl@sfsup.UUCP (M.P.Lindner)
> #define swap(a, b) { \
> 	register int	i; \
> 	for (i = 0; i < sizeof(a); i++) { \
> 		((char *) &a)[i] ^= ((char *) &b)[i]; \
> 		((char *) &b)[i] ^= ((char *) &a)[i]; \
> 		((char *) &a)[i] ^= ((char *) &b)[i]; \
> 	} \
> }
> OK, so it's not as pretty or efficient as everyone would like, but it's
> completely general purpose and doesn't need any new features.

No, I'm not about to pick the nit that everybody else does at this
point, which is that it doesn't work for register variables.  No, I'm
going to pick the nit that the typebreaking done when copying things of
arbitrary type as characters is first of all not guaranteed to work
correctly by draft X3J11 (nor by K&R), and second of all I'm familiar
with a machine on which it would have failed.

In particular, imagine a type with a data format that depends upon where
in memory it is located.  A common example of this is executable code.
To copy executable code from place to place, it is often not sufficent
to treat it as an array of bytes and copy those bytes.

Before you dismiss this as frivolous, there is at least one architecture
where *pointers* have this property... that is, the format of the
pointer depends upon where in memory it is stored.  Thus, versions of
"swap" that do type breaking of this kind are vulnerable to failure when
fed structures containing pointers to swap, for example.

Just thought y'all might be interested in this little-known bit of trivia.

--
To understand a program you must become both the machine and the program.
                                        --- Alan J. Perlis
-- 
Wayne Throop      <the-known-world>!mcnc!rti!xyzzy!throopw

franka@mmintl.UUCP (Frank Adams) (09/15/87)

In article <2020@sfsup.UUCP> mpl@sfsup.UUCP (M.P.Lindner) writes:
>and as for name clashes, might I suggest the following kludge based on a trick
>used by C++ to avoid name clashes:
>
>#define swap(a,b) { \
>	int _unlikely_string_/**/a/**/b; \
>	etc.
>
>blech, I have a bad taste in my mouth.  Note that this scheme (and much of
>C++) fails if the arguments to the macro have space around them (ie
>swap(x, y) instead of swap(x,y) ).

This also doesn't work if x and y anything other than simple variables.  I
*don't* expect swap(a[i++],b[j++]) to work, but swap(a[i],b[j]) *should*.
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108