[comp.lang.c] Useful macro...or dangerous??

kjartan@raunvis.UUCP (Kjartan Pier Emilsson Jardedlisfraedi) (04/26/88)

Hello world!
  
  This sentence is meant as an introducing one.  

  In my work I have often felt the need to test the equality of
two structures, for example test the equality of a Color structure
and so on. As I quickly became tired of gigantic IF tests, I decided
to build some sort of equality macro, and came down on the following
solution which seems to work.  (Of course if some members of a
structure are pointers, then the equal() function returns 0 if the 
pointers do not point to the same adress).  Now I would very much like
to know whether this is a foolproof way to test equality of structures,
or is there a hidden nasty little fellow who eludes me.

#define EQ(A,B) equal(A,B,sizeof(*(A)))

typedef struct {
	...
	...
	}ANYSTRUCTURE;

main(){
ANYSTRUCTURE some,thing,or,the,*other
...
...
if( EQ(&some,other) )
	printf("yupyup\n");
else if( EQ(&thing,&or) )
	printf("spulft.\n");
...
...
etc.
}

equal(a,b,size)
char *a,*b;
long int size;
{
while(*(a+si-1)== *(b+si-1) && si>0)
	si--;
if(si==0)
	return(1);
else
	return(0);
}

I herewith submit my possible blunders to the Blowtorches of the Net.

		Kjartan Pierre Emilsson, Reykjavik, ICELAND

djones@megatest.UUCP (Dave Jones) (04/28/88)

in article <221@raunvis.UUCP>, kjartan@raunvis.UUCP (Kjartan Pier Emilsson Jardedlisfraedi) says:
> 
...

> As I quickly became tired of gigantic IF tests, I decided
> to build some sort of equality macro, and came down on the following
> solution which seems to work.
...
> 
> #define EQ(A,B) equal(A,B,sizeof(*(A)))
> 
...

> I herewith submit my possible blunders to the Blowtorches of the Net.
> 
> 		Kjartan Pierre Emilsson, Reykjavik, ICELAND

I admire your courage.  No torch here, so relax.

As Dustin Hoffman said in Marathon Man, "It's not safe."

Due to packing alignment within the struct, there
may be some garbage bits "in the cracks".  You could get a "not equal"
result when in fact, the structures should compare as equal.

If you use this technique, be sure that you bzero() structures when you 
allocate them.  Don't be too suprised if you, or somebody else, forgets to
do so, and the world, as we know it, comes to an end.

Also, use the C-library function bcmp() rather than your equal().

Good luck,


		Dave J.

chris@mimsy.UUCP (Chris Torek) (04/28/88)

[Dangerous]
In article <221@raunvis.UUCP> kjartan@raunvis.UUCP (Kjartan Pier
Emilsson Jardedlisfraedi) writes:
>... test the equality of two structures [of the same type].
>... I decided to build some sort of equality macro, and came down
>on the following solution which seems to work.

[Aside: the usual idiom is `came up with', giving perhaps the
rather bizarre image of someone diving into a lake and emerging
with a trout between his teeth :-) ]

>(Of course if some members of a structure are pointers, then the
>equal() function returns 0 if the pointers do not point to the same
>adress).  Now I would very much like to know whether this is a
>foolproof way to test equality of structures, or is there a hidden
>nasty little fellow who eludes me.

There is:

>#define EQ(A,B) equal(A,B,sizeof(*(A)))

>[e.g., struct thing x, y; ... EQ(&x, &y)]

Then `equal' is defined more or less to do what `memcmp' does.

Aside from coding errors in equal (the one provided would not compile),
there is a more insidious problem.  Structures may contain `holes'; the
holes are not necessarily set to any particular value, so two otherwise
equal structures may compare differently.

For instance, if I write

	struct holey {
		char	ch;
		int	i;
	} smokes, batman;

there is often a `gap' between `ch' and `i'.  The 4BSD Vax compiler,
for instance, puts `i' at offset 4, leaving three bytes of hole before
it.  If these are automatic variables, setting smokes.ch and smokes.i
leaves its hole untouched and full of random stuff left over from
whatever was on the stack before.

You *can* get around this, at some expense, by calling bzero() or
memset() to set all structures, including the holes, to all-zero-bytes,
but there is no guarantee that the compiler must never write onto the
hole area.  One could imagine an architecture in which byte stores are
impossible, where writing to a single byte requires a `load, mask, or,
store' sequence; in this case the compiler might elect to put a hole
before or after every character in a structure, and just store a word
each time, overwriting the previous hole contents.

In short, it is safest not to do this.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

ekb@ho7cad.ATT.COM (Eric K. Bustad) (04/29/88)

In article <221@raunvis.UUCP>, kjartan@raunvis.UUCP (Kjartan Pier Emilsson Jardedlisfraedi) asks:
> [ If the following is a foolproof way to test for structure equality. ]
> 
> #define EQ(A,B) equal(A,B,sizeof(*(A)))
> 
> equal(a,b,size)
> char *a,*b;
> long int size;
> {
> while(*(a+si-1)== *(b+si-1) && si>0)
> 	si--;
> if(si==0)
> 	return(1);
> else
> 	return(0);
> }

The main problem with this is that there are often "holes" in the
structure that should be ignored when checking for equality.  For
example, on the machine I'm on now, the structure

	struct gub {
		int a;
		char b;
	};

takes *eight* bytes of memory, four for the int, one for the char
and three more to pad it out to a multiple of eight bytes.  The
structure

	struct hvc {
		char b;
		int a;
	};

also takes eight bytes, with the extra three being inserted before
the int to place it at the proper alignment.

These holes could contain almost any kind of random garbage, so two
structures which compared equal field-wise may not when each byte is
compared in your equal() function.

= Eric

ark@alice.UUCP (04/29/88)

In article <221@raunvis.UUCP>, kjartan@raunvis.UUCP writes:
 
> #define EQ(A,B) equal(A,B,sizeof(*(A)))

The function called by this macro does a byte comparison.
Here's the trouble.  The assertions in comments are not true
on all machines, but are on some and that's what matters:

	struct bar {
		short a;
		/* two bytes of invisible padding here */
		long b;
	};

	struct foo {
		char c[sizeof(struct bar)];
	};

	/* now here's some code */

		foo *fp1, *fp2;
		bar *bp1, *bp2;
		int i;

		fp1 = (foo *) malloc (sizeof (foo));
		for (i = 0; i < 8; i++)
			fp1->c[i] = '?';
		free ((char *) fp1);
		bp1 = (bar *) malloc (sizeof (bar));

		fp2 = (foo *) malloc (sizeof (foo));
		for (i = 0; i < 8; i++)
			fp2->c[i] = '!';
		free ((char *) fp2);
		bp2 = (bar *) malloc (sizeof (bar));

This code is all clean and portable.  Here's what's happening.
If you free memory and then allocate the same amount again
immediately, malloc will probably be nice enough to give you
back the memory you just freed.  Thus the assignments to
c[i] initialize memory to a particular value, give it back
to the system, and maybe get it back again.  In any event,
the memory addressed by bp1 and bp2 has probably been set
to different values.  Now some assignments:

		bp1->a = 3;
		bp2->a = 3;
		bp1->b = 7;
		bp2->b = 7;

By any sensible definition of equality, bp1 and bp2 point
to equal objects.  Yet, because the invisible padding in
these objects has been initialized to different values,
the EQUAL macro above will show them as different.

pablo@polygen.uucp (Pablo Halpern) (04/29/88)

From article <221@raunvis.UUCP>, by kjartan@raunvis.UUCP (Kjartan Pier Emilsson Jardedlisfraedi):
[ Introduces attempt at macro for testing structure equality ] 
> #define EQ(A,B) equal(A,B,sizeof(*(A)))
> 
[ Gives example of use ]
> 
> equal(a,b,size)
> char *a,*b;
> long int size;
> {
> while(*(a+si-1)== *(b+si-1) && si>0)
> 	si--;
> if(si==0)
> 	return(1);
> else
> 	return(0);
> }
> 
> I herewith submit my possible blunders to the Blowtorches of the Net.
> 
> 		Kjartan Pierre Emilsson, Reykjavik, ICELAND

With one correction, your method MIGHT be useful but not certainly fool
proof.  First the correction: according to K&R, the sizeof operator
returns an int.  According to dpANS, the sizeof operator returns
size_t.  No document that I know of defines sizeof as returning long
int, so equal() should be redefined as follows (I also did code optimization):

int equal(a, b, size)
char	*a, *b;
size_t	size;
{
	/* byte for byte compare of structures *a and *b */
	while (a < a + size)
		if (*a++ != *b++)
			return (0);
	return (1);
}

This would work ONLY IF THERE ARE NO HOLES IN THE STRUCTURES BEING COMPARED.
For example, the following would not work on many architectures.

	struct {
		char	c;
		int	i;
	} x, y;
	...
	if (EQ(&x, &y))
		...;

On many machines, x.i and y.i will be aligned on word boundaries (i.e.,
they must have addresses that are multiples of the word size, typically
2, 4, or 8).  That means that there could be a gap or "hole" of several
bytes between the end of x.c and the beginning of x.i.  There is no
rule about what values fill this hole; they could be different for x
and y.  Thus, your byte for byte comparison could fail on these holes
even if the structures matched field by field.  Your method would
only work for some structures on some architectures and would be very
non-portable.  Sorry to burst your bubble.

Pablo Halpern		|	mit-eddie \
Polygen Corp.		|	princeton  \ !polygen!pablo  (UUCP)
200 Fifth Ave.		|	bu-cs      /
Waltham, MA 02254	|	stellar   /

mitt@hpclisp.HP.COM (Roy Mittendorff) (04/30/88)

/ pablo@polygen.uucp (Pablo Halpern) /  6:04 pm  Apr 28, 1988 /

> This would work ONLY IF THERE ARE NO HOLES IN THE STRUCTURES BEING COMPARED.
> For example, the following would not work on many architectures.
>
> 	struct {
> 		char	c;
> 		int	i;
> 	} x, y;
> 	...
> 	if (EQ(&x, &y))
> 		...;
> 
> On many machines, x.i and y.i will be aligned on word boundaries (i.e.,
> they must have addresses that are multiples of the word size, typically
> 2, 4, or 8).  That means that there could be a gap or "hole" of several
> bytes between the end of x.c and the beginning of x.i. ...

   Is there any guarantee that members of structs be stored in any
   particular order?  For example, could c and i may be reversed in 
   some implementations?

> Pablo Halpern

   Roy Mittendorff.

gordan@maccs.UUCP (gordan) (04/30/88)

In article <221@raunvis.UUCP> kjartan@raunvis.UUCP (Kjartan Pier Emilsson Jardedlisfraedi) writes:
-
- [Is it a good idea to test for equality of structures by doing a
- byte-by-byte comparison for sizeof(struct xxx) bytes?]

"Because certain data objects may be constrained by the target computer
to lie on certain addressing boundaries, a structure object may contain
"holes," storage units that do not belong to any component of the
structure.  The holes would make equality tests implemented as a
wholesale bit-by-bit comparison unreliable..."

-- Harbison & Steele, 2nd. ed., p. 103

-- 
                 Gordan Palameta
            uunet!mnetor!maccs!gordan

blm@cxsea.UUCP (Brian Matthews) (05/02/88)

Chris Torek (chris@mimsy.UUCP) writes:
|[Dangerous]
|In article <221@raunvis.UUCP> kjartan@raunvis.UUCP (Kjartan Pier
|Emilsson Jardedlisfraedi) writes:
|For instance, if I write
|
|	struct holey {
|		char	ch;
|		int	i;
|	} smokes, batman;
|
|there is often a `gap' between `ch' and `i'.

Another problem that may occur is if you have something like:

	struct maybe_holey {
		char	arr[16];
	} string, array;

then do something like:

	strcpy (string.arr, "kiwi fruit");
	strcpy (string.arr, "bananas");
	...
	strcpy (array,arr, "mangos");
	strcpy (string.arr, "mangos");

equal will now report that array and string aren't equal, even though they
should be considered equal.  The reason is of course that the characters
beyond the end of the string aren't significant in this case, but equal
doesn't know this.  This is one of the reasons that structure equality
can't be done reasonably in the compiler.  In general, the compiler will
know about the gaps that Chris mentions, but it won't know whether
string.arr is being used as an array, where all characters are
significant, or a string, where only the characters up to and including
the null are significant.  And of course equal can't know either of
these.

|In short, it is safest not to do this.

Agreed.

-- 
Brian L. Matthews                               "The first time I died,
...{mnetor,uw-beaver!ssc-vax}!cxsea!blm          was in the arms of good
+1 206 251 6811                                  friends of mine."
Computer X Inc. - a division of Motorola New Enterprises