[comp.unix.wizards] diff

m5@lynx.uucp (Mike McNally) (01/07/89)

While hacking on GNU diff to add the -D option, I noticed a seemingly
unavoidable problem with the whole concept.  The -D option generates
from two input files a single file containing all common lines from the
files, along with differences cleverly surrounded by cpp-style #ifdef's
(actually #ifndef's).  The idea is to produce a single source file from
two C modules (I guess it could be anything that normally gets cranked
through cpp on its way to fulfillment) and then get either version by
appropriate compilation switches.

The problem does not really lie with diff.  Rather, it appears because
cpp eats comments, and thus confuses itself.  Consider this possible
output from diff -DBALLOON version1.c version2.c:

/* This is the beginning of a multi-line comment.  It describes a certain
#ifndef BALLOON
   C function, which might be used to open text files.  The comment ends
#else BALLOON
   C function, which might be used to close text files.  The comment ends
#endif BALLOON
   with a single uninteresting line. */

The preprocessor can't deal with this setup; the interaction of #things and
comments causes lots of problems.

Is my cpp screwed up (Greenhills)?  Should I worry about this?  Are there
ways known to smart people to avoid these problems?  Is the -D option
ever used anyway?  Did I waste my time kludging it into GNU diff?

-- 
Mike McNally                                    Lynx Real-Time Systems
uucp: {voder,athsys}!lynx!m5                    phone: 408 370 2233

            Where equal mind and contest equal, go.

ark@alice.UUCP (Andrew Koenig) (01/08/89)

In article <5174@lynx.UUCP>, m5@lynx.uucp (Mike McNally) writes:
> While hacking on GNU diff to add the -D option, I noticed a seemingly
> unavoidable problem with the whole concept.  The -D option generates

There's another problem:

	File 1:

	stuff 1
	#ifdef X
	stuff 2
	#endif
	stuff 3


	File 2:

	stuff 1
	#ifdef Y
	stuff 2
	#endif
	stuff 3

The obvious way to merge this is:

	stuff 1
	#ifdef FOO
	#ifdef X
	#else
	#ifdef Y
	#endif
	stuff 2
	#endif
	stuff 3

but this is obviously wrong.

The correct solution is left as an exercise for the reader.
-- 
				--Andrew Koenig
				  ark@europa.att.com

scs@adam.pika.mit.edu (Steve Summit) (01/08/89)

In article <5174@lynx.UUCP> m5@lynx.UUCP (Mike McNally) writes:
>While hacking on GNU diff to add the -D option, I noticed a...problem.
>The -D option generates...a single file containing all common lines...
>along with...[clever]...cpp-style #ifdef's.  The problem...appears because
>cpp eats comments, and thus confuses itself.
>
>/* This is the beginning of a multi-line comment.  It describes a certain
>#ifndef BALLOON
>   C function, which might be used to open text files.  The comment ends
>#else BALLOON
>   C function, which might be used to close text files.  The comment ends
>#endif BALLOON
>   with a single uninteresting line. */
>
>The preprocessor can't deal with this setup; the interaction of #things and
>comments causes lots of problems.

I wouldn't think this particular example would cause a problem,
since all three preprocessor directives are commented out, and
should be ignored.  (Maybe there is a problem with your
preprocessor.)  Truly problematic cases can certainly be
imagined, involving unmatched comments within the #ifdef.

In general, it's a good idea to double-check the output of diff -D
to see if it behaved reasonably.  (In the example given, you'd
probably just scratch the #ifdefs within a comment anyway.)

>Is the -D option
>ever used anyway?  Did I waste my time kludging it into GNU diff?

diff -D is very handy; I use it all the time.  (It's often
superior to rcsmerge, since it only requires two versions, not
three -- rcsmerge needs a common ancestor.)  However, if you're
working with diff -D, fix its other problem: the

	#else BALLOON
and
	#endif BALLOON

forms have never been portable, and are (I believe) explicitly
disallowed by ANSI-C.  If you like the comments on #else and
#endif lines (I can take them or leave them), change them (in
this case, change the printf statements within diff which
generate them) to bona-fide comments:

	#else /* BALLOON */
	#endif /* BALLOON */

                                            Steve Summit
                                            scs@adam.pika.mit.edu

dave@onfcanim.UUCP (Dave Martindale) (01/12/89)

I use diff -D all the time.  When I use it to produce an output file
that will actually use cpp to generate one version or the other,
I always have to hand-edit it to get around the sort of problem that
you note.

However, I seldom use it for that.  My main use is producing a new
source file that contains the merged changes from two different
versions of basically the same code.  Wherever the two pieces of code
differ, the diff output allows me to see the differences in context,
side-by-side, and I selectively discard the lines of code that I don't
want, leaving the version that I do.