hugh@dgp.toronto.edu ("D. Hugh Redelmeier") (01/19/89)
Various versions of a public domain diff have been broadcast across the net over the years (including comp.sources.misc 2.1, 2.8, and 2.59). The original version was on the DECUS C tape, author unknown (Conroy?). This program is based on the same algorithm as UNIX diff. Anyway, all the versions I have checked seem to produce suboptimal output when diffing the following two files. UNIX diff does not have this problem. Am I right, is this a bug? Does anyone know a fix? The code is currently beyond my comprehension. file a: 9 1 2 3 4 5 6 9 file b: 8 1 2 x x 4 3 4 5 6 8 pd-diff says: 1c1 < 9 --- > 8 4c4,5 < 3 --- > x > x 5a7,8 > 3 > 4 8c11 < 9 --- > 8 SunOS3.5 diff says: 1c1 < 9 --- > 8 3a4,6 > x > x > 4 8c11 < 9 --- > 8 Notice that pd-diff uselessly deletes and re-inserts 3. This is not wrong, just suboptimal. Perhaps there is a simple off-by-one error in the code. Hugh Redelmeier {utcsri, yunexus, uunet!attcan!utzoo, hcr}!redvax!hugh When all else fails: hugh@csri.toronto.edu +1 416 482-8253
alanf%smile@Sun.COM (Alan Fargusson @ peace with the world) (01/21/89)
In article <8901191545.AA18697@explorer.dgp.toronto.edu>, hugh@dgp.toronto.edu ("D. Hugh Redelmeier") writes: > > Anyway, all the versions I have checked seem to produce suboptimal > output when diffing the following two files. UNIX diff does not > have this problem. Am I right, is this a bug? Does anyone know a > fix? The code is currently beyond my comprehension. > > Notice that pd-diff uselessly deletes and re-inserts 3. This is not > wrong, just suboptimal. Perhaps there is a simple off-by-one error > in the code. It looks like the code that tries to find the longest match is doing something wrong. This should be nearly the last thing done by diff. I don't have source for this. GNU diff gets this right, so you may want to get that. I have a version of diff that I wrote that also gets it right. I may try and post it after all. I had decided not to since there are so many versions floating around these days. - - - - - - - - - - - - - - - - - - - - - Alan Fargusson Sun Microsystems alanf@sun.com ..!sun!alanf
jdc@naucse.UUCP (John Campbell) (01/23/89)
From article <86247@sun.uucp>, by alanf%smile@Sun.COM (Alan Fargusson @ peace with the world): > > GNU diff gets this right, so you may want to get that. I have a version of > diff that I wrote that also gets it right. I may try and post it after all. > I had decided not to since there are so many versions floating around these > days. > - - - - - - - - - - - - - - - - - - - - - > Alan Fargusson Sun Microsystems > alanf@sun.com ..!sun!alanf (I also wrote a "diff" type program...) What we need to know is how "compatible" the various public domain diffs are. Can GNU diff be used with RCS, for instance? I find the public domain diff works well enough (diff -cb) for Larry Wall's patch program, but I've heard complaints that there are command options used by RCS that don't exist in the public domain version. Can anyone summarize the compatibility questions or point to a diff that RCS can use (public of course). -- John Campbell ...!arizona!naucse!jdc CAMPBELL@NAUVAX.bitnet unix? Sure send me a dozen, all different colors.
karl@mstar.UUCP (Karl Fox) (01/24/89)
Well, I *too* have also written a pd diff that *also* gives the same output as the 'real' diff on these two files. It uses a different (and much faster) algorithm, one described by Webb Miller and Gene Myers in "A File Comparison Program", _Software - Practice and Experience_, November 1985 (with permission of the author, even). This is the same basic algorithm used by GNU diff, except that mine doesn't have the fancy all-different-line-removal stuff that GNU diff does (which speeds it up a lot on very different files), nor does it have the stuff that massages the results to "look better" but make it also different from regular diff. Mine supports the -e, -f, -b, -h, -s, -r, -l, -S, -D and -c options, which I think is the same as BSD diff. I wrote it because the old diff was eating up our machine when we used SCCS on very large files. It used to be, "type delta and go to lunch"; now it never takes more than several seconds. Maybe we should assign a central Diff Naming Authority to hand out names like ndiff, gdiff, diff1, diff2, or maybe we should all post at once. Let the customer choose, just like "Sun Memory at Bargain Prices". -- Karl Fox, Morning Star Technologies UUCP: osu-cis!mstar!karl -or- pyramid!mstar!karl -or- sequent!mstar!karl Internet: osu-cis!mstar!karl@tut.cis.ohio-state.edu
alanf%smile@Sun.COM (Alan Fargusson @ peace with the world) (01/26/89)
In article <1133@naucse.UUCP>, jdc@naucse.UUCP (John Campbell) writes: > > What we need to know is how "compatible" the various public domain diffs > are. Can GNU diff be used with RCS, for instance? I find the public > domain diff works well enough (diff -cb) for Larry Wall's patch program, > but I've heard complaints that there are command options used by RCS that > don't exist in the public domain version. > > Can anyone summarize the compatibility questions or point to a diff that > RCS can use (public of course). > -- > John Campbell ...!arizona!naucse!jdc > CAMPBELL@NAUVAX.bitnet > unix? Sure send me a dozen, all different colors. GNU diff looks to be exactly compatible with UNIX diff. My diff program is not and I intended it that way. I do intend to add an output format that does look like UNIX for Larry Wall's patch program. The biggest advantage of my diff is that it easy to add output options. Currently it outputs a human readable format, a edlin compatible format (for MS-DOS), and a format for the CP/V editor (you don't want to know). I guess it is an advantage that mine works on UNIX, MS-DOS (MSC 4.0), and CP/V with no special care (not even one #ifdef). - - - - - - - - - - - - - - - - - - - - - Alan Fargusson Sun Microsystems alanf@sun.com ..!sun!alanf