[comp.os.minix] cmp.c again: this time it works...

wayne@csri.toronto.edu (Wayne Hayes) (10/06/90)
Awhile ago I posted a diatribe about piping in Minix, concluding that there
was a bug in cmp.c when it read input from a pipe, and added a fix.  Well
the fix of course introduced a _new_ bug.  (Hey, I *did* say it was a kludgy
fix, OK?)

The fix was to change

    n1 = read(fd1, buf1, BLOCK_SIZE);
    n2 = read(fd2, buf2, BLOCK_SIZE);

to

    n1 = ...same...
    n2 = read(fd2, buf2, n1);

Now, this messes up if fd1 happens to have a size exactly divisible
by BLOCK_SIZE.  Then n1 = 0 at the last read, and then n2 will equal 0 as
well, and cmp.c will conclude the files are identical even if they aren't.

So first, if you used my fix last time, then go and change it back.  (Also
in the patch I last posted I only changed the cmp() routine, and not the
fastcmp() routine.)  

Then, apply the following cdif to the original 1.5.10 cmp.c.  This one
has been much more thoroughly tested, and I'm pretty sure it works.  It
also adds a feature to tell you *where* an EOF was reached if the -l flag
is included.  (I hope this isn't anti-POSIX.)

------------- cut here -----------------
*** cmp.c.1.5	Sun Sep 23 11:01:39 1990
--- cmp.c	Sat Oct  6 11:44:10 1990
***************
*** 5,12 ****
   *	Reduced buffer size to accommodate 7K pipes (Minix restriction).
   *	Better trapping of file reading errors.
   *	Considerable speedup when using -s flag.
!  *	Buffering strategy remains seriously error prone; should be fixed.
!  */
  
  #include <sys/types.h>
  #include <fcntl.h>
--- 5,20 ----
   *	Reduced buffer size to accommodate 7K pipes (Minix restriction).
   *	Better trapping of file reading errors.
   *	Considerable speedup when using -s flag.
! > *	Buffering strategy remains seriously error prone; should be fixed.
! | */
! |
! /* 1990 Oct 10, Wayne Hayes
!  * 	Fixed a bug in buffering that incorrectly handled pipes,
!  *	  by changing the second read to only read as many bytes as the 1st
!  *	  read did.  That also means we have to check specially for EOFs on
!  *	  BLOCK_SIZE boundaries.
!  *	Added output to tell what byte EOF was reached at if -l flag is set.
! */
  
  #include <sys/types.h>
  #include <fcntl.h>
***************
*** 74,80 ****
    exit_status = 0;
    do {
  	n1 = read(fd1, buf1, BLOCK_SIZE);
! 	n2 = read(fd2, buf2, BLOCK_SIZE);
  	n = (n1 < n2) ? n1 : n2;
  	if (n < 0) {
  		printf("cmp: Error on %s\n", (n1 < 0) ? file_1 : file_2);
--- 82,90 ----
    exit_status = 0;
    do {
  	n1 = read(fd1, buf1, BLOCK_SIZE);
! 
! 	n2 = read(fd2, buf2, n1 ? n1 : 1);  /* to recognize an EOF on a block
! 		boundary in fd1, we have to read an extra char */
  	n = (n1 < n2) ? n1 : n2;
  	if (n < 0) {
  		printf("cmp: Error on %s\n", (n1 < 0) ? file_1 : file_2);
***************
*** 94,101 ****
  		if (buf1[i] == '\n') line_cnt++;
  		char_cnt++;
  	}
! 	if (n1 != n2) {		/* EOF on one of the input files. */
! 		printf("cmp: EOF on %s\n", (n1 < n2) ? file_1 : file_2);
  		return(1);
  	}
    } while (n > 0);		/* While not EOF on any file */
--- 104,115 ----
  		if (buf1[i] == '\n') line_cnt++;
  		char_cnt++;
  	}
! 	if (n1 != n2) {	/* EOF on one of the input files. */
! 		if(lflag)
! 			printf("cmp: EOF on %s at char %ld, line %ld\n",
! 		    	    (n1 <= n2) ? file_1 : file_2, char_cnt, line_cnt);
! 		else
! 			printf("cmp: EOF on %s\n", (n1 <= n2) ? file_1 : file_2);
  		return(1);
  	}
    } while (n > 0);		/* While not EOF on any file */
***************
*** 109,119 ****
  
    while (1) {
  	n1 = read(fd1, buf1, BLOCK_SIZE);
! 	n2 = read(fd2, buf2, BLOCK_SIZE);
! 	if (n1 != n2) return(1);	/* Bug! - depends on buffering */
! 	if (n1 == 0) return(0);
! 	if (n1 < 0) return(1);
! 	if (memcmp((void *) buf1, (void *) buf2, (size_t) n1) != 0)
  		return(1);
    }
  }
--- 123,133 ----
  
    while (1) {
  	n1 = read(fd1, buf1, BLOCK_SIZE);
! 	n2 = read(fd2, buf2, n1 ? n1 : 1);    /* to handle empty file in fd1 */
! 	if (n1 != n2) return(1);
! 	if (n1 == 0) return(0);
! 	if (n1 < 0) return(1);
! 	if (memcmp((void *) buf1, (void *) buf2, (size_t) n1))
  		return(1);
    }
  }
-- 
"The number of programs that can be done with the Hubble Space Telescope has
always greatly exceeded the time available for their execution, and this
remains true even with the telescope in its current state." -- HST Science
Working Group and User's Commitee Report, 1990 June 29.
Wayne Hayes	INTERNET: wayne@csri.utoronto.ca	CompuServe: 72401,3525