joec@u1100a.UUCP (Joe Carfagno) (08/07/85)
{this line not shown by sort -u ... only kidding} >Running 4.2 on a 750, I have encountered a bug in sort, such that the output >of "sort -u", when run against a particular file, is missing one line. > >Unfortunately, the file that produces this error is 136550 bytes (45516 >lines), and I have been unable to extract a reasonably small subset of >the file which will demonstrate the bug. Nevertheless, I consider this >extremely serious. I'll never know what erroneous reports my application >has turned out; I use sort a lot, and it was only by accident that I >noticed the problem with a file this large. I would be glad to send any >interested party a copy of the file, but I think that it is too big to >send over the net. > >Everything seems to go ok until the final call to the "merge" routine, >and although I have come up with a patch that fixes this instance of >the problem, the fact that I can't replicate the bug with a smaller file >verifies that my understanding of sort is inadequate. > >Does any of this sound familiar? >Any help or suggestions would be greatly appreciated. > > Carl Shapiro > ...sdcrdcf!otto!carl Here's the problem (and the fix) with sort -u missing one line $ diff /usr/src/cmd/sort.c /usr/src/cmd/OLDsort.c 399c399 < if(!cflg && (uflg == 0 || muflg || i <= 1 || --- > if(!cflg && (uflg == 0 || muflg || It seems that i can be set to 1 and the lines which follow line 399 if(!cflg && (uflg == 0 || muflg || (*compare)(ibuf[i-1]->l,ibuf[i-2]->l))) do putc(*cp, os); while(*cp++ != '\n'); specifically the ibuf[i-2]->l don't make much sense for i == 1. So, add the extra "|| i <= 1" condition. The problem not occuring with small files is explainable - hopefully I can explain it. Not remembering all the details, but small files can be merged in core (see what the MEM tag gets used for, that rings a bell). I think that when you merge, one of the files can get one line in them (thus i == 1) and it screws up. I found the problem on our version of the UNIX system which runs on the Sperry 1100 processor line. The (*compare) function was called with a strange second argument which worked on one type of processor but not on another. It turns out that the eol() function was scanning the register set (which starts at address 0 on an 1100) and finding a new-line character in one processor type's register set, but not in the other. This was an interesting one to find, and I think it'll solve your problem. Joe Carfagno {...!u1100a!joec}