tytso@athena.mit.edu (Theodore Y. Ts'o) (08/04/88)
When expire attempts to rebuild your history file for you, it makes the assumption that there aren't any duplicate articles. But one of the main reasons you might want to rebuild your history file is that you lost your old one; and in the meantime, you might have received many different copies of an article. Before I try to fix this; is this not a bug; or are we running an old version of expire; or has anyone fixed this already? The reason why this came up is that I'm trying to write a program to exterminate duplicate news articles, since we're extremely tight on space in /usr/spool/news. I was going to look for duplicate (or n-cate) entries in the history file; but I have to deal with entries like these: <809@uhnix1.uh.edu> 07/21/88 10:46 comp.sources.d/2706 comp.sources.bugs/1210 comp.sources.bugs/1190 <809@uhnix1.uh.edu> 07/21/88 10:46 comp.unix.questions/9251 comp.unix.questions/9156 comp.sources.d/2685 I'm thinking about modifying expire so that 1) it will be able to hunt down and kill duplicates; or 2) guaranteeing that each line has unique newsgroups, so that a secondary program could easily nuke duplicates. What do people think? =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Theodore Ts'o bloom-beacon!mit-athena!tytso 3 Ames St., Cambridge, MA 02139 tytso@athena.mit.edu If it's for real, it isn't!