[news.sysadmin] bug

tytso@athena.mit.edu (Theodore Y. Ts'o) (08/04/88)

When expire attempts to rebuild your history file for you, it makes the
assumption that there aren't any duplicate articles.  But one of the
main reasons you might want to rebuild your history file is that you
lost your old one; and in the meantime, you might have received many
different copies of an article.

Before I try to fix this; is this not a bug; or are we running an old
version of expire; or has anyone fixed this already?

The reason why this came up is that I'm trying to write a program to
exterminate duplicate news articles, since we're extremely tight on
space in /usr/spool/news.  I was going to look for duplicate (or
n-cate) entries in the history file; but I have to deal with entries
like these:

<809@uhnix1.uh.edu>     07/21/88 10:46  comp.sources.d/2706 
	comp.sources.bugs/1210 comp.sources.bugs/1190
<809@uhnix1.uh.edu>     07/21/88 10:46  comp.unix.questions/9251 
	comp.unix.questions/9156 comp.sources.d/2685

I'm thinking about modifying expire so that 1) it will be able to hunt
down and kill duplicates; or 2) guaranteeing that each line has unique
newsgroups, so that a secondary program could easily nuke duplicates.
What do people think?
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Theodore Ts'o				bloom-beacon!mit-athena!tytso
3 Ames St., Cambridge, MA 02139		tytso@athena.mit.edu
			If it's for real, it isn't!