davidsen@steinmetz.steinmetz.UUCP (William E. Davidsen Jr) (03/17/88)
At one time we got a batch of duplicate articles in some groups. I wrote this little script to locate the articles, and optionally to prepare a file of rm commands which could be fed to shell. There was a reason not to remove them on the fly, but I don't remember it. I hope no one else has this problem and this posting is totally useless (but I doubt that it's true). : # # finddup - find duplicate entries in news # # enter the group name as a series of arguments, a list of dups # will be output. Optionally a list of rm commands may be written # to a file for execution. # # Example or find only: # finddup.sh comp arch # # Example of find and delete: # finddup.sh @r delfile comp arch # sh delfile # # @(#)finddup.sh v1.3, by bill davidsen, modified 1/19/88 # this code tests if the first argument is "@r". If so the next # argument is taken as the name of an output file for the remove # commands. if [ "$1" = "@r" ] then # convert to absolute pathname case "$2" in /.*) # absolute pathname rfile="$2";; *) # relative pathname rfile=`pwd` rfile=$rfile/$2;; esac shift; shift else rfile="" fi # build the directory name dir=$NEWS i=1 while [ $i -le $# ] do eval dir=$dir/\$$i if [ $i -eq 1 ] then ngname=$1 else eval ngname=$ngname.\$$i fi i=`expr $i + 1` done # change to the directory if [ -d $dir ] then cd $dir echo "Scanning newsgroup $ngname" else echo "$ngname - no such group" exit 1; fi # are we building a remove list? if [ -n "$rfile" ] then echo "Building a remove list in $rfile" fi # build the topic list for n in [1-9]* do # see if any files found if [ ! -f $n ] then echo "No files in $ngname" exit 0; fi # scan for message id sed -n " /^Message-ID:/{ s//$n:/ p q } " $n done | sort -t: +1 | awk ' BEGIN { indup = 0; oldmid = ""; FS = ":"; } { if ($2 == oldmid) { printf("Msg %d duplicates %d\n", $1, oldmnum); if (rfile != "") { printf("rm %s/%d\n", dir, $1) > rfile } } else { oldmid = $2; oldmnum = $1+0; } }' rfile=$rfile dir=$dir - -- bill davidsen (wedu@ge-crd.arpa) {uunet | philabs | seismo}!steinmetz!crdos1!davidsen "Stupidity, like virtue, is its own reward" -me