[comp.lang.perl] Sample UseNet programs?

pashdown@javelin.es.com (Pete Ashdown) (03/26/91)

I'm looking for some sample programs that will grab the active news file and
then traverse the newsgroup trees, examining the content of each groups
messages.  I know some perl programs have been written to do various operations
similar to this, so I thought I would ask for samples.
-- 
		  "Why can't I be you?" - Robert Smith
		  "Why can't he be you?" - Patsy Cline
		  "Why can't you be you?" - `Seven Faces of Eve'
Pete Ashdown  pashdown@javelin.sim.es.com ...uunet!javelin.sim.es.com!pashdown

merlyn@iwarp.intel.com (Randal L. Schwartz) (03/26/91)

In article <1991Mar25.212456.9031@javelin.es.com>, pashdown@javelin (Pete Ashdown) writes:
| 
| I'm looking for some sample programs that will grab the active news file and
| then traverse the newsgroup trees, examining the content of each groups
| messages.  I know some perl programs have been written to do various operations
| similar to this, so I thought I would ask for samples.

Here's "newslat" which I wrote a while ago to find out how delayed my
average news article is, by comparing the modtime of the file (thus,
my arrival time) with the date within the file.  No points for picking
out bad code... this was one of my *much* earlier programs. :-)

A couple of local notes: I'm running Cnews (a year old... ugh), and my
spool disk used to also be accessible as /r1/usr.spool.news on other
machines (/r1 is a shared disk), hence the funny test at the
beginning.

#! /bin/sh
# This is a shell archive.  Remove anything before this line, then unpack
# it by saving it into a file and typing "sh file".  To overwrite existing
# files, type "sh file -c".  You can also feed this as standard input via
# unshar, or by typing "sh <file", e.g..  If this archive is complete, you
# will see the following message at the end:
#		"End of shell archive."
# Contents:  newslat2
# Wrapped by merlyn@iwarpse on Mon Mar 25 15:06:13 1991
PATH=/bin:/usr/bin:/usr/ucb ; export PATH
if test -f 'newslat2' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'newslat2'\"
else
echo shar: Extracting \"'newslat2'\" \(2488 characters\)
sed "s/^X//" >'newslat2' <<'END_OF_FILE'
X#!/local/usr/bin/perl
X
X(($spool) = grep(-d, "/usr/spool/news", "/r1/usr.spool.news")) ||
X	die "Cannot find spool directory";
X(($lib) = grep(-d, "/usr/lib/news", "/r1/usr.lib.news")) ||
X	die "Cannot find lib directory";
X
X%offset = (
X	'Jan89', 0, 'Feb89', 31, 'Mar89', 59,
X	'Apr89', 90, 'May89', 120, 'Jun89', 151,
X	'Jul89', 181, 'Aug89', 212, 'Sep89', 243,
X	'Oct89', 273, 'Nov89', 304, 'Dec89', 334,
X	'Jan90', 365, 'Feb90', 396, 'Mar90', 425,
X	'Apr90', 456, 'May90', 486, 'Jun90', 516,
X	'Jul90', 546, 'Aug90', 577, 'Sep90', 608,
X	'Oct90', 638, 'Nov90', 669, 'Dec90', 699,
X	'Jan91', 730, 'Feb91', 761, 'Mar91', 789,
X); # that'll do for a while
X
X$| = 1;
Xchdir $spool || die "Cannot chdir $spool ($!)";
Xopen(HIST, "$lib/history") ||
X	die "Cannot open $lib/history ($!)";
XARTICLE: while (<HIST>) {
X	chop;
X	@fields = split(/\t/);
X	next ARTICLE if @fields < 3;
X	($article) = split(/ /, $fields[2]);
X	$article =~ s#\.#/#g;
X	unless (open(ARTICLE, $article)) {
X		warn "Cannot open $article ($!)";
X		next ARTICLE;
X	}
X	while (<ARTICLE>) {
X		if (/^$/) {
X			warn "$article: What? No date?";
X			next ARTICLE;
X		}
X		last if /^Date:/;
X	}
X	unless (/\s(\d\d?)\s+(\w\w\w)\w*\s+(\d\d)\s+(\d\d?):(\d\d):(\d\d)\s+GMT/) {
X		/(.*)/;
X		warn "$article: unknown date format: $1";
X		next ARTICLE;
X	}
X	($day,$monthname,$year,$hour,$minute,$second) = ($1,$2,$3,$4,$5,$6);
X	unless (defined($offset{$monthname . $year})) {
X		warn "$article: unknown month/year: $monthname/$year";
X		next ARTICLE;
X	}
X	$when = ($offset{$monthname . $year}+$day-1)*86400 + $hour*3600 +
X		$minute * 60 + $second + 599616000;
X	@x = stat(ARTICLE);
X	$latency = $x[9]-$when;
X	### print "$article: $latency\n"; ### DEBUG
X	$daysold = int($latency/86400+2)-2;
X	$daysold = 29 if $daysold > 28;
X	$daysold = -1 if $daysold < 0; # time warp
X	$daysold{$daysold}++;
X	$daysoldn++;
X	next ARTICLE unless $daysold < 1;
X	$hoursold = int($latency/3600+2)-2;
X	$hoursold = 25 if $hoursold > 24;
X	$hoursold = -1 if $hoursold < 0; # time warp
X	$hoursold{$hoursold}++;
X	$hoursoldn++;
X}
X
Xexit if $daysoldn < 1;
X
Xprint "$daysoldn articles total\n";
Xprint "Days Count %---10---20---30---40---50---60---70---80---90--100\n";
Xfor ((-1..29)) {
X	printf "%4d %5d %s\n",
X		$_, $daysold{$_}, '*' x (50*$daysold{$_}/$daysoldn);
X}
X
Xexit if $hoursoldn < 1;
X
Xprint "\n$hoursoldn articles in first day\n";
Xprint "Hour Count %---10---20---30---40---50---60---70---80---90--100\n";
Xfor ((-1..25)) {
X	printf "%4d %5d %s\n",
X		$_, $hoursold{$_}, '*' x (50*$hoursold{$_}/$hoursoldn);
X}
END_OF_FILE
if test 2488 -ne `wc -c <'newslat2'`; then
    echo shar: \"'newslat2'\" unpacked with wrong size!
fi
chmod +x 'newslat2'
# end of 'newslat2'
fi
echo shar: End of shell archive.
exit 0

grep((print "Just another $_ hacker,\n"),Perl,news)
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Intel: putting the 'backward' in 'backward compatible'..."====/