[comp.mail.misc] Two perhaps-useful awk scripts for news and mail

rsk@j.cc.purdue.edu.UUCP (12/11/86)

These two awk scripts process the information contained in the logs kept
by news (2.10.3 at least; haven't brought up 2.11 yet) and mail (4.2 or
4.3 bsd, don't know about system V and friends) into a form that some people
find a little more useful; they are emphatically hacks, and are therefore
not elegant, efficient, or anything like that--however, they do the job.

The usage for each is given below.  Note that parts of the script for mail
are commented out; if you want to keep track of ftp or finger connections,
remove the comments.  Similarly, the news script counts articles but
does not print out the count.

The basic idea behind these is that both news and mail make a multi-line
entry for each transaction that they handle.   The scripts produce one line
per transaction, where are a transaction is either a letter or an article,
respectively.  Typical output for mail.awkfile looks like this:

Dec 11 12:25:04-- size=5686, from=xs0, to=joe@h.cc.purdue.edu, message-id=<8612111724.AA08095@j.cc.purdue.edu>
Dec 11 12:14:10-- size=2510, from=boncolo, to=snarf@tb.cc.cmu.edu, message-id=<8612111711.AA07717@j.cc.purdue.edu>
Dec 11 12:13:49-- size=8219, from=nobish, to=samurai@ee.ecn.purdue.edu, message-id=<8612111712.AA07735@j.cc.purdue.edu>

Typical output for news.awkfile looks like this:

acs	amdahl.UUCP	comp.sys.amiga	'VT100 Beeps!'        
penick	hplabsc.UUCP	misc.consumers	'Re: MORE card'       
peterson	milano.UUCP	comp.graphics	'Geographic data bases'       
prindle	nadc.arpa	comp.sys.misc	're: CP/M Wordstar files'      

Note that these scripts just pick out the information I deemed useful to me;
in the case of mail, that was sender/recipient(s)/date/time/size/message-id;
in the case of news, that was sender/site/newsgroup(s)/subject.
Of course, it's pretty easy to post-process these to remove extraneous junk,
columnize them, and so on, but I figure everyone will want something different.
If you have comments, suggestions, or improvements, please send them to
me via MAIL; I will be happy to summarize, digest, and so on, and follow
up this article at a later date.

-- 
Rich Kulawiec, rsk@j.cc.purdue.edu, j.cc.purdue.edu!rsk

==========
mail.awkfile; usage is "awk -f mail.awkfile < /usr/spool/mqueue/syslog"
==========
BEGIN {fromcount = 0; sizecount = 0; tocount = 0; midcount = 0}
/ sendmail.*: from/		{ from[$6] = $7; fromcount++ }
/ sendmail.*: from/		{ size[$6] = $8; sizecount++ }
/ sendmail.*: to/		{ to[$6] = $7; tocount++ }
/ sendmail.*: message-id/	{ mid[$6] = $7; midcount++; month[$6] = $3;  day[$6] = $4;  time[$6] = $5}
# /.* fingd.*/			{ fingfrom[$2] = $6; fingto[$2] = $9}
# /.* ftpd.* connection from/	{ ftpfrom[$2] = $8}

END { for ( i in from ) {
		printf("%s %s %s %s %s %s %s\n",month[i],day[i],time[i],size[i],from[i],to[i],mid[i])
      }
#     for ( i in ftpfrom ) {
# 	printf("ftp from %s\n",ftpfrom[i])
#      }
#     for ( i in fingfrom ) {
#	printf("%s finged %s\n",fingfrom[i],fingto[i])
#      }
}

==========
news.awkfile; usage is "awk -f news.awkfile < /usr/local/lib/news/log"
==========
BEGIN {artcount = 0}
/.*	received */		{ ng[$6] = $8; subja[$6] = $10; subjb[$6] = $11; subjc[$6] = $12; subjd[$6] = $13; subje[$6] = $14; subjf[$6] = $15; subjg[$6] = $16; subjh[$6] = $17; subji[$6] = $18; subjj[$6] = $19; artcount++ ; artid = $6 }
/.*	from */			{ from[artid] = $6 }

END { for ( i in ng ) {
		printf("%s\t%s\t%s %s %s %s %s %s %s %s %s %s\n",from[i],ng[i],subja[i],subjb[i],subjc[i],subjd[i],subje[i],subjf[i],subjg[i],subjh[i],subji[i],subjj[i])
      }
}

dennis@rlgvax.UUCP (Dennis Bednar) (01/07/87)

In article <2731@j.cc.purdue.edu>, rsk@j.cc.purdue.edu (Same as it ever Wombat) writes:

> ( two awk scripts, one for mail, and one for news )

I tried running the news awkfile and it dropped a core.
Below is what happened (output from script(1)).
I am not an awk expert.  Does anybody have any ideas, or
can anybody else reproduce the errors on a different
machine?  We are running SVr2/4.2 hybrid, and I
am guessing that its the SVr2 awk we are running.


Script started on Wed Jan  7 15:01:36 1987
$ cat /tmp/news.awkfile
BEGIN {artcount = 0}
/.*	received */		{ ng[] = ; subja[] = 0; subjb[] = 1; subjc[] = 2; subjd[] = 3; subje[] = 4; subjf[] = 5; subjg[] = 6; subjh[] = 7; subji[] = 8; subjj[] = 9; artcount++ ; artid =  }
/.*	from */			{ from[artid] =  }

END { for ( i in ng ) {
		printf("%s\t%s\t%s %s %s %s %s %s %s %s %s %s\n",from[i],ng[i],subja[i],subjb[i],subjc[i],subjd[i],subje[i],subjf[i],subjg[i],subjh[i],subji[i],subjj[i])
      }
}
$ awk -f /tmp/news.awkfile < /usr/lib/news/log
awk: syntax error near line 2
awk: illegal statement near line 2
awk: syntax error near line 2
awk: illegal statement near line 2
Illegal instruction - core dumped
$ adb /bin/awk core
No symbol table	in file
$c
?(5b,5b) at 5128
?() at 697
?(1,bfffdbf0,bfffdbf8) at 238a
$q
$ 
script done on Wed Jan  7 15:03:03 1987
-- 
-Dennis Bednar
{decvax,ihnp4,harpo,allegra}!seismo!rlgvax!dennis	UUCP

guy%gorodish@Sun.COM (Guy Harris) (01/10/87)

> I tried running the news awkfile and it dropped a core.
> Below is what happened (output from script(1)).
> I am not an awk expert.  Does anybody have any ideas, or
> can anybody else reproduce the errors on a different
> machine?  We are running SVr2/4.2 hybrid, and I
> am guessing that its the SVr2 awk we are running.

Yes, and yes.  It's a bug in "awk"; I reproduced it on several different
machines.  It's in every version of "awk" I could find; the fix is in
comp.bugs.4bsd (for the 4.[123]BSD "awk", or the one distributed with the
update tape that came out for V7) or comp.bugs.sys5 (a fix for S5R2 and
later, and one for the original V7 one and the S3 and S5R1 one).

The problem is that the "awk" program in question has a number of nasty syntax
errors.  The bug is triggered by syntax errors in certain circumstances.

From the syntax errors in the program, it looks like the "awk" program was
passed as an argument to "awk" inside double quotes.  Unfortunately, the
"awk" program probably contains references to fields; these are of the form
"$1", "$2", ..., and since the shell expands those constructs inside
arguments enclosed in double quotes, and those shell variables didn't have
values, those references to fields were replaced with null strings.  If it's
enclosed in single quotes, no expansion is done, so single quotes should be
used for this.