[news.admin] News Traffic Generator reports

jak9213@helios.TAMU.EDU (John Kane) (12/13/89)

I would like to get a handle on the traffic that my news system is
handling. I am running BNews 2.11 and NNTP 1.5.6. What utility do I need
to have to get periodic traffic reports.

Thanks.

 John Arthur Kane, Systems Analyst, Microcomputer Support and Training
 Texas A&M University, College Station, TX 77843  (409) 845-9999

 jak9213@helios.tamu.edu     profs: x043jk@tamvm1.tamu.edu

fletcher@cs.utexas.edu (Fletcher Mattox) (12/13/89)

In article <3989@helios.TAMU.EDU> jak9213@helios.TAMU.EDU (John Kane) writes:
>I would like to get a handle on the traffic that my news system is
>handling.

Here's how I do it.  This method includes headers and counts cross-posted
articles only once.  A sys entry records the path of every article we get:

	log:world,all:F:/usr/spool/batch/record/log

Then I run something like this nightly from cron to add up all the bytes:

	(
	# how many bytes of news arrived today?
	cd /usr/spool/batch/record
	mv log.5 log.6
	mv log.4 log.5
	mv log.3 log.4
	mv log.2 log.3
	mv log.1 log.2
	mv log   log.1
	echo -n "`date` "
	xargs ls -l <log.1 | awk '{s+=$4}END{printf("%4s\t%8s\n", NR, s)}'
	) >> meter.log

tale@cs.rpi.edu (David C Lawrence) (12/13/89)

I made a few changes to Erik Fair's NNTP syslog summarizer; it's
available to anyone that wants it.  I run it as part of my localdaily
script invoked by cron every morning at 7am.

I intend to re-invent an awk processor for C News's log, but haven't
gotten around to it yet.  Does someone else already have something
that gives a nice report after running over $NEWSCTL/log?

This morning's output from a pass of the awk script over a day of
syslogging, trimmed down some to save on bandwidth:

Dec 11 09:53:54 rpi nntplink[18450]: image.soe.clarkson.edu xfer /usenet/spool/misc/emerg-services/442: Interrupted system call
Dec 11 19:30:30 rpi nntpd[10386]: brutus.cs.uiuc.edu spawn: EOF before period on line by itself
Dec 11 22:17:43 rpi nntpxmit[11136]: nisc.nyser.net signal 15
Dec 12 06:44:41 rpi nntpxmit[14485]: nisc.nyser.net xfer: Bad file number
[...]

News Transmission Daemon Activity:
newsxd reinitialized once.

Article Reception        Offered      Took         Toss         Fail
Contacting Host           To Us    Total  Pct   Total  Pct   Total  Pct
leah.albany.edu            3513       10   0%    3503 100%       0   0%
[...]
------------------------- -----    ----------   ----------   ----------
TOTALS                    23829     5754  24%   18072  76%       3   0%

Article Transmission    Offered       Took         Toss         Fail
Host Contacted   	To Them    Total  Pct   Total  Pct   Total  Pct
leah.albany.edu            4115     4101 100%      14   0%       0   0%
[...]
------------------------- -----    ----------   ----------   ----------
TOTALS                    27955    20918  75%    7025  25%      71   0%

Outgoing Transmission Connexions         ------errors-------
System                     Conn    OK    NS   Net   Rmt  Pct
image.soe.clarkson.edu      156    96     0    58     2  38%
nisc.nyser.net              263   113     0   149     1  57%     [1]
crdgw1.ge.com               128    90     0    38     0  30%
[...]
------------------------- ----- ----- ----- ----- ----- ----
TOTALS                     1277  1024     1   249     3  20%

NNTP readership statistics
System                     Conn Articles Groups Post
*.its.rpi.edu                 3      510     52    2
*.ecs.rpi.edu                 9     1017    639    0
*.ecse.rpi.edu                1        0      0    1
*.cie.rpi.edu                43      963     86    1
*.rdrc.rpi.edu               22     1254     61    1
*.pawl.rpi.edu               29     2792    185   11
*.cs.rpi.edu                160     5181    601    2
*.ipl.rpi.edu                 1      141     13    2
------------------------- ----- -------- ------ ----
TOTALS                      268    11858   1637   20

nntpd timeouts
uwm.edu                       1
pawl14.pawl.rpi.edu           1
turing.cs.rpi.edu             2

[1] I find it very odd that nearly all of errors I get are from the
regional network.  Blargh.

Dave
-- 
   (setq mail '("tale@cs.rpi.edu" "tale@ai.mit.edu" "tale@rpitsmts.bitnet"))

warren@samsung.COM (Warren Lavallee) (12/14/89)

jak9213@helios.TAMU.EDU (John Kane) writes:
>I would like to get a handle on the traffic that my news system is
>handling. I am running BNews 2.11 and NNTP 1.5.6. What utility do I need
>to have to get periodic traffic reports.

	I wrote a C program that parses my news log file and NNTP log
file.  The output looks like this:

News summary bounds: Dec 12 05:24:52-Dec 13 05:15:19.
                                 Outgoing   %      Incoming   totals%
                  Host name:    offer accept acc    rec acc%  tot offr acc/min
           xanth.cs.odu.edu:        0      0 n/a   2441  69%  30%   0%  unkwn
              cs.utexas.edu:     5196    632 12%   1331  29%  16%  76%  0.42
  zaphod.mps.ohio-state.edu:     3445   1067 31%   1104  41%  14%  49%  0.90
               uunet.uu.net:     4444   1081 24%    714  21%   9%  60%  0.78
                    usc.edu:     2329    740 32%    640  22%   8%  31%  0.66
         brutus.cs.uiuc.edu:     4290   1321 31%    587  36%   7%  57%  0.99
             news.think.com:     4309   1720 40%    341  49%   4%  55%  1.15
    uakari.primate.wisc.edu:     2437   1088 45%    184  55%   2%  31%  0.91
         aplcen.apl.jhu.edu:     4550   1655 36%     92  92%   1%  56%  1.15
      shadooby.cc.umich.edu:     6662   3718 56%     97   6%   1%  83%  3.08
      caesar.cs.montana.edu:     6705   2365 35%     60   5%   1%  83%  1.67
              munnari.oz.au:     1970    940 48%     21  84%   0%  24%  0.65
         psuvax1.cs.psu.edu:        0      0 n/a     26  10%   0%   0%  unkwn
          rex.cs.tulane.edu:     5646   1861 33%     23   3%   0%  69%  2.08
     emory.mathcs.emory.edu:     5530   2720 49%     32   1%   0%  68%  1.98
       sol.ctr.columbia.edu:     4776   2598 54%     20   1%   0%  59%  2.25
           swan.ulowell.edu:      211     97 46%      0  n/a   0%   3%  0.14
      interlan.interlan.com*       17     11 65%      0  n/a   0%   0%  unkwn

                     Totals:    62517  23614      8147

	Send me mail if you want a copy.
-- 
Samsung Software America.       			      Warren J. Lavallee
UUCP:  ...!uunet!samsung!warren            NEARnet/Internet:  warren@samsung.com
"Punishment becomes ineffective after a certain point.  Men become insensitive."
				  -- Eneg, "Patterns of Force," stardate 2534.7.

tale@cs.rpi.edu (David C Lawrence) (12/14/89)

In article <9A~{L|@rpi.edu> tale@cs.rpi.edu (David C Lawrence) writes:
> I made a few changes to Erik Fair's NNTP syslog summarizer; it's
> available to anyone that wants it.  I run it as part of my localdaily
> script invoked by cron every morning at 7am.

Well, I've received eleven requests for this today, so I'll post.  The
only site configuration that should need to be changed is the ``local''
array defined in the BEGIN block.

People who like time/cpu information are advised to use the original
version of this script, with perhaps the addition of domain summary
for readers.  "polled" still uses the time/cpu format, so you can
convert this back from there if you don't have access to the original.

# an awk script 
# an NNTP log summary report generator
#
# NOTE: for systems that are not as yet using the new 4.3 BSD syslog
# (and therefore have nntp messages lumped with everything else), it
# would be best to invoke this script thusly:
#
#	egrep nntp syslog.old | awk -f nntp_awk > report_of_the_week
#
# because this script will include in the report all messages in the log
# that it does not recognize (on the assumption that they are errors to
# be dealt with by a human).
#
# Erik E. Fair <fair@ucbarpa.berkeley.edu>
# May 17, 1986 - Norwegian Independence Day
#
# Recognize some new things - February 22, 1987
# Erik E. Fair <fair@ucbarpa.berkeley.edu>
#
# fix "xmt is not an array" bug - March 11, 1987
# Change Elapsed/CPU fields to break out time values, HH:MM:SS
# Erik E. Fair <fair@ucbarpa.berkeley.edu>
#
# Add reporting for newnews commands - August 27, 1987
# Erik E. Fair <fair@ucbarpa.berkeley.edu>
#
# Add nntpxmit connection attempt counting/reporting - December 7, 1987
# Erik E. Fair <fair@ucbarpa.berkeley.edu>
#
# Some hacking on 11 Nov 89, tale.  Deal with newsxd and change output
# format a little.  Left the output for pollers alone.
#
# More whacking early December, to stop listing readers on individual machines
# but instead summarize the domain.

BEGIN {
  # set up an array to use for summarizing domains
  local["its.rpi.edu"] = 0;
  local["pawl.rpi.edu"] = 0;
  local["cs.rpi.edu"] = 0;
  local["ecs.rpi.edu"] = 0;
  local["ecse.rpi.edu"] = 0;
  local["cie.rpi.edu"] = 0;
  local["ipl.rpi.edu"] = 0;
  local["rdrc.rpi.edu"] = 0;
  local["ral.rpi.edu"] = 0;
}
### Skip stderr reports from rnews
{
  n = split($6, path, "/");
  if (path[n] == "rnews:") next;
  n = split($7, path, "/");
  if (path[n] == "rnews") next;
  host = $6;
}
$5 ~ /^newsxd\[[0-9]+\]:$/ {
  newsxds = 1;
  if ($6 == "shut" && $7 == "down")
    newsxd[$10]++;
  else if ($6 == "starting")
    newsxd["start"]++;
  else if ($6 == "reinitializing")
    newsxd["reinit"]++;
  else print;
  next;
}
  
$7 == "group" {
  readers = 1;
  ng[$8]++;
  next;
}
$7 == "ihave" {
  receive = 1;
  rec[host]++;
  if ($9 == "accepted") {
    rec_accept[host]++;
    if ($10 == "failed") rec_failed[host]++;
  } else if ($9 == "rejected") rec_refuse[host]++;
  next;
}
# this is from version 1.4 of nntpd
$7 == "ihave_stats" {
  receive = 1;
  rec[host] += $9 + $11 + $13;
  rec_accept[host] += $9;
  rec_refuse[host] += $11;
  rec_failed[host] += $13;
  next;
}
$7 == "connect" {
  systems[host]++;
  next;
}
# nntpxmit connection errors
# Ooooh! I *wish* awk had N dimensional arrays,
# so I wouldn't have to throw away the error message here!
$7 == "hello:" {
  conn[host]++;
  if ($8 == "Connection" && $9 == "refused")
    rmt_fail[host]++;
  else
    open_fail[host]++;
  next;
}
# we'll get stats from this, don't count conn[]
$7 == "xfer:" {
  open_fail[host]++;
# since these are expected to be few in number, we still print
# the exact error (no "next;" statement here).
}
$7 == "greeted" {
  conn[host]++;
  rmt_fail[host]++;
  next;
}
$7 == "host" && $8 == "unknown" {
  conn[host]++;
  ns_fail[host]++;
  next;
}
# nntpd connection abort - all "broken pipe" right now
$7 == "disconnect:" { next }
# syslogd shit
$7 == "repeated" { next }
# inews shit
$11 == "spooled" { next }
$7 == "exit" {
  if ($8 > 0) readers = 1;
  articles[host] += $8;
  groups[host] += $10;
  next;
}
$7 == "xmit" {
  xmt_cpu[host] += $9 + $11;
  xmt_ela[host] += $13;
  next;
}
$7 == "times" {
  cpu[host] += $9 + $11;
  ela[host] += $13;
  next;
}
$7 == "stats" {
  transmit = 1;
  conn[host]++;
  xmt[host] += $8;
  xmt_accept[host] += $10;
  xmt_refuse[host] += $12;
  xmt_failed[host] += $14;
  next;
}
#
#  For the Nth time, I wish awk had two dimensional associative
#  arrays. I assume that the last request is the same as all the
#  others in this section of logfile.
#
$7 == "newnews" {
  polled = 1;
  poll[host] ++;
  poll_asked[host] = $8;
  next;
}
$7 == "newnews_stats" {
  poll_offered[host] += $9;
  poll_took[host] += $11;
  next;
}
$7 == "post" {
  readers = 1;
  post[host]++;
  next;
}
$7 == "timeout" {
  timeout[host]++;
  timeouts = 1;
  next;
}
$7 == "unrecognized" {
  unknown[host]++;
#  curious = 1;  # originally by Erik.  I'll see it at the top of
                 # report anyway without it being an Unknown Explorer
  print $1, $2, $3, $4, $5, $6, $7, $8 # just print the first word,
  next;                                # which is really the unrecognised part.
}
$7 == "refused" {
  splut=1;
  refused[host]++;
  next;
}
### Print anything that we don't recognize in the report
{
  print;
}
END {
  printf("\n");

  if (newsxds) {
    printf("News Transmission Daemon Activity:\n");
    for (s in newsxd) {
      if (s == "start") printf("newsxd starts: %d\n",newsxd["start"]);
      else if (s== "reinit")
        printf("newsxd reinitialisations: %d\n",newsxd["reinit"]);
      else printf("newsxd shut downs by signal %d: %d\n",s,newsxd[s]);
    }
  }

  printf("\n");

### Article Exchange With Peers (other servers) Statistics
  if (polled) for(s in poll) servers[s]++;
  if (receive) for(s in rec) servers[s]++;
  if (transmit) for(s in xmt) servers[s]++;

  if (receive) {
    printf("Article Reception        Offered      Took         Toss         Fail\n");
    printf("Contacting Host           To Us    Total  Pct   Total  Pct   Total  Pct\n");     
    for(s in rec) {
      nrec += rec[s];
      nrec_accept += rec_accept[s];
      nrec_refuse += rec_refuse[s];
      nrec_failed += rec_failed[s];

      they_offered = rec[s];
      if (they_offered == 0) they_offered = 1;
      we_toss = (rec_refuse[s] / they_offered) * 100 + 0.5;
      we_took = (rec_accept[s] / they_offered) * 100 + 0.5;
      we_fail = (rec_failed[s] / they_offered) * 100 + 0.5;

      printf("%-25s %5d    %5d %3d%%   %5d %3d%%   %5d %3d%%\n", s, rec[s], rec_accept[s], we_took, rec_refuse[s], we_toss, rec_failed[s], we_fail);
    }

    they_offered = nrec;
    if (they_offered == 0) they_offered = 1;
    we_toss = (nrec_refuse / they_offered) * 100 + 0.5;
    we_took = (nrec_accept / they_offered) * 100 + 0.5;
    we_fail = (nrec_failed / they_offered) * 100 + 0.5;
    printf("------------------------- -----    ----------   ----------   ----------\n");
    printf("%-25s %5d    %5d %3d%%   %5d %3d%%   %5d %3d%%\n\n", "TOTALS", nrec, nrec_accept, we_took, nrec_refuse, we_toss, nrec_failed, we_fail);
  }

###############################################################################
  if (polled) {
    printf("Article Transmission (they poll us)\n");
    printf("System                     Conn Offrd  Took   Elapsed       CPU  Pct  Groups\n");
    npoll = 0;
    npoll_offered = 0;
    npoll_took = 0;
    npoll_cpu = 0;
    npoll_ela = 0;

    for(s in poll) {
      npoll += poll[s];
      npoll_offered += poll_offered[s];
      npoll_took += poll_took[s];

      if (rec[s]) {
        printf("%-25s %5d %5d %5d  (see Article Reception)  %s\n", s, poll[s], poll_offered[s], poll_took[s], poll_asked[s]);
      } else {
        npoll_ela += ela[s];
        npoll_cpu += cpu[s];

        e_hours = ela[s] / 3600;
        e_sec   = ela[s] % 3600;
        e_min   = e_sec / 60;
        e_sec   %= 60;

        c_hours = cpu[s] / 3600;
        c_sec   = cpu[s] % 3600;
        c_min   = c_sec / 60;
        c_sec   %= 60;

        tmp = ela[s];
        if (tmp == 0) tmp = 1;
        pct = ((cpu[s] / tmp) * 100.0 + 0.5);

        printf("%-25s %5d %5d %5d %3d:%02d:%02d %3d:%02d:%02d %3d%%  %s\n", s, poll[s], poll_offered[s], poll_took[s], e_hours, e_min, e_sec, c_hours, c_min, c_sec, pct, poll_asked[s]);
      }
    }
    printf("\n%-25s %5d %5d %5d", "TOTALS", npoll, npoll_offered, npoll_took);
    if (npoll_ela > 0 && npoll_cpu > 0) {

      e_hours = npoll_ela / 3600;
      e_sec   = npoll_ela % 3600;
      e_min   = e_sec / 60;
      e_sec   %= 60;

      c_hours = npoll_cpu / 3600;
      c_sec   = npoll_cpu % 3600;
      c_min   = c_sec / 60;
      c_sec   %= 60;

      tmp = npoll_ela;
      if (tmp == 0) tmp = 1;
      pct = ((npoll_cpu / tmp) * 100.0 + 0.5);

      printf(" %3d:%02d:%02d %3d:%02d:%02d %3d%%\n\n", e_hours, e_min, e_sec, c_hours, c_min, c_sec, pct);
    } else
      printf("\n\n");
  }

###############################################################################
  if (transmit) {
    printf("Article Transmission    Offered       Took         Toss         Fail\n");
    printf("Host Contacted          To Them    Total  Pct   Total  Pct   Total  Pct\n");
    for(s in xmt) {
      we_offered = xmt[s];
      if (we_offered == 0) we_offered = 1;
      they_toss = (xmt_refuse[s] / we_offered) * 100 + 0.5;
      they_took = (xmt_accept[s] / we_offered) * 100 + 0.5;
      they_fail = (xmt_failed[s] / we_offered) * 100 + 0.5;

      printf("%-25s %5d    %5d %3d%%   %5d %3d%%   %5d %3d%%\n", s, xmt[s], xmt_accept[s], they_took, xmt_refuse[s], they_toss, xmt_failed[s], they_fail);

      nxmt        += xmt[s];
      nxmt_accept += xmt_accept[s];
      nxmt_refuse += xmt_refuse[s];
      nxmt_failed += xmt_failed[s];
    }

    we_offered = nxmt;
    if (we_offered == 0) we_offered = 1;
    they_toss = (nxmt_refuse / we_offered) * 100 + 0.5;
    they_took = (nxmt_accept / we_offered) * 100 + 0.5;
    they_fail = (nxmt_failed / we_offered) * 100 + 0.5;
    printf("------------------------- -----    ----------   ----------   ----------\n");
    printf("%-25s %5d    %5d %3d%%   %5d %3d%%   %5d %3d%%\n\n", "TOTALS", nxmt, nxmt_accept, they_took, nxmt_refuse, they_toss, nxmt_failed, they_fail);

    printf("Outgoing Transmission Connexions         ------errors-------\n");
    printf("System                     Conn    OK    NS   Net   Rmt  Pct\n");
    for(s in xmt) {
      tot = conn[s];
      if (tot == 0) tot = 1;
      errs = rmt_fail[s] + ns_fail[s] + open_fail[s];
      ok = (conn[s] - errs);
      printf("%-25s %5d %5d %5d %5d %5d %3d%%\n", s, conn[s], ok, ns_fail[s], open_fail[s], rmt_fail[s], (100.0 * errs / tot + 0.5));
      ct_tot += conn[s];
      ct_ok  += ok;
      ct_ns  += ns_fail[s];
      ct_net += open_fail[s];
      ct_rmt += rmt_fail[s];
    }
    tot = ct_tot;
    if (tot == 0) tot = 1;
    errs = ct_ns + ct_net + ct_rmt;
    printf("------------------------- ----- ----- ----- ----- ----- ----\n");
    printf("%-25s %5d %5d %5d %5d %5d %3d%%\n\n", "TOTALS", ct_tot, ct_ok, ct_ns, ct_net, ct_rmt, (100.0 * errs / tot + 0.5));
  }

### Article Readership Statistics

  if (readers) {
    printf("NNTP readership statistics\n");
    printf("System                     Conn Articles Groups Post\n");
    for(s in systems) {

### servers are different animals; they don't belong in this part of the report

      if (servers[s] > 0 && groups[s] == 0 && articles[s] == 0)
        continue;

### report the curious server pokers elsewhere

      if (groups[s] == 0 && articles[s] == 0 && post[s] == 0 && refused[s] != systems[s]) {
        unknown[s] += systems[s];
        curious = 1;
        continue;
      }

      nconn += systems[s];
      nart += articles[s];
      ngrp += groups[s];
      npost += post[s];

      # V7 awk is so damn annoying.  Can't match against variable patterns.
      # so instead i break apart host name and compare elements from the rear
      domain = "";
      nso = split(s, sp, ".");
      for (l in local) {
        nl = split(l, lp, ".");
        ns = nso;
        found = 1;
        while ( nl > 0 ) {
          if ( lp[nl--] != sp[ns--] ) {
            found = 0; nl=0;
          }
        }
       	if (found) domain = "*." l;
      }
      # special-case f*cked up cs dept machines that won't tell me their names
      if (!domain && sp[1] == "128" && sp[2] == "213") domain = "*.cs.rpi.edu";
      if (domain) {
        rep_sys[domain] += systems[s];
        rep_art[domain] += articles[s];
        rep_grp[domain] += groups[s];
        rep_pst[domain] += post[s];
      } else {
        rep_sys[s] = systems[s];
        rep_art[s] = articles[s];
        rep_grp[s] = groups[s];
        rep_pst[s] = post[s];
      }
    }
    for (r in rep_sys) {
      printf("%-25s %5d %8d %6d %4d\n", r, rep_sys[r], rep_art[r], rep_grp[r], rep_pst[r]);
    }
    printf("------------------------- ----- -------- ------ ----\n");
    printf("%-25s %5d %8d %6d %4d\n\n", "TOTALS", nconn, nart, ngrp, npost);
  }

###############################################################################
  if (curious) {
    printf("Unknown NNTP server explorers\n\n");
    printf("System                     Conn\n");
    for(s in unknown) {
      printf("%-25s %5d\n", s, unknown[s]);
    }
    printf("\n");
  }
###############################################################################
  if (timeouts) {
    printf("nntpd timeouts\n");
    for(s in timeout) {
      printf("%-25s %5d\n", s, timeout[s]);
    }
    printf("\n");
  }
  if (splut) {
    printf("Refused connexions\n");
    for(s in refused) {
      if (refused[s] > 0)
        printf("%-25s %5d\n", s, refused[s]);
    }
    printf("\n");
  }
}

lmb@vicom.com (Larry Blair) (12/15/89)

In article <9A~{L|@rpi.edu> tale@cs.rpi.edu (David C Lawrence) writes:
=I intend to re-invent an awk processor for C News's log, but haven't
=gotten around to it yet.  Does someone else already have something
=that gives a nice report after running over $NEWSCTL/log?

I've posted this a few times.  As yet, Henry and Geoff have shown no
interest in incorporating the minor logging change or distributing the awk
script.

First the patch.  I haven't tried to re-diff it since all the subsequent
patchs, but I know that it will still work but may require a little larger
fuzz factor.

*** relay/history.c.org	Sat Jun 17 23:14:20 1989
--- relay/history.c	Mon Jun 26 14:39:33 1989
***************
*** 184,191 ****
  
  	if (startlog) {
  		timestamp(stdout, &now);
! 		if (printf(" %s + %s", sendersite(nullify(art->h.h_path)),
! 		    msgid) == EOF)
  			fulldisk(art, "stdout");
  	} else
  		now = time(&now);
--- 184,199 ----
  
  	if (startlog) {
  		timestamp(stdout, &now);
! 		if(art->h.h_ngs == NULL) {
! 			if (printf(" %s f %s",
! 			    sendersite(nullify(art->h.h_path)), msgid) == EOF)
! 				fulldisk(art, "stdout");
! 		} else if (printf(" %s %c %s %s",
! 		    sendersite(nullify(art->h.h_path)),
! 		    (art->h.h_ctlcmd) ? 'c' : '+', msgid,
! 		    art->h.h_ngs) == EOF)
! 			fulldisk(art, "stdout");
! 		if ( art->h.h_ctlcmd && printf(" %s", art->h.h_ctlcmd) == EOF)
  			fulldisk(art, "stdout");
  	} else
  		now = time(&now);

Here's the awk script:

#  USAGE: awk -f report_awk /usr/lib/news/log
#  AWK script which eats netnews log files and produces a summary of USENET
#  traffic over the period of time that the log was collected.
#
#  C news version - for use with log file patches
#
#  6/30/89
#
#  Erik E. Fair <dual!fair>
#  Original Author, May 22, 1984
#
#  Brad Eacker <onyx!brad>
#  Modified to simplify the record processing and to sort the output.
#
#  Erik E. Fair <dual!fair>
#  Modifed to provide information about control messages.
#
#  Erik E. Fair <dual!fair>
#  Bug in system name extraction fixed. It was assumed that the forth field
#  (system name) always had a dot. local is one that doesn't. Some others
#  (including 2.9 sites) don't either.
#
#  Earl Wallace <pesnta!earlw>
#  The "sent" field was changed from $5 to $6 in 2.10.2 (beta)
#  named "newstats" and called with no arguments.
#
#  Erik E. Fair <dual!fair>
#  Remove support for 2.10.1, revise for 2.10.2 to provide information
#  about junked articles, garbled articles, and bad newsgroups
#
#  Erik E. Fair <ucbvax!fair>
#  Minor bug fix to bad newsgroup reporting, also now counting ``old''
#  articles as junked, with counter for number that are `old'.
#
#  Erik E. Fair <ucbvax!fair>
#  Fix up the domain & local hosts support
#
#  Erik E. Fair <ucbvax!fair>
#  Fix up the counting of gatewayed material, add counting of "linecount"
#  problems. Additional cleanup to make things faster.
#
#  Larry Blair <lmb@vicom.com>
#  Rewritten for C news with modified logging.  Removed many of the B news
#  counts, such as linecount mismatch.
#
BEGIN{
#	"ourname" is the C news name of our system.  The old lprefix stuff
#	doesn't apply for C news, since a common naming scheme is provided.

	ourname = "vsi1";

#
#	For phony name, create real entries.  They divide into two classes.
#	Most are additive.  Some are subtractive, meaning that when the phony
#	group appears, you need to subtract for a site that was added to
#	in a previous alias.
#
#	This stuff is used if you are running a group batching scheme with
#	a phony site name.  We also use it to map stuff sent to "news",
#	which is ames' netnews system.
#
#	Example:
#	alias_add[leaf_main]="sitea,siteb,sitec"
#	alias_sub[leaf_rest]="sitec"
#
#	leaf_main would be attributed to sitea, siteb, and sitec
#	leaf_main, leaf_rest would be attriubted to sitea and siteb
#
	alias_add["leaf_main"]="daver,teraida,zorch,frame,ubvax,octela,altos"
	alias_sub["leaf_rest"]="zorch"

	alias_add["news"]="ames"

#	If you do bi-directional USENET gatewaying (e.g. mailing list
#	to newsgroup where the material flows both ways freely), this
#	should be the name in the sys file that you use to mail stuff
#	to the mailing lists.
#
#	NOTE: I have not tested this stuff with C news. {lmb}
#
	pseudo = "internet";
	rptname = "(GATEWAY)";
#
#	Top level domain names and what network they represent
#	(for use in counting stuff that is gatewayed)
#
	domains["ARPA"] = rptname;
	domains["arpa"] = rptname;
	domains["EDU"] = rptname;
	domains["edu"] = rptname;
	domains["GOV"] = rptname;
	domains["gov"] = rptname;
	domains["COM"] = rptname;
	domains["com"] = rptname;
	domains["MIL"] = rptname;
	domains["mil"] = rptname;
	domains["ORG"] = rptname;
	domains["org"] = rptname;
	domains["NET"] = rptname;
	domains["net"] = rptname;
	domains["UK"] = rptname;
	domains["uk"] = rptname;
	domains["DEC"] = rptname;
	domains["dec"] = rptname;
	domains["CSNET"] = rptname;
	domains["csnet"] = rptname;
	domains["BITNET"] = rptname;
	domains["bitnet"] = rptname;
	domains["MAILNET"] = rptname;
	domains["mailnet"] = rptname;
	domains["UUCP"] = rptname;
	domains["uucp"] = rptname;
	domains["OZ"] = rptname;
	domains["oz"] = rptname;
	domains["AU"] = rptname;
	domains["au"] = rptname;
#
#	tilde chosen because it is ASCII 126 (don't change this)
#
	invalid = "~~~~~~";
#
	accept[invalid]   = 0;
	reject[invalid]   = 0;
	xmited[invalid]   = 0;
	control[invalid]  = 0;
	junked[invalid]   = 0;
	tossed[invalid]   = 0;
	neighbor[invalid] = 0;
	canfail = 0;
}
{
#	Henry says that whitespace in Message-ID's is ok.  Awk doesn't
#	like that, so we just won't count those ones.
	
	if(substr($6, length($6), 1) != ">")
		next;
#
#	Get the name of the system that did this,
#	taking into account that not everyone believes in domains.
#	[[This stuff is extraneous for C news ]]
#
#	if we get a route addr (we shouldn't, but...), take the last one
#	[[Particularly with C news - lmb]]
#
	nhosts = split($4, hosts, "@");
	hostname = hosts[nhosts];
#
#	get the root domain name, and the hostname
#
	ndoms = split(hostname, doms, ".");
	domain = doms[ndoms];
	sys = doms[1];
#
#	check for local system, and if not that, then internet sites.
#	special case the network name replacement of specific host names,
#	such that the network name is there only on a `local' posting
#	(which is really gatewaying in disguise)
#
	if(sys == ourname)
	{
		sys = "local";
	} else {
		dom = domains[domain];
		if (dom) sys = dom;
	}
}
#
#	Accepted articles.  Count the newsgroups and who we sent it to.
#
$5 == "+" {

	accept[sys]++;
	neighbor[sys] = 1;
	nng = split($7, ngl, ",");
	for(i = 1; i <= nng; i++) {
		dot = index(ngl[i], ".");
		if (dot) ng = substr(ngl[i], 1, (dot - 1));
		else ng = ngl[i];
		if (ng) newsgcnt[ng]++;
	}
	for(j = 8; j <= NF; j++) {
		if ($(j) == pseudo) $(j) = rptname;
		else neighbor[$(j)] = 1;
		xmited[$(j)]++;
	}
	next;
}

#
#	Rejected article.  At this point, we just count them.  The "tossed"
#	count is for groups that were "x'ed" in the active file, but it's
#	not currently being printed in the report.  This section should
#	be expanded.
#
$5 == "-" {
	reject[sys]++;
	if($7 == "all")	 tossed[sys]++;
	next;
}
#	These are the cancels that preceed the article being cancelled.
#	Erik used to call the "failed", so I left it alone.  Note that
#	the cancel has already been counted on the "c" line.
#
$5 == "f"			{ canfail++; next }
#  
#	Count the junk.
# 
$5 == "j"		{ junked[sys]++; next }
#
#	Control messages.  This is not fully tested; there may be some
#	others that use more than one field.
#
$5 == "c"	{
	ctot++;
	accept[sys]++;
	control[sys]++;
	ctlcnt[$(8)]++;
	j = 9;
	if($8 == "cancel" || $8 == "rmgroup")
		j = 10;
	else if($8 == "newgroup")
	{
		if ($10 == "moderated") j = 11;
		else j = 10;
	}
	for( ; j <= NF; j++) {
		if ($(j) == pseudo) $(j) = rptname;
		else neighbor[$(j)] = 1;
		xmited[$(j)]++;
	}
	next;
}
#
#	Summarize and print the report
#
END{
#	special processing for Duplicates, because we can't tell if
#	they came from a netnews neighbor or from the gatewaying
#	activities until we have processed the entire log.
#
	for( hostname in reject ) {
#
#	get the root domain name, and the hostname
#
		ndoms = split(hostname, doms, ".");
		domain = doms[ndoms];
		sys = doms[1];
		if (! neighbor[sys]) {
			if (sys == ourname) {
				sys = "local";
			} else {
				dom = domains[domain];
				if (dom) sys = dom;
			}
		}
		i = reject[hostname];
		reject[hostname] = 0;
		reject[sys] += i;
	}

	rtot = 0;
	for( i in reject ) {
		if (reject[i] > 0) {
			list[i] = 1;
			rtot += reject[i];
		}
	}

	atot = 0;
	for( i in accept ) {
		list[i] = 1;
		atot += accept[i];
	}

	xtot = 0;
	for( i in xmited ) {
		if(alias_add[i] != "")
		{
			split(alias_add[i], ala, ",");
			for (j in ala)
			{
				list[ala[j]] = 1;
				xmited[ala[j]] = xmited[i];
			}
			xmited[i] = 0;
			continue;
		}
		if(alias_sub[i] != "")
		{
			split(alias_sub[i], als, ",");
			for (j in als)
			{
				xmited[als[j]] -= xmited[i];
			}
			xmited[i] = 0;
		}
	}
	for( i in xmited ) {
		if(xmited[i] != 0)
			list[i] = 1;
		xtot += xmited[i];
	}

	ctot = 0;
	for( i in control ) {
		list[i] = 1;
		ctot += control[i];
	}

	jtot = 0;
	for( i in junked ) {
		list[i] = 1;
		jtot += junked[i];
	}
#
# ctot is part of rtot, so we don't add it in to the grand total.
#
	totarticles = atot + rtot;
	if (totarticles == 0) totarticles = 1;

	printf("\nSystem       \tAccept\tReject\tJunked\tXmit to\tControl\t%% total\t%% rejct\n");
	for( ; ; ) {
# selection sort
		i = invalid;
		for( j in list ) {
			if ( list[j] > 0 && j < i ) i = j;
		}
		if ( i == invalid ) break;
		list[i] = 0;
#
#	control & junked are counted under accept.
#
		sitetot = accept[i] + reject[i];
		if (sitetot == 0) sitetot = 1;
		articles[i] = sitetot;
#
# What an 'orrible printf spec
#
		printf("%-14s\t%6d\t%6d\t%6d\t%7d\t%7d\t%6d%%\t%6d%%\n", i, accept[i], reject[i], junked[i], xmited[i], control[i], (sitetot * 100) / totarticles, (reject[i] * 100) / sitetot);
#
	}
	printf("\nTOTALS        \t%6d\t%6d\t%6d\t%7d\t%7d\t%6d%%\t%6d%%\n", atot, rtot, jtot, xtot, ctot, 100, (rtot * 100) / totarticles);
	printf("\nTotal Articles processed %d", totarticles);
	printf("\n");

	if (ctot) {
		printf("\nControl	Invocations\n");
		for( i in ctlcnt ) {
			if (i == "cancel") {
				printf("%-12s %6d", i, ctlcnt[i]);
				if (canfail) printf(", %d failed", canfail);
				printf("\n");
			} else {
				printf("%-12s %6d\n", i, ctlcnt[i]);
			}
		}
	}

	if (atot) {
		printf("\nNetnews Categories Received\n");
		l = 0;
		for( i in newsgcnt ) {
			if (l < length(i)) l = length(i);
		}
		fmt = sprintf("%%-%ds %%6d\n", l);
		for( ; ; ) {
# selection sort
			max = 0;
			for( j in newsgcnt ) {
				if (newsgcnt[j] > max) {
					i = j;
					max = newsgcnt[j];
				}
			}
			if (max == 0) break;
			printf(fmt, i, newsgcnt[i]);
			newsgcnt[i] = 0;
		}
	}
}
-- 
Larry Blair   ames!vsi1!lmb   lmb@vicom.com

henry@utzoo.uucp (Henry Spencer) (12/16/89)

In article <1989Dec14.173151.13369@vicom.com> lmb@vicom.COM (Larry Blair) writes:
>I've posted this a few times.  As yet, Henry and Geoff have shown no
>interest in incorporating the minor logging change or distributing the awk
>script.

Don't confuse lack of time with lack of interest.  There is a *whole bunch*
of stuff sitting in our "look at -- low priority" queue.  I have hopes of
possibly getting it cleaned out over the holidays.

Given that news is functioning satisfactorily, we get paid to work on other
things, and C News gets whatever moments we can spare.  Not many, of late.
-- 
1755 EST, Dec 14, 1972:  human |     Henry Spencer at U of Toronto Zoology
exploration of space terminates| uunet!attcan!utzoo!henry henry@zoo.toronto.edu