mangoe@umcp-cs.UUCP (Charley Wingate) (11/11/85)
Here are the average article sizes and top 25 groups minus top 25 users for
Nov. 10:
     Orig. Avg. Art. New      %
Rank Rank    Size   Kbytes   Chg.   Group 
  1    1      1.4    563.8    4.1%  net.news.group
  2    2      1.7    319.3   23.0%  net.politics
  3    6      1.5    308.4    7.5%  net.flame
  4   10      1.5    244.0    3.7%  net.news
  5   13      1.0    234.0    0.0%  net.movies
  6   14      1.3    233.6    0.0%  net.women
  7    9      1.3    224.9   20.5%  net.micro.mac
  8    7      2.4    208.8   32.7%  net.religion
  9    3      4.0    208.7   46.7%  net.sources
 10    4      1.1    197.1   48.1%  net.music
 11    5      2.3    189.8   50.1%  net.philosophy
 12   12      2.0    188.0   21.1%  net.religion.christian
 13   17      0.6    185.8    0.0%  net.sf-lovers
 14   19      1.1    181.9    0.0%  net.audio
 15   20      0.8    174.9    1.9%  net.unix-wizards
 16   15      2.8    158.3   28.8%  net.politics.theory
 17   23      0.8    151.0    0.0%  net.cooks
 18   21      1.0    150.5   10.4%  net.lang.c
 19   25      0.6    142.2    2.2%  net.jokes
 20   11      3.4    141.1   44.1%  net.origins
 21   22      0.8    139.9    8.8%  net.unix
 22   16      1.1    138.6   27.9%  net.micro.amiga
 23   24      1.2    136.0    7.2%  net.arch
 24   18      1.5     89.3   51.2%  net.micro.pc
 25    8      8.9     55.0   81.8%  net.sources.mac
Again, note the number of large drops.
For comparison, Rich Rosen (the top user) would have been 3rd on this list.
Charley Wingateems@amdahl.UUCP (ems) (11/27/85)
It would be interesting to see what percentage of total
volume and what percentage of each group volume was made up of
headers and footers.  By what percent would total net volume
drop if cute .signatures were eliminated and a standard
disclaimer were appended to all articles.  (Rather than having
each person come up with the obligatory disclaimer...)
This should save at least the couple of percent that the major
groups consume.
-- 
E. Michael Smith  ...!{hplabs,ihnp4,amd,nsc}!amdahl!ems
'If you can dream it, you can do it'  Walt Disney
This is the obligatory disclaimer of everything. (Including but
not limited to: typos, spelling, diction, logic, and nuclear war)adams@calma.UUCP (Robert Adams) (11/27/85)
> E. Michael Smith ...!{hplabs,ihnp4,amd,nsc}!amdahl!ems > It would be interesting to see what percentage of total > volume and what percentage of each group volume was made up of > headers and footers. By what percent would total net volume > drop if cute .signatures were eliminated and a standard > disclaimer were appended to all articles. (Rather than having > each person come up with the obligatory disclaimer...) > > This should save at least the couple of percent that the major > groups consume. I wrote a program to scan all news on our system and gather such statistics. What follows is the output of same. A "signature" is the lines after a line of "-- " which doesn't get them all but... "Included" lines are ones beginning with ">". Average article length is skewed because of a few large files -- maps and sources. Notice that 1/4 of the characters are in headers and that 4% of the total characters stored are in the Path: line. adams@calma.UUCP -- Robert Adams ...!ucbvax!calma!adams ------------------ cut here ---------------- files = 9328, lines = 523997, characters = 21082117 average lines per file = 56, average chars per file = 2260 header lines = 119323, characters = 5099675, percent = 24% signature lines = 28212, characters = 957140, percent = 5% inserted lines = 47111, characters = 2401935, percent = 11% percent percent Header occurances total chars of headers of total Relay-Version 9327 522336 10.2% 2.5% Posting-Version 9322 585931 11.5% 2.8% Path 9327 803948 15.8% 3.8% From 9327 344330 6.8% 1.6% Newsgroups 9327 268699 5.3% 1.3% Subject 9327 362454 7.1% 1.7% Message-ID 9327 285049 5.6% 1.4% Date 9327 260814 5.1% 1.2% Article-I.D. 0 0 0.0% 0.0% Posted 0 0 0.0% 0.0% Date-Received 9327 345084 6.8% 1.6% References 5279 286359 5.6% 1.4% Distribution 2743 46750 0.9% 0.2% Organization 8531 393317 7.7% 1.9% Lines 9327 82888 1.6% 0.4% Xref 2381 126036 2.5% 0.6% Approved 394 12167 0.2% 0.1% Nf-ID 603 27965 0.5% 0.1% Nf-From 605 30231 0.6% 0.1% Control 275 8581 0.2% 0.0% Reply-To 2515 109362 2.1% 0.5% Sender 1150 34935 0.7% 0.2% Xpath 61 1379 0.0% 0.0% Keywords 560 17544 0.3% 0.1% Summary 727 17643 0.3% 0.1% Followup-To 132 3405 0.1% 0.0% Expires 67 2054 0.0% 0.0% Cc 7 65 0.0% 0.0% Apparently-To 5 160 0.0% 0.0% In-reply-to 1 63 0.0% 0.0% This-Account 1 26 0.0% 0.0% Reply-tp 1 27 0.0% 0.0% Original-Subject 3 179 0.0% 0.0% Followups-to 2 50 0.0% 0.0% other 2 117 0.0% 0.0%