reid@decwrl.UUCP (02/01/87)
USENET READERSHIP SUMMARY REPORT for Jan 87 This is the first article in a monthly posting series from the Network Measurement Project at the DEC Western Research Laboratory in Palo Alto, California. This survey is based on a sample of data taken from various USENET sites. At the end of this message there is a short explanation of the measurement techniques and the meaning of the various statistics. The messages that follow this one show survey data sorted by various criteria. The complete set of readership data (of which this is a summary) is posted in mod.newslists. The software that will let your site participate in the survey is in net.sources. Brian Reid OVERALL SUMMARY: This Estimated Sample for entire net Sites: 405 5700 Fraction reporting: 7.11% 100% Users with accounts: 51659 727000 Netreaders: 11121 156000 Average readers per site: 27 Percent of users who are netreaders: 21.53% Average traffic per day (megabytes): 2.010 Average traffic per day (messages): 864 Traffic measurement interval: last 21 days Readership measurement interval: last 75 days Sites used to measure propagation: 203 Valid data received from these sites: 3comvax a60 aaec abic abnji acctel acetes acornrc adelie aeras akov68.dec.com alberta alice alliant altos.dec.com alv amdahl amdcad ames amulet.dec.com ant.dec.com arthur.cs.purdue.edu ascvax asd.dec.com asic.dec.com aspen2.dec.com astro astroatc astrovax atari athena axis b-tech bach.dec.com bass basser bcm5000 bdmrrr bene beno beta.dec.com bigbang bigtex bizet.dec.com bms-at bnl brand brspyr1 btnix bu-cs bucket bute.tcom.stc.co.uk cacilj cad.dec.com cadomin cadsys.dec.com caip.rutgers.edu calgary cascade casee.dec.com cavell cbdkc1 cbosgd cca celica.dec.com cesare.dec.com cgfsv1.dec.com cgfsv2.dec.com cgl.ucsf.edu chalmers charlie chas2 chovax.dec.com ci-dandelion circe cisden cisunx cit-vax closet.dec.com cod cognos concurrent.co.uk cookie.dec.com cp1 cpro cpsc53 cpw.columbia.edu crcge1 crin cs.nott.ac.uk csadfa csustan cuae2 curie.dec.com curium.dec.com cuuxb cvl cwruecmp cxsea daimi dalcs darth davasun dayton dciem dcl-cs dcl-csvax decwrl desint devon dgis diamond.bbn.com dievms.dec.com dlb dmcnh doshita drexel drums.dec.com dssdev.dec.com dukempd dvinci.dec.com dycom ecc.dec.com ector.cs.purdue.edu edison elbereth.rutgers.edu elrond elroy elsie ems endor enea eneevax enmasse entropy eros eta ethos ewok.dec.com exodus.dec.com fai felix firqb.dec.com fisher flinders fortune fritz furilo.dec.com ganash garfield.mun.cdn gatech geac genrad geocub glacier goons.dec.com gouldsd grc97 gt-stratus haddock.isc.com hao hcx1 hjuxa hoptoad hpcea hpldora hpscad.dec.com hqda-ai hscfvax husc4 ico ihnp4 ileaf ima imagen imt3b2 indian.dec.com infinet intrin invest iscuva ishtar isis isl izimbra.css.gov jasper jimi jplgodo kaoa01.dec.com kefren killer kirk.dec.com kodak korppi kosman kpe kryptn.dec.com labrea lerouf.dec.com litp liuida lll-crg macbeth maccs macs marlin masscomp maynard mcf mcgill-vision mck-csc me-ncr meccsd meccts mind mit-eddie mntgfx mordred.cs.purdue.edu moscom mss msudoc mtblue.dec.com mtgzy mtgzz mulga munnari munsell naakka navajo nbires ncr-sd ncrcae nears nesterc nexus.dec.com nike noao novavax nplpsg nssg.dec.com nttlab nucsrl oblio oblio.dec.com ocean oddjob oktext olamb omepd onecom opus orion osi3b2 osiris osu-eddie osupyr panda parity.dec.com pbhya pbhyc pbhyd pbhye pdn pegasus penet percival peregrine philtis phoenix phri piaget plaid pogo polaris pompeo.dec.com popeye popvax poseidon potak.dec.com potaru.dec.com potomac princeton psivax ptovax.dec.com ptsfa ptsfb ptsfc ptsfd pulman.dec.com pyramid qantel qnda01 qtc quad1 ra rayssd reality1 remsit rlgvax rlvd rochester rocky rosevax rsts32.dec.com rti-sel sandia sandoz santra saturn sauron scgvaxd scicom sdcsvax se-sd shasta sicsten sigma sjuvax soma spar sphinx sri-spam sstmv1.dec.com star.dec.com stb stride styx suadb sunybcs tallis.dec.com teddy teklds temvax termin tesla tipple.dec.com tmsoft topaz.rutgers.edu tropix trwhal trwspf tuck tucos turtlevax tutctl tutor tymix ubc-cs ucbarpa.berkeley.edu ujocs ulowell ultra.dec.com umd5.umd.edu umn-cs umn-d-ub umndub uokmax uqcspe.oz usc-oberon usiv03.dec.com utacs utcs utcsri utrtsc.dec.com uwmacc valmet vianet vino.dec.com viper voder voodoo vrdxhq vu-vlsi vulcan walldata walrus wanginst watale watarts watcal watcgl watdaisy watdcsu watdragon wateng water watlion watmath watmum watnot watopt watrose watvlsi wjh12 wnuxb wolf wp3b01 wuphys xios xylar.dec.com yarra yetti yogi.dec.com zeus zorro ------------------------------------------------------------------------------ EXPLANATION OF THE MEASUREMENTS AND STATISTICS Survey data is taken by having one person at each site run a program called "arbitron", which looks at the news or notes files and determines the newsgroups that the user has read within a recent interval. To "read" a newsgroup means to have been presented with the opportunity to look at at least one message in it. Going through a newsgroup with the "n" key counts as reading it. For a news site, "user X reads group Y" means that user X's .newsrc file has marked at least one unexpired message in Y. If there is no traffic in a newsgroup for the measurement period, then the survey will show that nobody reads the group. For a notes site, "user X reads group Y" means that user X has been in the notesfile with the sequencer in the last 14 days. The "14 days" interval for notesfiles corresponds to "unexpired" for news. The "arbitron" program is periodically posted to net.sources, or is available from me (decwrl!reid). The notesfiles version of the program should be available through standard notesfiles software distribution channels as well. SITES SURVEYED IN THIS SAMPLE "This Sample" means the set of sites that have sent in an arbitron report within the past "Readership measurement interval" days. In every case the most recent report from each site is used. At the moment, some of the readership reports are several months old. In future postings those reports will have expired and will not be included. One might argue that the sample is self-selected, and thereby be biased. It does in fact have a certain self-selection factor in it, because we only get data from sites at which someone participates in the survey. However, we do not require the participation of every user at a site, only one user. The survey program returns data for every user on the system on which it was run. Since there are an average of 30 people per site reading news, there is a certain amount of randomness introduced that way. Of course, the sample is biased in favor of large sites (they are more likely to have a user willing to run the survey program) and software-development-oriented sites (more likely to have a user *able* to run the survey program). I intend to post, reasonably soon, some breakdowns of statistics about the sites that have responded. NETWORK SIZE I determine the network size by looking at the set of sites that are mentioned in the Path lines of news articles arriving at decwrl. This number is consistently higher than the number of sites that posted a message (as measured and posted from Seismo) because it includes passive sites that are on the paths between posting sites and decwrl. Each month I store the names of the hosts that are named that month, and for this report I used the past 10 months worth of data. There are 5633 different sites in the Path lines of articles that arrived at decwrl in the last 10 months. There are 5093 different sites in the mod.map data, but mod.map includes every site that participates in uucp; there is a considerable number of machines that exchange uucp mail but do not get USENET. Of those 5633 sites, 59 (1%) are DEC E-net hosts not part of uucp, and which therefore are not included in the 5093 figure. Despite these various difficulties, I believe that 5700 is the best estimate for the size of USENET. Because it is actually a measurement of the number of sites that have posted a message or that are on the path to a site that has posted a message, it will be slightly smaller than the number of sites that actually read netnews. Any site that believes it is not being counted can just ensure that it posts at least one message a year, so that it will be counted. NUMBER OF USERS The number of users at each site is determined in a site-specific fashion. Sometimes it is done by counting the number of user accounts that have shells and login directories. Sometimes it is done by counting the number of people who have logged in to the machine in some interval. Sometimes other techniques are used. This number is probably not very accurate--certainly not more accurate than to within a factor of two. ESTIMATED TOTAL NUMBER OF PEOPLE WHO READ THIS GROUP, WORLDWIDE There are two sources of error in this number. The number is computed by multiplying the number of people in the sample who actually read the group by the ratio of estimated network size to sample size. The estimated total can therefore be biased by errors in the network size estimate (see above) and also by errors in the determination of whether or not someone reads a group. Assuming that "reading a group" is roughly the same as "thumbing through a magazine", in that you don't necessarily have to read anything, but you have to browse through it and see what is there, then the measurement error will come primarily from inability to locate .newsrc files, which can either be protected or moved out of root directories. There is no way of measuring the effect on the measurements from unlocated .newsrc files, but it is not likely to be more than a few percent of the total news readers. PROPAGATION: HOW MANY SITES RECEIVE THIS GROUP AT ALL This number is the percent of the sites that are even receiving this newsgroup. The information necessary to compute propagation was not generated by early versions of the arbitron program, so the "basis" (number of sites) used to generate the Propagation figure is smaller than the "Sites in this sample" figure. A site's data will be used to compute propagation if either (a) it reports zero readers for at least one group, or (b) it is using an arbitron with an explicit version number that is high enough. MESSAGES PER MONTH AND KILOBYTES PER MONTH Traffic is measured at decwrl, in Palo Alto, California. Any message that has arrived at decwrl within the last "Traffic measurement interval" days is counted, regardless of when it was posted. Monthly rates are computed by taking the total traffic, dividing by the number of days in the traffic measurement interval, and multiplying by 30. Decwrl runs 2.10.3 news, which does not store the "Date-Received", "Relay-version" or "Posting-version" header lines; the amount of space occupied at your site might be higher, and the number of bytes transmitted between machines is probably higher. By definition this number is correct, because it is an exact measurement, but it may differ from the traffic at your site by as much as 15% due to timing differences and news version differences. Timing differences will be random, but will average out in the long run. News version differences will cause a systematic error that is additively uniform across all newsgroups, and which therefore does not significantly affect ratios. If a message is crossposted to several groups simultaneously, it is charged only to the first-named group in the list. PARTICIPATION RATIO: MESSAGES per MONTH per 1000 READERS This number is exactly what it says: the number of messages per month in that newsgroup, divided by the number of 1000 readers. It is an indication of how involved the readers of the group are in the traffic, of whether they are mostly listeners or mostly talkers. Its accuracy is limited by the accuracy of its two components. The messages per month figure is exact; the reader count is only as accurate as the network size estimate, which is in worst case accurate to 40%. Therefore you should treat this number as having an error margin of plus or minus 40%. However, ratios between participation ratios for different newsgroups are quite accurate, since the network-size component divides out. COST RATIO: DOLLARS PER MONTH PER READER The most controversial field in the survey report is the "$US per month per reader". It is the estimated number of dollars that are being spent on behalf of each reader, worldwide, on telephone costs to transmit this newsgroup. The cost ratio does not include the cost of disk storage to store the news or of computer time to process it; both of those are assumed to be free. The cost ratio is computed as follows: $US/month/reader = ($USPerMonthPerSite * numberOfSites) / numberOfReaders $USPerMonthPersite = KBytesTrafficPerMonth * $USPerKByte $USPerKByte = ($USperMinute / KBytesPerMinute) * (1 - CompressionFactor) $USperMinute = 0.10 [ten cents per minute avg phone cost] KBytesPerMinute = 60 * BytesPerSecond / 1000 BytesPerSecond = 100 [average transfer rate over 1200-baud line] CompressionFactor = 0.4 [40% compression is typical for netnews] Combining all these gives $USPerMonthPersite = KBytesTrafficPerMonth * (0.10 / 6) * (1 - 0.4) = KBytesTrafficPerMonth / 100 Therefore: $US/month/reader = (KBytesTrafficPerMonth * numberOfSites) / (100 * numberOfReaders) The accuracy of this number is in fact better than the accuracy of the participation ratio, because the source of error--the network size estimate--is present both in the numerator and the denominator, and therefore cancels out. The primary source of bias in this number comes from the bias in the "estimated number of readers, worldwide", which is described above. Treat this value as being accurate to within about 25%. SITE PARTICIPATION I would like to receive data from every site on USENET. The arbitron programs (posted to net.sources along with this report) work on news 2.9, 2.10.[1-3], 2.11, and on many versions of notesfiles. Brian Reid DEC Western Research Laboratory, Palo Alto CA reid@decwrl.DEC.COM {ihnp4,allegra,decvax,ucbvax,sun,glacier}!decwrl!reid
reid@decwrl.UUCP (Brian Reid) (02/01/87)
USENET READERSHIP SUMMARY REPORT for Jan 87 This is the first article in a monthly posting series from the Network Measurement Project at the DEC Western Research Laboratory in Palo Alto, California. I have posted it by hand several hours late because we had a slight hiccup at this end and it didn't go out on schedule. This survey is based on a sample of data taken from various USENET sites. At the end of this message there is a short explanation of the measurement techniques and the meaning of the various statistics. The messages that follow this one show survey data sorted by various criteria. The complete set of readership data (of which this is a summary) is posted in mod.newslists. The software that will let your site participate in the survey is in net.sources. Brian Reid OVERALL SUMMARY: This Estimated Sample for entire net Sites: 405 5700 Fraction reporting: 7.11% 100% Users with accounts: 51659 727000 Netreaders: 11121 156000 Average readers per site: 27 Percent of users who are netreaders: 21.53% Average traffic per day (megabytes): 2.010 Average traffic per day (messages): 864 Traffic measurement interval: last 21 days Readership measurement interval: last 75 days Sites used to measure propagation: 203 Valid data received from these sites: 3comvax a60 aaec abic abnji acctel acetes acornrc adelie aeras akov68.dec.com alberta alice alliant altos.dec.com alv amdahl amdcad ames amulet.dec.com ant.dec.com arthur.cs.purdue.edu ascvax asd.dec.com asic.dec.com aspen2.dec.com astro astroatc astrovax atari athena axis b-tech bach.dec.com bass basser bcm5000 bdmrrr bene beno beta.dec.com bigbang bigtex bizet.dec.com bms-at bnl brand brspyr1 btnix bu-cs bucket bute.tcom.stc.co.uk cacilj cad.dec.com cadomin cadsys.dec.com caip.rutgers.edu calgary cascade casee.dec.com cavell cbdkc1 cbosgd cca celica.dec.com cesare.dec.com cgfsv1.dec.com cgfsv2.dec.com cgl.ucsf.edu chalmers charlie chas2 chovax.dec.com ci-dandelion circe cisden cisunx cit-vax closet.dec.com cod cognos concurrent.co.uk cookie.dec.com cp1 cpro cpsc53 cpw.columbia.edu crcge1 crin cs.nott.ac.uk csadfa csustan cuae2 curie.dec.com curium.dec.com cuuxb cvl cwruecmp cxsea daimi dalcs darth davasun dayton dciem dcl-cs dcl-csvax decwrl desint devon dgis diamond.bbn.com dievms.dec.com dlb dmcnh doshita drexel drums.dec.com dssdev.dec.com dukempd dvinci.dec.com dycom ecc.dec.com ector.cs.purdue.edu edison elbereth.rutgers.edu elrond elroy elsie ems endor enea eneevax enmasse entropy eros eta ethos ewok.dec.com exodus.dec.com fai felix firqb.dec.com fisher flinders fortune fritz furilo.dec.com ganash garfield.mun.cdn gatech geac genrad geocub glacier goons.dec.com gouldsd grc97 gt-stratus haddock.isc.com hao hcx1 hjuxa hoptoad hpcea hpldora hpscad.dec.com hqda-ai hscfvax husc4 ico ihnp4 ileaf ima imagen imt3b2 indian.dec.com infinet intrin invest iscuva ishtar isis isl izimbra.css.gov jasper jimi jplgodo kaoa01.dec.com kefren killer kirk.dec.com kodak korppi kosman kpe kryptn.dec.com labrea lerouf.dec.com litp liuida lll-crg macbeth maccs macs marlin masscomp maynard mcf mcgill-vision mck-csc me-ncr meccsd meccts mind mit-eddie mntgfx mordred.cs.purdue.edu moscom mss msudoc mtblue.dec.com mtgzy mtgzz mulga munnari munsell naakka navajo nbires ncr-sd ncrcae nears nesterc nexus.dec.com nike noao novavax nplpsg nssg.dec.com nttlab nucsrl oblio oblio.dec.com ocean oddjob oktext olamb omepd onecom opus orion osi3b2 osiris osu-eddie osupyr panda parity.dec.com pbhya pbhyc pbhyd pbhye pdn pegasus penet percival peregrine philtis phoenix phri piaget plaid pogo polaris pompeo.dec.com popeye popvax poseidon potak.dec.com potaru.dec.com potomac princeton psivax ptovax.dec.com ptsfa ptsfb ptsfc ptsfd pulman.dec.com pyramid qantel qnda01 qtc quad1 ra rayssd reality1 remsit rlgvax rlvd rochester rocky rosevax rsts32.dec.com rti-sel sandia sandoz santra saturn sauron scgvaxd scicom sdcsvax se-sd shasta sicsten sigma sjuvax soma spar sphinx sri-spam sstmv1.dec.com star.dec.com stb stride styx suadb sunybcs tallis.dec.com teddy teklds temvax termin tesla tipple.dec.com tmsoft topaz.rutgers.edu tropix trwhal trwspf tuck tucos turtlevax tutctl tutor tymix ubc-cs ucbarpa.berkeley.edu ujocs ulowell ultra.dec.com umd5.umd.edu umn-cs umn-d-ub umndub uokmax uqcspe.oz usc-oberon usiv03.dec.com utacs utcs utcsri utrtsc.dec.com uwmacc valmet vianet vino.dec.com viper voder voodoo vrdxhq vu-vlsi vulcan walldata walrus wanginst watale watarts watcal watcgl watdaisy watdcsu watdragon wateng water watlion watmath watmum watnot watopt watrose watvlsi wjh12 wnuxb wolf wp3b01 wuphys xios xylar.dec.com yarra yetti yogi.dec.com zeus zorro ------------------------------------------------------------------------------ EXPLANATION OF THE MEASUREMENTS AND STATISTICS Survey data is taken by having one person at each site run a program called "arbitron", which looks at the news or notes files and determines the newsgroups that the user has read within a recent interval. To "read" a newsgroup means to have been presented with the opportunity to look at at least one message in it. Going through a newsgroup with the "n" key counts as reading it. For a news site, "user X reads group Y" means that user X's .newsrc file has marked at least one unexpired message in Y. If there is no traffic in a newsgroup for the measurement period, then the survey will show that nobody reads the group. For a notes site, "user X reads group Y" means that user X has been in the notesfile with the sequencer in the last 14 days. The "14 days" interval for notesfiles corresponds to "unexpired" for news. The "arbitron" program is periodically posted to net.sources, or is available from me (decwrl!reid). The notesfiles version of the program should be available through standard notesfiles software distribution channels as well. SITES SURVEYED IN THIS SAMPLE "This Sample" means the set of sites that have sent in an arbitron report within the past "Readership measurement interval" days. In every case the most recent report from each site is used. At the moment, some of the readership reports are several months old. In future postings those reports will have expired and will not be included. One might argue that the sample is self-selected, and thereby be biased. It does in fact have a certain self-selection factor in it, because we only get data from sites at which someone participates in the survey. However, we do not require the participation of every user at a site, only one user. The survey program returns data for every user on the system on which it was run. Since there are an average of 30 people per site reading news, there is a certain amount of randomness introduced that way. Of course, the sample is biased in favor of large sites (they are more likely to have a user willing to run the survey program) and software-development-oriented sites (more likely to have a user *able* to run the survey program). I intend to post, reasonably soon, some breakdowns of statistics about the sites that have responded. NETWORK SIZE I determine the network size by looking at the set of sites that are mentioned in the Path lines of news articles arriving at decwrl. This number is consistently higher than the number of sites that posted a message (as measured and posted from Seismo) because it includes passive sites that are on the paths between posting sites and decwrl. Each month I store the names of the hosts that are named that month, and for this report I used the past 10 months worth of data. There are 5633 different sites in the Path lines of articles that arrived at decwrl in the last 10 months. There are 5093 different sites in the mod.map data, but mod.map includes every site that participates in uucp; there is a considerable number of machines that exchange uucp mail but do not get USENET. Of those 5633 sites, 59 (1%) are DEC E-net hosts not part of uucp, and which therefore are not included in the 5093 figure. Despite these various difficulties, I believe that 5700 is the best estimate for the size of USENET. Because it is actually a measurement of the number of sites that have posted a message or that are on the path to a site that has posted a message, it will be slightly smaller than the number of sites that actually read netnews. Any site that believes it is not being counted can just ensure that it posts at least one message a year, so that it will be counted. NUMBER OF USERS The number of users at each site is determined in a site-specific fashion. Sometimes it is done by counting the number of user accounts that have shells and login directories. Sometimes it is done by counting the number of people who have logged in to the machine in some interval. Sometimes other techniques are used. This number is probably not very accurate--certainly not more accurate than to within a factor of two. ESTIMATED TOTAL NUMBER OF PEOPLE WHO READ THIS GROUP, WORLDWIDE There are two sources of error in this number. The number is computed by multiplying the number of people in the sample who actually read the group by the ratio of estimated network size to sample size. The estimated total can therefore be biased by errors in the network size estimate (see above) and also by errors in the determination of whether or not someone reads a group. Assuming that "reading a group" is roughly the same as "thumbing through a magazine", in that you don't necessarily have to read anything, but you have to browse through it and see what is there, then the measurement error will come primarily from inability to locate .newsrc files, which can either be protected or moved out of root directories. There is no way of measuring the effect on the measurements from unlocated .newsrc files, but it is not likely to be more than a few percent of the total news readers. PROPAGATION: HOW MANY SITES RECEIVE THIS GROUP AT ALL This number is the percent of the sites that are even receiving this newsgroup. The information necessary to compute propagation was not generated by early versions of the arbitron program, so the "basis" (number of sites) used to generate the Propagation figure is smaller than the "Sites in this sample" figure. A site's data will be used to compute propagation if either (a) it reports zero readers for at least one group, or (b) it is using an arbitron with an explicit version number that is high enough. MESSAGES PER MONTH AND KILOBYTES PER MONTH Traffic is measured at decwrl, in Palo Alto, California. Any message that has arrived at decwrl within the last "Traffic measurement interval" days is counted, regardless of when it was posted. Monthly rates are computed by taking the total traffic, dividing by the number of days in the traffic measurement interval, and multiplying by 30. Decwrl runs 2.10.3 news, which does not store the "Date-Received", "Relay-version" or "Posting-version" header lines; the amount of space occupied at your site might be higher, and the number of bytes transmitted between machines is probably higher. By definition this number is correct, because it is an exact measurement, but it may differ from the traffic at your site by as much as 15% due to timing differences and news version differences. Timing differences will be random, but will average out in the long run. News version differences will cause a systematic error that is additively uniform across all newsgroups, and which therefore does not significantly affect ratios. If a message is crossposted to several groups simultaneously, it is charged only to the first-named group in the list. PARTICIPATION RATIO: MESSAGES per MONTH per 1000 READERS This number is exactly what it says: the number of messages per month in that newsgroup, divided by the number of 1000 readers. It is an indication of how involved the readers of the group are in the traffic, of whether they are mostly listeners or mostly talkers. Its accuracy is limited by the accuracy of its two components. The messages per month figure is exact; the reader count is only as accurate as the network size estimate, which is in worst case accurate to 40%. Therefore you should treat this number as having an error margin of plus or minus 40%. However, ratios between participation ratios for different newsgroups are quite accurate, since the network-size component divides out. COST RATIO: DOLLARS PER MONTH PER READER The most controversial field in the survey report is the "$US per month per reader". It is the estimated number of dollars that are being spent on behalf of each reader, worldwide, on telephone costs to transmit this newsgroup. The cost ratio does not include the cost of disk storage to store the news or of computer time to process it; both of those are assumed to be free. The cost ratio is computed as follows: $US/month/reader = ($USPerMonthPerSite * numberOfSites) / numberOfReaders $USPerMonthPersite = KBytesTrafficPerMonth * $USPerKByte $USPerKByte = ($USperMinute / KBytesPerMinute) * (1 - CompressionFactor) $USperMinute = 0.10 [ten cents per minute avg phone cost] KBytesPerMinute = 60 * BytesPerSecond / 1000 BytesPerSecond = 100 [average transfer rate over 1200-baud line] CompressionFactor = 0.4 [40% compression is typical for netnews] Combining all these gives $USPerMonthPersite = KBytesTrafficPerMonth * (0.10 / 6) * (1 - 0.4) = KBytesTrafficPerMonth / 100 Therefore: $US/month/reader = (KBytesTrafficPerMonth * numberOfSites) / (100 * numberOfReaders) The accuracy of this number is in fact better than the accuracy of the participation ratio, because the source of error--the network size estimate--is present both in the numerator and the denominator, and therefore cancels out. The primary source of bias in this number comes from the bias in the "estimated number of readers, worldwide", which is described above. Treat this value as being accurate to within about 25%. SITE PARTICIPATION I would like to receive data from every site on USENET. The arbitron programs (posted to net.sources along with this report) work on news 2.9, 2.10.[1-3], 2.11, and on many versions of notesfiles. Brian Reid DEC Western Research Laboratory, Palo Alto CA reid@decwrl.DEC.COM {ihnp4,allegra,decvax,ucbvax,sun,glacier}!decwrl!reid