[news.groups] accuracy of arbitron "reader" data

randall@uvaarpa.virginia.edu (Randall Atkinson) (09/21/89)

In article <18401@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes:
>Greg has often written to me that he doesn't feel there is any significant
>accuracy to the readers/machine figures that one can derive from Brian's
>arbitron figures.   One thing is clear, the readers/machine figure is
>exact for arbitron sites.  

Until last month beginning who knows when, edison.cho.ge.com was reporting
wildly inaccurrate counts of people reading newsgroups.  The reason was
that the expire we used didn't have the options that arbitron expected
us to use, so our active file misled arbitron about which articles were
recent.  This has been fixed somewhat, but now arbitron reports the
correct number of newsreaders but 0 users on the system.  The arbitron
data is not exact for arbitron sites. QED

Nevertheless, I think that the arbitron and inpaths data are useful
benchmarks as long as we recognise upfront that they are not accurate.
I do trust them for broad trends and for propogation and flow information.

The point about large sites running rrn and sending in misleading
"who reads what" statistics is very true.  This university (UVa)
primarily uses rrn to a single machine and so any reports which might
be sent from here are going to be much lower than reality with
respect to the "reader" data.

DISCLAIMER:  I don't speak for the corporation which is the University
	of Virginia; these are my own views.

chuq@Apple.COM (Chuq Von Rospach) (09/22/89)

>Nevertheless, I think that the arbitron and inpaths data are useful
>benchmarks as long as we recognise upfront that they are not accurate.
>I do trust them for broad trends and for propogation and flow information.

Agreed. Arbitron has generaly shown readership of rec.mag.otherrealms in the
3500-5000 range. I've done independent surveys where, by checking the
percentage of returns in an area with known levels of readership and
extrapolating those out to the entire survey response, my readership is
closer to 11,000 to 15,000. 

That's at least a 50% difference between arbitron and my numbers. My numbers
may be a little high. They may also be a little low (+- 5% or so). It's
impossible to tell. All I do know is that Brian's numbers are a lot
different and significantly lower, well beyond his margin of error.

What I think you can tell from Brian's numbers are relative things -- I
don't think the relative number of readers will change. I don't think the
relative interest in a group would change. etc. Absolute numbers, though,
they aren't.

-- 

Chuq Von Rospach <+> Editor,OtherRealms <+> Member SFWA/ASFA
chuq@apple.com <+> CI$: 73317,635 <+> [This is myself speaking. I am not Appl
Segmentation Fault. Core dumped.

brad@looking.on.ca (Brad Templeton) (09/22/89)

In article <34921@apple.Apple.COM> chuq@Apple.COM (Chuq Von Rospach) writes:
>
>Agreed. Arbitron has generaly shown readership of rec.mag.otherrealms in the
>3500-5000 range. I've done independent surveys where, by checking the
>percentage of returns in an area with known levels of readership and
>extrapolating those out to the entire survey response, my readership is
>closer to 11,000 to 15,000. 

OtherRealms, as you know, is a very special case, because it is 12 articles
over 3 days once every N months.  Many sites are down to 1 week expire now.
(Hell, I used to run 1 day expire on most groups) and so it makes sense
that a once/month arbitron report would be off by a factor of 4.

In fact, didn't Chuq once deliberately arrange to expire R.M.O well beyond
the default time, in part to avoid having to deal with back-issue requests,
but also in part to make sure it showed up on the Arbitron surveys?

Actually, my personal opinion is that the survey results are too high,
by as much as a factor of 5 in some cases.  71,000 for rec.humor.funny?
Not a chance in the world.  But that's because we have no really strong
reliable way to find out what percentage of total netreaders the 719
arbitron sites represent.  Brian says 4.5% based on the fact that there are
15,000 machines in net maps.  A survey I did said that about 21% of all
USENET articles are posted by readers at arbitron sites.  The real answer
lies between.

But all this is not important in calculating readers per site, unless you
feel that the readers/site figure for arbitron sites is extremely
atypical.   NNTP doesn't affect this figure, other than to plump it up,
since adding reports for sites that read by NNTP would add thousands of
single-user sites that can't possibly increase the readers/site figure over 1.
-- 
Brad Templeton, Looking Glass Software Ltd.  --  Waterloo, Ontario 519/884-7473

david@indetech.com (David Kuder) (09/24/89)

In article <19376@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes:
>71,000 for rec.humor.funny?  Not a chance in the world.

If you want a better number why not ask for it.  Just post a demand in
your playground/newsgroup.  Ask every reader to send you e-mail to
some address that has a counter on it.

>But all this is not important in calculating readers per site, unless
>you feel that the readers/site figure for arbitron sites is extremely
>atypical.  NNTP doesn't affect this figure, other than to plump it
>up, since adding reports for sites that read by NNTP would add
>thousands of single-user sites that can't possibly increase the
>readers/site figure over 1.

Well, the problem is "feel".  I may feel that the readers/site from
Brian's data is bogus while you feel otherwise.  We won't be able
agree because it isn't great data.  The selection mechanism is poor,
the data collected overlooks at one mechanism for reading news, the
definition of site is strange (any thing that ever showed up in a
"Path:", we used to send out workstation names here while running
arbitron from our server, what's that do to the numbers), the
timeliness of the data is queer (we last sent in an arbitron about 3
months ago, this will probably be the first month since then that we
aren't counted), and there are all the things Brian lists himself.

Not that any of this makes the data less interesting.  It is
interesting.  It just isn't good enough to prove anything.  You, Mr.
Survey, ought to know better than putting trust in anything collected
from the net.
-- 
David A. Kuder					david@indetech.com
415 438-2003	              {sun,sharkey,pacbell}!indetech!david

vnend@phoenix.Princeton.EDU (D. W. James) (10/03/89)

In article <19376@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes:
)But all this is not important in calculating readers per site, unless you
)feel that the readers/site figure for arbitron sites is extremely
)atypical.   NNTP doesn't affect this figure, other than to plump it up,
)since adding reports for sites that read by NNTP would add thousands of
)single-user sites that can't possibly increase the readers/site figure over 1.

	I disagree.  NNTP *does* affect it, if you view readership/site
as based on the *server*, rather than reader machine (which is much like
claiming that it should be based on per terminal...)  NNTP *significantly*
affects your (questionable) method of evaluating the success of groups
if you view it that way.

	Another factor that is becoming more and more important (in
terms of readers and posters) on the net is BITNET sites, for which
there isn't an ARBITRON analogue.  Here at Princeton PUCC is usually
right behind phoenix in terms of posted volume, and it certainly
has more accounts than Phoenix.  So, I agree that the stats are  
only valid as comparisons between groups, not for use as
absolute numbers (which really throws your "stat" out the
window.)

-- 
Later Y'all,  Vnend                       Ignorance is the mother of adventure.   
SCA event list? Mail?  Send to:vnend@phoenix.princeton.edu or vnend@pucc.bitnet   
        Anonymous posting service (NO FLAMES!) at vnend@ms.uky.edu                    
       "First, they stood guard over us.  Then, they sat guard over us.                 Then they wandered off to find some corn plasters and we escaped."