[net.news] net readership poll

bobr@zeus.UUCP (Robert Reed) (03/29/86)

> Some groups have very low volume, such that it is possible for no articles
> to be current in the group when the survey is run.  If that were the case,
> the survey would show no readers when in fact many people may read the
> group.
>	David Eppstein, eppstein@cs.columbia.edu, seismo!columbia!cs!eppstein

> What effect, if any, is there from hosts that do not permit access to all
> newsgroups? 
>	wmartin@brl-smoke.ARPA (Will Martin )

These are both valid concerns if the number of individual samples is small,
but as the sample size increases, both of these anomalies will get lost in
the noise.  The major problems in such a survey are:

	1.  The possibility of error in the collection mechanism.  For
	example, if there was a bug in the posted arbitron script, such that
	every site which reported had intrinsic and random errors in the
	reported data.  A mere systematic error (i.e., consistently
	reporting half the number of readers in each site report), if
	detected, could be accounted for and weighted out of the sample.

	One possible source for this kind of error exists in the nature of
	the responders.  If arbitron is run without root priviledges,
	readers whose home directories or .newsrc files are protected are
	counted as users but not readers.  Similarly, sites which have a set
	of machines, with accounts for all users but with prefered home
	machines for partitions of this user set, will have a similar skew
	in the user/readership ratio.  But in either of these cases the
	effect is systemic, reducing the percentages with without skewing
	towards any particular newsgroup.

	2.  Lack of sample size.  Most of the complaints about the
	readership poll have been concerns about skew from set of
	samples which do not reflect the interests of the complaintant.
	The easiest fix is to increase the sample set size.

roy@phri.UUCP (Roy Smith) (03/30/86)

	In article <89@zeus.UUCP> bobr@zeus.UUCP (Robert Reed) points out 2
ways that arbitron will miscount: non-readable .newsrc files and people
with accounts on several machines at a site, but who usually only read news
at one of them.  He claims that "in either of these cases the effect is
systemic, reducing the percentages with without skewing towards any
particular newsgroup."

	Looking around my site (about 100 users, maybe a dozen of which
read news on a regular basis) I can divide the population into two groups.
The sophisticated ones (by and large) know how to change the mode of their
files, and can deal with the idea of accounts on multiple machines.  The
non-sophisticated users (the majority) have enough trouble figuring out how
nroff works.  The first group also has the people most likely to read the
technical groups.  Thus, I claim that both of the types of errors described
by Robert discriminate against technical groups.

	Now, don't get me wrong.  This is just anecdotal evidence with
absolutely no hard data to back it up.  At any rate, I suspect that the
effect is quite small and not worth worrying about too much.

	I fully support Brian Reid's efforts to gather statistics, and have
confidence that all the minor problems that have been pointed out will end
up not making much difference (as long as the sample set is large enough).
Will 2.10.3 support some kind of "sendstats" control message when it
finally gets released?
-- 
Roy Smith, {allegra,philabs}!phri!roy
System Administrator, Public Health Research Institute
455 First Avenue, New York, NY 10016

dave@lsuc.UUCP (David Sherman) (04/03/86)

>  non-readable .newsrc files

This problem can be avoided if the sysadmin runs arbitron as root.
In fact, Brian, if you get two sets of results from a site and one
indicates more readership than the other, this could explain it.
(More likely non-searchable home directories, since people wouldn't
be likely to specifically protect their .newsrc's.)

Dave Sherman
The Law Society of Upper Canada
Toronto
-- 
{ ihnp4!utzoo  pesnta  utcs  hcr  decvax!utcsri  } !lsuc!dave

rees@apollo.uucp (Jim Rees) (04/07/86)

    I don't know about the rest of you, but is it really that hard to get
    your local administrator to have arbitron run with root priviledges.

(I assume you mean "privileges".)

Sorry to flame on this, but this is the third time I've seen this opinion,
and this time it's from a system administrator.  It scares me.

If someone has one of his files protected in such a way that I can't read
it, I respect that.  I don't abuse my position of authority by snooping in
files that have been protected against access by the general public.
I don't even consider the reason that the file is protected, or the reason
that I want to look into the file.  I don't go looking through protected
files any more than I would look through one of my co-worker's desk drawers
if I had a master key to all the desks.

There are always exceptions, of course.  If your desk is belching smoke and
flame, I might break in.  If your file is growing by 1 Mbyte a minute, and
the disk is starting to fill up, I might break in.  But under normal
circumstances, I don't go snooping into locked places, whether the lock
is physical or electronic, without asking you first.

Am I the only who feels this way?  Is it normal practice at other companies
to snoop?