msb@sq.UUCP (11/06/87)
Greg (greg@maypo.berkeley.edu) writes: > ... I would like to propose a new statistics-gathering feature for > rn in addition to the Arbitron ratings. In this scheme, rn would > monitor which articles each reader actually reads, and how long each > reader spends on those articles. This is an intriguing idea, but there are technical problems. The basic problem is that you can't tell, just because the image remains on the screen for a long time, that the reader is really reading that article. Similarly, you can't tell, just because they spend 20 minutes in Rnmail composing a response, that they are really responding to that article. (In particular, they could be !'d out, dealing with something more urgent.) Some compensation for this could be achieved by finally reporting the median time rather than the mean, or perhaps some other percentile if (as I suspect) the median proves to be 0 for almost all articles. In addition, readers may not want such monitoring. I don't think *I* would, and I make no bones about the fact that I spend a lot of time reading news. My reaction to this might well be to stop using rn and spend 20 minutes writing a Very Simple News Reading Program. Besides that, admins might not care for the extra system load... if I was in that position, I'd be inclined to turn if off, just like other accounting logs. I've fixed the vague Subject line of the article, but left it crossposted to news.misc and news.admin, because I thought it was appropriate. Followuppers beware. Mark Brader On our campus the UNIX system has proved to be not SoftQuad Inc. only an effective software tool, but an agent of Toronto technical and social change within the University. utzoo!sq!msb, msb@sq.com John Lions
chuq@plaid.Sun.COM (Chuq Von Rospach) (11/09/87)
>> In this scheme, rn would >> monitor which articles each reader actually reads, and how long each >> reader spends on those articles. >This is an intriguing idea, but there are technical problems. Erik Fair wrote an article a while back about this for Login:. The concept was called an Accolade (design by Erik, cute name by me). The whole idea was that you could keep track of what other people read, and only read those articles that other folks (whose reading taste you trust) felt was worth reading. Sort of like a net-wide kill file at a very high level of abstraction. One minor problem. If everyone starts sending out Accolade messages on everything they read, what do you think will happen to network traffic? Even if you limit it to sending one message per rn session and sending it to a single point (ala Arbitron) instead of broadcasting it to the net, the amount of data being slogged around is astounding. If you figure 7,000 people read usenet once a day (very low numbers! VERY low numbers) and the package is 1,000 bytes, the receiving end needs to handle seven megabytes of data a day. And these numbers are ridiculously small -- you could triple them and still not be realistic. It's a very nice idea. But from a technical point of view, it is a cure much worse than the disease. chuq --- Chuq "Fixed in 4.0" Von Rospach chuq@sun.COM Delphi: CHUQ
greg@jiff.berkeley.edu (Greg) (11/10/87)
Greg (greg@maypo.berkeley.edu) writes: > rn would > monitor which articles each reader actually reads, and how long each > reader spends on those articles. Mark Brader writes: >The basic problem is that you can't tell, just because the image >remains on the screen for a long time, that the reader is really >reading that article. A good point. I propose some reasonable cutoff time, like five minutes, i.e. if a reader spend two hours on one article, five minutes is averaged in instead. There will still be some error from readers walking away from their terminals, performing shell escapes, and so on. I figure that the error will be roughly randomly distributed among articles in proportion to their popularity anyway; to some extent this variation will merely add a constant factor to the "true" statistics. It may also be more appropriate to report the median reading time rather than the mean. >In addition, readers may not want such monitoring [because of invasion >of privacy]. I find it difficult to fret over the fate of a list of numbers about me, without my name attached, that are promptly averaged in with thousands of other such numbers. I think most readers are the same way. The few who object are free to turn off the feature. I'd prefer the voting power to privacy. In any case, as with the Arbitron ratings, the Nielsens need not poll EVERY user on the net. >Besides that, admins might not care for the extra system load... Chuq Von Rospach also brought up the load issue, in the context of network load rather than system load: >If you figure 7,000 >people read usenet once a day (very low numbers! VERY low numbers) and the >package is 1,000 bytes, the receiving end needs to handle seven megabytes of >data a day. And these numbers are [ridiculous underestimates]... >It's a very nice idea. But from a technical point of view, it is a cure much >worse than the disease. I don't see why such an expensive scheme is necessary. In my scheme the network load for trading statistics is necessarily much smaller than the load of the news groups themselves. A news feed in a local network doles out articles to rn programs running on various hosts; the rn programs would reply with the stats for their news sessions. The news feed would then compress the statistics on its own by taking local averages and sums. Every week or so the Nielsen program would collect data from all of the news feed hosts. The data on all of the articles would be in the same report; there would be about 20-40 bytes of data per article. Estimating that the articles themselves are 1K long on average, the Nielsens would be only a small fraction of both the global network load and the local system load. -- Greg
eric@snark.UUCP (Eric S. Raymond) (11/10/87)
In article <1987Nov6.124824.20374@sq.uucp>, msb@sq.UUCP writes: > This is an intriguing idea, but there are technical problems. > The basic problem is that you can't tell, just because the image > remains on the screen for a long time, that the reader is really > reading that article. I finessed the problems you point out by not recording reading time, only the fact that you've seen the text of an article and whether you praised or condemned it. > Besides that, admins might not care for the extra system load... > if I was in that position, I'd be inclined to turn if off, just like > other accounting logs. Not necessary. It's really cheap. My news readers keep a trail of where-you've been locations (the '-' command works to *arbitrary* depth). At the end of the session I run through the trail, appending a small record to a logfile with write(2) (for atomicity, and to avoid the buffering overhead). It all happens so fast you don't even notice the delay after typing 'q'. As for space...I suppose the file could get large if you had lots of readers reading lots of stuff -- but that's why you want to run an abstract generator on it every night, truncating the logfile when you do it. Once a month you optionally ship your results off to a central collection point. It's ll very painless, really. -- Eric S. Raymond UUCP: {{seismo,ihnp4,rutgers}!cbmvax,sdcrdcf!burdvax,vu-vlsi}!snark!eric Post: 22 South Warren Avenue, Malvern, PA 19355 Phone: (215)-296-5718