draper (01/19/83)
I wish to draw attention to a very sensible point made by Simon Gibbs of Toronto recently, and also (I now realise on re-reading this article) reply in effect to points made by Steven McGeady of Tektronix on the proliferation of newsgroups. These points all start from the observation that usenet is expanding rapidly, and that means more newsgroups and more news. This follows simply from the observation that the number of sites and users is growing fast, and if the rate of submissions per user remains even approximately constant then there will be more articles in total. The first consequence is for the number of newsgroups. The mechanism netnews offers to help readers select a manageable subset for reading is the newsgroup, so it is vital, if the the average user's appetite for articles remains constant, for the number of (active) newsgroups to grow. This was Simon's point, and it seems quite right to me. Therefore it is wrong to argue, as some have, that there should be growing restraint on creating new newsgroups, such as requiring a larger number of positive responses to suggestions to create a new one. Not creating a new group will not reduce the number of articles, only our ability to filter them out easily from our personal reading. The real problem is whether sites can afford to process and store a swelling volume of news, much of which might not be read by its own users if the amount each person reads remains roughly constant while the total volume grows -- I take this up below. Meanwhile, we can reasonably predict a growing number of newsgroups as a proper consequence of the swelling amount of news. (The growth in newsgroups seems entirely OK to me. It is no problem to expire defunct newsgroups periodically and on a local basis by simple scripts that look at if a group has been idle for longer than some threshold, delete the newsgroup directory, dotfile, and line in the active file, and run an overnight script that updates all .newsrc files by deleting all lines referring to groups not in the active file.) Various other consequences seem likely to follow, besides the proliferation of newsgroups. 1) The growing use of secretary programs to help users pre-sort articles. Readnews -t does a crude version of this. For instance, "readnews -p -t burger > /dev/null" will junk all messages that admit in their title that they are about the hamburger war. Tom Neuendorffer's recent program presumably does it better. 2) More and more sites will have to think hard about limiting their news intakes because of the growing cost. I am experimenting with scripts that measure disk storage costs, how many groups never get read, and splitting disk costs between users. Eventually this might be used to change sys files automatically so as not to receive groups that are never read. In effect an unsubscribe command for a whole site would be required -- new groups would be received, but after a grace period of 2 weeks, if no-one was reading it the sys file would be changed and the group deleted. (See below.) 3) CPU and transmission costs are also already considerable and growing with the net. Here at UCSD we have gone over locally to a batching system developed from the one published on the net that effects very considerable savings in uucp/berknet overhead: if we were paying phone charges as well we would be even more keen on it. In addition on my machine, we take care to have only one version of inews at a time processing incoming news, and to run it niced so as (we hope) to prevent it from loading the machine noticeably. As the size of daily news grows, I suspect new sites may not even consider bringing up netnews without batching software, so it may be time to develop some standard version and to distribute it with the other netnews software. Batching may be considered a fairly cheap and painless way of overcoming a lot of the horror of uucp overhead. Saving on CPU and transmission overheads is a further motive for reducing the number of groups received, if the sender's sys file can be altered dynamically (as opposed to simply altering one's own, which would only achieve savings in storage, not in transmission and processing). Presumably the best way to do this would be to have a new control message recognised by inews to change a sender's sys file at the request of the receiver. 4) As the cost of netnews rises, I expect to see the informal hierarchical structure of the net grow more and more pronounced. The software is set up to support a net without any structure -- multiple paths are fine, and any connections and patterns of connections will work. I am sure that that is a very good design choice. What I am getting at is that in practice there is a kind of hierarchy with major and minor sites. Looking at the published maps shows that the net actually consists of a "big 6" connected in a ring (UCB, decvax, harpo, ...), then main branches off that, and finally local networks. This could become more pronounced if people go in for reduced subscription lists, since that will only work for sites that get news from a major site that is itself guaranteed to get all newsgroups. Only one machine in a local cluster need get all groups, the rest can economise. There is already the beginnings of this distribution of labor in that only a few sites keep archives of old articles -- a service appreciated by other sites that can draw on them occasionally. Clearly some acknowledgement of the service rendered will more and more have to be made since while an overall saving of costs is achieved by not duplicating unnecessarily the costs, the central servers are having to pay more than their share of CPU and storage resources. Note that these considerations favor a star network. This is already favored by recipients because it reduces the delay before they get new news, and means they don't have to forward it, but is disliked by the sender since it means they have increased transmission costs. A suitable quid pro quo might be to have recipients pay the phone bill if senders are carrying higher machine costs. Similarly, sending news batches only at night reuces CPU "cost" as well as phone charges. 5) A similar division of the labor involved with netnews is appearing at the personal level between users at a given site. There is too much news for a normal person to keep up with, so we each tend to select a subset. However if we're lucky, colleagues will forward articles from the groups that they see and we don't if they think they are exceptionally interesting. I am about to investigate making it easier for people to produce these personal digests of netnews to be posted to a local group (a kind of "the best of netnews" idea) by adding a "digest" command to readnews, and/or having a "postdigest" command. Less formally, readnews needs a "forward" command so a user can send an article to a colleague they think will appreciate it. 6) It is possible that digests like this may catch on as distributed newsgroups on a net-wide basis. However this is less likely, as making a good digest depends on having a sense for what your colleagues would like to see -- this is the net's weakest point, leading to flames, endless discussions about what should be published etc. etc. However if people are interested in pusuing the idea of moderated (i.e. edited) newsgroups, then I think this is how they should be done -- as separate groups run in competition with the standard ones. If they are doing their job, then many readers will ONLY read the digests and not the original groups, and this will become apparent to measuring software at individual sites. CONCLUSION: Netnews is based on division into newsgroups, so they should continue to be created easily and with low readership. This helps the rest of us to avoid articles we don't want to see. It could also become a basis for systems to reduce their costs by not subscribing to groups that their users don't read. Mailing lists for low circulation discussions is one proposed alternative to spawning new groups, and indeed it would save on costs. However it would abandon the unique characteristic of netnews -- that readers can see articles without knowing in advance that they want to see them and without going to any special trouble to obtain them. As newspapers articles show, there is a whole class of reading matter that people do not know in advance they want to read but do read and enjoy when it is put under their noses (I mean "eyes" I guess). Only a few will care enough to write in and say they want a particular column, but many more read it. However when no-one even opens the paper (newsgroup) any more, it is time to stop subscribing to it. Steve Draper UCSD, San Diego ucbvax!sdcsvax!sdcsla!draper draper@nprdc
djj (01/20/83)
Steve Draper's recent article was very well-written. I agree with much of what he said. However, I must take exception to one quote (from item 1, I think): "and an overnight batch program can update all .newsrc files to remove all references to deleted newsgroups [ that's from memory, the original already scrolled out of screen memory]." For my money, this smacks of 1984 and the re-writing of history. I prefer to keep all groups that I read and used to read in my .newsrc; it indicates the changes in my reading patterns, reminds me of past battles fought on the net (i.e. net.women), and also reminds me about trying to start inconsequential newsgroups (net.suicide, etc.). Perhaps a better way might be to "mark" such affected newsgroups within the users' .newsrc files so that they will have no effect when reading news. This would notify the users of the deletion of said newsgroups and preserve the desirable features I mentioned above. Dave Johnson BTL - Piscataway