[net.unix] The traffic volume problems on Usenet and ARPANET

lauren@rand-unix.ARPA (02/25/85)

The problems with traffic volume both in these ARPANET lists, and in the
linked Usenet groups, is becoming significant on both sides.  Simply
blaming the Usenet side doesn't do much good--I've seen just as many
useless postings from the ARPA side of the fence.  And while the cost
(other than time and disk space) of each posting on the ARPA side
is essentially nil to the participants, the costs are very high,
ultimately, on the Usenet side, where in most cases EVERY message
gets sent to EVERY machine, usually by dialup phone line at 1200 bps,
and frequently via long distance, not local, calls.

I for one am trying to discourage the creation of more specialized
Usenet groups for awhile, in the hopes of getting people to instead
spend the time to establish coordinated mailing lists that will only
involve the people who are actually interested in particular topics.
As ARPANET people know, mailing lists, if properly managed, can provide
much faster distribution than the current point-to-point netnews
system on Usenet.  Of course, mailing lists are more trouble to
maintain than just blasting a message out all over the world (and
ignoring the costs) but when people are trying to get real work
done (for example, the various collections of people working on 
the various phases of the UUCP Project) mailing lists can be far
more efficient in both time and money than blanket netnews
discussions and distributions.  Mailing lists, while preferable
to netnews distribution in many cases, can still have problems
of volume--as we're seeing on ARPANET now.  But I think that mailing
lists are still superior to netnews distributions in many cases if
properly coordinated and planned.

But in any case, am I alone in getting the feeling that we
(both on ARPANET and even more on Usenet) have crossed over some
sort of volume "threshold"?  It's getting almost impossible to
deal with the volume of submissions being entered, on whatever
topics, from an ever growing crowd of users, most of whom have
no idea what has ever been discussed in these lists before.

As this problem continues to grow, more and more people will
be forced to drop off the lists (as they are doing now with
Usenet newsgroups) since they simply won't have TIME to deal
with all this material, much of which is not very useful and just
represents (in many cases) useless quips or repetitive questions/answers.

In my opinion, the models under which both the major ARPANET
lists and the Usenet groups were founded are not scaling
up well to the growing user population, as almost anyone
on these lists/groups must realize by now.  For Usenet,
some progress can be made by discouraging many new newsgroups 
and promoting coordinated mailing lists as a step forward.
On the ARPANET side, where many lists already exist, the next 
step isn't so clear.

--Lauren--

Conde.osbunorth@XEROX.ARPA (02/26/85)

Lauren,

Here's an approach taken by some Xerox mailing lists which may be
adapted to your situation.

Some lists in digest form will mail out the table of contents only. If
the user is interested, he will retrieve the entire contents to  his
machine. The file copy command is typically embedded in the table of
contents to make it easier. In the particular mail system that I am
using, there is no equivalent of the Unix netnews command to share news
messages.

I do not know if this is feasible, but here's how this may be adapted to
USENET sites. If you are willing to deal with 1-2 day delays in reading
messages:

- Each digest mails out its table of contents. A non-digest message
sends out the subject line only.
- A user uses some program  to peruse the table of contents (TOC) If the
message is available locally, (for some value of local) the user has the
option of reading it. Otherwise, it is simply marked for later
retrieval.
- During that evening, a program will try to retrieve all files which
are marked "interested" but is not already available locally. The
messages will be retrieved from a set of hosts which may have them.
- The following day, the user may read the messages.

As an implementation issue, some kind of universal message id scheme and
a database could be used to index into message contents/subject lines.
This way, the user could ignore all message which say: "What's the
termcap entry for a Trash-80?".

It may be possible on ARPA sites, but I do not know if this will even be
worth considering for usenet sites that redistribute messages to other
sites.  The hard part is knowing who has the replicated copies/when one
is capable of doing cleanup operations (i.e. zapping files) without
causing hardship to others. Some kind of expiration date scheme may work
too...

Daniel Conde
conde.pa@Xerox.ARPA