[net.news.adm] usenet volume problems...

ramin@rtgvax.UUCP (06/06/86)

[don't give up...]

Noticing a recent resurgence of discussion on over-loaded usenet traffic,
I thought I might suggest a solution that I've been playing with for a
while. It might help solve this issue and other network traffic problems...

Now as I understand it, every bit of mail (up to several Mbytes daily)
gets shipped around to various sites around the world. Of those that I
have been looking at carefully recently, a very large number include either
cross-postings or follow-ups to older articles with large quantities
of material quoted.

The readnews software that I use here shows only one copy of a cross-posted
message and I'm not sure if the news software is intelligent enough about
actually *sending* multiple copies or not... For the sake of argument I'll
assume it doesn't (and if it does, there's one place things can be tightened
up) but the issue is that the actual *volume* of quotes in follow-ups
constitute a pretty major factor in the traffic problem.

Given such a premise it appears that one solution might be to avoid sending
multiple copies of these follow-ups around. The solution immediately
coming to mind is to have the posting software replace all quoted
regions with a context diff type notation, except with each difference marking
not only the beginning and ending lines (or bytes) but including the
article-id, i.e. (<####@xyz.UUCP>12,14) instead of the full text-line.
The larger the quote, the more efficient the scheme.

The *de-quoting* step would be taken after the user has edited the file
(in "postnews" after typing *send*).  Now the news-reader software would
access the file as usual, but whenever a certain escape character
is reached in the stream it would be an indicator to the news software that it
should go to the history file to get the article number so and so from lines
a to b (or bytes a to b).
Also, readnews could probably cache the file pointers to save on overhead of
reopening the file in a single message, or even in a single reading session.
It would then place the suitable indentation markers and proceed with the
rest of the message. Naturally, to allow for punsters and pundits alike (:-),
the included text would be searched for such escape sequences to allow for
nested follow-ups (to a pre-defined depth with a strategy of tossing out
the top-levels of a nesting, i.e. the oldest quote gets shoved out and
replaced by the most recent quotes).

The only problem here is obviously that the quotes cannot extend beyond
the date news is expired. The solution here can be to expire selective
newsgroups over longer periods (i.e. technical ones) and the ones
currently proposed in the talk groups with shorter periods... Since the
life of an article and its relevance would generally depend on the expiration
cycle of the newsgroup (most places I've talked to keep it at around 2 weeks)
the necessity for temporary groups such as net.politics.terror would be
eliminated since discussions should generally run-down shortly after
most of the quoted articles expire.

For important matters, a mechanism could be implemented to allow users
(or the system manager on their behalf) to request a copy of
an expired article from a central archive to access a quote beyond the 
expiration date of that article...

The only other issue I can think of is, obviously, the processing overhead.
If there is enough interest in this solution I will try to come up with
some figures based on average news traffic loads and amounts of followups
across various groups (plus file access and read times, etc...) to come up
with a rough estimate for the trade-off...

I think this solution is particularly fitting for net news where the
variations in the format of the contents of a given transmission is
finite and lexically determinable. One can generalize this *active*
transmission approach to other networks (i.e. compress selectively based
on the contents of the message). But I think one of the
problems of the news network is its passive approach (i.e. send down
everything, everywhere).  Another way to look at this is that the
"References:" field currently applies only at file-level granularity.
This would take that to line-level (or even byte-level)...

I apologize for the length of this note, but I thought an extended
elaboration would help present the solution better... I've addressed all
followups to *net.news* to avoid clogging other groups. I've also
included net.unix-wizards since other people there would be more
familiar with the guts of the news system and the applicability of
this suggestion...


Looking forward to hearing people's thoughts on this...

ramin

-- 
=--------------------------------------=-------------------------------------=
: alias: ramin firoozye'               :   USps: Systems Control Inc.        :
: uucp: ...!shasta \                   :         1801 Page Mill Road         :
:       ...!lll-lcc \                  :         Palo Alto, CA  94303        :
:       ...!ihnp4    \...!ramin@rtgvax :   ^G:   (415) 494-1165 x-1777       :
=--------------------------------------=-------------------------------------=