ramin@rtgvax.UUCP (06/06/86)
[don't give up...] Noticing a recent resurgence of discussion on over-loaded usenet traffic, I thought I might suggest a solution that I've been playing with for a while. It might help solve this issue and other network traffic problems... Now as I understand it, every bit of mail (up to several Mbytes daily) gets shipped around to various sites around the world. Of those that I have been looking at carefully recently, a very large number include either cross-postings or follow-ups to older articles with large quantities of material quoted. The readnews software that I use here shows only one copy of a cross-posted message and I'm not sure if the news software is intelligent enough about actually *sending* multiple copies or not... For the sake of argument I'll assume it doesn't (and if it does, there's one place things can be tightened up) but the issue is that the actual *volume* of quotes in follow-ups constitute a pretty major factor in the traffic problem. Given such a premise it appears that one solution might be to avoid sending multiple copies of these follow-ups around. The solution immediately coming to mind is to have the posting software replace all quoted regions with a context diff type notation, except with each difference marking not only the beginning and ending lines (or bytes) but including the article-id, i.e. (<####@xyz.UUCP>12,14) instead of the full text-line. The larger the quote, the more efficient the scheme. The *de-quoting* step would be taken after the user has edited the file (in "postnews" after typing *send*). Now the news-reader software would access the file as usual, but whenever a certain escape character is reached in the stream it would be an indicator to the news software that it should go to the history file to get the article number so and so from lines a to b (or bytes a to b). Also, readnews could probably cache the file pointers to save on overhead of reopening the file in a single message, or even in a single reading session. It would then place the suitable indentation markers and proceed with the rest of the message. Naturally, to allow for punsters and pundits alike (:-), the included text would be searched for such escape sequences to allow for nested follow-ups (to a pre-defined depth with a strategy of tossing out the top-levels of a nesting, i.e. the oldest quote gets shoved out and replaced by the most recent quotes). The only problem here is obviously that the quotes cannot extend beyond the date news is expired. The solution here can be to expire selective newsgroups over longer periods (i.e. technical ones) and the ones currently proposed in the talk groups with shorter periods... Since the life of an article and its relevance would generally depend on the expiration cycle of the newsgroup (most places I've talked to keep it at around 2 weeks) the necessity for temporary groups such as net.politics.terror would be eliminated since discussions should generally run-down shortly after most of the quoted articles expire. For important matters, a mechanism could be implemented to allow users (or the system manager on their behalf) to request a copy of an expired article from a central archive to access a quote beyond the expiration date of that article... The only other issue I can think of is, obviously, the processing overhead. If there is enough interest in this solution I will try to come up with some figures based on average news traffic loads and amounts of followups across various groups (plus file access and read times, etc...) to come up with a rough estimate for the trade-off... I think this solution is particularly fitting for net news where the variations in the format of the contents of a given transmission is finite and lexically determinable. One can generalize this *active* transmission approach to other networks (i.e. compress selectively based on the contents of the message). But I think one of the problems of the news network is its passive approach (i.e. send down everything, everywhere). Another way to look at this is that the "References:" field currently applies only at file-level granularity. This would take that to line-level (or even byte-level)... I apologize for the length of this note, but I thought an extended elaboration would help present the solution better... I've addressed all followups to *net.news* to avoid clogging other groups. I've also included net.unix-wizards since other people there would be more familiar with the guts of the news system and the applicability of this suggestion... Looking forward to hearing people's thoughts on this... ramin -- =--------------------------------------=-------------------------------------= : alias: ramin firoozye' : USps: Systems Control Inc. : : uucp: ...!shasta \ : 1801 Page Mill Road : : ...!lll-lcc \ : Palo Alto, CA 94303 : : ...!ihnp4 \...!ramin@rtgvax : ^G: (415) 494-1165 x-1777 : =--------------------------------------=-------------------------------------=