[net.news.b] ihave/sendme and 2.10.3

mike@peregrine.UUCP (Mike Wexler) (09/25/85)

	I have looked at the code for netnews 2.10.2 and it looks like it would be
pretty easy to modify it so that ihave and sendme would work as follows:
	1. The sendihave(a replacement for sendbatch) script would look in 
	/usr/spool/batch and get a list of filenames(or article names, if that is
	more convenient) and create a ihave(or maybe ihavelist) control
	message with *all* the articles listed.
	2. Upon receipt of an ihave message, a sendme(or sendme list) message
	would be genereated that would have a list of all the articles needed.
	3. Upon receipt of a sendme message, all the requested articles would
	be batched up and sent.
This would allow people to relatively inexpensively set up redundant news
feeds.  It could even be used to set up a cross country feed to pick things
up from the other side of the country quickly and reduce propagation delays.

	Another feature could be added that would allow a site to ask what
articles are available. This would allow people to recover lost articles and
also to set up redundant feed for a limited set of newsgroups(maybe call up
a backbone site to get *.sources, and mod.*)
	1. A doyouhave message would be sent that would include a pattern
	specifiying a set of news groups(*.sources, mod.*, etc..)
	2. In reply to a doyouhave message an ihave would be sent listing all
	articles available that matched the given pattern.
	3. The ihave message would be treated as above(step 2).

	I would like to know if version 2.10.3 has this feature?  If it doesn't
has anybody implemented something of like this? If I implemented this,
could it be put in 2.10.3? 2.10.4?  When will 2.10.3 become generally
available?  Where will I get it?

	Know for the far out stuff.  It would be really nice if the USENET
network was really redundant.  With this type of functionality it would
be possible for everyone to hook up to more than one site.

	If each article had a checksum at the end(or the beginning).  A receiving
machine could set up a sendme message asking for all the articles whose
checksums were incorrect.
-- 
Mike(always a dreamer) Wexler
15530 Rockfield, Building C
Irvine, Ca 92718
(714)855-3923
(trwrb|scgvaxd)!felix!peregrine!mike

jerry@oliveb.UUCP (Jerry Aguirre) (09/26/85)

I have modified news version 2.10.2 to include "avail" and "iwant"
control messages.  These implement an efficient form of the ihave/sendme
control messages.

The "avail" message consists of a list of article IDs and optionally,
the associated spool filename.  The list can be generated via a log
entry in the sys file or extracted from the history file with an editor
script.

The receiving system checks each article ID in the "avail" list against
the history and if it is a new one then it requests it.  There are two
options for requesting the articles. 

The first is to send an "iwant" control message containing the articles
that were requested.  Including the spool filename speeds things up
because it is not necessary to search the history file.  The original
system can then send the requested articles via (compressed) batching or
whatever.

The second alternative is to directly access the article via uucp.  A
command of the form:
	uux -r -z !rnews < remotesys!pathname

is executed to fetch the news article from the remote system.  This
reduces the delay in transmitting the article but doesn't allow for
batching or compression.

Which transmission method you would use depends on the volume of the
requested articles.

I had originally intended this scheme for exactly what you proposed.
Two sites could send each other a list of the articles that they have
received and then only send the articles that were lost in regular
transmission.  This is a very low overhead operation.  A days worth of
articles can be listed in a message of a few K bytes.  The receiving
system can check them against the history in < 30 seconds.

I tried it our on my own systems and ran into a serious problem.  The
news history is just not reliable enough.  There can be large numbers of
articles that exist locally but are not in the history.  One of my
systems wound up requesting several hundred articles that it already had
but didn't know about.  After numerous user complaints about the
duplications I stopped testing until I can modify the news history to
work better.

I have in mind a scheme to map the article ID into a pathname.  Then the
rnews program could simply attempt to do a creat (mode 644) on the
pathname and if the article already existed the create would fail.  This
should be faster than that silly DBM mechanism and more portable.  The
created pathname would become the base name for the article.  All the
net.* cross postings would be links to the base name.  In this way the
history becomes the articles and there can be no disagreement between
them.  Given the xref mod, the base article would reference all the
links made to it so expire could remove them all.

This makes referencing a parent article absurdly simple and fast.  One
need only map the article ID to it's base pathname and you have the
article.  The current readnews code to reference a parent article is
not only circuitous but is also just plain broken.

Given that Unix allows anything but a null or / in a filename, mapping
the article ID into a pathname is simple.  The only critical part is
creating enough sub directories to keep the access time fast.

Anybody want to take on the project of improving the news history
processing?

					Jerry Aguirre @ Olivetti ATC
{hplabs|fortune|idi|ihnp4|tolerant|allegra|tymix|olhqma}!oliveb!jerry

greg@ncr-sd.UUCP (Greg Noel) (09/27/85)

In article <190@peregrine.UUCP> mike@peregrine.UUCP (Mike Wexler) writes:
>	I have looked at the code for netnews 2.10.2 and it looks like it would
>be pretty easy to modify it so that ihave and sendme would work as follows:

I have considered this as well, but haven't had the time to look at it at all.
My thought is that inews should create the file in /usr/spool/batch to look like
"<article-id><TAB>path/suffix/of/file" since inews has all that information
when it is making the entry.  This file should be sent to the remote inews as
a control message.  (To be sure of compatibility, the control names should be
changed; my name for this message is "douwant" and the reply an "iwant".)  The
critical points are that the remote inews should take the article-id and decide
if the article is wanted and just reply with the path/suffix/of/file which is
all that is needed for the local inews to deliver the file.  You don't want the
full path name, both to save on transmission costs and to avoid a possible
security hole that would occur if you could ask inews to send you an arbitrary
file.  There is also the issue of how to specify what program(s) should be
run to deliver the files -- compression and batching are the two most obvious
things that would need to be known.

I also like the general idea of a "douhave" control sequence, but I think that
will require a \lot/ of careful thought to be sure that every desirable request
was possible and that undesirable requests were \not/ possible.

I'm sure that many people will tell you that USENET is indeed redundant -- not
all sites are like yours with only one feed.  We have two full feeds and some
partial feeds; well-connected backbone sites have many more.  However, your
point that it would be much better if sites could be cheaply interconnected
is a valid one; in fact, I'd bet that if one percent of the current telephone
costs for transmission of duplicated articles were applied to doing something
like this, it would be accomplished in a day.....  Not only that, it would make
Chuqui's cost estimates for the breakpoint between newsgroups and mailing lists
a \lot/ closer to reality.
-- 
-- Greg Noel, NCR Rancho Bernardo    Greg@ncr-sd.UUCP or Greg@nosc.ARPA

usenet@ucbvax.ARPA (USENET News Administration) (10/05/85)

The inherent problem with all of these schemes is that our base level
transport mechanism (UUCP) is batched, not interactive. To illustrate
what I mean, let's look over the base case:

Three sites A, B, and C, in the following configuration:

	A - B - C

let's say that both A & C are backbone sites, so that B is gonna get it
with both barrels. Let's further state that in an attempt to alleviate
this problem, B has turned on IHAVE/SENDME protocol on both links.

IDEAL sequence:

01) A gets article X from elsewhere and queues an IHAVE X to B
02) A & B connect and speak UUCP
03) B processes the IHAVE X, and queues a SENDME X to A
04) A & B connect and speak UUCP
05) A processes the SENDME X, and queues article X to B
06) A & B connect and speak UUCP
07) B processes article X, and queues an IHAVE X to C
08) B & C connect and speak UUCP
09) C processes the IHAVE X, and queues a SENDME X to B
10) B & C connect and speak UUCP
11) B processes the SENDME X, and queues article X to C
12) B & C connect and speak UUCP
13) C processes article X

Whew! Pretty laborious, wouldn't you say? One possibility that I don't
note in here is that the processing and queueing of a reply to any
message your neighbor just sent you, might get done quick enough to go
back out during the same conversation that the message arrived in
originally. However, it's not that likely, and it still wouldn't cut
down on the number of messages that have to go flying about to move
an article from A to B to C.

Now I'd like to consider a slightly less-than-ideal sequence.

BAD sequence:

01) A gets article X from elsewhere and queues an IHAVE X to B
02) A & B connect and speak UUCP
03) B processes the IHAVE X, and queues a SENDME X to A
04) C gets article X from elsewhere and queues an IHAVE X to B
05) B & C connect and speak UUCP
06) B processes the IHAVE X, and queues a SENDME X to C
07) A & B connect and speak UUCP
08) A processes the SENDME X, and queues article X to B
09) B & C connect and speak UUCP
10) C processes the SENDME X, and queues article X to B
11) A & B connect and speak UUCP
12) B gets article X from A, accepts it, and queues an IHAVE X to C
13) B & C connect and speak UUCP
14) B gets article X from C, and rejects it as duplicate
15) C processes the IHAVE X, and ignores it because it already has X

The problem here occurrs because `B' keeps no record of having asked
for an article from any remote system, and not having gotten it yet.

If you were to add that safe guard against asking for something twice
(or `n' times), you then have another problem. UUCP is known to drop
things on the floor occasionally, and if it drops your request for an
article into the bit bucket, you will never get the article.

Unless you have a timeout... which further delays transmission of
articles. And suppose that `A' (or `C') has expired the article by
the second (or third) time that you ask for it?

To make the IHAVE/SENDME protocol work, we need an interactive
transport level, where two netnews transmission agents can actually
converse both ways to say IHAVE, and the response be either, SENDME or
DONTSENDME (and if the SENDME response was recieved, transmission
should occurr immediately, to keep the window during which they could
get the article from some other system to an absolute minimum).
So long as we're stuck with the batch nature of UUCP, we lose.

The best, most reliable method for propagating netnews articles is the
one in common use today: check the path, and if the article hasn't
already been there, send it.

	keeper of the network news for ucbvax,
		and guardian of the gateway,

	Erik E. Fair	ucbvax!fair	fair@ucbarpa.BERKELEY.EDU