[news.admin] IDEA: reader-initiated sendme protocol

pete@octopus.UUCP (Pete Holzmann) (07/01/88)

Maybe this is an old idea, but I think it might provide a good long-term
solution to some of our troubles:

We have an IHAVE/SENDME protocol that works automagically. Suppose that
the moderator of a particular newsgroup posted a message along the lines
of:

Archived-Name: <c.s.u.newsprog.pt1>
Available-At:  csu-archives

This is part 1 of 10 of the new newsprogram....


<end of example>

The reader could respond simply by hitting the 'sendme' key. A control
message would then be forwarded towards the nearest archive site (csu-archives
being an alias for the comp.sources.unix archive sites), until it reaches
a site that either already has the message in question, or [an efficiency
improvement that might be optional at first] until the message reaches a
site that has already requested the message. In the latter case, requests
would pile up until the requested message arrives, then be filled.

This method would limit distribution of big stuff to just those sites
that request it. To make everything wonderful, we'd probably want to
add information to the sys file (and maybe the maps) regarding which paths
may be used for such distributions.

This seems like a nice combination of the best of both worlds (mailed requests
and flooding-distribution) to me. It almost seems like it could be implemented
partially on top of IHAVE/SENDME. Instead of IHave/SendMe, it is
IHave/I'llGetIt4U.

Do any of you new-news-version-implementors have any idea whether this, or
some useful subset, is implementable without too much pain?

Or is this just pie in the sky?

Pete
-- 
  OOO   __| ___      Peter Holzmann, Octopus Enterprises
 OOOOOOO___/ _______ USPS: 19611 La Mar Court, Cupertino, CA 95014
  OOOOO \___/        UUCP: {hpda,pyramid}!octopus!pete
___| \_____          Phone: 408/996-7746

david@ms.uky.edu (David Herron -- One of the vertebrae) (07/04/88)

In article <270@octopus.UUCP> pete@octopus.UUCP (Pete Holzmann) writes:

[ ... deleted text about reading a posting describing some archived object]

>The reader could respond simply by hitting the 'sendme' key. A control
>message would then be forwarded towards the nearest archive site (csu-archives
>being an alias for the comp.sources.unix archive sites), until it reaches
>a site that either already has the message in question, or [an efficiency
>improvement that might be optional at first] until the message reaches a
>site that has already requested the message. In the latter case, requests
>would pile up until the requested message arrives, then be filled.

This is oh so very similar to something which has been bubbling
away in my mind for a long time.  As I see it though, that phrase
about "would then be forwarded towards the nearest archive site"
is the sticking point that nobody short of peter honeyman could
solve.  The problem I see is that this network doesn't live just
within the UUCP network, but also lives within BITNET and ArpaNet.
If it just lived within the UUCP world, then finding the nearest
archive site is simply a matter of running pathalias.  But if there's
an archive site in BITNET land or Arpa land then it's not so trivial
to find the nearest one.  Or a user in BITland or Arpaland also
has a difficult time of finding the nearest archive.

The problem here is that nobody (short of peter honeyman) has a
good map of ALL of the networks which includes weights to different
parts of ALL of the networks.  Within each network there are maps
from which weights can be derived (er.. maybe not for the Arpanet,
I'm not sure).  But since there is no joint map, the "nearest archive"
cannot be found in the general case.

Assuming that this can be solved then the rest of it will fall into
place in any number of ways.  Given current software it would be
easiest to use something like Brian Reids archive server and build
into it the knowledge of which is the closest archive site for any 
particular site.  When a request comes into the server from a distant
site, it looks up the closest known server in its database and
forwards the request to that one.

Just because two sites are on the Arpanet doesn't mean that they're
close to each other, or that you can get good throughput to that site,
or that it makes sense to fill archive requests from that site.
The same holds true for BITNET but more comes down to which side
of the ohstvma<->psuvm<->cunyvm<->ucbjade links that you live on.

But while I've been sitting here two methods have come to mind.  One
is a control message type of thing which floats out into the network.
It's marked in some special way to cause it to go into an archive
server whenever it reaches a site running one.  The idea is to have
the first one which it reaches handle the request, but how to stop
secondary filling of requests?  hmmm.. The server could send out a
cancel message for the request as soon as it receives the request.
In addition, instead of immediately sending out the requested file
it could generate a mail message which, when replied to, will
cause the request to be filled.  We'd have to trust the user to
not reply to redundant responses from servers...

Another idea is to handle the "find closest archive" problem by
hand.  That is, instead of allowing requests from arbitrary sites
to require a site to register itself with a particular archive.
The director of the archive would probably be fleunt enough with
the network to know a better server to use if the site is far away.
Maybe part of this would be to strongly urge sites to make
direct connections with their archive servers, if possible.

Maybe part of this is a file which the local news maintainer
maintains like the moderators file.  Or maybe it *is* the moderators
file.  So a user making a request posts to a moderated group
and the entry in the moderators file knows where the closest
archiver is because it's in the file.

>This method would limit distribution of big stuff to just those sites
>that request it. To make everything wonderful, we'd probably want to
>add information to the sys file (and maybe the maps) regarding which paths
>may be used for such distributions.

YES EXACTLY!
-- 
<---- David Herron -- The E-Mail guy                         <david@ms.uky.edu>
<---- ska: David le casse\*'      {rutgers,uunet}!ukma!david, david@UKMA.BITNET
<----
<---- I'm not bad, I'm just coded that way!

len@netsys.UUCP (Len Rose) (07/04/88)

Nice idea .. it has appeal.

Such a scheme could greatly reduce costs and has the potential
for lightening loads netwide.Now all you have to do is find
sites willing to be archive servers in each "region" ... If you
meet that goal, then intra-archiver links can stay at redundent 
levels with each other.Speedy propigation to say the least,and 
it would also provide archive resource sharing or perhaps 
specialization of services.. 

Of course,you are also talking about completely changing the face
of usenet software,headers,backbone organizations,etc.

No small feat... <smile>


-- 
Len Rose - Netsys,Inc. 301-520-5677 
len@ames.arc.nasa.gov  or len@netsys

pete@octopus.UUCP (Pete Holzmann) (07/04/88)

In article <9823@g.ms.uky.edu> david@ms.uky.edu (David Herron) writes:
>... As I see it though, that phrase
>about "would then be forwarded towards the nearest archive site"
>is the sticking point that nobody short of peter honeyman could
>solve.

Well, to begin with, I'd just use normal maps to fire the request in
the direction of *some* archive server. Maybe let the user pick which one
from the list, maybe have the SA be able to modify a list similar to the
moderators file.

Since the request would be filled by the first news-running machine that
has already either received the package in question, or [a blue-sky
extra feature?] already has the same request outstanding for somebody else,
the worst case scenario is that a bunch of requests would cause the net to 
become quickly flooded with a popular software package... which is exactly
what happens right now for ALL postings.

The only thing lost is that less-popular things would not be transmitted
along the most efficient or most desireable route. I think this tweak
[and admittedly a pain to solve, as David points out] can be added later.

Pete
-- 
  OOO   __| ___      Peter Holzmann, Octopus Enterprises
 OOOOOOO___/ _______ USPS: 19611 La Mar Court, Cupertino, CA 95014
  OOOOO \___/        UUCP: {hpda,pyramid}!octopus!pete
___| \_____          Phone: 408/996-7746

mack@inco.UUCP (Dave Mack) (07/06/88)

In article <9823@g.ms.uky.edu> david@ms.uky.edu (David Herron -- One of the vertebrae) writes:
>In article <270@octopus.UUCP> pete@octopus.UUCP (Pete Holzmann) writes:
>
>[ ... deleted text about reading a posting describing some archived object]
>
>>The reader could respond simply by hitting the 'sendme' key. A control
>>message would then be forwarded towards the nearest archive site (csu-archives
>>being an alias for the comp.sources.unix archive sites), until it reaches
>>a site that either already has the message in question, or [an efficiency
>>improvement that might be optional at first] until the message reaches a
>>site that has already requested the message. In the latter case, requests
>>would pile up until the requested message arrives, then be filled.
>
>This is oh so very similar to something which has been bubbling
>away in my mind for a long time.  As I see it though, that phrase
>about "would then be forwarded towards the nearest archive site"
>is the sticking point that nobody short of peter honeyman could
>solve.  The problem I see is that this network doesn't live just
>within the UUCP network, but also lives within BITNET and ArpaNet.
>If it just lived within the UUCP world, then finding the nearest
>archive site is simply a matter of running pathalias.  But if there's
>an archive site in BITNET land or Arpa land then it's not so trivial
>to find the nearest one.  Or a user in BITland or Arpaland also
>has a difficult time of finding the nearest archive.

I don't see why this should be so difficult. What's wrong with this:

1) Post source abstracts with a volume number and the Followup-To: line
   set to comp.sources.{machine,OS}.request.

2) In the active file, replace the moderation flag (fourth field) with
   an 'a' for the comp.sources.{machine,OS}.request groups.

3) Hack inews to recognize the 'a' flag as an archive request, which
   it would treat the same as a posting to a moderated group, except
   that it would add an Archive-Request: line to the header which
   would include the volume requested and the address of the requestor.

4) Either get the backbone sites that maintain moderator lists to keep
   a list of archive sites (preferred) or add an archivelist: field to
   the mailpaths file designating the nearest system that maintains an
   archive site list. I believe Bill Wisner is currently maintaining
   such a list. Perhaps he would be willing to mail it to backbone sites
   on a monthly basis? Or I'll shoot this dog.

5) Hack the news at backbone sites to convert postings to c.s.{}.request
   into mail to the nearest archive server as determined by pathalias.

Presumably, both BITNET and Arpanet sites can post to moderated USENET
groups. If that's true, it seems that 90% of the software needed to do
this is already in place, and the changes necessary aren't too horrendous.
The cooperation of the backbone in this is the critical factor.

>The problem here is that nobody (short of peter honeyman) has a
>good map of ALL of the networks which includes weights to different
>parts of ALL of the networks.  Within each network there are maps
>from which weights can be derived (er.. maybe not for the Arpanet,
>I'm not sure).  But since there is no joint map, the "nearest archive"
>cannot be found in the general case.

This is only a problem if you insist on an optimal path. Obviously, this
is desirable, but not essential. In principle, all of the archive servers
could be USENET sites. I'm not sure how the folks on BITNET and Arpanet
would feel about this, but I'm sure they'll give me a clue. In any event,
they wouldn't have to be able to find the "nearest archive", just the
nearest backbone site. This ought to be a simpler problem.

The real problem is that mail transport on the net is a much shakier
proposition than news is. Smart mailers are hell.

Yours for blue sky, handwaving, smoke and mirrors,
Dave (short of peter honeyman) Mack
...uunet!inco!mack
...sun!sundc!inco!mack

matt@oddjob.UChicago.EDU (Schizophrenic Solipsist) (07/07/88)

One possible way to find the "nearest archive" ...

Have each archive site periodically post something, perhaps a list of
holdings.  Let each site compare the arrival time of each such
article to its posting time and keep track of the quickest, with some
filtering to remove glitches.

This supposes that the propAgation of these articles resembles the
route that archive requests and responses will take.  It should be
mandated to be so, since other routes may not be approved for netnews
use.
________________________________________________________
Matt Crawford	     		matt@oddjob.uchicago.edu

eric@snark.UUCP (Eric S. Raymond) (07/07/88)

In article <14877@oddjob.uchicago.edu>, matt@oddjob.UChicago.EDU writes:
>One possible way to find the "nearest archive" ...
>
>Have each archive site periodically post something, perhaps a list of
>holdings.  Let each site compare the arrival time of each such
>article to its posting time and keep track of the quickest, with some
>filtering to remove glitches.

Somehow I missed the previous articles in this thread. This sounds a lot like
some aspects the hypertext design I'm working on as a followon to News 3.0.

If the originators of <270@octopus.uucp> and <9823@g.ms.uky.edu> still have
these on line, please send me your suggestions (yes, I know, this is itself
an instance of the sort of problem such a protocol would solve).
-- 
      Eric S. Raymond                     (the mad mastermind of TMN-Netnews)
      UUCP: ..!{uunet,att,rutgers!vu-vlsi}!snark!eric   Smail: eric@snark.UUCP
      Post: 22 South Warren Avenue, Malvern, PA 19355   Phone: (215)-296-5718