[list.ietf-nntp] should batch/ibatch/image/et al apply to the article command?

lear@turbo.bio.net (Eliot) (06/25/91)

There are several implementations out there that use the
NEWNEWS/ARTICLE commands to download information in bulk.  In
addition, if we do add support for binary tranfer, should the article
command handle that one, too?

Eliot Lear
[lear@turbo.bio.net]

sob@tmc.edu (Stan Barber) (06/25/91)

NEWNEWS and ARTICLE were really invented for news readers, not news transfer

Therefore, I believe these features should not be addedto NEWNEWS/ARTICLE.

Perhaps part of NNRP, but not NNTP.

-- 
Stan           internet: sob@bcm.tmc.edu         Director, Networking 
Olan           uucp: rutgers!bcm!sob             and Systems Support
Barber         Opinions expressed are only mine. Baylor College of Medicine

Harri.Salminen@funet.fi (Harri K Salminen) (06/25/91)

Your message dated: Mon, 24 Jun 91 19:14:40 CDT

> NEWNEWS and ARTICLE were really invented for news readers, not news transfer
> Therefore, I believe these features should not be addedto NEWNEWS/ARTICLE.
> Perhaps part of NNRP, but not NNTP.

As I said in my earlier letter about extending newnews et al I think it's
very important to improve the newnews and article support also for NNTP
transfer. I think that it it's VERY important to support newsgroup and
article selection to transfer at least articles for some interval of of
time from a PSECIFIC newsgroup. My vision is that we NEED support for
building hierarchical caching news distribution backbone in which at least
leaf sites don't need to keep copy of ALL articles in ALL available
newsgroups for one year just because some users might sometimes need that. 
The news servers might be able to keep list of newsgroups or even subject
index (a la nn but applied to a news server) but not bodies of all articles
from which a user could select an article that is fetched to a local cache
from a backbone or archival server that keeps longer term copies. Keeping
archives on anon ftp is not a solution every end user likes or even manages
to use.

The basic support for these using newsgroup masks and time (article?)
intervals shouldn't be complicated but would help a lot to manage the 
traffic growth. There's more and more need for sparse public and closed
distributions which could be handled more efficiently this way than
hopefully the coming listserv like service which I hope to merge with
netnews gateways to offer at least to the user a single easy to use service
that is also easy for us to manage.

I this sense I don't think too strong division between NNTP and NNRP makes
sense when building cache style hierachical support. You might support even
user level authentication some levels up or to several sites or archives.

The more complex future feature would be an intelligent and robust routing
protocol for newsgroups but I agree it might be out of scope for the
basic set of services and does need further study before making it even
optional. Anyway we need option support also for experimental enhancements.

I think we definitely NEED OPTIONS negotiation not some general MODEs. You
could do it faster by sending an OPTIONS option-count command followed with
a sorted list of options you'd like to have. The other end would respond
back with the subset of options it accepts. This could continue further
untill the originating side agrees to the reply by just starting the normal
session.  Since the system could remember the values from the last session
it could just use the previous set next time unless there's a change of
version (did we have something similar to SOA record check in the beginning
already?) number or explicit request. Also most common sets of options
could have a shorthand name like (NNRP-BASIC-OPTIONS) if you insist on
something looking like MODE NNRP support. This way startup would normally be
very fast even if we had dozens of options.

We definitely need options for image, multimedia, several authentication
schemes, several search facilities (subject/keyword/etc a la nngrab or
fulltext), experiments etc. 


> Stan           internet: sob@bcm.tmc.edu         Director, Networking 
> Olan           uucp: rutgers!bcm!sob             and Systems Support
> Barber         Opinions expressed are only mine. Baylor College of Medicine

Harri

sob@tmc.edu (Stan Barber) (06/25/91)

If the need is to transfer the contents of a particular group from
a master server to a caching server, NEWNEWS/ARTICLE is not the solution.

I could see a GROUP command to set the group and then some kind of BATCH
command to transfer all the contents of a a group as one batch. This 
minimizes the number of transactions and really looks like a TRANSPORT.
NEWNEWS/ARTICLE (as stated previously) is for newsreaders.


-- 
Stan           internet: sob@bcm.tmc.edu         Director, Networking 
Olan           uucp: rutgers!bcm!sob             and Systems Support
Barber         Opinions expressed are only mine. Baylor College of Medicine

rsalz@bbn.com (Rich Salz) (06/25/91)

>NEWNEWS and ARTICLE were really invented for news readers, not news transfer
This is wrong.  There are servers out there that use NEWNEWS.
	/r$

sob@tmc.edu (Stan Barber) (06/25/91)

Please list these. The only one I know of is nntpxfer.


-- 
Stan           internet: sob@bcm.tmc.edu         Director, Networking 
Olan           uucp: rutgers!bcm!sob             and Systems Support
Barber         Opinions expressed are only mine. Baylor College of Medicine

brian@UCSD.EDU (Brian Kantor) (06/25/91)

>NEWNEWS and ARTICLE were really invented for news readers, not news transfer

No, whilst they are currently of primary interest for newsreaders, it
was my idea that they could be used for demand feeds ("suck" instead of
"blow").  In fact, when we were inventing NNTP, that was a major point
of contention: whether news feeds were to inhale or exhale.  The
compromise was to include facilities for both.

As you may recall, I've often stated that people shouldn't use
'nntpxfer' to suck in news articles because it's a hack.  That refers
to the PROGRAM, not the technique.  I still think it's a good way to do
news if you have anything less than a full feed, but I ascribe the fact
that few are doing it to programmer laziness rather than any flaw in
the concept.
	- Brian

Harri.Salminen@funet.fi (Harri K Salminen) (06/25/91)

Your message dated: Tue, 25 Jun 91 08:27:35 CDT


> I could see a GROUP command to set the group and then some kind of BATCH
> command to transfer all the contents of a a group as one batch. This 
> minimizes the number of transactions and really looks like a TRANSPORT.
> NEWNEWS/ARTICLE (as stated previously) is for newsreaders.

I don't really care if the functionality is achieved with some new commands
or extending newnews (newnews is preferred here for transport by some
sites) but I care for the flexible functionality. Maybe there could be
commands like "GROUP (filter-number or newsgroupname) start-time end-time
LIST/BATCH/XHEADERS...  " that could be used to restrict the scope of the
old style more general commands? If the system doesn't support that you
have to fetch more and throw away extranous information received. This way
it might be easier to implement on existing systems and you could use
Message-id's and batch style or article numbers and database searches on
those restricted set of articles. How about this then?

I still think it's sometimes usefull to allow reading style access to
certain groups or time intervals of articles.

> -- 
> Stan           internet: sob@bcm.tmc.edu         Director, Networking 
> Olan           uucp: rutgers!bcm!sob             and Systems Support
> Barber         Opinions expressed are only mine. Baylor College of Medicine

tytso@ATHENA.MIT.EDU (Theodore Ts'o) (06/25/91)

   From: Rich Salz <rsalz@bbn.com>
   Date: Tue, 25 Jun 91 09:29:07 EDT

   >NEWNEWS and ARTICLE were really invented for news readers, not news
   >transfer 
   This is wrong.  There are servers out there that use NEWNEWS.

Indeed.  Can anyone name a _news_reader_ that uses NEWNEWS?  As far as I
can tell, it's only used by people who are too lazy to set up a real
news server.  As Rich Salz mentions in another message, NEWNEWS is also
a good way of raping the victim host and sending its performance to
hell, since it's fairly inefficient in the way it's implemented.

Is there a good reason to even keep NEWNEWS in the spec?  I would love
to see it go... at the very least, nntpxfer should be removed NNTP
distribution. 

							- Ted

P.S.  This is one of my pet peeves.  Can you tell?  :-)

rickert@cs.niu.edu (Neil Rickert) (06/25/91)

In article <9106250014.AA25541@tmc.edu> sob@tmc.edu (Stan Barber) writes:
>NEWNEWS and ARTICLE were really invented for news readers, not news transfer

 ARTICLE and HEAD can be very useful debugging tools.  If NNRP is separated
from NNTP, some primitive support would still be useful.  Hosts that
receive a feed with NNTP will probably not be in the NNRP permissions, but
they should not lose the ability to manually select an occasional article
to inspect it while tracing problems.

 During the recent flap over C news rejecting articles, it has been useful
to occasionally fetch a rejected article for inspection so as to see what is
wrong.

-- 
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
  Neil W. Rickert, Computer Science               <rickert@cs.niu.edu>
  Northern Illinois Univ.
  DeKalb, IL 60115                                   +1-815-753-6940

fletcher@cs.utexas.edu (Fletcher Mattox) (06/25/91)

I second that.  I would hate to see HEAD disappear from NNTP.  
I use it all the time for diagnosing problems with my xfer peers.

sob@tmc.edu (Stan Barber) (06/25/91)

I am no advocating the removal of any current functionality from NNTP. I 
am only advocating that we don't spend alot of time improving NNTP
for the benefit of news readers. I would like to see effort spent towards
making NNTP a better TRANSPORT.

-- 
Stan           internet: sob@bcm.tmc.edu         Director, Networking 
Olan           uucp: rutgers!bcm!sob             and Systems Support
Barber         Opinions expressed are only mine. Baylor College of Medicine

rsalz@bbn.com (Rich Salz) (06/25/91)

>I would like to see effort spent towards making NNTP a better TRANSPORT.

The following is obnoxious and somewhat self-serving.  I apologize,
but will say it anyway as a way of getting a real data-point in there:

    InterNetNews, currently in beta-test, processes IHAVE at better than
    10 to 20 articles per second for accepting, verifying, storing, and
    arranging for articles to be transmitted.  It can reject more than 100
    duplicates per second.

    INN's implementation techniques aren't for everybody.  Just those
    on the Internet with a reasonable server, like a Sparc-class
    machine.

How much better does NNTP have to be? :-)

More seriously, expending effort to handle unusual cases like
compression or batching might be worthwhile, but the IHAVE transaction
is really quite good.
	/r$

sob@tmc.edu (Stan Barber) (06/25/91)

Tell me, Rich, how well does InterNetNews work over 9.6Kb lines?

If you think it is good enough, then why participate in this discussion?


-- 
Stan           internet: sob@bcm.tmc.edu         Director, Networking 
Olan           uucp: rutgers!bcm!sob             and Systems Support
Barber         Opinions expressed are only mine. Baylor College of Medicine

rsalz@bbn.com (Rich Salz) (06/25/91)

My message said ihave/sendme is good enough, and I thought I
particularly said other stuff is worth doing.

Apparently I offended you, I'm sorry.
	/r$

Jim.Thompson@Central.Sun.COM (Jim Thompson) (06/25/91)

One of the things that I've suggested to R$ in the past few weeks
is a mode whereby two co-operating NNTP implimentations participate
in the following 


	machinea		machineb

	  res+n	----------------->  119   (1)

	    	<-----------------    (2)

	  res+o  ------------------>  res+p  (3)

	 	<------------------   (4)

Connections 1 and 2 are half-duplex 'haves' of a normal TCP stream, with
the NNTPport (119) on the 'b' side.

Connections 3 and 4 are the same, but with the NNTPport on the 'a' side.

Machine 'a' is sending news to 'b'.

During the setup phase, machine 'a' sends machine 'b' the equivalent of 
ftp's 'PORT' message, specifing a connection endpoint where it will accept
articles.  

'A' sends 'B' nothing but message-ids on (1).  B sends back the Message-ids
on (2) of the articles that it would like to see.  'A' sends the articles
on (3), and 'B' responds on (4) as to the success or failure of the
article or batch.

Note that the batch (possibly compressed) can trickle across the link as
its produced.  One needn't wait to have a fully-formed batch before its
sent.  In fact, if there are other users on your 9600 baud line, then they'll
quite likely be pissed because every few time quantum the equivalent of 
'rcp bigfile machineb:/tmp' will take place.  This will tromp on their
telnet session.  Machine 'A' must (obviously) keep track of which articles
are in the batch being sent across the wire until a 'got it all man, thanks'
message comes back from 'B', but since we're talking about message-ids here,
that shouldn't be a big deal, even on a 386 machine.

Note as well that Machine 'b' can keep track of the articles that it
has requested, and if it doesn't get one (or more), it is free to 
insert the message-ids in the 'gimme' (2) stream at some later point,
possibly on some other server.

Why all the complexity?

I don't want to see the proposed batching reduce what we've gained in
temporal diameter reduction.  News is now getting very fast.  With INN
I can forward articles from one spool to another in less than 200ms.
I really don't want to see a day when we go back to N hour delays just
because 10,000 sites have joined the Internet via 9600 baud V.32 connections,
and they're all batching news.  I've got nothing against 9600 baud
connectivity, in fact, I think I'm trying to help.

Note that this will require a "MODE" verb, as well as a "PORT" verb.

Fire at will, I'm sure someone will take issue with this.

Jim

geoff@world.std.com (06/26/91)

> During the recent flap over C news rejecting articles, it has been useful
> to occasionally fetch a rejected article for inspection so as to see what
> is wrong.

Or rather, would have been useful if the major NNTP sites hadn't cut off
NNTP access to all but a few feed sites.  From Toronto, we basically
couldn't retrieve articles from any of the interesting sites.  Surely
there should be some way to permit (indeed, mandate) low-volume remote
debugging while preventing full-scale news transmission.

sob@tmc.edu (Stan Barber) (06/26/91)

Gee, let's reactivate the WIZARD command in sendmail while we are at it.
:-)
STAN

-- 
Stan           internet: sob@bcm.tmc.edu         Director, Networking 
Olan           uucp: rutgers!bcm!sob             and Systems Support
Barber         Opinions expressed are only mine. Baylor College of Medicine

sob@tmc.edu (Stan Barber) (06/26/91)

I like Jim's proposal. It truely addresses issues about transport.
It looks like it could be an effective solution to many problems.
It probably obviates the need for nntplink.

-- 
Stan           internet: sob@bcm.tmc.edu         Director, Networking 
Olan           uucp: rutgers!bcm!sob             and Systems Support
Barber         Opinions expressed are only mine. Baylor College of Medicine

lear@turbo.bio.net (Eliot) (06/27/91)

Ok.  Here are some comments about your method, Jim.

1.  Setup cost would be higher - establishing an extra TCP connection. 
This is OK, if the NNTP connection is supposed to linger for a long
period of time.

2.  One of the nice things about the existing model is that it could be
implemented on any reliable medium.  Your method pretty much nails it
down to TCP, and protocols that support multiplexing.

3.  Somewhere in there we'll have to exchange byte count information if
the BINARY or IMAGE options are also enabled.  

Now.  What does this buy for 9600 baud connections that normal
IHAVE/SENDME doesn't?

Eliot Lear
[lear@turbo.bio.net]

Jim.Thompson@Central.Sun.COM (Jim Thompson) (06/28/91)

	From lear@turbo.bio.net Thu Jun 27 14:05:47 1991
	
	Ok.  Here are some comments about your method, Jim.
	
	1.  Setup cost would be higher - establishing an extra TCP connection. 
	This is OK, if the NNTP connection is supposed to linger for a long
	period of time.

Temporal diameter will be reduced.  This is good.  We already have 
an implimentation that can forward articles in less that 150ms.  Just
how much less, we have yet to determine.  This obviously depends on the
wire time for sending the article.   Someone has to do something with
all those T3 bits.  (Hmm, will all of the average USENET article fit
in the wire between, say, Boston and San Francisco?)  
	
	2.  One of the nice things about the existing model is that it could be
	implemented on any reliable medium.  Your method pretty much nails it
	down to TCP, and protocols that support multiplexing.

Not really, any protocol that can support two stream connections between
the same endpoint qualifies.  I think I can do the same thing over an OSI
stack, (though I don't want to.) :-)  X.25 could also support this technique.
	
	3.  Somewhere in there we'll have to exchange byte count information if
	the BINARY or IMAGE options are also enabled.  

Not really.  This is TCP specific, but you could have the serving end
'shutdown' its part of the article stream after it was done sending.
using, the far end sees EOF.  Much like ftp.

You can then send the articles batched in, say, 'rnews' format, to
make extraction easy.  This requires IMAGE, of course.  Even without
this, you can send the articles, separated by "\r\n.\r\n", one right after
the other.  
	
Given that you've already negotiated what compression technique you're
using, you're still set.  LZW (don't shoot me Len) handles EOF quite
well.  Note that sending the articles in 'rnews' format under compression
is likely an efficency win, but isn't required.  

	Now.  What does this buy for 9600 baud connections that normal
	IHAVE/SENDME doesn't?

Compression is more efficent for a longer (larger?) data stream.  If you
follow the current model of IHAVE/SENDME, then you can compress,
at most, the contents of one article.  Thus, the 9600 baud connections
will benefit because they'll have more bandwidth for other applications.
Also, a compressed batch will have less affect on the 9600 baud line,
since it should 'trickle' across, rather than being blasted.  News
transfers should appear to be background noise.

Article propigation time may decrease for any site using this technique.
	
Jim