schoch@trident.arc.nasa.gov (Steve Schoch) (09/21/90)
We have a problem with C news. When we first installed it, incoming nntp connections would spool articles into /usr/spool/news/in.coming faster than relaynews could process them. I installed a patch that has rnews run relaynews -r immediately which solves our problem with the large in.coming batches (there are no more) but now I have about 23 nntpd's that are waiting for relaynews -r to complete. The 23 relaynews -r we have running are all waiting for the current relaynews to finish with the LOCK file. We are using dbz. Neighbors are complaining that their nntpxmits are running too slowly to us. What am I doing wrong? Steve
henry@zoo.toronto.edu (Henry Spencer) (09/23/90)
In article <1990Sep20.212757.12868@news.arc.nasa.gov> schoch@trident.arc.nasa.gov (Steve Schoch) writes: >The 23 relaynews -r we have running are all waiting for the current relaynews >to finish with the LOCK file. The integration of relaynews with NNTP is far from seamless at present. :-) The underlying problem is that NNTP does not take notice of the fact that transferring and processing articles a batch at a time is much more efficient than doing them one at a time. The UUCP community learned that most of a decade ago, but the Internet community has generally been reluctant to admit that UUCP has anything to teach them. There is work in progress, on various fronts, aimed at doing something about the issue. It's not a solved problem with a canned solution ready to hand. -- TCP/IP: handling tomorrow's loads today| Henry Spencer at U of Toronto Zoology OSI: handling yesterday's loads someday| henry@zoo.toronto.edu utzoo!henry
zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) (09/23/90)
>The underlying problem is that NNTP does not take notice of the fact that >transferring and processing articles a batch at a time is much more >efficient than doing them one at a time. The UUCP community learned that I've modified a version of nntpxfer that batches a number of articles into memory and then feeds it to relaynews. Others have done similar things. It is MUCH more efficient - I'm suprised than the distribution version doesn't do something similar. -- Jon Zeeff (NIC handle JZ) zeeff@b-tech.ann-arbor.mi.us
brian@ucsd.Edu (Brian Kantor) (09/24/90)
In article <8?P*Z-=@b-tech.uucp> zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) writes: >I've modified a version of nntpxfer that batches a number of articles >into memory and then feeds it to relaynews. Others have done similar >things. It is MUCH more efficient - I'm suprised than the distribution >version doesn't do something similar. That's because nntpxfer is a hack kluge and wasn't really ever supposed to be used in production systems. I can say that: I wrote it. Batch transmission will be supported in NNTP v2. - Brian
jerry@olivey.olivetti.com (Jerry Aguirre) (09/25/90)
In article <1990Sep23.000826.15925@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes: >The underlying problem is that NNTP does not take notice of the fact that >transferring and processing articles a batch at a time is much more >efficient than doing them one at a time. The UUCP community learned that >most of a decade ago, but the Internet community has generally been >reluctant to admit that UUCP has anything to teach them. Henry, I think this is more a case of a square peg and a round hole rather than NIH. UUCP in inherently a batched operation in the sense that one submits a job and it is processed sometime later. That is what makes the ihave/sendme protocol so inefficient for UUCP connections. Adding batches, in the sense of multiple articles sent as one file, is a natural optimization with only minor disadvantages. The use of compression makes batching for UUCP even more appealing. The type of transmission used for NNTP establishes a real time connection. This allows for the potential to virtually eliminate the wasted overhead of sending again an article already on the receivers system. NNTP lends itself to multiple feeds for a number of reasons and the number of duplicates grows proportionally. (It is the norm rather than the exception for me to see the same article being offered multiple times within a few seconds. With B news there was little advantage to using batches (in the news processing itself). For most IP network connections there is little advantage to using compression. Therefor the advantages of eliminating the extra overhead of the duplicate copies very much outwayed the almost nonexistant advantages of batching. The release of C news may have shifted that balance but it is hardly fair to critizise NNTP because you changed the rules. NNTP doesn't break when one uses batching, it just looses a couple of its advantages. Some of us just happen to think they are important advantages. The performace advantage of C news seems to rest primarily on a deliberate delaying of the processing and retransmission of news articles. This is at odds with the goal of many NNTP developers who wanted to reduce the propagation delay of articles. Obviously we are dealing with different design goals here. A leaf UUCP site that is short on CPU cycles is going to have a different set of requirements than a NNTP site with 10 neighbors and a faster CPU. If I were such a leaf site I would have converted to C news long ago. As it is I am still waiting for that "seam" to become less obvious. Jerry Aguirre
henry@zoo.toronto.edu (Henry Spencer) (09/25/90)
In article <49453@olivea.atc.olivetti.com> jerry@olivey.olivetti.com (Jerry Aguirre) writes: >The type of transmission used for NNTP establishes a real time >connection... There seems to be a general illusion that real-time connections are exempt from considerations of efficiency. With the volume of news we currently see, this is not true. Real-time or not, the most efficient way to transfer news is to pump data bytes, in bulk, from one end to the other, without control handshaking or other time-wasting complications interspersed. Rev 2 of NNTP includes a batching protocol for this. Our reaction to the way a lot of NNTP sites currently do their news transmission is roughly: "Jesus, are they all running on Crays?!?". The waste of resources is mind-boggling. We wish we could afford to squander so many cycles on ruinously inefficient transmission methods; it would make life a lot easier. >The performace advantage of C news seems to rest primarily on a >deliberate delaying of the processing and retransmission of news >articles. This is at odds with the goal of many NNTP developers who >wanted to reduce the propagation delay of articles... I confess that we fail to understand why some of the NNTP folks are so obsessed with propagating talk.religion in seconds rather than minutes. However, this need not imply a contradiction with C News's philosophy of processing in bulk for efficiency. You just have to do things more cleverly to combine the two. Work is in progress on this. -- TCP/IP: handling tomorrow's loads today| Henry Spencer at U of Toronto Zoology OSI: handling yesterday's loads someday| henry@zoo.toronto.edu utzoo!henry
jerry@olivey.olivetti.com (Jerry Aguirre) (09/26/90)
In article <1990Sep25.153101.2437@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes: > >There seems to be a general illusion that real-time connections are exempt >from considerations of efficiency. With the volume of news we currently >see, this is not true. Real-time or not, the most efficient way to >transfer news is to pump data bytes, in bulk, from one end to the other, >without control handshaking or other time-wasting complications interspersed. >Rev 2 of NNTP includes a batching protocol for this. Henry, It is a pretty convincing illusion. Certainly the NNTP/network connections transfer a lot more news with less CPU load than UUCP had. The nntpxmit asks if the receiver has a particular message ID and it the answer is no it sends it. Granted there are turn around delays but they effect thruput not system or network load. On most systems the serial input interrupts for each character. The typical network card interupts on the packet resulting in an order of magnitude less overhead. My experience with running both UUCP and NNTP bears out this theoretical conclusion. A uucico can be hogging 50% of the system while a nntpd is transferring twice the articles and is not even in the top 10 processes. Just how do you propose to prevent massive transmission of duplicates if "rnews" squirels away the articles without updating the history file? I seriouly want to know your philosophy on this. >Our reaction to the way a lot of NNTP sites currently do their news >transmission is roughly: "Jesus, are they all running on Crays?!?". I manage a full feed quite nicely and without a Cray. Even a 0.69 MIPs B news system can handle multiple full NNTP feeds (if it wern't for the damn UUCP connections). >However, this need not imply a contradiction with C News's philosophy >of processing in bulk for efficiency. You just have to do things more >cleverly to combine the two. Work is in progress on this. Glad to hear it. Jerry Aguirre
I.G.Batten@fulcrum.bt.co.uk (Ian G Batten) (09/26/90)
jerry@olivey.olivetti.com (Jerry Aguirre) writes: > It is a pretty convincing illusion. Certainly the NNTP/network > connections transfer a lot more news with less CPU load than UUCP > had. The nntpxmit asks if the receiver has a particular message This is no advert for NNTP, merely a statement that UUCP over serial lines is an IO bandwidth hog in a way that TCP isn't. My newsfeed is via a 64K leased line, which replaced 2K4 modems and 2K4 X25. Since I already had UUCP over TCP running here for local purposes, we initially ran our existing 100K 8-bit compressed batches over ``e'' protocol. This screamed, and the inbound uucico comsumed almost no resources. I then switched to NNTP for reasons of modernity and suddenly found performance going through the floor, with the nntpd consuming significant resources. I'm essentially a leaf site, so I rarely get an article presented more than once. I now run faster with bizarre tweaks to the NNTP batching, but often think that UUCP over TCP would be neat to go back to... ian
henry@zoo.toronto.edu (Henry Spencer) (09/27/90)
In article <49460@olivea.atc.olivetti.com> jerry@olivey.olivetti.com (Jerry Aguirre) writes: >>There seems to be a general illusion that real-time connections are exempt >>from considerations of efficiency... > >It is a pretty convincing illusion. Certainly the NNTP/network >connections transfer a lot more news with less CPU load than UUCP >had... This is more a function of the nature of the hardware -- typically doing a packet at a time rather than a character at a time -- than of the protocol, I would say. I'm told the costs of NNTP are *not* trivial. And processing articles one at a time is massively inefficient, even if the inefficiency is spread out so it's not so noticeable. Your machine is being gnawed to death by mice rather than trampled by an elephant, but it's still losing just as much blood. >Just how do you propose to prevent massive transmission of duplicates if >"rnews" squirels away the articles without updating the history file? >I seriouly want to know your philosophy on this. There are several tactics on this one. The one I would personally favor is to keep a record of received-but-not-yet-processed articles separate from the history file -- they *are* on disk, so there is no reliability impact from not processing them immediately -- but I haven't had a chance to experiment with this yet. -- TCP/IP: handling tomorrow's loads today| Henry Spencer at U of Toronto Zoology OSI: handling yesterday's loads someday| henry@zoo.toronto.edu utzoo!henry