loverso@Xylogics.COM (John Robert LoVerso) (04/12/90)
In article <G&L$Z=$@b-tech.uucp> zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) writes: > >... nntpxfer ... > >This puppy is slow. It took about 3 hours to fetch 4K articles, and it > > I have some fixes for these things that will hopefully be included in a > future nntp release. Until then, copies are available on request. After seeing how slow it really was, I just changed it to just do the NEWNEWS and return a list of message-ids I need. You can then just turn that into a sendme control message to the other host. Thus, the articles will get to your machine using standard whatever xmit channel you already use (nntpxmit/nntplink/etc). As an aside, but I just found out that this is an easy way to retrieve articles that the nntp access file won't let you have. I.e., a "newnews *" will list message-ids for all articles, even if you are not allowed to get transfer them. However, that restriction doesn't exist on sendme control messages. Of course, this only works against sites that feed you news to begin with... John -- John Robert LoVerso Xylogics, Inc. 617/272-8140 x284 loverso@Xylogics.COM Annex Terminal Server Development Group
urlichs@smurf.sub.org (Matthias Urlichs) (04/13/90)
In news.software.nntp, article <8876@xenna.Xylogics.COM>, loverso@Xylogics.COM (John Robert LoVerso) writes: < In article <G&L$Z=$@b-tech.uucp> zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) writes: < > >... nntpxfer ... < > >This puppy is slow. It took about 3 hours to fetch 4K articles, and it < > < > I have some fixes for these things that will hopefully be included in a < > future nntp release. Until then, copies are available on request. < < After seeing how slow it really was, I just changed it to just do < the NEWNEWS and return a list of message-ids I need. You can then < just turn that into a sendme control message to the other host. < Thus, the articles will get to your machine using standard whatever < xmit channel you already use (nntpxmit/nntplink/etc). Assuming you already have one. But once you have the articles, nntpxfer may be actually faster than xmit because it only needs one request-reply interaction instead of two. Assuming you don't have to lower-case the ID in order to get the article -- see below. < There are a whole bunch of things you can do to nntpxfer if you want to speed things up, and/or just feel inclined to add some features: - Use alarm()/signal() instead of select() and reading one character at a time. - Use fdopen() and fgets/fputs if your stdio library lets you. - Ignore 5xx results on ARTICLE requests -- some sites say 5xx if you want to access an article which local policy forbids you to get. - Open two channels concurrently -- one to get the IDs and one to concurrently get the data. (This will make nntpxfer faster than nntpxmit on lines with large ping times.) - Use signal() to block the SIGPIPE you get when the forked inews aborts before reading the whole article. - Drop the buggers into files instead of forking every time. - Ask for the article by the original ID. If the other side doesn't have that, convert the message-ID's post-@ part to lower case (which is the primitive version of what RFC822 says about this topic). If that resulted in a change, ask again. If the article is still not present, lowercase the whole message-ID and ask, again only if you did change something. (The _current_ nntpd code suggests that it should be sufficient either just to rfc822ize the ID, or to leave it alone, but this seems not always to work. Anyone know for sure?) - Batch articles by calling the batching code (../server/batch.c). This involves modifying batch.c to (a) optionally read from somewhere other than stdin and (b) don't say 2xx "Give me the article" in case of (a). - Log via (fake)syslog. - Make -d an incremental swicth instead of a toggle, and make the debug-printing logic somewhat more clever. - Log progress into an almost-temporary file which also serves a lock to make sure that no two nntpxfers concurrently access the same site. Rename this file when done, so that one can see why the last xfer failed while watching the current one die. :-( - Control all of the above with some new option letters. I've done most of these -- it's not much work, but my code is really ugly right now. (You thought it was ugly before? Ha!) It also lacks a whole lot of error checking, safe termination, and the aforementioned alarm()-type stuff isn't even tested yet. Unfortunately I don't have time to prettify it all -- I'll mail my version to anyone who wants to do that. Nntpxfer is now reasonably fast and about the only dead time is spent in forking newsrun(C) or rnews -U (B). Besides waiting for the negative responses of ARTICLE commands, of course. ;-) :-( Next project: Convincing the NEWNEWS code to not report IDs of expired articles. This is harder because it's not my nntpd which has that problem but the machines' we xfer news from. Not good. Aside: Does anyone keep at least two weeks of News online? I'd like to have a backup to xfer our news from if our Internet link drops dead again, as it did last week... -- Matthias Urlichs
loverso@Xylogics.COM (John Robert LoVerso) (04/13/90)
I should have been clearer... NEWNEWS has a security hole in that it can advertise message-ids of articles that the other end is restricted from getting via the access file. LIST does the same thing. They should both filter their output based upon restricted groups. John