stealth@caen.engin.umich.edu (arrakis) (09/07/90)
When we cut over to NNTP 1.5.10 from 1.5.8, GNUS stopped operating
properly. It gets through the group selection process fine, but when
you start to read a newsgroup, it gets stuck at the "0% of headers
recieved" point. Changing nntp-maximum-request to a low number doesn't
fix the problem, which is an implied solution to such a problem.
We're running GNUS 3.13. We're currently operating by having a 1.5.8
server on a separate port to which the GNUS reader can connect.
Any ideas? I'll post a summary.
--
Michael V. Pelletier | "We live our lives with our hands on the
CAEN UseNet News Administrator | rear-view mirror, striving to get a better
Systems Group Programmer | view of the road behind us. Imagine what's
| possible if we look ahead and steer..."
aglew@crhc.uiuc.edu (Andy Glew) (09/12/90)
GNUS users at UIUC were plagued by the same problem - GNUS 3.13 or 3.12 hanging when our news server updated to nntp 1.5.10. A bit of experimentation showed that any value of nntp-maximum-request > 2 produced this hang. I set nntp-maximum-request back to 1, and am using GNUS fine now. -- Andy Glew, a-glew@uiuc.edu [get ph nameserver from uxc.cso.uiuc.edu:net/qi]
stealth@caen.engin.umich.edu (Mike Pelletier) (09/14/90)
Add a line "setbuf(stdin, NULL);" at line 255 of server/serve.c
(just before the "for(;;) {" line)
and the problem will go away. This may have some performance
costs, but it's by far the simplest solution that's been sent to
me thus far.
--
Michael V. Pelletier | "We live our lives with our hands on the
CAEN UseNet News Administrator | rear-view mirror, striving to get a better
Systems Group Programmer | view of the road behind us. Imagine what's
| possible if we look ahead and steer..."
sdk@shadow.twinsun.com (Scott D Kalter) (09/15/90)
So far two solutions have been mentioned to the GNUS vs. NNTP 1.5.10 problem. 1. (setq nntp-maximum-request 1) which makes emacs and the NNTP server work in a synchronous lock-step -- emacs makes one request, NNTP answers, emacs makes one request etc. It seems intuitive to allow emacs to make several requests without making it wait for NNTP to respond. 2. setbuf(stdin,NULL) which solves the problem by forcing NNTP to not ignore buffered requests (by eliminating the buffer). This isn't so hot a solution since it forces NNTP to do a read() for each and every input character! I have spent some time looking at both ends of this problem. My current opinion is that NNTP has some problems in trying to use both select() and buffered I/O through fgets(). After staring at this code for an hour I realized that there are a couple of things to note: 1. The select is used as nothing more than a timeout mechanism to decide when the server has been idle too long and should be shut down. 2. There is an evil bug lurking in this code in that if a client should send a string with no carriage return and then stops sending anything it will hang the NNTP server and that select() isn't going to help one bit (it will hang in the fgets()). Granted (2) is much more unlikely than having a well behaved client that only makes complete requests and then sits idle for two hours (the standard timeout for select here) indicating the server should just give up. However, I see three options to solve the problem: 1. Remove the use of select and simply assume that clients will not sit around idle for two or more hours. 2. Put in an explicit test (before the select call) to see if there is something still in the buffer and don't make the select call if there is. This could be done (non-portably) by poking in _iobuf->_cnt described in stdio.h or by building one's own buffering scheme (portably). 3. Give up on using select and use an alarm instead. I believe the 3rd option makes the most sense given that we basically are trying to implement something like a watchdog timer. If the timer goes off, just shut down this server process (which is what select does if it times out). Apparently the NNTP author is working on a fix but any of the above could be implemented without too much difficulty and without costing all newsreaders with a setbuf(stdin, NULL);. -sdk
pcg@cs.aber.ac.uk (Piercarlo Grandi) (09/18/90)
On 14 Sep 90 17:33:08 GMT, sdk@shadow.twinsun.com (Scott D Kalter) said: sdk> So far two solutions have been mentioned to the GNUS vs. NNTP 1.5.10 sdk> problem. [ ... both are bad, because setting nntp-maximum-request to 1 prevents batching of requests, and thus leads to many small one line IPC transactions, and unbuffering stdio means that fgets must read from the socket one char at a time ... ] sdk> I have spent some time looking at both ends of this problem. My sdk> current opinion is that NNTP has some problems in trying to use both sdk> select() and buffered I/O through fgets(). [ ... ] Correct -- select(2) can only know that there are bytes waiting at the socket, does not know about bytes waiting in the stdio buffer. sdk> However, I see three options to solve the problem: sdk> 1. Remove the use of select and simply assume that clients will not sdk> sit around idle for two or more hours. Best quick solution! Just do not define 'TIMEOUT' in "common/conf.h" -- this is by far the easiest and most efficient solution. It is also catered for in the configuration options, requires no modifications, and removes only a very limited use facility, analogous to the c-shell 'autologin'. sdk> 2. Put in an explicit test (before the select call) to see if there is sdk> something still in the buffer and don't make the select call if there sdk> is. [ ... ] As you say, this is very unpalatable. It requires perforating abstraction layers, or rewriting parts of stdio functionality, and also it is very unportable. sdk> 3. Give up on using select and use an alarm instead. sdk> I believe the 3rd option makes the most sense given that we basically sdk> are trying to implement something like a watchdog timer. Yes, I was going to implement it like that, but there is a comment that SIGPIPE is explicitly ignored because in any case we get to know about a severed connection when trying to read. I have also noticed that EINTR is ignored when reading from the socket. This leads me to think: we may want to hang up either because the connection has been dropped, or because the client has been inactive, while keeping the connection alive. I think it is not appropriate to implement the later option ('TIMEOUT'); if the client has been inactive, without dropping the connection, that means that its machine is still up and it is still "logged in". If the client has to be auto-logged out, let's leave the task to the client's machine. sdk> Apparently the NNTP author is working on a fix but any of the above sdk> could be implemented without too much difficulty and without costing sdk> all newsreaders with a setbuf(stdin, NULL);. Well, I have just disabled TIMEOUT. I think that is the best all around solution. Granted, if you want to implement auto-logout on a live but inactive socket in the server, alarm(3) is probably the best way. -- Piercarlo "Peter" Grandi | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk
leres@ace.ee.lbl.gov (Craig Leres) (09/22/90)
The TIMEOUT code in nntpd used to use ALRM. But the ALONE code also uses ALRM so you couldn't use TIMEOUT and ALONE. Since I wanted to do this, I rewrote serve() to use select() and submitted the new code to Stan. Another reason for the rewrite is that there are other things I want to run off a timer. The version of nntpd I'm currently running has the following timers: - The standard idle timer; close the connection and exit after TIMEOUT seconds of idle time. The reason I do this is to release resources that are not in use. - A cnews batch check timer; launch a partial batch file after BATCHCHECK seconds of idle time. This works really well when you're being fed by a nntplink site. - A /etc/nologin check; look for /etc/nologin every LOGINCHECK seconds and shutdown the connection if its found. This at least gives news readers the option of gracefully handling a shutdown of the news server. These timers are run from a generic interface. It's easy to add more and there aren't any restrictions on which combinations you can use. Anyway, back to the problem; if fgets() reads two lines from the remote side, fgets() returns the first line, buffers the second one and we deadlock at the next call to select(). The solution I like the best (and the one I'm currently testing) is to look and see if there are characters in the stdio buffer: #define BUFFERED_DATA(f) ((f)->_cnt > 0) This probably isn't 100% portable but I believe that it'll work on most systems that have select(). And I'm sure we can come up with something equivalent for those other systems. Craig