[gnu.emacs.gnus] gnus broken!?

cedman@golem.ps.uci.edu (Carl Edman) (12/22/90)

In article <ay0prv.eor@wang.com> lee@wang.com (Lee Story) writes:
   parker@mprgate.mpr.ca (Ross Parker) writes:

   >I have just installed Cnews (PATCHDATE 7-Sep-1990) and nntp version
   >1.5.10 on our news serving system. Until this change, we had been
   >running Bnews 2.11.?. This change has caused gnus to break... everyone
   >using gnus reports that it as soon as a newsgroup is selected, gnus
   >will hang. One user has reported that newsgroups with less than 4
   >articles in them work fine (this is the number of subject lines that
   >is displayed...)

   >Anyone seen this?

   Yes, sort of.  I build gnus for SCO V.3.2 and emacs 18.55, using the
   "tcp.c" program from the gnus distribution to access the remote nntp
   server.  When I run "tcp" to talk to nntp interactively, it works JUST
   FINE!!  When run from gnus (nntp.el/nntp.elc OR tcp.el/tcp.elc) it
   displays the group list (after a painfully long time), but if a group
   of more than two of three articles is selected, gnus hangs.  I've
   tried reducing the number of requests parameter, as the installation
   documentation suggests (from 400 down to 100, 20, and 10).
   I've tried setting "nntp-buggy-select".  It's time for SERIOUS
   debugging.

   I'd love to see a post if anyone else has solved this.

This should be around my 30th post (not counting a large number of
private mails) on this newsgroup and news.software.nntp that solves
this problem. This has been asked a huge number of times. I understand
that some newsservers throw away messages after 2 or 3 days, but there
has to be a way to prevent the reposting of this solution twice a week.
OK, from now on I will just repost this file again every time this
question is asked (Only updateing the number).

The problem is a bug in the NNTP software revision 1.5.10. It didn't
exist in 1.5.9 and has been promised to be fixed in 1.5.11. Basically
the problem is that NNTP mixes 2 levels of file access. It uses
buffered IO, and at the same time uses select(). Usually this doesn't
cause any trouble as long as the sender of commands doesn't send them
faster than they can be processed, so that the buffer is always empty
when NNTP looks for the next command. But GNUS requires a large number
of commands at a decent speed, so it sends up to 400 commands ahead
before waiting for a reply.

To solve this problem there are several possibilities.

First: If you don't have access to the NNTP server software, then this
is your only possiblity. Change nntp-maximum-request to 1 , and set
nntp-buggy-select to t in your "~/.emacs" file. To do this add lines
like this:

(setq nntp-maximum-request 1
      nntp-buggy-select t)

This will work, but makes GNUS very slow. You wont want to read large
newsgroups like this. You probably will stop using GNUS. You may even
stop reading news. (Every reader judge for himself , if that would be
an advantage :-).

Second: This is the most common and easiest way to fix the problem,
which this NNTP server uses, and which seems to be fairly wide spread.
Select is used by NNTP to time out connections. If you disable the
timeout feature, this solves the problem. To do this find where
TIMEOUT is defined in the source, and undefine it. Now there wont be
any timeouts, select wont be used and GNUS is happy. The disadvantages
of this approach is that connections don't timeout anymore, and that
it is slightly cludgy.

Third: You can really fix the problem by rewriting the timeout parts
of NNTP. If anyone has done this, please let me know.

        Carl Edman


Theorectical Physicist,N.:A physicist whose  | Send mail
existence is postulated, to make the numbers |  to
balance but who is never actually observed   | cedman@golem.ps.uci.edu
in the laboratory.                           | edmanc@uciph0.ps.uci.edu

nntp@tmc.edu (12/22/90)

In article <CEDMAN.90Dec21124521@lynx.ps.uci.edu> cedman@golem.ps.uci.edu (Carl Edman) writes:
>The problem is a bug in the NNTP software revision 1.5.10. It didn't
>exist in 1.5.9 and has been promised to be fixed in 1.5.11. Basically
>the problem is that NNTP mixes 2 levels of file access. It uses
>buffered IO, and at the same time uses select(). Usually this doesn't
>cause any trouble as long as the sender of commands doesn't send them
>faster than they can be processed, so that the buffer is always empty
>when NNTP looks for the next command. But GNUS requires a large number
>of commands at a decent speed, so it sends up to 400 commands ahead
>before waiting for a reply.

We are in alpha testing of nntp 1.5.11 now and early results are that
the "gnus" problem is fixed. Expect to see 1.5.11 hit the streets around
February 1, 1991.

It will also fix the "pipe" problem reported by some.

We are also looking into fixing the 4096 articles/group problem that
some people are afraid of happening.

We hope to release a new version of clientlib that will work with TLI
as well.

Please send any questions to "nntp@tmc.edu"


-- 
Stan           internet: sob@bcm.tmc.edu         Director, Networking 
Olan           uucp: {rutgers,mailrus}!bcm!sob   and Systems Support
Barber         Opinions expressed are only mine. Baylor College of Medicine

palkovic@calvin.fnal.gov (John A. Palkovic) (12/22/90)

In article <CEDMAN.90Dec21124521@lynx.ps.uci.edu> 
	cedman@golem.ps.uci.edu (Carl Edman) writes:
>This should be around my 30th post (not counting a large number of
>[...]
>To solve this problem there are several possibilities.
>[...]
>Third: You can really fix the problem by rewriting the timeout parts
>of NNTP. If anyone has done this, please let me know.

Fourth: Get nntp 1.5.9 and install it. Works fine at this site. I
still have a copy if anyone needs it.

-- 
John Palkovic (708) 840-3527	| palkovic@linac.fnal.gov
"A Superconductor generates electricity without resistance when cooled." 
- Chicago Tribune, Oct. 21, 1990, A-18 
-- 
John Palkovic | palkovic@linac.fnal.gov, {smart host}!linac!palkovic
"A Superconductor generates electricity without resistance when cooled." 
- Chicago Tribune, Oct. 21, 1990, A-18 | Home: linac!jpmac!johnny

stealth@engin.umich.edu (Mike Pelletier) (01/03/91)

In article <CEDMAN.90Dec21124521@lynx.ps.uci.edu>
	cedman@golem.ps.uci.edu (Carl Edman) writes:
>Third: You can really fix the problem by rewriting the timeout parts
>of NNTP. If anyone has done this, please let me know.

I just replaced the timeout parts of nntp 1.5.10 with the timeout parts
of 1.5.9.  I have a patch...



*** nntp-1.5.10/server/serve.c.orig	Sun Aug 12 05:33:30 1990
--- nntp-1.5.10/server/serve.c	Fri Sep  7 20:39:42 1990
***************
*** 20,41 ****
  # endif not USG
  #endif
  
- #ifdef TIMEOUT
- /* Not all systems define these */
- #ifndef FD_SETSIZE
- #define FD_SET(n, p)	((p)->fds_bits[0] |= (1<<(n)))
- #define FD_CLR(n, p)	((p)->fds_bits[0] &= ~(1<<(n)))
- #define FD_ISSET(n, p)	((p)->fds_bits[0] & (1<<(n)))
- #define FD_ZERO(p)	((p)->fds_bits[0] = 0)
- #endif
- #endif
- 
  extern	int	ahbs(), group(), help(), ihave();
  extern	int	list(), newgroups(), newnews(), nextlast(), post();
  extern	int	slave(), stat(), xhdr();
  
- extern int errno;
- 
  #ifdef AUTH
  extern	int	doauth();
  #endif AUTH
--- 20,29 ----
***************
*** 107,116 ****
  #ifdef POSTER
  	struct passwd	*pp;
  #endif
! #ifdef TIMEOUT
! 	struct timeval timeout;
! 	fd_set readfds;
! #endif
  #ifdef LOG
  # ifdef USG
  	struct tms	cpu;
--- 95,103 ----
  #ifdef POSTER
  	struct passwd	*pp;
  #endif
! # ifdef TIMEOUT
! 	void		timeout();
! # endif
  #ifdef LOG
  # ifdef USG
  	struct tms	cpu;
***************
*** 249,286 ****
  	 */
  
  #ifdef TIMEOUT
! 	timeout.tv_sec = TIMEOUT;
! 	timeout.tv_usec = 0;
! #endif
! 	for (;;) {
  #ifdef TIMEOUT
! 		/* Do timeout with select() (i.e. the intelligent way) */
! 		FD_ZERO(&readfds);
! 		FD_SET(fileno(stdin), &readfds);
! 		errno = 0;
! 		i = select(fileno(stdin) + 1,
! 		    &readfds, (fd_set*)0, (fd_set*)0, &timeout);
! 		if (i < 0) {
! 			/* "Interrupted system call" isn't a real error */
! 			if (errno == EINTR)
! 				continue;
! 			syslog(LOG_ERR, "%s read select: %m", hostname);
! 			break;
! 		}
! 		if (!FD_ISSET(fileno(stdin), &readfds)) {
! 			printf(
! 		    "%d Timeout after %d seconds, closing connection.\r\n",
! 				ERR_FAULT, TIMEOUT);
! 			(void) fflush(stdout);
  
- #ifdef LOG
- 			syslog(LOG_ERR, "%s timeout", hostname);
- #endif LOG
- 			exit(1);
- 		}
- #endif
- 		if (fgets(line, sizeof(line), stdin) == NULL)
- 			break;
  		/* Strip trailing CR-LF */
  		cp = line + strlen(line) - 1;
  		while (cp >= line && (*cp == '\n' || *cp == '\r'))
--- 236,250 ----
  	 */
  
  #ifdef TIMEOUT
! 	(void) signal(SIGALRM, timeout);
! 	(void) alarm(TIMEOUT);
! #endif TIMEOUT
! 
! 	while (fgets(line, sizeof(line), stdin) != NULL) {
  #ifdef TIMEOUT
! 		(void) alarm(0);
! #endif TIMEOUT
  
  		/* Strip trailing CR-LF */
  		cp = line + strlen(line) - 1;
  		while (cp >= line && (*cp == '\n' || *cp == '\r'))
***************
*** 322,327 ****
--- 286,294 ----
  				(void) fflush(stdout);
  			}
  		}
+ #ifdef TIMEOUT
+ 		(void) alarm(TIMEOUT);
+ #endif TIMEOUT
  	}
  
  	printf("%d %s closing connection.  Goodbye.\r\n",
***************
*** 393,395 ****
--- 360,384 ----
  #endif
  	exit(0);
  }
+ 
+ 
+ #ifdef TIMEOUT
+ /*
+  * No activity for TIMEOUT seconds, so print an error message
+  * and close the connection.
+  */
+ 
+ void
+ timeout()
+ {
+ 	printf("%d Timeout after %d seconds, closing connection.\r\n",
+ 		ERR_FAULT, TIMEOUT);
+ 	(void) fflush(stdout);
+ 
+ #ifdef LOG
+ 	syslog(LOG_ERR, "%s timeout", hostname);
+ #endif LOG
+ 
+ 	exit(1);
+ }
+ #endif TIMEOUT
-----------------------------
This is what I'm running now, and I've had no problems for months.

-- 
	Mike Pelletier - Usenet News Admin & Programmer
"Wind, waves, etc. are breakdowns in the face of the commitment to getting
 from here to there.  But they are the conditions for sailing -- not
 something to be gotten rid of, but something to be danced with."