[news.software.nntp] NNTP's IHAVE gets carried away

merlyn@iwarp.intel.com (Randal Schwartz) (02/27/90)

I just ran across a bug in the internal handshaking between
gethistent() and ihave() in the NNTP server rev 1.5.7, and wonder if
anyone has a clean fix.

If the client says "IHAVE <123@foo.com>", and <123@foo.com> has
expired or been cancelled, the NNTP server gleefully replies with
CONT_XFER (go ahead, make my day...), which the client then interprets
as a request to feed it.  But, as soon as inews gets a hold of it
(well, OK, relaynews in Cnews's case), it gets dropped as a duplicate.

This is because gethistent() has only two things to say: "yes I have
it, and it is in file /usr/spool/news/gnu/emacs/123", or "nope, don't
have it", neither of which is true for an expired/cancelled article.

I suppose a quick fix would be to have gethistent() return "/dev/null"
for an expired article, but that reeeeeeeks of a kludge.  And I don't
know what that would break.

Has anyone else fixed this?  Or am I the only one that is worrying
about hundreds of dups a day from my three NNTP feeds?

Just another confused news admin,
-- 
/=Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095 ==========\
| on contract to Intel's iWarp project, Beaverton, Oregon, USA, Sol III      |
| merlyn@iwarp.intel.com ...!any-MX-mailer-like-uunet!iwarp.intel.com!merlyn |
\=Cute Quote: "Welcome to Portland, Oregon, home of the California Raisins!"=/

pst@ack.Stanford.EDU (Paul Traina) (02/27/90)

The problem is that gethistent really does two related but different functions.

Gethistent is called by ihave() to determine if an article has been received
by the system.

Gethistent is called by openartbyid() to actually give the path of the file.

In the ihave() instance, there is no need to actually dig through the file
to find the damn entry.  I've hacked arround this (it's not the best fix,
but it was simple), by adding a second parameter to gethistent, which is
a flag "returnpath".  I then placed the fragment:

	if (!returnpath)
		return "";

after the "if (content.dptr == NULL)" line.  This will return a valid non-NULL
pointer.  Then add a ",1" to the gethistent call in openartbyid() and a ",0"
in the call in ihave().

Simple, cheap, ugly.  It solves the duplicate message problem *and* it gives
you an extra performance boost on servicing ihave()'s in general.

Now of course, the proper way to do it would be to split the routine up into
int artinhistfile() and char *gethistent().

Regards.

tale@cs.rpi.edu (David C Lawrence) (02/27/90)

> I've hacked arround this (it's not the best fix, but it was simple),
> by adding a second parameter to gethistent, which is a flag "returnpath".

Well, it's essentially the fix used by this; though this still hadn't
made it into the 1.5.7 release a few of us (well, me and John Coolidge
at least.  I have good guesses about a handful of others that have
likely applied it to) have been running with it for several months.

>From news.software.nntp Thu Jul 13 18:04:47 1989
>From: david@elroy.jpl.nasa.gov (David Robinson)
Newsgroups: news.software.nntp
Subject: ihave speedup
Message-ID: <1989Jul13.031603.13790@elroy.jpl.nasa.gov>
Date: 13 Jul 89 03:16:03 GMT
Organization: Image Analysis Systems Grp, JPL
Lines: 128

Below is a noticeable speedup in the processing of IHAVE requests,
especially if you receive many duplicates.

In the function ihave(), gethistent() is called to see if
the proposed article exists if it returns non-NULL a "Got It"
message is sent back, otherwise the article is accepted.  No
where is the value returned (the path of the article) ever used
by ihave().

In gethistent(), a database lookup is performed and the resulting
value is used to seek into the history file and then read the
line.  Since ihave() doesn't really need the path, the extra lseek()
and fgets() (which usually forces a hard disk read) is wasted.

These diffs add a lookup variable as an argument to gethistent,
it it is non-zero only a history lookup is performed and a bogus
non-NULL pointer is returned.

Gethistent should probably be rewritten into two pieces, one part
to get back the database value and the other to actually read the
history file.

I also noticed a rewind() that is done on the history file right
before the fseek() to an absolute offset.  I see no good reason to
do this and it has the side effect of flushing the stdio input
buffer.

	-David Robinson
==================
*** misc.c-orig	Wed Jun 14 19:32:32 1989
--- misc.c	Wed Jul 12 20:09:46 1989
***************
*** 72,77 ****
--- 72,78 ----
   *
   *	Parameters:	"msg_id" is the message ID of the
   *			article, enclosed in <>'s.
+  *			"lookup", only check if article exists
   *
   *	Returns:	A char pointer to a static data area
   *			containing the full pathname of the
***************
*** 91,98 ****
  #endif not DBM
  
  char *
! gethistent(msg_id)
  	char		*msg_id;
  {
  	char		line[MAXBUFLEN];
  	char		*tmp;
--- 92,100 ----
  #endif not DBM
  
  char *
! gethistent(msg_id, lookup)
  	char		*msg_id;
+ 	int		lookup;
  {
  	char		line[MAXBUFLEN];
  	char		*tmp;
***************
*** 167,172 ****
--- 169,181 ----
  	if (content.dptr == NULL)
  		return (NULL);
  
+ 	/*
+ 	 * If we are just checking to see if it exists return a non-NULL
+ 	 * result
+ 	 */
+ 	if (lookup)
+ 		return ((char *)1);
+ 
  	if (hfp == NULL) {
  		hfp = fopen(historyfile, "r");
  		if (hfp == NULL) {
***************
*** 177,182 ****
--- 186,192 ----
  			return (NULL);
  		}
  	} else {
+ /* Why do this if we are going to do an absolute fseek below? XXX */
  		rewind(hfp);
  	}
  
***************
*** 241,247 ****
  {
  	char	*path;
  
! 	path = gethistent(msg_id);
  	if (path != NULL)
  		return (fopen(path, "r"));
  	else
--- 251,257 ----
  {
  	char	*path;
  
! 	path = gethistent(msg_id, 0);
  	if (path != NULL)
  		return (fopen(path, "r"));
  	else
*** ihave.c-orig	Wed Jun 14 19:32:32 1989
--- ihave.c	Mon Jul 10 20:00:46 1989
***************
*** 30,36 ****
  		return;
  	}
  
! 	cp = gethistent(argv[1]);
  	if (cp != NULL) {
  		printf("%d Got it.\r\n", ERR_GOTIT);
  		(void) fflush(stdout);
--- 30,36 ----
  		return;
  	}
  
! 	cp = gethistent(argv[1], 1);
  	if (cp != NULL) {
  		printf("%d Got it.\r\n", ERR_GOTIT);
  		(void) fflush(stdout);
-- 
	David Robinson		elroy!david@csvax.caltech.edu     ARPA
				david@elroy.jpl.nasa.gov	  ARPA
				{cit-vax,ames}!elroy!david	  UUCP
Disclaimer: No one listens to me anyway!