[net.news] trying to locate duplicate articles

smk@linus.UUCP (Steven M. Kramer) (06/21/83)

	The following are two articles we received.  The first
appeared via seismo!harpo!decvax!genrad and the second thru
seismo!philabs.  Can anyone limit the problem now?  Which sites
look suspect?

From solomon@uwvax.UUCP Sat Jun 18 15:17:22 1983
Relay-Version: version B 2.10 5/3/83; site linus.UUCP
Path: linus!genrad!decvax!harpo!seismo!uwvax!solomon
From: solomon@uwvax.UUCP
Newsgroups: net.news
Subject: Re: A gripe about 2.10 readnews - (nf)
Message-ID: <222@crystal.UUCP>
Date: Sat, 18-Jun-83 15:17:22 EDT
Article-I.D.: crystal.222
Posted: Sat Jun 18 15:17:22 1983
Date-Received: Sun, 19-Jun-83 07:14:19 EDT
Lines: 72

Yep, if PAGER is /usr/ucb/more, having it read its input from the news
article itself rather than a pipe is a real win.  It's even better if
	...

From solomon@uwvax.UUCP Sat Jun 18 15:17:22 1983
Relay-Version: version B 2.10 5/3/83; site linus.UUCP
Posting-Version: version B 2.10 5/3/83; site crystal.ARPA
Path: linus!philabs!seismo!uwvax!solomon
From: solomon@uwvax.UUCP
Newsgroups: net.news
Subject: Re: A gripe about 2.10 readnews - (nf)
Message-ID: <222@crystal.ARPA>
Date: Sat, 18-Jun-83 15:17:22 EDT
Article-I.D.: crystal.222
Posted: Sat Jun 18 15:17:22 1983
Date-Received: Sun, 19-Jun-83 15:30:42 EDT
References: <2248@uiucdcs.UUCP>
Organization: U of Wisconsin CS Dept
Lines: 72

Yep, if PAGER is /usr/ucb/more, having it read its input from the news
article itself rather than a pipe is a real win.  It's even better if
	...
-- 
--steve kramer
	{allegra,genrad,ihnp4,utzoo,philabs,uw-beaver}!linus!smk	(UUCP)
	linus!smk@mitre-bedford						(ARPA)

jim@uw-beaver.UUCP (06/25/83)

This dup was caused by someone dropping the message-id.  Crystal is
using ".ARPA" as their domain, so any time the message-id is dropped,
you get a dup.

I know the article was ok when it left seismo, so I would guess the
problem is at philabs.  I also got a duplicate here, and the message-id
in this case was definitely dropped by microsoft.

I'm afraid we'll have to put up with dups until sites like these get
around to updating their software, or until we decide to drop
message-ids and go back to article-ids.

chris@grkermit.UUCP (Chris Hibbert) (06/27/83)

... And uw-beaver!jim's article got here with duplicate Message-ID's.
Both copies had "Article-I.D.: uw-beave.679", but there were different
Message-ID's:  One was "Message-ID: <679@uw-beaver>", and the other was
"<679@uw-beave.UUCP>."  The second clearly reconstructed from the
article-ID, which had gotten truncated.  The paths they used were:

"grkermit!genrad!linus!wivax!decvax!harpo!floyd!vax135!cornell!uw-beaver!jim"

for <679@uw-beaver> and 

"grkermit!genrad!decvax!harpo!floyd!vax135!cornell!uw-beaver!jim" for
<679@uw-beave.UUCP>

Maybe decvax<==>genrad is at fault?  Genrad did recently change over to
news 2.10, and I don't think decvax has.  

jim@mcvax.UUCP (06/28/83)

Our only news-feed from "Over There" is through philabs, and we
are getting a lot of duplicate articles. I have had a prod around,
and it looks like genrad<=>decvax to me too:

diff 454 457
	2,3c2
	< Posting-Version: version B 2.10 5/3/83; site uw-beaver
	< Path: mcvax!philabs!linus!wivax!decvax!harpo!floyd!vax135!cornell!uw-beaver!jim
	---
	> Path: mcvax!philabs!linus!genrad!decvax!harpo!floyd!vax135!cornell!uw-beaver!jim
	7c6
	< Message-ID: <679@uw-beaver>
	---
	> Message-ID: <679@uw-beave.UUCP>
	11,13c10
	< Date-Received: Mon, 27-Jun-83 23:37:40 (DT)
	< References: <26911@linus.UUCP>
	< Organization: U of Washington Computer Science
	---
	> Date-Received: Mon, 27-Jun-83 23:50:16 (DT)

Other duplicate messages that we receive always have genrad in them, but
others have decvax (without genrad) and are not duplicated.

Note that the Posting-Version, References and Organisation fields
are also gone. We had a similar problem when we changed to 2.10, as
a couple of sites we talk to "Over Here" were running VERY old 2.9
versions, which didn't understand all the fields (like they threw the
Path field away and used the From field, so we got ALL the news back!).
They also stripped the above fields out.

Does that help?

Jim McKie		....decvax!mcvax!jim
			...philabs!mcvax!jim

smk@linus.UUCP (Steven M. Kramer) (06/30/83)

Has anyone considered that there may still be a bug in the news code?
It looks like each of us has checked out the code (except decvax, but
I'm not accusing them) that has been identified along the routes.
The information to go on is so fragile, it's hard to debug.
-- 
--steve kramer
	{allegra,genrad,ihnp4,utzoo,philabs,uw-beaver}!linus!smk	(UUCP)
	linus!smk@mitre-bedford						(ARPA)

mark@cbosgd.UUCP (07/01/83)

One possible cause of a dropped message id is if a site sends
news to another site in A format.  I just discovered that decvax,
while running 2.10 and having mostly 2.10 neighbors, is sending
news to several sites in A format.  A format will lose much
information, including the message ID.

Could I ask news admins to double check their sys files to make
sure that A format isn't being used, unless their neighbor is
really running A news?  (A+ understands B format.)  I don't know
of many sites on the net still running A (microsoft is the only
one I can think of, and I'm not sure about them), so A's are
unlikely to be needed anymore.

	Mark Horton

rob@genrad.UUCP (Rob Wood) (07/02/83)

There have been a number of people suggesting that genrad is at the
root of the multiple copies.  This in not correct.  I appreciate
everyone trying to find out what is happening, but while my site
does pass both articles, it is because someone has already truncated
the r from uw-beaver in one of the articles, so our news thinks
that they are different articles.  We are running 2.10.  We have
exactly the same software as most of the New England sites.  Our
link to decvax is a 2.10 link.  I hope that someone can find the
problem, as about 50% of our articles received are duplicates and
thus our spool directory is overutilized.

	Rob Wood	(decvax!genrad!rob)

jim@uw-beaver.UUCP (07/06/83)

I got a call this morning from Dan at philabs who assures me that he
isn't dropping the message-id.  He is running news 2.10 with the
header.c fixes.  I'm out of ideas now, because I was pretty sure that
all the other sites in that path were OK.

My apologies for falsely accusing philabs.  They apparently take at
least as much pride as I do in running bug-free software, and were
justifiably annoyed at being publicly maligned.