[news.software.b] C news and bogus distributions

pst@ack.Stanford.EDU (Paul Traina) (10/11/90)

[Henry/Geoff: I tried to send mail to you about this first, but must have
 had the wrong address for comments, so I decided to open up to the world.]

Some programs, like Pnews in rn insist on generating articles with bogus
distribution lines.  Since C news does the right thing by separating out
distributions from newsgroups in the sys file,  I've seen a slight problem.

My sys entry might look like:

kink.com:alt.sex.bondage/ca,usa,na,world:l:

Now the problem is that if an article comes in with:

Newsgroups: alt.sex.bondage
Distribution: alt

C news won't send the message along.

For a while, I was doing negative distribtions along the lines of:

kink.com:alt.sex.bondage/all,!local,!csd,!sdc,!su,!ba:l:

But this is bogus (and if you think about it, really the wrong way to do
it).

I was going to patch oktransmit() in relaynews so that if the distribution
was singular (ba as opposed to ba,ca) and was the same as the top level
of the newsgroup heirarchy,  it would consider the distribution to be
DEFDIST (in this case world).


However,  I find this to be an ugly solution because I was considering creating
two distributions for our university groups:

So we end up with a newsgroup:

	su.sex

and a distribution levels: su and su-export.  Only messages with distribution
su.export get sent off campus.  However, now "su" gets mapped to world, and
distribution "su" messages get sent off campus.

However, we've got two cases -- one where there are legitimate distributions
with similar names (e.g. ba, ca, su) and then "national" groups with fake
distribtutions (such as comp, fj, bit)...  I could read in a distribution
file when relaynews starts and map all invalid distributions to world...
	sigh.

Mapping to "world" seems to do the right thing in 99% of the cases,
but I find it a losing situation.  Has anyone else thought about this?

(p.s. Would people consider it anti-social if we started re-writing headers
      so the message actually gets a legitimate distribution?)
--
At AIR,  if we can't fix something, it isn't broken.
			(ruthlessly stolen from a lab manager at HP via
			 rec.humor.funny)

henry@zoo.toronto.edu (Henry Spencer) (10/12/90)

In article <pst.655622474@ack.Stanford.EDU> pst@ack.Stanford.EDU (Paul Traina) writes:
>For a while, I was doing negative distribtions along the lines of:
>
>kink.com:alt.sex.bondage/all,!local,!csd,!sdc,!su,!ba:l:
>
>But this is bogus (and if you think about it, really the wrong way to do
>it).

Actually, we tend to think that negative distributions end up being the
simplest approach.  However, if you've got a lot of local distributions
and a lot of connections that cross distribution boundaries, that does
make life complicated.

We've generally got a low opinion of people who generate (or whose software
generates) bogus distributions, and can't get too excited about maximizing
propagation of such articles.

>(p.s. Would people consider it anti-social if we started re-writing headers
>      so the message actually gets a legitimate distribution?)

Any header rewriting is a sin.  However, this strikes me as a relatively
minor one, provided you're sure you know what the legitimate distributions
are.
-- 
"...the i860 is a wonderful source     | Henry Spencer at U of Toronto Zoology
of thesis topics."    --Preston Briggs |  henry@zoo.toronto.edu   utzoo!henry

markw@gvlf1-c.gvl.unisys.com (Mark H. Weber) (10/22/90)

In article <1990Oct11.210812.19392@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>In article <pst.655622474@ack.Stanford.EDU> pst@ack.Stanford.EDU (Paul Traina) writes:
>>For a while, I was doing negative distribtions along the lines of:
>>
>>kink.com:alt.sex.bondage/all,!local,!csd,!sdc,!su,!ba:l:
>>
>>But this is bogus...
>
>We've generally got a low opinion of people who generate (or whose software
>generates) bogus distributions, and can't get too excited about maximizing
>propagation of such articles.
>

Rats! I thought I had distributions figured out, and that entries like 
"comp" and "alt" were annoying, but not technically bogus. Here's what I
based my understanding on (extracted from an article by David Lawrence):

|
|> I have a question on the proper entries for the Distribution header. 
|
|Ok.  First there should be a common definition for what the
|Distribution header means.  This is provided by RFC 1036.
|
|RFC> 2.2.7.  Distribution
|RFC>  
|RFC>     This line is used to alter the distribution scope of the message.
|RFC>     It is a comma separated list similar to the "Newsgroups" line.  User
|RFC>     subscriptions are still controlled by "Newsgroups", but the message
|RFC>     is sent to all systems subscribing to the newsgroups on the
|RFC>     "Distribution" line in addition to the "Newsgroups" line.  For the
|RFC>     message to be transmitted, the receiving site must normally receive
|RFC>     one of the specified newsgroups AND must receive one of the
|RFC>     specified distributions.  Thus, a message concerning a car for sale
|RFC>     in New Jersey might have headers including:
|RFC>  
|RFC>                    Newsgroups: rec.auto,misc.forsale
|RFC>                    Distribution: nj,ny
|RFC>  
|RFC>     so that it would only go to persons subscribing to rec.auto or misc.
|RFC>     for sale within New Jersey or New York.  The intent of this header
|RFC>     is to restrict the distribution of a newsgroup further, not to
|RFC>     increase it.  A local newsgroup, such as nj.crazy-eddie, will
|RFC>     probably not be propagated by hosts outside New Jersey that do not
|RFC>     show such a newsgroup as valid.  A follow-up message should default
|RFC>     to the same "Distribution" line as the original message, but the
|RFC>     user can change it to a more limited one, or escalate the
|RFC>     distribution if it was originally restricted and a more widely
|RFC>     distributed reply is appropriate.
| 
|> As I understand it, the Distribution header is to limit the distribution
|> of an article to some logical or geographically related set of machines.
|
|Right.  But note especially the words, 'sent to all systems subscribing
|to the newsgroups on the "Distribution" line'.
|
|> Are Distribution: entries of comp, rec, sci etc bogus?
|
|No.
|

It appears to me that the source of the confusion regarding the use of this
header is the very definition of the header itself. Does anyone know what
intent the original framer(s) of RFC 1036 had when they introduced the 
concept of newsgroup hierarchy names on the "Distribution: " line? 
Was this done simply because the format of the B news sys file was considered
to be frozen, and the newsgroups and distributions had to share the same field?

Now that C news is catching on, with it's modified sys file format, is there
any effort underway to revise RFC 1036 to separate the newsgroup and 
distribution definitions? Anyone who is working on this, please contact me
via email.

A truly functional system of distributions for netnews could greatly reduce
the volume of world-wide postings. Many question-and-answer type newsgroups
could be successful as limited distribution groups. Why does a question need
to be posted to the entire world, when there is probably someone nearby who
has the answer? Some help from newsreader software would be needed for this
to be effective, also. If I could set my newsreader to only show me articles
that were posted in a specific distribution, and post by default to that
same distribution, then I could easily set the "focus" of my newsviewer
to be as narrow or as wide as I wanted. As a site administrator, I could
set my default distribution for new users to be fairly local, and then
let them discover the larger world for themselves. Current newsreader 
software does not appear to have this capacity (please correct me if 
I am wrong), and even if it did, the lack of a functional distribution 
mechanism would hamper its effectiveness.

Yes, there are many "broken" sites out there, and B news will be with us
for some time to come. But before we can eliminate these "broken" headers,
we need a clear idea of what we would change them to. These changes can
then be applied such that any re-writing of the Distribution: header lines
can be done in a uniform fashion. 

-- 
  Mark H. Weber                   | Internet: markw@GVL.Unisys.COM  
  Unisys - Great Valley Labs      |     UUCP: ...!uunet!cbmvax!gvlv2!markw
  Paoli, PA  USA  (215) 648-7111  |           ...!psuvax1!burdvax!gvlv2!markw

pst@laura-palmer.Stanford.EDU (Paul Traina) (10/22/90)

markw@gvlf1-c.gvl.unisys.com (Mark H. Weber) writes:
>It appears to me that the source of the confusion regarding the use of this
>header is the very definition of the header itself. Does anyone know what
>intent the original framer(s) of RFC 1036 had when they introduced the 
>concept of newsgroup hierarchy names on the "Distribution: " line? 
>Was this done simply because the format of the B news sys file was considered
>to be frozen, and the newsgroups and distributions had to share the same field?

The whole thing started when regional newsgroups such as "ba" or "ca" started
forming.  The authors of a few "post" programs took to a simple way of
producing distribution lines.  (Set the dist field the same as the top level
newsgroup field).  Unfortunately,  what they should have done was NOT create
a distribution line at all, or left it blank for the user to fill in.

It's (semi) reasonable to put a Distribution: ba line on a message posted to
ba.singles,  but it isn't necessary, as the distribution should follow the
regional newsgroup, so Distribution: world should have the same effect.

However, I should be able to further limit the distribution of a regional
(such as "Distribution: su" posted to ba.singles).

My personal feeling is the RFC is fuzzy on this, and times have changed since
this problem was last looked into.  I don't want to get into a flame war
about rewriting headers (which is do think is wrong;  but sometimes terrorist
tactics do have good effect).  My approach is to modify C news so that
if a message is posted to newsgroup "top.group.name" and the distribution
ONLY contains "top" (as opposed to "top,canada,su,whatever") I'll just refuse
to acknowlege the distribution line exists.

It catches all but the most bizzare convolutions of headers and distributions
(see my message about "su" and "su-export").

Frankly, I'm just pissed that the whole thing started in the first place.
Distribution was always an optional field that NEVER should have been
defaulted by any posting software.