[news.misc] Take a sniff of

jh@mancol.UUCP (John Hanley) (03/27/88)

In article <1509@looking.UUCP> brad@looking.UUCP (Brad Templeton) writes:
>The net is for the *readers* and not the posters.  If a software check makes
>posting 100 times harder, and it helps eliminate 10% of the unwanted articles,
>that's worth it.

Hear, hear!

And now for something completely different... (others have mentioned keywords)

Why must any keyword-based system be doomed to inflexibility?
To be effective, a system with lots of rules like Gnews would need to be able
to automatically pull new rules off the net.  Clearly, we can't make this too
easy, or the first time Fred Twit pissed off Joe Blow, Joe would instruct
Gnews's all over the world to abuse people posting articles on Mr. Twit.
We _could_, however, do this sort of thing by concensus:  if you feel an
article doesn't belong in a group, and that this can be distinguished by
some combination of keywords and Boolean operators, you could hit ^F (for
Flame about article content), but then rather than entering Pnews to flame
to the net, you would be prompted for keywords.  A keyword program prompts
you for the offending keywords, verifies that the specified operators and
keywords do in fact produce a match on the current article, maybe lets you
review other articles in the news group that also match so you can see if they
fit the same mold of 'inappropriate' articles, then posts your message to
moderated group news.group.foo.keywords.  If news.group.foo is moderated,
then its moderator handles the keywords sub-group, otherwise some small
number of volunteer moderators handle the vast wasteland of unmoderated groups.
Moderating a keywords newsgroup is actually not time-consuming because the
postings are automatically processed by a vote-tallying program, which only
needs attention when a 100 or a 1000 votes or whatever are collected.
Then the moderator looks at the proposed 'naughty' keywords, decides if they
are reasonable, based on his knowledge as a reader of news.group.foo, and maybe
sends out a control message of some sort telling Pnews's the world over to
add a rule to their rulesets for foo encouraging people to think before
posting if their article matches the keywords.  This technique would not,
for example, screen out non-source/binaries from binaries groups (this is
a seperate and probably easier problem -- non-English text is pretty easy to
automatically detect), but it _would_ have had some effect on the prolonged
series of "please don't post noise" noise messages seen recently in
comp.binaries.ibm.pc.

The moderation aspect is necessary for two reasons:
  - a distributed vote-tally algorithm would generate O(N^2) net traffic 
	while the send-to-one-site-then-post-result-to-all-sites scheme
	generates O(N) traffic (where N is the number of hosts with news)
  - complicated combinations of keywords require a human in the loop to
	figure out readers' intended meaning.  The vote-tallyer would be
	very liberal in its counting (if [ `grep $SOMEKEYWORD everyones_keywords
	 | wc -l` -gt 100 ] then send_mail_to_moderator) so that the
	moderator could then sort out all the possibilities that people
	came up with and decide on some representative rules to send out.
	Also, the moderator would look at things like a suspicious number
	of postings coming from a single institution.

Of course, probably the _best_ way to improve newsgroup quality is to let
someone read an article, hit 'F', compose their followup, enqueue it, then
require them to wait at least 30 minutes, then require them to spend at
least 5 minutes in their favorite editor reviewing their followup before
dequeueing it and actually posting it.

            --John Hanley
              System Programmer, Manhattan College
              ..!cmcl2.nyu.edu!manhat!jh  or  hanley@nyu.edu   (CMCL2<=>NYU.EDU)

weemba@garnet.berkeley.edu (Obnoxious Emacs Weenie) (03/27/88)

I'd like to clarify a few things, and then shut up.

In article <357@mancol.UUCP>, jh@mancol (John Hanley) writes:
>Why must any keyword-based system be doomed to inflexibility?
>To be effective, a system with lots of rules like Gnews would need to be able
>to automatically pull new rules off the net.  Clearly, we can't make this too
>easy, or the first time Fred Twit pissed off Joe Blow, Joe would instruct
>Gnews's all over the world to abuse people posting articles on Mr. Twit. [...]

Gnews is just one news-reader/poster/mailer of many.  Its primary advan-
tage over others is that it is written in Emacs Lisp, and so it can be
programmed at a very high level.  Systems like the ones being thrown
around for screening articles are no more inherent to Gnews than any
other poster; and implementing them is only effective if they become
widespread.

I mentioned Gnews for the advertizing, and because I think any new user
restricting software, like the 50% rule, should be tested more widely
before being implemented, to see just how users will respond.

I agree with Chuq & Greg that a genuine keyword system for USENET is
hopelessly impossible.  And I agree with Greg that any system that gets
implemented cannot be one that must adapt to stay alive and be useful;
in particular, I reject John's suggestions above as impossible.

The proposal that I did make was merely a very rough sketch--tests, when
possible, for generally accepted ungood postings.  Programming *this*
seems neither too easy nor outrageously hard.

I'll also point out that Gnews does have two restrictions built in to
its posting mode:
 (*) The *gnews*reply* buffer is in auto-fill-mode when the width of its
     window exceeds 80 columns.
 (*) Double signature appending is blocked, using some very trivial heur-
     istics.
I have yet to hear a single complaint about either of these.

ucbvax!garnet!weemba	Matthew P Wiener/Brahms Gang/Berkeley CA 94720

webber@porthos.rutgers.edu (Bob Webber) (03/28/88)

In article <8061@agate.BERKELEY.EDU>, weemba@garnet.berkeley.edu (Obnoxious Emacs Weenie) writes:
> I'll also point out that Gnews does have two restrictions built in to
> its posting mode: ...
>  (*) Double signature appending is blocked, using some very trivial heur-
>      istics.
> I have yet to hear a single complaint about either of these.

I COMPLAIN.  Double signature inclusion is one of the easiest automatic
idiot detectors to install in software.  Of course, a -- indicating that
a person who has just typed a 20+ line message finds typing their own
signature too tedious is also a good indicator.  Automatic cross-correlation
with people on the moderators list and people who post votes to news.groups
is still available though.  Hmmm, I guess it is still easy to spot idiots.
I UNCOMPLAIN.

------ BOB (webber@athos.rutgers.edu ; rutgers!athos.rutgers.edu!webber)

p.s., so why is garnet the only site in creation running ftp that doesn't
recognize anonymous?

Script started on Sun Mar 27 19:38:00 1988
porthos[2,1] ftp garnet.berkeley.edu
Connected to garnet.berkeley.edu.
220 garnet.berkeley.edu FTP server (Version 4.16 Tue Aug 11 12:23:14 PDT 1987) ready.
Name (garnet.berkeley.edu:(null)): anonymous
Password (garnet.berkeley.edu:anonymous): 
530 User anonymous unknown.
Login failed.
ftp> quit
221 Goodbye.
porthos[2,2] ftp arizona.edu
Connected to arizona.edu.
220 megaron.arizona.edu FTP server (Version 4.109 Mon Sep 14 17:06:01 MST 1987) ready.
Name (arizona.edu:(null)): anonymous
Password (arizona.edu:anonymous): 
331 Guest login ok, send ident as password.
230 Guest login ok, access restrictions apply.
ftp> quit
221 Goodbye.
porthos[2,3] exit
script done on Sun Mar 27 19:40:06 1988