[news.misc] natural language recognition suitable for netnews?

emv@math.lsa.umich.edu (Edward Vielmetti) (10/13/90)

Cross-posted to a few unlikely groups because I don't know the
sub-specialty that knows how to deal with this problem.  Followups
to comp.ai.

In article <1990Oct12.105710.29940@chinet.chi.il.us> laird@chinet.chi.il.us (Laird J. Heal) writes:

   What really needs to be done is to redirect inappropriate articles into
   more appropriate newsgroups, as with the reposted articles comprising 
   comp.archives.  

Not exactly.  comp.archives does not view any article as
"inappropriate".  Rather it culls out particularly appropriate
articles, dusts them off and cleans them up a bit, and places them
into a new additional group.  Inappropriate articles are silently
ignored.

   The problem with doing this is threefold.  First, I would
   not trust a program to automatically inspect an article for content and
   place it in a newsgroup.  Second, I would not trust any old Joe on the net
   to put my article in any old newsgroup, and third, anyone or any group of
   people who would be trustworthy to catalog the articles would burn out
   within a very short period of time.

I wouldn't want to have software automatically take articles and
rebroadcast them around the net; that has too many opportunities for
feedback loops.  A better solution would be to retroactively
cross-post the possibly interesting or appropriate articles into a new
local newsgroup, which you could then read locally and have a
tremendous competetive advantage over other people who are slogging
through junk.  

I think it's possible to build an expert system (or whatever the
current word for those things is these days -- knowlege-based system?
natural language understander?) which finds things suitable for
comp.archives, and by extension any other group with a similar narrow
focus.  At least I would hope so, after all I have a few thousand
articles which a human system has found, something should be trainable
to mimic what I have done.

--Ed

Edward Vielmetti, U of Michigan math dept <emv@math.lsa.umich.edu>
moderator, comp.archives
after 1 November 1990: <emv@ox.com>

laird@chinet.chi.il.us (Laird J. Heal) (10/19/90)

In article <EMV.90Oct13031622@josephus.math.lsa.umich.edu> emv@math.lsa.umich.edu (Edward Vielmetti) writes:
>Cross-posted  Followups to comp.ai. [and news.misc where it started]

>In article <1990Oct12.105710.29940@chinet.chi.il.us> laird@chinet.chi.il.us (Laird J. Heal) writes:

>   What really needs to be done is to redirect inappropriate articles into
>   more appropriate newsgroups, as with the reposted articles comprising 
>   comp.archives.  

>Not exactly.  comp.archives does [about what I said, but I emphasize reposting]

>I think it's possible to build an expert system (or [...]) [...] for
>comp.archives, and by extension any other group with a similar narrow
>focus.

I was thinking _prospectively_; that is, have inews categorize the article or
carry out the recategorization done by upstream sites.  It is clearly untenable
to flip control messages seven ways from Sunday reclassifying articles; 
canceling them is hard enough.

However, in Eric Fair's article starting this thread, he gave a very clear 
opinion that better news presentation tools are required.  Another of my ideas 
would be to have the news readers add keywords.  Then, an expert system proper 
could base its censorship on what the current reader has found interesting in 
the past correlated to the choice of keywords.

I would prefer to be able to read all of the postings on topics of my interest 
by reading particular newsgroups.  Rhat means requesting that those posting the 
article post to whichever newsgroup is appropriate, and that unfortunately is 
not always done.  What I generally do is find the time to read news and strip 
out the interesting articles, which I later go through and make use of.
-- 
My .signature is on vacation ------------- like me!

zed@mdbs.uucp (Bill Smith) (10/28/90)

In article <1990Oct19.112936.6561@chinet.chi.il.us> laird@chinet.chi.il.us (Laird J. Heal) writes:
>In article <EMV.90Oct13031622@josephus.math.lsa.umich.edu> emv@math.lsa.umich.edu (Edward Vielmetti) writes:
>>Cross-posted  Followups to comp.ai. [and news.misc where it started]
> 
>>In article <1990Oct12.105710.29940@chinet.chi.il.us> laird@chinet.chi.il.us (Laird J. Heal) writes:
> 
>>   What really needs to be done is to redirect inappropriate articles into
>>   more appropriate newsgroups, as with the reposted articles comprising 
>>   comp.archives.  
> 
>>Not exactly.  comp.archives does [about what I said, but I emphasize reposting]
> 
>>I think it's possible to build an expert system (or [...]) [...] for
>>comp.archives, and by extension any other group with a similar narrow
>>focus.
> 
>I was thinking _prospectively_; that is, have inews categorize the article or
>carry out the recategorization done by upstream sites.  It is clearly untenable
>to flip control messages seven ways from Sunday reclassifying articles; 
>canceling them is hard enough.
> 
>However, in Eric Fair's article starting this thread, he gave a very clear 
>opinion that better news presentation tools are required.  Another of my ideas 
>would be to have the news readers add keywords.  Then, an expert system proper 
>could base its censorship on what the current reader has found interesting in 
>the past correlated to the choice of keywords.
> 
>I would prefer to be able to read all of the postings on topics of my interest 
>by reading particular newsgroups.  Rhat means requesting that those posting the 
>article post to whichever newsgroup is appropriate, and that unfortunately is 
>not always done.  What I generally do is find the time to read news and strip 
>out the interesting articles, which I later go through and make use of.
>-- 
>My .signature is on vacation ------------- like me!

Why don't you people call TRW at Redondo Beach.  They already have build 
hardware and software that will make categorization unnecessary.  It is 
cool and it works.   The group that built that project deserves more public
recognition.

Basically it is called a Fast Data Finder.  It searches through data with
(arbitrary) regular expressions as fast as the disk array that feeds it 
can ship it through.  It can search through an entire months worth of the 
net in a second or so.   I haven't heard what the current
status is,  so by now it probably has perfectly friendly user interface.
If it doesn't I'm sure someone on the net will volunteer to adapt their
(already written) pet project to the task.

Bill Smith
pur-ee!mdbs!zed
[Keep on smiling.  Otherwise, people might think you are an axe murderer.]

reynolds@park.bu.edu (John Reynolds) (10/31/90)

We would be interested in seeing such a newsgroup.