[net.news.group] keyword based news

mason@utcsri.UUCP (Dave Mason) (09/22/85)

I also am coming to believe a keyword based system is correct.  A couple
of ideas follow, but first I must point out a problem.
My understanding is that many sites recieve only a part of the postings
(particularly Europe, Australia) because of transmission costs.
Currently this is based on news groups.  If these disappear it MAY be
more difficult for them to get the articles they are interested in without
the drivel.  The only solution I can see is that they recieve all articles
with keywords not on a list, with a report being produced each day of
new keywords that have arrived, which could be added to the censor list.
....maybe it's not as bad a problem as I thought.

One thing that I think is critical to the use of key words is that the
author of an article must choose her own keywords, even if following-up.
If this is not done, the situation will be little better than at present
with the title becoming totally meaningless.

The big advantages of (correctly chosen) keywords are:
  1) we would not have parallel discussions going on in different news groups.
	Therefore we could read all the articles of interest to us without
	scanning the universe.
  2) as discussions drifted away from our interests, we would not be sucked
	along behind them (although some simple mechanism for following a
	discussion as it drifts might be handy.

The biggest question it seems to me is how to present the discussions to
the reader.  I suggest that we could define 'interest trees' along the lines of
	os | operating_system !mvs !cpm !bugs
		unix
			4.2bsd
				bugs
			bugs
		cpm internals
This would present:
  1) os articles (however spelled) that dealt with anything but mvs,
	unix, bugs or cpm
  2) os & unix articles not about 4.2bsd or bugs
  3) os & unix & 4.2bsd articles not about bugs
  4) os & unix & 4.2bsd & bugs
  5) os & unix & bugs
  6) os & cpm & internals
where of course once seen, an article would not reappear unless you postponed
it (to another group or another time).

Comments welcome.
-- 
Usenet:	{dalcs dciem garfield musocs qucis sask titan trigraph ubc-vision
 	 utzoo watmath allegra cornell decvax decwrl ihnp4 uw-beaver}
	!utcsri!mason		Dave Mason, U. Toronto CSRI
CSNET:	mason@Toronto
ARPA:	mason%Toronto@CSNet-Relay

lauren@vortex.UUCP (Lauren Weinstein) (09/23/85)

The issues of keyword-based news came up a couple of (years?) ago,
and were pretty completely discussed at that time.  I'll try find
some of my old messages on the topic--I don't think I want to 
try generate them again from memory!

The summary though, was that I felt (and feel) that keyword-based
news won't work well in our environment.  I had a number of objections
at the time, including:

1) People don't even (much of the time) keep simple subject lines initially 
   relevant nor up-to-date on followups.  I have little faith that
   we'll see better results with keywords, the selection of which
   is very critical (see below).  Research in database systems has indicated
   that poor user choices of keywords is one of the biggest problems
   in making keyword systems useful.

2) Inappropriate use of keyword-based systems can make life very difficult
   for people who find they are missing useful articles since the original
   keywords were badly chosen.  At least with newsgroups there's a chance
   of finding things of interest (in particular groups) regardless of how
   badly subject lines may have been chosen.  The lack of newsgroups in
   a keyword-based system essentially is like putting TOTAL faith in the
   keywords (analogous to the subject line) for finding articles of interest
   (that is, there are no newsgroups to provide a higher level reference).
   People often just don't take the time and effort to choose appropriate
   keywords, and the situation could get very ugly with followups as the
   topics drift but the keywords tend to remain the same (through laziness
   or whatever...)

3) Keyword-based systems may encourage vast increases in the volume of
   postings.  Right now we only tend to find high volume in established
   topic newsgroups, but with a keyword-based system my gut feeling is that
   people would feel much more free to post anything and everything anytime
   they wanted.  This could clearly accelerate many of the problems that
   we've already been seeing as topics splinter off in all directions and
   volume balloons.  This is made even worse since...

4) ... it will be very difficult for systems to control the types of
   material they are willing to pass on in a keyword-based system.
   With newsgroups, a site can at least consider dropping some of the
   "junk" groups if they have to/want to, but how do you make such
   decisions with a keyword system?

   Some people may think this is great--a way to force all sites to pass
   everything.  Friends, all that will do is force many sites to stop
   passing anything at all.  We're starting to see sites faced with the
   alternative of cutting off some of the junk groups or not being able
   to hire some new people to do real work.  If we try to set things
   up so that sites can't easily control what netnews they're paying
   for, we're just asking many important sites to vanish.  Some sites
   simply don't have the money, disk or CPU cycles to handle all groups.
   If you put them in a situation where they can't easily subscribe (or pass
   on) topic groups of particular interest, you're saying they can't
   participate at all.  This "take it all or take nothing" aspect of keyword
   systems is one of its most negative aspects.  Newsgroups provide an upper
   level of organization whose importance cannot be overemphasized, since they
   cause users to fit their postings into some established level of
   organization that isn't totally tied to users' own (arbitrary) keyword 
   choices. 

More later...

--Lauren--

mason@utcsri.UUCP (Dave Mason) (09/24/85)

In article <808@vortex.UUCP> lauren@vortex.UUCP (Lauren Weinstein) writes:
>...
>1) People don't even (much of the time) keep simple subject lines initially 
>   relevant nor up-to-date on followups.  I have little faith that
>   we'll see better results with keywords, the selection of which
>   is very critical (see below).  Research in database systems has indicated
>   that poor user choices of keywords is one of the biggest problems
>   in making keyword systems useful.
I agree whole-heartedly.

>2) Inappropriate use of keyword-based systems can make life very difficult
>   for people who find they are missing useful articles since the original
>   keywords were badly chosen.  At least with newsgroups there's a chance
>...
>   People often just don't take the time and effort to choose appropriate
>   keywords, and the situation could get very ugly with followups as the
>   topics drift but the keywords tend to remain the same (through laziness
>   or whatever...)
I proposed that the original keywords would NOT be propagated, and each
successive poster would choose the appropriate keywords.  There presumably
would be a way to track articles that were follow-ups to ones you found of
interest, but the original choice of keywords would not affect the followup.

>3) Keyword-based systems may encourage vast increases in the volume of
>   postings.  Right now we only tend to find high volume in established
>   topic newsgroups, but with a keyword-based system my gut feeling is that
>   people would feel much more free to post anything and everything anytime
>   they wanted.  This could clearly accelerate many of the problems that
>   we've already been seeing as topics splinter off in all directions and
>   volume balloons.  This is made even worse since...
Maybe..but I'm not convinced.

>4) ... it will be very difficult for systems to control the types of
>   material they are willing to pass on in a keyword-based system.
>   With newsgroups, a site can at least consider dropping some of the
>   "junk" groups if they have to/want to, but how do you make such
>   decisions with a keyword system?
I proposed a method, and although I'm not claiming it is fool-proof, I
would find it more useful if it were critiqued rather than ignored.
I was talking about trans-oceanic links, but the same would hold for
any site that didn't want to accept some classes of articles.

The only argument I can see against the approach (limiting articles
based on keywords) is the use of trojan horse words:  keyword mvs-xa
meaning aberrant-sex-with-children (there are those who would lump the
two topics together anyway :-) , but I can't see this being much worse
than the current news-group situation.
-- 
Usenet:	{dalcs dciem garfield musocs qucis sask titan trigraph ubc-vision
 	 utzoo watmath allegra cornell decvax decwrl ihnp4 uw-beaver}
	!utcsri!mason		Dave Mason, U. Toronto CSRI
CSNET:	mason@Toronto
ARPA:	mason%Toronto@CSNet-Relay

bmg@mck-csc.UUCP (Bernard M. Gunther) (09/25/85)

> I also am coming to believe a keyword based system is correct. 

I have been hearing about people wanting a keyword based system and I would
like everyone who advocates this to try a little experiment.  Today, write
out a list of all the articles which you would like to read in TOMORROWS
newspaper.  Just try it and you will see why I advocate newsgroups.  

Bernie Gunther
mit-eddie!mck-csc!bmg

chuqui@nsc.UUCP (Chuq Von Rospach) (09/29/85)

In article <132@mck-csc.UUCP> bmg@mck-csc.UUCP (Bernard M. Gunther) writes:
>> I also am coming to believe a keyword based system is correct. 
>
>I have been hearing about people wanting a keyword based system and I would
>like everyone who advocates this to try a little experiment.  Today, write
>out a list of all the articles which you would like to read in TOMORROWS
>newspaper.  Just try it and you will see why I advocate newsgroups.  

Going to the keyword list allows you to define a list of all the things you
DON'T want to read, which is a very different proposition. For instance, I
can easily drop out classified, the sports section (except possibly for
browsing) and anything having to do with religion. Being able to build
exclusions lists is quite simple -- every time you find something you don't
want to read, you add it to the exclusion list. You could also set up a
'must read' list as well (consider it a clipping service). Also remember
that having a computer around means that you don't need to worry about
reading the things you don't want to read -- if I was going through a paper
manually I'd probably just toss the sports section (unsubscribe to
net.sports) but if I have a computer around I can throw out the car ads,
the basketball, football and baseball stuff and still be able to see the
cricket and bike racing stuff. As it currently stands, I have to wade
through a lot of paper (or messages) because of the problems of the
newsgroups.

Keeping to the newspaper analogy, usenet currently does its first cut using
newsgroups, which translates well to the major newspaper sections (news,
weather, business, sports, opinion). This, unfortunately, creates
ambiguities, since a sports medicine article might go under sports or
medicine, and if it is in sports I'll miss it, but if it is in medicine
I'll read it. Going to the keyword system means that the primary piece of
available information is the subject line, which is analogous to being able
to scan the paper based on the headlines instead. Now, are you more likely
to decide to read an article because of the headline or because of the
section of the paper it is in? For me, at least, the section it is placed
in is a lot less important than what the article is about...
-- 
:From under the bar at Callahan's:   Chuq Von Rospach 
nsc!chuqui@decwrl.ARPA               {decwrl,hplabs,ihnp4,pyramid}!nsc!chuqui

If you can't talk below a bellow, you can't talk...

edward@ukecc.UUCP (Edward C. Bennett) (09/30/85)

	But in order for all this to work, you must still rely on posters
to label thier postings with the proper keywords. What's to prevent some
malicious type person from labeling a particularly offensive posting
with "Keywords: sex UNIX pontiac"?

-- 
Edward C. Bennett

UUCP: ihnp4!cbosgd!ukma!ukecc!edward

/* A charter member of the Scooter bunch */

"Goodnight M.A."

putnam@steinmetz.UUCP (jefu) (09/30/85)

I personally rather like the idea of keyword based news, but think that
there are definite problems.  The worst is having the user choose the
keywords.  I dont believe this could ever work, instead, if there were
some way to get the software to generate them, perhaps using Zipf's law
and some weighting on keywords for the posting (if any) the person is
responding to.  It might also help to add `not` operators (but how to 
get the keyword generating software to do this?  Sounds tough to me).

Until this, at least, can be answered, the other questions are academic.





-- 
               O                      -- jefu
       tell me all about              -- UUCP: edison!steinmetz!putnam
Anna Livia! I want to hear all....    -- ARPA: putnam@kbsvax.decnet@GE-CRD

gordon@cae780.UUCP (Brian Gordon) (10/03/85)

In article <265@ukecc.UUCP> edward@ukecc.UUCP (Edward C. Bennett) writes:
>	But in order for all this to work, you must still rely on posters
>to label thier postings with the proper keywords. What's to prevent some
>malicious type person from labeling a particularly offensive posting
>with "Keywords: sex UNIX pontiac"?

The same thing that keeps that poster from sending the article to
net.singles, net.social, net.unix, net.unix_wizzards, net.auto, AND
net.general -- viz. nothing.  So what would we lose?  A jerk is still
a jerk in either case.  However, the equivalent of "innocent"
mis-postings to, for example, net.general should be much lower.

FROM:   Brian G. Gordon, CAE Systems Division of Tektronix, Inc.
UUCP:   tektronix!teklds!cae780!gordon
	{ihnp4, decvax!decwrl}!amdcad!cae780!gordon 
        {nsc, hplabs, resonex, qubix, leadsv}!cae780!gordon 

loverso@sunybcs.UUCP (John Robert LoVerso) (10/13/85)

> Today, write
> out a list of all the articles which you would like to read in TOMORROWS
> newspaper.  Just try it and you will see why I advocate newsgroups.  
> 
> Bernie Gunther

Wouldn't the default be that you would automatically see any articles with
new keywords?

bmg@mck-csc.UUCP (Bernard M. Gunther) (10/22/85)

> > Today, write
> > out a list of all the articles which you would like to read in TOMORROWS
> > newspaper.  Just try it and you will see why I advocate newsgroups.  
> > 
> > Bernie Gunther
> 
> Wouldn't the default be that you would automatically see any articles with
> new keywords?

Yes, you could do it that way, but what about meta discussions?  You don't
want to read about zebras but someone is talking about the problems about
talking about zebras which you do want to read.  How do you deal with this?

If you have a list of approved keywords for topics, this is no different
from the current newsgroups, outside of being a wider newsspace (ie more
names therefore smaller newsgroups).  

How about those who advocate the idea of keywords building it such that it
works with the current system.  Have the keyword system work within the
newsgroup frame.  If it works well, newsgroups which presently have
subdivisions will loose the subdivisions because they are no longer
necessary.  The same goes for subdivisions (read: newsgroups) on the rest
of the network.  The net will then be in the state you want.  

If it can't be done on the newgroup level, I will find it *very* hard
to beleive you can do it on the network level.  If it's good, people 
will try it and like it.  If it's bad or doesn't do what people want,
you can still keep on using it.  Before you re-build the world, try
working on a small peice and seeing how that works.

Bernie Gunther