[news.groups] Future of USENET

xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) (06/25/91)

[note crossposts, followup]

 mathews@hadar.cs.Buffalo.EDU (Ryan Mathews) writes:

> But I just have to ask one question:
> Don't you think we're creating too many groups?

No, quite to the contrary, I think we are creating
far too few, and our news maintenance and news
reading software is poorly set up for the growth
which should occur to make up for this.

My reasoning is based on weak data, but to the best
of my ability to determine, in the last five years,
the volume of postings has risen 25-fold, while the
number of newsgroups has risen only tenfold. That
makes each newsgroup 2.5 times as crowded on average
as was the case back when it was still (barely)
possible for a single human being to read the entire
net as a full time occupation.

I attribute this deficiency of organizational
improvements to the cumbersome newsgroup creation
process. Once having determined that the net needs
to get back to a much finer split of newsgroups to
give readers any hope of reading interesting
material without wading through ten times as much
uninteresting material, why should anyone but the
current readers of a newsgroup be involved in the
partition of that group into subgroups?

The rest of the net has only three possible
responses when presented with a vote for an
unfamiliar group: 1) ignore the vote; 2) vote in
ignorance; or 3) vote NO on the "principle", however
misguided, that there are "too many newsgroups".

What is needed instead is a re-examination of this
whole question, and the creation of software and
operating paradigms to satisfy the following poorly
met needs:

1) Index the net, so that groups of interest can be
found by keyword searches; even a full text search
of the entire online news spool, while slow as mud,
would be a help in this direction.

This would actually _lessen_ the need for group
creation, by showing the user that topic X is
already heavily discussed in newsgroup Y, and so
doesn't need a newsgroup of its own to get a
discussion going.

2) Index and automate feed sys file maintenance, so
that, while all group propagate to those who want to
read them, uninterested sites are omitted from
carrying, and to the extent possible from passing,
unwanted newsgroups. Among other things this will
require a much denser set of interconnections for
the net than now exist, and software to accomplish
the much more complex feed contact protocols and
expiry protocols needed.

This will save scads of spool space and telecomm
charges.

3) Change the news base to a hypertext style, to
limit the actual volume used for passing context
material in followups.

This would save space, and if actually presented by
some news readers as hot buttons, would also
dramatically decrease reading time for a subscriber
following a thread who already has the context in
mind and doesn't need to see it again.

4) Present newsgroup choices hierarchically, to let
the user view the actual newsgroup organization, and
to limit screen painting time for newsgroup
selection; change from a typing to a pointing
interface.

The more I read news, the less satisfied I find
myself with _any_ particular order of presentation
of newsgroups; I tend to read in different orders on
different days or even different hours of the same
day. None of the current interfaces I've seen make
random access to newsgroups easy.

5) Take much more advantage of user-local processing
power; this one is tough because of the wide variety
of news reading hardware, but lots of stuff that I
have to access over slow dial up lines repeatedly
during my session could be downloaded silently to a
database on my local hardware while I read other
articles, and painted on my screen much faster (about
30 times) from local store.

This would actually _decrease_ the communications
load on the host machine.

6) Create an easy to use compliment to kill files:
interest files, such that only files that meet some
positive criterion are presented for reading, rather
than negative criteria being avoided.

As an example, show me articles containing at least
five of twenty keywords, or articles starting new
threads. Make a global facility that pulls forward
articles from _anywhere_ I subscribe, or even
anywhere at all, containing ten of twenty
particularly hot keywords, and presents them to me
before I enter any newsgroup, in case that is all I
have time for right now.

7) Improve the software to cope gracefully with lots
more newsgroups, with much deeper hierarchies, with
longer, less typable, fully qualified newsgroup
names.

For example, I just found out that one of the two
leaf site packages for my local hardware has a very
hard limit of 30 characters in a fully qualified
newsgroup name, because it makes that a directory
name rather than using a hierarchical directory
structure. Unfortunately, my local site already has
several newsgroups whose fully qualified name is
longer than 30 characters!

Naturally typing one of these behemoths in rn or trn
to jump directly to the newsgroup is a royal pain.

There are lots more ideas along the same direction.
The net has become an information overload for any
one person, and even individual newsgroups are such
for many of us. Lacking better access mechanisms,
finer newsgroup partitioning is at least a start.

Kent, the man from xanth.
<xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>

xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) (06/28/91)

 sksircar@stroke.princeton.edu (Subrata Sircar) writes:
> xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:

>> The rest of the net has only three possible
>> responses when presented with a vote for an
>> unfamiliar group: 1) ignore the vote; 2) vote in
>> ignorance; or 3) vote NO on the "principle",
>> however misguided, that there are "too many
>> newsgroups".

> Very few groups ever get rejected, however. The
> opposite argument is more true; that the only
> people who vote are the people in favor of the
> split.

Not quite; there are a steady 45 or so folks who
vote against every group, no matter the merits,
because they think the net has too many newsgroups.
This means you'd better aim for 150 YES votes to
pass a group, not just 100.

>> 1) Index the net, so that groups of interest can
>> be found by keyword searches; even a full text
>> search of the entire online news spool, while
>> slow as mud, would be a help in this direction.

> With proper naming, this is easy; just grep the
> news spool for directory names.

I wish it were that easy; read a few research papers
on the relative success rate of keyword searches
against even full text indexed databases; the
results are pretty sorry. Humans do _not_ have a
good unspoken agreement about what words should be
used to talk about which subjects, so you have to
use lots of keywords against lots of pertinent text
to have a good chance of finding what you seek.

Take a look at another posting in this thread, from
Richard Miller, that bemoans the difficulty of
keeping conversations correctly slotted in a mere
_three_ education newsgroups. Just looking at the
group names isn't nearly enough, though it can of
course help; I use it myself a lot, but less in
looking for a subject than in finding a fully
qualified group name I only remember in part.

>> 3) Change the news base to a hypertext style, to
>> limit the actual volume used for passing context
>> material in followups.

> This is unfortunately extremely difficult, given
> the number of character based interfaces to the
> net. How do you generate hypertext interfaces that
> can be manipulated only through 7-bit ascii codes,
> which is what the majority of the net uses?

Up until someone got a little too clever installing
facist options in inews, there was a common
agreement on the net that a leading ">" (or several)
indicated included material, so reserving a marker
seems the right thing to do. The Thinker(tm)
hypertext package encloses words which are hypertext
link hotbuttons in "<", ">" pairs. Mix this with a
message id, start byte, end byte contents (which
need not be displayed that way to the user) and you
have the essence of a hypertext link, done in
printable ASCII.  I'd prefer that the display to the
user in the hot button show the user-id of the author
of the included material, with a level number in case
the thread contains quotes from that author from more
than one prior article.  So what the user sees would
look like "<xanthian-1>" to indicate a most recent
level quote from me had been included by the present
article's author.

We could continue the convention of keeping this
left adjusted on a line alone, or tag it on the end
of the previous paragraph to save space if our news
displayer did real time paragraph flowing and worked
in meaningful (SGML) units of text.

>> 6) Create an easy to use compliment to kill
>> files: interest files, such that only files that
>> meet some positive criterion are presented for
>> reading, rather than negative criteria being
>> avoided.

> This is possible with rn, I don't know about other
> newsreaders.

The operative word is "possible"; I use this with
alt.flame to pull out napalm aimed at my personal
carcass, but it is an inconvenient side effect of
trn mechanisms meant for other purposes, and quite
clunky. An "interest" filter designed explicitly for
this purpose could be better designed.

>> As an example, show me articles containing at
>> least five of twenty keywords, or articles
>> starting new threads. Make a global facility that
>> pulls forward articles from _anywhere_ I
>> subscribe, or even anywhere at all, containing
>> ten of twenty particularly hot keywords, and
>> presents them to me before I enter any newsgroup,
>> in case that is all I have time for right now.

> The first is conceptually easy; run twenty "mark"
> files on the newsgroup, and only present articles
> which are marked by five or more (storing the
> numbers in separate files, unmarking all when
> done).

Thinking harder about that, what I'd probably want
is "N occurrances of some subset of these M keywords
with at least R differnt keywords appearing;
persistent mention is a better clue than casual
mention.

> The second is conceptually just as easy, but
> tremendously difficult in current practice.

Not conceptually harder, just that our machines are
nowhere near fast enough to do the job; a Connection
Machine wired into the disk drive hardware would be
Just About Right.

>> Naturally typing one of these behemoths in rn or
>> trn to jump directly to the newsgroup is a royal
>> pain.

> This can be done with filename completion, as
> t-shell editing or Mach on the NeXT provide;
> simply type part of the name and hit a hot key and
> it finishes unique extensions.

Yeah, except that hierarchy names by design aren't
unique until you get close to the end. Doing
completion a level at a time would be better, but
a point and click interface would be _much_ better.

Kent, the man from xanth.
<xanthian@Zorch.SF-Bay.ORG> <xanthian@well.sf.ca.us>