[bionet.general] building an interstate

IAE@CU.NIH.GOV ("Irene Anne Eckstrand") (06/30/91)
== For Your Information ==

MAIL VIA BITNET FROM KIDSNET@PITTVMS  MONDAY  06/24/91  10:20:38 P.M.

Date: Mon, 24 Jun 91 21:48 EDT
From: KIDSNET MAILING LIST <KIDSNET@pittvms>
Subject: Re: building an interstate (data) highway with no roadmaps
To: kids-l@pittvms
Message-id: <560A3D8684FF202923@vms.cis.pitt.edu>
X-Envelope-to: IAE@NIHCU.BITNET
X-VMS-To: IN%"kids-l"

Date: Sun, 23 Jun 91 11:09 EDT
From: "ROBERT D. CARLITZ" <RDC@vms.cis.pitt.edu>
Subject: RE: building an interstate (data) highway with no roadmaps

There have been a number of interesting ideas brought up in the
discussion going on in the Usenet newsgroup comp.protocols.tcp-ip.  I
attach a selection below, with apologies for its length.  Those who
would like to follow this further can either read the newsgroup directly
or can look at the affiliated BITNET list TCP-IP@UTDALLAS.

--------------------------------------------------------------------------

From: J.Crowcroft@CS.UCL.AC.UK (Jon Crowcroft)
Newsgroups: comp.protocols.tcp-ip
Subject: Re: building an interstate (data) highway with no roadmaps
Date: 17 Jun 91 14:07:58 GMT

in the medium term, this may be helped a bit by the massive effort in
the ietf directory group...

in the longer term, shared file systems with clever chaching and
replication, and shared window systems,
and good navigation tools will be needed too...

shared filesystem technology
will obviate the need for
(ftp foo.bar; ls; cd dirX)*
type activity,
but will lead to semi-efficient
find -print

shared window systems could lead to very fast combined help desk/bboard
functionality ... instead of day lonmg exchanges like...
mail fred "where'sA";
repl bloggs "in docs/goo.tar.Z" on fiasco.obscureUniversity.ac.uk
repl fred "what do i do with .tar.Z files?

by (pure) cooincidence, we have just been talking about rationalising
ftp (re-)distribution here via AFS...as one example of order out of chaos

the greatest hope is that the library/information service people who
are used to terra-reference searchspaces will go on increasing their
involvement the way they have started (e.g. our UCL/London
librarians just ran a video conference last week over part of the
internet to talk to some MIT and other librarians)...

jon

---------------------------------------------------------------------------

From: clw@MERIT.EDU
Newsgroups: comp.protocols.tcp-ip
Subject: Re: building an interstate (data) highway with no roadmaps
Date: 17 Jun 91 16:12:42 GMT

The Directory Group at MERIT, Chris Weider and Mark Knopper, are starting
to address some of these issues.  I do think that Directory Services are
a good medium term answer, and we're starting to put everything which
fits the X.500 philosophy into X.500....
Chris

---------------------------------------------------------------------------

From: jqj@DUFF.UOREGON.EDU
Newsgroups: comp.protocols.tcp-ip
Subject: Re: building an interstate (data) highway with no roadmaps
Date: 17 Jun 91 16:42:50 GMT

I think online directory services are only one aspect of the problem that
EMV raised.  I'm not convinced that a global shared file system will help
too much (a shared file system uniting AFS/unix with Novell, Appleshare,
and typical IBM mainframe file systems is still several years away, and
you still need to know how to navigate through that gigantic file
system!), except by improving/regularizing the user interface.  I'm also
not convinced that we should expect the CNI or library folks to rescue us
(unless you believe that the research-library card catalog model is the
right one for dynamic network information).

>From my perspective, EMV's important point is that there is no equivalent
of a AAA (or any other national Automobile Association) to provide travel
information services, personalized maps, triptics, towing services, motel
certifications, etc. for normal people (rather than hackers like us).
Perhaps what we need is a users' group that is tied to international
networking rather than to a particular computer vendor or even to a
particular internet or particular protocol stack.  "American Networking
Association" anyone?

Where should I turn for advice about where I might take my family for its
next network vacation?

Come to think of it, in addition to a AAA analog, perhaps we need an AA
analog too.  "Networkers Anonymous" to help those poor souls who have
become adicted to cyberspace.

[p.s. please note the lack of :-).  I'm more serious than it appears.]

JQ Johnson
Director of Network Services            Internet: jqj@oregon.uoregon.edu
University of Oregon                    voice:  (503) 346-1746
250E Computing Center                   BITNET: jqj@oregon
Eugene, OR  97403-1212                  fax: (503) 346-4397

---------------------------------------------------------------------------

From: emv@msen.com (Ed Vielmetti)
Newsgroups: comp.protocols.tcp-ip,comp.archives.admin
Subject: Re: building an interstate (data) highway with no roadmaps
Date: 18 Jun 91 04:03:49 GMT

In article <9106171612.AA01441@mazatzal.merit.edu> clw@MERIT.EDU writes:

   The Directory Group at MERIT, Chris Weider and Mark Knopper, are starting
   to address some of these issues.  I do think that Directory Services are
   a good medium term answer, and we're starting to put everything which
   fits the X.500 philosophy into X.500....

All due respects, Chris, but X.500 doesn't address many of these
issues at all, and the ones it does sort of fit into can be more
easily addressed with other tools.

X.500 Directory services assume a neat, structured, hierarchical name
space and a clear line of authority running from the root all the way
to the leaves.  Indeed, most X.500 services in place on the internet
today that work well enough to be useful run off of centrally
organized, centrally verified, and bureaucractically administered
information -- the campus phone book.  For what this is, it's great --
i'm happy that I can finger user@host.edu at any number of sites and
get something back.  But that is of little relevance to the archives
problem.

X.500 services are hard to run -- the technology is big, bulky,
osified.  So the people who are most interested in running it are the
"computer center" folks.  If you look for the innovative, interesting,
and desirable applications that you'd want to find on the net, you'll
see that many of them are being done out in the field in departmental
computing environments or increasingly in small focused private
commercial or non-commercial efforts.  There's not a terribly good
reason for these two groups to communicate, and so most X.500 projects
have much more structure than substance.

X.500 services are directory oriented.  The data in them is relatively
small, of known value, and highly structured.  Information about
archive sources is just about completely counter to these basic
principles.  The amount of information about any particular service
which you'd like to have on hand can be quite considerable; perhaps at
minimum access instructions, but more likely some text describing the
service, who its intended audience is, sample output, etc.  In
addition it would be valuable to keep information on user reactions to
the system close to the official provided directory notice; these
reviews (a la the michelin guide) are often more valuable than the
official propaganda put out by the designer.  To search this mass of
information, you'll want something much more expressive than the
relatively pitiful X.500 directory access tools -- full text
searching, at the very minimum, with a way to sensibly deal both with
structured data and with more fuzzy matches on "similar" items.

X.500 is a holy grail, there's a lot of money which seems to be being
thrown at it these days in the hope to make it useful.  Good luck, I
wish you well.  But please, don't try to cram all the world's data
into it, because it doesn't all fit.  It's a shame that equivalent
amounts of effort aren't being spent on developing other protocols
more suited to the task. I'm thinking in particular of the Z39.50
implementation in WAIS [*] which holds a lot of potential for
providing a reasonable structure for searching and sifting through
databases which have rich textual information.  Perhaps it's just as
well that federal subsidy hasn't intruded here and clouded people's
judgments on the applicability of a particular technology to a
certain task.

--
Edward Vielmetti, MSEN Inc.     moderator, comp.archives        emv@msen.com

"often those with the power to appoint will be on one side of a
controversial issue and find it convenient to use their opponent's
momentary stridency as a pretext to squelch them"

[*] think.com:/public/wais/,
    also quake.think.com:/pub/wais/

---------------------------------------------------------------------------

From: worley@compass.com (Dale Worley)
Newsgroups: comp.protocols.tcp-ip,comp.archives.admin
Subject: Re: building an interstate (data) highway with no roadmaps
Date: 18 Jun 91 13:49:57 GMT

In article <EMV.91Jun18000345@bronte.aa.ox.com> emv@msen.com (Ed Vielmetti) writ
   es:
   X.500 services are directory oriented.  The data in them is relatively
   small, of known value, and highly structured.  Information about
   archive sources is just about completely counter to these basic
   principles.

   X.500 is a holy grail, there's a lot of money which seems to be being
   thrown at it these days in the hope to make it useful.

What can be done to produce good catalogs?  As Ed notes, archive
information is likely to be bulky, chaotic, and of unknown (probably
small) value.  Given how much money is needed to get a directory
system for information without these problems running, it will
probably take much more to get a good system for archive information
working.

Perhaps the analogy to road maps can be a guide -- Roads have been
around for thousands of years, but road maps have only been available
for fifty(?) years.  What happened?  One thing is that it is now
possible to make a map and then sell thousands (hundreds of
thousands?) of copies, thus making each copy reasonably inexpensive.
Until the development of the automobile this was not possible, there
were too few potential users.  (Or even necessary, since a horse cart
is slow enough that stopping to ask directions in each town isn't a
burden.)

One possibility is to make a service that charges you for use.  A good
archive information system should see enough use that each query can
be quite inexpensive.  And the authorization and billing should be
easy enough to automate!

Dale Worley             Compass, Inc.                   worley@compass.com
--
Perhaps this excerpt from the pamphlet, "So You've Decided to
Steal Cable" (as featured in a recent episode of _The_Simpsons_)
will help:
        Myth:  Cable piracy is wrong.
        Fact:  Cable companies are big faceless corporations,
                which makes it okay.

---------------------------------------------------------------------------

From: ajw@manta.mel.dit.CSIRO.AU (Andrew Waugh)
Newsgroups: comp.protocols.tcp-ip,comp.archives.admin
Subject: Re: building an interstate (data) highway with no roadmaps
Date: 19 Jun 91 00:14:11 GMT

In article <EMV.91Jun18000345@bronte.aa.ox.com> emv@msen.com (Ed Vielmetti) writ
   es:
> X.500 Directory services assume a neat, structured, hierarchical name
> space and a clear line of authority running from the root all the way
> to the leaves.

While this is certainly true, it is important to understand why this is
so. X.500 is intended to support a distributed directory service. It
is assumed that there will be thousands, if not millions, of
repositories of data (DSAs). These will co-operate to provide the
illusion of a single large directory.

The problem with this model is how you return a negative answer in a
timely fashion. Say you ask your local DSA for a piece of information.
If the local DSA holds the information you want, it will return it.
But what if it doesn't hold the information? Well, the DSA could
ask another DSA, but what if this second DSA also doesn't hold the
information? How many DSAs do you contact before you return the
answer "No, that piece of information does not exist"? All of them?

X.500 solves this problem by structuring the stored data hierarchically
and using this heirarchy as the basis for distributing the data
amongst DSAs. Using a straightforward navigation algorithm, a query
for information can always progress towards the DSA which should hold
the information. If the information does not exist that DSA can
authoritatively answer "No such information exists." You don't have to
visit all - or even a large proportion - of the DSAs in the world.

It is important to realise that this is a generic problem with highly
distributed databases. The X.500 designers chose to solve it by
structuring the data. This means that X.500 is suitable for storing
data which can be represented hierarchically and is less suitable
for storing data which cannot. Exactly what data will be suitable for
storing in X.500 is currently an open question - there is simply not
sufficient experience.

The proposed archive database which started this thread will have
exactly the same problem. The solution chosen will, if different to
that X.500 uses, will have problems as well. There is no such thing as
a perfect networking solution!

As for the rest of the posting, all I can say is that it must be great
to know so much about the costs and benefits of using X.500.
>From my perspective, it is obvious that X.500 will not solve all
the world's problems (nothing ever does :-) but it is way too early
to be so dogmatic.  When we have had
        1) The necessary expericence of implementing X.500, running
        X.500 databases and storing different types of data in such
        a database; and
        2) experience in alternative highly distributed databases.
        (X.500 might prove to be extremely poor for storing certain
        types of data - but the alternatives might be even worse.)
then we can be dogmatic.

andrew waugh

---------------------------------------------------------------------------

From: brnstnd@kramden.acf.nyu.edu (Dan Bernstein)
Newsgroups: comp.protocols.tcp-ip,comp.archives.admin
Subject: Re: building an interstate (data) highway with no roadmaps
Date: 19 Jun 91 02:53:15 GMT

Instead of complaining about how inappropriate X.500 is for all but the
simplest problems, why don't we identify the problems we're really
trying to solve?

I think that the Internet People Problem---make a *usable* database of
people on the Internet---embodies most, if not all, of the technical and
political difficulties that an archive service has to overcome. You want
to find that vocrpt package? I want to find Don Shearson Jr. You want to
find a SLIP-over-the-phone package? I want to find a SLIP-over-the-phone
expert. You want to know where you got this collection of poems? I want
to know where I got this phone number. You want to see what everyone
thinks of vocrpt now that you've found it? DISCO wants to get references
for Shearson from anyone who's willing to admit he's worked with him.

One advantage of starting from the Internet People Problem is that it
has a lot more prior art than the archive problem, from telephone
directories on up. Once we've solved it we can see how well the same
mechanisms handle data retrieval.

---Dan

---------------------------------------------------------------------------

From: carroll@ssc-vax (Jeff Carroll)
Newsgroups: comp.protocols.tcp-ip
Subject: Re: building an interstate (data) highway with no roadmaps
Date: 18 Jun 91 20:47:08 GMT

In article <9106171742.AA26647@phloem.uoregon.edu> jqj@DUFF.UOREGON.EDU writes:
>From my perspective, EMV's important point is that there is no equivalent
>of a AAA (or any other national Automobile Association) to provide travel
>information services, personalized maps, triptics, towing services, motel
>certifications, etc. for normal people (rather than hackers like us).
>Perhaps what we need is a users' group that is tied to international
>networking rather than to a particular computer vendor or even to a
>particular internet or particular protocol stack.  "American Networking
>Association" anyone?

        Yup. Even with distributed file systems there's no way of knowing
_a_priori_ which volumes to mount, or to attempt to mount.

        There's something of a start being made toward this type of service
in the form of servers which provide information on other servers. An
example which comes to mind is "archie", a server (running at a location
that I haven't yet committed to memory) which serves as a pointer to other
archive sites for various types of code and documentation.

        Before netland becomes tractable to the general public, it will
have to emulate the functionality of a fairly complete videotex system.

        Another problem which will eventually have to be addressed (once
the issue of commercial use of the Internet is put to bed) is protocols
for handling commercial transactions (how to pay your dues to the Network
Association, or make a contribution to FTPers Anonymous).

Jeff Carroll            carroll@ssc-vax.boeing.com

"...and of their daughters it is written, 'Cursed be he who lies with
any manner of animal.'" - Talmud

---------------------------------------------------------------------------

From: randall@Virginia.EDU (Randall Atkinson)
Newsgroups: comp.protocols.tcp-ip
Subject: Re: building an interstate (data) highway with no roadmaps
Date: 19 Jun 91 13:55:22 GMT

  I for one would like to see any NREN proposal include the keeping of
a "archive library catalog" that contains pointers to software
archives kept at various sites across the Internet.  It needn't keep
pointers to all of the tiny anonymous ftp sites, but an index by item
that included information on whether it were located on one or more of
the major archive sites (Simtel20, UUNET, WUarchive, prep, ucbvax,
grape, ncsa, etc.).

  It would be *nice* if something like Simtel20 were made available
and kept orderly as part of the NREN, but it isn't essential because
there are enough kind sites already that the net could get by.  The
catalog pointing to where to look for things is already becoming a
problem.  When I was at GE, people generally used *me* or a couple of
other people as the catalog and that wasn't fun.  I know that the same
practice exists at a lot of sites -- someone who has been around a
little while becomes the catalog of "the usual places to look."

Just a few partly-baked thoughts...

Randall Atkinson
randall@Virginia.EDU

---------------------------------------------------------------------------

From: lars@spectrum.CMC.COM (Lars Poulsen)
Newsgroups: comp.protocols.tcp-ip,comp.archives.admin
Subject: Re: building an interstate (data) highway with no roadmaps
Date: 20 Jun 91 07:05:16 GMT

In article <EMV.91Jun18000345@bronte.aa.ox.com>
   emv@msen.com (Ed Vielmetti) writes:
>   X.500 services are directory oriented.  The data in them is relatively
>   small, of known value, and highly structured.  Information about
>   archive sources is just about completely counter to these basic
>   principles.

In article <WORLEY.91Jun18094957@sn1987a.compass.com>
   WOrley@compass.com (Dale Worley) writes:
>What can be done to produce good catalogs?  As Ed notes, archive
>information is likely to be bulky, chaotic, and of unknown (probably
>small) value.  Given how much money is needed to get a directory
>system for information without these problems running, it will
>probably take much more to get a good system for archive information
>working.

Actually, we know quite well what it takes to raise the signal-to-noise
ratio. Administration and moderation.

One possible option would be for the Internet Society to sponsor an
archive registration facility. Maybe each of the IETF task forces can
identify valuable programs that  need to be archived, with mirrored
servers on each continent, available for NFS mounting as well as
anonymous FTP. It should be worth $50 for each site to have access to
good easily  accessible archives instead of having to keep disk space
for everything in our own space. (I know; not every "hobby site" can
afford $50, but there are many commercial sites, including my own, that
would be happy to help feed such a beaST; I'm sure many academic sites
would be able to help, too).
--
/ Lars Poulsen, SMTS Software Engineer
  CMC Rockwell  lars@CMC.COM