[list.ietf] draft-osids-resdescripx500

emv@ox.com (Edward Vielmetti) (06/18/91)

i'm not fond of this document, and i'd like to try to figure out why.

first problem is that there's already a perfectly good protocol
(Z39.50) for accessing structured data in MARC-like tagged formats.
it's extensible, much more so than any ASN.1 encoding is going to be,
and there are appropriate tools now being built that use this
information.  throwing all of this information into X.500 seems to be
a step backwards, since you're losing all of the good properties of
the tagged datastream.

i'd think you'd expect to look up internet resources in the card
catalog, not in the phone book.

some quibbles with the individual fields:

it's a shame that they need to be fixed length.  i don't know where
you picked the numbers from but it's not hard to believe that some
reasonable resources are going to want or need more than the size of
the field you have available.

the schema are too loosely defined to be of much value for automatic
processing. for instance, it would be ideal if the entry for the
t-shirt server gave enough of a precisely detailed description of what
commands needed to be sent where that it would be possible for the
system to connect you to it at the touch of a key.  descriptions like
"a string describing how to access the resource" make it sound like
this field is going to be useless for automatic processing, and the
unfortunate user is going to have to negotiate their way through the
inevitable cryptic prompts on their own.

similar problems arise with the alternateProviders string (a good
scheme would also encode enough detail for you to select one),
costOfUse (measured in what?  dollars?  rubles?), sourceMachine
(subject to the same problmes as the internet HINFO record, and not
necesarily relevant), i could go on.

the costOfUse field is almost certainly too small -- 128 characters
doesn't even begin to address all of the possible variations on
charging policies (by time of day, quality of service, different for
commercial or educational users).  it's not going to be useful.

the real problem is if you expect people are actually going to fill
in all of this data for new resources they bring on line, or that
you're going to find willing volunteers.  this data is expensive to
collect, unimaginably bureaucratic, and largely useless to the
potential customer of these services.  I suspect that services based
on these schema will get voluntary compliance rates on the order of
the 3-5% that the Internet Resource Guide gets for archives -- only
the people who are the most dedicated to self-promotion will bother to
fill out and verify the 39 separate data items involved.  I have even
less confidence that they will keep the resources up to date.

I hope i'm not sounding too negative :-).  it's important to look at
standardizing the way we describe internet resources, and to provide
people browsing through internet directories with easy access to the
resources they run across or are looking for.  X.500 appears to be an
artificial choice to store this information in, and the description of
the tables for each resource are not adequately defined to make them
usable by automatic processses.  Other existing protocols, namely
Z39.50, provide a more natural structure for which to approach this
problem.

-- 
Edward Vielmetti, moderator, comp.archives, MSEN Inc. 	emv@msen.com

"With all of the attention and publicity focused on gigabit networks,
not much notice has been given to small and largely unfunded research
efforts which are studying innovative approaches for dealing with
technical issues within the constraints of economic science."  
							RFC 1216

yeongw@spartacus.psi.com (06/19/91)

> i'm not fond of this document, and i'd like to try to figure out why.

I was not overly enthused with this document either, when I first
saw it, and have (privately) sent comments to Chris Weider about it.

However, I feel obliged to say something in Chris's defense here:

> first problem is that there's already a perfectly good protocol
> (Z39.50) for accessing structured data in MARC-like tagged formats.
>		[ ... ]
> i'd think you'd expect to look up internet resources in the card
> catalog, not in the phone book.

While I agree with Ed that Z39.50 is a more appropriate protocol
for the purposes of searching bibliographic databases., I have to
disagree with the implication ("i'd think you'd expect ... in the
card catalog, not the phone book") that Z39.50 is the answer to
the entire problem.

Basically, I view the problem of networked information retrieval
as actually consisting of three subproblems:

	- discovery: finding potential sources ("providers") of
	  information. In other words, finding out what's out there,
	  and who is providing what's out there.

	- searching: having found potential sources of information,
	  this involves meaningfully searching the provider's
	  data for the subset that is of interest

	- delivery: having identified "interesting" data, having
	  it delivered.

Essentially, I think X.500 is a good mechanism to solve the discovery
problem/subproblem, by providing an "Information Yellow Pages" that
would list potential sources of information.

On the other hand, Z39.50 is a potentially good mechanism to address
and solve the second problem of searching. While some work still
needs to be done (and is being done) to get Z39.50 into a useful
form, I think it shows great potential to be a very powerful
search protocol. And for more than bibliographic data too.

So, I don't totally agree with either Chris or Ed: neither X.500 nor
Z39.50 represent the complete answer to the problem. Instead, each
forms part of the solution, in my opinion.


Wengyik