laws@ai.sri.com (Kenneth I. Laws) (06/23/91)
'Scuse me, I'm new here. I've been following with great interest Ed's discussion of the volunteer-moderator problem, together with the cogent comments of others about sharing the load, growing the net into a real-world service, and giving people an incentive to properly catalog their own submissions. I'm curious about one issue: is the goal to create a single (distributed) archive? I can see some efficiency advantages in avoiding duplicate storage, and some administrative advantages in having a single indexing system, but I don't see real-world analogies showing that this is the way to go. The Library of Congress is a special case, and one could argue for NASA engineering archives as a model. But there is no way that you can compete on a governmental scale. Why are you not aiming for separate archives (cross linked, of course) for each of the different discussion topics? Each would have its own librarians, consultants, or priests, and each would serve a fairly well-defined community. Access from outside the community would be by asking someone on the inside. This is particularly pertinent if you wish to grow a commercial service -- as I believe you should. The existance of a free service like comp.archives makes the next step very difficult. (In a like manner, Prof. John McCarthy claims that the existance of the Arpanet eventually interfered with commercial network development, leading to the current revolutionary acceptance of a rather poor FAX standard.) Beause the transition will be difficult, you will have to pay very close attention to market forces and realistic business principles. The current comp.archives appears to be driven by "technology push": you have the data available, so you're saving it. Business doesn't work that way; it works by "pull." You have to find customers who need a specific type of data, then you let them pay for the archiving, indexing, and knowledgeable data experts. As an extreme case, you can imagine a host of consultants, each with his or her own archive. Each consultant advertises a specialty, collects related data, indexes it according to personal needs, seeks out customers, prepares reports, and occasionally even publishes a book. Instead of following the consultant model, you seem to be following the public library model. Why? There's no money in it. -- Ken Laws
emv@msen.com (Ed Vielmetti) (06/24/91)
<excerpt> The current comp.archives appears to be driven by "technology push": you have the data available, so you're saving it. Business doesn't work that way; it works by "pull." You have to find customers who need a specific type of data, then you let them pay for the archiving, indexing, and knowledgeable data experts. </excerpt> There's something that you're missing here, I think. No doubt there's some domain-specific knowledge involved in the production of comp.archives; it's useful to have a feel for which of the 1000+ archive sites in the world have the greatest likelihood of having current stuff, which authors are most reliable, who is best organized. But there's more to it than that. One of the fundamental technologies involved is taking a piece of text and answering the question "Is this interesting?", or more likely "Is this likely to be interesting to Ed Vielmetti, or Chris Torek, or Mark Moraes, or Richard Stallman, or Mitch Kapor?" That's not an easy question, but if you can solve it (for free) for the person involved, then you can instantly market what you have to everyone else in the world who respects these people's opinions. <excerpt> As an extreme case, you can imagine a host of consultants, each with his or her own archive. Each consultant advertises a specialty, collects related data, indexes it according to personal needs, seeks out customers, prepares reports, and occasionally even publishes a book. </excerpt> That's a good model to follow, and I would hope to start following it. One of the things that's going to be part of the <tm> MSEN Archive Service </tm> which is not in comp.archives now is a further breakdown by subject classification; you'll be able to subscribe to "msen.archives.tex" and get just the latest and greatest on TeX software announcements and reviews, or "msen.archives.x" to track the progress of X11 stuff. You'll particularly want the last one once X11R5 rolls around. Each of these collections will have its own archivist, who is responsible for quality control and additional research. I'm planning to apply the same technology to related fields as well, subject to the availablility of some copyrighted information (and the time and investment to pull it off). For instance, an <tm> MSEN Patent Watch </tm> subscription would get you news of patent filings, cross-license agreements, technical information (and raw speculation) on the viability and challengability of <kw> software patents </kw>, etc, culled from every available source and tagged (by experts) with an assessment of quality and value. I'd bet that this on could even make a go for itself on paper. <excerpt> Instead of following the consultant model, you seem to be following the public library model. Why? There's no money in it. <excerpt> One of the problems with the consultant model is that it doesn't scale too well; you have to do all of the development yourself, and it's hard to find like-minded people because you're hoarding all of your efforts. By pursuing a strategy that includes some component of public service / pro bono / for the good of the net, and by aggressively tracking Internet standards (like the multipart, multimedia "richmail" spec), it's possible to get a substantial amount of goodwill, and perhaps enough visibility for people to take you seriously. After all, this sort of thing is very old, it's just a high tech "clipping service". It's something that I would do <o>just for myself</o> except that that hasn't been lucrative enough to buy the necessary hardware and software I'd need to store all of the interesting things I find, or to license the necessary rights to the copyrighted newsfeeds (let alone have anything left over for me) . It doesn't matter if there's "no money in it", so long as the venture is self-supporting and sustainable. <sig> Edward Vielmetti, vice president for research, MSEN Inc. emv@msen.com "MSEN Archive Service" and "MSEN Patent Watch" are trademarks of MSEN, Inc. <snappy-quote> On the Net, the Net-way is best. It's just that we are trying to figure out what the Net-way is. e. miya </snappy-quote> </sig> <comment> Markup information provided for use by news readers which implement the experimental "Mechanisms for Specifying and Describing Internet Message Bodies", available for anonymous ftp from <msen-archive-information> <site>thumper.bellcore.com</site> <directory>/pub/nsb</directory> </msen-archive-information> This text has been marked up in the hopes that someone will be able to print it out on paper and make it pretty! A five dollar reward goes to the first nice paper copy. Send submissions to <snail> Edward Vielmetti MSEN, Inc. 317 S. Division, Suite 218 Ann Arbor, MI 48104-2203 USA </snail> <markup> <kw> key words </kw> <o> emphasis </o> <tm> trademark </tm> <sig> signature </sig> <snail> paper mail ("snail mail") address </snail> <snappy-quote> when in doubt, quote an RFC. </snappy-quote> <msgid> message id </msgid> <from> from </from> <excerpt> <msgid> LAWS.91Jun22223423@sunset.ai.sri.com </msgid> <from> laws@ai.sri.com (Kenneth I. Laws) </from> </excerpt> </markup> </comment>
rodney@sun.ipl.rpi.edu (Rodney Peck II) (06/24/91)
In article <LAWS.91Jun22223423@sunset.ai.sri.com> laws@ai.sri.com (Kenneth I. Laws) writes: >(... Prof. John McCarthy claims >that the existance of the Arpanet eventually interfered with >commercial network development, leading to the current >revolutionary acceptance of a rather poor FAX standard.) I think Prof. John McCarthy is making an awful lot of assumptions. FAXs and the internet are not all that closely related. Maybe if you want to make some sort of argument that the internet had stalled commercial development of telephone switching networks and their digital side, you might have something. Then again, you probably wouldn't since the internet (including the global portions) is extremely small compared to the phone switching networks. >Instead of following the consultant model, you seem to be >following the public library model. Why? There's no money >in it. because there's more to life than money. Comp.archives seemed to me to be a project that developed as a Neat Thing that was useful to many people, not a way for some people to get rich. -- Rodney
cmf851@anu.oz.au (Albert Langer) (06/25/91)
In article <EMV.91Jun23144034@bronte.aa.ox.com> emv@msen.com (Ed Vielmetti) writes (many things related to an interesting dicsussion I don't have time to participate in, so I'm just responding on the occasional side issue): >Markup information provided for use by news readers which implement >the experimental "Mechanisms for Specifying and Describing Internet >Message Bodies", available for anonymous ftp from The markup appears to be based on SGML (Standard Generalized Markup Language, which has an ISO standard and is indeed suitable for maintaining both text databases and revisable form rich text documents via news). However if a suitable SGML document type HAS been defined for your purposes then you ought to publish it and reference it as a public text. Then you can use a MUCH less verbose (but equally readable) notation - e.g. omitting or shortening most of the end markers and making use of various abbreviations and typist techniques. -- Opinions disclaimed (Authoritative answer from opinion server) Header reply address wrong. Use cmf851@csc2.anu.edu.au
emv@msen.com (Ed Vielmetti) (06/25/91)
<par> As far as it is feasible the IETF "richmail" project is being pushed to use as simple a subset of SGML as possible so that people can type it in by hand and not have it distract too much from the actual text. </par> <excerpt> in article 1991Jun24.193928.21180@newshost.anu.edu.au cmf851@anu.oz.au (Albert Langer) writes: However if a suitable SGML document type HAS been defined for your purposes then you ought to publish it and reference it as a public text. Then you can use a MUCH less verbose (but equally readable) notation - e.g. omitting or shortening most of the end markers and making use of various abbreviations and typist techniques. </excerpt> <par> There's good reasons not to use the SGML minimization rules, not the least of which is to minimize the amount of work that "dumb" user agents have to do to strip out the formatting information. To quote from the internet draft -- <excerpt> NOTE ON THE RELATIONSHIP OF RICHTEXT TO SGML: Richtext is decidedly not SGML, and should not be used to transport arbitrary SGML documents. Those who wish to use SGML document types as a mail transport format should define a new text-plus subtype, e.g. "text-plus/sgml-dtd-whatever". Richtext is designed to be compatible with SGML, and specifically so that it will be possible to define a richtext DTD if that is desired. However, this does not imply that arbitrary SGML can be called richtext, nor that richtext implementors have any need to understand SGML; the description in this memo is a complete definition of richtext. </excerpt> The approach of avoiding the complicated minimization rules facilitates treatment of the text by more general systems, such as Open Text System's PAT, which can be taught to recognize very simple tagging schemes but which don't have facilities for disambiguating whether a minimized end-tag matches one or more begin-tags. I also hope to have a system built in GNU Emacs, and while the richtext scheme seems easy enough with it I don't have any intention of hacking full-blown SGML in emacs. </par> <par> As an extreme example, all of the markup in this document is one tag per line, which is extremely easy to wipe out with even with grep -v. </par> <sig> Edward Vielmetti, vice president for research, MSEN Inc. emv@msen.com <snappy-quote> By the way, Ed, I think you may be the first person in the history of the world to successfully send a multifont email message to someone who wasn't using the same software with which the message was composed. Congratulations! nsb@thumper.bellcore.com </snappy-quote> </sig>