emv@msen.com (Ed Vielmetti) (06/24/91)
<excerpt> The current comp.archives appears to be driven by "technology push": you have the data available, so you're saving it. Business doesn't work that way; it works by "pull." You have to find customers who need a specific type of data, then you let them pay for the archiving, indexing, and knowledgeable data experts. </excerpt> There's something that you're missing here, I think. No doubt there's some domain-specific knowledge involved in the production of comp.archives; it's useful to have a feel for which of the 1000+ archive sites in the world have the greatest likelihood of having current stuff, which authors are most reliable, who is best organized. But there's more to it than that. One of the fundamental technologies involved is taking a piece of text and answering the question "Is this interesting?", or more likely "Is this likely to be interesting to Ed Vielmetti, or Chris Torek, or Mark Moraes, or Richard Stallman, or Mitch Kapor?" That's not an easy question, but if you can solve it (for free) for the person involved, then you can instantly market what you have to everyone else in the world who respects these people's opinions. <excerpt> As an extreme case, you can imagine a host of consultants, each with his or her own archive. Each consultant advertises a specialty, collects related data, indexes it according to personal needs, seeks out customers, prepares reports, and occasionally even publishes a book. </excerpt> That's a good model to follow, and I would hope to start following it. One of the things that's going to be part of the <tm> MSEN Archive Service </tm> which is not in comp.archives now is a further breakdown by subject classification; you'll be able to subscribe to "msen.archives.tex" and get just the latest and greatest on TeX software announcements and reviews, or "msen.archives.x" to track the progress of X11 stuff. You'll particularly want the last one once X11R5 rolls around. Each of these collections will have its own archivist, who is responsible for quality control and additional research. I'm planning to apply the same technology to related fields as well, subject to the availablility of some copyrighted information (and the time and investment to pull it off). For instance, an <tm> MSEN Patent Watch </tm> subscription would get you news of patent filings, cross-license agreements, technical information (and raw speculation) on the viability and challengability of <kw> software patents </kw>, etc, culled from every available source and tagged (by experts) with an assessment of quality and value. I'd bet that this on could even make a go for itself on paper. <excerpt> Instead of following the consultant model, you seem to be following the public library model. Why? There's no money in it. <excerpt> One of the problems with the consultant model is that it doesn't scale too well; you have to do all of the development yourself, and it's hard to find like-minded people because you're hoarding all of your efforts. By pursuing a strategy that includes some component of public service / pro bono / for the good of the net, and by aggressively tracking Internet standards (like the multipart, multimedia "richmail" spec), it's possible to get a substantial amount of goodwill, and perhaps enough visibility for people to take you seriously. After all, this sort of thing is very old, it's just a high tech "clipping service". It's something that I would do <o>just for myself</o> except that that hasn't been lucrative enough to buy the necessary hardware and software I'd need to store all of the interesting things I find, or to license the necessary rights to the copyrighted newsfeeds (let alone have anything left over for me) . It doesn't matter if there's "no money in it", so long as the venture is self-supporting and sustainable. <sig> Edward Vielmetti, vice president for research, MSEN Inc. emv@msen.com "MSEN Archive Service" and "MSEN Patent Watch" are trademarks of MSEN, Inc. <snappy-quote> On the Net, the Net-way is best. It's just that we are trying to figure out what the Net-way is. e. miya </snappy-quote> </sig> <comment> Markup information provided for use by news readers which implement the experimental "Mechanisms for Specifying and Describing Internet Message Bodies", available for anonymous ftp from <msen-archive-information> <site>thumper.bellcore.com</site> <directory>/pub/nsb</directory> </msen-archive-information> This text has been marked up in the hopes that someone will be able to print it out on paper and make it pretty! A five dollar reward goes to the first nice paper copy. Send submissions to <snail> Edward Vielmetti MSEN, Inc. 317 S. Division, Suite 218 Ann Arbor, MI 48104-2203 USA </snail> <markup> <kw> key words </kw> <o> emphasis </o> <tm> trademark </tm> <sig> signature </sig> <snail> paper mail ("snail mail") address </snail> <snappy-quote> when in doubt, quote an RFC. </snappy-quote> <msgid> message id </msgid> <from> from </from> <excerpt> <msgid> LAWS.91Jun22223423@sunset.ai.sri.com </msgid> <from> laws@ai.sri.com (Kenneth I. Laws) </from> </excerpt> </markup> </comment>
emv@msen.com (Ed Vielmetti) (06/25/91)
<par> As far as it is feasible the IETF "richmail" project is being pushed to use as simple a subset of SGML as possible so that people can type it in by hand and not have it distract too much from the actual text. </par> <excerpt> in article 1991Jun24.193928.21180@newshost.anu.edu.au cmf851@anu.oz.au (Albert Langer) writes: However if a suitable SGML document type HAS been defined for your purposes then you ought to publish it and reference it as a public text. Then you can use a MUCH less verbose (but equally readable) notation - e.g. omitting or shortening most of the end markers and making use of various abbreviations and typist techniques. </excerpt> <par> There's good reasons not to use the SGML minimization rules, not the least of which is to minimize the amount of work that "dumb" user agents have to do to strip out the formatting information. To quote from the internet draft -- <excerpt> NOTE ON THE RELATIONSHIP OF RICHTEXT TO SGML: Richtext is decidedly not SGML, and should not be used to transport arbitrary SGML documents. Those who wish to use SGML document types as a mail transport format should define a new text-plus subtype, e.g. "text-plus/sgml-dtd-whatever". Richtext is designed to be compatible with SGML, and specifically so that it will be possible to define a richtext DTD if that is desired. However, this does not imply that arbitrary SGML can be called richtext, nor that richtext implementors have any need to understand SGML; the description in this memo is a complete definition of richtext. </excerpt> The approach of avoiding the complicated minimization rules facilitates treatment of the text by more general systems, such as Open Text System's PAT, which can be taught to recognize very simple tagging schemes but which don't have facilities for disambiguating whether a minimized end-tag matches one or more begin-tags. I also hope to have a system built in GNU Emacs, and while the richtext scheme seems easy enough with it I don't have any intention of hacking full-blown SGML in emacs. </par> <par> As an extreme example, all of the markup in this document is one tag per line, which is extremely easy to wipe out with even with grep -v. </par> <sig> Edward Vielmetti, vice president for research, MSEN Inc. emv@msen.com <snappy-quote> By the way, Ed, I think you may be the first person in the history of the world to successfully send a multifont email message to someone who wasn't using the same software with which the message was composed. Congratulations! nsb@thumper.bellcore.com </snappy-quote> </sig>