[comp.archives] [comp.archives] comp.archives disappears for a while

emv@msen.com (Edward Vielmetti, moderator) (04/26/91)

Archive-name: archives/admin/administrivia/0--
Original-posting-by: emv@msen.com (Edward Vielmetti, moderator)
Original-subject: comp.archives disappears for a while
Reposted-by: emv@msen.com (Edward Vielmetti, MSEN)

comp.archives will be on hiatus until some time in May; if
you're not getting your usual feed of articles, it's because there
aren't any being posted.  I'll still be screening incoming articles
for likely candidates, so nothing will be completely missed; the
challenge will be for me to weed though an enormous number of postings
to find the good stuff.

Here's how you, as someone who wants to have their postings noticed
and distributed through comp.archives, can help.

- Informative subject headers.  They should include the name of the
  package and the word "available", e.g.
	Subject: (package) -- (bright shiny wrapped package) available

- A magic phrase in the body of the posting.  This is what I would
  prefer, all on one line:
	(package) is available via anonymous ftp from

- A description of the location of the package, in the same
  unambiguous notation that comp.archives uses, all on one line:
	ftp.domain.org:/pub/package-1.00.tar.Z

- A nice long healthy hunk of text describing what the package does,
  who worked on it, what other packages it is similar too, maybe a
  little bit about how it works.  About 5K of text is more or less
  right.  Pretend that there's a full text indexer that is going to be
  storing the text, and lard it with all of the appropriate buzzwords
  that people in your field would be looking for if they were going to
  search for it.

If you follow these guidelines, then it's a good chance your
announcement will be picked out from 1000-odd other random things that
will no doubt pile up over time.  The other reasonable way to get my
attention is to drop a short note to me (emv@msen.com) and let me know
when and where you posted something; that should let me find it
quickly too.

comp.archives is not (as I might have mislead some people into
believing!) completely automated!  there's a fair amount of work that
goes into the article location that's automated, but the tools that
disambiguate between "package is available via anonymous ftp from" and
"where can i ftp a copy from" and "we need a net connection so we can
telnet, ftp, etc" are in their formative stages.  

there's a database of package to category types that makes the
Archive-name production reasonably straightforward; my date parser
goofs up every so often, and it's sometimes hard to categorize things.
don't know how useful that all is, I try mostly to keep related
packages together to the extent that that's even possible.

A sticky part is getting the Archive: or Archive-directory: piece
right so that people can use comp.archives to drive an automatic ftp
fetcher.  archie helps if something's been around for a while, but for
new stuff it's relatively time-consuming to have to track just exactly
where on a site and how it's spelled if the poster gets it wrong.

comp.archives in the future...

MSEN (as you've noticed from the headers) is slowly becoming the
primary place from which I will be producing comp.archives.  This
moves it out of the realm of "done on company time and they don't mind
that much" to "done as a service of a for-profit company with the hope
of getting a return on investment".  MSEN, Inc. is a small startup
company which has as its primary goal not to lose too much money, to
put itself on the Internet, and to provide a range of network
services.  As such I'll be trying to put comp.archives on a
self-supporting basis, i.e. charging money to someone somewhere, or
getting money from someone somewhere.

What that means is that I need a credible threat :-) and to say "if
you like comp.archives, support it, because if it's too much of a
drain on MSEN's time and my time it'll go away in a year".  What
exactly that support is is left intentionally vague for the moment
because I don't have a good answer.  It might mean spin-off products,
like putting some extracts of the collection on paper and selling them
on late-night TV "THOUSANDS of programs absolutely FREE with your paid
Internet subscription!".  There's a reasonable chance of providing
pay-per-view or subscription access to a nice full-text full-screen
browser for the collection; that would make a lot of
comp.sources.wanted obsolete, sort of an "archie" for comp.archives.
I suppose that I could make comp.archives into "msen.archives", with a
Clarinet-like distribution policy, though the techno-legal nonsense of
restricting distribution of other people's words puts a sour taste in
my mouth.

Indeed there are any number of things that would be awful nice to have
but which would take a lot of work.  *It's rapidly approaching the
stage where it's going to take a serious infusion of technology to
keep comp.archives running.*  I really want to end up with something
much more generalizable, something that you could say to it "look for
all of the interesting articles about frame relay and whatever related
subjects on whatever lists, and just show me the interesting ones",
and it would find them for you; the newsreaders of the 1990's have
huge piles of news to weed through and only the puniest of newsgroups
to split them down with.  Thus, I see no choice but to try and force
the issue and say "if the technology doesn't get better within a year,
I give up, I can't keep up with it all."

Technology costs money, at least the technology that I'm aware of.
Topic (by Verity) could be a very useful tool for scanning through
news.  There's some very good full-text searching and X11 based
presentation software made by the folks who did the New OED which
would be fun to throw at the task.  No doubt there are expert systems
and lexical analyzers which could chew on articles and spit out just
exactly the ones that I would pick, all neatly tagged like I would tag
them, and send them to comp.archives.  (i wish.)

whew.

comments should go direct to me (emv@msen.com) or to comp.sources.d,
and you should all vote YES on comp.archives.admin (send votes to
kent@uunet.uu.net). 

-- 
 Msen	Edward Vielmetti
/|---	moderator, comp.archives
	emv@msen.com

"(6) The Plan shall identify how agencies and departments can
collaborate to ... expand efforts to improve, document, and evaluate
unclassified public-domain software developed by federally-funded
researchers and other software, including federally-funded educational
and training software; "
			High-Performance Computing Act of 1991, S. 218