[comp.archives] [administrivia] format of comp.archives articles

emv@ox.com (Edward Vielmetti) (12/03/90)

Archive-name: comp.archives/administrivia/format/1990-12-02
Archive-directory: cs.toronto.edu:/comp.archives/ [128.100.1.65]

This posting describes what's in a comp.archives article other than
the text that the original author put in it.

The Subject: header has an extra thing stuck on the front of it which
denotes the newsgroup that the posting originally came from for ease
of article searching or killing.  The original newsgroups are also
preserved intact in the X-Original-Newsgroups line.

Occasionally I'll put on a new subject if the old one was particularly
uninformative.

The Reply-To: and Followup-To: lines are filled out as best I can to
point replies back to the original author and followups back to an
appropriate newsgroup.

There are several "auxiliary" headers that I add on.  These cannot
ordinarily be hidden by newsreaders.

The first is the Archive-name: header.  This is intended as a suitable
file in which you might store the article using an archive program
like "rkive" or others of its ilk.  Starting in December 1990, this
header is formatted as follows.  Note that the dates should sort
properly.  I do not have the means at this point to guarantee that two
articles will not posted with the same Archive-name.

Archive-name: category/subcategory/package/yyyy-mm-dd
Archive-name: x11/kanji/kterm/1990-12-01
Archive-name: fonts/chinese/crl.nmsu.edu/1990-12-01

I keep a table of package -> category mappings, which might help this
stay consistent.  The categorization scheme is only as good as the
information that I get; what seems to be most helpful in this regard
are the periodic comparative reviews of 4-12 programs in the same
general area.  

The Archive: or Archive-directory: header follows.  It intended as a
complete reference by which you can grab the entire package
automatically.  Given the vagaries of FTP, there is no guarantee that
the thing might not move before you get there, but it should be at
worst a good clue.  If there is a whole directory full of files to be
retrieved, this is noted by an Archive-directory: header in roughly
the same format.

Archive: host.domain.org:/pub/directory/package-nn.n.tar.Z [128.64.32.16]
Archive: expo.lcs.mit.edu:/contrib/kterm-4.1.1.tar.Z [18.30.0.212]

Archive-directory: host.domain.org:/pub/directory/ [128.64.32.16]
Archive-directory: crl.nmsu.edu:/pub/chinese/ [128.123.1.14]

If you are a GNU emacs user, the "ange-ftp" package from
Archive: tut.cis.ohio-state.edu:/pub/gnu/emacs/elisp-archive/packages/ange-ftp.el.Z [128.146.8.60]
will (more or less) let you point to an Archive: reference and
retrieve the file without further typing.  Work is in progress to come
up with tools that you can pipe a comp.archives article into and have
it drop into a sensible place.  (probably it would be enough to hack
on batchftp).

The last header is the Reposted-by: header, which lets you know which
of the multiple moderators (if there were to be such a thing) was
responsible.  Also to remind you who is doing the work :-).

Any questions on archive formats should go to me.  If you are writing
tools that archive comp.archives postings or that fetch stuff from
them, let me know.  

Note the address change (effective 1 Dec 1990) to "emv@ox.com" instead
of "emv@math.lsa.umich.edu".  

--Ed
Edward Vielmetti
moderator, comp.archives
emv@ox.com
archives@ox.com

emv@ox.com (Edward Vielmetti) (12/22/90)

Archive-name: comp.archives/administrivia/format/1990-12-21
Archive-directory: cs.toronto.edu:/comp.archives/ [128.100.1.65]

This posting describes what's in a comp.archives article other than
the text that the original author put in it.

The Subject: header has an extra thing stuck on the front of it which
notes the newsgroup that the posting originally came from for ease of
article searching or killing.  The original newsgroups are also
preserved intact in the X-Original-Newsgroups line.

Occasionally I'll put on a new subject if the old one was particularly
uninformative.

The Reply-To: and Followup-To: lines are filled out as best I can to
point replies back to the original author and followups back to an
appropriate newsgroup.

There are several "auxiliary" headers that I add on.  These cannot
ordinarily be hidden by newsreaders.

The first is the Archive-name: header.  This is intended as a suitable
file in which you might store the article using an archive program
like "rkive" or others of its ilk.  Starting in December 1990, this
header is formatted as follows.  Note that the dates should sort
properly.  I do not have the means at this point to guarantee that two
articles will not posted with the same Archive-name.

Archive-name: category/subcategory/package/yyyy-mm-dd
Archive-name: x11/kanji/kterm/1990-12-01
Archive-name: fonts/chinese/crl.nmsu.edu/1990-12-01

I keep a table of package -> category mappings, which helps this stay
consistent.  The categorization scheme is only as good as the
information that I get; what seems to be most helpful in this regard
are the periodic comparative reviews of 4-12 programs in the same
general area.

The Archive: or Archive-directory: header follows.  It intended as a
complete reference by which you can grab the entire package
automatically.  Given the vagaries of FTP, there is no guarantee that
the thing might not move before you get there, but it should be at
worst a good clue.  If there is a whole directory full of files to be
retrieved, this is noted by an Archive-directory: header in roughly
the same format.

Archive: host.domain.org:/pub/directory/package-nn.n.tar.Z [128.64.32.16]
Archive: expo.lcs.mit.edu:/contrib/kterm-4.1.1.tar.Z [18.30.0.212]

Archive-directory: host.domain.org:/pub/directory/ [128.64.32.16]
Archive-directory: crl.nmsu.edu:/pub/chinese/ [128.123.1.14]

The Archive: header may include a wildcard in the file name, suitable
for use in an FTP "mget".  If anything it will point to too few files
rather than too many; i.e. an index file rather than an entire
(possibly huge) directory.  Be extra cautious when fetching an entire
directory, for it may be huge!

If you are a GNU emacs user, the "ange-ftp" package from
Archive: tut.cis.ohio-state.edu:/pub/gnu/emacs/elisp-archive/packages/ange-ftp.el.Z [128.146.8.60]
will let you (more or less) point to an Archive: reference and
retrieve the file without further typing. 

A good tool to have would be one which reads a comp.archives article
and issues the proper commands to FTP to fetch the file.  Work is in
progress on such a thing.  The author(s) of the best program available
to do this task on 1 May 1991 will receive a prize (yet to be
determined).

The last header is the Reposted-by: header, which lets you know which
of the multiple moderators (if there were to be such a thing) was
responsible.  Also to remind you who is doing the work :-).

Any questions on archive formats should go to me.  If you are writing
tools that archive comp.archives postings or that fetch stuff from
them, let me know.  

Note the address change (effective 1 Dec 1990) to "emv@ox.com" instead
of "emv@math.lsa.umich.edu".  

--Ed
Edward Vielmetti
MSEN
moderator, comp.archives
emv@ox.com (or) archives@ox.com