[comp.archives.admin] Revised man page for archie

bajan@cs.mcgill.ca (Alan Emtage) (06/13/91)

Due to the kind work of  R. Rodgers (rodgers@maxwell.mmwb.ucsf.edu), and
Nelson Beebe (beebe@math.utah.edu), archie now has a revised man page. It has
basically been cleaned up and rewritten for clarity. It has also been
updated to include the latest additions to the email interface.

It can be retrieved from quiche.cs.mcgill.ca (132.206.2.3) via anonymous
ftp in the directory ~ftp/archie/doc as archie.man.roff for the *roff
(man) version or as formatted ASCII text in archie.man.txt. Compressed
versions of both of these files (.Z) are in the directory as well.

For those of you without direct Internet access, here it is. Enjoy.

-Alan

------------------------------------------------------------------------------
.TH ARCHIE 1L "20 May 1991"
.SH NAME
archie \- Internet archive server listing service
.SH SYNOPSIS
.B archie
.SH DESCRIPTION
The
.I archie
system allows the user to query a database containing a list of software
which is available on hosts connected to the Internet network.
For hosts connected to the Internet,
software located through this service can be obtained by means of
.IR ftp (1);
otherwise,
for hosts with access to BITNET/NetNorth/EARN,
it can be obtained by electronic mail through the Princeton
.I bitftp (1L)
service.
.LP
The system can be accessed in an interactive fashion or via electronic mail.
.SS "Using the Interactive Interface"
In order to use the interactive system:
.TP
1)
Connect to host
quiche.cs.mcgill.ca
(132.206.2.3 or 132.206.51.1) with
.IR telnet (1).
.TP
2)
Login as user
.B archie
(no capitals, no password required).
The system prints a banner message and status report.
.TP
3)
Type ``help'' for further information.
.LP
For full details,
refer to the section entitled
.SM THE
.SM INTERACTIVE
.SM INTERFACE
which appears below.
.SS "Using the Electronic Mail Interface"
In order to use the email interface, send requests to:
.IP
archie@cs.mcgill.ca
.LP
Send the word ``help'' in a message to obtain a list of available commands
and features.
This is a completely automated interface,
acting without human intervention.
.LP
For full details,
refer to the section entitled
.SM THE
.SM ELECTRONIC
.SM MAIL
.SM INTERFACE
which appears below.
.SS "Communicating with the Database Administrators"
This experimental database service is maintained by the
Computer Science Department of McGill University.
General comments and suggestions should be sent to:
.IP
archie-l@cs.mcgill.ca
.LP
Communications requesting additions to the set of hosts surveyed
for the database,
modifications to the Software Description Database,
or pertaining to other administrative matters,
should be sent to:
.IP
archie-admin@cs.mcgill.ca
.SH "THE INTERACTIVE INTERFACE"
.SS "Commands"
Arguments to commands shown in square brackets '[]' are optional;
all others are mandatory.
.TP
.B help
List the valid
.I archie
commands.
.TP
\fBlist\fP [\fIpattern\fP]
List the sites currently stored in the database,
and the time at which they were last updated.
The optional regular expression argument can be used to limit the list
to specific sites.
.IP
Note that the numerical (IP) address associated with a site name is
valid at the listed time,
but may have been changed.
Furthermore,
the listed IP address is the primary address
as listed in the DNS database
(secondary addresses are not stored).
.IP
Example:
.RS
.IP
\fClist\fP
.RE
.IP
lists all sites in the database,
while
.RS
.IP
\fClist \e.de$\fP
.RE
.IP
lists all German sites.
.TP
\fBmail\fP [\fIaddress1\fP,[\fIaddress2\fP...]]
Mail the output of the last command
to the specified address or comma-separated list of addresses
(no spaces must appear in the address list).
.IP
Example:
.RS
.IP
\fCmail user1@hello.edu,user2@goodbye.com\fP
.RE
.IP
In the absence of an argument, the mail is sent to the address
specified by the
.B mailto
variable.
.IP
Example:
.RS
.IP
\fCmail\fP
.RE
.IP
Conventional Internet addressing styles are understood.
BITNET sites should use the convention:
.RS
.IP
\fCuser@sitename.bitnet\fP
.RE
.IP
UUCP addresses can be specified as
.RS
.IP
\fCuser@sitename.uucp\fP
.RE
.TP
.BI prog " pattern"
Find all occurrences of programs with names matching
.I pattern.
The interpretation of
.I pattern
depends upon the value of the
.B search
variable.
The output lists the names of hosts with matching entries,
the size of the matching program,
its last modification date, and its path.
The results are sorted according to the value of the
.B sortby
variable,
and are limited in number by the
.B maxhits
variable.
.TP
.BI set " variable-name"
Set the specified variable.
See the section below concerning available variables,
as well as the entries for
.B unset
and
.BR show .
.TP
\fBshow\fP [\fIvariable-name\fP]
Display the value of a particular variable.
If no variable is specified, display
.I all
variables.
.IP
Example:
.RS
.IP
\fCshow maxhits\fP
.RE
.TP
.BI site " sitename"
Produce a full table of contents for a specified
.IR ftp (1)
site in the
.I archie
database.
The output format is similar to that of the UNIX command:
.RS
.IP
\fCls -lR\fP
.RE
.IP
Example:
.RS
.IP
\fCsite col.hp.com\fP
.RE
.TP
.BI unset " variable"
Remove any value associated with the specified variable.
This may cause counter-intuitive behavior in some cases;
for example, if
.B maxhits
is not defined by the user,
.B prog
will print the default number of matches rather than an
unlimited number of matches.
.TP
.BI whatis " substring"
Search the Software Description Database for the given substring,
ignoring case.
This database consists of names and short descriptions of many
software packages,
documents (like RFCs and educational material),
and data files stored on the Internet.
.IP
Example:
.RS
.IP
\fCwhatis uucp\fP
.RE
.IP
in part gives as a result:
.RS
.IP
\fCfindpath.sh             UUCP Pathfinder
.br
logfile-stats           UUCP LOGFILE analyzer
.br
mapstats                UUCP map statistics program\fP
.RE
.SS "Variable Types"
The behavior of
.I archie
can be modified by certain variables,
the values of which may be changed using the
.B set
command, or removed entirely by the
.B unset
command.
There are three variable types:
.TP 15
.B boolean
(Set or unset)
.TP
.B numeric
(Integer within a defined range)
.TP
.B string
(String of characters, may or may not be restricted).
.SS "Boolean Variables"
.TP
.B pager
Filter all output through the pager
.IR less (1L)
(default: unset).
When using the pager you may also want to set the
.B term
variable to your terminal type (see
.B term
variable).
.IP
Example:
.RS
.IP
\fCset pager\fP
.RE
.TP
.B status
During the database search,
display a status-line containing the number of matches and percentage of
the database searched (default: set).
.SS "Numeric Variables"
.TP
.B autologout
Set the length of idle time
(in minutes)
allowed before automatic logout
(permissible range: 1-300; default: 60).
.IP
Example:
.RS
.IP
\fCset autologout 45\fP
.RE
.IP
logs the user out after 45 minutes of idle time.
.TP
.B maxhits
Allow the
.B prog
command to generate at most the specified number of matches
(permissible range: 0-1000; default: 1000).
Set this to a smaller value if
.I archie
is too slow.
.IP
Example:
.RS
.IP
\fCset maxhits 100\fP
.RE
.IP
halts
.B prog
after 100 matches have been found.
.SS "String Variables"
.TP
.B mailto
If the
.IR mail (1)
command is issued with no arguments,
mail the output of the last command to the address
specified by this string variable,
which may contain a single mail address,
or a comma-separated list of addresses
(lists must not contain whitespace).
.IP
Example:
.RS
.IP
\fCset mailto user@frobozz.com\fP
.RE
.IP
Example:
.RS
.IP
\fCset mailto user1@hello.edu,user2@goodbye.com\fP
.RE
.IP
Conventional Internet addressing styles are understood.
BITNET sites should use the convention:
.RS
.IP
\fCuser@sitename.bitnet\fP
.RE
.IP
UUCP addresses can be specified as
.RS
.IP
\fCuser@sitename.uucp\fP
.RE
.TP
.B search
.IP
Define the type of search to be performed by the
.B prog
command.
The following values are permitted:
.RS
.TP
.B exact
Exact match (the fastest method).
A match occurs if the file (or directory)
name in the database corresponds
.I exactly
to the user-given substring (including case).
.IP
For example,
this type of search could be used to locate all
.I xlock.tar.Z
files.
.TP
.B regex
Allow user-specified (search) strings to take the form of
.IR ed (1)
regular expressions (the default search method).
.IP
.BR Note :
unless specifically anchored to the beginning (with ^) or end
(with $) of a line,
.IR ed(1)
regular expressions (effectively) have ``.*'' prepended and
appended to them.
For example,
it is not necessary to type
.RS
.IP
\fCprog .*xnlock.*\fP
.RE
.IP
because
.RS
.IP
\fCprog xnlock\fP
.RE
.IP
suffices.
In this instance,
the
.B regex
match is equivalent a simple substring match.
Those unfamiliar with regular expressions should refer to the
section entitled
.SM REGULAR
.SM EXPRESSIONS
which appears below.
.TP
.B sub
Substring (case insensitive).
A match occurs if the file (or directory)
name in the database contains the user-given substring,
without regard to case.
.IP
Example:
.IP
The pattern:
.RS
.IP
\fCis\fP
.RE
.IP
matches any of the following:
.RS
.IP
\fCislington
.br
this
.br
poison\fP
.RE
.TP
.B subcase
Substring (case sensitive).
As above,
but taking case as significant.
.IP
Example:
.IP
The pattern:
.RS
.IP
\fCTeX\fP
.RE
.IP
will match:
.RS
.IP
\fCLaTeX\fP
.RE
.IP
but neither of the following:
.RS
.IP
\fCLatex
.br
TExTroff\fP
.RE
.RE
.TP
.B sortby
.IP
Set the method of sorting to be applied to output from
.BR prog .
Typing the keyboard interrupt character (generally Ctl-C on UNIX hosts)
aborts a search.
Results obtained to that point will be sorted according to the
.B sortby
variable and sent as output.
The output phase may be aborted by typing the abort character a second time.
The five permitted methods (and their associated reverse orders) are:
.RS
.TP
.B none
Unsorted (default; no reverse order, though
.B rnone
is accepted)
.TP
.B filename
Sort files/directories by name, using lexical order (reverse order:
.BR rfilename )
.TP
.B hostname
Sort on the archive hostname, in lexical order (reverse order:
.BR rhostname )
.TP
.B size
Sort by size, largest files/directories first (reverse order:
.BR rsize )
.TP
.B time
Sort by modification time,
with the most recent file/directory names first (reverse order:
.BR rtime )
.RE
.TP
.B term
Specify the type of terminal in use
(and optionally, its size in rows and columns).
This information is used by the pager.
.IP
The usage is:
.RS
.IP
\fCset term <terminal-type> [<#rows> [<#columns>]]\fP
.RE
.IP
The terminal type is mandatory,
but the number of rows and columns is optional;
specify either rows only,
or both rows and columns (default: 24 rows, 80 columns).
.IP
Examples:
.RS
.IP
\fCset term vt100
.br
set term xterm 60
.br
set term xterm 24 100\fP
.RE
.SH "THE ELECTRONIC MAIL INTERFACE"
The
.I archie
email interface currently accepts a limited subset of
the interactive interface commands, plus a few of its own.
Variables are not supported in the email interface.
The ``Subject:'' line in incoming mail is processed as if it
were part of the main message body.
The
.B help
command is exclusive;
all other commands in the same message are ignored.
A message not containing a valid request will be treated as a
.B help
request.
The server recognizes the following commands:
.TP
.BI compress
Process the mail message with the
.IR compress (1)
and
.IR uuencode (1)
programs.
Upon receiving the reply,
the recipient should
remove the mail header and run the rest of the file through
.IR uudecode (1),
producing a file with a name of the form:
.RS
.IP
\fCfile.Z\fP
.RE
.IP
Process this file with
.IR uncompress (1)
to obtain the results of the request.
.TP
.BI help
Send a message describing how to use the email interface.
.TP
.BI path " path"
Override the return address that would normally be extracted from the header.
The path describes how to mail a message from
.IR cs.mcgill.ca ,
which is fully connected to the Internet,
to your address.
Consider adding a
.B path
command to a request to provide an explicit return address if the
.I archie
server does not respond to the original request within several hours.
BITNET users should use the convention:
.RS
.IP
\fCuser@site.bitnet\fP
.RE
.IP
UUCP users should use the convention:
.RS
.IP
\fCuser@site.uucp\fP
.RE
.TP
\fBprog\fP <\fIreg exp1\fP> [<\fIreg exp2\fP> ...]
Search of database for each
./RI < "reg exp" >
(an
.IR ed (1)-style)
regular expression,
and return any matches.
Multiple
regular expressions may be placed on one line,
in which case the results will be mailed back in one message.
Where regular expressions appear on multiple lines,
multiple messages will be returned,
one for each line (not working correctly yet).
Any regular expression containing spaces must be quoted with single
or double quotes.
Searches are case sensitive.
The
.B prog
command is executed as if the
.B search
variable were set to
.IR regex .
Those unfamiliar with regular expressions should refer to the
section entitled
.SM REGULAR
.SM EXPRESSIONS
which appears below.
.TP
.BI quit
Stop interpreting the request.
This prevents the inadvertent interpretation of
text in an email signature which might accidentally resemble a valid
.I archie
command.
.TP
.BI site " <site name> | <site IP address>"
Return a list of the contents of the specified
.RI < "site name" >.
The fully qualified domain name or IP address may be used.
.TP
.BI list "<\fIreg exp1\fP> [<\fIreg exp2\fP> ...]"
List all of the sites names currently stored in the
database that match 
.RI < "reg exp" >
(an
.IR ed (1)-style)
regular expression, and return any matches.  
The format of the resulting
list is: site name, site IP address and date 
last updated in the database.
.TP
\fBwhatis\fP <\fIsubstring1\fP> [<\fIsubstring2\fP> ...]
Search the Software Description Database (SDD) for
.RI < "substring" > 
The SDD is a text
database containing the names and short descriptions of
about 3500 software packages, documents and datasets
available on the Internet. If you have any corrections or
additions, mail them to

archie-admin@cs.mcgill.ca

Multiple <substring> arguments may be placed on the same
\fBwhatis\fR command line.

.SH "REGULAR EXPRESSIONS"
Regular expressions follow the conventions of the
.IR ed (1)
command,
allowing sophisticated pattern matching.
In the following discussion,
the string containing a regular expression will be called
the ``pattern'',
and the string against which it is to be matched is called
the ``reference string''.
Regular expressions imbue certain characters with special meaning,
providing a quoting mechanism to remove this special meaning
when required.
.LP
The rules governing regular expression are:
.TP
.B c
A character
.B c
matches itself unless it has been assigned a special meaning as listed below.
A special character loses its special meaning
when preceded by the character '\fC\\\fP'.
This does not apply to '\fC{\fP',
which is non-special
.I until
it is so treated.
Thus, although '\fC*\fP' normally has special meaning,
the string '\fC\\*\fP' matches itself.
.IP
Example:
.IP
The pattern
.RS
.IP
\fCacdef\fP
.RE
.IP
matches any of the following:
.RS
.IP
\fCs83acdeffff
.br
acdefsecs
.br
acdefsecs\fP
.RE
.IP
but neither of the following:
.RS
.IP
\fCaccdef
.br
aacde1f\fP
.RE
.IP
Example:
.IP
Normally the characters '*'  and '$' are special, but the pattern
.RS
.IP
\fCa\\*bse\\$\fP
.RE
.IP
acts as above.
Any reference string containing:
.RS
.IP
\fC*abse$\fP
.RE
.IP
as a substring will be flagged as a match.
.TP
.B \&.
A period
(known as a
.I wildcard
character)
matches any character except the newline character.
.IP
Example:
.IP
The pattern
.RS
.IP
\&\fC....\fP
.RE
.IP
will match any 4 characters in the reference string,
except a newline character.
.TP
.B ^
A caret (\fC^\fP) appearing at the beginning of a pattern
requires that the reference string must
.B start
with the specified pattern
(an escaped caret,
or a caret appearing elsewhere in the pattern,
is treated as a non-special character).
.IP
Example:
.IP
The pattern
.RS
.IP
\fC^efghi\fP
.RE
.IP
The pattern will match only those reference strings starting with
\fCefghi\fP;
thus, it will match either of the following:
.RS
.IP
\fCefghi\fP
.br
\fCefghijlk\fP
.RE
.IP
but not:
.RS
.IP
\fCabcefghi\fP
.RE
.TP
.B $
A dollar sign (\fC$\fP) appearing at the end of a pattern
requires that the pattern appear at the end of a reference string
(an escaped dollar sign, or a dollar sign appearing elsewhere,
is treated as a regular character).
.IP
Example:
.IP
The pattern
.RS
.IP
\fCefghi$\fP
.RE
.IP
Will match either of the following:
.RS
.IP
\fCefghi\fP
\fCabcdefghi\fP
.RE
.IP
but not:
.RS
.IP
\fCefghijkl\fP
.RE
.TP
.B \e<
Match something at the beginning of a
.I word
(the beginning of a line,
or just before a letter,
digit,
or underline character,
or just after a character which is not one of the foregoing).
.IP
Example:
.IP
The pattern
.RS
.IP
\fC\e<abc\fP
.RE
.IP
matches the last \fCabc\fP in the reference string:
.RS
.IP
\fC@hijabc#+abc\fP
.RE
.IP
but not the first, since the first \fCabc\fP did not start on a word boundary.
.TP
.B \e>
Match the following one-character regular expression at the end of a word,
as defined above.
.TP
.RB [ string ]
Match any single character within the brackets.
The caret (\fC^\fP) has a special meaning if it is the first character in
the series:
the pattern will match any character
.I other
than one in the list.
.IP
Example:
.IP
The pattern
.RS
.IP
\fC[^abc]\fP
.RE
.IP
Will match any character
.IR except
one of:
.RS
.IP
\fCa
.br
b
.br
c\fP
.RE
.IP
To match a right bracket (\fC]\fP) in the list,
put it first, as in:
.RS
.IP
\fC[]ab01]\fP
.RE
.IP
A caret appearing anywhere but the in first position is treated as a
regular character.
.IP
The minus (\fC-\fP) character is special within square brackets.
It is used to define a range of ASCII characters to be matched.
For example, the pattern:
.RS
.IP
\fC[a-z]\fP
.RE
.IP
matches any lower case letter.
The minus can be made non-special by placing it first or last
within the square brackets.
The characters '\fC$\fP', '\fC*\fP' and '\fC.\fP'
are not special within square brackets.
.IP
Example:
.IP
The pattern
.RS
.IP
\fC[ab01]\fP
.RE
.IP
matches a single occurrence of a character from the set:
.RS
.IP
\fCa
.br
b
.br
0
.br
1\fP
.RE
.IP
Example:
.IP
The pattern
.RS
.IP
\fC[^ab01]\fP
.RE
.IP
will match any single character other than one from the set:
.RS
.IP
\fCa
.br
b
.br
0
.br
1\fP
.RE
.IP
Example :
.IP
The pattern
.RS
.IP
\fC[a0-9b]\fP
.RE
.IP
matches one of the characters:
.RS
.IP
\fCa
.br
b\fP
.RE
.IP
or a digit between \fC0\fP and \fC9\fP,
inclusive.
.IP
Example :
.IP
The pattern
.RS
.IP
\fC[^a0-9b.$]\fP
.RE
.IP
matches any single character which is not in the set:
.RS
.IP
\fCa
.br
b
.br
\&.
.br
$\fP
.RE
.IP
or a digit between 0 and 9, inclusive.
.TP
.B *
Match zero or more occurrences of an immediately preceding regular expression.
.IP
Example:
.IP
The pattern
.RS
.IP
\fCa*\fP
.RE
.IP
matches zero or more occurrences of the character:
.RS
.IP
\fCa\fP
.RE
.IP
Example:
.IP
The pattern
.RS
.IP
\fC[A-Z]*\fP
.RE
.IP
matches zero or more occurrences of the upper case alphabet.
.TP
\fB\e{\fP\fIm\fP\fB\e}\fP
Match exactly
.I m
occurrences of a preceding regular expression,
where
.I m
is a non-negative integer between 0 and 255 (inclusive).
.IP
Example:
.IP
The pattern
.RS
.IP
\fCab\\{3\\}\fP
.RE
.IP
matches any substring in the reference string consisting of the character
`\fCa\fP' followed by exactly three `\fCb\fP' characters.
.TP
\fB\e{\fP\fIm\fB,\e}\fP
Match at least
.I m
occurrences of the preceding regular expression.
.IP
Example:
.IP
The pattern
.RS
.IP
\fCab\\{3,\\}\fP
.RE
.IP
matches any substring in the reference string of the character `\fCa\fP'
followed by at least three `\fCb\fP' characters.
.TP
\fB\e{\fP\fIm\fP\fB,\fP\fIn\fP\fB\e}\fP
Match between
.I m
and
.I n
occurrences of the preceding regular expression
(where
.I n
is a non-negative integer between 0 and 255, and
.IR n > m ).
.IP
Example:
.IP
The pattern
.RS
.IP
\fCab\\{3,5\\}\fP
.RE
.IP
matches any substring in the reference string consisting of the character
`\fCa\fP' followed by at least three but at most five `\fCb\fP' characters.
.SS "Tips for Using Regular Expressions"
.TP
1)
When matching a substring it is not necessary to use the wildcard
character to match the part of the reference string preceding and
following the substring.
.IP
Example:
.IP
The pattern
.RS
.IP
\fCabcd\fP
.RE
.IP
will match any reference string containing this pattern.
It is not necessary to use
.RS
.IP
\fC\&.*abcd.*\fP
.RE
.IP
as the pattern.
.TP
2)
In order to constrain a pattern to the entire reference pattern,
use the construction:
.RS
.IP
\fC^pattern$\fP
.RE
.TP
3)
The '\fC[]\fP' operator provides an easy mechanism
to obtain case insensitivity.
For example,
to match the word:
.RS
.IP
\fChello\fP
.RE
.IP
regardless of case, use the pattern:
.RS
.IP
\fC[Hh][Ee][Ll][Ll][Oo]\fP
.RE
.SH "THE ARCHIE DATABASE"
The
.I archie
database subsystem maintains a list of about 600 Internet
.IR ftp (1)
archive sites.
Each night, the database subsystem executes an anonymous
.IR ftp (1)
to a subset of these sites and fetches a recursive directory listing (or
a file containing the recursive directory listing if this exists).
Currently, each site gets updated approximately once a month.
The directory listings are stored on
.I quiche.cs.mcgill.ca
(132.206.2.3),
where they are available to the Internet community via anonymous
.IR ftp (1).
They appear in the directory
.I ~ftp/archie/listings
in compressed form.
.SH "BUGS AND LIMITATIONS"
.TP
1)
Only UNIX sites are included in the database.
.TP
2)
The user can not limit searches to specific sites.
.TP
3)
There is no graphical user interface.
.TP
4)
There is no way to abort the help facility completely.
.SH "LONG TERM PLANS"
The
.I archie
system is regarded as developmental,
and is not presently being released to outside sites.
The current database requires about 70 MB of disk storage,
and the updates and searches put a noticeable load
on the Sun 4/280 on which it operating.
We hope to distribute
.I archie
to several other sites throughout the world, at a later date.
.LP
We welcome comments and suggestions;
please send them to
.IR archie-l@cs.mcgill.ca .
.SH "SEE ALSO"
bitftp (1L),
ftp(1),
telnet(1)
.SH AUTHORS
Alan Emtage (bajan@cs.mcgill.ca) and
Bill Heelan (wheelan@cs.mcgill.ca), McGill University.
Manual page by R. P. C. Rodgers,
UCSF School of Pharmacy, San Francisco,
California 94143 (rodgers@maxwell.mmwb.ucsf.edu),
Nelson H. F. Beebe (beebe@math.utah.edu),
and Alan Emtage.
.\" end of file