[alt.sources.wanted] [comp.archives...] Announcing Archie version 2.0

bajan@cs.mcgill.ca (Alan Emtage) (03/13/91)

Archive-name: ftp/database/archie/1991-03-12
Archive-directory: quiche.cs.mcgill.ca:/archie/listings/ [132.206.2.3]
Original-posting-by: bajan@cs.mcgill.ca (Alan Emtage)
Original-subject: Announcing Archie version 2.0
Reposted-by: emv@ox.com (Edward Vielmetti)


			Archie 2.0
			----------

The "Archie Group" of McGill University is pleased to announce Archie,
the "Archive Server Server" Version 2.0.


 McGill University Operating "archie"
 ------------------------------------------------------
 - An Internet  Archive Server Listing Service
 ----------------------------------------------
 
 Given the number of hosts being used as archive sites nowadays, there can
 be great difficulty in finding needed software in a distributed
 environment. You may know that the software that you need is out there,
 but it can sometimes be difficult to find.  The School of Computer
 Science at McGill University has one solution to the problem - "archie".

 Since the announcement of the dedicated-database version of archie in
 November, the popularity of the program has grown by leaps and bounds.
 From an average of about 30 logins/day in November we are now averaging
 over 550 with our all-time high coming in at 700 for a single day.
 (Our ex-boss owes us lunch for the 500+ mark :-). Archie's email
 interface averages about 40/day and anonymous ftp to quiche (for
 retrieval of the compressed site listings files in ~ftp/archie/listings)
 is over 70/day. Needless to say, quiche is a well-used system right
 about now :-)


 Getting To The Point:
 ---------------------
 
 So how do you get to use archie? If you are Internet connected, it's
 easy. Telnet to quiche.cs.mcgill.ca (132.206.2.3 or 132.206.51.1) and
 login as user "archie". You should get a banner message and status
 report on our latest additions (there's no password, although we do log
 the sessions to provide rudimentary stats). "help" gets a list of valid
 commands. Feedback welcome and can be sent to archie-l@cs.mcgill.ca


 NOTE:  The following changes only apply to the interactive version of
 archie (the one you see when you telnet or rlogin to quiche) and NOT to
 the E-Mail interface. We will hopefully be overhauling that interface in
 the coming week(s).


 Quick Summary
 -------------

 For those of you who don't want to read the whole thing, here's a quick
 summary of what's new in V2.0. If you want the full explanation, skip to
 the next section. Otherwise, see the archie online help facility.


 (a) Speed and performance under load should be improved. Feedback (to
     archie-l@cs.mcgill.ca) on this would be appreciated.

 (b) 3 new searching methods added. See help section under "set search".

 (c) Output may now be sorted. See help under "set sortby".

 (d) New Software Description database to help you find the names of
     packages to do what you want done, as well as an RFC index and other
     useful information. See help under "whatis".

 (e) New "mail" command allows you to mail archie results back to you.
     Say goodbye to those hated script sessions :-). See help under
     "mail" and "set mailto".

 (f) "list" command now tells the truth. Help "list".

 (g) A "status" variable allows you to turn on or off search progress
     information. Help "set status".



 Changes in Version 2.0 
 ---------------------- 

 Thanks to all the feedback we've gotten over the past couple of months,
 we have modified archie into what we hope will be a more friendly and
 efficient service. 


The changes in V2.0 are:



(1) Speed & Implementation
    ----------------------

     For faster execution, Archie has been rewritten using a shared memory
     model which greatly improves execution times especially when the
     host on which archie is running is under load [which, for those of
     you who use archie regularly, know that quiche has been for some
     time now :-]. This model also allows for much faster database
     updates. We'd appreciate feedback on what kind of response times you
     are getting (subjective rather than objective).


(2) Searching
    ---------

     Wider range of search methods. Until this point, archie could only
     search using regular expressions (as defined in ed(1)). Since most
     users don't require the power of regex's (and many who don't use them
     regularly have (understandably) trouble composing them), 3 new search
     methods have been added, bringing the total to 4.


     To change the search method, set the "search" variable and use the
     "prog" command per usual.  Command line options are in the works
     but have not yet been incorporated into this version of archie. The
     value of the search variable for each method is listed in brackets
     '[ ]' below.  Type "help set search" at the "archie>" prompt if you
     want more info.

     (1) Substring (case insensitive) ["sub"]. As above but ignoring
	 the case of the strings involved. Speed about on par with the
	 regex equivalent.

     (2) Substring (case sensitive) ["subcase"]. A simple, everyday substring
	 search. A match occurs if the the file (or directory) name in
	 the database contains the user-given substring. Slightly faster
	 than the equivalent regex.

     (3) Exact match ["exact"]. The fastest search method of all.  The
	 restriction is that the user (search) string has to exactly
	 match (including case) the string in the database. Provided for
	 those of who who know just what you are looking for. For example,
	 if you wanted to know where all the xlock.tar.Z files were, this
	 is the kind of search to use. [For those of you that are
	 interested, the search is O(1) in this case via the magic of
	 dbm].

     (4) Regex ["regex"]. The "old" method. Searches the database with
	 the user (search) string which is given in the form of an ed(1)
	 regular expression. This is the DEFAULT search method.


 Note  : The "status" line that used to appear when the "pager" variable
	 was set and the search was proceeding (showing the number of
	 matches found and the percentage of the database) can be enabled
	 or disabled by the use of the "status" variable, which can
	 either be set or unset depending on if you want the line to be
	 displayed or not. Therfore there will be no search ouput
	 displayed until the search is complete or aborted by the user.


(3) Sorting
    -------

     Ordering the output. Archie V1.X had no concept of sorted output,
     except for the fact that we tried to do the updates in lexical order
     so that the output would be (mostly) sorted in that order. It didn't
     work. Consequently, you may now sort your 'prog' command output in 5
     different ways.  For each method, the "natural" sort order (or at
     least, what we consider to be the natural order) is the default.

     To change the sort method, set the "sortby" variable.  The value of
     the sortby variable for each method is listed in brackets '[ ]'
     below. Command line options are not available at this time. 

     The reverse sorting orders from those described here are obtained by
     prepending "r" to the sortby value given. (Eg. reverse hostname
     order "hostname" is "rhostname").

     (1) Hostname order ["hostname"]. Output is sorted on the archive
	 hostname in lexical order. 

     (2) File/Directory name modification time ["time"]. Output is sorted
	 with the most recent modifcation times of the found
	 file/directory names coming first (youngest -> oldest).

     (3) File/Directory size ["size"]. Output is sorted by the size of
	 the found files/directories, largest first.

     (4) File/Directory name lexical order ["filename"]. 

     (5) Database order ["none"]. In other words, effectively non sorted.
	 This is the default order and is the one that most users of
	 archie 1.X versions will be used to.



 Note: Typing the keyboard interrupt character ( Ctl-C for most people on
       UNIX) during a search will cause the search to aborted. The
       results up to that time will be sorted (determined by the value of
       the sortby variable) and the results output. Typing an abort character
       during the sort will cause that to be aborted. Results up to that
       point will be output.
       

(4) PD Software Description Database
    --------------------------------


     A new database, similar to the one that the man(1) UNIX command uses
     when doing a "keyword" ( -k option ) lookup has been added to
     archie. The database currently contains about 2600 entries that we
     have gleaned from various sources (such as the comp.sources.*,
     alt.sources and RFC indices).  The format is basically the name of a
     PD program, document, or software package followed by a short
     description of said object. 

     The command is "whatis" and takes a (sub)string as an argument. All
     lines in the database containing that substring (case insensitive)
     will be printed.

     I think such a beast would be very useful if it were properly
     maintained. These current entries should be considered the mere start
     of the database and I'm depending on all you authors and maintainers
     out there to send me additions, corrections and updates to the
     various entries in the database. All such info should be sent to 


                   archie-admin@cs.mcgill.ca


     All entries are welcome, and I'll endevour to keep the database
     uptodate. I have not finialized what will and will not be in it so
     send whatever you have along and I'll make up the policy as we go
     along.




(5) Getting rid of those crummy "script" sessions
    -------------------------------------------

    Your days of typing "script" before every interactive archie session
    are now over: archie can now mail you the results of your interactive 
    sessions. It works like this: 

    (a) Set the "mailto" variable to your E-mail address

    (b) Run archie as you normally would. When you get a result that you
	want to keep a record of (and after you have finished browsing
	through it if you have the pager set on) type "mail". Archie will
	automatically forward the results of the last request (site,
	prog, etc) to the email address set before. If you have not set
	the address in the mailto variable you may specify one on the
	command line to the "mail" command. [If you do neither, and type
	"mail", archie will tell you].

    (c) The mail is sent asynchronously (you don't have to wait for it to
	be sent). You will be informed when it is complete.

    If the generated output from archie is greater than 45K bytes, it
    will automatically be split it into as many parts as required to get
    it to you in chunks this size or less. This is so as to cooperate with
    certain mail systems which don't handle 50+ K chunks. [Many thanks to
    Mark Crispin's c-client library of mail routines which made this code
    SOOOO much easier]

    Note: For those of you who have to do source routing for your email,
	  remember that the mail address given has to be a path from our
	  machines to yours. Our mail setup here is pretty darned good
	  (if I might say so myself) so the results should get to you in
	  reasonable time (there's no queueing on our part unless the load
	  gets abnormally high).


(6)  What achive sites does archie know about ?
     -----------------------------------

     The "list" command which has been out for a couple of weeks under
     version 1.3 is now formally part of archie. This command allows you
     to specify a regular expression as an argument and prints the site
     names in the database which match that expression, along with the
     primary IP address of the site and the date that archie last updated
     the site for the database. "list" without an argument prints the
     data on all sites that archie knows about.



(7)  Getting kicked off for loitering
     --------------------------------

     Archie now has an autologout feature (well, actually it has had one
     for the past couple of weeks, but we're now telling you about it
     :-). If you hang around for too long without doing anything, we'll
     bump you off and free up the resources for the next person along. We
     aren't very strict on this and, in fact, you can set the autologout
     period yourself, varying from 1 minute to 5 hours, with 1 hour being
     the default. The variable "autologout" controls this feature.

 

  Things to be done
  -----------------

      A couple of things on our wishlist that still haven't been done:

      (1) Restricting searches to specific sites (soon hopefully).

      (2) Non UNIX sites aren't in the database (soon, maybe).

      (2) GUI interface (a little further off).


      The email interface will have to be brought up to the level of the
      interactive interface (as well as fixing some pretty annoying bugs
      in it), and hopefully that will be done fairly soon.


That's all for the moment folks. We would really like to see that
"whatis" database get off of the ground and all contributions are welcome.

If you have any comment, suggestions or constructive critisism, please
don't hesitate to drop us a line at

	archie-l@cs.mcgill.ca


It was your comments which led to the above improvements and we'd like to
keep hearing from you.


- The "Archie Group":  Bill Heelan (wheelan@cs.mcgill.ca)
		       Peter Deutsch (peterd@cc.mcgill.ca)
		       Alan Emtage (bajan@cs.mcgill.ca)