[comp.sys.next] indexing & searching with dbindex routines

dz@pumpkin.ucsb.edu (Daniel James Zerkle) (08/29/89)

A few people here (including myself) have been making noise about the
search capabilities of the NeXT (used by Digital Librarian).  I wrote
this little bit of code which demonstrates (to a small extent) these
capabilities.  To use it, cd to the directory containing the files you
want to search (these should be previously indexed, probably with
index(1)).  Then just type the name of the program.  You will be
prompted for the keywords for which to search.  These keywords work
just like the keywords in Digital Librarian.  For each file where
there is a match, the file name and the info line will be displayed.
Considerably more information is retrieved (see dbfilecell(3) and
/usr/include/text/filecell.h), but only these two fields are displayed.
See the manual entry for dbindex(3) for information on the assorted
data structures used.

WARNING:  Although this program is based on an example given in the
manual for dbindex, that example is wrong, wrong, wrong.  It is probably
based on an old version of the routines.  Hopefully, the documentation
in 1.0 will catch up with the code in 0.9....  Two other errors are
present in the documentation.  The manual entry for dbfilecell does
not actually include the structure in question (thus, you'll have to
look in filecell.h, where it is fairly well commented).  Also, it is
nowhere documented that you must put -ldb_p at the end of the command
line when you are compiling, but you have no chance of compling without
it.  I had to wade through the contents of all the library files before
I found that one....

I was quite impressed with the performance of the indexing routines.
I ran this program from /usr/man and searched for a few arbitrary
keys, and it searched almost instantly.

		---------- cut here ---------
/*
 * When run from some directory that is indexed, this program will accept
 * keywords until EOF.  For each line of keys, it will search the index
 * and display the name of the file and the info line.
 *
 * Compile this file with cc -o searcher searcher.c -ltext -ldb_p
 * Note that the -ldb_p is *not* documented anywhere, but must be in there.
 * See the manual entry for dbindex(3) for more information.
 */
#include <stdio.h>
#import <text/libtext.h>

main()
  {
  char keys[1000];
  Index *i;
  RefList *r;
  FileCell *f;
  int j;

  i = dbOpenIndex (".","r"); /* "." means current directory */
  r = (RefList *)malloc(sizeof(RefList));

  printf("Enter keys> ");
  while(gets(keys)!=NULL)
    {
    dbGetRefList(r, i, keys);
    printf("%d match(es) found\n", r->n);
   
    for (j=0;j<r->n;j++)
      printf("file: %s\ninfo: %s\n\n",r->r[j].f->file,r->r[j].f->desc);
    printf("Enter keys> ");
    } /* while */
  dbCloseIndex(i);
  } /* main() */

		---------- cut here ---------

| Dan Zerkle home:(805) 968-4683 morning:961-2434 afternoon:687-0110  |
| dz@cornu.ucsb.edu dz%cornu@ucsbuxa.bitnet ...ucbvax!hub!cornu!dz    |
| Snailmail: 6681 Berkshire Terrace #5, Isla Vista, CA  93117         |
| Disclaimer: If it's wrong or stupid, pretend I didn't do it.        |