zhahai@gaia.UUCP (Zhahai Stewart) (12/17/86)
There are several commercial text indexing products on the market, which will
keep a master index for a set of files (or articles) which for any keyword
can quickly tell which files contain that keyword. Further, one can query
for a boolean combination of keywords (OR, AND, maybe NOT), and (best trick
yet) ask for two keywords to be found within N words of each other. One must
of course let the text indexer know when a file is deleted, added, or revised.
My question is how this is done, algorithmically. The most obvious approches
are slow and would build rather large indices. I am looking for either a
description or a reference to some source which treats this subject with enough
detail to support an implementation (assume a decent foundation in data
structures and basic algorithms).
Thanks for any help. ~z~
--
Zhahai Stewart
{hao | nbires}!gaia!zhahai