zhahai@gaia.UUCP (Zhahai Stewart) (12/17/86)
There are several commercial text indexing products on the market, which will keep a master index for a set of files (or articles) which for any keyword can quickly tell which files contain that keyword. Further, one can query for a boolean combination of keywords (OR, AND, maybe NOT), and (best trick yet) ask for two keywords to be found within N words of each other. One must of course let the text indexer know when a file is deleted, added, or revised. My question is how this is done, algorithmically. The most obvious approches are slow and would build rather large indices. I am looking for either a description or a reference to some source which treats this subject with enough detail to support an implementation (assume a decent foundation in data structures and basic algorithms). Thanks for any help. ~z~ -- Zhahai Stewart {hao | nbires}!gaia!zhahai