andy@garnet.berkeley.edu (Andy Lieberman) (08/31/89)
We're looking for a full text information retrieval package that runs on UNIX (the more flavors the better). The plan is to write our own user interface (client) that communicates to a unix server that we also write. The unix server will take the clients search requests and communicate them to the search engine and pass the search results back to the client. The search engine must be able to do keyword boolean searches. Word-proximity searching, weighted searching are also of interest. The search engine must have a good programming interface. It would be nice if there was a run-time version of the search engine available so we don't have to buy a full package for every machine. Although, utilities for maintaining the database are also a consideration. I am currently looking at BRS/Search and Topic by Verity. These both seem acceptable, but I would like to know of any other choices on the market. Comments about BRS/Search and Topic are also welcome (I already have their literature, so I'd be more interested in opinion than fact.) Are there any journals or magazines I should be reading? It seems that everything I hear about is SQL... Please mail responses and I'll post a summary. Thanks, Andy Lieberman Library Systems Office UC Berkeley
paul@csnz.co.nz (Paul Gillingwater) (09/06/89)
In article <1989Aug30.204014.27985@agate.uucp> andy@garnet.berkeley.edu (Andy Lieberman) writes: >We're looking for a full text information retrieval package that runs on >UNIX (the more flavors the better). It's easier for me to post -- replies bounce too often with many US sites that refuse to use fully-qualified domains. Disclaimer: I work for a dealer that sells BRS/Search. I believe that BRS/Search is probably a good choice for some applications. We have developed many applications that use it, and find that it has a very good programming interface. I'm less happy with the documentation, although there have been big improvements recently. Where BRS/Search falls down is that it does not have any relational links between records or strong subrecord structuring. We have programmed around that limitation by writing and interface between BRS/Search and Informix SQL, which gives the best of both worlds, i.e. BRS is used for a full MARC catalogue, and Informix is used for the circulation control system for our library package. My only other grumble with BRS is its price -- I think it's a bit high -- but now that BRS has been taken over by a publishing empire, I think we may see some more aggressive marketing.... -- Paul Gillingwater, Computer Sciences of New Zealand Limited Domain: paul@csnz.co.nz Bang: uunet!vuwcomp!dsiramd!csnz!paul Call Magic Tower BBS V21/23/22/22bis 24 hrs NZ+64 4 767 326 SpringBoard BBS for Greenies! V22/22bis/HST NZ+64 4 767 742
andy@garnet.berkeley.edu (Andy Lieberman) (09/09/89)
In article <1989Aug30.204014.27985@agate.uucp> I wrote: >We're looking for a full text information retrieval package that runs on >UNIX (the more flavors the better). The plan is to write our own user interface >(client) that communicates to a unix server that we also write. The unix >server will take the clients search requests and communicate them to the >search engine and pass the search results back to the client. >The search engine must be able to do keyword boolean searches. Word-proximity >searching, weighted searching are also of interest. > Here's what I got: ---------- From: billr@brspyr1.brs.com (Bill Rowe) FYI. We, here at BRS, are on the NET. (Big supprise, huh?) :-) Since you already have documentation, you probably know that BRS/Search does just about everything that you require, except provide information on how to write an interface for our search engine. Well, that's about to change: My current project involves producing a manual which documents how to develop your own interface to the BRS Search Engine. If you have questions of a technical nature, email me and I'll do my best to get you the answer. If you interested in an advanced copy of this manual, contact your Sales Rep. (Whoever that is?). ----------- The distributor I talked to was Main Street Software in New York, (212)779-8398. They also suggested ordering the MNS Reference Manual for $35. MNS is their 4GL. The manual explains everything that can be done through a C program interface. ------------- From: rcsmith@anagld.berkeley.edu (Ray Smith) My company markets a hardware based search engine you may find is more flexible than the two software packages you mentioned depending on the type application you are working with and the amount of data. Our search engine conducts a serial serch of the data at disk throughput speeds. Since it searches the data serially there is no need for the overhead associated with indexing (typically the size of an index is 2-3 times the size of the original text). Since our product is hardware based we are limited in the number of platforms we support. Currently the UNIX versions of our system are supported on Sun 3's & Sun 4's, Solbourne and Gould PowerNodes systems. With both the Sun and Solbourne systems we support an Ethernet based search server where you can build a smart frontend on a number of different platforms (PC's, MAC, etc.) For customization, our system comes with a full development library. Below is a copy of an article I posted to "comp.newprod" a while back. I also have a product description if you would like me to send it to you. You can reach me at the phone number listed in the .signature for more information. In addition to the features listed below we have added a few new ones recently. These include support for Soundex queries, a technique to catch typical typographical errors (handles four common errors), and a SunView point-and-shoot interface. We are currently working on a MAC HyperCard interface and a MS/DOS interface as well as a point-and-shoot interface under X Window. -Ray -------------------------- Begin included text ---------------------------- "TEXTRACT" HIGH SPEED TEXT SEARCH AND RETRIEVAL SYSTEM Analytics, Inc. is pleased to announce the availability and full technical support of Textract, a hardware-based full text search and retrieval engine. Textract allows very high speed search and retrieval of ASCII data stored in any format. The user can query up to 100 information files at once, with natural English language words and phrases, using up to 256 unique terms. Both search time and data file size are decreased by Textract's ability to encode and compress data. The Textract engine is capable of searching at speeds of 8 Megabytes per second, although the effective search speed is limited by bus and disk data transfer rates on host systems. Textract currently runs on DEC MicroVAX II's under VMS, Sun 3 & 4 under SunOS3.5 and SunOS4.0, and Gould PowerNode platforms under UTX/32 R2.1. Prices vary depending on host computer. Features include: -English language query with full Boolean logic operators -Single and multicharacter wild cards. -Numerical ranging -Vocabulary and Thesaurus lookup -Order or unordered terms -Word proximity -ANSI terminal based menu interface -Complete "C" libraries for customization -Full Sun networking support including the use of NFS and the capability to use boards physically located on another machine if your local boards are busy. For more information on Textract, feel free to contact Ray Smith at Analytics, Inc. 9891 Broken Land Pkwy. (301) 381-4300 Suite 200 Columbia, MD 21046 Email: rcsmith@anagld.UUCP --------------------------------- From: Jkrueger <mtxinu!uunet.UU.NET!dgis!daitc.daitc.mil!jkrueger@ucbvax.berkeley.edu> You can do these things with relational database management systems. We're doing it. There are advantages and disadvantages with respect to text engines. -------------------------------- >From: paul@csnz.co.nz (Paul Gillingwater) Disclaimer: I work for a dealer that sells BRS/Search. I believe that BRS/Search is probably a good choice for some applications. We have developed many applications that use it, and find that it has a very good programming interface. I'm less happy with the documentation, although there have been big improvements recently. Where BRS/Search falls down is that it does not have any relational links between records or strong subrecord structuring. We have programmed around that limitation by writing and interface between BRS/Search and Informix SQL, which gives the best of both worlds, i.e. BRS is used for a full MARC catalogue, and Informix is used for the circulation control system for our library package. My only other grumble with BRS is its price -- I think it's a bit high -- but now that BRS has been taken over by a publishing empire, I think we may see some more aggressive marketing.... --------------------- I agree that the price seems a bit high... One of the main reasons I'm trying to find other packages to choose from. -------------------- From: mtxinu!prlb.philips.be!sunbim!od@ucbvax.berkeley.edu (Olivier Declerfayt) "TOPIC is the first full-text search and retrieval system designed for networked computing environments. TOPIC allows you to store, manage, and retrieve any document regardless of its format or location on a network. Employing Concept-Based Retrieval technology, TOPIC features a rule-based approach to searching documents by subject of interest and presenting search results in relevance-ranked order. TOPIC can be used not only for managing internal text databases, but also for managing external information sources. The TOPIC Real-Time System can automatically monitor any number of live sources of time sensitive text information, such as newswire feeds, and classify and disseminate selected information tailored to each individual user's interest profile. When used as a retrospective searching tool, TOPIC enables organizations to manage and provide access to the increasing amounts of unstructured text and image data, such as reports, manuals, evaluations, and e-mail, that traditional database systems cannot. The TOPIC SQL-Bridge allows TOPIC full-text databases to be integrated with popular SQL-based RDBMS system. TOPIC written in C, runs in heterogeneous networked environments of computers and supports Digital VAX/VMS; UNIX-based workstations and minicomputers, such as Sun Microsystems, Pyramid Series 9000 and MIPS; and DOS and Xenix-based microcomputer systems. TOPIC is available in two configurations: as a stand alone multi-user system and as a network-based system. TOPIC supports most major file sharing networks including, NFS, DecNet, Novell, 3Com, TOPS, Banyan, TCP/IP, and ArcNet." Verity's address: 1550 Plymouth Street, Mountain View, CA 94043-1230 Phone: (415) 960 7600, Fax: (415) 960 7698 --------------- Someone asked me if I had an e-mail address for Verity. I did, but can't find it. Maybe someone from Verity could make themselves known... Andy Lieberman, Library Systems Office, UC Berkeley