[comp.databases] Full text retrieval data products

msa@clinet.UUCP (10/01/87)

In article <4031@well.UUCP> nightjob@well.UUCP (Hank Roberts writes:
>
>This site has been investigating a full text retrieval database product
>called BRS/SEARCH from BRS Information Technologies in New York; ...
>... asked me to see if I could get any comments on the product. ...

	I'm also interested in comparisons between different
	products offering full text retrieval. About a year
	ago I attempted to find a suitable product by reading
	brochures and other similar material. The target machine
	was VAX/VMS.

	I had to resort to comparing what kind of things you
	could do with them, like can you search words beginning,
	ending or even with complicated pattern. Can you search
	consecutive words, words in same sentence, paragraph, etc.
	Can you build search sets and combine them with boolean
	operators (AND, OR, NOT) to form new sets.

	I read brochures of BRS/SEARCH, STATUS, BASIS and
	some Scandinavian products, like Polydoc, NOTIS-IR
	and TRIP/TDBS. BRS/SEARCH and TRIP/TDBS had clearly
	almost everything you could think of.

	TRIP/TDBS boasted with the clear separation of the actual
	database and user interface, which I consider important.
	I also like the idea of giving a logical structure to the
	archived document (like: headline, ingress, body text
	and any other required fields like numbers, dates etc.)

	For testing purposes I set up various kinds databases.
	Among others:

	USENET	20-30Mb of *unsorted* material (blew up the thing,
		in the first attempt -- didn't much like the
		UUENCODED and some other funny messages. This
		slight omission was promptly corrected by Paralog--
		after that it ate everything :-).

	STTFNB	for 2 1/2 months time we received everything from
		from the Finnish News Agency (STT-FNB) and archived
		it. The total amount of archived text was about 25Mb.
		With indexes the required disk space is 75Mb (3 times
		the raw text -- mainly because of Finnish, I suspect).
		(This was the main test)

	Excluding minor bugs in user interface (in forms input & design),
	it has worked according to the manuals. I'm pretty satisfied
	with it, but as I have no "hands on" experience of the other
	products, I would be interested to hear other opinions.
	Any other users/testers of TRIP/TDBS (descendant of 3RIP)?
	Any hands on comparisons with BRS/SEARCH or BASIS? Is there
	any suitable "benchmarks" that I could try on our system?

Note:	I cannot send/reply with direct/private mail from this
	host (clinet) -- I can post news (obviously :-)
--
Markku Savela,     msa@clinet.FI
Nokia Information Systems, P.O. BOX 780, SF-00101 Helsinki, Finland