[comp.databases] Summary: Full Text Database Products

tony@rata.vuw.ac.nz (Tony Martindale) (02/20/90)

		Summary: Full Text Database products
		====================================

  As promised here is the summary.  Thanks to those people who
replied; there were seven replies in total.  I have edited them
slightly removing quotations of the original posting and inessential
stuff from the mail headers.  Any more information concerning full
text database products would be appreciated.
  A brief summary of the products that the replies cover is as follows:

	Empress 	- relational database with capability of handling text.
	TITAN 		- full text database system from Australia!
	Ful/Text	- full text database sytem from Fulcrum Technologies.
	Information	- Pick with indexes maintained for you, from Prime.

  See the replies for more details.

  The original posting:
  --------------------

  If you know of any products that allow textual storage, indexing,
searching and retrieval please tell me about them (via mail
please!:-).  The name of the product and a contact address and/or
phone number of someone who sells it in the U.S. or in our corner of
the world would be nice.

  Products/systems that can be used on VAX/VMS and/or (preferably
*and*) Unix are of primary interest, but we would also be interested
in knowing what is available in the Mac and PC worlds also.

  So far we have heard of the following products: Basis(plus),
BRS/Search and Topic.  If there are more than one or two replies I
will post a summary.

  The replies:
  -----------

From: louk@tslwat.tsl.com

>Newsgroups: comp.databases
>Organization: Teleride Sage, Ltd., Waterloo

Empress from Empress Software in Toronto is a relational database that
supports arbitrarily long byte strings.    You can define your own
SQL operators so just about any text operation should be possible.
Empress is available on VAX/VMS, UNIX and in a single user MS-DOS version
(they are working on the multi-user MS-DOS version).   They advertise
in UNIX World.


From: djc@mbunix.mitre.org (Cazier)
Posted-From: The MITRE Corp., Bedford, MA
X-Alternate-Route: user%node@mbunix.mitre.org
Organization: MITRE Corp., Houston, TX

The UniForum show had a few vendors touting their wares in text processing -
or better yet compund document manipulation. Few are there yet. If you
could get a listing of the UniForum attendees at 312-938-1228 that might
help. One other source is actually calling a VENDOR like VERITY at
415-960-7698 and get started to see what can be done. They might
even tell you of other vendors they compete against.
-- 
Jacques Cazier (713)-333-0966
{decvax,philabs}!linus!mbunix!jak or jak@mbunix.mitre.org


From: munkew@csli.stanford.edu (Mun-Kew Leong)
Organization: Center for the Study of Language and Information, Stanford U.

Try something closer to home. I have used a product called TITAN from
a firm Knowledge Engineering located in Melbourne. It has affiliations
with the Royal Melbourne Institute of Technology. Sorry I don't have a
more detailed address. TITAN runs on any flavour of UNIX and has a screen
oriented interface, though a C interface was promised. I've looked at
several full-text systems and TITAN is the most impressive.

Disclaimer: (sigh) I ain't got nothing to do with TITAN, KE, or RMIT.
	    I just like TITAN (a lot).

Mun-Kew LEONG                                    MUNKEW@CSLI.STANFORD.EDU
Institute of Systems Science        currently:   c/o Philosophy Dept.
National University of Singapore                 Stanford University
Kent Ridge, Singapore.                           Stanford, CA 94305, USA


Return-Path: <dgh%unify%csusac%lll-crg%lll-winken%uwm@uunet.uu.net>
From: dgh%uwm%unify%csusac%lll-crg@uunet.uu.net (David Harrington)
Organization: Unify Corporation, Sacramento, CA, USA

Check out Fulcrum Technologies Inc.
	  560 Rochester St.
	  Ottawa, Canada K1S 5K2
	  (613) 238-1761

They have a pretty powerful text retrieval engine.

-- 
David Harrington		                       internet: dgh@unify.UUCP
Unify Corporation		                 ...!{csusac,pyramid}!unify!dgh
3870 Rosin Court                                          voice: (916) 920-9092
Sacramento, CA 95834                                        fax: (916) 921-5340


Return-Path: <@qucdn.queensu.ca:nordin@qucis.queensu.ca>
From: nordin@qucis.queensu.ca

I saw your note about full text retrieval.  You should look into a product
from Fulcrum Technologies Inc.  The stuff is called Ful/Text and runs on
just about any platform you could want (except IBM mainframe style O/Ss).
Their product is sold primarily to OEMs and has been incorporated into a 
large number manufacturers offerings.  They are also playing a big role
in CD-ROMs and electronic publishing.  Here is the contact info:

Fulcrum Technologies Inc.
560 Rochester St.
Ottawa, Ontario
CANADA
K1S 5K2
ph: (613)238-1761
fax: (613)238-7695

I am not involved with Fulcrum at all.  I used their software three years
ago in a research environment - it was very powerful.

Brent Nordin
nordin@qucis.queensu.ca
nordin@qucis.bitnet
(613)531-9226 if all else fails


Return-Path: <burke%seachg%darkover%attcan%gpu.utcs.utoronto.ca%utai@uunet.uu.net>
From: burke%seachg@uunet.uu.net ()

A few years ago, I had an opportunity to use Fultext from a company called
Fulcrum in Ottawa, Ontario, Canada. 

The package was fast (for it's day) and demo'd very well.

You may wish to contact them to see what O/S bases they now support. When I
last had dealings with them, they only has a Unix based product. But that was
nearly 4 years ago.

Michael Burke
Sea Change Corporation
Mississauga, Ontario
Canada


From: ghm@ccadfa.cc.adfa.oz.au (Geoff Miller)
Organization: Computer Centre, Australian Defence Force Academy

Prime's "Information" (which is, as you may know, a Pick clone with some
extra goodies) can be used to provide a very effective system providing 
text storage, indexing and retrieval.  All you have to do for this is 
write a simple routine to remove unwanted keywords (the, and, of...)
from the index, which Information maintains automatically.  A similar
system could be set up very easily in generic Pick, except you would have
to write your own index-maintenance routine.

Pick is available in the Unix environment (UniVerse), while Information
Release 8 (which provides automatic indexing) is only available under Prime's
Primos O/S.  However, Prime do also market Unix boxes, and are talking about
providing the full functionality of Release 8 Information on them sometime
(probably late) next year.

Geoff Miller  (ghm@cc.adfa.oz.au)


Tony Martindale                           Computing Services Centre,
Domain: tony@rata.vuw.ac.nz               Victoria University of Wellington,
Path: ...!uunet!vuwcomp!rata!tony         P.O. Box 600, NEW ZEALAND.

jkrueger@dgis.dtic.dla.mil (Jon) (02/20/90)

tony@rata.vuw.ac.nz (Tony Martindale) writes:

>Empress from Empress Software in Toronto is a relational database that
>supports arbitrarily long byte strings.    You can define your own
>SQL operators so just about any text operation should be possible.

You can't define indexing on these objects, however.  So the
operation will execute in time linear to table size.

There's nothing inherent in the relational model that prevents one from
handling user-defined types as first-class objects.  But to date no
commercial RDBMS has done so.

-- Jon
-- 
Jonathan Krueger    jkrueger@dtic.dla.mil   uunet!dgis!jkrueger
The Philip Morris Companies, Inc: without question the strongest
and best argument for an anti-flag-waving amendment.

anders@penguin (Anders Wallgren) (02/21/90)

In article <1990Feb19.235547.8915@comp.vuw.ac.nz>, tony@rata (Tony Martindale) writes:
>From: djc@mbunix.mitre.org (Cazier)
>Posted-From: The MITRE Corp., Bedford, MA
>X-Alternate-Route: user%node@mbunix.mitre.org
>Organization: MITRE Corp., Houston, TX
>
>The UniForum show had a few vendors touting their wares in text processing -
>or better yet compund document manipulation. Few are there yet. If you
>could get a listing of the UniForum attendees at 312-938-1228 that might
>help. One other source is actually calling a VENDOR like VERITY at
>415-960-7698 and get started to see what can be done. They might
>even tell you of other vendors they compete against.
>-- 
>Jacques Cazier (713)-333-0966
>{decvax,philabs}!linus!mbunix!jak or jak@mbunix.mitre.org
>
>

You might have better luck calling Verity if you use 415-960-7600,
unless you speak FAX that is.

Anders Wallgren
Verity, Inc.

tim@ohday.sybase.com (Tim Wood) (02/21/90)

In article <768@dgis.dtic.dla.mil> jkrueger@dgis.dtic.dla.mil (Jon) writes:
>
>You can't define indexing on [arbitrarily long byte strings], however.  
>So the operation will execute in time linear to table size.
>
>There's nothing inherent in the relational model that prevents one from
>handling user-defined types as first-class objects.  But to date no
>commercial RDBMS has done so.

One can index BLOBs (basic large object--conceptual instances of 
a user-defined datatype) by maintaining an identifier column value
of a simple datatype for each row containing a BLOB.  The indexing 
on the simple datatype is bound to be faster and more reliable than
the one-off indexing code a user writes for his/her BLOB datatype.
Moreover, one can change the "ordering" of the BLOBs by changing 
the identifier values, whereas with directly-indexed BLOBs one would need
to change the type-specific indexing algorithm.  This amounts to changing 
part of the datatype (or object type) definition.  
-TW
---
Sybase, Inc. / 6475 Christie Ave. / Emeryville, CA / 94608	  415-596-3500
tim@sybase.com          {pacbell,pyramid,sun,{uunet,ucbvax}!mtxinu}!sybase!tim
		This message is solely my personal opinion.
		It is not a representation of Sybase, Inc.  OK.

jkrueger@dgis.dtic.dla.mil (Jon) (02/21/90)

tim@ohday.sybase.com (Tim Wood) writes:

>One can index BLOBs ...  by maintaining an identifier column value
>of a simple datatype for each row containing a BLOB.  The indexing 
>on the simple datatype is bound to be faster and more reliable than
>the one-off indexing code a user writes for his/her BLOB datatype.

I can show counterexamples (they happen to be textual datatypes).
So it's not "bound to be".  Might be on average.  What it's bound to
be is more subvertible, less safe, less productive.  Simulated data
types are like that.

>Moreover, one can change the "ordering" of the BLOBs by changing 
>the identifier values, whereas with directly-indexed BLOBs one would need
>to change the type-specific indexing algorithm.  This amounts to changing 
>part of the datatype (or object type) definition.  

Sounds like a bad idea to me.  I prefer object ordering to be defined
by the object.  E.g. you mind if I change ordering of ints?  Say by one 
application without other applications' knowledge?

Tim, it's strange to hear a Sybase guy argue for moving functionality
from the database into the application.  You feeling ok?  :-)

-- Jon
-- 
Jonathan Krueger    jkrueger@dtic.dla.mil   uunet!dgis!jkrueger
The Philip Morris Companies, Inc: without question the strongest
and best argument for an anti-flag-waving amendment.