[comp.databases] DATABASE ACCELERATOR

don@grc.UUCP (Donald D. Woelz) (09/19/87)

We are in the process of specifying and designing a hardware database
accelerator.  This product will be capable of doing searches and/or
sorts at extremely high speeds (100 to 200 times faster than the CPU
would be capable of).

My question to the net is this:  What do people who design database
software believe to be the bottlenecks to performance?

This ends up being a lot of questions.  What percentage of time is
spent searching? Sorting?  The two are inter-related, so which one
would be more important to the database designer to have optimized?
What are the parameters of most database searches in terms of the 
lenght of the keys?  The length of the records?  Are the records
mostly fixed length or variable length?  The same questions apply
to sorting.

Would a database designer be particularly put out if there were
restrictions such that a certain record length will produce (say) 5 times
more performance than another record length (for instance record
lengths that are 2**N bytes long, or have an even number or bytes)?

I don't think I have even begun to scratch the surface on the questions
that have to be answered (or asked?), but I would appreciate any input
from knowledgeable people on this topic.  If you email me your
responses, I will summarize and post to the net at some time in the
future.

Thanks
Don Woelz
GENROCO, Inc.
*********************************************************
{rutgers, harvard, ames, seismo(?)}!uwvax!uwmcsd1!grc!don
*********************************************************

larry@xanadu.uucp (Larry Rowe) (09/23/87)

In article <740@grc.UUCP> don@grc.UUCP (Donald D. Woelz) writes:
>We are in the process of specifying and designing a hardware database
>accelerator.  This product will be ...
>
i suggest that you be careful before you pursue this product
strategy.  i know of at least one hardware sorting box that was
linked into a commercial dbms that only speeded up queries by at
most a factor of 7.  this was best case performance.  average was
much lower (2-3).  so, you have to ask yourself the following question:
``how many people will buy my box to go twice as fast?''  remember
that you have to be price/competitive with bigger processor from currnet
vendor AND you have to maintain your performance advantage as new
products are delivered by the hardware vendor.  also, remember to include
the cost of disk storage in your sort box if you aren't going to sort
in memory or in stages.  (if you sort in stages, why will your processor
be faster than vanilla hardware?)  if you don't want to buy separate
disks, then you've got a really serious problem because you have to
understand the page layout of a random dbms (all implementations are
different) and the lock manager abstraction as well (most dbms's don't
give user access to the lock manager).

what are the big problems.  well, first off, you have to support all
of the data types and comparison operators implemented by the dbms.
that may be easier now that there is an sql standard.  you must support
multiple field keys.  

if you haven't heard about it, jim gray wrote a parallel sort package
for the tandem system that performs a tournament sort in parallel.  very
elegant and fast.  rumor from the ozone....the teradata Y-net does not
significantly impact system performance.  they'd go just as fast with
a vanilla ethernet.  no, suppose they implemented a parallel sort
algorithm on their local net....
	larry