don@grc.UUCP (Donald D. Woelz) (09/19/87)
We are in the process of specifying and designing a hardware database accelerator. This product will be capable of doing searches and/or sorts at extremely high speeds (100 to 200 times faster than the CPU would be capable of). My question to the net is this: What do people who design database software believe to be the bottlenecks to performance? This ends up being a lot of questions. What percentage of time is spent searching? Sorting? The two are inter-related, so which one would be more important to the database designer to have optimized? What are the parameters of most database searches in terms of the lenght of the keys? The length of the records? Are the records mostly fixed length or variable length? The same questions apply to sorting. Would a database designer be particularly put out if there were restrictions such that a certain record length will produce (say) 5 times more performance than another record length (for instance record lengths that are 2**N bytes long, or have an even number or bytes)? I don't think I have even begun to scratch the surface on the questions that have to be answered (or asked?), but I would appreciate any input from knowledgeable people on this topic. If you email me your responses, I will summarize and post to the net at some time in the future. Thanks Don Woelz GENROCO, Inc. ********************************************************* {rutgers, harvard, ames, seismo(?)}!uwvax!uwmcsd1!grc!don *********************************************************
larry@xanadu.uucp (Larry Rowe) (09/23/87)
In article <740@grc.UUCP> don@grc.UUCP (Donald D. Woelz) writes: >We are in the process of specifying and designing a hardware database >accelerator. This product will be ... > i suggest that you be careful before you pursue this product strategy. i know of at least one hardware sorting box that was linked into a commercial dbms that only speeded up queries by at most a factor of 7. this was best case performance. average was much lower (2-3). so, you have to ask yourself the following question: ``how many people will buy my box to go twice as fast?'' remember that you have to be price/competitive with bigger processor from currnet vendor AND you have to maintain your performance advantage as new products are delivered by the hardware vendor. also, remember to include the cost of disk storage in your sort box if you aren't going to sort in memory or in stages. (if you sort in stages, why will your processor be faster than vanilla hardware?) if you don't want to buy separate disks, then you've got a really serious problem because you have to understand the page layout of a random dbms (all implementations are different) and the lock manager abstraction as well (most dbms's don't give user access to the lock manager). what are the big problems. well, first off, you have to support all of the data types and comparison operators implemented by the dbms. that may be easier now that there is an sql standard. you must support multiple field keys. if you haven't heard about it, jim gray wrote a parallel sort package for the tandem system that performs a tournament sort in parallel. very elegant and fast. rumor from the ozone....the teradata Y-net does not significantly impact system performance. they'd go just as fast with a vanilla ethernet. no, suppose they implemented a parallel sort algorithm on their local net.... larry