robf@squid.rtech.com (Robert Fair) (02/03/90)
>From: may@28182.dec.com (Patrick May) writes >>In article <1990Jan31.010510.15242@welch.jhu.edu>, francois@welch.jhu.edu (Francois Schiettecatte) writes... >>Does anyone out there know of any books which might help >>me to optimise sql statements which are to be sent to a > >An issue of _Database Programming and Design_ sometime around October or >so 1988 had an article on this. I'll try and dig up a more precise >reference. > >Their major points concerned avoiding the use of IN, ANY, and ALL in the >WHERE clause -- the same functionality can be acheived with EXISTS and >at the same time force the use of an index if available. There were >also some suggestions on ordering the table names in the FROM clause, >but these were more implementation dependent. 1) The order in the FROM clause should make NO DIFFERENCE to the query plan generated if the DBMS has any kind of decent optimizer. Having said that some DBMS products today still don't have proper optimizers, so you may still be forced to munge with the FROM clause to get decent performance. 2) IN, ANY, ALL are indeed equivilent to EXISTS in many cases, but that doesn't mean you shoudn't use them. If the DBMS has a good optimizer equivilent expressions should usually generate very similar query plans. It sounds like the book was written after using a DBMS which had an optimizer only capable of handling EXISTS effectively. So, look for a product which has a smart optimizer and save yourself a lot of hassle. Things to look for include: - A statistical optimizer which has data distribution information available for smart use of indexes. - Flattening subqueries into joins automatically when appropriate - Multiple join strategies (tuple substitution, sort-merge etc) - A cost-based method which thinks of things like CPU overhead and disk IO when choosing the best query plan. If you have a distributed database (based on several different machines) the optimizer should also be capable of allowing for things like network overhead etc Given a smart optimizer which turns your SQL into an intelligent query plan, a more important issue become your database schema (layout) - do you have appropriate indexes/storage structures on columns which can use them ? Is the data normalized appropriately ? etc If you have a huge database with absolutely no indexes the smartest optimizer in the world won't help (although it'll find the best way of getting the result without using any indexes !) Robert L. Fair Ingres Corperation Technical Support Group.
jkrueger@dev.dtic.dla.mil (Jonathan Krueger) (02/05/90)
francois@welch.jhu.edu (Francois Schiettecatte) writes: >Does anyone out there know of any books which might help >me to optimise sql statements which are to be sent to a >database. What I really need are a set of guidelines, of >does and donts, of what type of query to use, etc. It isn't this simple. First design your database. There are books on this. If you do it right, you'll find most commonly executed queries can be made to go fast without special attention from the application programmer. Next get the right performance analysis tools. At a minimum, you need a way to measure how expensive your queries are and a way to find out how your engine is executing them. Use these tools. You'll likely find that the queries you thought were problems can indeed be made to go faster, but that it won't help your application nearly as much as optimizing others you never considered. This may lead you back to database design again. Then, and only then, is it worth playing games with "exists" versus "in". Yes, certain families of queries will be poorly optimized across broad classes of vendor products and versions. You're unlikely to learn which ones in advertiser rags like "Database Programming and Design". -- Jon -- Jonathan Krueger jkrueger.dtic.dla.mil uunet!dgis!jkrueger Isn't it interesting that the first thing you do with your color bitmapped window system on a network is emulate an ASR33?
nigelc@cognos.UUCP (Nigel Campbell) (02/12/90)
From experience with Interbase/Dg Sql / Vax rdb and a little Oracle I would strongly recommend that you go on the vendors courses . There are some very subtle features to each query optimizer that a generic sql book cannot hope to bring out . I.e Does the optimizer process multiple indexes in parallel , does it place more significance on a unique vs a repeating index , does an index segment with a null value effect the decision .... these and many more cannot be found in a Sql book The major success lies in investing the time in getting a correct logical erm that is then transposed to a normalised database design . Also it is worth while determining what the nature and volume of the data will be as this could have an impact in the long term on the performance of the system . Many sites I have visited have not considered archival of data or due to a poor design have not investigated the life cycle of an entity an the impact of its' death on others . This usually leads to unwanted data being left in tables .
sweeneyd@handel.CS.ColoState.Edu (daniel j sweeney) (02/12/90)
In article <3601@dev.dtic.dla.mil> jkrueger@dev.dtic.dla.mil (Jonathan Krueger) writes: >francois@welch.jhu.edu (Francois Schiettecatte) writes: > >>Does anyone out there know of any books which might help >>me to optimise sql statements which are to be sent to a >>database. What I really need are a set of guidelines, of >>does and donts, of what type of query to use, etc. > >It isn't this simple. [some of the response deleted] >Next get the right performance analysis tools. At a minimum, you need >a way to measure how expensive your queries are and a way to find out >how your engine is executing them. Use these tools. You'll likely >find that the queries you thought were problems can indeed be made to >go faster, but that it won't help your application nearly as much as >optimizing others you never considered. This may lead you back to >database design again. [some of the response deleted] >Jonathan Krueger jkrueger.dtic.dla.mil uunet!dgis!jkrueger I'm a student at CSU taking my first database class. I recently read and article in the ACM Transactions on Databases March 1989, entitled: "On Estimating the Cardinality of the Projection of a Database Relation" by Ahad, Rao, and McLeod. The authors discuss their formula for estimating the cost of a query (in disk accesses) with exceptable error results. The formula does not require a scan of the relation. This article has intrigued me. I would like to do a project for my class on estimating the cost of queries. However, I am not familar with what performance analysis tools are available or any schools doing research in this area. If anyone can point me towards articles, research and software (for Vax, HP, Sequent Balance or any UNIX based OSs), I would be grateful. Thanks, -- Dan Sweeney sweeneyd@handel.cs.colostate.edu Department of Computer Science sweeneyd%colostate.csnet@relay.cs.net Colorado State University (RAMS) Fort Collins, CO 80523