[comp.databases] Query Optimization

cluet@bdblues.altair.fr (Sophie Cluet) (01/14/91)

I am looking for references on the optimization of
nested SQL or SQL-like queries. You know, that kind:

select a1 a2 ... an 
from R 
where ai is in (select ... 
                from ...
                where ...).

I know of one paper
by W. Kim called "on optimizing SQL-like nested queries".
Do you know others?

Thanks in advance, 

Sophie.

peter@doe.utoronto.ca (Peter Mielke) (02/15/91)

I was just at a talk on relational database query optimization using
the technique of "Magic Sets".  I was wondering what RDBMSes use any
form of query optimization (i myself have been bitten quite a number
of times by not writing my SQL query the right way)? Is it still on
the user of SQL to format the query in the right way to get the answer
the fastest way?
-- 
Peter Mielke                                    peter@doe.utoronto.ca
Dictionary of Old English Project               utgpu!utzoo!utdoe!peter
University of Toronto

isaac@goanna.cs.rmit.oz.au (Isaac Balbin) (02/15/91)

peter@doe.utoronto.ca (Peter Mielke) writes:


>I was just at a talk on relational database query optimization using
>the technique of "Magic Sets".  

Was this a talk on using the well known magic sets technique in the
context of a relational database based implementation of a *deductive
database*? Can you tell us who gave the talk?
Even with magic sets for deductive databases, it still may well be
the users responsibility to choose  the best sideways information
passing strategy (SIPS) implemented by the magic set.
I do not know of any work that has automated the choice of best SIPS.
-- 

peter@doe.utoronto.ca (Peter Mielke) (02/19/91)

In <4772@goanna.cs.rmit.oz.au>, Isaac Balbin writes:
> >I was just at a talk on relational database query optimization using
> >the technique of "Magic Sets".  
> 
> Was this a talk on using the well known magic sets technique in the
> context of a relational database based implementation of a *deductive
> database*?

I'm not quite sure. This was the first time i'd every heard of such
things.

> Can you tell us who gave the talk?

The speaker was Inderpal Mumick, Stanford University.
-- 
Peter Mielke                                    peter@doe.utoronto.ca
Dictionary of Old English Project               utgpu!utzoo!utdoe!peter
University of Toronto

nobody@blia.sharebase.com (Nobody at all) (02/19/91)

In article <1991Feb14.192012.6084@doe.utoronto.ca> peter@doe.utoronto.ca (Peter Mielke) writes:
>
>I was just at a talk on relational database query optimization using
>the technique of "Magic Sets".  I was wondering what RDBMSes use any
>form of query optimization (i myself have been bitten quite a number
>of times by not writing my SQL query the right way)? Is it still on
>the user of SQL to format the query in the right way to get the answer
>the fastest way?
>-- 
>Peter Mielke                                    peter@doe.utoronto.ca
>Dictionary of Old English Project               utgpu!utzoo!utdoe!peter
>University of Toronto

Just for the record:  ShareBase I and ShareBase II use an optimiser
that is good for up to about a 4 way join and sometimes to ok on larger
queries.  They are not dependant on the way the query is stated
(ordering of tables and/or qualification clauses (but certain SQL
queries can be stated in a variety very different ways so these might
effect the optimizer).  ShareBase III (in beta test, scheduled for
release around June) does a complete optimization search for up to
about 10 way joins and a pruned search up to 32 way.  As far as I know
it will generate the same query plan for any semanticly equivalent SQL
statements.
(Both optimizers accept explicit directives from the user to set the
query plan if the optimizer does not generate an acceptible one.)

wahl@shodha.enet.dec.com (David Wahl) (02/20/91)

Most commercially available database managers have some form of query
optimizer, although it may not be a cost-based rewriter.  Magic sets
is a general technique for handling queries which are recursively 
defined.  SQL doesn't allow the specification of recursion, so you
won't find a declarative (i.e., without specifying an execution plan)
way of doing recursion in any commercial relational system.

There are a lot of different wrinkles to the query optimization
problem.  If you are looking for a general introduction to the field,
almost any modern text on relational databases has a chapter on the
subject.  A lot of work on recursive query processing has come from
the field of logic and databases.  If you are interested in magic sets
and related subjects, you might find the chapter on deductive
databases in Valduriez & Gardarin's book on Relational Databases and
Knowledge Bases helpful; it's particularly accessible to the general
reader.

As far as commercialization is concerned, more work is needed to
determine the application needs for recursion in relational queries
and methods for specifying and implementing them which suit
applications programmers and end users.  We have been studying the
problem here; there are several other industrial labs looking at the
problem as well.

Regards,
Dave Wahl
===================================================================
Digital Equipment Corporation
Database Systems Research (CXN1/2)
1175 Chapel Hills Drive
Colorado Springs, CO 80920-2080

Tel 719-260-2758
Email: wahl@cookie.enet.dec.com
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% The opinions expressed are my own, not Digital's.               %
% "I suppose you would have to be very well educated to get that  %
%  kind of job."                                                  %
% "Extremely well educated.  Typing, everything."                 %
% -- Donald Barthelme, "The Emerald"                              %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

mao@eden.Berkeley.EDU (Mike Olson) (02/20/91)

In <13511@blia.sharebase.com>, mike@woodstock.UUCP (Mike Ubell) writes

> ShareBase III (in beta test, scheduled for
> release around June) does a complete optimization search for up to
> about 10 way joins and a pruned search up to 32 way.  As far as I know
> it will generate the same query plan for any semanticly equivalent SQL
> statements.

does this optimizer consider nested subqueries to be semantically equivalent
to their flattened expressions?  do you have to flatten subqueries in order
to make this happen?  this seems kind of tricky.

					mike olson
					postgres research group
					uc berkeley
					mao@postgres.berkeley.edu

nobody@blia.sharebase.com (Nobody at all) (02/26/91)

In article <11262@pasteur.Berkeley.EDU> mao@eden.Berkeley.EDU (Mike Olson) writes:
>In <13511@blia.sharebase.com>, mike@woodstock.UUCP (Mike Ubell) writes
>
>> ShareBase III (in beta test, scheduled for
>> release around June) does a complete optimization search for up to
>> about 10 way joins and a pruned search up to 32 way.  As far as I know
>> it will generate the same query plan for any semanticly equivalent SQL
>> statements.
>
>does this optimizer consider nested subqueries to be semantically equivalent
>to their flattened expressions?  do you have to flatten subqueries in order
>to make this happen?  this seems kind of tricky.
>
>					mike olson
>					postgres research group
>					uc berkeley
>					mao@postgres.berkeley.edu

I like a good straight man, thanks Mike:

The answer is that it does optimizations across subqueries.  Remembering that
subqueries form an existensial join with the outer query makes this
somewhat tricky since it is not legal to generate multiple values where
the original query would only generate one.

An example using a query that was popular when I was at Berkeley:
The following are not equivalent since the second query is required to
generate one m.name per qualifying record in the cross product while
the first generates one m.name per qualifying record in the single
table.

select m.name from emp m where m.eno =
	(select e.manager from emp e where e.sal > m.sal)

select m.name from emp m, emp e where m.eno = e.manager and e.sal > m.sal;

ShareBase III, however, can use an index on eno to resolve the first query
if that is the best way to reslove it and will get the correct cardinality
of the result.  That is the reason that you might want to flatten the
query.  The first query is equivalent to a select distinct of the second
query.

				Michael Ubell
				ShareBase Corp.
				mike@sharebase.com