[comp.databases] Join Contest

gus@plains.NoDak.edu (07/13/90)

Recently I heard that an IBM-type db person claimed that it's
not uncommon in commercial db applications to join as many as
10-15 (maybe even 30) tables in a single query.

This seems like an incredible number of joins, especially if the
tables are large, on the order of 1/2 million tuples.

Any comment/references, etc.?

miket@blia.BLI.COM (Mike Tossy) (07/14/90)

In article <5265@plains.UUCP>, gus@plains.NoDak.edu writes:
> 
> 
> Recently I heard that an IBM-type db person claimed that it's
> not uncommon in commercial db applications to join as many as
> 10-15 (maybe even 30) tables in a single query.
> 
> This seems like an incredible number of joins, especially if the
> tables are large, on the order of 1/2 million tuples.
> 
> Any comment/references, etc.?

We have an instalation in the U.S. Navy (a personnel application) that
routinely runs 15 way joins.  They were very pleased when we increased
the limit with ShareBase II (from 16 tables under ShareBase I).

My observation is that the number of tables used by an application 
increases dramaticly if the database design is well normalized.

--
						Teradata Corporation
Mike Tossy                                      ShareBase Division
miket@blia.bli.com                              14600 Wichester Blvd
(408) 378-7575 ext2200                          Los Gatos, CA 95030
					(Formerly: Britton Lee, Inc.)

These are only my opinions.

wengland@stephsf.stephsf.com (Bill England) (07/16/90)

In article <5265@plains.UUCP> gus@plains.NoDak.edu writes:
>
>
>Recently I heard that an IBM-type db person claimed that it's
>not uncommon in commercial db applications to join as many as
>10-15 (maybe even 30) tables in a single query.
>
>This seems like an incredible number of joins, especially if the
>tables are large, on the order of 1/2 million tuples.
>
>Any comment/references, etc.?

  In a project for this spring we regularly created selects that would 
join 5 to 10 tables.  The rational being that the database engine can 
gather all of the required data faster than building several selects and 
asking the engine to do them seperately.  Many of the joins were outer 
joins, collecting data that may or may not exist for a primary record.

    If you have data from several tables to display or report, it is almost 
always more efficient to join the tables and issue a single select. Rather
than look up data with several selects inside of a loop.

   The largest problem I have found with joining many tables is that it
becomes less intuitive what your result will be as the number of tables 
joined increases.    < This is, perhaps, an understatment.  :-) >

PS.  The database system is Informix.4gl/compiled running on a Sequent.

 +--------
 |  Bill England
 |  Stephen Software Systems, Inc.,   Tacoma Wa.
 |  wengland@stephsf.com              +1 206 564 2122
 |
  * *      H -> He +24Mev
 * * * ... Oooo, we're having so much fun making itty bitty suns *
  * *

davidm@uunet.UU.NET (David S. Masterson) (07/16/90)

In article <12072@blia.BLI.COM> miket@blia.BLI.COM (Mike Tossy) writes:

   My observation is that the number of tables used by an application 
   increases dramaticly if the database design is well normalized.

Is that good or bad?  (Speaking from your observations that is :-)
--
===================================================================
David Masterson					Consilium, Inc.
uunet!cimshop!davidm				Mt. View, CA  94043
===================================================================
"If someone thinks they know what I said, then I didn't say it!"

miket@blia.BLI.COM (Mike Tossy) (07/17/90)

In article <CIMSHOP!DAVIDM.90Jul15215312@uunet.UU.NET>, cimshop!davidm@uunet.UU.NET (David S. Masterson) writes:
> In article <12072@blia.BLI.COM> miket@blia.BLI.COM (Mike Tossy) writes:
> 
>    My observation is that the number of tables used by an application 
>    increases dramaticly if the database design is well normalized.
> 
> Is that good or bad?  (Speaking from your observations that is :-)

I think it is good.  The normal reason to denormalize is to gain
performance.  Assuming that your RDBMS is good at joins this is not
necesary, and you can even gain speed by normalizing because the resulting
database is smaller (requiring less I/O and making it more likely to fit in
cache). If your DBMS can handle a well normalized database design
why not use well normalized design?  You gain in data consistancy and
in a smaller database.

(In my experience most database designs are under normalized.  I have
consulted on several projects where further normalization resulted in
significant performance improvements.  I have seen only a few cases
where the solution to a performance problem was to denormalize the data,
and in all these cases the solution was to keep some value pre-summarized
so as to avoid an aggregate.  I have not yet seen a case where performance
was helped combining together what should be two logical tables.  (If I
had consulted on projects using other RDBMS perhaps my observations would 
be different :-)  ))

For systems with good join performance, the only disadvantage that I've ever
seen to a well normalized database is that some users prefer think about
the data as if it were one big table.  They really want a "big spreadsheet".
Normally these are read-only users and their needs can be handled with views.

-------
						Teradata Corporation
Mike Tossy                                      ShareBase Division
miket@blia.bli.com                              14600 Wichester Blvd
(408) 378-7575 ext2200                          Los Gatos, CA 95030
					(Formerly: Britton Lee, Inc.)
											            These are only my opinions.

jkrueger@dgis.dtic.dla.mil (Jon) (07/17/90)

gus@plains.NoDak.edu writes:

>Recently I heard that an IBM-type db person claimed that it's
>not uncommon in commercial db applications to join as many as
>10-15 (maybe even 30) tables in a single query.

It is uncommon.  Ask your IBM-type db person to substantiate
his claim.  "Extraordinary claims require extraordinary
justification".  In fact it's difficult to design a query that
makes sense with more than about five joins.  Try it.

-- Jon
-- 
Jonathan Krueger    jkrueger@dtic.dla.mil   uunet!dgis!jkrueger
Drop in next time you're in the tri-planet area!

gus@plains.UUCP (jim gustafson) (07/17/90)

In article <933@dgis.dtic.dla.mil> jkrueger@dgis.dtic.dla.mil (Jon) writes:

>>Recently I heard that an IBM-type db person claimed that it's
>>not uncommon in commercial db applications to join as many as
>>10-15 (maybe even 30) tables in a single query.
 
>It is uncommon.  Ask your IBM-type db person to substantiate
>his claim.  "Extraordinary claims require extraordinary
>justification".  In fact it's difficult to design a query that
>makes sense with more than about five joins.  Try it.
 
>-- Jon
>-- 
>Jonathan Krueger    jkrueger@dtic.dla.mil   uunet!dgis!jkrueger
>Drop in next time you're in the tri-planet area!


Jon, in case you missed it, I'll repost what I guess would be 
a main contender and an example of an "uncommon" join.  I have 
seen/heard of several other examples of "monster" joins, but I 
am still interested in hearing from anyone who works with large, 
m-way joins.

Thanks,
Jim Gustafson	gus@plains.nodak.edu	
uunet!plains!gus (UUCP)	gus@plains (Bitnet)


+------------------- mike's --------- mail -------------------
| We have an instalation in the U.S. Navy (a personnel application) that
| routinely runs 15 way joins.  They were very pleased when we increased
| the limit with ShareBase II (from 16 tables under ShareBase I).
| 
| My observation is that the number of tables used by an application 
| increases dramaticly if the database design is well normalized.
| 
| --
|						  Teradata Corporation
| Mike Tossy                                      ShareBase Division
| miket@blia.bli.com                              14600 Wichester Blvd
| (408) 378-7575 ext2200                          Los Gatos, CA 95030
|					(Formerly: Britton Lee, Inc.)
| 
| These are only my opinions.
-- 
Jim Gustafson	gus@plains.nodak.edu	
uunet!plains!gus (UUCP)	gus@plains (Bitnet)

tom@eds.com (Tom H. Meyer) (07/17/90)

In article <933@dgis.dtic.dla.mil> jkrueger@dgis.dtic.dla.mil (Jon) writes:
>gus@plains.NoDak.edu writes:
>
>In fact it's difficult to design a query that
>makes sense with more than about five joins.  Try it.
>
>Jonathan Krueger    jkrueger@dtic.dla.mil   uunet!dgis!jkrueger

It is indeed difficult for human beings to create queries with lots
of joins. However, as more and more CASE tools which automatically
generate queries come into common use, you will see increasing
large joins occuring. My company, EDS, has such a case tool (INCA)
which has generated 7 way joins for one of our customers. We also
have a research project under way which routinely generates queries
with upwards of 40 joins. 

tom meyer, EDS Research               | If I don't see you in the future
tom@edsr.eds.com or ...uunet!edsr!tom | I'll see you in the pasture

davidm@uunet.UU.NET (David S. Masterson) (07/18/90)

In article <933@dgis.dtic.dla.mil> jkrueger@dgis.dtic.dla.mil (Jon) writes:

   gus@plains.NoDak.edu writes:

   >Recently I heard that an IBM-type db person claimed that it's
   >not uncommon in commercial db applications to join as many as
   >10-15 (maybe even 30) tables in a single query.

   It is uncommon.  Ask your IBM-type db person to substantiate
   his claim.  "Extraordinary claims require extraordinary
   justification".  In fact it's difficult to design a query that
   makes sense with more than about five joins.  Try it.

Already been tried.  Try to do a parts explosion problem (of any sort) without
the recursive join operator that Codd speaks of.  Tain't nothing extraordinary
about it.
--
===================================================================
David Masterson					Consilium, Inc.
uunet!cimshop!davidm				Mt. View, CA  94043
===================================================================
"If someone thinks they know what I said, then I didn't say it!"

chiu@brahms.amd.com (Timothy Chiu) (07/18/90)

In article <5314@plains.UUCP> gus@plains.UUCP (jim gustafson) writes:
>In article <933@dgis.dtic.dla.mil> jkrueger@dgis.dtic.dla.mil (Jon) writes:
>
>>>Recently I heard that an IBM-type db person claimed that it's
>>>not uncommon in commercial db applications to join as many as
>>>10-15 (maybe even 30) tables in a single query.
> 
>>It is uncommon.  Ask your IBM-type db person to substantiate
>>his claim.  "Extraordinary claims require extraordinary
>>justification".  In fact it's difficult to design a query that
>>makes sense with more than about five joins.  Try it.
> 
>>-- Jon
>>-- 
>>Jonathan Krueger    jkrueger@dtic.dla.mil   uunet!dgis!jkrueger
>>Drop in next time you're in the tri-planet area!
>

Well, let's see, I used to work for a major money center bank on
Wall Street and we used DB2, (IBM's hideous monster) and the 
applications I worked with used joins of up to 18 different 
tables (each table represented a different business activity
of the bank, but contained information from more than one 
subsidiary).  To find information about one subsidiary it was
not uncommon to have huge table joins, this added to the fact
we had truly "brain-dead" users made for some interesting QMF
problems.  If anyone's truly interested I can send them more
details via e-mail.

--
Timothy Z. Chiu      | -- This space under construction --  
chiu@brahms.amd.com  |
Advanced Micro Devices, M/S 167 P.O. Box 3453, Sunnyvale, CA 94088-3000
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

jean@beno.CSS.GOV (Jean Anderson) (07/18/90)

Mike Tossy writes:
> In article <CIMSHOP!DAVIDM.90Jul15215312@uunet.UU.NET>, cimshop!davidm@uunet.UU.NET (David S. Masterson) writes:
> > In article <12072@blia.BLI.COM> miket@blia.BLI.COM (Mike Tossy) writes:
> > 
> >    My observation is that the number of tables used by an application 
> >    increases dramaticly if the database design is well normalized.
> > 
> > Is that good or bad?  (Speaking from your observations that is :-)
> 
> I think it is good.  The normal reason to denormalize is to gain
> performance.  Assuming that your RDBMS is good at joins this is not
> necesary, and you can even gain speed by normalizing because the resulting

I think "well normalized" is a matter or style and frequently gets confused
with excessive decomposition.  I have seen tables that were so decomposed 
they consisted of little more than primary and foreign keys.  This may be OK
for an application which is entirely canned, but it makes the logical design
artificially complex for any database which has to support interactive SQL 
users.

Also, my experience with two different products so far tells me that the higher
the number of joins, the longer the query takes to run.  With a large amount
of data or complex queries, this can become prohibitive.  I may get flamed 
off the net, but I try to normalize to third normal form, then denormalize 
for performance.  Here are several ways I back off:

   1	If there is a one to one relationship between two tables, they
	share the same primary key and are frequently retrieved together,
	I consider making them a single table so long as the table doesn't 
	get too wide. "Too wide" is a matter of style; I start getting 
	uncomfortable with tables that have more than 20-25 columns.

   2	If a join is used frequently to do a simple lookup, I include
	that field as redundant data and eliminate the join.  I take a
	serious look at it to see if this will introduce update anomalies.

   3	I use repeating columns to store aggregate information rather than 
	have it be computed on the fly.

I realize that breaking rules like this would drive a purist nuts, but I 
answer to a user community that daily wants to see it all go faster with
the current product on the current platform.

   -	Jean Anderson
	SAIC, Geophysics Division
	jean@seismo.css.gov

            +++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++   Any opinions are mine, not my employer's. +++++++++++++++++
            +++++++++++++++++++++++++++++++++++++++++++++++++

nigelc@cognos.UUCP (Nigel Campbell) (07/18/90)

In article <CIMSHOP!DAVIDM.90Jul15215312@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes:
>In article <12072@blia.BLI.COM> miket@blia.BLI.COM (Mike Tossy) writes:
>
>   My observation is that the number of tables used by an application 
>   increases dramaticly if the database design is well normalized.
>
>Is that good or bad?  (Speaking from your observations that is :-)

From my own relational / non relational experiences the gains frequently
have outweighed the losses 

	1. you usually have small record sizes so most systems caches
		perform well

	2. you reduce the locking (often) that a non-normalised system
		would experience

	3. transactions tend to become more simpler and thus easier 
		to model and maintain in the language and probably in
		most case tools 

	4. you also can model time very easily even if originally
		historical data was not a system requirement .

Denormalisation helps when necessary however so often I have reviewed
applications where the programmers spent more time coding around the
design thus increasing the cost of the system . You really have to 
weigh up the pros and cons of not going to 3rd or Bcnf .

The only hassle we have faced occasionaly is when an end user wants
to write his own reports and then is faced with how to traverse 
reference tables etc to flesh out his report . Views obviously come to
the rescue in relational however in say an RMS based system you have to
help provide a similar interface .

One client we have (On a Data General) did find that the query optimiser
for the resident dbms made some very odd optimizations which killed the
view processing he defined.They originally opted to collapse the tables
and then found that by altering the stats on the tables that they could fool
the optimiser into a more optimial query . 

-- 
Nigel Campbell          Voice: (613) 783-6828                P.O. Box 9707
Cognos Incorporated       FAX: (613) 738-0002                3755 Riverside Dr.
uucp: nigelc@cognos.uucp || uunet!mitel!sce!cognos!nigelc    Ottawa, Ontario
                                                             CANADA  K1G 3Z4

davidm@uunet.UU.NET (David S. Masterson) (07/19/90)

In article <49001@seismo.CSS.GOV> jean@beno.CSS.GOV (Jean Anderson) writes:

   I think "well normalized" is a matter or style and frequently gets confused
   with excessive decomposition.

   [...line eater was here...]

   I may get flamed off the net, but I try to normalize to third normal form, 
   then denormalize for performance.

Well, I'm not going to flame, but some of your definitions have me intrigued.

   Here are several ways I back off:

      1	If there is a one to one relationship between two tables, they
	   share the same primary key and are frequently retrieved together,
	   I consider making them a single table so long as the table doesn't 
	   get too wide. "Too wide" is a matter of style; I start getting 
	   uncomfortable with tables that have more than 20-25 columns.

If the two tables share the same primary key, are they not representative of
the same entity and, therefore, should be one table after normalization?
Especially since they are frequently retrieved together?  What would lead to
separating them?

      2	If a join is used frequently to do a simple lookup, I include
	   that field as redundant data and eliminate the join.  I take a
	   serious look at it to see if this will introduce update anomalies.

To include the data in the primary table, the lookup data must be one-to-one
related with the primary key of the primary table, correct?  If data from a
secondary table is used for a simple lookup on the primary table and that data
is 1-1 related to the primary table, it sounds like the tables were overly
decomposed in the first place.

      3	I use repeating columns to store aggregate information rather than 
	   have it be computed on the fly.

This is a typical solution to a perceived problem.  You've obviously taken
into account the problems of update and the "window of vulnerability" between
change to information and recomputing the aggregate.  I imagine Sybase style
triggers can make this automatic.

   I realize that breaking rules like this would drive a purist nuts, but I 
   answer to a user community that daily wants to see it all go faster with
   the current product on the current platform.

I'm not sure whether your breaking the rules in the second place or going too
far in the first place.
--
===================================================================
David Masterson					Consilium, Inc.
uunet!cimshop!davidm				Mt. View, CA  94043
===================================================================
"If someone thinks they know what I said, then I didn't say it!"

miket@blia.BLI.COM (Mike Tossy) (07/19/90)

By email Jon (jkrueger@dtic.dla.mil) asked for a copy of my 15 way query or
"even of any query with more than 6 joins".  In comp.databases Jon wrote
"Extraordinary claims require extraordinary justification".

I agree.  I will try to get permission from our customer to release their
queries.  In the mean time here is a nine way join (using only five tables)
that we wrote for a proposal to the Veterans Administration (VA).


In English:
list readmissions to same bed section for patients readmitted within
14 days of previous discharge by medical district

In SQLish:
select e2.patid, f.distid
    from episode e1, episode e2, action a1, action a2, detail d1,
    detail d2, location l1, location l2, facility f
where e1.patid = e2.patid and e1.patseq + 1 = e2.patseq
    and  e2.edate - e1.ddate <= 14 
    and e2.epid = a2.epid and a2.actid=1 and a2.eactid=d2.eactid
    and e1.epid = a1.epdi and a1.actid=1 and a1.eactid=d1.eactid
    and d1.loc_code = l1.loc_code and d2.loc_code=l2.loc_code
    and l1.bed_section = l2.bed_section 
    and e2.facid = f.facid
    group by f.distid


Can't say that I'd like to make my living writing this stuff; but as
tom meyer of EDS Research (tom@edsr.eds.com) pointed out automated 
CASE tools (GUI front ends like CQT (Claris Query Tool) or Metaphor)
do generate complex queries like the one above.
--
						Teradata Corporation
Mike Tossy                                      ShareBase Division
miket@blia.bli.com                              14600 Wichester Blvd
(408) 378-7575 ext2200                          Los Gatos, CA 95030
					(Formerly: Britton Lee, Inc.)

	 These are only my opinions.

jean@beno.CSS.GOV (Jean Anderson) (07/19/90)

In <12096@blia.BLI.COM>, miket@blia.BLI.COM (Mike Tossy):
> seen enough poor database designs to lead me to believe that the 'state-of-
> the-practice' of the average commercial programmer does not yet extend to
> third normal form.

I think it is the case that many of us inherit existing database designs.
We try to make positive changes given opportune moments and new development.
I feel fortunate that the database structure I inherited was for the most part
intelligently designed.  It was the ad hoc extensions to it that caused 
the most problems.

In <CIMSHOP!DAVIDM.90Jul18104049@uunet.UU.NET>, cimshop!davidm@uunet.UU.NET (David S. Masterson):
> If the two tables share the same primary key, are they not representative of
> the same entity and, therefore, should be one table after normalization?

Not always.  For example, this is a seismic database with diverse entities
which share the same primary key.  For example, location hypotheses, 
magnitudes, and events really do share the same key. 

> To include the data in the primary table, the lookup data must be one-to-one
> related with the primary key of the primary table, correct?

I think you are essentially right. This case is actually one-to-many, but
the main table's primary key is a composite key which when broken down
contains the foreign key to the table in question (who's on first?).
So this doesn't seem totally bizarre, there is a many to many relationship 
between seismic signals and location hypotheses.  Multiple seismic stations 
contribute data for a given location hypothesis.  An intermediate table 
keeps track of the associations:

      station
      info
    +----------+         +----------------------+        +--------------+
    |  seismic | <--->>  |     association of   | <<---> |   location   |
    |  signals |         | signals and locations|        |  hypotheses  |
    +----------+         +----------------------+        +--------------+

In this case, I put the station in the association table to avoid doing a 
lookup join to the signal table to retrieve assoc info by station.

> I'm not sure whether your breaking the rules in the second place or going too
> far in the first place.

Bless you for saying that.  :)  It's sometimes difficult to make some folks 
realize that more is not necessarily better.

   -	Jean Anderson
	SAIC Geophysics Division
	Most reliable return path: jean@seismo.css.gov


            +++++++++++++++++++++++++++++++++++++++++++++++++
            ++   Any opinions are mine, not my employer's. ++
            +++++++++++++++++++++++++++++++++++++++++++++++++

miket@blia.BLI.COM (Mike Tossy) (07/20/90)

I wrote:
CASE tools (GUI front ends like CQT (Claris Query Tool) or Metaphor)
I ment to say:
CASE tools (and GUI front ends like CQT (Claris Query Tool) or Metaphor)

--miket

nigelc@cognos.UUCP (Nigel Campbell) (07/20/90)

In article <933@dgis.dtic.dla.mil> jkrueger@dgis.dtic.dla.mil (Jon) writes:
>gus@plains.NoDak.edu writes:
>
>>Recently I heard that an IBM-type db person claimed that it's
>>not uncommon in commercial db applications to join as many as
>>10-15 (maybe even 30) tables in a single query.
>

>It is uncommon. 

	really ..... 

>Ask your IBM-type db person to substantiate
>his claim.  "Extraordinary claims require extraordinary
>justification".  In fact it's difficult to design a query that
>makes sense with more than about five joins.

	Quite an extraordinary claim 

> Try it.

	I just took a peek at the 40+ reports in a large bus
scheduling/driver bidding system I have worked on . Most reports
are non trivial (but real-life) and have 4-8 tables . Not much
above 5 I admit but those larger joins do make sense . Given that
most SQL implementations do not support a recusive join I would
not find it unusual to see 5+ reflexive joins in BOM style structures.
In fact a museums Taxonomic dictionary can be up to 24 levels deep
depending upon the discipline .



-- 
Nigel Campbell          Voice: (613) 783-6828                P.O. Box 9707
Cognos Incorporated       FAX: (613) 738-0002                3755 Riverside Dr.
uucp: nigelc@cognos.uucp || uunet!mitel!sce!cognos!nigelc    Ottawa, Ontario
                                                             CANADA  K1G 3Z4

chesky@portia.Stanford.EDU (Snehylata Gupta) (07/25/90)

In article <8619@cognos.UUCP> nigelc@cognos.UUCP (Nigel Campbell) writes:
>>gus@plains.NoDak.edu writes:
>>
>>>Recently I heard that an IBM-type db person claimed that it's
>>>not uncommon in commercial db applications to join as many as
>>>10-15 (maybe even 30) tables in a single query.
>>
>
>>It is uncommon. 
>
>	really ..... 
>
>>Ask your IBM-type db person to substantiate
>>his claim.  "Extraordinary claims require extraordinary
>>justification".  In fact it's difficult to design a query that
>>makes sense with more than about five joins.

Once I had come up with a design which required a query where there waws
a 26 way join. It was very well normalized. How it would perform I do
not know because the tool was INGRES and INGRES has a limit of 30 tables
in your query. It was an unusual case I suppose. But it was the best
design then. It would have been interesting to see the query execution
plan. However I routinely use 8 way joins. 
This is also because INGRES likes thin tables more than wide ones.

In that project I had do denormalize it and work with a 4 way join.
(if I remember correctly).

If you insist I can give the design. But the logic is rather long
and I don't type at 80 wpm :-).  However if there is enough interest
then I shall ...

Thanks

Sanjay

wes@loft386.uucp (Wes Peters) (07/27/90)

In article <5265@plains.UUCP>, gus@plains.NoDak.edu writes:
> Recently I heard that an IBM-type db person claimed that it's
> not uncommon in commercial db applications to join as many as
> 10-15 (maybe even 30) tables in a single query.
> This seems like an incredible number of joins, especially if the
> tables are large, on the order of 1/2 million tuples.
> Any comment/references, etc.?

My company has an application for tracking software problems and
changes that runs on Oracle.  Last week, I tried running a selection
that joined 3 tables*.  On our system, a cluster of 3 VAX 8250s
running VMS 4.7 and Oracle 5.1.22, this selection would not produce
any output in 45 minutes.  Dropping one table from the selection
resulted in output typically in 30-40 seconds.

*Actually we were selecting from 3 public synonyms for tables "owned"
by another Oracle user.  I'm sure our selection was pretty pessimal in
it's order of query, but still 45 minutes?  Yuck!

tgreenla@oracle.uucp (Terry Greenlaw) (07/31/90)

In article <1990Jul27.042508.10645@loft386.uucp> wes@loft386.uucp (Wes Peters) writes:
>In article <5265@plains.UUCP>, gus@plains.NoDak.edu writes:
>> Recently I heard that an IBM-type db person claimed that it's
>> not uncommon in commercial db applications to join as many as
>> 10-15 (maybe even 30) tables in a single query.
>> This seems like an incredible number of joins, especially if the
>> tables are large, on the order of 1/2 million tuples.
>> Any comment/references, etc.?
>
>My company has an application for tracking software problems and
>changes that runs on Oracle.  Last week, I tried running a selection
>that joined 3 tables*.  On our system, a cluster of 3 VAX 8250s
>running VMS 4.7 and Oracle 5.1.22, this selection would not produce
>any output in 45 minutes.  Dropping one table from the selection
>resulted in output typically in 30-40 seconds.
>
>*Actually we were selecting from 3 public synonyms for tables "owned"
>by another Oracle user.  I'm sure our selection was pretty pessimal in
>it's order of query, but still 45 minutes?  Yuck!

An application I worked on a while back was based on collecting data from 
many different sub-offices and compiling it into one report for the main
office. The report was written using SQL*ReportWriter ver. 1.1 against a
ver. 6 database and the main query had to join 22 tables to get the primary
tracking data out! The original program was written in C and took 45 minutes
to two hours to complete. Our version in ReportWriter took between 45 seconds
and 1 minute.
I would say that well over 95% of the queries I have seen were 3 tables or
less. The Surgeon General has warned that anything exceeding this amount is
hazardous to your health, and may result in high blood pressure, temporary
loss of mental facilities, death, or the irresistable urge to deploy     
high-powered explosive devices against computing machinery ;-}



Terry O. Greenlaw             Sheathed within the Walkman, 
Staff Engineer                Wear a halo of distortion.
Oracle Corporation            Aural contraceptive,
tgreenla@oracle.oracle.com    Aborting pregnant conversation - Marillion