[comp.databases] Problem with SQL join/group-by in Unify database

dag@fciva.FRANKCAP.COM (Daniel A. Graifer) (01/21/91)

Let me appologize in advance for the lengthly example.
I am having a problem with group-by with joins in Unify/Accell SQL:

$ SQL -version
4.0-880916 SYS5LCK USSLCK  M8BIT (international)
$ #I have added "*" to the field listings to mark key fields
$ SQL
UNIFY TURBO/SQL
Copyright Unify Corporation 1986
sql>fields apmaster
NAME					     TYPE    LENGTH
ap_region                                    INTEGER 2 *
ap_brch                                      INTEGER 2 *
ap_reg_num                                   LONG    6 *
:
ap_loan_amt                                  LONG    8
:
ap_funding_date                              DATE    2
:
ap_broker                                    STRING  8
:
sql>fields rolodex
NAME					     TYPE    LENGTH
rd_co_id                                     STRING  8 *
rd_co_name                                   STRING  30
:
sql>!# ap_broker should have been a ref to rd_co_id, but isn't (historical)
sql>select ap_broker, count(*), sum(ap_loan_amt) from apmaster
sql>   where ap_funding_date between 12/01/90 and 12/31/90
sql>	and ap_region = 2
sql>	group by ap_broker /
recognized query!

ap_broker| count(*)|sum(ap_loan_amt)
-------------------------------------
:
141 TUS  |        3|    966250.00000
:
960 TUS  |        1|    169200.00000
:
sql> select rd_co_id, rd_co_name, count(*), sum(ap_loan_amt) from apmaster,
sql>	rolodex where ap_funding_date between 12/01/90 and 12/31/90
sql>	and ap_region = 2 and rd_co_id = ap_broker
sql>	group by rd_co_name /
recognized query!

rd_co_id|rd_co_name                    | count(*)|sum(ap_loan_amt)
-------------------------------------------------------------------
960 TUS |REPLACED FOR PRIVACY 1        |        2|    374900.00000
141 TUS |REPLACED FOR PRIVACY 2        |        1|    169200.00000
:
select rd_co_id, rd_co_name, count(*), sum(ap_loan_amt) from apmaster, rolodex
   where ap_funding_date between 12/01/90 and 12/31/90
	and ap_region = 2 and rd_co_id = ap_broker
	group by ap_broker /
recognized query!

rd_co_id|rd_co_name                    | count(*)|sum(ap_loan_amt)
-------------------------------------------------------------------
141 TUS |REPLACED FOR PRIVACY 2        |        2|    388750.00000
:
178 TUS |REPLACED FOR PRIVACY 1        |        2|    326950.00000
:
960 TUS |REPLACED FOR PRIVACY 3        |        2|    246435.00000
:
sql>select ap_broker, ap_loan_amt from apmaster where ap_region = 2
sql>	and ap_funding_date between 12/01/90 and 12/31/90
sql>	and ap_broker in <'141 TUS','178 TUS','960 TUS'> /
recognized query!

ap_broker|ap_loan_amt
----------------------
960 TUS  |     169200
141 TUS  |     600000
178 TUS  |     187450
178 TUS  |     187450
141 TUS  |     146250
141 TUS  |     220000
sql>end

I've only shown two lines of the about 200 that come out, but as you can
see, joining to the rolodex table to get the broker's company name gives
me 'incorrect' answers (both the id/co pairs are wrong, and the totals).

I spoke with someone at Unify Tech support, and they were sure the problem
was damaged explicit relationship tables.  I tried rebuilding the hash
tables and the explicit relationships yesterday (this took 8 hours), but
the problem hasn't gone away.  So now I turn to the net.

Either I don't understand how joins work with 'group by', or I need to be
very frightened about every report I run that does this.  Fortunately, most
of our reports need both loan details and totals, so we don't do a lot
of group-bys,  we pull detail and summarize in the report writer.

Any Ideas?  Anxiously awaiting any input, and thanks in advance...
Dan
-- 
Daniel A. Graifer			Coastal Capital Funding Corp.
Sr. Vice President, Financial Systems	7900 Westpark Dr. Suite A-130
(703)821-3244				McLean, VA  22102
uunet!fciva!dag				fciva.FRANKCAP.COM!dag@uunet.uu.net

allbery@NCoast.ORG (Brandon S. Allbery KB8JRR) (01/21/91)

As quoted from <584@fciva.FRANKCAP.COM> by dag@fciva.FRANKCAP.COM (Daniel A. Graifer):
+---------------
| sql> select rd_co_id, rd_co_name, count(*), sum(ap_loan_amt) from apmaster,
| sql>	rolodex where ap_funding_date between 12/01/90 and 12/31/90
| sql>	and ap_region = 2 and rd_co_id = ap_broker
| sql>	group by rd_co_name /
+---------------

There is indeed a bug here... but not the one you think.  A selection using
GROUP BY must group by all the non-aggregates in the SELECT clause or the
result is undefined.  Most ANSI SQL's catch this; Unify 4.0 SQL does not, and
instead silently mis-groups things.  Try grouping by both rd_co_id and
rd_co_name here.  I noticed the same problem in the third example of GROUP
BY; the first example appeared correct compared to the actual data shown in
the final select, and (unsurprisingly) the GROUP BY clause was correct.

++Brandon
-- 
Me: Brandon S. Allbery			    VHF/UHF: KB8JRR on 220, 2m, 440
Internet: allbery@NCoast.ORG		    Packet: KB8JRR @ WA8BXN
America OnLine: KB8JRR			    AMPR: KB8JRR.AmPR.ORG [44.70.4.88]
uunet!usenet.ins.cwru.edu!ncoast!allbery    Delphi: ALLBERY

dag@fciva.FRANKCAP.COM (Daniel A. Graifer) (01/22/91)

In article <1991Jan21.001428.21987@NCoast.ORG> allbery@ncoast.ORG (Brandon S. Allbery KB8JRR) writes:
>There is indeed a bug here... but not the one you think.  A selection using
>GROUP BY must group by all the non-aggregates in the SELECT clause or the
>result is undefined.  Most ANSI SQL's catch this; Unify 4.0 SQL does not, and
>instead silently mis-groups things.  Try grouping by both rd_co_id and
>rd_co_name here.  
>
>++Brandon

This fixed it.  Thanks a million Brandon!  Gosh, I love the net :-)

Has this bug been submitted to the list (uug-old-bug@Transact.COM) that
Steven List is building for Unify?

Dan
-- 
Daniel A. Graifer			Coastal Capital Funding Corp.
Sr. Vice President, Financial Systems	7900 Westpark Dr. Suite A-130
(703)821-3244				McLean, VA  22102
uunet!fciva!dag				fciva.FRANKCAP.COM!dag@uunet.uu.net

meh@ufycorp.Unify.Com (Mark Hansen) (01/22/91)

In article <584@fciva.FRANKCAP.COM> dag@fciva.UUCP (Daniel A. Graifer) writes:
>Let me appologize in advance for the lengthly example.
>I am having a problem with group-by with joins in Unify/Accell SQL:
>
>$ SQL -version
>4.0-880916 SYS5LCK USSLCK  M8BIT (international)

>sql> select rd_co_id, rd_co_name, count(*), sum(ap_loan_amt) from apmaster,
>sql>	rolodex where ap_funding_date between 12/01/90 and 12/31/90
>sql>	and ap_region = 2 and rd_co_id = ap_broker
>sql>	group by rd_co_name /
>recognized query!
>
[Detail of querries and results removed ...]

>Any Ideas?  Anxiously awaiting any input, and thanks in advance...
>Dan

  The one basic problem that I see with your queries off hand, 
is the fact that you are creating an aggregate query,  but not 
all the projected columns are aggregates.  For example, in the 
above query you select  rd_co_id,  rd_co_name,  count(*),  and 
sum( ap_loan_amt )  from   apmaster grouped by the  rd_co_name 
column. The count(*) and sum( ap_loan_amt ) are aggregates for 
obvious reasons.  The rd_co_name is an aggregate because it is 
in the 'group by' clause.  The remaining column,  rd_co_id, is 
not an aggregate, however, and is what is causing your problem. 

  The SQL manual, on page  16-26, goes into this a little, and 
although doesn't come right out and say it, tells you that you 
can't have any projected columns that are not aggregates in an 
aggregate query.

  The actual  BUG  is that the  SQL doesn't come back and tell 
you that this is a syntax error. In later versions, it does.

Hope this is helpful.

>-- 
>Daniel A. Graifer			Coastal Capital Funding Corp.
>Sr. Vice President, Financial Systems	7900 Westpark Dr. Suite A-130
>(703)821-3244				McLean, VA  22102
>uunet!fciva!dag				fciva.FRANKCAP.COM!dag@uunet.uu.net

Mark E. Hansen                                     internet: meh@Unify.Com
Manager, Client Support Services            ...!{csusac,pyramid}!unify!meh
Unify Corporation                                    voice: (916) 922-1177
3870 Rosin Court, Sacramento, CA 95834                 fax: (916) 920-5306

meh@ufycorp.Unify.Com (Mark Hansen) (02/12/91)

In article <584@fciva.FRANKCAP.COM> dag@fciva.UUCP (Daniel A. Graifer) writes:
>Let me appologize in advance for the lengthly example.
>I am having a problem with group-by with joins in Unify/Accell SQL:
>
>$ SQL -version
>4.0-880916 SYS5LCK USSLCK  M8BIT (international)
>$ #I have added "*" to the field listings to mark key fields
>$ SQL
>UNIFY TURBO/SQL
>Copyright Unify Corporation 1986
>sql>fields apmaster
>NAME					     TYPE    LENGTH
>ap_region                                    INTEGER 2 *
>ap_brch                                      INTEGER 2 *
>ap_reg_num                                   LONG    6 *
>:
>ap_loan_amt                                  LONG    8
>:
>ap_funding_date                              DATE    2
>:
>ap_broker                                    STRING  8
>:
>sql>fields rolodex
>NAME					     TYPE    LENGTH
>rd_co_id                                     STRING  8 *
>rd_co_name                                   STRING  30
>:
>sql>!# ap_broker should have been a ref to rd_co_id, but isn't (historical)
>sql>select ap_broker, count(*), sum(ap_loan_amt) from apmaster
>sql>   where ap_funding_date between 12/01/90 and 12/31/90
>sql>	and ap_region = 2
>sql>	group by ap_broker /
>recognized query!
>
>ap_broker| count(*)|sum(ap_loan_amt)
>-------------------------------------
>:
>141 TUS  |        3|    966250.00000
>:
>960 TUS  |        1|    169200.00000
>:
>sql> select rd_co_id, rd_co_name, count(*), sum(ap_loan_amt) from apmaster,
>sql>	rolodex where ap_funding_date between 12/01/90 and 12/31/90
>sql>	and ap_region = 2 and rd_co_id = ap_broker
>sql>	group by rd_co_name /
>recognized query!
>
[ ... rest of example deleted ...]

>
>I've only shown two lines of the about 200 that come out, but as you can
>see, joining to the rolodex table to get the broker's company name gives
>me 'incorrect' answers (both the id/co pairs are wrong, and the totals).
>
>I spoke with someone at Unify Tech support, and they were sure the problem
>was damaged explicit relationship tables.  I tried rebuilding the hash
>tables and the explicit relationships yesterday (this took 8 hours), but
>the problem hasn't gone away.  So now I turn to the net.
>
>Either I don't understand how joins work with 'group by', or I need to be
>very frightened about every report I run that does this.  Fortunately, most
>of our reports need both loan details and totals, so we don't do a lot
>of group-bys,  we pull detail and summarize in the report writer.
>
>Any Ideas?  Anxiously awaiting any input, and thanks in advance...
>Dan

Yes. The second example has a fundamental problem. You have selected
a non-aggregate column in an aggregate query. In the following
example:

select rd_co_id, rd_co_name, count(*), sum(ap_loan_amt) from apmaster,
rolodex where ap_funding_date between 12/01/90 and 12/31/90
and ap_region = 2 and rd_co_id = ap_broker
group by rd_co_name /

You select rd_co_id, rd_co_name, count(*), and sum( ap_loan_amt), and
group by the rd_co_name. The GROUP BY clause causes this to become an 
aggregate query. In other words, SQL is going to group records into 
a set, based on rd_co_name, and fill in the values for the other 
aggregates accordingly ( count(*) and sum() ). 

  The problem is that the rd_co_id column is not an aggregate. 
Although all the values for rd_co_id might be the same for the group, 
SQL should generate an error for this ( it is a BUG that it allows you
to even run this query; Later releases would generate an error ).

  If your data were better normalized, say by putting the rd_co_name
in an rd_company table, and then joining it in the query, it would 
work correctly.

  This functionality is documented in the Unify DBMS Reference manual
on Page 16-26.

Hope this helps.

>-- 
>Daniel A. Graifer			Coastal Capital Funding Corp.
>Sr. Vice President, Financial Systems	7900 Westpark Dr. Suite A-130
>(703)821-3244				McLean, VA  22102
>uunet!fciva!dag				fciva.FRANKCAP.COM!dag@uunet.uu.net

Mark E. Hansen                                     internet: meh@Unify.Com
Manager, Client Support Services            ...!{csusac,pyramid}!unify!meh
Unify Corporation                                    voice: (916) 922-1177
3870 Rosin Court, Sacramento, CA 95834                 fax: (916) 920-5306