[comp.databases] Effecient merging of tables in SQL ?

nomann@rimfaxe.diku.dk (Ole Nomann Thomsen) (11/07/90)

Hello all. I am doing some programming in informix-sql V2.10, and have run 
into a problem that I can solve (:-)) , but only inefficiently (:-().

So, if you have the time and energy, please lend me a hand.

This is the problem:
====================

I have build a temporary table T1, containing the columns "no" and "lnum" (both
integers).

The main table contains (among other things) the columns "s_no" (serial) and
"l_num" (integer).

I need to update the main table, so that the "l_num" is set to the "lnum" from
T1 when "no" and "s_no" matches. T1 contains unique "no" to match some of the
main table's "s_no" (but not all).

Now this statement (repeated from jelly-ware memory) does it:

update main
  set l_num = (
    select lnum from T1                             {1}
    where T1.no = main.s_no                         {1}
  )
where exists                                        {2}
    select no from T1                               {2}
    where T1.no = main.s_no;                        {2}

But it's cruciatingly slow:

The {1} part of the statement seems to involve searching the entire T1 table,
(approx. 500 entries), for each row in main (approx 800).

The {2} part seemingly searches T1 *again* but if I leave it out, all the
main.l_num with no main.s_no matched by T1.no get updated with NULL.
This is catastrophic.

A naive estimate (mine :-}) of the involved amount of search follows:

{1} for each 800 in main, search the 500 in T1 :  40000
{2} well ... um, the same again                :  40000
                                       result  :  80000 tests.

It certainly seems to take long enough for this, but can it really be the
only way to do it ? what if the main table had 10000 rows and T1 8000 ? 
(shudder!). Both T1 and main's no/ s_no are indexed, which does help, but
not nearly enough.

Now, what I'd like to see, was a stmt. like this:

MERGE (main ordered by s_no), (T1 ordered by no)
  if main.s_no = T1.no then
    set main.l_num = T1.lnum;

But that's whishfull thinking. Still, can any of you tell me of an efficient
way to do this ? I can't seem to find any help in my manual.

Please reply by E-mail. Thanks in advance.

- Ole. (nomann@diku.dk).

"Information is not knowledge" - Frank Zappa. 

randall@informix.com (Randall Rhea) (11/08/90)

In article <1990Nov7.115344.27844@diku.dk> nomann@rimfaxe.diku.dk (Ole Nomann Thomsen) writes:
>Hello all. I am doing some programming in informix-sql V2.10, and have run 
>into a problem that I can solve (:-)) , but only inefficiently (:-().
>
>Now this statement (repeated from jelly-ware memory) does it:
>
>update main
>  set l_num = (
>    select lnum from T1                             {1}
>    where T1.no = main.s_no                         {1}
>  )
>where exists                                        {2}
>    select no from T1                               {2}
>    where T1.no = main.s_no;                        {2}
>
>But it's cruciatingly slow:

Temporary tables can really speed things up in Informix, especially for
ACE reports.  This will work:

SELECT
   no, lnum
   FROM T1, main
   WHERE T1.no = main.s_no
   INTO TEMP tempmain;
UPDATE main SET
   l_num = (SELECT lnum FROM tempmain WHERE tempmain.no = main.s_no)
   WHERE s_no IN (SELECT no FROM tempmain);

Hopefully, that will be a lot faster.   There may be a better way to
do it. (without a TEMP table)  However, Informix-SQL is a bit limited
when it comes to performing an UPDATE while looking up information
in more than one table.

-- 

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Randall Rhea                                          Informix Software, Inc. 
Senior Programmer/Analyst, MIS                    uunet!pyramid!infmx!randall

jfr@locus.com (Jon Rosen) (11/10/90)

In article <1990Nov8.021252.27609@informix.com> randall@informix.com (Randall Rhea) writes:
>In article <1990Nov7.115344.27844@diku.dk> nomann@rimfaxe.diku.dk (Ole Nomann Thomsen) writes:
>>Hello all. I am doing some programming in informix-sql V2.10, and have run 
>>into a problem that I can solve (:-)) , but only inefficiently (:-().
>>
>>Now this statement (repeated from jelly-ware memory) does it:
>>
>>update main
>>  set l_num = (
>>    select lnum from T1                             {1}
>>    where T1.no = main.s_no                         {1}
>>  )
>>where exists                                        {2}
>>    select no from T1                               {2}
>>    where T1.no = main.s_no;                        {2}
>>
>>But it's cruciatingly slow:
>

Keep in mind that this SQL is non-standard... There is no way in ANSI SQL
or most other mainstream SQLs such as DB2 or RDB to update a column using
a Subselect... the ANSI spec (that is ANSI SQL1, not SQL2 or SQL3) 
requires a value as the right side of an update assignment... In those
types of SQLs, the only way to do this is to use either:
  
  1) a single cursor that joins the two tables and then issues
     regular single row update statements for each row assigning
     the appropraite value to l_num; or

  2) a cursor that retrieves the main table one row and a time, and
     then issues a singleton Select to get the matching row in the
     secondary table and does an Update Where Current Of... on each
     row in the main table; or
  
  3) a cursor that retrieves the main table one row at a time, and
     a second cursor the retrieves the secondary table one row at
     a time, each in the same sequence, and do a stepped "join" to
     find matching rows and update the main table with a Where 
     Current Of...

I have done *NOTHING* to test the effectiveness of these solutions...
Understand that they require embedded SQL and can not be done from
a normal interactive SQL environment...
 
Jon Rosen