bill@twg.bc.ca (Bill Irwin) (02/03/91)
I have an SQL that is supposed to select the clients that are due for Preventative Maintenance (PM) in the coming month and for each selected, add a new PM record to the pm table for each hardware item whose most recently completed pm record is 60 days old. It actually works, but... The first time this SQL was run with the pm table empty, it selected 11 clients and about 200 items to add pm records for. It took about 5 minutes to run. The 2nd time it was run it selected about 7 clients and still hasn't completed running as I write this article. The elapsed time has been over 4 hours already, with the CPU time being 150 minutes - almost 3 hours! With this kind of dramatic increase between the first two runs of this SQL, I am visualizing a week to run the third one. Below is the update SQL. Any suggestions on ways to make it run faster would be greatly appreciated. I am using Unify Turbo-SQL. insert into pm (item_num): select item.item_number from item, client, inv [the (inv)entory table stores item descriptions] [the item table records individual components] where client.key = item.key and # item belongs to client inv.key = item_id and # item key matches inv key inv_group = 'H' and # hardware items only quit_date = **/**/** and # client hasn't terminated support item.inst_date > **/**/** and # item has been installed [ pm_month_1 = 2 or # one of the 3 PM months for the pm_month_2 = 2 or # client is February pm_month_3 = 2 ] and [ item.item_number = # dont't add a new pm record if select pm.item_num from pm # the most recent one is within where pm.item_num = item.item_number # 2 months group by pm.item_num having 02/02/91 - max(pm.pm_date) > 60 ; or 0 = # handles the case where the select count(*) from pm # item is new and has no PMs where pm.item_num = item.item_number # on file ; ] / I know it is the two innner selects that are slowing it down, but I need to be selective about whether to add a new PM record for an item. Thanks for any suggestions. -- Bill Irwin - The Westrheim Group - Vancouver, BC, Canada ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ uunet!van-bc!twg!bill (604) 431-9600 (voice) | Your Computer bill@twg.bc.ca (604) 430-4329 (fax) | Systems Partner
lugnut@sequent.UUCP (Don Bolton) (02/05/91)
In article <600@twg.bc.ca> bill@twg.bc.ca (Bill Irwin) writes: > >I have an SQL that is supposed to select the clients that are due >for Preventative Maintenance (PM) in the coming month and for >each selected, add a new PM record to the pm table for each >hardware item whose most recently completed pm record is 60 days >old. It actually works, but... > >The first time this SQL was run with the pm table empty, it >selected 11 clients and about 200 items to add pm records for. >It took about 5 minutes to run. The 2nd time it was run it >selected about 7 clients and still hasn't completed running as I >write this article. The elapsed time has been over 4 hours >already, with the CPU time being 150 minutes - almost 3 hours! > >With this kind of dramatic increase between the first two runs of >this SQL, I am visualizing a week to run the third one. Below is >the update SQL. Any suggestions on ways to make it run faster >would be greatly appreciated. I am using Unify Turbo-SQL. > This sort of thing seems common to SQL based products in general. Had an istance with Oracle where I had 20,000 rows and had appx 7,000 that needed updating with values from annother table (had a zip +4 done to our contact lists by an external source). an update where exists select blah blah blah chewed and chewed and chewed. The solution was to create YAT (yet annother table) as the "sum" of the selected criteria and do renames. went from 20 hours to 20 minutes. I've encountered a similar situation with Informix and opted for a similar type solution. >insert into pm (item_num): >select item.item_number from item, client, inv > [the (inv)entory table stores item descriptions] > [the item table records individual components] >where > client.key = item.key and # item belongs to client > inv.key = item_id and # item key matches inv key > inv_group = 'H' and # hardware items only > quit_date = **/**/** and # client hasn't terminated support > item.inst_date > **/**/** and # item has been installed > [ > pm_month_1 = 2 or # one of the 3 PM months for the > pm_month_2 = 2 or # client is February > pm_month_3 = 2 > ] > and [ > item.item_number = # dont't add a new pm record if > select pm.item_num from pm # the most recent one is within > where pm.item_num = item.item_number # 2 months > group by pm.item_num having 02/02/91 - max(pm.pm_date) > 60 > ; > or 0 = # handles the case where the > select count(*) from pm # item is new and has no PMs > where pm.item_num = item.item_number # on file > ; > ] >/ > >I know it is the two innner selects that are slowing it down, but >I need to be selective about whether to add a new PM record for >an item. Thanks for any suggestions. >-- >Bill Irwin - The Westrheim Group - Vancouver, BC, Canada >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >uunet!van-bc!twg!bill (604) 431-9600 (voice) | Your Computer >bill@twg.bc.ca (604) 430-4329 (fax) | Systems Partner Good luck
bill@twg.bc.ca (Bill Irwin) (02/05/91)
While I was waiting for responses to come in from the Net, I thought I'd poke around with the SQL a bit. I remembered something that I read in a DB theory guide, to the effect: when you are listing selection criteria, list the criteria that is the most likely to fail first and the criteria that is the least likely to fail last. As soon as a record fails a test, the other criteria are not evaluated for that record. By listing the criteria most like to fail first, you stop many records from even being evaluated at the second step. I also thought that perhaps the query was getting royaly confused because of the inner query using the same table as the outer one, so I decided to try a self join. I wrote: :insert into pm (item_num): :select item.item_number from item, client, inv : [the (inv)entory table stores item descriptions] : [the item table records individual components] :where ============================================= This next section used to be last, now first. Records are most likely to fail this test. ============================================= : [ : pm_month_1 = 2 or # one of the 3 PM months for the : pm_month_2 = 2 or # client is February : pm_month_3 = 2 : ] and : client.key = item.key and # item belongs to client : inv.key = item_id and # item key matches inv key : inv_group = 'H' and # hardware items only : quit_date = **/**/** and # client hasn't terminated support : item.inst_date > **/**/** and # item has been installed : and [ ============================================= Did a self join on the pm table, since this is the one I am adding to. Called the temp pm table "x". ============================================= : item.item_number = # dont't add a new pm record if : select x.item_num from pm x # the most recent one is within : where x.item_num = item.item_number # 2 months : group by x.item_num having 02/02/91 - max(x.pm_date) > 60 : ; : or 0 = # handles the case where the : select count(*) from pm # item is new and has no PMs : where pm.item_num = item.item_number # on file : ; : ] :/ Guess what? It now runs in under three minutes, compared to over four hours previously (I never did wait to see if it would finish, as it had brought the system to its knees and response time was nonexistent). Steven List offered a suggestion that agreed with one of the two changes I had made. :Here's what occurs to me: this query is recursive. That is, it :is inserting into a table that it is also searching. Based on :my knowledge of Unify, it is possible that Bill is creating all :sorts of confusing loops for SQL and causing it to lose its :place. :That is, Unify tracks the CURRENT RECORD in each table. When :you traverse a set, unless you're using an index, my :recollection is that that CURRENT RECORD pointer is used too. :So if both the subquery and the insertion are working with the :same table, then the key question is: What's the CURRENT RECORD :at any given moment, and how does changing it affect the query :at that moment? :Let's see if I can clarify... If I'm doing a simple search :through a set, and as a result of finding a record I insert a :record, then I've changed the CURRENT RECORD to the one I've :just added. But I'm still traversing a "set". So my position :in the set has changed. If I'm doing a sequential search, and :the new record is added (through the delete chain) BEFORE the :record that triggered the insertion, then will I look at all the :intervening records again? If I'm traversion an explicit :relation (link) and the same thing happens such that the :inserted record is in a different set, then what's the effect? :It gets VERY confusing!! :I hope Bill will let us all know what the resolution of this is :- it's fascinating! :-- :+----------------------------------------------------------------------------+ :: Steven List @ Transact Software, Inc. :^>~ : :: Chairman, Unify User Group of Northern California : :: {apple,coherent,limbo,mips,pyramid,ubvax}!itkin@guinan.Transact.COM : :: Voice: (415) 961-6112 : :+----------------------------------------------------------------------------+ Here's your wish Steven. I hope this exercise is of use to others in optimizing their queries. It sure can make a huge difference to processing time to write the thing correctly! Thanks to all who responded. -- Bill Irwin - The Westrheim Group - Vancouver, BC, Canada ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ uunet!van-bc!twg!bill (604) 431-9600 (voice) | Your Computer bill@twg.bc.ca (604) 430-4329 (fax) | Systems Partner