christie@kylie.oz (Chris Tham) (07/03/89)
I found the following anomaly and wondered if it is a bug in Sybase or normal in SQL: (All lines beginning with '>' are typed by me) > create table hello(a int) > create table world(b int) > insert into hello(a) values (1) (1 row affected) > select * from hello where a = 1 a ----------- 1 (1 row affected) > select * from hello where a = 1 or a in (select b from world) a ----------- (0 rows affected) > insert into world(b) values (2) (1 row affected) > select * from hello where a = 1 or a in (select b from world) a ----------- 1 (1 row affected) > drop table hello,world As you can see, the "select" with a subselect in an "or" clause fails if the subselect fails but succeeds if the subselect succeeds. If this is in fact not a bug but proper behaviour, can anyone tell me the proper way to phrase this query? I await comments with anticipation. Thank you. -- Internet: christie@kylie.otr.oz "I'm Pink, therefore I'm Spam" JANET: christie%kylie.oz@uk.cc.ucl.cs Phone: +612 235-0255 UUCP: {uunet,hplabs,mcvax,ukc,nttlab}!munnari!christie@kylie.otr.oz Mail: Optech Research Pty Ltd, Level 60 MLC Centre Sydney NSW 2000 AUSTRALIA
jkrueger@daitc.daitc.mil (Jonathan Krueger) (07/04/89)
In article <613@kylie.oz>, christie@kylie (Chris Tham) writes: >I found the following anomaly and wondered if it is a bug in Sybase or >normal in SQL: [test case looks like: +-------+ +-------+ | hello | a | world | b +-------+------+ +-------+------+ | 1 | +------+ +------+ select * from hello where a = 1 or a in (select b from world); +--------+ | result | a +--------+------+ +------+ It's normal. The semantics are usually unintended, however. The subquery implies a join to an empty table. This means a cartesian product where one matrix has size zero. Or if you prefer, in the relational model intersection is undefined on two relations if either is empty. To force the semantics you intend, use UNION: select a from hello where a = 1 UNION select a from hello where a in (select b from world); +--------+ | result | a +--------+------+ | 1 | +------+ -- Jon -- --
jklein@oracle.uucp (Jonathan Klein) (07/04/89)
st! omp: O: O:___rrrssstttu-I-I--s: Fon:on:on@orps s sO18181on:IDArlelelOra 8 t 8 t 9 ": w: w: 1 16 16 8@o@o@ubs n@eiMs sOr.c.c."FctZylttGMFFFU-! atiatiauburfffbutnRep: jZy-R-R-up'''B Kgegeg1rssq=n&g?sa1S hverouleinn g Kgpppsgl 8 P P anF Danijkjm(J: P&gcppppp`lrism( thag?ttclczRrZCa:2hzzzacle veraclFrourB= 8 ga:! F 8 t B=48Refr(J:aoooclcZyp.g?sq22abagup'e:st-I-T>.d19,j2news.daceMna1natiJbui$flq%#!INriF#!v#!v#acle ttlqlqlO O tiontime:acle tnttld jkjmsgl
lcain@cuc1.UUCP (Leroy Cain) (07/05/89)
In article <571@daitc.daitc.mil>, jkrueger@daitc.daitc.mil (Jonathan Krueger) writes: > In article <613@kylie.oz>, christie@kylie (Chris Tham) writes: > >I found the following anomaly and wondered if it is a bug in Sybase or > >normal in SQL: [test case looks like: > > +-------+ +-------+ > | hello | a | world | b > +-------+------+ +-------+------+ > | 1 | +------+ > +------+ > > select * from hello where a = 1 or a in (select b from world); > > +--------+ > | result | a > +--------+------+ > +------+ > > It's normal. The semantics are usually unintended, however. The > subquery implies a join to an empty table. This means a cartesian > product where one matrix has size zero. Or if you prefer, in the > relational model intersection is undefined on two relations if either > is empty. That is an interesting explanation but I don't buy it. I checked Informix-SQL and Unify 3.2 and they both product the correct results. +--------+ | result | a +--------+------+ | 1 | +------+ [ example of union delete ] So why the example using a union ( which is correct ) if you think there is an intersection? Or am I just misunderstand you? MS-DOS Just say NO!!!!! OS/2 Why???? Leroy Cain; Columbia Union College; Mathematical Sciences Department 7600 Flower Ave. WH406; Takoma Park, Md 20912 (301) 891-4172 uunet!cucstud!lcain
gupta@cullsj.UUCP (Yogesh Gupta) (07/08/89)
In article <613@kylie.oz>, christie@kylie.oz (Chris Tham) writes: > I found the following anomaly and wondered if it is a bug in Sybase or > normal in SQL: > [...] > > select * from hello where a = 1 or a in (select b from world) > a > ----------- > > (0 rows affected) > -- According to the ANSI SQL86 standard, a the result of a <search condition> is "The result derived by the application of the specified boolean operators to the conditions that result from the application of EACH specified <predicate> to a given row of a table or a given group of a grouped table." (General Rule #1). [Emphasis on EACH is mine]. Thus, the predicate a = 1 should be applied to each row of the table hello, and would be true for the one row that exists in table hello. Also, according to the truth tables described in General Rule #2, the search condition would be true for the row. Thus, the result should contain 1 row. Yogesh Gupta. ---- The above opinions are my own, and not those of Cullinet.
jeffl@sybase.Sybase.COM (Jeff Lichtman) (07/09/89)
>> > select * from hello where a = 1 or a in (select b from world) > > Thus, the predicate a = 1 should be applied to each row of the table > hello, and would be true for the one row that exists in table hello... > Thus, the result should contain 1 row. > > Yogesh Gupta. The result of a SQL query is supposed to be as if the following operations were performed (in this order): 1) Cartesian product 2) Restriction (where clause) 3) Projection (select list) (Note that this doesn't necessarily describe the actual processing strategy.) In this case, the Cartesian product of the tables contains no rows, so the answer contains no rows. A previous posting had it right: 'or' does not mean 'union'. --- Jeff Lichtman at Sybase {mtxinu,pacbell}!sybase!jeffl -or- jeffl@sybase.com "Saints should always be judged guilty until they are proved innocent..."
lcain@cucstud.UUCP (Leroy Cain) (07/10/89)
In article <4982@sybase.sybase.com>, jeffl@sybase.Sybase.COM (Jeff Lichtman) writes: > >> > select * from hello where a = 1 or a in (select b from world) > > > > Thus, the predicate a = 1 should be applied to each row of the table > > hello, and would be true for the one row that exists in table hello... > > Thus, the result should contain 1 row. > > > > Yogesh Gupta. This is correct! [ stuff about query plan deleted] > A previous posting had it right: 'or' does not mean 'union'. Well in that case a number of text books need rewriting. A quote from "An Introduction to Database Systems" by C.J. Date page 264: "2. R WHERE c1 OR c2 is defined to be equivalent to ( R WHERE c1 ) UNION ( R WHERE c2 )" > --- > Jeff Lichtman at Sybase > {mtxinu,pacbell}!sybase!jeffl -or- jeffl@sybase.com > "Saints should always be judged guilty until they are proved innocent..." MS-DOS Just say NO!!!!! OS/2 Why???? Leroy Cain; Columbia Union College; Mathematical Sciences Department 7600 Flower Ave. WH406; Takoma Park, Md 20912 (301) 891-4172 uunet!cucstud!lcain
ajs@datlog.co.uk ( Andy Simms ) (07/10/89)
In article <4982@sybase.sybase.com> jeffl@sybase.Sybase.COM (Jeff Lichtman) writes: >>> > select * from hello where a = 1 or a in (select b from world) >> > >The result of a SQL query is supposed to be as if the following operations >were performed (in this order): > > 1) Cartesian product > 2) Restriction (where clause) > 3) Projection (select list) > >In this case, the Cartesian product of the tables contains no rows, so the >answer contains no rows. > >A previous posting had it right: 'or' does not mean 'union'. >--- >Jeff Lichtman at Sybase You're wrong, because there is NO cartesian product involved here. The (conceptual) cartesian product only applies when there is more than one table named in the FROM clause; here we have a predicate based on a single table with two conditions joined by a disjunction, the second condition being a subquery. Now, given that the predicate "a = 1" is TRUE for the single row in table "hello" and that "TRUE OR x" is TRUE for all three truth values of x, the result of the subquery is immaterial: the predicate is satisfied. In fact, irrelevant though it may be, the value of x is treated as FALSE (rather then MAYBE) by ANSI SQL (see under <quantified predicate> case 2(d)). If the Sybase query optimiser is transforming subqueries into joins, then I suspect you have a nice little problem to solve here. Furthermore, lazy evaluation suggests the subquery does not need to be executed at all (so you certainly don't need to do any transformation). Andy Simms Data Logic (ajs@datlog.co.uk).
gupta@cullsj.UUCP (Yogesh Gupta) (07/11/89)
In article <4982@sybase.sybase.com>, jeffl@sybase.Sybase.COM (Jeff Lichtman) writes: > >> > select * from hello where a = 1 or a in (select b from world) > > > > Thus, the predicate a = 1 should be applied to each row of the table > > hello, and would be true for the one row that exists in table hello... > > Thus, the result should contain 1 row. > > > > Yogesh Gupta. > > The result of a SQL query is supposed to be as if the following operations > were performed (in this order): > > 1) Cartesian product You missed a very important point though - The cartesian product is of only those tables that are in the <from clause>. (see lengthy ANSI quote below). Thus, in our query, there is NO cartesian product involved! > 2) Restriction (where clause) > 3) Projection (select list) > > (Note that this doesn't necessarily describe the actual processing strategy.) > In this case, the Cartesian product of the tables contains no rows, so the > answer contains no rows. Again, there is no Cartesian product involved. The following is a long discussion which tries to explain how subqueries are different from simple joins: What is the difference between the following the queries? Q1. Select a from hello where a = 1 or a in (select b from world); Q2. Select a from hello, world where a = 1 or a = b; Note that the behaviour of the subquery is different in the sense that there is NO initial cartesian product! Thus, to quote ANSI again (X3.135-1986): <table expression> ::= <from clause> [<where clause>] [<group by clause>] [<having clause>] General Rules 1) If all the optional clauses are omitted, then the table is the result of the <from clause>. Otherwise, each specified clause is applied to the result of the previously specified clause and the table is the result of of the application of the last specified clause. ... <from clause> General Rules: 2) Case: a) If the <from clause> contains a single <table name>, then the result of the <from clause> is the table identified by that <table name>. <where clause> General Rules: 1) Let R denote the result of the <from clause>. 2) The <search condition> is applied to each row of R. The result of the <where clause> is a table of those rows of R for which the result of the <search condition> is true. 3) Each <subquery> in the <search condition> is effectively executed for each row of R and the results used in the application of the <search condition> to the given row of R. If any executed <subquery> contains an outer reference ... From the above, the logical order of processing (not necessarily the actual order of execution) of the query Q1 is as follows: Step A) The <from clause> has the table "hello". The result of the from clause is the single row with the value of "a" being 1. Step B) The <where clause> contains two predicates. Applying each of them to the value of the single row, the first predicate (a = 1) yields true while the second (subquery) yields false. The result of True OR False is True. The the result of applying the <where clause> is also the single row with the value of "a" being 1. However, the logical order of processing of the query Q2 is as follows: Step A) The <from clause> has the tables "hello" and "world". The result of the from clause is the extended cartesian product of the two tables. As the number of rows in the table "world" is zero, the result has no rows. I hope this makes it clear. (I shouldn't have to explain this to two people from Sybase, should I :-) Yogesh Gupta. Cullinet Software, Inc. ----- The opinions expressed in this article are those of the author only and do not represent those of Cullinet.
jeffl@sybase.Sybase.COM (Jeff Lichtman) (07/11/89)
I hate being wrong.... My previous analysis of the problem was incorrect. The query: select * from hello where a = 1 or a in (select b from world) should return rows even if the table "world" contains no rows. The rule about forming the Cartesian product first applies to tables in the same "from" clause. The SQL 86 standard says that each row from the Cartesian product is applied to the where clause one at a time. If there is a subquery, it is evaluated once per Cartesian product row from the outer query. > "2. R WHERE c1 OR c2 > is defined to be equivalent to > ( R WHERE c1 ) UNION ( R WHERE c2 )" I can't find this in my edition of Date, but I have an old 3rd edition. I don't think it is correct. At least, it doesn't match the ANSI standard. What edition of Date do you have, and what section are you quoting? --- Jeff Lichtman at Sybase {mtxinu,pacbell}!sybase!jeffl -or- jeffl@sybase.com "Saints should always be judged guilty until they are proved innocent..."
lcain@cuc1.UUCP (Leroy Cain) (07/12/89)
In article <5003@sybase.sybase.com>, jeffl@sybase.Sybase.COM (Jeff Lichtman) writes: > I hate being wrong.... > > My previous analysis of the problem was incorrect. The query: > > select * from hello where a = 1 or a in > (select b from world) [ a clarification of SQL 86 delete] Here another way of viewing the problem. first the query in SQL. 1. select * from hello where a = 1 or a in (select b from world) Then we converted it into logical equivalent form. ( Relation Calculus ) 2. FORALL (hello) ( hello.a = 1 OR EXIST(world) (hello.a = world.b) ) We now convert the relational calculus into relation algebra. 3. (hello WHERE a=1) UNION ( hello X world WHERE hello.a=world.b )[hello.a] As you can see we move from an OR to a UNION. My quote below was not completely relavent to the discussion since R in the case we are examining are two different things. Though the general principal seems to still apply. > > "2. R WHERE c1 OR c2 > > is defined to be equivalent to > > ( R WHERE c1 ) UNION ( R WHERE c2 )" > > I can't find this in my edition of Date, but I have an old 3rd edition. > I don't think it is correct. At least, it doesn't match the ANSI > standard. What edition of Date do you have, and what section are > you quoting? I am sorry it is the 4rd edition section 13.4. > --- > Jeff Lichtman at Sybase > {mtxinu,pacbell}!sybase!jeffl -or- jeffl@sybase.com > "Saints should always be judged guilty until they are proved innocent..." Leroy Cain; Columbia Union College; Mathematical Sciences Department 7600 Flower Ave. WH406; Takoma Park, Md 20912 (301) 891-4172 uunet!cucstud!lcain
jkrueger@daitc.daitc.mil (Jonathan Krueger) (07/13/89)
Leroy Cain writes: >in the 4rd edition section 13.4. > "2. R WHERE c1 OR c2 > is defined to be equivalent to > ( R WHERE c1 ) UNION ( R WHERE c2 )" This is a relational algebra used by Date. SQL may agree. Then again it may not. Perhaps someone with the ANSI spec could say. -- Jon --