kent@manzi.unx.sas.com (Paul Kent) (05/17/91)
hello, SQL queries with summary functions when the where clause eliminates all rows from consideration have a strange quirk that i'd like to know more about. > create table gr ( a int, b char ); > insert into gr values( 1, 'a' ); > insert into gr values( 2, 'b' ); > select count(*) from gr; this gets predictable results.. a single row with "2" > select count(*) from gr where a > 10; this too. there are no rows that satisfy the where clause, so we get a single row showing that "0" records were counted. > select count(*) from gr where a > 10 group by b ; > select b, count(*) from gr where a > 10 group by b ; this is wierd. there were no rows, so presumably there are no groups. however o) DB2(v2r2) o) Oracle(v6 on a vax) o) RDB(sorry, no version details, vms5.3 however) dont produce any output for these latter queries. does anyone know the rationale behind this. can they point at the rules in the standard (ansi 135.1 or draft stndard for sql2) that say that this they way it should be? our implementation of SQL produces a single row for all of these queries. and a user has asked why the difference. unfortunately, thats the way we read it in the standard. what does your favorite SQL database do? and do you like it that way? thanks, paul. -- Paul Kent (SQL r&d) " nothing ventured, nothing disclaimed " kent@unx.sas.com SAS Institute Inc, SAS Campus Dr, Cary NC 27513-2414.
jfr@locus.com (Jon Rosen) (05/17/91)
In article <1991May16.212202.22391@unx.sas.com> kent@manzi.unx.sas.com (Paul Kent) writes: >SQL queries with summary functions when the where clause >eliminates all rows from consideration have a strange >quirk that i'd like to know more about. >> create table gr ( a int, b char ); >> insert into gr values( 1, 'a' ); >> insert into gr values( 2, 'b' ); > >> select count(*) from gr; >this gets predictable results.. a single row with "2" >> select count(*) from gr where a > 10; >this too. there are no rows that satisfy the where clause, so we >get a single row showing that "0" records were counted. >> select count(*) from gr where a > 10 group by b ; >> select b, count(*) from gr where a > 10 group by b ; >this is wierd. there were no rows, so presumably there are >no groups. however > o) DB2(v2r2) > o) Oracle(v6 on a vax) > o) RDB(sorry, no version details, vms5.3 however) >dont produce any output for these latter queries. >does anyone know the rationale behind this. >can they point at the rules in the standard >(ansi 135.1 or draft stndard for sql2) >that say that this they way it should be? >our implementation of SQL produces a single row for all >of these queries. and a user has asked why the difference. >unfortunately, thats the way we read it in the standard. >what does your favorite SQL database do? >and do you like it that way? I went and looked at Date's book on the SQL Standard (I unfortunately don't have the actual Standard definition book here at work today). It is obvious that the issues involving COUNT have all sorts of weird exceptions... (a) for other aggregates, you can add DISTINCT before the column name; for COUNT, the DISTINCT is required UNLESS you use COUNT(*) in which case the DISTINCT is NOT ALLOWED. (b) Nulls are eliminated in other expressions EXCEPT for COUNT where nulls are never eliminated. (c) If the argument happens to be a empty set, other functions return null; COUNT returns 0. This last one is probably on point here. SELECT COUNT(*) without a GROUP BY is specially defined to return one row with the COUNT of all rows in the table (after the WHERE clause has been applied). Even if the WHERE clause eliminates all of the rows, you still get a value of zero returned. Apparently, when the GROUP BY is added, the WHERE clause is used PRIOR to the grouping and there are no rows left to partition so no results are returned. This is clearly one of the problems with SQL's lack of orthogonality. By the way, one could argue that your implementation is incorrect also. Since SELECT COUNT(*) returns on row (representing the rows of the table) why shouldn't the GROUP BY clause then partition the table and count the rows for each group, returning all B's with the COUNT of zero for each row (this would mostly make sense in your 4th example where B is included in the query). I will further verify this in my standard definition manual. Jon Rosen