knechod%peruvian.utah.edu@cs.utah.edu (Kevin Nechodom) (05/13/91)
Here is a question that I have used to harass the database companies that bring in their "Wonderful World of Our Database" shows. This is just a base case; any response to this question will quite likely inspire a flurry of counter-questions. Anyway... Imagine that you have a single table of many columns (>500) and many rows (>5000). Let's allow _any_ of those rows to be null (with the exception of the key variable). I would like to write a report that returns all cases where the data is missing. Since I am allowing for many columns, it can't be simply a columnar report. I envision this report to look like: --------------------------- | Key Value | Column Name | |-------------------------| | Key 1 | Field 1 | | Key 1 | Field 2 | | ... | ... | | Key 1 | Field xxx | | Key 2 | Field nn | | ... | ... | --------------------------- Is there _anything_ in the database universe that does something like this? Where is it? Who does it? etc. I think this would be of great interest, so I'll summarize any response and post it here. Thanks in advance. Kevin Nechodom University of Utah CSSRD/STACC nechodom@cc.utah.edu Disclaimers assume that I know what I'm talking about, don't they?
jfr@locus.com (Jon Rosen) (05/15/91)
In article <1991May13.162936.14729@hellgate.utah.edu> knechod%peruvian.utah.edu@cs.utah.edu (Kevin Nechodom) writes: >Here is a question that I have used to harass the database companies >that bring in their "Wonderful World of Our Database" shows. This is just >a base case; any response to this question will quite likely inspire a flurry >of counter-questions. Anyway... > >Imagine that you have a single table of many columns (>500) and many rows >(>5000). Let's allow _any_ of those rows to be null (with the exception of >the key variable). > >I would like to write a report that returns all cases where the data is >missing. Since I am allowing for many columns, it can't be simply a columnar >report. I envision this report to look like: > > --------------------------- > | Key Value | Column Name | > |-------------------------| > | Key 1 | Field 1 | > | Key 1 | Field 2 | > | ... | ... | > | Key 1 | Field xxx | > | Key 2 | Field nn | > | ... | ... | > --------------------------- > >Is there _anything_ in the database universe that does something like this? >Where is it? Who does it? etc. > To directly answer your question, the answer, to the best of my knowledge, is No, Nowhere, No-one :-) However, there are several solutions (none of them optimal of course, but can be used in interesting situations :-) First of all, the obvious solution here if you have a relational DBMS like DB2 or Ingres with an embedded SQL capability, is to just write a program to do it. The program would be pretty trivial (albeit long) in that it would look something like (Pascal psuedo-code here, thank you, please accept my apologies to all the C bigots :-): exec sql declare cursor csr as select key,a,b,c,... from tbl where ...; exec sql open csr; repeat exec sql fetch csr into :key,:a:anull,:b:bnull,:c:cnull,...; if sqlcode = 0 then begin if anull = -1 then writeln(key,"Field A is null"); if bnull = -1 then writeln(key,"Field B is null"); if cnull = -1 then writeln(key,"Field C is null"); ... end; until sqlcode <> 0; exec sql close csr; If the embedded SQL supports dynamic SQL, this can even be done with a general purpose routine that doesn't know what columns are in the table until execution time. THe routine becomes more complex but it can use a DESCRIBE to find out what the attribute columns are, and have a general loop that cycles through all the null column data for each row retrieved. I leave this as a exercise for the reader :-) There is also a kludgy SQL solution for small tables (I know your question asked about large column tables but I can't do that one). In any case, you can do the following: SELECT KEY,"FIELD A IS NULL" FROM TBL WHERE A IS NULL UNION SELECT KEY,"FIELD B IS NULL" FROM TBL WHERE B IS NULL UNION SELECT KEY,"FIELD C IS NULL" FROM TBL WHERE C IS NULL ... (ad nauseum until your system croaks from too many unions) ORDER BY 1,2 I know, I know, this is gross and disgusting... Don't blame me, I didn't invent SQL (Simplistic Query Language :-)... Jon Rosen
billc@ingres.com (Bill Coffin, x3387) (05/16/91)
From article <1991May13.162936.14729@hellgate.utah.edu>, by knechod%peruvian.utah.edu@cs.utah.edu (Kevin Nechodom): > Here is a question that I have used to harass the database companies > that bring in their "Wonderful World of Our Database" shows. This is just > a base case; any response to this question will quite likely inspire a flurry > of counter-questions. Anyway... Sounds like you're out for blood. Anyway, here's a quick knockoff on Ingres. It requires some "dictionary" support, but I'd be surprised to find an RDBMS that didn't have a table describing table columns. select i.id, c.column_name from iemps i, iicolumns c where c.table_name='iemps' and ( (c.column_name = 'name' and i.name is null) or (c.column_name = 'dept' and i.dept is null) or (c.column_name = 'salary' and i.salary is null) ) You could easily write a little tool that would generate this query automatically, just by feeding it a table name. -- Bill Coffin, aka billc@ingres.com FETCH INTO :standard_disclaimer, :gratuitous_witticism, :obscure_quote