[comp.databases] Bizarre Database Reporting

knechod%peruvian.utah.edu@cs.utah.edu (Kevin Nechodom) (05/13/91)

Here is a question that I have used to harass the database companies
that bring in their "Wonderful World of Our Database" shows.  This is just
a base case; any response to this question will quite likely inspire a flurry
of counter-questions.  Anyway...

Imagine that you have a single table of many columns (>500) and many rows
(>5000).  Let's allow _any_ of those rows to be null (with the exception of
the key variable).

I would like to write a report that returns all cases where the data is
missing.  Since I am allowing for many columns, it can't be simply a columnar
report.  I envision this report to look like:

    ---------------------------
    | Key Value | Column Name |
    |-------------------------|
    | Key 1     | Field 1     |
    | Key 1     | Field 2     |
    | ...       | ...         |
    | Key 1     | Field xxx   |
    | Key 2     | Field nn    |
    | ...       | ...         |
    ---------------------------

Is there _anything_ in the database universe that does something like this?
Where is it?  Who does it? etc.  

I think this would be of great interest, so I'll summarize any response and
post it here.

Thanks in advance.

Kevin Nechodom
University of Utah
CSSRD/STACC
nechodom@cc.utah.edu
Disclaimers assume that I know what I'm talking about, don't they?

jfr@locus.com (Jon Rosen) (05/15/91)

In article <1991May13.162936.14729@hellgate.utah.edu> knechod%peruvian.utah.edu@cs.utah.edu (Kevin Nechodom) writes:
>Here is a question that I have used to harass the database companies
>that bring in their "Wonderful World of Our Database" shows.  This is just
>a base case; any response to this question will quite likely inspire a flurry
>of counter-questions.  Anyway...
>
>Imagine that you have a single table of many columns (>500) and many rows
>(>5000).  Let's allow _any_ of those rows to be null (with the exception of
>the key variable).
>
>I would like to write a report that returns all cases where the data is
>missing.  Since I am allowing for many columns, it can't be simply a columnar
>report.  I envision this report to look like:
>
>    ---------------------------
>    | Key Value | Column Name |
>    |-------------------------|
>    | Key 1     | Field 1     |
>    | Key 1     | Field 2     |
>    | ...       | ...         |
>    | Key 1     | Field xxx   |
>    | Key 2     | Field nn    |
>    | ...       | ...         |
>    ---------------------------
>
>Is there _anything_ in the database universe that does something like this?
>Where is it?  Who does it? etc.  
>

To directly answer your question, the answer, to the best of my knowledge,
is No, Nowhere, No-one :-)
 
However, there are several solutions (none of them optimal of course, but
can be used in interesting situations :-)
 
First of all, the obvious solution here if you have a relational DBMS
like DB2 or Ingres with an embedded SQL capability, is to just write
a program to do it.  The program would be pretty trivial (albeit long)
in that it would look something like (Pascal psuedo-code here, thank you,
please accept my apologies to all the C bigots :-):  
 

  exec sql declare cursor csr as select key,a,b,c,... from tbl where ...; 
  exec sql open csr;
  repeat
    exec sql fetch csr into :key,:a:anull,:b:bnull,:c:cnull,...;
    if sqlcode = 0 then begin
      if anull = -1 then writeln(key,"Field A is null");
      if bnull = -1 then writeln(key,"Field B is null");
      if cnull = -1 then writeln(key,"Field C is null");
        ...
    end;
  until sqlcode <> 0;
  exec sql close csr;  
 
If the embedded SQL supports dynamic SQL, this can even be done with
a general purpose routine that doesn't know what columns are in the
table until execution time.  THe routine becomes more complex but
it can use a DESCRIBE to find out what the attribute columns are,
and have a general loop that cycles through all the null column
data for each row retrieved.  I leave this as a exercise for the
reader :-)
 
There is also a kludgy SQL solution for small tables (I know your
question asked about large column tables but I can't do that one).
In any case, you can do the following:
 
   SELECT KEY,"FIELD A IS NULL" FROM TBL WHERE A IS NULL 
   UNION
   SELECT KEY,"FIELD B IS NULL" FROM TBL WHERE B IS NULL 
   UNION
   SELECT KEY,"FIELD C IS NULL" FROM TBL WHERE C IS NULL
    ... (ad nauseum until your system croaks from too many unions)
   ORDER BY 1,2
 
I know, I know, this is gross and disgusting... Don't blame me,
I didn't invent SQL (Simplistic Query Language :-)...
 
Jon Rosen

billc@ingres.com (Bill Coffin, x3387) (05/16/91)

From article <1991May13.162936.14729@hellgate.utah.edu>, by knechod%peruvian.utah.edu@cs.utah.edu (Kevin Nechodom):
> Here is a question that I have used to harass the database companies
> that bring in their "Wonderful World of Our Database" shows.  This is just
> a base case; any response to this question will quite likely inspire a flurry
> of counter-questions.  Anyway...

Sounds like you're out for blood.
 
Anyway, here's a quick knockoff on Ingres.  It requires some "dictionary" support, but
I'd be surprised to find an RDBMS that didn't have a table describing table columns.

    select i.id, c.column_name
	from iemps i, iicolumns c
	where c.table_name='iemps'
	and (
		(c.column_name = 'name' and i.name is null)
		or (c.column_name = 'dept' and i.dept is null)
		or (c.column_name = 'salary' and i.salary is null)
	)

You could easily write a little tool that would generate this query automatically, just by
feeding it a table name.
--

Bill Coffin, aka billc@ingres.com 
 FETCH INTO :standard_disclaimer, :gratuitous_witticism, :obscure_quote