[comp.databases] Ingres and referential integrities

slc@rti.UUCP (Lynn Cohen) (07/21/89)

I'm curious as to why Ingres doesn't provide referential integrities.
I'm aware that you can build in referential constraints through QBF
and VIFRED but not as part of the data definition. Does anyone know
why Ingres chose not to provide this part of the relational model?

Thanks,
Lynn

debra@alice.UUCP (Paul De Bra) (07/21/89)

In article <3081@rti.UUCP> slc@rti.UUCP (Lynn Cohen) writes:
>I'm curious as to why Ingres doesn't provide referential integrities.
>I'm aware that you can build in referential constraints through QBF
>and VIFRED but not as part of the data definition. Does anyone know
>why Ingres chose not to provide this part of the relational model?
>

My guess would be performance.

I once attended a talk by an IBM person (a techie, not a salesman)
who explained that when moving from their hierarchical database system
to the relational one there still was an (average) performance
loss of about a factor 3. This was even without referential integrity.

The reason he gave for having difficulties with referential integrity
was that in the hierarchical databases referential integrity comes
natural, it is almost built into the model. With a relational database
checking for referential integrity involves real (computational) work.

Paul.

-- 
------------------------------------------------------
|debra@research.att.com   | uunet!research!debra     |
------------------------------------------------------

jkrueger@daitc.daitc.mil (Jonathan Krueger) (07/23/89)

>in the hierarchical databases referential integrity comes naturally, it
>is almost built into the model. With a relational database checking
>for referential integrity involves real (computational) work.

The reasoning is entirely valid but skips two relevant points: HOW
MUCH is the performance difference?  and how much is the performance
ADVANTAGE when you need to perform queries not expressible in
hierarchical terms?

So my hunch on INGRES's lack of referential integrity isn't
performance, but development complexity, and RTI's desire to implement
it in a more general way that includes derived columns, user-defined
rules, arbitrary grouping, and the like.  How do current commercial
relational DBMS define the integrity "no person can be his own
ancestor or descendent"?  Sorry, folks, that's not referential.  How
do current commercial hierarchical DBMS define it?  I'd be interested,
particularly in how concise and general the form is, and how data
independent and and separable from user interface.  Could somone post
how one states this in ADABASE, IMS, other?

-- Jon
--

davidm@cimshop.UUCP (David Masterson) (07/25/89)

>...  How do current commercial
>relational DBMS define the integrity "no person can be his own
>ancestor or descendent"?

I'm curious, Jon, can you give an example of how Ingres implements "a more
general way" of referential integrity (hope I'm not paraphrasing too much).
Can it be done at the DBMS level or must it be done by application program?
Oh, is it v5 or v6?

David Masterson					(preferred)
uunet!cimshop!davidm		or		DMasterson@cup.portal.com

robf@squid.rtech.com (Robert Fair) (07/25/89)

Lynn writes:
>I'm curious as to why Ingres doesn't provide referential integrities.
>I'm aware that you can build in referential constraints through QBF
>and VIFRED but not as part of the data definition. Does anyone know
>why Ingres chose not to provide this part of the relational model?
>
>Thanks,
>Lynn

If you mean single-table constraints, INGRES has  had them for many
years, for example:

	CREATE INTEGRITY ON employee IS age>15 AND  age<75;

will enforce the integrity that all employees are between 15 and 75.

Starting with release 6, multi-table update integrity can be enforced by
using database procedures, typically linked updates are coded using
dbprocs similar to:

	CREATE PROCEDURE del_emp (emp_no=integer) AS
	BEGIN
		DELETE FROM emp
		WHERE  emp_no= :emp_no;

		IF iirowcount!=1 THEN
			MESSAGE 'Error deleting employee';
			ROLLBACK;
			RETURN  0;
		ENDIF;

		DELETE FROM emp_info
		WHERE  emp_no= :emp_no

		IF iirowcount!=1 THEN
			MESSAGE 'Error deleting emp_info';
			ROLLBACK;
			RETURN  0;
		ENDIF;

		COMMIT;
		RETURN 1;
	END;

The trick here to ensure integrity is that permits can be granted to 
execute procedures accessing base tables, even when there is no permission 
on the base table itself, e.g:

	GRANT SELECT ON emp TO PUBLIC;
	GRANT SELECT ON emp_info TO PUBLIC;
	GRANT EXECUTE ON PROCEDURE del_emp TO PUBLIC;

would ensure the only updates to the "emp" table come through database
procedures, so providing proper referential integrity. Similar procedures
can be defined to insert & update data in tables. dbprocs can be called
from ESQL and ABF/4GL.

Robert Fair
Relational Technology 
Technical Support.

jkrueger@daitc.daitc.mil (Jonathan Krueger) (07/25/89)

In article <442@cimshop.UUCP>, davidm@cimshop (David Masterson) writes:
>I'm curious, Jon, can you give an example of how Ingres implements "a more
>general way" of referential integrity (hope I'm not paraphrasing too much).

It doesn't.  Thus my phrasing: "RTI's desire to implement it in a more
general way".  Between their desire and delivery of a robust set of
tools lies a tough problem, which neither RTI nor any other vendor has
solved.

-- Jon
--

debra@alice.UUCP (Paul De Bra) (07/25/89)

In article <3225@rtech.rtech.com> robf@squid.UUCP (Robert Fair) writes:
}Lynn writes:
}>I'm curious as to why Ingres doesn't provide referential integrities.
}>I'm aware that you can build in referential constraints through QBF
}>and VIFRED but not as part of the data definition. Does anyone know
}>why Ingres chose not to provide this part of the relational model?
}>
}>Thanks,
}>Lynn
}
}If you mean single-table constraints, INGRES has  had them for many
}years, for example:
}
}	CREATE INTEGRITY ON employee IS age>15 AND  age<75;
}
}will enforce the integrity that all employees are between 15 and 75.
}
}Starting with release 6, multi-table update integrity can be enforced by
}using database procedures, typically linked updates are coded using
}dbprocs similar to:
}
} [example procedure deleted]
}
}Robert Fair
}Relational Technology 
}Technical Support.

Well, it doesn't look good if RTI's technical support doesn't even know
what referential integrity means, does it?

Now, as far as i know, it means the following:
given for instance 2 relations:
EMP (EMPNO, DNO)      employees with their department
DEPT(DNO, MANAGER)    departments with their manager

Then referential integrity (given EMPNO is a key in EMP and DNO a key in
DEPT) would imply that EMP cannot contain a DNO which does not occur in
DEPT.

There should be no need for procedures or other low-level tricks to
enforce this constraint.

In a hierarchical database the link between the DNO's in EMP and in DEPT
would come natural, and referential integrity is therefore easy.
In a relational database there is no automatic link between the two
DNO's which makes it harder to implement referential integrity.

Paul.
-- 
------------------------------------------------------
|debra@research.att.com   | uunet!research!debra     |
------------------------------------------------------

gupta@cullsj.UUCP (Yogesh Gupta) (07/27/89)

In article <9677@alice.UUCP>, debra@alice.UUCP (Paul De Bra) writes:
> [...]
> Then referential integrity (given EMPNO is a key in EMP and DNO a key in
> DEPT) would imply that EMP cannot contain a DNO which does not occur in
> DEPT.

Referential integrity:
If DNO is the primary key in DEPT and EDNO is the foreign key in the
EMP table that refers to DNO in DEPT, then EMP.EDNO can contain no non-NULL
value that does not exist in DEPT.DNO.

Note that the above only defines what a property of the database to maintain
consistency.  However, it says nothing about what action to take in case of
an attempted violation.  Let us say that you are trying to delete a row
from DEPT which has some rows in EMP referring to it.  The following actions
are possible:
	Disallow the delete (RESTRICT).
	Delete the rows from EMP as well (CASCADE).
	Set the EDNO value for those rows in EMP to NULL (SET NULL).
	Set the EDNO value for those rows in EMP to NULL some default value
	    (SET DEFAULT).

Any of these actions maintains consistency in the database.  It should be up
to the DBA to define these.

> There should be no need for procedures or other low-level tricks to
> enforce this constraint.
> 

I wholeheartedly agree with the above statement.  If you write procedures,
you have just eliminated ALL the flexibility of SQL.  Using Robert Fair's
example:
In article <3225@rtech.rtech.com>, robf@squid.rtech.com (Robert Fair) writes:

>Starting with release 6, multi-table update integrity can be enforced by
>using database procedures, typically linked updates are coded using
>dbprocs similar to:
>
>	CREATE PROCEDURE del_emp (emp_no=integer) AS
>	BEGIN
>		DELETE FROM emp
>		WHERE  emp_no= :emp_no;
>
>		IF iirowcount!=1 THEN
>			MESSAGE 'Error deleting employee';
>			ROLLBACK;
>			RETURN  0;
>		ENDIF;
>
>		DELETE FROM emp_info
>		WHERE  emp_no= :emp_no
>
>		IF iirowcount!=1 THEN
>			MESSAGE 'Error deleting emp_info';
>			ROLLBACK;
>			RETURN  0;
>		ENDIF;
>
>		COMMIT;
>		RETURN 1;
>	END;
>
>The trick here to ensure integrity is that permits can be granted to 
>execute procedures accessing base tables, even when there is no permission 
>on the base table itself, e.g:
>
>	GRANT SELECT ON emp TO PUBLIC;
>	GRANT SELECT ON emp_info TO PUBLIC;
>	GRANT EXECUTE ON PROCEDURE del_emp TO PUBLIC;

Note that at this point all the power of SQl is lost!  The only criterion
usable in the where clause is empno = :x!  For example, I can not say:

	DELETE from emp where performance = poor;

A much more flexible implementation would be defining the referential
integrity in the declarative form, say:

	create table emp_info (empno integer
			(refers to emp.empno on delete CASCADE), ...);

The above is just an example of syntax that would specify the action
to be taken, and not how.  Also, it would not matter how you deleted
a row from the emp table, the DBMS would guarantee the deletetion of
the appropriate row from emp_info.

> In a hierarchical database the link between the DNO's in EMP and in DEPT
> would come natural, and referential integrity is therefore easy.
> In a relational database there is no automatic link between the two
> DNO's which makes it harder to implement referential integrity.
>
> Paul.

This is an interesting myth.  Just to cite a counter example, Cullinet's
Enterprise:DB supports referntial integrity WITHOUT any physycal links.
And it performs very well (unfortunately, there are no referential
integrity benchmarks around).

Also, there is nothing that prevents a DBMS from internally maintaining
links, as long as those are NOT VISIBLE AND NOT NECESSARY for the
functionality, and only a performance enhancement.

robf@squid.rtech.com (Robert Fair) (07/27/89)

>From: debra@alice.UUCP (Paul De Bra):
>
>Well, it doesn't look good if RTI's technical support doesn't even know
>what referential integrity means, does it?
>
Actually, I know very well what referential integrity means, and so do
many others at RTI !

The essential point of  referential integrity is that the DBMS does not
allow users to violate the integrity (e.g. by writing an ESQL program).
There are two basic ways to do this (both implemented in the DBMS):

- Passive (rules/triggers). This allows users to do any kind of update,
  with rules being fired when update occurs to perform the associated
  referential checks, cascaded updates etc. Rules are very nice, but
  can be complex to setup and maintain (how about cascaded rules ?
  recursive rules ? etc)

- Active (dbprocedures). This defines one or more well-defined interfaces
  to  update  sets of tables, complete with error checking, transaction
  control etc to enforce referential integrity. Database procedures may
  not be as  transparent/elegent as rules but they provide just as
  much data integrity, and are often easier to set up/manage for
  real applications.
  
The choice between active and passive often depends on personal preference,
neither is a perfect solution for all possible cases.

Robert Fair
Technical Support
Relational Technology

pnnbkr@dean.Berkeley.EDU (=steven pennebaker) (07/31/89)

>> Article 3431 of comp.databases:
>> Path: pasteur!ucbvax!decwrl!sun-barr!apple!oliveb!amdahl!rtech!squid!robf
>> From: robf@squid.rtech.com (Robert Fair)
>> Newsgroups: comp.databases
>> Subject: Re: Ingres and referential integrities
>> Date: 27 Jul 89 12:41:50 GMT
>> Organization: Relational Technology, Inc.
>> 
>> >From: debra@alice.UUCP (Paul De Bra):
>> >
>> >Well, it doesn't look good if RTI's technical support doesn't even know
>> >what referential integrity means, does it?
>> >
>> Actually, I know very well what referential integrity means, and so do
>> many others at RTI !

who are they?  what are their names??  :-)
>> 
>> The essential point of  referential integrity is that the DBMS does not
>> allow users to violate the integrity (e.g. by writing an ESQL program).
>> There are two basic ways to do this (both implemented in the DBMS):
>> 
>> - Passive (rules/triggers). This allows users to do any kind of update,
>>   with rules being fired when update occurs to perform the associated
>>   referential checks, cascaded updates etc. Rules are very nice, but
>>   can be complex to setup and maintain (how about cascaded rules ?
>>   recursive rules ? etc)
>> 
>> - Active (dbprocedures). This defines one or more well-defined interfaces
>>   to  update  sets of tables, complete with error checking, transaction
>>   control etc to enforce referential integrity. Database procedures may
>>   not be as  transparent/elegent as rules but they provide just as
>>   much data integrity, and are often easier to set up/manage for
>>   real applications.
>>   

this is a good summary.  the point that i think is worth making, and seems
to be the underlying assumption of yogesh gupta's article 3424, is that 
in most cases simple cascade/restrict/nullify rules suffice for maintaining 
referential integrity.  database procedures have advantages, but seem like 
an awful tedious way of solving a problem that can be modeled so much 
more easily.

my solution to the problem of enforcing referential integrity in ingres rel. 5
has been to develop a single foriegn key table that is read by routines
that enforce the rules.  it is simple, centralized, easily managed and
changed.  in fact, given triggers ala sybase, i'll still use the table
to generate the triggers!  i suppose you could gen the procedures, but
it still seems like the long way around.

in one database that has ~160 delete rules in it, i found 3 cases where 
the rules alone were not sufficient.  in 2 of those cases, i cheated and 
decided that rather than cascading i should restrict which did work 
out and the last case, which really needs to be a cascade,
could be (but won't be) resolved w/ a bit of normalization (and subsequent 
degredation in reporting performance!).

i'm not certain how database procedures allow you to avoid tricky
cascade/recursion problems as you suggest above.  regardless of whether
you choose a rule based approach or a db procedure approach, you will
still have to work out the consequences of a transaction.

personally, i'd like both triggers and database procedures.  triggers
because they're easily managed, simple and, given a well designed
database, usually all you need, and database procedures because
every now and then you've gotta do something odd and that's a good
way to handle it.

>> The choice between active and passive often depends on personal preference,
>> neither is a perfect solution for all possible cases.

and they call it computer "science"! :-)

>> 
>> Robert Fair
>> Technical Support
>> Relational Technology

steven pennebaker

disclaimer:  the opinions are mine!

bsa@telotech.UUCP (Brandon S. Allbery) (08/01/89)

In article <9677@alice.UUCP>, debra@alice (Paul De Bra) writes:
+---------------
| Then referential integrity (given EMPNO is a key in EMP and DNO a key in
| DEPT) would imply that EMP cannot contain a DNO which does not occur in
| DEPT.
| 
| There should be no need for procedures or other low-level tricks to
| enforce this constraint.
+---------------

Why?  Because referential integrity is, by natural law, the lowest-level form
of integrity possible?

Multi-table integrity via procedures allows you to define referential
integrity.  Moreover, it allows you to define forms of integrity which a
hierarchical DBMS can not handle:  for example, a database I designed once
required the condition for integrity to be "... between xxx and yyy" instead
of simply "=".  Certainly, I could have designed it otherwise... and then
would have been forced to explain why the database had to be fed magic codes
instead of the values everyone was used to.  Sorry, folks, but forcing people
to conform to the DBMS's way of doing things isn't what I consider the right
way to do things.  (...segue to comp.info-systems....  1/2 :-)

RTI's got it right:  their system permits referential integrity as *you*
define it as a specific case of a much more general mechanism.

++Brandon
-- 
  Brandon S. Allbery @ telotech, inc.   (I do not speak for telotech.  Ever.)
*This article may only be redistributed if all articles listed in the header's*
*  References: field, and all articles listing this one in their References:  *
*		       fields, are also redistributed.			      *