[comp.databases] Referential Integrity

trogers%yosemite@Sun.COM (Tom Rogers) (03/01/88)

>TO THE NET PEOPLE:  Does anyone know what other DBMSs support
>referential integrity?
>>The only "well-known" system that I have seen that supports referential
>>integrity is the Sybase DataServer.  It does this through triggers that are
>>embedded in the database itself and activate no matter what program is
>>accessing the data.

Interbase has the same trigger facility as Sybase. SunUnify, Unify, and Cincom's
Supra all have real referential integrity that is not implemented via triggers.

greg@mdbs.UUCP (Greg Feldman) (03/12/88)

>
>>From: DMasterson@cup.portal.com
>Newsgroups: comp.databases
>Subject: Re: Informix 4GL Question?
>In message <1320@polyslo.UUCP> lchirica@polyslo.UUCP writes:
>>In article <2314@geac.UUCP> daveb@geac.UUCP (David Collier-Brown) writes:
>>>In article <714@uel.uel.co.uk> andrew@uel.uel.co.uk (Andrew Josey) writes:
>>>>When deleting records in my database I check whether they are
>>>>referenced by any other records. This is currently done by code
>>>>similar to the following :
	^^^^^
	I contend this is the DBMS's job.

>
>>TO THE NET PEOPLE:  Does anyone know what other DBMSs support
>>referential integrity?
>>
MDBSIII supports referential integrity.  Deleting (or updating)
a record (tuple) will AUTOMATICALLY maintain all relationships
referencing that record.  It will NOT delete (cascade) any other
records.  However, it will dynamically maintain all indexes, so
no need to run a data base re-organize program.

MDBSIII can do this because there is no redundancy in a properly 
designed schema.  So, rather than having to run around the data
base looking for potential anamolies, all references to that data
are explicitly connected to that particular record.  As a result,
referential integrity can be had by the DBMS software, not the
application program.

#include ".signature"
Greg Feldman--MDBS  (317) 448-6187

UUCP:	{rutgers,ihnp4,decvax,ucbvax}!pur-ee!mdbs!support

Note:  "These are my opinions, so if anyone asks, I didn't do it!"

bobd@bloom.UUCP (Bob Donaldson) (12/23/88)

The discussion on binary data types has prompted me to ask another feature-
related question -

Why don't any of the RDBMS packages we all know and love REALLY support
referential integrity??

HOLD ON - before you all chime in with 'but MINE does', let me say that I
do NOT mean the simplistic approach that is so often implemented that
merely prevents you from inserting or deleting a record that would 
violate the principle.  Let me explain.

C.S. Date is the source for most of my terminology, but let me clarify what
I mean so you can disagree with that, rather than what you think I might have
meant.

Foreign Key	The attribute in relation A that uniquely identifies a tuple
		in some (probably different) relation B.  

Target		The specific tuple identified by some instance of a foreign 
		key

BLOCK		An action taken by the system that prevents a user-specified
		action from completing if it violates referential integrity

CASCADE		An action taken by the system that propagates a user-specified
		action to other tuples and/or relations in order to maintain
		referential integrity

NULLIFY		An action taken by the system to set foreign keys to NULL in
		order to maintain referential integrity without cascading.
		NOTE - SEE BELOW ON EXISTENCE DEPENDENCY BEFORE YOU JUMP TO
		A CONFUSION.

Referential Integrity	"If an attribute is designated as a foreign key, then
			 the database may not assume a state such that that
			 attribute contains a value which does not exist in
			 some target tuple." 

Existence-Dependence	Note that the definition of referential integrity
			does NOT rule out NULL foreign keys.  This is
			because some real-world applications require 
			existence INdependence, rather than existence 
			dependence.  This can best be illustrated with an 
			example.  A student may sign up for a course, and the
			information about this might be in the ENROLLMENT
			table.  Clearly, if either the student or the course
			do not exist, the ENROLLMENT entry will be incon-
			sistent.  The same student may (or may not) have an
			advisor.  This information might be stored in the
			STUDENT table.  Clearly, if he HAS an advisor, the
			advisor must exist (i.e. he must have an entry in
			the PROFESSOR table), but it is also possible that
			no advisor is declared (STUDENT.advisor = NULL).
			The first case illustrates that ENROLLMENT is
			existence-dependent on STUDENT and COURSE.  The second
			ilustrates that STUDENT is existence-independent of
			PROFESSOR.

NOW, if anybody is still with me - Here is what I WANT:

create table ENROLLMENT
	attribute	student_id
			foreign key into STUDENT (id)
			/* referential integrity; join field is STUDENT.id */
				delete of target CASCADES
				/* If the student quits, delete the
				   enrollment too  */
				modify of target BLOCKED
				/* Don't allow the student id to change (at
				   least if he is currently enrolled  */

	attribute	course_id
			foreign key into COURSE (num)
			/* referential integrity; join field is COURSE.num */
				delete of target CASCADES
				/* If you cancel the course, get rid of the
				   associated enrollments  */
				modify of target CASCADES
				/* If you renumber the courses, fix the
				   enrollments too  */
	attribute	...
	...

create table STUDENT
	attribute	student_id
	attribute	advisor
			foreign key into PROFESSOR (name)
			/* referential integrity; join field is PROFESSOR.name
			 */
				delete of target NULLIFIES
				/* If my advisor quits, let him (just don't
				   throw me out too)  */
				modify of target CASCADES
				/* If my professor suffers an identity crisis
				   and changes his name, don't leave me
				   hanging there without an advisor  */
	attribute	...
	...

I obviously stretched a little for these examples;  PLEASE DON'T CRITIQUE
THE DATABASE DESIGN!  That's not the point.  

My point is that while I recognize that such things as CASCADES are both
demanding in terms of resources and also potentially dangerous, I maintain
that the database designer NEEDS to specify these various nuances of
referential integrity SOMEHOW.  Currently, we must choose from several
unattractive options:

	#1	Document the decision and pray that all the application
		programmers build the code into all of their applications
		AND strictly control all SQL documentation so that no real
		users will get in and foul something up.

	#2	Implement the subset of RefInt provided by your favorite
		RDBMS and get an answering machine for your phone to handle
		user complaints when they have to do something like renumber
		all the COURSES (or some other keystone relation).

	#3	Use record numbers or some such for ALL keys and try to claim
		(with a straight face) that you are implementing a relational
		model

	#4	Hire somebody to check the database daily for inconsistencies
		and figure out how to fix them.

Any comments? criticisms? PRODUCTS???

I'm braced for the response this is sure to draw.
-=-
Bob Donaldson              ...!cs.utexas.edu!natinst!radian!bobd
Radian Corporation                    ...!sun!texsun!radian!bobd
PO Box 201088       
Austin, TX  78720       (512) 454-4797

Views expressed are my own, not necessarily those of my employer.

jeffl@sybase.sybase.com (Jeff Lichtman) (12/25/88)

> Why don't any of the RDBMS packages we all know and love REALLY support
> referential integrity??

Sybase supports referential integrity, although the syntax is not anything
like what you showed.  With Sybase, you can put a trigger on a table which
will be executed whenever the table is updated.  Within triggers, one can
tell which columns have been updated and select the newly inserted and
deleted rows.  Here is an example of a cascading delete trigger:

	create trigger student_delete on student for delete as
		delete enrollment from enrollment, deleted
			where enrollment.student_id = deleted.student_id

As I said, the syntax isn't the same.  There main advantage to this approach
is that the integrity conditions and actions can be arbitrarily complex.
That is, you're not limited to simple existence checks or simple cascade and
block actions.
-- 
Jeff Lichtman at Sybase
{mtxinu,pacbell}!sybase!jeffl
"Saints should always be judged guilty until they are proved innocent..."

jeffw@spider.sybase.com (Jeffrey Ward) (12/28/88)

In article <560@bloom.UUCP> bobd@bloom.UUCP (Bob Donaldson) writes:
] The discussion on binary data types has prompted me to ask another feature-
] related question -
] 
] Why don't any of the RDBMS packages we all know and love REALLY support
] referential integrity??
] 
] HOLD ON - before you all chime in with 'but MINE does', let me say that I
] do NOT mean the simplistic approach that is so often implemented that
] merely prevents you from inserting or deleting a record that would 
] violate the principle.

Have you taken a look at the SYBASE system?  While the referential integrity
mechanisms used in this system are not implemented precisely the way that
you specify, SYBASE uses special database objects called "rules", "defaults",
and, in particular, "triggers" to control database integrity.

In particular, triggers are not used merely to prevent an insert or update
that conflicts with integrity rules; they can also be used to force cascading
inserts, updates, and deletes, print messages, rollback entire transactions,
make calls to stored procedures or (in the upcoming 4.0 release) remote stored
procedures on other server nodes, raise exceptions, make decisions on the
basis of the data or the operation, and so forth.

I believe that other vendors are also implementing trigger-like objects,
but I do not know what their schedules for implementation are.  It sounds
like this sort of database object fulfills your need for non-simplistic
referential integrity, even if it does not use precisely the mechanism
that you cite in your example.

-----------------------------------------------------------------------------
	Jeffrey Ward		{pacbell,mtxinu,pyramid}!sybase!jeffw
	Sybase, Inc.
	6475 Christie Avenue	Disclaimer:  The opinions expressed above may
	Emeryville, CA  94608	not even be my own, much less my employers'.
-----------------------------------------------------------------------------

john@riddle.UUCP (Jonathan Leffler) (01/03/89)

In article <560@bloom.UUCP> bobd@bloom.UUCP (Bob Donaldson) writes:
>Why don't any of the RDBMS packages we all know and love REALLY support
>referential integrity??

There is an ANSI SQL committee working on the next versions (yes,
plural) of ANSI SQL -- known as SQL2 and SQL3.  I don't have up
to date info, but as of March 1988, SQL2 (and therefore SQL3 too)
had the following (outline) syntax in BNF:

<table definition> ::=
		CREATE TABLE <table name>
		(	<table element> [ { , <table element> } ... ] )

<table element>	::=
		<column definition>
	|	<table constraint definition>

<table constraint definition> ::=
		<table constraint>
		<constraint name definition>

<table constraint> ::=
		<unique constraint definition>
	|	<referential constraint definition>
	|	<check constraint definition>

<column definition> ::=
		<column name> [ <data type> |  <domain name> ]
		[ <default clause> ]
		[ <column constraint definition> ... ]

<column constraint definition> ::=
		<column constraint>
		<constraint name definition>

<column constraint> ::=
		NOT NULL
	|	<unique specification>
	|	<references specification> [ <triggered action> ]
	|	CHECK ( <search condition> )

<unique constraint definition> ::=
		<unique specification> ( <unique column list> )

<unique specification> ::=
		UNIQUE | PRIMARY KEY

<referential constraint definition> ::=
		<referential constraint>
		[ <triggered action> ]

<referential constraint> ::=
		FOREIGN KEY ( <referencing columns> )
		<references specification>

<references specification> ::=
		REFERENCES <referenced table and columns>
		[ MATCH <match columns> ]

<match columns> ::=
		NONE | ALL

<triggered action> ::=
		<update rule> [ <delete rule> ]
	|	<delete rule> [ <update rule> ]

<update rule> ::=
		ON UPDATE <referential action>

<delete rule> ::=
		ON DELETE <referential action>

<referential action> ::=
		CASCADE
	|	SET NULL
	|	SET DEFAULT

There are a good many definitions I have not listed; they are
either not relevant (well, not very) or trivial (e.g. a list of
things is a thing or a thing followed by a comma and a list of
things).

There are many pages of spiel about what it all means, but this
should give you:

	a)	An idea of how turgid standards documents can be.

	b)	How referential integrity will be specified.

I haven't yet worked out the method of specifying a RESTRICTED
(or BLOCK) constraint -- I think you omit the <triggered action>.

I hope this is some help.

Jonathan Leffler (john@sphinx.co.uk)