[comp.databases] Entity Models and BCNF

nigelc@cognos.UUCP (Nigel Campbell) (12/14/90)

This is a followup to the recent thread on Entity Models and 
Normalization from a colleague who is having problems getting
into the net . You can reach him at colim@cognos.UUCP or try 
3132050@mcimail.com which may reach him faster 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Subject: ERMs & Normalization
Status: OR

Using Normalization to check your ERM is of course redundant, 
but only if your ERM is perfect!

In practice people are rarely perfect and shifting to another paradigm to look
at the same problem another way is a standard way of getting the Human
mind to stop overlooking things.

So in practice checking your ERM against Normalization is far from 
redundant, even if you find nothing it proves your design is right
(or at least internally complete & consistent).

ERM's which only carry Entity names don't have enough information to
derive Normalization. The approach we use is to identify the
Attributes & Key for each Entity, then the dependencies between them.

You can then produce a first cut Relation for each Entity and start checking
it's Normalization.  Personally I've never seen the usefulness in using
an automated package at this point. Identifying the Dependencies is the
real Analysis step, checking whether the Relation is now in 3NF, and fixing it
is a trivial exercise.

As a side issue ERM's allow 1 to 1 relationships between things
which are seperate logical business entities, but have the same unique
identifier (could be Employees & Phones in the current phone system),
Normalization would put these in the same relation. I would advocate keeping them 
apart at the Logical level to keep your options open, and consider whether to 
collapse them into one Relation at the Physical Level.


Colin Moden



-- 
Nigel Campbell          Voice: (613) 783-6828                P.O. Box 9707
Cognos Incorporated       FAX: (613) 738-0002                3755 Riverside Dr.
uucp: nigelc@cognos.uucp || uunet!mitel!cunews!cognos!nigelc Ottawa, Ontario
                                                             CANADA  K1G 3Z4

aaron@grad2.cis.upenn.edu (Aaron Watters) (12/17/90)

In article <9129@cognos.UUCP> nigelc@cognos.UUCP (Nigel Campbell) writes:
on behalf of colim@cognos.UUCP or try 
             3132050@mcimail.com which may reach him faster 
>Using Normalization to check your ERM is of course redundant, 
>but only if your ERM is perfect!
>
>In practice people are rarely perfect and shifting to another paradigm to look
>at the same problem another way is a standard way of getting the Human
>mind to stop overlooking things.

Okay, please give me an example where the paradigm shift actually
helps.  Someone previously mentioned something about removing
a foriegn key from an entity -- which seemed to me to be a silly
mistake in the first place -- the modeller should have asked
`Is this thing really an attribute of this entity?' to get the
right answer without worrying about dependancies at all.

Let me make a strong rhetorical claim:
	Any normalization that is NOT justified by an entity
	relationship (or similar) analysis is suspect and probably
	invalid -- ie, some user in the future will probably
	want to violate it.
comments?  (Keep in mind here that I'm talking about logical
modelling -- not performance tuning or anything else.)
		-aaron.

davidm@uunet.UU.NET (David S. Masterson) (12/19/90)

>>>>> On 17 Dec 90 15:10:21 GMT, aaron@grad2.cis.upenn.edu (Aaron Watters)
>>>>> said:

Aaron> Okay, please give me an example where the paradigm shift actually
Aaron> helps.  Someone previously mentioned something about removing a foriegn
Aaron> key from an entity -- which seemed to me to be a silly mistake in the
Aaron> first place -- the modeller should have asked `Is this thing really an
Aaron> attribute of this entity?' to get the right answer without worrying
Aaron> about dependancies at all.

Well, you may quibble about talk of "functional dependencies", but the logical
grounds are sound.  In a logical model, its better to identify relationships
separately from the entities so that attributes may be properly assigned.  For
instance, the Married-To relationship might have a Date attribute that
logically does not make sense as part of one the entities (Employee or Wife).
However, in final analysis, it may be decided to represent the Married-To
relationship as a foreign key on the Employee relation.

To be more explicit:

	The denormalization process should only occur when physically
	implementing the logical model.  A normalized logical model
	presents DB analyst with a more complete view of the world
	for both the current implementation and future changes.

--
====================================================================
David Masterson					Consilium, Inc.
(415) 691-6311					640 Clyde Ct.
uunet!cimshop!davidm				Mtn. View, CA  94043
====================================================================
"If someone thinks they know what I said, then I didn't say it!"

zuker@6sigma.UUCP (Hunter Zuker) (12/20/90)

In article <9129@cognos.UUCP> nigelc@cognos.UUCP (Nigel Campbell) writes:

>Using Normalization to check your ERM is of course redundant, 
>but only if your ERM is perfect!

Agreed.

>You can then produce a first cut Relation for each Entity and start checking
>it's Normalization.  Personally I've never seen the usefulness in using
>an automated package at this point. Identifying the Dependencies is the
>real Analysis step, checking whether the Relation is now in 3NF, and fixing it
>is a trivial exercise.

Well I agree that identifying the dependencies is where the real work is
done.  Normalization is just drudge work, like long division.  Sure I can
do it my head, but why bother when I have a tool that does much quicker.
For some people I guess normalization is trivial and others it's not.
It also depends on how large and how complex the database is.

A lot of designers model and normalize at the same time.  (That's one 
of the reasons that we see confusion about what is normalization here 
on the net and throughout the industry.)  I find this slows down the 
modeling process.  

>As a side issue ERM's allow 1 to 1 relationships between things
>which are seperate logical business entities, but have the same unique
>identifier (could be Employees & Phones in the current phone system),
>Normalization would put these in the same relation. I would advocate keeping them 
>apart at the Logical level to keep your options open, and consider whether to 
>collapse them into one Relation at the Physical Level.

Hmmm.  If I understand this correctly it seems to me that the best place
to keep your options open is by specifying this option at the modeling
level.  Using a one to many relationship which will keep the entities in
separate tables.  (This also documents this option so that the next person 
that comes around doesn't just plop them together.)

I tend to model, then normalize, model, normalize, etc.  This is done
until all those involved are happy with and understand the final design.
With a modeling tool and automated normalization I can do this pretty fast.

Hunter
-- 
Hunter Zuker		Six Sigma CASE, Inc.	 13456 SE 27, Suite 210
zuker@6sigma.UUCP       (206) 643-6911           Bellevue, WA 98005-4211

colinm@cognos.UUCP (Colin Moden) (12/21/90)

Aaron, I think you and I are on the same side. 
I too believe that Entity Modelling can be used at the "Logical" level
as well as the "Conceptual", and can provide a full replacement for
Normalization where Entities & Relations correspond 1-1.

Anyway you asked for a real example, so here is one from a customer in
Belgium.  (The context of this was that I was teaching them to Model
before programming).

The system is Car insurance and they came up with the Entities

Policy Types (Key : Policy type No)

Policy (Key : License plate, Policy type No, Start_Date)

Vehicle (Key : License plate)

Owner (Key : Owner Reference No)


I suspected there was more to it than that and asked what the Attributes
of Vehicle were, they gave.

License Plate,
Manufacturer,
Model,
Engine Capacity,
Date of Manufacture,
Car Body type,
Insurance group,
...

By using normalization I was able to show them that they had missed out
entities like : Car manufacturer, Model, Insurance Group

Similar work on Policies identified : Claims, Payments, Quotes, Receipts etc


Of course a good entity modeller would have found these anyway, but few of us are
perfect, and Entity Modelling has no way of checking when you are Done.
By listing attributes and dependencies & normalizing you do get an extra
viewpoint and a check that you are complete.

Of course if you miss attributes or dependencies nothing will save you, but
you are at least internally complete & consistent.


Part of my job is to train people to design systems, by using this combination
of Entity Modelling and Normalization people can check their own work and learn 
from it. Initally people spend their time 50% on each, and find 75% of
their  entities when Normlizing. After a couple of years the Normalization checks 
become almost subconscious and they spend 90% of the time on the ERM, 
5% writing down the attributes and 5% checking that it is fully normalized
with a 95% success rate. 

But it's still a useful check and the addition of attributes at this stage is
the next step anyway.

Colin Moden


-- 
Colin Moden               Voice: +44 344 486668              Westerly Point
Cognos Incorporated       FAX: +44 344 485124                Market St 
uucp: colinm@cognos.uucp || uunet!mitel!cunews!cognos!colinm Brackell, Berks
                                                             UK  RG12 1QB