[comp.databases] Do you normalize?

blm@6sigma.UUCP (Brian Matthews) (07/19/89)

A few questions for people designing databases.  Do you normalize your
databases?  If you do normalize, do you do so by hand, or with an
automated tool?  How far do you take it - 3rd, 4th, or 5th normal form?
Do you think it was worth the effort?  What DBMS do you use?

If you don't normalize, why not?  Too difficult, or you don't think it's
worth doing?

Just curious if people normalize and if they think it's worth it...

-- 
Brian L. Matthews	Six Sigma CASE, Inc.
+1 206 854 6578		blm@6sigma.UUCP

hargrove@harlie.sgi.com (Mark Hargrove) (07/20/89)

In article <1581@cod.NOSC.MIL> dberg@cod.NOSC.MIL (David I. Berg) writes:

>
>I always normalize.  Remember, a database is a mapping of a bounded enterprise
>environment.  Each relation, or table, in the database is mapped from some
>entity set or object type in the enterprise.  If you want your database to
>accurately map the enterprise it represents, you will normalize it.  (And
>this is true for any database supporting any application.)   When
>your users apply logical transactions against the database, they will properly
>reflect the actual operation that generated the transaction.
>
>I usually normalize to 3NF, I do it by hand, and I use INFORMIX.
>

I've got to (mostly) agree with David.  We always normalize to at least 2NF and
usually to 3NF.  I can recall one or two instances where 4NF was appropriate.
We also do the job by hand, but this isn't really hard.  After designing
a few databases, you start to *think* in normal form.

My only disagreement is over the assertion that normal form represents
an 'accurate' mapping of an enterprise.  To me, 'normal form' is a
series of rules that describe the "right" way to represent objects,
entities and relationships *within the relational model*.  The
relational model is NOT the most natural or 'accurate' way to represent
all enterprises.  In fact, for representing objects with very complex
structure (such as graphical objects, or the data structures
representing the state of a manufacturing shop-floor), the relational
model often forces peculiar representations.

On the other hand:  if what David means is that normal form gives the
most accurate representation of an entrprise *given that you must operate 
within the relation model*, then I agree completely :-).



--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Mark Hargrove                                       Silicon Graphics, Inc.
email: hargrove@harlie.sgi.com                      2011 N.Shoreline Drive
voice: 415-962-3642                                 Mt.View, CA 94039

dberg@cod.NOSC.MIL (David I. Berg) (07/20/89)

In article <242@6sigma.UUCP>, blm@6sigma.UUCP (Brian Matthews) writes:
> Do you normalize your databases? 

I always normalize.  Remember, a database is a mapping of a bounded enterprise
environment.  Each relation, or table, in the database is mapped from some
entity set or object type in the enterprise.  If you want your database to
accurately map the enterprise it represents, you will normalize it.  (And
this is true for any database supporting any application.)   When
your users apply logical transactions against the database, they will properly
reflect the actual operation that generated the transaction.

I usually normalize to 3NF, I do it by hand, and I use INFORMIX.

-- 
David I. Berg (dberg@nosc.mil)
GENISYS Information Systems, Inc., 4250 Pacific Hwy #118, San Diego, CA 92110
MILNET: dberg@nosc.mil
UUCP:   {akgua decvax dcdwest ucbvax}!sdcsvax!noscvax!dberg

dmc@sei.cmu.edu (Dawn Cappelli) (07/20/89)

When I first learned to design a relational database (6 years ago) I was 
taught to create an entity-relationship diagram, then create the tables 
from the final diagram, then normalize those tables.  I've found that to be
an effective method, although as I became more experienced a lot of the
stringent steps became second nature and intuitively obvious.  

At this point I still use entity-relationship diagrams.  I find they are
quite valuable as a communication tool between the database developers
and the end-users of the database.  An e-r diagram provides a roadmap of
the database.  As they say, a picture is worth a thousand words!  I have
used them to design a database for end-users who have absolutely no 
concept of what a database is, yet they understood the e-r diagram quite
easily.

After the e-r diagram is completed, I then design the tables.  As I design
the tables, I no longer consciously normalize the data, however, 
normalization is an intuitive process at this point for me.  In addition,
I sometimes make a conscious decision to denormalize tables in order to 
enhance performance.

So, in summary, I think anyone learning to design a database should use
normalization, although it becomes more intuitive and less disciplined as
experience develops.  

-- 
Dawn Cappelli	dmc@sei.cmu.edu   sei!dmc     (412) 268-6170

This is only my opinion, and doesn't necessarily reflect the opinion of the SEI.

mithomas@bsu-cs.bsu.edu (Michael Thomas Niehaus) (07/20/89)

In article <1581@cod.NOSC.MIL>, dberg@cod.NOSC.MIL (David I. Berg) writes:
> 
> > Do you normalize your databases? 
> 
... 
> I usually normalize to 3NF, I do it by hand, and I use INFORMIX.

I will clain to normalise to 3NF all the time as well, but sometimes it is not
worth the trouble.  Especially when having to use dBase.  If I am using
Oracle or some other SQL product that will do joins for me, 3NF is not a
problem.

I, too, do it by hand, but I do not claim to perform this normalization to
perfection.  (At least I don't have tables that have 128 fields in them.)

-Michael

-- 
Michael Niehaus        UUCP: <backbones>!{iuvax,pur-ee}!bsu-cs!mithomas
Apple Student Rep      ARPA:  mithomas@bsu-cs.bsu.edu
Ball State University  AppleLink: ST0374 (from UUCP: st0374@applelink.apple.com)

englandr@phoenix.Princeton.EDU (Scott Englander) (07/20/89)

I admit it.  I feel really stupid.  But could someone please explain to
me what normalization is, in this sense?  Thanks.
-- 

                                               - Scott

jkrueger@daitc.daitc.mil (Jonathan Krueger) (07/20/89)

In article <242@6sigma.UUCP>, blm@6sigma (Brian Matthews) writes:
>Do you normalize your databases?

Yes.

>If you do normalize, do you do so by hand, or with an
>automated tool?

By hand.  Usually I put a pen in my hand first, though :-)

>How far do you take it - 3rd, 4th, or 5th normal form?

I aim for 3rd.  Sometimes I make it.  Sometimes the source of data
can't define entities in a concise enough manner.

>Do you think it was worth the effort?

Always.  When I can make it to third, applications behave more
predictably, I'm more productive, and performance isn't the natural
enemy of correctness.  When I can't, it still pays to define the
problem.  A useful further step is to state the problem in a simple
declarative sentence.  Examples on request.

>What DBMS do you use?

Mostly INGRES.

-- Jon
-- 

rz02+@andrew.cmu.edu (Rudolph R. Zung) (07/21/89)

Normalization is a process of specifying the structure of the database
to excise any anomalies in it. The anomalies in this case would be
instances where certain operations on the database might render it to
be inconsistent (eg if a piece of information is stored in more than
one location, a change in that number in one location would render
the other instances of that number to be inconsistent.)
eg
 Customer:("Adams, James", "1 Elm Street", "555-1212")
 Order:("Adams, James", "555-1212", "3 pounds suger")

Now if James Adams changes his address (and hence phone) before the
order is processed, then the order table would have a wrong phone number
listed, unless someone went in to changed every instance of "555-1212"
in the order table.

Any good IS/DBMS book should be able to tell you the various normal forms.
Each normal form (1NF, 2NF, 3NF...) prevents a specific anomaly. It's a
little hard to fully explain in a bb post.


...Rudy

InterNet: rz02+@andrew.cmu.edu                               All I want is an
BITnet  : rz02+@andrew                                       all-expense paid
UUCP    : ...!{ucbvax, harvard}!andrew.cmu.edu!rz02+         life.
BELLnet : (412) 681-4237 | 0100 < time in (EDT, DST) < 0800
USMnet  : R Zung \ CMU Box 231 \ Pittsburgh PA 15213

Tastes filling; less great.             Roaches, and spiders, and ants, ohmy!
"Oh, forget the small change... hit the quarter... hit the quarter..."
                           -Woman from WVa Banker's Assoc, via bob boag
Names and places have been identified to protect the anonymous.

itkin@mrspoc.Transact.COM (Steven M. List) (07/21/89)

In article <3665@fp.sei.cmu.edu> dmc@sei.cmu.edu (Dawn Cappelli) writes:
> 
> After the e-r diagram is completed, I then design the tables.  As I design
> the tables, I no longer consciously normalize the data, however, 
> normalization is an intuitive process at this point for me.  In addition,
> I sometimes make a conscious decision to denormalize tables in order to 
> enhance performance.
> 
Bravo, Dawn!  Dawn's was the only response that I saw that made what, to
me, is a reasonable statement about denormalization (is that a word?).
I have found, in designing applications AND databases (and they DO usually
go together), that sometimes normalization is a significant detriment to
performance.

When retrieving data for reporting purposes, the trade-off between such
things as normalization and speed of retrieval can be a significant design
issue.  While normalization is an excellent concept, it is not sufficient
in and of itself.  It is a concept that is a tool to designing "better"
applications.  Where it obstructs rather than improves the process, it
is better ignored.
-- 
 +----------------------------------------------------------------------------+
 :                Steven List @ Transact Software, Inc. :^>~                  :
 :           Chairman, Unify User Group of Northern California                :
 :     {apple,coherent,limbo,mips,pyramid,ubvax}!itkin@guinan.Transact.COM    :

Bron@cup.portal.com (George Bron Faison) (07/21/89)

On 7/18/89 14:44 blm@6sigma.UUCP (Brian Matthews) writes:
>A few questions for people designing databases.  Do you normalize your
>databases?  If you do normalize, do you do so by hand, or with an
>automated tool?  How far do you take it - 3rd, 4th, or 5th normal form?
>Do you think it was worth the effort?  What DBMS do you use?

Yes, by hand (but looking at various CASE tools), to 3rd as a minimum.
Definitely worth the effort.  Don't believe you can properly implement
a relational database without normalizing your data first.

> Just curious if people normalize and if they think it's worth it...

Most problems I see in systems are due to an improper design, which
usually means not in 3rd normal form or better.  Understanding the
lingo of normalization (while learning) is a lot harder than actually
doing it.  Once you've been through it a few times it rapidly becomes
"natural" and intuitive.

You'll also learn that a normalized system behaves normally! :-)

(You'll have a lot fewer problems down the road.  If I run into
unexpected difficulties or find that it's "hard" to do something
I want to do, my first step is to step back and see if something
is not truly in 3rd normal form.  4th, and Boyce-Codd/5th, etc.
forms are far less commonly needed.)
-----------------------------------------------------------------------------
  George "Bron" Faison         Email: Bron@cup.portal.com
                              USMAIL: Modern Office Systems Technology, Inc.
       "Be Still!"                    6455 Penrith Drive
                                      Mechanicsville, VA 23111
      Oracle & UNIX            Phone: (804) 730-1467 (voice)
-----------------------------------------------------------------------------

P. S.  I use ORACLE.

allbery@nc386.UUCP (Brandon S. Allbery) (07/25/89)

Not having been able to find anything describing normalization (maybe I need
to take a day off and raid a college bookstore?), I go no farther than the 3NF
descibed in the Unify 4.0 (Accell 1.4) manuals.  At least, I think I go no
farther; not knowing what 4NF is, how would I know?)

Any suggestions as to book titles to look for when I descend upon the local
university?

++Brandon
-- 
Brandon S. Allbery, moderator of comp.sources.misc	     allbery@NCoast.ORG
uunet!hal.cwru.edu!ncoast!allbery		    ncoast!allbery@hal.cwru.edu
   * This message brought to you courtesy the "Watcher" for the 4th NCoast *

brianc@daedalus (Brian Colfer) (07/26/89)

In article <242@6sigma.UUCP> blm@6sigma.UUCP (Brian Matthews) writes:
>A few questions for people designing databases.  Do you normalize your
>databases?  If you do normalize, do you do so by hand, or with an
>automated tool?  How far do you take it - 3rd, 4th, or 5th normal form?
>Do you think it was worth the effort?  What DBMS do you use?
>
>If you don't normalize, why not?  Too difficult, or you don't think it's
>worth doing?

1) yes... the more experience I get the more I realize my teachers
were right.
2) By hand... wish I had a tool but I do what I can...
3) 4th most of the time 3rd at least.
4) Yes, my biggest head aches have come from un-normalized data
structures.  For example I inherted a bone marrow tracking
database which was not normalized and we cannot ask the system to 
reliably select all marrows for a particular patient.
5) Informix-4GL and Sybase.

6) When I don't normalize its in prototyping and if the applcation is
used more than just once or twice I always pay the price later in having 
to restructure the application in normalized format.

7) It takes time and thought.

Bite the bullet and NORMALIZE it will save you down the line!
=============================================================================
Brian  | UC San Francisco        | E-mail: USENET, Internet, BITNET
Colfer | Dept. of Lab. Medicine  |...!{ucbvax,uunet}!daedalus.ucsf.edu!brianc
       | S.F. CA, 94143-0134 USA | brianc@daedalus.ucsf.edu 
       | PH. (415) 476-2325      | BRIANC@UCSFCCA.BITNET
-----------------------------------------------------------------------------
       "Leave your body and soul at the door ..." -- Oingo Boingo
=============================================================================

donovan@hpindwa.HP.COM (Donovan Hsieh) (07/26/89)

/ hpindwa:comp.databases / allbery@nc386.UUCP (Brandon S. Allbery) /  3:14 pm  Jul 24, 1989 /
> Not having been able to find anything describing normalization (maybe I need
> to take a day off and raid a college bookstore?), I go no farther than the 3NF
> descibed in the Unify 4.0 (Accell 1.4) manuals.  At least, I think I go no
> farther; not knowing what 4NF is, how would I know?)

> Any suggestions as to book titles to look for when I descend upon the local
> university?

If you hate mathematics, read the follwoing two books :

	1. Database System Concepts  by Henry F. Korth &
					Abraham Silberschatz

		McGraw-Hill International Editions

		page 200 - 209

	2. Data Models	by Dionysios C. Tsichritzis &
			   Frederick H. Lochovsky

		Prentice-Hall

		page 292 - 297


Otherswise, read the following two books :

	1. The Theory of Relational Databases  by David Maier

		Computer Science Press

		page 124 - 142


	2. Principles of Database and Knowledge-Base Systems Volume I by
		Jeffrey D. Ullman

		Computer Science Press

		page 420 - 422 & 443


Donovan Hsieh
Business Network Division
Hewlett-Packard

dberg@cod.NOSC.MIL (David I. Berg) (07/27/89)

In article <1989Jul24.221425.3611@nc386.uucp>, allbery@nc386.UUCP (Brandon S. Allbery) writes:
> Any suggestions as to book titles to look for when I descend upon the local
> university?
> 
Christopher Date: An Introduction to Database Systems, Addison Wesley
Tsichritzis & Lochovsky: Data Models, Prentice Hall

The latter is my book of choice.

-- 
David I. Berg (dberg@nosc.mil)
GENISYS Information Systems, Inc., 4250 Pacific Hwy #118, San Diego, CA 92110
MILNET: dberg@nosc.mil
UUCP:   {akgua decvax dcdwest ucbvax}!sdcsvax!noscvax!dberg