[comp.databases] ER versus dependency normalization methods.

aaron@grad2.cis.upenn.edu (Aaron Watters) (11/27/90)

I've always suspected that the notion of designing a 
database using data dependancies -- ie, the notion
that you start with a pile of attributes and functional
(and/or other) dependencies and proceed to slice up
the database into this or that normal form -- was somewhat
silly and artificial.

Personally, it seems to me that the less mathematical
methods of drawing Entity-Relationship diagrams (or other,
perhaps more advanced conceptual analysis methods) does
as well or better as a database design method, with no
mumbo-jumbo involved.

Consider the scenario:  To design a database you first
1.	-draw an ER-diagram of the information of interest
2.	-by inspecting the ER-diagram derive a heap of
	dependencies and attributes.
3.	-toss the lot into a normalization procedure.
4.	-if the output doesn't look like the ER-diagram at
	step 1, go back to (2) and figure out what you
	left out.
Close analysis will reveal that steps 2-4 are redundant.

Now I'm not interested in getting into some high level
shouting match -- but could someone come up with a simple
compelling example which demonstrates why I'm wrong about
this?		-aaron.

ghm@ccadfa.adfa.oz.au (Geoff Miller) (11/28/90)

aaron@grad2.cis.upenn.edu (Aaron Watters) writes:

[...some good common sense deleted just to save a bit of space...]

>Consider the scenario:  To design a database you first
>1.	-draw an ER-diagram of the information of interest
>2.	-by inspecting the ER-diagram derive a heap of
>	dependencies and attributes.
>3.	-toss the lot into a normalization procedure.
>4.	-if the output doesn't look like the ER-diagram at
>	step 1, go back to (2) and figure out what you
>	left out.
>Close analysis will reveal that steps 2-4 are redundant.

>Now I'm not interested in getting into some high level
>shouting match -- but could someone come up with a simple
>compelling example which demonstrates why I'm wrong about
>this?		-aaron.

Basically I don't think you are wrong -- although you may have oversimplified
a bit!  Some years ago I read De Marco's "Structured Analysis and System
Specification", and while I can't say I follow his (or anyone else's)
methodology exactly he put a lot of emphasis on *simple* diagrammatic
representations and on the use of different *levels* of detail.  While he
mostly used this for processes, the same approach can easily be applied to
ER diagrams or variants thereof.  By keeping each diagram simple they 
become much more managable and -- as you suggest -- much more directly useful.

I guess the formal methodologies may have a place in the design of very
complex systems, although I would always prefer to approach these in a less
formal way as simpler sub-systems.  However, to paraphrase Sturgeon's Law
(90% of everything is crud), "90% of computing is common sense dressed up
in jargon".  People were using normalisation and relational file structures
before Codd formalised that approach, and people were drawing pictures long
before they were formally defined as ER diagrams.  The way to approach any
of the formal methodolgies is to look at what underlies them and take what
is useful for your particular application.

Geoff Miller  (ghm@cc.adfa.oz.au)
Computer Centre, Australian Defence Force Academy

lugnut@sequent.UUCP (Don Bolton) (11/28/90)

In article <33445@netnews.upenn.edu> aaron@grad1.cis.upenn.edu (Aaron Watters) writes:
>I've always suspected that the notion of designing a 
>database using data dependancies -- ie, the notion
>that you start with a pile of attributes and functional
>(and/or other) dependencies and proceed to slice up
>the database into this or that normal form -- was somewhat
>silly and artificial.
>
>Personally, it seems to me that the less mathematical
>methods of drawing Entity-Relationship diagrams (or other,
>perhaps more advanced conceptual analysis methods) does
>as well or better as a database design method, with no
>mumbo-jumbo involved.
>
>Consider the scenario:  To design a database you first
>1.	-draw an ER-diagram of the information of interest
>2.	-by inspecting the ER-diagram derive a heap of
>	dependencies and attributes.
>3.	-toss the lot into a normalization procedure.
>4.	-if the output doesn't look like the ER-diagram at
>	step 1, go back to (2) and figure out what you
>	left out.
>Close analysis will reveal that steps 2-4 are redundant.
>
>Now I'm not interested in getting into some high level
>shouting match -- but could someone come up with a simple
>compelling example which demonstrates why I'm wrong about
>this?		-aaron.

I'll take a stab at this...

By proper identification of the information of interest and a
proper ER-diorama right off, you substansially reduce the "time
to market" on your project. Other more, ahem "structured" programmers
will find this bothersome, un-dignified, and un-($nationality).

You might well find yourself knifed in your sleep. :-)

Actually, it may take this involved of a process for those that
can only see the "mechanical" side of the data requirements, on 
the other hand if the designer is more attuned to the "human"
use of the data, then steps 2-4 are indeed redundant. A lot really
depends on ones background and understanding how the data is used
by the people.

cdm@gem-hy.Berkeley.EDU (Dale Cook) (11/29/90)

In article <33445@netnews.upenn.edu>, aaron@grad2.cis.upenn.edu (Aaron
Watters) writes:
|> 
|> Personally, it seems to me that the less mathematical
|> methods of drawing Entity-Relationship diagrams (or other,
|> perhaps more advanced conceptual analysis methods) does
|> as well or better as a database design method, with no
|> mumbo-jumbo involved.
|> 
|> Consider the scenario:  To design a database you first
|> 1.	-draw an ER-diagram of the information of interest
|> 2.	-by inspecting the ER-diagram derive a heap of
|> 	dependencies and attributes.
|> 3.	-toss the lot into a normalization procedure.
|> 4.	-if the output doesn't look like the ER-diagram at
|> 	step 1, go back to (2) and figure out what you
|> 	left out.
|> Close analysis will reveal that steps 2-4 are redundant.
|> 

This is it in a nutshell.  It seems to me that you have described the
conceptual (step 1) to logical (steps 2-4) model development cycle.
I might only add 2 things:

1) Step 2 needs to include more than inspection of the E-R diagram.
   One typically gathers attributes and keys through examination
   of existing reports, processes, and any other information you
   have about the system provided by the users/customers.  I was
   always taught to keep the attribution of the E-R diagram to a
   minimum, so as to keep the conceptual model as simple as possible.
   Too many attributes tend to obscure the purpose of the conceptual
   model: to show the existence of entities and their relationships.

This brings up the second point

2) After completion of step 4, it may become apparent that step 1
   is incomplete/wrong.  My point:  you may need to iterate on all
   four steps.

I agree that there is no need for "mumbo jumbo" when simpler methods
which produce reliable results are used.  In fact, if someone uses
"mumbo jumbo" when simpler methods are available, are sabotaging a 
prime benefit of data modeling: clarity.  If more people understood
and followed the method you have outlined, our lives would all be
easier.

--- Dale Cook    cdm@inel.gov
========== long legal disclaimer follows, press n to skip ===========
^L
Neither the United States Government or the Idaho National Engineering
Laboratory or any of their employees, makes any warranty, whatsoever,
implied, or assumes any legal liability or responsibility regarding any
information, disclosed, or represents that its use would not infringe
privately owned rights.  No specific reference constitutes or implies
endorsement, recommendation, or favoring by the United States
Government or the Idaho National Engineering Laboratory.  The views and
opinions expressed herein do not necessarily reflect those of the
United States Government or the Idaho National Engineering Laboratory,
and shall not be used for advertising or product endorsement purposes.

davidm@uunet.UU.NET (David S. Masterson) (11/29/90)

>>>>> On 26 Nov 90 19:48:39 GMT, aaron@grad2.cis.upenn.edu (Aaron Watters) said:

Aaron> I've always suspected that the notion of designing a database using
Aaron> data dependancies -- ie, the notion that you start with a pile of
Aaron> attributes and functional (and/or other) dependencies and proceed to
Aaron> slice up the database into this or that normal form -- was somewhat
Aaron> silly and artificial.

Have you seen the book "Conceptual Schema and Relational Database Design - A
Fact Oriented Approach" by G.M. Nijssen and T.A. Halpin.  It discusses a
cookbook method (known as NIAM) for relational database design beginning by
working with real world examples and then step by step transforming them into
a list of facts showing the entities involved, roles they play, constraints on
entities in roles, types of facts and constraints in diagrammatic form.  There
is also methodologies for transforming one design into another equivalent, but
more manageable form which then can be optimally normalized (the book talks
about optimal normalization being the balance between performance needs and
reduction of redundancy).  Perhaps this method would seem a little less "silly
and artificial"?
--
====================================================================
David Masterson					Consilium, Inc.
(415) 691-6311					640 Clyde Ct.
uunet!cimshop!davidm				Mtn. View, CA  94043
====================================================================
"If someone thinks they know what I said, then I didn't say it!"

randall@informix.com (Randall Rhea) (11/29/90)

In article <2095@ccadfa.adfa.oz.au> ghm@ccadfa.adfa.oz.au (Geoff Miller) writes:

>I guess the formal methodologies may have a place in the design of very
>complex systems, although I would always prefer to approach these in a less
>formal way as simpler sub-systems.  However, to paraphrase Sturgeon's Law
>(90% of everything is crud), "90% of computing is common sense dressed up
>in jargon".  People were using normalisation and relational file structures
>before Codd formalised that approach, and people were drawing pictures long
>before they were formally defined as ER diagrams.  The way to approach any
>of the formal methodolgies is to look at what underlies them and take what
>is useful for your particular application.
>

Amen!!!! This guy has obviously designed and built real databases.  I have
a lot of respect for Codd, DeMarco et al., but in a real project, under
a real budget, a real computer, and real deadlines, you can't always follow 
them to the letter. 


-- 

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Randall Rhea                                          Informix Software, Inc. 
Senior Programmer/Analyst, MIS                    uunet!pyramid!infmx!randall

jdudeck@polyslo.CalPoly.EDU (John R. Dudeck) (11/29/90)

In an article cimshop!davidm@uunet.UU.NET (David S. Masterson) wrote:
>Have you seen the book "Conceptual Schema and Relational Database Design - A
>Fact Oriented Approach" by G.M. Nijssen and T.A. Halpin.  It discusses a
>cookbook method (known as NIAM) for relational database design beginning by
>working with real world examples and then step by step transforming them into
>a list of facts showing the entities involved, roles they play, constraints on
>entities in roles, types of facts and constraints in diagrammatic form.  There
>is also methodologies for transforming one design into another equivalent, but
>more manageable form which then can be optimally normalized (the book talks
>about optimal normalization being the balance between performance needs and
>reduction of redundancy).  Perhaps this method would seem a little less "silly
>and artificial"?

I read this book rather hastily last year and made a mental note that it bears
further study.  My impression was that their graphical visualization method
is basically similar to ER diagramming such as the Consoi-ERM program gives
on the Mac, but the NIAM approach puts more detaied information into the
diagrams.  To me, this seemed to make it more difficult to learn to use.
One of the advantages of Consoi-ERM is its simplicity.  I suspect that 
NIAM is more powerful, but until it is embodied in a simple, elegant
program for a popular computer such as the PC or Mac, it will never become
known and used.

HINT: Does anyone feel like writing such a program?

-- 
John Dudeck                                  "If it's Object Oriented then by
jdudeck@Polyslo.CalPoly.Edu                    definition it's A Good Thing".
ESL: 62013975 Tel: 805-545-9549                                 -- D. Stearns

aaron@grad2.cis.upenn.edu (Aaron Watters) (11/29/90)

In article <1990Nov29.032532.23631@informix.com> randall@informix.com (Randall Rhea) writes:
>
> I have
>a lot of respect for Codd, DeMarco et al., but in a real project, under
>a real budget, a real computer, and real deadlines, you can't always follow 
>them to the letter. 
>Randall Rhea                                          Informix Software, Inc. 

Issues of respect aside, how does one `follow' them at all?
I still don't see how one extracts functional dependencies
from the customer without getting the customer to draw something
like an ER diagram first.  And once the ER-diagram is drawn,
why bother with the notion of dependency based normalization
at all?  I'm purposely adopting a strong position in the hopes
of extracting a good example from someone (and maybe learning
something).		-aaron.

ghm@ccadfa.adfa.oz.au (Geoff Miller) (11/30/90)

aaron@grad2.cis.upenn.edu (Aaron Watters) writes:

>In article <1990Nov29.032532.23631@informix.com> randall@informix.com (Randall Rhea) writes:
>>
>> I have
>>a lot of respect for Codd, DeMarco et al., but in a real project, under
>>a real budget, a real computer, and real deadlines, you can't always follow 
>>them to the letter. 
>>Randall Rhea                                          Informix Software, Inc. 

>Issues of respect aside, how does one `follow' them at all?
>I still don't see how one extracts functional dependencies
>from the customer without getting the customer to draw something
>like an ER diagram first.  And once the ER-diagram is drawn,
>why bother with the notion of dependency based normalization
>at all?  I'm purposely adopting a strong position in the hopes
>of extracting a good example from someone (and maybe learning
>something).		-aaron.

But you don't *start* with the final ER diagram.

Before you get to see an ER diagram somone has (hopefully) done a hell of a
lot of work with the customer, helping them to specify their requirements.
ER diagrams, DFDs (Data Flow Diagrams) and other techniques are tools to 
assist in this process, which normally involves a lot of iterations.  One 
of the biggest problems I have found is persuading even willing customers to
accept just how much of *their* time will be involved in going over the 
diagrams and documentation with me to make sure *I* haven't missed something.

In many ways you can regard Data Dictionaries, ER diagrams and other tools
as alternative representations of the same data.  Each form of representation
emphasises something different, and each contains information that the 
others do not, but the better CASE tools will let you transform from one
to the other.  In the process of getting to a final ER diagram I have
effectively done the dependency-based normalisation on the way, because you
don't work exclusively with one form of data representation, be it ER, DFD 
or whatever.

Geoff Miller  (ghm@cc.adfa.oz.au)
Computer Centre, Australian Defence Force Academy

haim@taichi.uucp (24122-Haim Kilov(3786)m000) (12/01/90)

The most important thing is to explain to the customer the concepts you are
going to use. Usually you (the "modeler") and the customer use different
concepts, so that in order to be able to understand each other a common
(semantic) language is to be created and explained. ER diagrams are usually
not sufficient: they _look_ like something reasonable, but vaguely defined
concepts lead to different, incomplete, or incorrect interpretations of the
picture. A precise definition (yes, and a formal one) of each concept is
needed, otherwise your customers will not distinguish (e.g.) between a
component and a subtype, etc. Of course, the concept of object type      
definition (using predefined operations and assertions) comes to mind.
We use such an approach, and I have some papers on the approach published.

-Haim Kilov
haim@bcr.cc.bellcore.com

zuker@6sigma.UUCP (Hunter Zuker) (12/07/90)

>In article <33445@netnews.upenn.edu> aaron@grad1.cis.upenn.edu (Aaron Watters) writes:
>>Consider the scenario:  To design a database you first
>>1.	-draw an ER-diagram of the information of interest
>>2.	-by inspecting the ER-diagram derive a heap of
>>	dependencies and attributes.
>>3.	-toss the lot into a normalization procedure.
>>4.	-if the output doesn't look like the ER-diagram at
>>	step 1, go back to (2) and figure out what you
>>	left out.
>>Close analysis will reveal that steps 2-4 are redundant.
>>
>>Now I'm not interested in getting into some high level
>>shouting match -- but could someone come up with a simple
>>compelling example which demonstrates why I'm wrong about
>>this?		-aaron.

Well my only concern with this is the ability to go from step 2 to step 3.
From what I've seen of ER diagrams there is not enough information to go
to third normal form.  You easily get 1st normal form and might scramble
to 2nd.  But an automized normalization procedure doesn't get enough
information from ER diagrams to go to third normal form.

I know this because we are doing just that right now.  We have a normalization
routine which works with an Extended Relational Analysis like model.  We
are now converting it to work with a couple of ER diagrammers.  In general
there is not enough information about keys.  Like which is the primary key,
or which keys are concatenated.  This makes it difficult to go to 2NF or 
3NF.

Other than that your cycle works just fine.

Hunter Zuker
-- 
Hunter Zuker		Six Sigma CASE, Inc.	 13456 SE 27, Suite 210
zuker@6sigma.UUCP       (206) 643-6911           Bellevue, WA 98005-4211

vinay@ai.toronto.edu ("Vinay K. Chaudhri") (12/07/90)

In article <33684@netnews.upenn.edu> aaron@grad1.cis.upenn.edu (Aaron Watters) writes:
>In article <1990Nov29.032532.23631@informix.com> randall@informix.com (Randall Rhea) writes:
>>
>> I have
>>a lot of respect for Codd, DeMarco et al., but in a real project, under
>>a real budget, a real computer, and real deadlines, you can't always follow 
>>them to the letter. 
>>Randall Rhea                                          Informix Software, Inc. 
>
>Issues of respect aside, how does one `follow' them at all?
>I still don't see how one extracts functional dependencies
>from the customer without getting the customer to draw something
>like an ER diagram first.  And once the ER-diagram is drawn,
>why bother with the notion of dependency based normalization
>at all?  I'm purposely adopting a strong position in the hopes
>of extracting a good example from someone (and maybe learning
>something).		-aaron.

I tried to achieve this during my masters thesis.  Ideas was to
design a dialog that can engage the user in a conversation and
as a result would infer the dependencies.  I came up with a prototype
which worked well in our initial experimentation.  We wanted to
to do more experimentation, fine tune it, and make it a marketable
product, but (sigh!) I moved to other research problems and ideas
were left there.  In case any one is interested the thesis is
available as Lecture Notes in Computer Science #402 authored by
T. P. Bagchi and myself (Vinay K. Chaudhri).


-- 

---------------------------------------------------------------------------
Vinay K Chaudhri

Email:                                  Mail:

aaron@grad2.cis.upenn.edu (Aaron Watters) (12/08/90)

In article <358@6sigma.UUCP> zuker@6sigma.UUCP (Hunter Zuker) writes:
>From what I've seen of ER diagrams there is not enough information to go
>to third normal form.  You easily get 1st normal form and might scramble
>to 2nd.  But an automized normalization procedure doesn't get enough
>information from ER diagrams to go to third normal form.

Okay: example?  Also, how do you get the information (dependancies)
from the user so you can say whether you are at 3rd normal form
or not.  I find it hard to imagine a user casually mentioning
`by the way FLIGHTSPEED conditionally-multidetermines WINGSTRENGHT
given WEIGHT and WINGSPAN.'		-aaron.

cdm@gem-hy.Berkeley.EDU (Dale Cook) (12/11/90)

In article <34324@netnews.upenn.edu>, aaron@grad2.cis.upenn.edu (Aaron
Watters) writes:
|> In article <358@6sigma.UUCP> zuker@6sigma.UUCP (Hunter Zuker) writes:
|> >From what I've seen of ER diagrams there is not enough information to go
|> >to third normal form.  You easily get 1st normal form and might scramble
|> >to 2nd.  But an automized normalization procedure doesn't get enough
|> >information from ER diagrams to go to third normal form.
|> 
|> Okay: example? 

A factory has the following ER model:

 (1)  Employees assemble many parts.
 (2)  A part is assembled by one employee.

Thus, we have the entities EMPLOYEE and PART, and a single relationship,
call it ASSEMBLING.

Q: Is this normalized? 

A: That depends. 

On what, you ask.  Well, it obviously meets 1NF (nonredundancy).  How
about 2NF (full functional dependancy)?  That depends on the attributes 
of each entity and relationship.  If we have identified the attributes
EMPLOYEE.NAME, PART.DESCRIPTION, and EMPLOYEE-PART-ASSEMBLY-TIME.INSPECTOR,
then the answer is yes.  But suppose we have identified an attribute called 
EMPLOYEE-PART-ASSEMBLY-TIME.AVG-ASSEMBLY-RATE.  Is it normalized now? NO!  
The average rate at which an employee assembles his PARTs is NOT fully
functionally dependent on the relationship ASSEMBLING.  It is most properly
attributed to EMPLOYEE.  How about 3NF (nontransitivity)?  The 2NF example
above is also in 3NF.  Bawaddif we add a foreign key to PART, namely,
PART.EMPLOYEE-ID?  Away goes 3NF, because this is a redundant relationship.
However, this is not necessarily a BAD THING.  It is a matter of the
policies of the business.  Which brings me to the point of my 
ramblings:  In order to normalize, you must (1) fully attribute the
logical data model (one does not typically attribute an ER diagram -
it is a conceptual tool) and (2) fully understand and take into 
consideration all business rules and policies.    
   
|> Also, how do you get the information (dependancies)
|> from the user so you can say whether you are at 3rd normal form
|> or not.  I find it hard to imagine a user casually mentioning
|> `by the way FLIGHTSPEED conditionally-multidetermines WINGSTRENGHT
|> given WEIGHT and WINGSPAN.'		-aaron.

You get them from the user by translating his english statements about
the current system, by looking at existing reports, screens, etc, from
which you derive the attributes.  You would no more ask the user to describe
his data model in your terminology than you would ask a user to 
describe his business process in terms of: for (cust=0;cust<last_cust, cust++)
{if (cust.bal>MAX_CREDIT_LIMIT) printf("...)};.  The
keys to look for in his english statements are nouns, adjectives, and
verbs.  Nouns translate roughly into entities, verbs to relationships,
and adjectives to attributes.  Normally, however, the user will NOT
be able to give you all of the attributes in english (there are far
too many).  This is why you look at reports, screens, forms, etc. to
discover the attributes.

----------------------------------------------------------------------
--- Dale Cook     cdm@inel.gov
"The only stupid question is the unasked one."
The following disclaimer is my employer's.  No flames, please.
----------------------------------------------------------------------
========== long legal disclaimer follows, press n to skip ===========
^L
Neither the United States Government or the Idaho National Engineering
Laboratory or any of their employees, makes any warranty, whatsoever,
implied, or assumes any legal liability or responsibility regarding any
information, disclosed, or represents that its use would not infringe
privately owned rights.  No specific reference constitutes or implies
endorsement, recommendation, or favoring by the United States
Government or the Idaho National Engineering Laboratory.  The views and
opinions expressed herein do not necessarily reflect those of the
United States Government or the Idaho National Engineering Laboratory,
and shall not be used for advertising or product endorsement purposes.

brian@edat.UUCP (brian douglass personal account) (12/12/90)

I'm starting a new project and have been reading this conversation with
great interest.  In a former life I was developing about 4 different
DBMS based systems per year.  Having to do all of the analysis and
modeling by hand, Yourdon flows with ERs.  What a pain!

In this life I am now looking at CASE tools to automate the whole 
analysis and design process.  In particular I have been interested in
the teamwork series by Ingres/Cadre and IDE's software through
pictures.  

Since these products claim not only to assist in the diagramming
of systems, but also the creation of database schemas and relationships,
how useful has anyone found them to be?  I mean going from user 
interviews to actual data models was always a hair puller at best,
but now may be a moot point, the tool does it all for you (scarry!).
Some tools will not only generate all of your schemas, but even your
4gl code!

So how about it.  Are CASE tools going to relieve us all the need
to decide between ER and DN, and do it all for us?  Sort of like
microwave brownies.  Pour in the interviews, mix all around,
nuke for 3 minutes and presto, instant system!


Brian Douglass			Voice: 702-361-1510 X311
Electronic Data Technologies	FAX #: 702-361-2545
1085 Palms Airport Drive	brian@edat.uucp
Las Vegas, NV 89119-3715

cdm@gem-hy.Berkeley.EDU (Dale Cook) (12/12/90)

In article <2370@edat.UUCP>, brian@edat.UUCP (brian douglass personal
account) writes:
|> 
|> In this life I am now looking at CASE tools to automate the whole 
|> analysis and design process.  In particular I have been interested in
|> the teamwork series by Ingres/Cadre and IDE's software through
|> pictures.  
|> 
|> Since these products claim not only to assist in the diagramming
|> of systems, but also the creation of database schemas and relationships,
|> how useful has anyone found them to be?  I mean going from user 
|> interviews to actual data models was always a hair puller at best,
|> but now may be a moot point, the tool does it all for you (scarry!).
|> Some tools will not only generate all of your schemas, but even your
|> 4gl code!
|> 

We use IDE's STP.  While a good product, it won't do the hard part that
you mention: going from user interviews to ER to logical design.  It is
more of a documentary tool for these purposes, to give you pretty pictures.
The mundane job of taking a good logical model to physical schemas is 
there.  However, the really tough job of interview->ER->logical design
is still a human process.

|> So how about it.  Are CASE tools going to relieve us all the need
|> to decide between ER and DN, and do it all for us?  Sort of like
|> microwave brownies.  Pour in the interviews, mix all around,
|> nuke for 3 minutes and presto, instant system!
|> 
|> 
Maybe someday.  Remember when 3GL code generators (e.g., COBOL) were
going to make programmers obsolete?  The same claims are surfacing
in the data analysis world.  I'll believe it when I see it.

Don't get me wrong.  CASE tools are valuable.  They can take care of
many mundane tasks, and make fewer errors in the process.  A large 
part of their value lies in the forced rigor involved in getting the
specifications to them.  The key here, however, is that CASE tools are
just that - tools.  They still require human operators, and skilled
ones at that.  Until a complete CASE system arrives which takes 
some yet to be determined specification language and turns it into
a finished, working system, they will not replace the current methods,
IMHO.  And even IF that day arrives, you'll need a data analyst, system
analyst, and programmer rolled into one to use it.

----------------------------------------------------------------------
--- Dale Cook     cdm@inel.gov
"The only stupid question is the unasked one."
The following disclaimer is my employer's.  No flames, please.
----------------------------------------------------------------------
========== long legal disclaimer follows, press n to skip ===========
^L
Neither the United States Government or the Idaho National Engineering
Laboratory or any of their employees, makes any warranty, whatsoever,
implied, or assumes any legal liability or responsibility regarding any
information, disclosed, or represents that its use would not infringe
privately owned rights.  No specific reference constitutes or implies
endorsement, recommendation, or favoring by the United States
Government or the Idaho National Engineering Laboratory.  The views and
opinions expressed herein do not necessarily reflect those of the
United States Government or the Idaho National Engineering Laboratory,
and shall not be used for advertising or product endorsement purposes.

zuker@6sigma.UUCP (Hunter Zuker) (12/13/90)

Brian Douglass) writes:

>I'm starting a new project . . .

>In this life I am now looking at CASE tools to automate the whole 
>analysis and design process.  In particular I have been interested in
>the teamwork series by Ingres/Cadre and IDE's software through
>pictures.  

It's funny you should ask ;-)   As some of you know we have a data 
normalization product called the Canonizer (because it uses the 
canonical sythesis approach to normalization).  A bridge between CADRE's
Teamwork Entity Relationship Diagrams (ERDs) and our product has already been 
built, because the ERDs don't take you to third normal form.  And we 
are just completing a bridge to IDE's Software Through Pictures ERDs for 
the same reason.

>Since these products claim not only to assist in the diagramming
>of systems, but also the creation of database schemas and relationships,
>how useful has anyone found them to be?  I mean going from user 
>interviews to actual data models was always a hair puller at best,
>but now may be a moot point, the tool does it all for you (scarry!).
>Some tools will not only generate all of your schemas, but even your
>4gl code!

Well, they generate your schemas, but the schemas have tables that
are just copies of the entities and don't necessarily have anything to
with higher levels of normalization (3NF, 4NF, or BCNF).

I can't comment on the generation of 4GL, but real work and decisions
have to happen someplace.

>So how about it.  Are CASE tools going to relieve us all the need
>to decide between ER and DN, and do it all for us?  Sort of like
>microwave brownies.  Pour in the interviews, mix all around,
>nuke for 3 minutes and presto, instant system!

Obviously I don't think ERDs are going to do it.  There is too much
ambiguity.  You can generate 3NF ERDs, but just because you have an
ERD doesn't mean you'll get any higher level of normalization than
first normal form.

Though we have found that our product is real useful to go from interview,
to model, to normalized schema, it is still an iterative process.  There
is real work to be done to identify and find the correct relationships
between data.  

Hunter 

-- 
Hunter Zuker		Six Sigma CASE, Inc.	 13456 SE 27, Suite 210
zuker@6sigma.UUCP       (206) 643-6911           Bellevue, WA 98005-4211

ghm@ccadfa.adfa.oz.au (Geoff Miller) (12/13/90)

cdm@gem-hy.Berkeley.EDU (Dale Cook) writes:

...stuff deleted...
>Don't get me wrong.  CASE tools are valuable.  They can take care of
>many mundane tasks, and make fewer errors in the process.  A large 
>part of their value lies in the forced rigor involved in getting the
>specifications to them.  The key here, however, is that CASE tools are
>just that - tools.  They still require human operators, and skilled
>ones at that.  Until a complete CASE system arrives which takes 
>some yet to be determined specification language and turns it into
>a finished, working system, they will not replace the current methods,
>IMHO.  And even IF that day arrives, you'll need a data analyst, system
>analyst, and programmer rolled into one to use it.

My problem with the CASE tools I have seen is that they are relatively
inflexible, being implementations of particular methodologies.  If you 
want to do something slightly different or non-standard, this can become
difficult.  There is also the question of transforming the design from 
the CASE software into a DBMS  -  maybe what would be useful here would
be a standard interface specification.  Application generator packages 
for a particular DBMS could then use a defined input format for taking
data from any CASE software.  This seems like a good idea  -  does anyone
know if it exists?

Geoff Miller  (ghm@cc.adfa.oz.au)
Computer Centre, Australian Defence Force Academy

cdm@gem-hy.Berkeley.EDU (Dale Cook) (12/14/90)

In article <361@6sigma.UUCP>, zuker@6sigma.UUCP (Hunter Zuker) writes:
|> 
|> I can't comment on the generation of 4GL, but real work and decisions
|> have to happen someplace.
|> 

Bingo.  All the CASE tools to date have at best allowed us to concentrate
our efforts where there is a better payoff e.g., analysis.  Many have the
benefit of doing repetitive processes (like Hunter mentions, canonical
synthesis is a great example).  Some actually cause more work and grief,
after factoring in the awful interfaces.  Even the ultimate CASE system
will still require the human specification of data and processes.

I'd like to pose a question to the net.  It seems we have a terminology
gap when we speak of ER diagrams.  My posts have been from the view that
there is a 3 step process in the database design process:

1) Conceptual modeling.  This is what I see as the ER model.  To me, an
   ER diagram rarely contains any more than entities, relationships,
   and cardinalities (1-1, 1-N, N-N, etc).  The ER model includes the
   ER diagram and complete english definitions of the E's and R's.

2) Logical modeling.  This is where I see the introduction of keys and
   attributes.  Many methods exist for doing this; I like bubble charts.
   User views, which include existing file definitions, reports, screens,
   and any paper forms people use, are used to obtain the keys and 
   attributes.  This results in a series of views (bubble charts in the
   method I use) which can then be synthesized into a logical model,
   fully keyed and attributed.

3) The first 2 steps are database independent.  This step is the 
   transformation of the logical model to a physical schema.  It is
   here that I feel physical concerns such as performance should be
   considered (i.e., should I denormalize?).

There is a similar 3 step process in process modeling.  Anyway, my
question is this:  How does this fit in with your development process?
Does your definition of ER diagram contain keys and attributes?

----------------------------------------------------------------------
--- Dale Cook     cdm@inel.gov
"The only stupid question is the unasked one."
The following disclaimer is my employer's.  No flames, please.
----------------------------------------------------------------------
========== long legal disclaimer follows, press n to skip ===========
^L
Neither the United States Government or the Idaho National Engineering
Laboratory or any of their employees, makes any warranty, whatsoever,
implied, or assumes any legal liability or responsibility regarding any
information, disclosed, or represents that its use would not infringe
privately owned rights.  No specific reference constitutes or implies
endorsement, recommendation, or favoring by the United States
Government or the Idaho National Engineering Laboratory.  The views and
opinions expressed herein do not necessarily reflect those of the
United States Government or the Idaho National Engineering Laboratory,
and shall not be used for advertising or product endorsement purposes.