marti@ethz.UUCP (Robert Marti) (08/05/89)
In article <36270005@hpindwa.HP.COM>, donovan@hpindwa.HP.COM (Donovan Hsieh) writes in repsonse to some earlier message of mine: > My point is, it is difficult to formulate all possible MVDs for a given > relational database scheme in the real world applications. Certainly, finding > some or several of the MVDs are easier. If the designer choose to use the > synthese algorithm to normalize the database scheme, incomplete set of FDs & > MVDs can lead to incorrect result. I see your point, and it's certainly valid. However -- as I am sure you know -- a designer does not have to specify ALL functional and multi- valued dependencies to achieve correct third or fourth normal form. All she needs to do is provide a superset of an FD cover (or superset of a dependency basis in the case of MVDs) from which all FDs (MVDs) can be inferred. For real world examples, finding these FDs and MVDs is essentially the same thing as finding the relationships between different entity types with their association cardinalities in an ER model. In either case, you get an incorrect model of the world if the designer overlooks some relationships between attributes (in the relational model) or entity types (in the ER model). -- Robert Marti Phone: +41 1 256 52 36 Institut fur Informationssysteme ETH-Zentrum CSNET/ARPA: marti%inf.ethz.ch@relay.cs.net CH-8092 Zurich, Switzerland UUCP: ...uunet!mcvax!ethz!marti
donovan@hpindwa.HP.COM (Donovan Hsieh) (08/08/89)
In the note, Robert Marti <marti@ethz.UUCP> at ETH Zuerich wrote : > I see your point, and it's certainly valid. However -- as I am sure you > know -- a designer does not have to specify ALL functional and multi- > valued dependencies to achieve correct third or fourth normal form. > All she needs to do is provide a superset of an FD cover (or superset of > a dependency basis in the case of MVDs) from which all FDs (MVDs) can be > inferred. For real world examples, finding these FDs and MVDs is > essentially the same thing as finding the relationships between different > entity types with their association cardinalities in an ER model. > In either case, you get an incorrect model of the world if the designer > overlooks some relationships between attributes (in the relational model) > or entity types (in the ER model). In theory, the suggested way to design a fully normalized relational database scheme is as follows : 1. Formulate all possible set of FDs & MVDs from the real world enterprise 2. Compute the "optimal covers" from the set of FDs & MVDs to derive the optimal set of FDs & MVDs. Note that a minimal set is not necessarily optimal. 3. Apply whatever algorithms you may choose to normalize the set of optimal FDs & MVDs to the highest possible normal form. But in reality, the difficulties are : 1. You'll probably never be able to prove that you have obtained all possible FDs & MVDs for the real world enterprise except for those toy databases. Not to mention what is the superset of the the FDs & MVDs. It is different to say that "I have found the complete set of FDs & MVDs" and "I can produce the superset of FDS & MVDs for a GIVEN set of FDs & MVDs from which all original dependencies can be inferred". 2. At least for the current time, there is probably no polinomial time algorithm for finding an "optimal cover" for a set of FDs & MVDs. This is a NP-complete problem. 3. Even if you have successfully completed the above two steps, the net result of a mechanical normalization process does not always produce the most efficient schema for real world queries. This is due to that all normalization algorithms are purely "syntactical" instead of "semantic". They do not have the knowledge of what are the real world database requirements (from the perspective of query execution efficiency or referential integrity). This is why "Semantic" and "Object-Oriented" data models have become popular. Donovan Hsieh Hewlett-Packard Business Network Division
marti@ethz.UUCP (Robert Marti) (08/10/89)
In article <36270009@hpindwa.HP.COM>, donovan@hpindwa.HP.COM (Donovan Hsieh) writes as contribution to our ongoing (never-ending :-) discussion: > In theory, the suggested way to design a fully normalized relational database > scheme is as follows: [ ... ] > But in reality, the difficulties are : > 1. You'll probably never be able to prove that you have obtained > all possible FDs & MVDs for the real world enterprise except for > those toy databases. [ ... ] I agree. Absolutely. I never claimed anything else in my previous postings. The point I was trying to make was: If a designer is not aware of certain relationships (dependencies) between different data items, or if the designer is mistaken about the kind of relationship between data items (i.e one-to-many instead of many-to-many, or FD instead of MVD), you will end up with incorrect models of the world no matter what design methodology you use. > 3. [ ... ] the net result of a mechanical normalization process does > not always produce the most efficient schema for real world queries. Again I agree. I might even go further and claim that it hardly ever produces the most efficient schema. > This is due to that all normalization algorithms are purely > "syntactical" instead of "semantic". They do not have the knowledge > of what are the real world database requirements (from the > perspective of query execution efficiency or referential integrity). Well, in the end, all information in the computer is purely syntactic, even if you use a so-called semantic model. In order to capture what you call "real world database requirements", you'd probably want a design tool / algorithm which also takes into consideration - arbitrary integrity constraints (including referential integrity) - expected data volumes (if known) - "canned" update transactions and queries and their frequency (if known) (The paper by Finkelstein et al. in ACM ToDS, Vol.13, No.1, March 88 presents a design tool which represents a step in this direction.) -- Robert Marti Phone: +41 1 256 52 36 Institut fur Informationssysteme ETH-Zentrum CSNET/ARPA: marti%inf.ethz.ch@relay.cs.net CH-8092 Zurich, Switzerland UUCP: ...uunet!mcvax!ethz!marti