donovan@julius.csl.sri.com (Donovan Hsieh) (09/22/90)
Recently, the paper "Third-Generation Database System Manifesto" was published in the 1990 ACM SIGMOD Conference Proceedings. Its authors include some of the most influential researchers in the relational database community. The main theme of this paper recommends a set of characteristics for the next generation of database systems. Often, this becomes a discussion of whether a pure object-oriented model or an extended relational model is a better approach. The paper was also partly written in response to an earlier published paper, "The Object-Oriented Database System Manifesto," which was authored by some of the leading researchers in the OODB community. As a knowledgeable and impartial observer of this debate, I found that the arguments and criticisms that appeared in the paper "Third-Generation Database System Manifesto", which I will call "the manifesto" hereafter, contains both JUSTIFIED and MISLEADING opinions. Through this news file, I would like to comment objectively on the manifesto. I would also like to see opinions from others on this issue. The following comments are my own opinions, and are not those of my current employer or other organizations. The manifesto uses the term "third-generation database systems" to distinguish itself from the approach taken by the OODB community. It states that next generation databases should evolve and be extended from current relation systems, rather than be a new approach built from the ground up like OODBs. In the manifesto, it listed four groups of propositions that describe the fundamental requirements for next generation DBMS. I shall comment on those propositions where I feel they require further clarification. Less critical or generally agreeable propositions, such as "Inheritance is a good idea" (proposition 1.2), are not discussed in the following comments. Proposition 1.1 of the manifesto says that "A third generation DBMS must have a rich type systems." I agree that entirely new database systems such as OODBs are not needed to support abstract data types (ADT). But I doubt that ALL types can be added to or extended from the current relational systems. Stretching existing relational systems beyond their inherent limits would most likely cause an inefficient implementation. I feel that a pure OODB approach is more suitable and will be better able to provide full ADT features that are compatible with the existing object-oriented programming languages, such as C++. Proposition 1.3 of the manifesto says that "Functions, including database procedures and methods, and encapsulation are a good idea." However, it makes the criticism that some OODBs require users to use only functions to access data elements (attributes) of a collection (object instance). In fact, there are some OODB systems that allow an object class to specify public and private attributes, where a public attribute can be directly accessed by database query languages, and a private attribute can only be accessed through pre-defined methods to protect object integrity. Proposition 1.4 of the manifesto says that "Unique identifiers (UIDs) for records should be assigned by the DBMS only if a user-defined primary key is not available". It also argues that a human-readable, immutable primary key in relational systems is superior to the UID or OID used by OODBs. A UID in an OODB has different meanings and purposes than a primary key in a relational system. A UID guarantees that no two object instances will contain the same ID during the lifetime of the system. It also facilitates the internal referencing and representation of objects. Without a UID, the unique identification of an object must be properly defined and enforced in the schema. For example, although a social security number suffices to uniquely identify a person, it still must be defined as a unique primary key in the relational schema definition. It is also possible for a person to die, and then have his/her SS# mistakenly entered or reused without this error being caught by a relational system. Also, it is very costly when a primary key is CHANGED in a relational database, such as when someone's SS# is initially entered incorrectly. It must then be changed everywhere it was used as a foreign key. If a person is referenced by UID, the SS# only has to be changed in one place. Proposition 1.5 of the manifesto says that "Rules (triggers, constraints) will become a major feature in future systems. They should not be associated with a specific function or collection." I agree that the use of methods in OODBs to define arbitrary constraints, rules, or triggers is an undesirable approach. A more declarative language should be used. I prefer PROLOG-like languages, which are more declarative, expressive, and powerful than SQL. However, their query optimizers are also more difficult to implement. Proposition 2.1 of the manifesto says that "Essentially, all programmatic access to a database should be through a non-procedural, high-level access language". It argues that the navigational approach used in OODBs is undesirable and inefficient comparing with the use of non-procedural query languages in relational systems. I feel that this proposition is rather misleading. The manifesto claims that a well-written and well-tuned query optimizer can almost always produce a better execution method than a human. A query optimizer could probably do a better job for repetitive and straightforward accesses. However, there are cases where human navigation is required and a query optimizer cannot foresee all patterns and usages. An example is computing the transitive closure of a given parent object. First of all, the standard relational algebra does not support a query like "Find all children belonging to a given parent" in a single query expression (although some extended relational systems allow queries to compute transitive closures). A common solution is to implement a "for loop" in the application code to compute the closure one record at a time. The result is that the "select" query must be optimized for each loop (some smarter query optimizers will detect the looped query and stored its optimized query graph and execution method in pre-compiled modules so that they can be reused). Furthermore, the query optimizer cannot take advantage of buffered records because the next query will use the previous child value as its current parent search value that is most likely not in the same buffer (or page). On the other hand, an OODB user could use procedure calls to write the same loop without going through time-consuming optimizations and could dereference child pointers recursively. If the OODB schema defines a clustering based on this reference mode, users will be able to gain even more performance with fewer disk accesses because most child objects will have been cached into the buffer during the initial access. Another example would be a traversal of objects that involves computation, like a CAD application where some optimization of the connections between objects involves computation in the application language. I would argue that (1) navigation is much more "natural" for these computations than using a mixture of queries and programming, and (2) the mixture is inefficient for the reasons suggested above. As for the impact of schema evolution, I agree that the use of "views" in relational databases offers good insulation for applications from changes to the database schema definition. However, specifying the data elements with a declarative query language does not guarantee insulation if the primitive data element definition is changed. Also, some OODBs support "derived" objects, which provide a service like views. (Derived objects are defined procedurally or declaratively by a set of pre-conditions and post-conditions to instantiate or modify the objects.) In the same proposition, the manifesto questioned the performance benefit for OODBs that use low-level calls to navigate individual objects. It also criticized CAD programmers as being close-minded for not using query optimizers provided by databases systems. I feel that both arguments are rather misleading. First, there are various techniques proposed by many OODB researchers to address and resolve the performance issue. For example, a direct memory map technique currently used by one OODB vendor has reported tremendous performance gains over other indexed or hash-based dereferencing techniques, such as those that were mentioned in the manifesto. Numerous published cases have also reported poor performance when using off-the-shelf relational databases to support object navigation, such as in the closure computation example described earlier. Proposition 2.2 of the manifesto says that "There should be at least two ways to specify collections, one using enumeration of members and one using the query language to specify membership." I agree that defining objects "intentionally" with declarative expressions does offer more powerful abstractions than defining objects "extentionally". Proposition 2.4 of the manifesto says that "Performance indicators have almost nothing to do with data models and must not appear in them." I disagree with this claim. Although performance is heavily influenced by individual implementation techniques, there exists inherent limitations on the performance achievable for the underlying data models. For example, the relational model explicitly disallows the storing of ordered tuples. This makes it very inefficient to represent lists, and users are forced to sort on a sequence number implemented by their applications. It is always possible to extend existing database models with new features and constraints through arbitrary implementations. But the end result would be undesirable if the extension exceeds the limitations of the model, or lacks the support of a formal mathematical representation. Proposition 3.1 and 3.2 of manifesto say that "Third generation DBMSs must be accessible from multiple HLLs" and "Persistent X for a variety of Xs is a good idea. They will all be supported on top of a single DBMS by compiler extensions and a (more or less) complex run time system". In theory, I agree that next-generation databases (either third-generation or OODB) should be accessible from multiple HLLs (High Level Languages), and the DBMS should provide a multiple run-time type translations between declarative query languages and HLLs. However it is impractical for DBMSs to support all HLLs. For example, many MIS programmers are interested in adopting new object-oriented technologies (that is, to use object-oriented design methodology and object-oriented programming languages) to implement new MIS applications if they are given the opportunity to do so rather than revamping and retrofitting existing COBOL code. If they are given the opportunity to choose a DBMS to match with their new object-oriented applications, most likely they will use a fully supported OODB product because it provides a better match. Many CAD/CAM and CASE software vendors have long abandoned their use of (extended) relational DBMSs because of poor performance and data modeling capabilities. Another important reason is because they also have adopted object-oriented programming languages as their language of choice. It is natural for these software vendors to use OODBs rather than stretching existing (extended) relational systems to match with their needs. The advantage for OODB and object-oriented programming languages are that they have very little backward compatibility burden as third-generation DBMS does. Although providing a gateway to access old DBMSs from OODBs is helpful for OODBs, but it is not critical. A similar situation occurred when computer industry adopted the cheaper and faster RISC processors over the old CISC. History has proven that the migration is the correct choice. In relational databases, the type "impedance mismatch" between SQL and HLLs have long been criticized as being inefficient and unnatural. Even if the third-generation DBMSs provide brilliant ways to bridge the gap between all HLLs and SQL, new object-oriented users will always opt for the OODB because they are a natural fit. As for OODBs, although the lack of declarative query languages and a formal object algebra/calculus make it less intuitive for end users to use currently, many researchers have proposed different solutions and approaches to resolve this deficiency. We must allow more time for this new technology to be refined and improved, just as it took more than a decade for relational databases to become mature and popular, and replace the old network and hierarchical databases. In summary, I feel that there is room for both technologies to co-exist, and new database models will always be proposed to address existing deficiencies. We will probably see some fusion of both database approaches in the near future that will benefit database users. In the long run, I foresee OODBs replacing (extended) relational DBMSs in selected market segments. I would also predict that the next wave after OODBs will be fully-integrated, intelligent database (or knowledge-based) systems that will combine both AI and database technologies. Donovan Hsieh Computer Science Lab SRI International Menlo Park, CA
davidm@uunet.UU.NET (David S. Masterson) (09/24/90)
In article <21178@hercules.csl.sri.com> donovan@julius.csl.sri.com (Donovan Hsieh) writes: Recently, the paper "Third-Generation Database System Manifesto" was published in the 1990 ACM SIGMOD Conference Proceedings. [...deleted...] The paper was also partly written in response to an earlier published paper, "The Object-Oriented Database System Manifesto," Both papers were very interesting. Too bad there's such a schism in the database community. Proposition 1.1 of the manifesto says that "A third generation DBMS must have a rich type systems." I agree that entirely new database systems such as OODBs are not needed to support abstract data types (ADT). But I doubt that ALL types can be added to or extended from the current relational systems. Stretching existing relational systems beyond their inherent limits would most likely cause an inefficient implementation. I feel that a pure OODB approach is more suitable and will be better able to provide full ADT features that are compatible with the existing object-oriented programming languages, such as C++. Examples please. There's too much "feeling" in this paragraph. By the way, one criticism I have of the OO paper is the lack of credible reasoning on why a system implementing the relational model (the current relational systems are not really *completely* relational) is not a basic OODB that can be built upon to full object-oriented status. Proposition 1.3 of the manifesto says that "Functions, including database Procedures and methods, and encapsulation are a good idea." However, it makes the criticism that some OODBs require users to use only functions to access data elements (attributes) of a collection (object instance). In fact, there are some OODB systems that allow an object class to specify public and private attributes, where a public attribute can be directly accessed by database query languages, and a private attribute can only be accessed through pre-defined methods to protect object integrity. You contradict yourself. "Some OODBs require" means just that -- some. Also, what is the definition of "object integrity" in this case and how do these methods differ from "constraints and triggers" (IMHO, there is none). Proposition 1.4 of the manifesto says that "Unique identifiers (UIDs) for records should be assigned by the DBMS only if a user-defined primary key is not available". It also argues that a human-readable, immutable primary key in relational systems is superior to the UID or OID used by OODBs. A UID in an OODB has different meanings and purposes than a primary key in a relational system. A UID guarantees that no two object instances will contain the same ID during the lifetime of the system. Also, it is very costly when a primary key is CHANGED in a relational database, such as when someone's SS# is initially entered incorrectly. It must then be changed everywhere it was used as a foreign key. If a person is referenced by UID, the SS# only has to be changed in one place. Actually, the immutable, system-assigned primary key was a central part of the Codd/Date RM/T model (Date's Introduction to Database Systems V2), so the UID concept doesn't disagree with the relational model. This is one point where, IMHO, the manifesto disagrees with the model. IMO, though, the question foreign key updating could be handled by the proper use of constraints and triggers. Proposition 2.1 of the manifesto says that "Essentially, all programmatic access to a database should be through a non-procedural, high-level access language". It argues that the navigational approach used in OODBs is undesirable and inefficient comparing with the use of non-procedural query languages in relational systems. I feel that this proposition is rather misleading. The manifesto claims that a well-written and well-tuned query optimizer can almost always produce a better execution method than a human. A query optimizer could probably do a better job for repetitive and straightforward accesses. However, there are cases where human navigation is required and a query optimizer cannot foresee all patterns and usages. Perhaps true, but the manifesto was addressing the tradeoffs of the two approaches in that the majority of the cases in an information management system will tend toward the ad-hoc, so a query optimizer will generally do better. Its the same argument that was made for 3GLs over Assembler. An example is computing the transitive closure of a given parent object. First of all, the standard relational algebra does not support a query like "Find all children belonging to a given parent" in a single query expression (although some extended relational systems allow queries to compute transitive closures). Actually, Codd's new book addresses this with the recursive outer-join, so the new "standard" for the relational model addresses this. Now if we could only get the relational vendors to realize this. A common solution is to implement a "for loop" in the application code to compute the closure one record at a time. The result is that the "select" query must be optimized for each loop (some smarter query optimizers will detect the looped query and stored its optimized query graph and execution method in pre-compiled modules so that they can be reused). Furthermore, the query optimizer cannot take advantage of buffered records because the next query will use the previous child value as its current parent search value that is most likely not in the same buffer (or page). On the other hand, another common approach used is to do an n-way outer-join from parent to child to grandchild where n is the known number of levels in the hierarchy. This is the best that the current generation of relational systems can do and it does allow the query optimizer to do some optimizations. However, if the hierarchy is sparsely populated, this isn't that good. On the other hand, an OODB user could use procedure calls to write the same loop without going through time-consuming optimizations and could dereference child pointers recursively. If the OODB schema defines a clustering based on this reference mode, users will be able to gain even more performance with fewer disk accesses because most child objects will have been cached into the buffer during the initial access. A relational system can take advantage of the same clustering idea. "Time consuming optimizations", IMHO, are in the eye of the beholder. An OODB wouldn't have the full understanding of relationships that can occur in a database and, so, couldn't take advantage of a recursive join operation (unless the OODB was a relational DB). Another example would be a traversal of objects that involves computation, like a CAD application where some optimization of the connections between objects involves computation in the application language. I would argue that (1) navigation is much more "natural" for these computations than using a mixture of queries and programming, and (2) the mixture is inefficient for the reasons suggested above. But the Manifesto's Prop 1.1 states the need for a "rich type system". This should include all the computational capability needed to support the new type. So, the "true" third generation system should merge more seamlessly the programming idea with the query idea. As for the impact of schema evolution, I agree that the use of "views" in relational databases offers good insulation for applications from changes to the database schema definition. However, specifying the data elements with a declarative query language does not guarantee insulation if the primitive data element definition is changed. Also, some OODBs support "derived" objects, which provide a service like views. (Derived objects are defined procedurally or declaratively by a set of pre-conditions and post-conditions to instantiate or modify the objects.) I think views are better thought of as a composition operation than a derivation operation. The current generation of relational database systems, IMHO, does not really support enough of the view concept to truly insulate applications from the database design (lack of multi-table updates). In the same proposition, the manifesto questioned the performance benefit for OODBs that use low-level calls to navigate individual objects. It also criticized CAD programmers as being close-minded for not using query optimizers provided by databases systems. I feel that both arguments are rather misleading. First, there are various techniques proposed by many OODB researchers to address and resolve the performance issue. For example, a direct memory map technique currently used by one OODB vendor has reported tremendous performance gains over other indexed or hash-based dereferencing techniques, such as those that were mentioned in the manifesto. Numerous published cases have also reported poor performance when using off-the-shelf relational databases to support object navigation, such as in the closure computation example described earlier. Do you think that relational systems could not improve themselves with the above solutions? Proposition 2.4 of the manifesto says that "Performance indicators have almost nothing to do with data models and must not appear in them." I disagree with this claim. Although performance is heavily influenced by individual implementation techniques, there exists inherent limitations on the performance achievable for the underlying data models. For example, the relational model explicitly disallows the storing of ordered tuples. This makes it very inefficient to represent lists, and users are forced to sort on a sequence number implemented by their applications. What?!? True, relations are inherently non-ordered things in the model, but that does not eliminate indexing in an implementation of the relational model. Therefore, the optimizer can determine that an "order by" has no work to do because the tuples are already ordered by the index in the needed fashion. It is always possible to extend existing database models with new features and constraints through arbitrary implementations. But the end result would be undesirable if the extension exceeds the limitations of the model, or lacks the support of a formal mathematical representation. As has been seen, current relational implementations are in no way approaching the limits of the relational model (see Codd's new book). Proposition 3.1 and 3.2 of manifesto say that "Third generation DBMSs must be accessible from multiple HLLs" and "Persistent X for a variety of Xs is a good idea. They will all be supported on top of a single DBMS by compiler extensions and a (more or less) complex run time system". In theory, I agree that next-generation databases (either third-generation or OODB) should be accessible from multiple HLLs (High Level Languages), and the DBMS should provide a multiple run-time type translations between declarative query languages and HLLs. However it is impractical for DBMSs to support all HLLs. For example, many MIS programmers are interested in adopting new object-oriented technologies (that is, to use object-oriented design methodology and object-oriented programming languages) to implement new MIS applications if they are given the opportunity to do so rather than revamping and retrofitting existing COBOL code. If they are given the opportunity to choose a DBMS to match with their new object-oriented applications, most likely they will use a fully supported OODB product because it provides a better match. However, as has been seen, the "existing COBOL code" doesn't go away as it is usually the revenue producing code. Therefore, the adoption of new technologies (object-oriented or otherwise) usually requires a "ramping up" to full implementation with old code being adjusted to "try out" the new ideas. In relational databases, the type "impedance mismatch" between SQL and HLLs have long been criticized as being inefficient and unnatural. Even if the third-generation DBMSs provide brilliant ways to bridge the gap between all HLLs and SQL, new object-oriented users will always opt for the OODB because they are a natural fit. Even relational people criticize SQL as not being really relational. I don't believe the manifesto called for standardizing on SQL (but I don't have it in front of me). As for OODBs, although the lack of declarative query languages and a formal object algebra/calculus make it less intuitive for end users to use currently, many researchers have proposed different solutions and approaches to resolve this deficiency. We must allow more time for this new technology to be refined and improved, just as it took more than a decade for relational databases to become mature and popular, and replace the old network and hierarchical databases. I agree as long as the ultimate OODB programmer (not the OODBMS programmer) isn't really just a network or hierarchical database programmer with a new name. In summary, I feel that there is room for both technologies to co-exist, and new database models will always be proposed to address existing deficiencies. We will probably see some fusion of both database approaches in the near future that will benefit database users. In the long run, I foresee OODBs replacing (extended) relational DBMSs in selected market segments. I would also predict that the next wave after OODBs will be fully-integrated, intelligent database (or knowledge-based) systems that will combine both AI and database technologies. I also see the database technologies merging in the future, but I also see more complete implementations of the current systems solving many of the problems that have been seen. I wouldn't abandon the current technology quite yet. ;-) -- ==================================================================== David Masterson Consilium, Inc. uunet!cimshop!davidm Mtn. View, CA 94043 ==================================================================== "If someone thinks they know what I said, then I didn't say it!"
dlw@odi.com (Dan Weinreb) (09/24/90)
In article <CIMSHOP!DAVIDM.90Sep23204350@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes:
Examples please. There's too much "feeling" in this paragraph. By the way,
one criticism I have of the OO paper is the lack of credible reasoning on why
a system implementing the relational model (the current relational systems are
not really *completely* relational) is not a basic OODB that can be built
upon to full object-oriented status.
It seems to me that the OO paper is simply an attempt to define the
term OODB. It does not comment, one way or another, about whether it
is possible to build an OODB on top of some other kind of database, or
to extend a non-OODB into an OODB by adding things. I'm sure
the authors all have opinions about that, but the "manifesto" only
addresses the question of what an OODB is.
Actually, the immutable, system-assigned primary key was a central part of the
Codd/Date RM/T model (Date's Introduction to Database Systems V2), so the UID
concept doesn't disagree with the relational model.
Well, the RM/T model is not the same thing as the Relational model.
The people are the same (so to speak) but RM/T is a distinct model,
with new elements (such as the system-assigned primary key) designed
to remedy some problems of the relational model. The OODB's also
provide the equivalent of a system-assigned primary key.
Proposition 2.1 of the manifesto says that "Essentially, all programmatic
access to a database should be through a non-procedural, high-level access
language". It argues that the navigational approach used in OODBs is
undesirable and inefficient comparing with the use of non-procedural query
languages in relational systems.
In my own opinion, this is one of those issues that is clouded up by
the use of code words of adherents of dogma. The extended-relational
people criticize the OODB people because an OODB can say things like
"get me the father of X", which they denounce because it is imperative
rather than declarative. In a relational database, you'd say
something like "for all people in the database, find all people Z such
that Z's father-id attribute has the value X, and print out (?) the
value of the primary-key-id field for all of them" in order to do the
same thing. The latter (my English translation of SQL) is considered
ideologically pure because it is considered to be declarative rather
than imperative. It is also argued that it will run faster because of
the omniscent wisdom of a general-purpose query optimizer. As you can
guess from my incindiary tone (forgive me), I remain unconvinced that
this is true. On the other hand, I certainly do feel that for some
database operations, it is highly desirable to use a high-level,
"non-procedural" (in the sense in which they mean it) query construct
and that it is appropriate to use a query optimizer, and that the
optimizer can do the best job. It depends on the operation. But
applying the "declarative is always virtuous, procedural is always
evil" principle, particularly without clear definitions of the those
terms, leads to neither enlightenment nor efficient execution.
Some people seem to think that anybody who is an advoctate of OODBs is
utterly opposed to nonprocedural query languages. I don't know why.
My feeling is that the application writer should be provided with a
toolkit that provides a range of useful tools, so that he or she can
use the appropriate tool for the particular job at hand. Any serious
OODB should have a query language and a query optimizer.
A relational system can take advantage of the same clustering idea. "Time
consuming optimizations", IMHO, are in the eye of the beholder. An OODB
wouldn't have the full understanding of relationships that can occur in a
database and, so, couldn't take advantage of a recursive join operation
(unless the OODB was a relational DB).
And why wouldn't an OODB have a full understanding of relationships?
Your paragraph above must be read in the light of some understanding
of "what is an OODB" and "what is a relational DB". You say that it
can't have understanding unless it is a relational DB. Do you
seriously mean that it has to follow Codd's Rules or else such an
optimization is impossible? I suspect you mean something less
stringent. Is what you mean incompatible with the definition of
OODB, as given in the OODB manifesto?
Do you think that relational systems could not improve themselves with the
above solutions?
This statement is another one that always crops up in these
discussions. For about 2/3 of the advantages that the OODB people
claim, the relational ("post-relational"?) people answer by saying
"well, we could do that too". What this really points out is two
things. First, many of the differences between the two camps don't
actually hinge on the question of data model (OO vs. relational), but
really have to do with totally unrelated questions of implementation
technique, programming language interfaces, and so on. Second, for
application programmers who are trying to obtain a DBMS in some
particular year to accomplish some particular job, the key question is
not what is theoretically possible but what can be acquired promptly.
Proposition 2.4 of the manifesto says that "Performance indicators have
almost nothing to do with data models and must not appear in them." I
disagree with this claim. Although performance is heavily influenced by
individual implementation techniques, there exists inherent limitations on
the performance achievable for the underlying data models.
As I said above, I don't agree with this. The data model merely
specifies the external interface to the DBMS. There are many, many
clever tricks that can be used to implement such an interface to make
it run more quickly. The only things that are inherent in the data
model are whether it is possible to say certain things, and how clear
and convenient it is to say them. It would be impossible to somehow
prove that nobody could come up with a clever implementation, on the
might work in a totally non-obvious way, to make some particular
database operation fast. When you look at performance, you really
have to examine the speeds of particular DBMSs on particular tasks.
As for OODBs, although the lack of declarative query languages and a formal
object algebra/calculus make it less intuitive for end users to use
currently,
Oh, come now. Several existing OODBs have declarative query languges.
As for the great benefits of formal algebra/calculus, please read
Codd's essay criticizing SQL. SQL is based neither on the relational
algebra nor the relational calculus but on a sort of hybrid of both;
it's inconsistent and counterintuitive in many ways. SQL is not
prevailing in the relational DBMS industry right now because of its
clarity or its theoretical purity, but because of political factors
involving IBM endorsement and the need for standardization of
interfaces. OODB query languages are not less intutive than SQL.
These are purely my own late-night musings and are not necessarily
the official position of my employer.
Dan Weinreb dlw@odi.com Object Design, Inc.
davidm@uunet.UU.NET (David S. Masterson) (09/25/90)
In article <1990Sep24.071412.3561@odi.com> dlw@odi.com (Dan Weinreb) writes: In article <CIMSHOP!DAVIDM.90Sep23204350@uunet.UU.NET> (I) write: Actually, the immutable, system-assigned primary key was a central part of the Codd/Date RM/T model (Date's Introduction to Database Systems V2), so the UID concept doesn't disagree with the relational model. Well, the RM/T model is not the same thing as the Relational model. The people are the same (so to speak) but RM/T is a distinct model, with new elements (such as the system-assigned primary key) designed to remedy some problems of the relational model. The OODB's also provide the equivalent of a system-assigned primary key. This is an interesting point. Is the contention, therefore, that the current relational implementations are representative of the relational model? Codd seems to contend that no one has yet implemented even RMV1 (let alone RM/T), but all these versions go into the definition of the relational model. Your paragraph seems to suggest that the current implementations are representative of the relational model and Codd's 12 rules (and 300+ newer rules) are something entirely new. A relational system can take advantage of the same clustering idea. "Time consuming optimizations", IMHO, are in the eye of the beholder. An OODB wouldn't have the full understanding of relationships that can occur in a database and, so, couldn't take advantage of a recursive join operation (unless the OODB was a relational DB). And why wouldn't an OODB have a full understanding of relationships? Your paragraph above must be read in the light of some understanding of "what is an OODB" and "what is a relational DB". You say that it can't have understanding unless it is a relational DB. Do you seriously mean that it has to follow Codd's Rules or else such an optimization is impossible? I suspect you mean something less stringent. Is what you mean incompatible with the definition of OODB, as given in the OODB manifesto? No it isn't. Looking at it now, my paragraph was more stringent than I wanted it to be. What I meant was that, when you get down to it, considerations such as these can be accomplished by both the OODB and the relational DB, so there really may not be that much of a difference between the two. -- ==================================================================== David Masterson Consilium, Inc. uunet!cimshop!davidm Mtn. View, CA 94043 ==================================================================== "If someone thinks they know what I said, then I didn't say it!"
dlw@odi.com (Dan Weinreb) (09/26/90)
In article <CIMSHOP!DAVIDM.90Sep24104403@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes: In article <1990Sep24.071412.3561@odi.com> dlw@odi.com (Dan Weinreb) writes: Well, the RM/T model is not the same thing as the Relational model. The people are the same (so to speak) but RM/T is a distinct model, with new elements (such as the system-assigned primary key) designed to remedy some problems of the relational model. The OODB's also provide the equivalent of a system-assigned primary key. This is an interesting point. Is the contention, therefore, that the current relational implementations are representative of the relational model? Codd seems to contend that no one has yet implemented even RMV1 (let alone RM/T), but all these versions go into the definition of the relational model. Your paragraph seems to suggest that the current implementations are representative of the relational model and Codd's 12 rules (and 300+ newer rules) are something entirely new. Sorry, I don't follow this at all. I must be misunderstanding you. I don't see why you think I am suggesting what you said. What I am saying is that RM/T is quite disinct from what is usually called "relational". Current "relational database system" products do not exactly implement the Relational Model, but they are a lot close to the Relational Model than to RM/T. From my reading of Codd, the "12 rules" are an attempt to be more precise about the meaning of the Relational Model, whereas RM/T is a new proposed data model that is based on the Relational Model but has several crucial changes, considered to be improvements. Anyway, this was just a minor point; it didn't have much to do with the original issue under discussion. No it isn't. Looking at it now, my paragraph was more stringent than I wanted it to be. What I meant was that, when you get down to it, considerations such as these can be accomplished by both the OODB and the relational DB, so there really may not be that much of a difference between the two. I agree that the differences between the two are often exaggerated. When you look closely, they aren't as different as they seem at first.
aaron@grad2.cis.upenn.edu (Aaron Watters) (09/27/90)
In article <CIMSHOP!DAVIDM.90Sep23204350@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes: > >Both papers were very interesting. Too bad there's such a schism in the >database community. > I wouldn't characterize it as a schism, I'd call it a healthy debate. It's only `too bad' for vendors who want claim their products are `next generation' without any objections. Why is it that criticism and disagreement is so often characterized as `bad' in the CS community? The most minor criticism seems to reduce some people to fuming, clawing, scorpions. EG, see the letters to the CACM in response to `Program Verification: the Very Idea.' -aaron
dlw@odi.com (Dan Weinreb) (09/29/90)
In article <30205@netnews.upenn.edu> aaron@grad2.cis.upenn.edu (Aaron Watters) writes:
I wouldn't characterize it as a schism, I'd call it a healthy debate.
It's only `too bad' for vendors who want claim their products are
`next generation' without any objections.
Why is it that criticism and disagreement is so often characterized
as `bad' in the CS community? The most minor criticism seems to
reduce some people to fuming, clawing, scorpions. EG, see the letters
to the CACM in response to `Program Verification: the Very Idea.'
-aaron
From the point of view of intellectual progress, free debate, and the
open marketplace of ideas, a healthy debate is not only desirable but
essential. From this point of view, if there were grounds for
complaint, they'd be about the quality and persuasiveness of the
arguments, and so on.
In the real world, though, papers, especially those called
"manifestos" and "countermanifestos", and especially those co-authored
by academics of many different affiliations, end up being more than
just free debate, whether intentionally or not. The degree to which
such papers are believed and accepted can affect who gets research
funds, which theses are considered good enough to deserve a degree,
who gets tenure, and so on. I think it's possible that some of the
fuming may result from this aspect of the papers. If this is what
Masterson thinks is too bad, I agree, although I don't see what to
do about it.
davidm@uunet.UU.NET (David S. Masterson) (09/30/90)
In article <30205@netnews.upenn.edu> aaron@grad2.cis.upenn.edu (Aaron Watters) writes: In article <CIMSHOP!DAVIDM.90Sep23204350@uunet.UU.NET> (I) write: > >Both papers were very interesting. Too bad there's such a schism in the >database community. > I wouldn't characterize it as a schism, I'd call it a healthy debate. Perhaps, but, thus far, I've only seen movement on one side of the debate. It's only `too bad' for vendors who want to claim their products are `next generation' without any objections. Or even complete with respect to the current generation. Why is it that criticism and disagreement is so often characterized as `bad' in the CS community? The most minor criticism seems to reduce some people to fuming, clawing, scorpions. EG, see the letters to the CACM in response to `Program Verification: the Very Idea.' Why think it only occurs within the CS community? Seems to be quite a schism in the government debating the budget deficits and how to solve them. Of course, discussion of that doesn't belong in comp.databases. -- ==================================================================== David Masterson Consilium, Inc. uunet!cimshop!davidm Mtn. View, CA 94043 ==================================================================== "If someone thinks they know what I said, then I didn't say it!"
roelw@cs.vu.nl (Wieringa j Roel) (10/01/90)
In article <1990Sep28.173803.15043@odi.com> dlw@odi.com writes: > >In the real world, though, papers, especially those called >"manifestos" and "countermanifestos", and especially those co-authored >by academics of many different affiliations, end up being more than >just free debate, whether intentionally or not. The degree to which >such papers are believed and accepted can affect who gets research >funds, which theses are considered good enough to deserve a degree, >who gets tenure, and so on. I think it's possible that some of the >fuming may result from this aspect of the papers. If this is what >Masterson thinks is too bad, I agree, although I don't see what to >do about it. That is exactly the problem with these manifestos. Recently, John Maddox, chief editor of the journal Nature, visited The Netherlands to give a speech at a symposium held at the 40th anniversary of the major Dutch funding organization for pure scientific research, the NWO. He warned that the extremely high pressure to publish or perish can cause scientists to publish results too early and to be secretive about the results they have obtained so far. In addition, in sciences like biochemistry, commercially lucrative contracts sit just around the corner of the research laboratory, which does not foster open exchange of ideas either. The result is that people give press conferences of their discoveries before they are published in and tested by the scientific community, and secure patents before publication. Large press coverage may result in large funding. The pressures of funding and publication and the lure of commercial results do not foster a climate of open exchange of ideas in which people build upon the (acknowledged) results of their predecessors. Maddox reminded us of the fact that Newton stood upon the shoulders of giants. We do not want the rare researcher who is a genius to discover that there are only shoulders of cripples to stand on. Manifestos, Beach reports and reports from invitational NSF workshops are ostensively only about the objectively best definition of concepts and promising research directions. In addition to this, IMHO they must also be viewed as moves in a game of power and money: which projects get funded, who determines which projects are worth their money, and in general who has the clout to make his definition of what is the case and what should be the case stick. They may not be intended as such by their authors, but I think they cannot be seen in nisolation from the political (funding) dimension of science. Recently, I was in a workshop in Aigen, Austria, visited by European researchers from the database theory community. Someone suggested the name ``Moneyfesto'' for the kind of Manifesto now being published. Someone else pointed out that, just when Eastern Europe is abolishing the political system whose roots go back to the Communist Manifesto, we in the free West (and within the free West, in the heartland of freedom, scientific research), start publishing Manifestos about what Authorities deem to be the Truth. On a lighter note, a few months ago I was at an IFIP conference in Windermere, U.K., where Stonebraker presented the 3rd Generation DB Manifesto for the European research community. In my talk I proposed the Manifesto of Windermere: All Manifestos are Mere Wind. Hope to have offended no one, Roel
aaron@grad2.cis.upenn.edu (Aaron Watters) (10/03/90)
In article <7798@star.cs.vu.nl> roelw@cs.vu.nl (Wieringa j Roel) writes: >Manifestos, Beach reports and reports from invitational NSF workshops are >ostensively only about the objectively best definition of concepts and >promising research directions. In addition to this, IMHO they must also >be viewed as moves in a game of power and money: which projects get >funded, who determines which projects are worth their money, and in >general who has the clout to make his definition of what is the case and >what should be the case stick. What is wrong with that? If the people who disagree can't mount a convincing counterargument, perhaps they shouldn't be funded. Or are you arguing that administrators of funding sources are stupid and easily mislead? This last seems to be a common assumption among academics, perhaps because they are occasionally denied funding. There is a pervasive tendancy within academia to view any attempt at value-judgment as evil -- this, I think, is the source of people's problems with manifestos. The idea is that `ignorant outsiders' (to be read, `anyone who doesn't agree with me') should not influence the `direction of research.' This is nonsense. I think a large number of researchers would be reinvigorated if they were wrenched off their tired toilings and forced to consider some hairy industrial problem. There may be one or two geniuses who would not benefit from such an experience, but I'm willing to risk it and hypothesize that the overall effect would be for the better. -aaron.
roelw@cs.vu.nl (Wieringa j Roel) (10/03/90)
In article <30441@netnews.upenn.edu> aaron@grad2.cis.upenn.edu.UUCP (Aaron Watters) writes: >In article <7798@star.cs.vu.nl> roelw@cs.vu.nl (Wieringa j Roel) writes: > >>Manifestos, Beach reports and reports from invitational NSF workshops are >>ostensively only about the objectively best definition of concepts and >>promising research directions. In addition to this, IMHO they must also >>be viewed as moves in a game of power and money: which projects get >>funded, who determines which projects are worth their money, and in >>general who has the clout to make his definition of what is the case and >>what should be the case stick. > >What is wrong with that? In itself, nothing. I am simply saying that the manifesto game is a game of arguments as well as one of power. >If the people who disagree can't >mount a convincing counterargument, perhaps they shouldn't be >funded. Perhaps. People get convinced of things by other means than good arguments. And the person who wins in a power game may also be the one with the best argument, but this is not necessarily so. >Or are you arguing that >administrators of funding sources are stupid >and easily mislead? No, I assume they are intelligent and that you need a lot of forceful argument to mislead them. And those who win, may really believe that they have the best arguments and need not aim at misleading funding administrators at all. But then it still is the case that the person who wins the power game need not be the person with the best arguments. >[stuff deleted ...] > I think a large >number of researchers would be reinvigorated if they were wrenched >off their tired toilings and forced to consider some hairy industrial >problem. There may be one or two geniuses who would not benefit >from such an experience, but I'm willing to risk it and hypothesize >that the overall effect would be for the better. -aaron. My point is not that confrontation with practical problems does not cause new ideas -it generally does- or that such confrontation is not in general healthy for scientific research -in general, it is healthy. My point is that the pressure of funding my divert too must energy from the research effort to the fund-raising effort, that the closeness of commercially lucrative results may hinder the open exchange of ideas, and that an excessive pressure to publish results may cause the publication of immature results. None of these dangers may materialize, but only if we remain beware of them. Roel
aaron@grad2.cis.upenn.edu (Aaron Watters) (10/04/90)
In article <7824@star.cs.vu.nl> roelw@cs.vu.nl (Wieringa j Roel) writes: > My point is that the >pressure of funding my divert too must energy from the research effort >to the fund-raising effort, that the closeness of commercially lucrative >results may hinder the open exchange of ideas, and that an excessive pressure >to publish results may cause the publication of immature results. None >of these dangers may materialize, but only if we remain beware of them. I certainly agree. What does this have to do with the manifestos? It seems to me this is an orthogonal issue. You also argue that the people with the most forceful arguments may be `wrong' (in some sense). I can't deny this either. What alternative to open discussion and the taking of bold and controversial positions do you suggest? No discussion? -aaron