[comp.databases] Comment on the "Third-Generation Database System Manifesto"

donovan@julius.csl.sri.com (Donovan Hsieh) (09/22/90)

Recently, the paper "Third-Generation Database System Manifesto" was 
published in the 1990 ACM SIGMOD Conference Proceedings.  Its authors 
include some of the most influential researchers in the relational 
database community. The main theme of this paper recommends a set of
characteristics for the next generation of database systems. Often,
this becomes a discussion of whether a pure object-oriented model
or an extended relational model is a better approach. The paper was 
also partly written in response to an earlier published paper, 
"The Object-Oriented Database System Manifesto," which was authored by 
some of the leading researchers in the OODB community.

As a knowledgeable and impartial observer of this debate, I found that 
the arguments and criticisms that appeared in the paper "Third-Generation 
Database System Manifesto", which I will call "the manifesto" hereafter, 
contains both JUSTIFIED and MISLEADING opinions. Through this news file, 
I would like to comment objectively on the manifesto.  I would also like 
to see opinions from others on this issue. The following comments are my own 
opinions, and are not those of my current employer or other organizations.
 
The manifesto uses the term "third-generation database systems" to
distinguish itself from the approach taken by the OODB community. It states 
that next generation databases should evolve and be extended from current 
relation systems, rather than be a new approach built from the ground up 
like OODBs. In the manifesto, it listed four groups of propositions that
describe the fundamental requirements for next generation DBMS. I shall 
comment on those propositions where I feel they require further clarification.
Less critical or generally agreeable propositions, such as "Inheritance is a 
good idea" (proposition 1.2), are not discussed in the following comments.

Proposition 1.1 of the manifesto says that "A third generation DBMS must 
have a rich type systems." I agree that entirely new database systems 
such as OODBs are not needed to support abstract data types (ADT). But I 
doubt that ALL types can be added to or extended from the current 
relational systems. Stretching existing relational systems beyond their 
inherent limits would most likely cause an inefficient implementation. 
I feel that a pure OODB approach is more suitable and will be better able 
to provide full ADT features that are compatible with the existing 
object-oriented programming languages, such as C++.

Proposition 1.3 of the manifesto says that "Functions, including database
procedures and methods, and encapsulation are a good idea." However, it 
makes the criticism that some OODBs require users to use only functions to 
access data elements (attributes) of a collection (object instance). In fact, 
there are some OODB systems that allow an object class to specify public 
and private attributes, where a public attribute can be directly accessed 
by database query languages, and a private attribute can only be accessed 
through pre-defined methods to protect object integrity. 

Proposition 1.4 of the manifesto says that "Unique identifiers (UIDs) for
records should be assigned by the DBMS only if a user-defined primary key is 
not available". It also argues that a human-readable, immutable primary key in 
relational systems is superior to the UID or OID used by OODBs. A UID in an 
OODB has different meanings and purposes than a primary key in a relational 
system. A UID guarantees that no two object instances will contain the 
same ID during the lifetime of the system. It also facilitates the internal 
referencing and representation of objects. Without a UID, the unique 
identification of an object must be properly defined and enforced in 
the schema. For example, although a social security number suffices to 
uniquely identify a person, it still must be defined as a unique primary 
key in the relational schema definition. It is also possible for a person 
to die, and then have his/her SS# mistakenly entered or reused without 
this error being caught by a relational system. 

Also, it is very costly when a primary key is CHANGED in a relational 
database, such as when someone's SS# is initially entered incorrectly. 
It must then be changed everywhere it was used as a foreign key. If a 
person is referenced by UID, the SS# only has to be changed in one place.

Proposition 1.5 of the manifesto says that "Rules (triggers, constraints) 
will become a major feature in future systems. They should not be associated 
with a specific function or collection." I agree that the use of methods 
in OODBs to define arbitrary constraints, rules, or triggers is an 
undesirable approach. A more declarative language should be used. I 
prefer PROLOG-like languages, which are more declarative, expressive, 
and powerful than SQL. However, their query optimizers are also more 
difficult to implement.

Proposition 2.1 of the manifesto says that "Essentially, all programmatic 
access to a database should be through a non-procedural, high-level access 
language".  It argues that the navigational approach used in OODBs is 
undesirable and inefficient comparing with the use of non-procedural query 
languages in relational systems. I feel that this proposition is rather 
misleading. The manifesto claims that a well-written and well-tuned query 
optimizer can almost always produce a better execution method than a human.
A query optimizer could probably do a better job for repetitive and 
straightforward accesses. However, there are cases where human navigation 
is required and a query optimizer cannot foresee all patterns and usages. An 
example is computing the transitive closure of a given parent object. First 
of all, the standard relational algebra does not support a query like
"Find all 
children belonging to a given parent" in a single query expression (although 
some extended relational systems allow queries to compute transitive 
closures). A common solution is to implement a "for loop" in the application 
code to compute the closure one record at a time. The result is that the 
"select" query must be optimized for each loop (some smarter query optimizers
will detect the looped query and stored its optimized query graph and execution
method in pre-compiled modules so that they can be reused). Furthermore, the 
query optimizer cannot take advantage of buffered records because the next 
query will use the previous child value as its current parent search value
that is most likely not in the same buffer (or page). On the other hand, an 
OODB user could use procedure calls to write the same loop without going 
through time-consuming optimizations and could dereference child pointers 
recursively. If the OODB schema defines a clustering based on this reference 
mode, users will be able to gain even more performance with fewer disk 
accesses because most child objects will have been cached into the buffer 
during the initial access.

Another example would be a traversal of objects that involves computation, 
like a CAD application where some optimization of the connections between 
objects involves computation in the application language. I would argue that 
(1) navigation is much more "natural" for these computations than using a 
mixture of queries and programming, and (2) the mixture is inefficient for 
the reasons suggested above.

As for the impact of schema evolution, I agree that the use of "views" in 
relational databases offers good insulation for applications from changes 
to the database schema definition. However, specifying the data elements 
with a declarative query language does not guarantee insulation if the 
primitive data element definition is changed. Also, some OODBs support 
"derived" objects, which provide a service like views. (Derived objects 
are defined procedurally or declaratively by a set of pre-conditions and 
post-conditions to instantiate or modify the objects.)

In the same proposition, the manifesto questioned the performance benefit 
for OODBs that use low-level calls to navigate individual objects. It also
criticized CAD programmers as being close-minded for not using query 
optimizers provided by databases systems. I feel that both arguments are 
rather misleading. First, there are various techniques proposed by many 
OODB researchers to address and resolve the performance issue. For example, 
a direct memory map technique currently used by one OODB vendor has reported 
tremendous performance gains over other indexed or hash-based dereferencing 
techniques, such as those that were mentioned in the manifesto. Numerous 
published cases have also reported poor performance when using off-the-shelf 
relational databases to support object navigation, such as in the closure 
computation example described earlier.

Proposition 2.2 of the manifesto says that "There should be at least two 
ways to specify collections, one using enumeration of members and one using 
the query language to specify membership." I agree that defining objects 
"intentionally" with declarative expressions does offer more powerful 
abstractions than defining objects "extentionally".

Proposition 2.4 of the manifesto says that "Performance indicators have 
almost nothing to do with data models and must not appear in them." I 
disagree with this claim. Although performance is heavily influenced by 
individual implementation techniques, there exists inherent limitations on 
the performance achievable for the underlying data models. For example, the 
relational model explicitly disallows the storing of ordered tuples. This 
makes it very inefficient to represent lists, and users are forced to sort 
on a sequence number implemented by their applications.

It is always possible to extend existing database models with new features
and constraints through arbitrary implementations. But the end result would
be undesirable if the extension exceeds the limitations of the model, or 
lacks the support of a formal mathematical representation. 

Proposition 3.1 and 3.2 of manifesto say that "Third generation DBMSs must
be accessible from multiple HLLs" and "Persistent X for a variety of Xs
is a good
idea. They will all be supported on top of a single DBMS by compiler extensions
and a (more or less) complex run time system". In theory, I agree that 
next-generation databases (either third-generation or OODB) should be
accessible 
from multiple HLLs (High Level Languages), and the DBMS should provide a
multiple
run-time type translations between declarative query languages and HLLs.
However
it is impractical for DBMSs to support all HLLs. For example, many MIS
programmers 
are interested in adopting new object-oriented technologies (that is, to use
object-oriented design methodology and object-oriented programming languages) 
to implement new MIS applications if they are given the opportunity to do so 
rather than revamping and retrofitting existing COBOL code. If they are given
the opportunity to choose a DBMS to match with their new object-oriented 
applications, most likely they will use a fully supported OODB product because
it provides a better match. 

Many CAD/CAM and CASE software vendors have long abandoned their use of 
(extended) relational DBMSs because of poor performance and data modeling 
capabilities. Another important reason is because they also have adopted 
object-oriented programming languages as their language of choice. It is 
natural for these software vendors to use OODBs rather than stretching
existing (extended) relational systems to match with their needs.
The advantage for OODB and object-oriented programming languages are that
they have very little backward compatibility burden as third-generation DBMS
does. Although providing a gateway to access old DBMSs from OODBs is helpful
for OODBs, but it is not critical. A similar situation occurred when computer 
industry adopted the cheaper and faster RISC processors over the old CISC. 
History has proven that the migration is the correct choice. 

In relational databases, the type "impedance mismatch" between SQL and 
HLLs have long been criticized as being inefficient and unnatural. Even if the
third-generation DBMSs provide brilliant ways to bridge the gap between
all HLLs
and SQL, new object-oriented users will always opt for the OODB because
they are a natural fit.

As for OODBs, although the lack of declarative query languages and a formal 
object algebra/calculus make it less intuitive for end users to use currently,
many researchers have proposed different solutions and approaches to resolve 
this deficiency. We must allow more time for this new technology to be refined
and improved, just as it took more than a decade for relational databases to 
become mature and popular, and replace the old network and hierarchical 
databases.

In summary, I feel that there is room for both technologies to co-exist, 
and new database models will always be proposed to address existing 
deficiencies.  We will probably see some fusion of both database approaches 
in the near future that will benefit database users. In the long run, I 
foresee OODBs replacing (extended) relational DBMSs in selected market 
segments. I would also predict that the next wave after OODBs will be 
fully-integrated, intelligent database (or knowledge-based) systems that 
will combine both AI and database technologies.




Donovan Hsieh

Computer Science Lab
SRI International
Menlo Park, CA

davidm@uunet.UU.NET (David S. Masterson) (09/24/90)

In article <21178@hercules.csl.sri.com> donovan@julius.csl.sri.com 
(Donovan Hsieh) writes:

   Recently, the paper "Third-Generation Database System Manifesto" was
   published in the 1990 ACM SIGMOD Conference Proceedings.
   [...deleted...]			The paper was also partly written in
   response to an earlier published paper, "The Object-Oriented Database
   System Manifesto,"

Both papers were very interesting.  Too bad there's such a schism in the
database community.

   Proposition 1.1 of the manifesto says that "A third generation DBMS must
   have a rich type systems." I agree that entirely new database systems such
   as OODBs are not needed to support abstract data types (ADT). But I doubt
   that ALL types can be added to or extended from the current relational
   systems. Stretching existing relational systems beyond their inherent
   limits would most likely cause an inefficient implementation.  I feel that
   a pure OODB approach is more suitable and will be better able to provide
   full ADT features that are compatible with the existing object-oriented
   programming languages, such as C++.

Examples please.  There's too much "feeling" in this paragraph.  By the way,
one criticism I have of the OO paper is the lack of credible reasoning on why
a system implementing the relational model (the current relational systems are
not really *completely* relational) is not a basic OODB that can be built
upon to full object-oriented status.

   Proposition 1.3 of the manifesto says that "Functions, including database
   Procedures and methods, and encapsulation are a good idea." However, it
   makes the criticism that some OODBs require users to use only functions to
   access data elements (attributes) of a collection (object instance). In
   fact, there are some OODB systems that allow an object class to specify
   public and private attributes, where a public attribute can be directly
   accessed by database query languages, and a private attribute can only be
   accessed through pre-defined methods to protect object integrity.

You contradict yourself.  "Some OODBs require" means just that -- some.  Also,
what is the definition of "object integrity" in this case and how do these
methods differ from "constraints and triggers" (IMHO, there is none).

   Proposition 1.4 of the manifesto says that "Unique identifiers (UIDs) for
   records should be assigned by the DBMS only if a user-defined primary key
   is not available". It also argues that a human-readable, immutable primary
   key in relational systems is superior to the UID or OID used by OODBs. A
   UID in an OODB has different meanings and purposes than a primary key in a
   relational system. A UID guarantees that no two object instances will
   contain the same ID during the lifetime of the system.

   Also, it is very costly when a primary key is CHANGED in a relational
   database, such as when someone's SS# is initially entered incorrectly.  It
   must then be changed everywhere it was used as a foreign key. If a person
   is referenced by UID, the SS# only has to be changed in one place.

Actually, the immutable, system-assigned primary key was a central part of the
Codd/Date RM/T model (Date's Introduction to Database Systems V2), so the UID
concept doesn't disagree with the relational model.  This is one point where,
IMHO, the manifesto disagrees with the model.  IMO, though, the question
foreign key updating could be handled by the proper use of constraints and
triggers.

   Proposition 2.1 of the manifesto says that "Essentially, all programmatic
   access to a database should be through a non-procedural, high-level access
   language".  It argues that the navigational approach used in OODBs is
   undesirable and inefficient comparing with the use of non-procedural query
   languages in relational systems. I feel that this proposition is rather
   misleading. The manifesto claims that a well-written and well-tuned query
   optimizer can almost always produce a better execution method than a human.
   A query optimizer could probably do a better job for repetitive and
   straightforward accesses. However, there are cases where human navigation
   is required and a query optimizer cannot foresee all patterns and usages.

Perhaps true, but the manifesto was addressing the tradeoffs of the two
approaches in that the majority of the cases in an information management
system will tend toward the ad-hoc, so a query optimizer will generally do
better.  Its the same argument that was made for 3GLs over Assembler.

   An example is computing the transitive closure of a given parent object.
   First of all, the standard relational algebra does not support a query like
   "Find all children belonging to a given parent" in a single query
   expression (although some extended relational systems allow queries to
   compute transitive closures). 

Actually, Codd's new book addresses this with the recursive outer-join, so the
new "standard" for the relational model addresses this.  Now if we could only
get the relational vendors to realize this.

   				A common solution is to implement a "for
   loop" in the application code to compute the closure one record at a time.
   The result is that the "select" query must be optimized for each loop (some
   smarter query optimizers will detect the looped query and stored its
   optimized query graph and execution method in pre-compiled modules so that
   they can be reused). Furthermore, the query optimizer cannot take advantage
   of buffered records because the next query will use the previous child
   value as its current parent search value that is most likely not in the
   same buffer (or page).

On the other hand, another common approach used is to do an n-way outer-join
from parent to child to grandchild where n is the known number of levels in
the hierarchy.  This is the best that the current generation of relational
systems can do and it does allow the query optimizer to do some optimizations.
However, if the hierarchy is sparsely populated, this isn't that good.

   			On the other hand, an OODB user could use procedure
   calls to write the same loop without going through time-consuming
   optimizations and could dereference child pointers recursively. If the OODB
   schema defines a clustering based on this reference mode, users will be
   able to gain even more performance with fewer disk accesses because most
   child objects will have been cached into the buffer during the initial
   access.

A relational system can take advantage of the same clustering idea.  "Time
consuming optimizations", IMHO, are in the eye of the beholder.  An OODB
wouldn't have the full understanding of relationships that can occur in a
database and, so, couldn't take advantage of a recursive join operation
(unless the OODB was a relational DB).

   Another example would be a traversal of objects that involves computation,
   like a CAD application where some optimization of the connections between
   objects involves computation in the application language. I would argue
   that (1) navigation is much more "natural" for these computations than
   using a mixture of queries and programming, and (2) the mixture is
   inefficient for the reasons suggested above.

But the Manifesto's Prop 1.1 states the need for a "rich type system".  This
should include all the computational capability needed to support the new
type.  So, the "true" third generation system should merge more seamlessly the
programming idea with the query idea.

   As for the impact of schema evolution, I agree that the use of "views" in
   relational databases offers good insulation for applications from changes
   to the database schema definition. However, specifying the data elements
   with a declarative query language does not guarantee insulation if the
   primitive data element definition is changed. Also, some OODBs support
   "derived" objects, which provide a service like views. (Derived objects are
   defined procedurally or declaratively by a set of pre-conditions and
   post-conditions to instantiate or modify the objects.)

I think views are better thought of as a composition operation than a
derivation operation.  The current generation of relational database systems,
IMHO, does not really support enough of the view concept to truly insulate
applications from the database design (lack of multi-table updates).

   In the same proposition, the manifesto questioned the performance benefit
   for OODBs that use low-level calls to navigate individual objects. It also
   criticized CAD programmers as being close-minded for not using query
   optimizers provided by databases systems. I feel that both arguments are
   rather misleading. First, there are various techniques proposed by many
   OODB researchers to address and resolve the performance issue. For example,
   a direct memory map technique currently used by one OODB vendor has
   reported tremendous performance gains over other indexed or hash-based
   dereferencing techniques, such as those that were mentioned in the
   manifesto. Numerous published cases have also reported poor performance
   when using off-the-shelf relational databases to support object navigation,
   such as in the closure computation example described earlier.

Do you think that relational systems could not improve themselves with the
above solutions?

   Proposition 2.4 of the manifesto says that "Performance indicators have
   almost nothing to do with data models and must not appear in them." I
   disagree with this claim. Although performance is heavily influenced by
   individual implementation techniques, there exists inherent limitations on
   the performance achievable for the underlying data models. For example, the
   relational model explicitly disallows the storing of ordered tuples. This
   makes it very inefficient to represent lists, and users are forced to sort
   on a sequence number implemented by their applications.

What?!?  True, relations are inherently non-ordered things in the model, but
that does not eliminate indexing in an implementation of the relational model.
Therefore, the optimizer can determine that an "order by" has no work to do
because the tuples are already ordered by the index in the needed fashion.

   It is always possible to extend existing database models with new features
   and constraints through arbitrary implementations. But the end result would
   be undesirable if the extension exceeds the limitations of the model, or
   lacks the support of a formal mathematical representation.

As has been seen, current relational implementations are in no way approaching
the limits of the relational model (see Codd's new book).

   Proposition 3.1 and 3.2 of manifesto say that "Third generation DBMSs must
   be accessible from multiple HLLs" and "Persistent X for a variety of Xs is
   a good idea. They will all be supported on top of a single DBMS by compiler
   extensions and a (more or less) complex run time system". In theory, I
   agree that next-generation databases (either third-generation or OODB)
   should be accessible from multiple HLLs (High Level Languages), and the
   DBMS should provide a multiple run-time type translations between
   declarative query languages and HLLs.  However it is impractical for DBMSs
   to support all HLLs. For example, many MIS programmers are interested in
   adopting new object-oriented technologies (that is, to use object-oriented
   design methodology and object-oriented programming languages) to implement
   new MIS applications if they are given the opportunity to do so rather than
   revamping and retrofitting existing COBOL code. If they are given the
   opportunity to choose a DBMS to match with their new object-oriented
   applications, most likely they will use a fully supported OODB product
   because it provides a better match.

However, as has been seen, the "existing COBOL code" doesn't go away as it is
usually the revenue producing code.  Therefore, the adoption of new
technologies (object-oriented or otherwise) usually requires a "ramping up" to
full implementation with old code being adjusted to "try out" the new ideas.

   In relational databases, the type "impedance mismatch" between SQL and HLLs
   have long been criticized as being inefficient and unnatural. Even if the
   third-generation DBMSs provide brilliant ways to bridge the gap between all
   HLLs and SQL, new object-oriented users will always opt for the OODB
   because they are a natural fit.

Even relational people criticize SQL as not being really relational.  I don't
believe the manifesto called for standardizing on SQL (but I don't have it in
front of me).

   As for OODBs, although the lack of declarative query languages and a formal
   object algebra/calculus make it less intuitive for end users to use
   currently, many researchers have proposed different solutions and
   approaches to resolve this deficiency. We must allow more time for this new
   technology to be refined and improved, just as it took more than a decade
   for relational databases to become mature and popular, and replace the old
   network and hierarchical databases.

I agree as long as the ultimate OODB programmer (not the OODBMS programmer)
isn't really just a network or hierarchical database programmer with a new
name. 

   In summary, I feel that there is room for both technologies to co-exist,
   and new database models will always be proposed to address existing
   deficiencies.  We will probably see some fusion of both database approaches
   in the near future that will benefit database users. In the long run, I
   foresee OODBs replacing (extended) relational DBMSs in selected market
   segments. I would also predict that the next wave after OODBs will be
   fully-integrated, intelligent database (or knowledge-based) systems that
   will combine both AI and database technologies.

I also see the database technologies merging in the future, but I also see
more complete implementations of the current systems solving many of the
problems that have been seen.  I wouldn't abandon the current technology quite
yet.   ;-)
--
====================================================================
David Masterson					Consilium, Inc.
uunet!cimshop!davidm				Mtn. View, CA  94043
====================================================================
"If someone thinks they know what I said, then I didn't say it!"

dlw@odi.com (Dan Weinreb) (09/24/90)

In article <CIMSHOP!DAVIDM.90Sep23204350@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes:


   Examples please.  There's too much "feeling" in this paragraph.  By the way,
   one criticism I have of the OO paper is the lack of credible reasoning on why
   a system implementing the relational model (the current relational systems are
   not really *completely* relational) is not a basic OODB that can be built
   upon to full object-oriented status.

It seems to me that the OO paper is simply an attempt to define the
term OODB.  It does not comment, one way or another, about whether it
is possible to build an OODB on top of some other kind of database, or
to extend a non-OODB into an OODB by adding things.  I'm sure
the authors all have opinions about that, but the "manifesto" only
addresses the question of what an OODB is.

   Actually, the immutable, system-assigned primary key was a central part of the
   Codd/Date RM/T model (Date's Introduction to Database Systems V2), so the UID
   concept doesn't disagree with the relational model.

Well, the RM/T model is not the same thing as the Relational model.
The people are the same (so to speak) but RM/T is a distinct model,
with new elements (such as the system-assigned primary key) designed
to remedy some problems of the relational model.  The OODB's also
provide the equivalent of a system-assigned primary key.

      Proposition 2.1 of the manifesto says that "Essentially, all programmatic
      access to a database should be through a non-procedural, high-level access
      language".  It argues that the navigational approach used in OODBs is
      undesirable and inefficient comparing with the use of non-procedural query
      languages in relational systems.

In my own opinion, this is one of those issues that is clouded up by
the use of code words of adherents of dogma.  The extended-relational
people criticize the OODB people because an OODB can say things like
"get me the father of X", which they denounce because it is imperative
rather than declarative.  In a relational database, you'd say
something like "for all people in the database, find all people Z such
that Z's father-id attribute has the value X, and print out (?) the
value of the primary-key-id field for all of them" in order to do the
same thing.  The latter (my English translation of SQL) is considered
ideologically pure because it is considered to be declarative rather
than imperative.  It is also argued that it will run faster because of
the omniscent wisdom of a general-purpose query optimizer.  As you can
guess from my incindiary tone (forgive me), I remain unconvinced that
this is true.  On the other hand, I certainly do feel that for some
database operations, it is highly desirable to use a high-level,
"non-procedural" (in the sense in which they mean it) query construct
and that it is appropriate to use a query optimizer, and that the
optimizer can do the best job.  It depends on the operation.  But
applying the "declarative is always virtuous, procedural is always
evil" principle, particularly without clear definitions of the those
terms, leads to neither enlightenment nor efficient execution.

Some people seem to think that anybody who is an advoctate of OODBs is
utterly opposed to nonprocedural query languages.  I don't know why.
My feeling is that the application writer should be provided with a
toolkit that provides a range of useful tools, so that he or she can
use the appropriate tool for the particular job at hand.  Any serious
OODB should have a query language and a query optimizer.

   A relational system can take advantage of the same clustering idea.  "Time
   consuming optimizations", IMHO, are in the eye of the beholder.  An OODB
   wouldn't have the full understanding of relationships that can occur in a
   database and, so, couldn't take advantage of a recursive join operation
   (unless the OODB was a relational DB).

And why wouldn't an OODB have a full understanding of relationships?
Your paragraph above must be read in the light of some understanding
of "what is an OODB" and "what is a relational DB".  You say that it
can't have understanding unless it is a relational DB.  Do you
seriously mean that it has to follow Codd's Rules or else such an
optimization is impossible?  I suspect you mean something less
stringent.  Is what you mean incompatible with the definition of
OODB, as given in the OODB manifesto?

   Do you think that relational systems could not improve themselves with the
   above solutions?

This statement is another one that always crops up in these
discussions.  For about 2/3 of the advantages that the OODB people
claim, the relational ("post-relational"?) people answer by saying
"well, we could do that too".  What this really points out is two
things.  First, many of the differences between the two camps don't
actually hinge on the question of data model (OO vs. relational), but
really have to do with totally unrelated questions of implementation
technique, programming language interfaces, and so on.  Second, for
application programmers who are trying to obtain a DBMS in some
particular year to accomplish some particular job, the key question is
not what is theoretically possible but what can be acquired promptly.

      Proposition 2.4 of the manifesto says that "Performance indicators have
      almost nothing to do with data models and must not appear in them." I
      disagree with this claim. Although performance is heavily influenced by
      individual implementation techniques, there exists inherent limitations on
      the performance achievable for the underlying data models. 

As I said above, I don't agree with this.  The data model merely
specifies the external interface to the DBMS.  There are many, many
clever tricks that can be used to implement such an interface to make
it run more quickly.  The only things that are inherent in the data
model are whether it is possible to say certain things, and how clear
and convenient it is to say them.  It would be impossible to somehow
prove that nobody could come up with a clever implementation, on the
might work in a totally non-obvious way, to make some particular
database operation fast.  When you look at performance, you really
have to examine the speeds of particular DBMSs on particular tasks.

      As for OODBs, although the lack of declarative query languages and a formal
      object algebra/calculus make it less intuitive for end users to use
      currently,

Oh, come now.  Several existing OODBs have declarative query languges.
As for the great benefits of formal algebra/calculus, please read
Codd's essay criticizing SQL.  SQL is based neither on the relational
algebra nor the relational calculus but on a sort of hybrid of both;
it's inconsistent and counterintuitive in many ways.  SQL is not
prevailing in the relational DBMS industry right now because of its
clarity or its theoretical purity, but because of political factors
involving IBM endorsement and the need for standardization of
interfaces. OODB query languages are not less intutive than SQL.

These are purely my own late-night musings and are not necessarily
the official position of my employer.

Dan Weinreb	dlw@odi.com		Object Design, Inc.

davidm@uunet.UU.NET (David S. Masterson) (09/25/90)

In article <1990Sep24.071412.3561@odi.com> dlw@odi.com (Dan Weinreb) writes:

   In article <CIMSHOP!DAVIDM.90Sep23204350@uunet.UU.NET> (I) write:

      Actually, the immutable, system-assigned primary key was a central part
      of the Codd/Date RM/T model (Date's Introduction to Database Systems
      V2), so the UID concept doesn't disagree with the relational model.

   Well, the RM/T model is not the same thing as the Relational model.  The
   people are the same (so to speak) but RM/T is a distinct model, with new
   elements (such as the system-assigned primary key) designed to remedy some
   problems of the relational model.  The OODB's also provide the equivalent
   of a system-assigned primary key.

This is an interesting point.  Is the contention, therefore, that the current
relational implementations are representative of the relational model?  Codd
seems to contend that no one has yet implemented even RMV1 (let alone RM/T),
but all these versions go into the definition of the relational model.  Your
paragraph seems to suggest that the current implementations are representative
of the relational model and Codd's 12 rules (and 300+ newer rules) are
something entirely new.

      A relational system can take advantage of the same clustering idea.
      "Time consuming optimizations", IMHO, are in the eye of the beholder.
      An OODB wouldn't have the full understanding of relationships that can
      occur in a database and, so, couldn't take advantage of a recursive join
      operation (unless the OODB was a relational DB).

   And why wouldn't an OODB have a full understanding of relationships?  Your
   paragraph above must be read in the light of some understanding of "what is
   an OODB" and "what is a relational DB".  You say that it can't have
   understanding unless it is a relational DB.  Do you seriously mean that it
   has to follow Codd's Rules or else such an optimization is impossible?  I
   suspect you mean something less stringent.  Is what you mean incompatible
   with the definition of OODB, as given in the OODB manifesto?

No it isn't.  Looking at it now, my paragraph was more stringent than I wanted
it to be.  What I meant was that, when you get down to it, considerations such
as these can be accomplished by both the OODB and the relational DB, so there
really may not be that much of a difference between the two.
--
====================================================================
David Masterson					Consilium, Inc.
uunet!cimshop!davidm				Mtn. View, CA  94043
====================================================================
"If someone thinks they know what I said, then I didn't say it!"

dlw@odi.com (Dan Weinreb) (09/26/90)

In article <CIMSHOP!DAVIDM.90Sep24104403@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes:

   In article <1990Sep24.071412.3561@odi.com> dlw@odi.com (Dan Weinreb) writes:

      Well, the RM/T model is not the same thing as the Relational model.  The
      people are the same (so to speak) but RM/T is a distinct model, with new
      elements (such as the system-assigned primary key) designed to remedy some
      problems of the relational model.  The OODB's also provide the equivalent
      of a system-assigned primary key.

   This is an interesting point.  Is the contention, therefore, that the current
   relational implementations are representative of the relational model?  Codd
   seems to contend that no one has yet implemented even RMV1 (let alone RM/T),
   but all these versions go into the definition of the relational model.  Your
   paragraph seems to suggest that the current implementations are representative
   of the relational model and Codd's 12 rules (and 300+ newer rules) are
   something entirely new.

Sorry, I don't follow this at all.  I must be misunderstanding you.  I
don't see why you think I am suggesting what you said.  What I am
saying is that RM/T is quite disinct from what is usually called
"relational".  Current "relational database system" products do not
exactly implement the Relational Model, but they are a lot close to
the Relational Model than to RM/T.  From my reading of Codd, the "12
rules" are an attempt to be more precise about the meaning of the
Relational Model, whereas RM/T is a new proposed data model that is
based on the Relational Model but has several crucial changes,
considered to be improvements.  Anyway, this was just a minor point;
it didn't have much to do with the original issue under discussion.

   No it isn't.  Looking at it now, my paragraph was more stringent than I wanted
   it to be.  What I meant was that, when you get down to it, considerations such
   as these can be accomplished by both the OODB and the relational DB, so there
   really may not be that much of a difference between the two.

I agree that the differences between the two are often exaggerated.
When you look closely, they aren't as different as they seem at first.

aaron@grad2.cis.upenn.edu (Aaron Watters) (09/27/90)

In article <CIMSHOP!DAVIDM.90Sep23204350@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes:
>
>Both papers were very interesting.  Too bad there's such a schism in the
>database community.
>
I wouldn't characterize it as a schism, I'd call it a healthy debate.
It's only `too bad' for vendors who want claim their products are
`next generation' without any objections.

Why is it that criticism and disagreement is so often characterized
as `bad' in the CS community?  The most minor criticism seems to
reduce some people to fuming, clawing, scorpions.  EG, see the letters
to the CACM in response to `Program Verification: the Very Idea.'
-aaron

dlw@odi.com (Dan Weinreb) (09/29/90)

In article <30205@netnews.upenn.edu> aaron@grad2.cis.upenn.edu (Aaron Watters) writes:

   I wouldn't characterize it as a schism, I'd call it a healthy debate.
   It's only `too bad' for vendors who want claim their products are
   `next generation' without any objections.

   Why is it that criticism and disagreement is so often characterized
   as `bad' in the CS community?  The most minor criticism seems to
   reduce some people to fuming, clawing, scorpions.  EG, see the letters
   to the CACM in response to `Program Verification: the Very Idea.'
   -aaron

From the point of view of intellectual progress, free debate, and the
open marketplace of ideas, a healthy debate is not only desirable but
essential.  From this point of view, if there were grounds for
complaint, they'd be about the quality and persuasiveness of the
arguments, and so on.

In the real world, though, papers, especially those called
"manifestos" and "countermanifestos", and especially those co-authored
by academics of many different affiliations, end up being more than
just free debate, whether intentionally or not.  The degree to which
such papers are believed and accepted can affect who gets research
funds, which theses are considered good enough to deserve a degree,
who gets tenure, and so on.  I think it's possible that some of the
fuming may result from this aspect of the papers.  If this is what
Masterson thinks is too bad, I agree, although I don't see what to
do about it.

davidm@uunet.UU.NET (David S. Masterson) (09/30/90)

In article <30205@netnews.upenn.edu> aaron@grad2.cis.upenn.edu
(Aaron Watters) writes:

   In article <CIMSHOP!DAVIDM.90Sep23204350@uunet.UU.NET> (I) write:
   >
   >Both papers were very interesting.  Too bad there's such a schism in the
   >database community.
   >

   I wouldn't characterize it as a schism, I'd call it a healthy debate.

Perhaps, but, thus far, I've only seen movement on one side of the debate.

   It's only `too bad' for vendors who want to claim their products are
   `next generation' without any objections.

Or even complete with respect to the current generation.

   Why is it that criticism and disagreement is so often characterized
   as `bad' in the CS community?  The most minor criticism seems to
   reduce some people to fuming, clawing, scorpions.  EG, see the letters
   to the CACM in response to `Program Verification: the Very Idea.'

Why think it only occurs within the CS community?  Seems to be quite a schism
in the government debating the budget deficits and how to solve them.  Of
course, discussion of that doesn't belong in comp.databases.
--
====================================================================
David Masterson					Consilium, Inc.
uunet!cimshop!davidm				Mtn. View, CA  94043
====================================================================
"If someone thinks they know what I said, then I didn't say it!"

roelw@cs.vu.nl (Wieringa j Roel) (10/01/90)

In article <1990Sep28.173803.15043@odi.com> dlw@odi.com writes:
>
>In the real world, though, papers, especially those called
>"manifestos" and "countermanifestos", and especially those co-authored
>by academics of many different affiliations, end up being more than
>just free debate, whether intentionally or not.  The degree to which
>such papers are believed and accepted can affect who gets research
>funds, which theses are considered good enough to deserve a degree,
>who gets tenure, and so on.  I think it's possible that some of the
>fuming may result from this aspect of the papers.  If this is what
>Masterson thinks is too bad, I agree, although I don't see what to
>do about it.


That is exactly the problem with these manifestos. Recently, John Maddox,
chief editor of the journal Nature, visited The Netherlands to give a speech
at a symposium held at the 40th anniversary of the major Dutch funding
organization for pure scientific research, the NWO. He warned that the 
extremely high pressure to publish or perish can cause scientists to
publish results too early and to be secretive about the results they have
obtained so far. In addition, in sciences like biochemistry, commercially
lucrative contracts sit just around the corner of the research laboratory,
which does not foster open exchange of ideas either. The result is that people
give press conferences of their discoveries before they are published in and
tested by the scientific community, and secure patents before publication.
Large press coverage may result in large funding.

The pressures of funding and publication and the lure of commercial results 
do not foster a climate of open exchange of ideas in which people build upon
the (acknowledged) results of their predecessors. Maddox reminded us of the
fact that Newton stood upon the shoulders of giants. We do not want the
rare researcher who is a genius to discover that there are only shoulders
of cripples to stand on.

Manifestos, Beach reports and reports from invitational NSF workshops are
ostensively only about the objectively best definition of concepts and
promising research directions. In addition to this, IMHO they must also
be viewed as moves in a game of power and money: which projects get
funded, who determines which projects are worth their money, and in 
general who has the clout to make his definition of what is the case and
what should be the case stick. They may not be intended as such by
their authors, but I think they cannot be seen in nisolation from the
political (funding) dimension of science.

Recently, I was in a workshop in Aigen, Austria, visited by European 
researchers from the database theory community. Someone suggested the name
``Moneyfesto'' for the kind of Manifesto now being published. Someone else
pointed out that, just when Eastern Europe is abolishing the political system
whose roots go back to the Communist Manifesto, we in the free West (and
within the free West, in the heartland of freedom, scientific research),
start publishing Manifestos about what Authorities deem to be the Truth.

On a lighter note, a few months ago I was at an IFIP conference in Windermere,
U.K., where Stonebraker presented the  3rd Generation DB Manifesto for the
European research community. In my talk I proposed 
the Manifesto of Windermere:
All Manifestos are Mere Wind.


Hope to have offended no one,

Roel

aaron@grad2.cis.upenn.edu (Aaron Watters) (10/03/90)

In article <7798@star.cs.vu.nl> roelw@cs.vu.nl (Wieringa j Roel) writes:

>Manifestos, Beach reports and reports from invitational NSF workshops are
>ostensively only about the objectively best definition of concepts and
>promising research directions. In addition to this, IMHO they must also
>be viewed as moves in a game of power and money: which projects get
>funded, who determines which projects are worth their money, and in 
>general who has the clout to make his definition of what is the case and
>what should be the case stick.

What is wrong with that?  If the people who disagree can't 
mount a convincing counterargument, perhaps they shouldn't be
funded.  Or are you arguing that 
administrators of funding sources are stupid
and easily mislead?  This last seems to be a common assumption
among academics, perhaps because they are occasionally denied
funding.

There is a pervasive tendancy within academia to view any attempt at
value-judgment as evil -- this, I think, is the source of people's
problems with manifestos.
The idea is that `ignorant outsiders' 
(to be read, `anyone who doesn't agree with me') should not influence
the `direction of research.'  This is nonsense.  I think a large
number of researchers would be reinvigorated if they were wrenched
off their tired toilings and forced to consider some hairy industrial
problem.  There may be one or two geniuses who would not benefit
from such an experience, but I'm willing to risk it and hypothesize
that the overall effect would be for the better.		-aaron.

roelw@cs.vu.nl (Wieringa j Roel) (10/03/90)

In article <30441@netnews.upenn.edu> aaron@grad2.cis.upenn.edu.UUCP (Aaron Watters) writes:
>In article <7798@star.cs.vu.nl> roelw@cs.vu.nl (Wieringa j Roel) writes:
>
>>Manifestos, Beach reports and reports from invitational NSF workshops are
>>ostensively only about the objectively best definition of concepts and
>>promising research directions. In addition to this, IMHO they must also
>>be viewed as moves in a game of power and money: which projects get
>>funded, who determines which projects are worth their money, and in 
>>general who has the clout to make his definition of what is the case and
>>what should be the case stick.
>
>What is wrong with that?  

In itself, nothing. I am simply saying that the manifesto game is
a game of arguments as well as one of power.

>If the people who disagree can't 
>mount a convincing counterargument, perhaps they shouldn't be
>funded.  

Perhaps. People get convinced of things by other means than good arguments.
And the person who wins in a power game may also be the one with the
best argument, but this is not necessarily so.

>Or are you arguing that 
>administrators of funding sources are stupid
>and easily mislead?  

No, I assume they are intelligent and that you need a lot of forceful
argument to mislead them. And those who win, may really believe that
they have the best arguments and need not aim at misleading funding
administrators at all. But then it still is the case that the person
who wins the power game need not be the person with the best arguments.

>[stuff deleted ...]
>  I think a large
>number of researchers would be reinvigorated if they were wrenched
>off their tired toilings and forced to consider some hairy industrial
>problem.  There may be one or two geniuses who would not benefit
>from such an experience, but I'm willing to risk it and hypothesize
>that the overall effect would be for the better.		-aaron.

My point is not that confrontation with practical problems does not
cause new ideas -it generally does- or that such confrontation
is not in general healthy for 
scientific research -in general, it is healthy. My point is that the 
pressure of funding my divert too must energy from the research effort
to the fund-raising effort, that the closeness of commercially lucrative
results may hinder the open exchange of ideas, and that an excessive pressure
to publish results may cause the publication of immature results. None
of these dangers may materialize, but only if we remain beware of them.

Roel

aaron@grad2.cis.upenn.edu (Aaron Watters) (10/04/90)

In article <7824@star.cs.vu.nl> roelw@cs.vu.nl (Wieringa j Roel) writes:
> My point is that the 
>pressure of funding my divert too must energy from the research effort
>to the fund-raising effort, that the closeness of commercially lucrative
>results may hinder the open exchange of ideas, and that an excessive pressure
>to publish results may cause the publication of immature results. None
>of these dangers may materialize, but only if we remain beware of them.

I certainly agree.  What does this have to do with the manifestos?
It seems to me this is an orthogonal issue.
You also argue that the people with the most forceful arguments 
may be `wrong' (in some sense).  I can't deny this either.  What
alternative to open discussion and the taking of bold and
controversial positions do you suggest?  No discussion?		-aaron