[comp.databases] Extended RDB vs OODB

mitchell@wdl1.UUCP (Jo Mitchell) (08/03/89)

  For those of us who are interested in CAD/CAM, CASE applications ...

  After watching the oodb action and "extended" rdb action for awhile I'm
  of the opinion that all the extended rdb's will eventually turn into an
  oodb (at least at the conceptual level).

  Because of this it seems most application developers will decide to "convert"
  via the route with the least slope - by staying with an evolving rdb... 

  Comments?
==============================================================================

     //        o    "Mommy mommy I don't want to go to Europe"  
   (O)        ( )    Jo Mitchell 
  /    \_____( )     (408)473-6273
 o  \         |      ARPA: mitchell@wdl1.fac.ford.com 
    /\____\__/
  _/_/   _/_/
~

dlw@odi.com (Dan Weinreb) (08/09/89)

In article <3560052@wdl1.UUCP> mitchell@wdl1.UUCP (Jo Mitchell) writes:

     For those of us who are interested in CAD/CAM, CASE applications ...

     After watching the oodb action and "extended" rdb action for awhile I'm
     of the opinion that all the extended rdb's will eventually turn into an
     oodb (at least at the conceptual level).

     Because of this it seems most application developers will decide to "convert"
     via the route with the least slope - by staying with an evolving rdb... 

     Comments?

Many CAD and CASE applications currently don't use any existing DBMS,
relational or otherwise.  Or if they do, they only use it at a high
level of granularity, or for peripheral functions.  Few or none of
them use a relational DBMS to store, say, individual transistors, or
whatever are the small elements in which the program primarily deals.
Since they're not using a relational DBMS now, there's no issue of
"staying with an evolving rdb".

Dan Weinreb		Object Design, Inc.		dlw@odi.com

rich@osc.COM (Richard Fetik) (08/09/89)

In article <3560052@wdl1.UUCP> mitchell@wdl1.UUCP (Jo Mitchell) writes:
>
>
>  For those of us who are interested in CAD/CAM, CASE applications ...
>
>  After watching the oodb action and "extended" rdb action for awhile I'm
>  of the opinion that all the extended rdb's will eventually turn into an
>  oodb (at least at the conceptual level).
>
>  Because of this it seems most application developers will decide to "convert"
>  via the route with the least slope - by staying with an evolving rdb... 
>
>  Comments?

The problem that relational dbs and their to-be-expected object offspring and
frontends will encounter is a relative (pun) lack of performance.  For the
set of sufficiently complex applications object database performance will be
some orders of magnitude faster than could be achieved with an rdbms.  This
set of applications includes engineering tasks such as CAE, CASE, CAD/CAM,
CIM, etc.  There will always be a set of applications for which relational
databases are the right choice, but they can not be 'grown' into object 
databases without throwing large portions of them out and redesigning from
the ground up.  These large portions may include their storage managers, etc.

If the application really needs an oodb, then use one now - why wait ?
If you're satidfied with the performance you're getting from the commercial
rdb or in-house database you're using, then the evolution strategy that the
relational companies are following is not going to make much difference to
you.  Clearly, the object database market is in the position that the
relational market was in some years ago, and the tools and company
stability/size are yet to be released from the object companies, etc.
The customer that can afford to wait for a year or so may choose to do so,
but for the user community that require the performance and ease of real world 
modelling that can be provided by oo programming and data storage there are
real solutions arriving in your neighborhood shortly.

And yes, I do work for an oodb company, but I chose to do so because I
really believe that this type of product/modelling paradigm will become
the most common eventually, due to the technical advantages.

Whew!!!

Please stick IMHOs in everywhere above;-).  (flame in email please).

Well, you asked for comments...

-- 
					rich@osc.osc.com     415-325-2300
					uunet!lll-winken!pacbell!osc!rich

dennism@menace.rtech.COM (Dennis Moore (x2435, 1080-276) INGRES/teamwork) (08/10/89)

In article 3386, dlw@odi.com (Dan Weinreb) writes:

||In article <3560052@wdl1.UUCP> mitchell@wdl1.UUCP (Jo Mitchell) writes:
||
||For those of us who are interested in CAD/CAM, CASE applications ...
||
||After watching the oodb action and "extended" rdb action for awhile I'm
||of the opinion that all the extended rdb's will eventually turn into an
||oodb (at least at the conceptual level).
||
||Because of this it seems most application developers will decide to "convert"
||via the route with the least slope - by staying with an evolving rdb... 
||
||Comments?
|
|Many CAD and CASE applications currently don't use any existing DBMS,
|relational or otherwise.  Or if they do, they only use it at a high
|level of granularity, or for peripheral functions.  Few or none of
|them use a relational DBMS to store, say, individual transistors, or
|whatever are the small elements in which the program primarily deals.
|Since they're not using a relational DBMS now, there's no issue of
|"staying with an evolving rdb".
|
|Dan Weinreb		Object Design, Inc.		dlw@odi.com

This is common disinformation that OODB companies have been spreading in
an attempt to generate a "need" for their product.  Most CASE companies
use RELATIONAL databases at the hearts of their products.  For instance,
Cadre (teamwork) have used a number of commercial databases on different
platforms, and are forging a MUCH CLOSER relationship with my company (RTI).
IDE (Software through Pictures) uses an in-house RDBMS called TROLL, and are
forging a MUCH CLOSER relationship with Sybase.

Many databases have substantial object-oriented features;  many of these
databases are traditional RDBMSs.  There is no exclusivity between OO- and
R- DBMSs.  For instance, INGRES (from RTI, my company) has stored database
procedures and an object-oriented dictionary in the current product.  We
have announced at our users' convention that we will have rules (i.e. when
something happens to data, perform some action), triggers, alerters, and
substantial OO constructs in our 4GL.  These features will improve our
usability in several types of applications: data dictionaries
(CASE/CAD/CAM/CAE/etc.), computer integrated manufacturing (CIM), expert
systems, and others.

I suggest that we will have all these object oriented features before these
OODBMS companies have distributed database, development tools, bug elimination,
installed base, customer-driven features, third party developers, high
performance, and all the other things we expect for our $2K per user.

Dennis Moore, my own opinions etc etc etc

NOTE:  this was not intended as a commercial endorsement;  I merely used the
examples I know best.

jack@odi.com (Jack Orenstein) (08/10/89)

In article <3560052@wdl1.UUCP>, mitchell@wdl1.UUCP (Jo Mitchell) writes:
> 
> 
>   For those of us who are interested in CAD/CAM, CASE applications ...
> 
>   After watching the oodb action and "extended" rdb action for awhile I'm
>   of the opinion that all the extended rdb's will eventually turn into an
>   oodb (at least at the conceptual level).
> 
>   Because of this it seems most application developers will decide to "convert"
>   via the route with the least slope - by staying with an evolving rdb...

I assume that by "extended rdb" you mean a relational database system
that permits instances of object classes to be stored in fields of
tuples, instead of just numbers and strings.

There are different kinds of application developers, and they will
have different paths to (or around) DBMSs.  An important question is
what are these developers doing about persistent storage now?  Some
CAx application developers store data in DBMSs, using "long fields",
in which they can dump and then retrieve byte strings that have to be
interpreted in the application code. Retrieval based on the content of
such fields is possible only to the extent that other int or string
attributes store summary information.  An extended relational DBMS can
certainly be built on top of a long field facility, and it would be
surprising if a number of relational DBMSs did not go this route,
turning their relational systems into "born-again" OO DBMSs. This
would probably be appealing to CAx developers who are currently using
long fields.

On the other hand, many CAD/CAM and CASE developers have bypassed
relational DBMSs due to performance problems.  Such people have built
their own, or have built on top of file systems. Since these users are
not using any DBMSs currently, I don't agree that the proposed
conversion scenario applies to them. Many users who have gone this
route are becoming aware that OO DBMSs will meet their requirements
for expressiveness, integration with the host language and, maybe most
importantly, performance. I don't think that such users will be
convinced by a thin layer of paint on top of something they considered
and rejected before.

Jack Orenstein
Object Design, Inc.

davidm@cimshop.UUCP (David Masterson) (08/11/89)

>On the other hand, many CAD/CAM and CASE developers have bypassed
>relational DBMSs due to performance problems.
>
Could you elaborate on the performance problems in this case?  I mean is it
inherit in what the CAD/CAM or CASE developers are doing that leads to
performance problems with relational databases or is it just they chose the
wrong (aka. slow) relational database system?  Perhaps there are things that
OODBMSs can be optimized to do better than a relational database system, but I
still think that a relational database can provide a strong foundation for an
object-oriented database.  I'd like to hear why that can't be true.

David Masterson
uunet!cimshop!davidm

speyer@joy.cad.mcc.com (Bruce Speyer) (08/11/89)

In article <458@cimshop.UUCP> davidm@cimshop.UUCP (David Masterson) writes:
>>On the other hand, many CAD/CAM and CASE developers have bypassed
>>relational DBMSs due to performance problems.
>>
>Could you elaborate on the performance problems in this case?  I mean is it
>inherit in what the CAD/CAM or CASE developers are doing that leads to
>performance problems with relational databases or is it just they chose the
>wrong (aka. slow) relational database system?  Perhaps there are things that
>OODBMSs can be optimized to do better than a relational database system, but I
>still think that a relational database can provide a strong foundation for an
>object-oriented database.  I'd like to hear why that can't be true.
>
>David Masterson
>uunet!cimshop!davidm

If an application must cross its process boundary in order to communicate with
the database system it probably is at least two orders of magnitude too slow.
That is why all of the C++ based OODBMS efforts are using the application memory
heap for the cache.

CAE applications (I don't know about CASE) are highly iterative and work around
directed graphs.  The cost of navigating or accessing attributes must be as few
instructions as possible.

I don't know of a relational system which can or could support this type of
activity.  The C++ based OODBMS systems are being developed to meet this
criteria.

Bruce Speyer / MCC CAD Program                        WORK: [512] 338-3668
3500 W. Balcones Center Dr.,  Austin, TX. 78759       ARPA: speyer@mcc.com 

Bruce Speyer / MCC CAD Program                        WORK: [512] 338-3668
3500 W. Balcones Center Dr.,  Austin, TX. 78759       ARPA: speyer@mcc.com

mitchell@wdl1.UUCP (Jo Mitchell) (08/11/89)

	So does this mean an object oriented dbms can not, by performance
	necessity, be built above a rdbms?  (If performance is the key, than
	an oodbms built in such a manner would require some heavy duty
	optimization - the joins would be incredible).  It also 
	makes sense that adding another layer would make a slower
	system - comparable to an extended rdbm.

	Now, if the oodbms is not built above the "proven" relational algebra
	what should it be built above? Prolog? Now what are we talking about?
	Object-oriented DBs or deductive DBs?
==============================================================================

     //        o      ARPA: mitchell@wdl1.fac.ford.com
   (O)        ( )     
  /    \_____( )	"Mommy mommy I don't want to go to Europe"	
 o  \         |  
    /\____\__/
  _/_/   _/_/
~

dennism@menace.rtech.COM (Dennis Moore (x2435, 1080-276) INGRES/teamwork) (08/11/89)

There are some very complicated queries that CASE tools generally need to
do, which can be slow, regardless of whether an OODB or RDBMS is used,
depending on the implementation.  For instance, "get me all systems which
reference this variable" can be a many-table join when coded in the simplest,
most normalized, least RDBMS-efficient way.  Another example is "save this
object as a new version of this object," which can involve obtaining a new
logical key (a bottleneck) and saving a new version of the object in a
catalog (another bottleneck).  There are efficient and inefficient ways to
code this and other CASE-necessary functions.

Another area of concern is generally that CASE models are highly volatile.
For instance, the name (which is often a primary key) of an object may be
changed many times; the same object may appear in multiple locations of the
same or different models; objects are inserted, deleted, and resurrected;
and many thousands of objects may be in the same "databases."

I think many CASE vendors (our own partners [Cadre] included) tried to
implement poorly-defined data models of their CASE dictionaries on top of
poorly chosen RDBMS's with very little in-house knowledge of RDBMS techniques
at some time in the past, and came up with (surprise!) a poorly performing
system.  In addition, CASE vendors tend to be proprietary-minded and suffer
from a large dose of NIH syndrome, not to mention "we can do anything better
than the big guys" disease.  In our partnership, we have educated our
counterparts, prototyped a better design, and convinced them that RDBMS's
are the way to go.

-- Dennis Moore, Manager, INGRES/teamwork CASE product development
My own opinions blahhhhhhhhh

dlw@odi.com (Dan Weinreb) (08/11/89)

In article <3324@rtech.rtech.com> dennism@menace.rtech.COM (Dennis Moore (x2435, 1080-276) INGRES/teamwork) writes:

   |Many CAD and CASE applications currently don't use any existing DBMS,
   |relational or otherwise.  Or if they do, they only use it at a high
   |level of granularity, or for peripheral functions.  Few or none of
   |them use a relational DBMS to store, say, individual transistors, or
   |whatever are the small elements in which the program primarily deals.
   |Since they're not using a relational DBMS now, there's no issue of
   |"staying with an evolving rdb".

   This is common disinformation that OODB companies have been spreading in
   an attempt to generate a "need" for their product.  

If you intend to use comp.databases as a forum for insult and
invective rather than information and discussion, I won't continue to
reply to your postings.

						       Most CASE companies
   use RELATIONAL databases at the hearts of their products.  For instance,
   Cadre (teamwork) have used a number of commercial databases on different
   platforms, and are forging a MUCH CLOSER relationship with my company (RTI).
   IDE (Software through Pictures) uses an in-house RDBMS called TROLL, and are
   forging a MUCH CLOSER relationship with Sybase.

I stand by my statement, above.  The largest U.S. CASE company, Index
Technologies, does not use any DBMS in its product.  They have
carefully considered the question and decided that existing DBMS
technology is inadequate for what they want to do.  The major ECAD
companies do not use any RDBMS for anything, or at least not for
anything at the heart of their systems.

To explain what I mean by this, consider the following questions:

Can you name one serious ECAD system in which each gate and each wire
of a schematic is represented by one or more tuples in a relational
database system?  That runs simulations or design rule checks by accessing
the relational database system for each circuit element?

Can you name one serious CASE system in which each, say, box and
connection of a dataflow diagram is represented by one or more tuples
in a relational database system?  That refreshes its video display by
accessing a relational database for each of these little elements?

This isn't generally done because (a) the performance would be
unacceptable, and (b) the amount of programming needed to translate
between the datatypes and computational constructs of a relational
database and of a programming language would be too painful.

Those CAD systems that are built on relational databases use them
almost exclusively for selection, and sometimes for projection, but
perform or precompute joins in the program memory of the design tools.
Such tools read the appropriate chunk of the database at startup, and
build internal record and pointer structures from it in virtual
memory.  Thus, a design session starts by copying one part of the disk
into another.  At the end of the session, the internal structures are
converted back to tuples and written to the database.  This loading
and unloading can be very slow compared to the time to make small
change in a design.  Most CAD systems don't even work this way: they
store their data in files, in the operating system's file system.

   Many databases have substantial object-oriented features;  

Unfortunately, the phase "object-oriented" is used to cover so many
different areas that it's hard to conduct a meaningful conversation
about what "object-oriented database" means.  Certainly, you can store
strings representing SQL strings in a relational database.  Certainly,
you can add "rule" and "trigger" features to a relational database
system.  And these are useful, and people will use them.

However, the introduction of these features still won't result in each
gate and each wire being represented as an element in the database
system.  The really interesting data like gates and wires will still
have to be stored in files in a file system, just as they are now in
every real ECAD system.

In the view of the people in the OODBMS companies, we will have
succeeded when there is no longer any need for a CAD/CASE company to
use the operating system's file system for anything at all, and when
the CAD/CASE tool can be written in a single, unified language, with
no translation between normalized relational tuples on the one hand,
and a programming language with its type system on the other hand.
And all this without performance degradation.

A particular requirement is that fetching a data value out of the
database system must be as fast as fetching a component of a structure
(record) in the programming language.

That's what I really mean by "object-oriented database system".  A
relational system with extra added "object-oriented features" like
rules and triggers, while it has its uses, does not solve the problem
that we are trying to solve.  Those "features" are beside the point;
beside our point, anyway.  These problems cannot be solved by taking a
relational database system and adding some new "features".

We have presented our ideas to a range of leading CAD, CASE, and
related companies.  Many of them are very interested in seeing such a
system, and most realize that they aren't likely to get it from
relational database technology.  The leading technical people at these
companies are quite sophisticated about software technology; they
can't be "fooled" by simplistic "disinformation".  Being
sophisticated, they are of course skeptical of all future claims until
they see real working systems.  But they also have a good
understanding of what exists now, and why it works the way it does.

In fact, several of these companies, who can't wait for the new
OODBMS's and obviously can't be sure when those new system will
appear, have been working on in-house solutions, usually very
specialized OODBMS's (in the sense that I use the term) tailored for
their own application.  They realize that their in-house systems are
stopgap measures, and hope to replace them with a more general,
complete, tuned commercial OODBMS product at some point.  They would
not be doing this if they saw the solution coming soon from the
relational companies.  (The solution to the problem I'm talking about
-- not to the ones that you're talking about, for which relational
technology works OK.)

Certainly many CAD/CASE systems will still need to communicate with
relational database systems, since a lot of important data is and will
be stored in RDBMS's.  OODBMS's will not replace RDBMS's, and are not
trying to.  Rather, they are trying to answer new needs that will not
be answered by RDBMS's.  The mature, advanced CAD/CASE systems of the
future will use both kinds of database system.

For anyone interested in a deeper discussion of these points, I
recommend a short paper called "Making Database Systems Fast Enough
for CAD Applications", by David Maier.  It's in an excellent anthology
entitled "Object-Oriented Concepts, Databases, and Applications",
edited by Won Kim and Frederick Lochovsky, ACM Press/Addison Wesley
1989, ISBN 0-201-14410-7.  It's a new book and should be relatively
easy to find.

dlw@odi.com (Dan Weinreb) (08/11/89)

In article <3560053@wdl1.UUCP> mitchell@wdl1.UUCP (Jo Mitchell) writes:

   So does this mean an object oriented dbms can not, by performance
   necessity, be built above a rdbms?  (If performance is the key, than
   an oodbms built in such a manner would require some heavy duty
   optimization - the joins would be incredible).  It also 
   makes sense that adding another layer would make a slower
   system - comparable to an extended rdbm.

That's right, by my definition of "object-oriented DBMS".  Although a
relational database system can be enhanced with many features that are
often associated with "object-orientation", the kind of DBMS needed by
CAD/CASE/CAP/etc systems for "high performance for fine-grain
manipulation of small, persistent objects" requires a totally
different underlying storage architecture.  Bruce Speyer's point about
the high cost of switching address space/process context is quite
right; this is one of the important factors.

(The relational database fans will point out that there are drawbacks
to using the application process's address space, mainly that
application bugs causing "wild stores" can damage data.  Yes, that's
true; it's a tradeoff.  It's not all that much worse than the current
situation, in which application bugs can write garbage to files.)

   Now, if the oodbms is not built above the "proven" relational algebra
   what should it be built above? Prolog? Now what are we talking about?
   Object-oriented DBs or deductive DBs?

No, Prolog and "deduction" don't have much to do with the goals that
we're trying to achieve for CAD/CASE/etc. systems, although these
ideas are interesting and useful for their own domains.  Generally, we
see a need for seamless integration with the underlying programming
language.  The language in question, for our applications, is C++; the
CAD and CASE vendors have made it clear that C++ is what they want.
So C++ must be the starting point.  Then it's possible to go beyond
C++ to provide access to things that a database system does well, such
as queries over sets and representations of relationships.  This is an
area that's still being explored.

Dan Weinreb		Object Design, Inc.		dlw@odi.com

jack@odi.com (Jack Orenstein) (08/12/89)

In article <458@cimshop.UUCP> davidm@cimshop.UUCP (David Masterson)
writes (quoting a posting of mine):

>>On the other hand, many CAD/CAM and CASE developers have bypassed
>>relational DBMSs due to performance problems.
>>
>Could you elaborate on the performance problems in this case?  I mean is it
>inherit in what the CAD/CAM or CASE developers are doing that leads to
>performance problems with relational databases or is it just they chose the
>wrong (aka. slow) relational database system?  Perhaps there are things that
>OODBMSs can be optimized to do better than a relational database system, but I
>still think that a relational database can provide a strong foundation for an
>object-oriented database.  I'd like to hear why that can't be true.
>
>David Masterson
>uunet!cimshop!davidm

Dan Weinreb, a co-worker of mine, has just posted a very thorough
discussion of this point, so I'll refer you to his article.

Jack Orenstein
Object Design, Inc.

jack@odi.com (Jack Orenstein) (08/12/89)

In article <3560053@wdl1.UUCP> mitchell@wdl1.UUCP (Jo Mitchell) writes:
>
>	So does this mean an object oriented dbms can not, by performance
>	necessity, be built above a rdbms?  (If performance is the key, than
>	an oodbms built in such a manner would require some heavy duty
>	optimization - the joins would be incredible).

That's my opinion, anyway. Think of the relational schema that you'd
use to represent a circuit. Every time you wanted to follow some
connection, e.g. from a pin to a wire, a join would be required. To do
any useful work, e.g. a circuit simulation, a very large number of
joins would be required. Not only is there a big join optimization
problem, but the efficiency of the join implementation is crucial.

In an OO DBMS schema, the connections would be represented by
pointers, and following a pointer can be done extremely quickly, on
the order of a few machine instructions. Following a pointer will be
faster than a join (amortized over the number of connections made by
the join), no matter how the join is implemented, (judging by the
algorithms currently available).

This is why I believe that a relational DBMS (or an OO DBMS built on
top of a relational DBMS) cannot deliver the performance required for
dealing with large numbers of small objects. 

>
>	Now, if the oodbms is not built above the "proven" relational algebra
>	what should it be built above? Prolog? Now what are we talking about?
>	Object-oriented DBs or deductive DBs?

I don't see how Prolog enters the picture.

Many OO DBMSs start with a programming language and add persistence
and possibly semantic data modeling constructs.  Of the "first
generation" OO DBMSs, Vbase started with an extension of C, GemStone
started with Smalltalk, and Statice started with Lisp. Object Design
and other "second generation" companies are starting with C++.  The
advantage of this approach is that application developers no longer
have to worry about two type systems (one for the host language and
one for the DBMS) and two namespaces. Also, the problems inherent in
translating complex objects between host language structures and
relations in the database disappear.

CAx developers are in such a difficult position because they need the
expressiveness of a general-purpose programming language, and the
persistence, atomicity and recoverability of DBMSs. DBMS query
languages aren't expressive enough, and languages offer very little in
the way of persistence and nothing to support atomicity and
recoverability.  Most of the CAx developers we've talked to have
resolved this problem by using a programming language and then
providing their own persistence on top of file systems.  The goal of
an OO DBMS provides the best features of general-purpose (OO)
languages and DBMSs in a single package.

Jack Orenstein
Object Design, Inc.

jkrueger@dgis.daitc.mil (Jonathan Krueger) (08/12/89)

dennism@menace.rtech.COM (Dennis Moore, INGRES/teamwork) writes:
>I suggest that we will have all these object oriented features before these
>OODBMS companies have distributed database, development tools, bug elimination,
>installed base, customer-driven features, third party developers, high
>performance, and all the other things we expect for our $2K per user.

How about just shared, persistent data?  Is anyone out there willing to
describe his production OODB and state how many concurrent users access
it?  How many are actively updating it on a typical day?

-- Jon

jkrueger@dgis.daitc.mil (Jonathan Krueger) (08/12/89)

jack@odi.com (Jack Orenstein) writes:
>I assume that by "extended rdb" you mean a relational database system
>that permits instances of object classes to be stored in fields of
>tuples, instead of just numbers and strings.

What advantages of this approach would you see over support for
domains and abstract data types?

-- Jon

jkrueger@dgis.daitc.mil (Jonathan Krueger) (08/12/89)

speyer@joy.cad.mcc.com (Bruce Speyer) writes:

>If an application must cross its process boundary in order to
>communicate with the database system it probably is at least two orders
>of magnitude too slow.  That is why all of the C++ based OODBMS efforts
>are using the application memory heap for the cache.

Could you provide some performance measurement data that qualify
and quantify this assertion?

-- Jon

dgh@unify.UUCP (David Harrington) (08/12/89)

In article <3324@rtech.rtech.com> dennism@menace.UUCP (Dennis Moore (x2435, 1080-276) INGRES/teamwork) writes:
>
>
>I suggest that we will have all these object oriented features before these
>OODBMS companies have distributed database, development tools, bug elimination,
>installed base, customer-driven features, third party developers, high
>performance, and all the other things we expect for our $2K per user.
>

I agree.  Look at the OODBMS companies like Ontologic.  They are either living
off an existing RDBMS which they are trying to re-cast as OO, or they are
dying.

Look at Servio Logic.  It has been building GemStone for at least 5 years, and
as of April of this year had maybe 30 systems installed -- mostly 4 user
systems in R&D labs.  The only reason Servio is still around is that they are
funded by the House of Sampoerna, an Indonesian tobacco company run by a
41-year old Chinese "Tai-Pan" named T. Pao Liem who has MUCH more money than
he needs.  

GemStone has no front-end, no distributed database, no 3rd party
developers (other than a small group of Servio employees in Alameda trying to
build an MRP system in Smalltalk (!) that uses GS as a structure server).

They have based their marketing strategy, such as it is, on an assumption that
the market for OODBMS, which they say is "applications requiring a LOT of
COMPLEX data", will mature at least 5X as fast as the relational market did.

Figuring the R-market took 12-15 years from academia to maturity, they project
(or have been projecting for some time) that the OODBMS market would take off
in 1989. 

I think the path to OODBMS is evolutionary, especially given the huge install
base of RDBMS and applications that use them.

jack@odi.com (Jack Orenstein) (08/14/89)

In article <19@dgis.daitc.mil> jkrueger@dgis.daitc.mil (Jonathan Krueger) writes:
>jack@odi.com (Jack Orenstein) writes:
>>I assume that by "extended rdb" you mean a relational database system
>>that permits instances of object classes to be stored in fields of
>>tuples, instead of just numbers and strings.
>
>What advantages of this approach would you see over support for
>domains and abstract data types?

I don't have Codd's 1970 paper in front of me, but in his 1979 paper
on RM/T, Codd defines a relation as a subset of D1 x ... x Dn, where
each Di is a domain. He then defines a relational database as a
collection of relations whose defining domains are simple, i.e.
"nondecomposable by the database management system".

Several relational DBMS vendors go beyond this definition, and provide
non-simple domains, in the same way that many vendors of Pascal
compilers add features that go beyond the language definition.

I'm not sure what distinction you're drawing between an extended
relational DBMS and one storing domains and ADT (instances). Simple
domains are accomodated by Codd's definition of the relational model.
ADTs and object classes require, in addition, the capability to refer
to functions of the ADT or object class from the query language.  The
set of all instances of an ADT or object class are certainly
comparable to a domain, but the procedural part of these constructs
might violate the "nondecomposable" requirement. This is a bit fuzzy
to me - how is nondecomposable defined? I cannot make a convincing (to
me) argument explaining why stacks are any more or less decomposable
than ints. 

If your question is about ADTs vs. object classes, I see a couple of
differences. First, an ADT is defined axiomatically, e.g. pop(push(s:
stack, x: int)) --> s, while an object class is more of a programming
language construct. (Of course, any practical implementation of an ADT
facility is not axiomatic.) Second, object classes are usually
associated with language features such as inheritance, polymorphism,
and other ideas that have turned up in various combinations in recent
languages.

The comparison that I am more interested in is relational (plus
objects or ADTs) vs. OO DBMSs. There are at least three aspects to
this comparison:

	1. From the point of view of an application developer, what is
           the correct paradigm of database usage: host language +
	   DBMS (as is currently the situation with relational DBMSs)?
	   A database programming language in which certain constructs
	   - relations - have special properties such as
	   persistence)? A persistent programming language?

	2. Performance issues - this has been the focus of much of
	   the recent discussion.

	3.  What role can the relational model play in
	    "non-traditional" (CAx) applications? Suppose that you
	    have a persistent programming language for use in 
	    developing CAx applications, and that you were interested
	    in extending the language by adding, in some form, 
	    operations of the relational algebra. How would this
	    be done? Would you add a relation data type as a builtin
	    type? Could it be an object class? Would all relations be
	    of the same class, or would there be a different class for
	    each tuple type? What happens when a join is done? An
	    alternative is to draw an analogy between relations and
	    functions, or relations and object classes. A natural join
	    would then generate a new function or object class.

As someone working on an OO DBMS product, the marketplace determines
issues 1 and 2, and the performance requirements are so strict that
they influence just about every other issue. 3 is primarily of
academic interest to me, as I'm interested in how the relational and
OO models relate to one another.

Jack Orenstein
Object Design, Inc.

speyer@joy.cad.mcc.com (Bruce Speyer) (08/14/89)

In article <20@dgis.daitc.mil> jkrueger@dgis.daitc.mil (Jonathan Krueger) writes:
>speyer@joy.cad.mcc.com (Bruce Speyer) writes:
>
>>If an application must cross its process boundary in order to
>>communicate with the database system it probably is at least two orders
>>of magnitude too slow.  That is why all of the C++ based OODBMS efforts
>>are using the application memory heap for the cache.
>
>Could you provide some performance measurement data that qualify
>and quantify this assertion?
>
>-- Jon

No, I don't have the numbers or the time to work them up.  Perhaps somebody else
could provide actual statistics and even disprove my assertion.  It would be
interesting to hear from somebody involved with the HP Iris system which is
based upon a relational database.

About 3 years ago I tried putting an electronic information model on top of a
relational system.  It took about 30-40 times longer to netlist a circuit then
it did using a fairly inefficient internally developed memory-based database
system. An operation such as packaging the electronics is much worse since it
must transverse much more of the electronic information model and be constantly
refering to the library portion of the model which was distributed to another
database (making the join operation much more expensive).

Compare the cost of processing a tuple at a time to a C++ style database.  If
the object is in-memory then optimally an indirect reference and a test is all
that is required to transverse a relation or access an attribute.

My apologies for not being able to back up my statements with benchmarks.
Bruce Speyer / MCC CAD Program                        WORK: [512] 338-3668
3500 W. Balcones Center Dr.,  Austin, TX. 78759       ARPA: speyer@mcc.com

jkrueger@dgis.daitc.mil (Jonathan Krueger) (08/15/89)

jack@odi.com (Jack Orenstein) writes:
>I'm not sure what distinction you're drawing between an extended
>relational DBMS and one storing domains and ADT (instances).

That was my line!  What distinction are you drawing?

Better yet, could you give a simple example, please?  Something that
highlights the difference between using ADT's versus objects for
defining domains.  Thank you.

-- Jon
-- 
Jonathan Krueger    jkrueger@dgis.daitc.mil   uunet!dgis!jkrueger
Isn't it interesting that the first thing you do with your
color bitmapped window system on a network is emulate an ASR33?

marti@ethz.UUCP (Robert Marti) (08/15/89)

With respect to the ongoing debate concerning OODBs vs extended RDBs,
I'd like to see proof (make that circumstatial evidence, if you prefer)
that an OODB which supports traditional basic DBMS features such as
concurrency control, transactions, set-oriented data manipulation,
the ability to define views and to dynamically add new tables/columns,
etc. is

1) faster than a relational system for typical technical/engineering
   applications than a relational system, and

2) not much slower than a relational system for traditional business
   oriented applications.

How about some benchmarks, controversial as they may be?

Btw:  For me, functionality is a much more important point than
performance.  However, most OODB followers emphasize the superior
performance of OODBs.  So:  Put up or ...  :-)

-- 
Robert Marti                      Phone:      +41 1 256 52 36
Institut fur Informationssysteme
ETH-Zentrum                       CSNET/ARPA: marti%inf.ethz.ch@relay.cs.net
CH-8092 Zurich, Switzerland       UUCP:       ...uunet!mcvax!ethz!marti

dennism@menace.rtech.COM (Dennis Moore (x2435, 1080-276) INGRES/teamwork) (08/15/89)

In article 3452 of comp.databases, speyer@joy.cad.mcc.com (Bruce Speyer) writes:

>In article <20@dgis.daitc.mil> jkrueger@dgis.daitc.mil (Jonathan Krueger) writes:
>>speyer@joy.cad.mcc.com (Bruce Speyer) writes:
>>
>>>If an application must cross its process boundary in order to
>>>communicate with the database system it probably is at least two orders
>>>of magnitude too slow.  That is why all of the C++ based OODBMS efforts
>>>are using the application memory heap for the cache.
>>
>>Could you provide some performance measurement data that qualify
>>and quantify this assertion?
>>
>>-- Jon
>
>No, I don't have the numbers or the time to work them up.  Perhaps somebody else
>could provide actual statistics and even disprove my assertion.  It would be
>interesting to hear from somebody involved with the HP Iris system which is
>based upon a relational database.
>

It is true that changing contexts takes a small number of milliseconds,
depending primarily on the architecture of the CPU (i.e. an 80x86 takes a long
time, because it is a segmented architecture, 68x00 takes the same amount of
time to do a kernel call as a non-kernel call).  However, you must do a context
switch to call a C++ library routine or to call a database routine, so there
is not much difference there.  The difference in instantaneous response present
currently in most DBMS's (OO *OR* R) is that they are client-server (or multi-
server, in the case of INGRES (caveat -- I work for RTI and INGRES is our
product)).  This means to access data, you use IPC (inter-process communication)
rather than a function call.  IPC generally is much slower than a function call,
but let's not forget one *MAJOR* saver here -- the SAME server can serve
literally hundreds of users.  If each had it's own linked copy of the C++ data
access routines, there would be so much swapping/paging going on on the host,
that nothing would get done.  Even if linked libraries were used, each user
would have her own data segments etc., and would use many more resources than
the DBMS does currently.  Therefore, I have no issue with the claim that a
single user system is better off with a highly tuned, memory hogging,
specialized access method, than an RDBMS.

>About 3 years ago I tried putting an electronic information model on top of a
>relational system.  It took about 30-40 times longer to netlist a circuit then
>it did using a fairly inefficient internally developed memory-based database
>system. An operation such as packaging the electronics is much worse since it
>must transverse much more of the electronic information model and be constantly
>refering to the library portion of the model which was distributed to another
>database (making the join operation much more expensive).
>

Excuse me, have you heard of distributed database?  INGRES*STAR would allow you
to keep your packaging information in a separate "database," and still do joins
just as if the data were in the same database.  The concept of "a database" (as
opposed to "a different database") basically goes away, as the user can pick
and choose tables from multiple "databases" to be in one STAR database.  Maybe
the reason it was slow was that you didn't know what you were doing.

Let me posit a different architecture for your electronic information model.
Could you have read in all the data into memory from an RDBMS and performed
the same manipulations in-core that you did in your system?  The advantage to
this architecture is that you can lock the records while you are manipulating
them (with THREE WORDS ("FOR DIRECT UPDATE"), as opposed to many lines of code),
you get all the transaction processing capabilities of the DBMS (i.e. rollback,
savepoints, commit), you get all the utilities of the DBMS, etc.  To put it
in a few words, YOU GET THE *MS* FROM THE DBMS, and you do your own processing.

>Compare the cost of processing a tuple at a time to a C++ style database.  If
>the object is in-memory then optimally an indirect reference and a test is all
>that is required to transverse a relation or access an attribute.
>

What a surprise!  In INGRES, there is a concept called a TABLE FIELD (NOTE --
many other databases (such as Gupta's RESULT SETS, Sybase's SETS, etc.) have
the same concept with other terms).  You select a SET AT A TIME, NOT A TUPLE
AT A TIME into the TABLE FIELD.  BTW, do you know that a database oriented to
TUPLE AT A TIME processing is not relational?  By definition, a relational
database can process a SET AT A TIME.  For instance, if the diagram tuple has
a surrogate key DIAGRAM#, which is a foreign key for the components table (which
I will call COMPONENTS), then you could find all the components of a diagram
by the following SQL statement:
	SELECT * FROM COMPONENTS WHERE DIAGRAM# = :diagram_number;
where diagram_number is a C variable (for instance) containing the number of
the host diagram.  The results of this select could be stored in a table field
and manipulated in core.  BTW, all the table field manipulations (i.e.
INSERTROW, DELETEROW, etc.) are in our language, so you don't have to write
list processing classes -- we already did.

So, in summary, whether you use an OO system or an RDBMS (which has OO
features and capabilities), you can process the data in memory.  You STILL
have to get that data from disk and to disk SOMETIME, and the RDBMS will be
better at that.  In addition, the RDBMS already comes with the in-memory
manipulation features.  The RDBMS also protects against hardware and software
(i.e. the break key) failures and provides you with the capability to start
off a process and then back out if you don't like the results.  The RDBMS is
optimized to provide consistency and concurrency for the data.  The OO "faction"
here kees talking about what RDBMS's don't do, and yet every example so far
has been doable with an RDBMS today.  I am *SURE* that there *ARE* things that
an OODB can do, but RDBMS's are developing new features faster (there are more
people in engineering in *MY* company than in their whole company) and faster.

I would like to point out that only two people are doing this rather poor
defense of the entire OODB industry.  After all, if OO was not a good idea,
we wouldn't be developing even more OO features now.

>My apologies for not being able to back up my statements with benchmarks.

'Nuff said ...

>Bruce Speyer / MCC CAD Program                        WORK: [512] 338-3668
>3500 W. Balcones Center Dr.,  Austin, TX. 78759       ARPA: speyer@mcc.com 
>
>

-- Dennis Moore, my own opinions, etc etc etc

jkrueger@dgis.daitc.mil (Jonathan Krueger) (08/16/89)

dennism@menace.rtech.COM (Dennis Moore, INGRES/teamwork) writes:

>...the SAME server can serve literally hundreds of users...
>Therefore, I have no issue with the claim that a single user system is
>better off with a highly tuned, memory hogging, specialized access
>method, than an RDBMS.

We run the latest release of INGRES that RTI sells for Berkeley UNIX on
Pyramid, VAX, and Gould.  None of them supports servers yet.  Our
INGRES applications use about a megabyte of physical memory per
additional active concurrent user on Pyramid.

We regard this performance as adequate.  We bought our system to serve
users, not ration resources.  It would be nice to serve more users with
the same resources, as we anticipate when we receive INGRES 6.0.  But
our users would not be served at all without the development tools that
RTI has been providing since INGRES 3.0.

Therefore I'd like to divide the question: efficient implementation of
a data model versus inherently bad performance of some models for some
operations.  Recent traffic has confused the two issues without
addressing either.  It tells us very little that a current DBMS
performs poorly.  References to applications without specifying their
operations or describing their design tell us nothing.

For instance, Bruce alludes to operations like "netlist a circuit" and
"package the electronics".  It would be wonderful indeed to understand
the electronics that underlies all the computing we do, but I'll settle
for characterizing some operations that engineers need.  Can you
specify these operations in some terms we can understand?  Or simpler
ones?  How might one implement them with a relational data model?  Are
there data models that can be shown inherently better for some of these
operations?

-- Jon
-- 
Jonathan Krueger    jkrueger@dgis.daitc.mil   uunet!dgis!jkrueger
Isn't it interesting that the first thing you do with your advanced powerful
color bitmapped windowing workstation on a network is emulate an ASR33?

simpson@trwarcadia.uucp (Scott Simpson) (08/17/89)

>Dan Weinreb of Object Design writes:

>I stand by my statement, above.  The largest U.S. CASE company, Index
>Technologies, does not use any DBMS in its product.  They have
>carefully considered the question and decided that existing DBMS
>technology is inadequate for what they want to do.  The major ECAD
>companies do no use any RDBMS for anything, or at least not for
>anything at the heart of their systems.

	Index Technologies recently announced they have selected OB2,
Ontologic's C++ OODB for their products.  Ontologic is one of Object
Design's competitor.
	See "An Object-Oriented VLSI CAD Framework" by Rajiv Gupta,
Wesley H. Cheng, Rajesh Gupta, Ido Hardonag and Melvin A. Breuer in
the May 1989 IEEE Computer.

>Jack Orenstein of Object Design writes:

>Many OO DBMSs start with a programming language and add persistence
>and possibly semantic data modeling constructs.  Of the "first
>generation" OO DBMSs, Vbase started with an extension of C, GemStone
>started with Smalltalk, and Statice started with Lisp.  Object Design
>and other "second generation" companies are starting with C++.  The
>advantage of this approach is that application developers no longer
>have to worry about two type systems (one for the host language and
>one for the DBMS) and two namespaces. Also, the problems inherent in
>translating complex objects between host language structures and
>relations in the database disappear.

I believe that when you stick to straight C++, you also lose seamlessness
(that is, you must now use library calls rather than having persistence
built directly in your language).  C++ doesn't have keywords for persistence.
You could extend C++, but then it wouldn't be C++.  We use Ontologic's VBase
OODB.  This version had its own proprietary language that was seamless.
They dropped it due to the market's resistance to accepting a new language.
They are now coming out with a non-seamless C++ version called OB2.  We don't
know if we'll switch to it.  Our contract is ending.

	Lastly, say hi to Rich Fetik at Object Design.  I knew him when
he worked for Ontologic.  I was wondering where he went...
	Scott Simpson
	TRW Space and Defense Sector
	usc!trwarcadia!simpson  	(UUCP)
	trwarcadia!simpson@usc.edu	(Internet)

davidm@uunet.UU.NET (David S. Masterson) (08/17/89)

>With respect to the ongoing debate concerning OODBs vs extended RDBs,
>I'd like to see proof (make that circumstatial evidence, if you prefer)
>that an OODB which supports traditional basic DBMS features [is]
>	[better than a relational system]
>
The one flaw in this request is that proof of concept can't be provided if the
concept hasn't been defined.  I agree with Jon Krueger in that there is too
much hand-waving in this discussion ("our system is better than yours")
without defining the problem that is trying to be met.

1.  Relational DBs provide things necessary for a multi-user world
(concurrency control, security, etc.) that may or may not be needed in the
object oriented world (perhaps only a specific area [CAD/CASE]).

2.  Object DBs provide things necessary in a single-user world (extreme speed)
that may or may not be needed in the relational world.  Thus the need for
cached objects.

3.  What is the crossover point between the models?  Object-oriented
methodologies include relational methodologies?  Or is it vice versa?

David Masterson
uunet!cimshop!davidm

jack@odi.com (Jack Orenstein) (08/17/89)

In article <1765@ethz.UUCP> marti@ethz.UUCP (Robert Marti) writes:
>With respect to the ongoing debate concerning OODBs vs extended RDBs,
>I'd like to see proof (make that circumstatial evidence, if you prefer)
>that an OODB which supports traditional basic DBMS features such as
>concurrency control, transactions, set-oriented data manipulation,
>the ability to define views and to dynamically add new tables/columns,
>etc. is
>
>1) faster than a relational system for typical technical/engineering
>   applications than a relational system, and
>
>2) not much slower than a relational system for traditional business
>   oriented applications.
>
>How about some benchmarks, controversial as they may be?

Speaking for the system we're building at Object Design: The system is
based on C++. Concurrency control, transactions, and set-oriented data
manipulation (as well as one-at-a-time processing) will all be present
in our system. View definition is tricky to define - how does it
differ from simply writing another C++ object class? As for
dynamically adding tables and columns: We have set-valued types,
modeled as C++ classes (which are analogous to relational tables),
instances of which can be dynamically allocated, as is the case with
any C++ object class. "Adding columns" is a relational notion that
does not have a clear OO counterpart, (I'd be interested in hearing
about analogies that anyone would care to offer.)

No benchmarks (yet), but a forthcoming posting addresses one aspect of
the performance issue for CAx applications.

Jack Orenstein
Object Design, Inc.

jack@odi.com (Jack Orenstein) (08/17/89)

In article <CIMSHOP!DAVIDM.89Aug16173259@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes:
>>With respect to the ongoing debate concerning OODBs vs extended RDBs,
>>I'd like to see proof (make that circumstatial evidence, if you prefer)
>>that an OODB which supports traditional basic DBMS features [is]
>>	[better than a relational system]
>>
>The one flaw in this request is that proof of concept can't be provided if the
>concept hasn't been defined.  I agree with Jon Krueger in that there is too
>much hand-waving in this discussion ("our system is better than yours")
>without defining the problem that is trying to be met.

Dan Weinreb and I have tried to define the concept in earlier
postings.

From Dan:

   > Unfortunately, the phase "object-oriented" is used to cover so many
   > different areas that it's hard to conduct a meaningful conversation
   > about what "object-oriented database" means.  Certainly, you can store
   > strings representing SQL strings in a relational database.  Certainly,
   > you can add "rule" and "trigger" features to a relational database
   > system.  And these are useful, and people will use them.
   > 
   > However, the introduction of these features still won't result in each
   > gate and each wire being represented as an element in the database
   > system.  The really interesting data like gates and wires will still
   > have to be stored in files in a file system, just as they are now in
   > every real ECAD system.
   > 
   > In the view of the people in the OODBMS companies, we will have
   > succeeded when there is no longer any need for a CAD/CASE company to
   > use the operating system's file system for anything at all, and when
   > the CAD/CASE tool can be written in a single, unified language, with
   > no translation between normalized relational tuples on the one hand,
   > and a programming language with its type system on the other hand.
   > And all this without performance degradation.
   > 
   > A particular requirement is that fetching a data value out of the
   > database system must be as fast as fetching a component of a structure
   > (record) in the programming language.
   > 
   > That's what I really mean by "object-oriented database system".  A
   > relational system with extra added "object-oriented features" like
   > rules and triggers, while it has its uses, does not solve the problem
   > that we are trying to solve.  Those "features" are beside the point;
   > beside our point, anyway.  These problems cannot be solved by taking a
   > relational database system and adding some new "features".

From me:

   > Based on [interviews with potential customers], we concluded that
   > there is a need for high performance for fine-grain manipulation of
   > small, persistent objects.


>1.  Relational DBs provide things necessary for a multi-user world
>(concurrency control, security, etc.) that may or may not be needed in the
>object oriented world (perhaps only a specific area [CAD/CASE]).
>
>2.  Object DBs provide things necessary in a single-user world (extreme speed)
>that may or may not be needed in the relational world.  Thus the need for
>cached objects.

CAx applications need transactions and multi-user capabilities also,
and we are building in these features. Yes, CAx needs extreme speed,
but design projects are typically carried out by teams, and the
multi-user issues are at least as important as in business
applications.



Jack Orenstein
Object Design, Inc.

dcmartin@lisp.eng.sun.com (David C. Martin) (08/17/89)

Although this discussion has centered around the questions of the speed of an RDBMS versus
an OODBMS, one area which I think should be noted is the abilities and inabilities of each
type of DBMS to provide functionality under certain circumstances.

I believe Dennis mentioned the Postgres Papers (Rowe & Stonebraker) which discuss the
design and proposed implementation of Postgres, a next-generation RDBMS. In Mike's
design of Postgres he took into consideration many of the "problems" of typical applications
in CA*, AI and other non-traditional DBMS customers. Many of the new features in
Postgres were designed to improve performance (e.g. tuple fields which evaluate off-line to
produce data which can then be retrieved quickly when needed -- the canonical example,
if I remember correctly, was computing some employees list of children).

Many of the performance increases necessary for non-traditional environments can be provided
via NF**2 (non-first normal form) data, in which a field may contain an entire subtuple,
not simply a reference to another tuple in another table. An object-oriented DBMS may
take advantage of an object's identity, i.e. its unique ID throughout all space and time,
in order to cluster data efficiently (I'm a little lost for words here, what I mean to
say is that the ID will allow the data to be located regardless of the necessity for it
being located in a particular table).

I could ramble on about speed, but the more important question is functionality. One
of the biggest problems *I* have with traditional RDBMS (and I am sure that many
non-traditional users also have) is the inability to provide certain functionality to
the user-community (i.e. non-traditional) with *support* from the DBMS. For example,
if I wish to take an object-oriented language (I will use the Common LISP Object System)
and store the objects in the DBMS, I would like the backend to support the same types
of operations which the frontend provides, e.g. when I change the value of a database
field (a CLOS slot) using what is called a setf function in lisp (basically the inverse
of get), I would like the same operation to occur. One example might be that when I
setf the children list of an employee to no longer contain some child, I would like the
non-referenced child to be removed from the DB if it is no longer referenced from anywhere.

NOw I am sure that Dennis will tell me that he has millions of hackers banging on that
example right now :-) and perhaps even Larry and Mike will tell me I'm all wrong, but I
think that the position of most OODBMS vendors is to provide this type of extended
functionality in the DBMS, not necessarily in frontend support systems.

dcm

--
-----
Stupidty got us into this mess; why can't it get us out? - Will Rogers
-----
David C. Martin arpa: dcmartin@cs.wisc.edu
University of Wisconsin - Madison uucp: uunet!ucbarpa!dcmartin
Computer Sciences Department at&t: 608/262-6624 (O)

dlw@odi.com (Dan Weinreb) (08/17/89)

In article <CIMSHOP!DAVIDM.89Aug16173259@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes:

   The one flaw in this request is that proof of concept can't be provided if the
   concept hasn't been defined.  I agree with Jon Krueger in that there is too
   much hand-waving in this discussion ("our system is better than yours")
   without defining the problem that is trying to be met.

Indeed.  This cannot be emphasized strongly enough.

The term "object-oriented database system" is currently being used by
a fairly large number of research and product development groups.
It's clear from the published literature that the term covers quite a
lot of ground.  Some of these systems have things in common with
others in some respects, while differing greatly in other respects.

I have been at least as guilty as anyone of adding to the confusion,
with my recent postings, and I apologize for not being more specific
and clear.  While I have been following the research reports of quite
a few different groups and consider many of them very interesting, my
own attention has (naturally) been focused on the specific product
that I'm working on at Object Design.  It's only one of a vast range
of approaches that can legitimatly call themselves "object-oriented
database systems".  In the future, I will be more explicit about what
I'm referring to.

There's no official definition of "object-oriented database systems".
Various groups of people have proposed definitions and criteria, but
naturally there is no particular group that is obviously qualified to
rule on a universal definition for the whole database community.

Anyone who wants to get a better sense of the diversity of the field,
and also get an overall feeling for what kinds of things are being
worked on, might want to read:

The Proceedings of the 1986 International Workshop on Object-Oriented
Database Systems, edited by Dittrich and Dayal, ACM order number 472861,
ISBN 0-8186-0734-3, IEEE Computer Society Order Number 734.  235 pages.

Proceedings of the 1989 ACM SIGMOD, also published as SIGMOD Record
Vol 18, No. 2, June 1989.  There are six papers in the two sessions on
object-oriented databases, reasonably representative of the latest
work in the area.  Much other interesting work has appeared in the
SIGMOD proceedings of the last five years.

ACM Transactions on Office Information Systems, Vol. 5, No. 1, Jan 87,
Special Issue On Object-Oriented Systems.  This special issue contains
five extensive articles about five research efforts in object-oriented
database systems, all different.  The articles have extensive references,
through which the interested reader can find a wealth of related material.

Also, all three proceedings of the OOPSLA conference have interesting
papers on the topic.

ACM Computing Surveys, Vol. 19, No. 2, June 1987 has an excellent and
extensive article called "Types and Persistence in Database
Programming Languages", by Atkinson and Buneman.  It discusses the
question of integration of languages with database systems, which is
of great interest to some, but not all, of the object-oriented
database efforts.

ACM Computing Surveys, Vol. 19, No. 3, Sept 1987 has another excellent
and extensive article, this one called "Semantic Database Modelling:
Survey, Applications, and Research Issues" by Hull and King.  Semantic
database models have some relationship to object-oriented database
models, although what the relation consists in is something of a topic
of debate.  Nonetheless, I think the article is practically required
reading for anyone who intends to work on object-oriented database
systems.

simpson@trwarcadia.uucp (Scott Simpson) (08/18/89)

You may wish to look at the article 
	
	"Intermedia: A Case Study of the Difference Between Relational
and Object-Oriented Database Systems", Karen E. Smith and Stanley B.
Zdonik, Brown University, OOPSLA '87 Proceedings.

Basically, the authors implemented a hypermedia system on top of a
relational and then an object-oriented database system and they
determined that the object-oriented model suited their problem domain
better.  Some of the conclusions they came to

	o Relational databases are awkward as information made up of
	  complex, hierarchical data structures need to be flattened
	  into relations and un-flattened when you retrieve data.
	  The authors had to mimic hierarchies with identification keys
	  rather than use pointers or direct references.  There was
	  a mismatch between the programming language data structures
	  and the database data structures.

	o Object-oriented databases provide class-extensibility.  With
	  relational databases, their is a single parameterizable type,
	  relation.  With an OODB, each object is associated with a class
	  and you can add new classes that are on the same level as the
	  system types.  Your new objects are first-class citizens.  In
	  fact, you can simulate a relational database by creating a
	  tuple class.

	o Object-oriented databases provide data abstraction.  The behavior
	  of an object is described by a class definition.  These class
	  definitions provide data abstraction at the level of the database,
	  not at the level of the application.

	o OODBs can store active objects.  Since methods are stored in
	  the database, an application can ask the database to invoke
	  a method on an object.  Since methods are expressed in terms of
	  a programming language, any operation can be performed.  This is
	  in contrast to relational query languages that are not 
	  computationally complete.  Also, since the bulk of the application
	  code resides in the database, it can utilize the built-in
	  concurrency, versioning, etc.

	o There is no need to copy data to virtual memory in an OODB.
	  The application can send messages to objects and have them invoke
	  methods on those objects.  The computation is done on the database
	  side, not on the application side.  This is especially critical
	  if your database resides over a network.  The data does not need
	  to be shipped across the network.  In a RDBMS, data needs to be
	  shipped across for each projection, join, etc.  Stored procedures
	  have been added to many databases to address this problem, but I
	  think it is a cleaner solution to have it in your original model
	  than as an add on hack.  With an OODB, sending one message can
	  take the place of many relational queries.

	o OODB have automatic type checking at the point of use.  Since methods
	  are executed locally, the database can perform type checking as
	  soon as a method is called.  In an RDBMS, type failures are detected
	  when the tuples are checked back in.

	o OODB provide a better granularity of locking facilities.  
	  In an RDBMS, because hierarchies are represented across a number
	  of relations, the application needs to explicitly lock each record
	  to lock the hierarchy.  In an OODB, an application can lock a
	  hierarchy in a single operation.

My personal view is that relational databases were fine for tabular
data such as invoices in the data processing world, but the scientific
world deals in objects, not tabular data.  The twisting of objects onto
a relational structure causes poor performance and a wide semantic
gap.
	Scott Simpson
	TRW Space and Defense Sector
	usc!trwarcadia!simpson  	(UUCP)
	trwarcadia!simpson@usc.edu	(Internet)

hughes@math.berkeley.edu (Eric Hughes) (08/18/89)

In article <28@dgis.daitc.mil>, jkrueger@dgis (Jonathan Krueger) writes:
>Therefore I'd like to divide the question: efficient implementation of
>a data model versus inherently bad performance of some models for some
>operations.  Recent traffic has confused the two issues without
>addressing either.

"Inherently bad performance" is a slippery term.  It is important to
remember that a database model is an abstraction, and that there are
many different implementations of the same abstraction.  To speak of
"inherently bad performace" would require that one find some predicate
which is invariant over all possible implementations and show that
such an invariant has certain undesirable minimum time and/or space
growth rates.  I can see both information theory and complexity theory
useful in this regard, but as far as I know there has been no work
done in this area.  For example, one could calculate some measure of
the information present in a query and relate that to some critical
path of operation to get a lower bound.

In short, I don't think it makes much sense to talk of "inherently"
bad performance, at least for now.

The situation is not so hopeless, because such kinds of operation
counting are possible when one fully specifies the type of
implementation.  For example, one could analyze the performance of an
RDBMS whose implementation consisted of nothing but flat files with no
indexes.  You can get time and space estimates in terms of the file
sizes.  By adding indices to the implementation one can reduce some
O(n) operations to O(log n), and by adding pointer rings to represent
a one-to-many relationship (redundant data for speed sake) one can
reduce some O(log n) operations to O(log log n) (which, for most
applications, is as good as constant).  This is not inherent
performance, but might be termed "practically inherent."

Let's be careful when we talk about the performance of a model.  Only
programs have performance measure per se, and so to measure the
performance of model one must somehow relate it to the performance of
some programs.

Finally, I have a two-part conjecture: Asymptotic performance
measurements (i.e. "order of" type measurement) of the relational and
object-oriented models (and all their children :-) are identical.  No
model has a set of constants for such asymptotic measurement (i.e. the
constant that gets rid of the "order of" symbol) whose inverses
dominate (i.e. are everywhere better than) the respective constants of
all other models.  To paraphrase, both models have inherently the
same order of magnitude performance, and certain models are better
suited to certain operations than others.

Eric Hughes
hughes@math.berkeley.edu   ucbvax!math!hughes

dlw@odi.com (Dan Weinreb) (08/18/89)

In article <1765@ethz.UUCP> marti@ethz.UUCP (Robert Marti) writes:

   With respect to the ongoing debate concerning OODBs vs extended RDBs,
   I'd like to see proof (make that circumstatial evidence, if you prefer)
   that an OODB which supports traditional basic DBMS features such as
   concurrency control, transactions, set-oriented data manipulation,
   the ability to define views and to dynamically add new tables/columns,
   etc. is

   1) faster than a relational system for typical technical/engineering
      applications than a relational system, and

The proposition is that for certain applications, i.e. when being used
certain ways in certain computation environments, we believe that the
approaches that we're taking will result in substantially higher
performance than using a relational database system in those same
circumstances.  So there are two problems.  First, it all rests on
what sort of benchmarks you use, i.e. it depends on what you are
trying to test.  Second, it's not a claim about existing systems, but
about what some of believe we can accomplish.

   2) not much slower than a relational system for traditional business
      oriented applications.

Actually, I'm sure that some of the OODBMS's indeed *will* be much
slower than relational database systems for traditional business
oriented applications.  I, for one, certainly do *not* belive that the
kind of OODBMS that I am working on is going to replace, subsume or
displace relational database systems.  There are plenty of fine
relational database systems in existence.  They were designed to do a
certain kind of job, and they generally do those jobs fine.  When I
talk about object-oriented database management systems, I mean a
substantially different kind of DBMS designed to deal with a different
kind of problem, with different needs and tradeoffs.  (There are other
OODBMS efforts that might not take the same position, so let me
emphasize again that I'm speaking for myself.)

   How about some benchmarks, controversial as they may be?

If I had in front of me the sort of OODBMS that I envision existing in
the near future, I am sure that I could devise benchmarks that would
make the OODBMS look far faster than the RDBMS, *and* vice versa,
simply by designing the benchmarks with that goal in mind, because the
two systems would be so different.  So a benchmark would not "prove"
that system A is N times the speed of system B, but rather would
illustrate what sort of things each system is particularly good at.
That is, the interesting result would not be the numerical wall clock
times, but rather the general assumptions and philosophy underlying
the design of the benchmark.

I've been trying to think of a good analogy.  Suppose we benchmark a
car against a motorboat; the real question is not which one was
faster, but whether the benchmark took place on the interstate or on
the lake.  In my view of OODBMS, we are talking about two different
tools for two different jobs, and so a direct benchmark isn't really
relevant.  When some of us talk about "superior performance of an
OODBMS" or something, what we are really trying to say is that there
are interesting new data management tasks that are quite unlike
traditional business (DP, MIS) applications and for which existing
relational systems would not perform well.  I hope this makes
everything more clear than my previous postings.

Dan Weinreb		Object Design, Inc.		dlw@odi.com

morrison@grads.cs.ubc.ca (Rick Morrison) (08/18/89)

In article <1989Aug17.141620.24941@odi.com> jack@odi.com (Jack Orenstein) writes:
> ... "Adding columns" is a relational notion that
>does not have a clear OO counterpart, (I'd be interested in hearing
>about analogies that anyone would care to offer.)
>
How about adding new instance variables to a class definition?
------------------------------
Rick Morrison		 | {alberta,uw-beaver,uunet}!ubc-cs!morrison
Dept. of Computer Science| morrison@cs.ubc.ca
Univ. of British Columbia| morrison%ubc.csnet@csnet-relay.arpa
Vancouver, B.C. V6T 1W5  | morrison@ubc.csnet (ubc-csgrads=128.189.97.20)
(604) 228-4327

joshua@athertn.Atherton.COM (Flame Bait) (08/18/89)

In an article cimshop!davidm@uunet.UU.NET (David S. Masterson) writes:
>1. Relational DBs provide things necessary for a multi-user world
>(concurrency control, security, etc.) that may or may not be needed in the
>object oriented world (perhaps only a specific area [CAD/CASE]).

CAD and CASE do need concurrency!  It is critical that many programmers
(or chip designers) be able to work on (different parts of) the same
project at the same time!  Security is also important to many people 
doing CAD or CASE for the military.

Background: I work for Atherton Technology which produces a CASE tool
(actually an IPSE) built on OODB technology.

Joshua Levy
--------                Quote: "If you haven't ported your program, it's not
Addresses:                      a portable program.  No exceptions."  
joshua@atherton.com          
{decwrl|sun|hpda}!athertn!joshua    work:(408)734-9822    home:(415)968-3718

jkrueger@dgis.daitc.mil (Jonathan Krueger) (08/18/89)

simpson@trwarcadia.uucp (Scott Simpson) writes:

>	"Intermedia: A Case Study of the Difference Between Relational
>and Object-Oriented Database Systems", Karen E. Smith and Stanley B.
>Zdonik, Brown University, OOPSLA '87 Proceedings.

>the authors implemented a hypermedia system on top of a
>relational and then an object-oriented database system and they
>determined that the object-oriented model suited their problem domain
>better

What did they say about shared access to persistent data?

-- Jon
-- 
Jonathan Krueger    jkrueger@dgis.daitc.mil   uunet!dgis!jkrueger
Isn't it interesting that the first thing you do with your
color bitmapped window system on a network is emulate an ASR33?

jkrueger@dgis.daitc.mil (Jonathan Krueger) (08/18/89)

hughes@math.berkeley.edu (Eric Hughes) writes:

>"inherently bad performace" would require that one find some predicate
>which is invariant over all possible implementations and show that
>such an invariant has certain undesirable minimum time and/or space
>growth rates.

Example: find all descendents.

-- Jon
-- 
Jonathan Krueger    jkrueger@dgis.daitc.mil   uunet!dgis!jkrueger
Isn't it interesting that the first thing you do with your
color bitmapped window system on a network is emulate an ASR33?

jack@odi.com (Jack Orenstein) (08/18/89)

In this first round of the comp.databases relational vs. OO DBMS wars,
Jon Krueger has asked some very reasonable questions that go to the
heart of the issues that Dennis Moore (RTI), Dan Weinreb and I (Object
Design), and Bruce Speyer (MCC) have been discussing.  I think the
essential statements are the following:

   ... I'd like to divide the question: efficient implementation of
   a data model versus inherently bad performance of some models for some
   operations.  Recent traffic has confused the two issues without
   addressing either.  It tells us very little that a current DBMS
   performs poorly.  References to applications without specifying their
   operations or describing their design tell us nothing.
   
   For instance, Bruce alludes to operations like "netlist a circuit" and
   "package the electronics".  It would be wonderful indeed to understand
   the electronics that underlies all the computing we do, but I'll settle
   for characterizing some operations that engineers need.  Can you
   specify these operations in some terms we can understand?  Or simpler
   ones?  How might one implement them with a relational data model?  Are
   there data models that can be shown inherently better for some of these
   operations? 

The starting point has to be the second statement, which addresses
user requirements.  One of the major conclusions of our (Object
Design's) requirements analysis was, as Dan and I have stated in
earlier postings, that "object fetching" - finding an object given its
id - must be as fast as possible.  I will therefore try to answer Mr.
Krueger's questions by focussing on this one operation.

These statements are asking, (correct me if I'm wrong), whether there
is something in the design of a given model that leads to or precludes
certain implementation techniques required for efficient
implementation of operations that are important in the application
areas being considered, (object fetching for now).

Consider an application that has to access some persistent data and
manipulate it, possibly updating it, using subroutines written in a
programming language. This is an extremely common scenario among the CAx
developers that Object Design has talked to.

Writing such an application on top of a relational DBMS requires the use
of two languages, the host programming language and the query language
of the RDBMS. The programmer therefore has to deal with two type
systems, and except for the simplest types such as integer and maybe
string, conversions are required between the DBMS representation of a
type and that of the host language. 

This problem is most severe for object ids.  On the host language
side, object ids are simply addresses, or pointers, and object
fetching involves following the pointer, (e.g.  thing* p; ...; widget*
w = p->frammis; in C or C++). On the RDBMS side, object fetching
involves at least a selection starting from a specific key value, or a
join. Each retrieval from the RDBMS will load some data that can be
accessed through host language, but when the "boundaries" of the
retrieved data are reached, it is time to submit another query.

Earlier postings have identified three patterns of "interleaving" of
host language and RDBMS actions:

        1. Retrieve only the object required at the moment by passing
	   its key to the DBMS.

        2. Retrieve all the objects that will be required for some
	   part of the application. This can be done by grouping 
	   objects (e.g. using a view), and retrieving the group members
	   by providing the key of the group.

        3. Same as 2, but the objects to be manipulated are not
           stored individually in the database. Instead, the data
           is organized as a "blob" or "long field", which is
           requested by its key, i.e. groups are replaced by blobs.
        
It sounds like Mr. Speyer used approach 1 in his application:

   About 3 years ago I tried putting an electronic information model on
   top of a relational system.  It took about 30-40 times longer to
   netlist a circuit then it did using a fairly inefficient internally
   developed memory-based database system. An operation such as packaging
   the electronics is much worse since it must transverse much more of
   the electronic information model and be constantly refering to the
   library portion of the model which was distributed to another database
   (making the join operation much more expensive).

Mr. Moore suggested that he should have used approach 2 or 3
instead, (the description is not specific enough to say which):

   Let me posit a different architecture for your electronic information
   model.  Could you have read in all the data into memory from an RDBMS
   and performed the same manipulations in-core that you did in your
   system?  The advantage to this architecture is that you can lock the
   records while you are manipulating them (with THREE WORDS ("FOR DIRECT
   UPDATE"), as opposed to many lines of code), you get all the
   transaction processing capabilities of the DBMS (i.e. rollback,
   savepoints, commit), you get all the utilities of the DBMS, etc.  To
   put it in a few words, YOU GET THE *MS* FROM THE DBMS, and you do your
   own processing.

Elsewhere, he is specifically suggesting approach 3:

   For instance, you could store a CASE diagram as a BLOB in real-time,
   and fire off an asynch database procedure which invokes a method which
   does all kinds of stuff, including storing the thing in a normalized
   fashion (for reports etc.), and potentially invoking a compiler to
   create a new whole version, etc.  Would this not be good enough?
   There will be a tradeoff between disk space and access and storage
   times, though, regardless of OO or R/OO.

#1 is too slow to be practical, as suggested by Mr. Speyer and by our
discussions with our potential customers. A query per (small) object
is too expensive.

#2 has the drawback that all members of a group must be retrieved in
order to gain access to any members. This is wasteful if only a
handful of objects were actually needed.  Furthermore, there may be a
large number of joins and selections necessary to extract the required
data, and the data then has to be converted to host-language
structures.  This is pure overhead due to the use of two languages.

#3 fails to capture any relationships internal to the long field (the
"blob") unless the programmer explicitly asks the information to be
captured and sent back to the DBMS (as pointed out by Mr. Moore).
Again, this is overhead due to the use of two languages.

Going back to Mr. Krueger's question: are these problems inherent in
the relational model? No, they are due to the two-language paradigm
supported by all RDBMS vendors. In fact, the relational model doesn't
address the issue of how to interact with a more powerful
general-purpose programming language. Languages like Pascal/R, RIGEL,
and Aldat show that a smoother integration is possible. (See Atkinson
and Buneman's extremely thorough review of DB programming languages in
ACM Surveys, June 1987.) It is extremely unlikely that any of these
languages will see widespread use, since they are non-standard (i.e.
non-C and non-SQL) replacements of existing query languages AND
programming languages.

An OO DBMS does not present users with the two-language problem
characteristic of RDBMSs.  Or at least this is true of the system we're
building at Object Design.  Instead, there is a single type system, and
a type may have both transient and persistent instances.  Once a
persistent object has been created, it can be accessed and manipulated
in the same way as any other object. Our system will be C++-based, so we
have adopted the C++ type system.

The programmer does not have to fire off a query to a DBMS in a second
language in order to access persistent data.  A pointer can be
followed in the usual way (e.g.  *p, or p->field), even if the target
is persistent.  If the object happens to be in working memory, then
nothing out of the ordinary happens, and the speed of the access is
the same as for access to a transient object (and the same as what a
C++ programmer is used to). Otherwise, the requested object, along
with some objects stored nearby, are brought in from the database
automatically.

Concurrency control and recovery are present, as with any DBMS.  There
is certainly nothing inherent in the relational model or lacking from
any OO model that limits these features to relational DBMSs.


CONCLUSION

The two-language paradigm of RDBMSs complicates the writing of
applications, and has performance consequences as well.  This is a
problem with implementations of the relational model, and not inherent
in the relational model. OO DBMSs avoid these problems by offering a
single language in which to write applications, pushing the
responsibility for database access into the system, away from the
user.



Jack Orenstein
Object Design, Inc.

jkrueger@dgis.daitc.mil (Jonathan Krueger) (08/18/89)

dcmartin@lisp.eng.sun.com (David C. Martin) writes:

>I think that the position of most OODBMS vendors is to provide this
>type of extended functionality in the DBMS, not necessarily in frontend
>support systems.

This is also precisely the goal of those adding OO capabilities to
relational engines.  However, they start by assuming the engine must
provide shared access to persistent data.

-- Jon
-- 
Jonathan Krueger    jkrueger@dgis.daitc.mil   uunet!dgis!jkrueger
Isn't it interesting that the first thing you do with your
color bitmapped window system on a network is emulate an ASR33?

rich@osc.COM (Richard Fetik) (08/18/89)

In article <5259@wiley.UUCP> simpson@trwarcadia.UUCP (Scott Simpson) writes:
>	Lastly, say hi to Rich Fetik at Object Design.  I knew him when
>he worked for Ontologic.  I was wondering where he went...
>	Scott Simpson
>	TRW Space and Defense Sector

Nope, I've to Object-Sciences Corporation.  Sorry to confuse you.  And Hi. :-)

-- 
					rich@osc.osc.com     415-325-2300
					uunet!lll-winken!pacbell!osc!rich
 Disclaimer: These are not the words of Object-Sciences Corporation or its
 affiliates, except when they first hear them from me.

dlw@odi.com (Dan Weinreb) (08/18/89)

In article <1037@unify.UUCP> dgh@unify.UUCP (David Harrington) writes:

   I agree.  Look at the OODBMS companies like Ontologic.  They are either living
   off an existing RDBMS which they are trying to re-cast as OO, or they are
   dying.

This is hardly a convincing argument.  Yes, Ontologic and Servio-Logic
might be having problems or not growing as fast as they might, but
it's not logical to leap to the conclusion that OODBMS technology
isn't going to happen, or is only going to happen as a series of
changes to existing RDBMS implementations.  There are many other
factors that have shaped the courses of those two companies.  Merely
because A is true and B is true does not mean that A caused B.  (It
would be improper and highly rude of me to speculate about what those
other factors are, but I can suggest that one has to do with the use
of proprietary or unusual computer languages.  It's also possible (as
you suggested) that they started a bit too early.)  If you look at the
early history of relational databases, in fact, you'll see a lot of
early failed startups, obviously not because of fundamental problems
with RDBMS technology.

(By the way, there aren't any OODBMS companies that are living off an
existing RDBMS which they are trying to re-cast, etc.  Also by the
way, it's not fair to call Ontologic "dying"; they've announced that
they are coming out with an entirely new product on which judgements
at this time would be premature.  I expect we'll all learn more at the
OOPSLA conference.)

Usual notice: I work for a start-up company producing an OODBMS.  In
case anyone on this list doesn't recognize the name "Unify", Mr.
Harrington works for a company that produces a relational DBMS and
associated products.

Dan Weinreb		Object Design, Inc.		dlw@odi.com

dlw@odi.com (Dan Weinreb) (08/18/89)

In article <32@dgis.daitc.mil> jkrueger@dgis.daitc.mil (Jonathan Krueger) writes:

   dcmartin@lisp.eng.sun.com (David C. Martin) writes:

   >I think that the position of most OODBMS vendors is to provide this
   >type of extended functionality in the DBMS, not necessarily in frontend
   >support systems.

   This is also precisely the goal of those adding OO capabilities to
   relational engines.  However, they start by assuming the engine must
   provide shared access to persistent data.

But of course.  So does any serious OODBMS vendor.  In fact, there is
a commercially-available OODBMS product right now that does pretty
much exactly what Mr.  Martin asked for, and provides shared access to
persistent data (concurrency control, recovery, backup, etc., using
two-phase locking, write-ahead logging, etc).  It's called Statice,
and is a product of Symbolics, Inc.  Its main drawback is that it
currently is only available for Symbolics computers.

Did someone give you the impression that proposed OODBMS systems do
not provide shared access to persistent data?  Of course, as I said
before, the term OODBMS is used for all kinds of things.  However,
speaking for myself and my own use of the term, an OODBMS does not
deserve to be called that unless it provides shared access to
persistent data.

Caveat department: I was a co-founder of Symbolics Inc. and was the
chief designer and developer of Statice.  The work I am doing at
Object Design is also in the OODBMS area, but is substantially
different from Statice in most respects.  (So even within my own
definition of "OODBMS", there is a lot of room for different kinds of
systems!)

Dan Weinreb		Object Design, Inc.		dlw@odi.com

dlw@odi.com (Dan Weinreb) (08/18/89)

In article <1989Aug17.180057.2623@agate.berkeley.edu> hughes@math.berkeley.edu (Eric Hughes) writes:

   "Inherently bad performance" is a slippery term.  It is important to
   remember that a database model is an abstraction, and that there are
   many different implementations of the same abstraction.

Yes, indeed.  My colleague Jack Orenstein also pointed this out.
Performance is usually not inherent in an abstract data model.  The
most interesting performance differences between conventional DBMS's,
and the new CAx-oriented DBMS's, have less to do with the abstract
model and more to do with the implementation of the model.  The
claimed benefits of using an object-oriented model have more to do
with such areas as expressiveness and abstraction than performance.

Dan Weinreb		Object Design, Inc.		dlw@odi.com

jkrueger@dgis.daitc.mil (Jonathan Krueger) (08/19/89)

dlw@odi.com (Dan Weinreb) writes:

There is a commercially-available OODBMS product right now that...
provides shared access to persistent data...  It's called Statice, and
is a product of Symbolics, Inc.  Its main drawback is that it currently
is only available for Symbolics computers.

That's interesting.  How does one build multiuser systems out of
Symbolics computers?

>Did someone give you the impression that proposed OODBMS systems do
>not provide shared access to persistent data?

Rather that no one gave me the impression that they did.  Do they?
Would someone out there describe his production OODB and state how many
concurrent users access it?  How many are actively updating it on a
typical day?

-- Jon
-- 
Jonathan Krueger    jkrueger@dgis.daitc.mil   uunet!dgis!jkrueger
Isn't it interesting that the first thing you do with your
color bitmapped window system on a network is emulate an ASR33?

davidm@uunet.UU.NET (David S. Masterson) (08/19/89)

Based on Jack Orenstein's message, I have a couple of questions:

1. In implementing an OODB on top of C++ using the notion of persistent and
transient type objects, when you refer to information in the OODB, is it
always by an object identifier?  How, therefore, would you find objects
meeting some qualification if you don't know its identifier?  Is this even a
type of query you would ask in an OODB world?  (you ALWAYS know the identifier
because even a qualification would be wrapped in an object which contains the
identifier?)

2.  Again using the architecture of persistent and transient objects,  is a
persistent object ever in memory?  Or is it just a transient copy of a
persistent object that is in memory?  Then, how are persistent objects
created?

David Masterson
uunet!cimshop!davidm
415-691-6311

dlw@odi.com (Dan Weinreb) (08/22/89)

In article <35@dgis.daitc.mil> jkrueger@dgis.daitc.mil (Jonathan Krueger) writes:

   That's interesting.  How does one build multiuser systems out of
   Symbolics computers?

Same as any other workstation.  You connect them on an Ethernet.  One
(at least) of the workstations is a Statice server, which has the disk
that holds the nonvolatile data itself.  The others act as clients,
retrieving and storing data via a network protocol built on TCP/IP.

   >Did someone give you the impression that proposed OODBMS systems do
   >not provide shared access to persistent data?

   Rather that no one gave me the impression that they did.  Do they?

Well, as I said, people use the term "OODBMS" to cover a wide range of
things.  Our product certainly does, and I strongly expect other
forthcoming OODBMS products to do so.  Statice already does, and so
does Servio-Logic's Gemstone.

   Would someone out there describe his production OODB and state how many
   concurrent users access it?  How many are actively updating it on a
   typical day?

You're looking for a benchmark, and a benchmark result.
Unfortunately, it's not easy; see my previous posting on the subject.
I can tell you that during debugging of Statice, we ran up to 140 or
so client workstations (mostly in Cambridge Mass and some in the Los
Angeles area), each accessing the single server about once every two
minutes to do a simple associative update transaction.  The server was
able to keep up with this.  We didn't try saturating the server with
transactions.  Each of these transactions was a true database
transaction, setting locks and forcing the log to disk and so on.
This doesn't prove anything about performance, of course, but it does
mean that I was serious when I said "shared access to persistent
data".

Daniel Weinreb		Object Design, Inc.		dlw@odi.com

dgh@unify.UUCP (David Harrington) (08/23/89)

In article <1989Aug18.135935.29299@odi.com> dlw@odi.com writes:
>In article <1037@unify.UUCP> dgh@unify.UUCP (David Harrington) writes:
>
>   I agree.  Look at the OODBMS companies like Ontologic.  They are either living
>   off an existing RDBMS which they are trying to re-cast as OO, or they are
>   dying.
>
>This is hardly a convincing argument.  Yes, Ontologic and Servio-Logic
>might be having problems or not growing as fast as they might, but
>it's not logical to leap to the conclusion that OODBMS technology
>isn't going to happen, or is only going to happen as a series of
>changes to existing RDBMS implementations. 

I didn't intend to make that point; indeed, I do believe that OODBMS
technology IS going to happen.  It was my understanding while I was at Servio
(yes, I'll admit to having a personal opinion about that company) that
Ontologic was grafting an OO layer onto their existing technology.  This, of
course, was seen by Servio as the wrong approach.  Subsequent events such as
the Index Technology deal (which I knew of) seem to tell me that the Ontologic
efforts are meeting some success.  I wish them well.

>There are many other
>factors that have shaped the courses of those two companies.  Merely
>because A is true and B is true does not mean that A caused B.  (It
>would be improper and highly rude of me to speculate about what those
>other factors are, but I can suggest that one has to do with the use
>of proprietary or unusual computer languages.  It's also possible (as
>you suggested) that they started a bit too early.)  

I was also suggesting that Servio has a fundamental problem with top
management, which is not qualified to run a technology company, much less
a leading edge technology company. (There is more than just opinion in this
statement; I'll be happy to share more info. over email).

The staff and managers at
Servio's GemStone group are fine people; too bad they won't ever get to
realize the fruits of their considerable labor in the market place.  (Example:
the CEO once said he wanted GS to be a "dBase killer" -- somewhat unclear on the
concept, I would suggest).

>
>Usual notice: I work for a start-up company producing an OODBMS.  In
>case anyone on this list doesn't recognize the name "Unify", Mr.
>Harrington works for a company that produces a relational DBMS and
>associated products.
>

Indeed, I do not speak against OODBMS either for myself or for Unify.  Anybody
in this business would be foolish to ignore this technology.

moiram@tekcae.CAX.TEK.COM (Moira Mallison) (08/24/89)

In article <1765@ethz.UUCP> marti@ethz.UUCP (Robert Marti) writes:
>With respect to the ongoing debate concerning OODBs vs extended RDBs...
>How about some benchmarks, controversial as they may be?

>Btw:  For me, functionality is a much more important point than
>performance.  However, most OODB followers emphasize the superior
>performance of OODBs.  So:  Put up or ...  :-)

We have recently published a technical report specifying the
Tektronix HyperModel Benchmark[1], an application-oriented evaluation
strategy for engineering DBMSs based on a generic hypertext system.
We outline functionality requirements as well as the performance
requirement that resulted from our study of engineering applications.

The functionality requirements include data model requirements (eg,
ability to model complex object structures, data type extensibility)
and database system requirements (eg. client-server architecture,
concurrency control).

To measure performance, we use 22 operations.  Several of the 
operations are similar to those specified in the (Sun) Simple
Database Operations Benchmark [2].  In addition, we define
several variations of closure traversals over a 1-N hierarchy
and an M-N hierarchy, and editing operations on the formnodes and
textnodes.

The technical report presents the results of implementing the
benchmark using the GemStone and Vbase systems.  I am currently
working on RDBMS implementations (Ingres, Unify).  However, for
reasons irrelevant to this discussion, the motivating factor 
behind the RDBMS implementations was not to compare them with
OODBMS, so the development is on a different platform. It is 
uncertain when we will get everything on a single platform,
so that we can make some real comparisons.

Moira Mallison
CAX Data Management

[1] Anderson, Berre, Mallison, Porter, Schneider; The Tektronix
HyperModel Benchmark Specification, Tektronix TR #89-05, August, 1989.

[2] Rubenstein, Kubicar, Cattell; "Benchmarking Simple Database Operations,"
in Proceedings of the 1987 ACM SIGMOD Int'l Conference on the Management of
Data, SF, CA, May, 1987.

bsa@telotech.UUCP (Brandon S. Allbery) (08/28/89)

In article <1989Aug17.211534.28345@odi.com>, jack@odi (Jack Orenstein) writes:
+---------------
| The two-language paradigm of RDBMSs complicates the writing of
| applications, and has performance consequences as well.  This is a
| problem with implementations of the relational model, and not inherent
| in the relational model. OO DBMSs avoid these problems by offering a
| single language in which to write applications, pushing the
| responsibility for database access into the system, away from the
| user.
+---------------

Which raises the possibility that a version of C could be written that would
be able to treat transient and persistent RDBMS data in the same way.  Heck,
I'm actually *doing* this, after a fashion -- a certain program I've been
working on wants to be able to select data from various "tables" (not in an
RDBMS, actually, but they can be collectively treated as an RDBMS without
indexes), and I'm writing a selection language which is somewhat C-like.  (Not
complete C, because it's not needed for what I'm doing in this case.  Even in
existing systems this could be precompiled into mixed C and SQL code.

BTW, what exactly do you mean by your assertion that a pointer to persistent
data need not be declared differently from transient data?  The program needs
to be notified of the association between a particular data type and/or an
instance of that data type and an external data store *somehow*.  I can see
two such instances (variables) being declared the same way, but one of them
will have *some* operation done on it which will associate it with an external
object, thus effecting a "dynamic" change in definition.  Also, what does that
operation look like (in the code)?

++Brandon (RDBMS hacker -- but I truly want to know about OODB's)
-- 
-=> Brandon S. Allbery @ telotech, inc.   (I do not speak for telotech.) <=-
Any comp.sources.misc postings sent to this address will be DISCARDED -- use
allbery@uunet.UU.NET instead. My boss doesn't pay me to moderate newsgroups.
** allbery@NCoast.ORG ** uunet!hal.cwru.edu!ncoast!{allbery,telotech!bsa} **