[comp.databases] What is an OODBMS

wlp@calmasd.GE.COM (Walter L. Peterson, Jr.) (09/01/88)

In article <408@cullsj.UUCP>, gupta@cullsj.UUCP (Yogesh Gupta) writes:
>[...] 
> What do people think of when they think of an OODBMS?  Is it an Object
> Oriented System with the ability to have persistent objects?  Shared
> Objects?  Transactions?  Recoverability?  Authorization and Security?
> What is the interface to it? Is an OODBMS a complete environment, or
> not? ...
> 
> I believe that the question of the interface is very important because
> it defines the behaviour of the system.
> 
> OOPSLA 88 should be interesting...
> 


Persistent objects within an Object-Oriented System are, of course,
part of the answer.  It is very nice to be able to enter an
application, work for a while and then exit the program without
having to explicitly write storage methods on all of your objects.
This can be done by embedding a save/restore mechanism within the 
language itself.  While this capability is very nice to have, it is
not what comes to mind when one hears the term "Database".

The other elements that you mention are closer to the common preception 
of what a database is.  Data sharing is critical; without that most
of the power and indeed the reason for having a database go away.  The
need for transaction support is, I believe, also critical.  You have
to be able to establish logical units of work and then commit or rollback
those transactions, just as in an RDBMS.  The concept of transactions 
also provides a basis for data sharing, since it is usually the framework
arround which locking occurs.

Recovery is important in both the storage of persistent objects and in
an OODBMS.  Murphy is, after all, an optimist.  If things can go wrong
they will and any storage system has to be able to recover as gracefully
and as fully as possible from system crashes, power outages and accidentally
hitting CTRL-C.  If it does not or can not do this, then the system
will never be more than a laboratory curiosity ( which is all that many
ODBMSs are ).

From the point of view of a definition of what an OODBMS is or should
be, the problems of authorization and security are secondary; from the
standpoint of producing a commercially viable product they are not,
however.  Any commercial product will have to address these as closely
if not more so, as the current most successful RDBMSs do.

The problem of the interface is FAR too broad ( to say nothing of
divisive ) to go into in this posting ( I hope to have more to say on
this soon - before OOPSLA, if possible ).

This collection of issues only scratches the surface of the OODBMS question.
It leaves out one of the main issues that I feel is not being properly
addressed; that is "what about the methods?"  An object without methods 
isn't much of an object. How should an OODBMS handle the storage and 
retrieval of the methods on an object? How could the methods be brought 
into a running application, in a runnable form, when the application
retrieves the objects from the DB?  Is this possible ? ( I believe it is ).
Is it desirable? ( I also believe that it is ).

Another issue that is frequently not addressed by even the more commercially
successful RDBMSs is that of data integrity.  Even integrity issues as
simple as primary key - foreign key consistency are simply not addressed.
Most, if not all, data integrity checks can be made by a properly
designed ACTIVE data dictionary. I emphasized the word ACTIVE, since
far too many people, who should know better, still think that a data
dictionary is just a system documentation tool and not the actual
heart of the DBMS.  I know from first hand experience that an active
data dictionary can solve many of the problems that are encountered in
producing a commercial DMBS and that given the power of object oriented 
programming such a data dictionary is really quite easy to design and
implement.

Well, I've stirred the pot long enough; lets see what responses this 
brings.

(P.S. - Yes. OOPSLA '88 should be fun !)-- 
Walt Peterson   GE-Calma San Diego R&D
"The opinions expressed here are my own and do not necessarily reflect those
GE, GE-Calma nor anyone else.
...{ucbvax|decvax}!sdcsvax!calmasd!wlp        wlp@calmasd.GE.COM

gupta@cullsj.UUCP (Yogesh Gupta) (09/02/88)

In article <49@calmasd.GE.COM>, wlp@calmasd.GE.COM (Walter L. Peterson, Jr.) writes:
< 
< The other elements that you mention are closer to the common preception 
< of what a database is.  Data sharing is critical; without that most
< of the power and indeed the reason for having a database go away.  The
< need for transaction support is, I believe, also critical.  You have
< to be able to establish logical units of work and then commit or rollback
< those transactions, just as in an RDBMS.  The concept of transactions 
< also provides a basis for data sharing, since it is usually the framework
< arround which locking occurs.

I think that one of the things that is being noticed by a lot of OODBMS
developers/researchers is that two-phase locking does not lead to much
concurrency, especially in the case of OODBMS (I agree that the problem
also exists in RDBMS).  For example, ObServer (Brown Univ.) supports
more than just two-phase locking  (I do not know whether ENCORE
[OODBMS on top of ObServer] utilizes it or allows the user to specify
the locking protocol).

< [...]
< From the point of view of a definition of what an OODBMS is or should
< be, the problems of authorization and security are secondary; from the
< standpoint of producing a commercially viable product they are not,
< however.  Any commercial product will have to address these as closely
< if not more so, as the current most successful RDBMSs do.

The reason to bring this up was that so far the authorization mechanisms
have been based on relations - a user may or may not access data from
a specific relation, or specific columns within a relation, or where the
value in some field is a specific value.  I think that for OODBMS, the
authorization and security would be more based on relationships, that
is a user may be allowed to look at an object within a class iff it is
accessed through "relationship X" and not if it is accessed through
relationship Y.  Thus, the specification and implementation of access
authorization could be quite different from those that we see today.

< 
< The problem of the interface is FAR too broad ( to say nothing of

  That is why I brought it up :-).

< divisive ) to go into in this posting ( I hope to have more to say on
< this soon - before OOPSLA, if possible ).
<
< This collection of issues only scratches the surface of the OODBMS question.
< It leaves out one of the main issues that I feel is not being properly
< addressed; that is "what about the methods?"  An object without methods 
< isn't much of an object. How should an OODBMS handle the storage and 
< retrieval of the methods on an object? How could the methods be brought 
< into a running application, in a runnable form, when the application
< retrieves the objects from the DB?  Is this possible ? ( I believe it is ).
< Is it desirable? ( I also believe that it is ).

When I said "store Objects" I implied "objects and methods".  I agree with
your statements that the methods should be retrieved and be executable for
an OODBMS to be efficient enough to be a viable product.  I also agree that
it is doable.  However, the problems of sharing, modifications and security
becomes worse - how does one prevent a user from applying incorrect methods
to an object?  How does one prevent a user corrupting the methods and then
applying them on the objects? etc.

< Another issue that is frequently not addressed by even the more commercially
< successful RDBMSs is that of data integrity.

  We are changing that :-).

<                                               Even integrity issues as
< simple as primary key - foreign key consistency are simply not addressed.
< Most, if not all, data integrity checks can be made by a properly
< designed ACTIVE data dictionary. I emphasized the word ACTIVE, since
< far too many people, who should know better, still think that a data
< dictionary is just a system documentation tool and not the actual
< heart of the DBMS.  I know from first hand experience that an active
< data dictionary can solve many of the problems that are encountered in
< producing a commercial DMBS and that given the power of object oriented 
< programming such a data dictionary is really quite easy to design and
< implement.

I agree that integrity constraints are indispensible in a DBMS.  I also
think that this is something that would fall out of an OODBMS just because
one can specify methods that enforce integrity along with the definition of
the class.  In fact, the OODBMS really is nothing more than a repository :-),
and the methods of a class define the operations on the objects in that
class.  So, the collection of the methods IS the active data dictionary.

< 
< Well, I've stirred the pot long enough; lets see what responses this 
< brings.
< 
< Walt Peterson   GE-Calma San Diego R&D
< "The opinions expressed here are my own and do not necessarily reflect those
< GE, GE-Calma nor anyone else.
< ...{ucbvax|decvax}!sdcsvax!calmasd!wlp        wlp@calmasd.GE.COM

Thanks for your response.  I hope that others get into this also.
-- 
Yogesh Gupta                    | If you think my company will let me
Cullinet Software, Inc.         | speak for them, you must be joking.

wlp@calmasd.GE.COM (Walter L. Peterson, Jr.) (09/03/88)

In article <410@cullsj.UUCP>, gupta@cullsj.UUCP (Yogesh Gupta) writes:
> In article <49@calmasd.GE.COM>, wlp@calmasd.GE.COM (Walter L. Peterson, Jr.) writes:
> < [... my stuff about transactions & locking deleted...]
> 
> I think that one of the things that is being noticed by a lot of OODBMS
> developers/researchers is that two-phase locking does not lead to much
> concurrency, especially in the case of OODBMS (I agree that the problem
> also exists in RDBMS).  For example, ObServer (Brown Univ.) supports
> more than just two-phase locking  (I do not know whether ENCORE
> [OODBMS on top of ObServer] utilizes it or allows the user to specify
> the locking protocol).

Locking has always been a problem no matter what data model is being
used, and it dosn't seem to be getting any better with OODBMSs.  One
of the major problems that we face is that many of the commercially
available OODBMS systems are designed to work with CAD systems.  CAD
introduces its own special problems to any locking mechanism, in that
the frequency, granularity and longevity of CAD data accesses are very
different from the typical MIS application.  CAD data accesses tend to
be infrequent, long-term accesses of very large amounts of closely 
related data.  Most OODBMSs that have been produced to support CAD
data have opted for a "check-in-check-out" type of system like a code
management system (like SCCS) uses, rather than rely on a locking 
scheme.  The problem is still there and I think that it will require
some original thinking to solve, not just adapting what has been done
a million times before.

> 
> < [...my stuff about authorization deleted...]
> 
> The reason to bring this up was that so far the authorization mechanisms
> have been based on relations - a user may or may not access data from
> a specific relation, or specific columns within a relation, or where the
> value in some field is a specific value.  I think that for OODBMS, the
> authorization and security would be more based on relationships, that
> is a user may be allowed to look at an object within a class iff it is
> accessed through "relationship X" and not if it is accessed through
> relationship Y.  Thus, the specification and implementation of access
> authorization could be quite different from those that we see today.
>

Oh-oh. I think we have just hit one of those "terminology barriers" as
I call them, that seem to be plaguing object-oriented technology.
From what you say about "columns" I think that you are refering to 
relations as in "tables".  However, there is another usage in some
object-oriented languages that would be ideal for authorization and
security.  This type of "relation" is a semantic construct ( in at least
one OO-language that I know of ) that expresses relationship, associations,
and constraints between objects.  That language is DSM, which has been
developed by GE Corporate R&D and GE-Calma R&D.  It is the language
in which I work and have had some small part in helping to  develope.
Using DSM relations, one could leave the issue of authorization and
security completely up to the applications and not have the database
system ITSELF have to worry about it.  ( For more on DSM see :
Rumbaugh, J.E. "Relations as Semantic Constructs in an
Object-Oriented Language" OOPSLA-87 p. 466   and
Blaha, M.R., W.J. Premerlani and J.E. Rumbaugh "Relational Database
Design Using an Object-Oriented Methodology"  CACM April, 88 V. 31 No. 4 )
( Jim Rumbaugh is the original designer of DSM ).

I did not mean to down-play the importance of authorization. I did,
however mean to show that if we simply map objects onto the basic
architecture of the "traditional" DBMS, we will be accepting a lot of
system overhead that is simply not needed in an OODBMS, where the
semantics of the objects and their relations can do the work.

> 
> < [...my stuff about storage & retrieval of methods deleted...]
> 
> When I said "store Objects" I implied "objects and methods".  I agree with
> your statements that the methods should be retrieved and be executable for
> an OODBMS to be efficient enough to be a viable product.  I also agree that
> it is doable.  However, the problems of sharing, modifications and security
> becomes worse - how does one prevent a user from applying incorrect methods
> to an object?  How does one prevent a user corrupting the methods and then
> applying them on the objects? etc.

Preventing users from applying incorrect methods on an object is relatively
straight forward.  That can easily be handled by the method search
mechanism of the Object-Oriented language that is being used.  If the
method exists for an object class, then it is used, if not then the
object's super class is searched for a method of that name and its
super class and so on until the top of the class hierarchy is reached.
If at that point there is still no method by that name or none with the
proper invocation, then it is up to the language system to gracefully
handle the error.

The ability of a user to corrupt a retrieved method is a bit more
involved, but if there is an adequate authorization scheme in use then
the user class(es) can be restricted to allow only certain users to 
modify existing methods.  This could even be carried a step further by
not allowing the "run-time" access to the 'text' of the methods and
requiring that methods could only be loaded in to the system by some
"super-user" like the DBA. ( By the way we at GE-Calma have coined the
term Object-base Administrator or OBA, since we see the duties as
being a bit more complex due to this very issue of method use.  I made
the unfortunate mistake of producing the 'strawman' document on our need
for an OBA and as a result I was selected to be it. Lucky me. :-) )

> 
> < [...my stuff about data integrity and data dictionaries deleted...]
> 
> I agree that integrity constraints are indispensible in a DBMS.  I also
> think that this is something that would fall out of an OODBMS just because
> one can specify methods that enforce integrity along with the definition of
> the class.  In fact, the OODBMS really is nothing more than a repository :-),
> and the methods of a class define the operations on the objects in that
> class.  So, the collection of the methods IS the active data dictionary.
> 

Using methods to enforce data integrity, or more properly referential
integrity, is part of the solution, but it is not a complete solution.
Method enforcement of integrity constraints would be a bit cumbersome
for some applications, espically if the OODBMS were to be used by
clerical personnel, with little or no training in programming. We
can't always expect that the users of the system will have a large DP
shop at their disposal to write ad hoc methods for them. The users
have to have some sort of database language like SQL ( note: that
means "on the general idea of SQL"; NOT SQL itself !!! ) that will
allow the use of a simple data definition language which has the
necessary syntax and semantics to define the integrity constraints
when the user is defining data objects.  These requirements can, to
some extent, be implemented within the objects being defined, but much
of it will still have to be done by a data dictionary.  In this
context the class hierarchy ( as in Smalltalk ) may be thought of as
being at least part of the data dictionary, if not actually being the
DD itself.

I think it is very important for us ( those involved in OODBMS R&D )
to be think in terms of flexibility.  One of the reasons for the
success of the relational model is its (preceived) greater flexiblity
than the hierarchical or network models.  With object-oriented
systems, I believe that we can attain even greater flexiblity and we
can not allow ourselves to be constrained by accepting the paradigms
of the systems built upon the previous models.


Notice that once again I have studiously avoided talking about the
user interface !


>
> Thanks for your response.  I hope that others get into this also.


-- 
Walt Peterson   GE-Calma San Diego R&D (Object and Data Management Group)
"The opinions expressed here are my own and do not necessarily reflect those
GE, GE-Calma nor anyone else.
...{ucbvax|decvax}!sdcsvax!calmasd!wlp        wlp@calmasd.GE.COM