clamen@CS.CMU.EDU (Stewart Clamen) (05/06/91)
I am doing research in the area of type evolution in databases (object-oriented databases in particular) and wish to learn a bit more about how existing systems deal with the problem. I'd much appreciate it if those readers with intimate knowledge of a particular DBMS or OODBMS would drop me a note telling me if and how the system provides support for the evolution of types. By "type evolution", I am referring to the process of redefining an existing type (or schema in database parlance) in a DBMS and provisions for the reformatting of the data representing the instances of that type present in the database prior to the redefinition. I am already familiar with the services provided by ORION and GemStone. [Please don't post your replies to the 'net. I will post a summary if people express an interest.] -- Stewart M. Clamen Internet: clamen@cs.cmu.edu School of Computer Science UUCP: uunet!"clamen@cs.cmu.edu" Carnegie Mellon University Phone: +1 412 268 3620 Pittsburgh, PA 15213-3890, USA Fax: +1 412 268 1793
clamen@CS.CMU.EDU (Stewart Clamen) (05/22/91)
In article <CLAMEN.91May5174756@BYRON.SP.CS.CMU.EDU> I made the
following request:
I am doing research in the area of type evolution in databases
(object-oriented databases in particular) and wish to learn a bit more
about how existing systems deal with the problem. I'd much appreciate
it if those readers with intimate knowledge of a particular DBMS or
OODBMS would drop me a note telling me if and how the system provides
support for the evolution of types.
By "type evolution", I am referring to the process of redefining an
existing type (or schema in database parlance) in a DBMS and
provisions for the reformatting of the data representing the instances
of that type present in the database prior to the redefinition.
I am already familiar with the services provided by ORION and
GemStone.
I have so far, received relevant information now on the following
systems:
Research Systems Commercial Systems
---------------- -----------------
ORION Symbolics' Statice
IRIS Versant
GemStone Object Design
Encore Objectivity
COOL/COCOON ObjectStore
Exodus Ontos (formerly VBase)
Machiavelli NMP-CAD's Base/OPEN
ConceptBase PICK
AVANCE Persistent Data Systems' IDL products
Before I consider my survey complete, and post what I have collected
to the net (and mailed it to those who so requested), I'd like to make
a directed request for schema evolution and database conversion as it
pertains to other particular systems I have read about.
As you might have noticed, most of these systems are OODBMS. I am
interested in learning about the more traditional DBMSs, as well as
other non-OO systems. Notably:
ADAPLEX
CODASYL
CLASSIC
DAPLEX
POSTGRES
I would also like to hear something about Altair/O_2, an OODB
project in France.
Thank you again.
--
Stewart M. Clamen Internet: clamen@cs.cmu.edu
School of Computer Science UUCP: uunet!"clamen@cs.cmu.edu"
Carnegie Mellon University Phone: +1 412 268 3620
Pittsburgh, PA 15213-3890, USA Fax: +1 412 268 1793
marcs@slc.com (Marc San Soucie) (05/23/91)
Stewart M. Clamen writes: > > In article <CLAMEN.91May5174756@BYRON.SP.CS.CMU.EDU> I made the > following request: > > I am doing research in the area of type evolution in databases > (object-oriented databases in particular) and wish to learn a bit more > about how existing systems deal with the problem. > > I have so far, received relevant information now on the following > systems: > > Research Systems Commercial Systems > ---------------- ----------------- > ORION Symbolics' Statice > IRIS Versant > GemStone Object Design > Encore Objectivity > COOL/COCOON ObjectStore > Exodus Ontos (formerly VBase) > Machiavelli NMP-CAD's Base/OPEN > ConceptBase PICK > AVANCE Persistent Data Systems' IDL products GemStone Please be aware that GemStone is in fact a commercial product, not a research product, and has been a commercial product since 1987. Marc San Soucie Servio Corporation Beaverton, Oregon marcs@slc.com
clamen@CS.CMU.EDU (Stewart Clamen) (05/30/91)
The following is the result of my public survey into the schema evolution and database conversion support exhibited by known research and commerical DB and OODBMS. Further contributed are welcome. --------------------*-*--------------------- Direct quotes are attributed by including the email address of the poster directly following the information. Prose written by the poster, but with primary information provided by email are so identified. Information gleaned from publications are so noted. The information included here is not intended to completely describe the systems addressed, but rather, to describe what support, if any, is provided by the system for the evolution of schemas and the conversion of database objects (class instances) resulting from the schema change. SMC ----------*-*---------- <<< EXTENDED RELATIONAL DB MODEL >>> << Research Systems >> > POSTGRES (Berkeley) You ask explicitly about type evolution. We support schema modification on all classes, including user classes. This means that you can add attributes (instance slots) and methods at any time. Further, since postgres is a shared database system, such changes are instantly visible to any other user of the class. The language syntax supports attribute deletion, but the system won't do it yet. Since all data is persistent, removing attributes from a class requires some work -- you need to either get rid of or ignore all the values you've already stored. [mao@postgress.berkeley.edu] < <<< OO DATA MODEL >>> << Research Systems >> > COOL/COCOON (ETH Zurich) No implementation as yet. Project goals are: - to develop a general formal framework for investigations of all kinds of schema changes in object-oriented database systems (including schema design, schema modification, schema tailoring, and schema integration); - to find implementation techniques for evolving database schemas, such that changes on the logical level propagate automatically to adaptations of the physical level (without the need to modify all instances, if possible). Contact Markus Tresch <tresch@inf.ethz.ch> for more information. > Encore (Brown) Objects are never converted, rather, classes are versioned, and the user can specify filters to make old-style instances appear as new instances to new applications (and vice versa). REFS: Andrea H. Skarra, and Stanley B. Zdonik. "Type Evolution in an Object-Oriented Database." In the book, "Research Directions in Object-Oriented Programming", by Shriver and Wegner. (An earlier version of the paper appears in the proceedings to OOPSLA86.) [clamen] > ORION (MCC/Itasca System, Inc.) ORION is a prototype OODBMS developed at MCC, an American consortium. It is built on top of Common Lisp, and is intended to support applications such in the CAD/CAM, AI, and OIS domains. Advanced functions supported include [object] versions, change notification, composite objects, dynamic schema evolution, and multimedia data. For schema evolution, ORION identifies a list of database-consistency constraints that must be preserved across any class evolution operation. They then list the type of evolution operations you can perform, and how the relevant instances can be converted. Conversion is performed as the instances are accessed. I have found nearly a dozen papers published by the ORION folks. The most recent and general one is: W. Kim, N. Ballow, H-T. Chou, J.F. Garza, D. Woelk, and J. Banerjee. "Integrating an Object-Oriented Programming System with a Database System." Proceedings of OOPSLA88. [Pointers to the previous papers documenting each of the advanced features listed above are cited therein.] The paper most relevant to the issue of schema evolution is the following: J. Banerjee, W. Kim, H.J. Kim, H.F. Korth. "Semantics and Implementation of Schema Evolution in Object-Oriented Databases." Proceedings of SIGMOD87. [clamen] > Exodus (UWisc) No solution for the problem of schema evolution is provided. Emulation is rejected by the authors, who claim that the addition of a layer between the EXODUS Storage Manager and the E program would seriously reduce efficiency. Automatic conversion, whether lazy or eager, is also rejected, as it does not mesh well with the C++ data layout. To implement immediate references to other classes and structures, C++ embeds class and structure instances within its referent. The resulting change in the size of the object might invalidate remote pointer references. Joel E. Richardson and Michael J. Carey. "Persistence in the E langauge: Issues and Implementation." Appeared in "Software -- Practice and Experience", 19(12):1115-1150, December 1989. [clamen] > Machiavelli (UPenn) Machiavelli is a statically-typed persistent programming language project at the University of Pennsylvania. It does not address type evolution. [communication with limsoon@saul.cis.upenn.edu] > ConceptBase We have developed a deductive object-oriented database called ConceptBase where everything (tokens, classes, meta-classes ,meta-meta-classes ,attributes, instantiations, specializations) is treated as an object. That means that you may update the "schema" (classes) at any time just as any other ordinary object. The systems has (user-defined and builtin) integrity constraints that prevent inconsistency (e.g. violation of ref.integrity). Integrity constraints in ConceptBase are (as in most other systems) static, i.e., they are conditions that each database "state" must satisfy. The data model we use does not distinguish schema level information (i.e. classes) from instance level information. If you change for example some classes and this change violates some integrity constraints, e.g. some instances now don't have the right attribute types anymore, then you have the choice either to reject the update or to change the existing DB. Currently, ConceptBase simply rejects such updates. We are thinking of exploiting abduction (see VLDB'90 article of Kakas&Mancarella) to make more clever reactions in the sense of "reformating" instances. [Manfred Jeusfeld <jeusfeld@forwiss.uni-passau.de>] > AVANCE (SYSLAB) An object-oriented, distributed database programming language. Its most interesting feature is the presence of system-level version control, which is used to support schema evolution, system-level versioning (as a way of improving concurrency), and objects with their own notion of history. System consists of programming language (PAL) and distributed persistent object manager. REFS: Anders Bjornerstedt and Stefan Britts. "AVANCE: An Object Management System". Proceedings of OOPSLA88. [clamen] > Altair/O_2 (INRIA) Neither of the two articles I have (bibliographic information below) address the issue of schema evolution or database conversion. REFS: F. Bancilhon, G. Barbette, V. Benzaken, C. Delobel, S. Gamerman, C. Lecluse, P. Pfeffer, P. Richard, and F. Velez. "The Design and Implementation of O2, and Object-Oriented Database System". Advances in Object-Oriented Database Systems, Springer Verlag. (Lecture Notes in Computer Science series, Number 334.) C. Lecluse, P. Richard, and F. Velez. "O2, an Object-Oriented Data Model". Proceedings of SIGMOD88. Also appears in Zdonik and Maier, "Readings in Object-Oriented Database Systems", Morgan Kaufmann, 1990. [clamen] > OTGen (CMU) OTGen describes a scheme for computer-assisted schema evolution. A wide variety of changes (wider than those supported by Orion or GemStone) can be expressed in the evolution "mini-language", which describes a procedure for transforming instances from their new to old representations. Objects are converted as databases (which in the invisioned OTGen system are rather small) are opened. REFS: Barbara Staudt Lerner and A. Nico Habermann. "Beyond Schema Evolution to Database Reorganization" in Proceedings of OOPSLA/ECOOP '90. [clamen, blerner@cs.umass.edu] << Commercial Systems >> > CLOS Not persistent, but implementations must support redefinition of classes and the conversion (either lazy or eager) of existing instances. [c.f. CLtL II] In spite of this freedom, implementations seem to convert lazily. [communication with gregor@parc.xerox.com, hornig@symbolics.com, dussud@lucid.com] > Statice (Symbolics) I'm familiar with Statice, sold by Symbolics Inc. The Statice command "Update Database Schema" brings an existing database into conformance with a modified schema. Changes are classified as either compatible (lossless, i.e., completely information-preserving) or incompatible (i.e., potentially information-losing in the current implementation). Basically, any change is compatible except for the following: -- If an attribute's type changes, all such attributes extant are re-initialized (nulled out). Note that Statice permits an attribute to be of type T, the universal type. Such an attribute can then take on any value without schema modification or information loss. -- If a type's inheritance (list of parents) changes, the type must be deleted and re-created, losing all extant instances of that type. This is Statice's most serious current limitation. The simplest workaround is to employ a database dumper/loader (either the one supplied by Symbolics or a customized one) to save the information elements and then reload them into the modified schema. [lgm@iexist.att.com] > Versant Versant provides schema evolution. But in the current release, only leaf classes in the schema can be modified. Leaf classes can be added, dropped, renamed and individual attributes and methods changed. The class instances are modified later as they are accessed. There are no security mechanisms for preventing users from changing schema. Schema changes are done using a separate utility which compares files (with .sch extension) which contain new schema definitions with those of a database and changes the database schema so that there is no difference. In case of conflicting class names or other situations user has control on resolving the conflict. [h.subramanian@trl.OZ.AU] I've been looking at the C++ database vendors. Versant has schema evolution at the leaf class level only. They're trying to come up with a good way to do it for the general case. They talk about using versioning to mark class evolution. Then they want to test timestamps when an object is retrieved to see whether its class has been changed. If it has, they reformat the object to conform to the new definition at that time. [arc!chet@apple.com] > Object Design Object Design, to the best of my knowledge, do[es]n't support schema evolution at this time. [arc!chet@apple.com] > Objectivity Objectivity, to the best of my knowledge, do[es]n't support schema evolution at this time. [arc!chet@apple.com] > ObjectStore ObjectStore does not provide schema evolution as yet but it has promised to provide schema evolution in the next release. [h.subramanian@trl.OZ.AU] > Ontos [formerly VBase] (Ontologic) Ontos provides schema evolution. It allows any class to be modified. The major drawback is that data does not migrate ie., instances are not modified to adopt to the new class definition. So schema changes can be done only on classes that do not contain instances and do not have sub classes that contain instances. [h.subramanian@trl.OZ.AU] As a system for experiments, we are currently using ONTOS from Ontologic Inc. Unfortunately, there is no transparent concept of schema evolution for populated database. Thus, we still investigate how it works. [Markus Tresch <tresch@inf.ethz.ch>] > GemStone (Servio-Logic) The authors reject the emulation scheme and the lazy conversion approach as previously outlined. Instead, they favor a mixed strategy, which involves lazy conversion until the next garbage collection, at which point all remaining old instances are upgraded. (Their current implementation, however, does not yet support this feature --- the conversion being done eagerly for the time being.) They identify a list of constraints which must be preserved across modification to type descriptions and to the inheritance hierarchy. The authors then proceed to enumerate a number of categories of object updates that are permitted, and what changes to the dependent instances and subclasses must be performed in order to maintain the integrity of the database (i.e., to preserve the above constraints). REFS: Robert Bretl, David Maier, Allan Otis, Jason Penney, Bruce Schuchardt, Jacob Stein, E. Harold Williams, Monty Williams. "The GemStone Data Management System." Chapter 12 of "Object-Oriented Concepts, Databases and Applications", by Kim and Lockovsky. [clamen] > Base/OPEN (NMP-CAD) A structurally object-oriented system (ie. methods are not stored), only schema extension is supported. Instances of older type-versions are never converted, but can coexist in the database with newer objects. [communication with tomas@basf.nmpcad.se] <<< OTHER MODELS >>> << Commercial Systems >> > Pick With Pick and its variants you only have problems if you want to redefine an existing field. Because of the way the data are stored and the separation of the data and the dictionary you can define additional fields in the dictionary without having to do anything to the data - a facility which we have found very useful in a number of systems. There is no general facility to redefine an existing field - you just make whatever changes are required in the dictionary then write an Info Basic program to change the data. We have seldom needed to do this, but it has not been complicated to do. [Geoff Miller <ghm@ccadfa.cc.adfa.oz.au>] > IDL (Persistent Data Systems) IDL is a schema definition language. Schema modifications are defined in IDL, requiring ad-hoc offline transformations of the database, in general. A simple class of transformations can be handled by IDL->ASCII and ASCII->IDL translators (i.e., integer format changes, list->array, attribute addition). [conversation with Ellen Borison of Persistent Data Systems] << Research Systems >> > IRIS (HP Labs) Objects in the Iris system may acquire or lose types dynamically. Thus, if an object no longer matches a changed definition, the user can choose to remove the type from the object instead of modifying the object to match the type. In general, Iris tends to restrict class modifications so that object modifications are not neccssary. For example, a class cannot be removed unless it has no instances and new supertype-subtype relationships cannot be established. REFS: D.H. Fishman, D. Beech, H.P. Cate, E.C. Chow, T. Connors, J.W. Davis, N. Derrett, C.G. Hock, W. Kent, P. Lyngbaek, B. Mahbod, M.A. Neimat, T.A. Tyan, M.C. Shan. "Iris: An Object-Oriented Database Management System". ACM Transactions on Office Information Systems 5(1):48-69, Jan 1987. [clamen] -- Stewart M. Clamen Internet: clamen@cs.cmu.edu School of Computer Science UUCP: uunet!"clamen@cs.cmu.edu" Carnegie Mellon University Phone: +1 412 268 3620 Pittsburgh, PA 15213-3890, USA Fax: +1 412 268 1793
jgk@osc.COM (Joe Keane) (06/04/91)
In article <CLAMEN.91May29225024@BYRON.SP.CS.CMU.EDU> clamen+@CS.CMU.EDU compiles the following comments: >Versant provides schema evolution. But in the current release, only >leaf classes in the schema can be modified. Leaf classes can be >added, dropped, renamed and individual attributes and methods >changed. The class instances are modified later as they are accessed. >There are no security mechanisms for preventing users from >changing schema. Schema changes are done using a separate utility >which compares files (with .sch extension) which contain new schema >definitions with those of a database and changes the database schema >so that there is no difference. In case of conflicting class names >or other situations user has control on resolving the conflict. >[h.subramanian@trl.OZ.AU] This is an accurate description. >I've been looking at the C++ database vendors. Versant has schema >evolution at the leaf class level only. They're trying to come up with >a good way to do it for the general case. It's true that we currently only support evolution of leaf classes. However, i'd like to point out that this is only an implementation restriction of the current release, and there are no architectural or technical problems with doing it. As a practical matter, i haven't heard our customers complaining about this restriction. I'd guess that if you're changing the design of your base classes, then you have more to worry about than evolving your current databases. >They talk about using >versioning to mark class evolution. Then they want to test timestamps >when an object is retrieved to see whether its class has been changed. >If it has, they reformat the object to conform to the new definition >at that time. >[arc!chet@apple.com] These are all things we do in our released product, with the restriction given above. A minor correction is that we use internal class version identifiers, rather than comparing timestamps. Disclaimer: I work for Versant. -- Joe Keane, professional C programmer jgk@osc.com (...!uunet!stratus!osc!jgk)