rowley@bath.cs.ucla.edu (Michael T Rowley) (05/29/91)
On the topic of whether the extension of a class (a collection of all its instances) should be available: |> In article <4863@osc.COM>, jgk@osc.COM (Joe Keane) writes: |> |> But how about `find all instances of a given class'? What does it mean? Find |> all instances in memory? In general only some of the objects you're working |> with are actually in main memory at a given time, and the whole point of an |> ODBMS is to hide whether objects are `in' or `out'. Find all instances in all |> databases you're connected to? Again, which databases you're connected to |> should not be such an important part of your state. Find all instances in all |> databases in existence? This makes sense, but it has some obvious practical |> problems. This last approach is the approach taken by relational databases, assuming you map 'relation' in relational databases to 'class' in OODBs. Relational database systems also manage to provide decent concurrency control mechanisms in addition to this facility. Is there anything about OODBs which makes the problem of reaching all instances of a class more difficult for them than it is for relational databases? In general, a database which is only used for application programming can get away without support for listing the extension of a class. However, if there is to be a associative querry language, it must be possible for the user to say "give me all objects which meet these conditions". One of the conditions would be the object's class, since this determines what other aspects of the objects may be specified in the conditions (in relation db's this means specifying the relation). When relational databases took over from network databases, one of their main selling points was their support for an associative query language. If OODBs do not also support this capability, many people will write them off, incorrectly, as souped-up network databases. It should be possible for OODBs to provide an associative query language, as long as there is a facility for retrieving all instances of a class. Michael Rowley
dlw@odi.com (Dan Weinreb) (05/29/91)
In article <1991May28.232832.28284@cs.ucla.edu> rowley@bath.cs.ucla.edu (Michael T Rowley) writes:
Is there
anything about OODBs which makes the problem of reaching all instances
of a class more difficult for them than it is for relational databases?
No, there isn't. An OODB can work either way. It can be designed so
that it always maintains an explicit, semantically-visible extent for
all objects of a certain class within a certain database, or it can be
designed so that it does not necessarily do so.
However, if there is to be a associative querry language, it must be
possible for the user to say "give me all objects which meet these
conditions".
No, that's not true. If you have an OODB that does not keep extents,
you can still have an associative query language. Each query simply
needs to be handed a "collection" object. So a typical query might be
"find all of the employees within this set of employees for which the
salary is greater than 42", just like the standard mathematical
notation {x element-of X | x.emp > 100} (for "element-of" read a
little epsilon). The queries can be as complicated as you like; more
than one collection can be involved; and automatic optimization can be
performed.
So an OODB can have associative queries even if it does not
automatically maintain extents. These are two separate, orthagonal
issues.
rowley@bath.cs.ucla.edu (Michael T Rowley) (05/30/91)
In article <1991May29.134314.6850@odi.com>, dlw@odi.com (Dan Weinreb) writes: |> In article <1991May28.232832.28284@cs.ucla.edu> rowley@bath.cs.ucla.edu (Michael T Rowley) writes: |> |> However, if there is to be a associative querry language, it must be |> possible for the user to say "give me all objects which meet these |> conditions". |> |> No, that's not true. If you have an OODB that does not keep extents, |> you can still have an associative query language. Each query simply |> needs to be handed a "collection" object. So a typical query might be |> "find all of the employees within this set of employees for which the |> salary is greater than 42", just like the standard mathematical |> notation {x element-of X | x.emp > 100} (for "element-of" read a |> little epsilon). The queries can be as complicated as you like; more |> than one collection can be involved; and automatic optimization can be |> performed. |> |> So an OODB can have associative queries even if it does not |> automatically maintain extents. These are two separate, orthagonal |> issues. It may be true that associative queries can work with of arbitrary collections, in which case the term "associative query language" may not be the best descriptor of the features I'm trying to describe. Maybe a better description would be that the query language is declarative, rather than imperative. It is desirable to be able to specify a query by stating the conditions which the result objects must meet --- without specifying a navigational strategy for retrieving the objects. All queries must start with some known object. This may be a globally known object (relation or class) or it may be the result of a previous query. In network databases not very much could be retrieved by a single query using only global objects, so the user would have to make multiple queries, each building off the previously retrieved data. In order to facilitate this, the query language looked very much like an imperative programming language, complete with "cursors" to keep track of past results. Relational databases, on the other hand, easily reach all objects in the database from queries starting with only globally known objects (relations). As a result, it is easy for these databases to provide a declarative query language. In the example you wrote above: "find all of the employees within this set of employees for which the salary is greater than 42" the important words are "within this set of employees". The interesting set of employees may only be reachable through a multi-step navigation through the network. In which case, the user would be reduced to writing imperative programs to retrieve his data. I think the user community will balk at this. Hence, I think all objects in an OODB should be easily reachable from globally known objects. The most obvious candidates for such objects in an OODB would be collections representing the extents of the classes. Another, less desirable solution would be to provide a single collection representing every object in the database. The problem with this is that it would be hard to write the conditions for the result objects, since the questions that can be asked of objects depend on their types. Michael Rowley
dlw@odi.com (Dan Weinreb) (05/31/91)
In article <1991May30.003525.21161@cs.ucla.edu> rowley@bath.cs.ucla.edu (Michael T Rowley) writes:
the important words are "within this set of employees". The
interesting set of employees may only be reachable through a multi-step
navigation through the network. In which case, the user would be
reduced to writing imperative programs to retrieve his data. I think
the user community will balk at this.
(1) May or may not. If you are using a non-enforced-extent OODB, you
are perfectly free to maintain extents anyway. Or there might be
several sets of employees, all of which are stored in global
variables. A non-enforced-extent OODB gives you the freedom to build
up complex structures, but you don't have to use it if you don't want to.
(2) Which user community? Different user communities have different
ideas about how they want to organize data. Consider the existing
community of ECAD software developers. They write in programming
languages such as Pascal or C++, and they generally don't store their
transistors and so on in database systems. I think you'll find that
they don't all maintain a concept of "the set of all resistors" and so
on for every data type in the entire system. Just because they want
to switch to using an OODB does not necessarily mean that they want to
change their whole idea of how to organize their data structures. In
fact, it might even be a goal to *not* be forced into changing all of
the data structures. Perhaps you would argue that they are all wrong;
that they really ought to change their data structures; that they have
been suffering for years with an inferior notion, and they are simply
unware of their own pain. I don't know how many of them you'd
persuade. A key question is why databases should be so different from
programming languages in this regard. Surely questions of
persistence, concurrency control, and recovery have nothing to do with
it. Why doesn't Pascal have a way to say "iterate over all the records
of type foo?", and why don't people think that this lack makes Pascal
unacceptable?
Hence, I think all objects in an OODB should be easily reachable from
globally known objects. The most obvious candidates for such objects
in an OODB would be collections representing the extents of the
classes.
I certainly think any OODB should allow you to do this. The only
question is whether it should force you to do this.
bobm@server.Berkeley.EDU (Bob Muller) (06/01/91)
In article <1991May30.003525.21161@cs.ucla.edu>, rowley@bath.cs.ucla.edu (Michael T Rowley) writes: |> In article <1991May29.134314.6850@odi.com>, dlw@odi.com (Dan Weinreb) writes: |> |> In article <1991May28.232832.28284@cs.ucla.edu> rowley@bath.cs.ucla.edu (Michael T Rowley) writes: |> |> |> |> However, if there is to be a associative querry language, it must be |> |> possible for the user to say "give me all objects which meet these |> |> conditions". |> |> |> |> No, that's not true. If you have an OODB that does not keep extents, |> |> you can still have an associative query language. Each query simply |> |> needs to be handed a "collection" object. So a typical query might be |> |> "find all of the employees within this set of employees for which the |> |> salary is greater than 42", just like the standard mathematical |> |> notation {x element-of X | x.emp > 100} (for "element-of" read a |> |> little epsilon). The queries can be as complicated as you like; more |> |> than one collection can be involved; and automatic optimization can be |> |> performed. |> |> |> |> So an OODB can have associative queries even if it does not |> |> automatically maintain extents. These are two separate, orthagonal |> |> issues. |> |> It may be true that associative queries can work with of arbitrary |> collections, in which case the term "associative query language" may |> not be the best descriptor of the features I'm trying to describe. |> Maybe a better description would be that the query language is |> declarative, rather than imperative. It is desirable to be able to |> specify a query by stating the conditions which the result objects |> must meet --- without specifying a navigational strategy for |> retrieving the objects. |> |> All queries must start with some known object. This may be a globally |> known object (relation or class) or it may be the result of a previous |> query. In network databases not very much could be retrieved by a |> single query using only global objects, so the user would have to make |> multiple queries, each building off the previously retrieved data. |> In order to facilitate this, the query language looked very much like |> an imperative programming language, complete with "cursors" to keep |> track of past results. |> |> Relational databases, on the other hand, easily reach all objects in |> the database from queries starting with only globally known objects |> (relations). As a result, it is easy for these databases to provide a |> declarative query language. |> |> In the example you wrote above: |> |> "find all of the employees within this set of employees |> for which the salary is greater than 42" |> |> the important words are "within this set of employees". The |> interesting set of employees may only be reachable through a multi-step |> navigation through the network. In which case, the user would be |> reduced to writing imperative programs to retrieve his data. I think |> the user community will balk at this. |> |> Hence, I think all objects in an OODB should be easily reachable from |> globally known objects. The most obvious candidates for such objects |> in an OODB would be collections representing the extents of the |> classes. |> |> Another, less desirable solution would be to provide a single |> collection representing every object in the database. The problem |> with this is that it would be hard to write the conditions for the |> result objects, since the questions that can be asked of objects |> depend on their types. |> |> Michael Rowley Sorry for including all of the above, but my response won't make much sense without it. I'm with Mr. Weinreb for the most part; but Mr. Rowley's approach is not really in opposition, it just makes some invalid assumptions. The key to understanding declarative query languages is in understanding scope. Mr. Rowley is correct in supposing that there must be globally known objects such as relations. You've got to use some kind of global name in the declaration of what you want. However, where he errs is in thinking that the only valid candidate for such names are classes or types. In fact, an OODB can make available various kinds of storage extents as global objects, and this can be orthogonal to the type system. In a type forest, it can be a separate type hierarchy (storage hierarchy), the objects of which can be found by either declarative specification (if they have names, for example) or by navigation. The second bad assumption made by Mr. Rowley is that if you don't have global names you must navigate. This may be true in the old navigational (and relational) databases, but not in modern OODBs, most of which implement encapsulation and functional composition in some way. This kind of nested (or networked) scoping can allow for both full data independence and for declarative access through constructs such as Mr. Weinreb suggests. So "within this set of employees" could be fully declarative, not navigational, if the query language provides declarative constructs for specifying the set using storage object declarations in addition to type declarations. You could say "IN DB 'OBJY'", for example, to look only at employees in the Objectivity employee database named "OBJY". This is hardly navigational; what it does is to allow the query system to do the navigation for you. Obviously this language could include the standard logical operators to combine extents in whatever way you wish. But getting back to the original post (way back, I guess)--I think any DBMS must provide a way to look at "all the objects of a certain class", regardless of logical storage location or any other orthogonal attribute. Clearly this is a useful kind of query. Just as clearly, an OODBMS must not restrict you to this kind of query, as it may involve too much overhead for the limited query you really want to do. Certainly this is true in engineering and CASE applications. You also don't want queries to be limited to "extents of classes", because you may want to mix objects of different types in a certain extent, querying both at once; this is usually called "clustering", but looks very different than the clustering in a relational database because of the orthogonality. So my general response is that you should talk about the problem declaratively, not navigationally, both for type-based and for storage-based queries. Talking about "reaching" objects from global objects is not the point; you should be able to describe _any_ set of objects with a fully declarative language that takes advantage of whatever global or local names are available, and in an OODBMS there are more such names than just the global type names. -- The opinions expressed here are mine, not my employer's. -- Bob Muller Objectivity, Inc. bobm@objy.com