plogan@apd.mentorg.com (Patrick Logan) (06/06/91)
In article <1991May31.175612.16655@objy.com> bobm@server.Berkeley.EDU (Bob Muller) writes: In article <1991May30.003525.21161@cs.ucla.edu>, rowley@bath.cs.ucla.edu (Michael T Rowley) writes: |> In article <1991May29.134314.6850@odi.com>, dlw@odi.com (Dan Weinreb) writes: |> |> Each query simply needs to be handed a "collection" object. |> |> So a typical query might be "find all of the employees within |> |> this set of employees for which the salary is greater than |> |> 42" |> |> It may be true that associative queries can work with arbitrary |> collections, ... It is desirable to be able to specify a query |> by stating the conditions which the result objects must meet --- |> without specifying a navigational strategy for retrieving the |> objects. |> |> All queries must start with some known object. This may be a |> globally known object (relation or class) or it may be the |> result of a previous query. |> |> Relational databases ... easily reach all objects in |> the database from queries starting with only globally known objects |> (relations). Relations and _tables_. The tables are the globally known objects. The relations relate them. In an OODB there could be a top level collection of employees. This could be the set of employees mentioned in the example query Dan Weinreb used above, e.g. "Find all employees in the top level employee collection for which...". |> In the example you wrote above: |> |> "find all of the employees within this set of employees |> for which the salary is greater than 42" |> |> the important words are "within this set of employees". The |> interesting set of employees may only be reachable through a |> multi-step navigation through the network. This could be true. There could be a top level collection of departments, each with a collection of employees. Then the query would have to state which department's employees should be used, e.g. "Find all employees in all departments for which..." or "Find all employees in department A for which...". Or there could be both the top level collection and each department collection. Each employee could still be stored just once. Essentially, any graph you create in your run-time structures can be stored in the database. The relational model has all top level tables (collections) and relationships between them, there is no opportunity to store nested tables. In _some_ applications, the top level tables would have to be restructured to represent the more complex graph at run time. This is supposed to be the advantage of OODBs, they represent the graph more efficiently. If you are maintaining the furniture inventory in a building, a relational DB is probably better. If you are designing buildings, an OODB is probably better. |> Hence, I think all objects in an OODB should be easily reachable |> from globally known objects. The most obvious candidates for |> such objects in an OODB would be collections representing the |> extents of the classes. Sorry for including all of the above, but my response won't make much sense without it. Ditto, but I did more editing. The key to understanding declarative query languages is in understanding scope. Mr. Rowley is correct in supposing that there must be globally known objects such as relations. You've got to use some kind of global name in the declaration of what you want. However, where he errs is in thinking that the only valid candidate for such names are classes or types. That is, I think, because he failed to think of tables as instances (objects) in a database. He was comparing tables to classes (types) and so he was comparing apples to oranges. -- Patrick Logan, Try: plogan@dad.mentor.com, plogan@ws.mentor.com, plogan@mentorg.com or substitute patrick_logan for plogan and try that. You can also try going through uunet!(mntgfx, mentorg, mentor.com)!plogan [Can you tell things are changing around here?]
bobm@server.Berkeley.EDU (Bob Muller) (06/11/91)
In article <1991Jun6.143624.2485@apd.mentorg.com>, plogan@apd.mentorg.com (Patrick Logan) writes: |> In article <1991May31.175612.16655@objy.com> bobm@server.Berkeley.EDU (Bob Muller) writes: |> In article <1991May30.003525.21161@cs.ucla.edu>, rowley@bath.cs.ucla.edu (Michael T Rowley) writes: |> |> In article <1991May29.134314.6850@odi.com>, dlw@odi.com (Dan Weinreb) writes: ---- lots of stuff deleted ------- |> Relations and _tables_. The tables are the globally known objects. The |> relations relate them. In an OODB there could be a top level |> collection of employees. This could be the set of employees mentioned |> in the example query Dan Weinreb used above, e.g. "Find all employees |> in the top level employee collection for which...". |> -- More stuff deleted ---------- |> |> That is, I think, because he failed to think of tables as instances |> (objects) in a database. He was comparing tables to classes (types) |> and so he was comparing apples to oranges. Well, I've gotten the included part down to half a page, anyway :). This whole issue is pretty interesting, and I've heard lots of arguments for various positions on it. First, a bit of clarification: a "table" is the same thing as a "relation" in a relational database. The term relation is the technical name, the term table is the "marketing" name that SQL chose to use (as well as "column" for "attribute" and "row" for "tuple"). I wonder why ?-). The whole point of the relational calculus is that relationships between tuples, rows, entities, objects, or whatever you want to call them is in the data, not explicitly represented through relationship structures. This was the answer that Codd gave to the hierarchical and network DBMS people. When you want to relate two tables, you join them on the attributes--there is no other way to relate them in a relational database. Now, on to the messy part. How does the relation correspond to the structures in an OODBMS? There are at least two ways to look at it, and they aren't really mutually exclusive. First, you can make the table/relation correspond to class, and the rows of the table correspond to objects of the class. This works until you consider inheritance, which implies that a row in one table "is-a" row in another table as well, which is not possible in a relational database. Second, you can make the table correspond to an object, with the rows as internal data in the object. That is, a collection object containing a bunch of other objects (presumably with a single tuple type). This doesn't behave as you would expect a "relation" to behave, and implementing something like SQL on top of this is not possible. Manipulating objects of unknown cardinality is quite difficult mathematically. This approach requires a pretty complete development of the notion of a tuple type, and you would need to be able to create new tuple types dynamically as a consequence of projection of the contents of the collection. I've come down on the side of the first option--I would like to treat SQL tables as classes or types (and hence be able to use them to declare columns, extending the SQL type system through embedding objects) and SQL rows as instances of the class. I would also add inheritance and subclassing and object identity, so that a row in one table may actually be the same thing as a row in another table (more or less like a dynamic view, I guess). The big benefit here is type safety--you don't have tuple types, just object types (each "tuple" becomes an object typed with the "table" class). But you still need to be able to generate dynamic types because of projection--they're just not tuple types, just classes with data members; no need to extend the mathematics of the type system. Interestingly, you can develop collection classes, but a "collection" is not at the table level, just at the row or object level, with the collected objects contained or nested within each collection object. The original issue of how to specify queries using global names becomes an issue with using the global type names and the abstract interface (data, function, and association members) to declare what objects you want to retrieve. If you specify a collection class, you get back a collection; you could use the methods on the collection class to project any of the internal objects into new classes. Using the above example, you don't ask for THE employee collection, but AN employee collection (there can be more than one, possibly with overlapping contents). If "employee" is a type, you ask for employees, not a collection object; this produces an iterator that lets you iterate over the employee objects. You can see that this is saying the same thing in a slightly different way; the benefit is that you don't need to add any concepts to the OO model at the top level. I'd really like to hear from anybody out there with opinions on this stuff. -- -- Bob Muller Objectivity, Inc. bobm@objy.com