[comp.object] Has anyone taken an approach like this?

rodent@netcom.COM (Ben Discoe) (03/19/91)

  This is NOT an object-oriented Programming article.  No C++, etc.
This concerns object-oriented philosophy and mechanics.

  I've been thinking heavily about an approach to object-oriented databases
that mixes in elements of hypertext, knowledge-representation, etc.
It's basis is this: is there a way to represent objects and the relationship
between then in a fashion that isn't redundant or brute force? I think
there is, but I'm having some trouble implementing it.

 Rather than store every relationship twice:
 (The following Capitalized words represent CLASSES, not instances)

 Dog		Bird
 ----		----
 ...		...
 eats Bird	is eaten by Dog

 I'd rather store the relationship once:

 Dog		"A eats B" relations	 Bird
 ----		--------------------	 ----
 ...		 ...........		 ...
 eats	->	DogOID/BirdOID	   <-	eaten by

  This way we can query "What does the Dog eat" and "What eats the bird"
and the searches can be non-brute force: we just follow the link to the
relationship, which contains the ID of the two involved objects.
  Can generality also be represented this way?  We can say "The Dog eats
meat." which will form a "A eats B" relation between the dog and an
aspect of the Animal class, a superclass of Bird.  Then when we query what
eats the bird, we can search upwards through the class heirarchy for
any "eaten by" relations.  These relations may be dependent on the value
of a class field, in which case we check to see if the object in question
(the bird) has the appropriate value for the realtionship to apply to it.
  Now, how do we represent this in the relations list?

I'd be interested to know:
 1. Has anyone ever seriously taken this approach?
 2. Are there any obvious drawbacks?
 3. This method seems appropriate for modeling anything from an encyclopedia
to a rainforest to a coin collection.  Why can't I find anyone else pursuing
it?  I've scanned all the papers and conference proceedings I can get my
hands on.
 4. Am I being foolishly naive and grossly underestimating the potential
	difficulties?

----------------
Ben in San Jose, idealistic visionary, computer scientist, ecologist.

d_muller@suisse.enet.dec.com (Dan Muller) (03/20/91)

In article <1991Mar19.042444.20834@netcom.COM>, rodent@netcom.COM (Ben
Discoe) writes:
|
| <text deleted>
|
| I'd rather store the relationship once:
|
| Dog		"A eats B" relations	 Bird
| ----		--------------------	 ----
| ...		 ...........		 ...
| eats	->	DogOID/BirdOID	   <-	eaten by
|
| <text deleted>
|
|I'd be interested to know:
| 1. Has anyone ever seriously taken this approach?
| 2. Are there any obvious drawbacks?
| 3. This method seems appropriate for modeling anything from an encyclopedia
|to a rainforest to a coin collection.  Why can't I find anyone else pursuing
|it?  I've scanned all the papers and conference proceedings I can get my
|hands on.
| 4. Am I being foolishly naive and grossly underestimating the potential
|	difficulties?
|

I believe that in relational database theory, this organization of
data is used to form many-to-many relationships.  Treating Dog and
Bird as data tables, then the "A eats B" column is a special type of
data table called a "dictionary", and is in fact the _only_ way of
constructing many-to-many relationships between Dogs and Birds in a
properly normalized database system.

Given a Dog instance, you can look up all Bird instances eaten by the
Dog by searching the dictionary for entries containing that Dog's ID.

If you don't have a many-to-many relationship, then I don't think this
arrangement of the data gets you anything.

If there is a many-to-one relationship of Dogs to Birds, it's easier
to have Bird ID be part of the Dog object.

If there's a one-to-many relationship (a more likely situation in this
example), each Bird object can contain a Dog ID and all of a Dog's
meals can be found by searching for matching Dog IDs in the table of
Birds.

Of course, if you feel that a particular application can benefit from
treating the relationship itself as a separate object (for instance,
if the same type of relationship can hold among lots of unrelated
classes of objects and you wish to manipulate the relations per se
independently of the referenced objects), then the organization you
describe has obvious advantages.

--
| Dan Muller / KA1UXL / d_muller@took.enet.dec.com
| "So THAT's what an invisible barrier looks like!"

gt4084c@prism.gatech.EDU (SRINIVASAN,K) (03/20/91)

In article <1991Mar19.042444.20834@netcom.COM> rodent@netcom.COM (Ben Discoe) writes:
>
	<Deleted description of a scheme where relations will be classes>

>I'd be interested to know:
> 1. Has anyone ever seriously taken this approach?

	Yes, Wilensky proposed KODIAK representation scheme which does exactly what you had described - you only have to replace 'objects' in your posting with 'frames.'

> 2. Are there any obvious drawbacks?

	The major objections I see are:
	
	1. It does not fit the classical definition of objects as nouns from domain description (of course Booch, Rumbaugh, etc., do not seem to insist on this requirement any more).

	2. Lack of cognitive correspondence - we think in terms of entities and not in terms of relationships (Wilensky refutes this argument at length).


> 3. This method seems appropriate for modeling anything from an encyclopedia
>to a rainforest to a coin collection.  Why can't I find anyone else pursuing
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>it?  I've scanned all the papers and conference proceedings I can get my
^^
>hands on.

	You sure do have some enviable company!

> 4. Am I being foolishly naive and grossly underestimating the potential
>	difficulties?
	
	With the current representation, it is indeed a choice between redundancy and brute force as you  had said.  The Cyc people (the only large implementation attempt well documented and pubilicized) seem to have chosen redundancy.  I am really curious why they settled for it rather than trying something like what you, Wilensky, etc., have described.

>Ben in San Jose, idealistic visionary, computer scientist, ecologist.


-- 
SRINIVASAN,K
School of Textile Engineering   Georgia Tech.
uucp: ...!{allegra,amd,hplabs,seismo,ut-ngp}!gatech!prism!gt4084c
ARPA: gt4084c@prism.gatech.edu

davidm@uunet.UU.NET (David S. Masterson) (03/21/91)

>>>>> On 19 Mar 91 04:24:44 GMT, rodent@netcom.COM (Ben Discoe) said:

Ben> I've been thinking heavily about an approach to object-oriented databases
Ben> that mixes in elements of hypertext, knowledge-representation, etc.  It's
Ben> basis is this: is there a way to represent objects and the relationship
Ben> between then in a fashion that isn't redundant or brute force? I think
Ben> there is, but I'm having some trouble implementing it.

I understand "redundant", but what do you mean by "brute force"?

Ben>  Rather than store every relationship twice:
Ben>  (The following Capitalized words represent CLASSES, not instances)

Ben>  Dog		Bird
Ben>  ----		----
Ben>  eats Bird		is eaten by Dog

Ben>  I'd rather store the relationship once:

Ben>  Dog		"A eats B" relations	 Bird
Ben>  ----		--------------------	 ----
Ben>  eats	->	DogOID/BirdOID	   <-	eaten by

From an object-oriented view, the obvious problem here is that the
relationship now becomes an object by itself rather than a property of either
Dog or Bird (or both).  Since anything can be an object, this may not be
wrong, but it is a difference in the basic philosophy of the problem as
expressed in the first representation.

Ben> This way we can query "What does the Dog eat" and "What eats the bird"
Ben> and the searches can be non-brute force: we just follow the link to the
Ben> relationship, which contains the ID of the two involved objects.

In generalizing the problem this way, it would be better to express the
relationship as "Eater object" A "eats" "Eatable object" B with all the
derivations involved there.  The relationship then becomes a "hub" of
interaction between a number of different types of objects (all of which are a
KindOf Eater or Eatable types).

Ben> Can generality also be represented this way?  We can say "The Dog eats
Ben> meat." which will form a "A eats B" relation between the dog and an
Ben> aspect of the Animal class, a superclass of Bird.  Then when we query
Ben> what eats the bird, we can search upwards through the class heirarchy for
Ben> any "eaten by" relations.  These relations may be dependent on the value
Ben> of a class field, in which case we check to see if the object in question
Ben> (the bird) has the appropriate value for the realtionship to apply to it.

As you said, finding whether an object is used in a relationship (as in "The
Dog eats") is relatively easy in the general case (get the ID of Dog and see
if ID is in relationship list).  However (as you also said), going the other
way is much more expensive because of having to scan a (potentially large)
number of different types of objects to find out what an ID identifies.  This
kind of goes with the "asking an object its type" thread that's been raging in
comp.lang.c++. 

Ben> Now, how do we represent this in the relations list?

Do you need to do anything more than represent in the relations list the
generalized IDs of the objects participating in the relationship?  That is,
the IDs in the relationship are of type EaterID or EatableID.

Ben> I'd be interested to know:

Ben>  2. Are there any obvious drawbacks?

Especially in the generalized sense, there is a strict typing problem to worry
about.  That is, taken very far, this approach will lead to massive typing
trees with some connections that are very difficult to represent (how do you
represent the ID of something that can both Eat and be Eaten?).

Ben>  3. This method seems appropriate for modeling anything from an
Ben> encyclopedia to a rainforest to a coin collection.  Why can't I find
Ben> anyone else pursuing it?  I've scanned all the papers and conference
Ben> proceedings I can get my hands on.

Probably because of the massiveness of the typing problem.  This seems more
readily doable in a relational system because relationships are derivable
directly from the data rather than definition of the data.  Is this a drawback
to the object-oriented model that doesn't bother the relational model?

--
====================================================================
David Masterson					Consilium, Inc.
(415) 691-6311					640 Clyde Ct.
uunet!cimshop!davidm				Mtn. View, CA  94043
====================================================================
"If someone thinks they know what I said, then I didn't say it!"

rodent@netcom.COM (Ben Discoe) (03/21/91)

d_muller@suisse.enet.dec.com (Dan Muller) writes:
>rodent@netcom.COM (Ben
>Discoe) writes:
>| I'd rather store the relationship once:
>| Dog		"A eats B" relations	 Bird
>| ----		--------------------	 ----
>| eats	->	DogOID/BirdOID	   <-	eaten by

>I believe that in relational database theory, this organization of
>data is used to form many-to-many relationships.  Treating Dog and
>Bird as data tables, then the "A eats B" column is a special type of
>data table called a "dictionary", and is in fact the _only_ way of
>constructing many-to-many relationships between Dogs and Birds in a
>properly normalized database system.

>Given a Dog instance, you can look up all Bird instances eaten by the
>Dog by searching the dictionary for entries containing that Dog's ID.

  I was hoping to avoid any kind of brute-force searching.  Every object
(class) should have links to any related object (class) that the
application is likely to correlate.

>If there's a one-to-many relationship (a more likely situation in this
>example), each Bird object can contain a Dog ID and all of a Dog's
>meals can be found by searching for matching Dog IDs in the table of
>Birds.

  Wouldn't it be much more elegant to form a relationship between the
Dog and an aspect of Animal, a superclass of Bird?  That way instead of
each class of bird containing a link to each kind of object that eats
it, the query "What eats this bird" can search upward through the
object heirarchy, looking for links to "eaten by" relations, finding
the link under Animal and concluding that the bird is lunch.

  This way of forming many-to-many relationships (generality, saying
"Dog eats [Animal with certain values in its fields]") seems like the
most desirable (conceptually and implementationally) way.

>Of course, if you feel that a particular application can benefit from
>treating the relationship itself as a separate object (for instance,
>if the same type of relationship can hold among lots of unrelated
>classes of objects and you wish to manipulate the relations per se
>independently of the referenced objects)

Isn't this true of the "real world" and the way we handle it conceptually?
Our minds re-use the concept "A eats B" for all kinds of classes.
(Uh-oh, I'm starting to sound like a Marvin Minsky wanna-be)
Not that human metal processes can be readily modeled with Objects,
but why not use the Object facilities we're comfortable with to
describe relationships also.

>, then the organization you
>describe has obvious advantages.

OK, everyone, now how should we describe the operations on Classes
that determine whether or not a particular relationship holds true for
it?  Perhaps if the Class pointed to the relationship Class which in
turn evaluated the Class's fields... no, that ruins the reusability
of the relationship.  Hmmm, any thoughts? 

>| Dan Muller / KA1UXL / d_muller@took.enet.dec.com
>| "So THAT's what an invisible barrier looks like!"

------------------
Ben in San Jose, struggling to be a generalist in a world of specialists.

rodent@netcom.COM (Ben Discoe) (03/22/91)

cimshop!davidm@uunet.UU.NET (David S. Masterson) writes:
>>>>>> On 19 Mar 91 04:24:44 GMT, rodent@netcom.COM (Ben Discoe) said:
>Ben> [...] is there a way to represent objects and the relationship
>Ben> between then in a fashion that isn't redundant or brute force? I think
>Ben> there is, but I'm having some trouble implementing it.

>I understand "redundant", but what do you mean by "brute force"?

By brute force I mean doing a sequential search through a long list of
classes looking for some characteristic, versus organizing the structure
of your class relationships so as to make conceptually related classes
reachable to each other by means of a pointer or Class ID.  Many people
feel that redundancy or brute-force are the only options, but what I'm
trying to develop is a representation that minimizes both.

>Ben>  I'd rather store the relationship once:
>Ben>  Dog		"A eats B" relations	 Bird
>Ben>  ----		--------------------	 ----
>Ben>  eats	->	DogOID/BirdOID	   <-	eaten by

>Ben> This way we can query "What does the Dog eat" and "What eats the bird"
>Ben> and the searches can be non-brute force: we just follow the link to the
>Ben> relationship, which contains the ID of the two involved objects.

>In generalizing the problem this way, it would be better to express the
>relationship as "Eater object" A "eats" "Eatable object" B with all the
>derivations involved there.

  Depending on what you mean by "there", that's the approach I'm talking
about, except:

>  The relationship then becomes a "hub" of
>interaction between a number of different types of objects (all of which are a
>KindOf Eater or Eatable types).

  Since Dog and Bird are already classes, how can they also be Eater and
Eatable types unless you have multiple inheritance?  I consider "Type" and
"Class" interchangeable.  This kind of typing I find far from intuitive...
We don't think of "being eatable" as a property of a Bird, we think of
"eating" as being an aspect of the relationship between the Bird and
some carnivore.

>Ben> Can generality also be represented this way?  We can say "The Dog eats
>Ben> meat." which will form a "A eats B" relation between the dog and an
>Ben> aspect of the Animal class, a superclass of Bird.  Then when we query
>Ben> what eats the bird, we can search upwards through the class heirarchy for
>Ben> any "eaten by" relations.  These relations may be dependent on the value
>Ben> of a class field, in which case we check to see if the object in question
>Ben> (the bird) has the appropriate value for the realtionship to apply to it.

> [...] However (as you also said), going the other
>way is much more expensive because of having to scan a (potentially large)
>number of different types of objects to find out what an ID identifies.

If you mean locating an object once you have its ID, that's an implementation
problem with all kinds of solutions.  It's not a conceptual problem.
The brute force that I'm trying to avoid is having to do potentially large
scans of the Class and Relationship lists.

>Ben> Now, how do we represent this in the relations list?

>Do you need to do anything more than represent in the relations list the
>generalized IDs of the objects participating in the relationship?  That is,
>the IDs in the relationship are of type EaterID or EatableID.

Yes, there more!  Most relationships (both in our minds and in "the real
world") are conditional.  To make a lousy example, say "Dog eats Bird objects
which are blue and >1 year old".  The challenge is to encode the conditions
such that the bidirectional nature of the links is maintained, so that we
never do brute-force scanning.

The way to model this that come to my mind is to require that color and age
are aspects of the Bird class, then, under the Bird class, have a link to
the relationship that encodes "only if condition1(age) and condition2(color)"
  The relationship object, will have to have a link not only to the Bird
object but also point to the specific link inside the Class.  (All links
should bidirectionally point at each other so that query algorithms can
traverse the trees and logical constructs we simply following pointers!)

>Ben>  2. Are there any obvious drawbacks?

>Especially in the generalized sense, there is a strict typing problem to worry
>about.  That is, taken very far, this approach will lead to massive typing
>trees with some connections that are very difficult to represent (how do you
>represent the ID of something that can both Eat and be Eaten?).

  This isn't a problem if you abandon the idea that, for instance, all eatable
objects are subtypes of Eatable type.  See above for why this isn't intuitive
or desireable.

>Ben> Why can't I find
>Ben> anyone else pursuing it?  I've scanned all the papers and conference
>Ben> proceedings I can get my hands on.

>Probably because of the massiveness of the typing problem.

Either I don't understand what you mean by this problem, or its not real?
Since my original post I've received helpful replies pointing that indeed,
many people ARE working on "my" approach.  I am now more confident that there
are no severe inherent difficulties...

>  This seems more
>readily doable in a relational system because relationships are derivable
>directly from the data rather than definition of the data.  Is this a drawback
>to the object-oriented model that doesn't bother the relational model?

I don't see any drawback to the OO representation of Classes/Relationships
(assuming my current concerns are ultimately solvable, an open question)
and I see all kinds of problems with the classical relational model (I work
for a company that makes a "relational" database).  It seems that those
shortcomings are what attract many people to OODBMSs.

>David Masterson					Consilium, Inc.
---------------
Ben in San Jose.  Can I contribute to the (OODB?) world without becoming
		  a specialist?