[comp.object] Pre-computing objects

vinoski@apollo.HP.COM (Stephen Vinoski) (10/26/89)

  Some time ago a discussion arose in comp.lang.c++ about reading objects in
from, say, a disk file.  There were several answers, but none that seemed all
that great, so I hope to start a discussion of the subject here.

  The subject line of this article comes from Bertrand Meyer's "Object-oriented
Software Construction" book, page 290.  The problem he is solving on that page
is how to create the appropriate object to fulfill a command within an editor
program; the user enters the command, and the program must construct a command
object to carry out the desired operation.  He advocates associating an integer
identifier with each command, pre-computing an array of command elements, and
initializing the nth array element with a command object having integer
identifier n.  When the user enters an editor command, the integer identifier
associated with that command is used to index into the array to obtain a
command object which is copied and used to carry out the command.

  This problem seems pervasive; other examples are reading objects from disk
storage and receiving objects from another process.  Basically, how does one
know what type of object to construct to receive the object unless the type of
the blob coming in from the external source is already known?  This appears to
be one area of OOP that requires something like a switch or case statement to
work properly.

  Doesn't Meyer's method trade the maintenance woes of a switch or case
statement for having to maintain the initialization function for the command
object array?  It's a good tradeoff, but isn't there anything better?  It also
seems that his method is appropriate only for collections of objects which are
somehow related.

  Does anyone out there know of any good solutions for this problem?  Pointers
to literature would be appreciated (I'm working in C++).


-steve

| Steve Vinoski       | Hewlett-Packard Apollo Div. | ARPA: vinoski@apollo.com |
| (508)256-6600 x5904 | Chelmsford, MA    01824     | UUCP: ...!apollo!vinoski |
| "Yikes, not another OOP book using the tired old 'stack' class example!!!"   |

davidm@uunet.UU.NET (David S. Masterson) (10/27/89)

In article <4671be53.20b6d@apollo.HP.COM> vinoski@apollo.HP.COM (Stephen Vinoski) writes:

   Some time ago a discussion arose in comp.lang.c++ about reading objects in
   from, say, a disk file.  There were several answers, but none that seemed
   all that great, so I hope to start a discussion of the subject here.

I may have started that discussion (although I think there was one that
predates me) and I still have seen a good answer to the problem in C++.  In
particular, I am attempting to use a relational database (with no object
oriented features) in a C++ world.  Therefore, structures (tuples from tables)
come out of the database with no inheritance structure and we are trying to
give them an inheritance structure.  Thus far, we've had little success.

The problem is pervasive as we are/will be running into the problem when
moving objects between processes.  The one advantage, though, in moving
objects between processes is that at least they started with an inheritance
hierachy and, therefore, its just a matter of wrapping the object up into a
shippable form.  The database (disk file, non-object oriented process),
though, does not know of object hierarchies.

In C++, the only type-safe, compile-time method for supporting this that I
have found involves having parent objects knowing about their children's
structure so that if they get such an object on their initializer function,
they know what to do with it.  There is a basic translation mechanism that
must be invoked somewhere and, no matter what I have tried to do, I have only
succeeded in moving the problem from one layer of objects to another.

Anyone have any ideas?


--
===================================================================
David Masterson					Consilium, Inc.
uunet!cimshop!davidm				Mt. View, CA  94043
===================================================================
		"Nobody here but us chickens..."

render@m.cs.uiuc.edu (10/28/89)

Written 11:52 am  Oct 26, 1989 by chewy@apple.com :
>Good!  The discussion that you're trying to start essentially revolves 
>around what are usually called "persistent objects."
> ...
>Ok, so these are some of the more obvious challenges.  I'd recommend 
>looking at lots of OOPSLA proceedings to get some idea as to where to look 
>next.  I'd also recommend reading up on distributed object systems.

Also look at papers on object-oriented database management systems (OODBMSs).
Their main goal is integrating OO PLs and databases, and persistence is a
big part of this.  I can post references if necessary.

hal.

jwd@cbnewsc.ATT.COM (joseph.w.davison) (10/28/89)

In article <CIMSHOP!DAVIDM.89Oct26104730@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes:
>I>
>>   Some time ago a discussion arose in comp.lang.c++ about reading objects in
>>   from, say, a disk file.  There were several answers, but none that seemed
>>   all that great, so I hope to start a discussion of the subject here.
>>
...

>In C++, the only type-safe, compile-time method for supporting this that I
>have found involves having parent objects knowing about their children's
>structure so that if they get such an object on their initializer function,
>they know what to do with it.  There is a basic translation mechanism that
>must be invoked somewhere and, no matter what I have tried to do, I have only
>succeeded in moving the problem from one layer of objects to another.
>
>Anyone have any ideas?

Well, ignoring the issue about methods being stored with the objects, which
is hardly ignorable, I'll address the simpler problem of reading a text
file, recognizing the next "object" and creating one of "them".  ThiI encountered
the problem in Smalltalk while trying to process some computer generated
files of "messages" that had a variety of formats.  I had created a class
Message with various subclasses, one for each different message format.
The Message objects had methods to allow users to "interpret" or, more
generally, to interact with them.  The problem was to create objects of the
proper type as the file was read.  I intend/intended to do this same job
in C++, and have done a similar job as you suggest -- make the parent's
constructor know about the child's structure.  That just doesn't seem
right.  

The way I wish I could do it is with OOYACC -- Like Yacc, but designed to
work in an OO system.  Basically, I want yylex() and yyparse() and their
relatives to be member functions.  I want to be able to tell when a parser
is successful/fails -- perhaps a static data member of the parent could
be used for this info.  Then the parent class needs only to know the names
of its child classes.  Each class provides a constructor that takes a 
stream as input and applies its own yyparse() to it.  If the parse is
successful, the constructor can do its job, and the static data member is
set to SUCCESS.  If the parse fails, the static data member is set to FAILURE
 
and the constructor does nothing else.  The parent constructor tries
to create instances of each child, in some order, until one succeeds,
or it fails --  and does whatever constructors do when they fail.

This idea at least limits the amount of information the parent needs about
the child.  

Of course, OOYACC does not exist...
-- 
 Joe Davison      jwd@ihlts.att.com 

kurtl@fai.UUCP (Kurt Luoto) (10/31/89)

In article <4671be53.20b6d@apollo.HP.COM> vinoski@apollo.HP.COM (Stephen Vinoski) writes:
>
>  This problem seems pervasive; other examples are reading objects from disk
>storage and receiving objects from another process.  Basically, how does one
>know what type of object to construct to receive the object unless the type of
>the blob coming in from the external source is already known?  This appears to
>be one area of OOP that requires something like a switch or case statement to
>work properly.
>
>  Doesn't Meyer's method trade the maintenance woes of a switch or case
>statement for having to maintain the initialization function for the command
>object array?  It's a good tradeoff, but isn't there anything better?  It also
>seems that his method is appropriate only for collections of objects which are
>somehow related.
>
>-steve
>
>| Steve Vinoski       | Hewlett-Packard Apollo Div. | ARPA: vinoski@apollo.com |
>| (508)256-6600 x5904 | Chelmsford, MA    01824     | UUCP: ...!apollo!vinoski |

I have also run into this problem (translating external messages
into objects) while working in C++, and I could think of nothing
more pragmatic at the time than the equivalent of Meyer's solution,
i.e. a table lookup based on message type.  Indeed, it seems that at
least C++ (and apparently Eiffel) does not help to "distribute the case
statement" in this situation as it does for virtual functions.

There are other related situations that I have run into where the
C++ language could not provide any magic:
    
    I.  I have a particular inheritance tree of classes. The instances
	of these classes are small in size and high in volume.  Further-
	more, I would like to define virtual functions for them. However,
	in current implementations of C++, using the language-provided
	virtual function mechanism involves an extra storage overhead
	of a pointer (to a vtable) per object. While this is perfectly
	acceptable in most cases, it is not in this case because of the
	large number of instances. Hence I have to avoid the language
	provided mechanism. I can easily tell the class of an object by
	the value of a tag field. C++ could have provided a way for me
	to tie the value of this field to its class, and thus provide a
	alternate mechanism for discriminating virtual functions at runtime.
	However, the author(s) of C++ wisely left out such a provision.
	I must resort to putting case statements in the (now non-virtual)
	member functions, or resort to non-standard hackery to simulate
	virtual functions based on a tag field.

    II. I have a particular inheritance tree of classes. A client module
	needs to allocate storage in a particular place for an instance
	of one of these classes. The client needs to allocate a buffer
	large enough to hold an instance of any class in the tree. But
	we would like to not have to burden the client code with explicit
	knowledge about all classes. The pragmatic solution is to create
	a header file which defines a union type. The union contains
	members for all classes in the hierarchy. This is a good solution,
	but now we need to remember to update this header file every
	time we introduce a new class to the tree. It does not seem
	reasonable to expect C++ to handle this more gracefully, given
	its C-like separate compilation facilities.

In both of these situations, the pragmatic solution is less than elegant.
I would not propose changing C++ to better handle these cases, but I can
imagine how another language might simplify them.  Both cases would seem
to require that the compiler/system know about all the classes in the
inheritance tree, even when the client module under compilation only
refers to the root class of the tree.  This is not normally true in C++,
but it is probably true in at least some other languages.  Given the
availability of such knowledge, the compiler could pick off the information
that it needs, such as the size of an instance of each class and therefore
the maximum size over the overall tree.  To handle the first case above,
the language could provide a way for the programmer to enter the tag field
value (or some other discriminating information) into or along with the
class declaration itself. The compiler could then generate appropriate
code for runtime determination of the proper virtual function.

It seems to me that such a language could also help simplify the problems
that Stephen has run into. In the case of external messages, the "tag field"
class identification method would be extended to allow the programmer to
define a "default class" that would handle otherwise invalid tag field
values.

So, from my humble experience, there are no easy answers, at least not as
long as you're working with C++. But it is interesting food for thought.

P.S. Sorry for the length of the posting.
-- 

----------------
Kurt W. Luoto     kurtl@fai.fai.com   or   ...!sun!fai!kurtl

davidm@uunet.UU.NET (David S. Masterson) (10/31/89)

In article <4158@cbnewsc.ATT.COM> jwd@cbnewsc.ATT.COM (joseph.w.davison) writes:

   Well, ignoring the issue about methods being stored with the objects, which
   is hardly ignorable,...

Hmmm, anyone have any information on the idea of storing methods with objects?
Especially in a non-object-oriented storage area?

   The way I wish I could do it is with OOYACC -- Like Yacc, but designed to
   work in an OO system.

[munch]

   The parent constructor tries to create instances of each child, in some
   order, until one succeeds, or it fails -- and does whatever constructors do
   when they fail.

   This idea at least limits the amount of information the parent needs about
   the child.

Of course the problem with this is that the parent object should never know
about its children as this violates the idea of encapsulation of ideas that
object-orientation is supposed to promote.  Two other methods have come to
light that bear examination, but I'm not happy with them either:

1.  Assume that the child knows about its parent initializer function.  It
would call that initializer function when it is done with the persistent data
that it got.  The drawback is that if the parent of the child changes, the
child must know about it.  Also, some mechanism must still be found by which
the child can convert its persistent information into a form that the parent's
initializer function will accept.

2.  Translating data from one form into another can be done through explicit
casting.  Therefore, a structure that represents data as it resides in the
persistent storage area could have a cast function on it that would convert it
to its parent type of object.  The drawback here is this is something that you
would expect C++ to do for you automatically, but persistent data seems to be
the exception to the idea.

--
===================================================================
David Masterson					Consilium, Inc.
uunet!cimshop!davidm				Mt. View, CA  94043
===================================================================
		"Nobody here but us chickens..."

ttwang@polyslo.CalPoly.EDU (Thomas Wang) (10/31/89)

kurtl@fai.fai.com (Kurt Luoto) writes:
>In article <4671be53.20b6d@apollo.HP.COM> vinoski@apollo.HP.COM (Stephen Vinoski) writes:

>There are other related situations that I have run into where the
>C++ language could not provide any magic:
    
>    II. I have a particular inheritance tree of classes. A client module
>	needs to allocate storage in a particular place for an instance
>	of one of these classes. The client needs to allocate a buffer
>	large enough to hold an instance of any class in the tree. But
>	we would like to not have to burden the client code with explicit
>	knowledge about all classes.

Does not this waste some storage space, with additional to the complexity
of the design?  Why can't the client contains a 'reference' to the super-class
of all objects which it could contain?

This is the scheme usually used for system with automatic memory
management.


>Kurt W. Luoto     kurtl@fai.fai.com   or   ...!sun!fai!kurtl


 -Thomas Wang ("This is a fantastic comedy that Ataru and his wife Lum, an
                invader from space, cause excitement involving their neighbors."
                  - from a badly translated Urusei Yatsura poster)

                                                     ttwang@polyslo.calpoly.edu