vinoski@apollo.HP.COM (Stephen Vinoski) (10/26/89)
Some time ago a discussion arose in comp.lang.c++ about reading objects in from, say, a disk file. There were several answers, but none that seemed all that great, so I hope to start a discussion of the subject here. The subject line of this article comes from Bertrand Meyer's "Object-oriented Software Construction" book, page 290. The problem he is solving on that page is how to create the appropriate object to fulfill a command within an editor program; the user enters the command, and the program must construct a command object to carry out the desired operation. He advocates associating an integer identifier with each command, pre-computing an array of command elements, and initializing the nth array element with a command object having integer identifier n. When the user enters an editor command, the integer identifier associated with that command is used to index into the array to obtain a command object which is copied and used to carry out the command. This problem seems pervasive; other examples are reading objects from disk storage and receiving objects from another process. Basically, how does one know what type of object to construct to receive the object unless the type of the blob coming in from the external source is already known? This appears to be one area of OOP that requires something like a switch or case statement to work properly. Doesn't Meyer's method trade the maintenance woes of a switch or case statement for having to maintain the initialization function for the command object array? It's a good tradeoff, but isn't there anything better? It also seems that his method is appropriate only for collections of objects which are somehow related. Does anyone out there know of any good solutions for this problem? Pointers to literature would be appreciated (I'm working in C++). -steve | Steve Vinoski | Hewlett-Packard Apollo Div. | ARPA: vinoski@apollo.com | | (508)256-6600 x5904 | Chelmsford, MA 01824 | UUCP: ...!apollo!vinoski | | "Yikes, not another OOP book using the tired old 'stack' class example!!!" |
davidm@uunet.UU.NET (David S. Masterson) (10/27/89)
In article <4671be53.20b6d@apollo.HP.COM> vinoski@apollo.HP.COM (Stephen Vinoski) writes:
   Some time ago a discussion arose in comp.lang.c++ about reading objects in
   from, say, a disk file.  There were several answers, but none that seemed
   all that great, so I hope to start a discussion of the subject here.
I may have started that discussion (although I think there was one that
predates me) and I still have seen a good answer to the problem in C++.  In
particular, I am attempting to use a relational database (with no object
oriented features) in a C++ world.  Therefore, structures (tuples from tables)
come out of the database with no inheritance structure and we are trying to
give them an inheritance structure.  Thus far, we've had little success.
The problem is pervasive as we are/will be running into the problem when
moving objects between processes.  The one advantage, though, in moving
objects between processes is that at least they started with an inheritance
hierachy and, therefore, its just a matter of wrapping the object up into a
shippable form.  The database (disk file, non-object oriented process),
though, does not know of object hierarchies.
In C++, the only type-safe, compile-time method for supporting this that I
have found involves having parent objects knowing about their children's
structure so that if they get such an object on their initializer function,
they know what to do with it.  There is a basic translation mechanism that
must be invoked somewhere and, no matter what I have tried to do, I have only
succeeded in moving the problem from one layer of objects to another.
Anyone have any ideas?
--
===================================================================
David Masterson					Consilium, Inc.
uunet!cimshop!davidm				Mt. View, CA  94043
===================================================================
		"Nobody here but us chickens..."render@m.cs.uiuc.edu (10/28/89)
Written 11:52 am Oct 26, 1989 by chewy@apple.com : >Good! The discussion that you're trying to start essentially revolves >around what are usually called "persistent objects." > ... >Ok, so these are some of the more obvious challenges. I'd recommend >looking at lots of OOPSLA proceedings to get some idea as to where to look >next. I'd also recommend reading up on distributed object systems. Also look at papers on object-oriented database management systems (OODBMSs). Their main goal is integrating OO PLs and databases, and persistence is a big part of this. I can post references if necessary. hal.
jwd@cbnewsc.ATT.COM (joseph.w.davison) (10/28/89)
In article <CIMSHOP!DAVIDM.89Oct26104730@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes: >I> >> Some time ago a discussion arose in comp.lang.c++ about reading objects in >> from, say, a disk file. There were several answers, but none that seemed >> all that great, so I hope to start a discussion of the subject here. >> ... >In C++, the only type-safe, compile-time method for supporting this that I >have found involves having parent objects knowing about their children's >structure so that if they get such an object on their initializer function, >they know what to do with it. There is a basic translation mechanism that >must be invoked somewhere and, no matter what I have tried to do, I have only >succeeded in moving the problem from one layer of objects to another. > >Anyone have any ideas? Well, ignoring the issue about methods being stored with the objects, which is hardly ignorable, I'll address the simpler problem of reading a text file, recognizing the next "object" and creating one of "them". ThiI encountered the problem in Smalltalk while trying to process some computer generated files of "messages" that had a variety of formats. I had created a class Message with various subclasses, one for each different message format. The Message objects had methods to allow users to "interpret" or, more generally, to interact with them. The problem was to create objects of the proper type as the file was read. I intend/intended to do this same job in C++, and have done a similar job as you suggest -- make the parent's constructor know about the child's structure. That just doesn't seem right. The way I wish I could do it is with OOYACC -- Like Yacc, but designed to work in an OO system. Basically, I want yylex() and yyparse() and their relatives to be member functions. I want to be able to tell when a parser is successful/fails -- perhaps a static data member of the parent could be used for this info. Then the parent class needs only to know the names of its child classes. Each class provides a constructor that takes a stream as input and applies its own yyparse() to it. If the parse is successful, the constructor can do its job, and the static data member is set to SUCCESS. If the parse fails, the static data member is set to FAILURE and the constructor does nothing else. The parent constructor tries to create instances of each child, in some order, until one succeeds, or it fails -- and does whatever constructors do when they fail. This idea at least limits the amount of information the parent needs about the child. Of course, OOYACC does not exist... -- Joe Davison jwd@ihlts.att.com
kurtl@fai.UUCP (Kurt Luoto) (10/31/89)
In article <4671be53.20b6d@apollo.HP.COM> vinoski@apollo.HP.COM (Stephen Vinoski) writes: > > This problem seems pervasive; other examples are reading objects from disk >storage and receiving objects from another process. Basically, how does one >know what type of object to construct to receive the object unless the type of >the blob coming in from the external source is already known? This appears to >be one area of OOP that requires something like a switch or case statement to >work properly. > > Doesn't Meyer's method trade the maintenance woes of a switch or case >statement for having to maintain the initialization function for the command >object array? It's a good tradeoff, but isn't there anything better? It also >seems that his method is appropriate only for collections of objects which are >somehow related. > >-steve > >| Steve Vinoski | Hewlett-Packard Apollo Div. | ARPA: vinoski@apollo.com | >| (508)256-6600 x5904 | Chelmsford, MA 01824 | UUCP: ...!apollo!vinoski | I have also run into this problem (translating external messages into objects) while working in C++, and I could think of nothing more pragmatic at the time than the equivalent of Meyer's solution, i.e. a table lookup based on message type. Indeed, it seems that at least C++ (and apparently Eiffel) does not help to "distribute the case statement" in this situation as it does for virtual functions. There are other related situations that I have run into where the C++ language could not provide any magic: I. I have a particular inheritance tree of classes. The instances of these classes are small in size and high in volume. Further- more, I would like to define virtual functions for them. However, in current implementations of C++, using the language-provided virtual function mechanism involves an extra storage overhead of a pointer (to a vtable) per object. While this is perfectly acceptable in most cases, it is not in this case because of the large number of instances. Hence I have to avoid the language provided mechanism. I can easily tell the class of an object by the value of a tag field. C++ could have provided a way for me to tie the value of this field to its class, and thus provide a alternate mechanism for discriminating virtual functions at runtime. However, the author(s) of C++ wisely left out such a provision. I must resort to putting case statements in the (now non-virtual) member functions, or resort to non-standard hackery to simulate virtual functions based on a tag field. II. I have a particular inheritance tree of classes. A client module needs to allocate storage in a particular place for an instance of one of these classes. The client needs to allocate a buffer large enough to hold an instance of any class in the tree. But we would like to not have to burden the client code with explicit knowledge about all classes. The pragmatic solution is to create a header file which defines a union type. The union contains members for all classes in the hierarchy. This is a good solution, but now we need to remember to update this header file every time we introduce a new class to the tree. It does not seem reasonable to expect C++ to handle this more gracefully, given its C-like separate compilation facilities. In both of these situations, the pragmatic solution is less than elegant. I would not propose changing C++ to better handle these cases, but I can imagine how another language might simplify them. Both cases would seem to require that the compiler/system know about all the classes in the inheritance tree, even when the client module under compilation only refers to the root class of the tree. This is not normally true in C++, but it is probably true in at least some other languages. Given the availability of such knowledge, the compiler could pick off the information that it needs, such as the size of an instance of each class and therefore the maximum size over the overall tree. To handle the first case above, the language could provide a way for the programmer to enter the tag field value (or some other discriminating information) into or along with the class declaration itself. The compiler could then generate appropriate code for runtime determination of the proper virtual function. It seems to me that such a language could also help simplify the problems that Stephen has run into. In the case of external messages, the "tag field" class identification method would be extended to allow the programmer to define a "default class" that would handle otherwise invalid tag field values. So, from my humble experience, there are no easy answers, at least not as long as you're working with C++. But it is interesting food for thought. P.S. Sorry for the length of the posting. -- ---------------- Kurt W. Luoto kurtl@fai.fai.com or ...!sun!fai!kurtl
davidm@uunet.UU.NET (David S. Masterson) (10/31/89)
In article <4158@cbnewsc.ATT.COM> jwd@cbnewsc.ATT.COM (joseph.w.davison) writes:
   Well, ignoring the issue about methods being stored with the objects, which
   is hardly ignorable,...
Hmmm, anyone have any information on the idea of storing methods with objects?
Especially in a non-object-oriented storage area?
   The way I wish I could do it is with OOYACC -- Like Yacc, but designed to
   work in an OO system.
[munch]
   The parent constructor tries to create instances of each child, in some
   order, until one succeeds, or it fails -- and does whatever constructors do
   when they fail.
   This idea at least limits the amount of information the parent needs about
   the child.
Of course the problem with this is that the parent object should never know
about its children as this violates the idea of encapsulation of ideas that
object-orientation is supposed to promote.  Two other methods have come to
light that bear examination, but I'm not happy with them either:
1.  Assume that the child knows about its parent initializer function.  It
would call that initializer function when it is done with the persistent data
that it got.  The drawback is that if the parent of the child changes, the
child must know about it.  Also, some mechanism must still be found by which
the child can convert its persistent information into a form that the parent's
initializer function will accept.
2.  Translating data from one form into another can be done through explicit
casting.  Therefore, a structure that represents data as it resides in the
persistent storage area could have a cast function on it that would convert it
to its parent type of object.  The drawback here is this is something that you
would expect C++ to do for you automatically, but persistent data seems to be
the exception to the idea.
--
===================================================================
David Masterson					Consilium, Inc.
uunet!cimshop!davidm				Mt. View, CA  94043
===================================================================
		"Nobody here but us chickens..."ttwang@polyslo.CalPoly.EDU (Thomas Wang) (10/31/89)
kurtl@fai.fai.com (Kurt Luoto) writes: >In article <4671be53.20b6d@apollo.HP.COM> vinoski@apollo.HP.COM (Stephen Vinoski) writes: >There are other related situations that I have run into where the >C++ language could not provide any magic: > II. I have a particular inheritance tree of classes. A client module > needs to allocate storage in a particular place for an instance > of one of these classes. The client needs to allocate a buffer > large enough to hold an instance of any class in the tree. But > we would like to not have to burden the client code with explicit > knowledge about all classes. Does not this waste some storage space, with additional to the complexity of the design? Why can't the client contains a 'reference' to the super-class of all objects which it could contain? This is the scheme usually used for system with automatic memory management. >Kurt W. Luoto kurtl@fai.fai.com or ...!sun!fai!kurtl -Thomas Wang ("This is a fantastic comedy that Ataru and his wife Lum, an invader from space, cause excitement involving their neighbors." - from a badly translated Urusei Yatsura poster) ttwang@polyslo.calpoly.edu