[comp.lang.c++] Design Problem

sdm@cs.brown.edu (Scott Meyers) (12/04/89)

I'm having trouble figuring out how to solve the following problem.  Often
I find myself with a collection of objects of a given type, where some of
them define a particular function and some of them don't, and I want to
invoke the function on only those objects for which it is defined.  For
example, I might have a collection of objects of type B, but only those
objects of type D (derived from B) define a function f; I'll want to invoke
f on those objects for which it is defined.  How can I best accomplish
this?

To make things more concrete, I'm building graphs for representing
programs, and one of my base types is ARC.  Derived from ARC are two
classes, EXECUTABLE_ARC and NONEXECUTABLE_ARC.  Each of these types also
have subtypes, but that is immaterial.  EXECUTABLE_ARCs define a function
evaluate();  NONEXECUTABLE_ARCs do not.  The graph for a program consists
of a set of nodes and a set of arcs, and to execute the program, I need to
evaluate the nodes and arcs.  So what I have is a set of ARCs, and I want
to call evaluate() on them, but only some of them define that function.  

It seems I have three possible approaches:

    1.  Define evaluate() as a virtual function in ARC, and redefine it
        appropriately in EXECUTABLE_ARC.  In ARC, make it a noop, so when
        called on NONEXECUTABLE_ARCs, it won't do anything.

    2.  Add an enumerated type to ARC that enumerates each of the subtypes,
        and add a virtual function arc_type() that is redefined in each
        subclass to return the appropriate value.  Then I can write code
        like "if (arc->arc_type() == EXECUTABLE) arc->evaluate()".  

    3.  Separate my ARC class into two distinct classes, so I never have
        this problem.

None of these designs is attractive.  The first one doesn't extend well to
situations in which there may be many functions like evaluate() down in the
hierarchy -- each will have to be defined in ARC, even though they have
nothing to do with generic ARCness.  In addition, it suffers from the same
problem that the second one does: adding new classes to the hierarchy may
require that I modify the base class (e.g., adding new enumerated type
values).  The third option is impractical because, although the class ARC
breaks down well along executability, there are many other aspects of ARCs
that don't separate so easily -- I really do want to have a single ARC
superclass. 

What I really want to be able to say is "if function f() is defined for
object o, then call o.f()".  Is there a good way to acheive this effect?

Scott
sdm@cs.brown.edu

keith@csli.Stanford.EDU (Keith Nishihara) (12/05/89)

sdm@cs.brown.edu (Scott Meyers) writes:

>I'm having trouble figuring out how to solve the following problem.  Often
>I find myself with a collection of objects of a given type, where some of
>them define a particular function and some of them don't, and I want to
>invoke the function on only those objects for which it is defined.  For
>example, I might have a collection of objects of type B, but only those
>objects of type D (derived from B) define a function f; I'll want to invoke
>f on those objects for which it is defined.  How can I best accomplish
>this?

>To make things more concrete, I'm building graphs for representing
>programs, and one of my base types is ARC.  Derived from ARC are two
>classes, EXECUTABLE_ARC and NONEXECUTABLE_ARC.  Each of these types also
>have subtypes, but that is immaterial.  EXECUTABLE_ARCs define a function
>evaluate();  NONEXECUTABLE_ARCs do not.  The graph for a program consists
>of a set of nodes and a set of arcs, and to execute the program, I need to
>evaluate the nodes and arcs.  So what I have is a set of ARCs, and I want
>to call evaluate() on them, but only some of them define that function.  

>What I really want to be able to say is "if function f() is defined for
>object o, then call o.f()".  Is there a good way to acheive this effect?

It seems to me that the virtual function defined on ARC with no ops
for non executable arcs is what you need.  However, one other
approach which might be considered if you have lots of functions
like execute, and especially if the functions applicable to executable arcs
are from an open ended set, is as below:

class executable_arc;
class non_executable_arc;

class arc
{
public:
	virtual executable_arc *	executable() { return 0; }
...
};

class executable_arc
{
public:
	executable_arc *		executable() { return this; }
...
};

main()
{
	arc *p;
	executable *e;

	for(p = first; p; p = p->next())
		if((e = p->executable()) != 0)
			e->execute();
}

Neil/.		Neil%teleos.com@ai.sri.com	...decwrl!argosy!teleos!neil

rich@Rice.edu (Carey R. Murphey) (12/05/89)

In article <11258@csli.Stanford.EDU> keith@csli.Stanford.EDU (Keith Nishihara) writes:

   >What I really want to be able to say is "if function f() is defined for
   >object o, then call o.f()".  Is there a good way to acheive this effect?

   It seems to me that the virtual function defined on ARC with no ops
   for non executable arcs is what you need.  However, one other
   approach which might be considered if you have lots of functions
   like execute, and especially if the functions applicable to executable arcs
   are from an open ended set, is as below:

[code followed...]

Would it be reasonable to set the virtual memeber function  pointer
to NULL (0) by default to indicate there is no function defined by
default?  Here's an idea of how that might work:

class ARC
{
...
  virtual void execute() = 0; // there is no execute method by default.
}

ARC an_arc;

if (an_arc.execute)	
  an_arc.execute(); // execute the method if it exists.

I dunno... I haven't actually tried this to see if it works.  I think
libg++ uses this in it's `Bag' container classes.

--
Rich@rice.edu

sdm@brunix (Scott Meyers) (12/05/89)

In article <RICH.89Dec4144121@kalliope.Rice.edu> rich@Rice.edu (Carey R. Murphey) writes:
>Would it be reasonable to set the virtual memeber function  pointer
>to NULL (0) by default to indicate there is no function defined by
>default?  Here's an idea of how that might work:
>
>class ARC
>{
>  virtual void execute() = 0; // there is no execute method by default.
>}
>
>ARC an_arc;
>
>if (an_arc.execute)	
>  an_arc.execute(); // execute the method if it exists.
>
>I dunno... I haven't actually tried this to see if it works.  I think
>libg++ uses this in it's `Bag' container classes.

This is a good idea, and in fact it works with g++, but the AT&T compiler
won't take it.  (Actually, CC generates a warning -- "address of bound
function" -- and then cc chokes.)  I played around with this a bit, and it
doesn't look like there's any obvious way to dynamically determine the
address of a member function for a polymorphic object.  That is, given a
pointer to an object p and a virtual member function mf, there doesn't seem
to be any way to get the address of p->mf.  If anybody comes up with a
way to do this, I'd like to hear about it!

Scott
sdm@cs.brown.edu

S
i
g
h

S
i
g
h

dlw@odi.com (Dan Weinreb) (12/05/89)

In article <22137@brunix.UUCP> sdm@cs.brown.edu (Scott Meyers) writes:

   I'm having trouble figuring out how to solve the following problem.  Often
   I find myself with a collection of objects of a given type, where some of
   them define a particular function and some of them don't, and I want to
   invoke the function on only those objects for which it is defined.  

I believe that your analysis is completely correct: the options are as
you have stated them.  In my own personal understanding of the
philosophy behind C++, approach 1 is the recommended course.  (That
is, it's just my humble opinion. Approach 2 strikes me as clumsier
than approach 1, and approach 3 throws the baby out with the bathwater.)

In OO languages in which variables are not typed, such as Smalltalk-80
and Lisp-with-Flavors/CLOS, you wouldn't have any problem.  In fact,
"old Flavors" had a message called :SEND-IF-HANDLED, which was used
for exactly the kind of purpose you describe.  There are equivalent
ways to do this in new Flavors, CLOS, and Smalltalk-80.  In these
languages, only the objects themselves have types, and you can just
check at runtime whether an operation is handled.  I spent many years
programming with Flavors; this little trick is rarely used, but every
once in a while it comes in quite handy.

In C++, though, there is a principle that says that for every variable
(every expression, really), there is a type that is lexically apparent
(i.e. it can be known at compile time), and the type specifies the
exact set of function members that can be called for that object.  C++
has lexically-apparent typing; the meaning of "type" is (or
"includes", anyway) "what operations can be done on this thing".

   None of these designs is attractive.  The first one doesn't extend well to
   situations in which there may be many functions like evaluate() down in the
   hierarchy -- each will have to be defined in ARC, even though they have
   nothing to do with generic ARCness.  

Well, from the point of view of C++ philosophy (IMHO), they sort of
*do* have something to do with ARCness because you call these
functions on ARCs of all sorts, executable and not-executable.  This
is the key point.

					In addition, it suffers from the same
   problem that the second one does: adding new classes to the hierarchy may
   require that I modify the base class (e.g., adding new enumerated type
   values).  

That's because the base class does *two* things in C++: it defines a
particular class, and it defines a "protocol" for a family of classes.
I am using the word "protocol" in the sense in which it is used by the
Smalltalk-80 people; roughly, a definition of a set of messages, their
names and arguments, independent of the classes that provide methods
for those messages (if I may be allowed to use the Smalltalk lingo for
one sentence).  Adding a new operation changes the protocol, so you
have to redefine the base class.

You could look at it as a tradeoff.  The C++ style of typed variables
provides you with extra checking; but you have to do extra work while
programming in exchange.  There are similar tradeoffs to be found in
comparisons of the typed-variable and untyped-variable languages.

Actually, if I were programming in Flavors and were trying to do
precisely what you are trying to do, I would define a default method
on the base class.  It seems to me more stylistically desirable,
probably because the send-if-handles thing is more powerful than
necessary; if my program accidentally did one of these operations to a
non-ARC, I'd like an exception to be signalled at runtime.  That's
another way that these operations might really be said to have
something to do with ARCness.

So what I'm trying to say is that maybe your approach 1 isn't really
so bad after all.

Dan Weinreb		Object Design, Inc.		dlw@odi.com

dog@cbnewsl.ATT.COM (edward.n.schiebel) (12/05/89)

From article <22137@brunix.UUCP>, by sdm@cs.brown.edu (Scott Meyers):
> ... description of design problem.  A list of base class objects, but
> Scott wants to iterate over it and call functions which are only defined
> on some of the derived classes...
>
> It seems I have three possible approaches:
> 
>     1.  Define evaluate() as a virtual function in ARC, and redefine it
>         appropriately in EXECUTABLE_ARC.  In ARC, make it a noop, so when
>         called on NONEXECUTABLE_ARCs, it won't do anything.
IMHO, this is the purest solution.  If there are operations you want to
perform on a collection of objects, then all those objects better
define that operation.  From a philisophical point of view, is see nothing
wrong with considering  NONEXECUTABLE_ARCS as ARCs whose evaluate() is
a noop.
> 
>     2.  Add an enumerated type to ARC that enumerates each of the subtypes,
>         and add a virtual function arc_type() that is redefined in each
>         subclass to return the appropriate value.  Then I can write code
>         like "if (arc->arc_type() == EXECUTABLE) arc->evaluate()".  
I found a need to do this once (or twice :-), but (as you mention later)
it suffers from having to modify the base class when you derive a 
new type from it.  I have a variation on this theme which does not
suffer from this weakness:
class base {
public:
	virtual String type();
};

class derived : public base {
pubic:
	static String Stype() {return type_string;}
	String type() {return Stype();}
private:
	static String type_string; // initialized to "derived" in .c file
};

Not the test becomes:
	if(base_ptr->type() == derived::Stype()) doSomething()
(assuming class String has operator== defined of course).

This test (a String comparison) is not that much more expensive than
an integer test, since in many cases, the comparison will fail in the
first character (unless your clas hierarchy looks like arc_generic, 
arc_this, arc_that, arc_theotherthing).
> 
>     3.  Separate my ARC class into two distinct classes, so I never have
>         this problem.
If there are specific places in the code where you want to deal with
one type of ARC only, and use it as a SPECIAL_ARC, not the generic 
variety, then maybe you need to keep a collectio of SPECIAL_ARCs around
for this purpose.

> None of these designs is attractive.  The first one doesn't extend well to
> situations in which there may be many functions like evaluate() down in the
> hierarchy -- each will have to be defined in ARC, even though they have
> nothing to do with generic ARCness.  
Then, maybe you shouldn't be trying to use the special functions in
elements of a GENERIC_ARC collection.

> In addition, it suffers from the same
> problem that the second one does: adding new classes to the hierarchy may
> require that I modify the base class (e.g., adding new enumerated type
> values).  
See my solutino above.

> The third option is impractical because, although the class ARC
> breaks down well along executability, there are many other aspects of ARCs
> that don't separate so easily -- I really do want to have a single ARC
> superclass. 
But maybe also keep the special cases collected together too.

Hope this is of some help.

	Ed Schiebel
	AT&T Bell Laboratories
	dog@vilya.att.com

jean@paradim.UUCP (Jean Pierre LeJacq) (12/11/89)

In article <22137@brunix.UUCP>, sdm@cs.brown.edu (Scott Meyers) writes:
> I find myself with a collection of objects of a given type, where some of
> them define a particular function and some of them don't, and I want to
> invoke the function on only those objects for which it is defined. ....
> 
> It seems I have three possible approaches:
>     1.  Define evaluate() as a virtual function in ARC, and redefine it
>         appropriately in EXECUTABLE_ARC.  In ARC, make it a noop, so when
>         called on NONEXECUTABLE_ARCs, it won't do anything.
>     2.  Add an enumerated type to ARC that enumerates each of the subtypes,
>         and add a virtual function arc_type() that is redefined in each
>         subclass to return the appropriate value. ...
>     3.  Separate my ARC class into two distinct classes, so I never have
>         this problem.

Another approach is to insure that only instances of EXECUTABLE_ARC
are inserted into a collection. You are then assured that all members
of the collection, including public subclasses, have evaluate() defined.

This will require redesigning the solution to your problem at a higher
level. For example, two collections may be required to manage the
two fundamental ARC classes. The goal is to use polymorphism to your
advantage instead of reverting to C "switch" approaches or to de-tune
your class definitions.

marks@cbnewsl.ATT.COM (mark.e.smith) (12/15/89)

> > them define a particular function and some of them don't, and I want to
> > invoke the function on only those objects for which it is defined. ....
> > 
> > It seems I have three possible approaches:
> >     1.  Define evaluate() as a virtual function in ARC, and redefine it
> >         appropriately in EXECUTABLE_ARC.  In ARC, make it a noop, so when
> >         called on NONEXECUTABLE_ARCs, it won't do anything.
> > < other stuff >
> Another approach is to insure that only instances of EXECUTABLE_ARC
> are inserted into a collection. You are then assured that all members
> of the collection, including public subclasses, have evaluate() defined.
> 
> This will require redesigning the solution to your problem at a higher
> level. For example, two collections may be required to manage the
> two fundamental ARC classes. The goal is to use polymorphism to your
> advantage instead of reverting to C "switch" approaches or to de-tune
> your class definitions.

(Yes.  And add to that casting.)

Many times, the "natural" way of doing this is with parameterized types,
where each collection can only have members of one class hierarchy on it.
The base for a particular hierarchy defines the "weird" functions not 
shared with more general superclasses.  But the collection operations
are then shared across all the class hierarchies.

As of right now, parameterized types can only be done with macros (referred
to as "generic classes"), but I hope that changes sometime fairly soon.  
Having gone through extensive production code using generic classes,
I find the benefit of being able to specify parameterized types is 
evenly balanced by the drawback of having to manage those nasty macros.

Mark Smith
gc3ba!mark