[comp.lang.c++] When to make member functions?

kurtl@fai.UUCP (Kurt Luoto) (08/28/89)

In article <1267@pc.ecn.purdue.edu> fai!amdahl!ames!mailrus!iuvax!pur-ee!pc.ecn.purdue.edu!filoj filoj@pc.ecn.purdue.edu (Jeffrey J Filo) writes:
>I am doing thesis work with directed graphs for mechanical systems and
>am interested in finding out if there are any class libraries available
>for graph representation and manipulation.
[ ... ]
>Jeff Filo
>filoj@pc.ecn.purdue.edu

I don't know of any available libraries for graphs, but this article brings
up some questions for me.  I am currently writing a set of classes for graph
representation.  I am basing the implementation on that proposed in the article

    "A Versatile Data Structure For Edge-Oriented Graph Algorithms",
    Jurgen Ebert, CACM June 1987, Vol 30 No. 6, pp 513-519

My humble question for the wisdom of the net: When is it desirable to make a
member function of a class?  Obvious candidates are functions for traversing
the nodes and edges of a graph.  Less obvious are algorithms such as finding
a minimum path between two vertices of a graph.  In the latter case, the
algorithm can be written in terms of the primitive functions provided by the
graph class, and so is independent of the graph representation.

Since there are so many things that you can do with a graph, it seems silly
to encumber a simple graph class with all kinds of algorithmic functions.
On the other hand, in other OOP languages you have no choice. E.g. in Eiffel,
functions only exist as members of some class.  What is appropriate for C++?
(Honest, I am not trying to start any flame wars here.)

-------------------
Kurt W. Luoto,  neophyte net-poster.
(If I knew what my mailer address was, I'd give it.)

wright@hsi.UUCP (Gary Wright) (08/29/89)

In article <2506@fai.UUCP> kurtl@fai.fai.com (Kurt Luoto) writes:
>Since there are so many things that you can do with a graph, it seems silly
>to encumber a simple graph class with all kinds of algorithmic functions.
>On the other hand, in other OOP languages you have no choice. E.g. in Eiffel,
>functions only exist as members of some class.  What is appropriate for C++?
>(Honest, I am not trying to start any flame wars here.)

1. Design an abstract graph class with no commitment to any 
   particular implementation.
2. Inherit from your abstract class, and define the implementation.
3. Inherit once more from a specific implementation of your graph
   class and add the computationally expensive algorithms at
   this step.  This step may also include adding state to the
   various nodes in the graph in order to implement some graph
   algorithms.

This is an over-simplification but my point is that there is no need to
design one and only one graph class.  By building your classes from
more primitive classes, you create future possibilities for reuse.  I
am also ignoring the fact that you will probably want a class that
describes nodes in the graph, and a class that describes the graph
itself.

I would suggest reading Chapter 12, "Graphs" in Software Components
with Ada by Grady Booch.  Regardless of what you think about Ada, Booch
does give some nice abstract definitions of various data structures in
his book.
-- 
Gary Wright 					...!uunet!hsi!wright
Health Systems International                    wright@hsi.com

sakkinen@tukki.jyu.fi (Markku Sakkinen) (08/30/89)

In article <565@hsi86.hsi.UUCP> wright@hsi.com (Gary Wright) writes:
-In article <2506@fai.UUCP> kurtl@fai.fai.com (Kurt Luoto) writes:
->Since there are so many things that you can do with a graph, it seems silly
->to encumber a simple graph class with all kinds of algorithmic functions.
->On the other hand, in other OOP languages you have no choice. E.g. in Eiffel,
->functions only exist as members of some class.  What is appropriate for C++?
->(Honest, I am not trying to start any flame wars here.)
-
-1. Design an abstract graph class with no commitment to any 
-   particular implementation.
-2. Inherit from your abstract class, and define the implementation.
-3. Inherit once more from a specific implementation of your graph
-   class and add the computationally expensive algorithms at
-   this step.  This step may also include adding state to the
-   various nodes in the graph in order to implement some graph
-   algorithms.
-
-This is an over-simplification but [...]
- ...

Recommendations 1 and 2 above make sense,
but I think that recommendation 3 goes exactly the wrong way round.
The original question was about functions that are independent
of the particular realisation, i.e. written in terms of the general
public interface of the abstract graph class.
In my opinion, such functions should be _implemented_ in the abstract class.
This is both conceptually logical and avoids code duplication.
For safety's sake you could declare some or all of those functions
virtual if you might some day want to write essentially more efficient
versions of them for some specific cases.

My suggestion for the more general question (When to make functions
members of classes and when not?) is much more a matter of taste.
1. If a function needs to access private or protected members of the class,
   make it a member function unless there are strong logical reasons
   to the contrary (it must then be declared a friend, of course).
   There is some more technical discussion about this in Stroustrup's
   book (6.10).
2. If a function needs to access private or protected members of
   _several_ classes, make it a friend of all those classes;
   but if it is clearly most intimately connected with one class,
   make it a member of that one a friend of the others.
3. If a function needs no access to any private or protected class members,
   make it an ordinary function unless there are strong logical reasons
   for making it a member function of some class. (I think there are
   such reasons in the graph example.)

Markku Sakkinen
Department of Computer Science
University of Jyvaskyla (a's with umlauts)
Seminaarinkatu 15
SF-40100 Jyvaskyla (umlauts again)
Finland

wright@hsi.UUCP (Gary Wright) (08/30/89)

In article <1230@tukki.jyu.fi> markku@jytko.jyu.fi (Markku Sakkinen) SAKKINEN@FINJYU.bitnet (alternative) writes:
>In article <565@hsi86.hsi.UUCP> wright@hsi.com (Gary Wright) writes:
>-In article <2506@fai.UUCP> kurtl@fai.fai.com (Kurt Luoto) writes:
>->Since there are so many things that you can do with a graph, it seems silly
>->to encumber a simple graph class with all kinds of algorithmic functions.
>->On the other hand, in other OOP languages you have no choice. E.g. in Eiffel,
>->functions only exist as members of some class.  What is appropriate for C++?
>->(Honest, I am not trying to start any flame wars here.)
>-
>-1. Design an abstract graph class with no commitment to any 
>-   particular implementation.
>-2. Inherit from your abstract class, and define the implementation.
>-3. Inherit once more from a specific implementation of your graph
>-   class and add the computationally expensive algorithms at
>-   this step.  This step may also include adding state to the
>-   various nodes in the graph in order to implement some graph
>-   algorithms.
>-
>-This is an over-simplification but [...]
>- ...
>
>Recommendations 1 and 2 above make sense,
>but I think that recommendation 3 goes exactly the wrong way round.
>The original question was about functions that are independent
>of the particular realisation, i.e. written in terms of the general
>public interface of the abstract graph class.
>In my opinion, such functions should be _implemented_ in the abstract class.

I have a feeling that we are both using terminology in subtly different ways.
I also think that while I have multiple inheritance in mind, Markku does not.

I'll try to elaborate.  Disclaimer: I am more familiar with Eiffel and
it's inheritance mechanisms than with C++.  I will try to make things
clear enough but it is my general feeling that it may not be possible
to convert certain Eiffel techniques directly into C++.

Clearly, you could have any number of abstract graph classes.  Some
would not include any complex "algorithmic" queries of the underlying
graph, some would.  It is also clear that some algorithms to determine
characteristics of a graph can be "written in terms of the general
public interface of the abstract graph class."  

When designing the abstract graph class, you would like to include 
enough features so that any algorithm can be written in terms of this
interface.  This is the problem of determining the primitive operations
that an abstract graph class should provide.  Some algorithms
written in this way may be efficient, other may not.  For efficiency,
you may want an algorithm to know about the internal implementation of
the graph.

For simplicity, I will refer to the three classes I suggested in my 
previous posting as (1) abstract, (2) simple, (3) complex.

My understanding is that Markku is suggesting that functions that can
be written in terms of the abstract interface be included in (1) and as
such will also be come part of the abstract interface.  My problem with
this approach is that it prevents reuse of the abstract class.  What if
later on, I need to use a graph in some other application and it is not
necessary to have all the complicated graph algorithms?  

As mentioned above, some algorithms may need more than the primitive
graph structure in order to be implemented in an efficient way.  This
extra information should be in a separate class, not in (1) or direct
heirs of (1).  I would suggest using multiple inheritance and
deferred classes (abstract classes) to achieve this.  

I would go about creating my family of graph classes by first
defining a deferred graph class "GRAPH" that only contained the minimum number
of member functions, and none of the more complex "graph theoretic functions".

I would inherit from GRAPH and define the various functions in terms
of a specific implementation (i.e. FIXED_GRAPH, a graph with a fixed maximum
number of nodes and arcs).  This class will still be a basic graph class, 
but it would be fully implemented.

Now, let's assume I am interested in the shortest path between two nodes
of a graph.  Let's define the function as:

	short_path(start: NODE, finish: NODE): LIST[NODE];

Instead of including this function as part of GRAPH or FIXED_GRAPH, I'll
put it in a new class called GRAPH_QUERY.  This class will be a deferred
class, any of the graph primitives that my algorithm might need are deferred.
The function short_path() is defined in terms of the deferred primitives.
I will also need a weight associated with each arc of the graph, so I'll
create a class WEIGHTED_GRAPH by inheriting from FIXED_GRAPH, and 
providing a new implementation of arcs.  

Actually, I might want to use parameterized types here:
	
	class WEIGHTED_GRAPH exports
		...
		weight,
		...
	inherits
		FIXED_GRAPH[ WEIGHTED_ARC, NODE];
	features
		weight( arc: WEIGHTED_ARC) : INTEGER is do
			Result := arc.weight;	-- Result is the returned value
		end;
		...
	end -- WEIGHTED_GRAPH

So now for my final class, I use multiple inheritance to get the implementation
of a graph from WEIGHTED_GRAPH, and the implementation (e.g. Dijkstra's) for
short_path from GRAPH_QUERY.

	class FINAL_GRAPH exports
		...
		short_path,
		...
	inherits
		WEIGHTED_GRAPH;
		GRAPH_QUERY
	features
		...
	end -- FINAL_GRAPH

The idea is that WEIGHTED_GRAPH provides an implementation of
weight() that GRAPH_QUERY needs (i.e., weight() is deferred in
GRAPH_QUERY).  As long as there is only one parent that provides an
implementation of a particular function, it doesn't matter how many
parents provide a deferred version of it.

GRAPH_QUERY may actually contain a number of interesting algorithms
packaged together instead of having a separate class for each
algorithm.

>This is both conceptually logical and avoids code duplication.
>For safety's sake you could declare some or all of those functions
>virtual if you might some day want to write essentially more efficient
>versions of them for some specific cases.

I believe the best technique is to always declare functions virtual
and then to inline them when it is determined that that will improve
efficiency.  In Eiffel all functions are virtual.  An optimizing pass
can examine the text of a class and inline appropriate functions or
create static linkage if the function is never redefined, like
weight in my example which is just a simple assignment.

>
>My suggestion for the more general question (When to make functions
>members of classes and when not?) is much more a matter of taste.
>1. If a function needs to access private or protected members of the class,
>   make it a member function unless there are strong logical reasons
>   to the contrary (it must then be declared a friend, of course).
>   There is some more technical discussion about this in Stroustrup's
>   book (6.10).

I think that the use of multiple inheritance can remove the need for
using friend functions.

>2. If a function needs to access private or protected members of
>   _several_ classes, make it a friend of all those classes;
>   but if it is clearly most intimately connected with one class,
>   make it a member of that one a friend of the others.

Why not just create a new class through inheritance and add the desired
function to the class?  I think you are focusing too much on functions
and not enough on objects.  You really need to change the way you look
at a problem in order to really utilize OOP techniques.

>3. If a function needs no access to any private or protected class members,
>   make it an ordinary function unless there are strong logical reasons
>   for making it a member function of some class. (I think there are
>   such reasons in the graph example.)

This function should not be sitting around globally, it should also
be a member function of some class.  This class may be a deferred class
so that the function can be combined with other classes to build a more
complex class that you need for a specific application.

I understand that with single inheritance, many of these techniques can
not be used, and you do indeed need to use friend functions and the
like.  I think the graph example illustrates the power of multiple
inheritance very well.  My use of parameterized types is not possible in
C++ 2.0 (anyone know about g++?).  I don't know if deferred routines
can be combined with non-deferred routines using multiple inheritance
in C++.  Maybe someone else can comment on how to handle this in C++.

I apologize for the length of this posting.  I hope it was worth it.
-- 
Gary Wright 					...!uunet!hsi!wright
Health Systems International                    wright@hsi.com

sakkinen@tukki.jyu.fi (Markku Sakkinen) (09/01/89)

In article <569@hsi86.hsi.UUCP> wright@hsi.com (Gary Wright) writes:
-In article <1230@tukki.jyu.fi> markku@jytko.jyu.fi (Markku Sakkinen) SAKKINEN@FINJYU.bitnet (alternative) writes:
->In article <565@hsi86.hsi.UUCP> wright@hsi.com (Gary Wright) writes:
->-In article <2506@fai.UUCP> kurtl@fai.fai.com (Kurt Luoto) writes:
-> ...
- ...

-I have a feeling that we are both using terminology in subtly different ways.
-I also think that while I have multiple inheritance in mind, Markku does not.

I omitted MU to simplify the discussion a little,
because I don't think it can solve _all_ these questions (see below).

-I'll try to elaborate.  Disclaimer: I am more familiar with Eiffel and
-it's inheritance mechanisms than with C++.  I will try to make things
-clear enough but it is my general feeling that it may not be possible
-to convert certain Eiffel techniques directly into C++.

Watchers-on, please note the above! I think the most important things
in this debate are rather language-independent and would belong
to comp.object if it already existed (remember to send in your YES
votes for establishing that group!). The small corrections about
details of C++ in the sequel are mostly cosmetic.

- ...
-When designing the abstract graph class, you would like to include 
-enough features so that any algorithm can be written in terms of this
-interface.  This is the problem of determining the primitive operations
-that an abstract graph class should provide.  Some algorithms
-written in this way may be efficient, other may not.  For efficiency,
-you may want an algorithm to know about the internal implementation of
-the graph.
-
-For simplicity, I will refer to the three classes I suggested in my 
-previous posting as (1) abstract, (2) simple, (3) complex.
-
-My understanding is that Markku is suggesting that functions that can
-be written in terms of the abstract interface be included in (1) and as
-such will also be come part of the abstract interface.  My problem with
-this approach is that it prevents reuse of the abstract class.  What if
-later on, I need to use a graph in some other application and it is not
-necessary to have all the complicated graph algorithms?  

According to my recommendation 3, you would _not_ include the non-primitive
functions in the definition of the abstract class if you deem the cost
to be greater than the benefit. Moreover, "prevents reuse" is a slight
overstatement: in C++ you need not drag the implementations into the
other applications that don't need those fancy functions, only the
prototype declarations in the class definition (although even that can
be irritating extraneous luggage that slows down compilations).

-As mentioned above, some algorithms may need more than the primitive
-graph structure in order to be implemented in an efficient way.  This
-extra information should be in a separate class, not in (1) or direct
-heirs of (1).  I would suggest using multiple inheritance and
-deferred classes (abstract classes) to achieve this.  

Why not in direct heirs? Why should one add more levels than absolutely
necessary to the inheritance hierarchy?
C++ does not have a concept similar to the deferred classes of Eiffel;
such an effect must be simulated by programming discipline.

I think using multiple inheritance is not a good idea here.
In fact, I think _inheritance_ is often applied too much in object-oriented
programming (cf. my paper "Disciplined inheritance" at ECOOP'89).
In part this is caused by the properties of typical OOP languages,
e.g. that objects cannot directly contain other objects, only references
to them (here C++ is better off!). Some recent discussion in comp.lang.eiffel
is concerned with such problems.
Sometimes there may be a (perhaps unconscious) fallacy:
if two classes have any business whatever with each other,
then one of them must be a subclass of the other.

-
-I would go about creating my family of graph classes by first
-defining a deferred graph class "GRAPH" that only contained the minimum number
-of member functions, and none of the more complex "graph theoretic functions".
-
-I would inherit from GRAPH and define the various functions in terms
-of a specific implementation (i.e. FIXED_GRAPH, a graph with a fixed maximum
-number of nodes and arcs).  This class will still be a basic graph class, 
-but it would be fully implemented.
-
-Now, let's assume I am interested in the shortest path between two nodes
-of a graph.  Let's define the function as:
-
-	short_path(start: NODE, finish: NODE): LIST[NODE];
-
-Instead of including this function as part of GRAPH or FIXED_GRAPH, I'll
-put it in a new class called GRAPH_QUERY.  This class will be a deferred
-class, any of the graph primitives that my algorithm might need are deferred.
-The function short_path() is defined in terms of the deferred primitives.

Right, it is only a matter of taste (in C++) whether short_path
is made an ordinary function or a member function of a new class;
I'll return to that later.
There is no reason (even in Eiffel) why GRAPH_QUERY should necessarily be
deferred: the _interface_ of all routines in GRAPH is defined and known
although the _implementation_ is deferred.

-I will also need a weight associated with each arc of the graph, so I'll
-create a class WEIGHTED_GRAPH by inheriting from FIXED_GRAPH, and 
-providing a new implementation of arcs.  
- ...
-So now for my final class, I use multiple inheritance to get the implementation
-of a graph from WEIGHTED_GRAPH, and the implementation (e.g. Dijkstra's) for
-short_path from GRAPH_QUERY.
-	class FINAL_GRAPH [...]
- ...
-GRAPH_QUERY may actually contain a number of interesting algorithms
-packaged together instead of having a separate class for each
-algorithm.

This approach certainly works, but it has severe drawbacks
especially on code reuse:
- You cannot apply the functions in GRAPH_QUERY to all instances of GRAPH.
- For every "concrete" subclass of GRAPH that you define later,
  you will potentially need to define a companion class like FINAL_GRAPH.
I would prefer to make GRAPH_QUERY a module or package of functions
that take an additional GRAPH argument (in comparison to the above approach);
a function like short_path that takes two NODE arguments would probably
not need even that addition. Unfortunately, there is no module/package
concept in C++, nor are there class functions (corresponding to
Smalltalk's class methods) although there are "static data members"
(corresponding to class variables). So, one must make GRAPH_QUERY a
class, create one instance of it and call the functions through
that instance. That's why I would rather omit the class the class
and make the functions ordinary functions. Apparently the situation
is similar in Eiffel, but as the original posting by Kurt Luoto
pointed out, there you have no choice.

-I believe the best technique is to always declare functions virtual
-and then to inline them when it is determined that that will improve
-efficiency.  In Eiffel all functions are virtual. [...]
- ...

Again a matter of taste: I appreciate that Simula, C++, and some other
languages give the programmer a choice. Still, it might more logically
depend on the "kind of inheritance" than on declarations in the base class
when redefinitions are allowed and when not.

- ...
-I think that the use of multiple inheritance can remove the need for
-using friend functions.

This is true in many cases, but not in all!
See the following:

->2. If a function needs to access private or protected members of
->   _several_ classes, make it a friend of all those classes;
->   but if it is clearly most intimately connected with one class,
->   make it a member of that one a friend of the others.
-
-Why not just create a new class through inheritance and add the desired
-function to the class?  I think you are focusing too much on functions
-and not enough on objects.  You really need to change the way you look
-at a problem in order to really utilize OOP techniques.

OOP does not mean: "try to solve (almost) everything by inheritance".
For instance, an efficient function for multiplying a vector by a matrix
could maybe need to access non-public features of both class Vector
and Matrix (I think this was one example discussed by Kasper Osterbye in
SIGPLAN Notices, June 1988).  It makes no sense to define a class that
inherits from both Vector and Matrix.

->3. If a function needs no access to any private or protected class members,
->   make it an ordinary function unless there are strong logical reasons
->   for making it a member function of some class. (I think there are
->   such reasons in the graph example.)
-
-This function should not be sitting around globally, it should also
-be a member function of some class.  This class may be a deferred class
-so that the function can be combined with other classes to build a more
-complex class that you need for a specific application.

See earlier comments.

- ...
-I apologize for the length of this posting.  I hope it was worth it.
--- 
-Gary Wright 					...!uunet!hsi!wright
-Health Systems International                    wright@hsi.com

I apologise likewise.
Gary, if we don't see more general interest in continuing
this discussion, let's turn to e-mail!

Markku Sakkinen
Department of Computer Science
University of Jyvaskyla (a's with umlauts)
Seminaarinkatu 15
SF-40100 Jyvaskyla (umlauts again)
Finland

dog@cbnewsl.ATT.COM (edward.n.schiebel) (09/03/89)

In article <1254@tukki.jyu.fi>, sakkinen@tukki.jyu.fi (Markku Sakkinen) writes:
> In article <569@hsi86.hsi.UUCP> wright@hsi.com (Gary Wright) writes:
> -In article <1230@tukki.jyu.fi> markku@jytko.jyu.fi (Markku Sakkinen) SAKKINEN@FINJYU.bitnet (alternative) writes:
> ->In article <565@hsi86.hsi.UUCP> wright@hsi.com (Gary Wright) writes:
> ->-In article <2506@fai.UUCP> kurtl@fai.fai.com (Kurt Luoto) writes:
> -> ...
> - ...
>  ...
> 
> C++ does not have a concept similar to the deferred classes of Eiffel;
> such an effect must be simulated by programming discipline.

Not exactly.  Pure virtual fuctions are provided in Release 2.0:

class AbstractBaseClass {
public:
	virtual int f() = 0;
};

AbstractBaseClass::f is a pure virtual function. Any attempt to call
it is an error, and an instance of AbstractBaseClass may not be 
created.  Any class derived from AbstractBaseClass must either restate
the pure virtual declaration or provide an implementation.

> ... stuff deleted...
> I would prefer to make GRAPH_QUERY a module or package of functions
> that take an additional GRAPH argument (in comparison to the above approach);
> a function like short_path that takes two NODE arguments would probably
> not need even that addition. Unfortunately, there is no module/package
> concept in C++, nor are there class functions (corresponding to
> Smalltalk's class methods) although there are "static data members"
> (corresponding to class variables).

In 2.0, there are static member functions as well:

class foo {
public:
	static void g();
};

foo::g is invoked just like that:

	foo::g();

It is not associated with any particular foo, and there is no "this"
within its body. 

This doesn't have a whole lot to do about the discussion, but I thought
the details were important.

	Ed Schiebel
	AT&T Bell Laboratories
	201-386-3416