[comp.lang.c++] generic arithmetic question

soiffer@tekcrl.CRL.TEK.COM (Neil Soiffer) (11/05/88)

I am fairly new to c++ and would appreciate suggestions on how to write
a generic arithmetic package in c++.  By "generic", I mean I would
like to be able to write functions whose arguments can be of type "number"
and not a specific type like int, float, complex, rational, etc.
As a trivial example, I would like to be able to write a generic exponentiation
function that works by repeated squaring.  A generic class is useful not
only for writing general purpose generic arithmetic functions, but also for
writing programs where the specific type of the operands is unknown (eg, in a
calculator that reads many types of numbers).

The "obvious" solution to this problem is to write an abstract class
"number" and derive concrete classes (such as fixnum, rational, etc) from
it.  A solution along this line of thought is shown below:

------------------------------------------------

#include <stream.h>

class fixnum;	// forward ref
class rational;	// forward ref

class number {
public:
    virtual number& operator+(number&) {return *this;};	 // abstract function
    virtual number& operator+(fixnum&) {return *this;};	 // abstract function
    friend ostream& operator<<(ostream& s, number& num) {return num.print(s);};
protected:
    virtual ostream& print(ostream& s) {return s;};	 // abstract function
};


class fixnum: public number {
public:
    fixnum(int num) { n = num; };
private:
    int n;
    number& operator+(number& num) { return(num + *this); };  // double dispatch
    number& operator+(fixnum& num) { return *new fixnum(n + num.n); };
    ostream& print(ostream& s) {return s << n;};
};


void testfun(number &a, number &b)
{
    cout << a+b << "\n";
}

main()
{
    fixnum a = 3;
    fixnum b = 4;
    cout << "fix+fix: "; testfun(a, b);
}

--------------------------------------------------

The reason that there are two "+" functions is because in the first "+"
function, the type of the second argument is "number", and not "fixnum".
Thus, a double dispatch is used (Ingals refers to this as multiple-polymorphism
in his OOPSLA '86 paper).  Double dispatch can be avoided in languages that
provide a mechanism for multiple-polymorphism such as the Common Lisp Object
System (CLOS).  Double dispatch can also be used for handling arguments
whose types differ.

The solution presented above has at three problems with it.  The biggest
problem is that the class number must declare the abstract function
	virtual number& operator+(fixnum&)
In the general case, one such function must be declared for each derived
class's functions that use double dispatch (eg, +(number, fixnum),
*(number, fixnum), etc).  This violates an important condition for reuse
of inherited classes:  a base class should *never* need to know what its
derived classes are.

The second problem with the above solution is that even though the second
dispatch is written as an inline function, at least one more function call is
performed because of the vtable lookup (at least this is true of cfront 1.2.1).
Thus, this solution has a fairly high time overhead penalty.

The third problem with the above solution is that g++ (v1.27) won't
compile it.  The error messages are [including the spelling error]:
num.C:23: conficting specification deriving from virtual function `struct number
	 &number::operator + (struct number &)'
num.C:39: warning: excess elements in aggregate initializer


For those who are interested, Smalltalk does not use the double dispatch idea.
Smalltalk (by virtue of the language definition), does not have the modularity
problem mentioned above (function prototypes don't have to be declared).
Smalltalk's generic number code handles addition by using the following
two functions to check the type of the (second) argument:
    isMemberOf: aClass	true if self is an instance of 'aClass'
    isKindOf: aClass	true if self is an instance of 'aClass' or
			    any superclass of 'aClass'
Smalltalk's implementation uses these functions in a disciplined, modular
manner:  they are used only to check against the current class name (in
the above example, 'aClass' would be "fixnum").  A type check that uses
other than the current class's type hampers reusability because if a new type
is added (eg, 'gaussian integer'), then old code must be modified to
include the new type.

Smalltalk handles the addition of different types using the notion of
generality (to define the result type) and a method called "retry:coercing".
I'll let you guess what that does.

As I see it, the problem with c++ is that I can't specify the types of the
arguments precisely enough.  If we ignore type conversion possibilities,
the "+" operation in the above example really wants to have a signature:
    virtual T& operator+(T&, T&): T < number
Ie, "+" is a function on two arguments, both of exactly the same type and
that type is also the return type.  T must be a subtype of 'number'.
This would allow me to write just one function and avoid the second dispatch.

As mentioned way back at the beginning of this message, I'm interested
in any suggestions anyone has for a good solution.

	Neil Soiffer
	Textronix Computer Research Lab
	UUCP:  ...!tektronix!tekchips!soiffer
	ARPA:  soiffer%tekchips.tek.com@relay.cs.net
	CSNET: soiffer%tekchips.tek.com

bs@alice.UUCP (Bjarne Stroustrup) (11/08/88)

***> ``soiffer@tekcrl.CRL.TEK.COM (Neil Soiffer)'' presents one of my
	favorite problems. Unfortunately, I don't have a really good
	solution, but I'll comment on it never the less. My comments
	are marked with ***>. The plain text is Neil's. Like Neil,
	I'd be most interested if someone could come up with a really
	good solution that fitted into the C++ framework.

I am fairly new to c++ and would appreciate suggestions on how to write
a generic arithmetic package in c++.  By "generic", I mean I would
like to be able to write functions whose arguments can be of type "number"
and not a specific type like int, float, complex, rational, etc.
As a trivial example, I would like to be able to write a generic exponentiation
function that works by repeated squaring.  A generic class is useful not
only for writing general purpose generic arithmetic functions, but also for
writing programs where the specific type of the operands is unknown (eg, in a
calculator that reads many types of numbers).

The "obvious" solution to this problem is to write an abstract class
"number" and derive concrete classes (such as fixnum, rational, etc) from
it.  A solution along this line of thought is shown below:

***> Personally, I'm not convinced that the `generic number' type is a useful
	abstraction. What you do with numbers tend to depend on the kind of
	numbers you have and you tend to chose the representation of numbers
	to suit what you want to do with them. For numbers, I would probably
	first consider writing several unrelated ``number classes'.

	However, even if I have my doubts about this particular application
	I have no doubt that the fundamental problem it demonstrates is real.
	I would be interested in any ideas anyone might have in estimating
	how often multiple polymorphism is ``the ideal solution'' compared
	to how often the ordinary garden variety ``single polymorphism''
	suffices.

------------------------------------------------

#include <stream.h>

class fixnum;	// forward ref
class rational;	// forward ref

class number {
public:
    virtual number& operator+(number&) {return *this;};	 // abstract function
    virtual number& operator+(fixnum&) {return *this;};	 // abstract function
    friend ostream& operator<<(ostream& s, number& num) {return num.print(s);};
protected:
    virtual ostream& print(ostream& s) {return s;};	 // abstract function
};


class fixnum: public number {
public:
    fixnum(int num) { n = num; };
private:
    int n;
    number& operator+(number& num) { return(num + *this); };  // double dispatch
    number& operator+(fixnum& num) { return *new fixnum(n + num.n); };
    ostream& print(ostream& s) {return s << n;};
};


void testfun(number &a, number &b)
{
    cout << a+b << "\n";
}

main()
{
    fixnum a = 3;
    fixnum b = 4;
    cout << "fix+fix: "; testfun(a, b);
}

--------------------------------------------------

The reason that there are two "+" functions is because in the first "+"
function, the type of the second argument is "number", and not "fixnum".
Thus, a double dispatch is used (Ingals refers to this as multiple-polymorphism
in his OOPSLA '86 paper).  Double dispatch can be avoided in languages that
provide a mechanism for multiple-polymorphism such as the Common Lisp Object
System (CLOS).  Double dispatch can also be used for handling arguments
whose types differ.
***> I personally think that multiple-polymorphism is the right way to
	cope with these problems provided we can find a way to deal with
	ambiguities at compile time.

	However, we don't have it in C++. We can only handle the
	equivalent problem given the static types (function name and operator
	overloading) or in the case where the detailed type isn't known at
	compile time by use of the ``single dispatch' on the object (or
	some C-style grubbiness).

The solution presented above has at three problems with it.  The biggest
problem is that the class number must declare the abstract function
	virtual number& operator+(fixnum&)
In the general case, one such function must be declared for each derived
class's functions that use double dispatch (eg, +(number, fixnum),
*(number, fixnum), etc).  This violates an important condition for reuse
of inherited classes:  a base class should *never* need to know what its
derived classes are.
***> Agreed. On the other hand there are may applications where a violation
	of that rule is not a serious problem. This is the case where the
	base class isn't really intented as the base for arbitrary extension
	but simply as a means of providing a common interface to a fixed or
	only slowly changing set of derived classes.

The second problem with the above solution is that even though the second
dispatch is written as an inline function, at least one more function call is
performed because of the vtable lookup (at least this is true of cfront 1.2.1).
Thus, this solution has a fairly high time overhead penalty.
***> Really? compared with what? A CLOS multiple-polymorphic call? I doubt it.
	I haven't tried measuring - and would be most curious if somebody
	would take the bother - but I suspect that in a reasonably realistic
	mix of operations the C++ code would still be noticeably faster than
	the equivalent, but more elegant, CLOS.

	Also, is it possible in CLOS to construct a call that in C++ would
	be considered ambiguous, but that CLOS resolves by an order dependence?

The third problem with the above solution is that g++ (v1.27) won't
compile it.  The error messages are [including the spelling error]:
num.C:23: conficting specification deriving from virtual function `struct number
	 &number::operator + (struct number &)'
num.C:39: warning: excess elements in aggregate initializer
***> I think it is time to send Mike Tiemann a bug report. Cfront compiles the
	program as written:
		$ CC x.c
		CC  x.c:
		cc   x.i -lC
		$ a.out
		fix+fix: 7
		$ 

For those who are interested, Smalltalk does not use the double dispatch idea.
Smalltalk (by virtue of the language definition), does not have the modularity
problem mentioned above (function prototypes don't have to be declared).
Smalltalk's generic number code handles addition by using the following
two functions to check the type of the (second) argument:
    isMemberOf: aClass	true if self is an instance of 'aClass'
    isKindOf: aClass	true if self is an instance of 'aClass' or
			    any superclass of 'aClass'
Smalltalk's implementation uses these functions in a disciplined, modular
manner:  they are used only to check against the current class name (in
the above example, 'aClass' would be "fixnum").  A type check that uses
other than the current class's type hampers reusability because if a new type
is added (eg, 'gaussian integer'), then old code must be modified to
include the new type.

Smalltalk handles the addition of different types using the notion of
generality (to define the result type) and a method called "retry:coercing".
I'll let you guess what that does.
***> I didn't know of that - and would appreciate being told rather than
	guessing. I have seen the double dispatch technique recommended
	for Smalltalk in a form that was at least as shabby as your C++
	variant. What I have seen (in Smalltalk) looked more like a C
	switch statement against a type field than anything else.

	The key to a good solution is, as you point out, to avoid mentioning
	the names of the derived classes in the base class definition (or
	worse: in the base class member functions). Can that really be done
	in Smalltalk?
 
As I see it, the problem with c++ is that I can't specify the types of the
arguments precisely enough.  If we ignore type conversion possibilities,
the "+" operation in the above example really wants to have a signature:
    virtual T& operator+(T&, T&): T < number
Ie, "+" is a function on two arguments, both of exactly the same type and
that type is also the return type.  T must be a subtype of 'number'.
This would allow me to write just one function and avoid the second dispatch.
***> but that wouldn't solve the more general problem because you typically
	wants to be able to cope with ``mixed mode arithmetic''. A solution
	would be to introduce coercions, the extreeme variant of this to
	introduce coercions from every type to a single standard number
	type (e.g. something like Jerry Schwarz's Reals), but that is,
	I guess, cheating. Anyway, for other applications where the
	fundamental problem turns up there can be no coercions. Consider,
	for example, a general intersect function for a shape class.
	
As mentioned way back at the beginning of this message, I'm interested
in any suggestions anyone has for a good solution.

	Neil Soiffer
	Textronix Computer Research Lab
	UUCP:  ...!tektronix!tekchips!soiffer
	ARPA:  soiffer%tekchips.tek.com@relay.cs.net
	CSNET: soiffer%tekchips.tek.com

akwright@watdragon.waterloo.edu (Andrew K. Wright) (11/11/88)

In article <8407@alice.UUCP> bs@alice.UUCP (Bjarne Stroustrup) writes:
>
>***> ``soiffer@tekcrl.CRL.TEK.COM (Neil Soiffer)'' presents one of my
>	favorite problems. Unfortunately, I don't have a really good
>	solution, but I'll comment on it never the less. My comments
>	are marked with ***>. The plain text is Neil's. Like Neil,
>	I'd be most interested if someone could come up with a really
>	good solution that fitted into the C++ framework.
>
>I am fairly new to c++ and would appreciate suggestions on how to write
>a generic arithmetic package in c++.  By "generic", I mean I would
>like to be able to write functions whose arguments can be of type "number"
>and not a specific type like int, float, complex, rational, etc.
>As a trivial example, I would like to be able to write a generic exponentiation
>function that works by repeated squaring.  A generic class is useful not
>only for writing general purpose generic arithmetic functions, but also for
>writing programs where the specific type of the operands is unknown (eg, in a
>calculator that reads many types of numbers).

At the risk of preaching in the camp of the converted, I will present
a non-object-oriented solution to this problem.  This solution uses
parametric polymorphism, rather than inheritance, to obtain reusability.

The solution involves writing the operations of your generic arithmetic
package as polymorphic operations.  These operations do not belong to a
class; in fact, no operations in the language system belong to a class.

We introduce two new types of formal parameters, alreadly familiar
in formal logic.  Universal parameters will be written in uppercase,
and may be bound to an actual parameter of any type.
	*: [ a:T, b:T ] => T
specifies that * is an operation which takes two parameters of any
identical type, and returns a result of the same type.  But this is
not sufficient to write the body of *.  Existential parameters are
introduced with the keyword "exists".  No actual parameter is supplied
by the user for the existential parameter; it is searched for by
name by the compiler at the call site of the operation.
	*: [ a:T, b:T, exists +:[T,T]=>T ] => T
This definition of * take two actual parameters of the same type,
returns a result of that same type, but requires that where it is
called there exist a definition of + for those types.  Now you can
imagine writing the body of * as an loop doing additions.

Here is an implementation of a function called "exponentiation",
which operates not by repeated squaring, but by repeated multiplication
(a slight generalization of your problem).
"exponentiation" will take two parameters, a something and an integer,
and raise something to the integer power.  in order to operate,
"exponentiation" requires that * (multiply) exist in the calling
environment.
	exponentiation: [ a:T, b:int, exists *: [T,T]=>T ] => T
	{
		x:T;
		x = a;
		while( --b > 0 )
			x = x * a;
		return x;
	}
Now exponentiation will operate on all your builtin numeric types, because
they all define *.  If you introduce a new type which you wish to
exponentiate, all you have to do is define * for your new type.

Now here is one crucial advantage to this approach: if you wish
to use exponentiate on an existing type, you do not have to modify
that type, you simply introduce a * for that type.   For example
if I wish to apply exponentiate to strings, i need only define *
for strings.
	*: [a:string, b:string] => string    { return concat(a,b); }
	exponentiate( "abc", 3 )   \\ returns "abcabcabc"

For more information, please see
"Polymorphism in the Compiled Language ForceOne", Cormack and Wright,
Proc. of the 20th Annual Hawaii Int. Conf. on System Sciences, 1987.
This paper is a bit dated; we should be publishing some more up-to-date
stuff soon.
---------
Andrew K. Wright      akwright@watmath.waterloo.edu
CS Dept., University of Waterloo.
-- 
Andrew K. Wright      akwright@watmath.waterloo.edu
CS Dept., University of Waterloo.