soiffer@tekcrl.CRL.TEK.COM (Neil Soiffer) (11/05/88)
I am fairly new to c++ and would appreciate suggestions on how to write a generic arithmetic package in c++. By "generic", I mean I would like to be able to write functions whose arguments can be of type "number" and not a specific type like int, float, complex, rational, etc. As a trivial example, I would like to be able to write a generic exponentiation function that works by repeated squaring. A generic class is useful not only for writing general purpose generic arithmetic functions, but also for writing programs where the specific type of the operands is unknown (eg, in a calculator that reads many types of numbers). The "obvious" solution to this problem is to write an abstract class "number" and derive concrete classes (such as fixnum, rational, etc) from it. A solution along this line of thought is shown below: ------------------------------------------------ #include <stream.h> class fixnum; // forward ref class rational; // forward ref class number { public: virtual number& operator+(number&) {return *this;}; // abstract function virtual number& operator+(fixnum&) {return *this;}; // abstract function friend ostream& operator<<(ostream& s, number& num) {return num.print(s);}; protected: virtual ostream& print(ostream& s) {return s;}; // abstract function }; class fixnum: public number { public: fixnum(int num) { n = num; }; private: int n; number& operator+(number& num) { return(num + *this); }; // double dispatch number& operator+(fixnum& num) { return *new fixnum(n + num.n); }; ostream& print(ostream& s) {return s << n;}; }; void testfun(number &a, number &b) { cout << a+b << "\n"; } main() { fixnum a = 3; fixnum b = 4; cout << "fix+fix: "; testfun(a, b); } -------------------------------------------------- The reason that there are two "+" functions is because in the first "+" function, the type of the second argument is "number", and not "fixnum". Thus, a double dispatch is used (Ingals refers to this as multiple-polymorphism in his OOPSLA '86 paper). Double dispatch can be avoided in languages that provide a mechanism for multiple-polymorphism such as the Common Lisp Object System (CLOS). Double dispatch can also be used for handling arguments whose types differ. The solution presented above has at three problems with it. The biggest problem is that the class number must declare the abstract function virtual number& operator+(fixnum&) In the general case, one such function must be declared for each derived class's functions that use double dispatch (eg, +(number, fixnum), *(number, fixnum), etc). This violates an important condition for reuse of inherited classes: a base class should *never* need to know what its derived classes are. The second problem with the above solution is that even though the second dispatch is written as an inline function, at least one more function call is performed because of the vtable lookup (at least this is true of cfront 1.2.1). Thus, this solution has a fairly high time overhead penalty. The third problem with the above solution is that g++ (v1.27) won't compile it. The error messages are [including the spelling error]: num.C:23: conficting specification deriving from virtual function `struct number &number::operator + (struct number &)' num.C:39: warning: excess elements in aggregate initializer For those who are interested, Smalltalk does not use the double dispatch idea. Smalltalk (by virtue of the language definition), does not have the modularity problem mentioned above (function prototypes don't have to be declared). Smalltalk's generic number code handles addition by using the following two functions to check the type of the (second) argument: isMemberOf: aClass true if self is an instance of 'aClass' isKindOf: aClass true if self is an instance of 'aClass' or any superclass of 'aClass' Smalltalk's implementation uses these functions in a disciplined, modular manner: they are used only to check against the current class name (in the above example, 'aClass' would be "fixnum"). A type check that uses other than the current class's type hampers reusability because if a new type is added (eg, 'gaussian integer'), then old code must be modified to include the new type. Smalltalk handles the addition of different types using the notion of generality (to define the result type) and a method called "retry:coercing". I'll let you guess what that does. As I see it, the problem with c++ is that I can't specify the types of the arguments precisely enough. If we ignore type conversion possibilities, the "+" operation in the above example really wants to have a signature: virtual T& operator+(T&, T&): T < number Ie, "+" is a function on two arguments, both of exactly the same type and that type is also the return type. T must be a subtype of 'number'. This would allow me to write just one function and avoid the second dispatch. As mentioned way back at the beginning of this message, I'm interested in any suggestions anyone has for a good solution. Neil Soiffer Textronix Computer Research Lab UUCP: ...!tektronix!tekchips!soiffer ARPA: soiffer%tekchips.tek.com@relay.cs.net CSNET: soiffer%tekchips.tek.com
bs@alice.UUCP (Bjarne Stroustrup) (11/08/88)
***> ``soiffer@tekcrl.CRL.TEK.COM (Neil Soiffer)'' presents one of my favorite problems. Unfortunately, I don't have a really good solution, but I'll comment on it never the less. My comments are marked with ***>. The plain text is Neil's. Like Neil, I'd be most interested if someone could come up with a really good solution that fitted into the C++ framework. I am fairly new to c++ and would appreciate suggestions on how to write a generic arithmetic package in c++. By "generic", I mean I would like to be able to write functions whose arguments can be of type "number" and not a specific type like int, float, complex, rational, etc. As a trivial example, I would like to be able to write a generic exponentiation function that works by repeated squaring. A generic class is useful not only for writing general purpose generic arithmetic functions, but also for writing programs where the specific type of the operands is unknown (eg, in a calculator that reads many types of numbers). The "obvious" solution to this problem is to write an abstract class "number" and derive concrete classes (such as fixnum, rational, etc) from it. A solution along this line of thought is shown below: ***> Personally, I'm not convinced that the `generic number' type is a useful abstraction. What you do with numbers tend to depend on the kind of numbers you have and you tend to chose the representation of numbers to suit what you want to do with them. For numbers, I would probably first consider writing several unrelated ``number classes'. However, even if I have my doubts about this particular application I have no doubt that the fundamental problem it demonstrates is real. I would be interested in any ideas anyone might have in estimating how often multiple polymorphism is ``the ideal solution'' compared to how often the ordinary garden variety ``single polymorphism'' suffices. ------------------------------------------------ #include <stream.h> class fixnum; // forward ref class rational; // forward ref class number { public: virtual number& operator+(number&) {return *this;}; // abstract function virtual number& operator+(fixnum&) {return *this;}; // abstract function friend ostream& operator<<(ostream& s, number& num) {return num.print(s);}; protected: virtual ostream& print(ostream& s) {return s;}; // abstract function }; class fixnum: public number { public: fixnum(int num) { n = num; }; private: int n; number& operator+(number& num) { return(num + *this); }; // double dispatch number& operator+(fixnum& num) { return *new fixnum(n + num.n); }; ostream& print(ostream& s) {return s << n;}; }; void testfun(number &a, number &b) { cout << a+b << "\n"; } main() { fixnum a = 3; fixnum b = 4; cout << "fix+fix: "; testfun(a, b); } -------------------------------------------------- The reason that there are two "+" functions is because in the first "+" function, the type of the second argument is "number", and not "fixnum". Thus, a double dispatch is used (Ingals refers to this as multiple-polymorphism in his OOPSLA '86 paper). Double dispatch can be avoided in languages that provide a mechanism for multiple-polymorphism such as the Common Lisp Object System (CLOS). Double dispatch can also be used for handling arguments whose types differ. ***> I personally think that multiple-polymorphism is the right way to cope with these problems provided we can find a way to deal with ambiguities at compile time. However, we don't have it in C++. We can only handle the equivalent problem given the static types (function name and operator overloading) or in the case where the detailed type isn't known at compile time by use of the ``single dispatch' on the object (or some C-style grubbiness). The solution presented above has at three problems with it. The biggest problem is that the class number must declare the abstract function virtual number& operator+(fixnum&) In the general case, one such function must be declared for each derived class's functions that use double dispatch (eg, +(number, fixnum), *(number, fixnum), etc). This violates an important condition for reuse of inherited classes: a base class should *never* need to know what its derived classes are. ***> Agreed. On the other hand there are may applications where a violation of that rule is not a serious problem. This is the case where the base class isn't really intented as the base for arbitrary extension but simply as a means of providing a common interface to a fixed or only slowly changing set of derived classes. The second problem with the above solution is that even though the second dispatch is written as an inline function, at least one more function call is performed because of the vtable lookup (at least this is true of cfront 1.2.1). Thus, this solution has a fairly high time overhead penalty. ***> Really? compared with what? A CLOS multiple-polymorphic call? I doubt it. I haven't tried measuring - and would be most curious if somebody would take the bother - but I suspect that in a reasonably realistic mix of operations the C++ code would still be noticeably faster than the equivalent, but more elegant, CLOS. Also, is it possible in CLOS to construct a call that in C++ would be considered ambiguous, but that CLOS resolves by an order dependence? The third problem with the above solution is that g++ (v1.27) won't compile it. The error messages are [including the spelling error]: num.C:23: conficting specification deriving from virtual function `struct number &number::operator + (struct number &)' num.C:39: warning: excess elements in aggregate initializer ***> I think it is time to send Mike Tiemann a bug report. Cfront compiles the program as written: $ CC x.c CC x.c: cc x.i -lC $ a.out fix+fix: 7 $ For those who are interested, Smalltalk does not use the double dispatch idea. Smalltalk (by virtue of the language definition), does not have the modularity problem mentioned above (function prototypes don't have to be declared). Smalltalk's generic number code handles addition by using the following two functions to check the type of the (second) argument: isMemberOf: aClass true if self is an instance of 'aClass' isKindOf: aClass true if self is an instance of 'aClass' or any superclass of 'aClass' Smalltalk's implementation uses these functions in a disciplined, modular manner: they are used only to check against the current class name (in the above example, 'aClass' would be "fixnum"). A type check that uses other than the current class's type hampers reusability because if a new type is added (eg, 'gaussian integer'), then old code must be modified to include the new type. Smalltalk handles the addition of different types using the notion of generality (to define the result type) and a method called "retry:coercing". I'll let you guess what that does. ***> I didn't know of that - and would appreciate being told rather than guessing. I have seen the double dispatch technique recommended for Smalltalk in a form that was at least as shabby as your C++ variant. What I have seen (in Smalltalk) looked more like a C switch statement against a type field than anything else. The key to a good solution is, as you point out, to avoid mentioning the names of the derived classes in the base class definition (or worse: in the base class member functions). Can that really be done in Smalltalk? As I see it, the problem with c++ is that I can't specify the types of the arguments precisely enough. If we ignore type conversion possibilities, the "+" operation in the above example really wants to have a signature: virtual T& operator+(T&, T&): T < number Ie, "+" is a function on two arguments, both of exactly the same type and that type is also the return type. T must be a subtype of 'number'. This would allow me to write just one function and avoid the second dispatch. ***> but that wouldn't solve the more general problem because you typically wants to be able to cope with ``mixed mode arithmetic''. A solution would be to introduce coercions, the extreeme variant of this to introduce coercions from every type to a single standard number type (e.g. something like Jerry Schwarz's Reals), but that is, I guess, cheating. Anyway, for other applications where the fundamental problem turns up there can be no coercions. Consider, for example, a general intersect function for a shape class. As mentioned way back at the beginning of this message, I'm interested in any suggestions anyone has for a good solution. Neil Soiffer Textronix Computer Research Lab UUCP: ...!tektronix!tekchips!soiffer ARPA: soiffer%tekchips.tek.com@relay.cs.net CSNET: soiffer%tekchips.tek.com
akwright@watdragon.waterloo.edu (Andrew K. Wright) (11/11/88)
In article <8407@alice.UUCP> bs@alice.UUCP (Bjarne Stroustrup) writes: > >***> ``soiffer@tekcrl.CRL.TEK.COM (Neil Soiffer)'' presents one of my > favorite problems. Unfortunately, I don't have a really good > solution, but I'll comment on it never the less. My comments > are marked with ***>. The plain text is Neil's. Like Neil, > I'd be most interested if someone could come up with a really > good solution that fitted into the C++ framework. > >I am fairly new to c++ and would appreciate suggestions on how to write >a generic arithmetic package in c++. By "generic", I mean I would >like to be able to write functions whose arguments can be of type "number" >and not a specific type like int, float, complex, rational, etc. >As a trivial example, I would like to be able to write a generic exponentiation >function that works by repeated squaring. A generic class is useful not >only for writing general purpose generic arithmetic functions, but also for >writing programs where the specific type of the operands is unknown (eg, in a >calculator that reads many types of numbers). At the risk of preaching in the camp of the converted, I will present a non-object-oriented solution to this problem. This solution uses parametric polymorphism, rather than inheritance, to obtain reusability. The solution involves writing the operations of your generic arithmetic package as polymorphic operations. These operations do not belong to a class; in fact, no operations in the language system belong to a class. We introduce two new types of formal parameters, alreadly familiar in formal logic. Universal parameters will be written in uppercase, and may be bound to an actual parameter of any type. *: [ a:T, b:T ] => T specifies that * is an operation which takes two parameters of any identical type, and returns a result of the same type. But this is not sufficient to write the body of *. Existential parameters are introduced with the keyword "exists". No actual parameter is supplied by the user for the existential parameter; it is searched for by name by the compiler at the call site of the operation. *: [ a:T, b:T, exists +:[T,T]=>T ] => T This definition of * take two actual parameters of the same type, returns a result of that same type, but requires that where it is called there exist a definition of + for those types. Now you can imagine writing the body of * as an loop doing additions. Here is an implementation of a function called "exponentiation", which operates not by repeated squaring, but by repeated multiplication (a slight generalization of your problem). "exponentiation" will take two parameters, a something and an integer, and raise something to the integer power. in order to operate, "exponentiation" requires that * (multiply) exist in the calling environment. exponentiation: [ a:T, b:int, exists *: [T,T]=>T ] => T { x:T; x = a; while( --b > 0 ) x = x * a; return x; } Now exponentiation will operate on all your builtin numeric types, because they all define *. If you introduce a new type which you wish to exponentiate, all you have to do is define * for your new type. Now here is one crucial advantage to this approach: if you wish to use exponentiate on an existing type, you do not have to modify that type, you simply introduce a * for that type. For example if I wish to apply exponentiate to strings, i need only define * for strings. *: [a:string, b:string] => string { return concat(a,b); } exponentiate( "abc", 3 ) \\ returns "abcabcabc" For more information, please see "Polymorphism in the Compiled Language ForceOne", Cormack and Wright, Proc. of the 20th Annual Hawaii Int. Conf. on System Sciences, 1987. This paper is a bit dated; we should be publishing some more up-to-date stuff soon. --------- Andrew K. Wright akwright@watmath.waterloo.edu CS Dept., University of Waterloo. -- Andrew K. Wright akwright@watmath.waterloo.edu CS Dept., University of Waterloo.