[comp.lang.c++] the best match for an overloaded function

stan@hcr.UUCP (Stan Jarzabek) (08/24/89)

What does it mean to be "the best match" for an overloaded function?

The new C++ manual says: "the best-matching function is the intersection of sets
of functions that best match on each argument".

This definition disagrees with the intuition and, fortunately, that is not 
how the AT&T 2.0 implements selecting the best matching function.

Here is an example:

struct X {};
struct Z
{	Z(X) {}
};

int f(int,X) {}     
int f(short,Z) {}  
main()
{
	short s; 
	X x;
	f(s,x);
}

According to the definition from the manual, there is no best match for f(s,x), 
as the intersection of sets is empty. The 2.0 selects f(int,X), which confirms
the intuition and contradicts the manual definition.

So, what is the definition of the best match for an overloaded function, as
implemented in the AT&T 2.0?

In the example below, the compiler behaves inconsistently, and contradicts both 
the intuition and the definition from the manual.

struct X
{
	int a;
	X(int i) { a = i; }
	operator int() { return a; }
};

int f(short,short,short)
{	return 1; }

int f(char,char,short)
{	return 2; }

int f(X,short,short)
{	return 3; }

main()
{
	short s = 1;
	X x(1);
	f(x,s,x);  //2.0: f(X,short,short) is called
	f(x,x,s);  //2.0: ambiguous call
}


The last example shows that the intuitive meaning of "the best match" may not
be obvious. The overloading resolution mechanism should be based on some 
reasonably simple principle which would allow the user to staticly determine 
whether a given function call is ambiguous or not, and if not, which function
will be called. In particular, function calls in the last example should be 
resolved in the same way, i.e. either 
	a) function f(X,short,short) should be selected for both calls, or 
	b) both calls should be considered ambiguous.

The interpretation a) in the last example reflects the principle: "the best 
match has the minimum number of arguments that match at the highest level" (I
refer to the conversion levels [1] - [6], pp. 88-89 in the Reference Manual.)
In interpretation b), the underlying principle is: "if there are more than 
one functions requiring argument conversions at the highest level the function 
call is ambiguous." The former interpretation is more general, but the latter is
safer. 

The AT&T 2.0 algorithm for finding the best match seems to be based on some
more complex principle. The overloading resolution mechanism in the AT&T 2.0 
cfront sometimes behaves inconsistently which is probably due to some bugs, but 
does anybody know how this was intended to work? At the language definition
level, what does it mean to be the best match for an overloaded function with
many arguments?

Stan Jarzabek, HCR Corporation, Toronto

ark@alice.UUCP (Andrew Koenig) (08/25/89)

In article <1834@hcr.UUCP>, stan@hcr.UUCP (Stan Jarzabek) writes:

> What does it mean to be "the best match" for an overloaded function?

> Here is an example:

> struct X {};
> struct Z
> {	Z(X) {}
> };

> int f(int,X) {}     
> int f(short,Z) {}  
> main()
> {
> 	short s; 
> 	X x;
> 	f(s,x);
> }

> According to the definition from the manual, there is no best match for f(s,x), 
> as the intersection of sets is empty. The 2.0 selects f(int,X), which confirms
> the intuition and contradicts the manual definition.

I think this is a bug in cfront.  The best match for `s' is
clearly `short' and the best match for `x' is equally clearly
`X', so there's no reason to choose f(int,X) over f(short,Z).

> In the example below, the compiler behaves inconsistently, and contradicts both 
> the intuition and the definition from the manual.

> struct X
> {
> 	int a;
> 	X(int i) { a = i; }
> 	operator int() { return a; }
> };

> int f(short,short,short) {	return 1; }

> int f(char,char,short) {	return 2; }

> int f(X,short,short) {	return 3; }

> main()
> {
> 	short s = 1;
> 	X x(1);
> 	f(x,s,x);  //2.0: f(X,short,short) is called
> 	f(x,x,s);  //2.0: ambiguous call
> }

> The last example shows that the intuitive meaning of "the best match" may not
> be obvious.

Here I think cfront is correct and matches what the manual says.

Let's look first at the first call, f(x,s,x).

The best match for the first `x' argument is surely X.  The
best match for `s' is `short', and the best match for `x'
is also `short' -- not the greatest but it's the best available.
Therefore this calls f(X,short,short).

In the second call, f(x,x,s), the best match for the first `x' is
again X.  The second argument is an X and the choice is between
short and char.  There is no conversion from X to short or char,
just to int.  There is no reason to prefer X->int->short or
x->int->char, so the second argument is ambiguous.  Hence the
entire call is ambiguous.

Note that the third argument for each f() is short.  Thus for
all practical purposes the third argument does not participate
in function matching.


For curiosity's sake, what do you find counterintuitive about
the `best match' rule?
-- 
				--Andrew Koenig
				  ark@europa.att.com

stan@hcr.UUCP (Stan Jarzabek) (08/29/89)

In response to the question:
	What does it mean to be "the best match" for an overloaded function?
Andrew Koenig (ark@europa.att.com) writes:

>In article <1834@hcr.UUCP>, stan@hcr.UUCP (Stan Jarzabek) writes:

>> What does it mean to be "the best match" for an overloaded function?

>> Here is an example:

>> struct X {};
>> struct Z
>> {	Z(X) {}
>> };

>> int f(int,X) {}     
>> int f(short,Z) {}  
>> main()
>> {
>> 	short s; 
>> 	X x;
>> 	f(s,x);
>> }

>> According to the definition from the manual, there is no best match for 
>> f(s,x), as the intersection of sets is empty. The 2.0 selects f(int,X), which
>> confirms the intuition and contradicts the manual definition.

>I think this is a bug in cfront.  The best match for `s' is
>clearly `short' and the best match for `x' is equally clearly
>`X', so there's no reason to choose f(int,X) over f(short,Z).

I hope this is not a bug in cfront but an intended behavior. The cfront
seems to notice that f(int,X) is a better match for f(s,x) than f(short,Z),as
it requires only standard promotion (short->int), while the second requires
user-defined conversion (X->Z). The behavior of cfront in other, similar cases 
make me think that cfront understands "the best match" in that way. If I am 
right, then the definition of the "best match" from the manual cannot be the one
which is implemented in cfront. For me, the above example also shows that the
manual definition (i.e. "the best-matching function is the intersection of sets 
of functions that best match on each argument") is counterintuitive in certain
cases.

>> In the example below, the compiler behaves inconsistently, and contradicts both 
>> the intuition and the definition from the manual.

>> struct X
>> {
>> 	int a;
>> 	X(int i) { a = i; }
>> 	operator int() { return a; }
>> };

>> int f(short,short,short) {	return 1; }

>> int f(char,char,short) {	return 2; }

>> int f(X,short,short) {	return 3; }

>> main()
>> {
>> 	short s = 1;
>> 	X x(1);
>> 	f(x,s,x);  //2.0: f(X,short,short) is called
>> 	f(x,x,s);  //2.0: ambiguous call
>> }

>> The last example shows that the intuitive meaning of "the best match" may not
>> be obvious.

>Here I think cfront is correct and matches what the manual says.

>Let's look first at the first call, f(x,s,x).

>The best match for the first `x' argument is surely X.  The
>best match for `s' is `short', and the best match for `x'
>is also `short' -- not the greatest but it's the best available.
>Therefore this calls f(X,short,short).

>In the second call, f(x,x,s), the best match for the first `x' is
>again X.  The second argument is an X and the choice is between
>short and char.  There is no conversion from X to short or char,
>just to int.  There is no reason to prefer X->int->short or
>x->int->char, so the second argument is ambiguous.  Hence the
>entire call is ambiguous.

>				--Andrew Koenig
>				  ark@europa.att.com

As I understand this, according to the definition from the manual, in both cases
function f(X,sort,short) should be called: To find the best match, for both 
function calls you intersect sets:

{ f(X,short,short) } , { all functions } , { all functions }

The intersection of sets gives you f(X,short,short).

You can have two conversions for one argument and this does not
necessarily mean that the function call in ambiguous. The statement saying
that "if conversions for any one argument are ambiguous, the call is ambiguous"
was in the draft manual, but I could not find it in the new language manual.
(Fortunately, as cfront never behaved in that way, and if it did, I would
again call this counterintuitive.)
The new manual specifically says that "where identical conversions exist for an
argument, matching is determined by other arguments (if any)".
It seems to me that, although not always consistently, cfront tries to implement
the ambiguity test according to what the new manual says.

To support my interpretation, here are two more examples:
Assuming function declarations from the last example, in the function call
f(x,i,i) second argument "is ambiguous", but cfront finds f(X,short,short) as
the best match; also, in f(1,s,s) first argument "is ambiguous", but cfront
finds f(short,short,short) as the best match. I think cfront is right in those
particular cases.

There are bugs in the cfront's overloading resolution mechanism, for example,
it is enough to change the order of function declarations in the last example
to obtain different results - definitely, the result of overloading resolution
should be function-declaration-independent. But here I am concerned with the
underlying principle rather than with small bugs.

So, the problem remains: what is the definition of the best match for an 
overloaded function with many arguments?

Stan Jarzabek, HCR Corporation, Toronto
e-mail: stan!hcr