[comp.lang.c] Dialects, C++, Overloading

rob@raksha.eng.ohio-state.edu (Rob Carriere) (02/19/90)

I hope the attribution is correct, rn and postnews were both on the blitz...
In article <00017@meph.UUCP>, gsarff@meph.UUCP (Gary Sarff) writes:
>I think [...] the reason I have always distrusted
>overloading, [...] is that when you have overloading, you must have
>all source code, headers, module def's, whatever available to you (the way
>the compiler does) or the meaning of any statement you see that has operators
>in it cannot be deduced by you.  This is ease of maintainability??
>Even if you can determine the types of a,b, and c, and you see
>
>  a = b + c;
>
>do you _REALLY_ know the semantics of this, what if I overloaded + to do
>something really strange, like b + c = 1 if b and c are mutually prime?

Hmmm...  Do you claim it is any clearer if I write:

   assign( a, plus( b, c ));

instead?

>Doesn't make sense to you, (or me either), but that isn't the point.  The
>compiler won't stop me from doing this, the language definition allows it.
>The only restraint we have that something like this won't happen is the
>restraint of the originating programmer.  Features like that in any language
>are at least a bit scary to contemplate.

No more so than any type of procedure or function definition capability.  On
the other hand, you claim that such features should not be allowed because
they destroy the programmer's ability to read based on convention.  If the
programmer's only experience is in the writing of, say, C, this is true.
However, what is such a programmer doing with vectors (or whatever) anyway?
Such a person is a danger to himself and all around him.  If instead we make
the much more reassuring assumption that the programmer knows about the things
he is programming, then we have a second set of conventions to deal with:
those of the subject at hand.  Anybody who has linear algebra will have no
problem reading:

  x = A*x + B*u;

with x and u vectors, A and B matrices.  However,

  assign_vector( x, add_vector( mult_mat_vect( A, x ), mult_mat_vect( B, u )));

requires some deciphering[1].

Of course, you can screw your readers by using definitions of =, + and * that
run counter to the conventions observed in the area, but then, you can do the
same with the procedural version...

  make( glob( squeeze( x, A ), squeeze( u, B )), x );

is already rather sure to have everybody run for the docs, and it still hasn't
even touched the semantics... [2]

SR
[1] x.assign( (A.mult_vect( x )).plus( B.mult_vect( u ) ));
    isn't much better.
[2] If you want to drive everybody nuts, just give all your vectors uppercase
    names and all your matrices lowercase ones, and then use row-vector matrix
    multiplies...  Lynch mobs guaranteed, and you didn't even need overloading
    :-) 
---

brnstnd@stealth.acf.nyu.edu (02/19/90)

Rob, the issue is not user-defined operator syntax. The issue is overloading.

Say a language provides a vector library with a vector addition routine.
You might write ``a = b + c'' in full as ``a = b vec.+ c''. Overloading
means abbreviating vec.+ and string.+ and matrix.+ and complex.+ as a
single operator. We're not asking you to write a = vec.plus(b,c); we're
pointing out the problems of ambiguous abbreviation.

My main concern is maintainability. I hate debugging Ada---particularly
someone else's Ada---because I have to learn an entirely new language
just to understand each program. I have to force myself to look
suspiciously at each operator and function call, asking ``what are the
types?'' and ``is it overloaded?'' If I don't read Ada so painfully, I'm
liable to overlook an overloaded operator, automatically assuming that
(for example) a = b + c has no side effects or that bar(x) + foo(y) is
the same construction that I just debugged when I saw it in the last
routine.

I often have a lot of trouble finding the definition of an overloaded
function---my searching tools are of no help when the syntax sucks and
the definition could be in any of ten libraries. But this is only a
lesser evil compared to an overloaded operator insidiously taking over
the innocent plus sign while the programmer looks away.

> If instead we make
> the much more reassuring assumption that the programmer knows about the things
> he is programming, then we have a second set of conventions to deal with:
> those of the subject at hand.

This is the root of the problem: you have to learn a new language just
to read each new program. When that new language doesn't even have the
same semantics for a = b, the reader is in big trouble. In contrast, if
overloaded functions and operators had to have a special syntax, they'd
be an easy-to-notice, natural first spot to look for bugs. The special
syntax doesn't have to be anything more than, e.g., an initial period;
but it has to be there. If the language provided a standard way to
expand all overloaded operators into unambiguous versions, life would
be wonderful.

---Dan

rob@raksha.eng.ohio-state.edu (Rob Carriere) (02/20/90)

In article <5137:06:46:38@stealth.acf.nyu.edu> brnstnd@stealth.acf.nyu.edu
(Dan Bernstein) writes: 
>Rob, the issue is not user-defined operator syntax. The issue is overloading.

Yes.  Sorry I didn't make myself more clear.  I was trying to point out that
you can get just as confused without overloading.  I point that I _did_ miss
was that you are also talking about function overloading, not just operator
overloading.  That makes part of my last post besides the point.  Thanks for
clarifying. 

>Say a language provides a vector library with a vector addition routine.
>You might write ``a = b + c'' in full as ``a = b vec.+ c''. Overloading
>means abbreviating vec.+ and string.+ and matrix.+ and complex.+ as a
>single operator. We're not asking you to write a = vec.plus(b,c); we're
>pointing out the problems of ambiguous abbreviation.

1] I don't have a problem with seeing a = b + c; here.  I know what vectors,
   matrices and complex numbers do when you add them.

2] This starts out nice, but the example I gave in my post is going to look
   like: 

x vec.= (A mat_vec.* x) vec.+ (B mat_vec.* u);

   or perhaps even:

x complex_vec.=               (A complex_mat__complex_vec.* x) 
                complex_vec.+ (B complex_mat__complex_vec.* u);

   I claim that x = A*x + B*u; is drastically more readable.

>My main concern is maintainability. I hate debugging Ada---particularly
>someone else's Ada---because I have to learn an entirely new language
>just to understand each program. I have to force myself to look
>suspiciously at each operator and function call, asking ``what are the
>types?'' and ``is it overloaded?'' If I don't read Ada so painfully, I'm
>liable to overlook an overloaded operator, automatically assuming that
>(for example) a = b + c has no side effects or that bar(x) + foo(y) is
>the same construction that I just debugged when I saw it in the last
>routine.

But how is this different from assuming that func in 

   frubs = func( fobble );

has no side-effects?  Or that the same line in the next file will mean the
same thing?

>I often have a lot of trouble finding the definition of an overloaded
>function---my searching tools are of no help when the syntax sucks and
>the definition could be in any of ten libraries. But this is only a
>lesser evil compared to an overloaded operator insidiously taking over
>the innocent plus sign while the programmer looks away.

See above.  I don't understand why the evil is lesser.

>> [...] we have a second set of conventions to deal with [apart from those of
>> the programming language]: those of the subject at hand.
>
>This is the root of the problem: you have to learn a new language just
>to read each new program. 

I would hope not.  If you know linear algebra, the notation a + b for vectors
a and b will have meaning to you.  If you do not know linear algebra and you
still have to maintain the program, notation is the least of your problems.

>When that new language doesn't even have the
>same semantics for a = b, the reader is in big trouble. In contrast, if
>overloaded functions and operators had to have a special syntax, they'd
>be an easy-to-notice, natural first spot to look for bugs. The special
>syntax doesn't have to be anything more than, e.g., an initial period;
>but it has to be there. 

OK, but this much weaker than you what started out with.  The first example now
becomes:

  x .= A.*x .+ B.*u;

that's a lot closer to readability than all those decorations we had first.

>If the language provided a standard way to
>expand all overloaded operators into unambiguous versions, life would
>be wonderful.

That seems an implementation issue to me.  As long as the language is
statically typed, figuring out which operator goes where can be done
statically.  So you implementation could provide you with some kind of
hypertext-like help for this expansion.  In fact, it could do that without the
language needing those periods.  Then, if you have a confusing piece of code,
you could tell your implementation to show you a fully disambiguated version
of the code.

SR
"And v2.0 will show you a correct version of the code..."
---

cik@l.cc.purdue.edu (Herman Rubin) (02/21/90)

In article <5137:06:46:38@stealth.acf.nyu.edu>, brnstnd@stealth.acf.nyu.edu writes:
> Rob, the issue is not user-defined operator syntax. The issue is overloading.
> 
> Say a language provides a vector library with a vector addition routine.
> You might write ``a = b + c'' in full as ``a = b vec.+ c''. Overloading
> means abbreviating vec.+ and string.+ and matrix.+ and complex.+ as a
> single operator. We're not asking you to write a = vec.plus(b,c); we're
> pointing out the problems of ambiguous abbreviation.

In the same line of reasoning, one should have "a = b long.+ c", and
similarly for byte, short, etc.  I am not saying we should not have
these, in fact we should when we want to override type defaults.  But
as long as an operator is well-defined for a given set of types for its
arguments, I do not see any real problems.  The type-override operations
are likely to be machine dependent.  However, as Dan states, this
situation is self-flagging..

> My main concern is maintainability. I hate debugging Ada---particularly
> someone else's Ada---because I have to learn an entirely new language
> just to understand each program. I have to force myself to look
> suspiciously at each operator and function call, asking ``what are the
> types?'' and ``is it overloaded?'' If I don't read Ada so painfully, I'm
> liable to overlook an overloaded operator, automatically assuming that
> (for example) a = b + c has no side effects or that bar(x) + foo(y) is
> the same construction that I just debugged when I saw it in the last
> routine.

To require that a program be maintainable by someone who does not understand
what the program is doing may be desirable, but not at the expense of making
it difficult to produce good programs in the first place.  Mathematicians are
quite used to extremely overloaded operators and terminology.  Clarification
of such problems should be in comments.

> I often have a lot of trouble finding the definition of an overloaded
> function---my searching tools are of no help when the syntax sucks and
> the definition could be in any of ten libraries. But this is only a
> lesser evil compared to an overloaded operator insidiously taking over
> the innocent plus sign while the programmer looks away.
> 
| > If instead we make
| > the much more reassuring assumption that the programmer knows about the things
| > he is programming, then we have a second set of conventions to deal with:
| > those of the subject at hand.
> 
> This is the root of the problem: you have to learn a new language just
> to read each new program. When that new language doesn't even have the
> same semantics for a = b, the reader is in big trouble. In contrast, if
> overloaded functions and operators had to have a special syntax, they'd
> be an easy-to-notice, natural first spot to look for bugs. The special
> syntax doesn't have to be anything more than, e.g., an initial period;
> but it has to be there. If the language provided a standard way to
> expand all overloaded operators into unambiguous versions, life would
> be wonderful.

It is possible for an operator to be changed in "standard" situations if
one is not careful.  I have used an assembler which allowed this.  Possibly
a block could be put in for this purpose; I did not think it was a good
idea there.

It seems that both of the preceding posters would not object to new operator
symbols, or symbol strings.  This certainly will help.

I have broadened the posting of this followup, and because of the general
nature of the problem, I am directing followups to comp.lang.misc.

> ---Dan


-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)