[comp.compilers] Defining polymorphism vs. overloading

oliver@karakorum.berkeley.edu (Oliver Sharp) (08/31/90)

Once upon a time, shortly before an exam in a programming language course, 
I asked myself "exactly what is the difference between these two things?"  
The answer being "Ummm ... well ...", I decided to poke around.  The
result is that things seem a bit murky.  It seems as though:

  o  Overloading means using the same name (or symbol) to invoke different
    code depending on the types of the arguments (or operands).  So, an example
    is +, which may do radically different things for integers, floats, and
    complex numbers.  In other words, you have different chunks of code
    depending on the type.  You may need to do some implicit casting, as in
    promoting an integer to a complex number, before you can actually perform
    the operation.

  o  Polymorphism means using the same piece of code to operate on objects of
    different types.  In LISP, cons is polymorphic because you can give it
    arguments of different types but it goes ahead and does its thing without
    worrying about it.

Well, that seems pretty straight-forward so far, but in practice the
distinction isn't so clean.  For example, think about square braces in C.
I've heard people say that they are overloaded, because they can apply to
arrays of different types ... but, by our just agreed on definitions, they
are really polymorphic.  I'd look at someone funny if they said "braces are
polymorphic in C", though, wouldn't you?  I've also heard people describe
things that are clearly overloaded as being polymorphic.

Having just taken a Linguistics class from Lakoff, I could explain all this 
away with a lot of learned talk about radial categories, and prototypical
cases, and so forth ... but I'm really curious.  What do YOU think the
distinction is?  Or isn't there one?  Do you use these terms consistently?

- Oliver Sharp
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

burley@world.std.com (James C Burley) (09/01/90)

In article <9008310419.AA06194@karakorum.berkeley.edu> oliver@karakorum.berkeley.edu (Oliver Sharp) writes:

     o  Overloading means using the same name (or symbol) to invoke different
       code depending on the types of the arguments (or operands)....

     o  Polymorphism means using the same piece of code to operate on objects of
       different types.  In LISP, cons is polymorphic because you can give it
       arguments of different types but it goes ahead and does its thing without
       worrying about it.

   Well, that seems pretty straight-forward so far, but in practice the
   distinction isn't so clean.  For example, think about square braces in C.
   I've heard people say that they are overloaded, because they can apply to
   arrays of different types ... but, by our just agreed on definitions, they
   are really polymorphic.  I'd look at someone funny if they said "braces are
   polymorphic in C", though, wouldn't you?  I've also heard people describe
   things that are clearly overloaded as being polymorphic.

                            ...I'm really curious.  What do YOU think the
   distinction is?  Or isn't there one?  Do you use these terms consistently?

   - Oliver Sharp

It seems to me you kind of made the distinction already in your bullet items,
except when you talked about LISP (and as a LISP-ignorant, I can't be sure
about that, either).

Overloading means using the same operation to "invoke different code", you
say, and polymorphism uses the same "piece of code to operate on objects of
different types".

Conceptually, what is the difference?  Let's look at our favorite overloaded
operator, addition:

    y + z

Now if y and z can be of several valid types, we say that + is overloaded
because we know it must perform different actions depending on the types.

Now let's look at a function:

    foo(y,z)

Here, if y and z can be of several valid types, we say that foo is polymorphic.

But this isn't quite right either: in fact, if you maintain a consistent view
of what the phrase "of several valid types" means in the above two examples,
then the outcome changes.  For example, let's take the view that it means "of
several valid types, but the particular type for each y and z is known at the
time when the operation is recognized" (compile time).  In that case, we'd
view both "y + z" and "foo(y,z)" as cases of overloaded operations.

On the other hand, let's say that it means "of several valid types, and the
particular type for each y and z is not determined until the point at which
the operation is to be performed" (i.e. run-time).  In that case, I think it
is clear that not only would we view "foo(y,z)" as indeed polymorphic, but
"y + z" also, because we'd recognize that either the machine code would have
to be embedded in a separate function, making "y + z" the same as saying
"add(y,z)" where add is that separate function, or the machine hardware would
itself make the determination when performing its ADD operation, making the
operation polymorphic.

So it appears that a person-on-the-street description of the difference between
"overloaded" and "polymorphic" depends on assuming an environment where
recognizing (i.e. interpreting or compiling) a program is an activity distinct
from, and a necessary prelude to, performing (i.e. running) that program.

I'm sure that to academics, this is an inadequate description, but it makes
sense to me for now.  I view LISP as primarily an interpreted language where
type information is carried around along with "variables", so that's why I
think LISP is kind of out of place here: I'd guess that in LISP, the only
places where overloading and polymorphism are different have to do with
performing compilation tasks or things like macro handling and such.

Square brackets in C++ therefore strike me as overloaded, and possibly
polymorphic: overloaded because foo[3] results in an operation that depends
(at compile time) on the size (thus the type) of foo (so a single
implementation involving a particular size is chosen for placement in the
program being created); possibly polymorphic because objA[objB] can result
(via overloading) in calling a function that takes objA and objB as arguments,
but which is implemented so that the actual types of the arguments may be
classA and classB or their subclasses -- the same function code is run
(in most implementations, anyway) even if the types are classK and classY.

I'd disagree that square brackets in C, however, are polymorphic: a given
piece of code using square brackets executes the same code each time it is
run, because when it is compiled, the types of its operands is known.  You
can't write a function in C that expects an int array, uses [] to access
elements of that array, then at run time pass it a char array and expect it
to make the adjustment -- it'll treat the char array as an int array.

Another way of looking at it: overloading teaches a compiler's symbol table
management some new tricks, but code generation is not usually much different.
Polymorphism teaches a compiler's code generation (and run-time library) some
new tricks, but symbol table management is not much different.

Like my guess about LISP, my recollection of Smalltalk is that it makes no
distinction between overloading and polymorphism, as it is an interpreted
language.  (Well, they "compile", but it means something different to them
than it would to a C++ programmer...and of course real Smalltalk compilers
are probably available, but to the extent they don't change the language
itself, I think the lack of distinction remains.)

James Craig Burley, Software Craftsperson    burley@world.std.com
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

wright@gefion.rice.edu (Andrew Wright) (09/03/90)

In article <9008310419.AA06194@karakorum.berkeley.edu> oliver@karakorum.berkeley.edu (Oliver Sharp) writes:
>  o  Overloading means using the same name (or symbol) to invoke different
>    code depending on the types of the arguments (or operands).  So, an example
>   ...
>  o  Polymorphism means using the same piece of code to operate on objects of
>    different types.  In LISP, cons is polymorphic because you can give it
>    arguments of different types but it goes ahead and does its thing without
>    worrying about it.

C. Strachey originated the terms "ad-hoc" and "universal" polymorphism
to distinguish these issues.  overloading is an example of ad-hoc polymorphism,
automatic coercions such as real->int are another.
ML is the best example of universal polymorphism (what you call simply
polymorphism above).  See:
@article{Cardelli86,
 author =  "Luca Cardelli and Peter Wegner",
 year =    "1985",
 month =   "December",
 journal = "ACM Computing Surveys",
 volume =  "17",
 number =  "4",
 pages =   "471-522",
 title =   "On Understanding Types, Data Abstraction, and Polymorphism"
}
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

stephen@estragon.uchicago.edu (stephen p spackman) (09/03/90)

In article <9008310419.AA06194@karakorum.berkeley.edu>
oliver@karakorum.berkeley.edu (Oliver Sharp) conjectures:

     o  Overloading means using the same name (or symbol) to invoke different
       code depending on the types of the arguments (or operands).  So, an example
       is +, which may do radically different things for integers, floats, and
       complex numbers.  In other words, you have different chunks of code
       depending on the type.  You may need to do some implicit casting, as in
       promoting an integer to a complex number, before you can actually perform
       the operation.

     o  Polymorphism means using the same piece of code to operate on objects of
       different types.  In LISP, cons is polymorphic because you can give it
       arguments of different types but it goes ahead and does its thing without
       worrying about it.

I'd say that just about hits the nail on the head (-: in my humble idiolect
:-). Overloading implies multiple implementation discriminating by type,
while polymorphism implies multiple interpretation (of the same code). I
don't feel, however, that we can allow the coercion of the arguments to make
an overload apply as part of the overloading process; rather, any function
that causes its arguments to be coerced is, if it is a proper property of
the function, POLYMORPHIC in that respect (though it may also be overloaded)
- because the coercion process is just a (trivial) mechanism to allow the
same implementation text to manipulate arguments of different types.

On the other hand, it should also be borne in mind that one (and to my mind,
the neatest) model of coercion is that it is itself the application of an
OVERLOADED function with a (textually) null name {footnote: We have to
resolve an ambiguity here, of course: between any two symbols, how many null
names did we read? Workable solutions, at different points along the
syntax/semantics scale, include requiring (a) that all coercion paths
commute (this is so semantic that you'd want to implement it by restricting
the forms or the directions of the coercions), (b) that all coercion paths,
including T->T, have length 1, or (c), and most subtly, that coercion paths
of length 1 are preferred to coercion paths of any other length (if we
include zero we can work implicit normalisers into this framework), and that
any ambiguity not resolved by this rule is forbidden.

Is this approach (a) stupid, (b) obvious, (c) known or (d) in need of
writing up someplace, does anyone know?}.

So:

  o Overloading is type-resolved multiple implementation.

  o Polymorphism is type-resolved multiple interpretation.

  o Coercion is, in effect, polymorphism implemented though CALL-SIDE
    normalisation by a distinguished overloaded function (unlike the
    other two, of course, this can't affect the result type, though in
    the other cases it need not, either).

  o There is also a fourth mechanism, dependent typing; this will
    allow us either to build explicitly type-parameterised functions
    (if we have dependent maps (ITT PI types)) or dynamically typed
    systems (if we have dependent products (SIGMA types)). {footnote:
    Going one step further and giving types properties, we get
    late-resolving object-orientation a la Smalltalk (which is NOT
    polymorphism because Smalltalk objects, since they "know" their
    class dynamically, only have a single type IMHI).}

Naively, there is no ambiguity at all; naive polymorphism would be compiled
by macro expansion (implying that C's .[.] operator is PRECISELY polymorphic
- it is, as I recall, surface syntax for *(.+.)), while naive overloading
can be seen as including parts of the type signature in an object's name.
That's pretty much the Ada model for both (I'm sure I'll be corrected if I
misremember me).

Things *can* get murky though. There are obviously other ways of getting
multiple implementations; any clausal-equational definition style will give
you this;-

	0! = 1;
	(n : N)' ! = n' * n!;

What this implies is that as soon as types acquire value-lke properties
(such as an however-indirectly user-accessible type comparison), the
distinction rapidly dissolves;

	f (x : (X : TYPE)) = .if X = T .then tf x .else uf x .

provides only one body, but how different is it from

	f (x : T) = tf x;
	f (x : U) = uf x;

?

In case user-accessible type comparison seems far-fetched, incidentally,
note that the rather (IMHO) mathematically ill-considered unions of Algol68
relied on them, and that any language with non-injective coercions induces
(an admittedly not very usable) one.

Maybe I should also comment that from a type-inference point of view,
polymorphism is a little more straightforward, since it only implies that we
have some kind of quantification over types, while overloading implies that
we can discriminate them (a much stronger requirement, especially for those
of us uncomfortable with the everything-is-a-set outlook). But from an
implementation (particularly a compilation) standpoint, overloads are more
static and therefore nicer: you get less ucky descriptors and
quasi-interpretive code. In both domains the distinction is pretty important
from a TECHNICAL point of view, even though it can be
difficult-to-impossible for the user of a function to tell which he(gender
apology, but syntax requires)'s got.

Finally, we need to observe a curious fact: that while we can characterise
langauges according to which of polymorphism and overloading they provide to
the user, you would not normally have user-polymorphism without overloading
in the implementation. The reason for this is that any polymorphic text is
going to have to resolve to something executable, and unless you are going
to construct everything from combinators and operations manipulating objects
with homogeneous underlying representations (such as pointers), this implies
the existence of illative atoms (like, say, +) with overloaded
implementations.

(-:

   Having just taken a Linguistics class from Lakoff, I could explain all this
   away with a lot of learned talk about radial categories, and prototypical
   cases, and so forth ... but I'm really curious.  What do YOU think the
   distinction is?  Or isn't there one?  Do you use these terms consistently?

I'd personally be interested and amused to hear you do just that; but
maybe this isn't the forum for it...? :-)

stephen p spackman stephen@estragon.uchicago.edu
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

daveg@near.cs.caltech.edu (Dave Gillespie) (09/03/90)

In article <9008310419.AA06194@karakorum.berkeley.edu> oliver@karakorum.berkeley.edu (Oliver Sharp) writes:

  o  Overloading means using the same name (or symbol) to invoke different
     code depending on the types of the arguments (or operands)....

  o  Polymorphism means using the same piece of code to operate on objects of
     different types.  In LISP, cons is polymorphic because you can give it
     arguments of different types but it goes ahead and does its thing without
     worrying about it.

>>>>> On 1 Sep 90 05:31:13 GMT, burley@world.std.com (James C Burley) said:
> Conceptually, what is the difference?

Here's my interpretation.  Suppose you have several kinds of linked
lists in C or C++.  Each is a struct with "next" and "prev" pointers,
plus other stuff.  You want to define some functions, like "list_length"
or "append_lists," that operate on lists of any type.

Overloading approach:

  Define several versions of each function, and call them all
  "list_length", using C++'s function overloading to get the compiler
  to choose the right function for each type of list.

Polymorphism approach:

  Define each struct with "next" and "prev" as the first and second
  elements, respectively.  Write a single "list_length" which takes
  a "void *" which must point to a list of one of these kinds of
  structs.  Inside the function casts this to a pointer to a
  standard struct with only "next" and "prev" fields.  (I believe
  the ANSI standard guarantees this to work.)

The advantage of overloading is that each type's "list_length" function can
be customized.  The advantage of polymorphism is that the whole definition of
"list_length" sits in one place.

But from the point of view of a user of "list_length", there is no difference
here.  The language hides which type of implementation the writer of
"list_length" chose.  (Although because C was not designed as a polymorphic
language, some of the implementation details of the polymorphic "list_length"
show through to the user because the structs must be specially arranged.)

The built-in "+" operator can be considered overloaded or polymorphic; the
distinction only matters if you are implementing the operator yourself.

								-- Dave
--
Dave Gillespie
  256-80 Caltech Pasadena CA USA 91125
  daveg@csvax.cs.caltech.edu, ...!cit-vax!daveg
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

hrubin@l.cc.purdue.edu (Herman Rubin) (09/03/90)

In article <BURLEY.90Sep1013113@world.std.com>, burley@world.std.com (James C Burley) writes:
> In article <9008310419.AA06194@karakorum.berkeley.edu> oliver@karakorum.berkeley.edu (Oliver Sharp) writes:
> 
>     o  Overloading means using the same name (or symbol) to invoke different
>        code depending on the types of the arguments (or operands)....
> 
>     o  Polymorphism means using the same piece of code to operate on objects
>         of different types.
			..................

> Conceptually, what is the difference?  Let's look at our favorite overloaded
> operator, addition:
> 
>     y + z
> 
> Now if y and z can be of several valid types, we say that + is overloaded
> because we know it must perform different actions depending on the types.

Thus if y and z are long integers, this becomes (different assembler languages
have somewhat different notation, and this may not correspond to any)

ADDL y,z

and if they are floating

ADDF y,z

Now ADDL and ADDF will operate on either, so those are polymorphic and
not overloaded.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)	{purdue,pur-ee}!l.cc!cik(UUCP)
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

px@fctunl.rccn.pt (Joaquim Baptista [pxQuim]) (09/04/90)

I usually use the following definitions about "overloading",
"polymorphism" and "generic code":

- Overloading is the ability to have the same name (symbol) attached
to different entities (functions, procedures, variables or whatsoever).

- Polymorphism is the ability to have the compiler decide *at runtime*
which entity to use depending on the runtime type (class or whatever)
of the contents of variables.

- Generic code is a chunk of code that has the ability of performing
the same algorithm with more than one type. The actual types may be
known at runtime (as in Smalltalk) or at compile time (as in Ada).


Elaborating...

The distinction between overloading and polymorphism is, for me, a matter of
binding time. Overloading does not imply any runtime decision, it is just
syntactic sugar for a family of different names that I, the programmer,
usually collapse in one. No power is lost if individual names are used, ie,
integer+ and float+.

The same is not true with polymorphism. When I use (like in Smalltalk)
"Graphic_object display", expecting an appropriate display procedure to be
called for circles and rectangles, depending on what happens to be on the
variable Graphic_object at runtime, I cannot use a circle_display and
rectangle_display unless I use a case statement, as in (pascal-like):

  case Graphic_object is
    circle:    display_circle(Graphic_object);
    rectangle: display_rectangle(Graphic_object);
  end;

You may argue that it is still a matter of convenience and syntactic
sugar, but I think that it is more than that.

Some languages do not quite separate the concepts of overloading and
polymorphism, with some confusing results (at least for me). I say that this
is the case with c++.

Examples of generic code are:
  - the same "Graphic_object display" (an uninteresting one, by the
    way);
  - a function to compute the lenght of a list (which does not depend
    on the type of its elements);
  - a function to sort some data structure (which depends on the
    existance of a <= relation between its elements).

So, I say that the called functions are polymorphic functions, while
the caller code is said generic.

This genericity may involve dynamic binding (like in Smalltalk) or not
(like the packages in Ada).

In article <9008310419.AA06194@karakorum.berkeley.edu>
oliver@karakorum.berkeley.edu (Oliver Sharp) writes:

>o  Overloading means using the same name (or symbol) to invoke different
>   code depending on the types of the arguments (or operands).  So, an example
>   is +, which may do radically different things for integers, floats, and
>   complex numbers.  In other words, you have different chunks of code
>   depending on the type.  You may need to do some implicit casting, as in
>   promoting an integer to a complex number, before you can actually perform
>   the operation.

>o  Polymorphism means using the same piece of code to operate on objects of
>   different types.  In LISP, cons is polymorphic because you can give it
>   arguments of different types but it goes ahead and does its thing without
>    worrying about it.

> Well, that seems pretty straight-forward so far, but in practice the
> distinction isn't so clean.  For example, think about square braces in C.
> I've heard people say that they are overloaded, because they can apply to
> arrays of different types ... but, by our just agreed on definitions, they
> are really polymorphic.  I'd look at someone funny if they said "braces are
> polymorphic in C", though, wouldn't you?  I've also heard people describe
> things that are clearly overloaded as being polymorphic.

Oliver Sharp seems to use polymorphic where I use generic, and uses
overloading to mean my polimorphism with overloading.

Or did I read it wrong?
--
Joaquim Manuel Soares Baptista, aka px@fctunl.rccn.pt, px@unl.uucp
Snail: CRIA, UNINOVA, FCT/UNL, 2825 Mt Caparica, Portugal
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

plph@caen.engin.umich.edu (Mark Montague) (09/04/90)

Are the square braces in C polymorphic or overloaded?  Which term should be
used?  I am no expert, but my personal opinion is NEITHER.  As a programmer
who has just begun delving into the mysteries of compiler construction, I
prefer to think of the square braces as a SHORTHAND NOTATION which is expanded
by the preprocessor (even though the preprocessor doesn't).  So I imagine that
every expression of the form

		myarray[index]

is silently transformed into

		*(myarray+index)

before the compiler even sees it.  Even though it probably doesn't happen
this way for most compilers, this fictional device lets me ignore questions
of polymorphism and operator overloading in a non-object-oriented language
such as C.

Mark Montague
plph@caen.engin.umich.edu
[In the C compilers I've seen subscripts are turned into the intermediate code
equivalent of *(a+b) very early in the compiler, typically as the intermediate
form is being built. -John]
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

dmw9q@uvacs.cs.Virginia.EDU (David M. Warme) (09/05/90)

In article <4c98dcd0b.001a92c@caen.engin.umich.edu> Mark Montague writes:

>So I imagine that every expression of the form
>		myarray[index]
>is silently transformed into
>		*(myarray+index)
>before the compiler even sees it.  Even though it probably doesn't happen
>this way for most compilers, this fictional device lets me ignore questions
>of polymorphism and operator overloading in a non-object-oriented language
>such as C.

     This does not remove polymorphism and/or overloading, it merely
transfers the burden onto + and *.  The expansion you state is THE
DEFINITION of the operator.  A C compiler I wrote does this during
parsing by building two parse-tree nodes (* and +) rather than a
single node for [].  The only detail to consider is that the node used
for * in this case is a specially-flavored * that differs in only one
respect from the "normal" * node:  error messages say "invalid
subscript", rather than "illegal indirection".

					- Dave Warme
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

pase@orville.nas.nasa.gov (Douglas M. Pase) (09/06/90)

I've watched this discussion a little and have become confused by its intent.
Is this simply a round of "this is my idea of how polymorphism/overloading/
whatever ought to be defined"?  Certainly Cardelli, Strachey, and others
had (have, if still alive) a clear idea of what they meant when they used the
words.  I find Cardelli's definition illuminating as well.  (I'm only loosely
familiar with others so I'm not really able to judge.)  The notions are tied
to the types of objects, and the type signature of the operator or function.
They are not tied in any interesting way to when the functions do their thing,
whether at runtime or compile time.

In most languages, the operator "+" is considered overloaded (ad hoc
polymorphic) under this taxonomy, because different algorithms are required
for each of its possible uses.  E.g., ADD is used for integer add, FADD for
floating point, etc.  It is irrelevant that both use 32-bit arguments, and
therefore could conceptually operate on the same data.  For x,y of type float
and i,j of type int:

	x + y	=> FADD x,y

	i + j	=> ADD	i,j

	x + i	=> CIF	i,t
		   FADD x,t

and so forth.  Although it is possible that the instruction

		   ADD	x,y

could appear somewhere, it is questionable that this represents an add in
any real sense.  It does not represent a floating point add, because it is
an integer operation.  It does not represent an integer add, because it does
not have integer operands.  At best it seems that it is a bit-pattern
transformation operation that might be interpreted in special ways, because
the types float and int are both specializations of the type bit-pattern.
In a traditional assembler language the only supported data types are bit-
patterns of various sizes, so in that context I suppose it makes some sense.

Under the same taxonomy, the function

	define list_count = lambda list
		if empty list then
			0
		else
			1 + list_count ( tail list )
		;

is (universal) polymorphic, because the same algorithm is used regardless of
the type of list.  The compiler has the option of optimizing the code in a
certain way if the data types are suitable, but it does not have to be done,
and it does not fundamentally alter the nature of the beast.  The operation of
the function is *independent* of the type of the function arguments.

If we look at the type signatures, we get

	+ : int   x int   -> int
	  : int   x float -> float
	  : float x float -> float
		...

	list_count : list *a -> int

There is just one type signature for list_count, which happens to involve a
type variable, *a.

I suppose that to some people words mean what they want them to mean, and I
have no problem with that.  However, Cardelli's approach has advantages that
I can see (even with my limited understanding), and I guess I don't see how
some of these other proposals improve on it any.
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

wg@relay.EU.net (Wolfgang Grieskamp) (09/06/90)

wright@gefion.rice.edu (Andrew Wright) writes:

>C. Strachey originated the terms "ad-hoc" and "universal" polymorphism
>to distinguish these issues.  overloading is an example of ad-hoc
>polymorphism, automatic coercions such as real->int are another.

The unslightness of "ad-hoc" polymorphism may be one of the origins of the
confusion. In

	Wadler P, Blott S: How to make ad-hoc polymorphism less ad-hoc,
	Proc. 16th Symp. on Principles of Programming Languages (Jan.89),
	60-76

... the authors describe so-called "type-classes" (related, but not the same
as classes in object-oriented languages). This is just an approach, not
the ultimative solution, grew out of the efforts of the Haskell commitee
(a new generation functional language, based on ML and others). 

Regarding to the pragmatic points raised in the discussion, I guess what's
runtime, what's compile time, what's the same and what's another code is not
a sufficient classification. All this depends highly on the compiler's and 
target machine's architecture. For example, function *length* over lists may
or may not generate the same code, depending on the garbage collection
scheme (lists of primitive objects may be collected in another way than
lists of non-primitive objects). 

In object-oriented languages, the compiler may or may not derive the actual 
class of a message receiver - depending on the sender's context (and the 
compiler's capabilities), but not on the class declaration. So, is the same
message in one context overloaded, but in another polymorphic?

Even the square bracket operator in C may or may not generate different code.
If sizeof(int)=sizeof(long), then the square for a pointer to "int" will 
generate the same code as for a pointer to "long", otherwise not.

Returning to the mentioned paper, the actual problem sketched out there is
illustrated by the impossibility to define (in most languages)

	square(x) == x*x

... for any x (int, real, complex, etc.). The problem here is the "ad-hoc" 
polymorphism (or overloading) of "*". To say it in C, you cannot define

	swap(vec,i,j) void *vec; int i,j;
	{  void tmp; tmp = vec[i]; vec[i] = vec[j]; vec[j] = tmp;
	}

... for any vector vec. This shows clearly, that square brackets are the
worst-case - "ad-hoc" polymorphism (or overloading). The question seems to
me how to eliminate it.
[There was a PS on this message that got smashed in transmission.  Hope it
didn't say anything vital. -John]
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

pase@orville.nas.nasa.gov (Douglas M. Pase) (09/07/90)

In article <4c98dcd0b.001a92c@caen.engin.umich.edu> plph@caen.engin.umich.edu (Mark Montague) writes:
>Are the square braces in C polymorphic or overloaded?  Which term should be
>used?
>[In the C compilers I've seen subscripts are turned into the intermediate code
>equivalent of *(a+b) very early in the compiler, typically as the intermediate
>form is being built. -John]

Yes, well, it is important to note that a[b] => *(a+b).  So what is the type
signature of this function?  You can look at it several ways.  In the first
case it depends on the type of a.  (It also depends on whether a[b] occurs on
the rhs or lhs of an assignment, but I will ignore that additional
complication.)  Given the address that a represents, the amount added to that
address will depend on the size of its elements.  For example, in a byte
oriented 32-bit machine,

	char a[MAX];

	... a[b] ...

gives something like

	LDW,r6	b	; load the contents of b
	ADD,r6	r6,a	; add the address a
	LDB,r7	r6	; load the byte contents of address in r6

but

	int a[MAX];

	... a[b] ...

gives

	LDW,r6	b	; load the contents of b
	MUL,6	r6,4	; multiply b by the size of an int
	ADD,r6	r6,a	; add the address a
	LDW,r7	r6	; load an int located at the byte address in reg 7

Judging from this, it seems that the type signatures are

	[] : array of char x int -> char
	   : array of int  x int -> int
	     ...

This seems to suggest that [] is overloaded, and represents an infinite family
of functions, all differing in the size of the elements of a.  Because []
seems to fit the definition, one could say [] is overloaded.  It's hard to
believe, though, that much useful information is gained by looking at it in
this way.

On the other hand, it might also be said that it is universal polymorphic.
Its type signature would be

	[] : array of *a x int -> *a

Looking at it in this way assumes either that it is a primitive polymorphic
function, and is therefore polymorphic almost by definition, or that it is
constructed from functions which are.  That means loads, sizeofs, etc., must
be universal polymorphic.  The use of the "+" function doesn't give us any
problem as long as only one definition of "+" is used, namely

	+ : byte_address x int -> byte_address

If the "+" somehow depended on the type of "a", [] wouldn't be universal
polymorphic.  It would be ad hoc polymorphic (i.e., overloaded).  Assuming
that the above mentioned restrictions are met, it seems much simpler and
more useful to view [] as universal polymorphic.

--
Dr. Douglas M. Pase, Computer Sciences Corporation
NASA Ames Research Center MS 258-6, Moffett Field, CA  94035
(415) 604-6394, pase@orville.nas.nasa.gov
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

ok@goanna.cs.rmit.OZ.AU (Richard A. O'Keefe) (09/07/90)

In article <2509@l.cc.purdue.edu>, hrubin@l.cc.purdue.edu (Herman Rubin) writes:
> In article <BURLEY.90Sep1013113@world.std.com>, burley@world.std.com (James C Burley) writes:
> > In article <9008310419.AA06194@karakorum.berkeley.edu> oliver@karakorum.berkeley.edu (Oliver Sharp) writes:
> >     o  Overloading means using the same name (or symbol) to invoke different
> >        code depending on the types of the arguments (or operands)....
> >     o  Polymorphism means using the same piece of code to operate on objects
> >         of different types.

Let me summarise that:
	DIFFERENT code for different types => OVERLOADING
	     SAME code for different types => POLYMORPHISM
(ignoring implementation detail; the SELF compiler will compile different
versions of what's _conceptually_ the same code.)
They are means to different ends.

> Thus if y and z are long integers, this becomes (different assembler languages
> have somewhat different notation, and this may not correspond to any)
>	ADDL y,z
> and if they are floating
>	ADDF y,z

This sounds like _different_ code for the different types, so overloading.

> Now ADDL and ADDF will operate on either, so those are polymorphic and
> not overloaded.

And now I am completely lost.  The NS32?32 has addd (equivalent of ADDL)
and addf (equivalent of ADDF) and they _won't_ operate on either, they
are very specific.  Same for every machine I've ever used except the
B6700 and Xerox Lisp machine (both of which have _one_ polymorphic add).

-- 
[Herman's definition does seem to be the reverse of the one favored by
other people. -John]


-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

pardo@cs.washington.edu (David Keppel) (09/07/90)

>[Ongoing discussion of overloading and polymorphism]

To summarize:

Given the name of of an operator where that operator applies to
several types, the operator is

* Polymorphic if all implementations are derived from a single
  specification.
* Ad-hoc polymorphic or overloaded if the implementations are derived
  from several specifications.

Oversimplified, but close?

	;-D on  ( A rose by any other name would be polymorphic )  Pardo
-- 
		    pardo@cs.washington.edu
    {rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

pardo@cs.washington.edu (David Keppel) (09/08/90)

>[Ongoing discussion of polymorphism & overloading.]

Given two uses (`T' denotes type):

	func (T) -> T
	func (T, T) -> T

is `func' overloaded, polymorphic, or neither?  Note that the
parameter types do not change but that the typ signature of the whole
call does change.  Does the O/P/N status depend on whether the
specification is

	func (T: x) -> T  { ... }
	func (T: x, T: y) -> T  { ... }

or

	func (T: x, optional T: y) -> T  { ... }

?

Ada allows default parameters, while some LISPs allow both defaults
and genuinely optional parmaeters.  Does that make a difference?

	;-D on  ( And is Eiffel's `feature' a documented bug? )  Pardo
-- 
		    pardo@cs.washington.edu
    {rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

jamin@aludra.usc.edu (Sugih Jamin) (09/08/90)

In article <3709@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.OZ.AU (Richard A. O'Keefe) writes:
>Let me summarise that:
>	DIFFERENT code for different types => OVERLOADING
>	     SAME code for different types => POLYMORPHISM

My prof, Larry Rowe, once said that polymorphism means the exact same code 
works for different types.  Whereas calling different pieces of code with the 
same name depending on the type of the argument(s) is "dynamic procedure call."
Using one name for more than one piece of code is name overloading.


sugih

-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

freek@fwi.uva.nl (Freek Wiedijk) (09/10/90)

ok@goanna.cs.rmit.OZ.AU (Richard A. O'Keefe) writes:
>Let me summarise that:
>	DIFFERENT code for different types => OVERLOADING
>	     SAME code for different types => POLYMORPHISM

And what if the only difference in the code is that two different instances of
an overloaded function are being called?  Is this an instance of overloading
(the code is "really" different, because two different functions are being
called), or is this an instance of polymorphism (the code "looks the same")?

In other words, is the property of being overloading "inherited" by all
functions that make use of an overloaded function?  Or, can overloading
"evaporate" to polymorphism.

Freek "the Pistol Major" Wiedijk                      E-mail: freek@fwi.uva.nl
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

mmengel@cuuxb.ATT.COM (~XT6561110~Marc Mengel~C25~M27~6184~) (09/11/90)

Okay.  Time to settle this once and for all :-) 

Polymorphism (poly == many,morph == type) i.e. many-type-ism;
	is one operation that works on all (or at least many)
	types.

Overloading is having the same token represent different operations.

I am using "operation" here in a general sense; certainly integer and
floating point addition are mechanically different in detail, yet for the
"same" values (i.e. 5.0 + 5.0 and 5 + 5) an addition operator will yeild the
"same" result (i.e. 10.0 and 10), floating point and integer addition are
then the "same" operation.

So if you're going to be polymorphic, you need to define what the operation
is, and pick one that makes sense accross the set of types you are dealing
with.  You could for example define "<=" as a partial order on basically any
type and still have a consistent operator which a sorting algorithm would
work on for example.

Overloading is when the same token stands for different operations
altogether (i.e. "<<" being used for both bit shifts and standard I.O. in
C++).  This is (IMHO) a Bad Thing.  Overloading (as per my definition anyway
:-)) is something that makes code difficult if not impossible to read,
especially when pushed to the limit.

Polymorphism, on the other hand makes code easier to read, as it gets rid of
things like having both a "div" and "/" operator in Pascal.

[Incidentally, to soundly thump the dead horse of subscription in C, please
remember that "a[b]" in C is eqivalent to "*(a+b)" and that it is the "+"
operator on pointers and integers that is polymorphic, and always yeilds the
pointer address plus the integer times the size of the object the pointer
points to.]
-- 
Marc Mengel					mmengel@cuuxb.att.com
					attmail!mmengel
					...!{lll-crg|att}!cuuxb!mmengel

-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

pcg@cs.aber.ac.uk (Piercarlo Grandi) (09/13/90)

On 7 Sep 90 16:52:16 GMT, pardo@cs.washington.edu (David Keppel) said:

>[Ongoing discussion of overloading and polymorphism]

> To summarize:

> Given the name of of an operator where that operator applies to
> several types, the operator is

> * Polymorphic if all implementations are derived from a single
>   specification.
> * Ad-hoc polymorphic or overloaded if the implementations are derived
>   from several specifications.

> Oversimplified, but close?

Agreeably so, under current definitions. However the entire discussion
about *understanding* polymorphism and overloading (as well as defining
them), is seriously vitiated by one problem, that is always present:

	why would we want to have overloading or polymorphism?

This is a less stupid question than it looks. The answer is *reuse*.
My usual mantra is worth repeating here:

What we really want is to be able to express notationally:

	* reuse of interface

	* reuse of semantics

	* reuse of implementation

These are very different concepts. The various *mechanisms* that are
related (overloading, polymorphism, generic packages, private scopes,
inheritance, delegation, ...) do not directly address the support of
these concepts.

Just to go back to the mechanisms at hand, overloading is reuse of part
of an interface, and to a certain extent overloading of part of
semantics; when we have + overloaded for ints and floats the name is
reused, and so is part of the semantics, in that the mathematical
properties are assumed to be similar (hintsd of category theory). Things
start to look fishy when we overload + for string concatenation, as the
reuse of semantics thing starts to look very flimsy. Note that reuse of
interface is also partial here, because all these overloadings are
actually defined over two operands of the same type, even if the type is
different each time.

Polymorphism is reuse of implementation, with some hints of reuse of
semantics; for example you can use much the same logic to calculate the
length of a string and that of a file. Of course this relies on reuse of
semantics, because it will usually happen that the semantics of the
types over which we are polymorphic are somewhat similar. But, and this
rarely considered, it may happen that the same implementation (shape of
code) can be applied with entirely different results to different types;
this is especially obvious with functionals. Polymorphism also implies
some partial reuse of interface, as using the same implementation more
or less implies using the same interface.

What I would really like is for language designers to provide notation
to directly address the three distinct aspects of reuse, not with non
orthogonal and clumsy solutions.
--
Piercarlo "Peter" Grandi           | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

voss@suna0.cs.uiuc.edu (09/15/90)

> What we really want is to be able to express notationally:
>	* reuse of interface
>	* reuse of semantics
>	* reuse of implementation

How about a positive example of what you want?  From my Smalltalk-80 & C++
background, it looks to me as though the following are basically equivalent.

	Overloading	<=>    * reuse of interface
	Polymorphism	<=>    * reuse of semantics
	Inheritance	<=>    * reuse of implementation

NOTE: reuse of implementation seems to require reuse of semantics.
--
Bill Voss -- Graduate Student -- Department of Computer Science
voss@cs.uiuc.edu                 University of Illinois at Urbana/Champaign
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

stt@inmet.inmet.com (09/15/90)

Here is my favorite so far:

> - Overloading is the ability to have the same name (symbol) attached
> to different entities (functions, procedures, variables or whatsoever).
> 
> - Polymorphism is the ability to have the compiler decide *at runtime*
> which entity to use depending on the runtime type (class or whatever)
> of the contents of variables.
> 
> - Generic code is a chunk of code that has the ability of performing
> the same algorithm with more than one type. The actual types may be
> known at runtime (as in Smalltalk) or at compile time (as in Ada).
> . . .
> Joaquim Manuel Soares Baptista, aka px@fctunl.rccn.pt, px@unl.uucp

However, I would revise the definition of polymorphism as follows:

- Polymorphism is the ability to have a single entity (usually a parameter or
a pointer) refer to objects/values of more than one type.  Run-time
polymorphism (like C++) allows the dispatch to type-specific code to be made
at run-time.  Compile-time polymorphism (like Ada's generics) performs the
"dispatch" to type-specific code at compile-time.

- Generic code is code that manipulates polymorphic parameters/pointers, by
taking advantage of the commonality across the multiple types represented by
them.  In most cases, however, it must eventually "dispatch" to type-specific
code.

S. Tucker Taft    stt@inmet.inmet.com;  uunet!inmet!stt
Intermetrics, Inc.
Cambridge, MA  02138
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

pcg@compsci.aberystwyth.ac.uk (Piercarlo Grandi) (09/21/90)

I wrote:
  
pcg> What we really want is to be able to express notationally:
pcg>	* reuse of interface
pcg>	* reuse of semantics
pcg>	* reuse of implementation
  
Bill Voss (voss@suna0.cs.uiuc.edu) wrote:

  How about a positive example of what you want?  From my Smalltalk-80 & C++
  background, it looks to me as though the following are basically equivalent.
  
  	Overloading	<=>    * reuse of interface
  	Polymorphism	<=>    * reuse of semantics
  	Inheritance	<=>    * reuse of implementation

I'd tend to disagree as to the details; for example inheritance normally
implies reuse of interface as well. Polymorphysm seems to be mostly reuse of
implementation, but also of course is reuse of semantics.

While my ideas on the subject are not well crystallized, i'd like to see
something along the lines that follow:

*	The ability to define notationally a "protocol" or interface, and
	the ability to have concrete implementations that say "I am
	accessible via that protocol". Something like Ada packages, but
	you should be able to have *simultaneous* multiple implementations
	accessible. You also want probably some algebra on protocols.
	(this protocol extends this; these two protocols are unified in this,
	but for these and these aspects).

*	The ability to define notationally any set of pre and post conditions
	and invariants, or other ways of specifying semantics, and have
	concrete implementations say that they adopt a particular semantics.
	This of course must be possible to many levels. We also want some
	algebra on semantics of course.

*	The ability to have implementations that are parametric with respect
	to control flow (control abstraction), functions (higher order
	functionals), and types (generic or polymorphic). Some algebra
	is obviously implicit here (apply this implementation skeleton to
	this domain).

Notice that the same specification can be applied to radically different
implementations, and even to different interfaces; and so on. The same
implementation can have radically different semantics, and even interface,
when instantiated on a specific domain. The same protocol can be associated
with wildly different semantics or implementations.

We want to reuse interface to build flexible systems; we want to reuse
semantics to build realiable systems; we want to reuse implementation to
build cheap systems.

As an example of the benefits of reuse of interface, just consider UNIX
pipelines; as long as a program has a filter interface (read stdin, write
stdout), you can combine them a lot. Functionals in Lisp are an example of
implementation reuse; you can (map) a lot of different things over a list.
As to reuse of semantics, these are harder to find; but an example may be
that no matter which interface or implementation you use, a program that
depends on a simple string search function will not have to be carefully
reexamined again each time you try a new one, because they are all
understood to do the same thing.

In Modula-3 we have some idea about reuse of interfaces; in C++ and
Smalltalk we have abstract classes for interfaces, but they are are an extra
linguistic convention. In Eiffel we have inheritance of some semantics. In
polymorphic languages we have some more flexible reuse of implementation. In
ML we have powerful functional abstraction, and in SL5 or Scheme we have
some control abstractionas well.

Notice that supporting all this flexibility requires a lot of effort.  There
are some hints that the best environment is some form of Scheme with
symbolic reduction, i.e. something akin to a supercompiler.

But, and this is the more important thing, what it really requires is better
insight in the various aspects of the reuse problem, not the blind or ad-hoc
choice of notational features for languages we have now.

voss>  NOTE: reuse of implementation seems to require reuse of semantics.

Not really. You can use exactly the same code skeleton to do things that are
radically different, *as long as it is parametric*, which is of course
essential if it is to be reusable.

-- 
Piercarlo "Peter" Grandi           | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

px@fctunl.rccn.pt (Joaquim Baptista [pxQuim]) (09/21/90)

In article <19400003@inmet> stt@inmet.inmet.com writes:

   Here is my favorite so far:

   > - Overloading is the ability to have the same name (symbol) attached
   > to different entities (functions, procedures, variables or whatsoever).
   > 
   > - Polymorphism is the ability to have the compiler decide *at runtime*
   > which entity to use depending on the runtime type (class or whatever)
   > of the contents of variables.
   > 
   > - Generic code is a chunk of code that has the ability of performing
   > the same algorithm with more than one type. The actual types may be
   > known at runtime (as in Smalltalk) or at compile time (as in Ada).
   > . . .
   > Joaquim Manuel Soares Baptista, aka px@fctunl.rccn.pt, px@unl.uucp

   However, I would revise the definition of polymorphism as follows:

   - Polymorphism is the ability to have a single entity (usually a parameter or
   a pointer) refer to objects/values of more than one type.  Run-time
   polymorphism (like C++) allows the dispatch to type-specific code to be made
   at run-time.  Compile-time polymorphism (like Ada's generics) performs the
   "dispatch" to type-specific code at compile-time.

It's always nice to be appreciated, but still I must disagree with you.

I disagree with your difference of run-time vs. compile-time polymorphism. I
call the first polymorphism and the second one overloading.

Although they look similar, they have a different expressive power.  Run-time
dispatching is more powerful than compile-time "dispatching", and I think
that they deserve different names, ie, polymorphism and overloading.

I can elaborate on this "expressive power" business if that proves to
be necessary.
--
Joaquim Manuel Soares Baptista, aka px@fctunl.rccn.pt, px@unl.uucp
Snail: CRIA, UNINOVA, FCT/UNL, 2825 Mt Caparica, Portugal
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

chip@soi.com (Chip Morris) (09/21/90)

It is very difficult to draw a meaningful abstract distinction among
polymorphism, overloading and "generic functions" in the absence of
first-class procedures. After all, "polymorphism" is an attribute of
procedures, but if procedures don't _really_ exist in your language, you are
almost always going to be able to "fake" polymorphism by compile-time or
run-time analysis.

Indeed, this compiler fakery is just what's confusing most people.  I can
write a program in which '+' is applied to pairs of 32-bit integers and to
pairs of complex numbers (user defined), and in a lot of language
implementations the compiler does some analysis and picks the code before
run-time.  But suppose I assign the function denoted by '+' to a variable (as
in Lisp #'+), and then pass it through some code that makes it undecidable
what the type of x and y are in a subsequent call to 'x + y'.  Overloading
ain't possible.

Now, polymorphism can be "supported" in a variety of ways by a language
implementation.  An implementation could allow the user to give several
definitions of '+', and wrap them in type-checking code to yield the "real"
'+'.
 
In a language without first-class procedures you won't be able (without
peeking) to tell the difference between this method and a variety of run-time
dispatching schemes that never create a single '+' object.

Personally, I like to think of overloading (ad-hoc) and non-polymorophic run
time dispatching as compiler optimizations (constant-folding, really) of
polymorphism.  Rather like in-lining once-called procedures.  But I also
think that languages are much better off being founded on first-class
procedures and compiler optimizaton than some idea of "generic function".
(Flames to comp.lang misc.)

---
Chip Morris                             chip@soi.com
617-497-5054                            ..!uupsi!soi!chip
Software Options, Inc., 22 Hilliard St., Cambridge, MA 02140
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.