[comp.compilers] Polymorphism vs. overloading

johnson@cs.uiuc.edu (Ralph Johnson) (09/10/90)

I don't like any of the definitions that I have seen so far.
Instead of just being rude, I'll give my own.

"Polymorphic" refers to a procedure definition.  A polymorphic procedure
is one that can take arguments of many different types.  Most people
seem to get this one right.

Overloading (in Ada and C++, at least) means that there are several
different procedure definitions that define procedures with the same
names, but with different argument types (or, in the case of Ada,
result types).  The compiler figures out which one to call based on
the types of the arguments (or, in Ada, expected return type).  This
has nothing to do with any kind of polymorphism.

Most of the answers seem to imply that the late-binding in
object-oriented programming is a form of overloading.  This is
not true, since overloading means something that can be handled
by the compiler.  In fact, late-binding can be considered to be
normal polymorphism, i.e. a procedure that can take arguments of
many different types.

Consider the C++ expression figure->display().  This can be thought
of as "select the right function for the type of figure and run it",
in which case one might think that it was like overloading, i.e.
there are several different procedure definitions and one of them
is being selected based on the type of one of the arguments.  However,
it is more correct to think that it means to retrieve a field of
record figure, which is a pointer to a function, and execute the
function to which it points.  In this case, the function being
executed can be of one of several types.  Thus, the "select field"
operation is polymorphic (because it takes different types of
records and returns different types of functions) and the "evaluate
function" operation is polymorphic (because it takes functions of
many types and returns values whose type depends on the function),
but there is no overloading.

I like to call this distinguishing characteristic of object-oriented
languages "polymorphism implemented by late-binding of procedure
calls".  It is not very short, but it is accurate.  There are other
ways to achive polymorphism, but this is the kind that gives
object-oriented programming its power.

Ralph Johnson - University of Illinois at Urbana-Champaign
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

norman@a.cs.okstate.edu (09/12/90)

>From article <9009100341.AA17336@m.cs.uiuc.edu>, by johnson@cs.uiuc.edu (Ralph J
ohnson):
> I don't like any of the definitions that I have seen so far.
> Instead of just being rude, I'll give my own.

What's wrong with using the standard definitions as they occur in the
literature? Has no one bothered to read Cardelli and Wegner's excellent
article on this topic? [1] And how about Danforth and Tomlinson's article on
types and OOP? [2] Surely we owe some consideration to Strachey, the first
person to distinguish between 'parametric' and 'ad-hoc' polymorphism. [3]
(I'll sketch Cardelli and Wegner's definitions at the end of this article
for those of you who don't want to read the articles. But please read the
articles: They provide much more than definitions and I'm sure you'll enjoy
them.)

> "Polymorphic" refers to a procedure definition.  [...]

That's not quite correct. Any entity that can have a type can also have a
polymorphic type: This includes variables, constants, functions, modules,
etc. Don't fall into the trap of limiting a concept merely because some
language does not provide a complete implementation of that concept.

> Overloading (in Ada and C++, at least) means that there are several
> different procedure definitions that define procedures with the same
> names, but with different argument types (or, in the case of Ada,
> result types).  The compiler figures out which one to call based on
> the types of the arguments (or, in Ada, expected return type).  This
> has nothing to do with any kind of polymorphism.

Au contraire (see below).

> Most of the answers seem to imply that the late-binding in
> object-oriented programming is a form of overloading.  This is
> not true, since overloading means something that can be handled
> by the compiler.  In fact, late-binding can be considered to be
> normal polymorphism, i.e. a procedure that can take arguments of
> many different types.
>
> [Example omitted]
>
> I like to call this distinguishing characteristic of object-oriented
> languages "polymorphism implemented by late-binding of procedure
> calls".  It is not very short, but it is accurate.  There are other
> ways to achieve polymorphism, but this is the kind that gives
> object-oriented programming its power.

You're correct--late binding in oop is not a form of overloading; but
your definition of overloading seems to be amiss. Also, the polymorphism
in object-oriented languages has already been described in the literature,
so you needn't give it a new name.

Cardelli and Wegner describe the polymorphism hierarchy as follows:

  [Since most people seem to be interested only in polymorphic
  functions, I'll play along and give the definitions in terms
  of functions. Feel free to extend these definitions to included
  other values in your language of choice.]

    1) Polymorphism

         1.1) Universal Polymorphism--normally, functions that work on
                an infinite number of types, with all types sharing
                some common structure.

                1.1.1) Parametric Polymorphism--obtained when a function
                         works uniformly on a range of types that normally
                         exhibit some common structure. In this type of
                         polymorphism, the function has an implicit or
                         explicit type parameter that determines the
                         type of the argument for each application of
                         that function. These functions are also called
                         generic functions.

                1.1.2) Inclusion Polymorphism--objects can be viewed as
                         belonging to many different classes that need
                         not be disjoint (i.e. there may be an inclusion
                         of classes).

         1.2) Ad-hoc Polymorphism--exhibited by functions that operate
                on a finite set of different and potentially unrelated
                types.

                1.2.1) Overloading--the same name is used to denote
                         different functions and _context_ is used to
                         decide which particular function is denoted
                         by an instance of that name. This is just
                         a convenient syntactic abbreviation as each
                         function could be given a unique name.

                1.2.2) Coercion--a semantic operation that converts
                         an argument to the type expected by a function.
                         Coercions are used to prevent type errors.
                         Coercions may be provided at compile time or
                         they may be provided at run-time via tests on
                         the arguments.

[1] L. Cardelli and P. Wegner, "On understanding types, data abstraction,
    and polymorphism," Computing Surveys, Vol. 17, No. 4, December 1985,
    pp. 471-522.

[2] S. Danforth and C. Tomlinson, "Type theories and object-oriented
    programming," Computing Surveys, Vol. 20, No. 1, March 1988, pp. 29-72.

[3] C. Strachey, "Fundamental concepts in programming languages," Lecture
    notes for International Summer School in Computer Programming,
    Copenhagen, August 1967.
--
Norman Graham                            Oklahoma State University
  Internet:  norman@a.cs.okstate.edu     Computing and Information Sciences
  BangPath:                              219 Mathematical Sciences Building
     {cbosgd,rutgers}!okstate!norman     Stillwater, OK  USA  74078-0599
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.

sakkinen@jyu.fi (Markku Sakkinen) (09/14/90)

In article <9009100341.AA17336@m.cs.uiuc.edu>
johnson@cs.uiuc.edu (Ralph Johnson) writes:
> ...
> [...]  In fact, late-binding can be considered to be
>normal polymorphism, i.e. a procedure that can take arguments of
>many different types.
>
>Consider the C++ expression figure->display().  This can be thought
>of as "select the right function for the type of figure and run it",
>in which case one might think that it was like overloading, i.e.
>there are several different procedure definitions and one of them
>is being selected based on the type of one of the arguments.  However,
>it is more correct to think that it means to retrieve a field of
>record figure, which is a pointer to a function, and execute the
>function to which it points.  In this case, the function being
>executed can be of one of several types. [...]
> ...

This is somewhat inexact for C++. I would like to adjust
three points.

1. In an expression like the above, the variable 'figure'
   cannot be a record (structure, class object) but a pointer to such.

2. The function that gets executed _cannot_ be of one of several types,
   it must be of the type that has been declared for 'display'.

3. The explanation, "... to retrieve a field of record figure,
   which is a pointer to a function ...", is correct _if_ the field
   'display' has indeed been declared as a _pointer_ to function
   in the class definition. This is fully possible already in C
   without needing C++. In C++, 'display' would then be a "data member"
   (instance variable) and not a "member function" (instance method)
   of the class. - BTW, I think that the implicit dereferencing of
   a function pointer that happens here is a bad feature.

3 bis. If 'display' has been declared as a _virtual_ member function
   in the class definition, we do have a case of ordinary object-oriented
   (and more specifically, class-based) late binding.
   The variable 'figure' may point to an instance of any (public)
   subclass of its declared class, and the version of 'display'
   defined in the "lowest" (most specialised) subclass will be invoked.
   In this case there is no pointer-to-function field corresponding
   to 'display' in the object itself!

Markku Sakkinen
Department of Computer Science and Information Systems
University of Jyvaskyla (a's with umlauts)
Seminaarinkatu 15
SF-40100 Jyvaskyla (umlauts again)
Finland
          SAKKINEN@FINJYU.bitnet (alternative network address)
-- 
Send compilers articles to compilers@esegue.segue.boston.ma.us
{ima | spdcc | world}!esegue.  Meta-mail to compilers-request@esegue.