[comp.lang.misc] Runtime Polymorphism -- To Have and Have Not

gudeman@cs.arizona.edu (David Gudeman) (03/01/91)

Runtime polymorphism (or runtime typing, or dynamic typing) means that
identifiers in a language do not have types that are fixed at compile
time, they can reference values of different types at different times
in the program.  Languages with runtime polymorphism include Lisp,
Smalltalk, Icon, and Prolog.

There are several advantages to this.  First, it makes program
prototyping and developement much faster because declarations are
minimized.  For example a structure to represent a binary tree of
integers might be declared as

  record tree(data,left,right)

rather than

  struct tree {
    some_type data;
    struct tree left,right;
  }

I've saved maybe half of the work for an extremely simple structure.
I'd guess that for complex structures, the declarations for a language
with runtime polymorphism run 10% or so of the size of declarations in
a strongly typed language.  This can save a lot of time when you first
start writing the program and can save even more time later when you
change the representations (since most of the polymorphic declarations
needn't be changed).  In fact all of the languages named above have
powerful enough built-in data types that you can get by with almost no
declarations at all.  And such powerful built-in data types can only
be defined in languages that have runtime polymorphism.

Another advantage of runtime polymorphism is code generality.  You can
write generic procedures that work on many data types (as long as it
only makes use of operations that have definitions for all the data
types).  For example you can write a procedure to insert something in
a tree without knowing the type of the object: (Icon code)

  # insert x into tree t and return the new tree
  procedure tree_insert(t,x)
    if t === &null return tree(x)
    if x <<< t.data then 
      t.left := tree_insert(t.left,x)
    else
      t.right := tree_insert(t.right,x)
    return t
  end

The only operations on x are object comparison (<<<) and inserting
into a record, so the procedure is entirely generic.  (As an aside,
does anyone really want to claim that the lack of semi-colons is
confusing here?  I don't know of anyone who has actually programmed in
Icon who found the semi-colon rule to be any problem at all.)

There are two problems with runtime typing (or I should say, one
problem and one complaint of questionable importance)  First, it is
very hard to make these programs run fast.  In practice, most programs
use most variables in a type-consistent way, but the user pays the
performance penalty of having runtime type checks everywhere.  These
type checks can be largely eliminated by type inference, but not
entirely.  And there still must be some mechanism for cleanly
combining values of unknown type with values of known type.

The questionable complaint about runtime typing is that it leads to
hard-to-find errors that are not caught until runtime.  It is true
that type errors in such languages cannot be found until runtime, and
that the place where a runtime error actually occurs is not always
near where the error originated.  On rare occasions these errors
require some serious debugging.  But in my experience this is a minor
problem, and the time spent looking for these errors is quite small
compared to the time that would have been spent writing strongly typed
declarations.

Even so, I believe that there is a solution to both of these problems
that still leaves the advantages of runtime polymorphism intact.  I'll
discuss the solution in a later posting (and maybe answer some of the
flames that this article may generate :-).
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

gudeman@cs.arizona.edu (David Gudeman) (03/02/91)

It is a simple matter to define a language that has the advantages of
runtime polymorphism without the disadvantages.  More specifically, it
is possible to allow runtime polymorphism in some places and not in
others, and you only pay the penalties for the feature when you are
actually making use of it.

Start with a strongly typed language with automatic type conversion.
Add a new type called "top".  A value of this type contains two parts,
(1) a value of some other type, and (2) a tag to tell what the type of
the value is.  Expressions of type top can occur anywhere.  For
example

  top x, y; double z; int arr[20];
  x = 3;
  y = 4.2;
  z = x;
  z = x + y;
  x = arr[y];
  y = x;
  ...

The rule is that in a dereferencing context, a value of type top can
be automatically converted to any other type, but the conversion
involves a test to make sure that the type actually is correct.  For
example, in the above code fragment all statements are legal except
for "x = arr[y]", since during the conversion of y to an integer, it
is discovered that y is in fact a float.  (Of course this assumes that
floats are not converted automatically into ints.  If floats are
converted automatically into ints then all the above expressions are
legal.)

On the left hand side of an assignment, a variable of type top gets
the value of the right hand side plus the type of the right hand side.
In "x = 3", x gets the type int.  In "y = 4.2", y gets the type float.
In "y = x", y gets the current type of x.

One caveat is that type information must be propogated down the parse
tree as well as up.  For example, in "z = x + y", there has to be some
information passed down the parse tree to x and y to generate code
that will convert them to floats.  By extension, type information must
also be propogated from the left side of assignments to the right
side.

In the above, I slid past another potential problem: what to do for
polymorphic operators (and function).  In "z = x + y", you want the
expression to behave at runtime just like it would have behaved if you
had known the types of x and y at compile time.  That means that if x
and y are both integers, then you want integer addition performed, and
the result coerced to a float for assignment.  In general, the
decision of what version of a polymorphic operator to apply must be
made at runtime using the same algorithms that would be used at
compile time if the types were known.  However, this usually is a
simple case-by-case analysis of the types of the parameters.

Declarations in a language with top are a superset of the language
without top.  Using C-with-top as an example, you might declare a tree
struct as follows:

  struct tree {data;left;right}

where names given without types are assumed to have type top.  Thus,
the data is fully polymorphic, and so are the branches.  Or you might
want a little more type security so you could define

  struct tree {data; struct tree left, right;}

which does type checking on the branches, but leaves the data in the
trees fully polymorphic.

Program development then becomes a process of starting with skeleton
declarations and adding type details as required for debugging or
performance needs.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

grier@marx.enet.dec.com (03/02/91)

In article <559@coatimundi.cs.arizona.edu>, gudeman@cs.arizona.edu
(David Gudeman) writes:

|> Runtime polymorphism (or runtime typing, or dynamic typing) means that
|> identifiers in a language do not have types that are fixed at compile
|> time, they can reference values of different types at different times
|> in the program.  Languages with runtime polymorphism include Lisp,
|> Smalltalk, Icon, and Prolog.

   Not necessarily, if you accept that there's a difference between
"type" and "class".  In most models I've seen, variables have a type,
and instances have a class.  The type of the variable limits the class
of the instances which can be represented by that variable.  At
least in the object-oriented space, which seems to be what you're
talking about.  (Note that sometimes people invert the meaning of
type and class as I've presented it above.  This is the way I've seen
it and worked with it, so it's the way I talk about it.  Let's not war
over the terms...)

   Some languages, Trellis, for instance, allow the programmer to
narrow the type of a variable during a particular scope.  I believe the
Trellis syntax is something like:

   s : Shape;

   type_case s of
      Circle :begin ... end;
      Square : begin ... end;
      end;

   Within the blocks, the compiler (and thus the type system)
knew that the type of "s" had been narrowed to include only Circle
and subclasses or Square and subclasses. (the only problem here is that
if you follow the common notion that classes partition instances, and
the tags on the case-arms are types, there can be ambiguity, for
instance:

   s : Object; -- root of the hierarchy presumably

   type_case s of
      sortable :begin ... end;
      number :begin ... end;
      end;

   In practice however, the types chosen will partition the hierarchy, and
some simple rule such as "take the first type-case branch which matches"
can be applied successfully.)

   This feature isn't necessarily particular to OO languages and systems
either, I believe Trellis inherited the concept from CLU which had
something similar in relation to its REFANY type.

|> 
|> There are several advantages to this... <stuff deleted>

   Your comments seem to apply more to weakly typed systems
rather than systems which include polymorphism and late binding.
Is this perhaps what you meant?

|> 
|> Another advantage of runtime polymorphism is code generality...
|> <more deleted>

   This has little to do with runtime polymorphism.  Ada generics,
CLU and Trellis type generators and C++ templates all provide for
generation of generic modules of code.  Of the three, only Trellis
has late binding.  (It's coming back to me...  CLU has the REFANY
type, but the CLU equivalent of type_case has to be applied to
it to a particular type before applying any operations except
assignment to it...)

|> 
|> There are two problems with runtime typing (or I should say, one
|> problem and one complaint of questionable importance)  First, it is
|> very hard to make these programs run fast.  In practice, most programs
|> use most variables in a type-consistent way, but the user pays the
|> performance penalty of having runtime type checks everywhere.  These
|> type checks can be largely eliminated by type inference, but not
|> entirely.  And there still must be some mechanism for cleanly
|> combining values of unknown type with values of known type.

   With respect to polymorphism and late binding, wrong.  A good
compiler with strong typing and enough information can generate
code which is just as good.  Largely, it needs the ability to perform
some broader analysis of the type system of the program than is
typically available in languages/compilers where your code jumps
from being a source form into an object which can be munged by
the link-editor into an executable program.  DEC's Ada compiler,
for instance, is able to perform inter-module inlining quite well
because of its use of a program library - something which a language
like C++ has to resort to making the programmer put detailed
implementation information in the public definition in order to
gain some efficiency.  Trellis also has the same characteristic, since
it owns the workspace/program library in which the code is compiled
and executed.

   (C++ could fix this if the language definition wasn't so slanted
towards a specific technique to implement the language structures.
As it stands, such analysis could probably only be performed by
"smart" link-editors, which while they might be smart enough to
do some inlining, certainly can't be smart enough to see the larger
patterns of usage which might better drive decisions around inlining
and genericity.)

   Even in weakly-typed languages, a lot can be done here.  See some
recent postings to comp.arch from an author who had a compiler which
performed a large amount of optimization and inlining (is that a word?)
on a weakly typed language.

|> The questionable complaint about runtime typing is that it leads to
|> hard-to-find errors that are not caught until runtime.  It is true
|> that type errors in such languages cannot be found until runtime, and
|> that the place where a runtime error actually occurs is not always
|> near where the error originated.  On rare occasions these errors
|> require some serious debugging.  But in my experience this is a minor
|> problem, and the time spent looking for these errors is quite small
|> compared to the time that would have been spent writing strongly typed
|> declarations.

   Aha!  We are talking about weak typing vs. strong typing!

   What's your goal here?  Research or engineering?

   If you're researching some idea, or even prototyping a product, I
agree with you - the main factor has to be eliminating any overhead
for development so that the idea can be turned into something which
runs.

   If you're engineering a product which you're planning on selling
to people which they're going to use to run their business, adjust
control surfaces on a plane or monitor a nuclear power-plant, I strongly
disagree with you.  The cost of long-term maintenance of software far
outweighs and exceeds the initial development cost, and warrants
additional discipline during the development phase.  Not to mention
that even if you don't hold a legal responsilbity for damages incurred,
you hopefully hold some moral responsibility.  Sorry, this kind of
stuff belongs in comp.software-eng...

   Neither philosophy is good in isolation however.  If it's a major
research project which might last several years and reach a size of
some hundreds of thousands of lines of code, the cost of applying
good engineering principles up front will pay for themselves in
being able to advance the model with safety that either (a) clients
of the code you're changing are adhering to documented exported
interfaces and (b) if they don't, the compiler will catch it instead of
letting it run for three days and then crash mysteriously.

   And there are times when performing Software Engineering (in the
most pure form!) when you need to mung bits the way only a weak-
typed language can.  (However, most strongly typed languages
have escapes to do this in a controlled fashion anyways.)

   (My opinion is that programming with a strongly typed language
is a simple discipline which once you get used to it doesn't slow you
down or hurt your ability to produce code at all.  Sort of like
getting in the habit of showering and brushing your teeth in the
morning.  If you don't do them, it probably seems like a hard thing
to get used to, but if you can just discipline yourself long enough
to get used to it, it's second nature and doesn't take any additional
effort.  And people don't put their hand over their nose when you're
talking to them in close quarters! ;-)

|> 
|> Even so, I believe that there is a solution to both of these problems
|> that still leaves the advantages of runtime polymorphism intact.  I'll
|> discuss the solution in a later posting (and maybe answer some of the
|> flames that this article may generate :-).

   No flames here!  I just think that either you were using the wrong
terminology, or hadn't realized that we have the technology to do these
things right.

|> --
|> 					David Gudeman
|> gudeman@cs.arizona.edu
|> noao!arizona!gudeman
|> 
------------------------------------------------------------------------
I'm saying this, not Digital.  Don't hold them responsibile for it!

Michael J. Grier                           Digital Equipment Corporation
(508) 496-8417                             grier@leno.dec.com
Littleton, Mass, USA                       Mailstop OGO1-1/R6

gudeman@cs.arizona.edu (David Gudeman) (03/03/91)

In article  <1991Mar2.002608.20387@engage.enet.dec.com> grier@marx.enet.dec.com writes:
]In article <559@coatimundi.cs.arizona.edu>, gudeman@cs.arizona.edu
](David Gudeman) writes:
]
]|> Runtime polymorphism (or runtime typing, or dynamic typing) means that
]|> identifiers in a language do not have types that are fixed at compile
]|> time,...
]
]   Not necessarily...

You can't answer "not necessarily" to a definition.  The above was a
definition of "runtime polymorphism" as I planned to use it later.
All of your comments that are made to a different definition of
"runtime polymorphism" are therefore 

]   Your comments seem to apply more to weakly typed systems
]rather than systems which include polymorphism and late binding.
]Is this perhaps what you meant?

I meant what I said.  You can introduce a different term with the same
definition if you want, but I don't see what it accomplishes.

]|> Another advantage of runtime polymorphism is code generality...
]|> <more deleted>
]
]   This has little to do with runtime polymorphism.  Ada generics,
]CLU and Trellis type generators and C++ templates all provide for
]generation of generic modules of code.

I don't know about CLU and Trellis, but Ada and C++ both have
painfully inadequate support for this -- at least when compared to
languages with runtime polymorphism.

  Of the three, only Trellis
]has late binding.  (It's coming back to me...  CLU has the REFANY
]type, but the CLU equivalent of type_case has to be applied to
]it to a particular type before applying any operations except
]assignment to it...)

]|> There are two problems with runtime typing (or I should say, one
]|> problem and one complaint of questionable importance)  First, it is
]|> very hard to make these programs run fast.

]   With respect to polymorphism and late binding, wrong.  A good
]compiler with strong typing and enough information can generate
]code which is just as good.

Well then, your definition of "late binding" is not equivalent to my
definition of runtime polymorphism.  Given a function read() that
reads stdin and returns a float or an integer, depending on what the
next token looks like, how do you generate code for

  x + read()

that is as efficient as addition in a statically typed langauge?  You
can't.  You need to test the return type of read() at runtime.

]   Aha!  We are talking about weak typing vs. strong typing!

Depends on your definition of weak typing.  If it matches my
definition of runtime polymorphism, then that's what I'm talking
about.

]   No flames here!  I just think that either you were using the wrong
]terminology, or hadn't realized that we have the technology to do these
]things right.

How can my terminology be wrong when I invented it myself?  And yes, I
know that the technology exists to do these things right -- it is done
right in all the languages that have dynamic polymorphism.  It just
isn't as efficient as it could be.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

sommar@enea.se (Erland Sommarskog) (03/05/91)

Also sprach David Gudeman (gudeman@cs.arizona.edu):
>There are several advantages to this.  First, it makes program
>prototyping and developement much faster because declarations are
>minimized.

Hm, I don't have the figures here, but which phases takes
most time in the life of a software product? Prototyping?
Development? Testing? Integration? Maintenance? I could
be wrong, but I don't think it is the prototyping and
early development stages when you could sneak some typing
time with shorter declarations. But the test and integration
phase will take longer time. Maintenance and support will
be more expensive. (Particulary support since you are
likely to get more crashes due to typing errors and similar.)

>Another advantage of runtime polymorphism is code generality.  You can
>write generic procedures that work on many data types (as long as it
>only makes use of operations that have definitions for all the data
>types).  For example you can write a procedure to insert something in
>a tree without knowing the type of the object: (Icon code)
>...
>The only operations on x are object comparison (<<<) and inserting
>into a record, so the procedure is entirely generic.  (As an aside,

Fine. I can do this in Ada and Eiffel. And assure at compile time, 
that I don't pass incomparable or uninsertable types to the generic 
routine.

>But in my experience this is a minor
>problem, and the time spent looking for these errors is quite small
>compared to the time that would have been spent writing strongly typed
>declarations.

You've tried this in 500.000 line system? What did the customers
say?
-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se

grier@marx.enet.dec.com (03/06/91)

In article <567@coatimundi.cs.arizona.edu>, gudeman@cs.arizona.edu
(David Gudeman) writes:
|> 
|> You can't answer "not necessarily" to a definition.  The above was a
|> definition of "runtime polymorphism" as I planned to use it later.
|> All of your comments that are made to a different definition of
|> "runtime polymorphism" are therefore 
|> 

   Sorry for the confusion, your "definition" wasn't clear about what
you meant.  Over the weekend, I was thinking about it and realized
that I might have misinterpreted you.  I'd never seen a definition
of anything called "runtime polymorphism" before, and you didn't
include any additional information or examples which made what you
meant clear.

   I apologize for going off on a tangent, although an interesting
one...

|> <stuff deleted>
|> 
|> Depends on your definition of weak typing.  If it matches my
|> definition of runtime polymorphism, then that's what I'm talking
|> about.
|> 

   It wasn't.

   Your definition of "runtime polymorphism" doesn't really make sense
in a type-safe system.  In a type-safe language, you wouldn't have
these kinds of issues or problems, so clearly what you're talking about
can only occur in a weakly typed language - thus making optimization
and other smart things extremely difficult, although not impossible.

|> 					David Gudeman
|> gudeman@cs.arizona.edu
|> noao!arizona!gudeman
|> 
------------------------------------------------------------------------
I'm saying this, not Digital.  Don't hold them responsibile for it!

Michael J. Grier                           Digital Equipment Corporation
(508) 496-8417                             grier@leno.dec.com
Littleton, Mass, USA                       Mailstop OGO1-1/R6

new@ee.udel.edu (Darren New) (03/06/91)

In article <2741@enea.se> sommar@enea.se (Erland Sommarskog) writes:
>Hm, I don't have the figures here, but which phases takes
>most time in the life of a software product? Prototyping?
>Development? Testing? Integration? Maintenance? 

It depends on whether you are building a prototype to
be thrown away or whether you start out with the 
entire 500.000 line program and try to get it right
on the first try.  It seems reasonable to me that you
may want a "quicky" language to prototype a medium-size
(~10K-30K lines) application which can then be rewritten
into the final application in a different language.

Remember: always plan on writing it at least twice.
You will write it twice anyway, so you might as well
plan on it. 
		    -- Darren

-- 
--- Darren New --- Grad Student --- CIS --- Univ. of Delaware ---
----- Network Protocols, Graphics, Programming Languages, 
      Formal Description Techniques (esp. Estelle), Coffee, Amigas -----
              =+=+=+ Let GROPE be an N-tuple where ... +=+=+=

gudeman@cs.arizona.edu (David Gudeman) (03/06/91)

In article  <2741@enea.se> Erland Sommarskog writes:
]Also sprach David Gudeman (gudeman@cs.arizona.edu):
]>There are several advantages to this.  First, it makes program
]>prototyping and developement much faster because declarations are
]>minimized.
]
]Hm, I don't have the figures here, but which phases takes
]most time in the life of a software product? Prototyping?
]Development? Testing? Integration? Maintenance?

Depends on what sort of product it is.  And if it isn't a product, but
a utility for personal use, then the figures probably differ even more
dramatically.

] I could
]be wrong, but I don't think it is the prototyping and
]early development stages when you could sneak some typing
]time with shorter declarations.

Typing time isn't the main issue.  It's having to make too many
decisions up front, and setting them in stone early in the
development.  (Data abstraction helps solve this problem, but runtime
polymorphism is more flexible.)  Another problem is that when you
write more code you make more mistakes.  In Icon or Lisp, you _never_
make mistakes in the declarations because there are (almost) no
declarations.

]... Maintenance and support will
]be more expensive.

And those are the areas where runtime polymorphism gives the biggest
advantages.  It is a lot easier to add features when the language is
more flexible.  Systems written in languages with runtime polymorphism
tend to grow astronomically since they are so easy to modify.  (This
is not an unmixed blessing.)

](Particulary support since you are
]likely to get more crashes due to typing errors and similar.)

There is no reason why a program written in a language with runtime
polymorphism should be less robust.  I do almost all my work in GNU
emacs.  Huge amounts of the system are written in Emacs Lisp (a
language with runtime polymorphism), and I have never encountered a
bug in distributed code.  (There _are_ bugs of course, but they are no
more common than in any other system with comparable functionality.)

]>Another advantage of runtime polymorphism is code generality...
]]Fine. I can do this in Ada and Eiffel.

You have to write seperate code for each type that might be used.  The
amount of code you have to write increases linearly with the number of
types used in the program.  And increased code means increased bugs.

] And assure at compile time, 
]that I don't pass incomparable or uninsertable types to the generic 
]routine.

Why is that such a big deal?  What's the big difference whether you
find a bug during compilation or during the first couple of test runs?

]>But in my experience this is a minor
]>problem, and the time spent looking for these errors is quite small
]>compared to the time that would have been spent writing strongly typed
]>declarations.
]
]You've tried this in 500.000 line system? What did the customers
]say?

Not all programs are 500,000 line systems, and not all programs have
customers.  However, there have been plenty of huge programs written
in these languages (most notably lisp) and distributed to others, and
these programs have not shown greater problems with robustness than
programs written in other languages.  I wouldn't be surprised that
feature-for-feature, programs written in these languages are _more_
robust, since feature-for-feature they are much smaller than programs
written in strongly typed languages.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

billk@hawk.cs.ukans.edu (Bill Kinnersley) (03/06/91)

In article <559@coatimundi.cs.arizona.edu> gudeman@cs.arizona.edu 
  (David Gudeman) writes:
: Runtime polymorphism (or runtime typing, or dynamic typing) means that
: identifiers in a language do not have types that are fixed at compile
: time, they can reference values of different types at different times
: in the program.  Languages with runtime polymorphism include Lisp,
: Smalltalk, Icon, and Prolog.
: 
Runtime polymorphism is one of the worst features of Smalltalk and Icon.

In Smalltalk, people miss the typing so much that they name all of their
variables "anObject" or aNumber" to make up for it, thus forgoing the
opportunity to use meaningful names.

And in Icon, the absence of typing has led to an incredible proliferation
of operators.  Since the variables are polymorphic, the operators must all
be monomorphic.  It's entirely the operator's responsibility to decide 
whether to compare two variables as numbers (=) as strings (==) or as 
values (===).  Thus we get monstrous operations like >>=:= and --:= 
and |||:= and ~===:= that no one can remember.

sommar@enea.se (Erland Sommarskog) (03/10/91)

Also sprach Darren New (new@ee.udel.edu):
>It depends on whether you are building a prototype to
>be thrown away or whether you start out with the
>entire 500.000 line program and try to get it right
>on the first try.  It seems reasonable to me that you
>may want a "quicky" language to prototype a medium-size
>(~10K-30K lines) application which can then be rewritten
>into the final application in a different language.
>
>Remember: always plan on writing it at least twice.
>You will write it twice anyway, so you might as well
>plan on it.

Good point. However, there is a trap hiding here. Once
you have your prototype, be sure that some managment
person will give you a very blank stare when you tell
him that you will throw the prototype away and start
anew. I am not agreeing with him, I just think that
if you are not sure that you don't have management
behind you on such a point, abadon the idea.

As for writing it twice, you do not necessarily make
it more organized. I worked for two and half years
with a 500.000 lines information system. We did an
extensive revision of the system once we had in-
stalled it at the first customer. Not a complete
rewrite, this was impossible since we constantly
added new functions, but we took care of database-
access and other general functions, extracted them
and renamed them. But a lot of the things we did, 
couldn't have been written that way originally when 
the project first went into coding (which was eight
months before I came in), since several tools, both 
local and vendor-supplied came along the way.
-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se

sommar@enea.se (Erland Sommarskog) (03/10/91)

David Gudeman (gudeman@cs.arizona.edu) replies to my article:
>Typing time isn't the main issue.  It's having to make too many
>decisions up front, and setting them in stone early in the
>development.  (Data abstraction helps solve this problem, but runtime
>polymorphism is more flexible.)  Another problem is that when you
>write more code you make more mistakes.  In Icon or Lisp, you _never_
>make mistakes in the declarations because there are (almost) no
>declarations.

But do you make less type errors? And when do you find them?

I spend quite some time of declaring stuff and setting up
the types. And when I have done this I also know a lot more
what I am about to do.

>]>Another advantage of runtime polymorphism is code generality...
>]]Fine. I can do this in Ada and Eiffel.
>
>You have to write seperate code for each type that might be used.  The
>amount of code you have to write increases linearly with the number of
>types used in the program.  And increased code means increased bugs.

What do you mean? OK, in Ada you have to write

    PACKAGE Integer_list IS NEW Linked_list(integer);
    PACKAGE Array_list IS NEW Linked_list(Array_type);
    -- etc.

and then declare each list you need. But you don't have to rewrite
the list package for any type that you might think of. In Eiffel
you only declare them:

    Int_list   : Linked_list[integer];
    Array_list : Linked_list[Array[T]];

Is this so frustrating? Or do you simply not know what you are 
talking about?

>Why is that such a big deal?  What's the big difference whether you
>find a bug during compilation or during the first couple of test runs?

1) People pay me to program. The earlier I find my errors, the
   less it costs my contractor.

2) You find a type error in the first couple of test run, if it
   is a common case. An uncommon case, e.g. an error handler,
   might not wind up until you integrate, or horror after shipping.

>Not all programs are 500,000 line systems, and not all programs have
>customers.

True, but I have to admit that programs that have customers
are much more interesting to me. And 500.000 or 5.000 lines
it's not so much difference. The more errors the compiler
can find for me at an early stage, the happier I am.
-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se

gudeman@cs.arizona.edu (David Gudeman) (03/10/91)

In article  <1991Mar5.214727.8630@hawk.cs.ukans.edu> Bill Kinnersley writes:
]...
]Runtime polymorphism is one of the worst features of Smalltalk and Icon.
]
]In Smalltalk, people miss the typing so much that they name all of their
]variables "anObject" or aNumber" to make up for it, thus forgoing the
]opportunity to use meaningful names.

I'd guess that this is actually a result of people blindly following
poor examples from books.  It isn't seen in other languages with
dynamic typing.

]And in Icon, the absence of typing has led to an incredible proliferation
]of operators.

That's completely wrong.  The proliferation of operators in Icon is a
result of two things: (1) there are a lot of powerful built-in types,
and (2) the designers liked operator syntax for these things better
than function syntax.  It has absolutely nothing to do with dynamic
typing.

] Since the variables are polymorphic, the operators must all
]be monomorphic.

That too is wrong.  There are many polymorphic operators in Icon.
All the arithmetic operators are polymorphic the same way they
are in C.  I can off-hand think of three operators that operate on
list, strings and tables: element generation, random element choice,
and subscripting.  The set operators work for sets as well as csets,
And many of the built-in functions are polymorphic also.

]  It's entirely the operator's responsibility to decide 
]whether to compare two variables as numbers (=) as strings (==) or as 
]values (===).

These are all distinct operations that would generally be just as
useful in a statically typed language except for the operators that
express things that can't be expressed in statically typed langauges.
"=": do numerical comparison, but it is an error if one of the
arguments is not a number or convertiable to one.  "==": do lexical
comparison, but it is an error if one of the arguments is not a string
or convertible to one.  These two operators are distinct in most
languages that have string comparison.  "===" expresses something that
you can't even say in a statically typed language -- compare the
values numerically, lexically, or by some other criteria, depending on
the dynamic type of the expression.

]  Thus we get monstrous operations like >>=:= and --:= 
]and |||:= and ~===:= that no one can remember.

I've never had trouble.  All you need to do is remember some simple
rules.  These examples in particular are silly since they are compound
assignment operators that are easy to remember if you can remember the
base operator.  For all operators "@", "x @:= y" is equivalent to "x
:= x @ y" except that x only gets evaluated once:

  x >>= y	return y if x is lexically greater than y, otherwise fail.
  x >>=:= y	x gets y if x is lexically greater than y.
  x -- y	return the set difference of x and y.
  x --:= y	x gets the the set difference of x and y.
  x ||| y	return the list concatenatio of x and y.
  x |||:= y     x gets the list concatenatio of x and y.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

gudeman@cs.arizona.edu (David Gudeman) (03/11/91)

In article  <2773@enea.se> Erland Sommarskog writes:
]
]But do you make less type errors? And when do you find them?

Yes, I make fewer type errors.  Most type errors in statically typed
languages are not really errors in logic, they are errors in assuming
the language is smart enough to make obvious conversions.  Smarter
languages just do the right thing.

I find type errors in the same amount of time it takes me to find
other blatant runtime errors.

]I spend quite some time of declaring stuff and setting up
]the types. And when I have done this I also know a lot more
]what I am about to do.

I prefer to spend the time writing experimental code.  After you have
decided how to organize your data, the vast majority of time spent in
declarations is trivial stuff that can (and should) be done by the
computer.

]What do you mean? OK, in Ada you have to write
]
]    PACKAGE Integer_list IS NEW Linked_list(integer);
]    PACKAGE Array_list IS NEW Linked_list(Array_type);

]and then declare each list you need. But you don't have to rewrite
]the list package for any type that you might think of.

You aren't getting the same functionality that dynamic typing gives.
To get that functionality you would have to declare a tagged union of
every type in the program, and for each list function that deals with
list values you have to do the right thing with the member of the list
depending on it's actual type.  To compare two values you have to
write code to check the type and do the right thing based on the type.
You have to (in general) write seperate code for every type that is
used in the program.

]1) People pay me to program. The earlier I find my errors, the
]   less it costs my contractor.

Suppose you spend 2 hours writing type declarations and 2 hours
writing code.  A single compile finds your type errors.  I'll spend no
time writing declarations and 2 hours writing code (probably less
since dynamic typed languages have such powerful built-in data types).
I'll find the type error, on the average, after 10 to 20 minutes of
testing.  Who has saved more money for his contractor?

Of course these numbers are just taken out of thin air.  The point is
that programming is _faster_ with dynamic polymorphism, so it makes no
sense to complain that it will cost more money looking for blatant
errors.  Type errors are a trivial concern that get blown way out of
proportion by people who are used to statically typed languages and
are looking for justifications not to learn and use better
technologies.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

sakkinen@tukki.jyu.fi (Markku Sakkinen) (03/11/91)

In article <595@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:

[Quote from the very end of the posting:]
> [...] Type errors are a trivial concern that get blown way out of
>proportion by people who are used to statically typed languages and
>are looking for justifications not to learn and use better
>technologies.

Many proponents of dynamic typing in turn seem to be obsessed out of
proportion by completely polymorphic lists and the like,
which have very limited usefulness.

>In article  <2773@enea.se> Erland Sommarskog writes:
>] ...
> ...
>I prefer to spend the time writing experimental code.  After you have
>decided how to organize your data, the vast majority of time spent in
>declarations is trivial stuff that can (and should) be done by the
>computer.
>
>]What do you mean? OK, in Ada you have to write
>]
>]    PACKAGE Integer_list IS NEW Linked_list(integer);
>]    PACKAGE Array_list IS NEW Linked_list(Array_type);
>
>]and then declare each list you need. But you don't have to rewrite
>]the list package for any type that you might think of.
>
>You aren't getting the same functionality that dynamic typing gives.
>To get that functionality you would have to declare a tagged union of
>every type in the program, and for each list function that deals with
>list values you have to do the right thing with the member of the list
>depending on it's actual type.  To compare two values you have to
>write code to check the type and do the right thing based on the type.
>You have to (in general) write seperate code for every type that is
>used in the program.

Even if you are not "hampered" by static typing, you have to do pretty
much the same things in order to handle a totally polymorphic collection
sensibly.  There are hardly any operations that are applicable to all
possible types of objects.  If you want to compare two arbitrary objects,
you just have to look at their types to see whether the comparison
makes any sense.

It is not a good idea to prohibit static typing totally, as Gudeman
seems to suggest.  Many languages that have static typing available
actually allow various degrees of dynamic typing:

1. Object-oriented languages typically allow the type of a variable
   to be defined as a given class or any subclass of it.  This is much
   more often what is really needed than unbounded polymorphism.

2. Several languages even allow a type definition of "any" (i.e.
   a totally polymorphic variable) whenever that is really wanted.
   I think Eiffel (on which Sommarskog gave an example) currently
   has this facility.

I think Objective-C, which originally was based on totally dynamic
typing like Smalltalk, currently allows static type declarations
of any or all variables.  So there seems to be a convergence
to "the best of both worlds" from both sides.

I certainly fail to see that preventing a programmer from declaring
any static type information would be "better technology".
Neither do I share Gudeman's belief that testing will quickly
and easily reveal all (typing) errors.

Markku Sakkinen
Department of Computer Science and Information Systems
University of Jyvaskyla (a's with umlauts)
PL 35
SF-40351 Jyvaskyla (umlauts again)
Finland
          SAKKINEN@FINJYU.bitnet (alternative network address)

gudeman@cs.arizona.edu (David Gudeman) (03/13/91)

In article  <1991Mar11.114638.2220@tukki.jyu.fi> Markku Sakkinen writes:
]In article <595@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
]
]Many proponents of dynamic typing in turn seem to be obsessed out of
]proportion by completely polymorphic lists and the like,
]which have very limited usefulness.

It isn't that completely polymorphic lists are so useful, in fact
they are rarely used.  What is important is the ability to change the
types that can go in the list without making pervasive changes to the
program -- and without having to look for a declaration for the list.

]>... To compare two values you have to
]>write code to check the type and do the right thing based on the type.
]>You have to (in general) write seperate code for every type that is
]>used in the program.
]
]Even if you are not "hampered" by static typing, you have to do pretty
]much the same things in order to handle a totally polymorphic collection
]sensibly.  There are hardly any operations that are applicable to all
]possible types of objects.  If you want to compare two arbitrary objects,
]you just have to look at their types to see whether the comparison
]makes any sense.

No you don't.  Languages with dynamic typing have built-in operators
that work on all types as determined at runtime (that's why I started
with the term "runtime polymorphism").  I can write completely generic
routines to compare, copy, print, etc., all the elements of a generic
list.  And completely generic lists are rare anyway.  As you suggest,
there the elements of the list are usually restricted to some set of
types with even more operations in common.  You can just assume that
the list has elements of one of the correct types, and if that
assumption proves wrong, you get a runtime error message.

]It is not a good idea to prohibit static typing totally, as Gudeman
]seems to suggest.

You must not have read part 2.  There I give my preference, one that
allows both dynamic and static typing.

]2. Several languages even allow a type definition of "any" (i.e.
]   a totally polymorphic variable) whenever that is really wanted.
]   I think Eiffel (on which Sommarskog gave an example) currently
]   has this facility.

These features do not give dynamic typing.  They only give built-in
tagged union types, and the only static typing problem they solve is
that you don't have to change the declarations of a structure when you
change the types of elements.  They still require you to check the
types of elements before you do anything with them.

]...Neither do I share Gudeman's belief that testing will quickly
]and easily reveal all (typing) errors.

This is something that I have discovered from experience, not
something that I have tried to intuit.  I doubt anyone with serious
experience with dynamically typed languages would take these
complaints about type errors seriously.  They just aren't a problem.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

cs450a03@uc780.umd.edu (03/13/91)

When I first hit this subject chain a couple days ago, I decided that David
Gudeman was a breath of fresh air.  I wanted to follow up immediately, but
promised myself to wait till I'd read through to the end of the newsgroup
before I posted.  So, I'm responding to a large number of articles, and
hopefully I'm remembering everything right.

First, I get paid to maintain/develop code on a large system (gigabytes
of data, millions of lines of code) on what Mr Gudeman would probably
consider a strongly typed (runtime type checking) system.

Quick summary of personal experience:

The runtime errors I've run into include
          divide by zero
          array reference out of bounds
          file (or file system) full
          access denied (permissions set wrong)
          out of memory
          undefined variable

Oddly enough, these errors tend to show up much more frequently in
poor code than in good code, and in good code they generally only
show up shortly after the code goes into testing.

Oh yes, forgot a couple:
          race conditions
          data corruption

An example of the last was when I changed a file naming convention
from using a date encoded in the file name to using date as an
independent property, ran a routine to convert existing files but
had not actually installed the function that dealt with the new
convention.

I fail to see how a statically typed language would save me from
any of these circumstances.  At best, the language could forbid me
from doing whatever operation caused the problem (array index out
of bounds?  well, you shouldn't be using array indices... use our
Ronco brand array substitutes instead!)

The more general problem of incorrect logic has only one good solution:
(1) figure out what the problem is you are trying to solve
(2) figure out how to solve it
(3) implement it
(4) check the implementation to be sure it is correct

And if step 3 only takes me a few hours instead of a few days or a
few weeks, well, maybe I'm "undisciplined", but we're beating our
competition.

Tough luck, huh?

Raul Rockwell