[comp.lang.misc] Limitations of ML type inference

db@cs.ed.ac.uk (Dave Berry) (03/22/91)

In article <2400034@otter.hpl.hp.com> sfk@otter.hpl.hp.com (Steve Knight) writes:
Steve Knight posted an article which looked  (rather sarcastically) at
some of  the limitations of ML's type inference.  Most of these are well-known,
well-understood, and the subject of current research.  Several don't often
arise in real programs.

>True enough.  Good job the example wasn't this, though ....
>
>- fun foo x = x + x;
>Error: overloaded variable "+" cannot be resolved

Yes, overloading doesn't mix well with ML's type inference.  Fortunately,
this rarely arises; real programs usually have enough context to disambiguate
whether the operation is applied to integers or reals.

There are a number of ways to improve this situation.  A simple one is to
have a single numeric type encompassing both flaot and integer, though this
is obviously unsatisfactory.  A better approach is to have a hierarchy
of types, with appropriate overloadings or coercions.  Haskell includes
one such scheme, and could infer a type for the above declaration.

>
>- datatype IntOrBool = Int of int | Bool of bool;
>- val foo = [Bool false, Int 0];
>
>Just look how ML doesn't need type declarations!  I hope the intelligent reader
>is beginning to get an idea how this splendid type inference is achieved.

If you mix types in a construct, then you need some way of tagging
them, unless you have dynamic types.  The tags aren't type declarations - 
they're equivalent to the tags in a tagged variant or union, not to the
declaration of the variant or union itself.

None the less, this sort of expression can be ugly, and it's possible to do
better.  I've got a student working on a scheme that will infer the
appropriate tags in expressions like this.  The work is at an early stage,
but shows promise.

>- val x = ref [];
>Error: nongeneric weak type variable
>  x : '0S list ref
>
>Gosh.  A jolly good job I was protected there.  That could have led to a 
>terrible run-time error.

If that declaration was accepted, then the following two lines would
also be accepted:

x := [2];
x := [false];

If you would allow those, then we have different understandings of what is
meant by the term "strong typing".

An early approach to typing polymorphic references would have allowed the
declaration, refining the type from the polymorphic list to an integer
list at the first assignment.  This proved to be unwieldy in practice,
since the first assignment to a reference could be some distance from
the initial declaration, and this made code hard to read.

>- fun f x y = if y = 0 then 0 else f f (y-1);

The problem here is basically an "occurs-check" in the unification algorithm.
The Hindley-Milner type inference algorithm used in ML only allows generic
instantiation of a type variable if it's defined in a "let" declaration.
In practice this doesn't cause many problems.  It's still an area of
research; recently Rana Ghosh-Roy claimed to have an improved algorithm
that would handle cases like this.

In addition to the topics mentioned here, people are working on subtyping
systems, inheritance, high-order functors, dynamic types, first-class
environments, and a wider range of operations on records.  There are some
theoretical limits to how far type inference can be extended, but it's not
clear how many of these will limit the design of practical programming
languages.  Even if they do, it should be possible to embed powerful
type-inference systems in larger type systems, so that the number of
type declarations required is kept to a minimum.

--
 Dave Berry, LFCS, Edinburgh Uni.      db%lfcs.ed.ac.uk@nsfnet-relay.ac.uk

		George Bush: America's Chamberlain.

cs450a03@uc780.umd.edu (03/24/91)

Dave Berry writes:
>An early approach to typing polymorphic references would have allowed the
>declaration, refining the type from the polymorphic list to an integer
>list at the first assignment.  This proved to be unwieldy in practice,
>since the first assignment to a reference could be some distance from
>the initial declaration, and this made code hard to read.

One of the things I've been trying to point out is that type is a
property of data as much as it is of name (currently, I believe that
type is more data than name).  A related argument might be that it is
a property of assignment.

The way I interpret what Dave's comment is: Somebody allowed
assignment of a (constant) type to a name, which had implications on
all later assignments to that name.  To me, this is an old, familiar
stylistic problem (nonlocality).

Personally, I favor aiding "type inference" by providing (possibly
trivial) functions which guarantee results of a certain type.  If you
like, this could be extended to meta-functions and meta-types (but
meta-meta-* is silly).

Raul Rockwell