[comp.lang.functional] Type Inference in ML

mmcg@bruce.cs.monash.OZ.AU (Mike McGaughey) (01/17/91)

There *is* an anomaly in the LML type inference algorithm - one which
has recently prevented my use of the language for some very elegant
continuation based solutions to problems.  The simplest form of the
problem is that the expression

	let x = hd o hd in x

will typecheck properly - to (List (List *a))->*a) - whereas

	(\x.x o x) hd

will not.  The problem with the second expression seems to be that the
inference algorithm erroneously expects both of the 'x' values in the
x o x to have the same type signature, when in fact it is only the _form_
of the signature that should be shared (perhaps a new set of type
variables should be created for each occurrence of x, and then unified
in the normal way).  Thus the type of x in the expression is given as
(*a->*a).

I have three questions: can anyone point me to an algorithm or paper
describing a polymorphic type system which does not share this
anomaly?  Are there any readily available lazy functional languages
which use this system?  Has anyone modified the LML compiler to do
this? (looks around hopefully... :-)

Cheers,

    Mike.
-- 
Mike McGaughey			ACSNET:	mmcg@bruce.cs.monash.oz

"His state is kingly; thousands at his bidding speed,
 And post o'er land and ocean without rest."		- Milton.

brian@comp.vuw.ac.nz (Brian Boutel) (01/17/91)

In article <3603@bruce.cs.monash.OZ.AU>, mmcg@bruce.cs.monash.OZ.AU
(Mike McGaughey) writes:
|> There *is* an anomaly in the LML type inference algorithm - one
|> which
|> has recently prevented my use of the language for some very elegant
|> continuation based solutions to problems.  The simplest form of the
|> problem is that the expression
|> 
|> 	let x = hd o hd in x
|> 
|> will typecheck properly - to (List (List *a))->*a) - whereas
|> 
|> 	(\x.x o x) hd
|> 
|> will not.  The problem with the second expression seems to be that
|> the
|> inference algorithm erroneously expects both of the 'x' values in
|> the
|> x o x to have the same type signature, when in fact it is only the
|> _form_
|> of the signature that should be shared (perhaps a new set of type
|> variables should be created for each occurrence of x, and then
|> unified
|> in the normal way).  Thus the type of x in the expression is given
|> as
|> (*a->*a).
|>

This is NOT erroneous. It is quite normal Hindley/Milner typing, and
other functional languages behave the same way. It is usually expressed
by saying that the type variables associated with lambda-bound variables
are non-generic.

Indeed it is the reason why functional languages need the let (or where)
syntax. In all respects other than type inference, let a = b in c is
equivalent to (\a.c)b. If you want the type behaviour of the let form,
use it.

The type given to \x.x o x  is Forall t. (t -> t) -> (t -> t), but it
would appear that a more general type is  Forall T,t. (T t -> t) -> (T(T
t) -> t), where T is a type constructor variable, e.g T could
instantiate to List or Tree. etc. Unfortunately, quantification over
type constructors is second order, and undecidable.

--brian
-- 
Internet: brian@comp.vuw.ac.nz
Postal: Brian Boutel, Computer Science Dept, Victoria University of Wellington,
        PO Box 600, Wellington, New Zealand
Phone: +64 4 721000

farrell@batserver.cs.uq.oz.au (Friendless) (01/17/91)

In <3603@bruce.cs.monash.OZ.AU> mmcg@bruce.cs.monash.OZ.AU (Mike McGaughey) writes:
>	let x = hd o hd in x
>will typecheck properly - to (List (List *a))->*a) - whereas
>	(\x.x o x) hd
>will not.  The problem with the second expression seems to be that the
>inference algorithm erroneously expects both of the 'x' values in the
>x o x to have the same type signature, 

  This is the correct behaviour of Milner typing - parameters only have one
type. I was surprised too when I found it out, but I think it complicates the
algorithm a lot to fix it.

Friendless

kh@cs.glasgow.ac.uk (Kevin Hammond) (01/17/91)

In article <3603@bruce.cs.monash.OZ.AU> mmcg@bruce.cs.monash.OZ.AU (Mike McGaughey) writes:
>There *is* an anomaly in the LML type inference algorithm - one which
>has recently prevented my use of the language for some very elegant
>continuation based solutions to problems.  The simplest form of the
>problem is that the expression
>
>	let x = hd o hd in x
>
>will typecheck properly - to (List (List *a))->*a) - whereas
>
>	(\x.x o x) hd
>
>will not.  

NO, LML is doing the "right thing".  This is not a problem with LML,
it's a problem with Hindley/Milner polymorphic type inference.
let-bound definitions are polymorphic (ignore letrec, which is a
special case), whereas lambda-bound definitions are monomorphic.

The people at Nijmegen (Eric Nocker/Sjaak Smetsers if my memory's
correct) have a more permissive type system implemented for the Clean
GRS (based on "intersection types"), but it has a few vices of its
own...  [try rinus@cs.kun.nl -- Rinus Plasmeijer -- I don't have
Eric's address handy].

Maybe you should be using an untyped language (LML with typechecking
off?).

Kevin
-- 
This Signature Intentionally Left Blank.	E-mail: kh@cs.glasgow.ac.uk

kfr@dip.eecs.umich.edu (K. Fritz Ruehr) (01/19/91)

  mmcg@bruce.cs.monash.OZ.AU (Mike McGaughey) writes:

>	I have three questions: can anyone point me to an algorithm or paper
>	describing a polymorphic type system which does not share this
>	anomaly?  Are there any readily available lazy functional languages

  I don't know if he has implemented the type system for a programming
language, but a system of conjuctive types devised by R. Ghosh-Roy at Essex
attempts to resolve just the problem you describe.  The system is written up
in:

  "Conjunction-Type Standard ML Polymorphism", R. Ghosh-Roy.
  LISP and Symbolic Computation: An International Journal, 3, 381-409, 1990.
  Kluwer Academic Publishers.

His address is given at the front of the paper as:

  R. Ghosh-Roy  (ghosr@essex.ac.uk)
  Room Number 4B.526, Department of Computer Science
  University of Essex, Wivenhoe Park
  Colchester, CO4 3SQ, Essex, England, UK

He claims the following results for his system: decidability (of well-typing),
semantic soundness for all types, semantic soundness and completeness for 
basic types.

  --  Fritz Ruehr
      kfr@dip.eecs.umich.edu

doug@nixtdc.uucp (Doug Moen) (01/19/91)

>In article <3603@bruce.cs.monash.OZ.AU>, mmcg@bruce.cs.monash.OZ.AU
>(Mike McGaughey) writes:
>|> There *is* an anomaly in the LML type inference algorithm - one which
>|> has recently prevented my use of the language for some very elegant
>|> continuation based solutions to problems.  The simplest form of the
>|> problem is that the expression
>|> 
>|> 	let x = hd o hd in x
>|> 
>|> will typecheck properly - to (List (List *a))->*a) - whereas
>|> 
>|> 	(\x.x o x) hd
>|> 
>|> will not.  The problem with the second expression seems to be that the
>|> inference algorithm erroneously expects both of the 'x' values in the
>|> x o x to have the same type signature, when in fact it is only the _form_
>|> of the signature that should be shared (perhaps a new set of type
>|> variables should be created for each occurrence of x, and then unified
>|> in the normal way).  Thus the type of x in the expression is given as
>|> (*a->*a).

In article <1991Jan16.225903.2540@comp.vuw.ac.nz>, brian@comp.vuw.ac.nz
(Brian Boutel) writes:
>This ... is quite normal Hindley/Milner typing, and other functional
>languages behave the same way. It is usually expressed by saying that
>the type variables associated with lambda-bound variables are non-generic.
>
>Indeed it is the reason why functional languages need the let (or where)
>syntax. In all respects other than type inference, let a = b in c is
>equivalent to (\a.c)b. ...
>
>The type given to \x.x o x  is Forall t. (t -> t) -> (t -> t), but it
>would appear that a more general type is  Forall T,t. (T t -> t) -> (T(T
>t) -> t), where T is a type constructor variable, e.g T could
>instantiate to List or Tree. etc. Unfortunately, quantification over
>type constructors is second order, and undecidable.

In the latest issue of Lisp and Symbolic Computation (3, 1990),
R. Ghosh-Roy presents a new type-inference system based on conjunctive
types.  His system finds more general types, and can typecheck
a larger set of programs, than the Hindley/Milner system.  His system
is decidable and semantically sound for all types.  As far as I can tell,
'let a = b in c' is completely equivalent to '(\a.c)b' in his system.

The GR system gives '\x.x o x' the type '((*a->*b) + (*b->*c)) -> (*a->*c)'.
The + means type conjunction.  This type can be read as meaning:
  the first occurrence of 'x' in the function has type (*a->*b)
  the second occurrence of 'x' in the function has type (*b->*c)

Under GR, the expression (\x.x o x)(hd) is well typed.  Given:
  (\x.x o x) : ((*a->*b) + (*b->*c)) -> (*a->*c)
  hd : (List *d)->*d
the unification algorithm is able to find the type assignments
  *a = List(List *e)
  *b = List(*e)
  *c = *e
which results in (\x.x o x)(hd) being assigned the type List(List *e)->*e.

So ...
Is conjunctive type inference as good as it seems?
Can we expect a new family of functional languages with conjunctive type
systems to appear?  Can SML (with its module system) and Haskell (with its
type classes) be cleanly extended to use conjunctive types?
-- 
Doug Moen
{mnetor,alias,geac,torsqnt,lsuc}!nixtdc!doug
77 Carlton #1504, Toronto, Ontario, Canada M5B 2J7

ramaer@cs.utwente.nl (Mark Ramaer) (01/21/91)

In article <3603@bruce.cs.monash.OZ.AU>, mmcg@bruce.cs.monash.OZ.AU
(Mike McGaughey) writes:
|> There *is* an anomaly in the LML type inference algorithm - one
which
|> has recently prevented my use of the language for some very elegant
|> continuation based solutions to problems.  The simplest form of the
|> problem is that the expression
|> 
|> 	let x = hd o hd in x
|> 
|> will typecheck properly - to (List (List *a))->*a) - whereas
|> 
|> 	(\x.x o x) hd
|> 
|> will not.  The problem with the second expression seems to be that
the
|> inference algorithm erroneously expects both of the 'x' values in
the
|> x o x to have the same type signature, when in fact it is only the
_form_
|> of the signature that should be shared (perhaps a new set of type
|> variables should be created for each occurrence of x, and then
unified
|> in the normal way).  Thus the type of x in the expression is given
as
|> (*a->*a).
|> 
|> I have three questions: can anyone point me to an algorithm or paper
|> describing a polymorphic type system which does not share this
|> anomaly?  Are there any readily available lazy functional languages
|> which use this system?  Has anyone modified the LML compiler to do
|> this? (looks around hopefully... :-)
|> 
|> Cheers,
|> 
|>     Mike.

Read the PhD thesis "M. Gerritsen, Type Assignment Functions" 1988.
University of Twente

She built a type inference algorithm where the occurrences of variables
not necessarily have the same type.
Instead she uses `conjuction types': *a & *b .
(\x.x o x) has the type ((*b->*c)&(*a->*b))->*a->*c which sais that x
must be
instantiatable as *a->*b and as *b->*c .
Application of (\x.x o x) to hd yields the equations:
	*a->*b = List *d->*d  (and so: *a=List *d,   *b=*d)
	*b->*c = List *e->*e  (and so: *b=List *e,   *c=*e)
The two occurrences of x have types List *d->*d and List *e->*e
respectively.
Solving the equations yields:
	*a=List (List *e)
	*b=List *e
	*c=*e
	*d=List *e
and so
	(\x.x o x) hd::*a->*c = List (List *e)->*e

In this type inference system there is no more need for an explicit
let-construct.
All Milner-typable expressions are also typable in this scheme, but
some
non Milner-typable expressions too.
Try `twice twice k' where
	twice f x = f (f x)
	k x y = x

Mark Ramaer

joosten@cs.utwente.nl (Stef Joosten) (01/23/91)

  mmcg@bruce.cs.monash.OZ.AU (Mike McGaughey) writes:

>	can anyone point me to an algorithm or paper
>	describing a polymorphic type system which does not share this
>	anomaly?

A very nice solution is offered in the thesis of Mirjam Gerritsen:

   M. Gerritsen
   Type Assignment Functions
   PhD thesis, University of Twente, Comp. Sc., the Netherlands, 1988.

postal address:
   Functional Programming Group, Comp. Sc.
   University of Twente,
   postbus 217
   7500 AE  ENSCHEDE
   the Netherlands

In this thesis Mirjam Gerritsen gives a typing algorithm (proof included),
and shows that Mike's problem is covered, and also proves that
Milner/Hindley typing is still completely covered.
Furthermore, she shows how to build an efficient implementation.

Good luck to all of you.
Stef Joosten