[net.lang] Type-System Questions

richw@ada-uts.UUCP (10/24/85)

Does anyone know of any languages that either:

1) Attempts to do compile-time type-checking via "type inference"?
   That is, by analyzing the data flow of a program, the compiler
   tells you when it's sure that a type mismatch occurs -- this allows
   the programmer to not explicitly declare types.  Yes, I know that
   this can lead to situations where a variable may refer to objects
   of several types, and is thus not perfect.  I'm still curious to
   know if systems exist which try this.

or...

2) Includes BOTH compile-time types and run-time, "manifest" types?
   By compile-time type, I mean that the type of an object is only
   known by the compiler during compilation -- at run-time, there
   are no flags (or the like) attached to data marking its type.
   Manifest types refer to objects which do have such flags indicating
   type.  As an example, Lisp objects will have bits associated with
   them indicating whether they're atoms, lists, etc.
       Since I believe both type-systems have advantages, I'm curious
   to see if any languages have tried getting the benefits of both.

No preference for posting responses vs. personal E-mail.

Thanks "in advance",
---  Rich Wagner

guido@boring.UUCP (10/27/85)

In article <15100005@ada-uts.UUCP> richw@ada-uts.UUCP writes:
>Does anyone know of any languages that either:
>1) Attempts to do compile-time type-checking via "type inference"?
>   That is, by analyzing the data flow of a program, the compiler
>   tells you when it's sure that a type mismatch occurs -- this allows
>   the programmer to not explicitly declare types.

"B" does exactly this.  E.g., after looking at
	PUT t[a+1] IN b		\ this is B syntax for   b := t[a+1]
the system knows that 't' is a table (array) whose keys are numbers, that
'a' is a number, and that 'b' has the same type as the elements of 't'.
See reference [1].

>   Yes, I know that this can lead to situations where a variable may refer
>   to objects of several types, and is thus not perfect.

What you see here as a defect can actually be a useful extension to
a language: "generic" procedures are now for free!  This, too is provided
by B.  See also reference [2].

>2) Includes BOTH compile-time types and run-time, "manifest" types?
>   By compile-time type, I mean that the type of an object is only
>   known by the compiler during compilation -- at run-time, there
>   are no flags (or the like) attached to data marking its type.
>   Manifest types refer to objects which do have such flags indicating
>   type.

We are planning an extension to B to allow this, without disturbing
the existent type system.  (We don't have publishable material on this,
yet.)

References:

[1] L.G.L.T. Meertens, Draft Proposal for the B Programming Language.
    Mathematisch Centrum, Amsterdam, 1981.  ISBN 90 6169 238 2.

[2] R. Milner, A theory of type polymorphism in programming, Journal
    of Computer and System Sciences, vol 17 (1978), 348-375.

Disclaimer:

We are well aware of the fact that there is another language called "B",
in use at Waterloo University for system programming on old hardware.
Please don't post followups saying this again!
In the near future, "our" B will be renamed to ABC and at the same time
revised at many minor points.

-- 
	Guido van Rossum, CWI, Amsterdam (guido@mcvax.UUCP)

db@cstvax.UUCP (Dave Berry) (10/28/85)

In article <15100005@ada-uts.UUCP> richw@ada-uts.UUCP writes:
>Does anyone know of any languages that either:
>1) Attempts to do compile-time type-checking via "type inference"?

ML is a strongly-typed functional language with type inference.
See "A proposal for Standard ML" by Robin Milner in the 1984 LISP
conference.  The language is older than that, but has just been standardised.
It includes abstract datatypes, higher-order functions & polymorphism.
You can define your own infix operators, and updatable values are provided
if you need them.
There's also a variant called lazy ML.

The original paper detailing the inference scheme is "A theory of type
polymorphism in programming", by Robin Milner in the Journal of Computer
& System Sciences 17 (1978) pp 348-375.
-- 
	Dave Berry. CS postgrad, Univ. of Edinburgh		
					...mcvax!ukc!cstvax!db

craig@dcl-cs.UUCP (Craig Wylie) (10/28/85)

In reference to your question about languages using compile and runtime
type checking I believe that POLY developed at 
the University of Glagow, Scotland, Great Britain - may include
facilities such as this. There is a book on the subject :-


	Polymorphic programming languages
	Design and implementation 	publ: Ellis Horwood
	by David M. Harland


			Craig.

bob@cstvax.UUCP (Bob Harper) (10/29/85)

In article <15100005@ada-uts.UUCP> richw@ada-uts.UUCP writes:
>
>Does anyone know of any languages that either:
>
>1) Attempts to do compile-time type-checking via "type inference"?
>   That is, by analyzing the data flow of a program, the compiler
>   tells you when it's sure that a type mismatch occurs -- this allows
>   the programmer to not explicitly declare types.  Yes, I know that
>   this can lead to situations where a variable may refer to objects
>   of several types, and is thus not perfect.  I'm still curious to
>   know if systems exist which try this.
>
The programming language ML developed here at Edinburgh C.S. does just that.
The use of type inference in programming was pioneered by Robin Milner in
the design of ML (references below).  ML is a strongly--typed, polymorphic
functional programming language that was originally designed as the meta--
language (hence the name) for LCF, a logic for reasoning about programs.
It has recently been revamped and standardized, and is now known as Standard ML.
The language can be separated into two major components: the core
language and the module facility.  The core language implementation is complete
and available; the module facility implementation is in progress.

Enough advertising.  The type system of ML has the property that no well--typed
program can incur a type error at run time (where a type error includes things
like trying to add a string to a boolean), and furthermore every program has a
unique most--general type that can be automatically inferred without any type
declarations by the user.  The phrase ``most--general'' can be made mathemat--
ically precise, but, roughly speaking, it means that the ML compiler assigns
the broadest meaningful type that it can.  An example will help.  Consider the
function defined as follows:

	fun id(x) = x;

This function takes a value x as argument and returns x as result.  What type
should id have?  It clearly does not care whether the argument is an integer
or a boolean or a function or a list or whatever --- the functional behavior
is the same in all cases.  Therefore the type assigned to id is not, say,
int->int or bool->bool or ... but rather 'a->'a, where 'a (pronounced ``alpha'')
is a type variable ranging over types.  For id to have the type 'a->'a means
that it can be applied to an argument of type 'a (for any type at all), and
it returns a value of type 'a (the same 'a).  This means that we can apply id
to several arguments of vastly different types.  All of the following uses are
legal, and their type is given:

	id(3)			int
	id(true)		bool
	id(id)			('a->'a)->('a->'a) [!]

The combined fact of inferring a most--general type together with the ability
to use functions at several distinct types (all instances of the most general
type) is called polymorphism.  The function id is polymorphic (literally, many
shapes) because it can take and returns values of many different types.  (The
term was coined by Strachey; he distinguished two forms: parametric and ad--hoc
polymorphism; the former is what ML gives, the latter is now called ``overloading'',
whereby the same identifier is used to refer to *different* functions, whereas
here the identifier id refers to exactly *one* function: the compiler does *not*
build different versions of id, one for each type used.  Thus ML provides a much
more elegant way of achieving polymorphism than does, say, ADA, which simply
macro expands each use for each type argument.)

By the way, there is no problem about a variable referring to objects of
several different types.  SML accounts for polymorphic assignment in more
or less the same way as outlined above for functions, and thus preserves
strong typing without sacrificing type inference or compile--time typechecking.

Here are some salient points about SML:

	1. Functions are first--class objects (that is what we mean by saying
	   that SML is a functional language).

	2. SML is statically (lexically) scoped.  (I'll be happy to take on any
	   dynamic scoping fans if they'd like.)

	3. SML has imperative constructs (assignment, iteration), but also
	   has a purely functional subset.

	4. SML has an extensible type system, including simple type abbreviation,
	   definition of new structural types (like trees or whatever), and
	   definition of abstract types (stacks and the like).

	5. SML is polymorphic and provides complete compile--time type inference.

	6. SML has a module facility to support the structuring of large programs
	   and to provide facilities for separate compilation and linking.

This is getting long--winded, so I'll end with a reference list:

1. Robin Milner, ``A theory of type polymorphism in programming'', JCSS 17(3),
December, 1978.

2. Michael Gordon, Robin Milner, and Christopher Wadsworth, ``Edinburgh LCF'',
Springer Verlag LNCS 78 (1979).

3. Luis Damas and Robin Milner, ``Principal type schemes for functional
programs'', Proc. 9th POPL, 1982.

4. Robert Harper, David MacQueen, and Robin Milner, ``The definition of
Standard ML'' (to appear as Edinburgh CS Technical Report).

5. David MacQueen, ``Modules for Standard ML'', Bell Labs, October, 1985.
(see also POPL in January 86).

6. Luca Cardelli, ``Amber'', Bell Labs, 1985.  (a descendant of ML).

>or...
>
>2) Includes BOTH compile-time types and run-time, "manifest" types?
>   By compile-time type, I mean that the type of an object is only
>   known by the compiler during compilation -- at run-time, there
>   are no flags (or the like) attached to data marking its type.
>   Manifest types refer to objects which do have such flags indicating
>   type.  As an example, Lisp objects will have bits associated with
>   them indicating whether they're atoms, lists, etc.
>       Since I believe both type-systems have advantages, I'm curious
>   to see if any languages have tried getting the benefits of both.
>
>No preference for posting responses vs. personal E-mail.
>
>Thanks "in advance",
>---  Rich Wagner

Luca Cardelli's language Amber, a direct descendant of ML, provides for
an elegant scheme of associating a type with a value at run time in such a
way that strong typing is preserved.  Objects with a type attached are called
``dynamics'' because they have a dynamically (rather than statically) determined
type.  Amber successfully combines polymorphism, type inference, type inheritance
(as in Smalltalk), and dynamic types into a single well--defined, manageable
type system.

Bob Harper

UUCP:	... mcvax!ukc!{hwcs,kcl-cs,ucl-cs,edcaad}!cstvax!bob
ARPA:	bob%cstvax.ed@ucl-cs.arpa
tel:	+44 31 667 1081 ext. 2697
mail:	Department of Computer Science, University of Edinburgh, The King's
	Buildings (JCMB), Mayfield Road, Edinburgh EH9 3JZ, Great Britain

-- 
Bob Harper

UUCP:	... mcvax!ukc!{hwcs,kcl-cs,ucl-cs,edcaad}!cstvax!bob
ARPA:	bob%cstvax.ed@ucl-cs.arpa
tel:	+44 31 667 1081 ext. 2697
mail:	Department of Computer Science, University of Edinburgh, The King's
	Buildings (JCMB), Mayfield Road, Edinburgh EH9 3JZ, Great Britain

msp@ukc.UUCP (M.S.Parsons) (10/29/85)

In article <6@cstvax.UUCP> db@cstvax.UUCP (Dave Berry) writes:
>In article <15100005@ada-uts.UUCP> richw@ada-uts.UUCP writes:
>>Does anyone know of any languages that either:
>>1) Attempts to do compile-time type-checking via "type inference"?
>
>ML is a strongly-typed functional language with type inference.
>
Another language that does this is "Miranda", developed by David Turner
at the University of Kent. Ref - Miranda: a non strict functional language with 
polymorphic types, in Functional Programming Languages and Computer Architecture
(Lecture Notes in Computer Science 201), Springer-Verlag 1985

rcd@opus.UUCP (Dick Dunn) (10/30/85)

> Does anyone know of any languages that either:
> ...(item deleted)
> or...
> 
> 2) Includes BOTH compile-time types and run-time, "manifest" types?
>    By compile-time type, I mean that the type of an object is only
>    known by the compiler during compilation -- at run-time, there
>    are no flags (or the like) attached to data marking its type.
>    Manifest types refer to objects which do have such flags indicating
>    type...

Terminology problem here:  "Manifest" types are types known to the
compiler.  Types known only at runtime are called "latent" types.
Roughly the same meanings are associated with "static types" (known to the
compiler) and "dynamic types" (known at runtime; more to the point,
changeable at runtime).

There are various examples of languages which have both manifest and latent
types.  In the common examples, latent types occur only in a few well-
defined situations--in order to allow the compiler to make as much use of
manifest type information as possible.

Pascal's variant record is a weak example; the active variant of the
record is determined by a "flag", namely the tagfield just ahead of the
variant portion of the record.  I say it's a weak example because the
burden is on the programmer to examine the tagfield and act accordingly.

A better example is the ALGOL 68 union.  Access to a particular element
("variant" as it were) of the union is achieved by means such as a
case-conformity clause, which is a case statement whose selection is based
on the (hidden) type indicator of a union.  (Yes, I know I should be saying
things like "object" and "united from types..."  Forgive me, ALGOL 68
aficionados; I'm trying to use common terms.)  In a case-conformity, you
can't get to a section of code that assumes an inappropriate type.  Another
method of access is the "conforms-to-and-becomes", an assignment operator
that guards against inappropriate type.
-- 
Dick Dunn	{hao,ucbvax,allegra}!nbires!rcd		(303)444-5710 x3086
   ...At last it's the real thing...or close enough to pretend.

richw@ada-uts.UUCP (10/31/85)

First of all, thanks for all the replies; I've got quite a bit of
reading to do...

>>> 2) Includes BOTH compile-time types and run-time, "manifest" types?

I apologize for not being more specific about "manifest" types
(see below for a reference to the text from which I grabbed the term).
I realize now that unions (of C) or variants (of Pascal) or
oneofs/variants (of CLU) might be lumped in this category.
These aren't quite what I was curious about because they
are types which represent a union of a _finite_ set of other types.
I was curious about data which stored the name (or something)
of its type, and thus could be ANY one type of the "infinite"
number of types available (assuming the language provided for
user-defined types).

Since the term, "manifest", apparently isn't as universal as
I'd assumed:

> Terminology problem here:  "Manifest" types are types known to the
> compiler.  Types known only at runtime are called "latent" types.
> Roughly the same meanings are associated with "static types" (known to the
> compiler) and "dynamic types" (known at runtime; more to the point,
> changeable at runtime).

...let me quote my source. I used the term "manifest" after seeing
it in:

    "Structure and Interpretation of Computer Programs",
     by Abelson, Sussman, & Sussman

Specifically, page 132 says:

"A data object that has a type that can be recognized and tested
is said to have _manifest_ type."

The text then goes on to describe a scheme (no pun intended) for
doing generic arithmetic using tagged data.

Again, thanks for the replies.

-- Rich Wagner