[comp.lang.misc] Typing

torek@elf.ee.lbl.gov (Chris Torek) (05/05/91)

[Beware, I am discarding everyone's existing definitions in order to
talk about this in a different way.  I am not sure if this will get
anywhere....]

Maybe I am not close enough to the problem, but it seems to me that the
real distinction between static and dynamic typing is that of decision
time.  There is a continuum here between `very static' and `very
dynamic': the more `static' a language is, the more type decisions it
forces you to make up front.  A completely static type language has you
write the type of *every* operation and operand *every* time (e.g.,
machine language) and a completely dynamic type language never has you
write any types (not even when the operation or operand came from a
completely unknown source, such as user input).

Assembler is often considered `untyped' but the reality is that the
types are not persistent (i.e., it is not the registers and memory
which are considered typed, but rather the operations upon them).
I will have more to say as to what `typing' *means* in a while.

Note that neither of these have all that much to do with type
*checking*.  Typically `static typing' implies `less type checking at
runtime' and dynamic typing implies `more type checking at runtime',
but this is because of, well, see below.  The distinction between
`strong' and `weak' typing, in this classification system, is that
`strong' typing systems bind `operations' and `types' together, and
require (possibly by automatically guaranteeing, possibly by
complaining at compile time or run time, possibly some mix) that
operations and operands have `matching' types (the criteria for what
makes something match is a rather thorny issue as well).  Again there
is a continuum; the weaker the typing, the more freely the language
allows you to mix-and-match.

Often the design of a language guarantees that some condition will not
occur, and often people think of one property as another because *all*
language that have one have the other.  For instance, `very dynamic'
languages rarely or never have `weak' typing because they usually or
always have dynamic type checking.  But the following might be an
example of a run of a weakly typed, unchecked, dynamic typed language:

	Give me the value of K
	prompt> int 1000000000
	Okay, now I will add K to your inputs using your add op
	prompt> float, float 4
	Sum is 4.004724
	prompt>

Here we told the language at runtime to do a float-add of float-4 and
int-3 but we did not bother to implement type tags, so what we got was
the addition of two bit patterns.  I have cheated here: how did we
know what to print?

Now:

In article <OLSON.91May3124040@lear.juliet.ll.mit.edu>
olson@juliet.ll.mit.edu (Steve Olson) writes:
>If one puts an integer into a floating point instruction, one might
>get a deteriminstic result. But so what?  The result wouldn't be
>meaningful.

But (as Herman Rubin will tell you all too often :-) ) sometimes the
result *is* meaningful.

I think this really is getting somewhere.  What this points out is that
the result of some operation is meaningful if and only if the operation
and its parameters were the kind the programmer *wanted*.  What you
`want' depends on what concept you have of the type.  The object
itself, and the operation on the object, is just a bag of bits and a
defined (whehter exactly or loosely) series of manipulations of those
bits.  The meaning of the result is something you assign in your head.
That 4.004724 answer I got above (which really is the result of adding
int-3 plus float-4 under IEEE arithmetic on a Sun-4) is really just a
collection of bits interpreted as a float.  Indeed, the `4.004724' on
your screen is just a collection of bits; the chain between the bits,
the ASCII characters, the glyphs on the screen, the neural responses of
your retina, and the ultimate result in your head is all that gives it
meaning.

What does this have to do with typing?

A static-typed language makes the assumption that the programmer knows
in advance what type or types he wants.  A dynamic-typed language does
not make this assumption.  The types provided, directly or indirectly,
represent sets of meanings to the user.

A strongly typed language makes the assumption that the programmer does
not mean to `mix up' the types of objects and/or operations.  It will
object to things like `a + b' when a is a string and b is an integer.
A weakly typed language does not make this assumption (but then must
either assign a meaning to the operation, or make it system-dependent).
A better term for this might be `strict typing'.  Strict typing may
come with or without vigorous type checking.  (C can then be called
either strictly typed with lax checking, or weakly typed with strict
checking, depending on how you want to label errors.)  The more strict
it is, the less it lets you get away with (no casts, no automatic
conversions, or whatever).

[Note that in order to do automatic conversions in `completely dynamic'
typed languages, you must have runtime type tags.  If you opt not to
provide such conversions you can omit the tags, at the cost of being
able to produce meaningless answers.  If you opt to provide the
conversions Herman Rubin (and others) will want some way of
deliberately lying about the type tags (to produce meaningful but
rather peculiar answers).]

The static type, strict checking people believe that:

  - you know what your meanings are already, or can define them all
    at `compile time';

  - you want to introduce redundancy to tell you (early on) when
    you have mixed up your meanings.

The dynamic type, strict (tagged) checking people believe that:

  - you want to be able to add new meanings `on the fly';

  - you want the system to choose the proper meaning of an operation
    based at least in part on the data types involved.

Some of the both groups of people believe in redundancy, some do not.
Some think it is useful for efficiency and nothing else.

The `Herman Rubins' appear to believe that:

  - the language does not know what the user's meanings are;

  - the user's meanings change from machine to machine.

Most people disagree with both to a large extent.  They would like
to have the language know what they mean, and are often willing to
pay some cost in syntax and/or runtime efficiency to get this.

It is interesting to note that natural languages and bureaucracies
both make extensive use of redundancy.

Well, this was not as coherent as it should be, but then I have not
had breakfast yet.  I apologize for not taking more time to clean this
up....
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov

adam@visix.com (05/08/91)

In article <12822@dog.ee.lbl.gov>, torek@elf.ee.lbl.gov (Chris Torek) writes:

[stuff]

Hooray!  Great posting.  Worth saving and rereading.

Adam