[comp.lang.misc] An Interesting View of "Strong" Vs. "Weak" Typing

eberard@ajpo.sei.cmu.edu (Edward Berard) (01/05/90)

Folks,

I recently posted a large message to comp.object on the evaluation of
object-oriented programming languages. I have received a number of
responses. I would be interested in getting your reaction to the
following quotes from one of respondees:

>>         object-oriented programming. Smalltalk has, in effect, no
>>         types, but there are typed extensions to Smalltalk, e.g.,
>
> Again, CL and Smalltalk have *run-time* types - restrictions on
> the use of *objects* during program *execution*.  Languages such
> as Ada have *compile-time* types - restrictions on the use of
> *identifiers* during program *compilation.*

[...]

>
> More accurately, run-time typing places the burden on the run-time
> system, whereas compile-time typing places the burden on the
> compiler *and* the programmer (who must provide explicit type
> declarations and obey them at all times during program
> development).  I would ask again what solid *experimental*
> evidence anyone has to show that compile-time typing actually
> results in programs more reliable than those possible with
> run-time typing, *in the presence of the requirements evolution
> that is present in any large, complex software project*.

Are there any references documenting the benefits of strong vs. weak
typing? What are your reactions to the definitions given?

                                -- Ed Berard
                                   (301) 353-9652

P.S.: For those that don't know, "CL" stands for Common Lisp.

david@cs.washington.edu (David Callahan) (01/05/90)

>> More accurately, run-time typing places the burden on the run-time
>> system, whereas compile-time typing places the burden on the
>> compiler *and* the programmer (who must provide explicit type
>> declarations and obey them at all times during program
>> development). 

This statement is misleading. Many research languages perform
compile-time type checking based on type-inference algorithms and
allowing a powerful, polymorphic type system. While research work
indicates that type-inference at reasonable generality is undecidable,
it does not seem to be a tremendous problem in practice.

-- 
David Callahan  (david@tera.com, david@june.cs.washington.edu,david@rice.edu)
Tera Computer Co. 	400 North 34th Street  		Seattle WA, 98103

reino@cs.eur.nl (Reino de Boer) (01/05/90)

eberard@ajpo.sei.cmu.edu (Edward Berard) writes:

>I would be interested in getting your reaction to the
>following quotes from one of respondees:

>>>         object-oriented programming. Smalltalk has, in effect, no
>>>         types, but there are typed extensions to Smalltalk, e.g.,
>>
>> Again, CL and Smalltalk have *run-time* types - restrictions on
>> the use of *objects* during program *execution*.  Languages such
>> as Ada have *compile-time* types - restrictions on the use of
>> *identifiers* during program *compilation.*

I tend to agree with `>>>' that Smalltalk has no types. The feeling of
run-time types is actually the `not being able to respond to certain
messages'.

>Are there any references documenting the benefits of strong vs. weak
>typing? What are your reactions to the definitions given?

My reaction:
I think languages like Miranda have shown us that the effects of
run-time type-checking can for a very large part be accomplished by
compile-time evaluation of possible types. Although something can be
said in favor of the statement
	all interpreters (CL, Smalltalk, etc.) by definition do only
	run-time checking (not being able to compile at all).

P.S. Thank you for your overview of Object-Oriented Design.

Reino


-- 
Reino R. A. de Boer
Erasmus University Rotterdam ( Informatica )
e-mail: reino@cs.eur.nl

nick@lfcs.ed.ac.uk (Nick Rothwell) (01/05/90)

In article <641@ajpo.sei.cmu.edu>, eberard@ajpo (Edward Berard) writes:
I would be interested in getting your reaction to the
>following quotes from one of respondees:
>> More accurately, run-time typing places the burden on the run-time
>> system, whereas compile-time typing places the burden on the
>> compiler *and* the programmer (who must provide explicit type
				      ^^^^^^^^^^^^...
>> declarations and obey them at all times during program
>> development).

This is false for a polymorphic type system.

>> I would ask again what solid *experimental*
>> evidence anyone has to show that compile-time typing actually
>> results in programs more reliable than those possible with
>> run-time typing, *in the presence of the requirements evolution
>> that is present in any large, complex software project*.

Do you mean "practical experience"? I wrote 8000 lines in ML, and
hand-translated to 20000 lines of C, in 4 months, resulting in a
reliable piece of software. I don't know how to quantify that in any
objective way, though.

I suggest that compile-time typing serves the same purpose as
module interfaces: it lets you decide what you want different parts of
a project to see, and what operations you want to provide (I'm
thinking of a parameterised, abstract type system here).

		Nick.
--
Nick Rothwell,	Laboratory for Foundations of Computer Science, Edinburgh.
		nick@lfcs.ed.ac.uk    <Atlantic Ocean>!mcvax!ukc!lfcs!nick
~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~
  "...all these moments... will be lost in time... like tears in rain."

gudeman@cs.arizona.edu (David Gudeman) (01/06/90)

[anonymous quotes...]
>>>         object-oriented programming. Smalltalk has, in effect, no
>>>         types, but there are typed extensions to Smalltalk, e.g.,
>>
>> Again, CL and Smalltalk have *run-time* types - restrictions on
>> the use of *objects* during program *execution*.  Languages such
>> as Ada have *compile-time* types - restrictions on the use of
>> *identifiers* during program *compilation.*

I think there is some confusion here between "having types" and "being
typed".  In particular, it is possible for a language to have types
(as Smalltalk and CL do) and still to be untyped (as Smalltalk and CL
are).  These languages are not "weakly typed", they are "untyped".

All languages "have types", in the sense that values in the language
can be categorized (even if there is only one category).  CL has the
types "integer", "real", "string", "cons", etc.

A language is said to "be typed" if the identifiers of the language
(variable, function names, etc.)  are constrained to represent values
of specific types.  The difference between strong and weak typing
involves how much you can say about the types of values that the
identifiers can take.

Here are examples in C (a typed language) and Icon (an untyped
language with a syntactic resemblance to C):

  int add(a,b) int a,b; {return a + b;}   /* C */

  procedure add(a,b) return a + b end     # Icon

The C procedure is typed because the identifiers "a" and "b" are
restricted to represent values of type int, and the identifier "add"
is restricted to represent a value of type (int X int) -> int
(actually the type restriction "add" is a rather trivial one since it
is a constant).

In the Icon procedure, "a" and "b" may have values of any type: int,
real, or even procedure.  If one of both of them is not numeric (or
convertible to numeric) then a run-time error will occur when you try
to add them.  Likewise, the identifier "add" in the Icon program is
untyped.  It may take on a value of a different type (even int) during
the execution of the program.
-- 
					David Gudeman
Department of Computer Science
The University of Arizona        gudeman@cs.arizona.edu
Tucson, AZ 85721                 noao!arizona!gudeman

lgm@cbnewsc.ATT.COM (lawrence.g.mayka) (01/06/90)

In article <10287@june.cs.washington.edu> david@june.cs.washington.edu (David Callahan) writes:
>>> More accurately, run-time typing places the burden on the run-time
>>> system, whereas compile-time typing places the burden on the
>>> compiler *and* the programmer (who must provide explicit type
>>> declarations and obey them at all times during program
>>> development). 
>
>This statement is misleading. Many research languages perform
>compile-time type checking based on type-inference algorithms and
>allowing a powerful, polymorphic type system. While research work
>indicates that type-inference at reasonable generality is undecidable,
>it does not seem to be a tremendous problem in practice.

Consider a dynamically constructed list of objects such that:

a) The types of the elements have no necessary relation - i.e.,
the list is totally heterogeneous;

b) The types of the elements are not knowable at compile time, but
are each knowable at run time - i.e., a TYPE-OF or CLASS-OF
function is available;

c) The length of the list is not knowable at compile time, but is
knowable at run time.

Programs in run-time typed languages make ubiquitous use of such
lists.  If one cannot create and manipulate such lists easily and
efficiently in the research languages you describe, I suggest that
there is indeed a "problem in practice."  The distinction between
compile-time typing and run-time typing remains a valid one.

Moreover, a programming environment which requires specification
of all program parts in advance of compilation (e.g., in order to
perform type inferencing across function calls) faces essentially
the problems of compile-time typing:  hindrance of incremental
system development and obstruction of smooth system evolution.

Type inferencing within a function, or on programmer request
(e.g., via optional declarations), is of course a welcome optional
optimization.

	Lawrence G. Mayka
	AT&T Bell Laboratories
	lgm@ihlpf.att.com

Standard disclaimer.

lgm@cbnewsc.ATT.COM (lawrence.g.mayka) (01/06/90)

In article <1990Jan5.084746.17836@cs.eur.nl> reino@cs.eur.nl (Reino de Boer) writes:
>compile-time evaluation of possible types. Although something can be
>said in favor of the statement
>	all interpreters (CL, Smalltalk, etc.) by definition do only
>	run-time checking (not being able to compile at all).

Common Lisp programs are typically compiled into native machine
instructions.  For example, one typical implementation on the
SPARC architecture compiles the following function

        (defun add (x y)
          (+ x y)
          )

into exactly 11 machine instructions.  Such a function properly
computes the sum of any two numbers - integers of any size,
floating-point, rational, or complex.  (Only 5 instructions are
executed if each argument happens to be an integer that fits into
a single machine word.)  If one annotates this definition with
optional type declarations thus:

        (defun add (x y)
          (declare (fixnum x y))
          (the fixnum (+ x y))
          )

the compiler emits exactly 4 machine instructions.  If this latter
function is passed arguments of unexpected types - e.g., rational
numbers 4/5 and 7/8 - the run-time system traps the attempt and
asks the user if it should proceed with the addition anyway.  This
exception handling behavior can of course be overridden by the
programmer.)

An interpreter is usually also available but most CL programmers
use it primarily to execute top-level environment commands such as
ED, COMPILE-FILE, LOAD, TRACE, DESCRIBE, INSPECT, and DISASSEMBLE.

	Lawrence G. Mayka
	AT&T Bell Laboratories
	lgm@ihlpf.att.com

Standard disclaimer.

lgm@cbnewsc.ATT.COM (lawrence.g.mayka) (01/06/90)

In article <1493@castle.ed.ac.uk> nick@lfcs.ed.ac.uk (Nick Rothwell) writes:
>Do you mean "practical experience"? I wrote 8000 lines in ML, and
>hand-translated to 20000 lines of C, in 4 months, resulting in a
>reliable piece of software. I don't know how to quantify that in any
>objective way, though.

Since ML and C are both essentially compile-time typed, your
experience is not a comparison as such between compile-time and
run-time typing.

	Lawrence G. Mayka
	AT&T Bell Laboratories
	lgm@ihlpf.att.com

Standard disclaimer.

keith@curry.uchicago.edu (Keith Waclena) (01/07/90)

In article <16622@megaron.cs.arizona.edu>, gudeman@cs (David Gudeman) writes:
>I think there is some confusion here between "having types" and "being
>typed".  In particular, it is possible for a language to have types
>(as Smalltalk and CL do) and still to be untyped (as Smalltalk and CL
>are).  These languages are not "weakly typed", they are "untyped".
> [...]
>A language is said to "be typed" if the identifiers of the language
>(variable, function names, etc.)  are constrained to represent values
>of specific types.  The difference between strong and weak typing
>involves how much you can say about the types of values that the
>identifiers can take.
>

Don't you think the distinction between ``dynamic typing'' and
``static typing'' is a useful one?  The distinction is, who knows the
types: the compiler or the run-time system?

I would say that Icon is weakly typed statically and strongly typed
dynamically, in that identifiers aren't type-constrained in any way,
and yet the Icon runtime system knows the types of all values at all
times.

I would say that C is weakly typed (or untyped) dynamically, and
strongly-typed (but not too strongly) statically, in that identifiers
in C are somewhat type-constrained, and yet the C runtime system knows
nothing at all about the types of values.

In other words, there are two sorts of things that can have varying
degrees of ``typing'' applied to them: L-values (i.e., locations or
identifiers) and R-values (i.e., values, or data).  Dynamic (runtime)
typing refers to the typing of the R-values, while static (compile
time) typing refers to the typing of the L-values.

I think this nicely ties together some of the other points people have
made in this thread: dynamic typing implies tagging your data, and
happens to be most often done in interpreter-based implementations,
while static typing implies some amount of compiler-checking, perhaps
as elaborate as a polymorphic type inference system.  If you want to
do no runtime checking, you can substitute static checking (at the
cost of some flexibility); if you want to do no compile-time checking,
you can substitute run-time checking at the cost of some runtime
speed.  And of course, you can mix the two approaches as much as you
like.

Here are some example languages and the sort of type checking they do
(we could of course argue about where a given language falls on the
continuum between ``strong'' and ``weak''); the ``Traditionally
compiled?'' column indicates whether implementations of that language
usually compile native code or interpret pseudocode (of course, any
language can go either way).

                                       Traditionally
Language        Dynamic         Static   Compiled?   Comments
--------        -------         ------   ---------   -------------------------
BCPL            Untyped         Untyped     yes      Only value is bit pattern
Forth           Untyped         Untyped     no       Ditto
C               Untyped         Moderate    yes
Icon            Very strong     Untyped     no
Scheme          Very strong     Untyped     no
Miranda         Weak?           Strong      no       Polymorphic type inference
Lazy ML		Weak?           Strong      yes      Polymorphic type inference

(I'm not *sure* that Miranda and Lazy ML do no runtime tagging...)

Clearly, most languages that provide a type-query function that
returns the type of any value at run-time must be said to be strongly
typed in *some* sense; I'd say that sense is ``dynamically'', because
I'd guess that language uses runtime tags.  For example, Icon has
``type''; Common Lisp has ``type-of'; Snobol4 has something similar.
(However, I think Algol 68 went to some pains to allow a typeof
function that worked without runtime tags, using instead some unusual
language structures (like the case-type statement that was used with
unions).  Does anyone know about this?)

BCPL and Forth have similar typing approaches, though one is usually
compiled and the other usually interpreted.  Scheme and Miranda are
similar languages, usually interpreted, which take (mutually) opposite
approaches to typing.

Note that in languages like BCPL and Forth, which are completely
untyped, it is the programmer who supplies the type information by
choosing the appropriate operators to apply to particular values.

(I don't suppose it makes much sense to apply the adverb
``traditionally'' to recent languages like Miranda and Lazy ML, but I
hope you get the picture.

							Keith

--
Keith WACLENA                             keith@curry.uchicago.edu
CILS / TIRA / U of Chicago                keith%curry@uchimvs1.bitnet
1100 E.57th.St Chi IL 60637 USA           ...!uunet!curry.uchicago.edu!keith

"Immer Sicherheitsfruchthalter benutzen!"               -- A. Boerner, Gmbh.

gudeman@cs.arizona.edu (David Gudeman) (01/09/90)

In article  <7049@tank.uchicago.edu> keith@curry.uchicago.edu (Keith Waclena) writes:

>Don't you think the distinction between ``dynamic typing'' and
>``static typing'' is a useful one?  The distinction is, who knows the
>types: the compiler or the run-time system?

Well, you make a good case for the distinction, but I have some
reservations.  First, the distinction between compiler and run-time
system is itself somewhat artifical.  Granted that the large majority
of modern programming languages make this distinction at some level,
the distinction is not fundamental to the idea of a programming
language.  Rather, it is an artifact of implementation (or so I
claim).

>I would say that Icon is weakly typed statically and strongly typed
>dynamically...
>
>I would say that C is weakly typed (or untyped) dynamically, and
>strongly-typed (but not too strongly) statically...

This last sentence bothers me.  It would be quite simple to implement
a C operator such as "(typeof) x" that returns some representation of
the type of the variable x.  Granted this operator _does_ have to
refer to the type information gathered at compile time (it probably
would be treated as a compile-time constant, like "(sizeof) x"), but
the type of x is still available at run-time.

I think a better method to make this sort of distinction is to use the
traditional definition of "typed" vs. "untyped" languages that I
described previously, and further to distinguish between kinds of type
systems.  Define "tagged type systems" as systems where the
representation of the data uniquely identifies both the type of the
data and its value.  Define "untagged type systems" as those systems
where representations are overloaded (i.e.: a single representation
stands for a different value in several different types), and the type
must be known before it is possible to determine exactly what value is
being represented.

The advantage of this terminology is that it does not refer to the
distinction between compiler and run-time system, except where the
distinction is an inherent part of what you are talking about.  That
is, you _have_ to distinguish between compiler and run-time system in
order to talk about whether types are known at compile time, but it is
not necessary to make this distinction when you are talking about the
way data is represented.
-- 
					David Gudeman
Department of Computer Science
The University of Arizona        gudeman@cs.arizona.edu
Tucson, AZ 85721                 noao!arizona!gudeman

nick@lfcs.ed.ac.uk (Nick Rothwell) (01/09/90)

In article <12610@cbnewsc.ATT.COM>, lgm@cbnewsc (lawrence.g.mayka) writes:
>Since ML and C are both essentially compile-time typed, your
>experience is not a comparison as such between compile-time and
>run-time typing.

True, but there's a difference between a language "being typed" and a
language providing a rigorous and structured framework in which to
construct, reason about, and exercise abstractions and interfaces.  I
can't do the latter in C even through it knows about integers, reals,
records etc. since the type "system" isn't sufficiently powerful
to handle interfaces and abstraction.

>	Lawrence G. Mayka

		Nick.
--
Nick Rothwell,	Laboratory for Foundations of Computer Science, Edinburgh.
		nick@lfcs.ed.ac.uk    <Atlantic Ocean>!mcvax!ukc!lfcs!nick
~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~
  "...all these moments... will be lost in time... like tears in rain."

lgm@cbnewsc.ATT.COM (lawrence.g.mayka) (01/10/90)

In article <16678@megaron.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
>I think a better method to make this sort of distinction is to use the
>traditional definition of "typed" vs. "untyped" languages that I

The difficulty with these definitions is that they so easily
mislead people.  "Conventional wisdom" has decided - based on
comparisons between strong and weak compile-time typing - that
Types are Good and therefore, Strong Types are Better.  This
"conventional wisdom" is then often applied disparagingly to
run-time typing.  Run-time typing encourages a programming style
radically different from that of "weak" compile-time typing and
should not be placed in the same category.

>systems.  Define "tagged type systems" as systems where the
>representation of the data uniquely identifies both the type of the
>data and its value.  Define "untagged type systems" as those systems
>where representations are overloaded (i.e.: a single representation
>stands for a different value in several different types), and the type
>must be known before it is possible to determine exactly what value is
>being represented.

Any of these seem reasonable and roughly equivalent distinctions:

compile-time typing vs run-time typing
static typing vs. dynamic typing
untagged typing vs. tagged typing
identifier typing vs. object typing
textual typing vs. behavioral typing

	Lawrence G. Mayka
	AT&T Bell Laboratories
	lgm@ihlpf.att.com

Standard disclaimer.

sakkinen@tukki.jyu.fi (Markku Sakkinen) (01/10/90)

In article <16678@megaron.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
>In article  <7049@tank.uchicago.edu> keith@curry.uchicago.edu (Keith Waclena) writes:
> [...]
>> [...]
>>I would say that C is weakly typed (or untyped) dynamically, and
>>strongly-typed (but not too strongly) statically...
>
>This last sentence bothers me.  It would be quite simple to implement
>a C operator such as "(typeof) x" that returns some representation of
>the type of the variable x.  Granted this operator _does_ have to
>refer to the type information gathered at compile time (it probably
>would be treated as a compile-time constant, like "(sizeof) x"), but
>the type of x is still available at run-time.

I think Waclena's original characterisation is accurate.
Gudeman is thinking about what C could be if it was different!
The semantics of C is very heavily based on the premise that
there is no run-time descriptive information associated with
any programme objects. So, if one extended C with a 'typeof'
operator, an expression like "typeof *p" could only give the base type
of p, _not_ the type of the object to which p points at run time.

One reason why I do not consider C++ a good object-oriented language
is that the above characterisation holds for C++ as well.
Even it keeps no run-time type information except for the bare minimum
needed to handle virtual functions.

There _are_ strongly/statically typed languages that also have complete
run-time type information and a 'typeof' operator. One example is the
object-oriented language Mode, created by Juha Vihavainen at the
University of Helsinki (vihavain@uhecs.helsinki.fi).

Markku Sakkinen
Department of Computer Science
University of Jyvaskyla (a's with umlauts)
Seminaarinkatu 15
SF-40100 Jyvaskyla (umlauts again)
Finland

rh@ecs.soton.ac.uk (R Harrison) (01/10/90)

In article <1990Jan5.084746.17836@cs.eur.nl> reino@cs.eur.nl (Reino de Boer) writes:
>eberard@ajpo.sei.cmu.edu (Edward Berard) writes:
>
>>Are there any references documenting the benefits of strong vs. weak
>>typing? 

The advantages of strong typing have been recognised for some time
- consider, for example, Gries and Gehani ("Some ideas on data type 
in high level languages, CACM 20, 1977, pp.414-420), in which the case 
for controlled polymorphic programming is discussed.

 Rachel Harrison        rh@uk.ac.soton.ecs 
 Department of Electronics and Computer Science,
 University of Southampton, Southampton SO9 5NH, UK 
 Tel. (0703) 593055  International 44 703 593055 
 Fax 0703 593045 International Fax 44 703 593045

throopw@sheol.UUCP (Wayne Throop) (01/11/90)

> From: nick@lfcs.ed.ac.uk (Nick Rothwell)
> there's a difference between a language "being typed" and a
> language providing a rigorous and structured framework in which to
> construct, reason about, and exercise abstractions and interfaces.  I
> can't do the latter in C even through it knows about integers, reals,
> records etc. since the type "system" isn't sufficiently powerful
> to handle interfaces and abstraction.

C (at least draft ANSI C) also "knows about" prototypes, structure types,
and pointers to unelaborated structure types.  These tools are enough
to "handle" interface and data abstraction well enough to get by, but
of course these tools are (at best) ad hoc.  So while I agree that C does
NOT have a very good type description notation, it has enough to get
by for many practical uses.

But more generally, I think that the issue of type "strength" is being
confused by a difference between language examples I don't see brought
out in the tables in any of the articles posted so far.  That is the
distinction between languages where names are bound to containers (such
as FORTRAN) and languages where names are bound to values (such as
LISP).  This is related to the issue of type tagged languages, but it
isn't quite the same issue.  In some languages, containers are typed, in
others values are typed.  In languages where containers are typed, it is
possible that containers be referred to by values, and hence either of
these kinds of languages can be tagged to one extent or another. 

So we have compiletime/runtime, static/dynamic, non-tagged/tagged,
named-container/named-value, typed-container/typed-value distinctions
to worry about, and strongly-typed/weakly-typed is not strictly tied
to any of these distinctions (but it does make it harder to characterize
strong/weak type distinctions).  (And we COULD bring in 
descriptive/procedural, modular/monolithic namespace, and so on and on.)
And to top it off, all of these distinctions are more or less fuzzy.


I try nowadays not to use the terms "strongly typed" or "weakly typed",
because so many people mean so many things by them.  If I had to state
what they meant to me, I'd say something like

    Strongly typed languages are those languages where the correspondence
    between typed values and the operations the programmer intends to allow
    on those typed values can be made known to the language system, and
    weakly typed languages are those languages where this correspondence
    must be managed by the programmer. 

Under this definition of "strong" and "weak" types, I'd say that C is a
somewhat strongly typed language which is often used as a somewhat
weakly typed language by people who take advantage of the traditional
lack of diagnostic messages produced by the traditional language
processors for the language. 

Or to put it another way, C is like a machine tool from which the
safety sheilds can be removed (that is, one isn't forced to run lint).
While this is intended to be done only for efficency's sake, and only
when the area is cleared (when producing object code for source code
known not to violate the rules of the language), many operators run
with sheilds removed.  It's really quite fortunate that C compilers
don't snatch off fingers or whole limbs like machine tools do.
They just produce bad code.


> From: keith@curry.uchicago.edu (Keith Waclena)
> [.. A typeof operator for the C language ..]
> can't be done as a compile-time operator (like
> sizeof) except for identifiers (see below), while the type-of-like
> operators of Icon and Common Lisp work on values.  Because C allows
> unions and pointers you would have to have runtime tags to do this.
> I think the real distinction is between associating type information
> with identifiers (l-values) and associating it with data (r-values).
> A language can do one, or the other, or both, and it can do either
> strongly or weakly.

Actually, a language can associate type information with identifiers,
containers, or values.  C associates types with containers and values,
but sometimes the value of certain expressions are undefined, which
can happen several ways.

So, the thing to note is that a typeof operator COULD be added to
C as a compile-time operation on both r-values and l-values.  There
is no ambiguity about what type any expression or identifier has...
unions and pointer casts only add ambiguity about whether the value
of some expressions are defined.

It is this property of C (and most algol-related languages' type
systems) that leads me to my contention that strength of typing isn't
related only to how much one knows about the type of various
expressions.  It seems to me that in essentially all algol-related
languages (those without admixtures of LISP or functionally pure
notation at least), this information is always statically known, and is
unambiguous.  Yet in C for example, one must keep track "by hand" of
what operations are allowed at what times, because of the ambiguities
introduced by pointer casts and unions (among other things). 
--
Wayne Throop <backbone>!mcnc!rti!sheol!throopw or sheol!throopw@rti.rti.org

keith@curry.uchicago.edu (Keith Waclena) (01/11/90)

In article <16678@megaron.cs.arizona.edu>, gudeman@cs (David Gudeman) writes:
>In article  <7049@tank.uchicago.edu> keith@curry.uchicago.edu (Keith Waclena [me]) writes:
>
>>Don't you think the distinction between ``dynamic typing'' and
>>``static typing'' is a useful one?  The distinction is, who knows the
>>types: the compiler or the run-time system?
>
>Well, you make a good case for the distinction, but I have some
>reservations.  First, the distinction between compiler and run-time
>system is itself somewhat artifical.

This is true; the distinction is barely there in a language like
Forth, and is fuzzed in a language like Miranda (where some code
optimization happens the first time a particular piece of (combinator)
code is executed).

>
>>I would say that C is weakly typed (or untyped) dynamically, and
>>strongly-typed (but not too strongly) statically...
>
>This last sentence bothers me.  It would be quite simple to implement
>a C operator such as "(typeof) x" that returns some representation of
>the type of the variable x.  Granted this operator _does_ have to
>refer to the type information gathered at compile time (it probably
>would be treated as a compile-time constant, like "(sizeof) x"), but
>the type of x is still available at run-time.

I'm not sure in what sense it would be simple to implement a typeof in
C (only a compiler-writer could do so; you can't implement typeof *in
the language*).  It can't be done as a compile-time operator (like
sizeof) except for identifiers (see below), while the type-of-like
operators of Icon and Common Lisp work on values.  Because C allows
unions and pointers you would have to have runtime tags to do this.

>
>The advantage of this terminology is that it does not refer to the
>distinction between compiler and run-time system, except where the
>distinction is an inherent part of what you are talking about.

Perhaps I was being a little too informal in using those terms.  I
think the real distinction is between associating type information
with identifiers (l-values) and associating it with data (r-values).
A language can do one, or the other, or both, and it can do either
strongly or weakly.  For an interesting article discussing this, see:

    %A Hamish I. E. Gunn
    %A David M. Harland
    %T Degrees of Constancy in Programming Languages
    %J Information Processing Letters
    %V 13
    %N 1
    %D 27 October 1981
    %P 35-38
    %K dynamic constants; tagged architectures

/Keith

--
Keith WACLENA                             keith@curry.uchicago.edu
CILS / TIRA / U of Chicago                keith%curry@uchimvs1.bitnet
1100 E.57th.St Chi IL 60637 USA           ...!uunet!curry.uchicago.edu!keith

"Immer Sicherheitsfruchthalter benutzen!"               -- A. Boerner, Gmbh.

reino@cs.eur.nl (Reino de Boer) (01/12/90)

lgm@cbnewsc.ATT.COM (lawrence.g.mayka) writes:

>In article <1990Jan5.084746.17836@cs.eur.nl> reino@cs.eur.nl (Reino de Boer) writes:
>>	all interpreters (CL, Smalltalk, etc.) by definition do only
>>	run-time checking (not being able to compile at all).

>Common Lisp programs are typically compiled into native machine
>instructions. 
>An interpreter is usually also available but most CL programmers
>use it primarily to execute top-level environment commands such as
>ED, COMPILE-FILE, LOAD, TRACE, DESCRIBE, INSPECT, and DISASSEMBLE.

The point I was trying to make, is that an interpreter can BY DEFINITION
do no compile-time checking.
Some `compilers' allow a form of run-time checking because they include
a run-time interpreter in the compiled code. I would like to classify
these compilers under interpreters.
Some `interpreters' allow a form of compile-time checking because they
interpret some kind of internal code. Depending on the semantics of the
internal code, these could be classified under compilers.

The only reasonable way for a compiler to allow run-time type checking
would be to include type-checking code in the compiled code.

However, there is more than one issue being discussed here:
1. What would be reasonable definitions for
   o run-time type checking;
   o compile-time type checking.
2. Given a solution to (1), what are the advantages and disadvantages of
   both kinds of type-checking.
3. Given an answer to (2),
   o what would be the best way to let a programmer influence the
	 checking mechanism (e.g., type declarations as in Pascal, or
	 polymorphic typing as in Miranda)
   o does type checking have to be `strong' or `weak' (and what would be
	 suitable definitions for `strong' and `weak').
	 Example:
	 Strong type checking in Pascal would give the programmer the
	 opportunity to get a compiler error on the following:
	 program wrong( output );
	 type distance = real;
		  money    = real;
	 var  d        : distance;
		  m        : money;
	 begin d := 5.5; m := 10.0; writeln( d + m : 10 : 2 ) end.

Any comments? -- Reino

-- 
Reino R. A. de Boer
Erasmus University Rotterdam ( Informatica )
e-mail: reino@cs.eur.nl

franka@mentor.com (Frank A. Adrian) (01/13/90)

In article <2197@ecs.soton.ac.uk> rh@landin (R Harrison) writes:
>In article <1990Jan5.084746.17836@cs.eur.nl> reino@cs.eur.nl (Reino de Boer) writes:
>>eberard@ajpo.sei.cmu.edu (Edward Berard) writes:
>>
>>>Are there any references documenting the benefits of strong vs. weak
>>>typing? 
>
>The advantages of strong typing have been recognised for some time
>- consider, for example, Gries and Gehani ("Some ideas on data type 
>in high level languages, CACM 20, 1977, pp.414-420), in which the case 
>for controlled polymorphic programming is discussed.
>

And the disadvantages of strong typing during early development has also
been recognized for some time - consider, for example, Sandewall
("Programming in an Interactive Environment: The LISP Experience",
Computing Surveys, 1978, pp. 35-72), where we find the quote (p. 38):

	"The contents and use of declarations is one of the important
	issues that one wants to experiment with in a research system,
	and therefore should not be frozen in the system kernel.  In-
	stead the internal form of programs should be declaration free,
	and one of the services of the developing programming system
	should be to account for declarations as input by the user."

I take Sandewall's view.  Maybe the systems that you work on are so well
specified or the concepts you deal with in your programs are so well un-
derstood, even in the early stages, that you have no need of "playing
with the program".  I don't.  Nobody I know does.  I have no objections
to having a well specified type structure in a delivered product (and
in fact see see the necessity for one at that stage both for correctness
and efficiency), but hamstringing languages to REQUIRE it in ALL stages
of development seems to be a bit of overkill.  We may ask a somewhat
cynical question as to if the difficulty of setting up and changing type
hierarchies in the strongly-typed languages leads to programs which:

	a) are "correct" with respect to the initial specification, but 
	b) have not explored the possible type configuration space suf-
		ficiently to have a maintainable type structure, so
	c) significant changes to these strongly-typed structural behemoths
		are next to impossible.

I guess the bottom line is that, at least for me (and, I believe, for
many others), the case for having less strongly typed languages is far
from closed.
-- 

Frank A. Adrian
Mentor Graphics, Inc.
franka@mntgfx.com

lgm@cbnewsc.ATT.COM (lawrence.g.mayka) (01/14/90)

In article <2197@ecs.soton.ac.uk> rh@landin (R Harrison) writes:
>The advantages of strong typing have been recognised for some time
>- consider, for example, Gries and Gehani ("Some ideas on data type 
>in high level languages, CACM 20, 1977, pp.414-420), in which the case 
>for controlled polymorphic programming is discussed.

Gries and Gehani bring up the alternative of run-time typed
languages (the authors call them "typeless") in a single paragraph
at the end of their article.  Interestingly, they mention as
examples APL and Snobol but omit Lisp, which I suspect was the
most widely used of the three even in those days.  (The Interlisp
and MIT Lisp machine programming environments already existed in
1977, though they were perhaps not yet well known.) The authors
reject such languages for three reasons.  First, Gries and Gehani
complain of interpretive implementations.  This accusation no
longer applies - not to Common Lisp, anyway.  Second, the authors
mention the unparsability of APL.  This difficulty, for which
Gries and Gehani give an example, does not apply to other run-time
typed languages.  Indeed, the accusation today is much more
applicable in the opposite direction:  Common Lisp is particularly
easy to parse, whereas C++ is practically unparsable.  (Send me
mail for a particularly nasty example.)

The authors' third reason for rejecting run-time typing is that
"one cannot tell just by looking at a procedure just what it does;
it will depend too heavily on the inputs to that procedure." This
objection is equally applicable, of course, to the authors' own
proposal of parameterized types.  More ironic is the fact that
many popular language abstractions, including object-oriented
programming itself, are equally vulnerable to this same accusation.

And of course, Gries and Gehani give no concrete evidence of the
superiority of their preferences.  Indeed, their primary criteria
- listed as readability, understandability, provability of
correctness, and run-time efficiency - apparently do not include
productivity, extensibility, or evolvability.

	Lawrence G. Mayka
	AT&T Bell Laboratories
	lgm@ihlpf.att.com

Standard disclaimer.

sommar@enea.se (Erland Sommarskog) (01/14/90)

Reino de Boer (reino@cs.eur.nl) writes:
>	 Example:
>	 Strong type checking in Pascal would give the programmer the
>	 opportunity to get a compiler error on the following:
>	 program wrong( output );
>	 type distance = real;
>		  money    = real;
>	 var  d        : distance;
>		  m        : money;
>	 begin d := 5.5; m := 10.0; writeln( d + m : 10 : 2 ) end.

WOULD give, yes, if Pascal had been a strongly typed language. Alas,
the typing in Pascal is fairly weak, so there is no possibility for
a programmer to enforce distinction between two entities that have
no connection to each other than being represented by a real or
integer number.
  But of course there are languages which give you this possibility,
but I doubt that Wirth designed any of them.
-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se
Unix is a virus.

lgm@cbnewsc.ATT.COM (lawrence.g.mayka) (01/15/90)

In article <1990Jan13.155115.4809@mentor.com> franka@mntgfx.UUCP (Frank A. Adrian) writes:
>of development seems to be a bit of overkill.  We may ask a somewhat
>cynical question as to if the difficulty of setting up and changing type
>hierarchies in the strongly-typed languages leads to programs which:
>
>	a) are "correct" with respect to the initial specification, but 
>	b) have not explored the possible type configuration space suf-
>		ficiently to have a maintainable type structure, so
>	c) significant changes to these strongly-typed structural behemoths
>		are next to impossible.

For this reason I submit that run-time typing is particularly
appropriate for large software systems that must evolve
continuously over a long lifecycle in response to unforeseen
customer needs - e.g., feature-rich telecommunications switching
systems.


	Lawrence G. Mayka
	AT&T Bell Laboratories
	lgm@ihlpf.att.com

Standard disclaimer.

nick@lfcs.ed.ac.uk (Nick Rothwell) (01/16/90)

In article <12843@cbnewsc.ATT.COM>, lgm@cbnewsc (lawrence.g.mayka) writes:
>For this reason I submit that run-time typing is particularly
>appropriate for large software systems that must evolve
>continuously over a long lifecycle in response to unforeseen
>customer needs - e.g., feature-rich telecommunications switching
>systems.

I see no reason why you can't do that kind of thing with a static
type system, as long as it's flexible enough. If you have
polymorphism, type abstraction, parameterised types and a decent
type-secure module system then the abstraction mechanisms limit
the amount of dependency between modules, so that changes are
as localised as possible. Even for the inevitable global changes,
if you trust the type system, then you can tend to make the changes to
the support modules which provide the basic types, and let any
changes to the types and sharing constraints propagate through.
The work is done when the typechecker accepts the whole system
once again.

>	Lawrence G. Mayka

	Nick.
--
Nick Rothwell,	Laboratory for Foundations of Computer Science, Edinburgh.
		nick@lfcs.ed.ac.uk    <Atlantic Ocean>!mcvax!ukc!lfcs!nick
~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~
  "...all these moments... will be lost in time... like tears in rain."

lgm@cbnewsc.ATT.COM (lawrence.g.mayka) (01/17/90)

In article <1633@castle.ed.ac.uk> nick@lfcs.ed.ac.uk (Nick Rothwell) writes:
>type-secure module system then the abstraction mechanisms limit
>the amount of dependency between modules, so that changes are
>as localised as possible. Even for the inevitable global changes,
>if you trust the type system, then you can tend to make the changes to
>the support modules which provide the basic types, and let any
>changes to the types and sharing constraints propagate through.
>The work is done when the typechecker accepts the whole system
>once again.

My basic objection to this scenario is that multiple
modifications, sometimes spanning across the entire system, must
be made simultaneously in order to transition from one stable
system state to the next.  This presents several problems:

a) Source code may not be available for all parts of the system.
The wide-ranging adjustments which static typing can make
necessary - e.g., of inheritance relationships - may not be
possible.

b) Even if such source code is available, customer modification
may void warranties or other support.  Again, wide-ranging
adjustments may have to be ruled out for this reason.

c) Introducing a set of wide-ranging modifications simultaneously
increases the risk of catastrophic failure.  To prevent this, one
must typically conduct more extensive testing, delaying
introduction of new functionality - perhaps beyond the window of
profit opportunity.

Testing itself becomes more difficult, because the tester must
deal not with her/his own individual change, but rather an entire
set of (perhaps wide-ranging) changes, often made by different
people.

d) Simultaneous wide-ranging and interlocking modification also
increases the difficulty of change management.  The usual solution
- a more complex and bureaucratic change management system - slows
down development, again delaying feature introduction.

e) Some applications have extremely stringent availability
requirements (limits on downtime).  Under such conditions, taking
down a production system in order to bring up an entirely new
image is out of the question, except perhaps once every two years;
but a two-year delay in feature introduction is no longer
tolerable.  Such markets now demand "incremental update."

Even in lab testing situations, continual building and bringup of
entire images is an undesirable overhead and slows down the
development process.

f) Restricting each identifier to a single predetermined and
(usually) predeclared type, and restricting each function/method
to a predetermined and predeclared number and sequence of
arguments and return values, is fundamentally a bookkeeping
exercise which hinders programmer productivity.  Such restrictions
may be symptomatic of an attitude of rigid, centralized control
which, if applied uniformly throughout software development, leads
to higher prices and lower profits, and in the extreme case to
eventual exit from the market.

Perhaps my main point is a general one, applicable to many debates
of this kind:  Don't be closed to new approaches, even seemingly
radical ones, unless your current approach is absolutely
*delighting* your customers.  And even then, keep looking over
your shoulder...

	Lawrence G. Mayka
	AT&T Bell Laboratories
	lgm@ihlpf.att.com

Standard disclaimer.

nick@lfcs.ed.ac.uk (Nick Rothwell) (01/18/90)

In article <12886@cbnewsc.ATT.COM>, lgm@cbnewsc (lawrence.g.mayka) writes:
>My basic objection to this scenario is that multiple
>modifications, sometimes spanning across the entire system, must
>be made simultaneously in order to transition from one stable
>system state to the next.

True. I've only been involved in projects where the number of
programmers was no greater than three.

>f) Restricting each identifier to a single predetermined and
>(usually) predeclared type, and restricting each function/method
>to a predetermined and predeclared number and sequence of
>arguments and return values, is fundamentally a bookkeeping
>exercise which hinders programmer productivity.

I'm not sure how much this a symptom of languages which don't have
good polymorphic type systems. There is certainly some bookkeeping
involved, but I would argue that the time involved in this is offset
by the faster development time overall. I might also suggest that it
is a good thing, since the programming is forced to examine the
overall structure and heirarchy of the project on various occasions,
rather than let it just grow.

>Such restrictions
>may be symptomatic of an attitude of rigid, centralized control

Presumably you're thinking of large teams of programmers with some
kind of centralised management. In case I was thinking of, there
was just myself, delivering a piece of software to contract.

>Perhaps my main point is a general one, applicable to many debates
>of this kind:  Don't be closed to new approaches, even seemingly
>radical ones, unless your current approach is absolutely
>*delighting* your customers.

My customers were extremely pleased that the exercise of prototyping
in a strongly-typed, higher-order language enabled me to write the
eventual source code many times faster than had I used a conventional
languages to start with. However, your point is well taken. In fact,
my approach *was* the new one, since I was using Standard ML on
a Unix workstation in order to deliver code to a software house
accustomed to PCs, MS-DOS and Microsoft C.

>And even then, keep looking over your shoulder...

Absolutely. Your other points are well taken.

>	Lawrence G. Mayka

		Nick.
--
Nick Rothwell,	Laboratory for Foundations of Computer Science, Edinburgh.
		nick@lfcs.ed.ac.uk    <Atlantic Ocean>!mcvax!ukc!lfcs!nick
~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~
"I like music which sounds like a flying DeLorean being struck by lightning"

jeff@aiai.ed.ac.uk (Jeff Dalton) (02/23/90)

In article <12820@cbnewsc.ATT.COM> lgm@cbnewsc.ATT.COM (lawrence.g.mayka,ihp,) writes:
 >Gries and Gehani bring up the alternative of run-time typed
 >languages (the authors call them "typeless") in a single paragraph
 >at the end of their article.  Interestingly, they mention as
 >examples APL and Snobol but omit Lisp, which I suspect was the
 >most widely used of the three even in those days.  (The Interlisp
 >and MIT Lisp machine programming environments already existed in
 >1977, though they were perhaps not yet well known.) The authors
 >reject such languages for three reasons.  First, Gries and Gehani
 >complain of interpretive implementations.  This accusation no
 >longer applies - not to Common Lisp, anyway. 

Lisp, and Lisp compilers, have been around since at least the early
60's.  It's strange how some people think certain languages are
inherently interpreted even though there is evidence to the contrary
for anyone who cares to look.  For example, the Dartmouth
implementations of Basic were always compilers.

>Second, the authors mention the unparsability of APL. 

As if this was because it was "typeless".  

>The authors' third reason for rejecting run-time typing is that
>"one cannot tell just by looking at a procedure just what it does;
>it will depend too heavily on the inputs to that procedure."

I wonder how many procedures they actually looked at.  In many cases,
it's clear what sorts of objects are involved, either because there
are explicit tests or because type-specific procedures are called.
Of course, being sure that the argument types will really be what
they should be is another matter.

Besides, knowing the types of the parameters doesn't tell you what
the procedure does.  It helps in some cases, but hardly matters at
all in others.