[comp.software-eng] An Interesting View of "Strong" Vs. "Weak" Typing

eberard@ajpo.sei.cmu.edu (Edward Berard) (01/05/90)

Folks,

I recently posted a large message to comp.object on the evaluation of
object-oriented programming languages. I have received a number of
responses. I would be interested in getting your reaction to the
following quotes from one of respondees:

>>         object-oriented programming. Smalltalk has, in effect, no
>>         types, but there are typed extensions to Smalltalk, e.g.,
>
> Again, CL and Smalltalk have *run-time* types - restrictions on
> the use of *objects* during program *execution*.  Languages such
> as Ada have *compile-time* types - restrictions on the use of
> *identifiers* during program *compilation.*

[...]

>
> More accurately, run-time typing places the burden on the run-time
> system, whereas compile-time typing places the burden on the
> compiler *and* the programmer (who must provide explicit type
> declarations and obey them at all times during program
> development).  I would ask again what solid *experimental*
> evidence anyone has to show that compile-time typing actually
> results in programs more reliable than those possible with
> run-time typing, *in the presence of the requirements evolution
> that is present in any large, complex software project*.

Are there any references documenting the benefits of strong vs. weak
typing? What are your reactions to the definitions given?

                                -- Ed Berard
                                   (301) 353-9652

P.S.: For those that don't know, "CL" stands for Common Lisp.

david@cs.washington.edu (David Callahan) (01/05/90)

>> More accurately, run-time typing places the burden on the run-time
>> system, whereas compile-time typing places the burden on the
>> compiler *and* the programmer (who must provide explicit type
>> declarations and obey them at all times during program
>> development). 

This statement is misleading. Many research languages perform
compile-time type checking based on type-inference algorithms and
allowing a powerful, polymorphic type system. While research work
indicates that type-inference at reasonable generality is undecidable,
it does not seem to be a tremendous problem in practice.

-- 
David Callahan  (david@tera.com, david@june.cs.washington.edu,david@rice.edu)
Tera Computer Co. 	400 North 34th Street  		Seattle WA, 98103

reino@cs.eur.nl (Reino de Boer) (01/05/90)

eberard@ajpo.sei.cmu.edu (Edward Berard) writes:

>I would be interested in getting your reaction to the
>following quotes from one of respondees:

>>>         object-oriented programming. Smalltalk has, in effect, no
>>>         types, but there are typed extensions to Smalltalk, e.g.,
>>
>> Again, CL and Smalltalk have *run-time* types - restrictions on
>> the use of *objects* during program *execution*.  Languages such
>> as Ada have *compile-time* types - restrictions on the use of
>> *identifiers* during program *compilation.*

I tend to agree with `>>>' that Smalltalk has no types. The feeling of
run-time types is actually the `not being able to respond to certain
messages'.

>Are there any references documenting the benefits of strong vs. weak
>typing? What are your reactions to the definitions given?

My reaction:
I think languages like Miranda have shown us that the effects of
run-time type-checking can for a very large part be accomplished by
compile-time evaluation of possible types. Although something can be
said in favor of the statement
	all interpreters (CL, Smalltalk, etc.) by definition do only
	run-time checking (not being able to compile at all).

P.S. Thank you for your overview of Object-Oriented Design.

Reino


-- 
Reino R. A. de Boer
Erasmus University Rotterdam ( Informatica )
e-mail: reino@cs.eur.nl

nick@lfcs.ed.ac.uk (Nick Rothwell) (01/05/90)

In article <641@ajpo.sei.cmu.edu>, eberard@ajpo (Edward Berard) writes:
I would be interested in getting your reaction to the
>following quotes from one of respondees:
>> More accurately, run-time typing places the burden on the run-time
>> system, whereas compile-time typing places the burden on the
>> compiler *and* the programmer (who must provide explicit type
				      ^^^^^^^^^^^^...
>> declarations and obey them at all times during program
>> development).

This is false for a polymorphic type system.

>> I would ask again what solid *experimental*
>> evidence anyone has to show that compile-time typing actually
>> results in programs more reliable than those possible with
>> run-time typing, *in the presence of the requirements evolution
>> that is present in any large, complex software project*.

Do you mean "practical experience"? I wrote 8000 lines in ML, and
hand-translated to 20000 lines of C, in 4 months, resulting in a
reliable piece of software. I don't know how to quantify that in any
objective way, though.

I suggest that compile-time typing serves the same purpose as
module interfaces: it lets you decide what you want different parts of
a project to see, and what operations you want to provide (I'm
thinking of a parameterised, abstract type system here).

		Nick.
--
Nick Rothwell,	Laboratory for Foundations of Computer Science, Edinburgh.
		nick@lfcs.ed.ac.uk    <Atlantic Ocean>!mcvax!ukc!lfcs!nick
~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~
  "...all these moments... will be lost in time... like tears in rain."

gudeman@cs.arizona.edu (David Gudeman) (01/06/90)

[anonymous quotes...]
>>>         object-oriented programming. Smalltalk has, in effect, no
>>>         types, but there are typed extensions to Smalltalk, e.g.,
>>
>> Again, CL and Smalltalk have *run-time* types - restrictions on
>> the use of *objects* during program *execution*.  Languages such
>> as Ada have *compile-time* types - restrictions on the use of
>> *identifiers* during program *compilation.*

I think there is some confusion here between "having types" and "being
typed".  In particular, it is possible for a language to have types
(as Smalltalk and CL do) and still to be untyped (as Smalltalk and CL
are).  These languages are not "weakly typed", they are "untyped".

All languages "have types", in the sense that values in the language
can be categorized (even if there is only one category).  CL has the
types "integer", "real", "string", "cons", etc.

A language is said to "be typed" if the identifiers of the language
(variable, function names, etc.)  are constrained to represent values
of specific types.  The difference between strong and weak typing
involves how much you can say about the types of values that the
identifiers can take.

Here are examples in C (a typed language) and Icon (an untyped
language with a syntactic resemblance to C):

  int add(a,b) int a,b; {return a + b;}   /* C */

  procedure add(a,b) return a + b end     # Icon

The C procedure is typed because the identifiers "a" and "b" are
restricted to represent values of type int, and the identifier "add"
is restricted to represent a value of type (int X int) -> int
(actually the type restriction "add" is a rather trivial one since it
is a constant).

In the Icon procedure, "a" and "b" may have values of any type: int,
real, or even procedure.  If one of both of them is not numeric (or
convertible to numeric) then a run-time error will occur when you try
to add them.  Likewise, the identifier "add" in the Icon program is
untyped.  It may take on a value of a different type (even int) during
the execution of the program.
-- 
					David Gudeman
Department of Computer Science
The University of Arizona        gudeman@cs.arizona.edu
Tucson, AZ 85721                 noao!arizona!gudeman

lgm@cbnewsc.ATT.COM (lawrence.g.mayka) (01/06/90)

In article <10287@june.cs.washington.edu> david@june.cs.washington.edu (David Callahan) writes:
>>> More accurately, run-time typing places the burden on the run-time
>>> system, whereas compile-time typing places the burden on the
>>> compiler *and* the programmer (who must provide explicit type
>>> declarations and obey them at all times during program
>>> development). 
>
>This statement is misleading. Many research languages perform
>compile-time type checking based on type-inference algorithms and
>allowing a powerful, polymorphic type system. While research work
>indicates that type-inference at reasonable generality is undecidable,
>it does not seem to be a tremendous problem in practice.

Consider a dynamically constructed list of objects such that:

a) The types of the elements have no necessary relation - i.e.,
the list is totally heterogeneous;

b) The types of the elements are not knowable at compile time, but
are each knowable at run time - i.e., a TYPE-OF or CLASS-OF
function is available;

c) The length of the list is not knowable at compile time, but is
knowable at run time.

Programs in run-time typed languages make ubiquitous use of such
lists.  If one cannot create and manipulate such lists easily and
efficiently in the research languages you describe, I suggest that
there is indeed a "problem in practice."  The distinction between
compile-time typing and run-time typing remains a valid one.

Moreover, a programming environment which requires specification
of all program parts in advance of compilation (e.g., in order to
perform type inferencing across function calls) faces essentially
the problems of compile-time typing:  hindrance of incremental
system development and obstruction of smooth system evolution.

Type inferencing within a function, or on programmer request
(e.g., via optional declarations), is of course a welcome optional
optimization.

	Lawrence G. Mayka
	AT&T Bell Laboratories
	lgm@ihlpf.att.com

Standard disclaimer.

lgm@cbnewsc.ATT.COM (lawrence.g.mayka) (01/06/90)

In article <1990Jan5.084746.17836@cs.eur.nl> reino@cs.eur.nl (Reino de Boer) writes:
>compile-time evaluation of possible types. Although something can be
>said in favor of the statement
>	all interpreters (CL, Smalltalk, etc.) by definition do only
>	run-time checking (not being able to compile at all).

Common Lisp programs are typically compiled into native machine
instructions.  For example, one typical implementation on the
SPARC architecture compiles the following function

        (defun add (x y)
          (+ x y)
          )

into exactly 11 machine instructions.  Such a function properly
computes the sum of any two numbers - integers of any size,
floating-point, rational, or complex.  (Only 5 instructions are
executed if each argument happens to be an integer that fits into
a single machine word.)  If one annotates this definition with
optional type declarations thus:

        (defun add (x y)
          (declare (fixnum x y))
          (the fixnum (+ x y))
          )

the compiler emits exactly 4 machine instructions.  If this latter
function is passed arguments of unexpected types - e.g., rational
numbers 4/5 and 7/8 - the run-time system traps the attempt and
asks the user if it should proceed with the addition anyway.  This
exception handling behavior can of course be overridden by the
programmer.)

An interpreter is usually also available but most CL programmers
use it primarily to execute top-level environment commands such as
ED, COMPILE-FILE, LOAD, TRACE, DESCRIBE, INSPECT, and DISASSEMBLE.

	Lawrence G. Mayka
	AT&T Bell Laboratories
	lgm@ihlpf.att.com

Standard disclaimer.

rh@ecs.soton.ac.uk (R Harrison) (01/10/90)

In article <1990Jan5.084746.17836@cs.eur.nl> reino@cs.eur.nl (Reino de Boer) writes:
>eberard@ajpo.sei.cmu.edu (Edward Berard) writes:
>
>>Are there any references documenting the benefits of strong vs. weak
>>typing? 

The advantages of strong typing have been recognised for some time
- consider, for example, Gries and Gehani ("Some ideas on data type 
in high level languages, CACM 20, 1977, pp.414-420), in which the case 
for controlled polymorphic programming is discussed.

 Rachel Harrison        rh@uk.ac.soton.ecs 
 Department of Electronics and Computer Science,
 University of Southampton, Southampton SO9 5NH, UK 
 Tel. (0703) 593055  International 44 703 593055 
 Fax 0703 593045 International Fax 44 703 593045

reino@cs.eur.nl (Reino de Boer) (01/12/90)

lgm@cbnewsc.ATT.COM (lawrence.g.mayka) writes:

>In article <1990Jan5.084746.17836@cs.eur.nl> reino@cs.eur.nl (Reino de Boer) writes:
>>	all interpreters (CL, Smalltalk, etc.) by definition do only
>>	run-time checking (not being able to compile at all).

>Common Lisp programs are typically compiled into native machine
>instructions. 
>An interpreter is usually also available but most CL programmers
>use it primarily to execute top-level environment commands such as
>ED, COMPILE-FILE, LOAD, TRACE, DESCRIBE, INSPECT, and DISASSEMBLE.

The point I was trying to make, is that an interpreter can BY DEFINITION
do no compile-time checking.
Some `compilers' allow a form of run-time checking because they include
a run-time interpreter in the compiled code. I would like to classify
these compilers under interpreters.
Some `interpreters' allow a form of compile-time checking because they
interpret some kind of internal code. Depending on the semantics of the
internal code, these could be classified under compilers.

The only reasonable way for a compiler to allow run-time type checking
would be to include type-checking code in the compiled code.

However, there is more than one issue being discussed here:
1. What would be reasonable definitions for
   o run-time type checking;
   o compile-time type checking.
2. Given a solution to (1), what are the advantages and disadvantages of
   both kinds of type-checking.
3. Given an answer to (2),
   o what would be the best way to let a programmer influence the
	 checking mechanism (e.g., type declarations as in Pascal, or
	 polymorphic typing as in Miranda)
   o does type checking have to be `strong' or `weak' (and what would be
	 suitable definitions for `strong' and `weak').
	 Example:
	 Strong type checking in Pascal would give the programmer the
	 opportunity to get a compiler error on the following:
	 program wrong( output );
	 type distance = real;
		  money    = real;
	 var  d        : distance;
		  m        : money;
	 begin d := 5.5; m := 10.0; writeln( d + m : 10 : 2 ) end.

Any comments? -- Reino

-- 
Reino R. A. de Boer
Erasmus University Rotterdam ( Informatica )
e-mail: reino@cs.eur.nl

franka@mentor.com (Frank A. Adrian) (01/13/90)

In article <2197@ecs.soton.ac.uk> rh@landin (R Harrison) writes:
>In article <1990Jan5.084746.17836@cs.eur.nl> reino@cs.eur.nl (Reino de Boer) writes:
>>eberard@ajpo.sei.cmu.edu (Edward Berard) writes:
>>
>>>Are there any references documenting the benefits of strong vs. weak
>>>typing? 
>
>The advantages of strong typing have been recognised for some time
>- consider, for example, Gries and Gehani ("Some ideas on data type 
>in high level languages, CACM 20, 1977, pp.414-420), in which the case 
>for controlled polymorphic programming is discussed.
>

And the disadvantages of strong typing during early development has also
been recognized for some time - consider, for example, Sandewall
("Programming in an Interactive Environment: The LISP Experience",
Computing Surveys, 1978, pp. 35-72), where we find the quote (p. 38):

	"The contents and use of declarations is one of the important
	issues that one wants to experiment with in a research system,
	and therefore should not be frozen in the system kernel.  In-
	stead the internal form of programs should be declaration free,
	and one of the services of the developing programming system
	should be to account for declarations as input by the user."

I take Sandewall's view.  Maybe the systems that you work on are so well
specified or the concepts you deal with in your programs are so well un-
derstood, even in the early stages, that you have no need of "playing
with the program".  I don't.  Nobody I know does.  I have no objections
to having a well specified type structure in a delivered product (and
in fact see see the necessity for one at that stage both for correctness
and efficiency), but hamstringing languages to REQUIRE it in ALL stages
of development seems to be a bit of overkill.  We may ask a somewhat
cynical question as to if the difficulty of setting up and changing type
hierarchies in the strongly-typed languages leads to programs which:

	a) are "correct" with respect to the initial specification, but 
	b) have not explored the possible type configuration space suf-
		ficiently to have a maintainable type structure, so
	c) significant changes to these strongly-typed structural behemoths
		are next to impossible.

I guess the bottom line is that, at least for me (and, I believe, for
many others), the case for having less strongly typed languages is far
from closed.
-- 

Frank A. Adrian
Mentor Graphics, Inc.
franka@mntgfx.com

lgm@cbnewsc.ATT.COM (lawrence.g.mayka) (01/14/90)

In article <2197@ecs.soton.ac.uk> rh@landin (R Harrison) writes:
>The advantages of strong typing have been recognised for some time
>- consider, for example, Gries and Gehani ("Some ideas on data type 
>in high level languages, CACM 20, 1977, pp.414-420), in which the case 
>for controlled polymorphic programming is discussed.

Gries and Gehani bring up the alternative of run-time typed
languages (the authors call them "typeless") in a single paragraph
at the end of their article.  Interestingly, they mention as
examples APL and Snobol but omit Lisp, which I suspect was the
most widely used of the three even in those days.  (The Interlisp
and MIT Lisp machine programming environments already existed in
1977, though they were perhaps not yet well known.) The authors
reject such languages for three reasons.  First, Gries and Gehani
complain of interpretive implementations.  This accusation no
longer applies - not to Common Lisp, anyway.  Second, the authors
mention the unparsability of APL.  This difficulty, for which
Gries and Gehani give an example, does not apply to other run-time
typed languages.  Indeed, the accusation today is much more
applicable in the opposite direction:  Common Lisp is particularly
easy to parse, whereas C++ is practically unparsable.  (Send me
mail for a particularly nasty example.)

The authors' third reason for rejecting run-time typing is that
"one cannot tell just by looking at a procedure just what it does;
it will depend too heavily on the inputs to that procedure." This
objection is equally applicable, of course, to the authors' own
proposal of parameterized types.  More ironic is the fact that
many popular language abstractions, including object-oriented
programming itself, are equally vulnerable to this same accusation.

And of course, Gries and Gehani give no concrete evidence of the
superiority of their preferences.  Indeed, their primary criteria
- listed as readability, understandability, provability of
correctness, and run-time efficiency - apparently do not include
productivity, extensibility, or evolvability.

	Lawrence G. Mayka
	AT&T Bell Laboratories
	lgm@ihlpf.att.com

Standard disclaimer.

sommar@enea.se (Erland Sommarskog) (01/14/90)

Reino de Boer (reino@cs.eur.nl) writes:
>	 Example:
>	 Strong type checking in Pascal would give the programmer the
>	 opportunity to get a compiler error on the following:
>	 program wrong( output );
>	 type distance = real;
>		  money    = real;
>	 var  d        : distance;
>		  m        : money;
>	 begin d := 5.5; m := 10.0; writeln( d + m : 10 : 2 ) end.

WOULD give, yes, if Pascal had been a strongly typed language. Alas,
the typing in Pascal is fairly weak, so there is no possibility for
a programmer to enforce distinction between two entities that have
no connection to each other than being represented by a real or
integer number.
  But of course there are languages which give you this possibility,
but I doubt that Wirth designed any of them.
-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se
Unix is a virus.

lgm@cbnewsc.ATT.COM (lawrence.g.mayka) (01/15/90)

In article <1990Jan13.155115.4809@mentor.com> franka@mntgfx.UUCP (Frank A. Adrian) writes:
>of development seems to be a bit of overkill.  We may ask a somewhat
>cynical question as to if the difficulty of setting up and changing type
>hierarchies in the strongly-typed languages leads to programs which:
>
>	a) are "correct" with respect to the initial specification, but 
>	b) have not explored the possible type configuration space suf-
>		ficiently to have a maintainable type structure, so
>	c) significant changes to these strongly-typed structural behemoths
>		are next to impossible.

For this reason I submit that run-time typing is particularly
appropriate for large software systems that must evolve
continuously over a long lifecycle in response to unforeseen
customer needs - e.g., feature-rich telecommunications switching
systems.


	Lawrence G. Mayka
	AT&T Bell Laboratories
	lgm@ihlpf.att.com

Standard disclaimer.

nick@lfcs.ed.ac.uk (Nick Rothwell) (01/16/90)

In article <12843@cbnewsc.ATT.COM>, lgm@cbnewsc (lawrence.g.mayka) writes:
>For this reason I submit that run-time typing is particularly
>appropriate for large software systems that must evolve
>continuously over a long lifecycle in response to unforeseen
>customer needs - e.g., feature-rich telecommunications switching
>systems.

I see no reason why you can't do that kind of thing with a static
type system, as long as it's flexible enough. If you have
polymorphism, type abstraction, parameterised types and a decent
type-secure module system then the abstraction mechanisms limit
the amount of dependency between modules, so that changes are
as localised as possible. Even for the inevitable global changes,
if you trust the type system, then you can tend to make the changes to
the support modules which provide the basic types, and let any
changes to the types and sharing constraints propagate through.
The work is done when the typechecker accepts the whole system
once again.

>	Lawrence G. Mayka

	Nick.
--
Nick Rothwell,	Laboratory for Foundations of Computer Science, Edinburgh.
		nick@lfcs.ed.ac.uk    <Atlantic Ocean>!mcvax!ukc!lfcs!nick
~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~
  "...all these moments... will be lost in time... like tears in rain."

jeff@aiai.ed.ac.uk (Jeff Dalton) (02/23/90)

In article <12820@cbnewsc.ATT.COM> lgm@cbnewsc.ATT.COM (lawrence.g.mayka,ihp,) writes:
 >Gries and Gehani bring up the alternative of run-time typed
 >languages (the authors call them "typeless") in a single paragraph
 >at the end of their article.  Interestingly, they mention as
 >examples APL and Snobol but omit Lisp, which I suspect was the
 >most widely used of the three even in those days.  (The Interlisp
 >and MIT Lisp machine programming environments already existed in
 >1977, though they were perhaps not yet well known.) The authors
 >reject such languages for three reasons.  First, Gries and Gehani
 >complain of interpretive implementations.  This accusation no
 >longer applies - not to Common Lisp, anyway. 

Lisp, and Lisp compilers, have been around since at least the early
60's.  It's strange how some people think certain languages are
inherently interpreted even though there is evidence to the contrary
for anyone who cares to look.  For example, the Dartmouth
implementations of Basic were always compilers.

>Second, the authors mention the unparsability of APL. 

As if this was because it was "typeless".  

>The authors' third reason for rejecting run-time typing is that
>"one cannot tell just by looking at a procedure just what it does;
>it will depend too heavily on the inputs to that procedure."

I wonder how many procedures they actually looked at.  In many cases,
it's clear what sorts of objects are involved, either because there
are explicit tests or because type-specific procedures are called.
Of course, being sure that the argument types will really be what
they should be is another matter.

Besides, knowing the types of the parameters doesn't tell you what
the procedure does.  It helps in some cases, but hardly matters at
all in others.