[net.lang] Strong Typing and Ignorance

donald@utcsrgv.UUCP (Don Chan) (12/05/83)

It would be nice if someone would define what "strong typing" is.
Everybody seems to think "strong typing" is just Pascal's types, e.g.
"strong typing" implies fixed array bounds, etc.

Nonsense.  This is burning paper tigers-- nobody ever claimed Pascal's
types are perfect, or that they define "strong typing".  Notice I quote
"strong typing".  I don't use the term myself; I think it is an
ill-defined term used by people who don't quite know what they're
talking about (like "Object-orientation").

Part of the problem is confusion in the meaning of "type".  The type
systems of various languages are really not type *definition* systems
but type *representation* systems, and we're basically arguing the form
that such representation systems ought to take.  If you're lost at this
point I can suggest the section on types in the book "Programming
Methodology:  A Collection of Papers by IFIP WG2.3", edited by Gries.
Gries's overview and Guttag's paper are good intros to the nature of
types.

About the only reasonable meaning I can ascribe to "strong typing" is
"compile-time type-checking".  This still leaves a lot of leeway for
expansion!  In CLU if you don't like the system-provided arrays you can
make your own arrays (perhaps optimized for sparse matrices) and use
them just like the system arrays.  CLU still manages to do all its
type-checking at compile time.

Sometimes the compile-time checking constraint seems inconvenient for
the programmer, but it does have its advantages, just like forbidding
GOTOs seems inconvenient, but has advantages.

BTW, it may surprise some people that C is "strongly typed".  Read the
reference manual if you don't believe it.  Don't confuse lax compilers
with a vague language definition.

I find it amusing that the people who think "Actor" (what the heck are
they?) languages have a superior notion of types, have never actually
used them.  Having read some SmallTalk and Flavoured Lisp documents, it
seems that their main advantage is deferral of type-checking until run
time and good parameterization.  Nice at times, but certainly not superior
(or inferior).

As always, some more precision in this discussion would be appreciated.
-- 
Don Chan, University of Toronto Department of Computer Science
{ utzoo linus ihnp4 floyd allegra uw-beaver
  ubc-vision cornell watmath hcr decwrl }!utcsrgv!donald

condict@csd1.UUCP (Michael Condict) (12/06/83)

I agree with Don's comment about the vagueness of the phrase "strong typing".
Consequently, when I want to distinguish between the kind of typing done in
LISP and that found in Pascal, I call the former "static typing" and the
latter "dynamic typing".  These phrases can be precisely defined, although
most languages will fall somewhere between the two extremes.  Here are my
definitions:

static typing - the language has a set of type expressions one of which is
required to be associated with each variable at the point of its declaration.
The type expression specifies the set of values (not bit patterns, but values
in an abstract mathematical world) that the declared variable may take on.
Different type expressions may denote the same set of values, but the question
of equality between types is decidable (computable).
Furthermore, a program is invalid if it contains a comparison between
two variables of unequal type or if it contains an assignment (explicit or
implicit) of an expression of type tau to a variable of type sigma and tau is
not a subset of sigma.

dynamic typing - a language is dynamically typed if variables are not associ-
ated with a type expression when declared, and any variable may be assigned
any value.

The purpose of the restrictions on types in the definition of static typing
is so that it is possible to ensure at compile time that no variable ever
takes on, or is compared to, a value that is outside of its declared type.
Of course, most staticly typed languages allow some sort of type coersion, at
least between integers and floating point.  Also, Pascal allows a value of
type integer to be assigned to a variable of type m..n, which violates the
subset rule given above -- this is exactly why Pascal needs run-time type
checking at the point of such assignments.

I find these definitions to be quite useful in categorizing languages and
discussing the benefits of various typing strategies.  I hope others agree
and that they will lend some precision to this discussion.

One more point.  Many people want to distinguish between languages that they
view as being "weakly typed" (by which they usually include LISP) and those
that are "untyped".  I always find it impossible to discover exactly what they
mean by untyped.  For any programming language I know, including the formal
untyped lambda-calculus, there are many possible mathematical domains that
can equally well be viewed as the collection of objects which are computable
within the programming language.  Furthermore, it is always possible to
partition this domain into subsets and to call each subset a different type.
Apparently, when one says a language is untyped, one is saying that no such
partition is reasonable.  This may be, but I do not know of any such cases.
Even in the lambda-calculus, one constructs certain expressions that are to
denote integer constants and others that are to denote arithmetic operations
on these integers, and thereby sets up a type system -- expressions that
denote integers are of a different type from expressions that do.  The lesson
is just that the question of whether or not a language has types, and what
those types are, depends on the intended usage of the language.

M. Condict    ...cmcl2!csd1!condict
New York Univ.

guy@rlgvax.UUCP (Guy Harris) (12/07/83)

A sidenote to the article by Michael Condict:

He mentions that the assignment of an integer to a subrange-typed variable
requires runtime checking.  Has any work been done on proving such assignments
correct given constraints on the values assumed by the integers in question?
I.e., if you have

procedure foo(var i, j : integer);

var k : 2 .. 12;

    begin
	{ ASSERT 1 <= i <= 6, 1 <= j <= 6 }
	k = i + j;
    end

if the assertion in question is true, 2 <= i+j <= 12 so the assignment
will always be valid.

	Guy Harris
	{seismo,ihnp4,allegra}!rlgvax!guy

mark@cbosgd.UUCP (12/07/83)

In my view, languages are classified as follows:

Statically typed (a strongly typed language is a very picky
statically typed language - how strong is how picky the compilers
are able to be) means that the type of a variable is determined
at compile time, and each variable can have only one type stored
in it (although that type might be a union).  This includes Pascal,
Ada, and C.

Dynamically typed languages have the types determined at runtime,
this includes APL, Snobol, and Lisp.  The runtime system will make
any reasonable translations between types automatically.

Untyped languages do not keep track of types - everything is generally
of type "word", and it is up to the programmer to keep everything straight.
This includes Bliss and most assembly languages.  If you mess up a type,
you'll usually get garbage.

spaf@gatech.UUCP (Gene Spafford) (12/08/83)

		From: guy@rlgvax.UUCP
		Newsgroups: net.lang
		Subject: Re: Strong Typing and Ignorance
		Message-ID: <1457@rlgvax.UUCP>

		[...]
		procedure foo(var i, j : integer);

		var k : 2 .. 12;

		    begin
			{ ASSERT 1 <= i <= 6, 1 <= j <= 6 }
			k = i + j;
		    end

		if the assertion in question is true, 2 <= i+j <= 12 so the
		assignment will always be valid.

Yes, that assertion will make the result true in all cases, but it is not
a completely correct assertion.  That is, input of the values -37 and 40
for i and j would also produce a correct result, as would 0 and 3 (for 
example).  Depending on how integer arithmetic is defined and implemented,
it is entirely possible that the values -40000 and -25532 could also
produce an answer in range (assuming 16 bit, 2's complement, with no
error triggered for underflow).

The question concerns the exact type, and associated semantics, used in
the evaluation of the expression on the right hand side.  Is it done in
the mode of the LHS?  Or is it done in the mode of the "most precise"
(whatever that is) mode on the RHS?  In general, I see the assignment
operator as a function implying type conversion, and the operation
on the right is done according to whatever semantics apply to those
variables -- without regard to the LHS.  That way, definition of
a subrange is a restriction on the returned value of the assignment
function, akin to the restrictions in some languages about assigning
reals to integers, or defining the range on a trig function.

Look at it this way -- if the assignment statement was
	i := i +j;
would that assertion still be valid?

-- 
Off the Wall of Gene Spafford
School of ICS, Georgia Tech, Atlanta GA 30332
CSNet:	Spaf @ GATech		ARPA:	Spaf.GATech @ CSNet-Relay
uucp:	...!{akgua,allegra,rlgvax,sb1,unmvax,ulysses,ut-sally}!gatech!spaf

barmar@mit-eddie.UUCP (Barry Margolin) (12/08/83)

	... the assignment of an integer to a subrange-typed variable
	requires runtime checking.  Has any work been done on proving
	such assignments correct given constraints on the values assumed
	by the integers in question?

I don't know about research, but isn't that what the whole idea
of types is supposed to obviate?  Also, unless some runtime check
has been done the constraints cannot be assumed to be true, since
the values of most variables are determined by something external
to the program, namely user input.  This is also the only time
you should ever run into an assignment from a superset to a
subset, i.e. the assignment of the input variable to an internal
variable of a subtype.  If the program explicitly checks that the
user's input is valid then an ideal optimizing compiler would
recognize that a check was not needed in the assignment code
(hmm, did I just repeat what the original submission was saying?)
On the other hand, another way to do this would be to have a
handler for the condition signaled if the range-checking failed.
The latter doesn't require as much smarts from the compiler, and
also has a chance of producing better code in some types of type
conversion.  For instance, a good way in PL/I to turn a user
input into a numeric value is:

	on conversion begin;
	     /* Do something about bad input */
	     ...
	end;

	fixed_bin_var = fixed (char_var);

	revert conversion;

This could produce better code than something like

	if valid_fixed_bin (char_var) then
	     fixed_bin_var = fixed (char_var);
	else /* Do something about bad input */
	     ...

because you only have to examine the char_var variable once.
-- 
			Barry Margolin
			ARPA: barmar@MIT-Multics
			UUCP: ..!genrad!mit-eddie!barmar

tjt@kobold.UUCP (T.J.Teixeira) (12/08/83)

On the other hand, C is weakly typed in that it does not perform type
checking of procedure parameters (although you can get lint to do this
for you).  In addition, the ability to coerce integers and pointers
with reckless abandon can subvert the type checking.  On the other, C
would not be as useful for systems programming without the ability to
sidestep the type checking occasionally.  Consider writing printf or a
UNIBUS device driver in Pascal.
-- 
	Tom Teixeira,  Massachusetts Computer Corporation.  Westford MA
	...!{ihnp4,harpo,decvax,ucbcad,tektronix}!masscomp!tjt   (617) 692-6200

cjl@iuvax.UUCP (12/09/83)

#R:utcsrgv:-289200:iuvax:11800005:000:933
iuvax!cjl    Dec  8 14:51:00 1983


     I agree it is very hard to  find  an  untyped  language,  so  let
assume  every  existing language is typed. What we concern now is WHEN
and HOW DECENT the type checking is done.

     The definition of static type checking and dynamic type  checking
given  by  Condict describes WHEN the type checking is done, but prob-
ably should not include HOW DECENT it is done.

     We can reserve the strong type checking and weak type checking to
describe  HOW DECENT the type checking is enforced by the language. Of
course, the degree of adequacy is ambiguous. I usually think  the  set
of  complete type checking rules (name compatibility or structure com-
patibility) where the Pascal family enforces is decent. But  the  non-
enforced  type  checking of LISP and the incomplete type checking of C
are examples of weak type checking.

C. Jim Lo, Dept. of Computer and  Info  Sci.,  IU-PU  at  Indianapolis
cjl.Indiana@UDEL-relay

bprice@bmcg.UUCP (12/12/83)

<inedible line for voracious mailers>

Thank you for asking the question--it brings up one of the advantages of 
Pascal's variable-oriented strong typing, as opposed to Lisp's form.  The
particular advantage is that the typing is an assertion that is to be used
in a partial correctness-proof.

>From: guy@rlgvax.UUCP
>Organization: CCI Office Systems Group, Reston, VA
>A sidenote to the article by Michael Condict:
>He mentions that the assignment of an integer to a subrange-typed variable
>requires runtime checking.  Has any work been done on proving such assignments
>correct given constraints on the values assumed by the integers in question?
>I.e., if you have

>procedure foo(var i, j : integer);
>var k : 2 .. 12;
>    begin
>       { ASSERT 1 <= i <= 6, 1 <= j <= 6 }
>       k = i + j;
>    end

>if the assertion in question is true, 2 <= i+j <= 12 so the assignment
>will always be valid.

>      Guy Harris

Work has been done on the question, although most that I know of has cast
the assertion in Pascal form.  The subrange assertion in the comment of your
example carries exactly the information conveyed by subrange-type, so if we
rewrite the example:

type halfdozenorless: 1..6;
...
procedure foo(var i, j : halfdozenorless);
var k : 2 .. 12;
    begin
       k := i + j;
    end

then we have the assertion in the standard Pascal form.

Jim Welsh (Economic Range Checks in Pascal, S-P&E 8 pp85-97, 1978) observes
that many examples, like this one, have enough information that run-time
range-checking cannot be violated.  He reports on an implementation.  
Several other implementations exist now--I think the next CDC release will
include this improvement, but don't count on my fallible word.  I did an
implementation, too, for the DECSystem-10.
-- 
--Bill Price    uucp:   {decvax!ucbvax  philabs}!sdcsvax!bmcg!bprice
                arpa:?  sdcsvax!bmcg!bprice@nosc

hal@cornell.UUCP (Hal Perkins) (12/13/83)

>>>>> On the other hand, C is weakly typed in that it does not perform type
>>>>> checking of procedure parameters (although you can get lint to do this
>>>>> for you).


ARRRRGHHHHH!!!!!!!

The properties of a language are independent of its implementation!  Please
don't confuse the two.  If a Pascal compiler failed to perform any type-
checking, would that change your opinion of whether Pascal was "strongly
typed"?  It shouldn't.  (It might well change your opinion of whether you
wanted to spend any money for that compiler however.)



Hal Perkins                         UUCP: {decvax|vax135|...}!cornell!hal
Cornell Computer Science            ARPA: hal@cornell  BITNET: hal@crnlcs

kurt@fluke.UUCP (Kurt Guntheroth) (12/13/83)

>>>>>	>>>>> On the other hand, C is weakly typed in that it does not perform
>>>>>	>>>>> type checking of procedure parameters (although you can get lint
>>>>>	>>>>> to do this for you).


>>>>>	ARRRRGHHHHH!!!!!!!

>>>>>	The properties of a language are independent of its implementation!
>>>>>	Please don't confuse the two.  If a Pascal compiler failed to perform
>>>>>	any type-checking, would that change your opinion of whether Pascal
>>>>>	was "strongly typed"?  It shouldn't.  (It might well change your
>>>>>	opinion of whether you wanted to spend any money for that compiler
>>>>>	however.)

>>>>>	Hal Perkins

A R R R R G H H H H ! (etc) (meta-ARRRGHHHH ?)

C does not do type checking of procedure parameters because the language
definition says it does not.  Thus weak type checking is a property of C and
not of a particular implementation.
-- 
Kurt Guntheroth
John Fluke Mfg. Co., Inc.
{uw-beaver,decvax!microsof,ucbvax!lbl-csam,allegra,ssc-vax}!fluke!kurt

cjl@iuvax.UUCP (12/14/83)

#R:utcsrgv:-289200:iuvax:11800006:000:1030
iuvax!cjl    Dec 13 15:15:00 1983


>>> ARRRRGHHHHH!!!!!!!
>>> The properties of a language are independent of its implementation!  Please
>>> don't confuse the two.  If a Pascal compiler failed to perform any type-

I would say this is not an implementation issue. The separation of function
between lint and c compiler has its practical rationale. 
As a system programming language, C chooses not to enforce strongly type 
checking but an optional part, i.e. a weak type checking approach, so 
system programming can be done by cheating. 

In page 3 of the "The C programming language" : 

   C is not a strongly-typed language in the sense of Pascal or Algol 68...

So apparently lint is only treated as a software tool, not part of the C
language definition. The C reference manual does not even mention lint.

In contrast, Modula-2 defines low level types ADDRESS WORD and type
transfer functions such that strongly type checking is still enforced 
while the flavor of system programming is retained.

Chingmin Jim Lo, CS dept, IUPUI
cjl.Indiana@UDEL-Relay

rpw3@fortune.UUCP (12/14/83)

#R:utcsrgv:-289200:fortune:15100004:000:1017
fortune!rpw3    Dec 14 03:37:00 1983

When people called BLISS "un-typed", what they really meant was that
the contents of a field of storage was uninterpreted by the compiler.

In fact, while (simple) variables were untyped, the OPERATORS were
typed. There was integer add ("+"), and floating point add ("FADR"), etc.
No coercion. If you wanted to write it (and since BLISS was for
writing operating systems and libraries, you sometimes did), you could:

	x = (.y FADR .z) + 1^27 ;

which takes the contents of "y" and the contents of "z", adds them with
floating-point arithmetic, and then integer adds 2**27 (1<<27, in C) to
the result (bumping the exponent, on a PDP-10), and store it in "x".

"Un-typed" generally means "no type checking", although in some contexts,
such as APL and SNOBOL, it means self-described typing (descripter-based
variables). Is that sufficiently muddy?

Rob Warnock

UUCP:	{sri-unix,amd70,hpda,harpo,ihnp4,allegra}!fortune!rpw3
DDD:	(415)595-8444
USPS:	Fortune Systems Corp, 101 Twin Dolphins Drive, Redwood City, CA 94065

patcl@tekecs.UUCP (Pat Clancy) (12/14/83)

Suggest this discussion be moved to net.religion.

tim@unc.UUCP (12/16/83)

I was taught that the assertion "language X is strongly typed" is equivalent
to the assertion "if E is a legal expression in language X, then the type of
the value of E can be determined at compile time".  This is the standard
meaning of "strong typing" -- if someone knows of a well-defined contender
for the phrase, by all means post it to the net, but until that time I will
assume that this is the only valid meaning.

Contrary to the assertions of some recent articles, Pascal is not strongly
typed.  There are two exceptions.  The first concerns variant records, which
have a type which must be determined at run time by examination of the tag
field.  The second involves integer range types; it is legal to assign a
normal integer variable to an integer range variable, but it cannot be
determined at compile time whether the integer variable is of the integer
range type or the integer type: if the latter, the assignment is illegal.

C is not strongly typed, but probably not for the reasons you'd expect
(argument mismatch and pointer typing escapes do not make the expression
type indeterminate at compile time, they just complicate the typing
semantics).  The "?:" operator, which is a distributed conditional, can have
differently typed alternative values.  For instance, in the expression
"i>0 ? 1 : 4.2", the type of the value of the expression cannot be
determined until the value of i is known -- it might be int, or double.  I
don't know of any other exceptions to strong typing in C.

I hope this definition will help to counteract some of the wildly-inaccurate
articles on the subject that have appeared lately, and satisfy the curiosity
of those who wanted to know.
--
Tim Maroney, University of North Carolina at Chapel Hill
duke!unc!tim (USENET), tim.unc@csnet-relay (ARPA)

mauney@ncsu.UUCP (12/16/83)

References: unc.6421

In their book "Programming Language Concepts" (Wiley, 1982),
Ghezzi and Jazayeri define a language to be strongly typed
"if it allows all type checking to be done statically".
This definition is similar that given by Tim Maroney (unc.6421).

Although the above is an adequate theoretical definition of strong typing,
it leaves something to be desired from the practical point of view.
For example,  Ghezzi and Jazayeri go on to illustrate that Pascal is not
strongly typed and that ALGOL 68 is.  They discuss several problems with
Pascal,  such as variant record usage and formal procedure parameters,
which ALGOL 68 avoids by better design.  But then they get to subranges
and array indexing,  and they point out that ALGOL 68 does not have a
problem with ranges because the bounds of an array are considered to
be part of the value,  not part of the type.  Well,  OK.  That is an
interesting idea from the language designer's point of view,  but to the
programmer nothing has changed -- out of bounds indices are still wrong,
and still require a run-time check.  Who cares if it's a type checking
question or a value checking question?

To many people,  strong typing means that (a) there is a wide variety of
types, (b) there are few automatic coercions between types, (c)  things
that are not immediately compatible nor coercible are flagged as wrong, and
(d) most of the flagging is done at compile time.  This is not a precise
definition,  but it more accurately reflects what the programmer needs.

                           Jon Mauney,    mcnc!ncsu!mauney
                           North Carolina State University

tjt@kobold.UUCP (T.J.Teixeira) (12/29/83)

Using Tim's definition of strong typing, he should rest at ease.
According to the reference manual in K&R, section 7.13, the "?:"
operator is strongly typed.  The two expressions must be of an
arithmetic type (in which case the usual arithmetic conversions are
applied), both pointers of the same type, or one a pointer and the
other 0.

I concede that Tim's definition of "strong typing" makes sense, and I
won't submit an alternative definition.  However, I will point out that
sloppy usage substitutes "strong typing" for "strong type checking".
"strong type checking" means that the compiler checks for non-sensical
types of expressions in all contexts.  I think that we can agree that
using a floating point number as a pointer is non-sensical, as are the
results of most argument type mismatches.
-- 
	Tom Teixeira,  Massachusetts Computer Corporation.  Westford MA
	...!{ihnp4,harpo,decvax,ucbcad,tektronix}!masscomp!tjt   (617) 692-6200

guy@rlgvax.UUCP (Guy Harris) (01/01/84)

<FIX ME!>

	The "?:" operator, which is a distributed conditional, can have
	differently typed alternative values.  For instance, in the expression
	"i>0 ? 1 : 4.2", the type of the value of the expression cannot be
	determined until the value of i is known -- it might be int, or double.  I
	don't know of any other exceptions to strong typing in C.

According to the C Reference Manual:

	7.13 Conditional operator

	...If possible, the usual arithmetic conversions are performed
	to bring the second and third expressions to a common type;
	otherwise, if both are pointers of the same type, the result
	has the common type; otherwise, one must be a pointer and the
	other the constant 0, and the result has the type of the pointer.

So the type of "i>0 ? 1 : 4.2" is double; the "1" is coerced to double
so this expression is equivalent to "i>0 ? 1.0 : 4.2".

	Guy Harris
	{seismo,ihnp4,allegra}!rlgvax!guy

cjl@iuvax.UUCP (01/03/84)

#R:utcsrgv:-289200:iuvax:11800007:000:1122
iuvax!cjl    Jan  2 22:37:00 1984


I am against Tim Moroney's definition on "strong typing" which can also
be found in p.38 od Horowitz's "Fundamentals of Programming Languages".
Expressions consist of operators and operands which are just syntactic 
sugar of functions. It is neither consistent nor practical to leave out
procedure argument mismatch checking in a strongly typed language.

Most recent textbooks define a strongly typed language to allow complete,
statical type checking, e.g. p90 in Ghezzi's "Prog. Lang. Concepts",
p129 in MacLennan's "Principles of Programming Languages",
p184 in Ledgard's "The Programming Language Landscape".
The ambiguity lies on the COMPLETENESS.  Generally it should exclude
(1) type-ambiguity of expression which can be found in 
    (a) conditional expression in C,
    (b) record variant in Pascal,(In ISO Pascal, subrange is treated
        as range constraint not a new type) 
    (c) COMMON and EQUIVALENT in FORTRAN,
(2) untyped pointers (in PL/I) and dangerous pointer operation (in C),
(3) assignment incompatibility (which includes procedure parameter checking).

C.J.L. , IUPUI
cjl@Indiana@Rand-Relay