[comp.lang.misc] Dynamic typing

gudeman@cs.arizona.edu (David Gudeman) (03/12/91)

I have become convinced that many people still don't know what I mean
by "dynamic typing".  This is an indictment of computer science
educations in general, that such an important and pervasive paradigm
has been so neglected.  Dynamic typing is not a new or strange way to
program -- Lisp and SNOBOL are both nearly as old as Fortran, and they
are dynamically typed languages.  Mathematical notation is generally
closer to dynamically typed languages than to statically typed
languages.

In fact, I'd go so far as to claim that it is static type checking
that is the peculiar notion, not dynamic typing.  Static typing
originated, as near as I can determine, with low-level languages like
Fortran an Algol that were little more than glorified assemblers.
They had to give type declarations so that they could generate code
that an assembly language programmer expected to be generated for
expressions.  Much later, static typing took on religious significance
as an element of the great god Structured Programming, and became a
living force.  Now people have actually come to believe, against all
evidence, that static typing is important for program reliability.

Part of this belief is rooted in the confusion between weak typing and
dynamic typing.  The two are completely unrelated.  In fact, I'd claim
that most dynamicly typed languages are strongly typed in some sense.
What is the essential difference between strong and weak typing?  I
claim that it is that weakly typed languages can do bizarre things due
to type errors.  For example if the length of an array is not included
in the definition of the type, you can do arbitrary things to memory
by setting out-of-bounds values in the array.  Semantically we say
that the behavior of the program becomes undefined due to a type
error.

If we ignore the idea of when type checking is done, we can define
strong typing as follows:

strong typing		a language feature that guarantees that the
			behavior of a program never becomes undefined
			as a result of a type error.

All dynamically typed language I know have this feature.  In fact for
most of these languages there are no programs that have undefined
behavior,  I don't know whether any statically typed language can
make that claim.

With static typing you need a great deal of information at compile
time to be able to guarantee strong typing.  This has two
consequences: (1) you have to limit the forms of expressions to some
set for which you know a type-checking decision procedure, and (2) you
have to acquire type information somewhere.

The first consequence is unacceptable in my view -- I don't like it
when the set of things I can express are arbitrarily limited.  The
second consequence is also unacceptable if it means that the programer
is burdened with the job of providing this information (and that is
the case for the huge majority of statically, strongly typed
languages).  The computer is supposed to to busy work like checking
type consistency, the programmer should no more be burdened with this
than he should have to calculate constants.  How would you like it if
you could not rely on constant folding, so you had to calculate the
values of all your constants?  Yet this is the same sort of busy work
as writing type declarations.

I think that static typing took on religious significance as a result
of the problems with weak typing -- people didn't understand that the
two issues are orthogonal, and that there was an alternative to static
typing.  They that weak typing is a problem because it can lead to
undefined behavior -- which means it can be very hard to debug.  This
is a reasonable position to take, however the solution they came up
with was not reasonable.  The solution was to require more and more
restrictions to the set of legal expressions, and more and more
declarations from the programmer.

This was so burdensome that enormour amounts of further research went
into finding ways to undo the unpleasant effects of these decisions.
Thus we have 'typecase', polymorphic functions, polymorphic modules,
virtual functions, etc., ad nauseum.  Each feature adds semantic
burden to the language, making it harder to learn and harder to
implement.  And you still don't get the full expressiveness of dynamic
typing, which was the obvious solution to begin with.

The only real problem with dynamic typing is that it is (slightly)
less efficient that static typing.  However, there are two obvious
solutions to this problem, one of which I presented in part 2.  The
other is to start with a dynamically typed language and add optional
type declarations.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

yodaiken@chelm.cs.umass.edu (victor yodaiken) (03/12/91)

In article <602@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
>are dynamically typed languages.  Mathematical notation is generally
>closer to dynamically typed languages than to statically typed
>languages.
>
>In fact, I'd go so far as to claim that it is static type checking
>that is the peculiar notion, not dynamic typing.  Static typing
>originated, as near as I can determine, with low-level languages like
>Fortran an Algol that were little more than glorified assemblers.
>They had to give type declarations so that they could generate code
>that an assembly language programmer expected to be generated for
>expressions.  Much later, static typing took on religious significance
>as an element of the great god Structured Programming, and became a
>living force.  Now people have actually come to believe, against all
>evidence, that static typing is important for program reliability.
>


When I write:
             let S be a set
             let f:S -> X
             let Y = f(S) and let Z = {g(s): for s IN Y}
             for each  s in S let k_s = f(S)

a human being could figure out that:
           X, Y and Z are also sets
          that by f(S) I mean to apply f to each element of S
and that the last line makes more sense if we interpret f(S) as a typo
that should really be f(s)

How is a dynamically typed programming language going to figure this
out?

peter@ficc.ferranti.com (Peter da Silva) (03/12/91)

In article <602@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
> Now people have actually come to believe, against all
> evidence, that static typing is important for program reliability.

No, they believe that dynamic typing imposes a certain unavoidable runtime
overhead, and aren't willing to pay that cost. As computers get faster this
becomes more of a moot point outside the frantic world of embedded controllers
and videogames.

You can argue how great that cost is... and certainly when people are happy
with the performance of X, Windows, and Multifinder it's pretty daft to
quibble over less than a factor of 10 or so cost... but it's the real reason
people are still working in C and other Algol-derived languages.
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

gudeman@cs.arizona.edu (David Gudeman) (03/13/91)

In article  <27794@dime.cs.umass.edu> victor yodaiken writes:
]When I write:
]             let S be a set
]             let f:S -> X
]             let Y = f(S) and let Z = {g(s): for s IN Y}
]             for each  s in S let k_s = f(S)
]
]a human being could figure out that:
]... that the last line makes more sense if we interpret f(S) as a typo
]that should really be f(s)

A human is a lot smarter than a static type checker.  A static type
checker can only tell you there is an inconsistency between the
declaration of f and its use.  It doesn't know which is the error.

]How is a dynamically typed programming language going to figure this
]out?

A dynamic language might tell you more than the static language does,
given that you actually execute this code.  During the execution of f
you will probably call some function on the argument that is only
defined for sets, and you will get an appropriate error message
telling you what value should be a set and isn't.

Why should a programming language be expected to catch this type of
error at compile time, and not some other class of errors?  Why are
type errors so special?  There are many other sorts of errors that
could be detected at compile time by requiring enough work from the
programmer.  They aren't checked because it is rightly seen that the
extra work outweighs the potential benifits.  The same is true of type
declarations, which is obvious to everyone who programs in a language
without them.  Required declarations are not in languages because they
are a good idea, they are an accident of history.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

yodaiken@chelm.cs.umass.edu (victor yodaiken) (03/13/91)

In article <609@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
>In article  <27794@dime.cs.umass.edu> victor yodaiken writes:
>]When I write:
>]             let S be a set
>]             let f:S -> X
>]             let Y = f(S) and let Z = {g(s): for s IN Y}
>]             for each  s in S let k_s = f(S)
>]
>]a human being could figure out that:
>]... that the last line makes more sense if we interpret f(S) as a typo
>]that should really be f(s)
>
>A human is a lot smarter than a static type checker.  A static type
>checker can only tell you there is an inconsistency between the
>declaration of f and its use.  It doesn't know which is the error.
>
>]How is a dynamically typed programming language going to figure this
>]out?
>
>A dynamic language might tell you more than the static language does,
>given that you actually execute this code.  During the execution of f
>you will probably call some function on the argument that is only
>defined for sets, and you will get an appropriate error message
>telling you what value should be a set and isn't.
>

I'm agnostic on this issue, but your argument is very unpersuaive.
Won't a runtime type language attempt to do something sensible with the
error, something that might obscure the error, but leave it lying around
for later? If the language decides to apply f to each element of S,then
I may get a perfectly reasonable value for my test data, but suffer from
mysterious errors at some future date.

I'll come up with a more worked out example if needed.

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (03/13/91)

In article <609@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
> Why should a programming language be expected to catch this type of
> error at compile time, and not some other class of errors?  Why are
> type errors so special?

Compiler writers generally find it convenient to catch type errors.
Programmers generally find it convenient to know the type of a variable
and to have types always checked at compile time (rather than as an
optimization that a few compilers might provide). Who are you to argue
with taste?

> Required declarations are not in languages because they
> are a good idea,

I have to disagree. Declarations help catch typos, if nothing else.

---Dan

aipdc@castle.ed.ac.uk (Paul Crowley) (03/13/91)

I'm going to split languages in two by type extensibility.  In C and ML,
you can explicitly make up a new type from old types: a widget is made
up of two foos and a bar.  In Logo, there are only three types: words,
numbers, and lists.  If you want to define an imaginary number, you
could use a list of two integers.  If you want to define a UNIX-style
time as a list of two integers [secs, usecs], you can do that too.  If
you accidentally feed an imaginary number to a function that wants a
date, the language won't say a word.  Prolog behaves this way too. 

What are the words for these two? (I know that if you really wanted to,
you could preface every imaginary list with a typename, and write
functions that checked that the elements of these lists were the types
they were supposed to be.  This means that the types of all the elements
of a large structure are checked often.  Doubleplusungood.)

Are these two strong and weak typing?

Also, some languages do type-checking at compile-time, and some at
run-time.  Some (ML and others) typecheck at compile-time _but_ it does
all the work itself.  Run-time typecheckers maintain the type of a value
with the value itself, and on each operation checks that the operation
is an appropriate thing to do.

Are these two static and dynamic typechecking?

Thanks, and sorry for asking here but many people have said "X is the
wrong word for Y" that I felt that the only people who could tell me
what the words mean in the debate were the participants.
                                         ____
\/ o\ Paul Crowley aipdc@uk.ac.ed.castle \  /
/\__/ Part straight. Part gay. All queer. \/

cs450a03@uc780.umd.edu (03/13/91)

The Question, as I understand it, is:
        Is it a "good thing" or a "bad thing" to require programmers
        to specify what types of information can be associated with
        a variable name.

Sample argument for:  Static typing catches errors.
Sample argument against: Static typing encourages errors.

Other argument against: Static typing slows development.
Other argument for: Static typing allows faster code.


Is that the gist of this, so far?
Do I need to add that people favoring dynamic typing also favor dynamic
testing?  Or that existing popular systems (X-windows and emacs come to
mind) are not paragons of efficiency?


My own, biased, observation is that keeping information and control
paths simple goes a long, long way towards catching errors.  Static
typing, as a sort of globalish way of specifying data by side effect
often does little towards simplifying either control or data flow.
The problem being, essentially, that data applicable to function F
may or may not be applicable for function G.  (Or, similarly, datum
X may be used to qualify datum Y, but not datum Z).


For example, consider a function which finds the square of a number.
Nice, simple (almost trivial) function.  Yet on most computers,
you can give it arguments which do not have representable squares.
If you were to solve this problem with static typing, I suppose you
could invent some special numeric type of limited magnitude, and say
that only numbers of that type could be passed to this function.
Does anyone detect a note of insanity here?  Just how are you supposed
to verify that the result of some arbitrary computation, each stage
of which has its own domain (and its own range) will generate results
which are in the domain of the next relation down the line?

You might as well say "Lets find the solution to all problems, rather
than this one problem, because then we'll be more correct."

Yeah, right.

May I humbly suggest, if you really think this (static typing) is a
useful approach, that you spend a little time reviewing completeness
and consistency?  (Ask your favorite mathematician)  Who knows, maybe
you'll be able to solve the knapsack problem.


Finally, to Mr. Grier, who posed the rhetorical problem about trusting
software in critical systems which might have latent bugs:  would you
really trust such software if it had never been tested?  Would it make
you feel safer if each type of data required seperate chunks of code,
with the associated tests and branches and variant storage mechanisms?


This is slightly off the topic of data polymorphism, but I'd like to
point out at work it is considered good form to write code that does
not branch or loop (or use recursion), as much as possible.  Remember,
control flow (such as branching) is just another way of representing
information.  With the proper primitives, you can do an incredible
amount of decision making with zero branching (sort of "super-structured").


Raul Rockwell

gudeman@cs.arizona.edu (David Gudeman) (03/13/91)

In article  <YR+9VUB@xds13.ferranti.com> Peter da Silva writes:
]In article <602@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
]> Now people have actually come to believe, against all
]> evidence, that static typing is important for program reliability.
]
]No, they believe that dynamic typing imposes a certain unavoidable runtime
]overhead, and aren't willing to pay that cost.

You haven't been following the thread.  Only one person of all that
replied has mentioned the overhead.  Everyone else is worried that
types errors aren't detected by the compiler.  BTW, I mentioned the
high overhead in my first posting on the subject.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

grier@marx.enet.dec.com (03/13/91)

In article <YR+9VUB@xds13.ferranti.com>, peter@ficc.ferranti.com (Peter
da Silva) writes:
|> In article <602@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David
|> Gudeman) writes:
|> > Now people have actually come to believe, against all
|> > evidence, that static typing is important for program reliability.
|> 
|> No, they believe that dynamic typing imposes a certain unavoidable
|> runtime
|> overhead, and aren't willing to pay that cost. As computers get
|> faster this
|> becomes more of a moot point outside the frantic world of embedded
|> controllers
|> and videogames.
|> 

   Actually, reliability is exactly my argument.  I think that in
most cases, the static typing of an expression can be inferred,
and thus the best instruction sequence be selected.  (Going out
on a limb here, but I've never been known to be timid in making
statements or speculations.)

   My argument is for reliability in engineering of large software
systems, where a type or operator change might produce an error
embedded somewhere which cannot be determined statically
(remember, there still *has* to be a case where the type inference
logic can't determine the type of an expression, and thus has to
assume the worst,) and might not be found for quite some time.

   If this is your favorite game, this might be annoying but not truly
harmful.  If this is a nuclear power plant or air traffic controller station,
well, let's just hope it doesn't happen.

   I also claim agnostance(is that a real word?) on the utility of dynamic
typing.  Other than BASIC, I can't say I've ever used it to build
any real software, and I'd be a lot happier to have a hard and fast
notion of the type to which a reference/variable refers at any given
point in the program text than to freely toss about dynamically typed
variables.  But I'm poisoned from learning FORTRAN and BASIC at a
frightfully early age and I'm probably ruined for life in that respect.  :-)

					-mjg
------------------------------------------------------------------------
I'm saying this, not Digital.  Don't hold them responsibile for it!

Michael J. Grier                           Digital Equipment Corporation
(508) 496-8417                             grier@leno.dec.com
Stow, Mass, USA                            Mailstop OGO1-1/R6

gudeman@cs.arizona.edu (David Gudeman) (03/13/91)

In article  <25381:Mar1221:07:3891@kramden.acf.nyu.edu> Dan Bernstein writes:
]
]Compiler writers generally find it convenient to catch type errors.

In a language with dynamic typing, everything from the parser to the
code generator tends to be much simpler.  Only the runtime system is
more complex (and not much more complex if you don't add all the extra
features that dynamic typing opens up).

]Programmers generally find it convenient to know the type of a variable

Programmers do know the types of variables with dynamic typing, they
just don't have to tell the computer.

]and to have types always checked at compile time (rather than as an
]optimization that a few compilers might provide). Who are you to argue
]with taste?

I'm someone who has experience with both methods, arguing mostly (I
suspect) with people who don't.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

multics@acm.rpi.edu (Richard Shetron) (03/13/91)

My personal choice would be a language that supported both static and dynamic
type variables.  There have been times when one or two dynamic typed variables
would have been very useful.  I also like static typing as I've had good
luck with compilers finding type mismatches that were errors or possible
sources of problems.  I'm much more experienced with static typed languages
then dynamic, but I would much prefer a language that supported both.

-- 
A good bureaucracy is the best tool of oppression ever invented.
Richard Shetron   USERFXLG@rpi.mts.edu  multics@mts.rpi.edu

Chris.Holt@newcastle.ac.uk (Chris Holt) (03/13/91)

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>In article <609@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:

>> Required declarations are not in languages because they
>> are a good idea,

>I have to disagree. Declarations help catch typos, if nothing else.

So in prototype-style code, leave declarations out, and in production
code put them in.  Would you want to have to declare input types
when using a pocket calculator?  On the other hand, would you want
a theorem prover to have to work from scratch, without any hints
as to the legitimate domains of variables?
-----------------------------------------------------------------------------
 Chris.Holt@newcastle.ac.uk      Computing Lab, U of Newcastle upon Tyne, UK
-----------------------------------------------------------------------------
       "A peace I hope with honour." - Disraeli 1878

sakkinen@tukki.jyu.fi (Markku Sakkinen) (03/13/91)

In article <613@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
>In article  <YR+9VUB@xds13.ferranti.com> Peter da Silva writes:
>]In article <602@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
>]> Now people have actually come to believe, against all
>]> evidence, that static typing is important for program reliability.
>]
>]No, they believe that dynamic typing imposes a certain unavoidable runtime
>]overhead, and aren't willing to pay that cost.
>
>You haven't been following the thread.  Only one person of all that
>replied has mentioned the overhead.  Everyone else is worried that
>types errors aren't detected by the compiler.  BTW, I mentioned the
>high overhead in my first posting on the subject.

Your original statement would seem to refer to people (programmers,
managers who decide upon programming languages, etc.) in general,
not only us Usenet freaks who are taking part in this illustrious
thread of discussion.  Surely at least Peter da Silva meant it that
way, and I think very often the reason he suggested is the most
important one.

Just think about "typical" C hackers.  Although C has some degree
of static typing, they could not care less for "program reliability".
On the other hand, they easily get mad about any byte of runtime
overhead that might want to creep into their objects.  The same
horror of overhead seems to be common even among C++ programmers.

Markku Sakkinen
Department of Computer Science and Information Systems
University of Jyvaskyla (a's with umlauts)
PL 35
SF-40351 Jyvaskyla (umlauts again)
Finland
          SAKKINEN@FINJYU.bitnet (alternative network address)

grier@marx.enet.dec.com (03/14/91)

In article <13MAR91.00251986@uc780.umd.edu>, cs450a03@uc780.umd.edu writes:

|> My own, biased, observation is that keeping information and control
|> May I humbly suggest, if you really think this (static typing) is a
|> useful approach, that you spend a little time reviewing completeness
|> and consistency?  (Ask your favorite mathematician)  Who knows, maybe
|> you'll be able to solve the knapsack problem.

   Well, my favorite mathematician is about a hundred miles away right now,
but I studied under him for a few years, so I might speak up here.

   In mathematics, symbols bear more relation to unbound variables in
lambda calculus (well, they bear a LOT of relation :-) 

   In mathematics, it makes NO sense to talk about applying a function
or operation to a symbol unless the symbol is known to be in the
domain of the operation/function.  I.e. writing something like "for all
x, exp(x) is greater than zero" is nonsense.  exp's domain is commonly
the reals, and may be extended to bigger domains like the complexes, but
that doesn't mean it applies to the empty set, or the tree outside my
house's door.  The correct way to write that would be, "for all x, if
x is a real number, exp(x) is greater than zero."

   That's a type-case.  I fully expect to make complete knowledge of the
type of a symbol when I apply an operation to it, and any proof where
applying an operation which might not be valid to a value is incorrect.
(And programs are proofs of algorithms.)

   Now, that's the hard stance.  Following this philosophy, if x is in the
set of integers, x is also NOT in the set of reals (a nice copy of it and the
rest of the integers are embedded there.)  So, it's not uncommon to
sleaze through your typing when there is a clear an unambiguous
conversion implied.

|> 
|> 
|> Finally, to Mr. Grier, who posed the rhetorical problem about trusting
|> software in critical systems which might have latent bugs:  would you
|> really trust such software if it had never been tested?  Would it make
|> you feel safer if each type of data required seperate chunks of code,
|> with the associated tests and branches and variant storage mechanisms?

   Yes and no.  No first.

   "No", because that's why I think that subtyping and inheritance is
so wonderful.  If there's an obvious way to specialize an operation to a
subtype, it *is* clearer and still absolutely statically correct to apply the
supertype's operation to a value of the subtype.

   "Yes", because otherwise you end up putting operations up artifically
high in the tree.  The one which comes to mind and really burns me is 
when there's a "Shape" type, and because people want to have values of
type "Shape" and want to apply the "diameter()" operation to them.  Shapes
don't all have diameters!  At least if they do, it's not with the implicit
meaning implied by the C++ folks who want to define a diameter()
virtual member function for Shape.

   When the operations which can be performed by subtypes of a given
variable's type differ from the operations valid for the type, that's
where "Yes" comes into play.  In David Gudeman's example, I most
certainly don't want some "read()" operation which can return ANY
type of value to allow my code to blindly attempt to apply the "+" or
"Log()" operator to it.  My code should have some reason to believe
that "Log()" makes sense when applied to the value.  Statically checked,
before allowing me or anyone else to run it.

   My motivation is that I believe that compilers should try to do their
best to ensure correctness of a program implementation before allowing 
it to be executed.  If you don't believe in that, well, we differ and
that's life. I just wouldn't do my banking or trust my life to software which
relies on extensive testing rather than some level of ensurred correctness.

   (I don't claim that compilers will ever be able to prove correctness
of the semantics of an algorithm, but I'd like them to at least ensure the
correctness of the proof/implementation.)

   In addition, it's a little late now, but in my original posting, I conceded
that this sort of feature would be appropriate for research and/or
prototyping.

   Type checking is something like a spelling checker.  If you're in a burst
of creativity, it's a pain in the ass, so you turn it off - the flow of
information and ideas is much more important than the static
correctness of the spellings.  However, for producing real reports or
papers or books, you want to ensure not only that the ideas are
generally correct, but that the "i"s are dotted and the "t"s are crossed.
That's what strong static typing is.

   Heck, there's an even more direct parallel.  Syntax checking!  Next thing,
why don't we have languages where they "try" to interpret commands and
do what their best guess to the requested operations are!

   I don't WANT a smart computer, I don't WANT a computer which can
misinterpret ambiguous commands, I don't WANT a computer which
can forget.  (machine checks/hardware failures don't count.)

   I want a computer which does EXACTLY what I tell it, really fast.  If what
I tell it to do doesn't or might not make sense, I want it to let me know
so I can clear up any ambiguity as early as possible, rather than letting it
find it out later (rush hour at a busy airport, suddenly over 256 planes are
in the airspace, and some program writes past the end of an array, blasting
away the stack, crashing the ATC computer,) or make some guess (it's fiction,
but my fav. example is from a book by <forget first name> Hogan, _The Two
Faces of Tomorrow_, where a semi-intelligent computer misinterprets an
ambiguously worded command and ends up demolishing things...)

   (I also want to be able to build up a large library of things that I've
already told the computer how to do and now either want to refine or
use again.  I also want it to tell me if I'm using something
incorrectly, and/or
if I  change an existing component if it breaks other existing
components
which I may or may not know about.)

   This is tiring.  If you don't agree with me, that's OK, but I hope you
stay in the research world rather than producing software which people
pay for and expect to work reliably.  But if you're going to claim that
this approach of dynamic typing increases {productivity, quality, 
performance, correctness, robustness}, I'll continue to differ with you.

|> 
|> Raul Rockwell
|> 
------------------------------------------------------------------------
I'm saying this, not Digital.  Don't hold them responsibile for it!

Michael J. Grier                           Digital Equipment Corporation
(508) 496-8417                             grier@leno.dec.com
Littleton, Mass, USA                       Mailstop OGO1-1/R6

pcg@test.aber.ac.uk (Piercarlo Antonio Grandi) (03/14/91)

On 13 Mar 91 01:21:07 GMT, gudeman@cs.arizona.edu (David Gudeman) said:

gudeman> In article <25381:Mar1221:07:3891@kramden.acf.nyu.edu> Dan
gudeman> Bernstein writes:

brnstnd> Compiler writers generally find it convenient to catch type
brnstnd> errors.

Yes, and this is simply part of the general principle that symbolic
reduction should go as far as possible during compile time. Everything
that is resolvable at compile time, i.e. that does not depend on
variables that will be bound only at runtime, or during further
compilation, should be resolved.

Another principle is that the programmer should have the ability to
decorate the source code with statements of his assumptions, to be
checked by the compiler or runtime system as appropriate.

Both principles above are embodied in something like Common Lisp's
'proclaim', on in the various type inference and explicit declaration
rules of many functional languages.

These two things mean that indeed static type checking should go as far
as possible, which is what you say, but not that there should be no
dynamic typing.

gudeman> In a language with dynamic typing, everything from the parser
gudeman> to the code generator tends to be much simpler.  Only the
gudeman> runtime system is more complex (and not much more complex if
gudeman> you don't add all the extra features that dynamic typing opens
gudeman> up).

No, because the compiler will still have to do as much static checking
as it is possible, either by symbolic reduction or by checking
programmer's assumptions. I believe that compiler or runtime complexity
is not an issue, because one must have *both* static and dynamic
checking. Limiting onself only to one or the other seems constricting
(just static checking), as dynamic checking has then to be simulated or
many applications classes avoided, or hazardous (just dynamic checking),
as checks that do not depend on the particular paths of program
execution are not performed.
--
Piercarlo Grandi                   | ARPA: pcg%uk.ac.aber@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@aber.ac.uk

gudeman@cs.arizona.edu (David Gudeman) (03/14/91)

In article  <9106@castle.ed.ac.uk> Paul Crowley writes:
--[description of a Logo's lack of type extension]--
]...  Prolog behaves this way too. 

In prolog you can create a new type by making it a set of terms with
the same name/arity.  That makes it pretty much the same thing as a C
struct.

]What are the words for these two?

I'd call Logo "a language without type extension facilities". :-).

]...  This means that the types of all the elements
]of a large structure are checked often.  Doubleplusungood.)

That is what you have to do in general in dynamically typed languages,
and is what people are talking about when they say that dynamic typing
has a high overhead.  It is also the main reason why I want both
dynamic and static typing -- I want dynamic typing to speed code
developement and upgrades, and I want static typing to eliminate the
overhead for bottlenecks in the code.

]Are these two strong and weak typing?

Strong and weak typing are relative terms.  They refer to how many
areas the static type checking can break down.  For example in K&R C,
you can pass an int to a function that expects a double, and you get
undefined behavior that tends to be hard to track down.  In more
strongly typed languages, the types of values are checked across
procedure calls.  There are hordes of niggling little details to
consider when trying to increase the "strength" of static typing.
Often "strong typing" is intended to be an absolute term, only
refering to languages with no known type checking loopholes.

]Also, some languages do type-checking at compile-time, and some at
]run-time.  Some (ML and others) typecheck at compile-time _but_ it does
]all the work itself.

I thought you had to declare the types of functions in ML...

]Are these two static and dynamic typechecking?

Yes.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

aipdc@castle.ed.ac.uk (Paul Crowley) (03/14/91)

In article <618@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
>In article  <9106@castle.ed.ac.uk> Paul Crowley writes:
>--[description of a Logo's lack of type extension]--
>]...  Prolog behaves this way too. 
>
>In prolog you can create a new type by making it a set of terms with
>the same name/arity.  That makes it pretty much the same thing as a C
>struct.

How is a language in which you can say [imaginary,5,3] different from
one in which you can say imaginary(5,3)?  My problem with both is the
way they're both perfectly happy with imaginary(this_doesnt,make_sense).

You could solve it with a dynamically typed OO language, I suppose.
                                         ____
\/ o\ Paul Crowley aipdc@uk.ac.ed.castle \  /
/\__/ Part straight. Part gay. All queer. \/

schwartz@groucho.cs.psu.edu (Scott Schwartz) (03/14/91)

gudeman@cs.arizona.edu (David Gudeman) writes:
| I thought you had to declare the types of functions in ML...

% sml
Standard ML of New Jersey, Version 0.66, 15 September 1990
- fun foo x = x + 1;
val foo = fn : int -> int
- fun bar x = x + 1.0;
val bar = fn : real -> real

kers@hplb.hpl.hp.com (Chris Dollin) (03/14/91)

[Abbreviations: STL - statically typed language; DTL - dyamically T'd L.]

Dan Bernstein writes:

   Compiler writers generally find it convenient to catch type errors.

Er ... only if (begging the question) they're dealing with STL's. [Even then, 
it's not *convenient*, it's *necessary* - otherwise the compiler would be
wrong.]

   Programmers generally find it convenient to know the type of a variable
   and to have types always checked at compile time (rather than as an
   optimization that a few compilers might provide). Who are you to argue
   with taste?

Alternative A: I'm not a programmer; I don't find it "convenient to know the
type of a variable and to have types always checked at compile time".

I am reluctant to endorse this alternative because it seems to leave the
majority of my life's income (and the behaviour of a non-empty set of
computers) unexplained.

Alternative B: It is not true that "programmers generally ...". Programmers in
STL's have no choice. (Programmers in DTL's sometimes have the opposite
choice.) 

Since I know several individuals who would satisfy the usual crieria for being
programmers who don't "find it convenient ...", alternative B seems plausible.

[Other alternatives are imaginable.]

Observation: this is all a bit silly anyway. [Didn't we have the B&D debate a
few months ago?] The poles of argument seem to be:

* one ought to be able to write anything that makes sense in its context
* it's much nicer to catch errors at compile-time than run-time, if possible

Since I suspect that no-one would seriously disagree with these two poles, we
can concentrate on

* how much expressiveness are you prepared to give up for the cost of being
able to check correctness (be it type-correctness or whatever) at compile-time?

Bear in mind that in (say) Pascal, checking that an expression E has type T may
require a run-time check (suppose T is 1..10 and E is i+1, i:T), so a
presumptive STL may require run-time checks.

Given the choice between (say) C and Pop (or Lisp) I'd pick Pop (or, if I had
to, Lisp). As David G says, type errors turn out to be just not that frequent,
because you just don't write ill-typed code very often, and when you do, it
usually surfaces quickly. [The argument that it might not, and remain a logic
bomb lurking in your code ready to pounce on an unsuspecting user, is moot;
"real" logic errors can so lurk, but because we know they cannot be cheaply
detected at compile-time, no-one is upset that thay are not so detected;
instead they deploy design, and code reviews, and testing; in short, Software
Engineering.] 

[Of couse, for Pepper (the Pop-like language I'm building at the moment) I'd
like to design a static type system which will spot likely type errors at
compile-time, but not prevent you from overriding it; since Pepper programs
already work quite happily with no static typechecks, I'm in no hurry to build
a grotty little pedant rather than a pleasant cardboard cutout.]

--

Regards, Kers.      | "You're better off  not dreaming of  the things to come;
Caravan:            | Dreams  are always ending  far too soon."

gudeman@cs.arizona.edu (David Gudeman) (03/14/91)

In article  <1991Mar13.163629.12630@engage.enet.dec.com> grier@marx.enet.dec.com writes:
]
]   In mathematics, it makes NO sense to talk about applying a function
]or operation to a symbol unless the symbol is known to be in the
]domain of the operation/function.  I.e. writing something like "for all
]x, exp(x) is greater than zero" is nonsense.

Nonsense.  Anybody with a clue immediately understands that x is
restricted to values in the domain of exp.

]...  In David Gudeman's example, I most
]certainly don't want some "read()" operation which can return ANY
]type of value to allow my code to blindly attempt to apply the "+" or
]"Log()" operator to it.

In the first place, the read() only produced ints and floats, and the
return value was used in a context where either was a legal value.  In
the second place, if the read() function were defined to return other
sorts of values and you wrote "x + read()" the problem would be in the
program, not in the definition of "read()".

]My code should have some reason to believe
]that "Log()" makes sense when applied to the value.  Statically checked,
]before allowing me or anyone else to run it.

Then you are going to need types "positive int" and "positive float".
Do you also want type security for division?  Then you are going to
need "non-0 int", "non-0 float", as well as "positive non-0 int", etc.
I count 8 numeric types needed just to get type security for two
common operations.  The point is that static type checking involves an
essentially arbitrary set of restrictions, and there is no evidence
that it adds to program security at all.

I will concede that optional type declarations might help program
security in the same way that optional assert clauses do.  (I don't
object to having type declarations in a language, I object to
_required_ type declarations.)  However, for many variables there is
no need to declare the type since it is obvious from the var's use,
and programmers should be smart enough to know the difference.
Whatever makes a language designer think --at language design time--
that he knows more about the requirements of a program than the
programmer will know --at program design time?

]... I just wouldn't do my banking or trust my life to software which
]relies on extensive testing rather than some level of ensurred correctness.

You must be joking.  Static type checking doesn't give any reasonable
level of assurance at all -- it is never the case that simply because
program compiles without errors, there is reason to believe that it
has some level of reliability.  Testing is the _only_ known way to
give any assurance at all.  And a given amount of testing generally
provides more assurance for a language with dynamic typing than it
would for a language with static typing.  (Because programs in
dynamically typed languages are usually much smaller and have fewer
paths to test.)

]   Heck, there's an even more direct parallel.  Syntax checking!  Next thing,
]why don't we have languages where they "try" to interpret commands and
]do what their best guess to the requested operations are!
]
]   I don't WANT a smart computer, I don't WANT a computer which can
]misinterpret ambiguous commands, I don't WANT a computer which
]can forget...

There is no parallel there at all.  Dynamically typed languages in no
sense try to interpret ambiguous commands or "guess" what is wanted.
The typing rules are just as unambiguous in a dynamically typed
language as in a statically typed language.  But in the dynamically
typed language the rules are simpler, more intuitive, and easier to
use.  (Depending on the language of course -- I wouldn't argue that
Smalltalk's rules are any of the above...)

]... and some program writes past the end of an array, blasting
]away the stack, crashing the ATC computer,)

You are confusing weak typing with dynamic typing.  I don't know of
any dynamically typed language that lets you write past the end of an
array (unless it first expands the array).

]   This is tiring.  If you don't agree with me, that's OK, but I hope you
]stay in the research world rather than producing software which people
]pay for and expect to work reliably.

I'm sure that the thousands of people currently using and relying on
software written in dynamically typed languages are touched by your
concern.  We aren't worried about it though, we'll just go on using
easily modifiable software that was written in half the time with
twice the functionality, with no loss of reliabilty.  (Although we
_do_ have to buy faster machines or put up with slower response.)
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

gudeman@cs.arizona.edu (David Gudeman) (03/14/91)

In article  <1991Mar13.143805.28242@tukki.jyu.fi> Markku Sakkinen writes:
]In article <613@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
]>]In article <602@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
]>]> Now people have actually come to believe, against all
]>]> evidence, that static typing is important for program reliability.
]
]Your original statement would seem to refer to people

Not if you were following the thread.  And frankly, not even if you
read the article carefully.  I was not speaking in opposition to
static type checking, I was specifically talking about this religious
belief that static type checking is necessary for program reliability.
I was the first person in this discussion to say that dynamic typing
is less efficient, and that efficiency was a good reason to use static
typing sometimes.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

anw@maths.nott.ac.uk (Dr A. N. Walker) (03/14/91)

In article <602@optima.cs.arizona.edu> gudeman@cs.arizona.edu
(David Gudeman) writes:

[Disclaimer:  I usually find DG's contributions to be thoughtful and
accurate.  Exceptionally, I think this one was ill-conceived, and
dangerously close to a polemic.  This may have coloured my reaction
to it. -- ANW]

>				   Mathematical notation is generally
>closer to dynamically typed languages than to statically typed
>languages.

	& it was invented and had evolved long before the whole modern
problem of trying to describe problems and algorithms in a precise way
for computer consumption.  Such examples as we have of attempts at
precision in maths [eg, by Russell] are not very convincing.  We write
sloppy maths because "we all know what we mean" -- I don't think that's
a very good model for computer languages.

>						   Static typing
>originated, as near as I can determine, with low-level languages like
>Fortran an Algol that were little more than glorified assemblers.

	[I assume that Algol 60, rather than older or modern versions,
is meant.]  This statement is just historically ignorant.  Algol 60
was *specifically* designed to describe algorithms, independently of
the computer.  Static typing was certainly not put in to make code
generation easier;  indeed, it was widely thought that Algol 60 would
never be used in real compilers.  Fortran did not, for the most part,
have declarations at all;  and many people argued bitterly, using just
the same "busy work" arguments, against having them in Algol.  The view
that you should say what variables you want to use, and what you want
to use them for, prevailed (for good reason, in my opinion).

>		  For example if the length of an array is not included
>in the definition of the type, you can do arbitrary things to memory
>by setting out-of-bounds values in the array.  Semantically we say
>that the behavior of the program becomes undefined due to a type
>error.

	Well this is a matter of semantics [:-)].  The C fragment
"int a[10]; a[23] = 17;" might, in many implementations, do something
arbitrary to memory, but in my opinion it contains an error.  Making
the behaviour undefined is [perhaps wrong-headedly] a convenience
for the compiler writer.  Would C become more strongly typed if the
behaviour became defined in some way?

>With static typing you need a great deal of information at compile
>time to be able to guarantee strong typing.  This has two
>consequences: (1) you have to limit the forms of expressions to some
>set for which you know a type-checking decision procedure, and (2) you
>have to acquire type information somewhere.

	Ie, (1) you have to know what your expression is intended to do,
and (2) you have to use variables in a disciplined way.  I don't find
these "consequences" either irksome or undesirable.  When I write programs
in dynamically typed languages, I try hard to follow the same precepts.

>	      The computer is supposed to to busy work like checking
>type consistency, the programmer should no more be burdened with this
>than he should have to calculate constants.

	Type *consistency* is indeed the compiler's job, but I don't
find it unreasonable that I should document my identifiers.

>					      How would you like it if
>you could not rely on constant folding, so you had to calculate the
>values of all your constants?

	[Irrelevant aside:  you *can not* rely on constant folding.
The fragment "i := 2; j := 3;  if i/j != 2/3 then ..." will indeed
"fail" on many [arguably broken] systems, which may matter to the
Numerical Analyst.]

	There is a place for dynamic typing (I enjoy writing shell
scripts!), and a case for rapid prototyping, but there is also a
case for traditional declarations;  there is no need for either
"camp" to knock the other.

-- 
Andy Walker, Maths Dept., Nott'm Univ., UK.
anw@maths.nott.ac.uk

grier@marx.enet.dec.com (03/15/91)

In article <626@optima.cs.arizona.edu>, gudeman@cs.arizona.edu (David
Gudeman) writes:
|> In article  <1991Mar13.163629.12630@engage.enet.dec.com>
|> grier@marx.enet.dec.com writes:
|> ]
|> ]   In mathematics, it makes NO sense to talk about applying a
|> function
|> ]or operation to a symbol unless the symbol is known to be in the
|> ]domain of the operation/function.  I.e. writing something like "for
|> all
|> ]x, exp(x) is greater than zero" is nonsense.
|> 
|> Nonsense.  Anybody with a clue immediately understands that x is
|> restricted to values in the domain of exp.
|> 

   Wrong.   If you consider a proof to have some static notion of
correctness (which is exactly what they are,) this kind of error is
absolutely wong in a formal exhibition of the proof.

   True, when we're talking and scribbling on the board working
through examples, we'd use the "usual" meanings and not necessarily
be exact, but if you're submitting the proof for formal inspection, in
a journal or book or on an exam, IT IS WRONG.

   But then, I believe this is also the root of my belief that static typing
of variables, included with inheritance for reuse and genericity and some
sort of a "type_case" construct to permit more specialized operations in
a controlled fashion, is correct.  (i.e. it's implicit that if I write
"5+6 = 11"
on the board, I'm referring to the addition operator for the integers
or
possibly naturals.  If you think I'm arguing for specially writing code
to deal with all of the possibly addition definitions, you're missing
the
part of my argument about subtyping.)

|> ]...  In David Gudeman's example, I most
|> ]certainly don't want some "read()" operation which can return ANY
|> ]type of value to allow my code to blindly attempt to apply the "+" or
|> ]"Log()" operator to it.
|> 
|> In the first place, the read() only produced ints and floats, and the
|> return value was used in a context where either was a legal value.  In
|> the second place, if the read() function were defined to return other
|> sorts of values and you wrote "x + read()" the problem would be in the
|> program, not in the definition of "read()".

   Where is this knowledge known?  Your arguments so far have been to
not explicitly state the type of value accepted by or returned by an
operator.  And it still doesn't help when "Read()" is extended in the
future to return strings, cats and dogs.

|> 
|> ]My code should have some reason to believe
|> ]that "Log()" makes sense when applied to the value.  Statically checked,
|> ]before allowing me or anyone else to run it.
|> 
|> Then you are going to need types "positive int" and "positive float".
|> Do you also want type security for division?  Then you are going to
|> need "non-0 int", "non-0 float", as well as "positive non-0 int", etc.
|> I count 8 numeric types needed just to get type security for two
|> common operations.  The point is that static type checking involves an
|> essentially arbitrary set of restrictions, and there is no evidence
|> that it adds to program security at all.

   That's silly.  I'm arguing this in the light of subtyping, so you could
have some sort of abstract subtype called "Algebraic" or "Number" where
it has the "usual" operations defined.

   ReadNumber() would return a "Number" to permit at least type-safety.
This is a case where the compiler would not have a choice but defer the
binding of the operation invocation to actual code/methods until run-time,
but it's still working in a statically typed type-safe environment.

|> 
|> I will concede that optional type declarations might help program
|> security in the same way that optional assert clauses do.  (I don't
|> object to having type declarations in a language, I object to
|> _required_ type declarations.)  However, for many variables there is
|> no need to declare the type since it is obvious from the var's use,
|> and programmers should be smart enough to know the difference.
|> Whatever makes a language designer think --at language design time--
|> that he knows more about the requirements of a program than the
|> programmer will know --at program design time?
|> 

   She doesn't.  That's why I argue for subtyping/inheritance/genericity/etc.

   The language is the framework and syntax to work in.  It should be
minimal and formal.

   There's no one language for mathematics, but if you DO pick one, it's
very formal and minimal (usually it just involves notions of n-ary operators,
groupings via parenthesis - not necessary if you use prefix notation - and
quantifiers.  But look what you can do once you've build up your base of
operations and theorems!)

|> ]... I just wouldn't do my banking or trust my life to software which
|> ]relies on extensive testing rather than some level of ensurred correctness.
|> 
|> You must be joking.  Static type checking doesn't give any reasonable
|> level of assurance at all -- it is never the case that simply because
|> program compiles without errors, there is reason to believe that it
|> has some level of reliability.  Testing is the _only_ known way to
|> give any assurance at all.  And a given amount of testing generally
|> provides more assurance for a language with dynamic typing than it
|> would for a language with static typing.  (Because programs in
|> dynamically typed languages are usually much smaller and have fewer
|> paths to test.)

   Actually, formal proof is the only known way to ensure any measure of
static correctness.  Testing is like the old joke about "all odd numbers are
prime"... (let's see... 3, 5, 7, wow, yeah, I guess they are!)  Exhaustive
testing is also possible, but the claim nowadays is that software systems are
the most complex things mankind has ever built.  Now, I know that
simulations of electronic circuits usually run for hours/months trying to
exhaustively test strange conditions.  And they *still* find bugs years later
due to some strange signal path.

   You're going to tell me that some poking around by either the developers
or users is going to exhaustively test a software system?

   Formal proof is rather unwieldy in my opinion in software right now, but
I believe it's because we don't have a good enough set of theorems to work
with.  (The CS proofs I read tend to either be extremely vague, or end up
doing the equivalent of re-proving that 1+1=2 over and over again.  I argue
that code is a proof of an algorithm, and the compiler should do as much as
it can to check it before allowing it to be executed.  Certain semantics are
IMO beyond the realm of computers to verify, but correctness of structure
and syntax are certainly possible.)

   We're well beyond the topic now though... back to dynamic typing.

|> 
|> ]   Heck, there's an even more direct parallel.  Syntax checking! 
|> Next thing,
|> ]why don't we have languages where they "try" to interpret commands
|> and
|> ]do what their best guess to the requested operations are!
|> ]
|> ]   I don't WANT a smart computer, I don't WANT a computer which can
|> ]misinterpret ambiguous commands, I don't WANT a computer which
|> ]can forget...
|> 
|> There is no parallel there at all.  Dynamically typed languages in
|> no
|> sense try to interpret ambiguous commands or "guess" what is wanted.
|> The typing rules are just as unambiguous in a dynamically typed
|> language as in a statically typed language.  But in the dynamically
|> typed language the rules are simpler, more intuitive, and easier to
|> use.  (Depending on the language of course -- I wouldn't argue that
|> Smalltalk's rules are any of the above...)
|> 

   Once you've built up a large type library, the chances that any one
person has a detailed gestalt of the whole thing is very unlikely.  Therefore,
unless you explicitly restrict the type of a value you receive, you can only
assume that it's any type.  Therefore the complex operation application rules
(well, they're probably not complex, but when you have a library of
hundreds or thousands of types you start to hit a level of complexity where
it *looks* almost like guessing,) aren't totally predictable and you can get
an unexpected operation invoked.

   That's clearly a loose argument.

   Again, one more time, with feeling, my concern is with large software
systems.  If you're toying around with some new means to perform
parallel decomposition of a relational join, your domain is probably small
enough that you could maintain the simulation correctly yourself with
a dynamically typed system.  This is great for prototyping and research.
Let the ideas flow!

   Real-life systems get quite large and no one person has the complete and
total understanding of the whole system that you might in a limited domain.
This is my argument.  I'm sorry, David, if I've taken away from the
pure discussion of the topic and it's an interesting one even outside of
the engineering aspects, but I'm responding largely to the claims that
some testing by (a) people who know the system or (b) people who are
likely to only push the buttons that the (a) folks told them to are going to
ensure anything.

   Having something of a mathematical background, I believe that the
True Path to Software Engineering and Correctness is by building up
layers and levels of provably correct software.  (If I'm proving some sort
of extension theorem in topology, I don't have to continually prove
basic identities of fields.)  I'm also a fan of Eiffel, which allows for
building
up of these layers, refining them as you go along.  (It has its
problems,
but more than any other language I know of today, I think it's on the
right track for enabling building of large extensible systems.)

|> ]... and some program writes past the end of an array, blasting
|> ]away the stack, crashing the ATC computer,)
|> 
|> You are confusing weak typing with dynamic typing.  I don't know of
|> any dynamically typed language that lets you write past the end of an
|> array (unless it first expands the array).
|> 

   You're right.  I was in unnaturally early yesterday and not everything
came out as it would have after a good cup of coffee.  :-)

|> ]   This is tiring.  If you don't agree with me, that's OK, but I hope you
|> ]stay in the research world rather than producing software which people
|> ]pay for and expect to work reliably.
|> 
|> I'm sure that the thousands of people currently using and relying on
|> software written in dynamically typed languages are touched by your
|> concern.  We aren't worried about it though, we'll just go on using
|> easily modifiable software that was written in half the time with
|> twice the functionality, with no loss of reliabilty.  (Although we
|> _do_ have to buy faster machines or put up with slower response.)

   Yup, and it's going to cost when we have to maintain those
systems for the next 5-20 years.  Or are we going to just re-develop
them all over again when we need to extend them?  There go all your
claims of productivity and efficiency for the programmer.  Remember that
most of the cost of a software system is in the long term maintenance
and evolution - not in the initial development.

|> --
|> 					David Gudeman
|> gudeman@cs.arizona.edu
|> noao!arizona!gudeman
|> 
------------------------------------------------------------------------
I'm saying this, not Digital.  Don't hold them responsibile for it!

Michael J. Grier                           Digital Equipment Corporation
(508) 496-8417                             grier@leno.dec.com
Littleton, Mass, USA                       Mailstop OGO1-1/R6

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (03/15/91)

In article <1991Mar13.124811.1380@newcastle.ac.uk> Chris.Holt@newcastle.ac.uk (Chris Holt) writes:
> So in prototype-style code, leave declarations out, and in production
> code put them in.

Well, I usually start writing a program by seeing how the data will be
organized. Once I know this, I might as well tell the computer about it.
(This is an observation about my programming style, nothing more; I
never decided that---what do they call it now?---``data structure
design'' was the Right Way to Code, and I can't argue that it's better
or worse than any other style.) What would I gain from pushing off
explicit declarations until the last moment? Are you saying that it's
not important to catch typos and type errors during debugging?

---Dan

cs450a03@uc780.umd.edu (03/15/91)

Grier:
>   In mathematics, it makes NO sense to talk about applying a function
>or operation to a symbol unless the symbol is known to be in the
>domain of the operation/function.  I.e. writing something like "for all
>x, exp(x) is greater than zero" is nonsense.  exp's domain is commonly
>the reals, and may be extended to bigger domains like the complexes, but
>that doesn't mean it applies to the empty set, or the tree outside my
>house's door.  The correct way to write that would be, "for all x, if
>x is a real number, exp(x) is greater than zero."

Hmm... so what is the correct way of finding if x is a member of set
y?  What is the domain of a function which selects the first 5 from
an ordered sequence?  What is the domain of = (test-if-equal)?  What
is the domain of encapsulation (obtain-a-pointer-to)?  What is the 
domain of search (look-up-in-table)?

Why should I limit assignment (name-association) based on the domain
of, for instance, addition?

> ...
>  Now, that's the hard stance.  Following this philosophy, if x
>is in the set of integers, x is also NOT in the set of reals (a nice
>copy of it and the rest of the integers are embedded there.)  So,
>it's not uncommon to sleaze through your typing when there is a clear
>an unambiguous conversion implied.

And which of those types is the domain of log() ?  Can you think of a
good reason to not have runtime type-checking for log?

>|> Finally, to Mr. Grier, who posed the rhetorical problem about trusting
>|> software in critical systems which might have latent bugs:  would you
>|> really trust such software if it had never been tested?  Would it make
>|> you feel safer if each type of data required seperate chunks of code,
>|> with the associated tests and branches and variant storage mechanisms?
> 
>   Yes and no.  No first.
> 
>   "No", because that's why I think that subtyping and inheritance is
>so wonderful.  If there's an obvious way to specialize an operation to a
>subtype, it *is* clearer and still absolutely statically correct to apply the
>supertype's operation to a value of the subtype.

No this makes you feel safer?  Or no it doesn't make you feel safer?

>   "Yes", because otherwise you end up putting operations up artifically
>high in the tree.  The one which comes to mind and really burns me is 
>when there's a "Shape" type, and because people want to have values of
>type "Shape" and want to apply the "diameter()" operation to them.  Shapes
>don't all have diameters!  At least if they do, it's not with the implicit
>meaning implied by the C++ folks who want to define a diameter()
>virtual member function for Shape.

Sounds like a good reason to avoid testing to me.  (Right folks?)

>   My motivation is that I believe that compilers should try to do their
>best to ensure correctness of a program implementation before allowing 
>it to be executed.  If you don't believe in that, well, we differ and
>that's life. I just wouldn't do my banking or trust my life to software which
>relies on extensive testing rather than some level of ensurred correctness.

I have no objection to compilers doing their best.  Nor do I claim
that testing somehow negates the responsibility of the designer to
ensure correctness.  Nor do I claim that the compiler relieves the
designer of the responsibility for correctness. 

On the other hand, I do claim that IF the language lets me, I can
write programs which can be verified quite simply.  The technique is
very similar to induction.  (Check boundary conditions, check for
typical case).  Sometimes you can make your testing into a proof of
correctness (given basic assumptions about the correctness of the
underlying machinery).  Sometimes not.  

As an aside, you can test both your understanding of a concept, and
its correctness by implementing it and seeing if it works.  Proofs can
have bugs in them too, you know.

>   (I don't claim that compilers will ever be able to prove correctness
>of the semantics of an algorithm, but I'd like them to at least ensure the
>correctness of the proof/implementation.)

And how is a compiler supposed to prove that a program fulfills its
purpose?  At best, the compiler can prove that the program can be
compiled. 

>...
>   I don't WANT a smart computer, I don't WANT a computer which can
>misinterpret ambiguous commands, I don't WANT a computer which
>can forget.  (machine checks/hardware failures don't count.)

I do not believe I was arguing for any of these features.

>   I want a computer which does EXACTLY what I tell it, really fast.  If what
>I tell it to do doesn't or might not make sense, I want it to let me know
>so I can clear up any ambiguity as early as possible, rather than letting it
>find it out later (rush hour at a busy airport, suddenly over 256 planes are
>in the airspace, and some program writes past the end of an array, blasting
>away the stack, crashing the ATC computer,) or make some guess ...

A very good reason for runtime type checking.

>   This is tiring.  If you don't agree with me, that's OK, but I hope you
>stay in the research world rather than producing software which people
>pay for and expect to work reliably.  But if you're going to claim that
>this approach of dynamic typing increases {productivity, quality, 
>performance, correctness, robustness}, I'll continue to differ with you.

I must admit that following your line of thought (or attempting to--
I'm not sure I succeeded) is tiring for me as well.  I should point
out that I am not in the research world, and work very hard to see
that what people pay for works well.  And while I disclaim knowledge
of what you think dynamic typing is, I do claim that type as a
property of DATA as opposed to a property of a NAME does increase
productivity, quality, correctness as well as robustness.

In fact, I claim that assigning types to names, independent of the
assigned values, is a akin to side-effect driven programming.  I
realize that it is necessary at a low level.  I don't see it as a
productivity boost.  


Raul Rockwell

grier@marx.enet.dec.com (03/15/91)

In article <14MAR91.22372006@uc780.umd.edu>, cs450a03@uc780.umd.edu writes:
|> 
|> Hmm... so what is the correct way of finding if x is a member of set
|> y?  What is the domain of a function which selects the first 5 from
|> an ordered sequence?  What is the domain of = (test-if-equal)?  What
|> is the domain of encapsulation (obtain-a-pointer-to)?  What is the 
|> domain of search (look-up-in-table)?

   I don't see what point you're trying to make.  If these aren't
take a look at most any object-oriented system designed in the last
decade or so with strong typing and parametrized types.  Or even outside
the OO world per se., in perhaps CLU.

   (In direct answer, picking a syntax, perhaps "contains(y, x)",
"select_first_n(seq, 5)", "Object", huh?  let's keep pointers out of this..,
assuming that the Table type is parametrized ala "Table[KeyType,
ValueType]", it's the type specified for KeyType when the particular
Table value instance was created.)

|> 
|> Why should I limit assignment (name-association) based on the domain
|> of, for instance, addition?

   I don't know, why should you?  I think you're trying to twist what I'm
saying somehow but I don't see your point.  Perhaps it's your terminology.
I'll spell it out.

   If in some scope you have a variable, let's call it X, which is
declared to be of type "Number" (presuming all the types with
algebraic-type operations fall under there for the sake of argument,)
the compiler should ensure that any value I attempt to bind
X to is compatible with the type of X.  (I.e. you can assign 5, pi,
e, 27, 42, 0.00000001, maybe even 2+3i to X.  But you couldn't
assign "my car", "the red firetruck" or "The Soviet Union" to X.
Insert appropriate type-checking rules for generics/parametrized
types.)

|> 
|> > ...
|> >  Now, that's the hard stance.  Following this philosophy, if x
|> >is in the set of integers, x is also NOT in the set of reals (a nice
|> >copy of it and the rest of the integers are embedded there.)  So,
|> >it's not uncommon to sleaze through your typing when there is a clear
|> >an unambiguous conversion implied.
|> 
|> And which of those types is the domain of log() ?  Can you think of a
|> good reason to not have runtime type-checking for log?
|> 

   I can think of a better reason not to have runtime type checking for
log: if I'm doing it quite a bit, the runtime costs are high.  Wasn't that
part of the whole original premise or why David Gudeman made his
original postings?

   I believe that Ada's model here is correct.  No correct program may
depend on exceptions for its operation.  (Implying that if you have
a correct program, you can turn off all the nice bounds-testing and
such and end up with a nice and *fast* program.  Commercial Ada
compilers from both Rational and Digital do this.)

|> >|> Finally, to Mr. Grier, who posed the rhetorical problem about trusting
|> >|> software in critical systems which might have latent bugs:  would you
|> >|> really trust such software if it had never been tested?  Would it make
|> >|> you feel safer if each type of data required seperate chunks of code,
|> >|> with the associated tests and branches and variant storage mechanisms?
|> > 
|> >   Yes and no.  No first.
|> > 
|> >   "No", because that's why I think that subtyping and inheritance is
|> >so wonderful.  If there's an obvious way to specialize an operation to a
|> >subtype, it *is* clearer and still absolutely statically correct to
|> apply the
|> >supertype's operation to a value of the subtype.
|> 
|> No this makes you feel safer?  Or no it doesn't make you feel safer?

   Beings I discusses my thoughts on both counts, does it matter other
than you wanting to pick nits?

|> 
|> >   "Yes", because otherwise you end up putting operations up artifically
|> >high in the tree.  The one which comes to mind and really burns me is 
|> >when there's a "Shape" type, and because people want to have values of
|> >type "Shape" and want to apply the "diameter()" operation to them.  Shapes
|> >don't all have diameters!  At least if they do, it's not with the implicit
|> >meaning implied by the C++ folks who want to define a diameter()
|> >virtual member function for Shape.
|> 
|> Sounds like a good reason to avoid testing to me.  (Right folks?)
|> 

   Testing what?

   What does that have to do with placing operations artificially high in
an inheritance/subtyping tree?  This is a design issue more than a
testing one.

   I am most certainly not arguing against testing.  However, if you
haven't had an error reported, it doesn't mean it doesn't exist - it
just means that either the tester hasn't found it or didn't feel like
reporting it that day.

|> 
|> I have no objection to compilers doing their best.  Nor do I claim
|> that testing somehow negates the responsibility of the designer to
|> ensure correctness.  Nor do I claim that the compiler relieves the
|> designer of the responsibility for correctness. 
|> 

   Ok, we're in total agreement here.  I feel that there is/should be
an important synergy between programmer and compiler here.
(Good) compilers can find a great deal more optimizations than
people can on silly things like removing loop invariants, etc.  People
can do much more important optimizations on newer and better
spiffy algorithms.  However, people shouldn't be wasting their
time second-guessing the compiler.  But I digress as usual...

|> On the other hand, I do claim that IF the language lets me, I can
|> write programs which can be verified quite simply.  The technique is
|> very similar to induction.  (Check boundary conditions, check for
|> typical case).  Sometimes you can make your testing into a proof of
|> correctness (given basic assumptions about the correctness of the
|> underlying machinery).  Sometimes not.  
|> 

   Right, for a single program in a single instance.  If you've been
following what I've been saying, my argument has been around the
long term maintenance of large systems.  I could add, subtract and
multiple and even take derivatives and come up with anti-derivatives
long before I understood any of the theory.  Loose induction and
random sampling don't make for proofs.

   Simple programs can be verified simply.  Complex ones are harder.

   This is one of several places where "programs as proofs" falls apart.
My central argument is in long term management of large software
projects.

|> As an aside, you can test both your understanding of a concept, and
|> its correctness by implementing it and seeing if it works.  Proofs can
|> have bugs in them too, you know.
|> 

   No arguments here! Again your comments make me believe that
you don't understand my stance.  Please re-read my postings if
you have them available.

|> >   (I don't claim that compilers will ever be able to prove correctness
|> >of the semantics of an algorithm, but I'd like them to at least ensure the
|> >correctness of the proof/implementation.)
|> 
|> And how is a compiler supposed to prove that a program fulfills its
|> purpose?  At best, the compiler can prove that the program can be
|> compiled. 
|> 

   Re-read my previous posting today.  (well, yesterday at this
point.)

|> >...
|> >   I don't WANT a smart computer, I don't WANT a computer which can
|> >misinterpret ambiguous commands, I don't WANT a computer which
|> >can forget.  (machine checks/hardware failures don't count.)
|> 
|> I do not believe I was arguing for any of these features.

   (again) Read my posting earlier today.

   I claim that once a type hierarchy grows beyond the immediate and
total understanding of one person at a time, and into a depth of complexity
where the application of the algorithms for operation selection tend
towards guessing/smartness.  (I don't believe in AI per se, but I can
imagine that applying even simple pattern matching rules against a large
enough rule base could possibly maybe someday potentially doubtfully
pass a turing test.  This is exactly the situation when you have a large
projects.  Whether you're dealing with a decent modular language like
Ada or Modula-2, or into the OO world with an Eiffel, Trellis or Modula-3,
you need automated ensurance of type safety.)

   You're sitting on the bridge of the enterprise.  "Sulu, press the red
button".
(Unbeknownst to you, Captain Kirk, a new Red self-destruct button was
installed right next to the Warp Factor 1.0 button which used to be
called red
but might now be considered a dingy maroon.  ka-bam-ooo...  it's a
shame,
there was once a time when you knew that ship from stem to stern.  If
only
Star Fleet had had the sense to make a regulation that only one Red
button
would be on the navigator's console, but nobody thought that such
regulations and checks were necessary...)

   Once again, this is a problem I see in large systems evolving over time.
My motivation isn't so much to prove correctness as to ensure that changes
and growth don't invalidate other algorithms' implementations.  My
extending "Read()" to return baby names in addition to numbers broke
David Gudeman's program which assumes they're numbers, unfortunately.

   I want the system to be able to recognize these problems and prevent
them from being made into a running system.  We all understand that
even in a strongly typed language where type information isn't made
explicitly available, the most specific actual type of an expression
can't always be inferred.  Which means that either (a) we don't do static
checking because we don't want those blasted type error or syntax
error messages anyways, or (b) there are cases where it is impossible
for the compiler to statically type-check the program.

   In the (a) case, well, damn the torpedoes, full speed ahead, and I'll
just make sure I don't fly your airline or do banking with your bank.

   In the (b) case now either you've lost type-safety (which I claim is
good and worthy of our respect,) or you have to start inserting clues
about types of expressions back into your program.  Oh, but wait,
that's sooo difficult, we can't do that.  We'll just let it run... (tick,
tick, tick...)

|> 
|> >   I want a computer which does EXACTLY what I tell it, really fast.
|> If what
|> >I tell it to do doesn't or might not make sense, I want it to let me
|> know
|> >so I can clear up any ambiguity as early as possible, rather than
|> letting it
|> >find it out later (rush hour at a busy airport, suddenly over 256
|> planes are
|> >in the airspace, and some program writes past the end of an array,
|> blasting
|> >away the stack, crashing the ATC computer,) or make some guess ...
|> 
|> A very good reason for runtime type checking.
|> 

   Or proving your algorithms and implementations before putting
them in life-critical situations.

|> >   This is tiring.  If you don't agree with me, that's OK, but I hope you
|> >stay in the research world rather than producing software which people
|> >pay for and expect to work reliably.  But if you're going to claim that
|> >this approach of dynamic typing increases {productivity, quality, 
|> >performance, correctness, robustness}, I'll continue to differ with you.
|> 
|> I must admit that following your line of thought (or attempting to--
|> I'm not sure I succeeded) is tiring for me as well.  I should point
|> out that I am not in the research world, and work very hard to see
|> that what people pay for works well.  And while I disclaim knowledge
|> of what you think dynamic typing is, I do claim that type as a
|> property of DATA as opposed to a property of a NAME does increase
|> productivity, quality, correctness as well as robustness.
|> 
|> In fact, I claim that assigning types to names, independent of the
|> assigned values, is a akin to side-effect driven programming.  I
|> realize that it is necessary at a low level.  I don't see it as a
|> productivity boost.  

   I claim it doesn't.  I also write software for a living that people
depend on running their business.  I've worked with languages
where the programmer has the option of being able
to specify that only certain values are permitted in a given context,
and ones where looser bindings are permitted.  There is no doubt in
my mind that in the environments where the ability to specify the
types of values which can be associated with a given name in a given
scope, the quality of software was higher and the debugging cycle
was shorter than in an environment which had a looser typing scheme
and where I spent a few too many late nights tracking down where
what was eventually an obvious typing error occurred.  The problems
occurred exactly because of the problems I've been outlining, and
they would not/could not have occurred in a language where names
are only permitted to be bound to certain types of values.  The
error would have been reported during compilation with a
hopefully expressive message which would have explained the
difficulty.)

   If you can remember all those details so that when modules
evolve you can feel quite certain that interfaces won't be broken,
I congratulate you.  It's just too bad that you're using an environment
where your skill and expertise are wasted debugging rather than
designing and implementing.

   (I really don't understand the motivation.  Is there some amazing
wonderful power that you get out of dynamic typing?  Other than
being able to ask for bigger hardware and personnel budgets when
you need bigger computers and more people to maintain your
software?  I don't see the costs of static typing being high, and the
benefits are numerous.  I'm trying to see where this claimed
productivity and quality gain came from.  Please enlighten me!)

   Sorry for rambling, it's past my bedtime.

|> 
|> 
|> Raul Rockwell
|> 
------------------------------------------------------------------------
I'm saying this, not Digital.  Don't hold them responsibile for it!

Michael J. Grier                           Digital Equipment Corporation
(508) 496-8417                             grier@leno.dec.com
Littleton, Mass, USA                       Mailstop OGO1-1/R6

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (03/15/91)

In article <1991Mar14.183323.27020@engage.enet.dec.com> grier@marx.enet.dec.com () writes:
> In article <626@optima.cs.arizona.edu>, gudeman@cs.arizona.edu (David
> Gudeman) writes:
> |> In article  <1991Mar13.163629.12630@engage.enet.dec.com>
> |> grier@marx.enet.dec.com writes:
> |> ]   In mathematics, it makes NO sense to talk about applying a
> |> function
> |> ]or operation to a symbol unless the symbol is known to be in the
> |> ]domain of the operation/function.  I.e. writing something like "for
> |> all
> |> ]x, exp(x) is greater than zero" is nonsense.
> |> Nonsense.  Anybody with a clue immediately understands that x is
> |> restricted to values in the domain of exp.
>    Wrong.

No, you're wrong. Suppes, for example, defines functions so that you can
say ``The set of x such that exp(x) is smaller than or equal to zero is
empty.'' You don't have to qualify x as ``x in the reals'' for the
statement to make sense and be perfectly correct.

What this has to do with dynamic typing is beyond me.

---Dan

gudeman@cs.arizona.edu (David Gudeman) (03/15/91)

In article  <1991Mar14.183323.27020@engage.enet.dec.com> grier@marx.enet.dec.com writes:
]In article <626@optima.cs.arizona.edu>, gudeman@cs.arizona.edu (David
]Gudeman) writes:
]
]   ...If you consider a proof to have some static notion of
]correctness (which is exactly what they are,) this kind of error is
]absolutely wong in a formal exhibition of the proof.

It depends on your mathematical tradition.  There are lots of people
who don't worry about that stuff (although the current vogue may tend
toward the picky side).

]|> In the first place, the read() only produced ints and floats, and the
]|> return value was used in a context where either was a legal value.  In
]|> the second place, if the read() function were defined to return other
]|> sorts of values and you wrote "x + read()" the problem would be in the
]|> program, not in the definition of "read()".
]
]   Where is this knowledge known?  Your arguments so far have been to
]not explicitly state the type of value accepted by or returned by an
]operator.  And it still doesn't help when "Read()" is extended in the
]future to return strings, cats and dogs.

No, I often explicitely state the types used by operators, I just
don't tell the computer.  The documentation for a function should give
any relevant type restrictions.  Whenever you change the way a fuction
behaves, you (in general) have to check all the places it is used to
make sure the change is OK.  All static type checking provides is a
compile-time warning if (1) you have missed any uses _and_ and (2) the
function change was of the rare sort that changes the types.  It
doesn't seem very significant for that.

]|> Then you are going to need types "positive int" and "positive float"...
]
]   That's silly.  I'm arguing this in the light of subtyping, so you could
]have some sort of abstract subtype called "Algebraic" or "Number" where
]it has the "usual" operations defined.

The point is the essential arbitrariness of type declarations.  There
is no good verification-based reason to distinguish between ints and
floats and not distinguish between natural and integer, or between
non-zero and with-zero.

]   ReadNumber() would return a "Number" to permit at least type-safety.
]This is a case where the compiler would not have a choice but defer the
]binding of the operation invocation to actual code/methods until run-time,
]but it's still working in a statically typed type-safe environment.

No, the environment is type-safe, but it isn't statically typed.
There are now expressions whose types can't be uniquely determined at
compile time.  Dynamically typed languages are type safe, they just
have less of the type checking done at compile time.

]   She doesn't.  That's why I argue for subtyping/inheritance/genericity/etc.
]
]   The language is the framework and syntax to work in.  It should be
]minimal and formal.

Thank you for brightening my evening.  I got a good chuckle out of
that.  Asking for subtyping, inheritence, genericity and etc. in a
"minimal" language.  What a card.

]   Actually, formal proof is the only known way to ensure any measure of
]static correctness.

Please stop.  You're killing me.  Where did you get this bizarre sense
of humor?

]   Once you've built up a large type library, the chances that any one
]person has a detailed gestalt of the whole thing is very unlikely.  Therefore,
]unless you explicitly restrict the type of a value you receive, you can only
]assume that it's any type.

(Whew.  Back to serious stuff.)  You generally don't build up large
type libraries like that with dynamically typed languages.  There are
a few generic aggregate types that serve for most data structuring
purposes.  You don't have to implement stacks, hash tables,
concatenatable arrays or other things.

]   Again, one more time, with feeling, my concern is with large software
]systems.

Again, I don't believe that required static typing is in any sense, in
any software system, more reliable than dynamic typing.

]... I'm responding largely to the claims that
]some testing by (a) people who know the system or (b) people who are
]likely to only push the buttons that the (a) folks told them to are going to
]ensure anything.

Absolute correctness no.  But the level of security you get from
testing is so far above any you might get from static typing, that the
static typing is insignificant.

]   Yup, and it's going to cost when we have to maintain those
]systems for the next 5-20 years.

On the contrary, that's where the advantages of dynamically typed
languages really shine.  The programs are much more maintainable and
"evolvable".
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

cs450a03@uc780.umd.edu (03/15/91)

>In article <14MAR91.22372006@uc780.umd.edu>, cs450a03@uc780.umd.edu writes:
>   I don't see what point you're trying to make.  If these aren't
>take a look at most any object-oriented system designed in the last
>decade or so with strong typing and parametrized types.  

The points I'm trying to make are that strong typing is a good thing,
that dynamic type checking can be a good thing, and that two functions
with different domains may have a subdomain in common.  I'm claiming
that a function's domain is the basis for typing.  I'm claiming that
there is no need for me to declare the type of result of function F if
the result of function F is also the result of some other
'sub-function' G which has results of some known type (e.g. aggregates
of 0 or more non-negative integers with a maximum value somewhere
below a billion).  I'm claiming that there is no need for me to
declare that argument X to function F must have type K, if argument X
is also an argument to function H which only accepts type K.  

I see no reason this couldn't be put into a compiler.  Well, actually,
I use a compiler which does this sort of thing.  I rarely use it,
unless I find code that's eating up cpu doing type checking/bounds
checking.  Also, there are cases where compiled code has worse
performance than interpreted code (the interpreter has had some recent
changes, since the release of the compiler).

Incidentally, I have no objection to allowing declarations (e.g.
restrict function F's argument X to single values, no aggregates).  I
prefer those declarations to be used sensibly, though.  For example,
if F is applied to datum Y, which has N single values, apply F n times
and return an aggregate of n results).  I do not see this as being in
any way ambiguous.

>   (In direct answer, picking a syntax, perhaps "contains(y, x)",
>"select_first_n(seq, 5)", "Object", huh?  let's keep pointers out of this..,
>assuming that the Table type is parametrized ala "Table[KeyType,
>ValueType]", it's the type specified for KeyType when the particular
>Table value instance was created.)

Why keep pointers out of it?  Why must I keep types independent of the
typed information?  You are going to waste my time issuing
bean-certificates if you force me into that kind of situation.

>|> Why should I limit assignment (name-association) based on the domain
>|> of, for instance, addition?
> 
>   I don't know, why should you?  I think you're trying to twist what I'm
>saying somehow but I don't see your point.  Perhaps it's your terminology.
>I'll spell it out.
> 
>   If in some scope you have a variable, let's call it X, which is
>declared to be of type "Number" (presuming all the types with
>algebraic-type operations fall under there for the sake of argument,)
>the compiler should ensure that any value I attempt to bind
>X to is compatible with the type of X.

Ok, let me restate myself.  If X is going to be applied to function Y
which only accepts numbers as arguments, then consider X to be
declared type Y.  If you are in an interpretive environment
(reasonable if you are adding functionality on a day-to-day basis), it
is more efficient to wait till X - Y closure to check type.  In a
compiler, you do not gain anything by REQUIRING that I declare that X
can only be a number.

>   I can think of a better reason not to have runtime type checking for
>log: if I'm doing it quite a bit, the runtime costs are high.  Wasn't that
>part of the whole original premise or why David Gudeman made his
>original postings?

They might be high, especially if you aren't working with aggragates.
That's what compilers are for.  That is, to eliminate redudant
operations where possible, including redundant type checks.

>   I believe that Ada's model here is correct.  No correct program may
>depend on exceptions for its operation.  (Implying that if you have
>a correct program, you can turn off all the nice bounds-testing and
>such and end up with a nice and *fast* program.  Commercial Ada
>compilers from both Rational and Digital do this.)

I'd hate to have to do development work in Ada.  I wouldn't mind
having Ada available to RE-write time critical sections of code.

Me:
>|> >|> Finally, to Mr. Grier, who posed the rhetorical problem about trusting
>|> >|> software in critical systems which might have latent bugs:  would you
>|> >|> really trust such software if it had never been tested?  Would it make
>|> >|> you feel safer if each type of data required seperate chunks of code,
>|> >|> with the associated tests and branches and variant storage mechanisms?

Grier:
>|> >   Yes and no.  No first.
>|> > 
>|> > "No", because that's why I think that subtyping and inheritance
>|> >is so wonderful.  If there's an obvious way to specialize an
>|> >operation to a subtype, it *is* clearer and still absolutely
>|> >statically correct to apply the supertype's operation to a value
>|> >of the subtype.

Me:
>|> No this makes you feel safer?  Or no it doesn't make you feel safer?

Grier:
>   Beings I discusses my thoughts on both counts, does it matter other
>than you wanting to pick nits?

My apologies for being sarcastic.  I keep forgetting how time and
distance affect meaning.  What I meant was, I asked a simple yes/no
question (ok, somewhat rhetorical) and the first few paragraphs of
answer did not address what I thought was the key issue.  This apology
also good for my remarks after the "Yes" half of the answer.

..
>|> And how is a compiler supposed to prove that a program fulfills its
>|> purpose?  At best, the compiler can prove that the program can be
>|> compiled. 
>|> 
> 
>   Re-read my previous posting today.  (well, yesterday at this time)

I did not see anything that answers this question.

Raul Rockwell

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (03/15/91)

In article <651@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
> The point is the essential arbitrariness of type declarations.  There
> is no good verification-based reason to distinguish between ints and
> floats and not distinguish between natural and integer, or between
> non-zero and with-zero.

So? If you want to distinguish between the naturals with and without
zero, use Ada.

> ]   Once you've built up a large type library, the chances that any one
> ]person has a detailed gestalt of the whole thing is very unlikely.  Therefore,
> ]unless you explicitly restrict the type of a value you receive, you can only
> ]assume that it's any type.
> (Whew.  Back to serious stuff.)  You generally don't build up large
> type libraries like that with dynamically typed languages.  There are
> a few generic aggregate types that serve for most data structuring
> purposes.  You don't have to implement stacks, hash tables,
> concatenatable arrays or other things.

Unfortunately, the programming world is full of naturally complex data
structures. You can point to at least half of the typically fifty-odd
elements of the UNIX inode structure, for example, and give good reasons
why those elements must be in the structure, even though they require
all sorts of special operations. Even simple data structures like
Patricia require dozens of operations for a complete implementation.

> ]   Again, one more time, with feeling, my concern is with large software
> ]systems.
> Again, I don't believe that required static typing is in any sense, in
> any software system, more reliable than dynamic typing.

Fer cryin' out loud, who cares? I don't think any language designer will
take pains to *stop* you from using dynamic typing; if enough people
agree with you that dynamic typing is useful, then you can distribute
your library and be done with it. Nobody's stopping you from writing
your software with dynamic typing in practically any language. Just stop
pestering the people who prefer static typing in the same languages.

> On the contrary, that's where the advantages of dynamically typed
> languages really shine.  The programs are much more maintainable and
> "evolvable".

This smacks of religion. Have you compared dynamically typed and
statically typed programs solving similar problems? Have you observed
the programs over their useful life and seen how much programmer effort
went into them?

---Dan

sfk@otter.hpl.hp.com (Steve Knight) (03/15/91)

gudeman@cs.arizona.edu (David Gudeman) writes:
> I thought you had to declare the types of functions in ML...

Sort of.  ML is a very interesting strongly typed language in which the
types of expressions can often be inferred without type-annotations.

Scott Schwartz writes:
> % sml
> Standard ML of New Jersey, Version 0.66, 15 September 1990
> - fun foo x = x + 1;
> val foo = fn : int -> int
> - fun bar x = x + 1.0;
> val bar = fn : real -> real

Despite what follows, I commend ML to anyone who is interested in the 
development of strongly typed languages.  But enough of the Mr.Nice stuff
and onto the language bashing you're all waiting for.

{ .. DISGRUNTLED ML PROGRAMMER ALERT ... WARNING ... <FLASH!> ... WARNING ... }

- foo x = x + 1;
val foo = fn : int -> int

Ah yes.  True enough.  Good job the example wasn't this, though ....

- fun foo x = x + x;
Error: overloaded variable "+" cannot be resolved

It was equally fortunate that this example wasn't used either ....

- datatype Foo = Bool of bool | Int of int | Fn of int -> int;
- fun foo (x as Fn _, y as Fn _) = false
= |   foo ( x, y ) = x = y;
Error: rules don't agree (equality type required)

You see it's vitally important that this function is never executed.  Because
ML doesn't allow comparison of functions, which is arguable, it cannot allow
the comparison of data-types which include functions.  Even when the programmer
has specifically checked against that possibility .....

Another, obviously wicked, faulty program that the type-checker fortunately
throws out on its ear before I do myself any damage is ...

- val foo = [false, 0];
Error: operator and operand don't agree (tycon mismatch)

Notice the very dangerous consequences of using a list containing a boolean
and an integer?  This is just one of the many cases where the type system of
ML is too conservative, in order to provide the user with type inference.

- datatype IntOrBool = Int of int | Bool of bool;
- val foo = [Bool false, Int 0];

Just look how ML doesn't need type declarations!  I hope the intelligent reader
is beginning to get an idea how this splendid type inference is achieved.

Polymorphism is very nice, too.  I thought I'd construct an updateable 
reference to the empty list.  Whoops!  (ML is primarily a functional 
programming language.  The interaction between imperative data structures
and polymorphism isn't all one might desire.)

- val x = ref [];
Error: nongeneric weak type variable
  x : '0S list ref

Gosh.  A jolly good job I was protected there.  That could have led to a 
terrible run-time error.

The other elegant aspect of the ML type inference system is its ability to
resolve the most general type of an equation.  In this function "f", the first
parameter is never used.  So the type inference system deduces that it can be
any value it likes ... <clank!>

- fun f x y = if y = 0 then 0 else f f (y-1);
Error: operator is not a function

As you can see, modern programming languages that support polymorphic
type-checking and type-inference give you all the security and performance
enhancements of a strongly typed language and all the flexibility and
convenience of a dynamically typed language.

Yes you certainly get excellent performance from ML.  It is not possible to
write a polymorphic hash-function.  So you can't write polymorphic hash-tables.
So you can't write reusable-code for hash-tables.  So people use linked lists.
Good job it goes fast.  (Actually, this is more than a little unfair.  You
only have to write out hash-functions for each type used in a hash-table.
This isn't *really* acceptable but you can sort of live with it.  Also some
folks, like us, use binary trees rather than linked lists, which makes the
comparison less ludicrous.)

{ END OF MAD RAVINGS ... }

My belief is that, to date, there's been no programming languages that 
gives the obvious benefits of strong typing without some significant penalty.
The penalty is typically both superfluous type-declarations in the program
and the elimination of valuable programming idioms.  (Furthermore, the issue
of performance is greatly complicated by the elimination of these idioms.)

Unsurprisingly, I am inclined to believe that strong Vs dynamic
typing is an unresolved debate.  In practical terms, it is still horses for
courses.  When writing low-level code, strong typing is (in my view)
invaluable.  When writing in "very" high-level languages, such as Lisp or
ML, the penalties are less tolerable (in my view) and, on balance, I prefer
dynamically typing.

One of the under-explored regions of this topic is that of heuristic type-
checking for dynamically typed languages.  My belief is that this hybrid
approach can be made effective enough to be useful.  I know the Scheme folks
have made progress in this area but I've not kept up to date on it.

Steve

wright@datura.rice.edu (Andrew Wright) (03/15/91)

I would like to see a concise summary of the claimed advantages of
dynamic typing.  The claimed advantages of static typing are numerous
and often argued (verification, efficiency, documentation, ...) but
the advantages of dynamic typing are less often discussed.  To start:

 ***  expressiveness (probably the single largest advantage)
 *    terseness (in that declaration of types is not required).
      This is a mild point given the advances in type inference,
      as witnessed by languages like ML.

Andrew Wright
Rice University

peter@ficc.ferranti.com (Peter da Silva) (03/16/91)

In article <24547:Mar1506:28:2591@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> No, you're wrong. Suppes, for example, defines functions so that you can
> say ``The set of x such that exp(x) is smaller than or equal to zero is
> empty.'' You don't have to qualify x as ``x in the reals'' for the
> statement to make sense and be perfectly correct.

That's because "real exp(real);" is programmed into everyone's head. If
you have "complex y;" then "y <= 0;" is a syntax error.
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you debugged your wolf today?"

gudeman@cs.arizona.edu (David Gudeman) (03/16/91)

In article  <26221:Mar1510:17:4991@kramden.acf.nyu.edu> Dan Bernstein writes:
]In article <651@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
]> The point is the essential arbitrariness of type declarations.  There
]> is no good verification-based reason to ...
]
]So? If you want to distinguish between the naturals with and without
]zero, use Ada.

The point is the essential arbitrariness of type declarations.  That
means the point is not (1) that I want to distinguish between the
naturals with and without reals, and not (2) that I want to eliminate
type declarations alltogether.  Just that arguments that type
declarations are necessary for security are problematic if for no
other reason than because they are taking a system designed for A and
claiming that it's purpose is B (where A is efficiency and B is
security).

As I said before, I believe this is a historical accident.  Weak
typing actually does cause security problems, and when people started
making static typing stronger for program security, they started to
believe that the typing itself gives program security.  This is not
the case.  It was necessary to make static type checking stronger to
overcome the problems that static type checking caused in the first
place -- lack of type security.  Dynamically typed languages never had
this problem.

]> ...  You generally don't build up large
]> type libraries like that with dynamically typed languages.

]Unfortunately, the programming world is full of naturally complex data
]structures.

I did say "generally".  However the point is well taken that built-in
aggregate types can't handle everything, and that you still need
ADT's.  So let me answer the original question in another way:

]> ]   Once you've built up a large type library, the chances that any one
]> ]person has a detailed gestalt of the whole thing is very unlikely.
]> ]Therefore, unless you explicitly restrict the type of a value you
]> ]receive, you can only assume that it's any type.

Yes, that is correct.  You generally have to assume that the inputs to
a function can be any type.  That means that you have to write
explicit type checks in some circumstances.  I will even admit that
writing these explicit checks generally requires more time than
declarations would, since you have to explicitly do something if the
type is wrong (usually exit with an error message).  However, it seems
to me that the places where this is necessary are rare since most
operations on ADT's seem to be generic (on arguments other than the
aggregate itself).  Also, these explicit type checks generally only
have to be written once and never changed, unlike static declarations
that may have to be changed when you upgrade the program.

]> ]   Again, one more time, with feeling, my concern is with large software
]> ]systems.
]> Again, I don't believe that required static typing is in any sense, in
]> any software system, more reliable than dynamic typing.
]
]Fer cryin' out loud, who cares?

That's what this whole argument is about.

]... Just stop
]pestering the people who prefer static typing in the same languages.

Calm down, Dan.  I've already said it 4 or 5 times, but here it is
once more: I am not arguing against type declarations.  I am arguing
against _required_ type declarations and against the argument that
static type checking increases program security.  I am not arguing
that static typing has no uses at all, in fact I have said several
times that it is important for efficiency.  I've also mentioned once
or twice that as an option it can make debugging easier.  The only
people who should feel pestered by me are people who think that static
typing is necessary for program robustness and who think that
programmers should be forced by the language to do things in a certain
way.

]> On the contrary, that's where the advantages of dynamically typed
]> languages really shine.  The programs are much more maintainable and
]> "evolvable".
]
]This smacks of religion. Have you compared dynamically typed and
]statically typed programs solving similar problems? Have you observed
]the programs over their useful life and seen how much programmer effort
]went into them?

I haven't done anything resembling an experiment, no.  My opinions are
based on experience and observation (of several large systems).

These things seem rather self evident if you accept the premise that
the amount of work needed to maintain and upgrade a program is roughly
related to the size of the program and to the number of places where a
given piece of information has to be duplicated.  Programs in
dynamically typed languages are generally half to a tenth the size of
programs in statically typed languages that do the same thing.  Also,
many statically typed languages require duplicate type information in
two different places (a declaration and a definition), but dynamically
typed languages have no similar maintenance problem.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

gudeman@cs.arizona.edu (David Gudeman) (03/16/91)

In article  <1991Mar15.062756.3781@engage.enet.dec.com> grier@marx.enet.dec.com writes:
]
]   I believe that Ada's model here is correct.  No correct program may
]depend on exceptions for its operation.  (Implying that if you have
]a correct program, you can turn off all the nice bounds-testing and
]such and end up with a nice and *fast* program.  Commercial Ada
]compilers from both Rational and Digital do this.)

Good grief.  You don't think dynamic typing is safe enough but you are
willing to turn off bounds checking?  That is going to make your
programs a _lot_ more dangerous than any program in a dynamically
typed language.

]...  Loose induction and
]random sampling don't make for proofs.

No, in the realm of software they give you much better reliability
than proofs.  I didn't want to get into another topic, but this just
too much.  Vendor A offers a product that has never been tested but
that has an computer-verified proof of correctness.  Vendor B offers
an identical product that has never been proved correct but that has
been through the normal testing process.  Which one do you want?  Any
sane person would pick the product from vendor B.

Incidentally, there is no reason why you can't have computer verified
proofs of correctness combined with dynamic typing.  Of course, just
as for statically typed languages, it wouldn't be worth the effort to
write the proofs for any but a tiny number of programs.

]   Once again, this is a problem I see in large systems evolving over time.
]My motivation isn't so much to prove correctness as to ensure that changes
]and growth don't invalidate other algorithms' implementations.  My
]extending "Read()" to return baby names in addition to numbers broke
]David Gudeman's program which assumes they're numbers, unfortunately.

If you change read() in such a way as to invalidate its original
documentation then you should have gone through all code that uses the
fuction and made sure they were still correct.  If you didn't do so,
then you did something stupid, and there has never yet been a
programming language made that can prevent bugs due to stupidity.  And
after you changed the function you should have tested the system
again, and that should have found the problem.  And if you didn't find
the problem that way (then your testing was slipshod, but also...),
then the worst that could happen during the actual use of the software
is for the system to detect the error and report it, failing to do
what was expected -- a minor problem for the vast majority of
applications.  A lot worse can happen if you turn off array bounds
checking, and nobody seems overly concerned about that.

]   I want the system to be able to recognize these problems and prevent
]them from being made into a running system.

Yeah, I want the system to recognize all my errors at compile time.
I'm just not willing to give up all the expressiveness I'd have to
give up to make that possible.

](a) we don't do static
]checking because we don't want those blasted type error or syntax
]error messages anyways, (b) there are cases where it is impossible
]for the compiler to statically type-check the program.
]
]   In the (a) case, well, damn the torpedoes, full speed ahead, and I'll
]just make sure I don't fly your airline or do banking with your bank.

The banking example is a non-problem.  I don't believe its possible to
get an accounting error due to lack of static type checking.  If you
get a type error the worst that can happen is for the program to
abort.  As to the airplanes, a program abort might be a little more
inconvient, but the program should be written in such a way that it is
protected from that even if type errors occur.

]   In the (b) case now either you've lost type-safety (which I claim is
]good and worthy of our respect,) or you have to start inserting clues
]about types of expressions back into your program.  Oh, but wait,
]that's sooo difficult, we can't do that.  We'll just let it run... (tick,
]tick, tick...)

Dynamically typed languages are type-safe.  And how many times do I
have to say that it isn't simply the writing of declarations that I
object to?

]There is no doubt in
]my mind that in the environments where the ability to specify the
]types of values which can be associated with a given name in a given
]scope, the quality of software was higher and the debugging cycle
]was shorter than in an environment which had a looser typing scheme

So who is arguing that?  It isn't the _abilty_ to specify types that
is the problem, it is the _requirement_ of specifying types.  A
requirement that leads to either (1) weak typing -- I think you will
agree that that is a problem, or (2) the further requirement of
excessively detailed declarations and excessive interdependency of
modules -- which I claim are more problems.  With optional type
declarations and the ability of the language to insert runtime type
checks you get better reliability.

]and where I spent a few too many late nights tracking down where
]what was eventually an obvious typing error occurred.

There are a few dynamically typed languages with inadequate error
messages or debugging facilities.  That is a problem with the
language (or implementation), not the concept of dynamic typing.

]   If you can remember all those details so that when modules
]evolve you can feel quite certain that interfaces won't be broken,
]I congratulate you.  It's just too bad that you're using an environment
]where your skill and expertise are wasted debugging rather than
]designing and implementing.

I spend a lot more time debugging when I program in C than when I
program in dynamically typed languages.

]   (I really don't understand the motivation.  Is there some amazing
]wonderful power that you get out of dynamic typing?
]  I don't see the costs of static typing being high, and the
]benefits are numerous.

That's because you you don't see the costs at all.  You are so used to
it that you think the problems caused by static typing are fundamental
to programming, and don't see the true cause.

]  I'm trying to see where this claimed
]productivity and quality gain came from.  Please enlighten me!)

Mostly the fact that you have to write a lot less code to get the same
functionality.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

cs450a03@uc780.umd.edu (03/16/91)

Steve Knight writes:
 ... (ML examples ommitted) ...
>My belief is that, to date, there's been no programming languages that 
>gives the obvious benefits of strong typing without some significant penalty.
>The penalty is typically both superfluous type-declarations in the program
>and the elimination of valuable programming idioms.  (Furthermore, the issue
>of performance is greatly complicated by the elimination of these idioms.)
> 
>Unsurprisingly, I am inclined to believe that strong Vs dynamic
>typing is an unresolved debate.  In practical terms, it is still horses for
>courses.  When writing low-level code, strong typing is (in my view)
>invaluable.  When writing in "very" high-level languages, such as Lisp or
>ML, the penalties are less tolerable (in my view) and, on balance, I prefer
>dynamically typing.

Well, yess.. 

Since you didn't define what you meant by Strong and Dynamic typing,
and since I don't think those definitions conflict in the way they
have been used in this subject thread, well... hmm...

If you say strong typing is what you presented in the ML examples,
what would you say dynamic typing is?

Also, I believe that some of the 'limitations' of ML are tolerable.
For example, I don't compare functions, I compare their character
representations (not that I need much of that, except for library
maintainence).  Also, I almost always live with homogenous data
structures (all elements of the same type)--there are a lot of
performance advantages to this (e.g. 10000 elements need only one type
header, which only need be checked once per pass over the data).

Finally, though this is somewhat of a self-imposed limitation, I 
don't use a lot of the facilities that are available, unless
absolutely necessary (in other words, since the primitives will
already deal nicely with those 10000 numbers, or whatever, I don't
bother with a loop which would have to (a) extract each element, and
build a type/header for it, and (b) have the overhead of calling
primitives 10000 times).

Of course, I don't use ML, so I don't know how applicable these
comments are to programs written in ML.  Your mileage may vary.  Void
where taxed or prohibitted.  Prices may be higher west of the
Mississippi.  Batteries not included.  NO WARRANTEE (This information
provided without warra...

Raul Rockwell

sommar@enea.se (Erland Sommarskog) (03/17/91)

Also sprach David Gudeman (gudeman@cs.arizona.edu):
>All dynamically typed language I know have this feature.  In fact for
>most of these languages there are no programs that have undefined
>behavior,  I don't know whether any statically typed language can
>make that claim.

So when the types don't match, you crash. Sure, this is well-
defined, but in many application it is just as unacceptable
as undefined behaviour.

If variables never can be uninitiated in your dynamicly typed 
language, I guess their behaviour can be well-defined.
-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se

campbell@redsox.bsw.com (Larry Campbell) (03/17/91)

In article <25381:Mar1221:07:3891@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
-In article <609@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
-> Why should a programming language be expected to catch this type of
-> error at compile time, and not some other class of errors?  Why are
-> type errors so special?

Given the difficulty of completely testing all possible functions in any
large software product, the difference between catching the error at compile
time and catching it at runtime is the difference between YOU discovering the
error during development and the CUSTOMER discovering the error in the field.
-- 
Larry Campbell             The Boston Software Works, Inc., 120 Fulton Street
campbell@redsox.bsw.com    Boston, Massachusetts 02109 (USA)

jpiitulainen@cc.helsinki.fi (03/18/91)

In article <2400034@otter.hpl.hp.com>, sfk@otter.hpl.hp.com (Steve Knight)
writes:
> One of the under-explored regions of this topic is that of heuristic type-
> checking for dynamically typed languages.  My belief is that this hybrid
> approach can be made effective enough to be useful.  I know the Scheme folks
> have made progress in this area but I've not kept up to date on it.

You might be interested in the following paper:
  Olin Shivers, "Data-Flow Analysis and Type Recovery in Scheme",
  March 30, 1990, CMU-CS-90-115, to appear in Peter Lee (ed.),
  _Topics in Advanced Language Implementation_, MIT Press

cs450a03@uc780.umd.edu (03/18/91)

Dr. Walker writes:
>In article <602@optima.cs.arizona.edu> gudeman@cs.arizona.edu
>(David Gudeman) writes:
>>				   Mathematical notation is generally
>>closer to dynamically typed languages than to statically typed
>>languages.
> 
>	& it was invented and had evolved long before the whole modern
>problem of trying to describe problems and algorithms in a precise way
>for computer consumption.  Such examples as we have of attempts at
>precision in maths [eg, by Russell] are not very convincing.  We write
>sloppy maths because "we all know what we mean" -- I don't think that's
>a very good model for computer languages.

As one of my profs had to continually explain to me (when I made
similar complaints about mathematical notations), it's primarily a
matter of definitions.  

In any event, the number of statements you use to make a declaration
has very little to do with how 'mathematical' you are being.  I've
seen many, many statements of the form 'x element of set y, x takes on
value z.' ...   Quite similar to the sort of things required in a
statically typed language.

On the other hand, just because a mathematical statement doesn't fit
within the confines of some brand C statically typed language is no
reason to call it sloppy.  Mind you, I'm not trying to argue for
Russell's stuff...  I'd hate to try and implement code for it :)

>>With static typing you need a great deal of information at compile
>>time to be able to guarantee strong typing.  This has two
>>consequences: (1) you have to limit the forms of expressions to some
>>set for which you know a type-checking decision procedure, and (2) you
>>have to acquire type information somewhere.
> 
>	Ie, (1) you have to know what your expression is intended to do,
>and (2) you have to use variables in a disciplined way.  I don't find
>these "consequences" either irksome or undesirable.  When I write programs
>in dynamically typed languages, I try hard to follow the same precepts.

Indeed, but I find generalizing quite difficult in the statically
typed languages I've run across.  There is, of course, C's approach
(declare a prototype array, alloc some memory, declare it to be of
type 'prototype array', and ignore the phony array boundaries), but
that's just too much busy work for me.  Then there's the wonderful
idea of allocating a gob of memory and hoping it's enough.  Then
there's the technique of re-compiling the program for each instance of
the problem (a real speed demon if you need to run several hundred
thousand cases of each 'instance').

>	There is a place for dynamic typing (I enjoy writing shell
>scripts!), and a case for rapid prototyping, but there is also a
>case for traditional declarations;  there is no need for either
>"camp" to knock the other.

Quite true.

However, it's a bit ironic that things as inefficient as shell scripts
(with the attendant overhead of massive file operations and forks and
so on) go hand in hand with statically typed languages (with claims of
awesome efficiencies).  It's even more ironic when a program written
in a statically typed language will rely on heavy file manipulation
because they can not deal with memory with enough abstraction.

I'm also rather amused (if only it weren't so painful) by the amount
of memory required by statically linked programs.  And by things like
fixed sized buffers, with system calls that haven't a clue as to that
size.  Ah... the wonderous maintainability of statically typed code...

;)  *sigh* :(

Raul Rockwell

brm@neon.Stanford.EDU (Brian R. Murphy) (03/18/91)

In article <1991Mar17.161210.5574@cc.helsinki.fi> jpiitulainen@cc.helsinki.fi writes:
>In article <2400034@otter.hpl.hp.com>, sfk@otter.hpl.hp.com (Steve Knight)
>writes:
>> One of the under-explored regions of this topic is that of heuristic type-
>> checking for dynamically typed languages.  My belief is that this hybrid
>> approach can be made effective enough to be useful.  I know the Scheme folks
>> have made progress in this area but I've not kept up to date on it.
>
>You might be interested in the following paper:
>  Olin Shivers, "Data-Flow Analysis and Type Recovery in Scheme",
>  March 30, 1990, CMU-CS-90-115, to appear in Peter Lee (ed.),
>  _Topics in Advanced Language Implementation_, MIT Press

You might also be interested in:
  A. Aiken and B. Murphy, "Static Type Inference in a Dynamically Typed
  Language", in {\em Proceedings of the Seventeenth Annual ACM Symposium
  on the Principles of Programming Languages}, Orlando, 1991, pp. 279-290.

While this is specifically about type inference for the functional
language FL (successor of FP), it's also easily applicable to a
functional subset of Lisp, and, with a bit more effort, to a
non-functional subset (I think---haven't actually done it).  It hinges
a lot on our representation of types, to be described in an upcoming
paper (also described somewhat less completely & intelligibly in my
1990 MIT MS thesis, "A Type Inference System for FL").

					-Brian

gudeman@cs.arizona.edu (David Gudeman) (03/18/91)

In article  <1991Mar17.131413.13312@redsox.bsw.com> Larry Campbell writes:
]
]Given the difficulty of completely testing all possible functions in any
]large software product, the difference between catching the error at compile
]time and catching it at runtime is the difference between YOU discovering the
]error during development and the CUSTOMER discovering the error in the field.

Millions of lines worth of code written in dynamically typed languages
is in regular use throughout the word, and no general problems with
robustness have been noticed.  Until there is some evidence strong
enough to counter this wealth of experience, I wish people would stop
making unsupported statements like the one above.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

gudeman@cs.arizona.edu (David Gudeman) (03/18/91)

In article  <-B0A9_3@xds13.ferranti.com> Peter da Silva writes:
]
]That's because "real exp(real);" is programmed into everyone's head. If
]you have "complex y;" then "y <= 0;" is a syntax error.

That's the point!  If you _don't_ write "complex y", and somewhere you
have "y <= 0", then you know that y is not complex.  There is no
pressing reason why you should have to write "real y" in most cases.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (03/18/91)

In article <693@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
> Programs in
> dynamically typed languages are generally half to a tenth the size of
> programs in statically typed languages that do the same thing.

I don't believe you. Give an example.

---Dan

gudeman@cs.arizona.edu (David Gudeman) (03/18/91)

In article  <1991Mar14.151707.11686@maths.nott.ac.uk> Dr A. N. Walker writes:
]In article <602@optima.cs.arizona.edu> gudeman@cs.arizona.edu
](David Gudeman) writes:
]
]We write
]sloppy maths because "we all know what we mean" -- I don't think that's
]a very good model for computer languages.

I object to the term "sloppy".  We often don't give unnecessary and
redundant information because it isn't needed.  I think it's an
excellent model for computer languages.

]>						   Static typing
]>originated, as near as I can determine, with low-level languages like
]>Fortran an Algol that were little more than glorified assemblers.
]
]	[I assume that Algol 60, rather than older or modern versions,
]is meant.]  This statement is just historically ignorant.

No, I was refering to the earliest Algol tradition.

]  Fortran did not, for the most part,
]have declarations at all;

I said "static typing", not "type declarations".  Early FORTRAN had
static typing but some of the declarations were implicit.

]	Well this is a matter of semantics [:-)].  The C fragment
]"int a[10]; a[23] = 17;" might, in many implementations, do something
]arbitrary to memory, but in my opinion it contains an error.

You are just arguing for array bounds checking -- arguably a form of
dynamic typing.

]  Making
]the behaviour undefined is [perhaps wrong-headedly] a convenience
]for the compiler writer.

Not just a convenience.  It makes it _possible_ to generate faster
code, not just easier.

]  Would C become more strongly typed if the
]behaviour became defined in some way?

Yes.

]>With static typing you need a great deal of information at compile
]>time to be able to guarantee strong typing.  This has two
]>consequences: (1) you have to limit the forms of expressions to some
]>set for which you know a type-checking decision procedure, and (2) you
]>have to acquire type information somewhere.

]	Ie, (1) you have to know what your expression is intended to do,
]and (2) you have to use variables in a disciplined way.

Your (1) is meaningless.  Of course you have to know what an
expression is intended to do in any language.  This has nothing to do
with when type checking is done.  Your (2) is a subjective judgement
that can mean anything.  Static typing certainly does not enforce any
sort of discipline in the way you use variables unless by "discipline"
you mean "restricted to a single language type".  Then (2) becomes a
tautology: "static type enforcement forces you to use static types."
There is no inherent advantage (other than efficiency) to following
the "discipline" of restricting all your variables to a single
language type.

]  I don't find
]these "consequences" either irksome or undesirable.  When I write programs
]in dynamically typed languages, I try hard to follow the same precepts.

I don't find your "consequences" undesireable either (using my own
definition of "disciplined").  I find the unrelated two consequence I
listed to be extremely undesirable.

]	Type *consistency* is indeed the compiler's job, but I don't
]find it unreasonable that I should document my identifiers.

I find it unreasonable that I should be forced by some language
designer who has no idea what I'm trying to do, to document my
variables in a specific way which may be completely worthless in the
specific task at hand.  And not just identifiers.  In a strongly
typed, statically typed language you have to declare the recursively
complete type of every data location reachable in the program.

]	There is a place for dynamic typing (I enjoy writing shell
]scripts!), and a case for rapid prototyping, but there is also a
]case for traditional declarations;  there is no need for either
]"camp" to knock the other.

In the first place, dynamic typing is useful for full-scale working
programs, not just prototypes.  In the second, I'm not knocking the
usefulness of _optional_ type declarations.  I'm trying to point out
that there better alternatives to enforced static typing, and that
dynamic typing is not "dangerous".  There is absolutely no evidence
that dynamic typing is in any way unsafe, and considerable evidence to
the contrary.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (03/18/91)

In article <-B0A9_3@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes:
> In article <24547:Mar1506:28:2591@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> > No, you're wrong. Suppes, for example, defines functions so that you can
> > say ``The set of x such that exp(x) is smaller than or equal to zero is
> > empty.'' You don't have to qualify x as ``x in the reals'' for the
> > statement to make sense and be perfectly correct.
> That's because "real exp(real);" is programmed into everyone's head. If
> you have "complex y;" then "y <= 0;" is a syntax error.

No, it isn't, at least not in the works of Suppes (and Tarski), but I
was referring to the function exp on the reals.

---Dan

cs450a03@uc780.umd.edu (03/18/91)

Erland Sommarskog writes:
>So when the types don't match, you crash. Sure, this is well-
>defined, but in many application it is just as unacceptable
>as undefined behaviour.

Or you backtrack to some pre-defined point, with a warning, and
continue from there.  Not ideal, but hopefully saving significant work
on the part of the user.  Or if it's a known ill-conditioned case,
maybe you can recover in some other way.

Or, your type int is promoted to type float and you have a little
floating point over-head.

Or, if you really want, you crash... but in an interactive environment
where you have the option of examining the situation right there (with
files still opened and in the same state, for instance), possibly
hand-patching the data and nursing it along to completion, possibly
re-writing a function, popping back a few levels and re-starting, or
possibly you dump core (though it's hardly necessary to call the file
'core').  Not, I'll admit, a requirement in a dynamically typed
language (to support all these options), but much less challenging to
implement than in a statically typed language.  (Dynamic typing makes
dynamic linking somewhat easier, I believe, and in any event, both
seem easy in an 'interpretive' environment.)

>If variables never can be uninitiated in your dynamicly typed
>language, I guess their behaviour can be well-defined.

yep.  If you try and use it with no value, it'll probably (depending
on your language, of course) be a run-time error, with all that
implies (see above).  On the other hand, you can probably also test
it, to see if it has a value...  And there are always code sequences
that guarantee that the variable has a value.

**********************************************************************

Repeat of one of Gudeman's arguments:

When using run-time type checking, practically all type errors are
caught in early stages of testing.

Corollary I:

The software must be tested.

Corollary II:

If the code is ill-conditioned, such that it is likely that invalid
arguments will not show up during testing, the code is poorly written.

(This last is not intended to be proved, but is more on the lines of a
partial definition.)

**********************************************************************

'nuff said?

Raul Rockwell

peter@ficc.ferranti.com (Peter da Silva) (03/19/91)

In article <17MAR91.21285518@uc780.umd.edu> cs450a03@uc780.umd.edu writes:
> I'm also rather amused (if only it weren't so painful) by the amount
> of memory required by statically linked programs.

Smoke and mirrors time... notice how "statically typed" has become
"statically linked". The two concepts are orthogonal.

Besides:

-rwx--x--x   3 bin      bin        29217 Sep 28  1988 /bin/sh
-rw-r--r--   1 root     root      206380 Nov  8 12:02 libXlisp.a
-rwxr-xr-x   1 root     root      175889 Nov  8 12:11 xlisp

And this is a "toy" language!
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

peter@ficc.ferranti.com (Peter da Silva) (03/19/91)

In article <730@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
> In article  <-B0A9_3@xds13.ferranti.com> Peter da Silva writes:
> ]That's because "real exp(real);" is programmed into everyone's head. If
> ]you have "complex y;" then "y <= 0;" is a syntax error.

> That's the point!  If you _don't_ write "complex y", and somewhere you
> have "y <= 0", then you know that y is not complex.  There is no
> pressing reason why you should have to write "real y" in most cases.

But if you don't know that y is complex, you don't know if "y <= 0" is
legal or not... working code or a bug. The only reason you can get away
with this in mathematics is everyone has had a bunch of Fortran-style'
default typing rules programmed into their head in college. And even so,
different mathematical languages disagree and you have to introduce people
to these new typing rules...
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

cs450a03@uc780.umd.edu (03/19/91)

Peter da Silva writes:
>In article <17MAR91.21285518@uc780.umd.edu> cs450a03@uc780.umd.edu writes:
>> I'm also rather amused (if only it weren't so painful) by the amount
>> of memory required by statically linked programs.
> 
>Smoke and mirrors time... notice how "statically typed" has become
>"statically linked". 

Nahh... I was just griping...

Raul

gudeman@cs.arizona.edu (David Gudeman) (03/19/91)

In article  <2837@enea.se> Erland Sommarskog writes:
]
]So when the types don't match, you crash. Sure, this is well-
]defined, but in many application it is just as unacceptable
]as undefined behaviour.

First of all, this "crash" cannot do arbitrary things like overwrite
files.  Second, it may not necessarily crash, there are other things
that can happen, depending on the languages exception-handling
facilities.  Third, there are very few applications where a crash is
as bad as undefined behavior.  Undefined behavior can do all kinds of
nasty things to the system.  For example, referencing past the end of
an array in C causes undefined behavior:

void trouble()
{ int arr[4]
  int delete_flag = 0;

  ...
  arr[4] = 1;  /* undefined behavior -- might set delete_flag to 1 */
  ...
  if (delete_flag) delete_all_files();
  ...
}

This is much worse than what you get with dynamically typed languages.
The only place where an error stop can be this bad is in critical
real-time applications, and such applications should have exception
handling to avoid error stops.

]If variables never can be uninitiated in your dynamicly typed 
]language, I guess their behaviour can be well-defined.

I can't think of any dynamically typed language where the use of an
unitialized variable causes undefine behavior, but there surely are
some.  I can think of several statically typed languages where it
causes undefined behavior.  In any case, undefined variable semantics
is orthogonal to typing.  I have no objection to compilers that issue
errors for undefined variables in any language.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

gudeman@cs.arizona.edu (David Gudeman) (03/19/91)

In article  <H-2A982@xds13.ferranti.com> Peter da Silva writes:
]
]Besides:
]
]-rwx--x--x   3 bin      bin        29217 Sep 28  1988 /bin/sh
]-rw-r--r--   1 root     root      206380 Nov  8 12:02 libXlisp.a
]-rwxr-xr-x   1 root     root      175889 Nov  8 12:11 xlisp
]
]And this is a "toy" language!

And this is a bogus comparison.  If you want to compare the sizes of
the tools, you have to add enough tools to give /bin/sh similar
functionality to the "toy" language.  Xlisp is far from a toy, it is
more powerful than C (including C's standard libraries).  It is just a
"toy" compared to other Lisp's.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

gudeman@cs.arizona.edu (David Gudeman) (03/19/91)

In article  <L-2A3G2@xds13.ferranti.com> Peter da Silva writes:

]But if you don't know that y is complex, you don't know if "y <= 0" is
]legal or not...

If you don't know the length of A, you don't know whether "A[i]" is
legal or not.  So?  You should know the length of A and you should
know the type of y.  In someone elses code, if you see "y <= 0",
then you can assume that y is not complex.  In you own code, if you
know that y is not complex you can write "y <= 0".  There is no
ambiguity.

]with this in mathematics is everyone has had a bunch of Fortran-style'
]default typing rules programmed into their head in college.

No, the reason you can get away with this in math is that everyone who
has a clue knows that you can't compare magnitudes of complex numbers,
so if you are comparing magnitudes, the numbers must not be compex.
This is a trivial logical inference:

(1) y is complex implies "y <= 0" is not meaningful
(2) "y <= 0" is meaningful
therefore
(3) y is not complex

There is nothing sloppy or ambiguous about this reasoning.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

sfk@otter.hpl.hp.com (Steve Knight) (03/19/91)

In response to a claim (or observation) made by David Gudeman that "dynamically 
typed languages are generally half to a tenth the size of programs in
statically typed languages"  Dan Bernstein reasonably writes back.

] I don't believe you. Give an example.

Of course, this gets us into the foolish realms of anecdotes and virility 
tests.  However, just because a challenge is foolish...  Well, perhaps it is
sufficient to say that this exchange reminded me of an amusing incident a 
couple of years ago between myself and a friend of mine.  (Who shall remain
nameless; unless you're reading this, Tim.)

After a rather boring stint at work, Tim into my office with a broad smile on
his face -- evidently having completed an exceptionally useless task 
successfully.  "Watch this," he said, and proceeded to demonstrate the 
capabilities of a simple program he'd prepared earlier while watching the
C compiler dither over a small pile of files.

This program read a file of syllables and printed out "passwords" composed
from three random syllables.  It was vaguely entertaining in that sort of
Friday afternoon-going-on-evening, let's-out-for-a-pizza-later kind of way.
Tim then showed me the source code.  Now Tim, I should explain, makes no
pretension of being a great C hacker (well, not then) but had composed the
program in a very workman-like way.  However, at several pages of code it 
seemed rather lengthy to me.

I leaned over my workstation and said, "Oh, it would have been a bit easier
in Pop11" and proceeded to <tappity-tappity-tap> for a few moments.  (Pop11 is
the dynamically typed language of this tale, by the way.)  What emerged was the
following few lines ...

    define program();             ;;; build a list of syllables
        lvars syllables = 'syllables'.discin.incharline.pdtolist;
        repeat 3 times
            syllables.oneof.pr    ;;; print a syllable 3 times
        endrepeat;
        nl( 1 );                  ;;; throw 1 new line
    enddefine;
    
I ran it a couple of times.  No problem -- too simple for an error.  Even for
me.  Suddenly, I felt two hands close around my windpipe.  As everything went 
black, I heard, "You <****>, it took me two hours to debug that <******-
******>."

Of course, being choked to death by irate C programmers is only one of the
many hazards of using a dynamically typed language ...

wallace@hpdtczb.HP.COM (David Wallace) (03/20/91)

>> = David Gudeman
> = Dan Bernstein

>> Programs in
>> dynamically typed languages are generally half to a tenth the size of
>> programs in statically typed languages that do the same thing.
>
>I don't believe you. Give an example.

It's only a single data point, but my first prototype version of ATV (the
abstract timing verifier I wrote for my dissertation work) was 1600 lines of
C code.  At that point I changed to Lisp, and got the same functionality in
less than 300 lines of Common Lisp code.  I also got additional functionality
for free: command scripts, the next thing I wanted to add to the program,
took 0 lines of code in Lisp.  (load "file") worked just fine.  The 5+:1 code
ratio here is certainly consistent with David's ranges.

Dave W.		(wallace@hpdtl.ctgsc.hp.com)

yodaiken@chelm.cs.umass.edu (victor yodaiken) (03/20/91)

In article <815@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
>In article  <L-2A3G2@xds13.ferranti.com> Peter da Silva writes:
>
>]But if you don't know that y is complex, you don't know if "y <= 0" is
>]legal or not...
>
>If you don't know the length of A, you don't know whether "A[i]" is
>legal or not.  So?  You should know the length of A and you should
>know the type of y.  In someone elses code, if you see "y <= 0",
>then you can assume that y is not complex.  In you own code, if you
>know that y is not complex you can write "y <= 0".  There is no
>ambiguity.
>
>]with this in mathematics is everyone has had a bunch of Fortran-style'
>]default typing rules programmed into their head in college.
>
>No, the reason you can get away with this in math is that everyone who
>has a clue knows that you can't compare magnitudes of complex numbers,
>so if you are comparing magnitudes, the numbers must not be compex.
>This is a trivial logical inference:
>

I believe that clarifying notation and type theory has been the subject
of an enormous body of mathematical research over the last century. 
The "type" of an object is often quite unobvious. For example, 
when we write ax = bx -> a=b we are assuming that the "type" of
a,b, and x and the type of the operation has the cancellation property.
If we are working within semigroup theory, this assertion is false. 
Similarly, it is well known that  if x = y - z + z' we can get an
equivalent assertion by rewriting to x = y + -1(z - z')  but if we apply
this rule to the equation x = 1 -1 +1 -1 +1 ... (infinite), we can
prove that 1=0 --- the type of the infinite sequence 1-1 +1 ....
is not "number", so the rules we have invoked cannot be applied. 

I'm not at all sure that type determination is always something that one
can entrust to a compiler.

peter@ficc.ferranti.com (Peter da Silva) (03/20/91)

In article <814@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
> Xlisp is far from a toy, it is
> more powerful than C (including C's standard libraries).

For some definition of "powerful", perhaps. In my definition, which comes
down to "how well does it cover the problems I want to solve" it's a toy.
It doesn't even allow you to link to arbitrary libraries, for heavens' sake.

> It is just a "toy" compared to other Lisp's.

Like lisp 1.5?
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

peter@ficc.ferranti.com (Peter da Silva) (03/20/91)

In article <815@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
> In article  <L-2A3G2@xds13.ferranti.com> Peter da Silva writes:
> ]But if you don't know that y is complex, you don't know if "y <= 0" is
> ]legal or not...

> If you don't know the length of A, you don't know whether "A[i]" is
> legal or not.  So?  You should know the length of A and you should
> know the type of y.  In someone elses code, if you see "y <= 0",
> then you can assume that y is not complex.  In you own code, if you
> know that y is not complex you can write "y <= 0".  There is no
> ambiguity.

This is fine if the code is:

	Known to be correct and debugged,
and/or	Your code,
and	You wrote it recently,
or	You just finished tracing it all and thus know it intimately.

For maintainable code you need to document all these things. Now you can
either type in declaractions, or you can type in comments that are the
equivalent, but the compiler can no longer use this information to help
you support the software.

> This is a trivial logical inference:

> (1) y is complex implies "y <= 0" is not meaningful
> (2) "y <= 0" is meaningful
> therefore
> (3) y is not complex

> There is nothing sloppy or ambiguous about this reasoning.

Sure, it's got an unstated and unsupported assumption: that the code is
correct.
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

cs450a03@uc780.umd.edu (03/20/91)

David Gudeman writes:

>In article  <28149@dime.cs.umass.edu> victor yodaiken writes:
>]
>]I'm not at all sure that type determination is always something that one
>]can entrust to a compiler. 
> 
>I wasn't advocating that.  In most dynamically typed languages the
>compiler has no idea what the type of anything is.  The programmer
>knows the types and, where it isn't obvious, he should document the
>types.  There is no reason (other than efficiency) that he should be
>required to specify the type to the compiler.

On the other hand, I was.  At least for those cases where it's clear cut.  

I know that it's usually clear cut for the code I write.  Generally
it's pretty clear what I'm doing, at least in the sense of what type
something should be.  But I'm allowed to stick in declarations for the
cases where the compiler's type inference gets hopelessly lost, and
there's a lot of information it just ignores (*sigh*).

I've been working on collecting ideas and information for the last two
or three years... lots of it goes into coding style, but eventually I
hope to make a compiler which lives up to what I think's possible.
Since I'm not working on it for real, yet, I just do things like annoy
you guys with what's possible but poorly implemented...  (And I have
yet to come up with very elegant ways of representing this sort of
information... I'm afraid that if I were to start working on it now
it'd be one god-awful huge compiler.)

The confessions of a wanna-be 8-)  (It would make my job easier,
though, if someone ELSE were to do this...)

Raul Rockwell

gudeman@cs.arizona.edu (David Gudeman) (03/20/91)

In article  <28149@dime.cs.umass.edu> victor yodaiken writes:
]
]I'm not at all sure that type determination is always something that one
]can entrust to a compiler. 

I wasn't advocating that.  In most dynamically typed languages the
compiler has no idea what the type of anything is.  The programmer
knows the types and, where it isn't obvious, he should document the
types.  There is no reason (other than efficiency) that he should be
required to specify the type to the compiler.

The same applies to math.  Mathematical notation should be such that
the types of variables are clear.  But there is no reason why the
notation should include formal "declarations" when there are other
easy ways for a human to tell what the types are.  Mathematics is
written for humans to read, not computers.  I claim that programming
languages should also --as much as possible-- be designed for humans
to read, not computers.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (03/20/91)

In article <9106@castle.ed.ac.uk>, aipdc@castle.ed.ac.uk (Paul Crowley) writes:
> I'm going to split languages in two by type extensibility.  In C and ML,
> you can explicitly make up a new type from old types: a widget is made
> up of two foos and a bar.  In Logo, there are only three types: words,
> numbers, and lists.  If you want to define an imaginary number, you
> could use a list of two integers.  If you want to define a UNIX-style
> time as a list of two integers [secs, usecs], you can do that too.  If
> you accidentally feed an imaginary number to a function that wants a
> date, the language won't say a word.  Prolog behaves this way too. 

Just an observation here:  any Prolog programmer who chooses such
representations should be kicked in the backside until he or she
does it *right*.  A better way to represent complex numbers would be
as "tagged pairs"
	complex(RealPart, ImaginaryPart)
and a better way to represent times would be
	time(Seconds, MicroSeconds)
These cannot be confused with each other.  They also use 3/4 of the
memory that a two-element list would use.  In fact, with the DEC-10
Prolog type checker, you would declare

	:- type complex --> complex(number,number).
	:- type time --> time(integer,integer).

and have all the reliability benefits of static type checking (but
not the efficiency benefits).

What's more, there isn't one teeny tiny thing in C (or, for that
matter, ML) which *makes* you use separate types.  In C
	struct num2 { double x[2]; };
	struct num2 i = {0.0, 1.0};	/* a complex number */
	struct num2 t = {1.2, 0.0};	/* a time */
When you think of the number of conceptually distinct types which
C programmers overlay onto 'int', you realise that "static typing"
*may* fail to buy you much reliability.  Note the way that C,
most Pascals, and many Fortrans overlay the notion "bit vector"
onto integers.

One great *reliability* advantage of languages like Lisp is that
there is no equivalent of casts (or for PL/I fans, no UNSPEC).
There _is_ a typing system and you _can't_ defeat it.

-- 
Seen from an MVS perspective, UNIX and MS-DOS are hard to tell apart.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (03/20/91)

In article <1991Mar15.153342.10670@rice.edu>, wright@datura.rice.edu (Andrew Wright) writes:
> I would like to see a concise summary of the claimed advantages of
> dynamic typing.  The claimed advantages of static typing are numerous
> and often argued (verification, efficiency, documentation, ...) but
> the advantages of dynamic typing are less often discussed.

Read "Object-Oriented Programming, an evolutionary approach"
by Brad J. Cox, Addison-Wesley, ISBN 0-201-10393-1.

If I can summarise his argument adequately in two sentences
(and I really don't think I can do justice to it):

	- The fewer irrelevant constraints a software component
	  includes in its interface, the easier it is to re-use
	  that component or to continue to use it in an evolving
	  application.

	- Latent typing is an effective way of removing some
	  irrelevant constraints from an interface.

This argument also supports Ada-style generics and ML-style
polymorpism.  Cox also discusses Ada generics.

The static/dynamic typing issue is just one instance of the
general early-binding/late-binding issue.

The argument for dynamic typing can be put in a nutshell:
    "don't put anything in writing if you're going to regret it later".

-- 
Seen from an MVS perspective, UNIX and MS-DOS are hard to tell apart.

gudeman@cs.arizona.edu (David Gudeman) (03/20/91)

In article  <3352:Mar1803:04:5491@kramden.acf.nyu.edu> Dan Bernstein writes:
]In article <693@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
]> Programs in
]> dynamically typed languages are generally half to a tenth the size of
]> programs in statically typed languages that do the same thing.
]
]I don't believe you. Give an example.

One example would convince you?  Say, I've got this real estate I'm
trying to unload...

No, I'm not willing to go to the work to dig up (or generate)
examples.  Sorry.  I've seen examples, and those numbers are thrown
around quite a lot, and they agree with my experience, but I'm not
willing to work that hard just to win an argument (I may like to argue
but I don't really care if I win or not...).

You can probably find some examples of your own though.  Try comparing
the sizes of various GNU Emacs elisp packages with C programs that do
the same thing.  There are also articles scattered around the
programming languages literature that make these sorts of comparisons.

You will probably have to look back 10 years or more, since these days
the advocates of dynamic typing tend to feel that the size advantage
is so obvious it doesn't have to be proven any more.  Now they are
trying to prove it can be made as efficient as static typing (and I
don't think they will succeed).
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

cs450a03@uc780.umd.edu (03/20/91)

Steve Knight writes:
>
>In response to a claim (or observation) made by David Gudeman that
>"dynamically typed languages are generally half to a tenth the size
>of programs in statically typed languages" Dan Bernstein reasonably
>writes back.
> 
>] I don't believe you. Give an example.
> 
>Of course, this gets us into the foolish realms of anecdotes and virility 
>tests.

It's somewhat worse, IMHO, in that there is no such thing as a true
example.  A true example would be a drop-in replacement for another
program, maintaining its quirks and 'optimizations' as though they
were mandated by <<insert your favorite authority here>>.

It is also false (as Peter da Silva has indirectly pointed out) to
place all the credit for compactness on dynamic typing...  some
compactness is from dynamic linking (or its equivalent), some is
because startup code isn't needed, some might be because the program
is kept in a higher-level form (e.g. threaded code).

And some benefits come from incidental features (it being very easy
for example, to implement overlays on a system which doesn't have them
in the os, if you can manipulate 'object code' with your language
primitives, and call such routines with low overhead).  Perhaps what
we've been arguing is not so much for dynamic typing, but the
inclusion of additional types of high utility?

But what if that additional type is an array whose bounds are
maintained at run-time?  What if there are a large number of
primitives to support manipulations of the list, and so on?

Basically, dynamic-typing (the way I've been thinking about it)
applies to languages which allow more structure that if/then/else and
do/while.  Things like data-selection operations and iteration over
some sequence of values (where it DOESN'T MATTER what the data is, you
just want to select it, and so on).  I'm talking about making these
first-class functions in the language, not in the sense of 'this is an
ad-hoc feature which can be used to speed up performance on platform
X', but 'this is a language feature, intended to replace lower-level
operations'.  And to any argument that 'This can be done in C', I
respond, it can be done in assembly language too.  (And to any who say
"that's not portable", I add that it can even be done in 8086
assembly).

In so-called statically typed languages, you are forced to think and
program in a low-level fashion.  Dynamically typed languages (If I say
'array'-oriented or 'list'-oriented, do I exclude Icon or ML?) tend to
be higher-level languages.  It doesn't matter how fast your solution
is if it doesn't run.

Raul Rockwell

cs450a03@uc780.umd.edu (03/20/91)

Peter da Silva writes:
>In article <815@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
  [[ examples of type-checking, with slant towards doing it dynamically ]]

>This is fine if the code is:
> 
>	Known to be correct and debugged,
>and/or	Your code,
>and	You wrote it recently,
>or	You just finished tracing it all and thus know it intimately.

Hmm... I spend a lot of time debugging and upgrading old code.  There
is a lot of such code around.  Some of it I just replace (like when
there's a better algorithm), some of it I patch.  Lots of it had bugs
that showed up under obscure circumstances--things in C that would
involve type-casting, mallocs, unions, etc.  (and segvs and either
mysterious creeping bugs or core-dumps.)

Oddly enough, the code is still useful, even the sloppily written
stuff.

Oddly enough, I didn't write it.

Oddly enough, I rarely have to trace it.

Even odder, I maintain this code but still manage to spend around half
(often more) my time on development.

Odder still, usually the only part of the comments I find useful are
the ones that identify the purpose of the function, or perhaps the
purpose of a variable.

I will admit that it took me a few months to get 'up to speed', and
that I'm still learning things, but I disagree quite emphatically with
Peter's "must" list.

Raul Rockwell

yodaiken@chelm.cs.umass.edu (victor yodaiken) (03/20/91)

In article <878@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
>In article  <28149@dime.cs.umass.edu> victor yodaiken writes:
>]
>]I'm not at all sure that type determination is always something that one
>]can entrust to a compiler. 
>
>I wasn't advocating that.  In most dynamically typed languages the
>compiler has no idea what the type of anything is.  The programmer
>knows the types and, where it isn't obvious, he should document the
>types.  There is no reason (other than efficiency) that he should be
>required to specify the type to the compiler.
>
>The same applies to math.  Mathematical notation should be such that
>the types of variables are clear.  But there is no reason why the
>notation should include formal "declarations" when there are other
>easy ways for a human to tell what the types are.  Mathematics is

Mathematical literature is full of statements of the form
"let X be a finite set", let $f: S x Y -> Z$", "k ranges over
the naturals", etc. etc., these are type declarations.  For example,
here's the Serge Lang in "algebra" (1984, Addison-Wesely)

	The collection of all morphisms in a category A will 
	bedenoted by AR(A) ("arrows of A"). We shall sometimes use the
	symbols "f \in A(A)" to mean that f is a morphism of A ...

How else are we going to make the types of the variables "clear", except
by type definitions of this kind?
I don't see how one can avoid declarations in either programming or
math, except when the domain is very simple or very stylized.
Here's an example. Suppose we define an encoding of sequences into
integers with a map  element(i,j)$ that picks out the ith element
of the sequence encoded in j. When I write  "if i=j then P" there
is now an ambiguity, do I mean to compare i and j as integers, or as
sequences? Since the same sequence may be encoded by distinct integers
the choice is significant. In a math paper, one might dispell the ambiguity
by using "u" and "v" as sequence variables, while reservng i and j for
integers. So "if u = v then P" could be interpreted without ambiguity
as calling for sequence comparison. But, we can only disambiguate if we
have declared  the types of "u" and"v".

>written for humans to read, not computers.  I claim that programming
>languages should also --as much as possible-- be designed for humans
>to read, not computers.

How can one disagree with this sentiment?

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (03/20/91)

In article <2400035@otter.hpl.hp.com> sfk@otter.hpl.hp.com (Steve Knight) writes:
> This program read a file of syllables and printed out "passwords" composed
> from three random syllables.

sed 's/$/XXXXXXXXXX/' | travesty -s -o10 | ... Sorry, couldn't resist.

>     define program();             ;;; build a list of syllables
>         lvars syllables = 'syllables'.discin.incharline.pdtolist;
>         repeat 3 times
>             syllables.oneof.pr    ;;; print a syllable 3 times
>         endrepeat;
>         nl( 1 );                  ;;; throw 1 new line
>     enddefine;

Fine, this is sort of example I was looking for. I claim that the
conciseness of this program comes from the libraries available, not from
the dynamic typing.

#include <stdio.h>
#include "sop.h"
#include "strinf.h"
main() {
SOP(strinf) *syl; strinf *s; int i; syl = SOPempty(strinf);
while (s = strinfgets(stdin)) { strinfchop(s); SOPadd(syl,s,strinf); }
for (i = 0;i < 3;++i) puts(strinftos(SOPrandpick(syl,strinf)));
putchar('\n');
}

(Yes, I admit I stole the name and concept of strinfchop from Perl.)
Here strinf is a library for handling arbitrary-length strings, and sop
is a generic set-of-pointers library.

Surely you agree that, syntax aside, the C and Pop11 versions work the
same way to accomplish the same results. The difference? I get better
type checking and almost certainly better efficiency. For longer
programs this means better turnaround time.

---Dan

peter@ficc.ferranti.com (Peter da Silva) (03/20/91)

In article <2400035@otter.hpl.hp.com> sfk@otter.hpl.hp.com (Steve Knight) writes:
> Of course, being choked to death by irate C programmers is only one of the
> many hazards of using a dynamically typed language ...

Looks to me like this is more a factor of the richness of the runtime
library, rather than the fact that it's dynamically typed. Of course,
dynamically typed languages often have a rich subroutine library, but
that's not always true. Consider Lisp 1.5, or CScheme.
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

anw@maths.nott.ac.uk (Dr A. N. Walker) (03/21/91)

In article <731@optima.cs.arizona.edu> gudeman@cs.arizona.edu
(David Gudeman) writes:
>In article  <1991Mar14.151707.11686@maths.nott.ac.uk> Dr A. N. Walker writes:

[I assumed that "Algol", in DG's bracketing of Algol and Fortran as low-level
languages, meant Algol 60]

>No, I was refering to the earliest Algol tradition.

	Have you actually *read* the Algol 58 (or IAL) report?  The Algol
tradition is of a patrician disregard for efficiency, which argues against
your assertions that various features were put in to make code generation
easier.  As late as Algol 68, the arguments are mostly about whether
things can, *in principle*, be compiled;  it was assumed, as a matter of
faith, that compiler technology would eventually catch up.

[re Fortran]
>I said "static typing", not "type declarations".  Early FORTRAN had
>static typing but some of the declarations were implicit.

	Just so.  This saves the "busy work" to which you objected;
nevertheless, reputable authors recommend that Fortran users should
put the busy work back in.

>You are just arguing for array bounds checking -- arguably a form of
>dynamic typing.

	Semantics.  It's not what most people think of as dynamic typing,
and it exists in Algol N (for all N!), Pascal, C, Fortran, etc., which
you surely don't think of as dynamically typed languages.

[Static typing has two]
>]>consequences: (1) you have to limit the forms of expressions to some
>]>set for which you know a type-checking decision procedure, and (2) you
>]>have to acquire type information somewhere.
>
>]	Ie, (1) you have to know what your expression is intended to do,
>]and (2) you have to use variables in a disciplined way.
>
>Your (1) is meaningless.  Of course you have to know what an
>expression is intended to do in any language.  This has nothing to do
>with when type checking is done.

	If *I* can look at (say) "a+b" and decide whether the operation
is defined, and what types the operands might have, and what the
consequences might be for the rest of the program, so can the compiler.
If you intend to make use of the *freedom* that dynamic typing can give
you [and I agree that it is sometimes, even often, useful], then it
follows that you *can't* know what your expression might do (except in
the very boring sense that a particular language might define otherwise
undefined operations to deliver 0, or some such).

>				   Your (2) is a subjective judgement [...]

	Certainly.  On the other hand, if writing out each identifier
that you use just once extra, with [usually] one word and a semicolon,
is a significant load in a serious production program, there's something
wrong somewhere.  Of course there are times when it's a nuisance;  there
are times when *any* form of documentation or commenting is a nuisance.

>]	There is a place for dynamic typing (I enjoy writing shell
>]scripts!), and a case for rapid prototyping, [...]
>
>In the first place, dynamic typing is useful for full-scale working
>programs, not just prototypes.

	I did distinguish the two!

-- 
Andy Walker, Maths Dept., Nott'm Univ., UK.
anw@maths.nott.ac.uk

dc@sci.UUCP (D. C. Sessions) (03/21/91)

In article <626@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
# In article  <1991Mar13.163629.12630@engage.enet.dec.com> grier@marx.enet.dec.com writes:

  [Deleted for brevity -- DCS]

# ]... I just wouldn't do my banking or trust my life to software which
# ]relies on extensive testing rather than some level of ensurred correctness.
# 
# You must be joking.  Static type checking doesn't give any reasonable
# level of assurance at all -- it is never the case that simply because
# program compiles without errors, there is reason to believe that it
# has some level of reliability.  Testing is the _only_ known way to
# give any assurance at all.  And a given amount of testing generally
# provides more assurance for a language with dynamic typing than it
# would for a language with static typing.  (Because programs in
# dynamically typed languages are usually much smaller and have fewer
# paths to test.)

  [More deletions -- DCS]

# 					David Gudeman

Let's try this theory on a real-life test case:

  Once upon a time, there was an engineering team which got stuck 
  maintaining a (big) mess of spaghetti.  This spaghetti contained
  records -- *lots* of records -- which got passed around all over the
  place.

  Most of the record types (Type_A through Type_M) were of the form:

        <header>
        <foo>
        <stuff>

  But record type Type_N was of the form:

        <header>
        <bar>
        <more stuff>

  (And by the way, <foo> and <bar> were both integers.)

  Now, since this was in the bad old days before function prototypes, 
  all of the functions which expected a Type_A would quite cheerfully 
  accept a Type_N instead.  This happy circumstance was widely exploited 
  to allow common handling of the header record, which contained
  information controlling the routing of the record between concurrent 
  threads.

  One day, a problem showed up.  On examination, it turned out that some 
  values of <foo> dictated an alternate routing, so one of the 
  message-routing functions was modified to handle the situation.  What 
  everyone missed was the fact that the routing function sometimes 
  handled Type_N messages.

  Of course, the results of interpreting <bar> as if it were <foo> could
  be amusing in the extreme.  Especially so since the consequences
  usually didn't show up until the record had been passed along to
  another concurrent thread.  Testing didn't discover the gotcha since 
  the anomalous value never turned up in <bar>.

  At least, it didn't turn up until a certain major corporation with a 
  three-letter name revised its communications protocol.  Oops.  *Big* 
  oops.  Banks.  Airlines.  Can you say "panic debug"?  It didn't take 
  long to find the Record from Mars; tracing back to where it came from 
  was a bit slower.  Lots slower.  Especially since working backwards 
  just showed the mutant being handled correctly, right back to the 
  point where Type_D records shouldn't come from.

The upshot of this little affair was the conversion of an entire shop 
full of C hackers into Modula-2 fanatics, purely because they *never* 
wanted to give up intermodule type-safety again.

So: for the purposes of the current discussion, how do our ideal 
dynamically-typed languages ensure that a similar little 
misunderstanding doesn't happen again?  Of course, sufficient human 
discipline would avoid the problem (by doing a manual static 
type-check?) but this is one of those little things that some 
programmers have come to expect computers to do for them.
-- 
| The above opinions may not be original, but they are mine and mine alone. |
|            "While it may not be for you to complete the task,             |
|                 neither are you free to refrain from it."                 |
+-=-=-    (I wish this _was_ original!)        D. C. Sessions          -=-=-+

peter@ficc.ferranti.com (Peter da Silva) (03/21/91)

In article <20MAR91.08580313@uc780.umd.edu> cs450a03@uc780.umd.edu writes:
> Peter da Silva writes:
> >This is fine if the code is:
> >	Known to be correct and debugged,
> >and/or	Your code,
> >and	You wrote it recently,
> >or	You just finished tracing it all and thus know it intimately.

> Hmm... I spend a lot of time debugging and upgrading old code. [etc...]

You're over-generalising my response. I am talking about the particular
case where you know what the "type" of a value is because of the immediate
context assuming that the usage is valid. I don't: I have to look at a
declaration or look at more than just the immediate context.

> Oddly enough, I rarely have to trace it.

I didn't intend imply that you have to. What I mean here is that *unless
you have* traced it you can't look at a random piece of code and know
the types of all the objects being dealt with. The ones you're familiar
with, yes.

[the only useful comments are]
> the ones that identify the purpose of the function, or perhaps the
> purpose of a variable.

That is, type declarations. Hmmmm.
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

gudeman@cs.arizona.edu (David Gudeman) (03/21/91)

In article  <28190@dime.cs.umass.edu> victor yodaiken writes:

]Mathematical literature is full of statements of the form
]"let X be a finite set", let $f: S x Y -> Z$", "k ranges over
]the naturals", etc. etc., these are type declarations.

Come on, people.  I didn't say that you never see anything like
declarations in mathematics.  All I said is that you can often do
without them.  The same is true of dynamically typed languages:
sometimes you have to check the types of variables, but often you can
do without it.  And in either case, "doing without it" is neither
sloppy nor ambiguous.  If it is ambiguous then you can't do without
it in the first place; and "sloppy" is in the eye of the beholder.

]	The collection of all morphisms in a category A will 
]	bedenoted by AR(A) ("arrows of A"). We shall sometimes use the
]	symbols "f \in A(A)" to mean that f is a morphism of A ...

That isn't a declaration, it's a comment.  In other words, no formal
notation is being used to describe the type, an English description is
being given.  If you want something like a declaration in math, try

  f : A -> B
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

yodaiken@chelm.cs.umass.edu (victor yodaiken) (03/21/91)

In article <922@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
> [A quote from Lang]
>]	The collection of all morphisms in a category A will 
>]	bedenoted by AR(A) ("arrows of A"). We shall sometimes use the
>]	symbols "f \in A(A)" to mean that f is a morphism of A ...
>
>That isn't a declaration, it's a comment.  In other words, no formal
>notation is being used to describe the type, an English description is
>being given.  If you want something like a declaration in math, try
>
>  f : A -> B

This I don't understand at all.  Here are two "comments:

comment A: "In the following we use the symbols a,b to denote 
elements of a sequence and u and v to denote sequences"

comment B: "In the following we use the symbols u,v to denote 
elements of a sequence and a and b to denote sequences"

If I write "Comment A" and f(null)= 0, f(av) = h(g(a), f(v))
     I've defined f as a recursive map on sequences
But if I write "Comment B" and  f(null)= 0, f(av) = h(g(a), f(v))
I've defined a quite different function. 
If I precede either with the comment "We use juxtaposition to
indicate iterated multiplication, so that x<y1,.... yn> = 
<xy1,... , xyn>" I could change the meaning of either statment
again by refuting our expectation that av indicates concatenation.

So all these "comments" are really "type declarations" not just comments that
can be stripped from the text without altering the semantics.

Note that I have not had to state what elements can be placed
in a sequence, so both A and B declare only a structural property
If you mean to argue that type declarations in mathematical notation
are generally more concise and more to the point than those offered
by Fortran or C, I'll agree. But it seems as if you are arguing a
larger point.

kers@hplb.hpl.hp.com (Chris Dollin) (03/21/91)

Dan writes (following up to Steve's message):

   > This program read a file of syllables and printed out "passwords" composed
   > from three random syllables.

   sed 's/$/XXXXXXXXXX/' | travesty -s -o10 | ... Sorry, couldn't resist.

   >     define program();             ;;; build a list of syllables
   >         lvars syllables = 'syllables'.discin.incharline.pdtolist;
   >         repeat 3 times
   >             syllables.oneof.pr    ;;; print a syllable 3 times
   >         endrepeat;
   >         nl( 1 );                  ;;; throw 1 new line
   >     enddefine;

   Fine, this is sort of example I was looking for. I claim that the
   conciseness of this program comes from the libraries available, not from
   the dynamic typing.

   #include <stdio.h>
   #include "sop.h"
   #include "strinf.h"
   main() {
   SOP(strinf) *syl; strinf *s; int i; syl = SOPempty(strinf);
   while (s = strinfgets(stdin)) { strinfchop(s); SOPadd(syl,s,strinf); }
   for (i = 0;i < 3;++i) puts(strinftos(SOPrandpick(syl,strinf)));
   putchar('\n');
   }

   (Yes, I admit I stole the name and concept of strinfchop from Perl.)
   Here strinf is a library for handling arbitrary-length strings, and sop
   is a generic set-of-pointers library.

   Surely you agree that, syntax aside, the C and Pop11 versions work the
   same way to accomplish the same results. The difference? I get better
   type checking and almost certainly better efficiency. For longer
   programs this means better turnaround time.

Since neither Steve nor Dan has chosen to enlighten us as to the meanings of
the identifiers used, how are we supposed to know that they are ``the same''?
(Close watchers ill not that I do in fact know what the Pop routines are, but
that's not the point).

I do note that Dan's example had to be all squashed up to fit in a similar
amount of space; laid out in a similar style it would take another five or so
lines. It's not clear that Dan gets ``better'' type-checking: Steve's code
works if -syllables- is a list or a vector, for example.

Dan certainly gets better efficiency: -oneof- is slow on lists (because
-length- and indexing are slow on lists). Given Steve's propensity for
speed-hacking, I'm surprised he didn't write the definition of -syllables- as

    lvars syllables = {% 'syllables'.discin.incharline.pdtolist.dl %};

to generate a vector rather than a list. Still, if we're only going round three
times, it hardly matters, does it? [Perhaps Steve should add -pdtovec- to our
local library.]

Maybe we should stop muttering about static vs dynamic typing and instead look
to the *real* issue here: how would one capture the advantages of the
dynamically typed systems that David is advocating but still be able to
typecheck at compile-time? It's pretty clear (to me, at least) that some notion
of sub-typing would be required, but I suspect that means run-time checks are
needed after all. [Note for object enthusiasts: classes are NOT types.] ML
doesn't cut it as it stands.

Perhaps David should show us some examples where he thinks dynamic types are
``essential'' and we should attempt to devise type systems that capture that
information at compile-time in a reasonablly checkable way. Perhaps Dan should
present examples where static typing is all that's required - or, contrariwise,
where even he would like more looseness in the type system.

And please bear in mind that C is hardly a good example of a statically typed
system....

--

Regards, Kers.      | "You're better off  not dreaming of  the things to come;
Caravan:            | Dreams  are always ending  far too soon."

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (03/21/91)

In article <1991Mar13.010946.4536@engage.enet.dec.com>, grier@marx.enet.dec.com writes:
>    I also claim agnostance(is that a real word?) on the utility of dynamic
> typing.  Other than BASIC, I can't say I've ever used it to build
> any real software...

It's news to me that BASIC has dynamic typing.  Surely a string variable
X$ can't have numbers or arrays as values, an integer variable X% can't
have strings or arrays as values, a string array variable B$() can't
be a string or a number, a function FNA% must be an integer function,
and so on.  In short, surely BASIC uses static typing?

-- 
Seen from an MVS perspective, UNIX and MS-DOS are hard to tell apart.

db@cs.ed.ac.uk (Dave Berry) (03/21/91)

In article <2400034@otter.hpl.hp.com> sfk@otter.hpl.hp.com (Steve Knight) writes:
>As you can see, modern programming languages that support polymorphic
>type-checking and type-inference give you all the security and performance
>enhancements of a strongly typed language and all the flexibility and
>convenience of a dynamically typed language.

No-one claimed that ML gives you all the flexibility of dynamic typing.
Someone pointed out that you don't need to declare types in a strongly
typed language, which ML shows is true.  ML also shows that you can
have incremental compilation and interactive environments with strongly
typed languages.  It's a pity that so many people seem unable to separate
one feature of a language (or set of languages) from other features of
those langauges.

Some of Steve's examples show the sort of thing that can't be done in a
(purely) strongly typed language.  For example,

>- val foo = [false, 0];
>Error: operator and operand don't agree (tycon mismatch)

There's no way that you can know the type of the head of a list
containing both integer and booleans without either:

	1. Dynamic typing.
	2. Tagging each element.

A subtype hierarchy (of some sort or other) would let you specify a general
type for the elements in the list, and dynamic binding would let you call
type-specific functions on each element.  This is the sort of thing done
in Eiffel and C++.  It still doesn't give you the actual type of each element.

There are schemes for adding dynamic types to strongly typed languages that
would let you do things like this.  They preserve type-safety by requiring
explict coercions from the type "dynamic" to the desired type, which only
succeeds if the object really is of the desired type.  (It's likely that one
of these will appear in ML at some point.)  So it seems that designers of
strongly-typed languages are looking at facilities to use dynamic types when
they are needed, and designers of dynamically-typed languages are looking at
facilities to have type-checking when needed.  Perhaps we should stop arguing
and start co-operating.

>It is not possible to write a polymorphic hash-function.

I don't see how it's possible to write a polymorphic hash-function in any
language unless you know all the possible types or type-representations
that the language will ever have, at the time when you write the function.
Is this necessarily the case with dynamically-typed lanaguges?  How could
you write a polymorphic hash-function in SmallTalk?  Redefining the function
in each sub-class is not acceptable - you can make the equivalent definition
in ML (you don't have dynamic binding, but that wasn't Steve's point).

Steve's other comments about ML aren't relevant top the subject, so I'll
comment on them in a different message.

--
 Dave Berry, LFCS, Edinburgh Uni.      db%lfcs.ed.ac.uk@nsfnet-relay.ac.uk

		George Bush: America's Chamberlain.

sfk@otter.hpl.hp.com (Steve Knight) (03/22/91)

Chris writes:
> Dan certainly gets better efficiency: -oneof- is slow on lists (because
> -length- and indexing are slow on lists). Given Steve's propensity for
> speed-hacking, I'm surprised he didn't write the definition of -syllables- as
>     lvars syllables = {% 'syllables'.discin.incharline.pdtolist.dl %};
> to generate a vector rather than a list. 

No fair, Chris :-)  It was only an anecdote ...  Still if you insist[*].
    define pdtovector( p ); lvars p; {% apprepeater( p, identfn ) %} enddefine;

Footnote:

Besides, you know I'd have written it like this for speed
    define program();
        lvars n = #| apprepeater( '/tmp/syllables'.discinline, identfn ) |#;
        repeat 3 times
            n.random.subscr_stack.pr
        endrepeat;
        nl( 1 );
    enddefine;

cs450a03@uc780.umd.edu (03/22/91)

Peter da Silva's comments are   >  and >> >
Mine are                               >>
>> >This is fine if the code is: 	Known to be correct and
>> >debugged, and/or	Your code, and	You wrote it recently,
>> >or	You just finished tracing it all and thus know it intimately.
>> Hmm... I spend a lot of time debugging and upgrading old code. [etc...]
>You're over-generalising my response. I am talking about the particular
>case where you know what the "type" of a value is because of the immediate
>context assuming that the usage is valid. I don't: I have to look at a
>declaration or look at more than just the immediate context.

Gack!  I think I know what you're saying.  I think that the types that
I deal with are much simpler than the types you deal with.  Or maybe I
should say that each variable has its own unique type?  

Essentially, I deal with:  character, integer, bit, file, function,
and homogenous collections of each, there are also rare hetrogenous
collections, and not-so-rare relationships between data of different
type in different variables.

Generally, each assignment to a variable is unique.  (I try not to
re-assign, and when I do, I try and make sure re-executing that
section of code would not cause a problem).  Exception made for loop
counters, but not for other assignments made within the loop.

So I suppose you could say I'm writing using a static typing style of
coding.  But if I had to have a separate declaration for each
assignment, well, that would increase my code size by 50-70% right off
the top (probably more, if I constructed C-like structs for each
assignment I make).  Not to mention it would probably triple the
number of 'library routines' I'd need to deal with.  All this for
what?  To make sure I'm protected from 'making mistakes'?

Or, maybe to make my code "more efficient".

Or, is someone going to try and claim that my code would become "more
understandable"?

>> Oddly enough, I rarely have to trace it.
>I didn't intend imply that you have to. What I mean here is that
>*unless you have* traced it you can't look at a random piece of code
>and know the types of all the objects being dealt with. The ones
>you're familiar with, yes.

If it's being used with addition, it's probably gonna be a number.  If
it's being used with string search, it's probably a string.  If it's
being used with logical and it's probably a boolean.  If that doesn't
apply to your language, then maybe typing isn't strong enough in your
language.

>[the only useful comments are]
>> the ones that identify the purpose of the function, or perhaps the
>> purpose of a variable.
>That is, type declarations. Hmmmm.

Yeah, type declarations.  Neither dynamic type declarations nor static
type declarations, but descriptive type declarations.  A feature
supported in every language I've ever heard of (even machine
language).

Raul Rockwell

cs450a03@uc780.umd.edu (03/22/91)

Dr. A. N. Walker writes:
>	If *I* can look at (say) "a+b" and decide whether the operation
>is defined, and what types the operands might have, and what the
>consequences might be for the rest of the program, so can the compiler.
>If you intend to make use of the *freedom* that dynamic typing can give
>you [and I agree that it is sometimes, even often, useful], then it
>follows that you *can't* know what your expression might do (except in
>the very boring sense that a particular language might define otherwise
>undefined operations to deliver 0, or some such).

Er, what do you mean by "make use of the *freedom* that dynamic typing
can give"?

I'd think that using a primitive which has meaning for more than one
type would qualify.  And so should writing some operation intended to
have the same effect on different types of data.  And so should using
a primitive which would be impossible to define if it's results had to
be of some singular type.

My first thought is rather weak, I know.  Statically typed languages
usually let you do things like check equality between objects of the
same type.  And a died-in-wool static-typing-hardliner would claim
that it is meaningless to find out that objects of different types are
not equal.

Examples of the second also can be done in statically typed languages,
things like sorting, searching, and selection.  But even C (which some
people have said we shouldn't be using as an example of a statically
typed language) doesn't have a hashed search (for un-ordered items) in
any of the libraries I know of.  And I'd like to hear about such
things in a language other than C...  Who knows, I might be in an
environment where I'll have to use another language.

For the last one (primitives returning data of any type), I'll give
two examples:

(1)  Typed data on file (including array bounds, which some people
might claim is not type information).  (For those who haven't read my
earlier postings, I've been claiming that static type checking isn't
strong enough).

(2)  An 'Eval' primitive.  Before this is safe, I know, you have to be
able to validate each of the symbols used (make sure they are
appropriate for the problem).  But how could you even provide such a
service in some other language?  Or is somebody going to claim that if
I need the semantics of Eval, I'm supposed to write my own from ground
up?

Also, I know that with C you can write a meta-program which writes
another program, and that often serves the same purpose.  But how do
you validate that the meta-program wrote the right stuff into the
dynamically created program?  I find these more of a pain to write and
debug than they should be.  I've also found other people's work along
the same lines to often be a little non-portable (address errors from
lex/yacc code, for instance).

Finally, there are a number of things that I'd dread even trying to
describe in a statically typed language (things which take functions
as arguments and yield new functions as a result).  Yet I claim that I
can understand what the derived functions would do when applied to my
code.  Of course, I suppose many of these are equivalent to control
structures in a static language (but control structures have their own
problems... )

    The world of statements is a disorderly one, with few useful
    mathematical properties.  Structured programming can be seen as a
    modest effort to introduce some order into this chaotic world, but
    it accomplishes little in attacking the fundamental problems
    created by the word-at-a-time von Neumann style of programming,
    with its primitive use of loops, subscripts, and branching flow of
    control.
                    John Backus -- 1977 Turing Award Lecture

Raul Rockwell

gudeman@cs.arizona.edu (David Gudeman) (03/22/91)

In article  <KERS.91Mar21130048@cdollin.hpl.hp.com> Chris Dollin writes:
]
]Perhaps David should show us some examples where he thinks dynamic types are
]``essential''

Can't think of a one.  Dynamic typing is a convenience and a
productivity enhancer, but it isn't essential.

]And please bear in mind that C is hardly a good example of a statically typed
]system....

Actually, C is as pure a statically typed language as you can get, it
just isn't strongly typed.  Languages that are more strongly typed
usually get that way by flavoring with a little dynamic typing.  This
appears in such things as array bounds checking and ensuring that a
pointer is non-null before dereferencing it.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

gudeman@cs.arizona.edu (David Gudeman) (03/22/91)

In article  <1991Mar20.185308.8275@maths.nott.ac.uk> Dr A. N. Walker writes:
]
]	Have you actually *read* the Algol 58 (or IAL) report?

I don't recall.  Not in a long time if ever.

]  The Algol
]tradition is of a patrician disregard for efficiency, which argues against
]your assertions that various features were put in to make code generation
]easier.

I'll bet though that what you consider a patrician disregard for
efficiency I will see as a psychotic fixation on same.

]  As late as Algol 68, the arguments are mostly about whether
]things can, *in principle*, be compiled;

Yep, psychotic fixation on efficiency.  They weren't arguing on
whether something could be _implemented_, they were worried about
producing machine code.  Meanwhile the lisp people were happily
implementing things without worrying how fast they would run.

]	Certainly.  On the other hand, if writing out each identifier
]that you use just once extra, with [usually] one word and a semicolon,

No, that is not the problem with static typing.  It is not uncommon to
have to declare variables in dynamically typed languages, you just
don't specify the type.  When I complain about too many declarations,
I'm not talking about declaring simple variables, I'm talking about
the large detailed declarations needed to create any data structure
with even moderate complexity.  I believe that these declarations are
responsible for an inordinately large percentage of the errors in
programs.

Another problem is that static type declarations usually require the
programmer to over-specify the solution.  This is a problem for
initial developement, maintenance, and code re-usability.  In initial
developement, it is not uncommon to get continuous requirements
changes and these are easier to deal with if fewer details have been
engraved in code.  Similar comments apply to maintenance.

Dan mentions that the power of a language is related to the amount of
re-usable code written in the language, but that is an incomplete
observation.  There is also the question of how much a language
encourages the writing of re-usable code.  In statically typed
languages it is always more difficult to write a generic package than
a special-purpose package.  The difference in difficulty varies a
little over languages, but the fact remains that if it is _any_ more
difficult to write re-usable code, less re-usable code will be
written.  In dynamically typed languages it is just as easy to produce
re-usable code as not.  In fact it takes some imagination to make code
not re-usable.

Ironically, those programing features that are intended to make it
more possible to write re-usable code tend to lead to more voluminuous
declarations, thereby making the error problem worse.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

peter@ficc.ferranti.com (Peter da Silva) (03/22/91)

In article <1610006@hpdtczb.HP.COM> wallace@hpdtczb.HP.COM (David Wallace) writes:
> I also got additional functionality
> for free: command scripts, the next thing I wanted to add to the program,
> took 0 lines of code in Lisp.  (load "file") worked just fine.

Forth is a statically "typed" language, and has the same advantage. This
has nothing to do with dynamic typing.
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

peter@ficc.ferranti.com (Peter da Silva) (03/22/91)

In article <21MAR91.23594992@uc780.umd.edu> cs450a03@uc780.umd.edu writes:
> Gack!  I think I know what you're saying.  I think that the types that
> I deal with are much simpler than the types you deal with.  Or maybe I
> should say that each variable has its own unique type?  

I'm not even necessarily talking about any particular language. I'm an
agnostic on the whole issue of which is "better". I use each tool for
what it's good for, subject to the availability of good compilers. But
I do find declarations a useful tool when I'm using a statically typed
language with a sufficiently rich type system (neither C nor Pascal
really qualifies here, for different reasons).

> Generally, each assignment to a variable is unique.  (I try not to
> re-assign, and when I do, I try and make sure re-executing that
> section of code would not cause a problem).  Exception made for loop
> counters, but not for other assignments made within the loop.

This is an unusual coding style, in my experience. Are you actually
limitin assignments, or are you hiding those assignments in call by
reference? Perhaps a code fragment would help.

> >[the only useful comments are]
> >> the ones that identify the purpose of the function, or perhaps the
> >> purpose of a variable.
> >That is, type declarations. Hmmmm.

> Yeah, type declarations.  Neither dynamic type declarations nor static
> type declarations, but descriptive type declarations.  A feature
> supported in every language I've ever heard of (even machine
> language).

But wouldn't it be nice if the language understood those declarations?
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

peter@ficc.ferranti.com (Peter da Silva) (03/23/91)

In article <22MAR91.09242511@uc780.umd.edu> cs450a03@uc780.umd.edu writes:
> (2)  An 'Eval' primitive.

If you need the semantics of eval, you need an interpreted (or incrementally
compiled) language. As a counterexample, Forth provides for this tool
but is not a dynamically typed language in any sense of the word.

Of course, any dynamically typed language has an "inner interpreter", but
that's a different beast from the outer interpreter that understands the
source language.
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

cs450a03@uc780.umd.edu (03/23/91)

Peter da Silva   > 
Me               >>
>> Generally, each assignment to a variable is unique.  (I try not to
>> re-assign, and when I do, I try and make sure re-executing that
>> section of code would not cause a problem).  Exception made for loop
>> counters, but not for other assignments made within the loop.
>This is an unusual coding style, in my experience. Are you actually
>limitin assignments, or are you hiding those assignments in call by
>reference? Perhaps a code fragment would help.

It's not really that unusual...  especially when you consider that I
try but don't always succeed ;-)  A typical C ferinstance would be any
code where you initialize a table.  Other C examples include things
like |= or &= (after initializing with some neutral value).

Code example, hmm.. anything I can think of is boring or long or too
complicated to remember off the top of my head.  

meta-sample code (this code would really run, given appropriate
functions, and the assignments are pure (each function mallocs a new
array or whatever, which gets freed when its ref count goes to zero)).

text =. F_READ: file
text =. (HANDLE_BACKSLASH: '.\r..\0.') TEXTREPL: text
   text F_REPLACE: file

Ok, normally I do a lot more than just kill garbage characters
(carriage returns and ascii nulls in this example) the point here is
that you could execute any of these lines as many times as you like.
(Just try and execute them in order, please.)

Is that what you mean by an example?  (And I hope you don't mind if I
don't provide "real code" -- I have the pleasure and misfortune
getting paid to write in APL.  Pleasure because it's so easy,
misfortune because it's so hard to get code fragments through your
typical text editor, or news feed.)

Incidentally, the trailing : means that the name is a true constant
(can't be erased or re-assigned, though localization has normal
affects).  Not a feature I get to use at work (well, not in any
consistent fashion), but one I'd like.  (The language is J, which I'd
prefer to use over APL once a sufficiently fast version exists.)

[refering to descriptive comments as type declarations]
>But wouldn't it be nice if the language understood those declarations?

Then they'd be code.  

Raul Rockwell

cs450a03@uc780.umd.edu (03/23/91)

>If you need the semantics of eval, you need an interpreted (or incrementally
>compiled) language. As a counterexample, Forth provides for this tool
>but is not a dynamically typed language in any sense of the word.

True, but I'm not sure it's very statically typed either.

Raul

throopw@sheol.UUCP (Wayne Throop) (03/24/91)

- gudeman@cs.arizona.edu (David Gudeman)
-] As late as Algol 68, the arguments are mostly about whether
-] things can, *in principle*, be compiled;
- Yep, psychotic fixation on efficiency.  They weren't arguing on
- whether something could be _implemented_, they were worried about
- producing machine code.

It seems clear that David is misinterpreting the above statement.  There
is NOT a fixation on efficency evident here...  the phrase "can, in
principle, be compiled" means exactly what David means by "can be
implemented" in this context. 
--
Wayne Throop  ...!mcnc!dg-rtp!sheol!throopw

throopw@sheol.UUCP (Wayne Throop) (03/24/91)

- wallace@hpdtczb.HP.COM (David Wallace)
>> = David Gudeman
> = Dan Bernstein

>> Programs in dynamically typed languages are generally half to a tenth
>> the size of programs in statically typed languages that do the same thing. 
>I don't believe you. Give an example.

- It's only a single data point, but my first prototype version of ATV (the
- abstract timing verifier I wrote for my dissertation work) was 1600 lines of
- C code.  At that point I changed to Lisp, and got the same functionality in
- less than 300 lines of Common Lisp code.

As another data point (one well-studied one of a cluster of such
I'm familiar with), a configuration management and automated build
program was implemented in about 1000 lines of LISP code, and swelled
to 5000 lines when implemented in C.  The reimplementation was well
studied, because there was controversy over whether any future
tools should be done in LISP.

In this examples, and some other similar but less thoroughly studied
examples of the same controversy, there was essentially no advantage in
terms of the cpu time consumption of this particular application.  There
was, however, a 3-to-1 reduction in memory working set size in favor of
C, as well as a whopping 30-to-1 reduction in memory paged in during
startup of the application in favor of C.  And, of course, the C could
be made available on many more of the platforms we were interested in. 

The studies undertaken at that time showed that the LISP system could be
made more competitive in terms of its memory consumption, especially its
startup thrashing, but the ultimate conclusion was that this wasn't
worth the rather large investment it was estimated to take, at least not
when amortized over the anticipated applications that might be
implemented with the improved system. 

- I also got additional functionality
- for free: command scripts, the next thing I wanted to add to the program,
- took 0 lines of code in Lisp.

This was our experience also (in reverse: that is, we lost a great
deal of flexibility in our command scripts going to C).

But...

As Peter da Silva points out, much of this isn't due to dynamic typing. 
In fact, much of the benefit of the Common Lisp environment as we saw it
was due to the large library of pre-canned functions that are part of
the environment, and the power and generality of the syntax.  Now SOME
of the flexibility and size of the standard available functions can be
attributed to dynamic typing, but MOST of it cannot. 

Note also that much of the DISadvantage of Common Lisp's memory
consumption (both address space and working set) was also NOT due to
dynamic typing, but rather due to run-time evaluation, so that even if
the application code didn't mention (say) complex arithmetic, it
couldn't be assumed that a runtime evaluation of a text string wouldn't
involve it. 
--
Wayne Throop  ...!mcnc!dg-rtp!sheol!throopw

cs450a03@uc780.umd.edu (03/24/91)

Dave Berry writes:
>Perhaps we should stop arguing and start co-operating.

Umm...  I really appreciate the sweetness and light and everything,
but I'm not totally sure I'm clear on the difference as long as
information is being freely exchanged.

Steve Knight: It is not possible to write a polymorphic hash-function.
>I don't see how it's possible to write a polymorphic hash-function in
>any language unless you know all the possible types or
>type-representations that the language will ever have, at the time
>when you write the function.

If you're talking about a small-talkish language, you just require
that the 'type' allow whatever methods you need to implement your
hashing.

Raul Rockwell

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (03/25/91)

In article <22MAR91.09242511@uc780.umd.edu>, cs450a03@uc780.umd.edu writes:
> Examples of the second also can be done in statically typed languages,
> things like sorting, searching, and selection.  But even C (which some
> people have said we shouldn't be using as an example of a statically
> typed language) doesn't have a hashed search (for un-ordered items) in
> any of the libraries I know of.

Log on to a UNIX System V box and type
	man hsearch
It has been around for _years_.  (See also lsearch and tsearch.)
-- 
Seen from an MVS perspective, UNIX and MS-DOS are hard to tell apart.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (03/25/91)

In article <8078@skye.cs.ed.ac.uk>, db@cs.ed.ac.uk (Dave Berry) writes:
> >- val foo = [false, 0];
> >Error: operator and operand don't agree (tycon mismatch)
> 
> There's no way that you can know the type of the head of a list
> containing both integer and booleans without either:
> 
> 	1. Dynamic typing.
> 	2. Tagging each element.

Alan Mycroft wrote a paper in the early 80s about adding dynamic
typing to languages like ML in a fully type-safe way.  (Basically,
dynamically typed values are "tagged".  To get at a particular
value you had to use something like a conformity-case.)

> >It is not possible to write a polymorphic hash-function.
> 
> I don't see how it's possible to write a polymorphic hash-function in any
> language unless you know all the possible types or type-representations
> that the language will ever have, at the time when you write the function.

In the case of something like Lisp, we do know all the possible type
representations.  There is no particular reason why ML couldn't have
a built-in polymorphic "hash" function that applies to any "equality"
type.  Abstract data types are a problem, but if you can replace '='
for a data type you can replace 'hash' for it as well.  (Lisp has a
short-cut here which ML lacks; a hash function in Lisp has to
approximate EQ rather than EQUAL.)

-- 
Seen from an MVS perspective, UNIX and MS-DOS are hard to tell apart.

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (03/25/91)

In article <1610006@hpdtczb.HP.COM> wallace@hpdtczb.HP.COM (David Wallace) writes:
> It's only a single data point, but my first prototype version of ATV (the
> abstract timing verifier I wrote for my dissertation work) was 1600 lines of
> C code.  At that point I changed to Lisp, and got the same functionality in
> less than 300 lines of Common Lisp code.  I also got additional functionality
> for free: command scripts, the next thing I wanted to add to the program,
> took 0 lines of code in Lisp.  (load "file") worked just fine.  The 5+:1 code
> ratio here is certainly consistent with David's ranges.

But how many of those 1300 lines appeared anyway in the language
library? Command scripts, for example, are just a function of the
programming environment, and don't prove anything about dynamic typing.
What was it about dynamic typing that made your code so much shorter?

---Dan

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (03/26/91)

In article <KERS.91Mar21130048@cdollin.hpl.hp.com> kers@hplb.hpl.hp.com (Chris Dollin) writes:
> I do note that Dan's example had to be all squashed up to fit in a similar
> amount of space; laid out in a similar style it would take another five or so
> lines.

Feh. Here are the two pieces of code, side by side, with a few changes
to fit a change in the strinf library and to show the parallels:

#include <stdio.h>
#include "sop.h"
#include "strinf.h"
main() {                             define program();
 strinf *s; int i;
 SOP(strinf) *syl;                   lvars syllables =
 syl = SOPempty(strinf);
 while (s = strinfgets(stdin)) {     'syllables'.discin.incharline.pdtolist;
   strinfchop(s);
   SOPadd(syl,s,strinf);
 }
 for (i = 0;i < 3;++i) {                     repeat 3 times
   puts(strinfs(SOPrandpick(syl,strinf)));   syllables.oneof.pr
 }                                           endrepeat;
 putchar('\n');                      nl( 1 );
}                                    enddefine;

Sure, the C version is longer: it #includes library header files (to
declare functions), and declares variables. But everything else is
parellel. I didn't define repeat or nl, and I didn't hide discin or
incharline inside a subroutine, but I'm sure we can all manage to pay
attention to the real issue rather than these syntactic trivialities.

The C code is no more complex than the original. End of example.

> It's not clear that Dan gets ``better'' type-checking: Steve's code
> works if -syllables- is a list or a vector, for example.

What's your point? I do get better typechecking: the compiler tells me
immediately that I haven't made any type errors. Given that it didn't
take me a significant amount of time to include the declarations, what
do I lose?

  [ speed ]

Actually, if I needed this program for something, I'd be more worried
about memory. I'd use one of the selection algorithms in Knuth to pick
three (or however many) random syllables from the list on the fly,
without having the whole thing in memory at once; then I'd output those
syllables in a random order. In any case, the superior speed of the C
version doesn't really matter given that it already provides superior
typechecking.

> Maybe we should stop muttering about static vs dynamic typing and instead look
> to the *real* issue here: how would one capture the advantages of the
> dynamically typed systems that David is advocating but still be able to
> typecheck at compile-time?

*What* advantages? If any advantages exist, they certainly didn't matter
for this example.

> And please bear in mind that C is hardly a good example of a statically typed
> system....

It's adequate.

---Dan

john@clouseau.mitre.org (John D. Burger) (03/26/91)

db@cs.ed.ac.uk (Dave Berry) wrote:

  I don't see how it's possible to write a polymorphic hash-function
  in any language unless you know all the possible types or
  type-representations that the language will ever have, at the time
  when you write the function.

and ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) replied:

  In the case of something like Lisp, we do know all the possible type
  representations.  There is no particular reason why ML couldn't have
  a built-in polymorphic "hash" function that applies to any
  "equality" type.  Abstract data types are a problem, but if you can
  replace '=' for a data type you can replace 'hash' for it as well.
  (Lisp has a short-cut here which ML lacks; a hash function in Lisp
  has to approximate EQ rather than EQUAL.)

Actually, in Common Lisp, when a programmer creates a hash table she
specifies the equality test she wants to use.  The implementation must
provide EQ, EQL, EQUAL and EQUALP hash tables, which cover the basic
flavors of equality in Common Lisp.

A programmer also has access to an underlying hashing function, if he
wants to implement his own hashed structures.  This function is
SXHASH, which implements EQUAL equality.  That is:

  (EQUAL X Y) implies (= (SXHASH X) (SXHASH Y))

However, none of these built-in predicates implement component-wise
equality for user-defined datatypes, so it's not possible for two
different instances of a particular datatype to (necessarily) be
hashed alike, even if their components are identical.
--
John Burger                                               john@mitre.org

"You ever think about .signature files? I mean, do we really need them?"
  - alt.andy.rooney

cs450a03@uc780.umd.edu (03/26/91)

Richard O'Keefe   >
Me                >>
>> But even C ... doesn't have a hashed search (for un-ordered items)
>> in any of the libraries I know of.
> 
>Log on to a UNIX System V box and type
>	man hsearch
>It has been around for _years_.  (See also lsearch and tsearch.)

I did.  Thought I was going to see a horrendous call with a table arg,
function args, etc. (leaving all the work to be done elsewhere).
Instead I saw something that scared me worse: A global hash table
(now, THAT's useful)!  I want to use hashing to speed up this little
old function here, and all I have to do is make sure it's a unique
instance of the solution...

Maybe my nonlocality gripes are petty, but geez...

Raul Rockwell

peter@ficc.ferranti.com (Peter da Silva) (03/26/91)

In article <22MAR91.20485982@uc780.umd.edu> cs450a03@uc780.umd.edu writes:
Orig-Paul> Generally, each assignment to a variable is unique.  (I try not to
Orig-Paul> re-assign, and when I do, I try and make sure re-executing that
Orig-Paul> section of code would not cause a problem).  Exception made for loop
Orig-Paul> counters, but not for other assignments made within the loop.

Peter>This is an unusual coding style, in my experience. Are you actually
Peter>limitin assignments, or are you hiding those assignments in call by
Peter>reference? Perhaps a code fragment would help.

Paul> It's not really that unusual...  especially when you consider that I
Paul> try but don't always succeed ;-)  A typical C ferinstance would be any
Paul> code where you initialize a table.  Other C examples include things
Paul> like |= or &= (after initializing with some neutral value).

OK, I still don't follow what the point is. Why avoid assignments, and how do
you do things like state machines or stepping through a list?

Paul> [refering to descriptive comments as type declarations]
Peter>But wouldn't it be nice if the language understood those declarations?

Paul> Then they'd be code.  

That's the point.
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

peter@ficc.ferranti.com (Peter da Silva) (03/26/91)

In article <22MAR91.22190193@uc780.umd.edu> cs450a03@uc780.umd.edu writes:
> >If you need the semantics of eval, you need an interpreted (or incrementally
> >compiled) language. As a counterexample, Forth provides for this tool
> >but is not a dynamically typed language in any sense of the word.

> True, but I'm not sure it's very statically typed either.

It's weakly statically typed. Basically, it has two types, the character
and the cell. You can build more complex types with <builds-does> or
whatever the latest fashionable term for that is, but it remains statically
typed. Of course you can build a dynamically typed language on top of it
with appropriate definitions... but the basic language is weakly and
statically typed.
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

cs450a03@uc780.umd.edu (03/26/91)

Wayne Throop   >   (and, I presume  >-]  )
David Gudeman  >-
>-] As late as Algol 68, the arguments are mostly about whether
>-] things can, *in principle*, be compiled;
>- Yep, psychotic fixation on efficiency.  They weren't arguing on
>- whether something could be _implemented_, they were worried about
>- producing machine code.
>It seems clear that David is misinterpreting the above statement.  There
>is NOT a fixation on efficency evident here...  the phrase "can, in
>principle, be compiled" means exactly what David means by "can be
>implemented" in this context. 

Well, I wouldn't call it psychotic, more of a blind spot.  

I posted an earlier article on J (the one about run-time syntax
checking).  Just to give you the creeping horrors, consider the
following line of code:

  x=. +x 1 2 3 ,(*(x=./) 4 5 6) x 7 8 9 (x=.+) 10

Here's how it would parse:

  x=. +x 1 2 3 ,(*(x=./) 4 5 6) x 7 8 9 (x=.+) 10
  x=. +x 1 2 3 ,(*(x=./) 4 5 6) x 7 8 9 (+) 10
  x=. +x 1 2 3 ,(*(x=./) 4 5 6) x 7 8 9 + 10
  x=. +x 1 2 3 ,(*(x=./) 4 5 6) + 7 8 9 + 10
  x=. +x 1 2 3 ,(*(x=./) 4 5 6) + 17 18 19
  x=. +x 1 2 3 ,(*(/) 4 5 6) + 17 18 19
  x=. +x 1 2 3 ,(*/ 4 5 6) + 17 18 19
note: */4 5 6  <-->  4*5*6
  x=. +x 1 2 3 ,(120) + 17 18 19
  x=. +x 1 2 3 ,120 + 17 18 19
  x=. +x 1 2 3 ,137 138 139
  x=. +/ 1 2 3 ,137 138 139
  x=. +/ 1 2 3 137 138 139
  x=. 420
  420

In this case, implementing machine code is trivial -- the whole line
boils down to the constant 420.  However, one could easily replace any
of the objects on the above line with a variable, or a read from
standard input, etc.   Not that any of that would be particularly
meaningful in this case.

Now, the line of code I choose is particularly ill-conditioned, which
makes it ideal for pointing out that things can be implemented which
do not have an efficient implementation for all cases.  I think that a
lot of things might have been knocked out of ALGOL, not because they
could not be implemented, but because they didn't map well into the
framework the implementors were using.

Raul Rockwell

wallace@hpdtczb.HP.COM (David Wallace) (03/26/91)

> peter@ficc.ferranti.com (Peter da Silva) 
>> Me

>> I also got additional functionality
>> for free: command scripts, the next thing I wanted to add to the program,
>> took 0 lines of code in Lisp.  (load "file") worked just fine.
>
>Forth is a statically "typed" language, and has the same advantage. This
>has nothing to do with dynamic typing.

Agreed.  I was supplying information about the one example I have
hard data on, not arguing that all the difference was due to static vs.
dynamic typing.  This wasn't a controlled experiment in which only
one independent variable was changed, it's a historical anecdote that
happens to meet the basic conditions David G. was describing (basically
the same program in a dynamic vs. a statically typed language).

My comment above was intended to clarify the degree to which
the two programs had the same functionality: they did, in that all the code
I re-wrote in Lisp was intended to replicate the existing functionality of
the C version, but they also didn't, because I was able to add this new
functionality to the Lisp version without writing any new code.
If I had added command scripts to the C version also, we would have had
two functionally equivalent comparisons: 1600:300 for the comparison without
command scripts, and (1600+X):300 for the comparison with them.

Obligatory caveats: this is the one example I happened to have data for.
It is consistent with David Gudeman's assertion.  It does not prove it, though
it does add to the credibility of the assertion for me.  Two potential
sources of bias are:

	1) This was early prototype code.  As such, both versions lacked the
	extensive error checking and recovery typical of more bulletproof
	production code.  I do not know what the ratio of code size in such
	code would be.  I suspect it would still favor Lisp (some errors just
	don't happen in Common Lisp, such as integer overflow), but the ratio
	could very well be more even.  (Note though that the reason for CL not
	having integer overflow is because CL exploits dynamic typing to give
	you automatic bignums as needed.)

	2) I re-wrote the code in Lisp because I felt it would be easier to
	develop this particular program in Lisp than in C.  The data suggests
	that I was right.  The same results would not necessarily hold for
	applications that were more naturally oriented towards C than Lisp.

	In particular, there were chunks of the initial C code that were
	devoted to list manipulation, storage management, and input parsing,
	each of which became much simpler or a non-issue in Lisp.  None of
	these are directly a result of dynamic typing, but they all tend
	to be at least weakly related (e.g., tagged storage supports both
	dynamic typing and garbage collection, so the two tend to go together;
	garbage collection in turn helps support many of the Lisp list
	manipulation functions; and polymorphic read functions (supported by
	dynamic typing) facilitate input parsing).  In any case, David's
	specific assertion was about programs in dynamically typed languages
	(with all the baggage that goes along with them), not about the
	effects of dynamic typing in isolation.

In summary: David Gudeman asserted that programs written in dynamically typed
languages were generally half to a tenth the size of equivalent programs
in statically typed languages.
Dan Bernstein questioned the assertion and asked for an example.
I supplied the data from the one non-trivial example I have known where
a direct comparison was possible, and noted that the data was consistent
with David's assertion.

Dave W.		(wallace@hpdtl.ctgsc.hp.com)

gudeman@cs.arizona.edu (David Gudeman) (03/26/91)

In article  <1492@sheol.UUCP> Wayne Throop writes:
]
]It seems clear that David is misinterpreting the above statement.  There
]is NOT a fixation on efficency evident here...  the phrase "can, in
]principle, be compiled" means exactly what David means by "can be
]implemented" in this context. 

It will take a great deal of evidence to convince me that the Algol
committee members were so ignorant that they actually argued over the
implementability of anything in Algol 60.  There is nothing in the
language that any knowledgeable person would doubt can be implemented.

In fact, I can only think of two things that can't obviously be
compiled to machine code: nested scope and call-by-name.  Everything
else in the language has a fairly obvious machine implementation.

I want to clarify that my use of the term "psychotic" was intended as
hyperbole, not a serious criticism of the Algol committee.  But it is
just wrong to claim that they were trying to design a general notation
for describing algorithms without regard to implementation.  If that
were true they surely would have included sets, strings, graphs,
concatenatable sequences and other useful things in the language.  But
they did not because they wanted the language to be close in spirit to
computers.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (03/26/91)

I mentioned that System V C has hsearch.
In article <25MAR91.21515980@uc780.umd.edu>, cs450a03@uc780.umd.edu writes:
> Instead I saw something that scared me worse: A global hash table
> (now, THAT's useful)!  I want to use hashing to speed up this little
> old function here, and all I have to do is make sure it's a unique
> instance of the solution...

> Maybe my nonlocality gripes are petty, but geez...

I guess I should have put a few smileys in my posting.  I think the
hsearch() interface in the System V library is very bad.  I would
try hard to fail any of my students who did that.  Actually, I am of
the opinion that typing *values* (dynamic typing) is a Good Idea --
not least because mistakes I make in C just can't be made in Scheme
or won't go un-noticed if they _can_ be made.  I also like languages
where it's impossible for an uninitialised variable to go un-noticed
(Dijkstra's notation -- I often design in that then code in C,
Scheme, Pop-2, Pop-11, Arthur Sale's Pascal compiler for B6700s, ...).

	(declare (optimize safety))

-- 
Seen from an MVS perspective, UNIX and MS-DOS are hard to tell apart.

kers@hplb.hpl.hp.com (Chris Dollin) (03/26/91)

David responds:

   In article  <KERS.91Mar21130048@cdollin.hpl.hp.com> Chris Dollin writes:
   ]
   ]Perhaps David should show us some examples where he thinks dynamic types are
   ]``essential''

   Can't think of a one.  Dynamic typing is a convenience and a
   productivity enhancer, but it isn't essential.

I made myself less than clear. I don't use ``essential'' in the sense of
``can't be programmed without'', because almost nothing is essential in that
sense; I meant that, without dynamic types, the expression of the code would be
more obscure, or longer, or plain inelegant.

Incidentally, David, are we arguing for dynamic typing (as in Lisp, Pop11, etc)
or simple absence of mandatory type declarations (as in ML)? It makes a
difference. 

--

Regards, Kers.      | "You're better off  not dreaming of  the things to come;
Caravan:            | Dreams  are always ending  far too soon."

cs450a03@uc780.umd.edu (03/27/91)

Peter da Silva writes:
 [little peter/paul dialog about making assignments unique, ending:
>OK, I still don't follow what the point is. Why avoid assignments,
>and how do you do things like state machines or stepping through a
>list?]

Umm.. I like the Peter/Paul style, but my name starts with an 'R'
(other than that, pronounced like Paul).

(1) The point is not to avoid assignments.  The point is to be able to
see the program all at once.  (Mentally, if not physically).

(2) I generally don't need to make state machines.  When I do, I try
to represent the machine as a single data object (the machine state)
which is mapped by a simple function from one state to another.
Another thing is that usually, I'll want a history of the machine, so
each state is stored in a separate location.  In functional
programming, there is a significant difference between initialization
and re-assignment.

(3) I generally don't need to step through lists.  Generally, this is
done for me automatically by primitives.  Most of the time, I consider
the stepping process trivia (especially when the primitive simulates
parallel execution, as   1 1 0 0 and 1 0 1 0   yields: 1 0 0 0).

Pure functions are much easier to validate than dirty code, for the
same kind of reasons that static code is easier to validate than
self-modifying code.

Again, the point is not to prohibit re-defining of some value, the
point is to use simpler techniques where possible.  And, again, the
advantages are in development and debugging times.

When you're used to it, C statements like
    for (j=0; j<max ; j++) { list[j] = f(list[j], j); }
seem pretty simple.  However, they aren't much faster than
    new = (type *) malloc( ... );
    for (j=0; j<max ; j++) { new[j] = f(list[j], j); }
    free(list);

Why the extra hassle?  Because now you can abort in the middle, if you
hit an error, or whatever.  Also, your new type doesn't have to be the
same as the old type.  (Though keeping track of types is even more
overhead in a language like C).  If you're in a debugging environment,
you could easily change the argument array, and restart.

Thus, I could write the above loop as 
    new =. f (list ; i. # list)
where i. # creates a little array of indices.  Or, even simpler, I
could write 
   new =. f (list)
since f has full access to all the details of its argument (type,
dimensions, etc.)  This is vaugely like the argc, argv convention of
unix, but not so specialized for file names and option switches.

   Yes, there is overhead.  Usually, for arrays smaller than 16
elements it is significantly smaller to go with a statically typed
language.  On larger data objects, the type checking and array bounds
checking vanish in the noise.  (The exact crossover depends on the
function being used, the architecture of the machine you are on, and
that sort of thing.)

   There is also storage overhead, but that isn't always significant.
The efficiencies of scale are suprisingly similar to the benefits you
get from a compiler.  Except with a compiler you can get lots of
speed-up by adding more code.  Here, the speedup comes in simplifying
the code and going with more data.

   In fact, since C makes it so awkward to dynamically allocate result
arrays, people often go to various tree structures when it would be
more efficient to store the results as a flat array.

   For some things, you NEED to get down to a low level, and be able
to do things in a serial fashion, or just modify one little piece of a
tree.  Most of the time, you can provide that functionality in some
primitive, and get on with solving the big problems.

   Enough time is spent re-inventing the wheel as it is.

>Paul> [refering to descriptive comments as type declarations]
>Peter>But wouldn't it be nice if the language understood those declarations?
>Paul> Then they'd be code.  
>That's the point.

Would you settle for a language which provided no special syntax for
comments (i.e. where comments are just pieces of data that you throw
into the program)?

Raul Rockwell

gudeman@cs.arizona.edu (David Gudeman) (03/27/91)

In article  <KERS.91Mar26115416@cdollin.hpl.hp.com> Chris Dollin writes:
]
]   In article  <KERS.91Mar21130048@cdollin.hpl.hp.com> Chris Dollin writes:
]   ]
]   ]Perhaps David should show us some examples where he thinks
]   ]dynamic types are ``essential''...
]
]...I meant that, without dynamic types, the expression of the code would be
]more obscure, or longer, or plain inelegant.

The examples of heterogenous lists posted by others are persuasive,
more so than I would have expected.  Other examples include the
ability to define polymorphic functions easily:

  procedure max(a,b) return if a < b then b else a; end

which is perfectly clear and works for any type that "<" works on.
Statically typed languages either forbid this altogether, or require
extra declarations that obscure the simple meaning.

Another convenience is the ability to use different types to mean
different things.

  procedure string_to_num(s) # returns the integer or float
  # represented by s.  In case of an error, the return value is a
  # string describing the type of error.

Statically typed languages either need a global variable or an
exception handling mechanism to do this.  And with the global variable
method, the programmer can forget to check the value of the global,
leading to undefined behavior.  With the solution above, forgetting to
check the result will eventually lead to a type-error that will
probably be much easier to track down.

]Incidentally, David, are we arguing for dynamic typing (as in Lisp,
]Pop11, etc) or simple absence of mandatory type declarations (as in
]ML)? It makes a difference. 

ML does not have dynamic typing so I'm not arguing for that sort of
scheme.  It gives some of the advantages of dynamic typing, but
probably not the more important ones.  I'm arguing (1) dynamic typing
is useful enough that it should be supported directly by programming
languages, (2) dynamic typing is not "dangerous" in any sense, (3)
static typing is important enough that it should be supported by
programmig languages, and (4) it is not clear the the ability of
static type declarations to catch errors outweighs the number of
errors that occur in the declarations.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

rh@smds.UUCP (Richard Harter) (03/27/91)

In article <1493@sheol.UUCP>, throopw@sheol.UUCP (Wayne Throop) writes:

> As another data point (one well-studied one of a cluster of such
> I'm familiar with), a configuration management and automated build
> program was implemented in about 1000 lines of LISP code, and swelled
> to 5000 lines when implemented in C.  The reimplementation was well
> studied, because there was controversy over whether any future
> tools should be done in LISP...

> As Peter da Silva points out, much of this isn't due to dynamic typing. 
> In fact, much of the benefit of the Common Lisp environment as we saw it
> was due to the large library of pre-canned functions that are part of
> the environment, and the power and generality of the syntax.  Now SOME
> of the flexibility and size of the standard available functions can be
> attributed to dynamic typing, but MOST of it cannot. 

Just another data point.

We developed Lakota internally as a scripting language and have found
rather similar results.  For example, a 500 NCL C program was replaced
by a 35 NCL Lakota program.  An approoximate breakdown for the reasons
for reduction in code size is:

	15%	Elimination of declarations
	35%	Use of intrinsic language functions
	20%	Dynamic list operations
	20%	Free form string concatenation

To the point of this particular discussion -- Lakota is an untyped
language.  (More precisely, it has a single type, lists of strings;
types, if needed, are implemented as ADT packages; numeric operations
are recognized in functional context.)  Some points that may be relevant:

(a)	One of the features that reduced code size and made for increased
readibility was the elimination of the need for lots of "helper" variables,
i.e. loop indices, auxiliary arrays, variables containing sizes and
counts, etc.  I rather suspect that the same would be true for the cited
lisp program.  Much of this can be attributed to intrinsic dynamic data
structures.  Calculation of sizes, allocation of storage, and variable
"declarations" are handled by the language processor rather than by the
programmer.

(b)	Intrinsic functions such as sorting and splitting strings into
a list of fields played a major role in reducing size.  There are several
factors here.  (1) The ability of function operators to act on and return
lists, (2) consistent syntax which makes composition of operators convenient,
and (3) the elimination of "set-up" operations, i.e. allocation,
initialization, etc.  The fundamental gain over C (even given the
availibility  of equivalent library routines) is in the smoothness of
the interface between operations.

Speed:  Lakota is an intrinisically interpreted language.  On the whole
the execution times are 2-5 times slower than the equivalent C programs.
[The intended usage is high-level administrative with compute intensive
grunt work being done by 3GL subroutines and programs; ergo execution
time is not a penalty factor.]  Development time of Lakota vs C (or
equivalent 3GL) is at least an order of magnitude faster -- the salient
points are reduced code size, reduced syntax errors, elimination of the
compile and link step, and the ability to execute individual routines
in standalone mode.

A final note.  In the context of this discussion Lakota is not
execeptional; smaller code size, sharply reduced development time,
dynamic data structures, and functional composition are features of
many of the other languages mentioned.  I used Lakota as an example
because I am working with it at the moment.
-- 
Richard Harter, Software Maintenance and Development Systems, Inc.
Net address: jjmhome!smds!rh Phone: 508-369-7398 
US Mail: SMDS Inc., PO Box 555, Concord MA 01742
This sentence no verb.  This sentence short.  This signature done.

quale@saavik.cs.wisc.edu (Douglas E. Quale) (03/28/91)

>typed languages are generally half to a tenth the size of programs in
>statically typed languages"  Dan Bernstein reasonably writes back.
>
>] I don't believe you. Give an example.
>

In Common Lisp,

(defun compose (f g)
  (lambda (&rest) (funcall f (apply g &rest))))

I think it looks better in Scheme,

(define (compose f g)
  (lambda x (f (apply g x))))

These definitions use dynamic typing to obtain polymorphism.

I think it would take at least two orders of magnitude (probably three) to
do this in C.

-- Doug Quale
quale@saavik.cs.wisc.edu

peter@ficc.ferranti.com (Peter da Silva) (03/28/91)

In article <26MAR91.18541581@uc780.umd.edu> cs450a03@uc780.umd.edu writes:
> Umm.. I like the Peter/Paul style, but my name starts with an 'R'
> (other than that, pronounced like Paul).

Oops.

> (2) I generally don't need to make state machines.  When I do, I try
> to represent the machine as a single data object (the machine state)
> which is mapped by a simple function from one state to another.
> Another thing is that usually, I'll want a history of the machine, so
> each state is stored in a separate location.  In functional
> programming, there is a significant difference between initialization
> and re-assignment.

Hmmm. looks like we're dealing with radically different problem spaces.
You deal with problems where it's desirable and possible to avoid information
loss.  I have to deal with that entropy and give up reversibility to allow
unlimited runtime with limited memory storage. This is particularly important
in interactive programs.

> When you're used to it, C statements like
>     for (j=0; j<max ; j++) { list[j] = f(list[j], j); }
> seem pretty simple.  However, they aren't much faster than
>     new = (type *) malloc( ... );
>     for (j=0; j<max ; j++) { new[j] = f(list[j], j); }
>     free(list);

It may not be significantly slower, but it uses up significantly more
memory. If list is large enought that can easily make the difference between
being able to run the program and not.

Also, I tend to work through dynamically allocated lists, or even lists
that are implicitly defined. In that case "max" is unknown and potentially
unbounded.

	while(msg = GetMsg(port)) {
		switch(msg->request) {
			...
			case FOO:
				state = BAR;
		}
		Reply(msg);
	}

I don't care how GetMsg is implemented, or Reply, but I do need to assume
that there is no limit on the number of times this loop will be run through.
This means I *have* to discard old state information.

>    In fact, since C makes it so awkward to dynamically allocate result
> arrays, people often go to various tree structures when it would be
> more efficient to store the results as a flat array.

This is a limitation in the standard library: it doesn't have a "sack of
data" set of routines. Other C runtime libraries do have such things.

> Most of the time, you can provide that functionality in some
> primitive, and get on with solving the big problems.

That's true, but I don't see how it relates to this question. This is
the smart way to operate in any langauge, no matter how the type rules
work.

> >Raul> [refering to descriptive comments as type declarations]
> >Peter>But wouldn't it be nice if the language understood those declarations?
> >Raul> Then they'd be code.  
> >That's the point.

> Would you settle for a language which provided no special syntax for
> comments (i.e. where comments are just pieces of data that you throw
> into the program)?

Regarding what?
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

cs450a03@uc780.umd.edu (03/28/91)

Brian Boutel writes:
>This typing can be inferred at compile time, which is what is meant
>by static.

Typing inferred at compile time is a good thing, no matter what you
call it.  

Raul

brian@comp.vuw.ac.nz (Brian Boutel) (03/28/91)

In article <1991Mar27.161304.17666@daffy.cs.wisc.edu>,
quale@saavik.cs.wisc.edu (Douglas E. Quale) writes:
|> >typed languages are generally half to a tenth the size of programs
|> in
|> >statically typed languages"  Dan Bernstein reasonably writes back.
|> >
|> >] I don't believe you. Give an example.
|> >
|> 
|> In Common Lisp,
|> 
|> (defun compose (f g)
|>   (lambda (&rest) (funcall f (apply g &rest))))
|> 
|> I think it looks better in Scheme,
|> 
|> (define (compose f g)
|>   (lambda x (f (apply g x))))
|> 
|> These definitions use dynamic typing to obtain polymorphism.
|> 
|> I think it would take at least two orders of magnitude (probably
|> three) to
|> do this in C.
|> 
|> -- Doug Quale
|> quale@saavik.cs.wisc.edu

Why do you think this is dynamic typing? Definitions parallel to this
exist in statically typed languages like Haskell and ML.

In Haskell, write

 compose f g = \ x -> f (g x)

This is even shorter than the lisp/scheme versions. Its type (in the
normal Hindley/Milner type system) is

(forall T1, T2, T3), (T1->T2) -> (T3->T1) -> (T3->T2)

This is polymorphic because any types can be (consistantly) substituted
for the type variables T1,T2 and T3.

This typing can be inferred at compile time, which is what is meant by static.

So try again.

--brian
 
-- 
Internet: brian@comp.vuw.ac.nz
Postal: Brian Boutel, Computer Science Dept, Victoria University of Wellington,
        PO Box 600, Wellington, New Zealand
Phone: +64 4 721000

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (03/28/91)

In article <1162@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
> The examples of heterogenous lists posted by others are persuasive,
> more so than I would have expected.  Other examples include the
> ability to define polymorphic functions easily:
>   procedure max(a,b) return if a < b then b else a; end

#define max(a,b,lt) ((lt)((a),(b)) ? (b) : (a))

Or

any max(a,b,lt)
any a;
any b;
int lt();
{
 if (lt(a,b))
   return a;
 else
   return b;
}

>   procedure string_to_num(s) # returns the integer or float
>   # represented by s.  In case of an error, the return value is a
>   # string describing the type of error.
> Statically typed languages either need a global variable or an
> exception handling mechanism to do this.  And with the global variable
> method, the programmer can forget to check the value of the global,
> leading to undefined behavior.  With the solution above, forgetting to
> check the result will eventually lead to a type-error that will
> probably be much easier to track down.

But an exception handling mechanism obviously produces the best results:
you need to take positive action to catch the exception, so if you
forget to handle it then you're guaranteed to get an error. In contrast,
``will eventually lead to a type-error'' is not particularly comforting.
Your program may be doing something entirely different in the meantime,
and lots of other data may be corrupted.

> (1) dynamic typing
> is useful enough that it should be supported directly by programming
> languages,

Supported, yes; but why does the support have to be direct? Why can't it
just be a set of library routines, plus the syntactic sugar necessary to
make you happy?

> (4) it is not clear the the ability of
> static type declarations to catch errors outweighs the number of
> errors that occur in the declarations.

Fair enough. However, just in case the benefits of static typing *do*
outweigh the errors---possibly by a lot---doesn't it seem wise to
provide the support?

---Dan

augustss@cs.chalmers.se (Lennart Augustsson) (03/28/91)

In article <1991Mar27.161304.17666@daffy.cs.wisc.edu> quale@saavik.cs.wisc.edu (Douglas E. Quale) writes:
>In Common Lisp,
>
>(defun compose (f g)
>  (lambda (&rest) (funcall f (apply g &rest))))
>
>I think it looks better in Scheme,
>
>(define (compose f g)
>  (lambda x (f (apply g x))))
>
>These definitions use dynamic typing to obtain polymorphism.
>
Yes, in LISP it's dynamic typing that gives you the polymorphism, but
you don't have to have dynamic typing.  Let's do the same definition
in ML, which is statically typed:
  - fun compose(f,g) x = f(g x);
  > val compose = fn : (('a -> 'b) * ('c -> 'a)) -> ('c -> 'b)
(I wrote the line starting with '-', the system responded with the the
line starting with '>'.)
This compose function is typed and still polymorphic.  Some examples:
  - fun inc x = x+1;
  > val inc = fn : int -> int
  - compose(not,not) true;
  > true : bool
  - compose(inc,compose(inc,inc)) 0;
  > 3 : int
  - compose(inc,not) true;
  Type clash  in:  (compose (inc,not))
  Looking  for a:  int
  I have found a:  bool




	-- Lennart Augustsson
[This signature is intentionally left blank.]

gudeman@cs.arizona.edu (David Gudeman) (03/28/91)

In article  <25317:Mar2803:26:1291@kramden.acf.nyu.edu> Dan Bernstein writes:
]In article <1162@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
]
]#define max(a,b,lt) ((lt)((a),(b)) ? (b) : (a))

Not a proper function on args with side effects.  Though it does
demonstrate that macros can sometimes be used to simulate static
polymorphism.

]any max(a,b,lt)
]any a;
]any b;
]int lt();
]{
] if (lt(a,b))
]   return a;
] else
]   return b;
]}

That is a tiny fraction of the whole solution.  The whole solution
requires implementation of tagged structures, and garbage collection,
and requires that all of your numbers be specified with constructors
of some type.  That is, you can't just write for your dynamically
typed value, you have to write something like integer(3).  Of course,
integer(3) is completely different from 3 and you can't do normal C
arithmetic on it, so you have to define all the arithmetic functions
in dynamic form.  And your code becomes almost unreadable since
everything has to be done in function call syntax.

Yes, I know the syntax wouldn't be a problem in C++.  Everything else
still is.

]... In contrast,
]``will eventually lead to a type-error'' is not particularly comforting.
]Your program may be doing something entirely different in the meantime,
]and lots of other data may be corrupted.

I should have said "will eventually lead to a type-error, almost
certainly during the first few test runs", since testing with bogus
input should be high on the agenda.  (Although it is true that an
exception handling machanism is the best solution.)

]> (1) dynamic typing
]> is useful enough that it should be supported directly by programming
]> languages,
]
]Supported, yes; but why does the support have to be direct? Why can't it
]just be a set of library routines, plus the syntactic sugar necessary to
]make you happy?

First, dynamic typing is a simpler, more general, and more expressive
underlying semantic model.  It makes no sense to design a language
under a less satisfactory model and then to patch on the better model
with libraries and preprocessors.  Static typing is a hack to gain
efficiency on current architectures, so if anything it is static
typing that should be an add-on.  And type declarations are often
superfluous so they should not be required in the underlying model.

Second, it is not possible to tack dynamic typing on a statically
typed language (using a preprocessor and a library) without paying
some penalty somewhere.  Dynamic typing needs direct support from the
debugger, for one thing.  For another, there are optimizations that
the compiler can't do without understanding dynamic typing.  For
example if

  if (integer(i)) { /* a bunch of code using i */ }

is understood by the compiler, then it can optimize the use of i,
by generating integer arithmetic and not checking the type for each
operation.  But if gets translated to

  if (i.tag == INTEGER) { /* a bunch of code using i */ }

then the compiler doesn't have any idea what's going on.  Of course
you could have the preprocessor do some optimization itself and
provide debugging information to the compiler; but then what you are
calling a "preprocessor", I'd call an analysis phase of the compiler,
and what you call your "language", I'd call an intermediate
implementation language.

]> (4) it is not clear the the ability of
]> static type declarations to catch errors outweighs the number of
]> errors that occur in the declarations.
]
]Fair enough. However, just in case the benefits of static typing *do*
]outweigh the errors---possibly by a lot---doesn't it seem wise to
]provide the support?

Did you see number (3)?
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

nick@cs.edinburgh.ac.uk (Nick Rothwell) (03/29/91)

In article <1991Mar27.161304.17666@daffy.cs.wisc.edu>, quale@saavik.cs.wisc.edu (Douglas E. Quale) writes:
> >typed languages are generally half to a tenth the size of programs in
> >statically typed languages"  Dan Bernstein reasonably writes back.
> >
> >] I don't believe you. Give an example.
> >
> 
> In Common Lisp,
> 
> (defun compose (f g)
>   (lambda (&rest) (funcall f (apply g &rest))))
> 
> I think it looks better in Scheme,
> 
> (define (compose f g)
>   (lambda x (f (apply g x))))
> 
> These definitions use dynamic typing to obtain polymorphism.

ML (statically typed):

	fun compose (f, g) x = f(g x)

This definition uses polymorphism to obtain polymorphism.

BTW, I undercut you by a few characters (and several brackets).

>I think it would take at least two orders of magnitude (probably three) to
>do this in C.

C is trashy, we know this already.

	Nick.

-- 
Nick Rothwell,	Laboratory for Foundations of Computer Science, Edinburgh.
                nick@lfcs.ed.ac.uk    <Atlantic Ocean>!mcsun!ukc!lfcs!nick
~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~
           "I see what you see: Nurse Bibs on a rubber horse."

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (03/29/91)

In article <1211@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
> In article  <25317:Mar2803:26:1291@kramden.acf.nyu.edu> Dan Bernstein writes:
> ]In article <1162@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
> ]#define max(a,b,lt) ((lt)((a),(b)) ? (b) : (a))
> Not a proper function on args with side effects.

Fair enough; the problem here is that C doesn't let you make a variable
without making a block. (``The data scope matches the control scope,''
or something like that.) If you don't want to use a static variable
(like Fortran parameter passing) here, and you need an expression rather
than a block, then you'd better use a true expression language. This is
independent of static versus dynamic typing, though.

  [ function version, given a polymorphic ``any'' type ]
> That is a tiny fraction of the whole solution.  The whole solution
> requires implementation of tagged structures, and garbage collection,
> and requires that all of your numbers be specified with constructors
> of some type.

Yes, so? All of this is hidden below the implementation of ``any'' and
operations on it.

> First, dynamic typing is a simpler, more general, and more expressive
> underlying semantic model.  It makes no sense to design a language
> under a less satisfactory model and then to patch on the better model
> with libraries and preprocessors.

First, static typing is a simpler, more general, and more expressive
underlying semantic model. (After all, it lets you say what your
variable types are, without worrying about some screwy syntax for the
job.) It makes no sense to design a language under a less satisfactory
model and then to patch on the better model with libraries and
preprocessors. (Ya know, assertions like ``x is an int. Really!'')

The problem with this argument is that neither model is more general
than the other, or simpler, or much more expressive. Given a statically
typed language with structs, an ``all'' type, and the set of types, you
can set up pairs (value,type) and poof! there's dynamic typing. Given a
dynamically typed language with assertions, you can assert that a value
has some type and poof! there's static typing. A good preprocessor will
smooth out the differences either way; so, all else being equal, you
might as well choose the model that produces better code, i.e., static
typing (at least with current technology).

> Second, it is not possible to tack dynamic typing on a statically
> typed language (using a preprocessor and a library) without paying
> some penalty somewhere.  Dynamic typing needs direct support from the
> debugger, for one thing.

Yeah, so? As long as your debugger is built in to your interpreter, this
is not a problem.

> For another, there are optimizations that
> the compiler can't do without understanding dynamic typing.  For
> example if
>   if (integer(i)) { /* a bunch of code using i */ }
  [ vs. ]
>   if (i.tag == INTEGER) { /* a bunch of code using i */ }

Ah! A real issue! In Q's any.h, an ``any'' variable comes with a number
of constraints: if x.type == anytype(int), for example, then
eq(typeof(x.value),int). (Here anytype and typeof are functions
evaluated at the preprocessing level; anytype is defined by any.h and
produces a run-time value, but typeof merely produces a token that must
be used at compile time.) So in your example, it just takes simple data
flow analysis for the compiler to understand that i.type won't change
inside the block, and hence all operations on i are integer operations.

> ]> (4) it is not clear the the ability of
> ]> static type declarations to catch errors outweighs the number of
> ]> errors that occur in the declarations.
> ]Fair enough. However, just in case the benefits of static typing *do*
> ]outweigh the errors---possibly by a lot---doesn't it seem wise to
> ]provide the support?
> Did you see number (3)?

Okay, I was just making sure that you didn't mean (4) as a
recommendation of any sort.

---Dan

quale@saavik.cs.wisc.edu (Douglas E. Quale) (03/29/91)

In article <1991Mar28.124634.28106@mathrt0.math.chalmers.se> augustss@cs.chalmers.se (Lennart Augustsson) writes:
>In article <1991Mar27.161304.17666@daffy.cs.wisc.edu> quale@saavik.cs.wisc.edu (Douglas E. Quale) writes:
>>In Common Lisp,
>>
>>(defun compose (f g)
>>  (lambda (&rest) (funcall f (apply g &rest))))
>>
>>I think it looks better in Scheme,
>>
>>(define (compose f g)
>>  (lambda x (f (apply g x))))
>>
>>These definitions use dynamic typing to obtain polymorphism.
>>
>Yes, in LISP it's dynamic typing that gives you the polymorphism, but
>you don't have to have dynamic typing.  Let's do the same definition
>in ML, which is statically typed:
>  - fun compose(f,g) x = f(g x);
>  > val compose = fn : (('a -> 'b) * ('c -> 'a)) -> ('c -> 'b)

Actually the SML code isn't an exact replacement for the lisp code.
In SML the function g must be unary, while the lisp code uses a rest arg so
that g is n-ary.  I don't know if the type inferencing algorithm used in
SML (or in Haskell, re: an earlier post mentioning the same point) can be
extended to handle functions that take a variable number of arguments.  Some
people would say that varargs are bad.  I think that they're useful, but I
can live without them.

I never claimed dynamic typing was *required* for polymorphism, just that it
is an inherent benefit.

Dan Bernstein doesn't believe that programs written in languages with dynamic
typing are shorter than those written in C.  My example (which was originally
posted by someone else) is merely to show that polymorphism which is automatic
for a dynamically typed language makes it easy to write code that is very
difficult to duplicate in C.  With the exception of the varargs situation,
SML and Haskell and Hope and Miranda and Ponder and Orwell and probably many
other modern languages have no difficulty with this example.

I probably shouldn't use such a mature language as lisp to pick on C, but all
the languages mentioned above are younger than C and their type systems are
much more powerful.  The ANSII C type system would have been state of the art
20 years ago when C was invented, but today it's just a 20 year old type
system.  I have no bones to pick with modern type systems -- I like SML.
The problem with C's type system is that it falls between two stools:  it
doesn't give you the power and freedom of expression of a dynamic type system,
and it also doesn't give you the type safety of a truly strong type system
like SML's.  (And it certainly doesn't give the polymorphism of either.)
The lack of type safety across separately compiled modules is the last nail
in C's coffin.

Twenty years ago, when C was invented, many programmers thought that they had
to program in assembly language because high level languages were too
inefficient.  Fortunately, some folks looked into the future instead of the
past.  The success of C helped changed that attitude.  Today it's time to
look past C.  Some won't, of course, but progress is usually slow.

-- Doug Quale
quale@saavik.cs.wisc.edu

peter@ficc.ferranti.com (Peter da Silva) (03/29/91)

In article <3073:Mar2820:38:5191@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> A good preprocessor will
> smooth out the differences either way; so, all else being equal, you
> might as well choose the model that produces better code, i.e., static
> typing (at least with current technology).

If raw execution speed is the most important criterion, this is true. But
then you should probably be coding in assembly. If coding speed is the most
important criterion, then you might as well use a dynamically typed language.

Use the right tool for the job. Your job, apparently, is pumping bits.
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

cs450a03@uc780.umd.edu (03/30/91)

Peter da Silva   >
Me               >>

>> (2) I generally don't need to make state machines.  ...
>> Another thing is that usually, I'll want a history of the machine, so
>> each state is stored in a separate location.  ...
> 
>Hmmm. looks like we're dealing with radically different problem spaces.
>You deal with problems where it's desirable and possible to avoid information
>loss.  I have to deal with that entropy and give up reversibility to allow
>unlimited runtime with limited memory storage. This is particularly important
>in interactive programs.

Hmmm... but my programs are interactive.  Example from last week:  I
wrote a little "diff" style routine to compare two articles.  It was
implemented as a state machine -- the state history indicated the
simplest way to change from program a to program b.  I needed to keep
this in memory because once I've got how many lines are changed in old
and new I give the person the option of displaying only the sort of
information that looks interesting (e.g. if article A is intended to
replace 3 pages of article B (which is 50 pages) you can extract the
exact lines in article A that changes from article B--maybe only a
couple words in this application).

  [my example of dynamic array allocation elided]
>It may not be significantly slower, but it uses up significantly more
>memory. If list is large enought that can easily make the difference between
>being able to run the program and not.
>
>Also, I tend to work through dynamically allocated lists, or even lists
>that are implicitly defined. In that case "max" is unknown and potentially
>unbounded.

Um, yes..  You have to be careful to break up large problems into
manageable chunks.  In exchange for that, exception handling becomes
simpler (er... assuming a few key operations are atomic).

Remember, though, that I'm not working in C.  "max" for a dynamic
array is implicitly defined, and potentially unbounded, but known.
Programming style is a bit different than with lists too.  Again, if
the problem is too big to be handled as a single piece, you break it
up into pieces, and maintain a directory to keep track of the pieces.
(Er.. don't confuse the general concept of a directory with specific
implementations of a os).

> while(msg = GetMsg(port))
> { switch(msg->request) { ... case FOO: state = BAR; } Reply(msg); }
>I don't care how GetMsg is implemented, or Reply, but I do need to assume
>that there is no limit on the number of times this loop will be run through.
>This means I *have* to discard old state information.

Er... I didn't mean to say that I never discard state information.  I
meant to say that I am very careful, when I do so, to make sure that
my code will behave properly during exceptions.  This may be more
important to me than to you, because during debugging it is likely
I'll execute code sections multiple times--stopping at a bug, fixing
it, then back tracking a short distance and restarting.

On the descriptive comments/code thing, consider a language that does
not define a comment syntax, but allows you to include constant data
wherever syntatically convenient.  Then you could put strings in as
comments.

What the language does with those strings is up to you.  (You the
programmer, not you the hypothesizer.)

Raul Rockwell

anw@maths.nott.ac.uk (Dr A. N. Walker) (03/30/91)

In article <1124@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman)
writes:

>It will take a great deal of evidence to convince me that the Algol
>committee members were so ignorant that they actually argued over the
>implementability of anything in Algol 60.  There is nothing in the
>language that any knowledgeable person would doubt can be implemented.

	Whoa!  There are two different meanings of "implement" and three
different meanings of "Algol" that are being mixed up, if not actually
confused.  When this started, DG asserted that Algol was [essentially]
intended as a clever assembler, and that static typing was intended as
an efficiency hack.  In response to my queries, he asserted that he
intended the earliest Algol tradition [ie Algol 58, or IAL], denied
intending Algol 60, and couldn't remember whether he'd actually read
the Algol 58 report.  ["I rest my case m'lud."  However, I cannot resist.]

Algol 58:  Never [?] implemented, except in the Jovial realisation,
	probably unimplementable, certainly not designed with machine
	efficiency in mind.

Algol 60:  Intended primarily as a descriptive language, and certainly
	not efficient (call by name, dynamic typing [:-), but consider
	whether "2^-j" is integer or real, and whether "2" is an integer
	or a label], dynamic own arrays).  No "clever assembler" of the
	period would have included recursive procedures, or dynamic
	arrays, for example, as indeed Fortran didn't.  I deduce that
	it was not designed with efficiency in mind;  this is confirmed
	in the pages of Algol Bulletin and in histories of the period
	(where the whole venture sounds utterly chaotic).

Algol 68:  Designed to be *run* efficiently, with essentially *no*
	regard to compilability other than "possible in principle".
	No-one with a real compiler in mind would have designed a
	language that needs at least three passes and that cannot
	be adequately lexed by a finite state machine.  There is much
	in Algol 68 that would make any pre-doc compiler-writer despair;
	the committee satisfied itself that parsing was technically
	possible, that code generation was technically possible, and
	assumed that compiler technology would do the rest.  You only
	have to look at the coercion engines or at the operator
	identification to assure yourself that highly intelligent
	people could spend a long time wondering whether something
	was compilable, and indeed some ambiguities were detected only
	at a very late stage.  Results:  Algol 68 is a dream to write,
	a nightmare to compile, and a bat-out-of-hell to run (given a
	good compiler!).


>							     But it is
>just wrong to claim that they were trying to design a general notation
>for describing algorithms without regard to implementation.

	Well, they were thinking about *computing* algorithms, and
more specifically, *numerical* algorithms.  Having an eye on what
real computers were used for is different from having an eye on how
to turn a notation into practical code.

>							      If that
>were true they surely would have included sets, strings, graphs,
>concatenatable sequences and other useful things in the language.

	Strings and other useful things *are* in Algol 60 [2 out of 5
isn't bad!].  In 1960, the other things were scarcely recognised as
being useful:  my undergraduate degree [1961-64] included *no* mention
of sets or graphs or concatenation;  as an innovation, 5 or so lectures
on computing [Edsac Autocode] were included in the NA course.

	By 1968, the emphasis was on orthogonality and composition from
a small number of primitives.  Graphs, for example, scarcely count as
primitive -- but it is straightforward to write a graph-handling package
in Algol.

-- 
Andy Walker, Maths Dept., Nott'm Univ., UK.
anw@maths.nott.ac.uk

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (04/01/91)

In article <1991Mar29.012352.23889@daffy.cs.wisc.edu> quale@saavik.cs.wisc.edu (Douglas E. Quale) writes:
> Dan Bernstein doesn't believe that programs written in languages with dynamic
> typing are shorter than those written in C.

When did I say that?

Doug, are you capable of correctly paraphrasing anything I've ever said?

---Dan

Path: kramden.acf.nyu.edu!brnstnd
From: brnstnd@kramden.acf.nyu.edu (Dan Bernstein)
Newsgroups: comp.lang.misc
Subject: Re: Dynamic typing (part 3)
Message-ID: <3352:Mar1803:04:5491@kramden.acf.nyu.edu>
Date: Mon Mar 18 03:04:54 GMT 1991
References: <693@optima.cs.arizona.edu>
Organization: IR
Lines: 8

In article <693@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
> Programs in
> dynamically typed languages are generally half to a tenth the size of
> programs in statically typed languages that do the same thing.

I don't believe you. Give an example.

---Dan

brm@neon.Stanford.EDU (Brian R. Murphy) (04/01/91)

In article <3073:Mar2820:38:5191@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>The problem with this argument is that neither model is more general
>than the other, or simpler, or much more expressive. Given a statically
>typed language with structs, an ``all'' type, and the set of types, you
>can set up pairs (value,type) and poof! there's dynamic typing. Given a
>dynamically typed language with assertions, you can assert that a value
>has some type and poof! there's static typing. A good preprocessor will
>smooth out the differences either way; so, all else being equal, you
>might as well choose the model that produces better code, i.e., static
>typing (at least with current technology).

Unfortunately, given a statically-typed language with higher-order
functions and an "all" type, type inference appears to be
undecideable.  Thus your statically-typed language _requires_ type
declarations, whereas in a dynamically-typed language we can get by
without them.

Note, however, that I'm not certain of its undecideability.  Thatte, in
    @INPROCEEDINGS{Thatte88,
        AUTHOR = "S. Thatte",
        TITLE = "Type Inference with Partial Types",
        BOOKTITLE = "Automata, Languages and Programming: 15th
			International Colloquium",
        MONTH = jul,
        YEAR = 1988,
        PAGES = "615-629",
        Publisher = "Springer-Verlag Lecture Notes in Computer Science, 
				vol. 317"}
attempted such a system and failed to find one.  The problem is that,
whereas in standard ML-like type inference you wind up with a set of
equations to solve, here you have a set of inclusion constraints.
When all type constructors are monotonic (as all ML type constructors
are except for "->"), and your only inclusion rules are:
	x <= All       for all x
	x <= y         if  x == y
	e1(x) <= e2(y) if  x <= y  and  e1(y) <= e2(y)
then solving a set of such constraints is pretty simple.  If you allow
nonmonotonic type constructors:
	x->y  <=  a->b if  y <= b  and  a <= x
					------  (antimonotonic part)
then things get trickier, and it seems quite likely that sets of
inclusion constraints cannot in general be solved.

If anyone knows a reference for whether or not this problem is
decideable, please send me email (as I don't read these type flame
wars too carefully).  If it is decideable, then I retract my complaint
above...

						-Brian Murphy
						 brm@cs.stanford.edu

boehm@parc.xerox.com (Hans Boehm) (04/02/91)

brm@neon.Stanford.EDU (Brian R. Murphy) writes:

>In article <3073:Mar2820:38:5191@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>>The problem with this argument is that neither model is more general
>>than the other, or simpler, or much more expressive. Given a statically
>>typed language with structs, an ``all'' type, and the set of types, you
>>can set up pairs (value,type) and poof! there's dynamic typing. Given a
>>dynamically typed language with assertions, you can assert that a value
>>has some type and poof! there's static typing. A good preprocessor will
>>smooth out the differences either way; so, all else being equal, you
>>might as well choose the model that produces better code, i.e., static
>>typing (at least with current technology).

>Unfortunately, given a statically-typed language with higher-order
>functions and an "all" type, type inference appears to be
>undecideable.  Thus your statically-typed language _requires_ type
>declarations, whereas in a dynamically-typed language we can get by
>without them.

  I think this is getting into an area where we need a bit more precision.
The argument implied by Brian seems to be that with a dynamically typed
language and some automatic type inference, we can get similar performance
to a statically typed language.  I think we are no longer addressing
reliability issues.
  The main remaining problem then is that there are a huge number of different
formulations of the type inference problem, which vary greatly in the programs
to which they can assign types, especially if those programs are library
functions for which we have incomplete information about the calling
context.  Most systems for assigning types to expressions in dynamically
typed languages seem to assign basically simple types.  They might determine
that 3+5 has type integer, but they won't determine that the identity function
has type T -> T, for any T.  (It sounds like the reference given by Brian is
an exception, but it appears to have other problems.)  In more practical
terms, if list operations aren't built in to the language, and their
implementation is in a library, unavailable for reanalysis by the type checker,
then I have no way to determine that head(cons(1,NIL)) has type integer.
(This is of course exactly the same problem that C++ s type system has
with void *;  I lose information that's easily statically available.  You
have no way of keeping track of the fact that two void *'s are the same type.)
  For statically typed languages without subtyping, I know of the following
type inference problems that have been studied:

1. Only simple (Pascal-like) types are assigned. Type inference is easily
possible.

2. ML-style polymorphic types are assigned.  (The identity function gets
a reasonable type.  The function that applies its argument to itself (lambda x .
x x) is not typable, even though this function can reasonably be applied to the
identity function.)  Type inference is decidable (and done in ML.)

3. ML modified to allow "polymorphic" functions to be used on the right side
of their own (recursive) definitions.  Type inference is undecidable.

4. The Girard-Reynolds lambda-calculus.  Functions can be parametrized with
respect to polymorphic functions.  Lambda x . x x can be assigned a type.
Type inference is very sensitive to the problem statement.  It's either
known to be undecidable or an open problem, depending on the exact statement.
But there are possible compromises involving smallish amounts of explicit
type information.

5. Any of the above with subtyping.  I'm not up on all the results here.

6. Some of the above with certain recursive types getting automatically
inferred.

My impression is that the type inference procedures usually proposed for
dynamically typed languages usually are near the weak end of this scale.
If they were near the strong end, they would effectively have something
similar to a static type system, with nontrivial programmer supplied type
assertions.

This leaves some interesting questions I can't answer:

1. How much performance do you lose by performing type inference without
a notion of polymorphic types?  This is probably environment dependent ...

2. Why do most of us program in statically typed languages that are so
near the weak end of the above scale that this is all an academic excercise?
Ada and C++ are barely at level 2, if you stretch things, and they do
require lots of explicit type information.

Hans

gudeman@cs.arizona.edu (David Gudeman) (04/02/91)

In article  <3073:Mar2820:38:5191@kramden.acf.nyu.edu> Dan Bernstein writes:
]First, static typing is a simpler, more general, and more expressive
]underlying semantic model.

Nonsense.  How can it be simpler if it requires more details from the
programmer?  Statically typed languages are always more complicated in
the sense that they impose extra restrictions on expressions to make
static typing typing possible.  These extra restrictions are
complications.  By the same argument, static typing cannot possibly be
more general or more expressive.  You can write any expression with
dynamic typing that you can write with static typing, the reverse is
not true.  I will not credit the notion that being able to have the
computer check the static types of variables adds to the power or
expressiveness of a language.  Those terms are not rigorously defined
at this point, but my intuition rebels at the very notion.  Creating
restrictions cannot add to power or expressiveness.

]Ah! A real issue!

If you don't think the other issues are "real", why do you keep
arguing them?

] In Q's any.h, an ``any'' variable comes with a number
]of constraints: if x.type == anytype(int), for example, then
]eq(typeof(x.value),int). (Here anytype and typeof are functions
]evaluated at the preprocessing level; anytype is defined by any.h and
]produces a run-time value, but typeof merely produces a token that must
]be used at compile time.)

I thought so.  Your preprocessor is not a preprocessor at all.

If you can write expressions that look just like dynamic typing, then
you have dynamic typing.  Implementation details are unimportant.  It
is of no value to the programmer to know at what stage of the compiler
the dynamic typing gets done.  Dividing compilation into transparent
stages is an unnecessary complexity.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

gudeman@cs.arizona.edu (David Gudeman) (04/02/91)

In article  <1991Mar29.191210.9369@maths.nott.ac.uk> Dr A. N. Walker writes:
]In article <1124@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman)
]writes:
]
]>It will take a great deal of evidence to convince me that the Algol
]>committee members were so ignorant that they actually argued over the
]>implementability of anything in Algol 60.  There is nothing in the
]>language that any knowledgeable person would doubt can be implemented.
]
]	Whoa!  There are two different meanings of "implement" and three
]different meanings of "Algol" that are being mixed up, if not actually
]confused.

Nope.  I said Algol 60 because I meant Algol 60.  And _I'm_ not
confusing "implement" with "compile", although it looked like you
were.

]...  In response to my queries, he asserted that he
]intended the earliest Algol tradition [ie Algol 58, or IAL], denied
]intending Algol 60, and couldn't remember whether he'd actually read
]the Algol 58 report.  ["I rest my case m'lud."  However, I cannot resist.]

In the first place, I said the earliest "tradition", not "language"
[ie not Algol 58 or IAL].  What I meant was the design philosophy that
resulted in Algol 60, including all the decisions that were made
before that point.  The most important decision being that the
language should be compilable.  Whether the designers ever mentioned
this decision, or were even aware of it, I don't know.  But I do know
that this was an implicit part of the Algol 60 design.  I'm comparing
Algol 60 to Lisp and SNOBOL, not to FORTRAN and C.

As to the sarcasm: for your information I _have_ read the Algol 60
report and the Algol 68 report, and enough other information to think
I have a good feel for the way programming languages evolved.  This is
not a refereed forum, and I'll not accept being chastized for not
making a literature review before posting.

]Algol 60:  Intended primarily as a descriptive language, and certainly
]	not efficient (call by name, dynamic typing
                                     -------
I assume you meant "static".

]	No "clever assembler" of the
]	period would have included recursive procedures, or dynamic
]	arrays, for example, as indeed Fortran didn't.

Both features can be compiled.  They were less efficient on the
machines of the day.

]	Strings and other useful things *are* in Algol 60 [2 out of 5
]isn't bad!].

I should have been more specific: concatenatable, sectionable strings
with automatic storage management.  Those pathetic things that
statically typed languages call "strings" are really character arrays.

]  In 1960, the other things were scarcely recognised as
]being useful:

Mathematicians had thought they were useful for decades or centuries,
and were the designers of Lisp and SNOBOL also recognized their
importance.  They just didn't fit machine architectures well, so
languages designers who were worried about efficiency didn't deal with
them.

Basically, there are two traditions of programming language design:
Type A designers are concerned about efficiency and type B designers
are concerned about expressiveness.  Historically, you could tell
these designs apart because type A languages were compiled and type B
languages were interpreted (this no longer holds).  Type A languages
were designed with the idea that there should be a fairly direct
mapping from the language to machine languages.  Thus they needed
static typing, functions that were defined before compilation, and
simple data structures that could be mapped onto machine memory
directly.

Type B language designers didn't worry about efficiency, and created
languages with no simple relationship to machine languages (thus
needing to implement them with interpreters).  Type B languages almost
universally have dynamic typing and generalized data structures with
automatic storage management, most also have dynamically created
functions.

Within each camp, there were different levels of concern for
efficiency, but within the type A camp there was always this strong
aversion to any feature that would require a lot of runtime support.
Algol 60 clearly and unambiguously fits into type A.  It is true that
_within the type A camp_, they were somewhat cavalier about
efficiency, but they still didn't introduce anything that required a
great deal of runtime support.  In relationship to the programming
language community as a whole, they were extremely concerned with
efficiency.  Only someone who is unfamiliar of the huge type B
community would claim that Algol 60 was designed without regard to
efficiency.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

cs450a03@uc780.umd.edu (04/02/91)

Caution: about 40 lines of quoted material

Hans Boehm    >     [theorist of types and other maths]
Brian Murphy  >>    [your basic dynamic typing advocate]
Dan Bernstein >>>   [bit muncher & tty basher extraordinarie]
and me, your basic wild-eyed pragmatist
>>>The problem with this argument is that neither model is more
>>>general than the other, or simpler, or much more expressive. Given
>>>a statically typed language with structs, an ``all'' type, and the
>>>set of types, you can set up pairs (value,type) and poof! there's
>>>dynamic typing. Given a dynamically typed language with assertions,
>>>you can assert that a value has some type and poof! there's static
>>>typing.  [then mentions efficiency of static implementations]

Ok, poof, no argument.  And I almost hesitate to mention that static
language implentations have an efficiency advantage (as they better
model things like machine limits) while dynamic language
implementations have an efficiency advantage (they better model human
thought mechanisms as exemplified by thousands of years of
mathematical development).

>>... your statically-typed language _requires_ type declarations,
>>whereas in a dynamically-typed language we can get by without them.

er... which is an example of what I was trying to say.

> The main remaining problem then is that there are a huge number of
>different formulations of the type inference problem, which vary
>greatly in the programs to which they can assign types, especially if
>those programs are library functions for which we have incomplete
>information about the calling context.  Most systems for assigning
>types to expressions in dynamically typed languages seem to assign
>basically simple types. ...

er... simplicity is often an advantage.  

>For statically typed languages without subtyping, I know of the
>following type inference problems that have been studied:
>  [ 1. (simple) ... 6 (not simple) ]
>My impression is that the type inference procedures usually proposed
>for dynamically typed languages usually are near the weak end of this
>scale.  If they were near the strong end, they would effectively have
>something similar to a static type system, with nontrivial programmer
>supplied type assertions.
>
>This leaves some interesting questions I can't answer:
> 
>1. How much performance do you lose by performing type inference
>without a notion of polymorphic types?  This is probably environment
>dependent ...

environment and application dependent, otherwise you have no metric
for performance.  Worse, it is rather impractical to rigidly define
"without a notion" (not impossible, mind you).

>2. Why do most of us program in statically typed languages that are
>so near the weak end of the above scale that this is all an academic
>excercise?  Ada and C++ are barely at level 2, if you stretch things,
>and they do require lots of explicit type information.

Caution: Rampant assertions follow...

In a word, expressiveness.  Typing, is very similar to side-effect
driven programming, and thus difficult to manage if complex.  The
advantage of static typing in programming comes from the ability of
the typing system to reflect and model machine limits.  To take typing
beyond this generally means an attempt to characterize computer
routines as relations, and to re-define them in terms of abstract
domains.

In other words, strong typing attempts to make you write each computer
"function" twice: once in terms of step-by-step instructions as to
what needs to be done, then again in terms of step-by-step
instructions as to what it is allowed to do.

It is simpler, I believe, to just write the thing once.

Or rather, I should say it's simpler for the programmer.  The closer
you can get to machine language, the simpler things are for the
computer.  But as has been observed many, many times:  10% of the code
takes 90% of the time to execute (or similar statistics).

Assuming that programmer time is limited, it makes sense to profile
your code and spend most of the programming effort on that 10% CPU
hog.  (Which, if I understand aright, is what Dan Bernstein is doing:
working on optimizing some problems which are CPU hogs on current
machines.)

Now, it probably some of the topologies that are being developed to
describe type systems will have applications which are quite useful.
On the other hand, there are some very rich bodies of mathematics
which have been developed without these formalisms.  Which is to say,
"typing can be used to describe a programming" does not mean "typing
must be used to describe a program".

More specific to programming, it is often more useful to have
miscelaneous information about a function (does argument order matter?
is it pure computation or does it deal with external objects? does
this function apply to any representation of number for which we have
a math library, or only a specific word size and specific coding? ...)
than it is to concentrate on a completely general description of the
function.

Finally, note that describing functions as static objects is an
outgrowth of a couple very good assembly language practices: 
(1) describe what's allocated where, and (2) keep the code constant so
you can trust it.

On the other hand, I find myself more comfortable with a language
which allows me to "build up" a function from component functions and
utilities than I do with a language which wants me to write as if I
were allocating raw memory.  In other words, if I have some general
function f, it is nice to be able to use it, in all its gory
inefficiency and then, once it is tested, make a function g which is f
restricted to some domain chosen to reflect properties of the specific
problem and the machine I'm working on.

I suppose this is partially personal preference, but note that the
90/10 "rule" implies this is a good approach.

Also note that this system tends to favor simple types.  Function
domain is, of course, part of the function definition, but there is a
difference between the overall limits a function has for correctness,
and the specific limits imposed on a function for machine efficiency.

The typing system built into each function, for correctness, is (in
general) unique to the definition of that function.  In practice, this
means you can even define functions which have no domain (always give
a domain error on closure), and other odd things like that.  Note that
this typing usually becomes less complicated when you restrict the
problem to a specific machine domain (e.g. 32 bit signed integers).

I'm leaving out some key issues (such as the ability of a language to
deal with structured data, which is something type systems often try
to address), but I think you'll find this rule applies in those areas
too:

    The more direct you are, the simpler the problem.

Raul Rockwell

yodaiken@chelm.cs.umass.edu (victor yodaiken) (04/02/91)

In article <1APR91.23564447@uc780.umd.edu> cs450a03@uc780.umd.edu writes:
>
>Ok, poof, no argument.  And I almost hesitate to mention that static
>language implentations have an efficiency advantage (as they better
>model things like machine limits) while dynamic language
>implementations have an efficiency advantage (they better model human
>thought mechanisms as exemplified by thousands of years of
>mathematical development).
>

People keep saying this, but it just ain't so. Type declarations are
an integral part of mathematics. Abbreviations and conventions, which
make declarations implicit are used extensively, but that's not the
same as no type declarations. Type declarations are so common as
to be cliches in mathematical speech: e.g.  "let G be a group", 
"let P be a proposition", "let R be a ring",
"(forall x in X)" ....  
Maybe I'm missing an essential difference
between these type declarations, and the declarations of programming
languages. If so, I'd be happy if someone could explain.

peter@ficc.ferranti.com (Peter da Silva) (04/02/91)

Fine. There are type A languages and type B languages. There are good
reasons for having both. I prefer type A languages because they are simpler
to implement, which means I can get a decent implementation of them on
a consumer machine. I would love to have a good type B language on my
Amiga, though I'd still use a type A one for a lot of stuff: real-time
response seems to require it (I can't see doing MIDI processing in Lisp
on a 68000).

The problem is, a *decent* type B language compiler is about the size of
GCC (an *excellent* type A language compiler). They won't fit on PCs, and
the market share of other consumer machines is just too small.

*MY* challenge: give me an alternative.
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

peter@ficc.ferranti.com (Peter da Silva) (04/03/91)

In article <962@mgt3.sci.UUCP> dc@mgt3.sci.com (D. C. Sessions) writes:
> The upshot of this little affair was the conversion of an entire shop 
> full of C hackers into Modula-2 fanatics, purely because they *never* 
> wanted to give up intermodule type-safety again.

My response is "convert to ANSI C, which gives you intermodule type safety".

> So: for the purposes of the current discussion, how do our ideal 
> dynamically-typed languages ensure that a similar little 
> misunderstanding doesn't happen again?

That's easy: the function that gets a foo wouldn't misinterpret it as being
a bar in the first place. It'd either have kicked out an error, or the test
for bar.magic would have failed.
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

cs450a03@uc780.umd.edu (04/03/91)

Victor Yodaiken    >
Me                 >>
>>static language implentations have an efficiency advantage (as they
>>better model things like machine limits) while dynamic language
>>implementations have an efficiency advantage (they better model
>>human thought mechanisms as exemplified by thousands of years of
>>mathematical development).
>People keep saying this, but it just ain't so. Type declarations are
>an integral part of mathematics. Abbreviations and conventions, which
>make declarations implicit are used extensively, but that's not the
>same as no type declarations.

Did I say no type declarations?  I said dynamic typing.

The advantage of dynamic typing is not "no type declarations", the
advantage is that redundant type declarations can be eliminated.
[Actually, this property is not unique to dynamically typed languages.
But DTLs are more consistent in this regard than are STLs.]

For example, if F(x) is defined as G(h(q(x))), and the domain of q is
balogna sandwiches, then the domain of F is balogna sandwiches, and
x can only take on values that are balogna sandwiches.  Using q in
this manner is sufficient to declare the type of F.

Another advantage of dynamic typing is that you get a defined behavior
when a function gets a value which is out of its domain (e.g. divide
by zero, or array index with index too large).  Again, this is not
unique to DTLs -- and again, DTLs are more consistent about this than
STLs.

Raul Rockwell

cs450a03@uc780.umd.edu (04/03/91)

I wrote:
>For example, if F(x) is defined as G(h(q(x))), and the domain of q is
>balogna sandwiches, then the domain of F is balogna sandwiches, and
>x can only take on values that are balogna sandwiches.  Using q in
>this manner is sufficient to declare the type of F.

gack

This should say "Using q in this manner is sufficient to declare the
domain (argument type) of F."  The result type of F depends on G.

Raul

yodaiken@chelm.cs.umass.edu (victor yodaiken) (04/03/91)

In article <3APR91.00020019@uc780.umd.edu> cs450a03@uc780.umd.edu writes:
>Did I say no type declarations?  I said dynamic typing.
>
>The advantage of dynamic typing is not "no type declarations", the
>advantage is that redundant type declarations can be eliminated.
>[Actually, this property is not unique to dynamically typed languages.
>But DTLs are more consistent in this regard than are STLs.]
>
>For example, if F(x) is defined as G(h(q(x))), and the domain of q is
>balogna sandwiches, then the domain of F is balogna sandwiches, and
>x can only take on values that are balogna sandwiches.  Using q in
>this manner is sufficient to declare the type of F.
>

This makes sense, but it seems to be an instance of an abbreviation
or convention. There is a preface: all functions are asumed to be over
balogna sandwiches, implied in what you write.   If what you mean by
"dynamic typing" is simply greater facility in defining domains and
higher level declarations of type (e.g., all functions in the block are
over integers), then I believe that I understand. But, it appeared
from earlier discussion that dynamic typing was more involved.

>Another advantage of dynamic typing is that you get a defined behavior
>when a function gets a value which is out of its domain (e.g. divide
>by zero, or array index with index too large).  Again, this is not
>unique to DTLs -- and again, DTLs are more consistent about this than
>STLs.
>

I'm really lost now. What is the connection between type dynamism and
well definedness of operations?

cs450a03@uc780.umd.edu (04/04/91)

Victor Yodaiken >
 [responding to an illustration of how constructing a function
  specifies the type of the function]
>This makes sense, but it seems to be an instance of an abbreviation
>or convention. There is a preface: all functions are asumed to be
>over balogna sandwiches, implied in what you write.  If what you mean
>by "dynamic typing" is simply greater facility in defining domains
>and higher level declarations of type (e.g., all functions in the
>block are over integers), then I believe that I understand. But, it
>appeared from earlier discussion that dynamic typing was more
>involved.

Dynamic typing is simple in *concept*.  The problem is that it often
does not map well onto machine architecture.  For example, if you have
a function which is defined over the domain of numbers (take addition,
for example), then you usually see a dynamic typing system faithfully
providing that functionality for a variety of machine implementations
of numeric types.

There is no reason you couldn't have just a single numeric type (say
complex numbers where both real and imaginary components are 96 bit
IEEE floats).  Except that's pretty silly when all you want to do is
represent array indices.  Or worse, 1's and 0's.

>>Another advantage of dynamic typing is that you get a defined behavior
>>when a function gets a value which is out of its domain (e.g. divide
>>by zero, or array index with index too large).  Again, this is not
>>unique to DTLs -- and again, DTLs are more consistent about this than
>>STLs.

>I'm really lost now. What is the connection between type dynamism and
>well definedness of operations?

Another implementation detail of dynamic types is that there is a lot
more information present than is usually acted on.  For example, if
you have a function f(x) defined as F(g(x), h(x)), and g(x) is valid
over the integer domain, and h(x) is valid only for non-negative
numbers, then the domain of f is unsigned integers.  If f where
compiled, the compiler could probably make good use of this fact, and
issue warnings or error messages if f where used (a) where it is
possible to give f values outside of its domain or (b) where it is
guaranteed that f will be given values outside of its domain.  A
similar situation might be where the domain of f is integers greater
than negative 8.

In a statically typed language, type is considered to be a property of
variables, so if a variable can have a value outside the domain of a
function then that is not considered a type error.  In a dynamically
typed language, where practically all functions have type checks to
ensure their arguments' validity, more is usually checked than just
the "wordsize" or the "typetag" of the arguments.

Statically typed languages leave many such details to the programmer.
The advantage is that it is possible to write fast programs, simply by
leaving out some of the type-checks.  The costs are that errors may
not be caught as soon as they would be otherwise, and that the
programmer must spend a fair bit of his time programming and debugging
type structures.

Raul Rockwell

jgk@osc.COM (Joe Keane) (04/04/91)

In article <22MAR91.20485982@uc780.umd.edu> cs450a03@uc780.umd.edu writes:
>Generally, each assignment to a variable is unique.  (I try not to
>re-assign, and when I do, I try and make sure re-executing that
>section of code would not cause a problem).  Exception made for loop
>counters, but not for other assignments made within the loop.

In article <039AIL3@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da
Silva) writes:
>This is an unusual coding style, in my experience. Are you actually
>limitin assignments, or are you hiding those assignments in call by
>reference? Perhaps a code fragment would help.

I don't think this is so unusual.  I do the same thing to a large extent when
programming in C or C++.  Perhaps it could be my exposure to functional
languages, but i find that this approach goes well with C too.  I'm not strict
about it, but i find that almost all variables can be done this way.  We have
to make a few exceptions, which i'll describe later.

There's another thing i do which goes together with this.  I generally try to
keep variables' scope as small as possible.  For example, if a variable is
used inside a loop but it's not carried between iterations, then i can declare
it inside the loop body.  Given the `small as possible' rule, that means i
should declare it there.  Sometimes i make blocks just to declare variables
in.  This may sound crazy, but it makes sense because these blocks generally
correspond to some conceptual operation.

I think the result is that code written this way is easier to read and
understand what's going on.  By making the lifetime of a variable clear, you
help the human reader understand the general structure of the algorithm.
Optimizing compilers can figure it out too, but people don't want to spend
time doing this.

Loops are an exception to this rule.  Obviously, you have to increment or
somehow advance your loop variable in the loop, or else it gets stuck.  And
often the point of a loop is to accumlate some value.

You asked for an example.  I don't think a short example would illustrate the
point well.  So free with this posting, i'm going to give you a real program.
This is actually a useful program, so you may want to save the good version.

First i show the `bad' version of the program.  It suffers from what i call
`Pascal variable declaration syndrome'.  All the variables are declared at the
top of the function, far away from where they're actually used.  You can't see
where variables are used or how long they're expected to live.  Variables are
re-used without this being clear.

You may say that the problem is that my main function is too big.  I don't
agree with this, although i suppose this is a matter of taste.  I believe that
chopping it up into little functions would not improve readability.  With
proper attention to scoping, which is what i'm talking about after all, each
block is like a small function.  But the blocks are ordered exactly like
they're used, so you don't have to go jumping all around the source to see
something simple.

-- don't cut here -- don't cut here -- don't cut here -- don't cut here --
#include <stdio.h>
#include <sys/time.h>

extern char* malloc();
extern char* realloc();

struct line
{
  int length;
  char* ptr;
};

int main (argc, argv)
  int argc;
  char* argv[];
{
  struct line* master_ptr;
  int master_size;
  int master_capacity;
  char *buffer_ptr;
  int buffer_capacity;
  int c;
  int buffer_size;
  char* line_ptr;
  int pass;
  struct timeval tv;
  int line_number;
  int other_line;
  struct line temp;
  char* ptr;
  char* end;

  master_capacity = 256;
  master_ptr = (struct line*)malloc(master_capacity * sizeof(struct line));
  if (!master_ptr)
   goto out_of_memory;
  master_size = 0;
  buffer_capacity = 256;
  buffer_ptr = malloc(buffer_capacity);
  if (!buffer_ptr)
    goto out_of_memory;
  for (;;)
  {
    c = getchar();
    if (c == EOF)
      goto eof;
    if (master_size >= master_capacity)
    {
      master_capacity *= 2;
      master_ptr = (struct line*)realloc(master_ptr, master_capacity * sizeof (struct line));
      if (!master_ptr)
	goto out_of_memory;
    }
    buffer_size = 0;
    while (c != '\n')
    {
      if (buffer_size >= buffer_capacity)
      {
	buffer_capacity *= 2;
	buffer_ptr = realloc(buffer_ptr, buffer_capacity);
      }
      buffer_ptr[buffer_size] = c;
      buffer_size++;
      c = getchar();
      if (c == EOF)
      {
	fputs("shuffle: adding newline at end of file\n", stderr);
	break;
      }
    }
    line_ptr = malloc(buffer_size);
    if (!line_ptr)
      goto out_of_memory;
    memcpy(line, buffer_ptr, buffer_size);
    master_ptr[master_size].length = buffer_size;
    master_ptr[master_size].ptr = line_ptr;
    master_size++;
    if (c == EOF)
      goto eof;
  }
 eof:
  free(buffer_ptr);
  for (pass = 0; pass < 16; pass++)
  {
    gettimeofday(&tv, 0);
    srandom(tv.tv_sec ^ tv.tv_usec);
    for (line_number = 0; line_number < master_size; line_number++)
    {
      other_line = random() % (master_size - line_number) + line_number;
      temp = master_ptr[line];
      master_ptr[line_number] = master_ptr[other_line];
      master_ptr[other_line] = temp;
    }
  }
  for (line_number = 0; line_number < master_size; line_number++)
  {
    ptr = master_ptr[line_number].ptr;
    end = ptr + master_ptr[line_number].length;
    while (ptr < end)
    {
      putchar(*ptr);
      ptr++;
    }
    putchar('\n');
  }
  return 0;
 out_of_memory:
  fputs("shuffle: out of memory\n", stderr);
  return 1;
}
-- don't cut here -- don't cut here -- don't cut here -- don't cut here --

Now i'm going to show you how i actually wrote it.  None of the variables are
changed, but they're declared in the right place.  I've also added a couple
blocks, like i said before.  I've also restored some debugging code, because
you may find it useful.  It also satisfies the single-assignment property much
better than the `bad' version, because variables are created when they're
needed rather than at the beginning of the function.

-- cut here -- cut here -- cut here -- cut here -- cut here -- cut here --
#include <stdio.h>
#include <sys/time.h>

#define DEBUG 0

extern char* malloc();
extern char* realloc();

struct line
{
  int length;
  char* ptr;
};

int main (argc, argv)
  int argc;
  char* argv[];
{
  struct line* master_ptr;
  int master_size;

#if DEBUG
  fputs("shuffle: reading lines...\n", stderr);
#endif
  {
    int master_capacity;
    char *buffer_ptr;
    int buffer_capacity;

    master_capacity = 256;
    master_ptr = (struct line*)malloc(master_capacity * sizeof(struct line));
    if (!master_ptr)
      goto out_of_memory;
    master_size = 0;
    buffer_capacity = 256;
    buffer_ptr = malloc(buffer_capacity);
    if (!buffer_ptr)
      goto out_of_memory;
    for (;;)
    {
      int c;
      int buffer_size;
      char* line_ptr;

      c = getchar();
      if (c == EOF)
	goto eof;
      if (master_size >= master_capacity)
      {
	master_capacity *= 2;
	master_ptr = (struct line*)realloc(master_ptr, master_capacity * sizeof (struct line));
	if (!master_ptr)
	  goto out_of_memory;
      }
      buffer_size = 0;

      while (c != '\n')
      {
	if (buffer_size >= buffer_capacity)
	{
	  buffer_capacity *= 2;
	  buffer_ptr = realloc(buffer_ptr, buffer_capacity);
	}
	buffer_ptr[buffer_size] = c;
	buffer_size++;
	c = getchar();
	if (c == EOF)
	{
	  fputs("shuffle: adding newline at end of file\n", stderr);
	  break;
	}
      }

      line_ptr = malloc(buffer_size);
      if (!line_ptr)
	goto out_of_memory;
      memcpy(line_ptr, buffer_ptr, buffer_size);
      master_ptr[master_size].length = buffer_size;
      master_ptr[master_size].ptr = line_ptr;
      master_size++;
      if (c == EOF)
	goto eof;
    }

  eof:
    free(buffer_ptr);
  }
#if DEBUG
  fprintf(stderr, "shuffle: total of %d lines read\n", master_size);
#endif

  {
    int pass;

    for (pass = 0; pass < 16; pass++)
    {
      struct timeval tv;
      int line_number;

      gettimeofday(&tv, 0);
#if DEBUG
      fprintf(stderr, "shuffle: doing pass %d, time is %d seconds, %d microseconds...\n", pass, tv.tv_sec, tv.tv_usec);
#endif
      srandom(tv.tv_sec ^ tv.tv_usec);
      for (line_number = 0; line_number < master_size; line_number++)
      {
	int other_line;
	struct line temp;

	other_line = random() % (master_size - line_number) + line_number;
	temp = master_ptr[line_number];
	master_ptr[line_number] = master_ptr[other_line];
	master_ptr[other_line] = temp;
      }
    }
  }

#if DEBUG
  fputs("shuffle: writing lines...\n", stderr);
#endif
  {
    int line_number;

    for (line_number = 0; line_number < master_size; line_number++)
    {
      char* ptr;
      char* end;

      ptr = master_ptr[line_number].ptr;
      end = ptr + master_ptr[line_number].length;
      while (ptr < end)
      {
	putchar(*ptr);
	ptr++;
      }
      putchar('\n');
    }
  }

#if DEBUG
  fputs("shuffle: all done\n", stderr);
#endif
  return 0;

 out_of_memory:
  fputs("shuffle: out of memory\n", stderr);
  return 1;
}
-- cut here -- cut here -- cut here -- cut here -- cut here -- cut here --
--
Joe Keane, C++ hacker
jgk@osc.com (...!uunet!stratus!osc!jgk)

yodaiken@chelm.cs.umass.edu (victor yodaiken) (04/05/91)

In article <3APR91.20574161@uc780.umd.edu> cs450a03@uc780.umd.edu writes:
>Victor Yodaiken >
> [responding to an illustration of how constructing a function
>  specifies the type of the function]
>>This makes sense, but it seems to be an instance of an abbreviation
>>or convention. There is a preface: all functions are asumed to be
>>over balogna sandwiches, implied in what you write.  If what you mean
>>by "dynamic typing" is simply greater facility in defining domains
>>and higher level declarations of type (e.g., all functions in the
>>block are over integers), then I believe that I understand. But, it
>>appeared from earlier discussion that dynamic typing was more
>>involved.
>
>Dynamic typing is simple in *concept*.  The problem is that it often
>does not map well onto machine architecture.  For example, if you have
>a function which is defined over the domain of numbers (take addition,
>for example), then you usually see a dynamic typing system faithfully
>providing that functionality for a variety of machine implementations
>of numeric types.
>

Your explanation is confusing me even more.
Most of  the programming languages that I know do something like this.
For example, x +y works fine in "C" for floating and integer types,
and can also be extended to other types in C++  I believe. 
Maybe a precise definition of "dynamic typing" would be useful. From
previous postings, I had the impression that "dynamic typing" involved
compiler/run-time inferences about the type of a function or variable
from the context and use of the variable or function.  On the other
hand, "static typing" requires the programmer to indicate exactly
the type of the object. Is this incorrect?

>In a statically typed language, type is considered to be a property of
>variables, so if a variable can have a value outside the domain of a
>function then that is not considered a type error.  In a dynamically
>typed language, where practically all functions have type checks to
>ensure their arguments' validity, more is usually checked than just
>the "wordsize" or the "typetag" of the arguments.
>

Again, I'm not following you here. Is  this extra type checking in 
dynamic languages part of the language or a coding style? Clearly
there are limits to how much the compiler can do: e.g.
x * limit(f,x, x-> 0) is a "type" error of some kind if 
f(x) does not converge  --- multiplication has only numbers in
its domain.

new@ee.udel.edu (Darren New) (04/06/91)

In article <28875@dime.cs.umass.edu> yodaiken@chelm.cs.umass.edu (victor yodaiken) writes:
>Your explanation is confusing me even more.
>Maybe a precise definition of "dynamic typing" would be useful.

Dynamic typing is when the type of a *variable* is unknown (and unspecified)
at compile time.  Any given value assigned to that variable will have
a type, but the variable does not restrict what types of values may
be assigned to it.  Contrast these:

int f(float f, int i) { blah blah blah }
/* here, f must only get floats and i must only get ints. */

(defun f (f i) (blah blah ))
Here, f may be a float, and integer, a bignum, a list, a closure, etc.

Basically, expressions, variables, function names, and so on do not have
types.  Nothing that is purely syntactical has types.  Only actual
values have types.               -- Darren

-- 
--- Darren New --- Grad Student --- CIS --- Univ. of Delaware ---
----- Network Protocols, Graphics, Programming Languages, FDTs -----
  +=+=+ My time is very valuable, but unfortunately only to me +=+=+
+ When you drive screws with a hammer, screwdrivers are unrecognisable +

cs450a03@uc780.umd.edu (04/06/91)

Victor Yodaiken writes:
>>[Raul:] For example, if you have a function which is defined over the
>>domain of numbers (take addition, for example), then you usually see
>>a dynamic typing system faithfully providing that functionality for
>>a variety of machine implementations of numeric types.

>Your explanation is confusing me even more.

sorry..

>Most of  the programming languages that I know do something like this.
>For example, x +y works fine in "C" for floating and integer types,
>and can also be extended to other types in C++  I believe. 

But in C, you have to go through some red tape if you need to store
both integer and floating types in the same variable.  You also have
to go through this red tape to define a function that takes more than
one type as an argument (for example, functions which manage a
run-time symbol table).  C++ allows you a little more abstraction, but
still requires quite a bit of paper work.

>Maybe a precise definition of "dynamic typing" would be useful. 

See Darren New's article (5 April, I think, same Subject: as this
article). 

>From previous postings, I had the impression that "dynamic typing"
>involved compiler/run-time inferences about the type of a function or
>variable from the context and use of the variable or function.  On
>the other hand, "static typing" requires the programmer to indicate
>exactly the type of the object. Is this incorrect?

Dynamic typing requires compiler type inferences (though these are
trivial in the presence of declarations, or well-conditioned code).
There is no requirement for run-time type inferences.  Type
information is always available at run-time.

If a compiler can not infer type, it will provide for run-time type
checking as well--it's fairly trivial to implement.

>>[Me (Raul) again:] ...  In a dynamically typed language, where
>>practically all functions have type checks to ensure their
>>arguments' validity, more is usually checked than just the
>>"wordsize" or the "typetag" of the arguments.

>Again, I'm not following you here. Is this extra type checking in
>dynamic languages part of the language or a coding style?

For language primitives, it's part of the language.  For user defined
functions, it is both [the programmer must provide any necessary type
checking that is not implicit in the primitives].

Raul Rockwell

cs450a03@uc780.umd.edu (04/06/91)

Victor Yodaiken >
Darren New      >>

>>Dynamic typing is when the type of a *variable* is unknown (and unspecified)
>>at compile time.  Any given value assigned to that variable will have
>>a type, but the variable does not restrict what types of values may
>>be assigned to it.

I just realized: that should begin "Dynamic typing is when the type of
a variable _may_ be unknown at compile time."

>Makes sense. But, then I don't believe that dynamic typing is similar
>to standard mathematical usage. In fact, it seems that the trend in
>mathematics has been towards greater use of "static typing".  The
>problem with "untyped" expressions is that they are inherently
>ambiguous. If I'm writing about semigroups and regular languages I
>must tell you when a*b is concatenation and when it is semigroup
>addition.

Here, * is a variable, and "telling me when a*b is cantenation" "and
when it is semigroup addition" are both declarations (or assignment
statements).  That can get confusing, I agree, which is a good reason
for using a separate symbol for each usage.

Raul Rockwell

cs450a03@uc780.umd.edu (04/06/91)

I wrote (paraphrasing Darren New)

>"Dynamic typing is when the type of a variable _may_ be unknown at
>compile time."

An even yet more exactly stated version would be :-)

"Dynamic typing is when the most efficient of the available primitive
machine representation for some value(s) is chosen at run-time."

An analogy to C might be choosing an "array of short" or "array of
long" depending on the size of the values.  (I'd say "array of double"
but C has completely different semantics for "/" for that case.)

Raul Rockwell

yodaiken@chelm.cs.umass.edu (victor yodaiken) (04/06/91)

In article <49907@nigel.ee.udel.edu> new@ee.udel.edu (Darren New) writes:
>
>Dynamic typing is when the type of a *variable* is unknown (and unspecified)
>at compile time.  Any given value assigned to that variable will have
>a type, but the variable does not restrict what types of values may
>be assigned to it.  Contrast these:
>
>int f(float f, int i) { blah blah blah }
>/* here, f must only get floats and i must only get ints. */
>
>(defun f (f i) (blah blah ))
>Here, f may be a float, and integer, a bignum, a list, a closure, etc.
>
>Basically, expressions, variables, function names, and so on do not have
>types.  Nothing that is purely syntactical has types.  Only actual
>values have types.               -- Darren

Makes sense. But, then I don't believe that dynamic typing  is similar
to standard mathematical usage. In fact, it seems that the trend in mathematics
has been towards greater use of "static typing". 
The problem with "untyped" expressions is that they are inherently
ambiguous. If I'm writing about  semigroups and regular languages I must
tell you when a*b is concatenation and when it is semigroup addition.
Figuring this out from the context alone is difficult for most people.
What makes 
classical programming languages so annoying, however, is that the
type schemes are so much more clumsy than we are used to in mathematics.

	At the simplest level, one might have a hierarchy of types of
	objects, and demand that when two objects appear together, they are
	always converted to the type that appears highest on the hierarchy.
	In practice, however,  types of mathematical objects do not form any
	kind of strict hierarchy, In some respects,they form a network, or
	directed graph, and one could imagine always converting objects to
	the type of their common ancestor" in the network. But even this
	much more elaborate mechanism does not adequately capture
	mathematical usage. Different forms of algebraic expressions, say
	factored and expanded out, can be thought of as different types. In
	most cases, the rules for when to convert between these types are
	much more complicated than you can represent by a simple fixed
	network of type conversions.
               Stephen Wolfram, Mathematica

olson@lear.juliet.ll.mit.edu ( Steve Olson) (04/07/91)

In article <28924@dime.cs.umass.edu> yodaiken@chelm.cs.umass.edu (victor yodaiken) writes:
   The problem with "untyped" expressions is that they are inherently
   ambiguous. If I'm writing about  semigroups and regular languages I must
   tell you when a*b is concatenation and when it is semigroup addition.
   Figuring this out from the context alone is difficult for most people.

This sounds more like overuse of operator overloading rather than anything
having to do with static vs. dynamic typing.  This problem could occur just as
easily in C++ as in CLOS.  If you never want to mix the two types then 
renaming the operators will remove all confusion.  If you do want to mix the
two types then the problem is the same whether you are using a dynamically
typed language or a statically typed sortof-dynamic-typing-through-subclassing-
and-virtual-functions language.

- Steve Olson
  MIT Lincoln Laboratory
  olson@juilet.ll.mit.edu

yodaiken@chelm.cs.umass.edu (victor yodaiken) (04/07/91)

In article <OLSON.91Apr6155734@lear.juliet.ll.mit.edu> olson@lear.juliet.ll.mit.edu ( Steve Olson) writes:
>
>In article <28924@dime.cs.umass.edu> yodaiken@chelm.cs.umass.edu (victor yodaiken) writes:
>   The problem with "untyped" expressions is that they are inherently
>   ambiguous. If I'm writing about  semigroups and regular languages I must
>   tell you when a*b is concatenation and when it is semigroup addition.
>   Figuring this out from the context alone is difficult for most people.
>
>This sounds more like overuse of operator overloading rather than anything
>having to do with static vs. dynamic typing.  This problem could occur just as
>easily in C++ as in CLOS.  If you never want to mix the two types then 
>renaming the operators will remove all confusion.  If you do want to mix the
>two types then the problem is the same whether you are using a dynamically
>typed language or a statically typed sortof-dynamic-typing-through-subclassing-
>and-virtual-functions language.

I don't want to have to give up overloading --
one of the real conveniences of mathematical notation. But if I declare
variable types, there is no problem.
So, if I write: let a,b be elements of the alphabet and
let a',b' be the corresponding elements of the semigroup, then
a*b and a'*b' are easy to distinguish. By making the types of the
variables explicit, I make the overloading of * unambiguous. An
expression a*b' where the two variables have different types makes no
sense in this context and I'd like the compiler to catch it.

gudeman@cs.arizona.edu (David Gudeman) (04/08/91)

In article  <28673@dime.cs.umass.edu> victor yodaiken writes:
]In article <1APR91.23564447@uc780.umd.edu> cs450a03@uc780.umd.edu writes:
]...dynamic language[s]... better model human
]>thought mechanisms as exemplified by thousands of years of
]>mathematical development).
]>
]
]People keep saying this, but it just ain't so. Type declarations are
]an integral part of mathematics...
]Maybe I'm missing an essential difference
]between these type declarations, and the declarations of programming
]languages. If so, I'd be happy if someone could explain.

OK, try this.  In mathematics you don't have to describe the types of
objects unless the type is important.  In statically typed languages,
you have to declare the type whether it has any relevance or not (or
fix it at compile time whether it is really knowable or not).  In
mathematics or dynamically typed languages you can describe the
concept of a set or a sequence, and operations over those data
structures, without refering to the "type" of elements.  For example
in math you could say

  If f is a function and s is a sequence, then map(f,s) is the
  sequence t such that for all i . t[i] = f(s[i]).

In a Icon (a dynamically typed language) you could define

  procedure map(f,s)
  local t,i
  t := list(*s)
  every i := 1 to *s do t[i] := f(s[i])
  return t
  end

Try to write this function in a statically typed language so that it
has all the generality of the math and Icon versions.  (Actually, the
math version works for infinite sequences and the Icon one doesn't.
There are languages that fix that...)  No, forget the challenge.
Someone is sure to post a solution (or near solution) using some
baroque system of static declarations from Ada or some wierd C trick
(I can think of one...).  The point is that even if you can sort of do
it in these languages, it is obvious that the Icon version is much
closer to the mathematical (I say "reasonable" or "natural") way of
doing it.

Furthermore, when you do use something like a type declaration in
mathematics, it is to disambiguate or add clarify, and the nature of
the declarations reflects that.  In statically typed programming
languages the purpose of the types is to let the compiler generate
better code, and the nature of the declarations reflects that.  For
example, you cannot simply declare something as a "number", you have
to decide whether you want it represented in floating point format or
integer, and what size you want.  You can't declare that something is
a set, you have to implement a set using fixed-size memory blocks.

I just depends on what you want in a programming language.  If you
think that programming languages should always give fairly transparent
compilation, or are willing to go to a great deal of extra work for a
few extra cycles, then you won't be too happy with dynamic typing.
But it is just ridiculous to claim that static typing is more natural
than dynamic typing -- I am talking here about static typing in the
sense that _every_ expression has to be give a fixed type of the form
found in programming languages.

I only have to show one example of "dynamic typing" in mathematics to
show that mathematics is not "statically typed" in the universal sense
of statically typed programming languages (and I did so above).
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

gudeman@cs.arizona.edu (David Gudeman) (04/08/91)

In article  <6APR91.10374005@uc780.umd.edu> cs450a03@uc780.umd.edu writes:
]...
]"Dynamic typing is when the most efficient of the available primitive
]machine representation for some value(s) is chosen at run-time."

I don't like that definition at all.  In the first place, it is an
implementation-based definition rather than a semantic one.  In the
second place, I don't know of any dynamically typed language that
claims to choose the "most efficient ... representation" -- they
choose a convenient representation.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

yodaiken@chelm.cs.umass.edu (victor yodaiken) (04/08/91)

In article <1593@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
>Furthermore, when you do use something like a type declaration in
>mathematics, it is to disambiguate or add clarify, and the nature of
>the declarations reflects that.  In statically typed programming
>languages the purpose of the types is to let the compiler generate
>better code, and the nature of the declarations reflects that.  For
>example, you cannot simply declare something as a "number", you have
>to decide whether you want it represented in floating point format or
>integer, and what size you want.  You can't declare that something is
>a set, you have to implement a set using fixed-size memory blocks.
>

Agreeing that the type declarations in classical algol/fortran/c ...
programming languages are hopelessly rigid, I'm not sure that
"dynamic" versus "static" is at the core of the argument. There seems
no reason why we cant have a "static" programming language in which
type declarations are more flexible than current definitions. Thus,
if I declare "x,y" to range over numbers, then "x+y, x*y " are
well defined, but "x/y" is not, because the semantics of /
depends on more than the numberness of the arguments -- i.e.
integer/integer -> integer, float/float -> float, integer/integer ->(q,r)
are all different functions. I sympathize with those who find
it absurd to have to describe variables by storage cell length (
essentially the "c" model) but I still don't see what this has to do
with dynamic "run-time" type checking. Even with function variables,
it seems plausible that one could be forced to provide a proof
that any expression x(y,z) should be type consistent:
(y,z) in Domain(x). Why should this proof be given in temrs of a dyanmic
test, rather than a "compile time" calculation?

bbc@rice.edu (Benjamin Chase) (04/08/91)

gudeman@cs.arizona.edu (David Gudeman) writes:

> OK, try this.  In mathematics you don't have to describe the types
> of objects unless the type is important.  In statically typed
> languages, you have to declare the type whether it has any relevance
> or not (or fix it at compile time whether it is really knowable or
> not).

As we shall see below, delaring the type may not be quite as painful
as you suggest.

> In mathematics or dynamically typed languages you can describe the
> concept of a set or a sequence, and operations over those data
> structures, without refering to the "type" of elements.  For example
> in math you could say

>  If f is a function and s is a sequence, then map(f,s) is the
>  sequence t such that for all i . t[i] = f(s[i]).

> In a Icon (a dynamically typed language) you could define

>  procedure map(f,s)
>  local t,i
>  t := list(*s)
>  every i := 1 to *s do t[i] := f(s[i])
>  return t
>  end

> Try to write this function in a statically typed language so that it
> has all the generality of the math and Icon versions.  

I think you can write "map" in Russell, which is statically typed.

My Russell is a little rusty, so I could be wrong.  There might be a
hitch with the declaring the result type of "map"; I'm not really
certain either way.  This works by using a different notion of types
and polymorphism than you were probably expecting...

The type declarations for "map" look sort of parameterized in Russell.
Informally, they'd run something like

map(f:S->T; s:list of S) : list of T;
  S has {V}	/* value of (ie. rvalue of an lvalue, basically) */
  T has {V,:=}	/* value of, assignment operator */
{
}

I'm certain that I've got the syntax wrong.  Anyhow, the gist is that
I require the types S (type of elements of s) and T (types of elements
of t) to support certain operations.  This definitely has an o-o feel
to it, yes, and it might be helpful for purposes of your comprehension
to rephrase it as "classes S and T must support certain methods, named
V and :=".

> ... Furthermore, when you do use something like a type declaration
> in mathematics, it is to disambiguate or add clarify, and the nature
> of the declarations reflects that.  In statically typed programming
> languages the purpose of the types is to let the compiler generate
> better code, and the nature of the declarations reflects that.

Such an efficiency-minded purpose is not so apparent in Russell.  I
found the feel of type declarations for a polymorphic function to be
more of a flavor of "to use the function, your argument types must
support the following operations: ...".  That seemed like a very
natural requirement to me, one that would hold for most languages,
either at compile time or at run time.  I liked it from a
software-engineering aspect, too.

> I only have to show one example of "dynamic typing" in mathematics
> to show that mathematics is not "statically typed" in the universal
> sense of statically typed programming languages (and I did so
> above).

Careful.  You may have instead shown a narrow view of what is actually
possible.

Now, having said all this, I will say that I am not an unflagging fan
of static typing nor of Russell.  There are hard problems in Russell
regarding arrays, to which I never quite found a good solution.  I
recall it being difficult to write a completely general-purpose matrix
math package, without having explicit (ie. run-time) checks that the
sizes of arrays matched appropriately.  In particular, if you want to
code up a matrix multiply operation:

mult(A: array[m,n] of T;
     B: array[o,p] of T) : array [m,p] of T
  ...
{
	/* Ooops, need to basically do a run-time type check */
	if (n != o) {
	   print("Hey, these arrays can't be multiplied!");
	   print("Their sizes don't match!");
	}
}

(Of course, Russell's polymorphism over the element type T does work
like a champ.  Oh, yeah, and it supports operator overloading and
precendence, so the above function was actually called "*", and you'd
say "A*B", like you'd want.)

So, I guess I'd like something like Russell, but maybe with a
dynamically typed loophole.  I'd tell the compiler what I know, or
what I care to let it know, about my code.  It would figure out what
it can (and perhaps informs me of what is left unchecked), and then
puts off the rest of the checking until run-time.  I wish there had
been a way for the above run-time array size check to be done
automagically (I suppose you'd declare the arrays mxn and nxp then,
and unify the two n's?), and for the handling of such an error to be
done in some standard fashion (Modula-3 style raises/exceptions?).
--
	Ben Chase <bbc@rice.edu>, Rice University, Houston, Texas

cs450a03@uc780.umd.edu (04/09/91)

I wrote:
    "Dynamic typing is when the most efficient of the available
    primitive machine representation for some value(s) is chosen at
    run-time."

David Gudeman wrote:
    I don't like that definition at all.  In the first place, it is an
    implementation-based definition rather than a semantic one.

Darren New wrote:
    I don't think it has anything to do with machine representations
    or efficiency.

Ok, ok.. I apologize.  My fault for using such an inexact expression
as "most efficient"...  [My thoughts were along the line of "most
efficient of the implemented types for the computational model" but
even that isn't very useful as a description.]

Umm... the concept I was trying to capture was something along the
lines of the difference between a type-tag + a c-struct vs a type tag
+ a c-union.  But that is totally implementation, and only one of many
possible implementations at that.  (First alternate implementation
that comes to mind the case where address ranges are used to
distinguish between different word sizes.  ... still only vaguely
related to dynamic/static typing).

Once again, sorry.

Raul Rockwell

new@ee.udel.edu (Darren New) (04/09/91)

In article <6APR91.10374005@uc780.umd.edu> cs450a03@uc780.umd.edu writes:
>I wrote (paraphrasing Darren New)
>>"Dynamic typing is when the type of a variable _may_ be unknown at
>>compile time."

Actually, unless you are talking about type inferencing, I would say
"*must* be unknown."  If the type is known at compile time, it is
static typing, period.  If you don't specify the type but the compiler
can figure it out for you, then that is static typing, because the
type of the values assigned to the variable is known statically.

I never intended to imply that dynamic typing and static typing
cannot be mixed in the same language.

>An even yet more exactly stated version would be :-)
>"Dynamic typing is when the most efficient of the available primitive
>machine representation for some value(s) is chosen at run-time."

I don't think it has anything to do with machine representations or
efficiency.  Your definition would exclude Objective-C, Smalltalk, C++,
and every other dynamically typed language I know except APL.  For
example, Smalltalk does not chose the most efficient version of the
data: it uses what the programmer defines.  Maybe languages like
Mathematica or ML do such a thing, but chosing an efficient
representation is orthagonal to dynamic typing.  For example, in Ada
(as I understand it), one can say "I need a float with this range and
that accuracy at least" and the compiler will chose the most efficient
representation.  In Smalltalk, one must define all the operations on
lists. Ada choses efficient reps but is statically typed, and Smalltalk
does not chose efficient reps but is dynamically typed.

           -- Darren
-- 
--- Darren New --- Grad Student --- CIS --- Univ. of Delaware ---
----- Network Protocols, Graphics, Programming Languages, FDTs -----
  +=+=+ My time is very valuable, but unfortunately only to me +=+=+
+ When you drive screws with a hammer, screwdrivers are unrecognisable +

cs450a03@uc780.umd.edu (04/09/91)

Victor Yodaiken writes:

>There seems no reason why we cant have a "static" programming
>language in which type declarations are more flexible than current
>definitions. Thus, if I declare "x,y" to range over numbers, then
>"x+y, x*y " are well defined, but "x/y" is not, because the semantics
>of / depends on more than the numberness of the arguments -- i.e.
>integer/integer -> integer, float/float -> float, integer/integer
>->(q,r) are all different functions.

In general, you need to be able to do more with a computer than add
and multiply :-)

Consider a function Lookup(symbol_table, symbol) -- what should the
type of the result be?  Or consider Set_value(symbol_table, symbol, value) 
what type should "value" be?

Could you imagine a symbolic math package which had to invoke a
compiler at each stage to construct "statically typed code"
appropriate to an expression?  How about developing such a package
in a language where you had to change declarations each time you
wanted to work on a different class of problem?

-----------------------------------

Consider the problem of function-rewriting.  Let's say you have some
function F(g(x), g(y)).  Let's say F and g are both O(n squared), but
that F(g, g) has properties that allow it to be computed in O(n).  In
a language where a significant part of the code is spent declaring how
x and y (and presumably the results of F and g) are stored, how is a
compiler supposed to be able to determine that it is appropriate to
re-write this usage of F?

I tend to think of static typing in the same light as side-effect
driven programming.  Both have "non-local" properties that make
re-writing code difficult.  [If static typing is purely local, then
you are not typing a variable, but a value.  If that "variable" is a
constant, then there may be no significant difference between static
and dynamic typing.]

Function-rewriting involves a number of issues besides dynamic/static
typing, but that's what I've been spending my spare time on...

-----------------------------------

Let me pose a "classic typing problem":

F(x) and f(x) are defined for some domain D -> D, but are not one-to-one
G(x) is an inverse to F(x) for some values in D
g(x) is an inverse to f(x) for some values in D

F, f, G and g are all implemented as user defined functions on some
computer system.  All are pure computation (no side-effects).

What type must x be for G(g(x)) to be meaningful?

An easy way to deal with this problem is issue "message not
understood" where x is invalid.  Note that significant computation (g)
may occur before this situation is recognized.

Sometimes it may be recognized that if x is limited to some sub-domain
d, then G(g(x)) will always be meaningful.  Sometimes the computer
might recognize this, sometimes the programmer might.  Sometimes the
problem is too hard.

Raul Rockwell

olson@lear.juliet.ll.mit.edu ( Steve Olson) (04/09/91)

In article <28937@dime.cs.umass.edu> yodaiken@chelm.cs.umass.edu (victor yodaiken) writes:
   In article <OLSON.91Apr6155734@lear.juliet.ll.mit.edu> olson@lear.juliet.ll.mit.edu ( Steve Olson) writes:
   >
   >In article <28924@dime.cs.umass.edu> yodaiken@chelm.cs.umass.edu (victor yodaiken) writes:
   >   The problem with "untyped" expressions is that they are inherently
   >   ambiguous. If I'm writing about  semigroups and regular languages I must
   >   tell you when a*b is concatenation and when it is semigroup addition.
   >   Figuring this out from the context alone is difficult for most people.
   >
   >This sounds more like overuse of operator overloading rather than anything
   >having to do with static vs. dynamic typing.  This problem could occur just as
   >easily in C++ as in CLOS.  If you never want to mix the two types then 
   >renaming the operators will remove all confusion.  If you do want to mix the
   >two types then the problem is the same whether you are using a dynamically
   >typed language or a statically typed sortof-dynamic-typing-through-subclassing-
   >and-virtual-functions language.

   I don't want to have to give up overloading --
   one of the real conveniences of mathematical notation. But if I declare
   variable types, there is no problem.
   So, if I write: let a,b be elements of the alphabet and
   let a',b' be the corresponding elements of the semigroup, then
   a*b and a'*b' are easy to distinguish. By making the types of the
   variables explicit, I make the overloading of * unambiguous. An
   expression a*b' where the two variables have different types makes no
   sense in this context and I'd like the compiler to catch it.

You know, my line about renaming the operators was a little weak; let me
try to do better.  First of all, you may want to mix the two, not in your
core routines, but in support code that manipulates container classes.  This
is the old heterogeneous list problem, already hotly discussed on this list.
Most solutions to this problem from the static typers are based on subclassing
and virtual functions, which, as I orginally noted, lands you back where you 
started.  

Second, using a type system to either provide documentation or to catch
errors (you arguments seem to center around this) is overrated.  Just how
often do you mistakenly confuse variables of totally different types?
Everybody's experience is different, but for me the answer is virtually
never.  My viewpoint is that static typing *creates* the potential for type
errors by demanding all sorts of diddly-squat decisions about is this number a
int,float,double, or whatever.  This can lead to efficiency gains, but it
seems like a far cry from the concept of mathamatical type you have
been talking about in other posts.

I am aware that demanding detailed decisions about storage layout or machine
data-types is not inherent to the concept of static typeing.  I would be 
interested in hearing about a language where you could do something
like declare x to be a "number".  Does such exist?  Efficiency aside, does
there exist a language where a static type system catches lots of errors but
dosen't get in your way the way current examples do?  I don't know myself
-- thats why I read this list!

- Steve Olson
  MIT Lincoln Laboratory
  olson@juliet.ll.mit.edu

brm@neon.Stanford.EDU (Brian R. Murphy) (04/09/91)

In article <boehm.670539603@siria> boehm@parc.xerox.com (Hans Boehm) writes:
>brm@neon.Stanford.EDU (Brian R. Murphy) writes:
>>Unfortunately, given a statically-typed language with higher-order
>>functions and an "all" type, type inference appears to be
>>undecideable.  Thus your statically-typed language _requires_ type
>>declarations, whereas in a dynamically-typed language we can get by
>>without them.
>
>  I think this is getting into an area where we need a bit more precision.
>The argument implied by Brian seems to be that with a dynamically typed
>language and some automatic type inference, we can get similar performance
>to a statically typed language.  I think we are no longer addressing
>reliability issues.

I didn't say anything at all about performance.  I would agree that
with a statically typed language you can achieve higher performance
than with a dynamic language, in general.  Doing certain sorts of type
inference can reduce the performance differential, but it's still true
that statically typed programs will be more efficient in general.

My complaint about statically typed languages is that I _can't_ do
some things in them that I _do_ in dynamically typed languages (such
as Lisp).  For example, I can't I write a function which returns
either a boolean or an integer in a complex way.  I can't write my own
Y combinator.  I can't write a function which allows either a sequence
of functions which range over numbers or a sequence of numbers as an
argument.

Let's consider what a type system might do for me:

(1) It constrains the behavior of primitive functions so the program
doesn't do undefined things.  If "+" when applied to non-numbers has
some undefined behavior, then such applications should be prevented,
or else debugging programs becomes a nightmare.  Any strongly typed
language does this.  [ This is essential, in my opinion, in either
kind of type system. ]

(1') In the case of overloaded primitives, a type system selects which
underlying operation to use on given arguments.  A static type
system allows this to be done at compile time, a dynamic type system
does it at run time, although analysis can do some at compile time. 
[ Static wins here. ]

(2) It constrains the use of procedures that I write so that they
aren't applied to things they weren't intended to apply to.  Thus, I
might declare an argument to have a particular type, and applications
to objects not in that type are prevented.  Statically typed languages
usually _force_ me to make such declarations.  The problem with this
is that often a static type language is neither specific nor general
enough to adequately describe the types in my programs.  If I really
want true safety here, I'd have to have a type language which allows
me to express complex relationships (such as "sorted sequence", for
example).  Thus it must be very specific.  In other cases, argument
types (or certain parts of them) don't matter at all, and I'd like to
be able to omit such declarations.  Thus I want a type system which
  (a) allows me to omit many type declarations (where unnecessary)
  (b) allows me to be very specific in constraining some arguments

With dynamic typing, I can simply write predicates to constrain
arguments/variables (Common Lisp, FL, some implementations of Scheme
do this).

You static typing advocates claim that a static type system can do
this for me, but I claim that it can't.  A type language powerful
enough to constrain arguments anywhere near as precisely as I need
won't allow me to omit many types.  Type inference is only possible
for a limited class of type languages, and they tend to be fairly
weak. In addition, certain programs are forbidden simply because they
utilize types which can't be described by the type language used.  
[ dynamic wins here ]

(3) It shows efficient representations of certain objects.
A static type system allows this.  Given sophisticated analyses, a
dynamic type system allows this in some cases. [ static wins here ]

IN Conclusion, I observe that static typing wins on points (1) and
(3), which are merely performance issues.  Dynamic typing wins big on
point (2), which is a programming expressiveness issue.  Neither is
really adequate, but dynamic typing's problems are merely performance
issues; improved compiler technology will gradually eat away the
performance advantages of static typing (plus you can always just buy
a faster machine).  Static typing's problems seem a little harder to
work around (required declarations, some valid programs forbidden,
constraints on function use overly restrictive/unrestrictive), and
trip up the way I write programs, by forcing me to do certain things
in some ways.  Thus, although I regret the loss of performance, I feel
that dynamic typing is necessary and we should make the best of it.

I'll observe one final point, which is that static type systems which
satisfy point (2) generally allow such constraints to be resolved at
compile time.  This might reduce debugging time, but the expressible
constraints in such systems aren't really powerful enough to provide
any guarantee of program correctness, so thorough testing is still
necessary.  Thus I don't buy the argument that dynamically-typed
languages aren't adequate for production code; the user won't get a
run-time error message if you debugged your code well.  Even if he
does, better that than undefined behavior from a statically typed
program whose producer put too much confidence in the type system to
catch his errors.

					-Brian Murphy
					brm@cs.stanford.edu

augustss@cs.chalmers.se (Lennart Augustsson) (04/09/91)

In article <1593@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
>OK, try this.  In mathematics you don't have to describe the types of
>objects unless the type is important.  In statically typed languages,
>you have to declare the type whether it has any relevance or not (or
>fix it at compile time whether it is really knowable or not).  In
>mathematics or dynamically typed languages you can describe the
>concept of a set or a sequence, and operations over those data
>structures, without refering to the "type" of elements.  For example
>in math you could say
>
>  If f is a function and s is a sequence, then map(f,s) is the
>  sequence t such that for all i . t[i] = f(s[i]).
>
>In a Icon (a dynamically typed language) you could define
>
>  procedure map(f,s)
>  local t,i
>  t := list(*s)
>  every i := 1 to *s do t[i] := f(s[i])
>  return t
>  end
>
>Try to write this function in a statically typed language so that it
>has all the generality of the math and Icon versions.  (Actually, the
>math version works for infinite sequences and the Icon one doesn't.
>There are languages that fix that...)  No, forget the challenge.

Sorry, I can't forget it.  I'll just have to make another plug for
polymorhic type deduction.  Here's a version of map in Haskell (ML
would be very similar):

  map f [] = []
  map f (x:xs) = f x : map f xs

This also handles infinite lists (of course).  Haskell is statically
typed, and from the definition of map the compiler deduces (i.e. you 
don't have to write it, but you can if you like):

  map :: (a->b) -> [a] -> [b]

Just because some statically typed languages requires you to declare
the type whether it has any relevance or not does not mean that this is
true for all of them.

	-- Lennart Augustsson
[This signature is intentionally left blank.]

mathew@mantis.co.uk (mathew) (04/09/91)

new@ee.udel.edu (Darren New) writes:
> I don't think it has anything to do with machine representations or
> efficiency.  Your definition would exclude Objective-C, Smalltalk, C++,
> and every other dynamically typed language I know except APL.

When did C++ become dynamically typed?


mathew

--
If you're a John Foxx fan, please mail me!

mpj@prg.ox.ac.uk (Mark Jones) (04/09/91)

In article <1593@optima.cs.arizona.edu> David Gudeman writes:
| OK, try this.  
| [description of a map function which applies a given function to each
|  element of a sequence ... includes both a mathematical version and
|  an Icon version.]
|
|Try to write this function in a statically typed language so that it
|has all the generality of the math and Icon versions.

Well Haskell is a statically typed language and the map function can be
defined very neatly:    

                      map f xs = [ f x | x<-xs ]
 
That's it.  No need for any type declarations (although Haskell does
infer that map has type (a->b) -> [a] -> [b] for any types a and b).
 
As far as satisfying the mathematical definition that you gave:
|  If f is a function and s is a sequence, then map(f,s) is the
|  sequence t such that for all i . t[i] = f(s[i]).
 
In Haskell, the ith element of a sequence xs is written xs!!i.  So your
condition can be written as:   (map f xs)!!i = f (xs!!i).  This property
can be proved by simple structural induction (you will need to make the
assumption that xs has an ith element (i.e. that xs!!i exists); a point
which is implicit in your definition).
 
In other words, the Haskell definitions give you precisely the mathematical
behaviour that you've asked for!

|                                                       (Actually, the
|math version works for infinite sequences and the Icon one doesn't. 
|There are languages that fix that...) 
 
Haskell is one of them.  
 
|                                 In statically typed programming 
|languages the purpose of the types is to let the compiler generate 
|better code, and the nature of the declarations reflects that.  
 
In Haskell, you don't usually need to give any type declarations ...
perhaps some clever compilers will be able to use type declarations to 
generate better code ... but from my point of view, the main uses of 
type declarations are: 
 - documentation ... just knowing the type of a function can give you
   a lot of information about it.
 - consistency checking ... if I choose to give an explicit type 
   declaration, that reflects my intention as a programmer about how 
   an object will behave.  The Haskell type system will let me know 
   if the definition of the object does not agree with my intentions.

|                                                                For
|example, you cannot simply declare something as a "number", you have
|to decide whether you want it represented in floating point format or
|integer, and what size you want.
 
You can do this in (.. you guessed ..) Haskell.  In fact, even numeric
constants are treated as objects of type Num a => a meaning any type a
so long as a is a numeric type (which might be integers, floating points,
complex numbers ... even polynomials if you wanted).
 
Haskell takes this further ... in an expression of the form x+y one does
not have to insist that x,y are both Integers/Floats etc.  All that is
necessary is that they both have the same type a, and that a is a numeric
type.

This posting was not meant to be an advert for Haskell, but while I'm
on the subject, how would you use your Icon program to add one to a list
of lists of integers nss?  I'd be surprised if you could do it more naturally
than Haskell:
 
               map (map succ) nss where succ n = n+1
 
Which brings me onto a final point; if I can solve a problem P more naturally
in a language L1 than I can in a language L2, does that mean that L1 is
better than L2 (or that the type system of L1 is better than that of L2)?
I certainly wouldn't want to claim superiority of Haskell over Icon (or
anything else) in that way.
 
[Best quit now before we get back to first-class functions...]
 
Mark

mhcoffin@tolstoy.waterloo.edu (Michael Coffin) (04/10/91)

In article <1991Apr9.110217.10963@mathrt0.math.chalmers.se> augustss@cs.chalmers.se (Lennart Augustsson) writes:

>...  I'll just have to make another plug for
>polymorhic type deduction.  Here's a version of map in Haskell (ML
>would be very similar):
>
>  map f [] = []
>  map f (x:xs) = f x : map f xs
>
>This also handles infinite lists (of course).

Not to put down Haskell, but in some ways this isn't nearly as
powerful as the Icon version.   Icon lists aren't constrained to be
homogenous, as Haskell lists are.  So while "map" works for any 
Haskell list, it only manages to do so by putting artificial
constraints on what lists can contain.

-mike

nick@cs.edinburgh.ac.uk (Nick Rothwell) (04/10/91)

In article <1593@optima.cs.arizona.edu>, gudeman@cs.arizona.edu (David Gudeman) writes:
> In a Icon (a dynamically typed language) you could define
> 
>   procedure map(f,s)
>   local t,i
>   t := list(*s)
>   every i := 1 to *s do t[i] := f(s[i])
>   return t
>   end
> 
> Try to write this function in a statically typed language so that it
> has all the generality of the math and Icon versions.

(me)	- fun map(f, s) =
	    case s of x :: y => f x :: map(f, y)
	            | nil => nil;
(ml)	> val map = fn : ('a -> 'b) * 'a list -> 'b list

Do I get a prize?

> No, forget the challenge.
> Someone is sure to post a solution (or near solution) using some
> baroque system of static declarations from Ada or some wierd C trick
> (I can think of one...).

Not me, however.

	Nick.

-- 
Nick Rothwell,	Laboratory for Foundations of Computer Science, Edinburgh.
                nick@lfcs.ed.ac.uk    <Atlantic Ocean>!mcsun!ukc!lfcs!nick
~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~
           "I see what you see: Nurse Bibs on a rubber horse."

gudeman@cs.arizona.edu (David Gudeman) (04/11/91)

In article  <1991Apr9.110217.10963@mathrt0.math.chalmers.se> Lennart Augustsson writes:
]In article <1593@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
]>  If f is a function and s is a sequence, then map(f,s) is the
]>  sequence t such that for all i . t[i] = f(s[i]).
]>
]>In a Icon (a dynamically typed language) you could define...
]>
]>Try to write this function in a statically typed language so that it
]>has all the generality of the math and Icon versions.

]Sorry, I can't forget it.  I'll just have to make another plug for
]polymorhic type deduction.  Here's a version of map in Haskell (ML
]would be very similar):
]
]  map f [] = []
]  map f (x:xs) = f x : map f xs

I'm getting really tired of pointing this out: the program above does
not have the full generality of the mathematical or the dynamically
typed version.  This is a point I've made several times on several
Haskell and ML programs, and I wish people would get the idea so I
could stop repeating myself.  The statically typed program only works
on structures in which all elements have the same type -- and only
when the compiler can infer that type.

Type inference has some nice features, but it does _not_ give you the
expressive power of dynamic typing.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

mauls@warwick.ac.uk (The Chief Slime Monster) (04/11/91)

In article <8687@skye.cs.ed.ac.uk> nick@lfcs.ed.ac.uk writes:
>
>(me)	- fun map(f, s) =
>	    case s of x :: y => f x :: map(f, y)
>	            | nil => nil;
>(ml)	> val map = fn : ('a -> 'b) * 'a list -> 'b list
>
>Do I get a prize?
>

No, but ML does.

brian@comp.vuw.ac.nz (Brian Boutel) (04/11/91)

In article <1707@optima.cs.arizona.edu>, gudeman@cs.arizona.edu (David
Gudeman) writes:
|> In article  <1991Apr9.110217.10963@mathrt0.math.chalmers.se> Lennart
|> Augustsson writes:
|> ]  map f [] = []
|> ]  map f (x:xs) = f x : map f xs
|> 
|> I'm getting really tired of pointing this out: the program above
|> does
|> not have the full generality of the mathematical or the dynamically
|> typed version.  This is a point I've made several times on several
|> Haskell and ML programs, and I wish people would get the idea so I
|> could stop repeating myself.  The statically typed program only
|> works
|> on structures in which all elements have the same type -- and only
|> when the compiler can infer that type.
|> 
|> Type inference has some nice features, but it does _not_ give you
|> the
|> expressive power of dynamic typing.

Let us suppose that we apply a function f to a list in which not all
elements have the same type, and let us suppose further that by some
stroke of luck all the elements of the list "understand" something with
the name f, so in mapping f through the list there is no exception or
failure, and we get a new list. What useful operations can we then
perform on the result list? We know nothing about its members, except
that they all can all be the result of applying a function called f to
something or other, which is not very helpful.

Even if these values can tell you their own types, you can't write code
to deal with all possibilities unless you know in advance what types can
occur, in which case you can use a statically typed language, declaring
a type which is a discriminated union of the possible types in the
list.

Any problem solution can be programmed in a statically typed language,
if the programmer is prepared to *design* the program before writing it.
I would be far more confident trusting my life/safety/money to such a
program than to some piece of hackery written in a language where
"expressive power" is more important than solid engineering principle.

--brian

-- 
Internet: brian@comp.vuw.ac.nz
Postal: Brian Boutel, Computer Science Dept, Victoria University of Wellington,
        PO Box 600, Wellington, New Zealand
Phone: +64 4 721000   Fax: +64 4 712070

cs450a03@uc780.umd.edu (04/11/91)

Brian Boutel writes:
>Let us suppose that we apply a function f to a list in which not all
>elements have the same type, ...  We know nothing about its members,
>except that they all can all be the result of ... f ...
> 
>Even if these values can tell you their own types, you can't write
>code to deal with all possibilities unless you know in advance what
>types can occur, in which case you can use a statically typed
>language, ...
>
>Any problem solution can be programmed in a statically typed
>language, ... union ...  [then complains about run-time errors]

Eh...?  You might as well say that any dynamically typed language is a
statically typed language.

My experience is that I can develop a program about 6 times as fast in
a dynamically typed language (APL in my case) than I can in a
statically typed language (C in my case--I'd be even slower in
FORTRAN).  This is anecdotal, but you've chosen to ignore dozens of
postings giving reasons why this might be so.  I've seen postings
which indicate other dynamically typed languages (Icon, Smalltalk)
have similar advantages.

Incidentally, I'd have to say C takes the cake as having the WORST
run-time-error behavior of any language I've seen (besides machine
language/assembly language).  Please don't confuse the sort of things
C does with its "dynamic types" with the sort of things that happen in
a "true" dynamically typed language.

Raul Rockwell

nick@cs.edinburgh.ac.uk (Nick Rothwell) (04/11/91)

In article <1707@optima.cs.arizona.edu>, gudeman@cs.arizona.edu (David Gudeman) writes:
> This is a point I've made several times on several
> Haskell and ML programs, and I wish people would get the idea so I
> could stop repeating myself.

Sorry. I agree with your point, but your mention of maths confused me. You
said

>  If f is a function and s is a sequence, then map(f,s) is the
>  sequence t such that for all i . t[i] = f(s[i]).

Now, using such general terms as "function" and "sequence" in a mathematical
context, I assumed your were implying a homogeneous list. If you're assuming
heterogeneous sequences, then the kind of mathematics you're dealing with
gets more complicated than I assumed. I think.

If you're into the world of heterogeneous types, then you have to address
the situation where the function does something which makes no sense for
one of the elements (I agree that applying lambda x. (x :: nil) will work
for heterogeneous lists with more generality than ML or Haskell can
typecheck) - but then, as I say, the mathematics is getting complicated.

But, I dropped maths more than ten years ago...

> The statically typed program only works
> on structures in which all elements have the same type -- and only
> when the compiler can infer that type.

Er, the last sentence isn't true (if I were to nit-pick...). Given the
ML/Haskell definition of map, I can then write

	- fun double list = map (lambda x. (x, x)) list

to turn a list into a list of pairs. The compiler still doesn't know the
type of the list (beyond a general polymorphic type) but it'll work anyway.
The list is, of course, homogeneous.

> Type inference has some nice features, but it does _not_ give you the
> expressive power of dynamic typing.

Fair enough. I choose the benefits of static typing (no runtime errors,
plus the ability to build specifications, interfaces and datatypes) and
find the expressive power denied to me to be no handicap. I have no
problem with other peoples' views, applications, whatever.

So. End of thread, right? :-)

	Nick.

-- 
Nick Rothwell,	Laboratory for Foundations of Computer Science, Edinburgh.
                nick@lfcs.ed.ac.uk    <Atlantic Ocean>!mcsun!ukc!lfcs!nick
~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~
           "I see what you see: Nurse Bibs on a rubber horse."

cs450a03@uc780.umd.edu (04/12/91)

Nick Rothwell writes:
>Fair enough. I choose the benefits of static typing (no runtime errors,
                                                      ^^^^^^^^^^^^^^^^^
Seriously?  How is this accomplished?

Raul Rockwell

olson@lear.juliet.ll.mit.edu ( Steve Olson) (04/12/91)

In article <1991Apr11.053440.13401@comp.vuw.ac.nz> brian@comp.vuw.ac.nz (Brian Boutel) writes:
   Let us suppose that we apply a function f to a list in which not all
   elements have the same type, and let us suppose further that by some
   stroke of luck all the elements of the list "understand" something with
   ^^^^^^^^^^^^^^
   the name f, so in mapping f through the list there is no exception or
   failure, and we get a new list. What useful operations can we then
   perform on the result list? We know nothing about its members, except
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   that they all can all be the result of applying a function called f to
   something or other, which is not very helpful.

Thats ridiculous.  Your argument assumes a bad design.  No language outlaws
stupidity.  Do you believe that Lisp programmers go around making up lists
of completely random elements just to make things challanging?

Consider f=differentiate-symbolic-expression, f=print (the result 
being the screen spaced used), f=insert-in-symbol-table, f=eval, 
f=window-system-operation, f=describe-object, f=size-of, 
f=database-query-pattern-match, ....

   Even if these values can tell you their own types, you can't write code
   to deal with all possibilities unless you know in advance what types can
   occur, in which case you can use a statically typed language, declaring
   a type which is a discriminated union of the possible types in the
   list.

Of course you can use a statically typed language.  The question is how much
programmer resources (usually more) vs. how much machine resources (usually
less) will the static solution consume?  The choice depends on the application.

   Any problem solution can be programmed in a statically typed language,
   if the programmer is prepared to *design* the program before writing it.

Any problem solution can be programmed in a statically typed language, period.
So?  Are you trying to imply that using a dynamically typed language somehow
prevents one from designing?

   I would be far more confident trusting my life/safety/money to such a
   program than to some piece of hackery written in a language where
   "expressive power" is more important than solid engineering principle.

Dynamic typing implies programs are a "piece of hackery"?  Expressive power
is somehow incompatable with solid engineering principle?  Many (most?) 
large complex programs written in statically typed languages have little
bits of roll-your-own dynamic typing in them.  This is a prime source of
"hackery".  Screw up one of those explicit type tags and there is no
end to the troubles you face.

   --brian

- Steve Olson
  MIT Lincoln Laboratory
  olson@juliet.ll.mit.edu

brian@comp.vuw.ac.nz (Brian Boutel) (04/12/91)

In article <11APR91.08052192@uc780.umd.edu>, cs450a03@uc780.umd.edu writes:

|> 
|> My experience is that I can develop a program about 6 times as fast in
|> a dynamically typed language (APL in my case) than I can in a
|> statically typed language (C in my case--I'd be even slower in
|> FORTRAN).  This is anecdotal, but you've chosen to ignore dozens of
|> postings giving reasons why this might be so.  I've seen postings
|> which indicate other dynamically typed languages (Icon, Smalltalk)
|> have similar advantages.
|> 

You might be familiar with Fred Brookes' book "The Mythical Man Month".
In this he says

	Productivity seems constant in terms of elementary statements,...

        Programming productivity may be increased by as much as five 			
times when a suitable high-level language is used

So APL would be expected to lead to greater productivity than C because
APL programs are shorter than C programs. And this is not do do with the
absence of declarations. Anyway, static typing does not imply a
requirement to declare everything.

Besides, my concern is with reliability, safety, not how fast you can
hack something together. Let's talk about the software engineering
aspects of this topic.

|> Incidentally, I'd have to say C takes the cake as having the WORST
|> run-time-error behavior of any language I've seen (besides machine
|> language/assembly language).  Please don't confuse the sort of things
|> C does with its "dynamic types" with the sort of things that happen in
|> a "true" dynamically typed language.
|> 

Did I even mention C?


--brian

-- 
Internet: brian@comp.vuw.ac.nz
Postal: Brian Boutel, Computer Science Dept, Victoria University of Wellington,
        PO Box 600, Wellington, New Zealand
Phone: +64 4 721000   Fax: +64 4 712070

cs450a03@uc780.umd.edu (04/12/91)

Brian Boutel  >
Me            >|

>| My experience is that I can develop a program about 6 times as fast
>| in a dynamically typed language ... [and more anecdotal stuff]

>You might be familiar with Fred Brookes' book "The Mythical Man Month".
>In this he says
>       Productivity seems constant in terms of elementary
>       statements,...  Programming productivity may be increased by
>       as much as five times when a suitable high-level language is
>       used
>So APL would be expected to lead to greater productivity than C
>because APL programs are shorter than C programs. And this is not do
>do with the absence of declarations. Anyway, static typing does not
>imply a requirement to declare everything.

(1)  Since when are declarations not elementary statements?
(2)  While static typing does not require that you declare everything,
     it does require declarations.  
(3)  Static typing tends to make expensive the kind of generalization
     that I've seen in high-level languages.

>Besides, my concern is with reliability, safety, not how fast you can
>hack something together. Let's talk about the software engineering
>aspects of this topic.

>| Incidentally, I'd have to say C takes the cake as having the WORST
>| run-time-error behavior of any language I've seen (besides machine
>| language/assembly language).  Please don't confuse the sort of
>| things C does with its "dynamic types" with the sort of things that
>| happen in a "true" dynamically typed language.

>Did I even mention C?

Wow, that was a fun talk about the software engineering aspects of
this topic ;-)  (And I left out FORTH again .. oops)

But no, you (Brian Boutel) didn't mention C.  (Or any other language.)
A number of other people did though.

Just for fun, the software engineering aspects of this topic:

(1) runtime vs purely static type checking
(2) local declarations (e.g. functions) vs nonlocal declarations (e.g.
    storage allocation for later operations).
(3) degrees of abstraction (can we look at operation blarg() as pure
    computation, or must it have side effects and/or be effected by
    other side effects).

You can carry any of these concepts to heights of absurdity which I
don't care to contemplate...  

Raul Rockwell

sfk@otter.hpl.hp.com (Steve Knight) (04/13/91)

Nick Rothwell writes:
> If you're into the world of heterogeneous types, then you have to address
> the situation where the function does something which makes no sense for
> one of the elements [...]

Unless I misunderstand Nick, this seems to be nothing more complex than the
business of dealing with partial functions.

Steve

gudeman@cs.arizona.edu (David Gudeman) (04/15/91)

In article  <8742@skye.cs.ed.ac.uk> Nick Rothwell writes:
]
]Now, using such general terms as "function" and "sequence" in a mathematical
]context, I assumed your were implying a homogeneous list. If you're assuming
]heterogeneous sequences, then the kind of mathematics you're dealing with
]gets more complicated than I assumed. I think.

Why would you assume that?  Math makes no distinction between
homogenous and heterogeneous lists.  What complication are you talking
about?  It is trivial to define a heterogeneous list in math:

Let S be the sequence beginning with S[1] defined as follows:
  if i is odd then S[i] = i
  if i is even then S[i] = {i, i-1}

It is equally trivial to define a function to act on heterogenous
elements:

Let f x be defined as follows:
  if x is an integer then x
  else if x is a set then the min element of x

All of this can be programmed directly in a language with dynamic
typing and requires extra effort in a language with static typing.
My main criterion for the expresiveness of a language is "does it let
me say what I want to directly, instead of making me work around
limitations in the language?"
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

cs450a03@uc780.umd.edu (04/15/91)

Brian Boutel writes:

>First, let me try to define what I mean ...
>- A type is a set of operations.
>- a value has type T if it understands all the operations of type T

Thus a value can have multiple types?

Note that 0, 1, 33, -200 and 2000000000 might all share some types in
common, yet not share other types.

>- a type violation occurs when there is an attempt to apply to a value an
>  operation which is not in its type.

At run-time, although some values may be computed by the compiler.

>- Languages can be classified as untyped, dynamic typed, static typed.
>- Untyped means that it is solely the programmer's responsibility to
>ensure that no type violation occurs. Failure to ensure this will not
>be detected at compile time or by the system at run-time. ...

I'm a little foggy here...  If a language has some run-time checking,
and some compile time checking, and doesn't catch some things, how
would you classify it.

[e.g. C on a machine that has some arithmetic exception hardware.]

>- Dynamic typed means that values/objects contain type information at
>run-time, so can detect attempted type violations and respond. This
>is fine unless the response is unanticipated.

The response is unanticipated?  I presume you mean when the programmer
did not intend to apply a particular function to some particular
datum? 

>- Static typed means that a type is associated at compile time with
>each variable. (In a functional language, which is what interests me,
>this normally means with each defined function and by implication its
>formal parameters.)  It is possible at compile time to ensure that a
>value is never bound to a variable of the wrong type, and that only
>type-valid operations are performed on variables, thus preventing
>type violations.

How is this possible?  Especially in a machine with finite limits?
Note that arithmetic overflow means that addition can produce values
which don't have the type for addition.  You can't use subtraction to
generate a value for a function which does not accept 0, or negative
numbers, unless static analysis of the code reveals that this is ok.
Multiplication hits the overflow problem even harder.  Etc.

And that's just basic arithmetic.

Raul Rockwell

cs450a03@uc780.umd.edu (04/15/91)

Brian Boutel writes:

>The kind of languages I prefer, e.g. ML, Haskell, do not require
>declarations of variables or functions. Types are inferred from the
>context. (It may sometimes be pregmatically desirable to include some
>declarations to limit the generality of deduced types for purposes of
>run-time efficiency). 

Curious.  Sounds to me like ML and Haskell are [at least partially]
dynamically typed.  [I presume by "do not require declarations" you
mean "do not require declaration of 'type' or 'storage class'" -- I
can't quite see you not having to declare any values :-) ]

>I'm not objecting to dynamic typed languages, in fact I quite like
>them, and occasionally use them. I do, however, react badly to some
>of the claims made about their superiority, claims which ignore the
>most important issues, and advocate use which, at the present time,
>increases the risk to the public of computer systems.

Maybe if I didn't use the phrase "dynamic type" ?

I claim that a language that inserts run-time checks on the values
which are applied to functions, and only removes these checks where
static analysis of the code shows them to be redundant, is going to
catch more errors than a language which does not.

I claim that a language which allows a programmer to specify an
operation concisely is going to result in better productivity (in
terms of bug-free functionality produced) than a language which always
requires the programmer to deal with trivial details of the
implementation.

If that means I'm arguing for ML, so be it :-)  One of these days,
I might just learn it and find out for myself.  But note that I'm not
argueing for any particular language -- more a class of languages, and
a coding style.

Do you react badly to that?

Raul Rockwell

brian@comp.vuw.ac.nz (Brian Boutel) (04/15/91)

In article <OLSON.91Apr12012507@lear.juliet.ll.mit.edu>,
olson@lear.juliet.ll.mit.edu ( Steve Olson) writes:
|> [about something I wrote]
 
|> Thats ridiculous.  Your argument assumes a bad design.  No language
|> outlaws
|> stupidity.  Do you believe that Lisp programmers go around making up
|> lists
|> of completely random elements just to make things challanging?
|> 

The ability to do just this has been touted by others as one of the
advantages of dynamic typing. My point was that if they do not do this,
they could use static typing, as you yourself say.

|> Of course you can use a statically typed language.  The question is how much
|> programmer resources (usually more) vs. how much machine resources (usually
|> less) will the static solution consume?  The choice depends on the
|> application.
|> 

Again we agree -(see below)

|>[me]    Any problem solution can be programmed in a statically typed
language,
|>       if the programmer is prepared to *design* the program before
|> writing it.
|> 
|> Any problem solution can be programmed in a statically typed
|> language, period.
|> So?  Are you trying to imply that using a dynamically typed language
|> somehow
|> prevents one from designing?
|> 

It's a strange kind of logic that has my claim imply what you suggest.
There is, though a conection between the design effort I associated with
using static types and the "programmer resources (usually more)" that
you associate with static types.


|>  Many (most?) 
|> large complex programs written in statically typed languages have little
|> bits of roll-your-own dynamic typing in them.  This is a prime source of
|> "hackery".  Screw up one of those explicit type tags and there is no
|> end to the troubles you face.

This may well be true, but what does it show? A lack of design effort.

Let me make a statement - up to now I have just responded to other postings.
First, let me try to define what I mean (sorry if this is boring).
- A type is a set of operations.
- a value has type T if it understands all the operations of type T
- a type violation occurs when there is an attempt to apply to a value an
  operation which is not in its type.
- Languages can be classified as untyped, dynamic typed, static typed.
- Untyped means that it is solely the programmer's responsibility to
ensure that
  no type violation occurs. Failure to ensure this will not be detected
at
  compile time or by the system at run-time. Usually the code
implementing the
  operation is applied to the bit pattern representing the value as
though that
  pattern represented a value of the appropriate type for the operation,
with
  serious results. E.g. doing a floating point add to an integer value.
- Dynamic typed means that values/objects contain type information at
run-time,
  so can detect attempted type violations and respond. This is fine
unless the
  response is unanticipated.
- Static typed means that a type is associated at compile time with each
  variable. (In a functional language, which is what interests me, this
normally
  means with each defined function and by implication its formal
parameters.)
  It is possible at compile time to ensure that a value is never bound
to a
  variable of the wrong type, and that only type-valid operations are
performed
  on variables, thus preventing type violations.

You made a trade-off between development costs and run-time costs. Fine,
there is nothing wrong with that. I prefer to note the trade-off between
development cost and run-time unreliability. I.e. it's better, in an
important application, to take the extra time in development to protect
against run-time type violations. This is not a complete guarantee of
program correctness, but it is a step in the right direction.

If I had to accept liability for damages resulting from the failure of a
software system, which had to be developed from scratch, I would, other
things being equal, (programmer skills and experience, etc), insist on
the use of a static typed language. Since I moved from industry to
Academia, my own programming efforts carry no such burden of
responsibility, so I could freely use lisp or apl or smalltalk in my own
little world, but I believe my responsibility to students, most of whom
will become practitioners in industry, is to encourage them to use safe
practices, or "solid engineering principle". 

--brian
-- 
Internet: brian@comp.vuw.ac.nz
Postal: Brian Boutel, Computer Science Dept, Victoria University of Wellington,
        PO Box 600, Wellington, New Zealand
Phone: +64 4 721000   Fax: +64 4 712070

brian@comp.vuw.ac.nz (Brian Boutel) (04/15/91)

In article <12APR91.08192346@uc780.umd.edu>, cs450a03@uc780.umd.edu writes:
  [in response to something I said]
|> 
|> (1)  Since when are declarations not elementary statements?
|> (2)  While static typing does not require that you declare everything,
|>      it does require declarations.  

The kind of languages I prefer, e.g. ML, Haskell, do not require
declarations of variables or functions. Types are inferred from the
context. (It may sometimes be pregmatically desirable to include some
declarations to limit the generality of deduced types for purposes of
run-time efficiency). So there is really no difference between volumes
of declarations required in these and in dynamic typed languages like
Smalltalk.

|> (3)  Static typing tends to make expensive the kind of generalization
|>      that I've seen in high-level languages.
|> 

I'm unsure what you mean. Are you saying that statically typed languages
are not high-level?

I would distinguish between "generalization" and "useful
generalization". I'm sure you can show me some clever code, but do I
need it at the cost of the additional safety that I get from using a
language which disallows a large class of erroneous programs that might
otherwise get into production.


|> Just for fun, the software engineering aspects of this topic:
|> 
|> (1) runtime vs purely static type checking
|> (2) local declarations (e.g. functions) vs nonlocal declarations (e.g.
|>     storage allocation for later operations).
|> (3) degrees of abstraction (can we look at operation blarg() as pure
|>     computation, or must it have side effects and/or be effected by
|>     other side effects).
|> 
|> You can carry any of these concepts to heights of absurdity which I
|> don't care to contemplate...  
|> 
|> Raul Rockwell

Perhaps some more contemplation would help?

Engineering is about reliability, about using established techniques
with a solid underlying theoretical foundation, about safety margins.
For me, these are the important issues. Nice new ideas have their place,
but it is not in production systems. 

I'm not objecting to dynamic typed languages, in fact I quite like them,
and occasionally use them. I do, however, react badly to some of the
claims made about their superiority, claims which ignore the most
important issues, and advocate use which, at the present time, increases
the risk to the public of computer systems.

--brian
-- 
Internet: brian@comp.vuw.ac.nz
Postal: Brian Boutel, Computer Science Dept, Victoria University of Wellington,
        PO Box 600, Wellington, New Zealand
Phone: +64 4 721000   Fax: +64 4 712070

nick@cs.edinburgh.ac.uk (Nick Rothwell) (04/15/91)

In article <11APR91.19025118@uc780.umd.edu>, cs450a03@uc780.umd.edu writes:
> Nick Rothwell writes:
> >Fair enough. I choose the benefits of static typing (no runtime errors,
>                                                       ^^^^^^^^^^^^^^^^^
> Seriously?  How is this accomplished?

Oh nuts, my fault for being too hasty.

No runtime *type* errors. Runtime errors are restricted to a small set of
exception conditions defined by the language. These can be caught and
handled, and the exception mechanism allows you to define you own exception
values and raise and handle them. I'm talking about ML here, btw.

	Nick.

-- 
Nick Rothwell,	Laboratory for Foundations of Computer Science, Edinburgh.
                nick@lfcs.ed.ac.uk    <Atlantic Ocean>!mcsun!ukc!lfcs!nick
~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~
           "I see what you see: Nurse Bibs on a rubber horse."

anw@maths.nott.ac.uk (Dr A. N. Walker) (04/16/91)

In article <1366@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman)
replied rather peevishly to my <1991Mar29.191210.9369@maths.nott.ac.uk>:

	[On April 1st!  Thanks to some time warp, DG's article has only just
reached sunny Nottingham.]

	Well, you don't want to read it again.  I'm happy to let readers
decide which of us is the more confused about Algol and its implementation.

One or two more substantive points:

>]Algol 60:  Intended primarily as a descriptive language, and certainly
>]	not efficient (call by name, dynamic typing
>                                     -------
>I assume you meant "static".

	You omitted to quote the "smiley" & the examples I gave.  "2^j"
is an integer if "j" is +ve, a real if it's -ve, so the type must be
handled dynamically.  In "f(2)", the compiler cannot tell, in general,
whether "2" is a number or a label, so must provide enough information
for the called routine to handle both [in real life, most (?all) actual
implementations banned numeric labels].  More generally, a procedure which
is passed as a parameter cannot have its parameters specified, so that
uses of such parameters have dynamic type.

>]	 recursive procedures, or dynamic arrays,
>
>Both features can be compiled.

	That is obvious with hindsight.  I have a vague memory that
recursive procedures only got into Algol 60 by oversight;  on the last
day, someone realised that they hadn't been *excluded*, and there was a
furious debate with arguments about whether it was possible or useful,
on the successful conclusion of which they were left in.  Anyone know
better, or have I just started an urban legend?  In any case, dynamic
*own* arrays are a pig to compile;  again, they were [?universally]
omitted from implementations.

>I should have been more specific: concatenatable, sectionable strings
>with automatic storage management.  Those pathetic things that
>statically typed languages call "strings" are really character arrays.

	Algol 68 is statically typed;  its strings have all those properties.

>Mathematicians had thought [graphs, eg] were useful for decades or centuries,

	Perhaps so, and there may have been lots going on at the research
level, but graphs were not part of the normal armoury of the typical maths
graduate or postgraduate in 1960.  Even today, I would guess that most
mathematicians who avoid the specialist pure or computing courses will learn
rather little about graphs.  I checked some of my textbooks [mostly 60's
vintage].  I couldn't find a mention of graphs in *any* of the maths texts.
I recall that in about 1974, a [reasonably distinguished] colleague was
berated in public by his [rather more distinguished] head of department
for wasting his time researching in graph theory -- "Time you did some
*real* maths".  I think that expecting graphs to appear in a computing
language mainly intended for expressing numerical algorithms in 1960 is
simply anachronistic.

	DG's assertion started out in an admirably succinct form, with
two parts:  (i) Algol is a low-level language, (ii) into which static
typing was introduced as an efficiency hack.  I can make sense of this,
but strongly disagree with it.

	He now says that by Algol he really means "the design philosophy
that resulted in Algol 60", and [as far as I can make out] that by
efficiency he really means the frame of mind of "Type A designers"
(concerned with efficiency, as opposed to type B expressivists);  he
adds

>				    Thus [type A languages] needed
>static typing, [...]
>Algol 60 clearly and unambiguously fits into type A.  It is true that
>_within the type A camp_, they were somewhat cavalier about
>efficiency, [...]

	But Algol 60 was designed expressly to be machine *in*dependent,
in so far as its designers were able.  People were fed up with having to
invent a new high-level language for every new computer.  The results may
not have been as untainted with hardware considerations as DG would like,
but the *intent* and the *philosophy* was high-level.  And assertion (ii)
now seems to read that static typing was introduced as an efficiency hack
into a language that needed static typing by people who were cavalier, as
far as languages that need static typing go, about efficiency.  I can
discern a dim glimmer of meaning behind this;  but I just don't agree
with it.  Perhaps that's a good place to stop.

-- 
Andy Walker, Maths Dept., Nott'm Univ., UK.
anw@maths.nott.ac.uk

cs450a03@uc780.umd.edu (04/16/91)

Brian Boutel   >|>  or >
Me             <|

>|>First, let me try to define what I mean ...
>|>- A type is a set of operations.
>|>- a value has type T if it understands all the operations of type T
<| Thus a value can have multiple types?
>Yes, within a type hierarchy, a value may have a type and that type's
>supertype and so on upwards.

I'm not sure if a "type hierarchy" is always appropriate.  Consider
the following functions: factorial, sine, reciprocal, log, addition,
and multiplication.  Describe a type hierarchy which allows me to use
each of these functions (or operators, if you prefer) in a meaningful
fashion, without arbitrary restrictions.  I don't see how you can get
away from type errors occuring at runtime.

What I'm trying to say is that a general purpose function will quite
likely be able to generate different values, which have different
types.  This occurs because the definition of "type" deals with what
happens when a function is _applied_ to data.  Type tags are
implementation specific, and are not required to get run-time type
errors.

Maybe you're wondering what good it does it do to have a run-time
error?  Assuming you don't have some special case procedure to handle
a specific error (like renormalization *gack* *gack* *cough* *gasp*),
you could maybe back up to the last user input (or some generic user
prompt), and suggest, wistfully, that the user trys something else.
Much better than spewing out garbage, or crashing the program.

>|>It is possible at compile time to ensure that a value is never
>|>bound to a variable of the wrong type, and that only type-valid
>|>operations are performed on variables, thus preventing type
>|>violations.
<| How is this possible?  Especially in a machine with finite limits?
<| Note that arithmetic overflow means that addition can produce values
<| which don't have the type for addition.
>This is the cause of the misunderstanding between us. I was not
>including arithmetic exceptions such as overflow or divide by zero as
>type violations. ...

>So my type system has deficiencies, but it can still catch a lot of
>errors at compile time that might otherwise get through testing.  I
>think that is worthwhile.

I think you're on weak ground here.

Essentially, you seem to be saying:  subtle errors I don't consider,
but blatantly obvious errors are worth catching.

I also think that if you were working with a system which gave better
support to run-time testing, you might catch more errors in testing.

Finally, a statement which is formally accepted by the compiler is not
necessarily a statement which is correct in the context of a specific
program.  If a language cuts the number of branches in half (by
allowing the same statement to deal with more cases), _each_ branch
removed cuts the testing time roughly in half (or by n, in the case of
an n-way branch).  A program which has 24 independent branches may be
impossible to test adequately.  A program with no branches can be
almost trivial to test.

I should mention that I spend much of my time tracking down and
squashing lingering race conditions left by long-departed programmers.
This kind of problem, in particular, seems to be inadequately
addressed by static type checking (and I am relieved that I don't have
to bother with it when I don't want to).

Raul Rockwell

brian@comp.vuw.ac.nz (Brian Boutel) (04/16/91)

In article <15APR91.00191631@uc780.umd.edu>, cs450a03@uc780.umd.edu writes:
|> Brian Boutel writes:
|> 
|> >First, let me try to define what I mean ...
|> >- A type is a set of operations.
|> >- a value has type T if it understands all the operations of type T
|> 
|> Thus a value can have multiple types?
|> 

Yes, within a type hierarchy, a value may have a type and that type's
supertype and so on upwards.

|> 
|> >- Languages can be classified as untyped, dynamic typed, static typed...

|> I'm a little foggy here...  If a language has some run-time checking,
|> and some compile time checking, and doesn't catch some things, how
|> would you classify it.
|> 
|> [e.g. C on a machine that has some arithmetic exception hardware.]
|> 

I wuld classify C as statically typed, but see below.

|> 
|> >- Static typed means that a type is associated at compile time with
|> >each variable. (In a functional language, which is what interests me,
|> >this normally means with each defined function and by implication its
|> >formal parameters.)  It is possible at compile time to ensure that a
|> >value is never bound to a variable of the wrong type, and that only
|> >type-valid operations are performed on variables, thus preventing
|> >type violations.
|> 
|> How is this possible?  Especially in a machine with finite limits?
|> Note that arithmetic overflow means that addition can produce values
|> which don't have the type for addition.  You can't use subtraction to
|> generate a value for a function which does not accept 0, or negative
|> numbers, unless static analysis of the code reveals that this is ok.
|> Multiplication hits the overflow problem even harder.  Etc.
|> 
|> And that's just basic arithmetic.
|> 
|> Raul Rockwell

This is the cause of the misunderstanding between us. I was not
including arithmetic exceptions such as overflow or divide by zero as
type violations.
In the type systems I am talking about, the type Integer supports the
integer divide operation, (say, "div"), and the type of div is 
Integer X Integer -> Integer
This does not take account of the fact that the type of div should
really be 
Integer X NonZeroInteger -> Integer, where NonZeroInteger is a subtype
of Integer. So my type system has deficiencies, but it can still catch a
lot of errors at compile time that might otherwise get through testing.
I think that is worthwhile.

--brian

 
-- 
Internet: brian@comp.vuw.ac.nz
Postal: Brian Boutel, Computer Science Dept, Victoria University of Wellington,
        PO Box 600, Wellington, New Zealand
Phone: +64 4 721000   Fax: +64 4 712070

gudeman@cs.arizona.edu (David Gudeman) (04/16/91)

In article  <8872@skye.cs.ed.ac.uk> Nick Rothwell writes:
]
]No runtime *type* errors. Runtime errors are restricted to a small set of
]exception conditions defined by the language.

The only runtime error you have eliminated with static type checking
is a "message not understood" or "domain error".  The elimination of
this single type of error hardly seems to justify the limitations on
expressiveness -- particularily since this is the easiest type of
error to find by testing.

The only reason you can kind find type errors at compile time is
because the errors are so trivial that the test is decidable.  How
much are you willing to give up in expressiveness to find the most
trivial and obvious programming error a few minutes earlier?
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

wallace@hpdtczb.HP.COM (David Wallace) (04/16/91)

> brian@comp.vuw.ac.nz (Brian Boutel) /  9:34 pm  Apr 10, 1991 /
> Any problem solution can be programmed in a statically typed language,
> if the programmer is prepared to *design* the program before writing it.
> I would be far more confident trusting my life/safety/money to such a
> program than to some piece of hackery written in a language where
> "expressive power" is more important than solid engineering principle.

I have heard there's a quote attributed to von Neumann to the effect that any
competent programmer shouldn't need to use floating point - if you understand
the problem well enough and *design* the program appropriately, you should be
able to use appropriately scaled fixed point values throughout.  (Someone
with a better set of references than I might be able to trace the original
quote.  I've just heard it as oral folklore.)

Anyway, programmers (and users) find floating point rather useful.  Think of it
as "dynamic scaling" vs. "static scaling."  It may not be a perfect analogy,
but it's worth thinking about.  Sometimes not having to anticipate every detail
up front can give you significantly more flexibility.  (But if it's a
life-critical application, I would still like the floating-point algorithms
designed and/or checked by a competent numerical analyst.  There are
reliability issues associated with floating-point, too.)

Dave W.		(david_wallace@hpdtl.ctgsc.hp.com)

mathew@mantis.co.uk (mathew) (04/16/91)

cs450a03@uc780.umd.edu writes:
> Curious.  Sounds to me like ML and Haskell are [at least partially]
> dynamically typed.  [I presume by "do not require declarations" you
> mean "do not require declaration of 'type' or 'storage class'" -- I
> can't quite see you not having to declare any values :-) ]

What ML actually does is to infer types from information given implicitly.
For example, if you define a function f(x) = x + x, we can infer that f(x) is
a function from some type 'a to a value of the same type 'a ( 'a -> 'a ),
since we know that + is of type 'b * 'b -> 'b.

If we then define a function g(y) = f(f(y)), we can infer that y must be of
type 'a (in order to be a valid argument for f()) and hence that g(y) is a
function 'a -> 'a.

If we then do  z = sin(g(0.2)), we can infer that type 'a must in fact be
a floating-point number, since we know that sin is of type float -> float.

Using such rules, ML cleverly deduces the type of functions without your ever
having to explicitly state the type. It then does static compile-time type
checking. (Apologies to ML experts if I've glossed over anything or made any
silly errors; I haven't used ML in quite a while.)

In order to allow this sort of deduction to take place, you must place
certain limitations on what is allowed. For example, if you were to allow
heterogeneous lists, you would be unable to deduce a single type for any
list, leading to dead-ends in the type deduction.

Sometimes it can be necessary to state a type explicitly, if there isn't
enough information in the code for ML to deduce what you intend.

It is always permissible to state the type explicitly if you wish to do so;
although doing so removes most of ML's benefits.

> I claim that a language that inserts run-time checks on the values
> which are applied to functions, and only removes these checks where
> static analysis of the code shows them to be redundant, is going to
> catch more errors than a language which does not.

Indeed. One of the most stupid things which people do is to remove all the
run-time checking from released versions of programs, even when the programs
are not time-critical.

> I claim that a language which allows a programmer to specify an
> operation concisely is going to result in better productivity (in
> terms of bug-free functionality produced) than a language which always
> requires the programmer to deal with trivial details of the
> implementation.

Agreed. I also think that a language which allows features such as (for
example) heterogeneous lists is, in certain applications, going to result in
enormously greater programmer productivity. Why use unions and implement your
own type system using tags, when you can use a type system which is already
there?

If only we could solve the problem of generalized lambda-2 typing, we might be
able to have a statically-typed language which was as powerful as Lisp...

mathew

--
If you're a John Foxx fan, please mail me!

yodaiken@chelm.cs.umass.edu (victor yodaiken) (04/16/91)

In article <1883@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
>In article  <8742@skye.cs.ed.ac.uk> Nick Rothwell writes:
>]
>]Now, using such general terms as "function" and "sequence" in a mathematical
>]context, I assumed your were implying a homogeneous list. If you're assuming
>]heterogeneous sequences, then the kind of mathematics you're dealing with
>]gets more complicated than I assumed. I think.
>
>Why would you assume that?  Math makes no distinction between
>homogenous and heterogeneous lists.  What complication are you talking
>about?  It is trivial to define a heterogeneous list in math:
>
>Let S be the sequence beginning with S[1] defined as follows:
>  if i is odd then S[i] = i
>  if i is even then S[i] = {i, i-1}
>
>It is equally trivial to define a function to act on heterogenous
>elements:
>
>Let f x be defined as follows:
>  if x is an integer then x
>  else if x is a set then the min element of x
>
>All of this can be programmed directly in a language with dynamic
>typing and requires extra effort in a language with static typing.
>My main criterion for the expresiveness of a language is "does it let
>me say what I want to directly, instead of making me work around
>limitations in the language?"
>--


Your example seems to involve the use of type definitions. If I now write
f((1,2,3,4)) the compiler should be able to inform me that I have
attempted to apply "f" to a list, but "f" is only defined on sets and
integers.

anw@maths.nott.ac.uk (Dr A. N. Walker) (04/16/91)

In article <1991Apr9.021700.2688@neon.Stanford.EDU>
brm@neon.Stanford.EDU (Brian R. Murphy) writes:

>My complaint about statically typed languages is that I _can't_ do
>some things in them that I _do_ in dynamically typed languages (such
>as Lisp).

	But your examples are perfectly OK in *some* statically typed
languages, which suggests that it's a problem with your specific languages
rather than with typing.

>	    For example, I can't I write a function which returns
>either a boolean or an integer in a complex way.

	proc fred = union (bool, int): if random < 0.5 then true else 1 fi;
  # eg, #
	print ("found a" + case fred in (bool): " bool" out "n int" esac)

>						   I can't write my own
>Y combinator.

	Pass.

>		I can't write a function which allows either a sequence
>of functions which range over numbers or a sequence of numbers as an
>argument.

	mode integer = union (int, long int),
	     fraction = struct (integer numerator, denominator),
	     number = union (integer, real, long real, fraction);
	proc (number) number bert = (skip);

	proc jim = (union ([] proc (number) number, [] number) a) void: (skip);
  # eg, #
	jim (1); jim (pi); jim ((1, pi)); jim ((bert, bert, bert));
		# no guarantees that these will compile without casts! #

What you are asking for is almost the standard "print" specification,

	proc print = ([] union (proc (ref file) void, int, real, etc)): ...,

which is defined that way so that you can print arbitrary sequences of
printable things, like arrays, strings, numbers, interleaved with
procedures like "newline" that move around the output without printing.

-- 
Andy Walker, Maths Dept., Nott'm Univ., UK.
anw@maths.nott.ac.uk

yodaiken@chelm.cs.umass.edu (victor yodaiken) (04/16/91)

In article <OLSON.91Apr9012119@lear.juliet.ll.mit.edu> olson@lear.juliet.ll.mit.edu ( Steve Olson) writes:
>
>I am aware that demanding detailed decisions about storage layout or machine
>data-types is not inherent to the concept of static typeing.  I would be 
>interested in hearing about a language where you could do something
>like declare x to be a "number".  Does such exist?  Efficiency aside, does
>there exist a language where a static type system catches lots of errors but
>dosen't get in your way the way current examples do?  I don't know myself
>-- thats why I read this list!

Ignoring, for just a moment, the always fascinating topic of 
lisp versus C, this seems to get at what I want to understand from this
discussion: namely, is there something inherent in "static" typing
and type checking that makes it so tedious and baroque? Is it not
possible to have a programming language in which types are used as in
mathematics to clarify, disambiguate, and catch errors? I do not
buy the argument that there is a real plus in being able to write
expressions that may not have any meaning. That is, if I write
$f(X)$ I should be able to ensure, during design time, that 
f is defined over all possible instantiations of X. Why, one would
find it reasonable to do otherwise is beyond me.
Sorry if this is just a result of ignorance.
On the other hand, the clumsy type hierarchies of the programming
languages that I am familiar with, are not that impressive either. 
It is not clear that "types" should form a "hierarchy" at all:
integers are not simply a stripped down version of reals, the
ring Z/3 is not just a smaller version of the ring Z/170, etc. etc.

dmg@ssc-vax (David M Geary) (04/17/91)

David Gudeman writes:

]In article  <8872@skye.cs.ed.ac.uk> Nick Rothwell writes:
]]
]]No runtime *type* errors. Runtime errors are restricted to a small set of
]]exception conditions defined by the language.

]The only runtime error you have eliminated with static type checking
]is a "message not understood" or "domain error".  The elimination of
]this single type of error hardly seems to justify the limitations on
]expressiveness -- particularily since this is the easiest type of
]error to find by testing.

  Yes, a statically typed language, of course, still leaves all kinds of holes
for all kinds of bugs...

]The only reason you can kind find type errors at compile time is
]because the errors are so trivial that the test is decidable.  How
]much are you willing to give up in expressiveness to find the most
]trivial and obvious programming error a few minutes earlier?

  What about the type error that only shows up under rare circumstances?  It
is quite possible that production software may contain some hidden type error
that only shows up if the user performs action A, followed immediately by
action B, and then immediately performs action A again.  (OSTTE).  This may
or may *not* be caught by testing.

  However, with a statically (or dynamically) typed language, it is of course,
possible to have hidden bugs that have nothing to do with type.   

  Many C (or C++) programmers can be found who write code like this:

void  printXValue( someType  *p)
{
  printf("%d\n", p->x);
}

  The code will type check, and thus get by the compiler if x is a valid
member of the type someType.  However, what if, under some strange circumstance,
the function printXValue() is passed a NULL pointer?  I am amazed by the number
of people who insist that a language type check, but turn around and assume that
their functions will never be passed a bad pointer.

  Writing code that *appears* to be type-safe to a compiler is only a small
part of the whole issue of writing correct code.  There is no question that
a dynamically typed language increases the risk that an error may occur during
runtime that the software is not prepared to handle.  Of course, dynamically
typed languages permit more generic (and therefore reusable) code.  Thus,
we have a tradeoff.  Two questions then arise:

1)  What is the possiblity that a type error will occur at runtime? 
2)  Is the expressiveness provided by a dynamically typed language worth the 
    possiblity of a runtime type error?

new@ee.udel.edu (Darren New) (04/17/91)

In article <3857@ssc-bee.ssc-vax.UUCP> dmg@ssc-vax.UUCP (David M Geary) writes:
>void  printXValue( someType  *p)
>{ printf("%d\n", p->x);}
>  The code will type check, and thus get by the compiler if x is a valid
>member of the type someType.  

Not even considering what happens if p->x is a long, or a double, or a 
pointer, or ....
		     -- Darren
-- 
--- Darren New --- Grad Student --- CIS --- Univ. of Delaware ---
----- Network Protocols, Graphics, Programming Languages, FDTs -----
     +=+=+ My time is very valuable, but unfortunately only to me +=+=+
+=+ Nails work better than screws, when both are driven with screwdrivers +=+

gudeman@cs.arizona.edu (David Gudeman) (04/17/91)

In article  <3857@ssc-bee.ssc-vax.UUCP> David M Geary writes:
]
]  ...  There is no question that
]a dynamically typed language increases the risk that an error may occur during
]runtime that the software is not prepared to handle.

There certainly is a question about that.  I claim that dynamically
typed languages _reduce_ the risk that such an error may occur.  And
I've been claiming that for a month now, but people seem to forget it
two replies later.

Every time I see something like "I use statically typed languages for
security..." I grit my teeth and heroically try not to hit the
follow-up key (with more-or-less success).  No one has yet presented
anything resembling evidence that the above is a true statement. 

My argument is that: (1) the type errors that are caught by static
typing are the easiest kind of errors to find by testing -- so static
type checking is of no real value for product security.  And (2) the
complex declarations required by static typing can be sources of
hard-to-find errors.  (Complex declarations does not mean the
requirement to declare the types of variables, but of structures and
generic functions.)
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

sfk@otter.hpl.hp.com (Steve Knight) (04/17/91)

Just to expand on Brian Boutel's remarks about statically typed languages
such as ML, HOPE, and Haskell.  He comments:

> The kind of languages I prefer, e.g. ML, Haskell, do not require
> declarations of variables or functions. Types are inferred from the
> context.  [...] So there is really no difference between volumes
> of declarations required in these and in dynamic typed languages like
> Smalltalk.

These languages are completely statically typed & are probably amongst the
best examples of statically typed languages.  The "type inference" mechanism
is guaranteed to find the most general type expression (with a few trivial
caveats).  Dynamic typing is not involved.

However, this compromise is not achieved without a certain cost.  In 
particular, all collections (tuples don't count) are homogenous. This is
a fundamental constraint that allows the type inference mechanism to do
its stuff.  The result of this is that "anonymous" type unions aren't supported.

So if you want to construct a list with an integer and a boolean in it
you write (in ML)
    datatype IntOrBool = Int of int | Bool of bool;
where "Int" and "Bool" are constructors invented by this declaration.  We
can then construct the list as
    val LIST = [Int 3, Bool true, Int 99];

As you can see, although we managed to create a list that roughly corresponds
to our initial specification without any type declarations, we had to invent
a new datatype instead.  I call these type declarations "cosmetic" in that
they mask the type differences of elements in a collection.

These cosmetic type definitions are not as densely used as type declarations
in other statically typed languages such as Pascal, Ada, C, or (to pick a
sensible example) POLY.  But they do proliferate in even relatively moderate
applications.  

It should also be appreciated that type inferencing works well in functional
programming languages, such as ML and Haskell, but is much less effective in
general imperative languages.  This is evident in ML, which has a handful of
imperative features, where the type system becomes considerably less
attrative and more complex.

So programming in a language such as ML is not as comfortable as writing in
LISP, SmallTalk, or Prolog.  But it is certainly a great deal more comfortable
than writing in the other statically typed languages.  (I haven't used Haskell,
so I can't comment on whether or not it is an improvement -- my expectation
is that the "class" layer does provide real benefits -- but that's another
story.)  

Please note that I use the word "comfortable".  By this I am 
implying ease of use for a programmer rather than any other qualities.

Steve

nick@cs.edinburgh.ac.uk (Nick Rothwell) (04/17/91)

In article <1957@optima.cs.arizona.edu>, gudeman@cs.arizona.edu (David Gudeman) writes:
> In article  <8872@skye.cs.ed.ac.uk> Nick Rothwell writes:
> ]
> ]No runtime *type* errors. Runtime errors are restricted to a small set of
> ]exception conditions defined by the language.
> 
> The only runtime error you have eliminated with static type checking
> is a "message not understood" or "domain error".  The elimination of
> this single type of error hardly seems to justify the limitations on
> expressiveness

It does if that error accounts for a huge proportion of errors.

> How
> much are you willing to give up in expressiveness to find the most
> trivial and obvious programming error a few minutes earlier?

If it were always a case of a few minutes (and if the runtime system
were as informative as the compiler when errors occured), then it wouldn't
be worth it. But, runtime type errors are indeterminate since not every
execution path is guaranteed to be followed within a fixed period of
testing time.

-- 
Nick Rothwell,	Laboratory for Foundations of Computer Science, Edinburgh.
                nick@lfcs.ed.ac.uk    <Atlantic Ocean>!mcsun!ukc!lfcs!nick
~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~
"Playing strip poker with an exhibitionist somehow defeats the object...."

nick@cs.edinburgh.ac.uk (Nick Rothwell) (04/17/91)

In article <3857@ssc-bee.ssc-vax.UUCP>, dmg@ssc-vax (David M Geary) writes:
>   Many C (or C++) programmers can be found who write code like this:
> 
> void  printXValue( someType  *p)
> {
>   printf("%d\n", p->x);
> }

>   The code will type check

`printf' isn't typechecked.

-- 
Nick Rothwell,	Laboratory for Foundations of Computer Science, Edinburgh.
                nick@lfcs.ed.ac.uk    <Atlantic Ocean>!mcsun!ukc!lfcs!nick
~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~
"Playing strip poker with an exhibitionist somehow defeats the object...."

wg@opal.cs.tu-berlin.de (Wolfgang Grieskamp) (04/17/91)

gudeman@cs.arizona.edu (David Gudeman) writes:

  >Every time I see something like "I use statically typed languages for
  >security..." I grit my teeth and heroically try not to hit the
  >follow-up key (with more-or-less success).  No one has yet presented
  >anything resembling evidence that the above is a true statement. 

Since static or strong typing is a well established paradigm of software
engineering I guess you are in the position to present some 
arguments of evidence against it, arent you? Anyway, you did ...

  >My argument is that: (1) the type errors that are caught by static
  >typing are the easiest kind of errors to find by testing -- so static
  >type checking is of no real value for product security.  

To repeat, what several people already pointed out:
This depends strongly on the kind of application. I dont know what
kind of application you are dealing with. I would suspect at least
with the following ones you might get some serious troubles "trusting
in testing":

a)	Real-time applications, e.g. network-protocols. Here holds
	the Heisenberg syndrome ... you cannot observe them objectively.

b)	Sophisticated interactive systems: you cannot imagine all the
	possible action-sequences the user might performe.

c)	Complex data transformations, e.g. optimising compilers:
	there are always some exotic combinations of input
	data you missed to take into account in your test suite.

You claim, that the errors caught by static typing are trivial ones.
This is usally right. However, my experience is that most bugs of the
kind you are searching for days over days are finally of trivial nature. 
Logical errors are much more easy to recover; actually, most of 
them will be detected during the implementation phase.

The principal point here is related to the paradigm of compositionality
of software pieces. This claims it should be possible to compose 
correct pieces of software yielding a new correct piece of software 
with semantics formed from the semantics of the software pieces and the 
semantics of the composition operator. 

A trivial error in one piece under composition becomes a less trivial
error in the composed result and so on. 

One way to avoid such errors is to force some kind of compatibility 
condition for the software pieces under composition which is checkable 
by a tool. The easiest and indeed most restrictive way for this 
is monomorphic typing as known from C, MODULA-2, etc. But there are
more sophisticated models of typing already applicated 
(parameterization, subtyping) and even more sophisticated under research
(dependent types, parameter constraints, etc.), which the 
fans of dynamic typing permantly ignore.

  >And (2) the
  >complex declarations required by static typing can be sources of
  >hard-to-find errors.  (Complex declarations does not mean the
  >requirement to declare the types of variables, but of structures and
  >generic functions.)

I cannot follow you here. Several people already pointed out that
the requirement of type declarations is not connected
with the requirement of strong or static typing (see ML, Haskell). 

Its right that especially in a language with type unification some
raised type errors are hard to fix (ever get the message "Type 
error in RHS of equation" or how it was called in HOPE?). But this
just shows that there are also non-trivial type errors.

>--
>					David Gudeman
>gudeman@cs.arizona.edu
>noao!arizona!gudeman

--
Wolfgang Grieskamp 
wg@opal.cs.tu-berlin.de tub!tubopal!wg wg%opal@DB0TUI11.BITNET

gudeman@cs.arizona.edu (David Gudeman) (04/18/91)

In article  <29350@dime.cs.umass.edu> victor yodaiken writes:
]
]Your example seems to involve the use of type definitions. If I now write
]f((1,2,3,4)) the compiler should be able to inform me that I have
]attempted to apply "f" to a list, but "f" is only defined on sets and
]integers. 

For any variable v in any context whether mathematical, programming,
or natural language, you can find some set V such that all the
possible values of v are in V.  However, static typing implies more
than that: it implies that expressions are constrained statically such
a way that all expressions can be assigned types from a restricted
set.  The restricted set is such that machine representations can be
decided on at compile time and such that no type information has to be
associated with a value at runtime.

My example cannot be implemented in any programming language without
someone -- either the programmer or the language implementer --
associating type information with the values of the sequence.  If you
associate type information with a value at runtime then you have
dynamic typing.

My claim is not now, and never has been, that all static type checking
is bad, evil, or even rude.  I am only saying that in situations such
as the above, it is the language implementator who should be taking
care of type tags rather than the programmer.  When the implementator
does it, it leads to less code, less complexity, and fewer bugs.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

gudeman@cs.arizona.edu (David Gudeman) (04/18/91)

In article  <29352@dime.cs.umass.edu> victor yodaiken writes:
]... I do not
]buy the argument that there is a real plus in being able to write
]expressions that may not have any meaning.

If you can't write an expression that may not have any meaning then
your language is not expressive enough to write programs in.

] That is, if I write
]$f(X)$ I should be able to ensure, during design time, that 
]f is defined over all possible instantiations of X. Why, one would
]find it reasonable to do otherwise is beyond me.

That is impossible in a Turing-complete language.  All you can do is
make an arbitrary approximation to that kind of safety.  And static
typing approximations (ones that allow the representation of all
values to be decided at compile time) are overly restrictive for the
amount of safety they provide.  The fact that they are overly
restrictive is easy to see by how much effort is put into getting
around the restrictions.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

wg@opal.cs.tu-berlin.de (Wolfgang Grieskamp) (04/18/91)

[LANGUAGE WARNING: You might going to see a lot of spelling errors,
 notion errors, style errors, etc.]  

brm@neon.Stanford.EDU (Brian R. Murphy) writes:

 >My complaint about statically typed languages is that I _can't_ do
 >some things in them that I _do_ in dynamically typed languages (such
 >as Lisp).  For example, I can't I write a function which returns
 >either a boolean or an integer in a complex way.  I can't write my own
 >Y combinator.  I can't write a function which allows either a sequence
 >of functions which range over numbers or a sequence of numbers as an
 >argument.

Your examples are quite abstract. Maybe its missing phantasy 
-- but i cannot imagine the use of a function which returns
either a boolean or an integer unless there is some relationship
between this values, e.g. the boolean is an error value and 
normally an integer is expected, in which case either subsorting 
(multiple supersorts, of course) or parameterization is sufficient. 

And i see no need to write your own Y combinator, unless you use a
language for a ground course in lambda calculus. (BTW, you can
model it in strongly typed functional languages using recursive
data types; of course a bit cumbersome).

 >Let's consider what a type system might do for me:

 >(1) It constrains the behavior of primitive functions so the program
 >doesn't do undefined things.  
 >[ ... ]
 >[ This is essential, in my opinion, in either kind of type system. ]

 >(2) It constrains the use of procedures that I write so that they
 >aren't applied to things they weren't intended to apply to.  Thus, I
 >might declare an argument to have a particular type, and applications
 >to objects not in that type are prevented.  

I do not see the difference between primitive functions, in which 
case you regard type safety as essential, and user defined onces.

 >				   [Thus] I want a type system which
 >  (a) allows me to omit many type declarations (where unnecessary)
 >  (b) allows me to be very specific in constraining some arguments

Yes, i'am too.

 >With dynamic typing, I can simply write predicates to constrain
 >arguments/variables (Common Lisp, FL, some implementations of Scheme
 >do this).

... but i regard it as an abuse of notion to call such dynamic 
calculated predicates types. You can model them just using conditionals 
and an error halt function. What Common Lisp gives you here is simply
syntactic sugar in an essential untyped environment, and some
(rather restricted) kind of partial evaluation of this
predicates.

 >You static typing advocates claim that a static type system can do
 >this for me, but I claim that it can't.  A type language powerful
 >enough to constrain arguments anywhere near as precisely as I need
 >won't allow me to omit many types.  

My experience is the other way around. It would allow me to omit
many types, and in some rare cases, if have to add declarations.

However, unless I'am going to hack a one-night program, i would
always declare signatures at least for the exported functions
of a module even if they could be inferred. This seems to me
a reasonable kind of documentation.

 >Type inference is only possible
 >for a limited class of type languages, and they tend to be fairly
 >weak. In addition, certain programs are forbidden simply because they
 >utilize types which can't be described by the type language used.  

This is theoretically right. But the question is, how often 
instances outside this limited, weak class are required in
practice? For example, tell me an instance of practical evidence
for the need of self application!

Whats irritating me more and more in this discussion (as long as
I'am following it; the last week) is that the advocates of dynamic 
typing claim for more expressivness, which *looks* like 
the arguments of practical men, but in fact seems to be 
more of philosophical nature. (This would be quite alright, but say 
it like it is.)

I mean, the arguments often cummulate in some existential
view to software development: anything a man is able to do
should him be allowed to do ... pull down restrictions since
they are restrictive ... redundancy is for softys ... etc.

>					-Brian Murphy
>					brm@cs.stanford.edu

--
Wolfgang Grieskamp 
wg@opal.cs.tu-berlin.de tub!tubopal!wg wg%opal@DB0TUI11.BITNET

wg@opal.cs.tu-berlin.de (Wolfgang Grieskamp) (04/18/91)

gudeman@cs.arizona.edu (David Gudeman) writes:

 >							   If you
 >associate type information with a value at runtime then you have
 >dynamic typing.

I have heard of this definition before, but I think its to weak.
For instance, to implement C++ virtual messages, you have to add type
information to values. (If C++ was classified as a dynamical typed
language in the course of this discussion, I must apologize.)

To implement Haskells type classes (which is surely statically typed)
you dont have to add type information to values, but at least
to functions performing computations with these values. Its
an interesting observation that the restrictions of Haskell which
dont allow the usage of "heterogeneous" lists are caused to the fact that 
type information is associated only with values passed to functions 
but not with values stored in data types. This seems to be quite 
unnaturaly in a framework based on the lambda calculus, where
data types can be expressed through higher-order functions. 

--
Wolfgang Grieskamp 
wg@opal.cs.tu-berlin.de tub!tubopal!wg wg%opal@DB0TUI11.BITNET

cs450a03@uc780.umd.edu (04/18/91)

Victor Yodaiken    >
Me                 >>

>B: Type declarations do not necessarily only have to do with how data
>is stored. A type declaration: "z in Domain(g) is a number" or "z in
>Range x is a binary tree" or "Domain of f is ordered under >" all
>seem like useful type declarations, but they have nothing to do with
>storage.

Ok, let's take "z in Domain(g) is a number" as an example.  What is
the result of that statement?

If you're dealing with static typing, then I suppose you would say
that the compiler dutifully takes note of the statement and rejects
usages of z which conflict with this declaration.  But what if
computing Domain(g) is a problem?  Let's say z is an offset into a
file.  If you want this to be statically typed, you've just specified
a fixed-sized file.

>>Let me pose a "classic typing problem":
>>F(x) and f(x) are defined for some domain D -> D, but are not one-to-one
>>G(x) is an inverse to F(x) for some values in D
>>g(x) is an inverse to f(x) for some values in D
>>F, f, G and g are all implemented as user defined functions on some
>>computer system.  All are pure computation (no side-effects).
>>What type must x be for G(g(x)) to be meaningful?
>>
>>An easy way to deal with this problem is issue "message not
>>understood" where x is invalid.  Note that significant computation (g)
>>may occur before this situation is recognized.
>>
>>Sometimes it may be recognized that if x is limited to some sub-domain
>>d, then G(g(x)) will always be meaningful.  Sometimes the computer
>>might recognize this, sometimes the programmer might.  Sometimes the
>>problem is too hard.

>I don't believe that there is a good reason for writing programs with
>non-deterministic behavior

There is nothing non-deterministic in that problem.  For example,
let's say that G(y) is 1/p(y) where p(y) is 0 when y is a positive
integer.  Let's say g(x) always gives results which are less than 0 --
pretty easy to deal with, eh?  But what if g(x) can give results which
are positive integers?

Raul Rockwell

olson@juliet.ll.mit.edu ( Steve Olson) (04/18/91)

In article <3086@opal.cs.tu-berlin.de> wg@opal.cs.tu-berlin.de (Wolfgang Grieskamp) writes:
   Since static or strong typing is a well established paradigm of software
   engineering I guess you are in the position to present some 
   arguments of evidence against it, arent you? Anyway, you did ...

Oops, not at all the same thing (didn't this come up a few weeks ago?).
In fact, most dynaically-typed languages (Lisp, Smalltalk) are strongly
typed.  Statically typed languages vary in the "strength" of thier typing.
I do not keep up with the software engineering literature, but I would venture
a guess that strong typing is the well established paradigm, and static
typing is, well, what we are arguing over ...

   You claim, that the errors caught by static typing are trivial ones.
   This is usally right. However, my experience is that most bugs of the
   kind you are searching for days over days are finally of trivial nature. 

Um, you're overloading the word "trivial".  Trivial as in "is not a
fundamental error in medium or large scale program design" and trivial as
in "easy to find".  In a typical dynamicaly-typed language, type errors
tend to be trivial in both senses of the word.  The trival (in the first
sense of the word) errors that cause me real problems are when I type "+"
when I meant "-", or I type "ymin" when I meant "xmin".  Static typing won't
help there.

Static typing will help catch errors in areas of the a program that are
not exercised well by test runs.  However I am dubious about the net positive
effect of this because todays typical statically-typed language requires
you to do bizzare things to get around the typing system - thus causing
type errors.  I spotted the following in the the copy of SunExpert that
showed up in my mailbox today (p 31) -

char *rv;
rv = mmap( ..... );
if ( rv == (char *)-1) {

Hmm, rv is either a pointer or a flag value ... and you tell 'em apart at
run time ... dynamic typing!  Except that you are an order of magnitude
more likely to make a hard-to-find error than something equivalent in a
dynamically typed language.  If mmap really returns 0 as its flag value,
then its say hello to Mr. mystery core dump.

   --
   Wolfgang Grieskamp 
   wg@opal.cs.tu-berlin.de tub!tubopal!wg wg%opal@DB0TUI11.BITNET

- Steve Olson
  MIT Lincoln Laboratory
  olson@juliet.ll.mit.edu

cs450a03@uc780.umd.edu (04/19/91)

Wolfgang Grieskamp posts:

> Cardelli & Wegner: "On Understanding Types, Data Abstractions, and
> Polymorphims"
> 
> "Programming languages in which the type of every expression can be
>  determined by static program analysis are said to be statically
>  typed. ... [The] weaker requirement [is] that all expressions are
>  guaranteed to be type consistent although the type itself maybe
>  statically unknown; this can be generally done by adding some
>  run-time type checking [1]. Languages in which all expressions are
>  type consistent are called strongly typed languages. If a language
>  is strongly typed, its compiler can guarantee that the programs it
>  accepts will execute without type errors [2]."

At first, I thought that this was a definition which is exactly the
opposite of the terminology we've been using in this newsgroup.  But
the second time around, I decided that these are requirements for
statically typed languages.

This definition does not cover dynamically typed languages, and most
definitely does not say anything about languages which are strongly
dynamically typed.

Furthermore, within the context of the above definition, I might have
to say that C is a strongly statically typed language (depends on what
the definition of a "type error" is in that book).

Raul Rockwell

wg@opal.cs.tu-berlin.de (Wolfgang Grieskamp) (04/19/91)

olson@juliet.ll.mit.edu ( Steve Olson) writes:

  >In article <3086@opal.cs.tu-berlin.de> wg@opal.cs.tu-berlin.de (Wolfgang Grieskamp) writes:
  >   Since static or strong typing is a well established paradigm of software
  >   engineering I guess you are in the position to present some 
  >   arguments of evidence against it, arent you? Anyway, you did ...

 >Oops, not at all the same thing (didn't this come up a few weeks ago?).
 >In fact, most dynaically-typed languages (Lisp, Smalltalk) are strongly
 >typed.  Statically typed languages vary in the "strength" of thier typing.
 >I do not keep up with the software engineering literature, but I would venture
 >a guess that strong typing is the well established paradigm, and static
 >typing is, well, what we are arguing over ...

I dont't know the result of the result of the dispute some weeks ago,
but since I began to feel insecure (somebody e-mailed claiming too that
Lisp & Smalltalk are strongly typed) I looked up a reference:

	Cardelli & Wegner: "On Understanding Types, Data Abstractions,
	and Polymorphims"

	"Programming languages in which the type of every expression
	 can be determined by static program analysis are said to
	 be statically typed. ... [The] weaker requirement [is] 
	 that all expressions are guaranteed to be type consistent
	 although the type itself maybe statically unknown; this
	 can be generally done by adding some run-time type
	 checking [1]. Languages in which all expressions are
	 type consistent are called strongly typed languages. If
	 a language is strongly typed, its compiler can guarantee
	 that the programs it accepts will execute without type
	 errors [2]."

I interpret (1) that with a strongly typed language it might become
necessary to perform type checks at run-time (and thus associate
type information with values) in order to REALIZE THE SEMANTICS of a type
construct, but not to check type errors. (2) clearly drops Lisp
and Smalltalk out of the candidates for strongly typed languages.

Thats the way I learned it, I dont know if other references bring up
other definitions. :-)


--
Wolfgang Grieskamp 
wg@opal.cs.tu-berlin.de tub!tubopal!wg wg%opal@DB0TUI11.BITNET

cs450a03@uc780.umd.edu (04/19/91)

Dan Bernstein writes:
	
>Exception handling is hardly dynamic typing.

You mean because it doesn't require type tags?  Or maybe because it
has hardware support for some cases, on some machines?

Maybe I'm not totally clear on the conceptual difference between
checking for index-out-of-bounds on an array, and checking for
arithmetic overflow on some arithmetic calculation?

As near as I can figure, exception handling is a special case of
dynamic typing.

Raul Rockwell

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (04/19/91)

In article <OLSON.91Apr18021608@lear.juliet.ll.mit.edu> olson@juliet.ll.mit.edu ( Steve Olson) writes:
> char *rv;
> rv = mmap( ..... );
> if ( rv == (char *)-1) {
> Hmm, rv is either a pointer or a flag value ... and you tell 'em apart at
> run time ... dynamic typing!

Exception handling is hardly dynamic typing.

---Dan

kers@hplb.hpl.hp.com (Chris Dollin) (04/22/91)

Here's a hetrogenous list:

        [[
            widget          % LabelWidget %
            name            % name.as_string %
            translations    % standard_translations %
            install         % set_actions(% action %) %
            borderWidth     % 1 %
            borderColor     % 'black' %
            background      % 'gray' %
        ]]

The %-ed items are evaluated, the others (eg "widget") are not. This list is
one of the arguments to a widget instantiator. If I wrote it in a statically
typed language, I expect I'd have to (a) gather each name-value pair into a
twople [*1] and (b) inject each of the values into a suitable union type.

If anyone's really interested, I'll explain what it means. I'm happy to admit
that type errors may occur at run-time here; in fact, I'll lay odds that many
of those same type errors would occur in a statically-typed language (when
projecting a value back out of the presumed union type). So we need a way of
handling exceptions; but we knew *that* already.

It would be bootless to tell me that I ought to have an interface to the
instantiator that took (say) the widget as a separate argument, the colour(s)
as others, etc; one of the points about this interface is that given a widget
description, it can be modified (usually by appending stuff on the end) without
instantiating it, or inspected.
--

Regards, Kers.      | "You're better off  not dreaming of  the things to come;
Caravan:            | Dreams  are always ending  far too soon."

brm@neon.Stanford.EDU (Brian R. Murphy) (04/23/91)

In article <1991Apr16.151100.7221@maths.nott.ac.uk> anw@maths.nott.ac.uk (Dr A. N. Walker) writes:
>In article <1991Apr9.021700.2688@neon.Stanford.EDU>
>brm@neon.Stanford.EDU (Brian R. Murphy) writes:
>
>>My complaint about statically typed languages is that I _can't_ do
>>some things in them that I _do_ in dynamically typed languages (such
>>as Lisp).
>
>	But your examples are perfectly OK in *some* statically typed
>languages, which suggests that it's a problem with your specific languages
>rather than with typing.

Your solutions to my examples all have type declarations in them.
In my posting, I believe I clarified the my above complaint with 
"without type declarations."

					-Brian Murphy
					brm@cs.stanford.edu

brm@neon.Stanford.EDU (Brian R. Murphy) (04/23/91)

In article <3092@opal.cs.tu-berlin.de> wg@opal.cs.tu-berlin.de writes:
>[LANGUAGE WARNING: You might going to see a lot of spelling errors,
> notion errors, style errors, etc.]  
>
>brm@neon.Stanford.EDU (Brian R. Murphy) writes:
>
> >My complaint about statically typed languages is that I _can't_ do
> >some things in them that I _do_ in dynamically typed languages (such
> >as Lisp).  For example, I can't I write a function which returns
> >either a boolean or an integer in a complex way.  I can't write my own
> >Y combinator.  I can't write a function which allows either a sequence
> >of functions which range over numbers or a sequence of numbers as an
> >argument.
>
>Your examples are quite abstract. Maybe its missing phantasy 
>-- but i cannot imagine the use of a function which returns
>either a boolean or an integer unless there is some relationship
>between this values, e.g. the boolean is an error value and 
>normally an integer is expected, in which case either subsorting 
>(multiple supersorts, of course) or parameterization is sufficient. 

I fail to see how parameterization is relevant.  But yes, the case of
of a possible error value _is_ a more common case of my example.  For
example, a Lisp expression such as
	(if (= y 0) (error) (/ x y))
would have type [something like] (error + int).  So either we must
automatically include errors in every base type, introduce an explicit
union declaration in just about every function, or complicate things a
bit by going to a continuation-passing semantics.  Other examples
besides errors _do_ come up, though, so why not just adopt a more
general solution?

>And i see no need to write your own Y combinator, unless you use a
>language for a ground course in lambda calculus. (BTW, you can
>model it in strongly typed functional languages using recursive
>data types; of course a bit cumbersome).

Almost any sort of `meta-interpreter/compiler' (a la Abelson &
Sussman) which constructs functions in a generalized way contains
expressions with recursive function types.  The Y combinator was just
a particularly simple example.  Such translators are very convenient
to be able to write.

>I do not see the difference between primitive functions, in which 
>case you regard type safety as essential, and user defined onces.

Well, to facilitate debugging, the primitive functions have to behave
in some defined way on _every_ input, preferably some sort of error
signal/report to the user.  No static type system I know of can
eliminate all run-time errors (cf divide by 0, array index out of
bounds, disk full, out of memory, etc), so there must be a run-time
mechanism.

Similarly, there must be some mechanism for a user-defined partial
function to signal an error when given bad arguments.  The language
designer cannot possibly anticipate which domains a user-defined
function should be defined on, so the user must be given a way to
_explicitly_ signal an error (for example, if that sorted input array
turns out not to be sorted).  Providing strong static typing is merely
a partial solution; no type system/language can describe all function
domains (sorted arrays, for example).  An error-signalling mechanism
must be there anyway.

In a dynamically-typed language, the user has no static typing
`crutch' to lean on: if he wants to signal errors on bad arguments, he
can code his own test to discover these arguments; if not (perhaps on
quick-and-dirty throwaway code), he can omit them.  Presumably
user-defined functions intended for reuse and use by others would
signal an error on bad arguments.  Type declarations cover certain
cases, making it easy for programmers in statically-typed languages to
believe more precise argument validity checks (is that array sorted?)
aren't needed.  

[To be honest, I wouldn't really check an array for sortedness in most
cases, but just specify what happens if it's not, but hopefully you
get the idea.]

> >				   [Thus] I want a type system which
> >  (a) allows me to omit many type declarations (where unnecessary)
> >  (b) allows me to be very specific in constraining some arguments
>
>Yes, i'am too.
>
> >With dynamic typing, I can simply write predicates to constrain
> >arguments/variables (Common Lisp, FL, some implementations of Scheme
> >do this).
>
>... but i regard it as an abuse of notion to call such dynamic 
>calculated predicates types. You can model them just using conditionals 
>and an error halt function. What Common Lisp gives you here is simply
>syntactic sugar in an essential untyped environment, and some
>(rather restricted) kind of partial evaluation of this
>predicates.

Actually, what Common Lisp gives you isn't adequate at all.  It allows
primitives to behave in an undefined way on arguments out of their
specified domain.  Perhaps I'm biased, but having been exposed to CLU
and FL, which both have completely specified primitive functions (with
specific errors signalled on bad arguments or other error conditions),
I really don't like a language with incompletely determined behavior.
Your program may work one minute, but not the next, or on a different
machine, and there will be no way to tell why...

> >You static typing advocates claim that a static type system can do
> >this for me, but I claim that it can't.  A type language powerful
> >enough to constrain arguments anywhere near as precisely as I need
> >won't allow me to omit many types.  
> 
>My experience is the other way around. It would allow me to omit
>many types, and in some rare cases, if have to add declarations.

Perhaps you don't demand as powerful a type language as I do...
(arbitrary type union, intersection, dependent types, higher-order
functions)

>However, unless I'am going to hack a one-night program, i would
>always declare signatures at least for the exported functions
>of a module even if they could be inferred. This seems to me
>a reasonable kind of documentation.

So would I.  But, as I keep saying, any type language which allows
internal declarations to be inferred won't allow me to specify the
interfaces with the precision I need...

> >Type inference is only possible
> >for a limited class of type languages, and they tend to be fairly
> >weak. In addition, certain programs are forbidden simply because they
> >utilize types which can't be described by the type language used.  
>
>This is theoretically right. But the question is, how often 
>instances outside this limited, weak class are required in
>practice? For example, tell me an instance of practical evidence
>for the need of self application!

I can't come up with a quick example of self-application.
However, I have written Lisp programs for which no type inference
system could ever figure out what arguments certain functions would be
applied to---for example, a lookup table of functions, each constrained
in a complex way by program control to only be applied to correct
arguments.  Sure, I could build a big union type describing what
sorts of functions could be in the table, and explicitly extract one
of the appropriate type, but that just increases the number of things
I have to keep up-to-date when I modify the program and add a new
function to the table.  I'd rather not have to cope with that when
first trying something out.

>Whats irritating me more and more in this discussion (as long as
>I'am following it; the last week) is that the advocates of dynamic 
>typing claim for more expressivness, which *looks* like 
>the arguments of practical men, but in fact seems to be 
>more of philosophical nature. (This would be quite alright, but say 
>it like it is.)

Actually, I think the practical argument is this: In certain
situations (such as in protyping), I want to program in a completely
unconstrained way.  If one way of expressing things makes things
easier, I should be able do it.  ANYTHING which forces me to think
about my program in an unnatural way will distract me from the problem
at hand.  A static type system forces upon me someone else's idea of
how things should be done.  A dynamic type system leaves me more free
to do what ever moves me.  Later I can worry about going back and
making things safe for other people to use, or more efficient, or
whatever.

>Wolfgang Grieskamp 
>wg@opal.cs.tu-berlin.de tub!tubopal!wg wg%opal@DB0TUI11.BITNET

					-Brian Murphy
					 brm@cs.stanford.edu

diamond@jit345.swstokyo.dec.com (Norman Diamond) (04/24/91)

In article <2046@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:

>However, static typing implies more
>than that: it implies that expressions are constrained statically such
>a way that all expressions can be assigned types from a restricted
>set.  The restricted set is such that machine representations can be
>decided on at compile time

Yes.

>and such that no type information has to be
>associated with a value at runtime.

No.

>My claim is not now, and never has been, that all static type checking
>is bad, evil, or even rude.  I am only saying that in situations such
>as the above, it is the language implementator who should be taking
>care of type tags rather than the programmer.

Why not both?  A lot of people might agree that dynamic typing is
necessary in some cases too.  Maybe someday the two sides might agree
that the programmer should be allowed to specify types and/or classes
when it helps catch errors and/or improve efficiency, etc.; and that
the programmer should be allowed to omit specifications when it helps
speed development time or extend reusability.

>When the implementator
>does it, it leads to less code, less complexity, and fewer bugs.

No.  When the programmer is allowed to do the portions that need it,
it can also lead to less complexity and fewer bugs.
--
Norman Diamond       diamond@tkov50.enet.dec.com
If this were the company's opinion, I wouldn't be allowed to post it.

cs450a03@uc780.umd.edu (04/24/91)

Norman Diamond:
David Gudeman:

>>... static typing implies ... all expressions can be assigned types
>>from a restricted set ... such that machine representations can be
>>decided on at compile time 

>Yes.

>>                           and such that no type information has to
>>be associated with a value at runtime.

>No.

No, "type information has to be associated with a value at runtime" ?

Then what is dynamic typing?

Or, was this a statement that "constant type tags can be assiciated
with a value at runtime"?  If so, it is far from a clear statement.

>Why not both [dynamic and static typing] ?

"Both" is, by definition, not static typing -- such a language would
allow you to express statements which can not be statically typed.  On
the other hand, it is often a simple matter (of strength reduction) to
extract static typing information from statements in a dynamically
typed language.

One distinction is that in a dynamically typed language type
assertions are "just another operator" and most likely will use the
same syntactic conventions as other language operators.  In statically
typed language this does not have to be the case, and usually is not.

This was hashed out in comp.lang.misc about a month ago, though it
might not have been expressed fully in any one article.

Raul Rockwell

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (04/24/91)

In article <1991Apr15.183805.10544@maths.nott.ac.uk>,
 anw@maths.nott.ac.uk (Dr A. N. Walker) writes [about Algol 60]:
> In any case, dynamic
> *own* arrays are a pig to compile;  again, they were [?universally]
> omitted from implementations.

If I remember correctly, Burroughs Algol supported them.
There never was any particular problem in compiling them;
the problem was in chosing from several interpretations of what
they should mean (not unlike C++, just like a head cold is not
altogether unlike the Plague).  The interpretation which was
adopted for Algol 60.1 is that the bounds are determined the
first time that the declaration is elaborated, and don't change
after that.  So
	for n := 1, n + 2 while n < 13 do begin
	    own integer array a[1:n];
	    ...
	end;
would have the array declaration compiled to something like
	    static int already_done = 0;
	    static struct Int_1D a;
	    if (!already_done) {
		a.lb1 = 1;
		a.ub1 = n;
		a.data = malloc((a.ub1-a.lb1+1) * sizeof *(a.data));
		already_done = 1;
	    }
This is a pig to compile?
Of course, if the loop is
	for n := 11, n - 2 while n > 0 do begin
	    own integer array a[1:n];
	    ...
	end;
then you may be in trouble ...
-- 
Bad things happen periodically, and they're going to happen to somebody.
Why not you?					-- John Allen Paulos.

gudeman@cs.arizona.edu (David Gudeman) (04/25/91)

In article  <1991Apr24.051522.28988@tkou02.enet.dec.com> Norman Diamond writes:
]
]Why not both?  A lot of people might agree that dynamic typing is
]necessary in some cases too.  Maybe someday the two sides might agree
]that the programmer should be allowed to specify types and/or classes
]when it helps catch errors and/or improve efficiency, etc.; and that
]the programmer should be allowed to omit specifications when it helps
]speed development time or extend reusability.

That was the suggestion that spawned all of these threads.  Some
people seemed to feel that any dynamic typing is too "dangerous" and
discussions of that led to all the rest.

]>When the implementator
]>does it, it leads to less code, less complexity, and fewer bugs.
[       ^ (meaing dynamic typing)]

]No.  When the programmer is allowed to do the portions that need it,
]it can also lead to less complexity and fewer bugs.

That doesn't make any sense.  How can it lead to less complexity and
fewer bugs when the implementer has to explicitely handle details
instead of letting them be handled by the language?  Does it lead to
less complexity and fewer bugs when implementers handle array bounds
checking without language help?  What about checking for dereferencing
a null pointer?  Whenever you leave such things up to the implementer
you are providing opportunities for careless bugs.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

wall@pa.dec.com (David Wall) (04/25/91)

David Gudeman writes, of the claim that static typing leads to less
complexity:
    That doesn't make any sense.  How can it lead to less complexity and
    fewer bugs when the implementer has to explicitely handle details
    instead of letting them be handled by the language?  Does it lead to
    less complexity and fewer bugs when implementers handle array bounds
    checking without language help?  What about checking for dereferencing
    a null pointer?  Whenever you leave such things up to the implementer
    you are providing opportunities for careless bugs.

Since when does static typing preclude runtime checks?  Ever hear of
Pascal?  Or Modula-2?

The Modula-2 I use checks for range errors, following nil pointers,
even following bad but non-nil pointers.

Moreover, the optimizer is good enough to get rid of most of these
checks, when they are provably redundant.

David W. Wall - - - - - - - - - - - - - - - - < wall@decwrl.dec.com >

gudeman@cs.arizona.edu (David Gudeman) (04/25/91)

You left off the context necessary to see that I was not claiming
anything so stupid as the above.  The above reply is a complete
non-sequiter.  Please read more carefully before you follow-up, David.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

gudeman@cs.arizona.edu (David Gudeman) (04/25/91)

xxx

That's twice in the last week I seem to have been bitten by the
mythical line eater that I never believed in before.  Let me try
again:

In article  <1991Apr24.212350.27855@pa.dec.com> David Wall writes:
]
]Since when does static typing preclude runtime checks?  Ever hear of
]Pascal?  Or Modula-2?

You left off the context necessary to see that I was not claiming
anything so stupid as the above.  The above reply is a complete
non-sequiter.  Please read more carefully before you follow-up, David.

Let's see if that gets through...
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

markv@pixar.com (Mark VandeWettering) (04/25/91)

In article <1707@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
>In article  <1991Apr9.110217.10963@mathrt0.math.chalmers.se> Lennart Augustsson writes:
>]In article <1593@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
>]  map f [] = []
>]  map f (x:xs) = f x : map f xs

>I'm getting really tired of pointing this out: the program above does
>not have the full generality of the mathematical or the dynamically
>typed version.  This is a point I've made several times on several
>Haskell and ML programs, and I wish people would get the idea so I
>could stop repeating myself.  The statically typed program only works
>on structures in which all elements have the same type -- and only
>when the compiler can infer that type.

Well, in a language with strong typing, the function f has a particular 
type (alpha -> beta).  It doesn't make sense to have a function which 
applies to more than a single type.

Note: I do understand the motivations for weak (you may call it dynamic if
you like) typing.   There are many distinct advantages to both side.

>Type inference has some nice features, but it does _not_ give you the
>expressive power of dynamic typing.

Hard statement to prove, and almost meaningless.  Does it mean you can't
express the same programs?  In the same or fewer lines of code?  On what
kind of an example....

Mark

wall@pa.dec.com (David Wall) (04/26/91)

David Gudeman admonishes:
  You left off the context necessary to see that I was not claiming
  anything so stupid as the above.  The above reply is a complete
  non-sequiter.  Please read more carefully before you follow-up, David.

and on reviewing his article I see that he is right; he was not claiming
that static typing precluded runtime checks, he was drawing an analogy
between runtime typing and runtime checking.  I apologize.

I think it's a false analogy, which is why I misunderstood, but I do
apologize.  :-)

David W. Wall - - - - - - - - - - - - - - - - < wall@decwrl.dec.com >

gudeman@cs.arizona.edu (David Gudeman) (04/26/91)

In article  <1991Apr25.165342.14354@pixar.com> Mark VandeWettering writes:
]In article <1707@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
]>...the program above does
]>not have the full generality of the mathematical or the dynamically
]>typed version.
]Well, in a language with strong typing, the function f has a particular 
]type (alpha -> beta).  It doesn't make sense to have a function which 
]applies to more than a single type.

It certainly does.  I do it all the time.

]Note: I do understand the motivations for weak (you may call it dynamic if
]you like) typing.   There are many distinct advantages to both side.

There is a difference between weak typing and dynamic typing.  Lack of
understanding of this has resulted in far too much controversy,
especially since the definitions have been posted.

To repeat: weak typing means that the language may have undefined
behavior like infinite loops or core dumps from a type error.  Dynamic
typing means that it is not generally possible to assign a machine
representation to the values that will be returned by all syntactic
expressions.  You can still get strong typing in this case by
including type information with the data at runtime.

]>Type inference has some nice features, but it does _not_ give you the
]>expressive power of dynamic typing.

]Hard statement to prove, and almost meaningless.  Does it mean you can't
]express the same programs?  In the same or fewer lines of code?  On what
]kind of an example....

It means that most programs are smaller and simpler in dynamically
typed languages than in statically typed languages.  It is not all
that hard to show this: just compare the size of all the .h files in C
programs to the total size of the program.  As a rough approximation,
the .h files are the extra baggage caused by static typing.  That is a
very rough approximation though, since dynamic typing enhances
reusability of code.  So a lot of reduncancy in the .c files should be
eliminated also.
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

diamond@jit533.swstokyo.dec.com (Norman Diamond) (04/26/91)

David Gudeman
Norman Diamond
Raul Rockwell

dg>>>... static typing implies ... all expressions can be assigned types
dg>>>from a restricted set ... such that machine representations can be
dg>>>decided on at compile time 
nd>>Yes.
dg>>>                           and such that no type information has to
dg>>>be associated with a value at runtime.
nd>>No.

rr>No, "type information has to be associated with a value at runtime" ?

Type information might or might not be needed at runtime, depending on
the type.  Actually did not distinguish between static typing and static
classing, since no one else seems to be interested in distinguishing
these.

rr>Then what is dynamic typing?

From what I've seen in this argument, dynamic typing is the absence of
programmer specification of types.

rr>Or, was this a statement that "constant type tags can be assiciated
rr>with a value at runtime"?  If so, it is far from a clear statement.

Why would type tags have to be constant?

nd>>Why not both [dynamic and static typing] ?

rr>"Both" is, by definition, not static typing -- such a language would
rr>allow you to express statements which can not be statically typed.

Indeed, "both" means that the language would allow you to express
statements which cannot be statically typed, and also statements which
can be statically typed, and also statements which are constrained to
static types by administrative fiat.

rr>On the other hand, it is often a simple matter (of strength reduction) to
rr>extract static typing information from statements in a dynamically
rr>typed language.

"Often" is the key word in that sentence.  It is often possible to write
C programs without the use of "void *", but someone decided that "often"
isn't "often enough."

>One distinction is that in a dynamically typed language type
>assertions are "just another operator" and most likely will use the
>same syntactic conventions as other language operators.

Oh, this sounds like a "both" ... haven't seen one of those yet.

>In statically
>typed language this does not have to be the case, and usually is not.

Once upon a time, "uniformity of reference" was a buzz-phrase.  People
liked Fortran and PL/I better than Algol because array element selection
looked like function calls, and they would have liked structure selection
to look like function calls too, so that the client would not have to
"know" the kind (I don't want to say "type" this time) of structuring
involved to retrieve the data.  However, the Pascal and C style of square
brackets for arrays and dot for structures (and parentheses for function
calls) seems to have won.  I like it too.  It seems to reduce confusion.

I think that to use different syntax to distinguish static properties
of an entity from its dynamic properties also might reduce confusion.

>This was hashed out in comp.lang.misc about a month ago, though it
>might not have been expressed fully in any one article.

Sorry, I wasn't reading comp.lang.misc then.
--
Norman Diamond       diamond@tkov50.enet.dec.com
If this were the company's opinion, I wouldn't be allowed to post it.

diamond@jit533.swstokyo.dec.com (Norman Diamond) (04/26/91)

In article <2397@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
>In article  <1991Apr24.051522.28988@tkou02.enet.dec.com> Norman Diamond writes:
nd>> Maybe someday the two sides might agree
nd>>that the programmer should be allowed to specify types and/or classes
nd>>when it helps catch errors and/or improve efficiency, etc.; and that
nd>>the programmer should be allowed to omit specifications when it helps
nd>>speed development time or extend reusability.

dg>>>When the implementator
dg>>>does it, it leads to less code, less complexity, and fewer bugs.
dg>       ^ (meaing dynamic typing)
          ^ sorry, I thought you just meant "typing"

nd>>No.  When the programmer is allowed to do the portions that need it,
nd>>it can also lead to less complexity and fewer bugs.

dg>That doesn't make any sense.

Agreed.  I think the implementor should do dynamic typing.  However, the
programmer should specify the types of the static parts and specify which
parts are dynamic.  (Well never mind; this opinion got me fired from my
previous job, and is neither understood nor relevant at my present one.)
--
Norman Diamond       diamond@tkov50.enet.dec.com
If this were the company's opinion, I wouldn't be allowed to post it.

mathew@mantis.co.uk (mathew) (04/26/91)

wall@pa.dec.com (David Wall) writes:
> David Gudeman writes, of the claim that static typing leads to less
> complexity:
>     That doesn't make any sense.  How can it lead to less complexity and
>     fewer bugs when the implementer has to explicitely handle details
>     instead of letting them be handled by the language?  Does it lead to
>     less complexity and fewer bugs when implementers handle array bounds
>     checking without language help?  What about checking for dereferencing
>     a null pointer?  Whenever you leave such things up to the implementer
>     you are providing opportunities for careless bugs.
> 
> Since when does static typing preclude runtime checks?

He didn't claim that it did.

Someone said "It's less complicated to let the programmer handle dynamic
typing rather than the implementor".

He said "Does it lead to less complication to let the programmer handle
array bounds checking rather than the implementor?"  The expected answer was
"No", as you rightly point out; the question was asked in order to show up a
flaw in the original statement.

To reiterate:

In general, it is less complicated to use implementor-provided features than
to have to implement them yourself.

It is simply not true that it is easier, less complicated or safer to write
your own dynamic typing system than it is to use someone else's debugged
system.

> The Modula-2 I use checks for range errors, following nil pointers,
> even following bad but non-nil pointers.

Yes, but if we were to use the static typing enthusiasts' arguments, we
should be writing a language which doesn't have pointers.

That way you don't have to do time-consuming pointer-checking tests at run
time, and you don't run the risk of having a pointer error occur during
program execution.

[ Replace "pointer" with "type" in the above paragraph and you have exactly
  their argument. ]

> Moreover, the optimizer is good enough to get rid of most of these
> checks, when they are provably redundant.

Right. Which is what the dynamic type-checking people are suggesting: give
the programmer a CHOICE. And if you can remove the dynamic typing at compile
time then do so.

mathew
[ With subtitles for the hard of thinking ]

--
If you're a John Foxx fan, please mail me!

anw@maths.nott.ac.uk (Dr A. N. Walker) (04/26/91)

In article <5388@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au
(Richard A. O'Keefe) writes:

>If I remember correctly, Burroughs Algol supported [dynamic own arrays].
>There never was any particular problem in compiling them;
>the problem was in chosing from several interpretations of what
>they should mean (not unlike C++, just like a head cold is not
>altogether unlike the Plague).  The interpretation which was
>adopted for Algol 60.1 is that the bounds are determined the
>first time that the declaration is elaborated, and don't change
>after that.

	Well, yes, there's no difficulty compiling that interpretation.
But it could be [and was] argued that the bounds should be re-determined
every time, and that any common elements should be preserved.  Eg,
imagine an implementation of Conway's Life, in which, for efficiency,
only the active part of the board is retained.  We might have

	own Boolean array board [lorow:hirow, locol:hicol];

where "lorow" (etc) are integer procedures returning the first row (etc)
that needs to be considered.  As the "game" evolves, this declaration is
re-elaborated every generation, and the active region might march all
over the place and then back again.  Perfectly reasonable from the user's
point of view, not so nice for the compiler.  Possible, but not nice.

>This is a pig to compile?

	Quote from "Algol 60 Implementation", Randell & Russell, Academic
Press, 1964 [a detailed description of the Walgol interpreter, which has
very few restrictions], p271:

	"(vi) Dynamic Own Arrays are not Allowed
	 [...]
	 The difficulty of implementing dynamic own arrays is that the
	 size and shape of an own array, whose elements must be retained
	 when the block in which it is declared is left, may change on
	 subsequent entries to the block.  A proposed technique for
	 implementing dynamic own arrays is described in a paper by
	 Ingerman" [CACM 4, 1, pp 59-65, 1961].

So R&R clearly interpreted dynamic own arrays to include the case above.

-- 
Andy Walker, Maths Dept., Nott'm Univ., UK.
anw@maths.nott.ac.uk

tmb@ai.mit.edu (Thomas M. Breuel) (04/29/91)

   To repeat: weak typing means that the language may have undefined
   behavior like infinite loops or core dumps from a type error.  Dynamic
   typing means that it is not generally possible to assign a machine
   representation to the values that will be returned by all syntactic
   expressions.  You can still get strong typing in this case by
   including type information with the data at runtime.

Conversely, you can also get dynamic typing in statically typed
languages by creating unions ("datatype" in SML). At least in a
language like SML, the static type system is an additional, very
useful facility offered by the compiler. Most people who program in
statically typed languages write dynamically typed code in some
places; they simply restrict it to particular sections of their
programs.

Unfortunately, the notation for expressing dynamic typing in many
statically typed languages is a little cumbersome, requiring explicit
tagging and untagging by the programmer. However, C++, because of
user-definable conversion operators, lets you use dynamic typing a
little more easily, and ECL (an old Lisp-like language) had an even
smoother integration of dynamic and static typing.

   It means that most programs are smaller and simpler in dynamically
   typed languages than in statically typed languages.

Dynamically typed programs are also impossible to prove type-correct,
and many trivial errors go unchecked.

   It is not all
   that hard to show this: just compare the size of all the .h files in C
   programs to the total size of the program.  As a rough approximation,
   the .h files are the extra baggage caused by static typing.  That is a
   very rough approximation though, since dynamic typing enhances
   reusability of code.  So a lot of reduncancy in the .c files should be
   eliminated also.

As in most fields of human endeavour, redundancy is something to be
aimed for, not something to be eliminated, since it helps catch
mistakes.

   >>Type inference has some nice features, but it does _not_ give you the
   >>expressive power of dynamic typing.

Dynamic typing is necessary for serious program development, since
there are some things that are simply too cumbersome to express
in a static type system.

However, dynamic typing should not be the rule. For the proverbial
"99%" of all programming tasks, static typing is more robust, and
sufficiently expressive (at least, if it is polymorphic). As a bonus,
statically typed languages tend to be easier to compile into efficient
code with current compiler technology and machine hardware.

					Thomas.

Chris.Holt@newcastle.ac.uk (Chris Holt) (04/29/91)

mathew@mantis.co.uk (mathew) writes:

>Yes, but if we were to use the static typing enthusiasts' arguments, we
>should be writing a language which doesn't have pointers.

Just to add to the diversity of opinion :-), some of us think that
static typing is often very useful, that dynamic typing is sometimes
the best thing to use, but that pointers should never be visible to
the programmer.  But I don't want to restart the pointer wars again...

-----------------------------------------------------------------------------
 Chris.Holt@newcastle.ac.uk      Computing Lab, U of Newcastle upon Tyne, UK
-----------------------------------------------------------------------------
 "And when they die by thousands why, he laughs like anything." G Chesterton

mathew@mantis.co.uk (mathew) (04/29/91)

diamond@jit533.swstokyo.dec.com (Norman Diamond) writes:
>          I think the implementor should do dynamic typing.  However, the
> programmer should specify the types of the static parts and specify which
> parts are dynamic.  (Well never mind; this opinion got me fired from my
> previous job                          ^^^^^^^^^^^^^^^^^^^^^^^^^

You *are* joking, aren't you?


mathew

--
If you're a John Foxx fan, please mail me!

wg@opal.cs.tu-berlin.de (Wolfgang Grieskamp) (04/30/91)

[Excuse me, but this is somehow out of time...]

brm@neon.Stanford.EDU (Brian R. Murphy) writes:

 >I fail to see how parameterization is relevant.  But yes, the case of
 >of a possible error value _is_ a more common case of my example.  For
 >example, a Lisp expression such as
 >	(if (= y 0) (error) (/ x y))
 >would have type [something like] (error + int).  So either we must
 >automatically include errors in every base type, introduce an explicit
								^^^^^^^^
 >union declaration in just about every function, or complicate things a
 >bit by going to a continuation-passing semantics.  Other examples
 >besides errors _do_ come up, though, so why not just adopt a more
 >general solution?

Yes, why not? The general solution (for this class of problems) may
be "anonymous unions". For the example, its inferrable.

Parameterization in combination with anonymous unions might be 
of use for instance to define error tolerant function composition:

  compose:: ('a -> ('b + error)) * ('b -> ('c + error)) -> 'a -> ('c + error)
  compose(f,g)(x) = if f(x) isa error then f(x) else g(f(x))

The signature of compose will be inferrable again. 

 >Perhaps you don't demand as powerful a type language as I do...
 >(arbitrary type union, intersection, dependent types, higher-order
 >functions)

Well, maybe I'am not so idealistic in respect to the level
of the workaday problems in program developement.

However, probably no one ever worked with higher-order functions 
will dispense with it. So it might become some day with 
dependent types, although its hard to imagine (for me at least, 
currently. printf, *the* example for dependent typing, is not the
function I use all the day ...).

I agree completely with the statement someone else posted recently:
essential, playing with dynamic types opens the mind for recovering 
some more avangardistic programming techniques and inspires the resarch
in typing theory. 

Theory about typing is theory about programming. I wonder if theory
can be build up reasonable from dynamic typing :-?  

--
Wolfgang Grieskamp 
wg@opal.cs.tu-berlin.de tub!tubopal!wg wg%opal@DB0TUI11.BITNET

stephen@estragon.uchicago.edu (Stephen P Spackman) (05/02/91)

In article <2450@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
|There is a difference between weak typing and dynamic typing.  Lack of
|understanding of this has resulted in far too much controversy,
|especially since the definitions have been posted.
|
|To repeat: weak typing means that the language may have undefined
|behavior like infinite loops or core dumps from a type error.  Dynamic
|typing means that it is not generally possible to assign a machine
|representation to the values that will be returned by all syntactic
|expressions.  You can still get strong typing in this case by
|including type information with the data at runtime.

This statement confuses the *hell* out of me. I thought I knew all
about this issue; goodness knows I've been immersed in enough of it.
But it seems to me that it is **DYNAMIC** typing that is generally
implemented with every object an (underlying) dependant product - i.e.
with every type having the SAME and thus PREDICATBLE machine
representation; and it is **STATIC** typing which permits the
representational consistency constraints to be optimised out so that
the machine representations are arbitrary and NOT interpretable except
by reference to the inductive type structure of the actual code!

DYNAMIC typing uses tags always, STATIC typing uses tags on programmer
request (i.e. in unoptimisable discriminant fields).  DYNAMIC typing
uses uniform representation, STATIC typing uses variant
representation. It's just what you'd expect: the more work the
compiler does, the more "compiled" the result.

Look at the identity function. In a dynamically typed system (assuming
strong typing throughout, here) it has "type" OBJECT -> OBJECT (i.e.,
it's a monadic function, that's the whole story). OBJECT has an
underlying representation like this:
	[TYPE T; T t]
(though maybe the tag T is moved up into the pointer in some cases),
and this is true whether the value of T is Integer, Float, List,
whatever. Conversely, in a statically typed system, I has type 
FORALL T:TYPE. T -> T, and the underlying representation of the
argument and of the result can ONLY be determined by examination of
the lexical call site (or, presumably, in the machine representation,
by the examination of the particular instance of T passed in as an
extra, hidden, parameter). In particular, while the value of the
formal argument type T may be passed in as part of the calling
convention, I don't see why it should ever be passed out again....

-very confused stephen
----------------------------------------------------------------------
stephen p spackman         Center for Information and Language Studies
systems analyst                                  University of Chicago
----------------------------------------------------------------------

gudeman@cs.arizona.edu (David Gudeman) (05/03/91)

In article  <STEPHEN.91May2030111@estragon.uchicago.edu> Stephen P Spackman writes:
]In article <2450@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:
]|...  Dynamic
]|typing means that it is not generally possible to assign a machine
]|representation to the values that will be returned by all syntactic
]|expressions.  You can still get strong typing in this case by
]|including type information with the data at runtime.

]This statement confuses the *hell* out of me.

This response confuses the hell out of me.

]DYNAMIC typing uses tags always,

What did you think I meant by "including the type information at
runtime"?
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

nix@asd.sgi.com (Sold to the highest Buddha) (05/03/91)

stephen@estragon.uchicago.edu (Stephen P Spackman) writes:

> DYNAMIC typing uses tags always

... unless the typing can be completely determined at compile time,
and the tags can be optimized out.

	It seems to me that a language with a good type inference
algorithm and optional type constraints provided by the programmer
should be able to do this in many cases, eliminating type tags when
they aren't necessary.  If you're a static typing fan, you tell the
compiler to whine at you if it can't optimize out all type tags at
compile time, and you have to provide extra type information to get
the code to compile.

	Is there any reason why this can't be done?  Are there any
"dynamically typed" languages that do this?

Nix Thompson		nix@sgi.com		...!uunet!sgi!nix

	  follow the black valley - trail of death
		into the beautiful sunshine.

olson@juliet.ll.mit.edu ( Steve Olson) (05/03/91)

In article <NIX.91May2231654@valis.asd.sgi.com> nix@asd.sgi.com (Sold to the highest Buddha) writes:
	   It seems to me that a language with a good type inference
   algorithm and optional type constraints provided by the programmer
   should be able to do this in many cases, eliminating type tags when
   they aren't necessary.  If you're a static typing fan, you tell the
   compiler to whine at you if it can't optimize out all type tags at
   compile time, and you have to provide extra type information to get
   the code to compile.

	   Is there any reason why this can't be done?  Are there any
   "dynamically typed" languages that do this?

Your preceeding wish list sounds like a design document for CMU's
"Python" Commom Lisp compiler.  I can't vouch for it personally,
(am eagerly awaiting a SunOS version) but the user's manual and compiler
design notes (FTP-able from CMU) look promising.

   Nix Thompson		nix@sgi.com		...!uunet!sgi!nix



--
-- Steve Olson
-- MIT Lincoln Laboratory
-- olson@juliet.ll.mit.edu
--

rockwell@socrates.umd.edu (Raul Rockwell) (05/03/91)

Stephen P Spackman
> > DYNAMIC typing uses tags always

Nix Thompson		
> ... unless the typing can be completely determined at compile time,
> and the tags can be optimized out.

> Is there any reason why this can't be done?  Are there any
> "dynamically typed" languages that do this?

The compiler I use does this almost all the time.  However, it is
mostly used for compiling "leaf" functions to be used in an
interpreted environment.  Basically, its only purpose is to cut out
interpreter overhead (including type-checking) for problems that
involve a lot of little operations on the same piece of data.

The interpreter has better performance in other problem domains.
(the sort of optimizations compilers do IS a tradeoff...).

Raul Rockwell