[net.lang] strong typing and the "magic compiler"

leichter@yale-com.UUCP (Jerry Leichter) (12/26/83)
A couple of comments:

The single thing that most people seemed to fall back on as producing a need
for declarations was the problem that misspelling a variable name would just
have the compiler make up a new variable.  This is a "strawman" argument; there
are MUCH better solutions to the problem than introducing a typing system.  Over
the years, I've written cross-reference programs for several different languages.
One thing I ALWAYS include is a special, very visible flag for variables that
occur only once.  This will catch MOST typos.  A compiler good do the same
thing, without typing, and without a full cref.  (I prefer the full cref because
I make it a practice to read through the cref of any significant program at
least once.  I watch for variables only used a few times - which may be
repeated errors, variables that look very similar - again, possible errors or
at best sources of confusion; variables whose use I can't determine just from
the name, and so on.  Some of this could be automated...I've never felt the
need.)  A compiler could do MUCH more.  For example it can flag variables
that are used but not defined/defined but not used - both excellent pointers
to errors.  An optimizing compiler that does data-flow analysis can go even
further:  It can check that every variable used at a given point is defined
for ALL flow paths that reach that point.  This information is available using
well-understood algorithms, and would be EXTERMELY useful - but few compilers
give it to you.  All of this is way beyond what a simple-minded typing system
gives you.

I find nothing at all wrong with the compiler doing automatic conversions
between string representations of numbers and the numbers themselves - as long
as the semantics involved are carefully defined.  Sure, this has a (large)
cost in efficiency - and if the compiler is in a position to determine that
this will happen, and I ask for "optimization" information, it should tell
me.  The people who find this an ARGUMENT for typing should try to justify
maintaining the distinction WITHOUT REFERENCE TO TYPING AS A JUSTIFICATION.

BTW, APL does NOT have implicit conversions between characters and numbers.
It DOES have only one "language-level" notion of a number, and converts
freely amoung bits, integers, and reals; the most recent APL (IBM's APL-2)
adds complex numbers to this list.  As far as I can see, there are EXACTLY
two justifications for maintaining the distinction between reals and
integers:

	Integers are more efficient.  This may make the distinction worth
	making in some languages and some programs on some machines, but
	it is not a significant thing in any larger view of things, and
	does not belong in a language semantics.

	Allows overloading of arithmetic operators - mainly, the use of
	"/" to do integer division.  I think this was a neat hack from
	the early days of FORTRAN and should have died years and years
	ago.  Let "/" be division; provide a different operator for
	"truncating division", an operation with quite different
	semantics.  I've seem more confused (and confusing) code come
	out of mixed-mode operations trying to do rounding than I care
	to think about.  Numbers are numbers.

I'd like to conclude with a little story that illustrates the way in which
we have allowed certain beliefs to become blinding religions.

A couple of years back, an article appeared in (I think) Software - Practice
and Experience.  It discussed the addition to SITBOL, an implementation of
SNOBOL, of some facilities for user access to various transitions within the
code.  (The details are of no interest here.)  One problem that came up was
that there was a need for a test to be made every time around the innermost
loop of the interpreter; if a flag had gotten set somehow (asynchronously,
perhaps), the interpreter was to go off and do something special.  (This was
assumed to be a fairly rare event.)  Because the test would have to be exe-
cuted so often, the developers put a big effort into making it as cheap as
possible.  They discuss a large number of possible implementations, and
eventually settle on one that uses the PDP-10 EXECUTE instruction to execute
a cell that normally contains a NO-OP, but is "set" by inserting a subroutine
call to some special routine.  This turns out to be the cheapest way to do
things...or does it?  There is in fact a simple method with ZERO cost when
the special event is not to take place, and minimal cost when it is to take
place.  Do you see it?....



I'm willing to bet that the majority of readers did NOT see the trick.  It's
that old bugaboo, self-modifying code.  All you do is have the "unset" state
of the interpreter loop have a branch back to the top as the last instruction,
and the "set" state over-write that branch with a branch to the special code.
(To do this with "pure" code - as the PDP-10 allows - you just copy the inner
loop - which is very small - into the "impure" area on program startup, and
execute it from there.  This will work fine on any architecture except those
with separate I and D spaces/segments.)

Now, there may be reasons NOT to use this technique; that's not the point.
The point is that this technique is NOT EVEN MENTIONED - and apparently was
not even thought of by the authors - DESPITE their great concern for squeezing
every last bit of inefficiency out of their code.  Why?  Because the outright
inadmissibility of self-modifying code has been raised to the level of a
religious belief.  At one time, self-modifying code was extremely common;
when index registers were developed, 99+% or all the NEED for such code went
away, and with it a lot of hard-to-debug code.  This was fine, in fact, a major
advance.  BUT we over-learned; we now refuse to apply a perfectly good tech-
nique in the remaining <1% (more likely, <.01%) of cases where it would help,
sometimes greatly.

While this is a glaring example, I've seen others.  In only 30 years, this
field has created more than its share of taboos, unjustified beliefs, and
religious sentiments.  My own feeling is:  Any rule or rule of thumb that
you can't provide good reasons for is probably at least partly wrong.\

							-- Jerry
					decvax!yale-comix!leichter leichter@yale