[comp.lang.misc] type checking problem

cok@islsun.Kodak.COM (David Cok) (05/17/91)

Here is a real-life problem which advocates of various languages/typing systems
are invited to comment on. 
 
I want to write a routine to optimize a function (numerically) over one of its 
arguments, returning the abscissa of the extremum.  I'll settle for 
optimization over the first argument.  I'll even restrict myself to that 
argument and the return value being doubles.  So I want a function 
                optimize() 
which returns a double and takes as an argument a pointer to a function which
returns a double and takes as arguments a double *and other arguments*.  It 
would be used like this

		optimum_x = optimize(f,...)

The problem is what to do with the other arguments to f.  One solution is to
allow creation of functions on the fly.  Given, say, myf of four arguments,
one would write
		optimum_x = optimize(myf(#1,a,b,c))
in which myf(#1,a,b,c) is a function of one argument obtained by supplying
values for three of the four arguments of f.  This can be done in interpreted
environments like Mathematica.  Can it be done in any compiled programming
languages?  Smalltalk with blocks perhaps?

Another approach which is similar is to write a wrapper function which 
takes only one argument -- the one to be optimized -- but then the other
arguments would have to be passed in as global parameters, which is the sort 
of design I want to avoid.

Instead I want to declare something like this: 
                double optimize(double (*f)(double,...),...) 
but where the types of the two ...'s match up.  Note that all optimize does 
with this list of arguments is to call the function f on them, i.e. 
                double optimize(double (*f)(double,...),...) { 
                        double x; 
                        while (not converged) 
                           {     
                                pick a new x; 
                                result = f(x,...); 
                           } 
			return x;
                } 
Given some syntax to indicate that the two ...'s denote the same number and
sequence of types, both the implementation of optimize and the call of
optimize could be statically type-checked.  However, no statically type-
checked language that I know of allows this.  Are there any?  There are no 
dynamically created inhomogeneous lists which require checking of type tags 
or other such examples proposed by dynamic-typing advocates.  The problem
also requires flexibility in number of arguments which is not common in
dynamically typed languages.  Would anyone care to point me to an example?

One potential partial statically-typed solution is C++ with function templates:

double optimize(double (*f)()) {... f() ... }

template<class T1> double optimize(double (*f)(T1), T1 a1) { ... f(a1) ...}

template<class T1, class T2> double optimize(double (*f)(T1,T2), T1 a1, T2 a2)
					{  ... f(a1,a2) ...}

This gives flexibility in type but not in number of arguments (without writing 
a template for each possible number of arguments).  However, it sure would be
a shame to have to have separate instantiations of the code for each possible
combination of arguments.

Comments and solutions are welcome.

David R. Cok
Eastman Kodak Company -- Imaging Science Lab
cok@Kodak.COM

quale@saavik.cs.wisc.edu (Douglas E. Quale) (05/17/91)

In article <1991May16.182932.26327@kodak.kodak.com> cok@islsun.Kodak.COM (David Cok) writes:
>Here is a real-life problem which advocates of various languages/typing systems
>are invited to comment on. 
> 
>I want to write a routine to optimize a function (numerically) over one of its 
>arguments, returning the abscissa of the extremum.  I'll settle for 
>optimization over the first argument.  I'll even restrict myself to that 
>argument and the return value being doubles.  So I want a function 
>                optimize() 
>which returns a double and takes as an argument a pointer to a function which
>returns a double and takes as arguments a double *and other arguments*.  It 
>would be used like this
>
>		optimum_x = optimize(f,...)
>
>The problem is what to do with the other arguments to f.  One solution is to
>allow creation of functions on the fly.  Given, say, myf of four arguments,
>one would write
>		optimum_x = optimize(myf(#1,a,b,c))
>in which myf(#1,a,b,c) is a function of one argument obtained by supplying
>values for three of the four arguments of f.  This can be done in interpreted
>environments like Mathematica.  Can it be done in any compiled programming
>languages?  Smalltalk with blocks perhaps?
>

Languages that can create new functions at runtime can handle this without
even breaking a sweat.  It's a one-liner.  (Warning folks, these are
professional languages.  Do not try this in C.)

Smalltalk can indeed do this with blocks, but as I don't know Smalltalk I'll
demonstrate it in lisp and SML.  Someone else is sure to be able to give you
the Smalltalk equivalent.

The function optimize is written to take a unary function to optimize.

In lisp
	(optimize #'(lambda (x) (f x a b c)) ...)

In SML
	optimize(fn x => f(x, a, b, c), ...)

The function objects that are created are called closures, because they close
over the environment containing the values of a, b and c.
In general this process is called currying, and is standard tool of the trade
in languages based on the lambda calculus.  In languages that can't create
new functions it's a standard pain in the ass.

-- Doug Quale
quale@saavik.cs.wisc.edu

kers@hplb.hpl.hp.com (Chris Dollin) (05/17/91)

David Cok asks:

   Here is a real-life problem which advocates of various languages/typing 
   systems are invited to comment on. 

   [stuff deleted]

   The problem is what to do with the other arguments to f.  One solution is to
   allow creation of functions on the fly.  Given, say, myf of four arguments,
   one would write
		   optimum_x = optimize(myf(#1,a,b,c))
   in which myf(#1,a,b,c) is a function of one argument obtained by supplying
   values for three of the four arguments of f.  

In ML, if you choose the argument order of myf right, currying will do the job.
Thus we have

   fn myf x y z = ... expression ...
   ...
   optimise( myf X Y )

optimise is called with a one-parameter function, obtained (as it were) by
``specialising'' myf with the known values of x (ie X) and y (ie Y). Since you
can write argument-jigglers easily, or use lambda-expressions, there's no
problem. [ML is statically typed].

[You can do essentially the same trick in Pop11, Scheme, or Common Lisp, but 
they are not statically typed.]
--

Regards, Kers.      | "You're better off  not dreaming of  the things to come;
Caravan:            | Dreams  are always ending  far too soon."

anw@maths.nott.ac.uk (Dr A. N. Walker) (05/21/91)

In article <1991May16.182932.26327@kodak.kodak.com> cok@islsun.Kodak.COM
(David Cok) writes:

>					  Given, say, myf of four arguments,
>one would write
>		optimum_x = optimize(myf(#1,a,b,c))
>in which myf(#1,a,b,c) is a function of one argument obtained by supplying
>values for three of the four arguments of f.

	Cf Algol Bulletin #37, July 1974, p 24 [the third time I've given
this reference in a few weeks!] for the Algol partial parametrisation
proposals:

		optimum_x := optimise (myf ( , a, b, c))

On the other hand, good luck in finding a commercially available Algol
compiler with p-p implemented.

> [2nd style of solution]				       The problem
>also requires flexibility in number of arguments which is not common in
>dynamically typed languages.

	Yes, it's a nice example.  I don't know of many real situations
in which the programmer may reasonably claim ignorance of both the number
and type of the parameters to a procedure.  I suspect that a good solution
would require a much more visible type system than is common in compiled
languages [statically or dynamically typed], which is why it's easier in
interpreted languages.  It can *nearly* be done in Algol with modals
(same issue of Algol Bulletin, and same caution as above):

	PROC optimise = (MODE Z, PROC (REAL, REF Z) REAL f, REF Z args) REAL:
	  (REAL result, x;
	   WHILE result := f (x := pick_an_x, args);
		 NOT converged (x, result)
	      DO SKIP OD; x);

which allows things like

	MODE THISZ = STRUCT (REAL p, COMPL q, CHAR f);
	optimum_x := optimise (THISZ, myf, THISZ := (a, b, c))

[I think -- no way of trying it here] at the expense of some restrictions
on the nature of "myf".  [Note that "Z" needn't be instantiated as a
STRUCTure, it could be eg an array of UNIONs.]

-- 
Andy Walker, Maths Dept., Nott'm Univ., UK.
anw@maths.nott.ac.uk

kers@hplb.hpl.hp.com (Chris Dollin) (05/22/91)

Dr A. N. Walker says:

	   Cf Algol Bulletin #37, July 1974, p 24 [the third time I've given
   this reference in a few weeks!] for the Algol partial parametrisation
   proposals:

		   optimum_x := optimise (myf ( , a, b, c))

My God! Did they *seriously* propose to do partial parameterisation by simply
*omitting* the remaining arguments to a call? No extra syntax - like a
place-holder symbol - *at all*?

The typechecker would spot some silly mistakes, but I'd have thought that
having eg a new keyword to make it clear what was going on would have been
better ...

    myf( _, a, b, c )		or	myf( ARG, a, b, c )

[clear to the *reader*, not the compiler. Currying, incidentally, doesn't
suffer from this so much - the order in which you can partially parameterise is
fixed, and there aren't the commas in the functional-call syntax either
(usually).]
--

Regards, Kers.      | "You're better off  not dreaming of  the things to come;
Caravan:            | Dreams  are always ending  far too soon."

doug@netcom.COM (Doug Merritt) (05/23/91)

In article <1991May21.144739.23901@maths.nott.ac.uk> anw@maths.nott.ac.uk (Dr A. N. Walker) writes:
>	Yes, it's a nice example.  I don't know of many real situations
>in which the programmer may reasonably claim ignorance of both the number
>and type of the parameters to a procedure.

The inner loop of an interpreter: eval(func, parameters).

Since interpreters are applied to zillions of different applications,
not just classic languages, that's "many real situations" right there.
	Doug
-- 
Doug Merritt	doug@netcom.com		apple!netcom!doug

anw@maths.nott.ac.uk (Dr A. N. Walker) (05/25/91)

In article <KERS.91May22091153@cdollin.hpl.hp.com> kers@hplb.hpl.hp.com
(Chris Dollin) writes:

>My God! Did they *seriously* propose to do partial parameterisation by simply
>*omitting* the remaining arguments to a call? No extra syntax - like a
>place-holder symbol - *at all*?

	Yup!  Just like the partial indexisation [:-)] that was in from
the beginning:  eg "vector := matrix [ , j]" to copy the "j"th column.
In fact, since parentheses are a permitted substitution for brackets in
implementations with restricted character sets, the *syntax* (in the
traditional sense) is already provided.

>The typechecker would spot some silly mistakes,

	Well, it's not that bad.  You have to work quite hard to create
a silly mistake that would compile [you can't usually interchange a
"procedure with parameter foo returning bar" and a "bar" without the
compiler complaining].  The only simple one I can think of is where
two adjacent parameters happen to have the same mode and empty and
provided parameters are swapped;  but that seems less likely to happen
than swapping two parameters anyway, which can be a pig to debug in
*any* programming language.

-- 
Andy Walker, Maths Dept., Nott'm Univ., UK.
anw@maths.nott.ac.uk

kers@hplb.hpl.hp.com (Chris Dollin) (05/28/91)

Dr A. N. Walker reminds me:

   (Chris Dollin) writes:
   >My God! Did they *seriously* propose to do partial parameterisation by
   >simply *omitting* the remaining arguments to a call? No extra syntax - 
   >like a place-holder symbol - *at all*?

	   Yup!  Just like the partial indexisation [:-)] that was in from
   the beginning:  eg "vector := matrix [ , j]" to copy the "j"th column.
   In fact, since parentheses are a permitted substitution for brackets in
   implementations with restricted character sets, the *syntax* (in the
   traditional sense) is already provided.

Wow, I'd forgotten that.

It must have been edited out of my memory as an ``undesirable feature'' ....
--

Regards, Kers.      | "You're better off  not dreaming of  the things to come;
Caravan:            | Dreams  are always ending  far too soon."

anw@maths.nott.ac.uk (Dr A. N. Walker) (05/28/91)

In article <1991May22.185944.10208@netcom.COM> doug@netcom.COM (Doug
Merritt) writes:

[example of ignorance of both number and type of parameters to a procedure]

>The inner loop of an interpreter: eval(func, parameters).

	But that has [just] two parameters.  If the source is previously
unscanned, then presumably "func" and "parameters" are of types "string"
and "array of string" (or perhaps "list of string").  If the source has
been tokenised, and possibly syntax-checked, etc., then "func" could be,
for example, "pointer to symbol-table-entry" and "parameters" could be
"pointer to emulated-stack-frame" or some such.  [A symbol-table-entry
will naturally be of a fixed type, known to the interpreter, even though
it may well include a description of dynamic types constructed on the fly
by the program being interpreted.]

	There are many other possibilities, but I cannot see any reason
why "eval" has to be ignorant of how it is to be called [even if the
language being interpreted is inherently extensible, or the interpreter
can itself be interpreted].

	There are plenty of examples of interpreters and compilers written
in languages in which the number and type of procedure parameters must be
statically determinable.

	This example, "eval (func, params)", differs from the previous
example, "optimise (func, params)", because "optimise" was intended to
apply to functions and parameters generated [independently] inside the
same program, whereas "eval" is intended to apply to functions and
parameters generated by other programs being inspected by the interpreter.
Even if "eval" is applied to its own text, it can call on all the other
facilities of the interpreter, whereas "optimise" has no access to the
program source.  I hope this distinction is clear!

-- 
Andy Walker, Maths Dept., Nott'm Univ., UK.
anw@maths.nott.ac.uk

doug@netcom.COM (Doug Merritt) (05/29/91)

In article <1991May28.141856.26234@maths.nott.ac.uk> anw@maths.nott.ac.uk (Dr A. N. Walker) writes:
>In article <1991May22.185944.10208@netcom.COM> doug@netcom.COM (Doug
>Merritt) writes:
>
>[example of ignorance of both number and type of parameters to a procedure]
>
>>The inner loop of an interpreter: eval(func, parameters).
>
>	But that has [just] two parameters.  If the source is previously
>unscanned, then presumably "func" and "parameters" are of types "string"
>and "array of string" (or perhaps "list of string").  If the source has
>been tokenised, and possibly syntax-checked, etc., then "func" could be,
>for example, "pointer to symbol-table-entry" and "parameters" could be
>"pointer to emulated-stack-frame" or some such.

Nope! I mean the more usual case, where "func" and "parameters" are
completely processed into some internal model, and in the *body* of 'eval'
(as opposed to the *call* of eval()), you have to then call 'func' on
the appropriate parameters.

In the case where 'func' is a builtin function, you need to be able
to call it on an arbitrary number of parameters:
	if (func->addr == sqrt)  (*func->addr)(parameter[0])
	if (func->addr == pow)   (*func->addr)(param[0], param[1])
	if (func->addr == qsort) (*func->addr)(p[0], p[1], p[2], p[3])
	etc
I'm not making this up, this has been an irritation every time I've done
an interpreter (e.g. for lisp or "little languages"). C does not lend
itself to such situations of variable type & number of parameters at all.

>	There are plenty of examples of interpreters and compilers written
>in languages in which the number and type of procedure parameters must be
>statically determinable.

But that's just the point. C is such a language, and it's a real pain.
If you want your interpreter to be able to dynamically generate code,
so that functions defined at runtime can be efficiently called as
if they were hardwired primitives, then it gets even worse. You've got
to explicitly enumerate all possible cases of parameter type and number.
Or build C stack frames on the fly via either external assembler code or
kludgy tricks, all of which are highly nonportable (I've done all these
approaches, and I don't like any of them).

>	This example, "eval (func, params)", differs from the previous
>example, "optimise (func, params)", because "optimise" was intended to
>apply to functions and parameters generated [independently] inside the
>same program, whereas "eval" is intended to apply to functions and
>parameters generated by other programs being inspected by the interpreter.
>Even if "eval" is applied to its own text, it can call on all the other
>facilities of the interpreter, whereas "optimise" has no access to the
>program source.  I hope this distinction is clear!

I hope my explanation of my example makes it clear that there is no
such distinction here. In the Lisp world in particular, there is little
or no distinction between statically defined functions and those defined
at runtime, nor in access to interpreter facilities. (More generally,
in any language in which functions are first class citizens, and even
more especially those in which code generation may be done at runtime.)

Interpreters which operate purely on unscanned strings or even tokenized
sequences are much less common, less powerful, "quick and dirty",
admittedly easier to handle some tricky semantic cases, but also less
interesting for real world applications.
	Doug
-- 
Doug Merritt	doug@netcom.com		apple!netcom!doug

anw@maths.nott.ac.uk (Dr A. N. Walker) (05/29/91)

In article <1991May28.191001.12219@netcom.COM> doug@netcom.COM (Doug
Merritt) writes:

>Nope! I mean the more usual case,

	[as opposed to "unscanned" or "tokenised" -- ANW]

>				   where "func" and "parameters" are
>completely processed into some internal model, and in the *body* of 'eval'
>(as opposed to the *call* of eval()), you have to then call 'func' on
>the appropriate parameters.

	But then there are two cases:

  a) "func" is user-provided.  The parameters are "arbitrary", but the
     whole mechanism is under the interpreter's control.

  b) "func" is built-in.  The parameters are dictated by the requirements
     of the host libraries.  Somewhere there will be some indication of
     what parameters are required [checked either during processing or by
     "eval", according to taste], and we get something like your code
     [deleted].  Slightly messy, agreed, but not disastrous, and really
     the opposite of the case in question -- so far from the parameters
     being unknown, they are only too well known.  Given something like a
     "lint" library, you could construct the switch body and the relevant
     part of the symbol table automatically, in case the list of built-ins
     is changing.

>If you want your interpreter to be able to dynamically generate code,
>so that functions defined at runtime can be efficiently called as
>if they were hardwired primitives, then it gets even worse.

	This is right at the compiled end of the spectrum between
text-based interpreters, through tokenisers and threaded code, towards
full-blown compilers.  It's not surprising that you have to write
code comparable in complexity [and non-portability] with a full
compiler in order to interface your interpreter with compiled code.

	Note that in (Classic) C, the number and type of function
parameters *is* unspecified, and it doesn't seem to have bought you
anything ("printf" and friends always excepted).  In other words,
I'm happy to sympathise with your problems, and agree that C doesn't
provide all the answers in a convenient way;  but this doesn't have
much to do with allowing arbitrary parameters.

>Interpreters which operate purely on unscanned strings or even tokenized
>sequences are much less common,

	"Hush, hush, whisper who dares,
	    Before BASIC catches you, all unawares." [:-)]

-- 
Andy Walker, Maths Dept., Nott'm Univ., UK.
anw@maths.nott.ac.uk