[comp.lang.functional] Higher-Order Removal Transformations

wnc@ivax.doc.ic.ac.uk.doc.ic.ac.uk (Wei N Chin) (05/03/90)

Belated Reply To:

| >From: nelan@enuxha.eas.asu.edu (George Nelan)
| Ok, here's a summary of responses to my posting:

| I'm looking for references to work done on *transforming* higher-order
| functional programs to first-order.  Can anyone help?  E-mail: sufficient
| interest -> will post.  Thank you for your support.
|-------------------------------------------------------------------------------
| From: Paul Kelly <phjk%doc.imperial.ac.uk@NSFnet-Relay.AC.UK>
| 
| A research student here called Chin is just finishing a thesis which includes
| work on this, and has associated papers.  I suggest you email him, "wnc"

That's me! 

| Here is part of the abstract of a term paper I recently finished for a class in
| verification and transformation taught by one of my PhD advisors, Ed Ashcroft.
| The paper is titled "From Higher Order To First Order":
|
|       "Claim: an algorithm is presented that transforms most finitely typed
|	flattened higher order functional programs into equivalent first order
|	programs.  A finitely typed program is, in simple terms, a program that
|	passes (for instance) the type checking algorithm of Miranda or Haskell;
|	......
===============================================================================

My recent PhD thesis, entitled "Automated Methods for Program
Transformation", contains a chapter on higher-order removal. My
result appears to be quite similar (maybe you can confirm this)
to your above claim as it also removes 'most but not all higher-order 
features' and works with only 'well-typed' (or finitely typed) programs.

To structure the higher-order removal transformation, I have broken down 
the transformation into three algorithms. These transformation algorithms are
based on the define/unfold/fold rules of [Burstall&Darlington77] and serve 
the following purposes
	(1) remove 'general' applications
	(2) remove most of the instantiated (non-variable)
	    higher-order arguments
	(3) preserve laziness
Brief descriptions of these three algorithms (using examples)
follow. Note that the examples are based on an uncurried language (Hope-like).
I find 'uncurried' programs to be easier to deal with in unfold/fold
transformations because all parameter arguments are always present for
unfolding.

(1) Remove 'general' applications.
---------------------------------
This is similar (or identical) to what a number of people have
described as 'eta-abstraction'. Lester [Lester89] used this
transformation to simplify his stacklessness analysis.

For example, given:
	prg(u)	= inc(u)(4)
	inc(x)	= lambda y => y+4 end
The 'general' application, inc(u)(4), could be removed by transforming
(in this case just unfolds) to:
	prg(u)	= u+4

'General' applications (in an uncurried program) are terms of the form,
 fid (t1,..,tn)..(ty,..,tz), where fid is a function identifier and each
tx is an argument term. 'Function' applications are terms of the form
fid (t1,..,tn), while 'Variable' applications are terms of the form
v (t1,..,tn)..(ty,..,tz) where v is a variable. These are my terminologies
(apologies to anyone who have defined them differently), and I have shown
that all 'general' applications (in well-typed programs) could be 
eliminated (converted to function applications) by a simple transformation
algorithm. This may perhaps be what you mean by 'flattenning'.

(2) Remove 'instantiated' higher-order arguments
------------------------------------------------
Given a higher-order program:
	f(x)		= alt_map(x,lambda a=>a*a end,lambda a=>a*a*a end)
	alt_map([],F,G)	= []
	alt_map(x:xs,F,G)= F(x):alt_map(xs,G,F)
The second transformation algorithm will remove the 'instantiated'
higher-order function call in f by transforming it to:
	f([])	= []
	f(x:xs)	= x*x:f'(xs)
	f'([])	= []
	f'(x:xs)= x*x*x:f(xs)
This transformation is a form of function specialisation (or partial
evaluation) which specialises function calls with instantiated 
higher-order arguments. We have to be careful which of the higher-order
arguments we are allowed to specialise for recursive functions. If we are not
careful, we may get a non-terminating transformation algorithm. The parameter
criterion I used to determine if an argument can be specialised or not
is called, the 'variable-only' criterion. For example, the above
alt_map function contains two higher-order parameters, F and G, which are
both 'variable-only'. This is so because the corresponding recursive call, 
alt_map(xs,G,F), in the 2nd defining equation of alt_map contains variables 
(constants allowed as well) in those parameter positions.

In contrast, another higher-order function:
	acc_map([],F)	= []
	acc_map(x:xs,F) = F(x):acc_map(xs,lambda a=>a+F(x) end)
does not have any 'variable-only' higher-order parameter. This is because 
its corresponding recursive function call, acc_map(xs,lambda a=>a+F(x) end),
contains an accumulating term, lambda a=>a+F(x) end, in its ho-parameter
position. Such parameters are more difficult to specialise/remove and
may require laws to do so. I have not offered any solutions towards 
the removal of ho-arguments without the variable-only criterion.

3) Preserving Full Laziness
---------------------------

Higher-order parameters which are non-linear may, at times, result in 
loss of efficiency when they are specialised. For example, the map function
below:
	map([],F) 	= []
	map(x:xs,F)	= F(x):map(xs,F)
contains a non-linear higher-order parameter, F (occurred twice in the RHS
of the second equation). As a result, if we were to directly specialise the 
map function call in:
	g(xs,y)		= map(xs,lambda a=>sqr(y)+a end)
we get a less efficient program (note the re-computed sub-term
sqr(y)):
	g([],y)		= []
	g(x:xs,y)	= (sqr(y)+x):g(xs,y)
This issue is related to Wadworth's
full laziness graph reduction technique and Hughes's fully lazy lambda
lifting technique. To avoid this loss of laziness. I formulated a
transformation algorithm which would extract out 'ground-type' maximal
free expressions (MFEs - see [Hughes82]). A ground-type MFE for the 
above example is sqr(y). This could be extracted out to give:
	g(xs,y)		= g'(xs,sqr(y))
	g'(xs,Y)	= map(xs,lambda a=>Y+a end)
before g' is specialised to:
	g'([],Y)	= []
	g'(x:xs,Y)	= (Y+x):g'(xs,Y)
We are interested in extracting only ground-type MFEs and not function-type 
MFE. This is because the extraction of function-type MFEs, e.g. + sqr(y),
actually introduces (rather than remove) ho-functions - contradicting 
with the goal of ho-removal transformation. 

The laziness preserving transformation is quite complex because we may need to
extract out deeply 'buried' ground-type free expressions. I shall not go into
details here. The transformation algorithm can be found in my thesis and 
in a paper I am currently revising (e-mail me for details).

Additional Issues
-----------------

The transformation algorithms described are correct because they are based
on unfold/fold and satisfies Scherlis/Kott's total correctness criterion.
In addition, I have also proved the termination of these transformation
algorithms. The proof relied on the fact that programs are well-typed. An 
informal explanation for this well-typed requirement (to ensure termination)
is that the essence of the ho-removal algorithms is to reduce
the 'order' of programs. As well-typed programs have finite 'order', they
could always be transformed/removed within a finite number of iteration steps 
by the algorithm.

Lastly, the above algorithms does not directly eliminate list/trees of
function elements. However, when they are combined with an extended form
of Wadler's deforestation algorithm [Wadler88], we have a more 
powerful technique which is able to eliminate list/trees of functions. 
This is interesting because (in the opposite direction) we are using 
the higher-order removal technique to  help extend Wadler's first-order 
deforestation algorithm to higher-order programs. 

Related Works
-------------

[Goguen88] and [Wadler88] both proposed a macro expansion (unfolding) technique
to remove a restricted class of higher-order functions. More
specifically, their technqiue could be applied to higher-order functions 
whose function-type parameters are 'fixed' (e.g. map function above 
but not alt_map nor acc_map).

[Nielson&Nielson89] proposed the higher-order function removal 
transformation as a special form of partial evaluation 
with the help of BTA (Binding-time analysis).
I think their proposed technique is presently only able to handle
the removal of function-type arguments without free variables.

References
----------
[Burstall&Darlington77] A Transformation System for Developing 
		recursive programs, JACM 1977
[Goguen88] Higher-Order Functions considered Unnecessary for Higher-order
		Programming, SRI-CSL-88-1
[Lester89] Stacklessness: Compiling Recursion for a Distributed Architecture
		FPCA '89
[Hughes82] Supercombinators: A new implementation technique for Applicative
		Languages ACM LFP 1982
[Nielson&Nielson89] Transformation on Higher-order Functions FPCA'89.
[Wadler88] Deforestation - Transforming Programs to Eliminate Trees,
		ESOP '88
==========================================================================

..WN Chin
Dept of Computing, Imperial College

PS I have not yet implemented the described transformation techniques.