[comp.lang.scheme] Oh Continuations!

pcg@cs.aber.ac.uk (Piercarlo Grandi) (09/26/90)
There was some discussion somewhere somewhen about continuations;
somebody asked me for a discussion of continuations, and I wrote a quick
introduction, and it was appreciated. Here I am posting it for a wider
public, the N-th in a long series of treatises :->.

Some disclaimers: I am using C-like syntax to show that it is not that
impossible to add continuations even to such a language; I would not
however use precisely the same syntax to add continuations to C (I would
probably want to use library functions instead). The facility described
is very general and analytic, and not exactly equivalent to that found
in say Scheme; the closest thing I can think of is SL5, the predecessor
to Icon. This article has been crossposted to comp.object; closures are
very similar to objects (consider Simula 67, actor languages).

Finally, this is my work, interests, ideas (I am doing an OS that is
entirely based on capabilities to continuations, CAP or PDP-1 style),
and in no way the University College of Wales is involved, culpable (or
even interested :->) etc... in this work.


The best book on advanced language implementation I know (and I know a lot
of them :->) is Allen, "The Anatomy of Lisp", Academic Press. It does not
apply really only to Lisp, and it makes clear a lot of very fundamental
issues of programming. If one has intellectual curiosity about insight on
what languages and compilers are about and are designed and implemented,
this is the book.  It gives away the whole show.

Now my own words:

-------------------------------------------------------------------------

Environment

	A set of (variable name,value) pairs; if for some pair the value is
	missing, the variable is called 'free', 'unbound', (it's a
	parameter) and the environment is a template.

Procedure

	a procedure is a lump of code plus an enviroment template; example:

		f(x,y) { return x+y; }

	The lump of code is between the braces, the environment template is
	between the parenthesis. Note that local variables, if any belong to
	the environment; local variables are those in the environment
	that are set by the code before being used. Global variables are just
	implicit arguments.

	In languages like C and PL/1 you can refer to procedure values via
	pointers.

Closure

	A procedure where all or a part of the free variables have been
	bound. In languages like C or PL/1 you *cannot* refer to a closure;
	in others you can do something like


		succ(x) : f(x,1)

	to mean that succ(x) is a closure obtained by taking a copy of
	f's environment, then binding in it y to 1.

Partial application

	The process of creating a closure. A closure is a procedure that has
	been partially applied, i.e. where some free variables have been
	bound. With partial application you can create very general procedures
	and create more specialized ones out of them by binding some of
	their arguments. Partial application is not available in C or PL/1.

Procedure instance

	A procedure instance is a closure where there are no free variables
	and a processor context (program counter+registers) is associated
	with it. It is not accessible as such in C or PL/1, but it is the
	result of calling a procedure.

Continuation

	A procedure instance that can be manipulated by the program. In
	languages like C or PL/1 only the language runtime system can handle
	continuations. In C for example you can get a pointer to the current
	continuation using setjmp(cont), which is a runtime system
	procedure; the continuation you get is not however fully general. In
	PL/1 you have to use the tasking features to get a continuation (a
	PL/1 task *is* a continuation), but you cannot use it to transfer
	control to.

In more implementation oriented terms, the stack frame of a procedure is its
closure, and its continuation is that stack frame plus the contents of the
register set. When a procedure in C or PL/1 calls another, it will save the
contents of the register set on the stack under the stack frame, thus
putting its entire continuation on the stack, it will create a closure for
the called procedure by pushing the arguments on the stack, and thus
assigning a value to the free variables in the called procedure, and then
will pass a pointer to its own continuation. The called procedure will
eventually use this pointer to return control to the caller's continuation.

In a language like scheme, you can not only create continuations as you
want, you can also transfer control to them when you want, using call/cc.

In C and PL/1 a procedure always continues (returns) to the calling
continuation. In C, using longjmp(), you can return to a continuation
only *above* the calling procedure.

This is because a continuation is essentially a stack frame; on languages
that put stack frames on a stack, instead of allocating them with malloc(),
you cannot return upwards a pointer to a stack frame, only pass it downwards.

There is a continuation passing style of control flow; when a procedure
calls another it always passes an explicit pointer to the continuation the

called procedure is to continue to, and it need not be the calling
procedure's. This for example makes coroutines trivially easy. Indeed you
can consider a continuation as a coroutine instance to which you can get a
pointer.

With continuation passing you can easily implement tail recursion; you just
continue to the current continuation. Normal recursion instead creates a new
continuation out of the current procedure. Example, first with conventional
recursion (I have not used conditional expressions to show the control flow
better):

	fact(n)
	{
	    if (n == 1)
		    return 1;
	    else    return n * fact(n-1);
	}

	main()
	{
	    printf("fact(4): %d\n",fact(4));
	    return 0;
	}

This will create a new continuation with a new closure for n and fact_n_1 at
each level of recursion. Let's assume you have a few new keywords, that give
access to the basic of the implementation to user programs (note that as
defined they presume that we have garbage collection):

closure

	The type of a closure

continuation

	The type of a continuation

another PROC => CLOSURE

	Make a new closure using the PROC procedure as a template
	without binding any of its parameters, and return a pointer
	to it.

bind CLOSURE([EXPR1],...) => CLOSURE

	Take the given closure pointer CLOSURE and set its parameters to the
	given expressions, and return the closure pointer. If a parameter is
	not to be bound, it can be left empty, or omitted if it is a trailing
	one. To do partial application, one will start binding the rightmost
	parameters, so that it is normally advisable to put the least
	"general" parameter to the left.

with CLOSURE => CONTINUATION

	Make a continuation out of CLOSURE pointed, with the program counter
	at the entry point of the code denoted by the closure, and return a
	pointer to it.

continue CONTINUATION

	Transfer control to the given CONTINUATION pointer; if this is
	nil, the processor is halted.

resume LABEL => CONTINUATION

	Return pointer to a copy of the current continuation, with the
	program counter set to label LABEL.

We could rewrite the recursive example above as

	fact(returnto,result,n)
	    continuation *returnto;
	    int *result;
	    int n;
	{
	    if (n == 1)
		*result = 1;
	    else
	    {
		int fact_n_1;

		continue with bind another fact(resume fact_n,&fact_n_1,n-1);

	    fact_n:
		*result = n * fact_n_1;
	    }

	    continue returnto;
	}

	main(kernel,returncode)
	    continuation *kernel;
	    int *returncode;
	{
	    int fact_4;

	    continue with bind another fact(resume print,&fact_4,4);

	print:
	    continue with bind another printf
		(resume exit,"fact(4): %d\n",fact_4);

	exit:
	    *returncode = 0;

	    continue kernel;
	}

Notice that this version is exactly equivalent to the one above; all the
aspects of the implementation of a procedure call have been exposed to the
naked eye, when they are usually hidden:

another fact
	the allocation of the called procedure stack frame

resume fact_n	    
	the return address is the first instruction after the call

&fact_n_1
	where to store the result value

bind ...(...)
	pushing the implicit (return address, return value location) and
	explicit (n) arguments in the called procedure stack frame.

*result = ...
	setting the value to be returned

continue returnto
	return;

Notice also that for example the C 'goto L' can also be expressed as
'continue resume L'.

Now we can see the tail recursive rewrite of the above example (I will
rewrite just fact iself here); at each continuation n will be drecreased by
1, and *result will have increased by n.

	fact(returnto,n,result)
	    continuation *returnto;
	    int n; int *result;
	{
	    if (n < 1)
		    continue returnto;
	    else    *result = 1, /* the unity for multiplication */
		    continue with bind another factorial(returnto,result);
	}

	factorial(n,returnto,result)
	    int n; int *result;
	    continuation *returnto;
	{
	self:
	    if (n == 1)
		    continue returnto;
	    else    *result *= n, continue bind resume self(n-1);
	}

Note several interesting aspects of this; factorial does not return to fact,
but directly to fact's invoker; factorial's 'n' gets rebound at every
tail-recursive call, and it is obvious that there is reuse of factorial's
continuation, given that there is no call to 'another', etc...

Another example could be a nice preemptive scheduler. Let's assume that
we have a few library helps:

continueafter(continueto,resumeto,millisecond)
continuation *continueto;
continuation *resumeto;
unsigned milliseconds;

	The procedure will continue to 'continueto' after having setup
	things so that after at least the given number of 'milliseconds', the
	current continuation will be forced to execute the following line:

	{ continue rebind resumeto{resume restart}; restart:; }

struct queue

	A fifo queue of continuation pointers

enqueue(returnto,last,queue)
continuation *returnto;
struct queue *queue;
continuation *last;

	Append to the queue the last continuation

dequeue(returnto,first,queue)
continuation *returnto;
struct queue *queue;
continuation **first;

	Remove the first continuation from the queue

Our preemptive scheduler (roundrobin, fixed slice size) may then be
(omitting details such as ensuring atomicity here and there):

	scheduler(process,timeslice)
    	    continuation *process;
	    unsigned timeslice;
	{
	    struct queue runnable;
	    closure *timeslicer;

	    timeslicer = bind another continueafter
		(,resume passivate,timeslice);

	passivate:

	    continue with bind another enqueue
		(resume choose,process,&runnable);

	choose:

	    continue with bind another dequeue
		(resume activate,&process,&runnable);

	activate:

	    continue with bind timeslicer(process);
	}


What happens here is that we specialize 'continueafter', then we enter a
loop, where continuation passing works as follows:

	schedule:passivate	(we have got the current process)
	enqueue			(put it at the end of the queue)
	schedule:choose		(get the process at the start of the queue)
	schedule:activate	(the time slicing logic)
	continueafter		(post an interrupt request...)
	process			(...and continue with the chosen process)
	[ an interrupt ]	(continue to schedule, set the current process)
	schedule:passivate
	...

Notice the extreme clarity and flexibility and power. For example it is
possible to have multiple instances of the scheduler, e.g. for a multiple
processor machine; it is very easy to change the scheduling policy, the
paragraph 'choose:', which in my example is simply always choosing the
first runnable process, etc...

--
Piercarlo "Peter" Grandi           | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk