pcg@cs.aber.ac.uk (Piercarlo Grandi) (09/26/90)
There was some discussion somewhere somewhen about continuations;
somebody asked me for a discussion of continuations, and I wrote a quick
introduction, and it was appreciated. Here I am posting it for a wider
public, the N-th in a long series of treatises :->.
Some disclaimers: I am using C-like syntax to show that it is not that
impossible to add continuations even to such a language; I would not
however use precisely the same syntax to add continuations to C (I would
probably want to use library functions instead). The facility described
is very general and analytic, and not exactly equivalent to that found
in say Scheme; the closest thing I can think of is SL5, the predecessor
to Icon. This article has been crossposted to comp.object; closures are
very similar to objects (consider Simula 67, actor languages).
Finally, this is my work, interests, ideas (I am doing an OS that is
entirely based on capabilities to continuations, CAP or PDP-1 style),
and in no way the University College of Wales is involved, culpable (or
even interested :->) etc... in this work.
The best book on advanced language implementation I know (and I know a lot
of them :->) is Allen, "The Anatomy of Lisp", Academic Press. It does not
apply really only to Lisp, and it makes clear a lot of very fundamental
issues of programming. If one has intellectual curiosity about insight on
what languages and compilers are about and are designed and implemented,
this is the book. It gives away the whole show.
Now my own words:
-------------------------------------------------------------------------
Environment
A set of (variable name,value) pairs; if for some pair the value is
missing, the variable is called 'free', 'unbound', (it's a
parameter) and the environment is a template.
Procedure
a procedure is a lump of code plus an enviroment template; example:
f(x,y) { return x+y; }
The lump of code is between the braces, the environment template is
between the parenthesis. Note that local variables, if any belong to
the environment; local variables are those in the environment
that are set by the code before being used. Global variables are just
implicit arguments.
In languages like C and PL/1 you can refer to procedure values via
pointers.
Closure
A procedure where all or a part of the free variables have been
bound. In languages like C or PL/1 you *cannot* refer to a closure;
in others you can do something like
succ(x) : f(x,1)
to mean that succ(x) is a closure obtained by taking a copy of
f's environment, then binding in it y to 1.
Partial application
The process of creating a closure. A closure is a procedure that has
been partially applied, i.e. where some free variables have been
bound. With partial application you can create very general procedures
and create more specialized ones out of them by binding some of
their arguments. Partial application is not available in C or PL/1.
Procedure instance
A procedure instance is a closure where there are no free variables
and a processor context (program counter+registers) is associated
with it. It is not accessible as such in C or PL/1, but it is the
result of calling a procedure.
Continuation
A procedure instance that can be manipulated by the program. In
languages like C or PL/1 only the language runtime system can handle
continuations. In C for example you can get a pointer to the current
continuation using setjmp(cont), which is a runtime system
procedure; the continuation you get is not however fully general. In
PL/1 you have to use the tasking features to get a continuation (a
PL/1 task *is* a continuation), but you cannot use it to transfer
control to.
In more implementation oriented terms, the stack frame of a procedure is its
closure, and its continuation is that stack frame plus the contents of the
register set. When a procedure in C or PL/1 calls another, it will save the
contents of the register set on the stack under the stack frame, thus
putting its entire continuation on the stack, it will create a closure for
the called procedure by pushing the arguments on the stack, and thus
assigning a value to the free variables in the called procedure, and then
will pass a pointer to its own continuation. The called procedure will
eventually use this pointer to return control to the caller's continuation.
In a language like scheme, you can not only create continuations as you
want, you can also transfer control to them when you want, using call/cc.
In C and PL/1 a procedure always continues (returns) to the calling
continuation. In C, using longjmp(), you can return to a continuation
only *above* the calling procedure.
This is because a continuation is essentially a stack frame; on languages
that put stack frames on a stack, instead of allocating them with malloc(),
you cannot return upwards a pointer to a stack frame, only pass it downwards.
There is a continuation passing style of control flow; when a procedure
calls another it always passes an explicit pointer to the continuation the
called procedure is to continue to, and it need not be the calling
procedure's. This for example makes coroutines trivially easy. Indeed you
can consider a continuation as a coroutine instance to which you can get a
pointer.
With continuation passing you can easily implement tail recursion; you just
continue to the current continuation. Normal recursion instead creates a new
continuation out of the current procedure. Example, first with conventional
recursion (I have not used conditional expressions to show the control flow
better):
fact(n)
{
if (n == 1)
return 1;
else return n * fact(n-1);
}
main()
{
printf("fact(4): %d\n",fact(4));
return 0;
}
This will create a new continuation with a new closure for n and fact_n_1 at
each level of recursion. Let's assume you have a few new keywords, that give
access to the basic of the implementation to user programs (note that as
defined they presume that we have garbage collection):
closure
The type of a closure
continuation
The type of a continuation
another PROC => CLOSURE
Make a new closure using the PROC procedure as a template
without binding any of its parameters, and return a pointer
to it.
bind CLOSURE([EXPR1],...) => CLOSURE
Take the given closure pointer CLOSURE and set its parameters to the
given expressions, and return the closure pointer. If a parameter is
not to be bound, it can be left empty, or omitted if it is a trailing
one. To do partial application, one will start binding the rightmost
parameters, so that it is normally advisable to put the least
"general" parameter to the left.
with CLOSURE => CONTINUATION
Make a continuation out of CLOSURE pointed, with the program counter
at the entry point of the code denoted by the closure, and return a
pointer to it.
continue CONTINUATION
Transfer control to the given CONTINUATION pointer; if this is
nil, the processor is halted.
resume LABEL => CONTINUATION
Return pointer to a copy of the current continuation, with the
program counter set to label LABEL.
We could rewrite the recursive example above as
fact(returnto,result,n)
continuation *returnto;
int *result;
int n;
{
if (n == 1)
*result = 1;
else
{
int fact_n_1;
continue with bind another fact(resume fact_n,&fact_n_1,n-1);
fact_n:
*result = n * fact_n_1;
}
continue returnto;
}
main(kernel,returncode)
continuation *kernel;
int *returncode;
{
int fact_4;
continue with bind another fact(resume print,&fact_4,4);
print:
continue with bind another printf
(resume exit,"fact(4): %d\n",fact_4);
exit:
*returncode = 0;
continue kernel;
}
Notice that this version is exactly equivalent to the one above; all the
aspects of the implementation of a procedure call have been exposed to the
naked eye, when they are usually hidden:
another fact
the allocation of the called procedure stack frame
resume fact_n
the return address is the first instruction after the call
&fact_n_1
where to store the result value
bind ...(...)
pushing the implicit (return address, return value location) and
explicit (n) arguments in the called procedure stack frame.
*result = ...
setting the value to be returned
continue returnto
return;
Notice also that for example the C 'goto L' can also be expressed as
'continue resume L'.
Now we can see the tail recursive rewrite of the above example (I will
rewrite just fact iself here); at each continuation n will be drecreased by
1, and *result will have increased by n.
fact(returnto,n,result)
continuation *returnto;
int n; int *result;
{
if (n < 1)
continue returnto;
else *result = 1, /* the unity for multiplication */
continue with bind another factorial(returnto,result);
}
factorial(n,returnto,result)
int n; int *result;
continuation *returnto;
{
self:
if (n == 1)
continue returnto;
else *result *= n, continue bind resume self(n-1);
}
Note several interesting aspects of this; factorial does not return to fact,
but directly to fact's invoker; factorial's 'n' gets rebound at every
tail-recursive call, and it is obvious that there is reuse of factorial's
continuation, given that there is no call to 'another', etc...
Another example could be a nice preemptive scheduler. Let's assume that
we have a few library helps:
continueafter(continueto,resumeto,millisecond)
continuation *continueto;
continuation *resumeto;
unsigned milliseconds;
The procedure will continue to 'continueto' after having setup
things so that after at least the given number of 'milliseconds', the
current continuation will be forced to execute the following line:
{ continue rebind resumeto{resume restart}; restart:; }
struct queue
A fifo queue of continuation pointers
enqueue(returnto,last,queue)
continuation *returnto;
struct queue *queue;
continuation *last;
Append to the queue the last continuation
dequeue(returnto,first,queue)
continuation *returnto;
struct queue *queue;
continuation **first;
Remove the first continuation from the queue
Our preemptive scheduler (roundrobin, fixed slice size) may then be
(omitting details such as ensuring atomicity here and there):
scheduler(process,timeslice)
continuation *process;
unsigned timeslice;
{
struct queue runnable;
closure *timeslicer;
timeslicer = bind another continueafter
(,resume passivate,timeslice);
passivate:
continue with bind another enqueue
(resume choose,process,&runnable);
choose:
continue with bind another dequeue
(resume activate,&process,&runnable);
activate:
continue with bind timeslicer(process);
}
What happens here is that we specialize 'continueafter', then we enter a
loop, where continuation passing works as follows:
schedule:passivate (we have got the current process)
enqueue (put it at the end of the queue)
schedule:choose (get the process at the start of the queue)
schedule:activate (the time slicing logic)
continueafter (post an interrupt request...)
process (...and continue with the chosen process)
[ an interrupt ] (continue to schedule, set the current process)
schedule:passivate
...
Notice the extreme clarity and flexibility and power. For example it is
possible to have multiple instances of the scheduler, e.g. for a multiple
processor machine; it is very easy to change the scheduling policy, the
paragraph 'choose:', which in my example is simply always choosing the
first runnable process, etc...
--
Piercarlo "Peter" Grandi | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk