pcg@cs.aber.ac.uk (Piercarlo Grandi) (09/26/90)
There was some discussion somewhere somewhen about continuations; somebody asked me for a discussion of continuations, and I wrote a quick introduction, and it was appreciated. Here I am posting it for a wider public, the N-th in a long series of treatises :->. Some disclaimers: I am using C-like syntax to show that it is not that impossible to add continuations even to such a language; I would not however use precisely the same syntax to add continuations to C (I would probably want to use library functions instead). The facility described is very general and analytic, and not exactly equivalent to that found in say Scheme; the closest thing I can think of is SL5, the predecessor to Icon. This article has been crossposted to comp.object; closures are very similar to objects (consider Simula 67, actor languages). Finally, this is my work, interests, ideas (I am doing an OS that is entirely based on capabilities to continuations, CAP or PDP-1 style), and in no way the University College of Wales is involved, culpable (or even interested :->) etc... in this work. The best book on advanced language implementation I know (and I know a lot of them :->) is Allen, "The Anatomy of Lisp", Academic Press. It does not apply really only to Lisp, and it makes clear a lot of very fundamental issues of programming. If one has intellectual curiosity about insight on what languages and compilers are about and are designed and implemented, this is the book. It gives away the whole show. Now my own words: ------------------------------------------------------------------------- Environment A set of (variable name,value) pairs; if for some pair the value is missing, the variable is called 'free', 'unbound', (it's a parameter) and the environment is a template. Procedure a procedure is a lump of code plus an enviroment template; example: f(x,y) { return x+y; } The lump of code is between the braces, the environment template is between the parenthesis. Note that local variables, if any belong to the environment; local variables are those in the environment that are set by the code before being used. Global variables are just implicit arguments. In languages like C and PL/1 you can refer to procedure values via pointers. Closure A procedure where all or a part of the free variables have been bound. In languages like C or PL/1 you *cannot* refer to a closure; in others you can do something like succ(x) : f(x,1) to mean that succ(x) is a closure obtained by taking a copy of f's environment, then binding in it y to 1. Partial application The process of creating a closure. A closure is a procedure that has been partially applied, i.e. where some free variables have been bound. With partial application you can create very general procedures and create more specialized ones out of them by binding some of their arguments. Partial application is not available in C or PL/1. Procedure instance A procedure instance is a closure where there are no free variables and a processor context (program counter+registers) is associated with it. It is not accessible as such in C or PL/1, but it is the result of calling a procedure. Continuation A procedure instance that can be manipulated by the program. In languages like C or PL/1 only the language runtime system can handle continuations. In C for example you can get a pointer to the current continuation using setjmp(cont), which is a runtime system procedure; the continuation you get is not however fully general. In PL/1 you have to use the tasking features to get a continuation (a PL/1 task *is* a continuation), but you cannot use it to transfer control to. In more implementation oriented terms, the stack frame of a procedure is its closure, and its continuation is that stack frame plus the contents of the register set. When a procedure in C or PL/1 calls another, it will save the contents of the register set on the stack under the stack frame, thus putting its entire continuation on the stack, it will create a closure for the called procedure by pushing the arguments on the stack, and thus assigning a value to the free variables in the called procedure, and then will pass a pointer to its own continuation. The called procedure will eventually use this pointer to return control to the caller's continuation. In a language like scheme, you can not only create continuations as you want, you can also transfer control to them when you want, using call/cc. In C and PL/1 a procedure always continues (returns) to the calling continuation. In C, using longjmp(), you can return to a continuation only *above* the calling procedure. This is because a continuation is essentially a stack frame; on languages that put stack frames on a stack, instead of allocating them with malloc(), you cannot return upwards a pointer to a stack frame, only pass it downwards. There is a continuation passing style of control flow; when a procedure calls another it always passes an explicit pointer to the continuation the called procedure is to continue to, and it need not be the calling procedure's. This for example makes coroutines trivially easy. Indeed you can consider a continuation as a coroutine instance to which you can get a pointer. With continuation passing you can easily implement tail recursion; you just continue to the current continuation. Normal recursion instead creates a new continuation out of the current procedure. Example, first with conventional recursion (I have not used conditional expressions to show the control flow better): fact(n) { if (n == 1) return 1; else return n * fact(n-1); } main() { printf("fact(4): %d\n",fact(4)); return 0; } This will create a new continuation with a new closure for n and fact_n_1 at each level of recursion. Let's assume you have a few new keywords, that give access to the basic of the implementation to user programs (note that as defined they presume that we have garbage collection): closure The type of a closure continuation The type of a continuation another PROC => CLOSURE Make a new closure using the PROC procedure as a template without binding any of its parameters, and return a pointer to it. bind CLOSURE([EXPR1],...) => CLOSURE Take the given closure pointer CLOSURE and set its parameters to the given expressions, and return the closure pointer. If a parameter is not to be bound, it can be left empty, or omitted if it is a trailing one. To do partial application, one will start binding the rightmost parameters, so that it is normally advisable to put the least "general" parameter to the left. with CLOSURE => CONTINUATION Make a continuation out of CLOSURE pointed, with the program counter at the entry point of the code denoted by the closure, and return a pointer to it. continue CONTINUATION Transfer control to the given CONTINUATION pointer; if this is nil, the processor is halted. resume LABEL => CONTINUATION Return pointer to a copy of the current continuation, with the program counter set to label LABEL. We could rewrite the recursive example above as fact(returnto,result,n) continuation *returnto; int *result; int n; { if (n == 1) *result = 1; else { int fact_n_1; continue with bind another fact(resume fact_n,&fact_n_1,n-1); fact_n: *result = n * fact_n_1; } continue returnto; } main(kernel,returncode) continuation *kernel; int *returncode; { int fact_4; continue with bind another fact(resume print,&fact_4,4); print: continue with bind another printf (resume exit,"fact(4): %d\n",fact_4); exit: *returncode = 0; continue kernel; } Notice that this version is exactly equivalent to the one above; all the aspects of the implementation of a procedure call have been exposed to the naked eye, when they are usually hidden: another fact the allocation of the called procedure stack frame resume fact_n the return address is the first instruction after the call &fact_n_1 where to store the result value bind ...(...) pushing the implicit (return address, return value location) and explicit (n) arguments in the called procedure stack frame. *result = ... setting the value to be returned continue returnto return; Notice also that for example the C 'goto L' can also be expressed as 'continue resume L'. Now we can see the tail recursive rewrite of the above example (I will rewrite just fact iself here); at each continuation n will be drecreased by 1, and *result will have increased by n. fact(returnto,n,result) continuation *returnto; int n; int *result; { if (n < 1) continue returnto; else *result = 1, /* the unity for multiplication */ continue with bind another factorial(returnto,result); } factorial(n,returnto,result) int n; int *result; continuation *returnto; { self: if (n == 1) continue returnto; else *result *= n, continue bind resume self(n-1); } Note several interesting aspects of this; factorial does not return to fact, but directly to fact's invoker; factorial's 'n' gets rebound at every tail-recursive call, and it is obvious that there is reuse of factorial's continuation, given that there is no call to 'another', etc... Another example could be a nice preemptive scheduler. Let's assume that we have a few library helps: continueafter(continueto,resumeto,millisecond) continuation *continueto; continuation *resumeto; unsigned milliseconds; The procedure will continue to 'continueto' after having setup things so that after at least the given number of 'milliseconds', the current continuation will be forced to execute the following line: { continue rebind resumeto{resume restart}; restart:; } struct queue A fifo queue of continuation pointers enqueue(returnto,last,queue) continuation *returnto; struct queue *queue; continuation *last; Append to the queue the last continuation dequeue(returnto,first,queue) continuation *returnto; struct queue *queue; continuation **first; Remove the first continuation from the queue Our preemptive scheduler (roundrobin, fixed slice size) may then be (omitting details such as ensuring atomicity here and there): scheduler(process,timeslice) continuation *process; unsigned timeslice; { struct queue runnable; closure *timeslicer; timeslicer = bind another continueafter (,resume passivate,timeslice); passivate: continue with bind another enqueue (resume choose,process,&runnable); choose: continue with bind another dequeue (resume activate,&process,&runnable); activate: continue with bind timeslicer(process); } What happens here is that we specialize 'continueafter', then we enter a loop, where continuation passing works as follows: schedule:passivate (we have got the current process) enqueue (put it at the end of the queue) schedule:choose (get the process at the start of the queue) schedule:activate (the time slicing logic) continueafter (post an interrupt request...) process (...and continue with the chosen process) [ an interrupt ] (continue to schedule, set the current process) schedule:passivate ... Notice the extreme clarity and flexibility and power. For example it is possible to have multiple instances of the scheduler, e.g. for a multiple processor machine; it is very easy to change the scheduling policy, the paragraph 'choose:', which in my example is simply always choosing the first runnable process, etc... -- Piercarlo "Peter" Grandi | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk