[comp.lang.scheme] Formatted IO and continuations in C/C++

pcg@cs.aber.ac.uk (Piercarlo Grandi) (08/28/90)

I have recently noticed an exchange of views in comp.lang.scheme and
comp.lang.c++ about implementing the call/cc continuation passing style
of programming in C or C++ without assembler assist, and without getting
in too much trouble about tail recursion in the compiler.

	This is a topic in which I am keenly interested, because my
	current research is about an OS design in which something like
	call/cc is the one crucial thing (capabilities to continuations
	are *the* IPC mechanism).

I have been most amused by an article by Simon Peyton L. Jones, in which
he observed that if you keep the arguments on a separate stack, you can
have each procedure reduced to a parameterless state (as far as C or C++
are concerned) and returning a pointer to the next procedure to call,
and with a simple explicit loop the need for tail recursion can be
obviated.

This implies that (as readers of Greussay, Baker, Steele and Allen will
have recognized) the control graph gets divorced from the environment
graph, which is indeed the prerequisite for a lot of funky call/cc style
technology.

This is very similar to an old idea of mine, which I applied in a
formatted transput library for C; it could probably be applied to C++;
indeed the operator << overloading trick of C++ is the staticized
version.

In C's stdio library the problem with formatted transput (note the use
of Algol 68 terminology; the insiders will easily recognize why) is that
each format string is a program within the program, and unfortunately
built to totally different syntax and semantics. This causes a number of
difficulties with abstraction. In particular, the representation of each
entity must be exposed in the format string, and if the representation
changes, the format string must be manually edited to reflect it. While
C has typedef or #define to handle abstraction of types, this cannot be
used with printf(3) format strings.

The alternative used e.g. in Modula-2 and Simula 67 is to have each item
to be transput handled by a separate procedure call, which is not only
verbose, it also wastes code space (well, I still do care about such
unfashionable trivialities) because of many procedure call sequences to
be generated (note that in general a procedure call *is* required, time
wise). Another alternative is to map everything onto strings, and then
transput strings, which is also "inefficient".

Another possibility is to transform the printf(3) string into a string
of procedure calls, and printf(3) itself from a parser into an
interpreter, leaving the parsing to C, so that all the abstraction
facilities of C can be used.

We only would need one such interpreter, whose first argument identifies
the transput file, and whose subsequent arguments, in a variable number,
are a string of format specifications, described as a transput function
followed by its arguments, then another transput function followed by
arguments.

Each transput function will be called with two arguments; the first will
be the transput file, the second the argument list after it. It take
from this its own arguments, and return the reamining part of the
argument list.

The generic transput function will have a shape like:

    typedef va_list function(/* char *,va_list */);
    #define EOL ((function *) 0)
    extern void transput(/* char * */);

in some header, say <transput.h>, and then as implementation:

    #include <varargs.h>
    #include <transput.h>

    /* VARARGS1 */
    void transput(file,va_alist)
	    char *file; va_dcl
    {
	    va_list list;
	    function *formatter;

	    va_start(list);
	    while (formatter = va_arg(list,function *))
		    list = (*formatter)(file,list);
	    va_end(list);
    }

Notice that we use a generic C pointer type for the file, because in
this way this function can be used with any style of underlying transput
channel, not just say stdio. In other words this function is really a
functional that abstracts over *both* the io channel implementation and
the actual transput procedures. Actually, as we will notice later, it
can be used outside the transput field at all.

A format function, for example to write a complex, may be

    #include <stdio.h>
    #include <varargs.h>
    #include "cplx.h" /* contains "struct cplx { float re,im; };" */

    va_list cplx_o(file,list)
	    char *file; va_list list;
    {
	    struct cplx *c = &va_arg(list,struct cplx);
	    fprintf((FILE *) file,"{%g,%g}",c->re,c->im);
	    return list;
    }

and invoked as

    #include <stdio.h>
    #include <transput.h>
    #include "cplx.h"

    struct cplx c = {0.0,1.0};

    int main() { transput(stdout,cplx_o,c,EOL); return 0; }

Note that is also becomes possible to have meta formatters; for example,
a list formatter:

    #include <varargs.h>
    #include <transput.h>

    va_list list_of(file,list)
	    char *file; va_list list;
    {
	    register unsigned elements = va_arg(list,unsigned);
	    register function *formatter = va_arg(list,function *);

	    for (elements; elements != 0; --elements)
		list = (*formatter)(file,list); 
	    return list;
    }

(note that it can be used for both inout and output) to be used as

    #include <stdio.h>
    #include <transput.h>
    #include "cplx.h"

    struct cplx c0 = {0.0,1.0}, c1 = {1.0,0.0}, c2 = {1.0,1.0};

    int main () { transput(stdout,list_of,4,cplx_o,c0,c1,c2,EOL); return 0; }

It is possible to further extend the technique to any sort of transput
function, for example to position the file, or to flush it (reminiscent
of C++ 2.0 iomanip), or even to write actual programs as a sequence of
transput functions; it may be possible to define transput functions that
skip arguments, test for the truth of expressions, or else.

	Note that all this may seem unduly inefficient because of
	extra procedure call overheads, but note also that we are seeing
	here an implementation layered on stdio only for convenience.

	Also note that all this is terribly type unsafe, but so is, and
	worse, the regular format string style.

It is also possible to have a slightly different organization, which is
less transput oriented, and even more applicative or continuation
passing in style; for example something like

    #include <varargs.h>

    typedef va_list function(/* va_list */);

    void perform(va_alist)
	    va_dcl
    {
	    va_list list;
	    function *continuation;

	    va_start(list);
	    while (continuation = va_arg(list,function *))
		    list = (*continuation)(list);
	    va_end(list);
    }

This is actually a full continuation passing interpreter (without upward
funargs, which are hard to do in unabridged C -- but again this is not a
control issue, it is an enviroment one), as the return list from each
call to a function can be another, newly manufactured argument list
prefixed with any function pointer. Of course each function has to
slowly extract its arguments from the list, but this is inevitable with
dynamic dispatching. The alternative, for example in KCL, or suggested
above by Peyton Jones, is to keep the arguments off the C/C++ stack.

	Note: hints of reflexivity here. But reflexivity is so hyped :-).

It is equally easy to construct an interesting but maybe more limited
variation for dynamic method dispatching for an object oriented system,
e.g. for Objective C.  After all, this is not terribly different in
flavour from the virtual functions of C++. The same optimizations could
apply, e.g. in avoiding the dynamic dispatch when it can be resolved at
compile time.

Most of this will not be news to readers of Allen's Anatomy of Lisp, or
Greussay's thesis and papers, or Steele's and Baker's MIT AI reports,
but here we see it all in C, which is fairly rare.

I think that all this supports in some way my long standing contention
that higher order functions and any form of control abstraction, are the
dark underdeveloped side of algorithmic languages (that have grown more
sophisticated w.r.t. data abstraction, vide C++), and this is pity, as
they can be done even in C/C++. It may also give a hint on why I believe
that programs are made of layers where each layer has control
abstraction glue holding together data abstraction tranformers (sorry
for the imprecise language).

	Note: I am writing an article to be sent to some journal (CJ) on
	this, and much else besides, but do not hold your breath; it has
	been "in progress" for the last three years :-) :-(.
--
Piercarlo "Peter" Grandi           | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk