[comp.lang.c] Tail recursion in C

moore@utah-cs.UUCP (Tim Moore) (10/30/87)

Are there any C compilers which perform tail recursive optimization?
As an example, are there any C compilers under which the following
(stupid) program 

main()
{
    change(1);
}

change(a)
     int a;
{
    int b;

    b = -a;
    change(b);
}

will run forever and not run out of stack?

landon@Apple.COM (Landon Dyer) (07/14/89)

(This is partially a response to a posting on su.computers, but I have an
additional question).


Whenever someone asks "why can't I tail-recurse in C?" my eyes bug out, my
vision gets fuzzy, and I basically lose it.  I love tail recursion when I'm
using Scheme, but I'm completely revolted by the concept in C.


Here's what ANSI sez about the semantics of auto scoped variables (section
2.1.2.3 for you language lawyers -- uppercase "italics" are my own):

| An instance of each object with automatic storage duration is associated
| with each entry into its block.  Such an object exists and RETAINS ITS
| LAST-STORED VALUE during the execution of the block and WHILE THE BLOCK IS
| SUSPENDED (BY A CALL OF A FUNCTION or receipt of a signal).


What this means is that you can't tail-recurse if the address of a local can
"leak out" of its scope.  Consider this fragment (sorry about its length):

	--------------------------------
	typedef struct Link {
	    struct Link*  fNext;
	    int           fValue;
	} Link
	
	Link* gLinkList = 0;
	
	void f(int n) {
	    Link elem;
	
	    /* link 'elem' into a global list of 'elem's */
	    elem.fNext = gLinkList;
	    elem.fValue = i;
	    gLinkList = &elem;
	
	    if (n >= 0)
	        f(n-1);		/* recurse */
	    else {
		/* walk 'gLinkList' or something */
	    }
	}
	--------------------------------

At first glance, 'f()' is a tail-recursive function that recurses 'n+1'
times.  But we're in trouble, because third parties (the global 'gLinkList'
in this case) have pointers to locals that are contained in the control
stack.  Fortunately this is something a compiler can detect.

Question: does GCC do this right?


Things get much worse in C++, since destructors for objects have to be
called when the objects go out of scope.  That is, there may be objects to
destroy after the call to 'f(n-1)'.

I would hate to maintain code like that -- especially since the
tail-recursion-ness of a particular function can evaporate the moment
someone adds a constructor / destructor to a class that didn't have one
before.  [C++ has worse problems, but who needs the headache?]


My biggest objection is purely pragmatic.  Your carefully tailored tail-
recursive functions may become real-recursive when you move to a different
compiler.

Now, about that garbage collection....


----
Landon Dyer, Apple Computer, Inc.            "C++ is the FORTH of the 1990s;
Development Systems Group (MPW)               they don't call it 'Uh-oh'
Everything I said here is utter nonsense.     programming for nothing!"

jwl@ernie.Berkeley.EDU (James Wilbur Lewis) (07/14/89)

In article <33133@apple.Apple.COM> landon@Apple.COM (Landon Dyer) writes:
-Whenever someone asks "why can't I tail-recurse in C?" my eyes bug out, my
-vision gets fuzzy, and I basically lose it.  I love tail recursion when I'm
-using Scheme, but I'm completely revolted by the concept in C.
-
-
-Here's what ANSI sez about the semantics of auto scoped variables (section
-2.1.2.3 for you language lawyers -- uppercase "italics" are my own):
-
-[...]
-
-What this means is that you can't tail-recurse if the address of a local can
-"leak out" of its scope.

Eh?  You can tail-recurse just fine, but maybe not do the optimization you'd
want...is that what you meant?  Or did I completely miss your point?

-[...for example, if you take the address of an auto variable and store it
-in a global variable...]

-	Link* gLinkList = 0;
-	
-	void f(int n) {
-	    Link elem;
-	    [...]
-	    gLinkList = &elem;
-	    [...]
-	}

Maybe I'm just being dense, but why would anyone ever want to do this?
When the block containing the auto variable is exited, the variable
ceases to exist and you're left with a (dangling) pointer into the
stack somewhere...right?   Perhaps it's legal C, but under what 
circumstances could a program pulling such a stunt be correct? I mean,
to me it seems almost as dumb as, say, adding pointers. :-)

-I would hate to maintain code like that -- 

Jeez, you're not kidding!  But this appears to be a spectacularly bad
programming practice -- does it make sense to criticize C because it
also happens to be legal?  After all, as they say, there has never been
a language in which it was the least bit difficult to write bad programs... 

-- Jim Lewis
   U.C. Berkeley

gwyn@smoke.BRL.MIL (Doug Gwyn) (07/15/89)

In article <33133@apple.Apple.COM> landon@Apple.COM (Landon Dyer) writes:
>At first glance, 'f()' is a tail-recursive function that recurses 'n+1'
>times.  But we're in trouble, because third parties (the global 'gLinkList'
>in this case) have pointers to locals that are contained in the control
>stack.

C allows general recursion, not just tail recursion.  Of course you're
constrained in that you can't legally use variables once their lifetime
has expired; however in your example all autos in the function nest
would still be alive for the n<0 branch of the innermost call.  What
you DO have to watch out for is not relying on the auto storage
mechanism to provide storage outside the dynamic scoping inherent in
the use of a stack discipline.  That's what the heap (malloc()) is for.

bill@twwells.com (T. William Wells) (07/16/89)

In article <30050@ucbvax.BERKELEY.EDU> jwl@ernie.Berkeley.EDU.UUCP (James Wilbur Lewis) writes:
: Maybe I'm just being dense, but why would anyone ever want to do this?
: When the block containing the auto variable is exited, the variable
: ceases to exist and you're left with a (dangling) pointer into the
: stack somewhere...right?   Perhaps it's legal C, but under what
: circumstances could a program pulling such a stunt be correct? I mean,
: to me it seems almost as dumb as, say, adding pointers. :-)

Well, here's an example: I had a directed graph I wanted to traverse
and wanted to process paths of a specified through it, from the leaves
toward the "root". The code looked something like:

/* traverse(root, length, (NODELIST *)0); length > 0 */

void
traverse(node, length, next)
NODE    *node;
int     length;
NODELIST *next;
{
	NODELIST *np;
	NODELIST link;

	link.next = next;
	link.node = node;
	if (length == 1) {
		process_path(&link);
	} else {
		for (np = node->list; np; np = np->next) {
			traverse(np->node, length - 1, &link);
		}
	}
}

The idea is that on the stack is a linked list of NODELISTs which can
be scanned for each NODE on the path by the process_path() routine.

Note that there are no dangling pointers.

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

richard@aiai.ed.ac.uk (Richard Tobin) (07/19/89)

In article <10530@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>In article <33133@apple.Apple.COM> landon@Apple.COM (Landon Dyer) writes:
>>At first glance, 'f()' is a tail-recursive function that recurses 'n+1'
>>times.  But we're in trouble, because third parties (the global 'gLinkList'
>>in this case) have pointers to locals that are contained in the control
>>stack.
>
>C allows general recursion, not just tail recursion.  

I think the original poster was referring to tail-recursion
optimization, which means turning calls immediately before return into
jumps, thus avoiding stack overflow by re-using the stack frame.  Some
languages (eg Scheme and Postscript) specify this behaviour as part of
the language definition.

To do this in C, the compiler must be sure that there are no pointers
into the stack frame that's about to be re-used.  And if the arguments
are popped of the stack by the caller, you have to make sure it will
still work.  When I last looked at gcc, it generated jumps only for
tail-calls to the same function, and only when the stack frame is empty.

A C compiler that did this more generally would be very useful for
people who want to compile Scheme and other tail-recursive Lisps to C.

-- Richard
-- 
Richard Tobin,                       JANET: R.Tobin@uk.ac.ed             
AI Applications Institute,           ARPA:  R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk
Edinburgh University.                UUCP:  ...!ukc!ed.ac.uk!R.Tobin