[comp.lang.scheme] Tail recursion - it's not just for breakfast any more.

carlton@husc10.harvard.edu (david carlton) (05/12/91)

(Terminological warning: throughout this article I am going to use the
phrase "Tail-recursive procedure" to mean "procedure guaranteed to use
a finite amount of stack space", and similarly for related phrases.
Apologies if this grates on anybody's nerves.  Actually, it doesn't
mean that, either - not all scheme implementations use a stack, after
all.  But I'm at a bit of a loss as to how to describe it in an
implementation-independent manner.)

Everybody knows that writing procedures in a tail-recursive way is a
Good Thing.  I'm curious as to just how much of a good thing people
thing it is, though, and how much it is worth it to people to write a
procedure in a tail-recursive manner as opposed to in some other
manner which might have other benefits.  For example, consider the
following two definitions of "append":

(define append1
  (lambda (x y)
     (let loop ((x x))
        (if (null? x) y
            (cons (car x) (loop (cdr x)))))))

(define append2
  (lambda (x y)
     (reverse
        (let loop ((y y)
                   (acc (reverse x)))
           (if (null? y) acc
               (loop (cdr y)
                     (cons (car y) acc)))))))

The second is tail-recursive (assuming, of course, that reverse is
written in a tail-recursive manner); the first isn't.  But the second
is (on all implementations that I've tried it out on, at any rate)
slower than the first, though only by a constant factor.  (Somewhere
around 2 to 4, usually.)  And while it uses up about the same amount
of live space in the heap as the first one does, it conses more and so
will invoke the garbage collector more often.  Which one do you use?
The second one really is more likely to work, in the sense that it
will cause the implementation to run out of space much less often than
the first one does.  But it's slower, and so you lose the vast
majority of the times that you call append.

So what are your heuristics for choosing whether or not to write a
procedure in a tail-recursive manner?  Of course, if the
tail-recursive version uses up more heap space than the
non-tail-recursive version, effectively transferring memory usage from
one place to another, then you just use whichever one is faster.  And
if you are writing something like a repl loop which may never exit
then you write it in a tail-recursive manner.  But what about other
situations?  What if the tail-recursive version takes time order of
the square of the length of its input, say, while the other version
takes time order the length of its input?  What factors in the
implementation of your Scheme system could cause you to choose one
method or another?  Etc...

To be sure, the example I gave isn't the best.  (Though it is the best
that I know of, and one that actually has come up in my programming;
but I haven't looked very hard, and would really prefer a better
example.  Anybody?)  The careful reader has noticed by now that the
procedures aren't in fact equivalent, and in particular the
tail-recursive version isn't quite the append in the standard
(ignoring, of course, the fact that they should be taking a variable
number of arguments, anyways.)  Once you fix that and do some more
simple speedups (involving replacing some calls to reverse by calls to
reverse! or other slightly modified versions of reverse), you find
that you can actually make the tail-recursive version about as fast
and about as space-efficient as the non-tail-recursive version.  Is
this in fact an inherent property of tail-recursion?  (Seems unlikely
to me, but I don't have any real evidence one way or another.)  Can
one in fact come up with a procedure whose fastest tail-recursive
implementation has slower order than its fastest implementation?  What
about memory usage?

david carlton
carlton@husc10.harvard.edu

Chris.Holt@newcastle.ac.uk (Chris Holt) (05/14/91)

carlton@husc10.harvard.edu (david carlton) writes:

>Everybody knows that writing procedures in a tail-recursive way is a
>Good Thing.  I'm curious as to just how much of a good thing people
>thing it is, though...
>             What if the tail-recursive version takes time order of
>the square of the length of its input, say, while the other version
>takes time order the length of its input?
>                                                                Can
>one in fact come up with a procedure whose fastest tail-recursive
>implementation has slower order than its fastest implementation?  What
>about memory usage?

Aren't tail-recursive sorts bounded by o(n**2), while doubly-recursive
sorts bounded by o(n*log n)? [Unless you're Dan Bernstein, in which case
radix is linear :-)].

-----------------------------------------------------------------------------
 Chris.Holt@newcastle.ac.uk      Computing Lab, U of Newcastle upon Tyne, UK
-----------------------------------------------------------------------------
 "And when they die by thousands why, he laughs like anything." G Chesterton

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (05/18/91)

In article <1991May14.134257.16696@newcastle.ac.uk>, Chris.Holt@newcastle.ac.uk (Chris Holt) writes:
> Aren't tail-recursive sorts bounded by o(n**2), while doubly-recursive
> sorts bounded by o(n*log n)?

If you are willing to use an explicit stack to hold runs, a bottom-up
implementation of "natural" merge sort (see Knuth for what "natural"
means here) can be coded tail-recursively.  The code is pretty short, too.
With a slight generalisation of the idea of a run (which I tried to publish
in IPL some years ago, without success), a tail recursive merge sort can
sort already ascending or already descending sequences in O(n), and has
worst case ~ n.lg n.

-- 
There is no such thing as a balanced ecology; ecosystems are chaotic.