Recursion optimization

Tail and mutual recursion optimization

sapid lisp implements tail recursion optimization. This means that functions calling them self in terminal position do not use stack for each call.

(de foo (...) ... body ... (foo ...))

is interpreted more or less in the same way than:

(de foo (...)(while ... ... body ...))

Mutual recursion extends this to any number of nested definitions with calls in terminal position:

(de foo1 (...) ... (foo2 ...))
(de foo2 (...) ... (foo3 ...))
(de foo3 (...) ... (foo1 ...))

Three versions of the routine implementing call evaluation for user defined functions ("expr" in lisp jargon) are detailed here. Starting with no optimization and continuing with the tail recursion optimized version helps to explain the full version implementing mutual recursion optimization.

No recursion optimization

1.01
1.02
1.03
1.04
1.05
1.06
1.07
def evalexpr(func, fval, larg, form):
    bind(func, fval, fval.car, larg)
    try:
        x = eprogn(fval.cdr)
    finally:
        unbind()
    return x

This is the basis. The call to bind evaluates arguments, stores current values and updates the environment with new values. The body of the function is evaluated with the eprogn in the correct environment. The try .. finally ensures that the environment will be restored whatever happens.

Note that the try ... finally construct is equivalent to the following try ... except ... else. This enables to have a special process if required for some exception. This is put at work in mutual recursion optimization.

1.01
1.02
1.03
1.04
1.05
1.06
1.07
1.08
1.09
1.10
def evalexpr(func, fval, larg):
    bind(func, fval, fval.car, larg)
    try:
        x = eprogn(fval.cdr)
    except:
        unbind()
        raise
    else:
        unbind()
    return x

Tail recursion optimization

Tail recursion optimization is implemented in the following version.

1.01
1.02
1.03
1.04
1.05
1.06
1.07
1.08
1.09
1.10
1.11
1.12
1.13
1.14
1.15
1.16
1.17
1.18
def evalexpr(func, fval, larg, form):

    bind(func, fval, fval.car, larg)
    if istailrec(fval):
        Stack.pop()
        raise TailRecException(fval)
        
    try:
        while True:
            try:
                x = eprogn(fval.cdr)
                break
            except TailRecException:
                pass              
    finally:
        unbind()

    return x

The eprogn is embedded in a while loop and a try ... except. If no recursion is detected, the break stops the loop and the execution flow continues as usual. However, if a recursion is detected, the TailRecException is triggered and caught in the except clause of the previous call. This by-passes the break and makes the loop to continue, explicitely tranforming recursion into iteration.

The following details the working in case of tail recursion:

1.01
1.02
1.03
1.04
1.05
1.06
1.07
1.08
1.09
1.10
1.11
1.12
1.13
1.14
1.15
1.16
1.17
1.18
def evalexpr(func, fval, larg, form):

    bind(func, fval, fval.car, larg)
    if istailrec(fval):
        Stack.pop()
        raise TailRecException(fval)
        
    try:
        while True:
            try:
                x = eprogn(fval.cdr)
                break
            except TailRecException:
                pass              
    finally:
        unbind()

    return x
  |->
  |
  |
  |
  |
  |
  |
  |
  |
  |
->| 
2.01
2.02
2.03
2.04
2.05
2.06
2.07
2.08
2.09
2.10
2.11
2.12
2.13
2.14
2.15
2.16
2.17
2.18
def evalexpr(func, fval, larg, form):

    bind(func, fval, fval.car, larg)
    if istailrec(fval):
        Stack.pop()
        raise TailRecException(fval)
        
    try:
        while True:
            try:
                x = eprogn(fval.cdr)
                break
            except TailRecException:
                pass              
    finally:
        unbind()

    return x

Suppose we are evaluating a tail recursion call in 2.01. This call must be the last expression of the eprogn at 1.11.  At this point, the bind record is on the stack and variables have they new value (2.03). When the tail recursion is detected (2.04), the binding record is removed (2.05) and the tail recursion exception is raised (2.06). The exception is caught at 1.13. Being in the loop (1.09), execution returns to the eprogn (1.11).

Mutual recursion optimization

Mutual recursion optimization requires a little more mechanics to be handled:

  • The identity of the function must be tested to apply optimization at the right place.
  • The environment of the intermediate calls must be preserved. This is done by merging the binding record with the one of the previous call.
1.01
1.02
1.03
1.04
1.05
1.06
1.07
1.08
1.09
1.10
1.11
1.12
1.13
1.14
1.15
1.16
1.17
1.18
1.19
1.20
1.21
1.22
1.23
1.24
1.25
1.26
1.27
1.28
1.29
def evalexpr(func, fval, larg, form):
bind(func, fval, fval.car, larg) if istailrec(fval): Stack.pop() raise TailRecException(fval) try: while True: try: x = eprogn(fval.cdr) break except TailRecException as e: if e.fval == fval: pass else: bindmerge() Stack.pop() raise
except TailRecException: raise except: unbind() raise else: unbind()
return x

The following details the working in case of mutual recursion:

1.01
1.02
1.03
1.04
1.05
1.06
1.07
1.08
1.09
1.10
1.11
1.12
1.13
1.14
1.15
1.16
1.17
1.18
1.19
1.20
1.21
1.22
1.23
1.24
1.25
1.26
1.27
1.28
1.29
def evalexpr(func, fval, larg, form):
bind(func, fval, fval.car, larg) if istailrec(fval): Stack.pop() raise TailRecException(fval) try: while True: try: x = eprogn(fval.cdr) break except TailRecException as e: if e.fval == fval: pass else: bindmerge() Stack.pop() raise
except TailRecException: raise except: unbind() raise else: unbind()
return x
  |->
  |
  |
  |
  |
  |
  |
  |
  |
  |
->|

2.01
2.02
2.03
2.04
2.05
2.06
2.07
2.08
2.09
2.10
2.11
2.12
2.13
2.14
2.15
2.16
2.17
2.18
2.19
2.20
2.21
2.22
2.23
2.24
2.25
2.26
2.27
2.28
2.29
def evalexpr(func, fval, larg, form):
bind(func, fval, fval.car, larg) if istailrec(fval): Stack.pop() raise TailRecException(fval) try: while True: try: x = eprogn(fval.cdr) break except TailRecException as e: if e.fval == fval: pass else: bindmerge() Stack.pop() raise
except TailRecException: raise except: unbind() raise else: unbind()
return x
  |->
  |
  |
  |
  |
  |
  |
  |
  |
  |
->|
3.01
3.02
3.03
3.04
3.05
3.06
3.07
3.08
3.09
3.10
3.11
3.12
3.13
3.14
3.15
3.16
3.17
3.18
3.19
3.20
3.21
3.22
3.23
3.24
3.25
3.26
3.27
3.28
3.29
def evalexpr(func, fval, larg):
bind(func, fval, fval.car, larg) if istailrec(fval): Stack.pop() raise TailRecException(fval) try: while True: try: x = eprogn(fval.cdr) break except TailRecException as e: if e.fval == fval: pass else: bindmerge() Stack.pop() raise
except TailRecException: raise except: unbind() raise else: unbind()
return x

Suppose we are evaluating a mutual recursion call in 3.01, i.e. f12 in (f11 ...) --> (f2 ...) --> (f12 ...). This call must be the last expression of the eprogn at 2.11.

When the mutual recursion is detected (3.04), the binding record is removed (3.05) and the tail recursion exception is raised (3.06). The exception is caught at 2.13. Being an intermediate call (f2), the bindings are merged at 2.17 and the current binding record is popped. The exception is reraised at 2.19 and finally caught at 1.13. This time, the test at 1.14 is true and this is handled at this point as tail recursion.

Note

The current implementation adds an argument to evalexpr and bind routines to specify whether function arguments are evaluated or not. This enables to use the same routine for eval and apply.

Tail recursion detection

The tail recursion calls are detected in the history of calls. This history is a stack and it is handled explicitly in sapid implementation. This stack contains various types of records. The ones uses to store arguments are tagged with the keyword lambda. These records contain the definition of called functions and this enables to compare the function being evaluated and the calling ones.

A simple function definition and the resulting record:

(de foo (x y) (if (= x 0) y (foo (1- x) (1+ y)))

... ------------- lambda (x y) (if (= x 0) y (foo (1- x) (1+ y))) x 5 y 7 ------------- ...

In that case, just after binding, the stack could look like this:

-------------
lambda  (x y) (if (= x 0) y (foo (1- x) (1+ y)))
x 4
y 8
-------------
lambda  (x y) (if (= x 0) y (foo (1- x) (1+ y)))
x 5
y 7
-------------
...

The tail recursion is detected by comparing the top function value to the previous one. The exception mechanism will clean the host stack and the top record is removed from the evaluation stack. It is clear that there is no need to restore the value of variables x and y to 4 and 8 if they are immediately replaced with values 5 and 7.

The same mechanism is used for mutual recursion with the precaution that intermedate records must be merged with the first record of the recursive function.

When the recursion is not terminal, something must be in added in the stack to avoid detecting a tail recursion. This is the role of stack records of type eval.

(de bar (x y) (if (= x 0) y (+ 1 (bar (1- x) y))))
------------- lambda (x y) (if (= x 0) y (* 1 (bar (1- x) y)))) x 4 y 7 ------------- eval (+ 1 (bar (1- x) y)) ------------- lambda (x y) (if (= x 0) y (+ 1 (bar (1- x) y)))) x 5 y 7 ------------- ...

The records of type eval are set (except in some cases) by the function evalprotect which protects a call to eval from false tail recursion detection by adding and removing the record:

def evalprotect(expr):
    try:
        Stack.append({'eval': expr})
        return eval(expr)
    finally:
        Stack.pop()

Typically, the argument of a subr is not a tail call and its evaluation must be protected:

def evalsubr1(func, fval, larg, form):
    x = evalprotect(larg.car)
    return fval(x)

Extension

Function calls which could be terminal less for a call to a subr could be optimized by storing only the values of the deeper parameters. This is for instance the case of:
(de foo (n) (if (= n 0) 0 (+ (foo (- n 1)) 1)))
When calling foo, the stack could be as follow:
-------------
lambda   (n) (if (= n 0) 0 (+ (foo (- n 1)) 1)))
n 4
-------------
eval     (+ (foo (- n 1)) 1) 1)
-------------
lambda   (n) (if (= n 0) 0 (+ (foo (- n 1)) 1)))
n 5
-------------
...

It is clear that saving a value (4 for instance) is not necessary if the previous value (5 in that case) will override it immediately. This optimization is not available in current implementation. It could be extended to mutual recursion and to any number of subr calls between calls of the recursive function.

Timing

Running the test suites defined in tests.l enables to estimate the computation overhead needed by the optimizations presented here. Note that the interpretor can be configured to handle no recursion optimization, tail recursion optimization or mutual recursion optimization.The next table gives the figures obtained with the following conditions:

  • Pentium 4, 3.2 GHz, 2 GB RAM
  • each suite repeated 10 times
  • svn revision 80

Timings are given in seconds.

 Configuration \ suite  no optimization suite  tail optimization suite  mutual optimization suite
 no recursion optimization  160  cannot be processed
 cannot be processed
 tail recursion optimization  180  89  cannot be processed
 mutual recursion optimization  181  91 64

With the usual precautions, the conclusion is that the technique presented here have a cost of about 12% compared with running without optimization. This is quite acceptable given the benefits of being able to process any size of data when using tail and mutual recursion.

References

Recursion optimization is usually done by simulating in some way a stack machine. This is of course powerful but leads to code far from the natural non recursive writing. Playing with exceptions and after some experimentation, I got the tail recursion version and managed to extend it to mutual recursion optimization.

The technique seeming unusual, I have spent some time googling for a precedent. Actually, a similar technique is used to implement python decorators transforming tail recursion into explicit loop. For instance, this decorator has a structure very similar to the one used in sapid lisp.

Some discussions at comp.lang.lisp and stackoverflow have brought some additional information:

  • Henry Baker's "Cheney on the MTA" strategy have been quoted on both forums
  • some links to academic examples of using longjmp for looping have been proposed

However, both cases are not related to the technique described here which is definitely closer to the python decorator mentioned above.


Comments