Deferred Object Creation

Deferred Object Creation (DOC) is a form of escape analysis that delays creating small short lived objects such as tuples and dicts for parameters, frames and slices until required, which is often never.

Once traces have been specialised, they often include a lot of redundant code, especially across calls and call-return pairs.The DOC pass can eliminate most of this redundant code.

Deferred Object Creation (DOC) works by tracking the creation and initialisation of common objects and deferring the creation of those objects until necessary. DOC works in a single pass, and is thus fast. The following types are handled by DOC: tuples, slices, frames, bound-methods, floats, lists and dicts. (Since dicts and lists are mutable, they are handled only under limited, but still useful, circumstances).It is possible that the D.O.C. pass could be extended to handle some user-defined types in the future.

When a bytecode to create a new object (of one of the above types) is encountered, DOC records the state of the VM as it should be, but does not emit the code to create the object, unless it is required to do so. Quite often, the objects that are deferred never need to created, saving a lot of time.

As a simple example of how it works consider the following function:

def id(self):
return self

Such a function could be quite common, say as the implementation of the __iter__ method of an iterator.

Suppose id(x) is called in a trace, it will get inlined to a sequence which does the following:

  • load x
  • pack 1 -- Pack x into a single-valued tuple
  • empty_dict -- Push an empty dict
  • func_enter -- Enter the id function, creating a new frame from the tuple and dict
  • load self (which is x) from the frame
  • return -- Discards the current frame

The above code creates a tuple, an empty dict and a frame. All of which are redundant. Assume for the sake of this example that x is a complex expression that DOC cannot optimise.

The DOC pass defers the creation of those objects as follows:

  • load x -- DOC cannot handle x, so emits code for x.
  • pack 1 -- Defer the creation of the tuple. In order to do this DOC needs to store TOS into a register, so it emits store_register 0, noting that TOS is now (register[0],)
  • empty_dict -- -- Defer, deferred stack is now (register[0],), {}
  • func_enter -- Enter id function. Deferred stack is emptied and a deferred frame is created containing register[0] as its sole item.
  • Load self. Defer the load of self, TOS is self, which is register[0].
  • Return. The deferred frame is discarded.

At the end of the sequence, no instructions have to be emitted, and the DOC knows that the TOS is in cache[0]. Should the next instruction force the DOC pass to create TOS it can be done with the single instruction:

  • load_register 0

In this example the DOC pass has removed the need to create 3 objects, and reduced the code from six to three instructions:

  • load x
  • store_register 0
  • load_register 0

Once the DOC pass is completed the trace is passed to the Register Allocator. The Register Allocator will convert this to code that evaluates x into register 0.