Register Allocator

The register allocator converts stack-based traces to register-based code. This is done in two stages. The first is to remap the stack to the registers, reducing the number of loads and stores. The second is to coalesce the registers to reduce the number of (ideally to eliminate) register-to-register moves.

Converting Stack to Register Code

It is worth noting that the Deferred Object Creation optimisation pass will already have used some registers for temporary storage, so some registers will be pre-allocated.

Firstly all stack locations are alloted a register.
Local variables are also alloted a register, in which the value can be held temporarily if it loaded repeatedly.
Stack locations are allocated a new 'virtual'
For example local variables (in the current frame) might be alloted registers 1 upwards, and the stack locations alloted registers 63 downward.

Loads and stores can thus be converted into register to register moves. For example, a LOAD_REGISTER bytecode (generated by the D.O.C. pass) can be translated to a register register move. LOAD_REGISTER 4 would become MOVE 4, 62 if the TOS were previously register 63 (allocating the stack downward).

Binary operations, which take their operands from the stack are replaced with register-based versions:
BINARY_ADD becomes BINARY_ADD 63, 62, 63 if the TOS were register 62.

Register Coalescing

The register-based code generated by the first stage is inefficient; it contains many register-to-register move. By coalesing registers, many of these moves can be eliminated. For example:

  1. move R1 -> R63
  2. move R2 -> R62
  3. add (R63 + R62) -> R63
  4. move R63 -> R3

By scanning backwards over the code it is easy to determine the lifetimes of values in registers. For example, R63 has two lifetimes; from instruction 1 to instruction 3 and from instruction 3 to instruction 4. For the first lifetime R63 can be coalesced with R1, and the for the second lifetime it can be coalesced with R3. Coalescing R62 with R2 results in:

  1. add (R1 + R2) -> R3
In the example from the Deferred Object Creation page, the final code generated by the D.O.C. pass was:
  1. load x
  2. store_register 0
  3. load_register 0
The first stage converts this to:
  1. load x -> R63
  2. move R63 -> R0
  3. move R0 -> R63
The second stage can eliminate the two move instructions and, depending on the subsequent use of R0 and R63, can redirect the initial load of x to the appropriate register.

Comments