Project 4: Instruction Scheduler

In this project, you will implement the most interest part of an an instruction scheduling algorithm: building the dependency graph and computing the latency-weighted distance to root for each node. Your will extend tipc to output a distance-labeled graph.

Problem Specification

Your input is a single basic block of code consisting of instructions from the following subset of ILOC. All letters are case-sensitive. Note that latencies have changed from Projects 1 & 2. You should discard any nop instructions, as they do not have to be scheduled.

Syntax Meaning Latency
load r1 => r2 r2 ← MEM (r1) 3
loadI x => r2 r2 ← x 1
store r1 => r2 MEM(r2) ← r1 3
add r1, r2 => r3 r3 ← r1 + r2 1
sub r1, r2 => r3 r3 ← r1 - r2 1
mult r1, r2 => r3 r3 ← r1 * r2 2
lshift r1, r2 => r3 r3 ← r1 ≪ r2 1
rshift r1, r2 => r3 r3 ← r1 ≫ r2 1
output x prints MEM(x) to stdout 1
nop idle for one cycle 1

There are four components to the project. First, you MUST modify the IR to label your instructions with indexes starting at 1, which should be printed with a leading n. Second, you MUST construct a dependency graph that includes register edges, and output the graph. Third, you MUST include serialization edges, compute the weighted distance to root, and print the weights. Finally, you MAY schedule the instructions using either of the algorithms discussed in class.

Dependence Graph

The dependence graph will be a graph over the instruction labels. You are responsible for devising an algorithm to compute this graph. An edge (n1,n2) implies that n1 depends on n2 in some way, and that n2 must be completed before n1 can be executed. A register edge exists if one of the source registers of n1 is defined by n2. Serialization edges exist between instructions that deal with memory, in the following cases.

If n1 is an output instruction, it depends upon the most recent output instruction, as values must be output in the same order.
If n1 is an output, load, or store instruction, there is a dependence on the most recent store instruction. The previous store instruction may modify the memory location n1 accesses.
If n1 is a store instruction, it depends upon the most recent output instruction and every previous load instruction. The n1 instruction may overwrite the memory location that the output or load instruction accesses. As an optimization, you can omit edges to load instructions that occur before the most recent store instruction. Since dependency is transitive, the ordering will still be enforced.

Latency-Weighted Distance to Root

You must compute the latency-weighted distance to a root for every edge. The graph will be acyclic, but you cannot assume it is weakly connected (i.e. only one node has no parents) or that it is a forest (i.e. every node has at most one parent). To compute the distance, you will need to find the roots: instructions which have no incoming edges. The weight of a root is the latency of its operation. Then, follow edges from parents to children. The weight of the child is the weight of the parent plus the latency of the child. Be careful: if a child has multiple parents, you must use the maximum weight of all parents. You are again responsible for devising an algorithm to compute these weights: I suggest a worklist.

Interface

Your code MUST compile on the janus machines, using a Makefile in the project directory, to an executable named tipc in that directory. Your program MUST implement the following command line arguments when specified by the short flags. You SHOULD support the long flags as well. You SHOULD use a library, such as getopt, to handle these flags. Your default for this version SHOULD be the -s flag. Aside from the default behavior, your code MUST be backwards-compatible with Projects 1 and 2.

-a --alloc Run the allocator algorithm on the input block of code. By default, pretty-print the results.
-s --sched Compute the dependency graph for the input block of code, along with the latency-weighted distance to root.
By default, print the graph to standard output.
-k num Use num registers when allocating the code.
-l --lexer Print a list of the tokens, one per line. You should print both the token type and the specific lexeme.
-p --pretty-print Print legal ILOC code. You do not have to preserve white-space or comments.
-t --table-print Print the intermediate representation in tabular form.
-h --help Print a help summary and exit.
-d --debug Print debugging information during execution. All debugging information SHOULD be printed as a comment.

If the -s flag is provided, by default you MUST print the graph in the format specified below. You SHOULD print the virtual registers. If the -p and -s flag are both specified, you MAY instead print legal ILOC code that has been scheduled according to one of the scheduling algorithms discussed in class. Error messages for invalid arguments or an invalid file SHOUDL be reasonable and printed to stderr. You may include other command-line arguments or flags for your program.

Output

When the -s flag is used, the output will consist of three components. Each component maps a component label, on a single line, to by an indented map from node labels to values.

nodes:

A list of unique labeled instructions in the block. The instruction SHOULD list the virtual registers, instead of the source registers. You MAY use any alphanumeric sequence as a label, but you SHOULD use increasing integers, starting with 1, prefixed with 'n'. The syntax is label : instruction. For instance, if the first instruction is loadI 1024 => r1, and the source register r1 becomes the virtual register r12, then the corresponding map might be n1 : loadI 1024 => r12. Each instruction MUST be on a new line. For debugging, you MAY include the original instruction as a comment following the renumbered instruction: i.e. n1 : loadI 1024 => r12 //loadI 1024 => r1.

edges:

The set of outgoing edges from each instruction in the format label : { labels }, where label is an instruction label from nodes, and labels is a comma-separated list. Each edge MUST point to a dependency of the labelled instruction. If there are no dependencies, then use the empty set { }. Each set MUST be on a new line.

### weights:

The latency-weighted distance from root for each instruction label, with the format label : weight. Each label MUST be on a new line.

Example

An example file and corresponding output is provided below. Note the serialization edges from n13 to n2, n5, n8 and n11. Further, note that n2’s weight is 13, inherited from n3, not 6 inherited from n13.

Input:

loadI 1024 => r0

load r0 => r1

add r1, r1 => r1

loadI 1032 => r3

load r3 => r2

mult r1, r2 => r1

loadI 1040 => r3

load r3 => r2

mult r1, r2 => r1

loadI 1048 => r3

load r3 => r2

mult r1, r2 => r1

store r1 => r0

Output:

nodes:

n1 : loadI 1024 => r1

n2 : load r1 => r11

n3 : add r11, r11 => r8

n4 : loadI 1032 => r10

n5 : load r10 => r9

n6 : mult r8, r9 => r5

n7 : loadI 1040 => r7

n8 : load r7 => r6

n9 : mult r5, r6 => r2

n10 : loadI 1048 => r4

n11 : load r4 => r3

n12 : mult r2, r3 => r0

n13 : store r0 => r1

edges:

n1 : { }

n2 : { n1 }

n3 : { n2 }

n4 : { }

n5 : { n4 }

n6 : { n3, n5 }

n7 : { }

n8 : { n7 }

n9 : { n6, n8 }

n10 : { }

n11 : { n10 }

n12 : { n9, n11 }

n13 : { n1, n12, n11, n8, n5, n2 }

weights:

n1 : 14

n2 : 13

n3 : 10

n4 : 13

n5 : 12

n6 : 9

n7 : 11

n8 : 10

n9 : 7

n10 : 9

n11 : 8

n12 : 5

n13 : 3