Lesson 11 : Assembler

It is time to implement some S3 machine language programs to make use of all the developments that have brought you here. Programming a processor with machine language requires an understanding of its architecture. It was you who created this S3 processor. You are aware of the details of the functionality. You then still need to associate a sequence of instructions supported by the S3 processor in order to create an executable program.

Up to 95% of your S3 processor is already realized. What you still have to do is to place a data memory, Which is what you will do in the last lesson. It is time to implement some S3 machine language programs to make use of all the developments that have brought you here. Programming a processor with machine language requires an understanding of its architecture. It was you who created this S3 processor. You are aware of the details of the functionality. You then still need to associate a sequence of instructions supported by the S3 processor in order to create an executable program.

Required knowledge

Programming elements. To run an S3 program on the card or simulate it in ISE you must:

    • Write assembly code in a prog.S3 file.
    • Compile the code with S3asm.bat to a prog.coe file.
    • Load prog.coe code in the instruction memory insmem by clicking on the symbol in the hierarchy of your project. The third screen allows you to select the file and also to visualize it.
    • Generate insmem with this new initialization.
    • Update the symbol in S3.sch and save S3.sch.
    • Create S3 symbol.
    • Update S3 symbol in toplevel and save.
    • Simulate or synthesize.

Control instructions

By construction each loading phase of an instruction triggers CO incrementing to point to the next instruction to execute, that is to say the following instruction in the Von Neumann model. Sequence break is obtained only by explicitly loading a different address in CO. To do this you have 6 instructions in the S3 processor.

    • MOV registre_source CO: A 16-bit value can be generated by the program in a register. So you can write a program that calculates the next instruction to execute or enter from the switches via the Rsw register.
    • MVZ registre_source CO: CO The update will only be if the Rdest content is equal to x0000. By calculation we can set up a sequence break conditioned by the result of the last ALU operation.
    • MVNZ registre_source CO: Same but CO update is effective only if Rdest is not equal to x0000.
    • MVI 8_bits CO: Here with immediate addressing one can put an explicit address in CO. This address is between x00 and xFF.
    • MIZ 8_bits CO: Functions like the MVI but under the condition that Rdest equals x0000
    • MINZ 8_bits CO: With Rdest not equal to x0000 this time.

First of all we must understand the operation of the processor pipeline. When the instruction that changes the value of the CO executes, the pipeline has already loaded the next instruction and begins to decode. The consideration of the program deviation will be effective only a cycle late, which is an instruction time. We are talking here about delayed branching.

In this example the NEXT instruction is always executed, jumping being effective in the next cycle.

I propose to test this feature on your card and by simulation. you just need to test this program.

The simulation displaying q(15: 0) of the CO, RI and R7seg registers will provide this. Observe the effective consideration of MVI 02 R7seg before the sequence breaking. The execution displays x0002 on the display and not the x0001. The X0003 value will never be displayed.

Figure 125 delayed branch Simulation

It is not always possible to find a last instruction to be placed after updating CO. At worst you can always place a NOP. In this case, you would lose a cycle when branching.

Predicate calculation

Conditional register transfer instructions test only if the value of the register Rdest is equal or not to x0000.

Booleans are a codification of information, usually defined for the language you use. For S3 processor, if you decide that TRUE = x0000 and FALSE = xFFFF and other remaining values are undefined, operators of your ALU let you directly perform AND, OR or INV. The results produced in Rdest then used to trigger or not branchs.

For a language like shell, you may decide to choose the following codification where Boolean values are: TRUE = x0000 and FALSE all other values (in C it's the opposite!). In this case these are the codes to test the following predicates: A, NOT A, A and B, A NAND B and a code that performs B: = NOT A. De Morgan formulas will be very useful ... [1] the code if A = B is useful for all types of data. Is considered in these examples that the branch address is stored in R3, A in R1 and B in R2. You could do the same with MIZ, MINZ and immediate addressing.

I suggest you write the following program manipulating booleans with this coding and and OR.

Entrer R1

Entrer R2

si (R1 ou R2) alors

Rdest := x0001

sinon

Rdest := x0002

fsi

The program displays the value x0001 if at least one of the two input values is equal to x00 and x0002 otherwise. Apply the De Morgan formula A OR B is equal to NAND (Not A, Not B). You can analyze and test the following program (for simulation replace PAUSE by NOP).

The relative branch

Sequence breaking instructions force the value of CO to a new address. Sometimes it is desired to make a jump relative to the value of CO. For example one may want to decline by 3 instructions or jump over the following 5 instructions. It is necessary in this case to recalculate the value of CO relative to its current value. This sequence breaking technique allows to avoid calculating and recalculating the branch address each time the portion of code changes place in the instruction memory.

To simulate this behavior you need to include the instructions that recalculate the value of CO. This way to proceed has of course its cost compared to direct addressing. We must remember that the CO points to the next instruction. Here is a generic code that you can reuse later.

Here is a code with a decline.

The simulation with registers CO, RI and R7seg that you should get.

Figure 126 Infinite loop with relative branch

We can do the same thing for a forward jump.

Figure 127 Forward jump

This program simulation allows you to observe the forward jump to the address x0008. R7seg does not receive the x04 value.

I suggest you build a program that uses an IF THEN ELSE. You will design a program that displays the list

3 4 7 8 11 12 15 ... or 3 4 7 8 BCF 10 ... in hexadecimal

N := 1

Loop :

IF N pair THEN

dest := 2N

ElSE

dest := 2N + 1

ENDIF

Afficher dest

Pause

N := N+ 1

GOTO Loop

Here is the assembly program that could be produced by a compiler. Here the delayed branch has been used since the instructions found their place behind the sequence break.

You can execute the program on the card. You can also simulate on Isim to follow the evolution of CO and of registers Rsrc1, Rsrc2, Rdest and R1. For the simulation it is better to replace the PAUSE by NOP, this will avoid to wait for pressing the btn0 button.

The While construction

You will test the program that displays the ith element of the Fibonacci sequence. I is a number between 0 and 255 entered on the switches, however, using 16-bit we will not go very far. The program would be in a pseudo language:

Loop:

N := 0

R1 := 0

R2 : = 1

-- Positionner les switches

Pause

Tantque N<> Rsw faire

N := N + 1

Dest := R1 + R2

R1 := R2

R2 := Dest

FTQ

Afficher R1

Goto loop

Test the program on the card. This is a good test for your S3 processor!

Then to simulate it. it would be better to replace the PAUSE by a NOP and do not forget to set an initial value on switches in toplevel_tb.

Your turn

Question 1 : For the sequence 3 4 7 8 11 12 15, we can see that we alternate adding +1 and +3. Write this code with a loop which produces a series element every time

Question 2 : Generate another code that runs the loop and produces two elements at each iteration!

.