# 2017 exam ex. 1/pipelining

Hi,

I saw that there were other questions regarding this, but I didn't quite understand the answers, so I wanted to ask myself.

In a) there is a 5 stage pipeline with no conflicts, so I assume the pipeline looks like this: IF ID EX MA WB. Additionally, without forwarding I got:

after ALU: 5-2-1 = 2

after branch: 5-1-1 = 3

and with forwarding:

ALU: 0

branch: 1.

What I don't understand here is why are there no 2 nops between instr. 2 and 3 (if we start from instr. 0) and no 2 nops between 6 and 7. Also, why are there 3 nops after 8? In the 2018 exam, why are there no nops after instr. 6 and why is there only one nop after 7? In 2017 again, I also don't understand why there is no nop after the first ldi (2), only after 4.

Also a rather general question: What is the difference between Sheet3 ex.2 pipelining and this one above?

+1 vote

The standard processor without conflict resolution (not without conflicts as you wrote) has no forwarding logic, but makes use of register bypassing. With the standard five stage pipeline, we get the following

•     after ALU:    pWB − pID − 1 = 5 - 2 - 1 = 2
•     after LOAD:   pWB − pID − 1 = 5 - 2 - 1 = 2
•     after BRANCH: pWB − pIF − 1 = 5 - 1 - 1 = 3

and with forwarding, we get

•     after ALU:    pEX − pID − 1 = 3 - 2 - 1 = 0
•     after LOAD:   pMA − pID − 1 = 4 - 2 - 1 = 1
•     after BRANCH: pEX − pIF − 1 = 3 - 1 - 1 = 1

So, your calculations are correct. The program with nop instructions inserted you have doubts about is as follows:

```    00: mov \$0,0
nop,nop
nop,nop
02 wh: ldi \$2,\$1,10
03: subi \$1,\$1,1
nop,nop
04: ldi \$3,\$1,10
nop,nop
nop,nop
06: sti \$3,\$1,10
07: bez \$1,ex
nop,nop,nop
08: j wh
nop,nop,nop
09 ex: ldi \$7,\$1,10
```

There are no nops between instructions 2 and 3 since there is no conflict that we have to resolve. Note that the above number of stalls/nops that we calculated are only necessary if there is a conflict between instructions that follow each other. Note further that it may also be sufficient to add only one nop after a load even without forwarding, e.g. in case of the following program

```    ldi \$0,\$1,0

which results in

```    ldi \$0,\$1,0
nop

We just have to make sure that the distance between a read operation that follows for example a load operation is at least 2 in the above case, and have to add nop operations to increase the distance if needed. It does not mean that we always have to add the two nop operations after every load operation. It does mean however that we never have to add more than the two nop operations.

by (166k points)
selected
As far as I understood, we are only dealing with RAW conflicts right? Also, the exam would make sense, but what about in the 29018 exam, where there is only 1 nop after addi? And do jump instructions always have 3 nops after them, as much as after branch instructions?
Same there: After the addi, there is a nop and then a jump instruction before the read done by the seq instruction follows. The RAW from addi -> seq is resolved since the two instructions have a distance of two instruction cycles.

Jump instructions are a bit delicate: Before this semester, we said that in Abacus, the jump instructions will need three nops as the branches do. However, that has been changed now in that jump instructions immediately assign the new pc when they are in the decode stage, and then we do not need nop instructions for jump instructions anymore. This can also be achieved in Abacus for the branch instructions is the comparison and the computation of the new pc are done in the decode stage. You can enable this feature by parameter BranchInDecode in the simulator.

We will specify this clearly in the exam, it was not needed to specify this in former exams.