# Pipeline conflicts

I have a question regarding pipeline conflicts (slide 14 from the RoSy-04-Pipelining slide-set).

The struggle understanding what exactly i and p is. Is p the amount of pipeline stages?

If I got the meaning of a RAW conflict right, it means that if an instruction I in line i writes in a register, it of course needs some steps to be completely written and therefore can not be read in line i+1 yet. Also if I remember correctly, the amount of steps the pipeline needs to completely write the value is dependent on the amount of pipeline stages.

Can someone tell me in which line (depending on the amount of pipeline stages) a register that has been written in in line i can be safely read so no RAW conflict appears? Is it maybe also dependent on other settings like forwarding or register bypassing?

I hope I've made my problem clear enough. Thank you in advance for any responses.

The slide shows a pipeline with p pipeline stages numbers as 1..p where instructions enter at stage 1 and are then processed by the stages until they leave the pipeline at stage p. As we consider a stream of instructions I(0),I(1),I(2),... executed by the processor, we look at some time t at the pipeline and find there the following instructions:

```    stage   instruction
1       I(t+i-1)
2       I(t+i-2)
:
i-1     I(t+1)
i       I(t)
i+1     I(t-1)
:
p-1     I(t+i-p+1)
p       I(t+i-p)```

We don't know much about this general pipeline, so at any stage the instruction may read and write certain memory locations (registers). Writing may furthermore be done with or without a delay (using register bypassing or not).

Now we consider the instruction at some stage i which is above the instruction I(t). Instruction I(t) has a RAW conflict to one of its preceeding instructions I(t-1),...,I(t+i-p) which are still in execution in case I(t) reads now, i.e., in stage i, some memory location that will be written by one of the instructions I(t-1),...,I(t+i-p) while they continue their way through the pipeline.

To be more precise, immediate writes of the instructions I(t-1),...,I(t+i-p) at that point of time are okay (since these are already read by I(t) in stage i or can at least be forwarded), but delayed writes at that point of time of the instructions I(t-1),...,I(t+i-p) are already too late. Also later writes, regardless whether with or without delay of the instructions I(t-1),...,I(t+i-p) are too late, since I(t) has already done a read which was too early. Note that in a sequential execution, I(t) will read what has been written by I(t-1),...,I(t+i-p) before (since all of those writes are done before I(t) is executed).

The situation is explained in more detail in the slides that follow, but it is admittely, not that easy to express.

by (147k points)
selected by

Lets be precise considering the following Abacus program:

1    mul \$2,\$1,\$1
2    //some code
3    //some code
4    //some code
5    mul \$2,\$1,\$1

If I got RAW conflicts right, there is a RAW conflict from line 0 to 1 because the value of \$1 + 2 has not yet been written into \$1 in line 1. Also, according to the exercise lessons, I can fix that by adding two nops between line 0 and 1. Therefore I assume that the pipeline needs 3 stages to completely write into a register. Also there should be no RAW conflict from line 0 to line 5 since there should be enough stages between.

How can one know how many nops have to be added? Is that number dependent on the amount of pipeline stages? How many pipeline stages must the pipeline the program above is running on have? Is that number also dependent on forwarding and/or register bypassing?

I've found another slide (90) with a nice table on it. It states the number of required stalls and that actually might be exactly what I am searching for. Unfortunately I do not fully understand the slide. First of all, why is there one possible combination missing (forward + no bypass)? Second of all, how do I know which line I have to consider? Are the stages that are needed to compute the required stalls (p_IF to p_WB) given in a possible exam excercise and if not how do I compute them?
First of all, you should not say "that the pipeline needs 3 stages to completely write into a register". Instead, recall that operands are read in the decode stage and are produced in later stages, i.e., in EX for ALU instructions and MA for Load instructions. The distance between the decode stage and the producing stages is what matters for the number of instructions that follow a write instruction for having a RAW conflict. In other words, this distance determines the number of NOPs required to resolve such a conflict since those NOPs increase the distance between the two instructions that have the conflict.

Slide 90 summarizes these observations for a general pipeline where pIF, pID, pEX, pMA and pWB are defined as explained on that slide. We may also consider the fourth condition (forward + no bypass) which would be one more than forward + bypass. Usually that is however not considered since bypassing is a kind of forwarding that is usually used also when forwarding is used. Bypassing may however also be used without forwarding.