Alternatively, we can stall the pipeline. Remember, when we stall the pipeline, the instruction in the ID phase (which would be the branch) is not issued (a nop is issued instead). We also said that the IF and ID phase simply re-execute, i.e., their registers are not updated.
As it turns out, this is not completely true. If the PC would not be updated, the IF phase would fetch the same instruction again (the sub), and we would be in the exact same situation. Thus, the PC is updated for a control hazard. The ID register and PC1 are not updated, however, because if they did, the sub instruction would make it to the ID phase, which is exactly what we want to avoid).
Here is how the program would executed using pipeline interlock.
| Cycle | IF | ID | EX | MA | WB | |
|---|---|---|---|---|---|---|
| 1 | add | |||||
| 2 | j | add | ||||
| 3 | sub | j | add | We detect a stall | ||
| 4 | xor | j | nop | add | PC is updated so when the IF re-executes, a different instruction is loaded | |
| 5 | ori | xor | j | nop | add |
Try it out. Run this program using branch interlock.
Branch Prediction
Finally, there is a third option, branch prediction. Branch prediction uses a Branch Target Buffer to predict what the target of the PC will be. This branch target buffer or BTB is an associative cache. Each entry in this cache lists an address of a branch instruction (the tag), together with the prediction target. This predicted target is based on the last time the branch was executed.
When the IF phase executes, it sends the contents of the PC to the BTB (1). The BTB then compares this with the addresses (tags) in its cache. If the address is in the cache, we must have executed the branch before. We then predict that the branch will do the same as the last time, so we send the associated value in the BTB to the PC (2).
If the address is not there, either because we never saw the branch before (compulsory miss), or because we did see the the branch before, but too long ago and it has been replaced by another branch in the BTB (capacity miss or conflict miss), we cannot predict the branch and we simply calculate PC+4 (as if we did not have branch prediction, this is also called predict-not-taken, because we "predict" that we do not take the branch and simply want to execute the next instruction in memory).
advertisement
advertisement