Analog Devices EE157 Application Notes

a
a Engineer To Engineer Note EE-157
aa
Technical Notes on using Analog Devices’ DSP components and development tools
Phone: (800) ANALOG-D, FAX: (781) 461-3010, EMAIL: dsp.support@analog.com, FTP: ftp.analog.com, WEB: www.analog.com/dsp
Explaining the Branch Target Buffer on the ADSP-TS101
Last modified: 2/02 Contributed By: C.L.
The following Engineer-to-Engineer note will discuss the branch target buffer (BTB) on the ADSP-TS101 TigerSHARC DSP. It will explain how the branch target buffer works, when it is advantageous to use it, and some of the errors that can occur because of it.
Introduction
The TigerSHARC can achieve its fast execution rate in part because of its eight-cycle deep pipeline. However, the drawback to this deep instruction pipeline is that when a branch instruction is executed, the pipeline must be flushed. The new instruction that is being branched to must then traverse the entire pipeline before it can be executed. This can cause a latency of three to six cycles. Using the BTB can reduce this latency to zero cycles.
The BTB is a 4-way set associative cache for branch instructions. This includes: interrupt returns, call returns, and computed jump instructions. It is 128 entries deep and uses a Least Recently Used replacement policy.
Each entry in the BTB has a TAG and a TARGET field. The TAG field stores the quad address of the instruction line that contained the branch instruction. The TARGET field stores the address that the program sequencer will jump to if the branch is taken.
Copyright 2002, Analog Devices, Inc. All rights reserved. Analog Devices assumes no responsibility for customer product d esign or the use or application of customers ’ produc ts or for any infringements of patents or rights of others w hich may result fro m Analog Devices assist ance. All trademarks and logos are property of their respective holders. Information furnished by Analog Devices Applications and Development Tools Engineers is believed to be accurate and reliable, however no responsibility is assumed by Analog Devices regarding the technical accuracy of the content provided in all Analog Devices’ Engineer-to-Engineer Notes.
How the BTB saves cycles
The first time a branch instruction with a true condition occurs in code, the BTB will not have an entry for it. Therefore, a new entry is placed, with the TAG being the value in the PC. There will be a penalty of two cycles taken at this point.
However, every time the same branch instruction is run into again, the value of the PC will match the value stored in the TAG field that was stored the last time. The program sequencer will see that there is a BTB hit, and instead of grabbing the next sequential instruction, it will grab the instruction at the address stored in the TARGET field. Therefore, if the branch condition ends up being true and the branch is taken, the correct instruction is already being sent through the pipeline. The cycle penalty for taking this branch is now reduced zero cycles.
When Is It Advantageous To Use the BTB?
Using the BTB will save the most cycles when the same branch instruction is executed over and over again, and each time the condition that controls the branch is true. For example, if you are in a loop that contains a branch instruction that is always taken, then the first time through the loop there will be a two-cycle penalty. Each subsequent time through the loop there will be zero cycle penalties. Compare this to the same code that does not use the BTB. Depending on whether the condition depends on the IALU or the compute blocks, there will be a 3 or 6 cycle penalty every time through the loop.
The worst situation in which to use the BTB is where the branch instruction is executed over and over again as before, but the condition is only true the first time. If the branch is taken the first time, then each subsequent time the branch will be predicted. A predicted branch that is not taken delivers the same 3 or 6 cycle penalty. Compare this to where there is no branch prediction. The first time through the loop, you will suffer from the 3 or 6 cycle penalty. But each additional time through the loop you will suffer zero penalties
In general, if the branch is going to be taken more often than not, then it should be predicted. If the branch is not taken most of the time, then it should not be predicted. You can force a branch instruction to not be predicted by adding the (NP) suffix as shown below.
If jeq, jump 0x0000 (NP);;
Putting Two Branches in One Quad Word
One of the caveats of using the Branch Target Buffer is being aware of not putting two branch instructions in the same quad word in memory. The compiler will issue a warning saying “Detected two instruction lines with predicted jumps ending within 4 words” if it thinks this might be happening.
Understanding how to avoid this situation begins with understanding how instructions move from internal memory to the core. Every cycle, the DSP takes advantage of its 128-bit bus to grab four instructions from internal memory. The first of these four instructions always has an address that is quad aligned (it is a multiple of four).
Remember that the TigerSHARC can execute anywhere from one to four instructions per one instruction line. So while four instructions are being loaded into the core every cycle, those four instructions are not necessarily being executed in the same instruction line. To handle this situation, the instructions are stored in a five word FIFO called the Instruction Alignment Buffer (IAB). Here, they are aligned into the proper instruction lines as defined in the source code. For example, consider code structured as shown below.
Instr1;; Instr2; Instr3; Instr4; Instr5;; Instr6; Instr7;; Instr8;;
In the first cycle, the first quad word of instructions is fetched from memory. This includes instructions 1, 2, 3, and 4. The first instruction line can be executed because it only contains Instr1. The second instruction line must wait for the next quad word of instructions to be loaded from memory since it needs Instr5. Instructions 2, 3, 4, and 5 are aligned into one instruction line in the IAB. The instructions for
EE-157 Page 2
Technical Notes on using Analog Devices’ DSP components and development tools
Phone: (800) ANALOG-D, FAX: (781)461-3010, EMAIL: dsp.support@analog.com, FTP: ftp.analog.com, WEB: www.analog.com/dsp
Loading...
+ 1 hidden pages