a
a Engineer To Engineer Note EE-157
aa
Technical Notes on using Analog Devices’ DSP components and development tools
Phone: (800) ANALOG-D, FAX: (781) 461-3010, EMAIL: dsp.support@analog.com, FTP: ftp.analog.com, WEB: www.analog.com/dsp
Explaining the Branch Target Buffer
on the ADSP-TS101
Last modified: 2/02
Contributed By: C.L.
The following Engineer-to-Engineer note will
discuss the branch target buffer (BTB) on the
ADSP-TS101 TigerSHARC DSP. It will
explain how the branch target buffer works,
when it is advantageous to use it, and some of
the errors that can occur because of it.
Introduction
The TigerSHARC can achieve its fast execution
rate in part because of its eight-cycle deep
pipeline. However, the drawback to this deep
instruction pipeline is that when a branch
instruction is executed, the pipeline must be
flushed. The new instruction that is being
branched to must then traverse the entire
pipeline before it can be executed. This can
cause a latency of three to six cycles. Using the
BTB can reduce this latency to zero cycles.
The BTB is a 4-way set associative cache for
branch instructions. This includes: interrupt
returns, call returns, and computed jump
instructions. It is 128 entries deep and uses a
Least Recently Used replacement policy.
Each entry in the BTB has a TAG and a
TARGET field. The TAG field stores the quad
address of the instruction line that contained the
branch instruction. The TARGET field stores
the address that the program sequencer will
jump to if the branch is taken.
Copyright 2002, Analog Devices, Inc. All rights reserved. Analog Devices assumes no responsibility for customer product d esign or the use or application of customers ’ produc ts or
for any infringements of patents or rights of others w hich may result fro m Analog Devices assist ance. All trademarks and logos are property of their respective holders. Information
furnished by Analog Devices Applications and Development Tools Engineers is believed to be accurate and reliable, however no responsibility is assumed by Analog Devices
regarding the technical accuracy of the content provided in all Analog Devices’ Engineer-to-Engineer Notes.
How the BTB saves cycles
The first time a branch instruction with a true
condition occurs in code, the BTB will not have
an entry for it. Therefore, a new entry is placed,
with the TAG being the value in the PC. There
will be a penalty of two cycles taken at this
point.
However, every time the same branch
instruction is run into again, the value of the PC
will match the value stored in the TAG field that
was stored the last time. The program
sequencer will see that there is a BTB hit, and
instead of grabbing the next sequential
instruction, it will grab the instruction at the
address stored in the TARGET field. Therefore,
if the branch condition ends up being true and
the branch is taken, the correct instruction is
already being sent through the pipeline. The
cycle penalty for taking this branch is now
reduced zero cycles.
When Is It Advantageous To Use the
BTB?
Using the BTB will save the most cycles when
the same branch instruction is executed over and
over again, and each time the condition that
controls the branch is true. For example, if you
are in a loop that contains a branch instruction
that is always taken, then the first time through
the loop there will be a two-cycle penalty. Each
subsequent time through the loop there will be
zero cycle penalties. Compare this to the same
code that does not use the BTB. Depending on
whether the condition depends on the IALU or
the compute blocks, there will be a 3 or 6 cycle
penalty every time through the loop.
The worst situation in which to use the BTB is
where the branch instruction is executed over
and over again as before, but the condition is
only true the first time. If the branch is taken
the first time, then each subsequent time the
branch will be predicted. A predicted branch
that is not taken delivers the same 3 or 6 cycle
penalty. Compare this to where there is no
branch prediction. The first time through the
loop, you will suffer from the 3 or 6 cycle
penalty. But each additional time through the
loop you will suffer zero penalties
In general, if the branch is going to be taken
more often than not, then it should be predicted.
If the branch is not taken most of the time, then
it should not be predicted. You can force a
branch instruction to not be predicted by adding
the (NP) suffix as shown below.
If jeq, jump 0x0000 (NP);;
Putting Two Branches in One Quad
Word
One of the caveats of using the Branch Target
Buffer is being aware of not putting two branch
instructions in the same quad word in memory.
The compiler will issue a warning saying
“Detected two instruction lines with predicted
jumps ending within 4 words” if it thinks this
might be happening.
Understanding how to avoid this situation
begins with understanding how instructions
move from internal memory to the core. Every
cycle, the DSP takes advantage of its 128-bit
bus to grab four instructions from internal
memory. The first of these four instructions
always has an address that is quad aligned (it is
a multiple of four).
Remember that the TigerSHARC can execute
anywhere from one to four instructions per one
instruction line. So while four instructions are
being loaded into the core every cycle, those
four instructions are not necessarily being
executed in the same instruction line. To handle
this situation, the instructions are stored in a five
word FIFO called the Instruction Alignment
Buffer (IAB). Here, they are aligned into the
proper instruction lines as defined in the source
code. For example, consider code structured as
shown below.
Instr1;;
Instr2; Instr3; Instr4; Instr5;;
Instr6; Instr7;;
Instr8;;
In the first cycle, the first quad word of
instructions is fetched from memory. This
includes instructions 1, 2, 3, and 4. The first
instruction line can be executed because it only
contains Instr1. The second instruction line
must wait for the next quad word of instructions
to be loaded from memory since it needs Instr5.
Instructions 2, 3, 4, and 5 are aligned into one
instruction line in the IAB. The instructions for
EE-157 Page 2
Technical Notes on using Analog Devices’ DSP components and development tools
Phone: (800) ANALOG-D, FAX: (781)461-3010, EMAIL: dsp.support@analog.com, FTP: ftp.analog.com, WEB: www.analog.com/dsp