Technical notes on using Analog Devices DSPs, processors and development tools
Contact our technical support at dsp.support@analog.com and at dsptools.support@analog.com
Or vi sit our o n-li ne r esou rces htt p:/ /www.analog.com/ee-notes and http://www.analog.com/processors
SHARC® DSPs to TigerSHARC® Processors Code Porting Guide
Contributed by Andrew Caldwell and Maikel Kokaly-Bannourah Rev 1 – July 14, 2004
1 Introduction
The following Engineer-to-Engineer note discusses the differences between Analog Devices Inc. (ADI)
first and second generation ADSP-2106x and ADSP-2116x SHARC® DSPs and ADSP-TS101 and
ADSP-TS20x TigerSHARC® processors.
Over the years, the SHARC DSP architecture has become the world leader for high-end multiprocessor
applications. The introduction of the ultra-high performance TigerSHARC architecture makes it the ideal
upgrade device for existing SHARC systems looking for a performance boost and a system cost reduction.
This document is a step-by-step, how-to guide for porting SHARC code to its TigerSHARC equivalent.
Architectural differences are discussed, and multiple code examples are provided to help you upgrade
existing SHARC source code to the next generation of floating-point processors, the TigerSHARC
processor family.
Figure 1. SHARC DSP and TigerSHARC Processor Block Diagrams
Copyright 2004, Analog Devices, Inc. All rights reserved. Analog Devices assumes no responsibility for customer product design or the use or application of
customers’ products or for any infringements of patents or rights of others which may result from Analog Devices assistance. All trademarks and logos are property
of their respective holders. Information furnished by Analog Devices applications and development tools engineers is believed to be accurate and reliable, however
no responsibility is assumed by Analog Devices regarding technical accuracy and topicality of the content provided in Analog Devices’ Engineer-to-Engineer Notes.
2 Table of Contents ..............................................................................................................................................................................................2
4.2 Data Addressing......................................................................................................................................................................................10
4.2.2 Addressing in SISD and SIMD.......................................................................................................................................................11
4.3 Program Sequencer .................................................................................................................................................................................12
4.3.2 Instruction Cache and BTB............................................................................................................................................................. 12
4.3.3 Program Flow Variations................................................................................................................................................................13
5.2 Data Addressing......................................................................................................................................................................................22
5.2.1 Post-Modify and Pre-Modify ..........................................................................................................................................................22
5.2.3 Addressing in SISD and SIMD.......................................................................................................................................................23
5.3 Program Sequencer .................................................................................................................................................................................24
5.4.2 Internal Memory - Internal Memory of other DSPs (Multiprocessing)........................................................................................... 33
5.4.3 Internal Memory - Link Port I/O.....................................................................................................................................................34
6 C Run-Time Environment ...............................................................................................................................................................................36
6.1 Memory .SECTION and SECTION{} Names........................................................................................................................................36
6.4 Code Conversion from SHARC DSPs to TigerSHARC Processors........................................................................................................ 39
6.4.2 Function Prologue...........................................................................................................................................................................41
6.4.3 Pushing Additional Data to the Stack..............................................................................................................................................42
6.4.5 Popping Data from the Stack...........................................................................................................................................................46
6.4.7 Function Epilogue...........................................................................................................................................................................48
6.4.8 Using Mixed C/C++ and Assembly Naming Conventions.............................................................................................................. 49
7.1.3 Call DB ...........................................................................................................................................................................................57
SHARC® DSPs to TigerSHARC® Processors Code Porting Guide (EE-241) Page 2 of 64
7.1.4 Data Addressing..............................................................................................................................................................................57
7.2 FIR ..........................................................................................................................................................................................................57
7.2.1 FIR One-to-One Conversion ...........................................................................................................................................................57
Code 5. SHARC vs. TigerSHARC Register File Sets........................................................................................................................................ 22
Code 6. Post-modify and Pre-modify Data Addressing Operations ...................................................................................................................23
Code 10. Single Loop......................................................................................................................................................................................... 25
Code 12. Standard Timer Interrupt.....................................................................................................................................................................27
Code 15. User Software Exception.....................................................................................................................................................................31
Code 16. SHARC and TigerSHARC to External Memory Device DMA Example...........................................................................................32
Code 17. SHARC and TigerSHARC Multiprocessor DMA Example................................................................................................................ 34
Code 18. SHARC and TigerSHARC Multiprocessor DMA Example................................................................................................................ 35
Code 22. SHARC Function Call Macro .............................................................................................................................................................41
Code 23. TigerSHARC Function Call Macro..................................................................................................................................................... 41
Code 24. TigerSHARC Function Prologue Macros............................................................................................................................................41
Code 25. Post-Modify Store to the J Stack.........................................................................................................................................................42
Code 32. C Source for Function Block FIR Function Call.................................................................................................................................45
Code 33. Example for Retrieval or Arguments Passed to Block FIR.................................................................................................................46
Code 38. "restore_reg" Macro on SHARC DSPs...............................................................................................................................................47
Code 39. "restore_reg" Macro on TigerSHARC Processors...............................................................................................................................47
SHARC® DSPs to TigerSHARC® Processors Code Porting Guide (EE-241) Page 3 of 64
Code 40. Pushing and Popping the Stack Example............................................................................................................................................48
Code 41. ADSP-21020 and ADSP-21160 Function Epilogue Macros............................................................................................................... 48
Code 42. SHARC Function Epilogue Macros....................................................................................................................................................49
Code 43. TigerSHARC Function Epilogue Macros............................................................................................................................................49
Code 44. SHARC Heap Declaration in "seg_init.asm" ......................................................................................................................................51
Code 45. SHARC Heap MEMORY Section Placement within the .LDF...........................................................................................................51
Code 46. TigerSHARC Heap Declaration in "ts_hdr.asm".................................................................................................................................51
Code 47. TigerSHARC Heap Memory Section Placement within the .LDF......................................................................................................51
Code 48. SHARC Multiple Heap Declaration in "seg_init.asm"........................................................................................................................52
Code 49. SHARC Multiple Heap Memory Section Placement within the .LDF................................................................................................53
Code 50. TigerSHARC Multiple Heap Declaration in "ts_hdr.asm".................................................................. ................................................ 53
Code 52. SHARC and TigerSHARC DFT Example ..........................................................................................................................................56
Code 53. Floating Point Block FIR One-to-One Conversion .............................................................................................................................60
Code 54. Optimized Floating Point Block FIR...................................................................................................................................................62
a
2.2 Table Listing
Table 1. SHARC and TigerSHARC Features Overview................................................................................ ...................................................... 6
Table 7. Registers used for Passing Arguments..................................................................................................................................................44
Table 9. Accessing C from Assembly ................................................................................................................................................................49
Table 10. Accessing Assembly from C...............................................................................................................................................................49
Table 11. Accessing C++ from Assembly..........................................................................................................................................................50
Table 12. Accessing Assembly from C++..........................................................................................................................................................50
Table 13. SHARC and TigerSHARC Heap Management Functions..................................................................................................................54
SHARC® DSPs to TigerSHARC® Processors Code Porting Guide (EE-241) Page 4 of 64
a
3 Architecture Overview
The first and second generation SHARC - Super
Harvard Architecture Computer – DSPs, ADSP-
2106x and ADSP-2116x, build on the ADSP21000 DSP core family to form a complete
system-on-chip, adding dual-ported on-chip
SRAM, and integrated I/O peripherals.
The SHARC architecture combines a highperformance floating-point DSP core with
integrated on-chip features including a host
processor interface, DMA controller, serial ports,
link ports, and shared bus connectivity for
glueless multiprocessing for up to six SHARC
processors.
This architecture balances a high-performance
DSP core with high-performance buses, yielding
an ideal solution for audio, military,
communications, test equipment, motor control,
imaging and many other applications.
The ADSP-TSxxx TigerSHARC processor
family sets a new standard of performance for
digital signal processors, combining multiple
computation units for floating-point and fixedpoint processing, as well as very large word
widths.
This new architecture maintains a system-on-achip, scalable computing design philosophy,
including up to 24 Mbits of on-chip memory,
integrated I/O peripherals, a host processor
interface, DMA controllers, link ports, and
shared bus connectivity for glueless
multiprocessing of up to eight TigerSHARC
processors.
The TigerSHARC processor’s extremely highperformance core and increased feature set
makes it the ideal upgrade device for existing
SHARC systems looking for a performance
boost and a system cost reduction.
Due to the architectural changes (added units,
modified pipeline, increased memory structure,
internal bus architecture improvement, etc.)
introduced when moving from SHARC DSPs to
TigerSHARC processors, the source code
requires modification to overcome code
incompatibility.
This document first examines the main
differences between the two architectures and
then shows you how to convert SHARC source
code to its TigerSHARC equivalent.
Table 1 summarizes these two architectures main
features.
SHARC® DSPs to TigerSHARC® Processors Code Porting Guide (EE-241) Page 5 of 64
Production Process 0.35-0.5 microns 0.18-0.25 microns 0.13 microns 0.13 microns
Link Ports 6 x 4-bits 6 x 4/8 bits 4 x 8-bits 2-4 x 8-bits (LVDS)
Link Ports Compatibility ADSP-2116x ADSP-2106x N/C N/C
Serial Ports 2 2-4 N/A N/A
IRQ Lines 3 3 4 4
1x40-bit and 1x48-
bit
2x64–bit 3x128-bit 4x128-bit
External Interfaces
SRAM, SBSRAM,
SDRAM
Branch Target Buffer
(BTB)
SRAM, SBSRAM,
SDRAM
Branch Target Buffer
(BTB)
SRAM,SBSRAM,
SDRAM
G-P I/O Pins 4 4 4 4
N/A: Not Applicable, N/C: Not Compatible
Table 1. SHARC and TigerSHARC Features Overview
SHARC® DSPs to TigerSHARC® Processors Code Porting Guide (EE-241) Page 6 of 64
a
4 SHARC-to-TigerSHARC
Conversion Guidelines
The SHARC-to-TigerSHARC register
map shown throughout this EE-Note is
not fixed. It is simply an example of
Table 2 shows the ADSP-2106x and ADSP-
how this can be done.
2116x SHARC DSPs registers, along with their
TigerSHARC processor (ADSP-TS101 and
ADSP-TS20x) equivalent.
The given mapping scheme has been
selected to provide code translation in
the simplest way, with the SHARC DSP
The main differences between the two
family features in mind.
architectures, and how they can be mapped to
each other, are discussed.
This does not necessarily mean that it
will produce the most efficient
TigerSHARC code. It will, however,
help you to translate source code to run
on TigerSHARC platforms.
Register Type ADSP-2106x ADSP-2116x ADSP-TS101 ADSP-TS201
SHARC® DSPs to TigerSHARC® Processors Code Porting Guide (EE-241) Page 8 of 64
Register Type ADSP-2106x ADSP-2116x ADSP-TS101 ADSP-TS201
a
II3-0, IM3-0, C3-0,
Serial Port
Status DMASTAT DMASTAT DSTAT DSTAT
Buffer & Control LBUF5-0, LCTL
Common LCOM LCOM LSTAT3-0 LSTAT3-0
Assignment LAR LAR N/A N/A
Service Request LSRQ LSRQ LSTAT3-0 LSTAT3-0
Buffer & Control
P/R: Primary Registers, S/R: Secondary Registers, N/A: Not Applicable
CP3-0, GP3-0, DB3-0,
DA3-0
TX1-0, RX1-0,
STCTL1-0,
SRCTL1-0
MR, MR2-0, MRF,
MRF2-0, MRB,
MRB2-0
II3-0, IM3-0, C3-0,
CP3-0, GP3-0, DB3-0,
DA3-0
Link Port Control
LBUF5-0, LCTL1-0,
LIRPTL
Serial Port Control
TX1-0, RX1-0,
STCTL1-0,
SRCTL1-0
Multiplier Registers
MR, MR2-0, MRF,
MRF2-0, MRB,
MRB2-0
N/A N/A
LCTL3-0,
LBUFTX3-0,
LBUFRX3-0
N/A N/A
MR3-0, MR4 MR3-0, MR4
LRCTL3-0,
LTCTL3-0,
LBUFTX3-0,
LBUFRX3-0
Table 2. SHARC DSP and TigerSHARC Processor Registers Mapping Scheme
4.1 Register File
The register file of first-generation ADSP-2106x
processors features two sets (primary and
secondary) of 16 40-bit-wide registers (R0-R15)
for fast context switching.
The same applies to the ADSP-2116x SHARC
DSP register file, with the addition of a second
register file set (
(Single-Instruction Multiple-Data). The two
register files and the included arithmetic units are
referred to as Processing Element X (PEx) and
Table 2lists TigerSHARC processor
registers that are relevant to the SHARC
DSPs registers. It does not list all
TigerSHARC registers.
For details on all TigerSHARC
registers, refer to the ADSP-TS101
TigerSHARC Processor Hardware
Reference [5] and the ADSP-TS201
TigerSHARC Processor Hardware
Reference [7].
Processing Element Y (PEy).
ADSP-2106x DSPs are SISD machines
(Single-Instruction Single-Data) and
therefore do not have a PEy unit.
S0-S15) for SIMD operations
SHARC® DSPs to TigerSHARC® Processors Code Porting Guide (EE-241) Page 9 of 64
a
The TigerSHARC processor’s Compute Block X
(CBx) register file set has 32 32-bit registers
(XR0-XR31), but has no extra set for fast context
switching. However, since the register file has
twice the number of registers, direct register
mapping can be accomplished.
TigerSHARC processors also have a second
processing unit, Compute Block Y (CBy), with a
set of 32 32-bit registers (YR0-YR31). These
registers can be mapped directly to the PEy
register set of the ADSP-2116x SHARC DSPs
when performing SIMD operations.
The register file mapping used throughout this
document is as follows:
Refer to section 5.1 Register File for SHARC
DSP programming examples and their
TigerSHARC processor equivalent.
alternate or secondary register set.
address any memory block (i.e., there are no
DM/PM limitations).
Also, each IALU contains a register file (32 32bit registers) with dedicated registers for circular
buffer addressing.
Because of their flexible register set and ability
to address any memory block, each IALU
register can be mapped to any DAG register.
Throughout this EE-Note, the following
SHARC-to-TigerSHARC data addressing
register map is used:
Similar to the Processing Element register files,
each DAG has a secondary register set on the
SHARC architecture. Since TigerSHARC
processors do not have these extra sets and have
only 32 IALU registers per unit (J and K), the
number of alternate modifier registers is limited
to 3 instead of 8. This, however, should not
impact most (if not all) applications, because not
all other available modifier registers (J/K19-12,
J/K30-28) will be in use at the very same time.
4.2 Data Addressing
As shown in Table 2, the SHARC DSP Data
Address Generators (DAGs) are replaced by the
TigerSHARC Integer Arithmetic Logic Units
(IALUs).
Generally, the IALUs (JALU and KALU) have
the same functionality as the DAGs (DAG1 and
DAG2). Additionally, IALUs can also perform
arithmetic and logical operations (add and
subtract, arithmetic and logic shift, logical
operations, as well as some mathematical
functions – ABS, MIN, MAX, etc.), resulting in extra
capacity for computationally demanding
applications.
Unlike DAGs, both IALUs are connected to all
internal memory blocks, enabling each IALU to
SHARC® DSPs to TigerSHARC® Processors Code Porting Guide (EE-241) Page 10 of 64
DAGs support loading/storing of data using the
PM and DM buses in the same instruction (e.g.,
dm(i0,m0)=r0, f8=pm(i8,m8);).
IALUs support the use of the JALU and KALU
in parallel. However, due to the register mapping
scheme used throughout this EE-Note, dedicating
CBx registers to Pex (CBy to Pey), JALU and
Unlike the other IALU registers, J31
and K31 cannot be used as generalpurpose registers.
For more details, refer to the ADSP-
TS101 TigerSHARC Processor
Programming Reference [6] and the
ADSP-TS201 TigerSHARC Processor
Programming Reference [8].
a
KALU cannot be used in the same instruction for
storing data from CBx:
//Parallel STORE to memory–NOT ALLOWED
[j4+=j12] = xr2; [k4+=k12] = xr3;;
// Parallel STORE to memory-ALLOWED
[j4+=j12] = xr2; [k4+=k12] = yr3;;
Code 1. JALU and KALU Parallel Instructions
The second instruction in Code 1 results in a
resource violation since the number of required
CBx output ports (2) exceeds the allowed
maximum (1). As shown in the third instruction,
using the CBy register for one of the stores to
memory is allowed, resulting in the correct use
of J and K in the same instruction.
To keep things as simple as possible and
Refer to section 5.2.1 Post-Modify and Pre-
Modify for SHARC DSP programming examples
and their TigerSHARC Processor equivalent.
4.2.1 Circular Buffers
Although TigerSHARC processors have 32 J and
32 K IALU registers, only eight circular buffers
can be used at a time (via
their respective JL/KL and JB/KB registers).
For this reason, I7-I0 and I15-I8 have been
mapped to J11-J4 and K11-K4, allowing J3-J0
and
to comply with the register map selected
for this EE-Note, parallel DM and PM
stores will be translated as two
individual instructions (e.g., [j4+=j12]
= xr2;; [k4+=k12] = xr3;;
For SHARC DSPs, parallel instructions
are separated by a comma “,” and the
end of an instruction line is denoted by a
single semicolon “;”.
For TigerSHARC processors, a
semicolon “
instructions, and a double semicolon
“;;” terminates an instruction line.
Refer to section 5.2.2 Circular Buffers for
SHARC DSP programming examples and their
TigerSHARC processor equivalent.
4.2.2 Addressing in SISD and SIMD
The following section applies only to ADSP2116x SHARC DSPs. It does not apply to firstgeneration ADSP-2106x SHARC DSPs.
SIMD mode does not change the addressing
operations in the DAGs; it changes the amount of
data that moves during each access. The DAGs
put the same addresses on the buses in SIMD and
SISD modes. In SIMD mode, the DSP’s memory
and processing elements get data from the
locations named (explicit) in the instruction
syntax and the complementary (implicit)
locations.
This differs in TigerSHARC processors. In this
case, SIMD is no longer a processor mode.
TigerSHARC SIMD operation are controlled at
instruction level. Specifying “x” as part of the
instruction performs an operation using the CBx.
Specifying “y” as part of the instruction performs
an operation in CBy. Using “xy” or “yx” results
in an operation in both compute blocks, CBx and
CBy.
The order in which “x” and “y” are specified
influences the way data is moved between
memory and the compute blocks. Specifying
“xy” (e.g., xyR0) moves the lower portion of the
data to/from CBy and the higher portion to/from
CBx.
On the other hand, specifying “yx” (e.g., yxR0)
moves the lower portion of the data into or from
CBx and the higher portion into or from CBy.
Additionally, when neither “x” nor “y” precedes
the register file name (e.g.,
will be performed in one of the following two
ways: in the same manner as for “xy” (i.e., the
lower portion to/from CBy and the higher
JB3-0, B15-8 Ö KB3-0
R0), the data move
SHARC® DSPs to TigerSHARC® Processors Code Porting Guide (EE-241) Page 11 of 64
a
to/from CBx) or the same data to/from both CBx
and CBy. This depends on the number of
registers specified as the source or destination of
the transaction, as well as the length of the data
being transferred.
Moving the same data from/to both compute
blocks is equivalent to Broadcast mode of the
ADSP-2116x SHARC DSPs. In this mode,
identical data is moved to/from each processing
element.
Refer to section 5.2.3 Addressing in SISD and
SIMD for SHARC DSP programming examples
and their TigerSHARC processor equivalent.
accesses, pipeline depth does not pose a problem
when converting SHARC DSP code to the
TigerSHARC processor. The pipeline comes into
effect for non-sequential accesses such as jumps,
subroutine calls and returns, interrupts, and
loops. Due to the fully interlocked pipeline of the
TigerSHARC processors, you do not need to be
aware of the order of execution of instructions
and when data will be available with regard to
correct functionality. However, from a
performance perspective, this is an important
area and is covered in great detail in the ADSP-
TS101 TigerSHARC Processor Programming
Reference [6] and the ADSP-TS201 TigerSHARC
Processor Programming Reference [8].
4.3 Program Sequencer
This section details some of the fundamental
differences in the program sequencer between
SHARC DSPs and TigerSHARC processors.
This is not an in-depth comparison of the two
program sequencers; only specific parts deemed
important for the successful conversion of source
code from the SHARC DSP to the TigerSHARC
processor are covered. For detailed information
on SHARC DSP and TigerSHARC processor
program sequencers, refer to the hardware
documentation listed in section 9 References.
The following subjects will be covered briefly,
focusing on the main differences with regards to
programming between the SHARC DSPs and the
TigerSHARC processor.
Instruction pipeline
4.3.2 Instruction Cache and BTB
Engineers familiar with the SHARC DSP family
should be aware of the instruction cache that is
located within the program sequencer. The
instruction cache allows for simultaneous
fetching of an instruction and a program memory
data access. TigerSHARC processors do not
require an instruction cache in the program
sequencer due to the number of memory blocks.
There are a sufficient number of memory blocks
and internal buses so that an instruction fetch and
two data accesses can take place without an
instruction cache. This, however, is largely
dependent upon how data and program memory
is structured within the Linker Description File
.LDF).
(
The TigerSHARC program sequencer does have
a form of cache known as a branch target buffer
Instruction cache and BTB
Program flow variations
Interrupts
4.3.1 Instruction Pipeline
As shown in Table 1, SHARC DSPs have an
instruction pipeline depth of three thus
processing instructions in three clock cycles.
TigerSHARC processors have a much larger,
fully interlocked pipeline. For ADSP-TS101
processors, the pipeline depth is 8; for ADSP-
(BTB). The BTB is a 32 entry 4-way set
associative cache that has been implemented to
help reduce the number of stalls incurred with
non-sequential accesses on these deeply
pipelined processors. The destination address of
a branch instruction can be stored to the BTB,
acting as an early indication for the sequencer on
the next iteration where to continue fetching
code. The BTB becomes especially important in
loop execution in which incorrect usage can
result in significant loss of performance.
TS20x processors, the depth is 10. For sequential
SHARC® DSPs to TigerSHARC® Processors Code Porting Guide (EE-241) Page 12 of 64
a
To use the BTB, the BTB must be enabled and
the sequencer must predict the instruction flow.
Predicted instruction flow is the default method
of a branch instruction. However, you can
specify that a branch is not predicted to occur.
This is especially useful for conditional branch
instructions in which the condition is more likely
to be false than true. Non-predicted branches do
not update the BTB.
Examples of writing predicted and non predicted
branch instructions are shown below. Refer to
section 5.3.2 Loops for a complete loop example.
/* conditional branch based on the
result of an X compute block
computation being equal to zero */
If xaeq, jump label;;
/* conditional branch with prediction
based on the result of an X compute
block computation being equal to
zero */
If xaeq, jump label (P);;
/* conditional branch with no
prediction based on the result of
an X compute block computation
being equal to zero */
If xaeq, jump label (NP);;
Code 2. Predicted and Non-predicted Branch
In the first example above, no option has been
placed at the end of the instruction. By default,
this will be predicted but can be forced to be
predicted or non-predicted through the tools with
the use of an assembler switch. Refer to the
VisualDSP++ 3.5 Assembler and Preprocessor
Manual for TigerSHARC Processors [13] for
further details.
For a detailed description of the BTB and its
operation, Refer to the ADSP-TS101 TigerSHARC Processor Programming Reference
[6] and the ADSP-TS201 TigerSHARC Processor Programming Reference [8].
4.3.3 Program Flow Variations
The program flow of SHARC DSPs and
TigerSHARC processors is mostly linear. This
linear flow varies, however, when the program
uses non-sequential program structures such as:
Jumps
Loops
Subroutines
Interrupts
Idle
For a list of typical sequencer instructions on the
SHARC DSPs and their TigerSHARC processor
equivalent, refer to Table 3 at the end of this
section.
There is a difference in the way that a
CALL
instruction is handled by the program sequencers
on SHARC DSPs and TigerSHARC processors.
Because TigerSHARC processors have no PC
stack, the return address from the
CALL is instead
saved to the CJMP register. Therefore, before
performing another CALL, the CJMP register must
be saved to memory, otherwise the return
location for the preceding CALL will be lost. A
return from a CALL on TigerSHARC processors
is performed using the CJMP instruction. Thus,
the CJMP instruction on TigerSHARC processors
maps directly to the RTS instruction on SHARC
DSPs.
One fundamental difference between branching
on the SHARC DSPs and TigerSHARC
processors is that the TigerSHARC program
sequencer does not support the delayed branch
feature. Thus, when converting unconditional
delayed branches (i.e.,
call label (db);),
move the two instructions immediately following
the delayed branch to before the branch.
However, when converting conditional delayed
branches (e.g.,
IF EQ JUMP(PC,label) (db);),
copy the two instructions immediately following
the delayed branch (without deleting them from
their original location) to the beginning of the
target branch. Refer to section 5.3.1 Pipeline for
a delayed branch example.
Both SHARC DSPs and TigerSHARC
processors support zero-overhead looping
execution. SHARC DSPs support up to six
SHARC® DSPs to TigerSHARC® Processors Code Porting Guide (EE-241) Page 13 of 64
a
nested loops using the program sequencer’s loop
support registers.
The setup of a loop requires a loop counter
register, an instruction to decrement the counter,
and a conditional instruction to terminate the
loop at the required time. The main difference
between setting up a loop on the SHARC DSPs
and the TigerSHARC processors is that the
SHARC DSPs require the use of a DO/UNTIL
instruction as the conditional instruction. The
instruction immediately following the DO/UNTIL
is the first instruction of the loop, and the last
instruction of the loop is indicated by a label.
LCNTR = count, DO label UNTIL LCE;
/* first instruction */
Instruction;
.....
.....
/* last instruction in loop */
label: Instruction;
Code 3. SHARC DSP Loop
When executing the DO/UNTIL instruction, the
program sequencer pushes the address of the
loop’s last instruction and the loop’s termination
condition onto the loop address stack. The
sequencer also pushes the address of the
instruction following the DO/UNTIL instruction
onto the PC stack.
The TigerSHARC sequencers do not work in this
manner. They do not have a loop address stack or
a PC stack from which the required instructions
can be read. Instead, they have two dedicated
loop counter registers (
LC0 and LC1) and special
loop counter conditions (IF NLC0E, IF NLC1E,
IF LC0E, and IF LC1E) for setting up zero-
overhead loops. If more than two nested loops
are required, set up the additional loops using the
IALU registers. The effect of these differences
between the sequencers means that the loop has a
different structure from that of the SHARC
DSPs. On TigerSHARC processors, the
beginning of the loop is indicated by a label, and
a conditional jump instruction is required at the
end of the loop to jump back to this label. If the
test on the conditional jump is true, the
instruction immediately following the
conditional instruction is then fetched.
LCx = count;;
/*first instruction of loop */
label: Instruction;;....
.....
/* last instruction of loop */
IF NLCxE, jump label; instruction;;
Code 4. TigerSHARC Processor Loop
Refer to section 5.3.2 Loops for examples of
setting up and performing loops on SHARC
DSPs and how this same operation is translated
for operation on TigerSHARC processors.
4.3.4 Interrupts
There are significant differences between the
way that interrupts are set up on the SHARC
DSPs and the TigerSHARC processors. The first
point to note is that the interrupt vector addresses
on TigerSHARC are not in internal or external
program memory as they are on the SHARC
DSPs. TigerSHARC processors have dedicated
registers within the interrupt controller in which
the vector addresses are stored. This register set,
known as the Interrupt Vector Table (IVT),
contains 30 registers. On SHARC DSPs, there
are effectively five steps to process an interrupt,
assuming the interrupt is enabled:
1. Output the interrupt vector address.
2. Push the current PC value onto PC stack.
3. Depending on the interrupt that occurred,
push the
ASTAT and MODE1 registers onto
status stack.
4. Set the appropriate bit in IRPTL.
5. Alter IMASKP to reflect the current interrupt
nesting state.
TigerSHARC processors react to interrupts in a
different manner; this depends on whether the
interrupt is a hardware interrupt or a software
exception. For hardware interrupts, assuming the
interrupt is enabled:
SHARC® DSPs to TigerSHARC® Processors Code Porting Guide (EE-241) Page 14 of 64
a
1. Set the appropriate bit in ILATL/ILATH.
2. Output the interrupt vector address from
3. Store the current PC to
RETI.
IVT.
4. Upon entry to the interrupt service routine
(ISR), set the appropriate PMASKL/PMASKH bit
and set PMASKH bit 29 to block all hardware
interrupts on ADSP-TS101. For ADSPTS20x processors, instead of setting
bit 29, set
SQSTAT bit 21 to block all
PMASKH
hardware interrupts.
For software exceptions, assuming they are
enabled:
1. For ADSP-TS101 processors, set the
appropriate bit in ILATH. For ADSP-TS20x
processors, set appropriate bit in SQSTAT.
2. Output the interrupt vector address.
3. Store the current PC to RETS.
Because of differences in the way that SHARC
DSPs and TigerSHARC processors handle
interrupts, one question immediately comes to
mind: how do you nest interrupts on the
TigerSHARC processors?
This is one of the significant differences between
the two families. Because TigerSHARC
processors have no PC stack and do not save
registers (there is no status stack), the nesting of
interrupts must be performed in the ISR itself.
There is no nesting enable bit in a control
register. For this reason, all interrupts are
disabled upon entry to the ISR, this allows you to
save registers to memory (or the C run-time
stack). This includes the saving of the
RETI
register. Failure to save the contents of the RETI
register to memory before enabling the nesting or
re-using of interrupts will result in the loss of the
return address. Interrupts on TigerSHARC
processors are nested or re-used by execution of
special instructions. For nested interrupts, you
must save the
RETIB register to memory results in the
the
RETIB register to memory. Saving
following two actions being performed:
1. Saves the contents of
current contents of
RETI to memory. The
RETI are the return
address from the interrupt.
2. On ADSP-TS101 processor, bit 29 of the
PMASKH register is cleared, allowing for
higher priority interrupts to now occur. On
ADSP-TS20x processors, bit 21 of the
SQSTAT register is cleared, allowing for
higher priority interrupts to occur.
If the interrupt service routine is not nested,
restore registers that were saved to a stack at the
beginning of the ISR before executing the RTI
instruction, which returns to normal program
flow. If the interrupt was nested, however, before
restoring any registers, disable the interrupts
again so as not to clobber any data and risk
losing the correct return address. This is
performed by restoring the
RETIB register from
the location from which it was stored in memory.
Once this register has been restored, no further
interrupts may occur. This allows for safe
restoration of any registers. The ISR is then
exited using the RTI instruction.
Re-usable interrupts are enabled on
TigerSHARC processors by using the reduce to
subroutine instruction (RDS). RDS is not
equivalent to the RTS instruction on SHARC
DSPs. The RDS instruction on TigerSHARC
processors has an equivalent effect, clearing the
interrupt status option (
CI), which is appended to
a JUMP instruction within the interrupt vector
table on SHARC DSPs. Similar to nested
interrupts, save the status and any required
registers to the stack, including the
before executing the
RDS instruction. To safely
RETI register,
restore all registers at the end of the ISR that has
been reduced to a subroutine (including the
correct return address), all interrupts must be
disabled. This can be achieved by using a similar
method to that of nested interrupts, by restoring
the return address to the RETIB register. The
processor status registers and any registers saved
to the stack can then be safely restored from the
stack before returning from the ISR.
SHARC® DSPs to TigerSHARC® Processors Code Porting Guide (EE-241) Page 15 of 64
a
To return from an ISR that has been reduced to
subroutine level, execute the RETI instruction.
This instruction returns from the interrupt
without modifying any of the mask pointer
PMASK) register bits. However, since interrupts
(
are still disabled, execute this instruction in
parallel with a dummy save of the
RETIB
register. This results in the interrupts being
enabled again after return from the ISR.
This method of returning from a subroutine is
assuming provides a safe method of effectively
nesting not only higher priority interrupts, but
also the same interrupt and lower priority
interrupts.
Since there is no PC stack on TigerSHARC
processors, the only limit to the number of
interrupts that may be nested correctly is the size
of the stack defined in memory. For details on
complying with the C run-time environment
stack within assembly routines, refer to section 6
C Run-Time Environment. Examples of setting
up interrupts, nested interrupts, and re-usable
interrupts on SHARC DSPs and TigerSHARC
processors are provided in section 5.3.3
Interrupts. The TigerSHARC processor’s
program sequencer has a different method for
handling software exceptions. On SHARC DSPs,
software exceptions can result from:
TigerSHARC processors have extended
functionality to the above software exceptions.
The exceptions are enabled and handled in a
different manner from the SHARC DSPs. On
SHARC DSPs, the exceptions previously
introduced are enabled from within the
IMASK
register, where each exception has its own
interrupt vector. On TigerSHARC processors,
there is only one interrupt handler for all
software exceptions. This interrupt handler must
read all required registers to determine the cause
of the exception. The cause of the exception is
determined by reading the EXCAUSE field of the
the SQCTL register for ADSP-TS20x processors,
for any invalid floating-point operation to
generate software exceptions the Invalid enable
bit (IVEN) of the XSTAT/YSTAT register must be
set. For an underflow or overflow operations to
generate a software exception, the underflow
enable bit (UEN) or overflow enable bit (OEN)
must be set in the XSTAT/YSTAT register.
The four user software exceptions on SHARC
DSPs can be implemented on TigerSHARC
processors by using the
TRAP instruction. On
SHARC DSPs, you would force the setting of the
corresponding software exception bit in the
IRPTL register; instead, replace this instruction
with
TRAP. TigerSHARC processors can support
up to 32
for 32 user software exceptions. The
TRAP instructions, effectively allowing
TRAP
instruction is appended with a 5-bit value. When
the instruction is executed, this 5-bit value is
stored in the
SQSTAT register. The software
exception ISR would determine that a TRAP
instruction has occurred by reading the value
obtained from the
action to be taken for the
EXCAUSE field. The specific
TRAP instruction is
SHARC® DSPs to TigerSHARC® Processors Code Porting Guide (EE-241) Page 16 of 64
a
determined by reading the 5-bit value from the
SPVCMD field of the SQSTAT register.
Similar to SHARC DSPs, software exceptions on
the TigerSHARC processors have a higher
priority than hardware interrupts. Upon entry to
the software exception ISR, the return address is
stored to the RETS register. However to exit the
ISR, the RTI instruction is executed. As
described earlier, the execution of the RTI
instruction results in the modification of the
mask pointer bits as well as the return of
program flow to the address stored in RETI. For
this reason, a special procedure is required to
saving the current value in RETI to memory, so
as not to corrupt the return address if a hardware
interrupt is being served. Next, load the value
stored in RETS into RETI; this sets up the correct
return address from the software ISR. Lastly,
execute the RTI instruction in parallel with the
restoring the RETI register from the place it was
saved to in memory.
Refer to section 5.3.3 Interrupts for an example
of setting up a user software exception. Table 7
lists program flow control instructions for
SHARC DSPs and TigerSHARC processor
equivalents.
Conditional return from subroutine or interrupt / compute
IF condition RTS;
IF condition RTI; IF condition RTI(ABS);;
IF condition RTS, compute;
IF condition RTI, compute; IF condition RTI(ABS); compute;;
Conditional return from subroutine or interrupt or compute
IF condition RTS, ELSE compute;
IF condition RTI, ELSE compute; IF condition RTI(ABS); ELSE compute;;
Do until counter expired
LCNTR = <data16>, Do <addr24> UNTIL LCE;
LCNTR = ureg, Do <addr24> UNTIL LCE;
LCNTR = <data16>, Do (PC, <reladdr24>) UNTIL LCE;
LCNTR = ureg, Do (PC, <reladdr24>) UNTIL LCE;
IF condition, CJMP(ABS);;
IF condition, RETI(ABS);;
IF condition, CJMP(ABS); compute;;
IF condition, RETI(ABS); compute;;
IF condition, CJMP(ABS); ELSE compute;;
IF condition, RETI(ABS); ELSE compute;;
LCx = <data32>;;
label:
......
IF NLCxE, JUMP label;;
Do until
SHARC® DSPs to TigerSHARC® Processors Code Porting Guide (EE-241) Page 19 of 64
a
DO <addr24> UNTIL termination;
DO (PC, <reladdr24>) UNTIL termination;
Table 3. Sequencer Instructions
Notice that all RTS commands on SHARC DSPs
have two equivalent instructions on
TigerSHARC processors. This depends on
whether the RTS instruction is used from a simple
call or it is used to reduce an interrupt to
subroutine level as described earlier.
Table 4 maps the SHARC DSP conditions to
those available on TigerSHARC processors.
Some of the conditions cannot be mapped
directly to the TigerSHARC. To perform the
required action for some of these conditions, the
Condition SHARC DSPs TigerSHARC Processors
ALU Conditions
ALU equal zero EQ {X | Y | XY}AEQ
label:
......
IF not termination JUMP label;;
STATUS flag on the TigerSHARC can be masked,
and the unmasked bit copied to the static flag
register (
based on
SF0, SF1); then the condition may be
SF0, SF1, NSF0, or NSF1. For details on
static flag registers, refer to the ADSP-TS101
TigerSHARC Processor Programming Reference
[6] and the ADSP-TS201 TigerSHARC Processor
Programming Reference [8]. Note that some
additional condition codes on TigerSHARC
processors are not available on SHARC DSPs.
ALU less than zero LT {X | Y | XY}ALT
ALU less than or equal to zero LE {X | Y | XY}ALE
ALU carry AC not available, use static flag
ALU overflow AV not available, use static flag
ALU not equal to zero NE {X | Y | XY}NAEQ
ALU greater than zero GT {X | Y | XY}NALE
ALU greater than or equal to zero GE {X | Y | XY}NALT
Not ALU carry NOT AC not available, use static flag
Not ALU overflow NOT AV not available, use static flag
Multiplier Conditions
Multiplier overflow MV not available, use static flag
Multiplier sign (less than zero) MS {X | Y | XY}MLT
Multiplier not overflow NOT MV not available, use static flag
Multiplier not sign (greater than or equal to zero) NOT MS {X | Y | XY}NMLT
Shifter Conditions
Shifter overflow SV not available, use static flag
Shifter zero SZ {X | Y | XY}SEQ
Shifter not overflow NOT SV not available, use static flag
Shifter not zero NOT SZ {X | Y | XY}NSEQ
SHARC® DSPs to TigerSHARC® Processors Code Porting Guide (EE-241) Page 20 of 64
Loading...
+ 44 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.