tions, optimized for large, demanding multiprocessor DSP
applications
Performs exceptionally well on DSP algorithm and I/O bench-
marks (see benchmarks in Table 1)
Supports low overhead DMA transfers between internal
memory, external memory, memory-mapped peripherals,
link ports, host processors, and other (multiprocessor)
DSPs
Eases DSP programming through extremely flexible instruc-
tion set and high-level-language friendly DSP architecture
Enables scalable multiprocessing systems with low commu-
nications overhead
12M BITS INTERNAL MEMORY
MEMORY BLOCKS
(PAGE CACHE)
4xCROSSBAR CONNECT
ADADA
A
D
32
128
32
128
32
128
S-BUS ADDR
S-BUS DATA
128
128
Y
REGISTER
FILE
32 × 32
MUL
ALU
128
D
21
SHIFT
SOC
I/F
SOC BUS
JTAG
HOST
MULTI-
PROC
SDRAM
CTRL
C-BUS
ARB
DMA
L0
OUT
L1
OUT
L2
OUT
L3
OUT
JTAG PORT
6
EXTERNAL
PORT
32
64
8
10
EXT DMA
REQ
LINK PORTS
4
8
IN
4
8
4
8
IN
4
8
4
8
IN
4
8
4
8
IN
4
8
ADDR
DATA
CTRL
CTRL
4
TigerSHARC and the TigerSHARC logo are registered trademarks of Analog Devices, Inc.
Rev. 0
Information furnished by Analog Devices is believed to be accurate and reliable.
However, no responsibility is assumed by Analog Devices for its use, nor for any
infringements of patents or other rights of third parties that may result from its use.
Specifications subject to change without notice. No license is granted by implication
or otherwise under any patent or patent rights of Analog Devices. Trademarks and
registered trademarks are the property of their respective owners.
Reference Clocks—System Clock (SCLK) Cycle Time . . 25
Provides thermal information for wider temperature range in:
Thermal Characteristics for 25 mm × 25 mm Package . . 39
Rev. 0 | Page 2 of 44 | November 2004
GENERAL DESCRIPTION
ADSP-TS202S
The ADSP-TS202S TigerSHARC processor is an ultrahigh performance, static superscalar processor optimized for large signal
processing tasks and communications infrastructure. The DSP
combines very wide memory widths with dual computation
blocks—supporting 32- and 40-bit floating-point and supporting 8-, 16-, 32-, and 64-bit fixed-point processing—to set a new
standard of performance for digital signal processors. The
TigerSHARC static superscalar architecture lets the DSP execute up to four instructions each cycle, performing 24 fixedpoint (16-bit) operations or six floating-point operations.
Four independent 128-bit wide internal data buses, each connecting to the six 2M bit memory banks, enable quad-word
data, instruction, and I/O accesses and provide 28G bytes per
second of internal memory bandwidth. Operating at 500 MHz,
the ADSP-TS202S processor’s core has a 2.0 ns instruction cycle
time. Using its single-instruction, multiple-data (SIMD) features, the ADSP-TS202S processor can perform four billion 40bit MACs or one billion 80-bit MACs per second. Table 1 shows
the DSP’s performance benchmarks.
Table 1. General-Purpose Algorithm Benchmarks
at 500 MHz
Clock
BenchmarkSpeed
Cycles
32-bit algorithm, one billion MACs/s peak performance
1K point complex FFT
64K point complex FFT
(Radix2)
1
(Radix2)2.8 ms1397544
18.8 µs9419
1
FIR filter (per real tap)1 ns0.5
[8 × 8][8 × 8] matrix multiply (complex,
floating-point)2.8 µs1399
16-bit algorithm, four billion MACs/s peak performance
256 point complex FFT
1
(Radix 2)1.9 µs928
I/O DMA transfer rate
External port1G bytes/sn/a
Link ports (each)1G bytes/sn/a
1
Cache preloaded
The ADSP-TS202S processor is code-compatible with the other
TigerSHARC processors.
The Functional Block Diagram on Page 1 shows the ADSPTS202S processor’s architectural blocks. These blocks include:
• Dual compute blocks, each consisting of an ALU, multiplier, 64-bit shifter, and 32-word register file and associated
data alignment buffers (DABs)
• Dual integer ALUs (IALUs), each with its own 31-word
register file for data addressing and a status register
• A program sequencer with instruction alignment buffer
(IAB) and branch target buffer (BTB)
• An interrupt controller that supports hardware and software interrupts, supports level- or edge-triggers, and
supports prioritized, nested interrupts
• Four 128-bit internal data buses, each connecting to the six
2M bit memory banks
• On-chip DRAM (12M bit)
• An external port that provides the interface to host processors, multiprocessing space (DSPs), off-chip memorymapped peripherals, and external SRAM and SDRAM
• A 14-channel DMA controller
• Four full-duplex LVDS link ports
• Two 64-bit interval timers and timer expired pin
• A 1149.1 IEEE compliant JTAG test access port for on-chip
emulation
Figure 2 on Page 3 shows a typical single-processor system with
external SRAM and SDRAM. Figure 4 on Page 8 shows a typical
multiprocessor system.
ADSP-TS202S
CLOCK
REFERENCE
REFERENCE
SDRAM
MEMORY
(OPTIONAL)
CS
CLK
ADDR
RAS
DATA
CAS
DQM
WE
CKE
A10
LINK
DEVICES
(4 MAX)
(OPTIONAL)
RST_IN
RST_OUT
POR_IN
SCLK
SCLKRAT2–0
SCLK_V
REF
V
REF
IRQ3–0
FLAG3–0
ID2–0
MSSD3–0
RAS
CAS
LDQM
HDQM
SDWE
SDCKE
SDA10
IORD
IOWR
IOEN
LxDATO3–0P/N
LxCLKOUTP/N
LxACKI
LxBCMPO
LxDATI3–0P/N
LxCLKINP/N
LxACKO
LxBCMPI
CONTROLIMP1–0
TMR0E
DS2–0
ADDR31–0
DATA63–0
WRH/WRL
DMAR3–0
BUSLOCK
BMS
BRST
ACK
MS1–0
MSH
HBR
HBG
BOFF
BR7–0
CPA
DPA
BM
JTAG
RD
L
S
S
O
E
R
R
T
D
N
D
O
A
C
A
T
A
D
BOOT
EPROM
(OPTIONAL)
CS
ADDR
DATA
MEMORY
(OPTIONAL)
ADDR
DATA
OE
WE
ACK
CS
HOST
PROCESSOR
INTERFACE
(OPTIONAL)
ADDR
DATA
DMA DEVICE
(OPTIONAL)
DATA
Figure 2. ADSP-TS202S Single-Processor System with External SDRAM
TM
†
The TigerSHARC DSP uses a Static Superscalar
architecture.
This architecture is superscalar in that the ADSP-TS202S processor’s core can execute simultaneously from one to four 32-bit
instructions encoded in a very large instruction word (VLIW)
instruction line using the DSP’s dual compute blocks. Because
†
Static Superscalar is a trademark of Analog Devices, Inc.
Rev. 0 | Page 3 of 44 | November 2004
ADSP-TS202S
the DSP does not perform instruction re-ordering at run-time—
the programmer selects which operations will execute in parallel
prior to run-time—the order of instructions is static.
With few exceptions, an instruction line, whether it contains
one, two, three, or four 32-bit instructions, executes with a
throughput of one cycle in a ten-deep processor pipeline.
For optimal DSP program execution, programmers must follow
the DSP’s set of instruction parallelism rules when encoding an
instruction line. In general, the selection of instructions that the
DSP can execute in parallel each cycle depends on the instruction line resources each instruction requires and on the source
and destination registers used in the instructions. The programmer has direct control of three core components—the IALUs,
the compute blocks, and the program sequencer.
The ADSP-TS202S processor, in most cases, has a two-cycle
execution pipeline that is fully interlocked, so—whenever a
computation result is unavailable for another operation dependent on it—the DSP automatically inserts one or more stall
cycles as needed. Efficient programming with dependency-free
instructions can eliminate most computational and memory
transfer data dependencies.
In addition, the ADSP-TS202S processor supports SIMD operations two ways—SIMD compute blocks and SIMD
computations. The programmer can load both compute blocks
with the same data (broadcast distribution) or different data
(merged distribution).
DUAL COMPUTE BLOCKS
The ADSP-TS202S processor has compute blocks that can execute computations either independently or together as a singleinstruction, multiple-data (SIMD) engine. The DSP can issue up
to two compute instructions per compute block each cycle,
instructing the ALU, multiplier, or shifter to perform independent, simultaneous operations. Each compute block can execute
eight 8-bit, four 16-bit, two 32-bit, or one 64-bit SIMD computations in parallel with the operation in the other block.
The compute blocks are referred to as X and Y in assembly syntax, and each block contains three computational units—an
ALU, a multiplier, a 64-bit shifter—and a 32-word register file.
• Register File—each compute block has a multiported 32word, fully orthogonal register file used for transferring
data between the computation units and data buses and for
storing intermediate results. Instructions can access the
registers in the register file individually (word-aligned), in
sets of two (dual-aligned), or in sets of four (quad-aligned).
• ALU—the ALU performs a standard set of arithmetic operations in both fixed- and floating-point formats. It also
performs logic and PERMUTE operations.
• Multiplier—the multiplier performs both fixed- and floating-point multiplication and fixed-point multiply and
accumulate.
• Shifter—the 64-bit shifter performs logical and arithmetic
shifts, bit and bitstream manipulation, and field deposit
and extraction operations.
Using these features, the compute blocks can:
• Provide 8 MACs per cycle peak and 7.1 MACs per cycle
sustained 16-bit performance and provide 2 MACs per
cycle peak and 1.8 MACs per cycle sustained 32-bit performance (based on FIR)
• Execute six single-precision floating-point or execute 24
fixed-point (16-bit) operations per cycle, providing
3GFLOPS or 12.0GOPS performance
• Perform two complex 16-bit MACs per cycle
DATA ALIGNMENT BUFFER (DAB)
The DAB is a quad-word FIFO that enables loading of quadword data from nonaligned addresses. Normally, load instructions must be aligned to their data size so that quad words are
loaded from a quad-aligned address. Using the DAB significantly improves the efficiency of some applications, such as FIR
filters.
DUAL INTEGER ALU (IALU)
The ADSP-TS202S processor has two IALUs that provide powerful address generation capabilities and perform many generalpurpose integer operations. The IALUs are referred to as J and
K in assembly syntax and have the following features:
• Provide memory addresses for data and update pointers
• Support circular buffering and bit-reverse addressing
As address generators, the IALUs perform immediate or indirect (pre- and post-modify) addressing. They perform modulus
and bit-reverse operations with no constraints placed on memory addresses for the modulus data buffer placement. Each
IALU can specify either a single-, dual-, or quad-word access
from memory.
The IALUs have hardware support for circular buffers, bit
reverse, and zero-overhead looping. Circular buffers facilitate
efficient programming of delay lines and other data structures
required in digital signal processing, and they are commonly
used in digital filters and Fourier transforms. Each IALU provides registers for four circular buffers, so applications can set
up a total of eight circular buffers. The IALUs handle address
pointer wraparound automatically, reducing overhead, increasing performance, and simplifying implementation. Circular
buffers can start and end at any memory location.
Because the IALU’s computational pipeline is one cycle deep, in
most cases integer results are available in the next cycle. Hardware (register dependency check) causes a stall if a result is
unavailable in a given cycle.
Rev. 0 | Page 4 of 44 | November 2004
ADSP-TS202S
PROGRAM SEQUENCER
The ADSP-TS202S processor’s program sequencer supports the
following:
• A fully interruptible programming model with flexible programming in assembly and C/C++ languages; handles
hardware interrupts with high throughput and no aborted
instruction cycles
• A ten-cycle instruction pipeline—four-cycle fetch pipe and
six-cycle execution pipe—computation results available
two cycles after operands are available
• Supply of instruction fetch memory addresses; the
sequencer’s instruction alignment buffer (IAB) caches up
to five fetched instruction lines waiting to execute; the program sequencer extracts an instruction line from the IAB
and distributes it to the appropriate core component for
execution
• Management of program structures and program flow
determined according to JUMP, CALL, RTI, RTS instructions, loop structures, conditions, interrupts, and software
exceptions
• Branch prediction and a 128-entry branch target buffer
(BTB) to reduce branch delays for efficient execution of
conditional and unconditional branch instructions and
zero-overhead looping; correctly predicted branches that
are taken occur with zero overhead cycles, overcoming the
five-to-nine stage branch penalty
• Compact code without the requirement to align code in
memory; the IAB handles alignment
Interrupt Controller
The DSP supports nested and nonnested interrupts. Each interrupt type has a register in the interrupt vector table. Also, each
has a bit in both the interrupt latch register and the interrupt
mask register. All interrupts are fixed as either level-sensitive or
edge-sensitive, except the IRQ3–0
are programmable.
The DSP distinguishes between hardware interrupts and software exceptions, handling them differently. When a software
exception occurs, the DSP aborts all other instructions in the
instruction pipe. When a hardware interrupt occurs, the DSP
continues to execute instructions already in the instruction pipe.
Flexible Instruction Set
The 128-bit instruction line, which can contain up to four 32-bit
instructions, accommodates a variety of parallel operations for
concise programming. For example, one instruction line can
direct the DSP to conditionally execute a multiply, an add, and a
subtract in both computation blocks while it also branches to
another location in the program. Some key features of the
instruction set include:
• Algebraic assembly language syntax
• Direct support for all DSP, imaging, and video arithmetic
types
hardware interrupts, which
• Eliminates toggling DSP hardware modes because modes
are supported as options (for example, rounding, saturation, and others) within instructions
• Branch prediction encoded in instruction; enables zerooverhead loops
• Parallelism encoded in instruction line
• Conditional execution optional for all instructions
• User defined partitioning between program and data
memory
DSP MEMORY
The DSP’s internal and external memory is organized into a
unified memory map, which defines the location (address) of all
elements in the system, as shown in Figure 3.
The memory map is divided into four memory areas—host
space, external memory, multiprocessor space, and internal
memory—and each memory space, except host memory, is subdivided into smaller memory spaces.
The ADSP-TS202S processor internal memory has 12M bits of
on-chip DRAM memory, divided into six blocks of 2M bits
(64K words × 32 bits). Each block—M0, M2, M4, M6, M8, and
M10—can store program instructions, data, or both, so applications can configure memory to suit specific needs. Placing
program instructions and data in different memory blocks,
however, enables the DSP to access data while performing an
instruction fetch. Each memory segment contains a 128K bit
cache to enable single cycle accesses to internal DRAM.
The six internal memory blocks connect to the four 128-bit wide
internal buses through a crossbar connection, enabling the DSP
to perform four memory transfers in the same cycle. The DSP’s
internal bus architecture provides a total memory bandwidth of
28G bytes per second, enabling the core and I/O to access eight
32-bit data-words and four 32-bit instructions each cycle. The
DSP’s flexible memory structure enables:
• DSP core and I/O accesses to different memory blocks in
the same cycle
• DSP core access to three memory blocks in parallel—one
instruction and two data accesses
• Programmable partitioning of program and data memory
• Program access of all memory as 32-, 64-, or 128-bit
words—16-bit words with the DAB
EXTERNAL PORT (OFF-CHIP
MEMORY/PERIPHERALS INTERFACE)
The ADSP-TS202S processor’s external port provides the DSP’s
interface to off-chip memory and peripherals. The 4G word
address space is included in the DSP’s unified address space.
The separate on-chip buses—four 128-bit data buses and four
32-bit address buses—are multiplexed at the SOC interface and
transferred to the external port over the SOC bus to create an
external system bus transaction. The external system bus provides a single 64-bit data bus and a single 32-bit address bus.
The external port supports data transfer rates of 1G bytes per
second over the external bus.
Rev. 0 | Page 5 of 44 | November 2004
ADSP-TS202S
INTERNAL SPACE
RESERV ED
SOC R EGISTE RS ( UR EGS)
RESERV ED
INTERNAL REGISTERS (UREGS)
RESERV ED
INTERNAL MEMORY BLOCK 10
RESERV ED
INTERNAL MEMORY BLOCK 8
RESERV ED
INTERNAL MEMORY BLOCK6
RESERV ED
INTERNAL MEMORY BLOCK4
RESERV ED
INTERNAL MEMORYBLOCK 2
RESERV ED
INTERNAL MEMORYBLOCK 0
0x03FFFFFF
0x001F03FF
0X 001F 0000
0x 001E0 3FF
0X001E0000
0x0014FFFF
0x 00140 000
0x0010FFFF
0x 00100 000
0x000CFFFF
0x 000C 0000
0x 0008 FFFF
0 x000 80000
0x 0004 FFFF
0 x000 40000
0x0000FFFF
0x0 0000 000
GLOBAL SPACE
HOST (MSH)
RE SER V ED
MSSD BANK 3 (MSSD3)
E
C
A
P
S
Y
R
O
M
E
M
L
A
N
R
E
T
X
E
E
C
A
P
S
Y
R
O
M
E
M
R
O
S
S
E
C
O
R
P
I
T
L
U
M
RE SER V ED
MSSD BA NK 2 (MSSD2)
RE SER V ED
MSSD BANK 1 (MSSD1)
RE SER V ED
MSSD BANK 0 (MSSD0)
BANK 1 (MS1 )
BANK 0 (MS0 )
PROCESSORID7
PROCESSORID6
PROCESSORID5
PROCESSORID4
PROCESSORID3
PROCESSORID2
PROCESSORID1
PROCESSORID0
BROADCAST
RE SER V ED
INTER NAL MEMORY
0xFFFFFFFF
0x 8000 0000
0x 7400 0000
0x 7000 0000
0x 6400 0000
0x 6000 0000
0x 5400 0000
0x 5000 0000
0x4 4000 000
0x4 0000 000
0x 3800 0000
0x 30000 000
0x 2C00 0000
0x 28000 000
0x 24000 000
0x 20000 000
0x 1C00 0000
0x 18000 000
0x 14000 000
0x 10000 000
0X0C000000
0x03FFFFFF
0x 00000 000
EACH IS A COPY
OF INTERNAL SPACE
Figure 3. ADSP-TS202S Memory Map
The external bus can be configured for 32- or 64-bit, littleendian operations. When the system bus is configured for 64-bit
operations, the lower 32 bits of the external data bus connect to
even addresses, and the upper 32 bits connect to odd addresses.
The external port supports pipelined, slow, and SDRAM protocols. Addressing of external memory devices and memorymapped peripherals is facilitated by on-chip decoding of high
order address lines to generate memory bank select signals.
The ADSP-TS202S processor provides programmable memory,
pipeline depth, and idle cycle for synchronous accesses, and
external acknowledge controls to support interfacing to pipelined or slow devices, host processors, and other memorymapped peripherals with variable access, hold, and disable time
requirements.
Rev. 0 | Page 6 of 44 | November 2004
Host Interface
The ADSP-TS202S processor provides an easy and configurable
interface between its external bus and host processors through
the external port. To accommodate a variety of host processors,
the host interface supports pipelined or slow protocols for
ADSP-TS202S processor accesses of the host as slave or pipelined for host accesses of the ADSP-TS202S processor as slave.
Each protocol has programmable transmission parameters,
such as idle cycles, pipe depth, and internal wait cycles.
The host interface supports burst transactions initiated by a host
processor. After the host issues the starting address of the burst
and asserts the BRST
internally while the host continues to assert BRST
signal, the DSP increments the address
.
The host interface provides a deadlock recovery mechanism that
enables a host to recover from deadlock situations involving the
DSP. The BOFF
signal provides the deadlock recovery mecha-
ADSP-TS202S
nism. When the host asserts BOFF, the DSP backs off the
current transaction and asserts HBG
external bus.
The host can directly read or write the internal memory of the
ADSP-TS202S processor, and it can access most of the DSP registers, including DMA control (TCB) registers. Vector
interrupts support efficient execution of host commands.
and relinquishes the
Multiprocessor Interface
The ADSP-TS202S processor offers powerful features tailored
to multiprocessing DSP systems through the external port and
link ports. This multiprocessing capability provides highest
bandwidth for interprocessor communication, including:
• Up to eight DSPs on a common bus
• On-chip arbitration for glueless multiprocessing
• Link ports for point-to-point communication
The external port and link ports provide integrated, glueless
multiprocessing support.
The external port supports a unified address space (see Figure 3)
that enables direct interprocessor accesses of each ADSPTS202S processor’s internal memory and registers. The DSP’s
on-chip distributed bus arbitration logic provides simple, glueless connection for systems containing up to eight ADSPTS202S processors and a host processor. Bus arbitration has a
rotating priority. Bus lock supports indivisible read-modifywrite sequences for semaphores. A bus fairness feature prevents
one DSP from holding the external bus too long.
The DSP’s four link ports provide a second path for interprocessor communications with throughput of 4G bytes per second.
The cluster bus provides 1G bytes per second throughput—with
a total of 4G bytes per second interprocessor bandwidth (limited by SOC bandwidth).
SDRAM Controller
The SDRAM controller controls the ADSP-TS202S processor’s
transfers of data to and from external synchronous DRAM
(SDRAM) at a throughput of 32 or 64 bits per SCLK cycle using
the external port and SDRAM control pins.
The SDRAM interface provides a glueless interface with standard SDRAMs—16M bit, 64M bit, 128M bit, and 256M bit. The
DSP supports directly a maximum of four banks of
64M words × 32 bits of SDRAM. The SDRAM interface is
mapped in external memory in each DSP’s unified
memory map.
EPROM Interface
The ADSP-TS202S processor can be configured to boot from an
external 8-bit EPROM at reset through the external port. An
automatic process (which follows reset) loads a program from
the EPROM into internal memory. This process uses 16 wait
cycles for each read access. During booting, the BMS
tions as the EPROM chip select signal. The EPROM boot
procedure uses DMA Channel 0, which packs the bytes into
32-bit instructions. Applications can also access the EPROM
(write flash memories) during normal operation through DMA.
pin func-
The EPROM or flash memory interface is not mapped in the
DSP’s unified memory map. It is a byte address space limited to
a maximum of 16M bytes (24 address bits). The EPROM or
flash memory interface can be used after boot via a DMA.
DMA CONTROLLER
The ADSP-TS202S processor’s on-chip DMA controller, with
14 DMA channels, provides zero-overhead data transfers without processor intervention. The DMA controller operates
independently and invisibly to the DSP’s core, enabling DMA
operations to occur while the DSP’s core continues to execute
program instructions.
The DMA controller performs DMA transfers between internal
memory and external memory and memory-mapped peripherals, the internal memory of other DSPs on a common bus, a host
processor, or link port I/O; between external memory and external peripherals or link port I/O; and between an external bus
master and internal memory or link port I/O. The DMA controller performs the following DMA operations:
• External port block transfers. Four dedicated bidirectional
DMA channels transfer blocks of data between the DSP’s
internal memory and any external memory or memorymapped peripheral on the external bus. These transfers
support master mode and handshake mode protocols.
• Link port transfers. Eight dedicated DMA channels (four
transmit and four receive) transfer quad-word data only
between link ports and between a link port and internal or
external memory. These transfers only use handshake
mode protocol. DMA priority rotates between the four
receive channels.
• AutoDMA transfers. Two dedicated unidirectional DMA
channels transfer data received from an external bus mas ter
to internal memory or to link port I/O. These transfers only
use slave mode protocol, and an external bus master must
initiate the transfer.
The DMA controller provides these additional features:
• Flyby transfers. Flyby operations only occur through the
external port (DMA channel 0) and do not involve the
DSP’s core. The DMA controller acts as a conduit to transfer data from an external I/O device and external SDRAM
memory. During a transaction, the DSP relinquishes the
external data bus; outputs addresses and memory selects
(MSSD3–0
RD
• DMA chaining. DMA chaining operations enable applications to automatically link one DMA transfer sequence to
another for continuous transmission. The sequences can
occur over different DMA channels and have different
transmission attributes.
• Two-dimensional transfers. The DMA controller can
access and transfer two-dimensional memory arrays on any
DMA transmit or receive channel. These transfers are
implemented with index, count, and modify registers for
both the X and Y dimensions.
Figure 4. ADSP-TS202S Shared Memory Multiprocessing System
LINK PORTS (LVDS)
The DSP’s four full-duplex link ports each provide additional
four-bit receive and four-bit transmit I/O capability, using LowVoltage, Differential-Signal (LVDS) technology. With the ability to operate at a double data rate—latching data on both the
rising and falling edges of the clock—running at 500 MHz, each
link port can support up to 500M bytes per second per direction, for a combined maximum throughput of 4G bytes
per second.
The link ports provide an optional communications channel
that is useful in multiprocessor systems for implementing pointto-point interprocessor communications. Applications can also
use the link ports for booting.
Rev. 0 | Page 8 of 44 | November 2004
Each link port has its own triple-buffered quad-word input and
double-buffered quad-word output registers. The DSP’s core
can write directly to a link port’s transmit register and read from
a receive register, or the DMA controller can perform DMA
transfers through eight (four transmit and four receive) dedicated link port DMA channels.
Each link port direction has three signals that control its operation. For the transmitter, LxCLKOUT is the output transmit
clock, LxACKI is the handshake input to control the data flow,
and the LxBCMPO
output indicates that the block transfer is
complete. For the receiver, LxCLKIN is the input receive clock,
LxACKO is the handshake output to control the data flow, and
ADSP-TS202S
the LxBCMPI input indicates that the block transfer is complete. The LxDATO3–0 pins are the data output bus for the
transmitter and the LxDATI3–0 pins are the input data bus for
the receiver.
Applications can program separate error detection mechanisms
for transmit and receive operations (applications can use the
checksum mechanism to implement consecutive link port
transfers), the size of data packets, and the speed at which bytes
are transmitted.
TIMER AND GENERAL-PURPOSE I/O
The ADSP-TS202S processor has a timer pin (TMR0E) that
generates output when a programmed timer counter has
expired and four programmable general-purpose I/O pins
(FLAG3–0) that can function as either single-bit input or output. As outputs, these pins can signal peripheral devices; as
inputs, they can provide the test for conditional branching.
RESET AND BOOTING
The ADSP-TS202S processor has three levels of reset:
• Power-up reset—after power-up of the system (SCLK, all
static inputs, and strap pins are stable), the RST_IN
must be asserted (low).
• Normal reset—for any chip reset following the power-up
reset, the RST_IN
pin must be asserted (low).
• DSP-core reset—when setting the SWRST bit in EMUCTL,
the DSP core is reset, but not the external port or I/O.
For normal operations, tie the RST_OUT
POR_IN
pin.
pin to the
After reset, the ADSP-TS202S processor has four boot options
for beginning operation:
•Boot from EPROM.
• Boot by an external master (host or another ADSP-TS202S
processor).
•Boot by link port.
• No boot—start running from memory address selected
with one of the IRQ3–0
interrupt signals. See Table 2.
Using the “no boot” option, the ADSP-TS202S processor must
start running from memory when one of the interrupts is
asserted.
pin
For more information on boot options, see the EE-200: ADSP-TS20x TigerSHARC Processor Boot Loader Kernels Operation on
the Analog Devices website (www.analog.com)
CLOCK DOMAINS
The DSP uses calculated ratios of the SCLK clock to operate as
shown in Figure 5. The instruction execution rate is equal to
CCLK. A PLL from SCLK generates CCLK which is phaselocked. The SCLKRATx pins define the clock multiplication of
SCLK to CCLK (see Table 4 on Page 12). The link port clock is
generated from CCLK via a software programmable divisor, and
the SOC bus operates at 1/2 CCLK. Memory transfers to external and link port buffers operate at the SOCCLK rate. SCLK also
provides clock input for the external bus interface and defines
the AC specification reference for the external bus signals. The
external bus interface runs at the SCLK frequency. The maximum SCLK frequency is one quarter the internal DSP clock
(CCLK) frequency.
EXTERNAL INTERFACE
SCLK
SCLKRATx
LCTLx REGISTER
PLL
/2
/CR
SPD BITS,
Figure 5. Clock Domains
CCLK
(INSTRUCTION RATE)
SOCCLK
(PERIPHERAL BUS RATE)
LxCLKOUT
(LINK OUTPUT RATE)
POWER DOMAINS
The ADSP-TS202S processor has separate power supply connections for internal logic (V
buffer (V
), and internal DRAM (V
DD_IO
Note that the analog (V
), analog circuits (V
DD
DD_DRAM
) supply powers the clock generator
DD_A
DD_A
) power supply.
), I/O
PLLs. To produce a stable clock, systems must provide a clean
power supply to power input V
attention to bypassing the V
. Designs must pay critical
DD_A
supply.
DD_A
FILTERING REFERENCE VOLTAGE AND CLOCKS
Figure 6 and Figure 7 show possible circuits for filtering V
and SCLK_V
. These circuits provide the reference voltages
REF
for the switching voltage reference and system clock reference.
The ADSP-TS202S processor core always exits from reset in the
idle state and waits for an interrupt. Some of the interrupts in
the interrupt vector table are initialized and enabled after reset.
Rev. 0 | Page 9 of 44 | November 2004
V
DD_IO
R1
R2C1C2
V
SS
R1: 2 k⍀ SERIES RESISTOR (±1%)
R2: 2.87 k⍀ SERIE S RESISTOR (±1%)
C1: 1 F CAPACITOR (SMD)
C2: 1 nF CAPACITOR (HF SMD) PLACED CLOSE TO DSP’S PINS
Figure 6. V
Filtering Scheme
REF
V
REF
ADSP-TS202S
CLOCK DRIVER
*
VOLTAGE OR
V
DD_IO
R1
R2C1C2
V
SS
R1: 2 k⍀ SERIES RESISTOR (±1%)
R2: 2.87 k⍀ SERIES RESISTOR (±1%)
C1: 1 FCAPACITOR(SMD)
C2: 1 nF CAPACITOR (HF SMD) PLACED CLOSE TO DSP’S PINS
*
IF CLOCK DRIVER VOLTAGE ⬎ V
Figure 7. SCLK_V
DD_IO
Filtering Scheme
REF
SCLK_V
REF
DEVELOPMENT TOOLS
The ADSP-TS202S processor is supported with a complete set
of CROSSCORE
including Analog Devices emulators and VisualDSP++
opment environment. The same emulator hardware that
supports other TigerSHARC processors also fully emulates the
ADSP-TS202S processor.
The VisualDSP++ project management environment lets programmers develop and debug an application. This environment
includes an easy to use assembler (which is based on an algebraic syntax), an archiver (librarian/library builder), a linker, a
loader, a cycle-accurate instruction-level simulator, a C/C++
compiler, and a C/C++ run-time library that includes DSP and
mathematical functions. A key point for theses tools is C/C++
code efficiency. The compiler has been developed for efficient
translation of C/C++ code to DSP assembly. The DSP has architectural features that improve the efficiency of compiled
C/C++ code.
The VisualDSP++ debugger has a number of important features. Data visualization is enhanced by a plotting package that
offers a significant level of flexibility. This graphical representation of user data enables the programmer to quickly determine
the performance of an algorithm. As algorithms grow in complexity, this capability can have increasing significance on the
designer’s development schedule, increasing productivity. Statistical profiling enables the programmer to nonintrusively poll
the processor as it is running the program. This feature, unique
to VisualDSP++, enables the software developer to passively
gather important code execution metrics without interrupting
the real-time characteristics of the program. Essentially, the
developer can identify bottlenecks in software quickly and efficiently. By using the profiler, the programmer can focus on
those areas in the program that impact performance and take
corrective action.
†
CROSSCORE is a registered trademark of Analog Devices, Inc.
‡
VisualDSP++ is a registered trademark of Analog Devices, Inc.
®
†
software and hardware development tools,
®
‡
devel-
Debugging both C/C++ and assembly programs with the
VisualDSP++ debugger, programmers can:
• View mixed C/C++ and assembly code (interleaved source
and object information)
• Insert breakpoints
• Set conditional breakpoints on registers, memory,
and stacks
• Trace instruction execution
• Perform linear or statistical profiling of program execution
• Fill, dump, and graphically plot the contents of memory
• Perform source level debugging
• Create custom debugger windows
The VisualDSP++ IDE lets programmers define and manage
DSP software development. Its dialog boxes and property pages
let programmers configure and manage all of the TigerSHARC
processor development tools, including the color syntax highlighting in the VisualDSP++ editor. This capability permits
programmers to:
• Control how the development tools process inputs and
generate outputs
• Maintain a one-to-one correspondence with the tool’s
command line switches
The VisualDSP++ Kernel (VDK) incorporates scheduling and
resource management tailored specifically to address the memory and timing constraints of DSP programming. These
capabilities enable engineers to develop code more effectively,
eliminating the need to start from the very beginning, when
developing new application code. The VDK features include
threads, critical and unscheduled regions, semaphores, events,
and device flags. The VDK also supports priority-based, preemptive, cooperative and time-sliced scheduling approaches. In
addition, the VDK was designed to be scalable. If the application
does not use a specific feature, the support code for that feature
is excluded from the target system.
Because the VDK is a library, a developer can decide whether to
use it or not. The VDK is integrated into the VisualDSP++
development environment, but can also be used via standard
command line tools. When the VDK is used, the development
environment assists the developer with many error-prone tasks
and assists in managing system resources, automating the generation of various VDK-based objects, and visualizing the
system state, when debugging an application that uses the VDK.
VCSE is Analog Devices’ technology for creating, using, and
reusing software components (independent modules of substantial functionality) to quickly and reliably assemble software
applications. It also can be used for downloading components
from the Web, dropping them into the application, and publishing component archives from within VisualDSP++. VCSE
supports component implementation in C/C++ or assembly
language.
Use the expert linker to visually manipulate the placement of
code and data on the embedded system. View memory utilization in a color-coded graphical form, easily move code and data
Rev. 0 | Page 10 of 44 | November 2004
ADSP-TS202S
to different areas of the DSP or external memory with a drag of
the mouse, examine run-time stack and heap usage. The expert
linker is fully compatible with existing linker definition file
(LDF), allowing the developer to move between the graphical
and textual environments.
Analog Devices DSP emulators use the IEEE 1149.1 JTAG test
access port of the ADSP-TS202S processor to monitor and control the target board processor during emulation. The emulator
provides full speed emulation, allowing inspection and modification of memory, registers, and processor stacks. Nonintrusive
in-circuit emulation is assured by the use of the processor’s
JTAG interface—the emulator does not affect target system
loading or timing.
In addition to the software and hardware development tools
available from Analog Devices, third parties provide a wide
range of tools supporting the TigerSHARC processor family.
Hardware tools include TigerSHARC processor PC plug-in
cards. Third party software tools include DSP libraries, realtime operating systems, and block diagram design tools.
EVALUATION KIT
®
Analog Devices offers a range of EZ-KIT Lite
forms to use as a cost-effective method to learn more about
developing or prototyping applications with Analog Devices
processors, platforms, and software tools. Each EZ-KIT Lite
includes an evaluation board along with an evaluation suite of
the VisualDSP++ development and debugging environment
with the C/C++ compiler, assembler, and linker. Also included
are sample application programs, power supply, and a USB
cable. All evaluation versions of the software tools are limited
for use only with the EZ-KIT Lite product.
The USB controller on the EZ-KIT Lite board connects the
board to the USB port of the user’s PC, enabling the
VisualDSP++ evaluation suite to emulate the on-board processor in-circuit. This permits the customer to download, execute,
and debug programs for the EZ-KIT Lite system. It also allows
in-circuit programming of the on-board flash device to store
user-specific boot code, enabling the board to run as a standalone unit, without being connected to the PC.
With a full version of VisualDSP++ installed (sold separately),
engineers can develop software for the EZ-KIT Lite or any custom-defined system. Connecting one of Analog Devices JTAG
emulators to the EZ-KIT Lite board enables high speed, nonintrusive emulation.
†
evaluation plat-
halted to send data and commands, but once an operation has
been completed by the emulator, the DSP system is set running
at full speed with no impact on system timing.
To use these emulators, the target board must include a header
that connects the DSP’s JTAG port to the emulator.
For details on target board design issues including mechanical
layout, single processor connections, multiprocessor scan
chains, signal buffering, signal termination, and emulator pod
logic, see the EE-68: Analog Devices JTAG Emulation Technical Reference on the Analog Devices website (www.analog.com)—
use the string “EE-68” in site search. This document is updated
regularly to keep pace with improvements to emulator support.
ADDITIONAL INFORMATION
This data sheet provides a general overview of the ADSPTS202S processor’s architecture and functionality. For detailed
information on the ADSP-TS202S processor’s core architecture
and instruction set, see the ADSP-TS201 TigerSHARC Processor
Hardware Reference and the ADSP-TS201 TigerSHARC Processor Programming Reference. For detailed information on the development tools for this processor, see the VisualDSP++
User’s Guide for TigerSHARC Processors.
DESIGNING AN EMULATOR-COMPATIBLE
DSP BOARD (TARGET)
The Analog Devices family of emulators are tools that every
DSP developer needs to test and debug hardware and software
systems. Analog Devices has supplied an IEEE 1149.1 JTAG test
access port (TAP) on each JTAG DSP. The emulator uses the
TAP to access the internal features of the DSP, allowing the
developer to load code, set breakpoints, observe variables,
observe memory, and examine registers. The DSP must be
†
EZ-Kit Lite is a registered trademark of Analog Devices, Inc.
Rev. 0 | Page 11 of 44 | November 2004
ADSP-TS202S
PIN FUNCTION DESCRIPTIONS
While most of the ADSP-TS202S processor’s input pins are normally synchronous—tied to a specific clock—a few are
asynchronous. For these asynchronous signals, an on-chip synchronization circuit prevents metastability problems. Use the ac
specification for asynchronous signals when the system design
requires predictable, cycle-by-cycle behavior for these signals.
The output pins can be three-stated during normal operation.
The DSP three-states all outputs during reset, allowing these
pins to get to their internal pull-up or pull-down state. Some
pins have an internal pull-up or pull-down resistor (±30% tolerance) that maintains a known value during transitions between
different drivers.
Table 3. Pin Definitions—Clocks and Reset
SignalTypeTermDescription
Core Clock Ratio. The DSP’s core clock (CCLK) rate = n × SCLK, where n is user-programmable using the SCLKRATx pins to the values shown in Table 4. These pins may
change only during reset; connect these pins to V
or VSS. All reset specifications
DD_IO
in Table 22, Table 23, and Table 24 must be satisfied. The core clock rate (CCLK) is the
SCLKRAT2–0I (pd)na
instruction cycle rate.
System Clock Input. The DSP’s system input clock for cluster bus.The core clock rate
is user-programmable using the SCLKRATx pins. For more information, see Clock
SCLKIna
Domains on Page 9.
Reset. Sets the DSP to a known state and causes program to be in idle state. RST_IN
must be asserted a specified time according to the type of reset operation. For details,
RST_IN
RST_OUT
POR_IN
I/Ana
see Reset and Booting on Page 9, Table 18 on Page 24, and Figure 13 on Page 27.
OnaReset Output. Indicates that the DSP reset is complete. Connect to POR_IN.
I/AnaPower-On Reset for internal DRAM. Connect to RST_OUT.
I = input; A = asynchronous; O = output; OD = open drain output; T = three-state; P = power supply; G = ground;
pd = internal pull-down 5 k
ID = 0; pu_od_0 = internal pull-up 500
on DSP bus master; pu_ad = internal pull-up 40 k
Term (termination of unused pins) column symbols: epd = external pull-down approximately 5 k
imately 5 k
Ω to V
DD_IO
Ω; pu = internal pull-up 5 kΩ; pd_0 = internal pull-down 5 kΩ on DSP ID = 0; pu_0 = internal pull-up 5 kΩ on DSP
Ω on DSP ID = 0; pd_m = internal pull-down 5 kΩ on DSP bus master; pu_m = internal pull-up 5 kΩ
Ω. For more pull-down and pull-up information, see Electrical Characteristics on Page 22.
Ω to V
; epu = external pull-up approx-
, nc = not connected; na = not applicable (always used); V
Table 5. Pin Definitions—External Port Bus Controls
SignalTypeTermDescription
Address Bus. The DSP issues addresses for accessing memory and peripherals on
these pins. In a multiprocessor system, the bus master drives addresses for accessing
internal memory or I/O processor registers of other ADSP-TS202S processors. The DSP
ADDR31–0
DATA63–0
I/O/T
(pu_ad)nc
I/O/T
(pu_ad)nc
inputs addresses when a host or another DSP accesses its internal memory or I/O
processor registers.
External Data Bus. The DSP drives and receives data and instructions on these pins.
Pull-up or pull-down resistors on unused DATA pins are unnecessary.
Memory Read. RD is asserted whenever the DSP reads from any slave in the system,
is an input and indicates read trans-
RD
I/O/T
(pu_0)epu
excluding SDRAM. When the DSP is a slave, RD
1
actions that access its internal memory or universal registers. In a multiprocessor
system, the bus master drives RD
. RD changes concurrently with ADDR pins.
Write Low. WRL is asserted in two cases: when the ADSP-TS202S processor writes to
an even address word of external memory or to another external bus agent; and when
the ADSP-TS202S processor writes to a 32-bit zone (host, memory, or DSP
programmed to 32-bit bus). An external master (host or DSP) asserts WRL
to a DSP’s low word of internal memory. In a multiprocessor system, the bus master
. WRL changes concurrently with ADDR pins. When the DSP is a slave, WRL
WRL
I/O/T
(pu_0)epu
drives WRL
1
is an input and indicates write transactions that access its internal memory or
universal registers.
Write High. WRH is asserted when the ADSP-TS202S processor writes a long word (64
bits) or writes to an odd address word of external memory or to another external bus
agent on a 64-bit data bus. An external master (host or another DSP) must assert WRH
for writing to a DSP’s high word of 64-bit data bus. In a multiprocessing system, the
. WRH changes concurrently with ADDR pins. When the DSP
WRH
I/O/T
(pu_0)epu
bus master drives WRH
1
is a slave, WRH
memory or universal registers.
is an input and indicates write transactions that access its internal
Acknowledge. External slave devices can deassert ACK to add wait states to external
memory accesses. ACK is used by I/O devices, memory controllers and other peripherals on the data phase. The DSP can deassert ACK to add wait states to read and write
ACK
BMS
MS1–0
MSH
I/O/T/OD
(pu_od_0)epu
O/T
(pu_0)na
O/T
(pu_0)nc
O/T
(pu_0)nc
1
accesses of its internal memory. The pull-up is 50 Ω on low-to-high transactions and
is 500 Ω on all other transactions.
Boot Memory Select. BMS
reset, the DSP uses BMS
cessor system, the DSP bus master drives BMS
is the chip select for boot EPROM or flash memory. During
as a strap pin (EBOOT) for EPROM boot mode. In a multipro-
. For details, see Reset and Booting on
Page 9 and see the EBOOT signal description in Table 16 on Page 19.
Memory Select. MS0
or 1, respectively. MS1–0
with ADDR pins. When ADDR31:27 = 0b00110, MS0
or MS1 is asserted whenever the DSP accesses memory banks 0
are decoded memory address pins that change concurrently
is asserted. When ADDR31:27 =
0b00111, MS1 is asserted. In multiprocessor systems, the master DSP drives MS1–0.
Memory Select Host. MSH
space (ADDR31 = 0b1). MSH
is asserted whenever the DSP accesses the host address
is a decoded memory address pin that changes concur-
rently with ADDR pins. In a multiprocessor system, the bus master DSP drives MSH
Burst. The current bus master (DSP or host) asserts this pin to indicate that it is reading
or writing data associated with consecutive addresses. A slave device can ignore
addresses after the first one and increment an internal address counter after each
BRST
I/O/T
(pu_0)epu
1
transfer. For host-to-DSP burst accesses, the DSP increments the address automatically while BRST
is asserted.
I = input; A = asynchronous; O = output; OD = open drain output; T = three-state; P = power supply; G = ground;
pd = internal pull-down 5 k
ID = 0; pu_od_0 = internal pull-up 500
on DSP bus master; pu_ad = internal pull-up 40 k
Term (termination of unused pins) column symbols: epd = external pull-down approximately 5 k
imately 5 k
1
This external pull-up may be omitted for the ID = 000 TigerSHARC processor.
Ω to V
DD_IO
Ω; pu = internal pull-up 5 kΩ; pd_0 = internal pull-down 5 kΩ on DSP ID = 0; pu_0 = internal pull-up 5 kΩ on DSP
Ω on DSP ID = 0; pd_m = internal pull-down 5 kΩ on DSP bus master; pu_m = internal pull-up 5 kΩ
Ω. For more pull-down and pull-up information, see Electrical Characteristics on Page 22.
Ω to V
; epu = external pull-up approx-
, nc = not connected; na = not applicable (always used); V
= connect directly to V
DD_IO
SS
DD_IO
ADSP-TS202S
for writing
.
; VSS = connect directly to VSS.
Rev. 0 | Page 13 of 44 | November 2004
ADSP-TS202S
Table 6. Pin Definitions—External Port Arbitration
SignalTypeTermDescription
Multiprocessing Bus Request Pins. Used by the DSPs in a multiprocessor system to
arbitrate for bus mastership. Each DSP drives its own BRx
value of its ID2–0 inputs) and monitors all others. In systems with fewer than eight
DSPs, set the unused BRx
pins high (V
DD_IO
).
BR7–0
I/OV
DD_IO
1
Multiprocessor ID. Indicates the DSP’s ID, from which the DSP determines its order in
a multiprocessor system. These pins also indicate to the DSP which bus request
(BR0
–BR7) to assert when requesting the bus: 000 = BR0, 001 = BR1, 010 = BR2,
, 100 = BR4, 101 = BR5, 110 = BR6, or 111 = BR7. ID2–0 must have a constant
ID2–0I (pd)na
011 = BR3
value during system operation and can change during reset only.
Bus Master. The current bus master DSP asserts BM
BM
Ona
is a strap pin. For more information, see Table 16 on Page 19.
Back Off. A deadlock situation can occur when the host and a DSP try to read from
each other’s bus at the same time. When deadlock occurs, the host can assert BOFF
BOFF
BUSLOCK
HBR
Iepu
O/T
(pu_0)na
to force the DSP to relinquish the bus before completing its outstanding transaction.
Bus Lock Indication. Provides an indication that the current bus master has locked
the bus. At reset, this is a strap pin. For more information, see Table 16 on Page 19.
Host Bus Request. A host must assert HBR
When HBR
Iepu
bus and asserts HBG
to request control of the DSP’s external bus.
is asserted in a multiprocessing system, the bus master relinquishes the
once the outstanding transaction is finished.
Host Bus Grant. Acknowledges HBR and indicates that the host can take control of
the external bus. When relinquishing the bus, the master DSP three-states the
ADDR31–0, DATA63–0, MSH
IOWR
, IOEN, RAS, CAS, SDWE, SDA10, SDCKE, LDQM and HDQM pins, and the DSP puts
, MSSD3–0, MS1–0, RD, WRL, WRH, BMS, BRST, IORD,
the SDRAM in self-refresh mode. The DSP asserts HBG
HBG
I/O/T
(pu_0)epu
2
In multiprocessor systems, the current bus master DSP drives HBG
monitor it.
Core Priority Access. Asserted while the DSP’s core accesses external memory. This
pin enables a slave DSP to interrupt a master DSP’s background DMA transfers and
gain control of the external bus for core-initiated transactions. CPA
output, connected to all DSPs in the system. If not required in the system, leave CPA
unconnected (external pull-ups will be required for DSP ID = 1 through ID = 7).
CPA
I/O/OD
(pu_od_0)epu
2
DMA Priority Access. Asserted while a high priority DSP DMA channel accesses
external memory. This pin enables a high priority DMA channel on a slave DSP to
interrupt transfers of a normal priority DMA channel on a master DSP and gain control
of the external bus for DMA-initiated transactions. DPA
DPA
I/O/OD
(pu_od_0)epu
2
connected to all DSPs in the system. If not required in the system, leave DPA
nected (external pull-ups will be required for DSP ID = 1 through ID = 7).
I = input; A = asynchronous; O = output; OD = open drain output; T = three-state; P = power supply; G = ground;
pd = internal pull-down 5 k
ID = 0; pu_od_0 = internal pull-up 500
on DSP bus master; pu_ad = internal pull-up 40 k
Term (termination of unused pins) column symbols: epd = external pull-down approximately 5 k
imately 5 k
1
The BRx pin matching the ID2–0 input selection for the processor should be left nc if unused. For example, the processor with ID = 000 has BR0 = nc and BR7–1 = V
2
This external pull-up resistor may be omitted for the ID = 000 TigerSHARC processor.
Ω to V
DD_IO
Ω; pu = internal pull-up 5 kΩ; pd_0 = internal pull-down 5 kΩ on DSP ID = 0; pu_0 = internal pull-up 5 kΩ on DSP
Ω on DSP ID = 0; pd_m = internal pull-down 5 kΩ on DSP bus master; pu_m = internal pull-up 5 kΩ
Ω. For more pull-down and pull-up information, see Electrical Characteristics on Page 22.
Ω to V
, nc = not connected; na = not applicable (always used); V
= connect directly to V
DD_IO
line (corresponding to the
. For debugging only. At reset this
until the host deasserts HBR.
, and all slave DSPs
is an open drain
is an open drain output,
; epu = external pull-up approx-
SS
; VSS = connect directly to VSS.
DD_IO
uncon-
DD_IO
.
Rev. 0 | Page 14 of 44 | November 2004
Loading...
+ 30 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.