Texas Instruments Incorporated and its subsidiaries (TI) reserve the right to make corrections,
modifications, enhancements, improvements, and other changes to its products and services at any
time and to discontinue any product or service without notice. Customers should obtain the latest
relevant information before placing orders and should verify that such information is current and
complete. All products are sold subject to TI’s terms and conditions of sale supplied at the time of order
acknowledgment.
TI warrants performance of its hardware products to the specifications applicable at the time of sale
in accordance with TI’s standard warranty. Testing and other quality control techniques are used to the
extent TI deems necessary to support this warranty. Except where mandated by government
requirements, testing of all parameters of each product is not necessarily performed.
TI assumes no liability for applications assistance or customer product design. Customers are
responsible for their products and applications using TI components. To minimize the risks associated
with customer products and applications, customers should provide adequate design and operating
safeguards.
TI does not warrant or represent that any license, either express or implied, is granted under any TI
patent right, copyright, mask work right, or other TI intellectual property right relating to any
combination, machine, or process in which TI products or services are used. Information published by
TI regarding third-party products or services does not constitute a license from TI to use such products
or services or a warranty or endorsement thereof. Use of such information may require a license from
a third party under the patents or other intellectual property of the third party, or a license from TI under
the patents or other intellectual property of TI.
Reproduction of information in TI data books or data sheets is permissible only if reproduction is without
alteration and is accompanied by all associated warranties, conditions, limitations, and notices.
Reproduction of this information with alteration is an unfair and deceptive business practice. TI is not
responsible or liable for such altered documentation.
Resale of TI products or services with statements different from or beyond the parameters stated by
TI for that product or service voids all express and any implied warranties for the associated TI product
or service and is an unfair and deceptive business practice. TI is not responsible or liable for any such
statements.
Following are URLs where you can obtain information on other Texas Instruments products and
application solutions:
ProductsApplications
Amplifiersamplifier.ti.comAudiowww.ti.com/audio
Data Convertersdataconverter.ti.comAutomotivewww.ti.com/automotive
The TMS320C6000™ digital signal processor (DSP) platform is part of the
TMS320™ DSP family. The TMS320C62x™ DSP generation and the
TMS320C64x™ DSP generation comprise fixed-point devices in the
C6000™ DSP platform, and the TMS320C67x™ DSP generation comprises
floating-point devices in the C6000 DSP platform.
The TMS320C67x+™ DSP is an enhancement of the C67x™ DSP with added
functionality and an expanded instruction set. This document describes the
CPU architecture, pipeline, instruction set, and interrupts of the C67x and
C67x+™ DSPs.
Notational Conventions
Preface
Read This First
This document uses the following conventions.
Any reference to the C67x DSP or C67x CPU also applies, unless other-
wise noted, to the C67x+ DSP and C67x+ CPU, respectively.
Hexadecimal numbers are shown with the suffix h. For example, the
following number is 40 hexadecimal (decimal 64): 40h.
Related Documentation From Texas Instruments
The following documents describe the C6000™ devices and related support
tools. Copies of these documents are available on the Internet at www.ti.com.
Tip: Enter the literature number in the search box provided at www.ti.com.
The current documentation that describes the C6000 devices, related peripherals, and other technical collateral, is available in the C6000 DSP product
folder at: www.ti.com/c6000.
number SPRU723) describes the peripherals available on the
TMS320C672x DSPs.
TMS320C6000 Technical Brief (literature number SPRU197) gives an
introduction to the TMS320C62x and TMS320C67x DSPs, development
tools, and third-party support.
TMS320C6000 Programmer’s Guide (literature number SPRU198)
describes ways to optimize C and assembly code for the TMS320C6000
DSPs and includes application program examples.
TMS320C6000 Code Composer Studio Tutorial (literature number
SPRU301) introduces the Code Composer Studio integrated development environment and software tools.
Code Composer Studio Application Programming Interface Reference
Guide (literature number SPRU321) describes the Code Composer
Studio application programming interface (API), which allows you to program custom plug-ins for Code Composer.
TMS320C6x Peripheral Support Library Programmer’s Reference
(literature number SPRU273) describes the contents of the
TMS320C6000 peripheral support library of functions and macros. It lists
functions and macros both by header file and alphabetically, provides a
complete description of each, and gives code examples to show how
they are used.
Trademarks
ivSPRU733Read This First
TMS320C6000 Chip Support Library API Reference Guide (literature
number SPRU401) describes a set of application programming interfaces
(APIs) used to configure and control the on-chip peripherals.
Code Composer Studio, C6000, C64x, C67x, C67x+, TMS320C2000,
TMS320C5000, TMS320C6000, TMS320C62x, TMS320C64x,
TMS320C67x, TMS320C67x+, TMS320C672x, and VelociTI are trademarks
of Texas Instruments.
Trademarks are the property of their respective owners.
Describes the assembly language instructions of the TMS320C67x DSP. Also described are
parallel operations, conditional operations, resource constraints, and addressing modes.
The TMS320C6000™ digital signal processor (DSP) platform is part of the
TMS320™ DSP family. The TMS320C62x™ DSP generation and the
TMS320C64x™ DSP generation comprise fixed-point devices in the C6000™
DSP platform, and the TMS320C67x™ DSP generation comprises floatingpoint devices in the C6000 DSP platform. All three DSP generations use the
VelociTI™ architecture, a high-performance, advanced very long instruction
word (VLIW) architecture, making these DSPs excellent choices for multichannel and multifunction applications.
The TMS320C67x+ DSP is an enhancement of the C67x DSP with added
functionality and an expanded instruction set.
Any reference to the C67x DSP or C67x CPU also applies, unless otherwise
noted, to the C67x+ DSP and C67x+ CPU, respectively.
TMS320 DSP Family Overview / TMS320C6000 DSP Family Overview
1.1TMS320 DSP Family Overview
The TMS320™ DSP family consists of fixed-point, floating-point, and multiprocessor digital signal processors (DSPs). TMS320™ DSPs have an architec-
ture designed specifically for real-time signal processing.
Table 1−1 lists some typical applications for the TMS320™ family of DSPs. The
TMS320™ DSPs offer adaptable approaches to traditional signal-processing
problems. They also support complex applications that often require multiple
operations to be performed simultaneously.
1.2TMS320C6000 DSP Family Overview
With a performance of up to 6000 million instructions per second (MIPS) and
an efficient C compiler, the TMS320C6000 DSPs give system architects
unlimited possibilities to differentiate their products. High performance, ease
of use, and affordable pricing make the C6000 generation the ideal solution
for multichannel, multifunction applications, such as:
Pooled modems
Wireless local loop base stations
Remote access servers (RAS)
Digital subscriber loop (DSL) systems
Cable modems
Multichannel telephony systems
The C6000 generation is also an ideal solution for exciting new applications;
for example:
Personalized home security with face and hand/fingerprint recognition
Advanced cruise control with global positioning systems (GPS) navigation
and accident avoidance
Remote medical diagnostics
Beam-forming base stations
Virtual reality 3-D graphics
Speech recognition
Audio
Radar
Atmospheric modeling
Finite element analysis
Imaging (examples: fingerprint recognition, ultrasound, and MRI)
Introduction1-2SPRU733
TMS320C6000 DSP Family Overview
Table 1−1. Typical Applications for the TMS320 DSPs
AutomotiveConsumerControl
Adaptive ride control
Antiskid brakes
Cellular telephones
Digital radios
Engine control
Global positioning
Navigation
Vibration analysis
Voice commands
General-PurposeGraphics/ImagingIndustrial
Adaptive filtering
Convolution
Correlation
Digital filtering
Fast Fourier transforms
Hilbert transforms
Waveform generation
Windowing
InstrumentationMedicalMilitary
Digital filtering
Function generation
Pattern matching
Phase-locked loops
Seismic processing
Spectrum analysis
Transient analysis
Digital radios/TVs
Educational toys
Music synthesizers
Pagers
Power tools
Radar detectors
Solid-state answering machines
Disk drive control
Engine control
Laser printer control
Motor control
Robotics control
Servo control
Numeric control
Power-line monitoring
Robotics
Security access
Image processing
Missile guidance
Navigation
Radar processing
Radio frequency modems
Secure communications
Sonar processing
TelecommunicationsVoice/Speech
1200- to 56600-bps modems
Adaptive equalizers
ADPCM transcoders
Base stations
Cellular telephones
Channel multiplexing
Data encryption
Digital PBXs
Digital speech interpolation (DSI)
DTMF encoding/decoding
Echo cancellation
Faxing
Future terminals
Line repeaters
Personal communications
systems (PCS)
Personal digital assistants (PDA)
Speaker phones
Spread spectrum communications
Digital subscriber loop (xDSL)
Video conferencing
X.25 packet switching
The C6000 devices execute up to eight 32-bit instructions per cycle. The C67x
CPU consists of 32 general-purpose 32-bit registers and eight functional units.
These eight functional units contain:
Two multipliers
Six ALUs
The C6000 generation has a complete set of optimized development tools,
including an efficient C compiler, an assembly optimizer for simplified
assembly-language programming and scheduling, and a Windows™ based
debugger interface for visibility into source code execution characteristics. A
hardware emulation board, compatible with the TI XDS510™ and XDS560™
emulator interface, is also available. This tool complies with IEEE Standard
1149.1−1990, IEEE Standard Test Access Port and Boundary-Scan
Architecture.
Features of the C6000 devices include:
Advanced VLIW CPU with eight functional units, including two multipliers
and six arithmetic units
Executes up to eight instructions per cycle for up to ten times the
performance of typical DSPs
Allows designers to develop highly effective RISC-like code for fast
development time
Instruction packing
Gives code size equivalence for eight instructions executed serially or
in parallel
Reduces code size, program fetches, and power consumption
Conditional execution of all instructions
Reduces costly branching
Increases parallelism for higher sustained performance
Efficient code execution on independent functional units
Industry’s most efficient C compiler on DSP benchmark suite
Industry’s first assembly optimizer for fast development and improved
parallelization
8/16/32-bit data support, providing efficient memory support for a variety
of applications
Introduction1-4SPRU733
TMS320C67x DSP Features and Options
40-bit arithmetic options add extra precision for vocoders and other
computationally intensive applications
Saturation and normalization provide support for key arithmetic
operations
Field manipulation and instruction extract, set, clear, and bit counting
support common operation found in control and data manipulation
applications.
The C67x devices include these additional features:
Hardware support for single-precision (32-bit) and double-precision
(64-bit) IEEE floating-point operations.
32 × 32-bit integer multiply with 32-bit or 64-bit result.
In addition to the features of the C67x device, the C67x+ device is enhanced
for code size improvement and floating-point performance. These additional
features include:
Execute packets can span fetch packets.
Register file size is increased to 64 registers (32 in each datapath).
Floating-point addition and subtraction capability in the .S unit.
Mixed-precision multiply instructions.
32-KByte instruction cache that supports execution from both on-chip
RAM and ROM as well as from external memory through a VBUSP-based
external memory interface (EMIF).
Unified memory controller features support for flat on-chip data RAM and
ROM organizations for zero wait-state accesses from both load store units
of the CPU. The memory controller supports different banking organizations for RAM and ROM arrays. The memory controller also supports
VBUSP interfaces (two master and one slave) for transfer of data from the
system peripherals to and from the CPU and internal memory. A VBUSPbased DMA controller can interface to the CPU for programmable bulk
transfers through the VBUSP slave port.
1-5IntroductionSPRU733
TMS320C67x DSP Features and Options
The VelociTI architecture of the C6000 platform of devices make them the first
off-the-shelf DSPs to use advanced VLIW to achieve high performance
through increased instruction-level parallelism. A traditional VLIW architecture
consists of multiple execution units running in parallel, performing multiple
instructions during a single clock cycle. Parallelism is the key to extremely high
performance, taking these DSPs well beyond the performance capabilities of
traditional superscalar designs. VelociTI is a highly deterministic architecture,
having few restrictions on how or when instructions are fetched, executed, or
stored. It is this architectural flexibility that is key to the breakthrough efficiency
levels of the TMS320C6000 Optimizing C compiler. VelociTI’s advanced
features include:
Instruction packing: reduced code size
All instructions can operate conditionally: flexibility of code
Variable-width instructions: flexibility of data types
Fully pipelined branches: zero-overhead branching.
Introduction1-6SPRU733
1.4TMS320C67x DSP Architecture
Á
Á
Figure 1−1 is the block diagram for the C67x DSP. The C6000 devices come
with program memory, which, on some devices, can be used as a program
cache. The devices also have varying sizes of data memory. Peripherals such
as a direct memory access (DMA) controller, power-down logic, and external
memory interface (EMIF) usually come with the CPU, while peripherals such
as serial ports and host ports are on only certain devices. Check the data sheet
for your device to determine the specific peripheral configurations you have.
Figure 1−1. TMS320C67x DSP Block Diagram
Program cache/program memory
32-bit address
256-bit data
TMS320C67x DSP Architecture
DMA, EMIF
Power
down
Data path AData path B
Data cache/data memory
32-bit address
8-, 16-, 32-bit data
C6000 CPU
Program fetch
Instruction dispatch (See Note)
Instruction decode
Register file BRegister file A
.D1.M1.S1.L1
.D2 .M2 .S2 .L2
Control
registers
Control
logic
Test
Emulation
Interrupts
Additional
peripherals:
Timers,
serial ports,
etc.
1-7IntroductionSPRU733
TMS320C67x DSP Architecture
1.4.1Central Processing Unit (CPU)
The C67x CPU, in Figure 1−1, is common to all the C62x/C64x/C67x devices.
The CPU contains:
Program fetch unit
Instruction dispatch unit
Instruction decode unit
Two data paths, each with four functional units
32 32-bit registers
Control registers
Control logic
Test, emulation, and interrupt logic
The program fetch, instruction dispatch, and instruction decode units can
deliver up to eight 32-bit instructions to the functional units every CPU clock
cycle. The processing of instructions occurs in each of the two data paths (A
and B), each of which contains four functional units (.L, .S, .M, and .D) and 16
32-bit general-purpose registers. The data paths are described in more detail
in Chapter 2. A control register file provides the means to configure and control
various processor operations. To understand how instructions are fetched,
dispatched, decoded, and executed in the data path, see Chapter 4.
1.4.2Internal Memory
The C67x DSP has a 32-bit, byte-addressable address space. Internal
(on-chip) memory is organized in separate data and program spaces. When
off-chip memory is used, these spaces are unified on most devices to a single
memory space via the external memory interface (EMIF).
The C67x DSP has two 32-bit internal ports to access internal data memory.
The C67x DSP has a single internal port to access internal program memory,
with an instruction-fetch width of 256 bits.
1.4.3Memory and Peripheral Options
A variety of memory and peripheral options are available for the C6000
platform:
and other asynchronous memories for a broad range of external memory
requirements and maximum system performance.
Introduction1-8SPRU733
TMS320C67x DSP Architecture
DMA Controller (C6701 DSP only) transfers data between address ranges
in the memory map without intervention by the CPU. The DMA controller
has four programmable channels and a fifth auxiliary channel.
EDMA Controller performs the same functions as the DMA controller. The
EDMA has 16 programmable channels, as well as a RAM space to hold
multiple configurations for future transfers.
HPI is a parallel port through which a host processor can directly access
the CPU’s memory space. The host device has ease of access because
it is the master of the interface. The host and the CPU can exchange information via internal or external memory. In addition, the host has direct
access to memory-mapped peripherals.
Expansion bus is a replacement for the HPI, as well as an expansion of
the EMIF. The expansion provides two distinct areas of functionality (host
port and I/O port) which can co-exist in a system. The host port of the
expansion bus can operate in either asynchronous slave mode, similar to
the HPI, or in synchronous master/slave mode. This allows the device to
interface to a variety of host bus protocols. Synchronous FIFOs and
asynchronous peripheral I/O devices may interface to the expansion bus.
McBSP (multichannel buffered serial port) is based on the standard serial
port interface found on the TMS320C2000™ and TMS320C5000™
devices. In addition, the port can buffer serial samples in memory automatically with the aid of the DMA/EDNA controller. It also has multichannel
capability compatible with the T1, E1, SCSA, and MVIP networking
standards.
Timers in the C6000 devices are two 32-bit general-purpose timers used
for these functions:
Time events
Count events
Generate pulses
Interrupt the CPU
Send synchronization events to the DMA/EDMA controller.
Power-down logic allows reduced clocking to reduce power consumption.
Most of the operating power of CMOS logic dissipates during circuit
switching from one logic state to another. By preventing some or all of the
chip’s logic from switching, you can realize significant power savings without losing any data or operational context.
For an overview of the peripherals available on the C6000 DSP, refer to the
TM320C6000 DSP Peripherals Overview Reference Guide (SPRU190).
1-9IntroductionSPRU733
Chapter 2
CPU Data Paths and Control
This chapter focuses on the CPU, providing information about the data paths and
control registers. The two register files and the data cross paths are described.
The components of the data path for the TMS320C67x CPU are shown in
Figure 2−1. These components consist of:
Two general-purpose register files (A and B)
Eight functional units (.L1, .L2, .S1, .S2, .M1, .M2, .D1, and .D2)
Two load-from-memory data paths (LD1 and LD2)
Two store-to-memory data paths (ST1 and ST2)
Two data address paths (DA1 and DA2)
Two register file data cross paths (1X and 2X)
2.2General-Purpose Register Files
There are two general-purpose register files (A and B) in the C6000 data paths.
For the C67x DSP, each of these files contains 16 32-bit registers (A0–A15 for
file A and B0–B15 for file B), as shown in Table 2−1. For the C67x+ DSP, the
register file size is doubled to 32 32-bit registers (A0–A31 for file A and B0–B21
for file B), as shown in Table 2−1. The general-purpose registers can be used
for data, data address pointers, or condition registers.
The C67x DSP general-purpose register files support data ranging in size from
packed 16-bit data through 40-bit fixed-point and 64-bit floating point data.
Values larger than 32 bits, such as 40-bit long and 64-bit float quantities, are
stored in register pairs. In these the 32 LSBs of data are placed in an evennumbered register and the remaining 8 or 32 MSBs in the next upper register
(that is always an odd-numbered register). Packed data types store either four
8-bit values or two 16-bit values in a single 32-bit register, or four 16-bit values
in a 64-bit register pair.
There are 16 valid register pairs for 40-bit and 64-bit data in the C67x DSP
cores. In assembly language syntax, a colon between the register names
denotes the register pairs, and the odd-numbered register is specified first.
The additional registers are addressed by using the previously unused fifth
(msb) bit of the source and register specifiers. All 64-bit register writes and
reads are performed over 2 cycles as per the current C67x devices.
Figure 2−2 shows the register storage scheme for 40-bit long data. Operations
requiring a long input ignore the 24 MSBs of the odd-numbered register.
Operations producing a long result zero-fill the 24 MSBs of the odd-numbered
register. The even-numbered register is encoded in the opcode.
CPU Data Paths and Control2-2SPRU733
Figure 2−1. TMS320C67x CPU Data Paths
LD1 32 MSB
ST1
Data path A
LD1 32 LSB
DA1
.L1
long dst
long src
long src
long dst
.S1
.M1
.D1
src1
src2
dst
dst
src1
src2
dst
src1
src2
dst
src1
src2
General-Purpose Register Files
8
8
8
32
32
8
Register
file A
(A0−A15)
2X
Data path B
DA2
LD2 32 LSB
LD2 32 MSB
ST2
.D2
.M2
.S2
long dst
long src
long src
long dst
.L2
src2
src1
dst
src2
src1
dst
src2
src1
dst
dst
src2
src1
1X
Register
file B
(B0−B15)
8
8
8
32
32
8
Control
register
file
2-3CPU Data Paths and ControlSPRU733
General-Purpose Register Files
Table 2−1. 40-Bit/64-Bit Register Pairs
Register Files
AB
A1:A0B1:B0C67x DSP
A3:A2B3:B2
A5:A4B5:B4
A7:A6B7:B6
A9:A8B9:B8
A11:A10B11:B10
A13:A12B13:B12
A15:A14B15:B14
A17:A16B17:B16C67x+ DSP only
A19:A18B19:B18
A21:A20B21:B20
A23:A22B23:B22
A25:A24B25:B24
A27:A26B27:B26
A29:A28B29:B28
A31:A30B31:B30
Devices
Figure 2−2. Storage Scheme for 40-Bit Data in a Register Pair
310310
Odd registerEven register
Ignored
Odd registerEven register
Zero-filled
CPU Data Paths and Control2-4SPRU733
78
Read from registers
3932310
Write to registers
3932310
40-bit data
40-bit data
Loading...
+ 435 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.