– Enhanced Direct-Memory-Access Controller•Supports 32-Bit Integer, SP (IEEE Single
(EDMA3)Precision/32-Bit) and DP (IEEE Double
– Serial ATA (SATA) Controller
– DDR2/Mobile DDR Memory Controller
– Two Multimedia Card (MMC)/Secure Digital
(SD) Card Interface
– LCD Controller
– Video Port Interface (VPIF)
– 10/100 Mb/s Ethernet MAC (EMAC):
– Programmable Real-Time Unit Subsystem
– Three Configurable UART Modules
– USB 1.1 OHCI (Host) With Integrated PHY
– USB 2.0 OTG Port With Integrated PHY
– One Multichannel Audio Serial Port
– Two Multichannel Buffered Serial Ports
• 375/456-MHz C674x VLIW DSP
• C674x Instruction Set Features
– Superset of the C67x+™ and C64x+™ ISAs
– Up to 3648/2746 C674x MIPS/MFLOPS
– Byte-Addressable (8-/16-/32-/64-Bit Data)
– 8-Bit Overflow Protection
– Bit-Field Extract, Set, Clear
– Normalization, Saturation, Bit-Counting
– Compact 16-Bit Instructions
• C674x Two Level Cache Memory Architecture
– 32K-Byte L1P Program RAM/Cache
– 32K-Byte L1D Data RAM/Cache
– 256K -Byte L2 Unified Mapped RAM/Cache
– Flexible RAM/Cache Partition (L1 and L2)
Please be aware that an important notice concerning availability, standard warranty, and use in critical applications of Texas
Instruments semiconductor products and disclaimers thereto appears at the end of this data sheet.
2TMS320C6000, C6000 are trademarks of Texas Instruments.
ADVANCE INFORMATION concerns new products in the sampling
or preproduction phaseof development. Characteristic dataand other
specifications are subjectto change without notice.
Support
– 64 General-Purpose Registers (32 Bit)
Precision/64-Bit) Floating Point
•Supports up to Four SP Additions Per
Clock, Four DP Additions Every 2 Clocks
•Supports up to Two Floating Point (SP or
DP) Reciprocal Approximation (RCPxP)
and Square-Root Reciprocal
Approximation (RSQRxP) Operations Per
Cycle
– Two Multiply Functional Units
•Mixed-Precision IEEE Floating Point
Multiply Supported up to:
– 2 SP x SP -> SP Per Clock
– 2 SP x SP -> DP Every Two Clocks
– 2 SP x DP -> DP Every Three Clocks
– 2 DP x DP -> DP Every Four Clocks
•Fixed Point Multiply Supports Two 32 x
32-Bit Multiplies, Four 16 x 16-Bit
Multiplies, or Eight 8 x 8-Bit Multiplies per
Clock Cycle, and Complex Multiples
– Instruction Packing Reduces Code Size
– All Instructions Conditional
– Hardware Support for Modulo Loop
Operation
– Protected Mode Operation
– Exceptions Support for Error Detection and
Program Redirection
• Software Support
– TI DSP/BIOS™
– Chip Support Library and DSP Library
• 128K-Byte RAM Memory
• 1.8V or 3.3V LVCMOS IOs (except for USB and
DDR2 interfaces)
•16-Bit DDR2 SDRAM With 512 MB– 128-channel TDM
Address Space or
•16-Bit mDDR SDRAM With 256 MB
Address Space
• Three Configurable 16550 type UART Modules:
– With Modem Control Signals
– 16-byte FIFO
– 16x or 13x Oversampling Option
• LCD Controller
• Two Serial Peripheral Interfaces (SPI) Each
With Multiple Chip-Selects
• Two Multimedia Card (MMC)/Secure Digital (SD)
Card Interface with Secure Data I/O (SDIO)
Interfaces
• Two Master/Slave Inter-Integrated Circuit (I2C
Bus™)
• One Host-Port Interface (HPI) With 16-Bit-Wide
Muxed Address/Data Bus For High Bandwidth
• Programmable Real-Time Unit Subsystem
(PRUSS)
– Two Independent Programmable Realtime
Unit (PRU) Cores
•32-Bit Load/Store RISC architecture
•4K Byte instruction RAM per core
•512 Bytes data RAM per core
•PRU Subsystem (PRUSS) can be disabled
via software to save power
•Register 30 of each PRU is exported from
the subsystem in addition to the normal
R31 output of the PRU cores.
– Standard power management mechanism
•Clock gating
•Entire subsystem under a single PSC
clock gating domain
– Dedicated interrupt controller
– FIFO buffers for Transmit and Receive
• 10/100 Mb/s Ethernet MAC (EMAC):
– IEEE 802.3 Compliant
– MII Media Independent Interface
– RMII Reduced Media Independent Interface
– Management Data I/O (MDIO) Module
• Video Port Interface (VPIF):
– Two 8-bit SD (BT.656), Single 16-bit or Single
Raw (8-/10-/12-bit) Video Capture Channels
– Two 8-bit SD (BT.656), Single 16-bit Video
Display Channels
• Universal Parallel Port (uPP):
– High-Speed Parallel Interface to FPGAs and
Data Converters
– Data Width on Each of Two Channels is 8- to
16-bit Inclusive
– Single Data Rate or Dual Data Rate Transfers
– Supports Multiple Interfaces with START,
ENABLE and WAIT Controls
• Serial ATA (SATA) Controller:
– Supports SATA I (1.5 Gbps) and SATA II (3.0
Gbps)
– Supports all SATA Power Management
Features
– Hardware-Assisted Native Command
Queueing (NCQ) for up to 32 Entries
– Supports Port Multiplier and
Command-Based Switching
• Real-Time Clock With 32 KHz Oscillator and
Separate Power Rail
• Three 64-Bit General-Purpose Timers (Each
configurable as Two 32-Bit Timers)
• One 64-bit General-Purpose/Watchdog Timer
(Configurable as Two 32-bit General-Purpose
– Dedicated switched central resourceTimers)
• USB 1.1 OHCI (Host) With Integrated PHY• Two Enhanced Pulse Width Modulators
(USB1)(eHRPWM):
• USB 2.0 OTG Port With Integrated PHY (USB0)– Dedicated 16-Bit Time-Base Counter With
– USB 2.0 High-/Full-Speed Client
– USB 2.0 High-/Full-/Low-Speed Host
– End Point 0 (Control)
– End Points 1,2,3,4 (Control, Bulk, Interrupt or
ISOC) Rx and Tx
• One Multichannel Audio Serial Port:
– Two Clock Zones and 16 Serial Data Pins
– Supports TDM, I2S, and Similar Formats
– DIT-Capable
– FIFO buffers for Transmit and Receive
• Two Multichannel Buffered Serial Ports:
– Supports TDM, I2S, and Similar Formats
– AC97 Audio Codec Interface
Period And Frequency Control
– 6 Single Edge, 6 Dual Edge Symmetric or 3
Dual Edge Asymmetric Outputs
– Dead-Band Generation
– PWM Chopping by High-Frequency Carrier
– Trip Zone Input
The device is a Low-power applications processor based on a C674x DSP core. It provides significantly
lower power than other members of the TMS320C6000™ platform of DSPs.
The device enables OEMs and ODMs to quickly bring to market devices featuring robust operating
systems support, rich user interfaces, and high processing performance life through the maximum
flexibility of a fully integrated mixed processor solution.
The device DSP core uses a two-level cache-based architecture. The Level 1 program cache (L1P) is a
32KB direct mapped cache and the Level 1 data cache (L1D) is a 32KB 2-way set-associative cache. The
Level 2 program cache (L2P) consists of a 256KB memory space that is shared between program and
data space. L2 memory can be configured as mapped memory, cache, or combinations of the two.
Although the DSP L2 is accessible by other hosts in the system, an additional 128KB RAM shared
memory is available for use by other hosts without affecting DSP performance.
The peripheral set includes: a 10/100 Mb/s Ethernet MAC (EMAC) with a Management Data Input/Output
(MDIO) module; one USB2.0 OTG interface; one USB1.1 OHCI interface; two inter-integrated circuit (I2C)
Bus interfaces; one multichannel audio serial port (McASP) with 16 serializers and FIFO buffers; two
multichannel buffered serial ports (McBSP) with FIFO buffers; two SPI interfaces with multiple chip
selects; four 64-bit general-purpose timers each configurable (one configurable as watchdog); a
configurable 16-bit host port interface (HPI) ; up to 9 banks of 16 pins of general-purpose input/output
(GPIO) with programmable interrupt/event generation modes, multiplexed with other peripherals; three
UART interfaces (each with RTS and CTS); two enhanced high-resolution pulse width modulator
(eHRPWM) peripherals; 3 32-bit enhanced capture (eCAP) module peripherals which can be configured
as 3 capture inputs or 3 auxiliary pulse width modulator (APWM) outputs; and 2 external memory
interfaces: an asynchronous and SDRAM external memory interface (EMIFA) for slower memories or
peripherals, and a higher speed DDR2/Mobile DDR controller.
SPRS590B–JUNE 2009–REVISED AUGUST 2010
The Ethernet Media Access Controller (EMAC) provides an efficient interface between the device and a
network. The EMAC supports both 10Base-T and 100Base-TX, or 10 Mbits/second (Mbps) and 100 Mbps
in either half- or full-duplex mode. Additionally an Management Data Input/Output (MDIO) interface is
available for PHY configuration. The EMAC supports both MII and RMII interfaces.
The SATA controller provides a high-speed interface to mass data storage devices. The SATA controller
supports both SATA I (1.5 Gbps) and SATA II (3.0 Gbps).
The Universal Parallel Port (uPP) provides a high-speed interface to many types of data converters,
FPGAs or other parallel devices. The UPP supports programmable data widths between 8- to 16-bits on
each of two channels. Single-data rate and double-data rate transfers are supported as well as START,
ENABLE and WAIT signals to provide control for a variety of data converters.
A Video Port Interface (VPIF) is included providing a flexible video input/output port.
The rich peripheral set provides the ability to control external peripheral devices and communicate with
external processors. For details on each of the peripherals, see the related sections later in this document
and the associated peripheral reference guides.
The device has a complete set of development tools for the DSP. These include C compilers, a DSP
assembly optimizer to simplify programming and scheduling, and a Windows™ debugger interface for
visibility into source code execution.
NOTE: This is a placeholder for the Revision History Table for future revisions of the document.
This data manual revision history highlights the changes made to the SPRS590A device-specific data
manual to make it an SPRS590B revision.
Table 2-1. Revision History
ADDITIONS/MODIFICATIONS/DELETIONS
Global - Added MPU Content
Global - Replaced all "CLKIN" references with "OSCIN"
Global - Updated td(SCSL_SPC)S min from P to 2P
Global - Made changes in the document to reflect the following detail.
"The DSP L2 ROM is used for boot purposes and cannot be programmed with application code".
Global - Updated the pin map graphic to fix typos.
Global -
•All instances of EMU[0] updated to EMU0
•All instances of EMU[1] updated toEMU1
•All instances of UART1_RTS updated to have an overbar
•All instances of UART2_RTS updated to have an overbar
•All instances of SPI1_SCS[0] updated to have an overbar
•All instances of EMA_CS[4] updated to have an overbar
•All instances of SPI1_ENA updated to have an overbar
•All instances of SATA_TXN updated to have an overbar
•All instances of LCD_AC_ENB_CS updated to have an overbar
•All instances of DDR_CS updated to have an overbar
•All instances of UHPI_HRDY updated to have an overbar
•All instances of UHPI_HDS1 updated to have an overbar
•All instances of UHPI_HCS updated to have an overbar
Added Table 3-3 C674x L1/L2 Memory Protection Registers
Added Section 3.9 Unused Pin Configurations
Added Section 6.6.3- Dynamic Voltage and Frequency Scaling (DVFS)
AddedSection 4.3 Pullup/Pulldown Resistors
Added Section 6.14.3 - SATA Unused Signal Configuration
Added sections -Section 6.14.2 - SATA Interface, Section 6.14.2.1 - SATA Interface Schematic, Section 6.14.2.2 - Compatible SATA
The following documents are available on the Internet at www.ti.com. Tip: Enter the literature number in
the search box provided at www.ti.com.
DSP Reference Guides
SPRUG82TMS320C674x DSP Cache User's Guide. Explains the fundamentals of memory caches
and describes how the two-level cache-based internal memory architecture in the
TMS320C674x digital signal processor (DSP) can be efficiently used in DSP applications.
Shows how to maintain coherence with external memory, how to use DMA to reduce
memory latencies, and how to optimize your code to improve cache efficiency. The internal
memory architecture in the C674x DSP is organized in a two-level hierarchy consisting of a
dedicated program cache (L1P) and a dedicated data cache (L1D) on the first level.
Accesses by the CPU to the these first level caches can complete without CPU pipeline
stalls. If the data requested by the CPU is not contained in cache, it is fetched from the next
lower memory level, L2 or external memory.
SPRUFE8TMS320C674x DSP CPU and Instruction Set Reference Guide. Describes the CPU
architecture, pipeline, instruction set, and interrupts for the TMS320C674x digital signal
processors (DSPs). The C674x DSP is an enhancement of the C64x+ and C67x+ DSPs with
added functionality and an expanded instruction set.
SPRS590B–JUNE 2009–REVISED AUGUST 2010
SPRUFK5TMS320C674x DSP Megamodule Reference Guide. Describes the TMS320C674x digital
signal processor (DSP) megamodule. Included is a discussion on the internal direct memory
access (IDMA) controller, the interrupt controller, the power-down controller, memory
protection, bandwidth management, and the memory and cache.
EMIFA
Flash Card InterfaceMMC and SD cards supported.
EDMA3
Timers
UART3 (each with RTS and CTS flow control)
SPI2 (Each with one hardware chip select)
Peripherals
Not all peripherals pins
are available at the
same time (for more
detail, see the Device
Configurations section).
On-Chip Memory
I2C2 (both Master/Slave)
Multichannel Audio Serial Port [McASP]1 (each with transmit/receive, FIFO buffer, 16 serializers)
Multichannel Buffered Serial Port [McBSP]2 (each with transmit/receive, FIFO buffer, 16)
10/100 Ethernet MAC with Management Data I/O1 (MII or RMII Interface)
UHPI1 (16-bit multiplexed address/data)
USB 2.0 (USB0)High-Speed OTG Controller with on-chip OTG PHY
USB 1.1 (USB1)Full-Speed OHCI (as host) with on-chip PHY
General-Purpose Input/Output Port9 banks of 16-bit
LCD Controller1
SATA Controller1 (Support both SATA I and SATAII)
Universal Parallel Port (uPP)1
Video Port Interface (VPIF)1 (video in and video out)
PRU Subsystem (PRUSS)2 Programmable PRU Cores
Size (Bytes)448KB RAM
Organization
www.ti.com
Table 3-1. Characteristics of C6748
DDR2, 16-bit bus width, up to 150 MHz
Mobile DDR, 16-bit bus width, up to 133 MHz
Asynchronous (8/16-bit bus width) RAM, Flash,
16-bit SDRAM, NOR, NAND
64 independent channels, 16 QDMA channels,
2 channel controllers, 3 transfer controllers
4 64-Bit General Purpose (each configurable as 2 separate
32-bit timers, one configurable as Watch Dog)
4 Single Edge, 4 Dual Edge Symmetric, or
2 Dual Edge Asymmetric Outputs
DSP
32KB L1 Program (L1P)/Cache (up to 32KB)
32KB L1 Data (L1D)/Cache (up to 32KB)
256KB Unified Mapped RAM/Cache (L2)
DSP Memories can be made accessible to EDMA3 and
other peripherals.
ADDITIONAL MEMORY
128KB RAM
C674x CPU ID + CPU
Rev ID
C674x Megamodule
Revision
JTAG BSDL_IDDEVIDR0 Register0x0B7D_102F
CPU FrequencyMHz674x DSP 375 MHz (1.2V) or 456 MHz (1.3V)
(1) ADVANCE INFORMATION concerns new products in the sampling or preproduction phase of development. Characteristic data and
other specifications are subject to change without notice. PRODUCTION DATA information is current as of publication date. Products
conform to specifications per the terms of the Texas Instruments standard warranty. Production processing does not necessarily include
testing of all parameters.
(1)
Product Preview (PP),
Advance Information (AI),
or Production Data (PD)
375 MHz versions - PD
456 MHz versions - AI
3.3Device Compatibility
The C674x DSP core is code-compatible with the C6000™ DSP platform and supports features of both
the C64x+ and C67x+ DSP families.
The C674x Central Processing Unit (CPU) consists of eight functional units, two register files, and two
data paths as shown in Figure 3-2. The two general-purpose register files (A and B) each contain
32 32-bit registers for a total of 64 registers. The general-purpose registers can be used for data or can be
data address pointers. The data types supported include packed 8-bit data, packed 16-bit data, 32-bit
data, 40-bit data, and 64-bit data. Values larger than 32 bits, such as 40-bit-long or 64-bit-long values are
stored in register pairs, with the 32 LSBs of data placed in an even register and the remaining 8 or
32 MSBs in the next upper register (which is always an odd-numbered register).
The eight functional units (.M1, .L1, .D1, .S1, .M2, .L2, .D2, and .S2) are each capable of executing one
instruction every clock cycle. The .M functional units perform all multiply operations. The .S and .L units
perform a general set of arithmetic, logical, and branch functions. The .D units primarily load data from
memory to the register file and store results from the register file into memory.
The C674x CPU combines the performance of the C64x+ core with the floating-point capabilities of the
C67x+ core.
Each C674x .M unit can perform one of the following each clock cycle: one 32 x 32 bit multiply, one 16 x
32 bit multiply, two 16 x 16 bit multiplies, two 16 x 32 bit multiplies, two 16 x 16 bit multiplies with
add/subtract capabilities, four 8 x 8 bit multiplies, four 8 x 8 bit multiplies with add operations, and four
16 x 16 multiplies with add/subtract capabilities (including a complex multiply). There is also support for
Galois field multiplication for 8-bit and 32-bit data. Many communications algorithms such as FFTs and
modems require complex multiplication. The complex multiply (CMPY) instruction takes for 16-bit inputs
and produces a 32-bit real and a 32-bit imaginary output. There are also complex multiplies with rounding
capability that produces one 32-bit packed output that contain 16-bit real and 16-bit imaginary values. The
32 x 32 bit multiply instructions provide the extended precision necessary for high-precision algorithms on
a variety of signed and unsigned 32-bit data types.
SPRS590B–JUNE 2009–REVISED AUGUST 2010
The .L or (Arithmetic Logic Unit) now incorporates the ability to do parallel add/subtract operations on a
pair of common inputs. Versions of this instruction exist to work on 32-bit data or on pairs of 16-bit data
performing dual 16-bit add and subtracts in parallel. There are also saturated forms of these instructions.
The C674x core enhances the .S unit in several ways. On the previous cores, dual 16-bit MIN2 and MAX2
comparisons were only available on the .L units. On the C674x core they are also available on the .S unit
which increases the performance of algorithms that do searching and sorting. Finally, to increase data
packing and unpacking throughput, the .S unit allows sustained high performance for the quad 8-bit/16-bit
and dual 16-bit instructions. Unpack instructions prepare 8-bit data for parallel 16-bit operations. Pack
instructions return parallel results to output precision including saturation support.
Other new features include:
•SPLOOP - A small instruction buffer in the CPU that aids in creation of software pipelining loops where
multiple iterations of a loop are executed in parallel. The SPLOOP buffer reduces the code size
associated with software pipelining. Furthermore, loops in the SPLOOP buffer are fully interruptible.
•Compact Instructions - The native instruction size for the C6000 devices is 32 bits. Many common
instructions such as MPY, AND, OR, ADD, and SUB can be expressed as 16 bits if the C674x
compiler can restrict the code to use certain registers in the register file. This compression is
performed by the code generation tools.
•Instruction Set Enhancement - As noted above, there are new instructions such as 32-bit
multiplications, complex multiplications, packing, sorting, bit manipulation, and 32-bit Galois field
multiplication.
•Exceptions Handling - Intended to aid the programmer in isolating bugs. The C674x CPU is able to
detect and respond to exceptions, both from internally detected sources (such as illegal op-codes) and
from system events (such as a watchdog time expiration).
•Privilege - Defines user and supervisor modes of operation, allowing the operating system to give a
basic level of protection to sensitive resources. Local memory is divided into multiple pages, each with
read, write, and execute permissions.
•Time-Stamp Counter - Primarily targeted for Real-Time Operating System (RTOS) robustness, a
free-running time-stamp counter is implemented in the CPU which is not sensitive to system stalls.
For more details on the C674x CPU and its enhancements over the C64x architecture, see the following
documents:
•TMS320C64x/C64x+ DSP CPU and Instruction Set Reference Guide (literature number SPRUFE8)
•TMS320C64x Technical Overview (literature number SPRU395)
A. On .M unit, dst2 is 32 MSB.
B. On .M unit, dst1 is 32 LSB.
C. On C64x CPU .M unit, src2 is 32 bits; on C64x+ CPU .M unit, src2 is 64 bits.
D. On .L and .S units, odd dst connects to odd register files and even dst connects to even register files.
The DSP memory map is shown in .
By default the DSP also has access to most on and off chip memory areas.
Additionally, the DSP megamodule includes the capability to limit access to its internal memories through
its SDMA port; without needing an external MPU unit.
3.4.2.1External Memories
The DSP has access to the following External memories:
•Asynchronous EMIF / SDRAM / NAND / NOR Flash (EMIFA)
•SDRAM (DDR2)
3.4.2.2DSP Internal Memories
The DSP has access to the following DSP memories:
•L2 RAM
•L1P RAM
•L1D RAM
3.4.2.3C674x CPU
The C674x core uses a two-level cache-based architecture. The Level 1 Program cache (L1P) is 32 KB
direct mapped cache and the Level 1 Data cache (L1D) is 32 KB 2-way set associated cache. The Level 2
memory/cache (L2) consists of a 256 KB memory space that is shared between program and data space.
L2 memory can be configured as mapped memory, cache, or a combination of both.
www.ti.com
Table 3-2 shows a memory map of the C674x CPU cache registers for the device.
Extensive use of pin multiplexing is used to accommodate the largest number of peripheral functions in
the smallest possible package. Pin multiplexing is controlled using a combination of hardware
configuration at device reset and software programmable register settings.
3.6.1Pin Map (Bottom View)
The following graphics show the bottom view of the ZCE and ZWT packages pin assignments in four
quadrants (A, B, C, and D). The pin assignments for both packages are identical.
Device level pin multiplexing is controlled by registers PINMUX0 - PINMUX19 in the SYSCFG module.
For the device family, pin multiplexing can be controlled on a pin-by-pin basis. Each pin that is multiplexed
with several different functions has a corresponding 4-bit field in one of the PINMUX registers.
Pin multiplexing selects which of several peripheral pin functions controls the pin's IO buffer output data
and output enable values only. The default pin multiplexing control for almost every pin is to select 'none'
of the peripheral functions in which case the pin's IO buffer is held tri-stated.
Note that the input from each pin is always routed to all of the peripherals that share the pin; the PINMUX
registers have no effect on input from a pin.