This document is an overview of the MCF5407 ColdFire processor, focusing on feature
enhancements over the MCF5307. It includes general descriptions of features and of the
various modules incorporated in the MCF5407. It describes the V4 programming model as it
is implemented in the MCF5407.
1.1Features
The MCF5407 integrated mi cr opr ocessor combines a Version 4 ColdFire pr oce ss or core with
the following components, as shown in Figure 1:
•Harvard architecture memory system with 16-Kbyte instruction cache and 8-Kbyte
data cache
•Two, 2-Kbyte on-chip SRAMs
•Integer/fractional multiply-accumulate (MAC) unit
•Divide unit
•System debug interface
•DRAM controller for synchronous and asynchronous DRAM
•Four-channel DMA controller
•Two general-purpose timers
•Two UARTs, one that supports synchronous operations
2C™
•I
•Parallel I/O interface
•System integration module (SIM)
Designed for embedded control applications, the MCF5407 delivers 316 Dhrystone MIPS at
220 MHz or 233 Dhrystone MIPS at 162 MHz.
Although the MCF5407 offers obvious performance upgrade advantages, its rich memory and peripheral
integration at inexpensive prices should not be overlooked. Features common to many embedded
applications, such as DMAs, various DRAM controller interfaces, and on-chip memories, are integrated in
a cost-effective manner using aggressive process technologies.
The MCF5407 extends the legacy of Motorola’s 68K family by providing a compatible path for 68K and
ColdFire customers in which dev elopment tools and customer code ar e quickly leveraged. In fa ct, customers
moving from 68K to ColdFire can use code translation and emulation tools that facilitate modifying 68K
assembly code to the ColdFire architecture. The package, pinout, and integration mix of the MCF5407
create an especially simpl e upgrade for curre nt MCF5307 des igns wit h over trip le the sys tem perfor mance.
The revolutionary ColdFire microprocessor architecture provides new levels of price and performance to
cost-sensitive markets. Based on the concept of variable-length RISC technology, the ColdFire family
combines the architectural simplicity of conventional 32-bit RISC with a memory-saving, variable-length
instruction set. In defining the ColdFire architecture for embedded processing applications, a 68K-code
compatible core was created that combines the performance advantages of a RISC architecture with the
optimum code density of a streamlined, variable-length M68000 instruction set.
MCF5407 Features
nc...
I
cale Semiconductor,
Frees
By using a variable-length instruction set architecture, embedded system designers using ColdFire RISC
processors enjoy significant advantages over conventional fixed-length RISC architectures. The denser
binary code for ColdFire processors consumes less memory than many fixed-length instruction set RISC
processors available. This improved code density means more efficient system memory use for a given
application, and allows use of slower, less costly memory to help achieve a target performance level.
The MCF5407 is the first standard product to implement the Version 4 ColdFire microprocessor core. The
V4 microarchitecture implements a number of advanced techniques, including a Harvard memory
architecture, branc h cache acceleration l ogic, and limited superscal ar support (dual-instr uction issue), which
contribute to the 316 Dhrystone MIPS performance level. Increasing the internal speed of the core also
allows higher performance while providing the system designer with an easy-to-use lower speed system
interface. The processor complex frequency is an integer multiple, 3 to 6 times, of the external bus
frequency. The core clock can be stopped to support a low-power mode in the MCF5407.
Serial communication channels are provided by two programmable full-duplex UARTs, one of which
provides synchronous communications for soft-modem applications, and an I
channels of DMA allow for fast data transfer using a programmable burst mode independent of processor
execution. The two 16-bit general -purpose mult imode timers pro vide separat e input and output signals. For
system protection, the processor includes a programmable 16-bit software watchdog timer. In addition,
common system functions s uch as chip sele cts, inte rrupt cont rol, bus ar bitrati on, and an IEEE 1149.1 JTAG
module are included.
A sophisticated debug interface supports background-debug mode plus real-time trace and debug with an
expanded set of on- chip b reakpoi nt r egisters . Thi s int erfa ce is presen t in a ll Col dFire s tanda rd prod ucts a nd
allows common emulator support across the entire family of microprocessors.
2
C interface module. Four
1.2MCF5407 Features
The following list summarizes MCF5407 features:
•ColdFire processor core
— Variable-length RISC, clock-multiplied Version 4 microprocessor core
— Implements Revision B of the ColdFire instruction set architecture (ISA), which leverages the
68K programming model
MOTOROLAMCF5407 Integrated ColdFire® Microprocessor Produc t Brief3
For More Information On This Product,
Go to: www.freescale.com
Freescale Semiconductor, Inc.
MCF5407 Features
— Two independent decoupled pipelines: four-stage instruction fetch pipeline (IFP) and
five-stage operand execution pipeline (OEP)
— Ten-instruction FIFO buffer provides decoupling between the pipelines
— Limited superscalar design achieves performance levels close to dual-issue performance
— Programmable two-level branch acceleration mechanism with an 8-entry branch cache plus a
128-entry prediction table for increased performance
— 32-bit internal address bus supporting 4 Gbytes of linear address space
— 32-bit data bus
— 16 user-accessible, 32-bit-wide, general-purpose registers
— Supervisor/user modes for system protection
— Vector base register to reloc ate exception-vector ta ble
— Optimized for high-level language constructs
•Multiply and accumulate unit (MAC)
nc...
I
cale Semiconductor,
Frees
— Provides high-speed, complex arithmetic processing for DSP applications
— Tightly coupled to the OEP
— Three-stage execute pipeline with one clock issue rate for 16 x 16 operations
— Supports 16 x 16 and 32 x 32 multiplies, all with 32-bit accumulate
— Supports signed or unsigned integers, plus signed fractional operands
•Hardware integer divide unit
— Supports unsigned and signed integer divides
— Tightly coupled to the OEP
— Supports 32/16, and 32/32 operations producing quotient and/or remainder results
•16-Kbyte instruction cache, 8-Kbyte data cache
— Four-way set-associative organization
— Operates at higher processor core frequency
— Provides pipelined, single-cycle access to critical code and data
— Data cache supports write-through and copyback modes
— Four-entry, 32-bit store buffer to improve performance of operand writes
•Two, 2-Kbyte SRAMs
— Programmable location anywhere within 4-Gbyte linear address space
— Operates at higher core frequency
— Provides pipelined, single-cycle access to critical code and/or data
— Each block can be mapped to either the instruction or data operand bus
•DMA controller
— Four fully-programmable channels: two support external requests and external acknowledges
— Supports dual-address and single-address transfers with 8-, 16-, and 32-bit data capability
— Source/destination address pointers that can increment or remain constant
— 24-bit transfer counter per channel
— Operand packing and unpacking supported
— Auto-alignment transfers supported for efficient block movement
— Supports bursting and cycle steal
— Provides two bus clock internal access
— Automatic DMA transfers from on-chip UARTs using internal interrupts
•DRAM controller
— Support for synchronous DRAM (SDRAM), extended-data-out (EDO) DRAM, and fast page
mode
— Supports up to 512 Mbytes of DRAM
— Programmable timer provides CAS-before-RAS refresh for asynchronous DRAMs
— Support for two separate memory blocks
•Two UARTs
— One UART offers synchronous mode with expanded buffers for soft modem support
— Modem control signals available (CTS
— Processor-interrupt capability
•Dual 16-bit general-purpose multiple-mode timers
— 8-bit pre scaler
— Timer input and output pins
— Processor-interrupt capability
— Up to 18.5-nS resolution at 54 MHz
2
•I
C module
— Interchip bus interface for EEPROMs, LCD controllers, A/D converters, keypads
— Fully compatible with industry-s tandard I
— Master or slave modes support multiple masters
— Automatic interrupt generation with programmable level
•System interface module (SIM)
— Chip selects provide direct interface to 8-, 16-, and 32-bit SRAM, ROM, FLASH, and
memory-mapped I/O devices
— Eight, fully-programmable chip selects, each with a base address register
— Programmable wait states and port sizes per chip select
— User-programmable processor clock/input clock frequency ratio
MOTOROLAMCF5407 Integrated ColdFire® Microprocessor Produc t Brief5
For More Information On This Product,
Go to: www.freescale.com
Freescale Semiconductor, Inc.
ColdFire Module Description
•16-bit general-purpose I/O interface
•IEEE 1149.1 test (JTAG) module
•System debug support
— Real-time trace for determining dynamic execution path while in emulator mode
— Background debug mode (BDM) for debug features while halted
— Real-time debug support, including 13 user-visible hardware breakpoint registers
— Supports servicing of critical, real-time interrupt requests while the BDM is in emulator mode
— Supports comprehensive emulator functions through trace and breakpoint logic
•On-chip PLL
— Accepts various clock input (CLKIN) frequencies between 25 and 54 MHz
— Supports core frequencies between 100 and 162 MHz
— Supports low-power mode
nc...
I
cale Semiconductor,
Frees
•Product offerings
— 316 Dhrystone MIPS at 220 MHz
— 233 Dhrystone MIPS at 162 MHz
— Implemented in 0.22 µ, quad-layer-metal process technology with 1.8-V operation (3.3-V
compliant I/O pads)
— 208-pin plastic QFP package
— 0 to 70° C operating temperature at 162 and 220 MHz
— -40 to 85° C operating temperature at 162 MHz
1.2.1Process
The MCF5407 is manufactured in a 0. 22-µ CMOS process with quad-layer- metal routing technology. This
process combines the high performance and low power needed for embedded system applications. Inputs
are 3.3-V tolerant; outputs are CMOS or open-drain CMOS with outputs operating from VDD + 0.5 V to
GND - 0.5 V, with guaranteed TTL-level specifications.
1.3ColdFire Module Description
The following sections provide overviews of the various modules incorporated in the MCF5407.
1.3.1ColdFire Core
The Version 4 ColdFire core consists of two, independent and decoupled pipelines to maximize
performance—the instruction fetch pipeline (IFP) and the operand execution pipeline (OEP).
1.3.1.1Instruction Fetch Pipeline (IFP)
The four-sta ge instructi on fetch pipeli ne (IFP) is designed to prefetch inst ructions for the operand execu tion
pipeline (OEP). Because the fetch and execution pipelines are decoupled by a ten-instruction FIFO buffer,
the fetch mechanism can prefetch instructions in advance of their use by the OEP, thereby minimizing the
time stalle d waitin g for in struc tions. To maximize the perf orman ce of c onditio nal bra nch in struct ions, th e
Version 4 IFP implements a sophisticated two-level acceleration mechanism.
The first level is an 8-entry, direct-mapped branch cache with a 2-bit prediction state (strongly/weakly,
taken/not-taken) for each entry. The branch cache implements instruction folding techniques that allow
conditional branch instructions which are predicted correctly as taken to execute in zero cycles.
For those conditional branches with no information in the branch cache, a second-level, direct-mapped
prediction table containing 128 entries is accessed. Again, each entry uses the same 2-bit prediction state
definition as the branc h cache. This branch predi ction state is then use d to predict the dire ction of prefetched
conditional branch instructions.
Other change-of-flow instructions, including unconditional branches, jumps, and subroutine calls, use a
similar mechanism where the IFP calculates the target address. The performance of subroutine return
instructions is improved through the use of a four-entry, LIFO return stack.
In all cases, these mechanisms allow the IFP to redirect the fetch stream down the path predicted to be taken
well in advance of the actual instruction execution. The net effect is significantly improved performance.
nc...
I
1.3.1.2Operand Execution Pipeline (OEP)
ColdFire Module Description
cale Semiconductor,
Frees
The prefetched instruct ion stream is g ated from the FI FO buffer into the five-stage OE P. The OEP consists
of two, traditional two-stage RISC compute engines with a register file access feeding an arithmetic/logic
unit (ALU). The co mpute engin e located at the top of the OEP is typically used for operand memory address
calculations (the address ALU), while the compute engine located at the bottom of the pipeline is used for
instruction execution (the execution ALU). The resulting structure provides 3.9 Gbytes/S data operand
bandwidth at 162 MHz to the two compute engines and supports single-cycle execution speeds for most
instructions, including all load, store and most embedded-load operations. In response to users and
developers, the V4 desig n supports exec ution of the Co ldFire Revisi on B instructi on set, which ad ds a small
number of new instructions to improve performance and code density.
The OEP also implements two advanced performance features. It dynamically determines the appropriate
location of instruction execution (either in the address ALU or the execution ALU) based on the pipeline
state. The addre ss co mp ute e ngi ne, in conjunction wi th register renaming resources, can be use d t o exe cut e
a number of heavily-used opcodes and forward the results to subsequent instructions without any pipeline
stalls. Additionally, the OEP implements instruction folding techniques involving MOVE instructions so
that two instructions can be issued in a single machine cycle. The resulting microarchitecture approaches
the performance of a full superscalar implementation, but at a m uch lower silicon cost.
1.3.1.3MAC Module
The MAC unit provides signal processing capabilities for the MCF5407 in a variety of applications
including digital audio and servo control. Integrated as an execution unit in the processor's OEP, the MAC
unit implements a three- stage arithmetic pi peline optimized f or 16 x 16 multiplies. Bot h 16- and 32-bit inpu t
operands are suppor ted by this design in addition t o a full s et of exte nsions for signed and unsigned in tegers
plus signed, fixed-point fractional input operands.
1.3.1.4Integer Divide Module
Some embedded applications can benefit greatly from the integer divide unit. Integrated as another engine
in the processor’s OEP, the divide module performs a variety of operations using signed and unsigned
integers. The module supports word and longword divides producing quotients and/or remainders.
MOTOROLAMCF5407 Integrated ColdFire® Microprocessor Produc t Brief7
For More Information On This Product,
Go to: www.freescale.com
Freescale Semiconductor, Inc.
ColdFire Module Description
1.3.2Harvard Architecture
A Harvard memory architecture is implemented to support the increased bandwidth requirements of the V4
processor pipelines. In this design featuring separate instruction and data buses to the processor-local
memories, available bandwidth to the processor reaches 1.3 Gbytes/S at 162 MHz and conflicts between
instruction fetches and operand accesses are removed.
1.3.2.116-Kbyte Instruction Cache/8-Kbyte Data Cache
Attached to the Harvard memory architecture are a 16-Kbyte instruction cache and an 8-Kbyte data cache.
These four-way, set-associative designs improve system performance by providing pipelined, single-cycle
access on instruction fetches and operand accesses that hit in these memories.
As with all ColdFire caches, these controllers implement a non-lockup, streaming design to maximize
performance. The use of processor-local memories decouples performance from external memory speeds
and increases available bandwidth for external devices or the on-chip 4-channel DMA.
nc...
I
cale Semiconductor,
Frees
Both caches implement line-fill buffers to optimize the performance of line-sized (16-byte) burst acc esses.
Additionally, the data cache supports operation of copyback, write-through or noncacheable modes. A
4-entry, 32-bit buffer is used for cache line push operations and can be configured for deferred write
buffering while in write-through or non-cacheable modes.
The new INTOUCH instruction can be used to prefetch instructions to be locked in the instruction cache
using the cache locking feature. This function may be desirable in certain systems where deterministic
real-time performance is critical.
1.3.2.2Internal 2-Kbyte SRAMs
The two 2-Kbyte on-chip SRAM modules are also connected to the Harvard memory architecture, and
provide pipelined, si ngle-cyc le access to those me mory regions mapped to th ese device s. Each memory ca n
be independently mapped to any 0-modulo-2K location within the 4-Gbyte address space, and configured
to respond either to instruction or to data accesses. Time-critical functions can be mapped onto the
instruction memory bus, while the system stack and/or heavily-referenced data operands can be mapped
onto the data memory bus.
1.3.3DRAM Controller
The MCF5407 DRAM controller pr ovid es a direct interface f or up t o t w o bl oc ks o f DRAM. The controller
supports 8-, 16-, or 32-bit memor y widths, and can easily interface to PC- 100 DIMMs. A unique addressing
scheme allows for increases in system memory size without rerouting address lines and rewiring boards.
The controller operates in normal mode or in page mode and supports SDRAMs and EDO DRAMs.
1.3.4DMA Controller
The MCF5407 provides four fully-programmable DMA channels for quick data transfer. Dual- and
single-address mode s provide the abi lity to progra m bursting and cyc le steal. Dat a transfer s are 32 bits lo ng
with packing and unpacking supported along with an auto-alignment option for efficient block transfers.
Automatic block transfers from on-chip serial UARTs are also supported through the DMA channels.
The MCF5407 contains two UARTs, which function independently. One UART has been enhanced to
provide synchronous op eration and a CODEC in terface for s oft modem support. Ea ch UAR T can be cl ocked
by the system bus c lock, elimina ting the need f or an extern al crystal . Each UART module interfaces directly
to the CPU, as shown in Figure 2.
CTS
Serial Communications
Channel
RTS
RxD
TxD
16-Bit Timer for
Baud-Rate Generation
nc...
I
Each UART module consists of the following major functional a reas:
•Serial communication channel
•16-bit timer for baud-rate generation
•Internal channel control logic
•Interrupt control logic
In addition, UART1 is enhanced to provide a CODEC interface for soft modem support. UART1 can be
programmed to function like UART0 or in one of three following modem modes:
cale Semiconductor,
•An 8-bit CODEC interface
•A 16-bit CODEC interface
•An audio CODEC ‘97 (AC97) digital interface controller
Internal Channel
Control Logic
Interrupt Control
Logic
Figure 2. UART Module Block Diagram
System bus clock
or
External clock (TIN)
Frees
Each UART contains an on-chip baud-rate generator, which provides both standard and nonstandard baud
rates. Data formats can be 5, 6, 7, or 8 bits with even, odd, or no parity, and up to 2 stop bits in 1/16
increments. The UARTs include the following transmit and receive FIFO b uffers:
•UART0 has a 4-byte FIFO receive buffer and a 2-byte FIFO transmit buffer.
•In UART1, the Tx and Rx FIFOs can hold the following:
— 32 1-byte samples when programmed as a UART or as an 8-bit CODEC interface
— 16 2-byte samples when programmed as a 16-bit CODEC interface
— 16 20-bit samples when programmed as a Digital AC ’97 Controller
The UART modules also provide several error-detection and maskable-interrupt capabilities. Modem
support includes request-to-send (RTS
MOTOROLAMCF5407 Integrated ColdFire® Microprocessor Produc t Brief9
For More Information On This Product,
) and clear-to-send (CTS) lines.
Go to: www.freescale.com
Loading...
+ 19 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.