Freescale Semiconductor, Inc.
Technical Information Center, EL516
2100 East Elliot Road
Tempe, Arizona 85284
1-800-521-6274 or +1-480-768-2130
www.freescale.com/support
Europe, Middle East, and Africa:
Freescale Halbleiter Deutschland GmbH
Technical Information Center
Schatzbogen 7
81829 Muenchen, Germany
+44 1296 380 456 (English)
+46 8 52200080 (English)
+49 89 92103 559 (German)
+33 1 69 35 48 48 (French)
www.freescale.com/support
Japan:
Freescale Semiconductor Japan Ltd. Headquarters
ARCO Tower 15F
1-8-1, Shimo-Meguro, Meguro-ku,
Tokyo 153-0064
Japan
0120 191014 or +81 3 5437 9125
support.japan@freescale.com
Asia/Pacific:
Freescale Semiconductor China Ltd.
Exchange Building 23F
No. 118 Jianguo Road
Chaoyang District
Beijing 100022
China
+86 10 5879 8000
support.asia@freescale.com
Information in this document is provided solely to enable system and
software implementers to use F reescal e Semiconductor pr oducts. There ar e
no express or implied copyright licenses granted hereunder to design or
fabricate any integrated circ uits or in te g rated circuits based on the
information in this document.
Freescale Semiconductor reserves the right to mak e changes without further
notice to any products herein. F reescale Se miconductor m akes no w arran ty,
representation or guarantee regarding the suitability of its products for any
particular purpose, nor does Freescale Semiconductor assume any liability
arising out of the application or use of any product or circuit, and spec ifica lly
disclaims any and all liability, including without limitation consequential or
incidental damages. “Typical” parameters that may be provided in Freescale
Semiconductor data sheets and/or specif ications can and d o vary in diff erent
applications and actual performance may vary over time. All operating
parameters, including “Typicals”, must be validated for each customer
application by customer’s technical experts. Freescale Semiconductor does
not convey any license under its patent rights nor the rights of others.
Freescale Semiconductor products are not desig ned, intended, or auth orized
for use as components in systems intended for surgical implant into the body,
or other applications intended to support or sustain life, or for any other
application in which the failure of the Freescale Semiconductor product could
create a situation where personal injury or death may occur. Should Buyer
purchase or use Freescale Semiconductor products for any such unintended
or unauthorized application, Buyer shall indemnify and hold Freescale
Semiconductor and its officers, employees, subsidiaries, affiliates, and
distributors harmless against all claims, costs , damages, and expen ses, and
reasonable attorney fees arising out of, directly or indirectly, any claim of
personal injury or death associated with such unintended or unauthorized
use, even if such cl aim alleg es that Freescale Semiconductor was negligent
regarding the design or manufacture of t he part.
Freescale Semiconductor Literature Distribution Center
P.O. Box 5405
Denver, Colorado 80217
1-800-441-2447 or +1-303-675-2140
Fax: +1-303-675-2150
LDCForFreescaleSemiconductor@hibbertgroup.com
Freescale™ and the Freescale logo are trademarks of Freescale
Semiconductor, Inc. All other product or service names are the property of their
respective owners.
C Module
FlexCAN Module
General Purpose I/O Module
Chip Configuration Module (CCM)
Queued Analog-to-Digital Converter (QADC)
Reset Controller Module
Debug Support
IEEE 1149.1 Test Access Port (JTAG)
Mechanical Data
33
A
B
IND
Electrical Characteristics
Memory Map
Revision History
Index
Chapter 1
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
About This Book
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
The primary objective of this user’s manual is to define the functionality of the MCF5282 processor for
use by software and hardware developers.
The information in this book, except for changes to the flash and Ethernet functionality , also applies to the
MCF5280, MCF5281, MCF5216, and MCF5214.
The information in this book is subject to change without notice, as described in the disclaimers on the title
page. As with any technical documentation, it is the reader’s responsibility to be sure he is using the most
recent version of the documentation.
To locate any published errata or updates for this document, refer to the world-wide web at
http://www.freescale.com/coldfire.
Audience
This manual is intended for system software and hardware developers and applications programmers who
want to develop products with the MCF5282. It is assumed that the reader understands operating systems,
microprocessor system design, basic principles of software and hardware, and basic details of the
ColdFire® architecture.
Suggested Reading
This section lists additional reading that provides background for the information in this manual as well as
general information about the ColdFire architecture.
General Information
The following documentation provides useful information about the ColdFire architecture and computer
architecture in general:
•Using Microprocessors and Microcomputers: The Motorola Family, William C. Wray, Ross
Bannatyne, Joseph D. Greenfield
•Computer Architectur e: A Quantitative Approach, Second Edition, by John L. Hennessy and David
A. Patterson.
•Computer Organization and Design: The Hardware/Software Interface, Second Edition, David A
. Patterson and John L. Hennessy.
ColdFire Documentation
ColdFire documentation is available from the sources listed on the back cover of this manual, as well as
our web site, http://www.freescale.com/coldfire.
Freescale Semiconductorxxxi
•User’s manuals — These books provide details about individual ColdFire implementations and are
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
intended to be used in conjunction with the ColdFire Programmers Reference Manual.
•Data sheets — Data sheets provide specific data regarding pin-out diagrams, bus timing, signal
behavior, and AC, DC, and thermal characteristics, as well as other design considerations.
•Product briefs — Each device has a product brief that provides an overview of its features. This
document is roughly equivalent to the overview (Chapter 1) of an device’s reference manual.
•Application notes — These short documents address specific design issues useful to programmers
and engineers working with Freescale Semiconductor processors.
Additional literature is published as new processors become available. For a current list of ColdFire
documentation, refer to http://www.freescale.com/coldfire.
Conventions
This document uses the following notational conventions:
MNEMONICSIn text, instruction mnemonics are shown in uppercase.
mnemonicsIn code and tables, instruction mnemonics are shown in lowercase.
italicsItalics indicate variable command parameters.
Book titles in text are set in italics.
0x0Prefix to denote hexadecimal number
0b0Prefix to denote binary number
REG[FIELD]Abbreviations for registers are shown in uppercase. Specific bits, fields, or ranges
appear in brackets. For example, RAMBAR[BA] identifies the base address field
in the RAM base address register.
nibble A 4-bit data unit
byte An 8-bit data unit
word A 16-bit data unit
1
longword A 32-bit data unit
xIn some contexts, such as signal encodings, x indicates a don’t care.
nUsed to express an undefined numerical value
~NOT logical operator
&AND logical operator
|OR logical operator
1
The only exceptions to this appear in the discussion of serial communication modules that support variable-length data
transmission units. To simplify the discussion these units are referred to as words regardless of length.
xxxiiFreescale Semiconductor
Chapter 1
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
Overview
This chapter provides an overview of the microprocessor features, including the major functional
components.
1.1Key Features
A block diagram of the MCF528x and MCF521x is shown in Figure 1-1. The main features are as follows:
•Static Version 2 ColdFire variable-length RISC processor
— Static operation
— On-chip 32-bit address and data path
— Processor core and bus frequency up to 80 MHz
— Sixteen general-purpose 32-bit data and address registers
— ColdFire ISA_A with extensions to support the user stack pointer register, and four new
instructions for improved bit processing
— Enhanced Multiply-Accumulate (EMAC) unit with four 48-bit accumulators to support 32-bit
signal processing algorithms
— Illegal instruction decode that allows for 68K emulation support
•System debug support
— Real-time trace for determining dynamic execution path
— Background debug mode (BDM) for in-circuit debugging
— Real time debug support, with one user-visible hardware breakpoint register (PC and address
with optional data) that can be configured into a 1- or 2-level trigger
•On-chip memories
— 2-Kbyte cache, configurable as instruction-only, data-only, or split I-/D-cache
— 64-Kbyte dual-ported SRAM on CPU internal bus, accessible by core and non-core bus masters
(e.g., DMA, FEC) with standby power supply support
— 512 Kbytes of interleaved Flash memory supporting 2-1-1-1 accesses
(256 Kbytes on the MCF5281 and MCF5214, no Flash on MCF5280)
– This product incorporates SuperFlash® technology licensed from SST.
•Power management
— Fully-static operation with processor sleep and whole chip stop modes
— Very rapid response to interrupts from the low-power sleep mode (wake-up feature)
— Clock enable/disable for each peripheral when not used
•Fast Ethernet Controller (FEC) (not available on the MCF5214 and MCF5216)
— 10BaseT capability, half- or full-duplex
— 100BaseT capability, half- or limited-throughput full-duplex
— On-chip transmit and receive FIFOs
— Built-in dedicated DMA controller
Freescale Semiconductor1-1
Overview
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
•FlexCAN 2.0B Module
— Includes all existing features of the Freescale TouCAN module
— Full implementation of the CAN protocol specification version 2.0B
– Standard data and remote frames (up to 109 bits long)
– Extended data and remote frames (up to 127 bits long)
– 0–8 bytes data length
– Programmable bit rate up to 1 Mbit/sec
— Up to 16 message buffers (MBs)
– Configurable as receive (Rx) or transmit (Tx)
– Support standard and extended messages
— Unused message buffer (MB) space can be used as general-purpose RAM space
— Listen-only mode capability
— Content-related addressing
— No read/write semaphores
— Three programmable mask registers
– Global (for MBs 0-13)
– Special for MB14
– Special for MB15
— Programmable transmit-first scheme: lowest ID or lowest buffer number
— “Time stamp” based on 16-bit free-running timer
— Global network time, synchronized by a specific message
— Programmable I/O modes
— Maskable interrupts
— Interrupt control logic
— Maskable interrupts
— DMA support
— Data formats can be 5, 6, 7, or 8 bits with even, odd, or no parity
— Up to 2 stop bits in 1/16 increments
— Error-detection capabilities
— Modem support includes request-to-send (URTS
) and clear-to-send (UCTS) lines for two
UARTs
— Transmit and receive FIFO buffers
•I2C module
— Interchip bus interface for EEPROMs, LCD controllers, A/D converters, and keypads
— Fully compatible with industry-standard I
2
C bus
— Master or slave modes support multiple masters
— Automatic interrupt generation with programmable level
1-2Freescale Semiconductor
•Queued serial peripheral interface (QSPI)
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
— Full-duplex, three-wire synchronous transfers
— Up to four chip selects available
— Master mode operation only
— Programmable master bit rates
— Up to 16 pre-programmed transfers
•Queued analog-to-digital converter (QADC)
— 8 direct, or up to 18 multiplexed, analog input channels
— 10-bit resolution +/- 2 counts accuracy
— Minimum 7 μS conversion time
— Internal sample and hold
— Programmable input sample time for various source impedances
— Two conversion command queues with a total of 64 entries
— Sub-queues possible using pause mechanism
— Queue complete and pause software interrupts available on both queues
— Queue pointers indicate current location for each queue
— Automated queue modes initiated by:
– External edge trigger and gated trigger
– Periodic/interval timer, within QADC module [Queue 1 and 2]
Overview
– Software command
— Single-scan or continuous-scan of queues
— Output data readable in three formats:
– Right-justified unsigned
– Left-justified signed
– Left-justified unsigned
— Unused analog channels can be used as digital I/O
— Low pin-count configuration implemented
•Four 32-bit DMA timers
— 15-ns resolution at 80 MHz (66 MHz for MCF5214 and MCF5216)
— Programmable sources for clock input, including an external clock option
— Programmable prescaler
— Input-capture capability with programmable trigger edge on input pin
— Output-compare with programmable mode for the output pin
— Free run and restart modes
— Maskable interrupts on input capture or reference-compare
— DMA trigger capability on input capture or reference-compare
•Two 4-channel general purpose timers
— Four 16-bit input capture/output compare channels per timer
— 16-bit architecture
— Programmable prescaler
— Pulse widths variable from microseconds to seconds
— Single 16-bit pulse accumulator
Freescale Semiconductor1-3
Overview
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
— Toggle-on-overflow feature for pulse-width modulator (PWM) generation
— One dual-mode pulse accumulation channel per timer
•Software watchdog timer
— 16-bit counter
— Low-power mode support
•Phase locked loop (PLL)
— Crystal or external oscillator reference
— 2- to 10-MHz reference frequency for normal PLL mode
— 33- to 80-MHz (66 MHz for MCF5214/16) oscillator reference frequency for 1:1 mode
— Low-power modes supported
— Separate clock output pin
•Two interrupt controllers
— Support for up to 63 interrupt sources per interrupt controller (a total of 126), organized as
— Seven external interrupt signals
— Unique vector number for each interrupt source
— Ability to mask any individual interrupt source or all interrupt sources (global mask-all)
— Support for hardware and software interrupt acknowledge (IACK) cycles
— Combinatorial path to provide wake-up from low-power modes
•DMA controller
— Four fully programmable channels
— Dual-address transfer support with 8-, 16- and 32-bit data capability along with support for
16-byte (4 x 32-bit) burst transfers
— Source/destination address pointers that can increment or remain constant
— 24-bit byte transfer counter per channel
— Auto-alignment transfers supported for efficient block movement
— Bursting and cycle steal support
— Software-programmable connections between the 11 DMA requesters in the UAR T s (3), 32-bit
timers (4) plus external logic (4) and the four DMA channels
•External bus interface
— Glueless connections to external memory devices (e.g., SRAM, Flash, ROM, etc.)
— SDRAM controller supports 8-, 16-, and 32-bit wide memory devices
— Glueless interface to SRAM devices with or without byte strobe inputs
— Programmable wait state generator
— 32-bit bidirectional data bus
— 24-bit address bus
— Up to seven chip selects available
— Byte/write enables (byte strobes)
1-4Freescale Semiconductor
— Ability to boot from internal Flash memory or external memories that are 8, 16, or 32 bits wide
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
•Reset
— Separate reset in and reset out signals
— Seven sources of reset:
– Power-on reset (POR)
– External
– Software
– Watchdog
– Loss of clock
– Loss of lock
– Low-voltage detection (LVD)
— Status flag indication of source of last reset
•Chip integration module (CIM)
— System configuration during reset
— Support for single chip, master, and test modes
— Selects one of four clock modes
— Sets boot device and its data port width
— Configures output pad drive strength
— Unique part identification number and part revision number
•General purpose I/O interface
— Up to 142 bits of general purpose I/O for MCF5280/1/2
— Up to 134 bits of general purpose I/O for MCF5214/6
— Coherent 32-bit control
— Bit manipulation supported via set/clear functions
— Unused peripheral pins may be used as extra GPIO
•JTAG support for system-level board testing
Overview
Freescale Semiconductor1-5
Overview
Interface
Chip
UART1
Serial
I/O
JTAG
Port
Selects
ColdFire V2 Core
EMAC
External
Test
Controller
I2C
Module
UART2
Serial
I/O
DMA
Timer
Modules
DRAM
Controller
2-Kbyte
D-Cache/I-Cache
Debug Module
DIV
Clock Module
Chip
Configuration
Reset
Controller
Power
(PLL)
Edgeport
Interrupt
Controller 0
Interrupt
Controller 1
FEC
UART0
Serial
I/O
DMA
Controller
Watchdog
Timer
General
Purpose
Timer A
General
Purpose
Timer B
QSPIFlexCANQADC
PIT
Timers
(PIT0–
(DTIM0–
DTIM3)
Management
Module
Ports
Module
PIT3)
Internal Bus
Arbiter
System
Control
Module (SCM)
Flash
Module
64K
SRAM
Note:
Not present on
MCF5214 and
MCF5216
Note:
Not present
on MCF5280
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
1-6Freescale Semiconductor
Figure 1-1. MCF528x and MCF521x Block Diagram
Overview
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
1.1.1Version 2 ColdFire Core
The processor core is comprised of two separate pipelines that are decoupled by an instruction buffer. The
two-stage instruction fetch pipeline (IFP) is responsible for instruction-address generation and instruction
fetch. The instruction buffer is a first-in-first-out (FIFO) buffer that holds prefetched instructions awaiting
execution in the operand execution pipeline (OEP). The OEP includes two pipeline stages. The first stage
decodes instructions and selects operands (DSOC); the second stage (AGEX) performs instruction
execution and calculates operand effective addresses, if needed.
The V2 core implements the ColdFire instruction set architecture revision A with added support for a
separate user stack pointer register and four new instructions to assist in bit processing. Additionally, the
MCF5282 core includes the enhanced multiply-accumulate unit (EMAC) for improved signal processing
capabilities. The EMAC implements a 4-stage execution pipeline, optimized for 32 x 32 bit operations,
with support for four 48-bit accumulators. Supported operands include 16- and 32-bit signed and unsigned
integers, signed fractional operands, and a complete set of instructions to process these data types. The
EMAC provides superb support for execution of DSP operations within the context of a single processor
at a minimal hardware cost.
1.1.1.1Cache
The 2-Kbyte cache can be configured into one of three possible organizations: a 2-Kbyte instruction cache,
a 2-Kbyte data cache or a split 1-Kbyte instruction/1-Kbyte data cache. The configuration is
software-programmable by control bits within the privileged cache configuration register (CACR). In all
configurations, the cache is a direct-mapped single-cycle memory , or ganized as 128 lines, each containing
16 bytes of data. The memories consist of a 128-entry tag array (containing addresses and control bits) and
a 2-Kbyte data array , or ganized as 512 x 32 bits. The tag and data arrays are accessed in parallel using the
following address bits:
If the desired address is mapped into the cache memory, the output of the data array is driven onto the
ColdFire core's local data bus, completing the access in a single cycle. If the data is not mapped into the
tag memory, a cache miss occurs and the processor core initiates a 16-byte line-sized fetch. The cache
module includes a 16-byte line fill buffer used as temporary storage during miss processing. For all data
cache configurations, the memory operates in write-through mode and all operand writes generate an
external bus cycle.
1.1.1.2SRAM
The SRAM module provides a general-purpose 64-Kbyte memory block that the ColdFire core can access
in a single cycle. The location of the memory block can be set to any 64-Kbyte boundary within the
4-Gbyte address space. The memory is ideal for storing critical code or data structures, for use as the
system stack, or for storing FEC data buffers. Because the SRAM module is physically connected to the
processor's high-speed local bus, it can quickly service core-initiated accesses or memory-referencing
commands from the debug module.
Freescale Semiconductor1-7
Overview
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
The SRAM module is also accessible by non-core bus masters, for example the DMA and/or the FEC. The
dual-ported nature of the SRAM makes it ideal for implementing applications with double-buffer schemes,
where the processor and a DMA device operate in alternate regions of the SRAM to maximize system
performance. As an example, system performance can be increased significantly if Ethernet packets are
moved from the FEC into the SRAM (rather than external memory) prior to any processing.
1.1.1.3Flash
This product incorporates SuperFlash® technology licensed from SST . The ColdFire Flash Module (CFM)
is a non-volatile memory (NVM) module for integration with the processor core. The CFM is constructed
with eight banks of 32K x 16-bit Flash arrays to generate 512 Kbytes of 32-bit Flash memory
NOTE
The CFM on the MCF5281 and MCF5214 is constructed with four banks of
32K x 16-bit Flash arrays to generate 256 Kbytes of 32-bit Flash memory.
The MCF5280 does not contain a CFM.
These arrays serve as electrically erasable and programmable, non-volatile program and data memory . The
Flash memory is ideal for program and data storage for single-chip applications allowing for field
reprogramming without requiring an external programming voltage source. The CFM interfaces to the
V2 ColdFire core through an optimized read-only memory controller which supports interleaved accesses
from the 2-cycle Flash arrays. A “backdoor” mapping of the Flash memory is used for all program, erase,
and verify operations. It also provides a read datapath for non-core masters (for example, DMA).
1.1.1.4Debug Module
The ColdFire processor core debug interface is provided to support system debugging in conjunction with
low-cost debug and emulator development tools. Through a standard debug interface, users can access
real-time trace and debug information. This allows the processor and system to be debugged at full speed
without the need for costly in-circuit emulators. The debug interface is a superset of the BDM interface
provided on Freescale’s 683xx family of parts.
The on-chip breakpoint resources include a total of 6 programmable registers—a set of address registers
(with two 32-bit registers), a set of data registers (with a 32-bit data register plus a 32-bit data mask
register), and one 32-bit PC register plus a 32-bit PC mask register . These registers can be accessed through
the dedicated debug serial communication channel or from the processor’s supervisor mode programming
model. The breakpoint registers can be configured to generate triggers by combining the address, data, and
PC conditions in a variety of single or dual-level definitions. The trigger event can be programmed to
generate a processor halt or initiate a debug interrupt exception.
To support program trace, the Version 2 debug module provides processor status (PST[3:0]) and debug
data (DDATA[3:0]) ports. These buses and the CLKOUT output provide execution status, captured
operand data, and branch target addresses defining the dynamic execution path of the processor at the
CPU’s clock rate.
1.1.2System Control Module
This section details the functionality of the System Control Module (SCM) which provides the
programming model for the System Access Control Unit (SACU), the system bus arbiter, a 32-bit Core
Watchdog Timer (CWT), and the system control registers and logic. Specifically, the system control
includes the internal peripheral system base address register (IPSBAR), the processor’s dual-port RAM
1-8Freescale Semiconductor
Overview
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
base address register (RAMBAR), and system control registers that include low-power and core watchdog
timer control.
1.1.3External Interface Module (EIM)
The external interface module handles the transfer of information between the internal core and memory,
peripherals, or other processing elements in the external address space.
Programmable chip-select outputs provide signals to enable external memory and peripheral circuits,
providing all handshaking and timing signals for automatic wait-state insertion and data bus sizing.
Base memory address and block size are programmable, with some restrictions. For example, the starting
address must be on a boundary that is a multiple of the block size. Each chip select can be configured to
provide read and write enable signals suitable for use with most popular static RAMs and peripherals. Data
bus width (8-bit, 16-bit, or 32-bit) is programmable on all chip selects, and further decoding is available
for protection from user mode access or read-only access.
1.1.4Chip Select
Programmable chip select outputs provide a glueless connection to external memory and peripheral
circuits, providing all handshaking and timing signals for automatic wait-state insertion and data bus
sizing.
1.1.5Power Management
The MCF5282 incorporates several low-power modes of operation which are entered under program
control and exited by several external trigger events. An integrated Power-On Reset (POR) circuit
monitors the input supply and forces an MCU reset as the supply voltage rises. The Low Voltage Detect
(LVD) section monitors the supply voltage and is configurable to force a reset or interrupt condition if it
falls below the L VD trip point. The RAM standby switch provides power to RAM when the supply voltage
is higher than the standby voltage. If the supply voltage to chip falls below the standby battery voltage, the
RAM is switched over to the standby supply.
1.1.6General Input/Output Ports
All of the pins associated with the external bus interface may be used for several different functions. Their
primary function is to provide an external memory interface to access off-chip resources. When not used
for this function, all of the pins may be used as general-purpose digital I/O pins. In some cases, the pin
function is set by the operating mode, and the alternate pin functions are not supported.
The digital I/O pins on the MCF5282 are grouped into 8-bit ports. Some ports do not use all eight bits.
Each port has registers that configure, monitor, and control the port pins.
1.1.7Interrupt Controllers (INTC0/INTC1)
There are two interrupt controllers on the MCF5282, each of which can support up to 63 interrupt sources
for a total of 126. Each interrupt controller is organized as 7 levels with 9 interrupt sources per level. Each
interrupt source has a unique interrupt vector, and 56 of the 63 sources of a given controller provide a
programmable level [1-7] and priority within the level.
Freescale Semiconductor1-9
Overview
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
1.1.8SDRAM Controller
The SDRAM controller provides all required signals for glueless interfacing to a variety of
JEDEC-compliant SDRAM devices. SRAS/SCAS address multiplexing is software configurable for
different page sizes. To maintain refresh capability without conflicting with concurrent accesses on the
address and data buses, SRAS
, SCAS, DRAMW, SDRAM_CS[1:0], and SCKE are dedicated SDRAM
signals.
1.1.9Test Access Port
The MCF5282 supports circuit board test strategies based on the T est Technology Committee of IEEE and
the Joint T est Action Group (JT AG). The test logic includes a test access port (TAP) consisting of a 16-state
controller, an instruction register, and three test registers (a 1-bit bypass register, a 256-bit boundary-scan
register, and a 32-bit ID register). The boundary scan register links the device’s pins into one shift register .
Test logic, implemented using static logic design, is independent of the device system logic.
The MCF5282 implementation supports the following:
•Perform boundary-scan operations to test circuit board electrical continuity
•SampleMCF5282systempinsduringoperation and transparently shift out the resultin the
boundary scan register
•Bypass the MCF5282 for a given circuit board test by effectively reducing theboundary-scan
register to a single bit
•Disable the output drive to pins during circuit-board testing
•Drive output pins to stable levels
1.1.10UART Modules
The MCF5282 contains three full-duplex UARTs that function independently. The three UARTs can be
clocked by the system clock, eliminating the need for an external crystal.
Each UART has the following features:
•Each can be clocked by the system clock, eliminating a need for an external UART clock
•Independently programmable receiver and transmitter clock sources
•Programmable data format:
— 5–8 data bits plus parity
— Odd, even, no parity, or force parity
— One, one-and-a-half, or two stop bits
•Each channel programmable to normal (full-duplex), automatic echo, local loop-back, or remote
loop-back mode
•Automatic wake-up mode for multidrop applications
•Four maskable interrupt conditions
•All three UARTs have DMA request capability
•Parity, framing, and overrun error detection
•False-start bit detection
•Line-break detection and generation
1-10Freescale Semiconductor
Overview
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
•Detection of breaks originating in the middle of a character
•Start/end break interrupt/status
1.1.11DMA Timers (DTIM0-DTIM3)
There are four independent, DMA-transfer-generating 32-bit timers (DTIM0, DTIM1, DTIM2, DTIM3)
on the MCF5282. Each timer module incorporates a 32-bit timer with a separate register set for
configuration and control. The timers can be configured to operate from the system clock or from an
external clock source using one of the DTINx signals. If the system clock is selected, it can be divided by
16 or 1. The selected clock is further divided by a user-programmable 8-bit prescaler which clocks the
actual timer counter register (TCRn). Each of these timers can be configured for input capture or reference
compare mode. By configuring the internal registers, each timer may be configured to assert an external
signal, generate an interrupt on a particular event, or cause a DMA transfer.
1.1.12General-Purpose Timers (GPTA/GPTB)
The two general-purpose timers (GP TA and GPTB) are 4-channel timer modules. Each timer consists of a
16-bit programmable counter driven by a 7-stage programmable prescaler. Each of the four channels for
each timer can be configured for input capture or output compare. Additionally, one of the channels,
channel 3, can be configured as a pulse accumulator.
A timer overflow function allows software to extend the timing capability of the system beyond the 16-bit
range of the counter . The input capture and output compare functions allow simultaneous input waveform
measurements and output waveform generation. The input capture function can capture the time of a
selected transition edge. The output compare function can generate output waveforms and timer software
delays. The 16-bit pulse accumulator can operate as a simple event counter or a gated time accumulator.
1.1.13Periodic Interrupt Timers (PIT0-PIT3)
The four periodic interrupt timers (PIT0, PIT1, PIT2, PIT3) are 16-bit timers that provide precise interrupts
at regular intervals with minimal processor intervention. Each timer can either count down from the value
written in its PIT modulus register, or it can be a free-running down-counter.
1.1.14Software Watchdog Timer
The watchdog timer is a 16-bit timer that facilitates recovery from runaway code. The watchdog counter
is a free-running down-counter that generates a reset on underflow. To prevent a reset, software must
periodically restart the countdown.
1.1.15Phase Locked Loop (PLL)
The clock module contains a crystal oscillator (OSC), phase-locked loop (PLL), reduced frequency divider
(RFD), status/control registers, and control logic. T o improve noise immunity , the PLL and OSC have their
own power supply inputs, VDDPLL and VSSPLL. All other circuits are powered by the normal supply
pins, VDD and VSS.
1.1.16DMA Controller
The Direct Memory Access (DMA) controller module provides an efficient way to move blocks of data
with minimal processor interaction. The DMA module provides four channels (DMA0–DMA3) that allow
Freescale Semiconductor1-11
Overview
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
byte, word, longword or 16-byte burst line transfers. These transfers are triggered by software, explicitly
setting a DCRn[START] bit or the occurrence of a hardware event from one of the on-chip peripheral
devices, such as a capture event or an output reference event in a DMA timer (DTIMn) for each channel.
The DMA controller supports dual-address mode to on-chip devices.
1.1.17Reset
The reset controller is provided to determine the cause of reset, assert the appropriate reset signals to the
system, and keep track of what caused the last reset. The power management registers for the internal
low-voltage detect (LVD) circuit are implemented in the reset module. There are seven sources of reset:
•External
•Power-on reset (POR)
•Watchdog timer
•Phase-locked loop (PLL) loss of lock
•PLL loss of clock
•Software
•Low-voltage detection (LVD) reset
External reset on the RSTO pin is software-assertable independent of chip reset state. There are also
software-readable status flags indicating the cause of the last reset, and LVD control and status bits for
setup and use of LVD reset or interrupt.
1.2MCF5282-Specific Features
1.2.1Fast Ethernet Controller (FEC)
The MCF5282’s integrated Fast Ethernet Controller (FEC) performs the full set of IEEE 802.3/Ethernet
CSMA/CD media access control and channel interface functions. The FEC supports connection and
functionality for the 10/100 Mbps 802.3 media independent interface (MII). It requires an external
transceiver (PHY) to complete the interface to the media.
NOTE
The MCF5214 and MCF5216 devices do not contain an FEC module.
1.2.2FlexCAN
The FlexCAN module is a communication controller implementing the CAN protocol. The CAN protocol
can be used as an industrial control serial data bus, meeting the specific requirements of real-time
processing, reliable operation in a harsh EMI environment, cost-effectiveness, and required bandwidth.
FlexCAN contains 16 message buffers.
1.2.3I2C Bus
The I2C bus is a two-wire, bidirectional serial bus that provides a simple, efficient method of data
exchange, minimizing the interconnection between devices. This bus is suitable for applications requiring
occasional communications over a short distance between many devices.
1-12Freescale Semiconductor
Overview
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
1.2.4Queued Serial Peripheral Interface (QSPI)
The queued serial peripheral interface module provides a synchronous serial peripheral interface with
queued transfer capability . It allows up to 16 transfers to be queued at once, eliminating CPU intervention
between transfers.
1.2.5Queued Analog-to-Digital Converter (QADC)
The QADC is a 10-bit, unipolar, successive approximation converter. A maximum of 8 analog input
channels can be supported using internal multiplexing. A maximum of 18 input channels can be supported
in the internal/external multiplexed mode.
The QADC consists of an analog front-end and a digital control subsystem. The analog section includes
input pins, an analog multiplexer, and sample and hold analog circuits. The analog conversion is performed
by the digital-to-analog converter (DAC) resistor-capacitor array and a high-gain comparator.
The digital control section contains queue control logic to sequence the conversion process and interrupt
generation logic. Also included are the periodic/interval timer, control and status registers, the 64-entry
conversion command word (CCW) table, and the 64-entry result table.
Freescale Semiconductor1-13
Overview
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
1-14Freescale Semiconductor
Chapter 2
Instruction
Instruction
FIFO
Decode & Select,
Address
IAG
IC
IB
DSOC
AGEX
Instruction Buffer
Address
Generation
Fetch Cycle
Generation,
Execute
Operand Fetch
Instruction
Operand
Pipeline
Execution
Fetch
Pipeline
Address [:0]
31
Read Data[31:0]
Write Data[31:0]
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
ColdFire Core
2.1Introduction
This section describes the organization of the Version 2 (V2) ColdFire® processor core and an overview
of the program-visible registers. For detailed information on instructions, see the ISA_A+ definition in the
ColdFire Family Programmer’s Reference Manual.
2.1.1Overview
As with all ColdFire cores, the V2 ColdFire core is comprised of two separate pipelines decoupled by an
instruction buffer.
The instruction fetch pipeline (IFP) is a two-stage pipeline for prefetching instructions. The prefetched
instruction stream is then gated into the two-stage operand execution pipeline (OEP), which decodes the
Freescale Semiconductor2-1
Figure 2-1. V2 ColdFire Core Pipelines
ColdFire Core
(described fully in Chapter 3, “Enhanced Multiply-Accumulate Unit (EMAC
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
instruction, fetches the required operands and then executes the required function. Because the IFP and
OEP pipelines are decoupled by an instruction buffer serving as a FIFO queue, the IFP is able to prefetch
instructions in advance of their actual use by the OEP thereby minimizing time stalled waiting for
instructions.
The V2 ColdFire core pipeline stages include the following:
•Two-stage instruction fetch pipeline (IFP) (plus optional instruction buffer stage)
— Instruction address generation (IAG) — Calculates the next prefetch address
— Instruction fetch cycle (IC)—Initiates prefetch on the processor’s local bus
— Instruction buffer (IB) — Optional buffer stage minimizes fetch latency effects using FIFO
queue
•Two-stage operand execution pipeline (OEP)
— Decode and select/operand fetch cycle (DSOC)—Decodes instructions and fetches the
required components for effective address calculation, or the operand fetch cycle
— Address generation/execute cycle (AGEX)—Calculates operand address or executes the
instruction
When the instruction buffer is empty, opcodes are loaded directly from the IC cycle into the operand
execution pipeline. If the buffer is not empty, the IFP stores the contents of the fetched instruction in the
IB until it is required by the OEP.
For register-to-register and register-to-memory store operations, the instruction passes through both OEP
stages once. For memory-to-register and read-modify-write memory operations, an instruction is
effectively staged through the OEP twice: the first time to calculate the effective address and initiate the
operand fetch on the processor’s local bus, and the second time to complete the operand reference and
perform the required function defined by the instruction.
The resulting pipeline and local bus structure allow the V2 ColdFire core to deliver sustained high
performance across a variety of demanding embedded applications.
2.2Memory Map/Register Description
The following sections describe the processor registers in the user and supervisor programming models.
The programming model is selected based on the processor privilege level (user mode or supervisor mode)
as defined by the S bit of the status register (SR). Table 2-1 lists the processor registers.
The user-programming model consists of the following registers:
•EMAC registers:
— Four 48-bit accumulator registers partitioned as follows:
– Four 32-bit accumulators (ACC0–ACC3)
– Eight 8-bit accumulator extension bytes (two per accumulator). These are grouped into two
32-bit values for load and store operations (ACCEXT01 and ACCEXT23).
2-2Freescale Semiconductor
ColdFire Core
•Two 32-bit memory base address registers (RAMBAR, FLASHBAR)
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
Accumulators and extension bytes can be loaded, copied, and stored, and results from EMAC
arithmetic operations generally affect the entire 48-bit destination.
— One 16-bit mask register (MASK)
— One 32-bit Status register (MACSR) including four indicator bits signaling product or
accumulation overflow (one for each accumulator: PAV0–PAV3)
The supervisor programming model is to be used only by system control software to implement restricted
operating system functions, I/O control, and memory management. All accesses that affect the control
features of ColdFire processors are in the supervisor programming model, which consists of registers
available in user mode as well as the following control registers:
•16-bit status register (SR)
•32-bit supervisor stack pointer (SSP)
•32-bit vector base register (VBR)
•32-bit cache control register (CACR)
•32-bit access control registers (ACR0, ACR1)
Table 2-1. ColdFire Core Programming Model
1
BDM
Load: 0x080
Store: 0x180
Load: 0x081
Store: 0x181
Load: 0x082–7
Store: 0x182–7
Load: 0x088–8E
Store: 0x188–8E
Load: 0x08F
Store: 0x18F
0x804MAC Status Register (MACSR)32R/W0x0000_0000No3.2.1/3-3
0x805MAC Address Mask Register (MASK)32R/W0xFFFF_FFFFNo3.2.2/3-5
0x806, 0x809,
0x80A, 0x80B
0x807MAC Accumulator 0,1 Extension Bytes
Data Register 0 (D0)32R/W0xCF20_6080No2.2.1/2-4
Data Register 1 (D1)32R/W0x13B0_1080No2.2.1/2-4
Data Register 2–7 (D2–D7)32R/WUndefinedNo2.2.1/2-4
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
Table 2-1. ColdFire Core Programming Model (continued)
1
BDM
0x80FProgram Counter (PC)32R/WContents of
0x002Cache Control Register (CACR)32R/W0x0000_0000Yes2.2.6/2-7
0x004–5Access Control Register 0–1 (ACR0–1)32R/WSee SectionYes2.2.7/2-7
0x800User/Supervisor A7 Stack Pointer
(OTHER_A7)
0x801V ector Base Register (VBR)32R/W0x0000_0000Yes2.2.8/2-7
0x80EStatus Register (SR)16R/W0x27--No2.2.9/2-8
0xC04Flash Base Address Register
(FLASHBAR)
0xC05RAM Base Address Register (RAMBAR)32R/WSee SectionYes2.2.10/2-8
1
The values listed in this column represent the Rc field used when accessing the core registers via the BDM port. For more
information see Chapter 30, “Debug Support”.
Register
Supervisor Access Only Registers
Width
(bits)
AccessReset Value
location
0x0000_0004
32R/WContents of
location
0x0000_0000
32R/W0x0000_0000Yes2.2.10/2-8
Written with
MOVEC
No2.2.5/2-7
No2.2.3/2-5
Section/Page
2.2.1Data Registers (D0–D7)
D0–D7 data registers are for bit (1-bit), byte (8-bit), word (16-bit) and longword (32-bit) operations; they
can also be used as index registers.
NOTE
Registers D0 and D1 contain hardware configuration details after reset. See
Section 2.3.4.15, “Reset Exception” for more details.
These registers can be used as software stack pointers, index registers, or base address registers. They can
also be used for word and longword operations.
2-4Freescale Semiconductor
ColdFire Core
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
2.2.3Supervisor/User Stack Pointers (A7 and OTHER_A7)
This ColdFire architecture supports two independent stack pointer (A7) registers—the supervisor stack
pointer (SSP) and the user stack pointer (USP). The hardware implementation of these two
program-visible 32-bit registers does not identify one as the SSP and the other as the USP. Instead, the
hardware uses one 32-bit register as the active A7 and the other as OTHER_A7. Thus, the register contents
are a function of the processor operation mode, as shown in the following:
if SR[S] = 1
thenA7 = Supervisor Stack Pointer
OTHER_A7 = User Stack Pointer
elseA7 = User Stack Pointer
OTHER_A7 = Supervisor Stack Pointer
The BDM programming model supports direct reads and writes to A7 and OTHER_A7. It is the
responsibility of the external development system to determine, based on the setting of SR[S], the mapping
of A7 and OTHER_A7 to the two program-visible definitions (SSP and USP). This functionality is
enabled by setting the enable user stack pointer bit, CACR[EUSP]. If this bit is cleared, only a single stack
pointer (A7), defined for ColdFire ISA_A, is available. EUSP is cleared at reset.
To support dual stack pointers, the following two supervisor instructions are included in the ColdFire
instruction set architecture to load/store the USP:
move.l Ay,USP;move to USP
move.l USP,Ax;move from USP
These instructions are described in the ColdFire Family Programmer’s Reference Manual. All other
instruction references to the stack pointer, explicit or implicit, access the active A7 register.
NOTE
The SSP is loaded during reset exception processing with the contents of
location 0x0000_0000.
Freescale Semiconductor2-5
ColdFire Core
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
Figure 2-4. Stack Pointer Registers (A7 and OTHER_A7)
2.2.4Condition Code Register (CCR)
The CCR is the LSB of the processor status register (SR). Bits 4–0 act as indicator flags for results
generated by processor operations. The extend bit (X) is also an input operand during multiprecision
arithmetic computations. The CCR register must be explicitly loaded after reset and before any compare
(CMP), Bcc, or Scc instructions are executed.
BDM: LSB of Status Register (SR)Access: User read/write
BDM read/write
76543210
R000
W
XNZVC
Reset:0 0 0 —————
Figure 2-5. Condition Code Register (CCR)
Table 2-2. CCR Field Descriptions
FieldDescription
7–5Reserved, must be cleared.
4
Extend condition code bit. Set to the C-bit value for arithmetic operations; otherwise not affected or set to a specified
X
result.
3
Negative condition code bit. Set if most significant bit of the result is set; otherwise cleared.
N
2
Zero condition code bit. Set if result equals zero; otherwise cleared.
Z
1
Overflow condition code bit. Set if an arithmetic overflow occurs implying the result cannot be represented in operand
V
size; otherwise cleared.
0
Carry condition code bit. Set if a carry out of the operand msb occurs for an addition or if a borrow occurs in a
C
subtraction; otherwise cleared.
2-6Freescale Semiconductor
ColdFire Core
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
2.2.5Program Counter (PC)
The PC contains the currently executing instruction address. During instruction execution and exception
processing, the processor automatically increments contents of the PC or places a new value in the PC, as
appropriate. The PC is a base address for PC-relative operand addressing.
The PC is initially loaded during reset exception processing with the contents of location 0x0000_0004.
The CACR controls operation of the instruction/data cache memories. It includes bits for enabling,
freezing, and invalidating cache contents. It also includes bits for defining the default cache mode and
write-protect fields. The CACR is described in Section 4.2.1, “Cache Control Register (CACR).”
2.2.7Access Control Registers (ACRn)
The access control registers define attributes for user-defined memory regions. These attributes include the
definition of cache mode, write protect, and buffer write enables. The ACRs are described in Section 4.2.2,
“Access Control Registers (ACR0, ACR1).”
2.2.8Vector Base Register (VBR)
The VBR contains the base address of the exception vector table in memory. To access the vector table,
the displacement of an exception vector is added to the value in VBR. The lower 20 bits of the VBR are
not implemented by ColdFire processors. They are assumed to be zero, forcing the table to be aligned on
a 1 MB boundary.
2.2.10Memory Base Address Registers (RAMBAR, FLASHBAR)
The memory base address registers are used to specify the base address of the internal SRAM and flash
modules and indicate the types of references mapped to each. Each base address register includes a base
address, write-protect bit, address space mask bits, and an enable bit. FLASHBAR determines the base
address of the on-chip flash, and RAMBAR determines the base address of the on-chip RAM. For more
information, refer to Section 5.3.1, “SRAM Base Address Register (RAMBAR)” and Section 6.3.2,
“Flash Base Address Register (FLASHBAR)”.
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
2.2.9Status Register (SR)
The SR stores the processor status and includes the CCR, the interrupt priority mask, and other control
bits. In supervisor mode, software can access the entire SR. In user mode, only the lower 8 bits (CCR) are
accessible. The control bits indicate the following states for the processor: trace mode (T bit), supervisor
or user mode (S bit), and master or interrupt state (M bit). All defined bits in the SR have read/write access
when in supervisor mode. The lower byte of the SR (the CCR) must be loaded explicitly after reset and
before any compare (CMP), Bcc, or Scc instructions execute.
BDM: 0x80E (SR)Access: Supervisor read/write
BDM read/write
System ByteCondition Code Register (CCR)
1514131211109876543210
R
W
Reset00100111000—————
0
T
SM
0
I
Figure 2-8. Status Register (SR)
Table 2-3. SR Field Descriptions
FieldDescription
000
XNZVC
15TTrace enable. When set, the processor performs a trace exception after every instruction.
14Reserved, must be cleared.
13SSupervisor/user state.
0User mode
1 Supervisor mode
12MMaster/interrupt state. Bit is cleared by an interrupt exception and software can set it during execution of the RTE or
move to SR instructions.
11Reserved, must be cleared.
10–8IInterrupt level mask. Defines current interrupt level. Interrupt requests are inhibited for all priority levels less than or
equal to current level, except edge-sensitive level 7 requests, which cannot be masked.
7–0
CCR
Refer to Section 2.2.4, “Condition Code Register (CCR)”.
2-8Freescale Semiconductor
ColdFire Core
IAGICIB
Core Bus
Address
Core Bus
Read Data
Opword
Extension 1
Extension 2
FIFO
IB
+4
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
2.3Functional Description
2.3.1Version 2 ColdFire Microarchitecture
From the block diagram in Figure 2-1, the non-Harvard architecture of the processor is readily apparent.
The processor interfaces to the local memory subsystem via a single 32-bit address and two unidirectional
32-bit data buses. This structure minimizes the core size without compromising performance to a large
degree.
A more detailed view of the hardware structure within the two pipelines is presented in Figure 2-9 and
Figure 2-10 below. In these diagrams, the internal structure of the instruction fetch and operand execution
pipelines is shown:
Figure 2-9. Version 2 ColdFire Processor Instruction Fetch Pipeline Diagram
Freescale Semiconductor2-9
ColdFire Core
DSOCAGEX
Opword
Extension 1
Extension 2
Core Bus
Read Data
Core Bus
Address
Core Bus
Write Data
RGF
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
Figure 2-10. Version 2 ColdFire Processor Operand Execution Pipeline Diagram
The instruction fetch pipeline prefetches instructions from local memory using a two-stage structure. For
sequential prefetches, the next instruction address is generated by adding four to the last prefetch address.
This function is performed during the IAG stage and the resulting prefetch address gated onto the core bus
(if there are no pending operand memory accesses assigned a higher priority). After the prefetch address
is driven onto the core bus, the instruction fetch cycle accesses the appropriate local memory and returns
the instruction read data back to the IFP during the cycle. If the accessed data is not present in a local
memory (e.g., an instruction cache miss, or an external access cycle is required), the IFP is stalled in the
IC stage until the referenced data is available. As the prefetch data arrives in the IFP, it can be loaded into
the FIFO instruction buffer or gated directly into the OEP.
The V2 design uses a simple static conditional branch prediction algorithm (forward-assumed as
not-taken, backward-assumed as taken), and all change-of-flow operations are calculated by the OEP and
the target instruction address fed back to the IFP.
The IFP and OEP are decoupled by the FIFO instruction buffer , allowing instruction prefetching to occur
with the available core bus bandwidth not used for operand memory accesses. For the V2 design, the
instruction buffer contains three 32-bit locations.
Consider the operation of the OEP for three basic classes of non-branch instructions:
•Register-to-register:
opRy,Rx
•Embedded load:
op<mem>y,Rx
•Register-to-memory (store)
moveRy,<mem>x
For simple register-to-register instructions, the first stage of the OEP performs the instruction decode and
fetching of the required register operands (OC) from the dual-ported register file, while the actual
2-10Freescale Semiconductor
ColdFire Core
Operand Execution Pipeline
DSOCAGEX
Opword
Extension 1
Extension 2
Core Bus
Read Data
Core Bus
Address
Core Bus
Write
Data
new Rx
Rx
Ry
RGF
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
instruction execution is performed in the second stage (EX) in one of the execute engines (e.g., ALU,
barrel shifter, divider, EMAC). There are no operand memory accesses associated with this class of
instructions, and the execution time is typically a single machine cycle. See Figure 2-11.
Figure 2-11. V2 OEP Register-to-Register
For memory-to-register (embedded-load) instructions, the instruction is effectively staged through the
OEP twice with a basic execution time of three cycles. First, the instruction is decoded and the components
of the operand address (base register from the RGF and displacement) are selected (DS). Second, the
operand effective address is generated using the ALU execute engine (AG). Third, the memory read
operand is fetched from the core bus, while any required register operand is simultaneously fetched (OC)
from the RGF. Finally, in the fourth cycle, the instruction is executed (EX). The heavily-used 32-bit load
instruction (
move.l <mem>y,Rx) is optimized to support a two-cycle execution time. The following example
in Figure 2-12 shows an effective address of the form <ea>y = (d16,Ay), i.e., a 16-bit signed displacement
added to a base register Ay.
Freescale Semiconductor2-11
ColdFire Core
Operand Execution Pipeline
DSOCAGEX
Opword
Extension 1
Extension 2
Core Bus
Read Data
Core Bus
Address
Core Bus
Write
RGF
Data
Ay
d16
<ea>y
Operand Execution Pipeline
DSOCAGEX
Opword
Extension 1
Extension 2
Core Bus
Read Data
Core Bus
Address
Core Bus
Write
RGF
Data
Rx
new Rx
<mem>y
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
Figure 2-12. V2 OEP Embedded-Load Part 1
Figure 2-13. V2 OEP Embedded-Load Part 2
For register-to-memory (store) operations, the stage functions (DS/OC, AG/EX) are effectively performed
simultaneously allowing single-cycle execution. See Figure 2-14 where the effective address is of the form
<ea>x = (d16,Ax), i.e., a 16-bit signed displacement added to a base register Ax.
2-12Freescale Semiconductor
ColdFire Core
Operand Execution Pipeline
DSOCAGEX
Opword
Extension 1
Extension 2
Core Bus
Read Data
Core Bus
Address
Core Bus
Write
RGF
Data
Ax
d16
Ry
<ea>x
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
For read-modify-write instructions, the pipeline effectively combines an embedded-load with a store
operation for a three-cycle execution time.
Figure 2-14. V2 OEP Register-to-Memory
The pipeline timing diagrams of Figure 2-15 depict the execution templates for these three classes of
instructions. In these diagrams, the x-axis represents time, and the various instruction operations are shown
progressing down the operand execution pipeline.
Freescale Semiconductor2-13
ColdFire Core
Core clock
Register-to-Register
Core Bus
Embedded-Load
Core Bus
Register-to-Memory
op read
Core Bus
op write
OEP.DSOCOCnext
OEP.AGEXEX
OEP.DSOCDSOCnext
OEP.AGEXEXAG
OEP.DSOCDSOCnext
OEP.AGEXAGEX
(Store)
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
2.3.2Instruction Set Architecture (ISA_A+)
The original ColdFire Instruction Set Architecture (ISA_A) was derived from the M68000 family opcodes
based on extensive analysis of embedded application code. The ISA was optimized for code compiled
from high-level languages where the dominant operand size was the 32-bit integer declaration. This
approach minimized processor complexity and cost, while providing excellent performance for compiled
applications.
After the initial ColdFire compilers were created, developers noted there were certain ISA additions that
would enhance code density and overall performance. Additionally , as users implemented ColdFire-based
designs into a wide range of embedded systems, they found certain frequently-used instruction sequences
that could be improved by the creation of additional instructions.
The original ISA definition minimized support for instructions referencing byte- and word-sized operands.
Full support for the move byte and move word instructions was provided, but the only other opcodes
supporting these data types are CLR (clear) and TST (test). A set of instruction enhancements has been
implemented in subsequent ISA revisions, ISA_B and ISA_C. The new opcodes primarily addressed three
areas:
1. Enhanced support for byte and word-sized operands
2. Enhanced support for position-independent code
3. Miscellaneous instruction additions to address new functionality
Figure 2-15. V2 OEP Pipeline Execution Templates
2-14Freescale Semiconductor
ColdFire Core
2. The processor determines the exception vector number. For all faults except interrupts, the
processor performs this calculation based on exception type. For interrupts, the processor
performs an interrupt-acknowledge (IACK) bus cycle to obtain the vector number from the
interrupt controller. The IACK cycle is mapped to special locations within the interrupt
controller’s address space with the interrupt level encoded in the address.
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
Table 2-4 summarizes the instructions added to revision ISA_A to form revision ISA_A+. For more details
see the ColdFire Family Programmer’s Reference Manual.
Table 2-4. Instruction Enhancements over Revision ISA_A
InstructionDescription
BITREVThe contents of the destination data register are bit-reversed; new Dn[31] equals old Dn[0], new
Dn[30] equals old Dn[1],..., new Dn[0] equals old Dn[31].
BYTEREVThe contents of the destination data register are byte-reversed; new Dn[31:24] equals old
Dn[7:0],..., new Dn[7:0] equals old Dn[31:24].
FF1The data register, Dn, is scanned, beginning from the most-significant bit (Dn[31]) and ending
with the least-significant bit (Dn[0]), searching for the first set bit. The data register is then
loaded with the offset count from bit 31 where the first set bit appears.
Move from USP USP → Destination register
Move to USPSource register → USP
STLDSRPushes the contents of the status register onto the stack and then reloads the status register
with the immediate data value.
2.3.3Exception Processing Overview
Exception processing for ColdFire processors is streamlined for performance. The ColdFire processors
differ from the M68000 family because they include:
•A simplified exception vector table
•Reduced relocation capabilities using the vector-base register
•A single exception stack frame format
•Use of separate system stack pointers for user and supervisor modes.
All ColdFire processors use an instruction restart exception model. However, Version 2 ColdFire
processors require more software support to recover from certain access errors. See Section 2.3.4.1,
“Access Error Exception” for details.
Exception processing includes all actions from fault condition detection to the initiation of fetch for first
handler instruction. Exception processing is comprised of four major steps:
1. The processor makes an internal copy of the SR and then enters supervisor mode by setting the S
bit and disabling trace mode by clearing the T bit. The interrupt exception also forces the M bit to
be cleared and the interrupt priority mask to set to current interrupt request level.
Freescale Semiconductor2-15
ColdFire Core
All ColdFire processors support a 1024-byte vector table aligned on any 1 Mbyte address boundary (see
Table 2-5).
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
3. The processor saves the current context by creating an exception stack frame on the system stack.
The exception stack frame is created at a 0-modulo-4 address on top of the system stack pointed to
by the supervisor stack pointer (SSP). As shown in Figure 2-16, the processor uses a simplified
fixed-length stack frame for all exceptions. The exception type determines whether the program
counter placed in the exception stack frame defines the location of the faulting instruction (fault)
or the address of the next instruction to be executed (next).
4. The processor calculates the address of the first instruction of the exception handler. By definition,
the exception vector table is aligned on a 1 MB boundary. This instruction address is generated by
fetching an exception vector from the table located at the address defined in the vector base register .
The index into the exception table is calculated as (4 × vector number). After the exception vector
has been fetched, the vector contents determine the address of the first instruction of the desired
handler. After the instruction fetch for the first opcode of the handler has initiated, exception
processing terminates and normal instruction processing continues in the handler.
The table contains 256 exception vectors; the first 64 are defined for the core and the remaining 192 are
device-specific peripheral interrupt vectors. See Chapter 10, “Interrupt Controller Modules” for details on
the device-specific interrupt sources.
Table 2-5. Exception Vector Assignments
Vector
Number(s)
00x000—Initial supervisor stack pointer
10x004—Initial program counter
20x008FaultAccess error
30x00CFaultAddress error
40x010FaultIllegal instruction
50x014FaultDivide by zero
Fault ref ers to the PC of the instruction that caused the exception. Next ref ers to the PC
of the instruction that follows the instruction that caused the fault.
Vector
Offset (Hex)
Stacked
Program
Counter
Assignment
All ColdFire processors inhibit interrupt sampling during the first instruction of all exception handlers.
This allows any handler to disable interrupts effectively, if necessary, by raising the interrupt mask level
contained in the status register. In addition, the ISA_A+ architecture includes an instruction (STLDSR)
that stores the current interrupt mask level and loads a value into the SR. This instruction is specifically
intended for use as the first instruction of an interrupt service routine that services multiple interrupt
requests with different interrupt levels. For more details, see ColdFire Family Programmer’s Reference Manual.
2.3.3.1Exception Stack Frame Definition
Figure 2-16 shows exception stack frame. The first longword contains the 16-bit format/vector word (F/V)
and the 16-bit status register, and the second longword contains the 32-bit program counter address.
The 16-bit format/vector word contains three unique fields:
•A 4-bit format field at the top of the system stack is always written with a value of 4, 5, 6, or 7 by
the processor, indicating a two-longword frame format. See Table 2-6.
•There is a 4-bit fault status field, FS[3:0], at the top of the system stack. This field is defined for
access and address errors only and written as zeros for all other exceptions. See Table 2-7.
Freescale Semiconductor2-17
ColdFire Core
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
Table 2-7. Fault Status Encodings
FS[3:0]Definition
00xxReserved
0100Error on instruction fetch
0101Reserved
011xReserved
1000Error on operand write
1001Attempted write to write-protected space
101xReserved
1100Error on operand read
1101Reserved
111xReserved
•The 8-bit vector number, vector[7:0], defines the exception type and is calculated by the processor
for all internal faults and represents the value supplied by the interrupt controller in case of an
interrupt. See Table 2-5.
2.3.4Processor Exceptions
2.3.4.1Access Error Exception
The exact processor response to an access error depends on the memory reference being performed. For
an instruction fetch, the processor postpones the error reporting until the faulted reference is needed by an
instruction for execution. Therefore, faults during instruction prefetches followed by a change of
instruction flow do not generate an exception. When the processor attempts to execute an instruction with
a faulted opword and/or extension words, the access error is signaled and the instruction aborted. For this
type of exception, the programming model has not been altered by the instruction generating the access
error.
If the access error occurs on an operand read, the processor immediately aborts the current instruction’s
execution and initiates exception processing. In this situation, any address register updates attributable to
the auto-addressing modes, (for example, (An)+,-(An)), have already been performed, so the programming
model contains the updated An value. In addition, if an access error occurs during a MOVEM instruction
loading from memory, any registers already updated before the fault occurs contain the operands from
memory.
The V2 ColdFire processor uses an imprecise reporting mechanism for access errors on operand writes.
Because the actual write cycle may be decoupled from the processor’s issuing of the operation, the
signaling of an access error appears to be decoupled from the instruction that generated the write.
Accordingly , the PC contained in the exception stack fra me merely represents the location in the program
when the access error was signaled. All programming model updates associated with the write instruction
are completed. The NOP instruction can collect access errors for writes. This instruction delays its
2-18Freescale Semiconductor
ColdFire Core
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
execution until all previous operations, including all pending write operations, are complete. If any
previous write terminates with an access error, it is guaranteed to be reported on the NOP instruction.
2.3.4.2Address Error Exception
Any attempted execution transferring control to an odd instruction address (if bit 0 of the target address is
set) results in an address error exception.
Any attempted use of a word-sized index register (Xn.w) or a scale factor of eight on an indexed effective
addressing mode generates an address error, as does an attempted execution of a full-format indexed
addressing mode, which is defined by bit 8 of extension word 1 being set.
If an address error occurs on a JSR instruction, the Version 2 ColdFire processor calculates the target
address then the return address is pushed onto the stack. If an address error occurs on an R TS instruction,
the Version 2 ColdFire processor overwrites the faulting return PC with the address error stack frame.
2.3.4.3Illegal Instruction Exception
The ColdFire variable-length instruction set architecture supports three instruction sizes: 16, 32, or 48 bits.
The first instruction word is known as the operation word (or opword), while the optional words are known
as extension word 1 and extension word 2. The opword is further subdivided into three sections: the upper
four bits segment the entire ISA into 16 instruction lines, the next 6 bits define the operation mode
(opmode), and the low-order 6 bits define the effective address. See Figure 2-17. The opword line
definition is shown in Table 2-8.
1514131211109876543210
LineOpModeEffective Address
ModeRegister
Figure 2-17. ColdFire Instruction Operation Word (Opword) Format
Table 2-8. ColdFire Opword Line Definition
Opword[Line]Instruction Class
0x0Bit manipulation, Arithmetic and Logical Immediate
0x1Move Byte
0x2Move Long
0x3Move Word
0x4Miscellaneous
0x5Add (ADDQ) and Subtract Quick (SUBQ), Set according to Condition Codes (Scc)
0x6PC-relative change-of-flow instructions
Conditional (Bcc) and unconditional (BRA) branches, subroutine calls (BSR)
0x7Move Quick (MOVEQ), Move with sign extension (MVS) and zero fill (MVZ)
0x8Logical OR (OR)
0x9Subtract (SUB), Subtract Extended (SUBX)
Freescale Semiconductor2-19
ColdFire Core
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
Table 2-8. ColdFire Opword Line Definition (continued)
Opword[Line]Instruction Class
0xAEMAC, Move 3-bit Quick (MOV3Q)
0xBCompare (CMP), Exclusive-OR (EOR)
0xCLogical AND (AND), Multiply Word (MUL)
0xDAdd (ADD), Add Extended (ADDX)
0xEArithmetic and logical shifts (ASL, ASR, LSL, LSR)
In the original M68000 ISA definition, lines A and F were effectively reserved for user-defined operations
(line A) and co-processor instructions (line F). Accordingly, there are two unique exception vectors
associated with illegal opwords in these two lines.
Any attempted execution of an illegal 16-bit opcode (except for line-A and line-F opcodes) generates an
illegal instruction exception (vector 4). Additionally , any attempted execution of any non-MAC line-A and
most line-F opcodes generate their unique exception types, vector numbers 10 and 11, respectively.
ColdFire cores do not provide illegal instruction detection on the extension words on any instruction,
including MOVEC.
2.3.4.4Divide-By-Zero
Attempting to divide by zero causes an exception (vector 5, offset equal 0x014).
2.3.4.5Privilege Violation
The attempted execution of a supervisor mode instruction while in user mode generates a privilege
violation exception. See ColdFire Programmer’s Reference Manual for a list of supervisor-mode
instructions.
There is one special case involving the HALT instruction. Normally, this opcode is a supervisor mode
instruction, but if the debug module's CSR[UHE] is set, then this instruction can be also be executed in
user mode for debugging purposes.
2.3.4.6Trace Exception
To aid in program development, all ColdFire processors provide an instruction-by-instruction tracing
capability. While in trace mode, indicated by setting of the SR[T] bit, the completion of an instruction
execution (for all but the stop instruction) signals a trace exception. This functionality allows a debugger
to monitor program execution.
The stop instruction has the following effects:
1. The instruction before the stop executes and then generates a trace exception. In the exception stack
frame, the PC points to the stop opcode.
2. When the trace handler is exited, the stop instruction executes, loading the SR with the immediate
operand from the instruction.
2-20Freescale Semiconductor
ColdFire Core
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
3. The processor then generates a trace exception. The PC in the exception stack frame points to the
instruction after the stop, and the SR reflects the value loaded in the previous step.
If the processor is not in trace mode and executes a stop instruction where the immediate operand sets
SR[T], hardware loads the SR and generates a trace exception. The PC in the exception stack frame points
to the instruction after the stop, and the SR reflects the value loaded in step 2.
Because ColdFire processors do not support any hardware stacking of multiple exceptions, it is the
responsibility of the operating system to check for trace mode after processing other exception types. As
an example, consider a TRAP instruction execution while in trace mode. The processor initiates the trap
exception and then passes control to the corresponding handler . If the system requires that a trace exception
be processed, it is the responsibility of the trap exception handler to check for this condition (SR[T] in the
exception stack frame set) and pass control to the trace handler before returning from the original
exception.
2.3.4.7Unimplemented Line-A Opcode
A line-A opcode is defined when bits 15-12 of the opword are 0b1010. This exception is generated by the
attempted execution of an undefined line-A opcode.
2.3.4.8Unimplemented Line-F Opcode
A line-F opcode is defined when bits 15-12 of the opword are 0b1111. This exception is generated when
attempting to execute an undefined line-F opcode.
2.3.4.9Debug Interrupt
See Chapter 30, “Debug Support,” for a detailed explanation of this exception, which is generated in
response to a hardware breakpoint register trigger. The processor does not generate an IACK cycle, but
rather calculates the vector number internally (vector number 12). Additionally , SR[M,I] are unaffected by
the interrupt.
2.3.4.10RTE and Format Error Exception
When an RTE instruction is executed, the processor first examines the 4-bit format field to validate the
frame type. For a ColdFire core, any attempted R TE execution (where the format is not equal to {4,5,6,7})
generates a format error. The exception stack frame for the format error is created without disturbing the
original RTE frame and the stacked PC pointing to the RTE instruction.
The selection of the format value provides some limited debug support for porting code from M68000
applications. On M68000 family processors, the SR was located at the top of the stack. On those
processors, bit 30 of the longword addressed by the system stack pointer is typically zero. Thus, if an RTE
is attempted using this old format, it generates a format error on a ColdFire processor.
If the format field defines a valid type, the processor: (1) reloads the SR operand, (2) fetches the second
longword operand, (3) adjusts the stack pointer by adding the format value to the auto-incremented address
after the fetch of the first longword, and then (4) transfers control to the instruction address defined by the
second longword operand within the stack frame.
Freescale Semiconductor2-21
ColdFire Core
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
2.3.4.11TRAP Instruction Exception
The TRAP #n instruction always forces an exception as part of its execution and is useful for implementing
system calls. The TRAP instruction may be used to change from user to supervisor mode.
2.3.4.12Unsupported Instruction Exception
If execution of a valid instruction is attempted but the required hardware is not present in the processor , an
unsupported instruction exception is generated. The instruction functionality can then be emulated in the
exception handler, if desired.
All ColdFire cores record the processor hardware configuration in the D0 register immediately after the
negation of RESET. See Section 2.3.4.15, “Reset Exception,” for details.
2.3.4.13Interrupt Exception
Interrupt exception processing includes interrupt recognition and the fetch of the appropriate vector from
the interrupt controller using an IACK cycle. See ,” for details on the interrupt controller.
2.3.4.14Fault-on-Fault Halt
If a ColdFire processor encounters any type of fault during the exception processing of another fault, the
processor immediately halts execution with the catastrophic fault-on-fault condition. A reset is required to
to exit this state.
2.3.4.15Reset Exception
Asserting the reset input signal (RESET) to the processor causes a reset exception. The reset exception has
the highest priority of any exception; it provides for system initialization and recovery from catastrophic
failure. Reset also aborts any processing in progress when the reset input is recognized. Processing cannot
be recovered.
The reset exception places the processor in the supervisor mode by setting the SR[S] bit and disables
tracing by clearing the SR[T] bit. This exception also clears the SR[M] bit and sets the processor’s SR[I]
field to the highest level (level 7, 0b11 1). Next, the VBR is initialized to zero (0x0000_0000). The control
registers specifying the operation of any memories (e.g., cache and/or RAM modules) connected directly
to the processor are disabled.
NOTE
Other implementation-specific registers are also affected. Refer to each
module in this reference manual for details on these registers.
After the processor is granted the bus, it performs two longword read-bus cycles. The first longword at
address 0x0000_0000 is loaded into the supervisor stack pointer and the second longword at address
0x0000_0004 is loaded into the program counter. After the initial instruction is fetched from memory,
program execution begins at the address in the PC. If an access error or address error occurs before the first
instruction is executed, the processor enters the fault-on-fault state.
2-22Freescale Semiconductor
ColdFire Core
(This is the value used for this device.)
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
ColdFire processors load hardware configuration information into the D0 and D1 general-purpose
registers after system reset. The hardware configuration information is loaded immediately after the
reset-in signal is negated. This allows an emulator to read out the contents of these registers via the BDM
to determine the hardware configuration.
Information loaded into D0 defines the processor hardware configuration as shown in Figure 2-18.
BDM: Load: 0x080 (D0)
Store: 0x180 (D0)
31302928272625242322212019181716
RPFVERREV
W
Reset1100111100100000
1514131211109876543210
R MACDIVEMACFPU0000ISADEBUG
W
Reset0110000010000000
Access: User read-only
BDM read-only
Figure 2-18. D0 Hardware Configuration Info
Table 2-9. D0 Hardware Configuration Inf o Field Description
FieldDescription
31–24PFProcessor family. This field is fixed to a hex value of 0xCF indicating a ColdFire core is present.
23–20
VER
ColdFire core version number. Defines the hardware microarchitecture version of ColdFire core.
0001 V1 ColdFire core
0010 V2 ColdFire core (This is the value used for this device.)
0011 V3 ColdFire core
0100 V4 ColdFire core
0101 V5 ColdFire core
Else Reserved for future use
19–16
REV
MAC
EMAC
FPU
Freescale Semiconductor2-23
Processor revision number. The default is 0b0000.
15
MAC present. This bit signals if the optional multiply-accumulate (MAC) ex ecution engine is present in processor core.
0 MAC execute engine not present in core. (This is the value used for this device.)
1 MAC execute engine is present in core.
14
Divide present. This bit signals if the hardware divider (DIV) is present in the processor core.
DIV
0 Divide execute engine not present in core.
1 Divide execute engine is present in core.
13
EMAC present. This bit signals if the optional enhanced multiply-accumulate (EMAC) execution engine is present in
processor core.
0 EMAC execute engine not present in core.
1 EMAC execute engine is present in core. (This is the value used for this device.)
12
FPU present. This bit signals if the optional floating-point (FPU) execution engine is present in processor core.
0 FPU execute engine not present in core. (This is the value used for this device.)
1 FPU execute engine is present in core.
ColdFire Core
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
Table 2-9. D0 Hardware Configuration Info Field Description (continued)
FieldDescription
11–8Reserved.
7–4
ISA revision. Defines the instruction-set architecture (ISA) revision level implemented in ColdFire processor core.
ISA
0000 ISA_A
0001 ISA_B
0010 ISA_C
1000 ISA_A+ (This is the value used for this device.)
Else Reserved
3–0
Debug module revision number. Defines revision level of the debug module used in the ColdFire processor core.
Information loaded into D1 defines the local memory hardware configuration as shown in the figure below .
BDM: Load: 0x1 (D1)
Store: 0x1 (D1)
31302928272625242322212019181716
RCLSZCCASCCSZFLASHSZ000
W
Reset0001001110110000
1514131211109876543210
RMBSZ UCAS 0000SRAMSZ000
W
Reset0001000010000000
Access: User read-only
BDM read-only
Figure 2-19. D1 Hardware Configuration Info
Table 2-10. D1 Hardware Configuration Information Field Description
FieldDescription
31–30
CLSZ
29–28
CCAS
Cache line size. This field is fixed to a hex value of 0x0 indicating a 16-byte cache line size.
Configurable cache associativity.
00Four-way
01Direct mapped (This is the value used for this device)
Else Reserved for future use
2-24Freescale Semiconductor
Table 2-10. D1 Hardware Configuration Information Field Description (continued)
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
FieldDescription
ColdFire Core
27–24
CCSZ
23–19
FLASHSZ
18–16Reserved
15–14
MBSZ
13–8Reserved, resets to 0b010000
Configurable cache size. Indicates the amount of instruction/data cache. The cache configuration options
available are 50% instruction/50% data, 100% instruction, or 100% data, and are specified in the CACR register.
0000 No configurable cache
0001 512B configurable cache
0010 1KB configurable cache
0011 2KB configurable cache (This is the value used for this device)
0100 4KB configurable cache
0101 8KB configurable cache
0110 16KB configurable cache
0111 32KB configurable cache
Else Reserved
Flash bank size.
0000-0111 No flash
1000 64-KB flash
1001 128-KB flash
1010 256-KB flash
1011 512-KB flash (This is the value used for this device)
Else Reserved for future use.
Bus size. Defines the width of the ColdFire master bus datapath.
0032-bit system bus datapath (This is the value used for this device)
0164-bit system bus datapath
Else Reserved
7–3
SRAMSZ
2–0Reserved.
SRAM bank size.
00000 No SRAM
00010 512 bytes
00100 1 KB
00110 2 KB
01000 4 KB
01010 8 KB
01100 16 KB
01110 32 KB
10000 64 KB (This is the value us ed for this device)
10010 128 KB
ElseReserved for future use
2.3.5Instruction Execution Timing
This section presents processor instruction execution times in terms of processor-core clock cycles. The
number of operand references for each instruction is enclosed in parentheses following the number of
processor clock cycles. Each timing entry is presented as C(R/W) where:
•C is the number of processor clock cycles, including all applicable operand fetches and writes, and
all internal core cycles required to complete the instruction execution.
Freescale Semiconductor2-25
ColdFire Core
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
•R/W is the number of operand reads (R) and writes (W) required by the instruction. An operation
performing a read-modify-write function is denoted as (1/1).
This section includes the assumptions concerning the timing values and the execution time details.
2.3.5.1Timing Assumptions
For the timing data presented in this section, these assumptions apply:
1. The OEP is loaded with the opword and all required extension words at the beginning of each
instruction execution. This implies that the OEP does not wait for the IFP to supply opwords and/or
extension words.
2. The OEP does not experience any sequence-related pipeline stalls. The most common example of
stall involves consecutive store operations, excluding the MOVEM instruction. For all STORE
operations (except MOVEM), certain hardware resources within the processor are marked as busy
for two clock cycles after the final decode and select/operand fetch cycle (DSOC) of the store
instruction. If a subsequent STORE instruction is encountered within this 2-cycle window, it is
stalled until the resource again becomes available. Thus, the maximum pipeline stall involving
consecutive STORE operations is two cycles. The MOVEM instruction uses a different set of
resources and this stall does not apply.
3. The OEP completes all memory accesses without any stall conditions caused by the memory itself.
Thus, the timing details provided in this section assume that an infinite zero-wait state memory is
attached to the processor core.
4. All operand data accesses are aligned on the same byte boundary as the operand size; for example,
16-bit operands aligned on 0-modulo-2 addresses, 32-bit operands aligned on 0-modulo-4
addresses.
The processor core decomposes misaligned operand references into a series of aligned accesses as
shown in Table 2-11.
Table 2-11. Misaligned Operand References
address[1:0]Size
01 or 11WordByte, Byte2(1/0) if read
01 or 11LongByte, Word,
10LongWord, Word2(1/0) if read
Bus
Operations
Byte
Additional
C(R/W)
1(0/1) if write
3(2/0) if read
2(0/2) if write
1(0/1) if write
2.3.5.2MOVE Instruction Execution Times
Table 2-12 lists execution times for MOVE.{B,W} instructions; Table 2-13 lists timings for MOVE.L.
NOTE
For all tables in this section, the execution time of any instruction using the
PC-relative effective addressing modes is the same for the comparable
An-relative mode.
2-26Freescale Semiconductor
ColdFire Core
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
ET with {<ea> = (d16,PC)}equals ET with {<ea> = (d16,An)}
ET with {<ea> = (d8,PC,Xi*SF)}equals ET with {<ea> = (d8,An,Xi*SF)}
The nomenclature xxx.wl refers to both forms of absolute addressing, xxx.w
and xxx.l.
Storing an accumulator requires one additional processor clock cycle when saturation is enabled, or fractional
rounding is performed (MACSR[7:4] equals 1---, -11-, --11)
Freescale Semiconductor2-31
ColdFire Core
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
NOTE
The execution times for moving the contents of the Racc, Raccext[01,23],
MACSR, or Rmask into a destination location <ea>x shown in this table
represent the best-case scenario when the store instruction is executed and
there are no load or M{S}AC instructions in the EMAC execution pipeline.
In general, these store operations require only a single cycle for execution,
but if preceded immediately by a load, MAC, or MSAC instruction, the
depth of the EMAC pipeline is exposed and the execution time is four
cycles.
2.3.5.7Branch Instruction Execution Times
Table 2-18. General Branch Instruction Execution Times
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
Enhanced Multiply-Accumulate Unit (EMAC)
3.1Introduction
This chapter describes the functionality, microarchitecture, and performance of the enhanced
multiply-accumulate (EMAC) unit in the ColdFire family of processors.
3.1.1Overview
The EMAC design provides a set of DSP operations that can improve the performance of embedded code
while supporting the integer multiply instructions of the baseline ColdFire architecture.
The MAC provides functionality in three related areas:
1. Signed and unsigned integer multiplication
2. Multiply-accumulate operations supporting signed and unsigned integer operands as well as
signed, fixed-point, and fractional operands
3. Miscellaneous register operations
The ColdFire family supports two MAC implementations with different performance levels and
capabilities. The original MAC features a three-stage execution pipeline optimized for 16-bit operands,
with a 16x16 multiply array and a single 32-bit accumulator. The EMAC features a four-stage pipeline
optimized for 32-bit operands, with a fully pipelined 32 × 32 multiply array and four 48-bit accumulators.
The first ColdFire MAC supported signed and unsigned integer operands and was optimized for 16x16
operations, such as those found in applications including servo control and image compression. As
ColdFire-based systems proliferated, the desire for more precision on input operands increased. The result
was an improved ColdFire MAC with user-programmable control to optionally enable use of fractional
input operands.
EMAC improvements target three primary areas:
•Improved performance of 32 × 32 multiply operation.
•Addition of three more accumulators to minimize MAC pipeline stalls caused by exchanges
between the accumulator and the pipeline’s general-purpose registers
•A 48-bit accumulation data path to allow a 40-bit product, plus 8 extension bits increase the
dynamic number range when implementing signal processing algorithms
The three areas of functionality are addressed in detail in following sections. The logic required to support
this functionality is contained in a MAC module (Figure 3-1).
Freescale Semiconductor3-1
Enhanced Multiply-Accumulate Unit (EMAC)
X
+
/
-
Operand YOperand X
Shift 0,1,-1
Accumulator(s)
yi()ak()yi k–()
k1=
N1–
∑
bk()xi k–()
k0=
N1–
∑
+=
yi()bk()xi k–()
k0=
3
∑
b0()xi() b1()xi 1–()b2()xi 2–()b3()xi 3–()+++==
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
The MAC is an extension of the basic multiplier in most microprocessors. It is typically implemented in
hardware within an architecture and supports rapid execution of signal processing algorithms in fewer
cycles than comparable non-MAC architectures. For example, small digital filters can tolerate some
variance in an algorithm’s execution time, but larger, more complicated algorithms such as orthogonal
transforms may have more demanding speed requirements beyond scope of any processor architecture and
may require full DSP implementation.
T o balance speed, size, and functionality, the ColdFire MAC is optimized for a small set of operations that
involve multiplication and cumulative additions. Specifically, the multiplier array is optimized for
single-cycle pipelined operations with a possible accumulation after product generation. This functionality
is common in many signal processing applications. The ColdFire core architecture is also modified to
allow an operand to be fetched in parallel with a multiply , increasing overall perfo rmance for certain DSP
operations.
Consider a typical filtering operation where the filter is defined as in Equation 3-1.
Eqn. 3-1
Here, the output y(i) is determined by past output values and past input values. This is the general form of
an infinite impulse response (IIR) filter. A finite impulse response (FIR) filter can be obtained by setting
coefficients a(k) to zero. In either case, the operations involved in computing such a filter are multiplies
and product summing. To show this point, reduce Equation 3-1 to a simple, four-tap FIR filter, shown in
Equation 3-2, in which the accumulated sum is a past data values and coefficients sum.
Eqn. 3-2
3-2Freescale Semiconductor
3.2Memory Map/Register Definition
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
The following table and sections explain the MAC registers:
The values listed in this column represent the Rc field used when accessing the core registers via the BDM port. For more
information see Chapter 43, “Debug Module.”
Register
Width
(bits)
AccessReset ValueSection/Page
3.2.1MAC Status Register (MACSR)
The MAC status register (MACSR) contains a 4-bit operational mode field and condition flags.
Operational mode bits control whether operands are signed or unsigned and whether they are treated as
integers or fractions. These bits also control the overflow/saturation mode and the way in which rounding
is performed. Negative, zero, and multiple overflow condition flags are also provided.
Product/accumulation overflow flags. Contains f our flags, one per accumulator , that indicate if past MA C or
MSAC instructions generated an overflow during product calculation or the 48-bit accumulation. When a
MAC or MSA C instruction is e x ecuted, the PAVn flag associated with the destination accumulator f orms the
general overflow flag, MACSR[V]. Once set, each flag remains set until V is cleared by a move.l, MACSR
instruction or the accumulator is loaded directly.
Bit 11: Accumulator 3
...
Bit 8: Accumulator 0
Freescale Semiconductor3-3
Enhanced Multiply-Accumulate Unit (EMAC)
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
Table 3-2. MACSR Field Descriptions (continued )
FieldDescription
7
OMC
6
S/U
5
F/I
Overflow saturation mode. Enables or disables saturation mode on overflow. If set, the accumulator is set
to the appropriate constant (see S/U field description) on any operation that overflows the accumulator.
After saturation, the accumulator remains unaffected by any other MA C or MSAC instructions until the
overflow bit is cleared or the accumulator is directly loaded.
Signed/unsigned operations.
In integer mode:
S/U determines whether operations performed are signed or unsigned. It also determines the accumulator
value during saturation, if enabled.
0 Signed numbers. On overflow , if OMC is enabled, an accumulator saturates to the most positive
(0x7FFF_FFFF) or the most negative (0x8000_0000) number, depending on the instruction and the
product value that overflowed.
1 Unsigned numbers. On overflow, if OMC is enabled, an accumulator saturates to the smallest value
(0x0000_0000) or the largest value (0xFFFF_FFFF), depending on the instruction.
In fractional mode:
S/U controls rounding while storing an accumulator to a general-purpose register.
0 Move accumulator without rounding to a 16-bit value. Accumulator is moved to a general-purpose
register as a 32-bit value.
1 The accumulator is rounded to a 16-bit value using the round-to-nearest (even) method when moved to
a general-purpose register. See Section 3.3.1.1, “Rounding”. The resulting 16-bit value is stored in the
lower word of the destination register. The upper word is zero-filled. This rounding procedure does not
affect the accumulator value.
Fractional/integer mode. Determines whether input operands are treated as fractions or integers.
0 Integers can be represented in signed or unsigned notation, depending on the value of S/U.
1 Fractions are represented in signed, fixed-point, two’s complement notation. Values range from -1 to
-15
for 16-bit fractions and -1 to 1 - 2
1-2
-31
for 32-bit fractions. See Section3.3.4, “Data
Representation."
4
R/T
3
N
2
Z
Round/truncate mode. Controls rounding procedure for move.l ACCx,Rx, or MSAC.L instructions when
in fractional mode.
0 T runcate. The product’ s lsbs are dropped before it is combined with the accumulator . Additionally, when
a store accumulator instruction is executed (move.l ACCx,Rx), the 8 lsbs of the 48-bit accumulator
logic are truncated.
1 Round-to-nearest (even). The 64-bit product of two 32-bit, fractional operands is rounded to the nearest
40-bit value. If the low-order 24 bits equal 0x80_0000, the upper 40 bits are rounded to the nearest even
(lsb = 0) value. See Section 3.3.1.1, “Rounding”. Additionally, when a store accumulator instruction is
executed (move.l ACCx,Rx), the lsbs of the 48-bit accumulator logic round the resulting 16- or 32-bit
value. If MACSR[S/U] is cleared and MACSR[R/T] is set, the low-order 8 bits are used to round the
resulting 32-bit fraction. If MACSR[S/U] is set, the low-order 24 bits are used to round the resulting 16-bit
fraction.
Negative. Set if the msb of the result is set, otherwise cleared. N is affected only by MAC, MSA C , and load
operations; it is not affected by MULS and MULU instructions.
Zero. Set if the result equals zero, otherwise cleared. This bit is affected only by MAC, MSAC, and load
operations; it is not affected by MULS and MULU instructions.
3-4Freescale Semiconductor
Table 3-2. MACSR Field Descriptions (continued )
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
FieldDescription
Enhanced Multiply-Accumulate Unit (EMAC)
1
V
0
EV
Overflow. Set if an arithmetic overflow occurs on a MAC or MSAC instruction, indicating that the result
cannot be represented in the limited width of the EMAC. V is set only if a product overflow occurs or the
accumulation overflows the 48-bit structure. V is ev aluated on each MAC or MSA C operation and uses the
appropriate PAVn flag in the next-state V evaluation.
Extension overflow. Signals that the last MAC or MSAC instruction overflowed the 32 lsbs in integer mode
or the 40 lsbs in fractional mode of the destination accumulator. However, the result remains accurately
represented in the combined 48-bit accumulator structure. Although an overflow has occurred, the correct
result, sign, and magnitude are contained in the 48-bit accumulator. Subsequent MAC or MSA C operations
may return the accumulator to a valid 32/40-bit result.
Table 3-3 summarizes the interaction of the MACSR[S/U,F/I,R/T] control bits.
Table 3-3. Summary of S/U, F/I, and R/T Control Bits
S/UF/IR/TOperational Modes
00xSigned, integer
010Signed, fractional
Truncate on MAC.L and MSAC.L
No round on accumulator stores
011Signed, fractional
Round on MAC.L and MSAC.L
Round-to-32-bits on accumulator stores
10xUnsigned, integer
110Signed, fractional
Truncate on MAC.L and MSAC.L
Round-to-16-bits on accumulator stores
111Signed, fractional
Round on MAC.L and MSAC.L
Round-to-16-bits on accumulator stores
3.2.2Mask Register (MASK)
The 32-bit MASK implements the low-order 16 bits to minimize the alignment complications involved
with loading and storing only 16 bits. When the MASK is loaded, the low-order 16 bits of the source
operand are actually loaded into the register. When it is stored, the upper 16 bits are all forced to ones.
This register performs a simple AND with the operand address for MAC instructions. The processor
calculates the normal operand address and, if enabled, that address is then ANDed with {0xFFFF,
MASK[15:0]} to form the final address. Therefore, with certain MASK bits cleared, the operand address
can be constrained to a certain memory region. This is used primarily to implement circular queues with
the (An)+ addressing mode.
This minimizes the addressing support required for filtering, convolution, or any routine that implements
a data array as a circular queue. For MAC + MOVE operations, the MASK contents can optionally be
included in all memory effective address calculations. The syntax is as follows:
mac.sz Ry,RxSF,<ea>yand ,Rw
Freescale Semiconductor3-5
Enhanced Multiply-Accumulate Unit (EMAC)
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
The and operator enables the MASK use and causes bit 5 of the extension word to be set. The exact
algorithm for the use of MASK is:
if extension word, bit [5] = 1, the MASK bit, then
if <ea> = (An)
oa = An and {0xFFFF, MASK}
if <ea> = (An)+
oa = An
An = (An + 4) and {0xFFFF, MASK}
if <ea> =-(An)
oa = (An - 4) and {0xFFFF, MASK}
An = (An - 4) and {0xFFFF, MASK}
if <ea> = (d16,An)
oa = (An + se_d16) and {0xFFFF0x, MASK}
Here, oa is the calculated operand address and se_d16 is a sign-extended 16-bit displacement. For
auto-addressing modes of post-increment and pre-decrement, the updated An value calculation is also
shown.
Use of the post-increment addressing mode, {(An)+} with the MASK is suggested for circular queue
implementations.
Each pair of 8-bit accumulator extension fields are concatenated with the corresponding 32-bit
accumulator register to form the 48-bit accumulator. For more information, see Section 3.3, “Functional
The MAC speeds execution of ColdFire integer-multiply instructions (MULS and MULU) and provides
additional functionality for multiply-accumulate operations. By executing MULS and MULU in the MAC,
execution times are minimized and deterministic compared to the 2-bit/cycle algorithm with early
termination that the OEP normally uses if no MAC hardware is present.
The added MAC instructions to the ColdFire ISA provide for the multiplication of two numbers, followed
by the addition or subtraction of the product to or from the value in an accumulator . Optionally , the product
may be shifted left or right by 1 bit before addition or subtraction. Hardware support for saturation
arithmetic can be enabled to minimize software overhead when dealing with potential overflow conditions.
Multiply-accumulate operations support 16- or 32-bit input operands in these formats:
•Signed integers
•Unsigned integers
•Signed, fixed-point, fractional numbers
The EMAC is optimized for single-cycle, pipelined 32 × 32 multiplications. For word- and
longword-sized integer input operands, the low-order 40 bits of the product are formed and used with the
destination accumulator. For fractional operands, the entire 64-bit product is calculated and truncated or
rounded to the most-significant 40-bit result using the round-to-nearest (even) method before it is
combined with the destination accumulator.
For all operations, the resulting 40-bit product is extended to a 48-bit value (using sign-extension for
signed integer and fractional operands, zero-fill for unsigned integer operands) before being combined
with the 48-bit destination accumulator.
3-8Freescale Semiconductor
Enhanced Multiply-Accumulate Unit (EMAC)
X
OperandY
OperandX
Product
Extended Product
Accumulator
8
Extension Byte Upper [7:0]
+
0
32
40
40
8
40
Extension Byte Lower [7:0]
32
23
8
Accumulator [31:0]
X
OperandY
OperandX
Product
Extended Product
Accumulator
32
32
32
32
32
8
8
8
24
8
8
+
Extension Byte Upper [7:0]
Extension Byte Lower [7:0]
Accumulator [31:0]
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
Figure 3-7 and Figure 3-8 show relative alignment of input operands, the full 64-bit product, the resulting
40-bit product used for accumulation, and 48-bit accumulator formats.
Figure 3-7. Fractional Alignment
Figure 3-8. Signed and Unsigned Integer Alignment
Therefore, the 48-bit accumulator definition is a function of the EMAC operating mode. Given that each
48-bit accumulator is the concatenation of 16-bit accumulator extension register (ACCextn) contents and
32-bit ACCn contents, the specific definitions are:
The four accumulators are represented as an array, ACCn, where n selects the register.
Freescale Semiconductor3-9
Enhanced Multiply-Accumulate Unit (EMAC)
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
Although the multiplier array is implemented in a four-stage pipeline, all arithmetic MAC instructions
have an effective issue rate of 1 cycle, regardless of input operand size or type.
All arithmetic operations use register-based input operands, and summed values are stored in an
accumulator. Therefore, an additional MOVE instruction is needed to store data in a general-purpose
register. One new feature in EMAC instructions is the ability to choose the upper or lower word of a
register as a 16-bit input operand. This is useful in filtering operations if one data register is loaded with
the input data and another is loaded with the coefficient. Two 16-bit multiply accumulates can be
performed without fetching additional operands between instructions by alternating word choice during
calculations.
The EMAC has four accumulator registers versus the MAC’ s single accumulator. The additional registers
improve the performance of some algorithms by minimizing pipeline stalls needed to store an accumulator
value back to general-purpose registers. Many algorithms require multiple calculations on a given data set.
By applying different accumulators to these calculations, it is often possible to store one accumulator
without any stalls while performing operations involving a different destination accumulator.
The need to move large amounts of data presents an obstacle to obtaining high throughput rates in DSP
engines. Existing ColdFire instructions can accommodate these requirements. A MOVEM instruction can
efficiently move large data blocks by generating line-sized burst references. The ability to load an operand
simultaneously from memory into a register and execute a MAC instruction makes some DSP operations
such as filtering and convolution more manageable.
The programming model includes a mask register (MASK), which can optionally be used to generate an
operand address during MAC + MOVE instructions. The register application with auto-increment
addressing mode supports efficient implementation of circular data queues for memory operands.
3.3.1Fractional Operation Mode
This section describes behavior when the fractional mode is used (MACSR[F/I] is set).
3.3.1.1Rounding
When the processor is in fractional mode, there are two operations during which rounding can occur:
1. Execution of a store accumulator instruction (move.l ACCx,Rx). The lsbs of the 48-bit accumulator
logic are used to round the resulting 16- or 32-bit value. If MACSR[S/U] is cleared, the low-order
8 bits round the resulting 32-bit fraction. If MACSR[S/U] is set, the low-order 24 bits are used to
round the resulting 16-bit fraction.
2. Execution of a MAC (or MSAC) instruction with 32-bit operands. If MACSR[R/T] is zero,
multiplying two 32-bit numbers creates a 64-bit product truncated to the upper 40 bits; otherwise,
it is rounded using round-to-nearest (even) method.
T o understand the round-to-nearest-even method, consider the following example involving the rounding
of a 32-bit number, R0, to a 16-bit number. Using this method, the 32-bit number is rounded to the closest
16-bit number possible. Let the high-order 16 bits of R0 be named R0.U and the low-order 16 bits be R0.L.
•If R0.L is less than 0x8000, the result is truncated to the value of R0.U.
•If R0.L is greater than 0x8000, the upper word is incremented (rounded up).
3-10Freescale Semiconductor
Enhanced Multiply-Accumulate Unit (EMAC)
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
•If R0.L is 0x8000, R0 is half-way between two 16-bit numbers. In this case, rounding is based on
the lsb of R0.U, so the result is always even (lsb = 0).
— If the lsb of R0.U equals 1 and R0.L equals 0x8000, the number is rounded up.
— If the lsb of R0.U equals 0 and R0.L equals 0x8000, the number is rounded down.
This method minimizes rounding bias and creates as statistically correct an answer as possible.
The rounding algorithm is summarized in the following pseudocode:
if R0.L < 0x8000
then Result = R0.U
else if R0.L > 0x8000
then Result = R0.U + 1
else if lsb of R0.U = 0 /* R0.L = 0x8000 */
then Result = R0.U
else Result = R0.U + 1
The round-to-nearest-even technique is also known as convergent rounding.
3.3.1.2Saving and Restoring the EMAC Programming Model
The presence of rounding logic in the EMAC output datapath requires special care during the EMAC’s
save/restore process. In particular, any result rounding modes must be disabled during the save/restore
process so the exact bit-wise contents of the EMAC registers are accessed. Consider the memory structure
containing the EMAC programming model:
struct macState {
int acc0;
int acc1;
int acc2;
int acc3;
int accext01;
int accext02;
int mask;
int macsr;
} macState;
The following assembly language routine shows the proper sequence for a correct EMAC state save. This
code assumes all Dn and An registers are available for use, and the memory location of the state save is
defined by A7.
EMAC_state_save:
move.l macsr,d7; save the macsr
clr.l d0; zero the register to ...
move.l d0,macsr; disable rounding in the macsr
move.l acc0,d0; save the accumulators
move.l acc1,d1
move.l acc2,d2
move.l acc3,d3
move.l accext01,d4; save the accumulator extensions
move.l accext23,d5
move.l mask,d6; save the address mask
movem.l #0x00ff,(a7); move the state to memory
This code performs the EMAC state restore:
EMAC_state_restore:
Freescale Semiconductor3-11
Enhanced Multiply-Accumulate Unit (EMAC)
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
movem.l (a7),#0x00ff; restore the state from memory
move.l #0,macsr; disable rounding in the macsr
move.l d0,acc0; restore the accumulators
move.l d1,acc1
move.l d2,acc2
move.l d3,acc3
move.l d4,accext01; restore the accumulator extensions
move.l d5,accext23
move.l d6,mask; restore the address mask
move.l d7,macsr; restore the macsr
Executing this sequence type can correctly save and restore the exact state of the EMAC programming
model.
3.3.1.3MULS/MULU
MULS and MULU are unaffected by fractional-mode operation; operands remain assumed to be integers.
3.3.1.4Scale Factor in MAC or MSAC Instructions
The scale factor is ignored while the MAC is in fractional mode.
3.3.2EMAC Instruction Set Summary
Table 3-8 summarizes EMAC unit instructions.
Table 3-8. EMAC Instruction Summary
CommandMnemonicDescription
Multiply Signedmuls <ea>y,DxMultiplies two signed operands yielding a signed result
Multiply Unsignedmulu <ea>y,DxMultiplies two unsigned operands yielding an unsigned result
Multiply Accumulatemac Ry,RxSF,ACCx
msac Ry,RxSF,ACCx
Multiply Accumulate
with Load
Load Accumulatormove.l {Ry,#imm},ACCxLoads an accumulator with a 32-bit operand
Store Accumulatormove.l ACCx,RxWrites the contents of an accumulator to a CPU register
Copy Accumulatormove.l ACCy,ACCxCopies a 48-bit accumulator
Load MACSRmove.l {Ry,#imm},MACSRWrites a value to MACSR
Store MACSRmove.l MACSR,RxWrite the contents of MACSR to a CPU register
Store MACSR to CCRmove.l MACSR,CCRWrite the contents of MACSR to the CCR
Load MAC Mask Regmove.l {Ry,#imm},MASKWrites a value to the MASK register
mac Ry,Rx,<ea>y,Rw,ACCx
msac Ry,Rx,<ea>y,Rw,ACCx
Multiplies two operands and adds/subtracts the product
to/from an accumulator
Multiplies two operands and combines the product to an
accumulator while loading a register with the memory operand
Store MAC Mask Regmove.l MASK,RxWrites the contents of the MASK to a CPU register
Load Accumulator
Extensions 01
3-12Freescale Semiconductor
move.l {Ry,#imm},ACCext01 Loads the accumulator 0,1 extension bytes with a 32-bit
operand
Enhanced Multiply-Accumulate Unit (EMAC)
DSOC
AGEX
mac
mac
EMAC EX1
EMAC EX2
EMAC EX3
EMAC EX4
mac
mac
mac
move
move
movemove
Three-cycle
regBusy stall
Accumulator 0
old
new
mac
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
Table 3-8. EMAC Instruction Summary (continued)
CommandMnemonicDescription
Load Accumulator
Extensions 23
Store Accumulator
Extensions 01
Store Accumulator
Extensions 23
move.l {Ry,#imm},ACCext23 Loads the accumulator 2,3 extension bytes with a 32-bit
operand
move.l ACCext01,RxWrites the contents of accumulator 0,1 extension bytes into a
CPU register
move.l ACCext23,RxWrites the contents of accumulator 2,3 extension bytes into a
CPU register
3.3.3EMAC Instruction Execution Times
The instruction execution times for the EMAC can be found in Section 2.3.5.6, “EMAC Instruction
Execution Times”.
The EMAC execution pipeline overlaps the AGEX stage of the OEP (the first stage of the EMAC pipeline
is the last stage of the basic OEP). EMAC units are designed for sustained, fully-pipelined operation on
accumulator load, copy, and multiply-accumulate instructions. However, instructions that store contents
of the multiply-accumulate programming model can generate OEP stalls that expose the EMAC execution
pipeline depth:
mac.wRy, Rx, Acc0
move.lAcc0, Rz
The MOVE.L instruction that stores the accumulator to an integer register (Rz) stalls until the
program-visible copy of the accumulator is available. Figure 3-9 shows EMAC timing.
Figure 3-9. EMAC-Specific OEP Sequence Stall
In Figure 3-9, the OEP stalls the store-accumulator instruction for three cycles: the EMAC pipleline depth
minus 1. The minus 1 factor is needed because the OEP and EMAC pipelines overlap by a cycle, the
AGEX stage. As the store-accumulator instruction reaches the AGEX stage where the operation is
performed, the recently updated accumulator 0 value is available.
Freescale Semiconductor3-13
Enhanced Multiply-Accumulate Unit (EMAC)
value1 a
N1–
⋅()–2
i1N–+()–
ai⋅
i0=
N2–
∑
+=
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
As with change or use stalls between accumulators and general-purpose registers, introducing intervening
instructions that do not reference the busy register can reduce or eliminate sequence-related store-MAC
instruction stalls. A major benefit of the EMAC is the addition of three accumulators to minimize stalls
caused by exchanges between accumulator(s) and general-purpose registers.
3.3.4Data Representation
MACSR[S/U,F/I] selects one of the following three modes, where each mode defines a unique operand
type:
1. Two’s complement signed integer: In this format, an N-bit operand value lies in the range -2
< operand < 2
(N-1)
- 1. The binary point is right of the lsb.
2. Unsigned integer: In this format, an N-bit operand value lies in the range 0 < operand < 2N - 1. The
binary point is right of the lsb.
3. Two’ s complement, signed fractional: In an N-bit number, the first bit is the sign bit. The remaining
bits signify the first N-1 bits after the binary point. Given an N-bit number , a
N-1aN-2aN-3
its value is given by the equation in Equation 3-3.
(N-1)
... a2a1a0,
Eqn. 3-3
This format can represent numbers in the range -1 < operand < 1-2
(N-1)
.
For words and longwords, the largest negative number that can be represented is -1, whose internal
representation is 0x8000 and 0x8000_0000, respectively . The largest positive word is 0x7FFF or (1 - 2
the most positive longword is 0x7FFF_FFFF or (1 - 2
-31
).
3.3.5MAC Opcodes
MAC opcodes are described in the ColdFire Programmer’s Reference Manual.
Remember the following:
•Unless otherwise noted, the value of MACSR[N,Z] is based on the result of the final operation that
involves the product and the accumulator.
•The overflow (V) flag is managed differently . It is set if the complete product cannot be represented
as a 40-bit value (this applies to 32 × 32 integer operations only) or if the combination of the
product with an accumulator cannot be represented in the given number of bits. The EMAC design
includes an additional product/accumulation overflow bit for each accumulator that are treated as
sticky indicators and are used to calculate the V bit on each MAC or MSAC instruction. See
Section 3.2.1, “MAC Status Register (MACSR)”.
•For the MAC design, the assembler syntax of the MAC (multiply and add to accumulator) and
MSAC (multiply and subtract from accumulator) instructions does not include a reference to the
single accumulator. For the EMAC, assemblers support this syntax and no explic it reference to an
accumulator is interpreted as a reference to ACC0. Assemblers also support syntaxes where the
destination accumulator is explicitly defined.
-15
);
3-14Freescale Semiconductor
Enhanced Multiply-Accumulate Unit (EMAC)
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
•The optional 1-bit shift of the product is specified using the notation {<< | >>} SF, where <<1
indicates a left shift and >>1 indicates a right shift. The shift is performed before the product is
added to or subtracted from the accumulator. W ithout this operator , the product is not shifted. If the
EMAC is in fractional mode (MACSR[F/I] is set), SF is ignored and no shift is performed. Because
a product can overflow, the following guidelines are implemented:
— For unsigned word and longword operations, a zero is shifted into the product on right shifts.
— For signed, word operations, the sign bit is shifted into the product on right shifts unless the
product is zero. For signed, longword operations, the sign bit is shifted into the product unless
an overflow occurs or the product is zero, in which case a zero is shifted in.
— For all left shifts, a zero is inserted into the lsb position.
The following pseudocode explains basic MAC or MSAC instruction functionality. This example is
presented as a case statement covering the three basic operating modes with signed integers, unsigned
integers, and signed fractionals. Throughout this example, a comma-separated list in curly brackets, {},
indicates a concatenation operation.
switch (MACSR[6:5])/* MACSR[S/U, F/I] */
{
case 0:/* signed integers */
if (MACSR.OMC == 0 || MACSR.PAVn == 0)
then {
MACSR.PAVn = 0
/* select the input operands */
if (sz == word)
then {if (U/Ly == 1)
then operandY[31:0] = {sign-extended Ry[31], Ry[31:16]}
else operandY[31:0] = {sign-extended Ry[15], Ry[15:0]}
if (U/Lx == 1)
then operandX[31:0] = {sign-extended Rx[31], Rx[31:16]}
The cache is a direct-mapped, single-cycle memory. It may be configured as an instruction cache, a
write-through data cache, or a split instruction/data cache. The cache storage is organized as 128 lines,
each containing 16 bytes. The memory storage consists of a 128-entry tag array (containing addresses and
a valid bit), and a data array containing 2 Kbytes, organized as 512 × 32 bits.
Cache configuration is controlled by bits in the cache control register (CACR), detailed later in this
chapter. For the instruction or data-only configurations, only the associated instruction or data line-fill
buffer is used. For the split cache configuration, one-half of the tag and storage arrays is used for an
instruction cache and one-half is used for a data cache. The split cache configuration uses the instruction
and the data line-fill buffers. The core’s local bus is a unified bus used for instruction and data fetches.
Therefore, the cache can have only one fetch, instruction or data, active at one time.
For the instruction- or data-only configurations, the cache tag and storage arrays are accessed in parallel:
fetch address bits [10:4] addressing the tag array , and fetch address bits [10:2] addressing the storage array .
For the split cache configuration, the cache tag and storage arrays are accessed in parallel. The msb of the
tag array address is set for instruction fetches and cleared for operand fetches; fetch address bits [9:4]
provide the rest of the tag array address. The tag array outputs the address mapped to the given cache
location along with the valid bit for the line. This address field is compared to bits [31:
or data-only configurations and to bits [31:
bus to determine if a cache hit has occurred. If the desired address is mapped into the cache memory, the
10] for a split configuration of the fetch address from the local
11] for instruction-
Freescale Semiconductor4-1
Cache
31
43012
314
=
=
31
31
0
0
0
Local Address Bus
I or D LineBuffer
Address
External Data[31:0]
I or D Line Buffer Storage
MUX
DA TA
MUX
Fill Hit
TAG
VALID
Local Data BusTag Hit
127
511
11
10
Tag Index
Data Index
MCF5282 and MCF5216 ColdFire Microcontroller User’s Manual, Rev. 3
output of the storage array is driven onto the ColdFire core's local data bus, thereby completing the access
in a single cycle.
The tag array maintains a single valid bit per line entry. Accordingly, only entire 16-byte lines are loaded
into the cache.
The cache also contains separate 16-byte instruction and data line-fill buffers that provide temporary
storage for the last line fetched in response to a cache miss. With each fetch, the contents of the associated
line fill buffer are examined. Thus, each fetch address examines the tag memory array and the asso ciated
line fill buffer to see if the desired address is mapped into either hardware resource. A cache hit in the
memory array or the associated line-fill buffer is serviced in a single cycle. Because the line fill buffer
maintains valid bits on a longword basis, hits in the buffer can be serviced immediately without waiting
for the entire line to be fetched.
If the referenced address is not contained in the memory array or the associated line-fill buffer, the cache
initiates the required external fetch operation. In most situations, this is a 16-byte line-sized burst
reference.
The hardware implementation is a nonblocking design, meaning the ColdFire core's local bus is released
after the initial access of a miss. Thus, the cache or the SRAM module can service subsequent requests
while the remainder of the line is being fetched and loaded into the fill buffer.
4.2Memory Map/Register Definition
Three supervisor registers define the operation of the cache and local bus controller: the cache control
register (CACR) and two access control registers (ACR0, ACR1). Table 4-1 below shows the memory map
4-2Freescale Semiconductor
Figure 4-1. 2-Kbyte Cache Block Diagram
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.