Unpublished rights (if any) reserved under the copyright laws of the United States of America and other countries.
This document contains information that is proprietary to MIPS Tech, LLC, a Wave Computing company (“MIPS”) and MIPS’
affiliates as applicable. Any copying, reproducing, modifying or use of this information (in whole or in part) that is not expressly
permitted in writing by MIPS or MIPS’ affiliates as applicable or an authorized third party is strictly prohibited. At a minimum,
this information is protected under unfair competition and copyright laws. Violations thereof may result in criminal penalties
and fines. Any document provided in source format (i.e., in a modifiable form such as in FrameMaker or Microsoft Word
format) is subject to use and distribution restrictions that are independent of and supplemental to any and all confidentiality
restrictions. UNDER NO CIRCUMSTANCES MAY A DOCUMENT PROVIDED IN SOURCE FORMAT BE DISTRIBUTED TO A THIRD
PARTY IN SOURCE FORMAT WITHOUT THE EXPRESS WRITTEN PERMISSION OF MIPS (AND MIPS’ AFFILIATES AS APPLICABLE)
reserve the right to change the information contained in this document to improve function, design or otherwise.
MIPS and MIPS’ affiliates do not assume any liability arising out of the application or use of this information, or of any error or
omission in such information. Any warranties, whether express, statutory, implied or otherwise, including but not limited to the
implied warranties of merchantability or fitness for a particular purpose, are excluded. Except as expressly provided in any
written license agreement from MIPS or an authorized third party, the furnishing of this document does not give recipient any
license to any intellectual property rights, including any patent rights, that cover the information in this document.
The information contained in this document shall not be exported, reexported, transferred, or released, directly or indirectly, in
violation of the law of any country or international law, regulation, treaty, Executive Order, statute, amendments or
supplements thereto. Should a conflict arise regarding the export, reexport, transfer, or release of the information contained in
this document, the laws of the United States of America shall be the governing law.
The information contained in this document constitutes one or more of the following: commercial computer software,
commercial computer software documentation or other commercial items. If the user of this information, or any related
documentation of any kind, including related technical data or manuals, is an agency, department, or other entity of the United
States government ("Government"), the use, duplication, reproduction, release, modification, disclosure, or transfer of this
information, or any related documentation of any kind, is restricted in accordance with Federal Acquisition Regulation 12.212
for civilian agencies and Defense Federal Acquisition Regulation Supplement 227.7202 for military agencies. The use of this
information by the Government is further restricted in accordance with the terms of the license agreement(s) and/or applicable
contract terms and conditions covering this information from MIPS Technologies or an authorized third party.
MIPS, MIPS I, MIPS II, MIPS III, MIPS IV, MIPS V, MIPSr3, MIPS32, MIPS64, microMIPS32, microMIPS64, MIPS-3D, MIPS16,
MIPS16e, MIPS-Based, MIPSsim, MIPSpro, MIPS-VERIFIED, Aptiv logo, microAptiv logo, interAptiv logo, microMIPS logo, MIPS
Technologies logo, MIPS-VERIFIED logo, proAptiv logo, 4K, 4Kc, 4Km, 4Kp, 4KE, 4KEc, 4KEm, 4KEp, 4KS, 4KSc, 4KSd, M4K, M14K,
5K, 5Kc, 5Kf, 24K, 24Kc, 24Kf, 24KE, 24KEc, 24KEf, 34K, 34Kc, 34Kf, 74K, 74Kc, 74Kf, 1004K, 1004Kc, 1004Kf, 1074K, 1074Kc,
1074Kf, R3000, R4000, R5000, Aptiv, ASMACRO, Atlas, "At the core of the user experience.", BusBridge, Bus Navigator, CLAM,
CorExtend, CoreFPGA, CoreLV, EC, FPGA View, FS2, FS2 FIRST SILICON SOLUTIONS logo, FS2 NAVIGATOR, HyperDebug,
HyperJTAG, IASim, iFlowtrace, interAptiv, JALGO, Logic Navigator, Malta, MDMX, MED, MGB, microAptiv, microMIPS, Navigator,
OCI, PDtrace, the Pipeline, proAptiv, Pro Series, SEAD-3, SmartMIPS, SOC-it, and YAMON are trademarks or registered
trademarks of MIPS and MIPS’ affiliates as applicable in the United States and other countries.
All other trademarks referred to herein are the property of their respective owners.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.043
4MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
Table of Contents
Chapter 1: Introduction to the MIPS32® M14K™ Processor Core.....................................................4
1.1: Features ...................................................................................................................................................... 4
2.6: Data Bypassing ......................................................................................................................................... 33
3.2: Modes of Operation...................................................................................................................................44
3.2.2: User Mode........................................................................................................................................46
3.4: System Control Coprocessor..................................................................................................................... 53
Chapter 4: Exceptions and Interrupts in the M14K™ Core...............................................................55
6.2.2: Coprocessor 0 State ...................................................................................................................... 150
Chapter 7: Power Management of the M14K™ Core.......................................................................153
7.1: Register-Controlled Power Management ................................................................................................ 153
7.2: Instruction-Controlled Power Management.............................................................................................154
Chapter 8: EJTAG Debug Support in the M14K™ Core..................................................................155
8.1: Debug Control Register...........................................................................................................................155
8.3.1: Checking for Presence of Complex Break Support........................................................................ 182
8.3.2: General Complex Break Behavior.................................................................................................. 183
8.3.3: Usage of Pass Counters................................................................................................................183
8.3.4: Usage of Tuple Breakpoints...........................................................................................................184
8.3.5: Usage of Priming Conditions.......................................................................................................... 184
8.3.6: Usage of Data Qualified Breakpoints............................................................................................. 185
8.3.7: Usage of Stopwatch Timers........................................................................................................... 185
8.4: Test Access Port (TAP)...........................................................................................................................186
8.4.1: EJTAG Internal and External Interfaces......................................................................................... 186
8.4.2: Test Access Port Operation...........................................................................................................187
8.5: EJTAG TAP Registers............................................................................................................................. 193
8.9.1: PC Sampling in Wait State.............................................................................................................220
8.9.2: Data Address Sampling ................................................................................................................. 220
8.10: Fast Debug Channel.............................................................................................................................. 220
8.10.1: Common Device Memory Map..................................................................................................... 221
8.10.2: Fast Debug Channel Interrupt......................................................................................................221
Chapter 9: Instruction Set Overview.................................................................................................230
9.1: CPU Instruction Formats.........................................................................................................................230
9.2: Load and Store Instructions..................................................................................................................... 231
9.2.1: Scheduling a Load Delay Slot........................................................................................................ 231
9.3.1: Cycle Timing for Multiply and Divide Instructions........................................................................... 233
9.4: Jump and Branch Instructions.................................................................................................................233
9.4.1: Overview of Jump Instructions....................................................................................................... 233
9.4.2: Overview of Branch Instructions .................................................................................................... 233
9.5: Control Instructions.................................................................................................................................. 233
6MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
9.8: MCU ASE Instructions............................................................................................................................. 235
Figure 1.6: cJTAG Support ..................................................................................................................................... 21
Figure 2.1: M14K™ Core Pipeline Stages with high-performance MDU ...............................................................23
Figure 2.2: M14K™ Core Pipeline Stages with area-efficient MDU .......................................................................23
Figure 2.3: MDU Pipeline Behavior During Multiply Operations ............................................................................28
Figure 2.4: MDU Pipeline Flow During a 32x16 Multiply Operation ....................................................................... 29
Figure 2.5: MDU Pipeline Flow During a 32x32 Multiply Operation ....................................................................... 29
Figure 2.6: High-Performance MDU Pipeline Flow During a 8-bit Divide (DIV) Operation ....................................30
Figure 2.7: High-Performance MDU Pipeline Flow During a 16-bit Divide (DIV) Operation ..................................30
Figure 2.8: High-Performance MDU Pipeline Flow During a 24-bit Divide (DIV) Operation ..................................30
Figure 2.9: High-Performance MDU Pipeline Flow During a 32-bit Divide (DIV) Operation ..................................30
Figure 2.10: M14K™ Area-Efficient MDU Pipeline Flow During a Multiply Operation ...........................................31
Figure 2.11: M14K™ Core Area-Efficient MDU Pipeline Flow During a Multiply Accumulate Operation ............... 32
Figure 2.12: M14K™ Core Area-Efficient MDU Pipeline Flow During a Divide (DIV) Operation ...........................32
Figure 2.13: IU Pipeline Branch Delay ................................................................................................................... 33
Figure 2.14: IU Pipeline Data Bypass ...................................................................................................................34
Figure 2.15: IU Pipeline M to E bypass .................................................................................................................. 34
Figure 2.16: IU Pipeline A to E Data bypass .......................................................................................................... 35
Figure 2.17: IU Pipeline Slip after a MFHI ..............................................................................................................35
Figure 8.17: DVM Register Format ......................................................................................................................177
Figure 8.18: CBTC Register Format ..................................................................................................................... 178
Figure 8.19: PrCndA Register Format ..................................................................................................................179
Figure 8.20: STCtl Register Format .....................................................................................................................181
Figure 8.21: STCnt Register Format .................................................................................................................... 182
2MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
Figure 8.22: TAP Controller State Diagram ......................................................................................................... 188
Figure 8.23: Concatenation of the EJTAG Address, Data and Control Registers ................................................192
Figure 8.24: TDI to TDO Path When in Shift-DR State and FASTDATA Instruction is Selected ......................... 193
Figure 8.25: Device Identification Register Format .............................................................................................. 194
Figure 8.26: Implementation Register Format ......................................................................................................195
Figure 8.27: EJTAG Control Register Format ...................................................................................................... 196
Figure 8.28: Endian Formats for the PAD Register ..............................................................................................203
Table 5.2: CP0 Register R/W Field Types..............................................................................................................90
Table 5.4: HWREna Register Field Descriptions....................................................................................................91
Table 5.3: UserLocal Register Field Descriptions................................................................................................... 91
Table 5.5: BadVAddr Register Field Description.....................................................................................................92
Table 5.6: BadInstr Register Field Descriptions...................................................................................................... 93
Table 5.8: Count Register Field Description ........................................................................................................... 94
Table 5.7: BadInstrP Register Field Descriptions ................................................................................................... 94
Table 5.9: Compare Register Field Description......................................................................................................95
Table 5.10: Status Register Field Descriptions....................................................................................................... 96
Table 5.11: IntCtl Register Field Descriptions....................................................................................................... 100
Table 5.12: SRSCtl Register Field Descriptions ................................................................................................... 103
Table 5.13: Sources for new SRSCtl
Table 5.14: SRSMap Register Field Descriptions................................................................................................. 106
Table 5.15: View_IPL Register Field Descriptions................................................................................................ 107
Table 5.16: SRSMap Register Field Descriptions................................................................................................. 108
Table 5.17: Cause Register Field Descriptions..................................................................................................... 108
Table 5.18: Cause Register ExcCode Field.......................................................................................................... 112
Table 5.19: View_RIPL Register Field Descriptions ............................................................................................. 113
Table 5.20: NestedExc Register Field Descriptions.............................................................................................. 114
on an Exception or Interrupt................................................................. 106
CSS
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.041
Table 5.21: EPC Register Field Description..........................................................................................................115
Table 5.22: NestedEPC Register Field Descriptions ............................................................................................ 116
Table 5.23: PRId Register Field Descriptions.......................................................................................................116
Table 5.24: EBase Register Field Descriptions.....................................................................................................118
Table 5.25: CDMMBase Register Field Descriptions............................................................................................ 119
Table 5.26: Config Register Field Descriptions..................................................................................................... 120
Table 8.31: EJTAG Control Register Descriptions................................................................................................ 197
Table 8.32: Fastdata Register Field Description................................................................................................... 203
Table 8.33: Operation of the FASTDATA access ................................................................................................. 204
Table 8.34: EJ_DisableProbeDebug Signal Overview.......................................................................................... 207
Table 8.35: Data Bus Encoding ............................................................................................................................ 213
Table 8.36: Tag Bit Encoding................................................................................................................................ 213
Table 8.37: Control/Status Register Field Descriptions ........................................................................................ 215
Table 8.38: ITCBTW Register Field Descriptions ................................................................................................. 216
Table 8.39: ITCBRDP Register Field Descriptions ............................................................................................... 217
Table 8.40: ITCBWRP Register Field Descriptions...............................................................................................217
Table 8.41: drseg Registers that Enable/Disable Trace from Breakpoint-Based Triggers.................................... 218
Table 8.42: FDC TAP Register Field Descriptions................................................................................................ 223
Table 8.48: FDC Transmit Register Field Descriptions......................................................................................... 228
Table 9.1: Byte Access Within a Word.................................................................................................................. 232
Table 10.1: Encoding of the Opcode Field............................................................................................................ 237
Table 10.2: Special Opcode Encoding of Function Field......................................................................................237
Table 10.3: Special2 Opcode Encoding of Function Field....................................................................................237
Table 10.4: Special3 Opcode Encoding of Function Field....................................................................................238
Table 10.5: RegImm Encoding of rt Field..............................................................................................................238
Table 10.6: COP2 Encoding of rs Field ................................................................................................................238
Table 10.7: COP2 Encoding of rt Field When rs=BC2..........................................................................................238
Table 10.8: COP0 Encoding of rs Field ................................................................................................................239
Table 10.9: COP0 Encoding of Function Field When rs=CO................................................................................ 239
Table 11.8: 32-bit Instructions introduced within microMIPS................................................................................280
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.043
Chapter 1
Introduction to the MIPS32® M14K™ Processor Core
The MIPS32® M14K™ core from MIPS Technologies is a high-performance, low-power, 32-bit MIPS RISC processor core intended for custom system-on-silicon applications. The core is designed for semiconductor manufacturing
companies, ASIC developers, and system OEMs who want to rapidly integrate their own custom logic and peripherals with a high-performance RISC processor. The M14K core is fully synthesizable to allow maximum flexibility; it
is highly portable across processes and can easily be integrated into full system-on-silicon designs. This allows developers to focus their attention on end-user specific characteristics of their product.
The M14K core is especially well-suited for microcontrollers and applications that have real-time requirements with
a high level of performance efficiency and security requirements.
The M14K core implements the MIPS Architecture in a 5-stage pipeline. It includes support for the microMIPS™
ISA, an Instruction Set Architecture with optimized MIPS32 16-bit and 32-bit instructions that provides a significant
reduction in code size with a performance equivalent to MIPS32. The M14K core is a successor to the M4K®,
designed from the same microarchitecture, including the Microcontroller Application-Specific Extension (MCU™
ASE), enhanced interrupt handling, lower interrupt latency, a memory protection unit (MPU), a reference design of
an optimized interface for flash memory and built-in native AMBA®-3 AHB-Lite Bus Interface Unit (BIU), with
additional power saving, debug, and profiling features.
The M14K core is cacheless; in lieu of caches, it includes a simple interface to SRAM-style devices. This interface
may be configured for independent instruction and data devices or combined into a unified interface. The SRAM
interface allows deterministic latency to memory, while still maintaining high performance.
The core includes one of two different Multiply/Divide Unit (MDU) implementations, selectable at build-time, allowing the user to trade-off performance and area for integer multiply and divide operations. The high-performance
MDU option implements single-cycle multiply and multiply-accumulate (MAC) instructions that enable DSP algorithms to be performed efficiently. It allows 32-bit x 16-bit MAC instructions to be issued every cycle, while a 32-bit
x 32-bit MAC instruction can be issued every other cycle. The area-efficient MDU option handles multiplies with a
one-bit-per-clock iterative algorithm.
The MMU consists of a simple Fixed Mapping Translation (FMT) mechanism, for applications that do not require
the full capabilities of a Translation Lookaside Buffer- (TLB-) based MMU available on other MIPS cores.
The basic Enhanced JTAG (EJTAG) features provide CPU run control with stop, single-stepping and re-start, and
with software breakpoints using the SDBBP instruction. Additional EJTAG features such as instruction and data virtual address hardware breakpoints, complex hardware breakpoints, connection to an external EJTAG probe through
the Test Access Port (TAP), and PC/Data tracing, may be included as an option.
1.1 Features
•5-stage pipeline
•32-bit Address and Data Paths
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.044
1.1 Features
•MIPS32 Instruction Set Architecture
•MIPS32 Enhanced Architecture Features
•Vectored interrupts and support for external interrupt controller
•Programmable exception vector base
•Atomic interrupt enable/disable
•GPR shadow registers (one, three, seven, or fifteen additional shadows can be optionally added to minimize
latency for interrupt handlers)
•Bit field manipulation instructions
•microMIPS Instruction Set Architecture
•microMIPS ISA is a build-time configurable option that reduces code size over MIPS32, while maintaining
MIPS32 performance.
•Combining both 16-bit and 32-bit opcodes, microMIPS supports all MIPS32 instructions (except
branch-likely instructions) with new optimized encoding. Frequently used MIPS32 instructions are available
as 16-bit instructions.
•Added fifteen new 32-bit instructions and thirty-nine 16-bit instructions.
•Stack pointer implicit in instruction.
•MIPS32 assembly and ABI-compatible.
•Supports MIPS architecture Modules and User-defined Instructions (UDIs).
•MCU™ ASE
•Increases the number of interrupt hardware inputs from 6 to 8 for Vectored Interrupt (VI) mode, and from 63
to 255 for External Interrupt Controller (EIC) mode.
•Separate priority and vector generation. 16-bit vector address is provided.
•Hardware assist combined with the use of Shadow Register Sets to reduce interrupt latency during the prologue and epilogue of an interrupt.
•Two memory-to-memory atomic read-modify-write instructions (ASET and ACLR) eases commonly used
semaphore manipulation in microcontroller applications. Interrupts are automatically disabled during the
operation to maintain coherency.
•Memory Management Unit
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.045
Introduction to the MIPS32® M14K™ Processor Core
•Simple Fixed Mapping Translation (FMT) mechanism
•Memory Protection Unit
•Optional feature that improves system security by restricting access, execution, and trace capabilities from
untrusted code in predefined memory regions.
•Simple SRAM-Style Interface
•Cacheless operation enables deterministic response and reduces die-size
•32-bit address and data; input byte-enables enable simple connection to narrower devices
•Single or multi-cycle latencies
•Configuration option for dual or unified instruction/data interfaces
•Redirection mechanism on dual I/D interfaces permits D-side references to be handled by I-side
•Transactions can be aborted
•Reference Design
•A typical SRAM reference design is provided.
•An AHB-Lite BIU reference design is provided between the SRAM interface and AHB-Lite Bus.
•An optimized interface for slow memory (Flash) access using prefetch buffer scheme is provided.
•Parity Support
•The ISRAM and DSRAM support optional parity detection.
•Multiply/Divide Unit (area-efficient configuration )
•32 clock latency on multiply
•34 clock latency on multiply-accumulate
•33-35 clock latency on divide (sign-dependent)
•Multiply/Divide Unit (high-performance configuration)
•Maximum issue rate of one 32x16 multiply per clock via on-chip 32x16 hardware multiplier array.
•Maximum issue rate of one 32x32 multiply every other clock
•Early-in iterative divide. Minimum 11 and maximum 34 clock latency (dividend (rs) sign extension-dependent)
•CorExtend® User-Defined Instruction Set Extensions
•Allows user to define and add instructions to the core at build time
6MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
•Maintains full MIPS32 compatibility
•Supported by industry-standard development tools
•Single or multi-cycle instructions
•Multi-Core Support
•External lock indication enables multi-processor semaphores based on LL/SC instructions
•External sync indication allows memory ordering
•Debug support includes cross-core triggers
•Coprocessor 2 interface
•32-bit interface to an external coprocessor
•Power Control
•Minimum frequency: 0 MHz
1.1 Features
•Power-down mode (triggered by WAIT instruction)
•Support for software-controlled clock divider
•Support for extensive use of local gated clocks
•EJTAG Debug/Profiling and iFlowtrace™ Mechanism
•CPU control with start, stop, and single stepping
•Virtual instruction and data address/value breakpoints
•Hardware breakpoint supports both address match and address range triggering
•Optional simple hardware breakpoints on virtual addresses; 8I/4D, 6I/2D, 4I/2D, 2I/1D breakpoints, or no
breakpoints
•Optional complex hardware breakpoints with 8I/4D, 6I/2D simple breakpoints
•TAP controller is chainable for multi-CPU debug
•Supports EJTAG (IEEE 1149.1) and compatible with cJTAG 2-wire (IEEE 1149.7) extension protocol
•Cross-CPU breakpoint support
•iFlowtrace support for real-time instruction PC and special events
•PC and/or load/store address sampling for profiling
•Performance Counters
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.047
Introduction to the MIPS32® M14K™ Processor Core
•Support for Fast Debug Channel (FDC)
•SecureDebug
•An optional feature that disables access via EJTAG in an untrusted environment
•Testability
•Full scan design achieves test coverage in excess of 99% (dependent on library and configuration options)
1.2 M14K™ Core Block Diagram
The M14K core contains both required and optional blocks, as shown in the block diagram in Figure 1.1. Required
blocks are the lightly shaded areas of the block diagram and are always present in any core implementation. Optional
blocks may be added to the base core, depending on the needs of a specific implementation. The required blocks are
as follows:
•Instruction Decode
•Execution Unit
•General Purposed Registers (GPR)
•Multiply/Divide Unit (MDU)
•System Control Coprocessor (CP0)
•Memory Management Unit (MMU)
•I/D SRAM Interfaces
•Power Management
Optional blocks include:
•Configurable instruction decoder supporting three ISA modes: MIPS32-only, MIPS32 and microMIPS, or microMIPS-only
•Memory Protection Unit (MPU)
•Reference Design of I/D-SRAM, BIU, Slow Memory Interface
•Debug/Profiling with Enhanced JTAG (EJTAG) Controller, Break points, Sampling, Performance counters, Fast
Debug Channel, and iFlowtrace logic
8MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
Introduction to the MIPS32® M14K™ Processor Core
1.2.1.2 General Purposed Register (GPR) Shadow Registers
The M14K core contains thirty-two 32-bit general-purpose registers used for integer operations and address calculation. Optionally, one, three, seven or fifteen additional register file shadow sets (each containing thirty-two registers)
can be added to minimize context switching overhead during interrupt/exception processing. The register file consists
of two read ports and one write port and is fully bypassed to minimize operation latency in the pipeline.
1.2.1.3 Multiply/Divide Unit (MDU)
The M14K core includes a multiply/divide unit (MDU) that contains a separate, dedicated pipeline for integer multiply/divide operations. This pipeline operates in parallel with the integer unit (IU) pipeline and does not stall when the
IU pipeline stalls. This allows the long-running MDU operations to be partially masked by system stalls and/or other
integer unit instructions.
The MIPS architecture defines that the result of a multiply or divide operation be placed in a pair of
HI and LO regis-
ters. Using the Move-From-HI (MFHI) and Move-From-LO (MFLO) instructions, these values can be transferred to
the general-purpose register file.
There are two configuration options for the MDU: 1) a higher performance 32x16 multiplier block; 2) an area-efficient iterative multiplier block. . The selection of the MDU style allows the implementor to determine the appropriate
performance and area trade-off for the application.
MDU with 32x16 High-Performance Multiplier
The high-performance MDU consists of a 32x16 Booth-recoded multiplier, a pair of result/accumulation registers (
HI
and LO), a divide state machine, and the necessary multiplexers and control logic. The first number shown (‘32’ of
32x16) represents the rs operand. The second number (‘16’ of 32x16) represents the rt operand. The M14K core only
checks the value of the rt operand to determine how many times the operation must pass through the multiplier. The
16x16 and 32x16 operations pass through the multiplier once. A 32x32 operation passes through the multiplier twice.
The MDU supports execution of one 16x16 or 32x16 multiply or multiply-accumulate operation every clock cycle;
32x32 multiply operations can be issued every other clock cycle. Appropriate interlocks are implemented to stall the
issuance of back-to-back 32x32 multiply operations. The multiply operand size is automatically determined by logic
built into the MDU.
MDU with Area-Efficient Option
With the area-efficient option, multiply and divide operations are implemented with a simple 1-bit-per-clock iterative
algorithm. Any attempt to issue a subsequent MDU instruction while a multiply/divide is still active causes an MDU
pipeline stall until the operation is completed.
Regardless of the multiplier array implementation, divide operations are implemented with a simple 1-bit-per-clock
iterative algorithm. An early-in detection checks the sign extension of the dividend (rs) operand. If rs is 8 bits wide,
23 iterations are skipped. For a 16-bit-wide rs, 15 iterations are skipped, and for a 24-bit-wide rs, 7 iterations are
skipped. Any attempt to issue a subsequent MDU instruction while a divide is still active causes an IU pipeline stall
until the divide operation has completed.
1.2.1.4 System Control Coprocessor (CP0)
In the MIPS architecture, CP0 is responsible for the virtual-to-physical address translation, the exception control system, the processor’s diagnostics capability, the operating modes (kernel, user, and debug), and whether interrupts are
enabled or disabled. Configuration information, such as presence of build-time options like microMIPS, CorExtend
Module or Coprocessor 2 interface, is also available by accessing the CP0 registers.
10MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
1.2 M14K™ Core Block Diagram
Coprocessor 0 also contains the logic for identifying and managing exceptions. Exceptions can be caused by a variety
of sources, including boundary cases in data, external events, or program errors.
Interrupt Handling
The M14K core includes support for eight hardware interrupt pins, two software interrupts, and a timer interrupt.
These interrupts can be used in any of three interrupt modes, as defined by Release 2 of the MIPS32 Architecture:
•Interrupt compatibility mode, which acts identically to that in an implementation of Release 1 of the Architecture.
•Vectored Interrupt (VI) mode, which adds the ability to prioritize and vector interrupts to a handler dedicated to
that interrupt, and to assign a GPR shadow set for use during interrupt processing. The presence of this mode is
denoted by the
VInt bit in the Config3 register. This mode is architecturally optional; but it is always present on
the M14K core, so the VInt bit will always read as a 1 for the M14K core.
•External Interrupt Controller (EIC) mode, which redefines the way in which interrupts are handled to provide full
support for an external interrupt controller handling prioritization and vectoring of interrupts. The presence of
this mode denoted by the
VEIC bit in the Config3 register. Again, this mode is architecturally optional. On the
M14K core, the VEIC bit is set externally by the static input, SI_EICPresent, to allow system logic to indicate the
presence of an external interrupt controller.
The reset state of the processor is interrupt compatibility mode, such that a processor supporting Release 2 of the
Architecture, the M14K core for example, is fully compatible with implementations of Release 1 of the Architecture.
VI or EIC interrupt modes can be combined with the optional shadow registers to specify which shadow set should be
used on entry to a particular vector. The shadow registers further improve interrupt latency by avoiding the need to
save context when invoking an interrupt handler.
In the M14K core, interrupt latency is reduced by:
•Speculative interrupt vector prefetching during the pipeline flush.
•Interrupt Automated Prologue (IAP) in hardware: Shadow Register Sets remove the need to save GPRs, and IAP
removes the need to save specific Control Registers when handling an interrupt.
•Interrupt Automated Epilogue (IAE) in hardware: Shadow Register Sets remove the need to restore GPRs, and
IAE removes the need to restore specific Control Registers when returning from an interrupt.
•Allow interrupt chaining. When servicing an interrupt and interrupt chaining is enabled, there is no need to return
from the current Interrupt Service Routine (ISR) if there is another valid interrupt pending to be serviced. The
control of the processor can jump directly from the current ISR to the next ISR without IAE and IAP.
GPR Shadow Registers
The MIPS32 Architecture optionally removes the need to save and restore GPRs on entry to high-priority interrupts
or exceptions, and to provide specified processor modes with the same capability. This is done by introducing multiple copies of the GPRs, called shadow sets, and allowing privileged software to associate a shadow set with entry to
kernel mode via an interrupt vector or exception. The normal GPRs are logically considered shadow set zero.
The number of GPR shadow sets is a build-time option. The M14K core allows 1 (the normal GPRs), 2, 4, 8, or 16
shadow sets. The highest number actually implemented is indicated by the SRSCtlHSS field. If this field is zero, only
the normal GPRs are implemented.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0411
Introduction to the MIPS32® M14K™ Processor Core
Shadow sets are new copies of the GPRs that can be substituted for the normal GPRs on entry to kernel mode via an
interrupt or exception. When a shadow set is bound to a kernel-mode entry condition, references to GPRs operate
exactly as one would expect, but they are redirected to registers that are dedicated to that condition. Privileged software may need to reference all GPRs in the register file, even specific shadow registers that are not visible in the current mode, and the RDPGPR and WRPGPR instructions are used for this purpose. The CSS field of the SRSCtl
register provides the number of the current shadow register set, and the PSS field of the SRSCtl register provides the
number of the previous shadow register set that was current before the last exception or interrupt occurred.
If the processor is operating in VI interrupt mode, binding of a vectored interrupt to a shadow set is done by writing to
the SRSMap register. If the processor is operating in EIC interrupt mode, the binding of the interrupt to a specific
shadow set is provided by the external interrupt controller and is configured in an implementation-dependent way.
Binding of an exception or non-vectored interrupt to a shadow set is done by writing to the ESS field of the SRSCtl
register. When an exception or interrupt occurs, the value of SRSCtl
to the value taken from the appropriate source. On an ERET, the value of SRSCtl
to restore the shadow set of the mode to which control returns.
Refer to Chapter 5, “CP0 Registers of the M14K™ Core” on page 88 for more information on the CP0 registers.
Refer to Chapter 8, “EJTAG Debug Support in the M14K™ Core” on page 155 for more information on EJTAG
debug registers.
1.2.1.5 Memory Management Unit (MMU)
is copied to SRSCtl
CSS
, and SRSCtl
PSS
is copied back into SRSCtl
PSS
CSS
is set
CSS
Modes of Operation
The M14K core implements three modes of operation:
•User mode is most often used for applications programs.
•Kernel mode is typically used for handling exceptions and operating-system kernel functions, including CP0
management and I/O device accesses.
•Debug mode is used during system bring-up and software development. Refer to the EJTAG section for more
information on debug mode.
Figure 1.2 shows the virtual address map of the MIPS Architecture.
12MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
Figure 1.2 M14K™ Core Virtual Address Map
0xFFFFFFFF
Fix Mapped
0xFF400000
0xFF3FFFFF
0xFF200000
0xF1FFFFFF
0xE0000000
0xDFFFFFFF
0xC0000000
0xBFFFFFFF
0xA0000000
0x9FFFFFFF
0x80000000
0x7FFFFFFF
Memory/EJTAG
Fix Mapped
Kernel Virtual Address Space
Fix Mapped, 512 MB
Kernel Virtual Address Space
Unmapped, 512 MB
Uncached
Kernel Virtual Address Space
Unmapped, 512 MB
1
kseg3
kseg2
kseg1
kseg0
1.2 M14K™ Core Block Diagram
User Virtual Address Space
kuseg
Mapped, 2048 MB
0x00000000
1. This space is mapped to memory in user or kernel mode,
and by the EJTAG module in debug mode.
Memory Management Unit (MMU)
The M14K core contains a simple Fixed Mapping Translation (FMT) MMU that interfaces between the execution
unit and the SRAM controller.
•Fixed Mapping Translation (FMT)
A FMT is smaller and simpler than the full Translation Lookaside Buffer (TLB) style MMU found in other MIPS
cores. Like a TLB, the FMT performs virtual-to-physical address translation and provides attributes for the different segments. Those segments that are unmapped in a TLB implementation (kseg0 and kseg1) are translated
identically by the FMT.
Figure 1.3 shows how the FMT is implemented in the M14K core.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0413
Introduction to the MIPS32® M14K™ Processor Core
Figure 1.3 Address Translation During SRAM Access with FMT Implementation
FMT
Physical
Address
Physical
Address
SRAM
interface
Inst
SRAM
Data
SRAM
Instruction
Address
Calculator
Data
Address
Calculator
Virtual
Address
Virtual
Address
1.2.1.6 SRAM Interface Controller
Instead of caches, the M14K core contains an interface to SRAM-style memories that can be tightly coupled to the
core. This permits deterministic response time with less area than is typically required for caches. The SRAM interface includes separate uni-directional 32-bit buses for address, read data, and write data.
Dual or Unified Interfaces
The SRAM interface includes a build-time option to select either dual or unified instruction and data interfaces.
The dual interface enables independent connection to instruction and data devices. It generally yields the highest performance, because the pipeline can generate simultaneous I and D requests, which are then serviced in parallel.
For simpler or cost-sensitive systems, it is also possible to combine the I and D interfaces into a common interface
that services both types of requests. If I and D requests occur simultaneously, priority is given to the D side.
Back-stalling
Typically, read and write transactions will complete in a single cycle. However, if multi-cycle latency is desired, the
interface can be stalled to allow connection to slower devices.
Redirection
When the dual I/D interface is present, a mechanism exists to divert D-side references to the I-side, if desired. The
mechanism can be explicitly invoked for any other D-side references, as well. When the DS_Redir signal is asserted,
a D-side request is diverted to the I-side interface in the following cycle, and the D-side will be stalled until the transaction is completed.
Transaction Abort
The core may request a transaction (fetch/load/store/sync) to be aborted. This is particularly useful in case of interrupts. Because the core does not know whether transactions are re-startable, it cannot arbitrarily interrupt a request
that has been initiated on the SRAM interface. However, cycles spent waiting for a multi-cycle transaction to complete can directly impact interrupt latency. In order to minimize this effect, the interface supports an abort mechanism. The core requests an abort whenever an interrupt is detected and a transaction is pending (abort of an
instruction fetch may also be requested in other cases). The external system logic can choose to acknowledge or to
ignore the abort request.
Connecting to Narrower Devices
The instruction and data read buses are always 32 bits in width. To facilitate connection to narrower memories, the
SRAM interface protocol includes input byte-enables that can be used by system logic to signal validity as partial
read data becomes available. The input byte-enables conditionally register the incoming read data bytes within the
14MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
1.2 M14K™ Core Block Diagram
core, and thus eliminate the need for external registers to gather the entire 32 bits of data. External muxes are required
to redirect the narrower data to the appropriate byte lanes.
Lock Mechanism
The SRAM interface includes a protocol to identify a locked sequence, and is used in conjunction with the LL/SC
atomic read-modify-write semaphore instructions.
Sync Mechanism
The interface includes a protocol that externalizes the execution of the SYNC instruction. External logic might
choose to use this information to enforce memory ordering between various elements in the system.
External Call Indication
The instruction fetch interface contains signals that indicate that the core is fetching the target of a subroutine
call-type instruction such as JAL or BAL. At some point after a call, there will typically be a return to the original
code sequence. If a system prefetches instructions, it can make use of this information to save instructions that were
prefetched and are likely to be executed after the return.
1.2.1.7 Power Management
The M14K core offers a number of power management features, including low-power design, active power management, and power-down modes of operation. The core is a static design that supports slowing or halting the clocks,
which reduces system power consumption during idle periods.
The M14K core provides two mechanisms for system-level low-power support:
•Register-controlled power management
•Instruction-controlled power management
Register-Controlled Power Management
The RP bit in the CP0 Status register provides a software mechanism for placing the system into a low-power state.
The state of the RP bit is available externally via the SI_RP signal. The external agent then decides whether to place
the device in a low-power mode, such as reducing the system clock frequency.
Three additional bits,StatusEXL, StatusERL, and DebugDM support the power management function by allowing the
user to change the power state if an exception or error occurs while the M14K core is in a low-power state. Depending
on what type of exception is taken, one of these three bits will be asserted and reflected on the SI_EXL, SI_ERL, or
EJ_DebugM outputs. The external agent can look at these signals and determine whether to leave the low-power state
to service the exception.
The following four power-down signals are part of the system interface and change state as the corresponding bits in
the CP0 registers are set or cleared:
•The SI_RP signal represents the state of the RP bit (27) in the CP0 Status register.
•The SI_EXL signal represents the state of the EXL bit (1) in the CP0 Status register.
•The SI_ERL signal represents the state of the ERL bit (2) in the CP0 Status register.
•The EJ_DebugM signal represents the state of the DM bit (30) in the CP0 Debug register.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0415
Introduction to the MIPS32® M14K™ Processor Core
Instruction-Controlled Power Management
The second mechanism for invoking power-down mode is by executing the WAIT instruction. When the WAIT
instruction is executed, the internal clock is suspended; however, the internal timer and some of the input pins
(SI_Int[5:0], SI_NMI, SI_Reset, and SI_ColdReset) continue to run. When the CPU is in instruction-controlled power
management mode, any interrupt, NMI, or reset condition causes the CPU to exit this mode and resume normal operation.
The M14K core asserts the SI_Sleep signal, which is part of the system interface bus, whenever the WAIT instruction
is executed. The assertion of SI_Sleep indicates that the clock has stopped and the M14K core is waiting for an interrupt.
Local clock gating
The majority of the power consumed by the M14K core is in the clock tree and clocking registers. The core has support for extensive use of local gated clocks. Power-consciousimplementors can use these gated clocks to significantly
reduce power consumption within the core.
Refer to Chapter 7, “Power Management of the M14K™ Core” on page 153 for more information on power management.
1.2.2 Optional Logic Blocks
The core consists of the following optional logic blocks as shown in the block diagram in Figure 1.1.
1.2.2.1 Reference Design
The M14K core contains a reference design that shows a typical usage of the core with:
•Dual I-SRAM and D-SRAM interface with fast memories (i.e., SRAM) for instruction and data storage.
•Optimized interface for slow memory (i.e., Flash memory) access by having a prefetch buffer and a wider Data
Read bus (i.e., IS_RData[127:0]) to speed up I-Fetch performance.
•AHB-lite bus interface to the system bus if the memory accesses are outside the memory map for the SRAM and
Flash regions. AHB-Lite is a subset of the AHB bus protocol that supports a single bus master. The interface
shares the same 32-bit Read and Write address bus and has two unidirectional 32-bit buses for Read and Write
data.
The reference design is optional and can be modified by the user to better fit the SOC design requirement.
16MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
An optional CorExtend User-defined Instruction (UDI) block enables the implementation of a small number of application-specific instructions that are tightly coupled to the core’s execution unit. The interface to the UDI block is
external to the M14K core.
Such instructions may operate on a general-purpose register, immediate data specified by the instruction word, or
local state stored within the UDI block. The destination may be a general-purpose register or local UDI state. The
operation may complete in one cycle or multiple cycles, if desired.
Refer to Table 10.3 “Special2 Opcode Encoding of Function Field” for a specification of the opcode map available
for user-defined instructions.
1.2.2.6 EJTAG Debug Support
The M14K core provides for an optional Enhanced JTAG (EJTAG) interface for use in the software debug of application and kernel code. In addition to standard user and kernel modes of operation, the M14K core provides a Debug
mode that is entered after a debug exception (derived from a hardware breakpoint, single-step exception, etc.) is taken
and continues until a debug exception return (DERET) instruction is executed. During this time, the processor executes the debug exception-handler routine.
The EJTAG interface operates through the Test Access Port (TAP), a serial communication port used for transferring
test data in and out of the M14K core. In addition to the standard JTAG instructions, special instructions defined in
the EJTAG specification specify which registers are selected and how they are used.
Debug Registers
Four debug registers (DEBUG, DEBUG2, DEPC, and DESAVE) have been added to the MIPS Coprocessor 0 (CP0)
register set. The DEBUG and DEBUG2 registers show the cause of the debug exception and are used for setting up
single-step operations. The DEPC (Debug Exception Program Counter) register holds the address on which the debug
exception was taken, which is used to resume program execution after the debug operation finishes. Finally, the
DESAVE (Debug Exception Save) register enables the saving of general-purpose registers used during execution of
the debug exception handler.
To exit debug mode, a Debug Exception Return (DERET) instruction is executed. When this instruction is executed,
the system exits debug mode, allowing normal execution of application and system code to resume.
EJTAG Hardware Breakpoints
There are several types of simple hardware breakpoints defined in the EJTAG specification. These stop the normal
operation of the CPU and force the system into debug mode. There are two types of simple hardware breakpoints
implemented in the M14K core: Instruction breakpoints and Data breakpoints. Additionally, complex hardware
breakpoints can be included, which allow detection of more intricate sequences of events.
The M14K core can be configured with the following breakpoint options:
•No data or instruction, or complex breakpoints
•One data and two instruction breakpoints, without complex breakpoints
•Two data and four instruction breakpoints, without complex breakpoints
•Two data and six instruction breakpoints, with or without complex breakpoints
18MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
1.2 M14K™ Core Block Diagram
•Four data and eight instruction breakpoints, with or without complex breakpoints
Instruction breakpoints occur on instruction execution operations, and the breakpoint is set on the virtual address. A
mask can be applied to the virtual address to set breakpoints on a binary range of instructions.
Data breakpoints occur on load/store transactions, and the breakpoint is set on a virtual address value, with the same
single address or binary address range as the Instruction breakpoint. Data breakpoints can be set on a load, a store, or
both. Data breakpoints can also be set to match on the operand value of the load/store operation, with byte-granularity
masking. Finally, masks can be applied to both the virtual address and the load/store value.
In addition, the M14K core has a configurable feature to support data and instruction address-range triggered breakpoints, where a breakpoint can occur when a virtual address is either within or outside a pair of 32-bit addresses.
Unlike the traditional address-mask control, address-range triggering is not restricted to a power-of-two binary
boundary.
Complex breakpoints utilize the simple instruction and data breakpoints and break when combinations of events are
seen. Complex break features include:
•Pass Counters - Each time a matching condition is seen, a counter is decremented. The break or trigger will only
be enabled when the counter has counted down to 0.
•Tuples - A tuple is the pairing of an instruction and a data breakpoint. The tuple will match if both the virtual
address of the load or store instruction matches the instruction breakpoint, and the data breakpoint of the resulting load or store address and optional data value matches.
•Priming - This allows a breakpoint to be enabled only after other break conditions have been met. Also called
sequential or armed triggering.
•Qualified - This feature uses a data breakpoint to qualify when an instruction breakpoint can be taken. When a
load matches the data address and the data value, the instruction break will be enabled. If a load matches the
address, but has mis-matching data, the instruction break will be disabled.
Performance Counters
Performance counters are used to accumulate occurrences of internal predefined events/cycles/conditions for program analysis, debug, or profiling. A few examples of event types are clock cycles, instructions executed, specific
instruction types executed, loads, stores, exceptions, and cycles while the CPU is stalled. There are two, 32-bit
counters. Each can count one of the 64 internal predefined events selected by a corresponding control register. A
counter overflow can be programmed to generate an interrupt, where the interrupt-handler software can maintain
larger total counts.
PC/Address Sampling
This sampling function is used for program profiling and hot-spots analysis. Instruction PC and/or Load/Store
addresses can be sampled periodically. The result is scanned out through the EJTAG port. The Debug Control
Register
(DCR) is used to specify the sample period and the sample trigger.
Fast Debug Channel (FDC)
The M14K core includes an optional FDC as a mechanism for high bandwidth data transfer between a debug
host/probe and a target. FDC provides a FIFO buffering scheme to transfer data serially, with low CPU overhead and
minimized waiting time. The data transfer occurs in the background, and the target CPU can choose either to check
the status of the transfer periodically or to be interrupted at the end of the transfer.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0419
Introduction to the MIPS32® M14K™ Processor Core
Figure 1.5 FDC Overview
M14K
EJTAG
TAP
Receive from
Probe to Core
Transmitfrom
Core to Probe
32
32
FDC
FIFO
FIFO
Probe
TDI
TDO
Tap Controller
TMS
iFlowtrace™
The M14K core has an option for a simple trace mechanism named iFlowtrace. This mechanism only traces the
instruction PC, not data addresses or values. This simplification allows the trace block to be smaller and the trace
compression to be more efficient. iFlowtrace memory can be configured as off-chip, on-chip, or both.
iFlowtrace also offers special-event trace modes when normal tracing is disabled, namely:
•Function Call/Return and Exception Tracing mode to trace the PC value of function calls and returns and/or
exceptions and returns.
•Breakpoint Match mode traces the breakpoint ID of a matching breakpoint and, for data breakpoints, the PC
value of the instruction that caused it.
•Filtered Data Tracing mode traces the ID of a matching data breakpoint, the load or store data value, access type
and memory access size, and the low-order address bits of the memory access, which is useful when the data
breakpoint is set up to match a binary range of addresses.
•User Trace Messages. The user can instrument their code to add their own 32-bit value messages into the trace by
writing to the Cop0 UTM register.
•Delta Cycle mode works in combination with the above trace modes to provide a timestamp between stored
events. It reports the number of cycles that have elapsed since the last message was generated and put into the
trace.
Refer to Chapter 8, “EJTAG Debug Support in the M14K™ Core” on page 155 for more information on the EJTAG
features.
cJTAG Support
The M14K core provides an external conversion block which converts the existing EJTAG (IEEE 1149.1) 4-wire
interface at the M14K core to a cJTAG (IEEE 1149.7) 2-wire interface. cJTAG reduces the number of wires from 4 to
2 and enables the support of Star-2 scan topology in the system debug environment.
20MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
M14K
1.2 M14K™ Core Block Diagram
Figure 1.6 cJTAG Support
EJTAG
Tap
Controller
EJTAG
4-wire
interface
TDI
TDO
TCK
TMS
cJTAG
Conversion
Block
cJTAG
2-wire
interface
TMSC
TCK
SecureDebug
SecureDebug improves security by disabling untrusted EJTAG debug access. An input signal is used to disable debug
features, such as Probe Trap, Debug Interrupt Exception (EjtagBrk and DINT), EJTAGBOOT instruction, and PC
Sampling.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0421
Chapter 2
Pipeline of the M14K™ Core
The M14K processor core implements a 5-stage pipeline similar to the original M4K pipeline. The pipeline allows
the processor to achieve high frequency while minimizing device complexity, reducing both cost and power consumption. This chapter contains the following sections:
The M14K core implements a 5-stage pipeline with a performance similar to the M4K pipeline. The pipeline allows
the processor to achieve high frequency while minimizing device complexity, reducing both cost and power consumption.
The M14K core pipeline consists of five stages:
•Instruction (I Stage)
•Execution (E Stage)
•Memory (M Stage)
•Align (A Stage)
•Writeback (W stage)
MIPS32® M14Kª P rocessor Core Family Software User’s Manual, Revision 02.0422
2.1 Pipeline Stages
The M14K core implements a bypass mechanism that allows the result of an operation to be forwarded directly to the
instruction that needs it without having to write the result to the register and then read it back.
The M14K soft core includes a build-time option that determines the type of multiply/divide unit (MDU) implemented. The MDU can be either a high-performance 32x16 multiplier array or an iterative, area-efficient array. The
MDU choice has a significant effect on the MDU pipeline, and the latency of multiply/divide instructions executed on
the core. Software can query the type of MDU present on a specific implementation of the core by querying the MDU
bit in the Config register (CP0 register 16, select 0); see Chapter 5, “CP0 Registers of the M14K™ Core” on page 88
for more details.
Figure 2.1 shows the operations performed in each pipeline stage of the M14K processor core, when the high-perfor-
mance multiplier is present.
Figure 2.1 M14K™ Core Pipeline Stages with high-performance MDU
: D-SRAM read
: Loaddata aligner
: Registerfile write
: MUL instruction
: Multiply, Multiply Acc. And Divide
: Result can bereadfrom MDU
: One or more cycles.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0423
Pipeline of the M14K™ Core
2.1.1 IStage: Instruction Fetch
During the Instruction fetch stage:
•An instruction is fetched from the instructionSRAM.
•If both MIPS32 and microMIPS ISAs are supported, microMIPS instructions are converted to MIPS32-like
instructions. If the MIPS32 ISA is not supported, 16-bit microMIPS instructions will be first recoded into 32-bit
microMIPS equivalent instructions, and then decoded in native microMIPS ISA format.
2.1.2 E Stage: Execution
During the Execution stage:
•Operands are fetched from the register file.
•Operands from the M and A stage are bypassed to this stage.
•The Arithmetic Logic Unit (ALU) begins the arithmetic or logical operation for register-to-register instructions.
•The ALU calculates the data virtual address for load and store instructions and the MMU performs the fixed virtual-to-physical address translation.
•The ALU determines whether the branch condition is true and calculates the virtual branch target address for
branch instructions.
•Instruction logic selects an instruction address and the MMU performs the fixed virtual-to-physical address
translation.
•All multiply and divide operations begin in this stage.
2.1.3 MStage: Memory Fetch
During the Memory fetch stage:
•The arithmetic ALU operation completes.
•The data SRAM access is performed for load and store instructions.
•A 16x16 or 32x16 multiply calculation completes (high-performance MDU option).
•A 32x32 multiply operation stalls the MDU pipeline for one clock in the M stage (high-performance MDU
option ).
•A multiply operation stalls the MDU pipeline for 31 clocks in the M stage (area-efficient MDU option ).
•A multiply-accumulate operation stalls the MDU pipeline for 33 clocks in the M stage (area-efficient MDU
option ).
•A divide operation stalls the MDU pipeline for a maximum of 34 clocks in the M stage. Early-in sign extension
detection on the dividend will skip 7, 15, or 23 stall clocks (only the divider in the fast MDU option supports
early-in detection).
24MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
2.2 Multiply/Divide Operations
2.1.4 A Stage: Align
During the Align stage:
•Load data is aligned to its word boundary.
•A multiply/divide operation updates the HI/LO registers (area-efficient MDU option).
•Multiply operation performs the carry-propagate-add. The actual register writeback is performed in the W stage
(high-performance MDU option).
•A MUL operation makes the result available for writeback. The actual register writeback is performed in the W
stage.
•EJTAG complex break conditions are evaluated.
2.1.5 W Stage: Writeback
During the Writeback stage:
•For register-to-register or load instructions, the result is written back to the register file.
2.2 Multiply/Divide Operations
The M14K core implements the standard MIPS II™ multiply and divide instructions. Additionally, several new
instructions were standardized in the MIPS32 architecture for enhanced performance.
The targeted multiply instruction, MUL, specifies that multiply results be placed in the general-purpose register file
instead of the HI/LO register pair. By avoiding the explicit MFLO instruction, required when using the LO register,
and by supporting multiple destination registers, the throughput of multiply-intensive operations is increased.
Four instructions, multiply-add (MADD), multiply-add-unsigned (MADDU), multiply-subtract (MSUB), and multiply-subtract-unsigned (MSUBU), are used to perform the multiply-accumulate and multiply-subtract operations. The
MADD/MADDU instruction multiplies two numbers and then adds the product to the current contents of the HI and
LO registers. Similarly, the MSUB/MSUBU instruction multiplies two operands and then subtracts the product from
the HI and LO registers. The MADD/MADDU and MSUB/MSUBU operations are commonly used in DSP algorithms.
All multiply operations (except the MUL instruction) write to the HI/LO register pair. All integer operations write to
the general purpose registers (GPR). Because MDU operations write to different registers than integer operations,
integer instructions that follow can execute before the MDU operation has completed. The MFLO and MFHI instructions are used to move data from the HI/LO register pair to the GPR file. If an MFLO or MFHI instruction is issued
before the MDU operation completes, it will stall to wait for the data.
2.3 MDU Pipeline - High-performance MDU
The M14K processor core contains an autonomous multiply/divide unit (MDU) with a separate pipeline for multiply
and divide operations. This pipeline operates in parallel with the integer unit (IU) pipeline and does not stall when the
IU pipeline stalls. This allows multi-cycle MDU operations, such as a divide, to be partially masked by system stalls
and/or other integer unit instructions.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0425
Pipeline of the M14K™ Core
The MDU consists of a 32x16 Booth-encoded multiplier array, a carry propagate adder, result/accumulation registers
(HI and LO), multiply and divide state machines, and all necessary multiplexers and control logic. The first number
shown (‘32’ of 32x16) represents the rs operand. The second number (‘16’ of 32x16) represents the rt operand. The
core only checks the latter (rt) operand value to determine how many times the operation must pass through the multiplier array. The 16x16 and 32x16 operations pass through the multiplier array once. A 32x32 operation passes
through the multiplier array twice.
The MDU supports execution of a 16x16 or 32x16 multiply operation every clock cycle; 32x32 multiply operations
can be issued every other clock cycle. Appropriate interlocks are implemented to stall the issue of back-to-back
32x32 multiply operations. Multiply operand size is automatically determined by logic built into the MDU. Divide
operations are implemented with a simple 1 bit per clock iterative algorithm with an early in detection of sign extension on the dividend (rs). Any attempt to issue a subsequent MDU instruction while a divide is still active causes an
IU pipeline stall until the divide operation is completed.
Table 2.1 lists the latencies (number of cycles until a result is available) for multiply, and divide instructions. The
latencies are listed in terms of pipeline clocks. In this table ‘latency’ refers to the number of cycles necessary for the
first instruction to produce the result needed by the second instruction.
[1] For multiply operations, this is the rt operand. For divide operations, this is the rs operand.
[2] Integer Operation refers to any integer instruction that uses the result of a previous MDU operation.
[3] This does not include the 1 or 2 IU pipeline stalls (16 bit or 32 bit) that the MUL operation causes irre-
spective of the following instruction.These stalls do not add to the latency of 2.
[4] If both operands are positive, then the Sign Adjust stage is bypassed. Latency is then the same as for
DIVU.
26MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
2.3 MDU Pipeline - High-performance MDU
In Table 2.1, a latency of one means that the first and second instructions can be issued back-to-back in the code,
without the MDU causing any stalls in the IU pipeline. A latency of two means that if issued back-to-back, the IU
pipeline will be stalled for one cycle. MUL operations are special, because the MDU needs to stall the IU pipeline in
order to maintain its register file write slot. As a result, the MUL 16x16 or 32x16 operation will always force a onecycle stall of the IU pipeline, and the MUL 32x32 will force a two-cycle stall. If the integer instruction immediately
following the MUL operation uses its result, an additional stall is forced on the IU pipeline.
Table 2.2 lists the repeat rates (peak issue rate of cycles until the operation can be reissued) for multiply accumu-
late/subtract instructions. The repeat rates are listed in terms of pipeline clocks. In this table ‘repeat rate’ refers to the
case where the first MDU instruction (in the table below) if back-to-back with the second instruction.
Figure 2.3 below shows the pipeline flow for the following sequence:
1.32x16 multiply (Mult1)
2.Add
3.32x32 multiply (Mult2)
4.Subtract (Sub)
The 32x16 multiply operation requires one clock of each pipeline stage to complete. The 32x32 multiply operation
requires two clocks in the M
always starts a computation in the final phase of the E stage. As shown in the figure, the M
MDU pipeline occurs in parallel with the M stage of the IU pipeline, the A
stage, and the W
stage occurs in parallel with the W stage. In general this need not be the case. Following the 1st
MDU
pipe-stage. The MDU pipeline is shown as the shaded areas of Figure 2.3 and
MDU
pipe-stage of the
MDU
stage occurs in parallel with the A
MDU
cycle of the M stages, the two pipelines need not be synchronized. This does not present a problem because results in
the MDU pipeline are written to the HI and LO registers, while the integer pipeline results are written to the register
file.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0427
Pipeline of the M14K™ Core
Figure 2.3 MDU Pipeline Behavior During Multiply Operations
The following is a cycle-by-cycle analysis of Figure 2.3.
1.The first 32x16 multiply operation (Mult1) is fetched from the instruction cache and enters the I stage.
2.An Add operation enters the I stage. The Mult1 operation enters the E stage. The integer and MDU pipelines
share the I and E pipeline stages. At the end of the E stage in cycle 2, the MDU pipeline starts processing the
multiply operation (Mult1).
3.In cycle 3, a 32x32 multiply operation (Mult2) enters the I stage and is fetched from the instruction cache. Since
the Add operation has not yet reached the M stage by cycle 3, there is no activity in the M stage of the integer
pipeline at this time.
4.In cycle 4, the Subtract instruction enters I stage. The second multiply operation (Mult2) enters the E stage. And
the Add operation enters M stage of the integer pipe. Since the Mult1 multiply is a 32x16 operation, only one
clock is required for the M
5.In cycle 5, the Subtract instruction enters E stage. The Mult2multiply enters the M
stage, hence the Mult1operation passes to the A
MDU
stage of the MDU pipeline.
MDU
stage. The Add operation
MDU
enters the A stage of the integer pipeline. The Mult1operation completes and is written back in to the HI/LO register pair in the W
MDU
stage.
6.Since a 32x32 multiply requires two passes through the multiplier, with each pass requiring one clock, the 32x32
Mult2remains in the M
stage in cycle 6. The Sub instruction enters M stage in the integer pipeline. The Add
MDU
operation completes and is written to the register file in the W stage of the integer pipeline.
7.The Mult2 multiply operation progresses to the A
8.The Mult2 operation completes and is written to the HI/LO registers pair in the the W
stage, and the Sub instruction progress to the A stage.
MDU
stage, while the Sub
MDU
instruction writes to the register file in the W stage.
2.3.1 32x16 Multiply (High-Performance MDU)
The 32x16 multiply operation begins in the last phase of the E stage, which is shared between the integer and MDU
pipelines. In the latter phase of the E stage, the rs and rt operands arrive and the Booth-recoding function occurs at
this time. The multiply calculation requires one clock and occurs in the M
carry-propagate-add (CPA) function occurs and the operation is completed. The result is ready to be read from the
HI/LO registers in the W
MDU
stage.
Figure 2.4 shows a diagram of a 32x16 multiply operation.
28MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
stage. In the A
MDU
MDU
stage, the
2.3 MDU Pipeline - High-performance MDU
Figure 2.4 MDU Pipeline Flow During a 32x16 Multiply Operation
Clock1234
E
M
MDU
A
MDU
W
MDU
BoothArray
CPA
Res
2.3.2 32x32 Multiply (High-Performance MDU)
The 32x32 multiply operation begins in the last phase of the E stage, which is shared between the integer and MDU
pipelines. In the latter phase of the E stage, the rs and rt operands arrive and the Booth-recoding function occurs at
this time. The multiply calculation requires two clocks and occurs in the M
stage. In the A
MDU
stage, the CPA
MDU
function occurs and the operation is completed.
Figure 2.5 shows a diagram of a 32x32 multiply operation.
Figure 2.5 MDU Pipeline Flow During a 32x32 Multiply Operation
Clock1234
EM
BoothArray
MDU
Booth
M
MDU
Array
A
CPA
MDU
W
5
MDU
Res
2.3.3 Divide (High-Performance MDU)
Divide operations are implemented using a simple non-restoring division algorithm. This algorithm works only for
positive operands, hence the first cycle of the M
that this cycle is spent even if the adjustment is not necessary. During the next maximum 32 cycles (3-34) an iterative
add/subtract loop is executed. In cycle 3 an early-in detection is performed in parallel with the add/subtract. The
adjusted rs operand is detected to be zero extended on the upper most 8, 16 or 24 bits. If this is the case the following
7, 15 or 23 cycles of the add/subtract iterations are skipped.
stage is used to negate the rs operand (RS Adjust) if needed. Note
MDU
The remainder adjust (Rem Adjust) cycle is required if the remainder was negative. Note that this cycle is spent even
if the remainder was positive. A sign adjust is performed on the quotient and/or remainder if necessary. The sign
adjust stage is skipped if both operands are positive. In this case the Rem Adjust is moved to the A
MDU
stage.
Figure 2.6, Figure 2.7, Figure 2.8 and Figure 2.9 show the latency for 8, 16, 24 and 32 bit divide operations, respec-
tively. The repeat rate is either 11, 19, 27 or 35 cycles (one less if the sign adjust stage is skipped) as a second divide
can be in the RS Adjust stage when the first divide is in the Reg WR stage.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0429
Pipeline of the M14K™ Core
Figure 2.6 High-Performance MDU Pipeline Flow During a 8-bit Divide (DIV) Operation
Clock124-1011
M
E Stage
StageM
MDU
RS Adjust
3
M
Stage
MDU
Add/Subtract
Early In
StageM
MDU
StageA
MDU
Rem AdjustAdd/Subtract
12
Stage
MDU
Sign Adjust
13
W
Stage
MDU
MDU Res Rdy
Figure 2.7 High-Performance MDU Pipeline Flow During a 16-bit Divide (DIV) Operation
Clock124-1819
E Stage
M
StageM
MDU
RS Adjust
3
M
Stage
MDU
Add/Subtract
Early In
StageM
MDU
StageA
MDU
Rem AdjustAdd/Subtract
20
Stage
MDU
Sign Adjust
21
W
Stage
MDU
MDU Res Rdy
Figure 2.8 High-Performance MDU Pipeline Flow During a 24-bit Divide (DIV) Operation
Clock124-2627
E StageM
StageM
MDU
3
M
Stage
MDU
StageM
MDU
StageA
MDU
MDU
28
Stage
29
W
Stage
MDU
RS Adjust
Add/Subtract
Early In
Figure 2.9 High-Performance MDU Pipeline Flow During a 32-bit Divide (DIV) Operation
Clock124-3435
E StageM
StageM
MDU
RS Adjust
3
M
Stage
MDU
Add/Subtract
Early In
2.4 MDU Pipeline - Area-Efficient MDU
The area-efficient multiply/divide unit (MDU) is a separate autonomous block for multiply and divide operations.
The MDU is not pipelined, but rather performs the computations iteratively in parallel with the integer unit (IU) pipeline and does not stall when the IU pipeline stalls. This allows the long-running MDU operations to be partially
masked by system stalls and/or other integer unit instructions.
The MDU consists of one 32-bit adder result-accumulate registers (HI and LO), a combined multiply/divide state
machine, and all multiplexers and control logic. A simple 1-bit-per-clock recursive algorithm is used for both multiply and divide operations. Using Booth’s algorithm all multiply operations complete in 32 clocks. Two extra clocks
are needed for multiply-accumulate. The non-restoring algorithm used for divide operations will not work with nega-
StageM
MDU
Rem AdjustAdd/Subtract
StageA
MDU
Rem AdjustAdd/Subtract
Sign Adjust
36
Stage
MDU
Sign Adjust
MDU Res Rdy
37
W
Stage
MDU
MDU Res Rdy
30MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
2.4 MDU Pipeline - Area-Efficient MDU
tive numbers. Adjustment before and after are thus required depending on the sign of the operands. All divide operations complete in 33 to 35 clocks.
Table 2.3 lists the latencies (number of cycles until a result is available) for multiply and divide instructions. The
latencies are listed in terms of pipeline clocks. In this table ‘latency’ refers to the number of cycles necessary for the
second instruction to use the results of the first.
[1] Integer Operation refers to any integer instruction that uses the result of a previous MDU operation.
2.4.1 Multiply (Area-Efficient MDU)
Instruction Sequence
MSUB/MSUBU, or
MADD/MADDU,
MSUB/MSUBU, or
Integer operation
Integer operation
MSUB/MSUBU
MFHI/MFLO
MFHI/MFLO
[1]
[1]
Latency
Clocks1st Instruction2nd Instruction
32
34
32
34
2
1
Multiply operations are executed using a simple iterative multiply algorithm. Using Booth’s approach, this algorithm
works for both positive and negative operands. The operation uses 32 cycles in M
stage to complete a multiplica-
MDU
tion. The register writeback to HI and LO are done in the A stage. For MUL operations, the register file writeback is
done in the W
MDU
stage.
Figure 2.10 shows the latency for a multiply operation. The repeat rate is 33 cycles as a second multiply can be in the
first M
stage when the first multiply is in A
MDU
MDU
stage.
Figure 2.10 M14K™ Area-Efficient MDU Pipeline Flow During a Multiply Operation
Clock12-333435
M
E-Stage
-StageA
MDU
Add/sub-shift
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0431
-Stage
MDU
HI/LO Write
W
MDU
Reg WR
-Stage
Pipeline of the M14K™ Core
2.4.2 Multiply Accumulate (Area-Efficient MDU)
Multiply-accumulate operations use the same multiply machine as used for multiply only. Two extra stages are
needed to perform the addition/subtraction. The operations uses 34 cycles in M
ply-accumulate. The register writeback to HI and LO are done in the A stage.
Figure 2.11 shows the latency for a multiply-accumulate operation. The repeat rate is 35 cycles as a second multi-
ply-accumulate can be in the E stage when the first multiply is in the last M
MDU
Figure 2.11 M14K™ Core Area-Efficient MDU Pipeline Flow During a Multiply Accumulate Operation
Clock12-333435
E StageM
StageM
MDU
StageM
MDU
MDU
Stage
36
A
MDU
stage to complete the multi-
MDU
stage.
37
Stage
W
Stage
MDU
Add/Subtract Shift
Accumulate/LO
Accumulate/HI
HI/LO Write
2.4.3 Divide (Area-Efficient MDU)
Divide operations also implement a simple non-restoring algorithm. This algorithm works only for positive operands,
hence the first cycle of the M
executed even if negation is not needed. The next 32 cycle (3-34) executes an interactive add/subtract-shift function.
Two sign adjust (Sign Adjust 1/2) cycles are used to change the sign of one or both the quotient and the remainder.
Note that one or both of these cycles are skipped if they are not needed. The rule is, if both operands were positive or
if this is an unsigned division; both of the sign adjust cycles are skipped. If the rs operand was negative, one of the
sign adjust cycles is skipped. If only the rs operand was negative, none of the sign adjust cycles are skipped. Register
writeback to HI and LO are done in the A stage.
Figure 2.12 shows the pipeline flow for a divide operation. The repeat rate is either 34, 35 or 36 cycles (depending on
how many sign adjust cycles are skipped) as a second divide can be in the E stage when the first divide is in the last
M
stage.
MDU
Figure 2.12 M14K™ Core Area-Efficient MDU Pipeline Flow During a Divide (DIV) Operation
Clock
123-3435
E StageM
stage is used to negate the rs operand (RS Adjust) if needed. Note that this cycle is
MDU
MDU
M
MDU
M
MDU
M
36
MDU
A
37
MDU
W
38
MDU
RS Adjust
Sign Adjust 1Add/Subtract
Sign Adjust 2
HI/LO Write
2.5 Branch Delay
The pipeline has a branch delay of one cycle. The one-cycle branch delay is a result of the branch decision logic operating during the E pipeline stage. This allows the branch target address to be used in the I stage of the instruction following 2 cycles after the branch instruction. By executing the 1st instruction following the branch instruction
sequentially before switching to the branch target, the intervening branch delay slot is utilized. This avoids bubbles
32MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
2.6 Data Bypassing
being injected into the pipeline on branch instructions. Both the address calculation and the branch condition check
are performed in the E stage.
The pipeline begins the fetch of either the branch path or the fall-through path in the cycle following the delay slot.
After the branch decision is made, the processor continues with the fetch of either the branch path (for a taken branch)
or the fall-through path (for the non-taken branch).
The branch delay means that the instruction immediately following a branch is always executed, regardless of the
branch direction. If no useful instruction can be placed after the branch, then the compiler or assembler must insert a
NOP instruction in the delay slot.
Figure 2.13 illustrates the branch delay.
Figure 2.13 IU Pipeline Branch Delay
Jump or Branch
Delay Slot Instruction
Jump Target Instruction
2.6 Data Bypassing
Most MIPS32 instructions use one or two register values as source operands. These operands are fetched from the
register file in the first part of E stage. The ALU straddles the E-to-M boundary, and can present the result early in the
M stage. However, the result is not written to the register file before the W stage. If no precautions were taken, it
would take 3 cycles before the result was available for the following instructions. To avoid this, data bypassing is
implemented.
Between the register file and the ALU a data-bypass multiplexer is placed on both operands (see figure below). This
enables the M14K core to forward data from a preceding instruction whose target is a source register of a following
instruction. An M to E bypass and an A to E bypass feed the bypass multiplexers. A W to E bypass is not needed, as
the register file is capable of making an internal bypass of Rd write data directly to the Rs and Rt read ports.
One Cycle
IEMA
One CycleOne CycleOne CycleOne Cycle
IEMA
IEMA
One Clock
Branch
Delay
One Cycle
W
W
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0433
Pipeline of the M14K™ Core
Figure 2.14 IU Pipeline Data Bypass
E stageM stageA stageW stageI stage
A to E bypass
M to E bypass
Instruction
Rs Addr
Rs Read
Rt Addr
Reg File
Rd Write
Rt Read
ALU
E stage
Bypass
multiplexers
ALU
M stage
Loaddata, HI/LO Data or
CP0 data
Figure 2.15 shows the data bypass for an Add1instruction followed by a Sub2and another Add3instruction. The Sub
instruction uses the output from the Add1instruction as one of the operands, and thus the M to E bypass is used. The
following Add3 uses the result from both the first Add1 instruction and the Sub2 instruction. Since the Add1 data is
now in A stage, the A to E bypass is used, and the M to E bypass is used to bypass the Sub2 data to the Add2 instruction.
Figure 2.15 IU Pipeline M to E bypass
ADD
1
R3=R2+R1
One CycleOne CycleOne CycleOne CycleOne Cycle
I
EMA
M to E bypass
A to E bypass
W
One Cycle
2
SUB
2
R4=R3-R7
ADD
3
R5=R3+R4
IEMA
M to E bypass
IEMA
W
2.6.1 Load Delay
Load delay refers to the fact that data fetched by a load instruction is not available in the integer pipeline until after
the load aligner in A stage. All instructions need the source operands available in the E stage. An instruction immediately following a load instruction will, if it has the same source register as was the target of the load, cause an instruction interlock pipeline slip in the E stage (see 2.10 “Instruction Interlocks” on page 38). If an instruction following
the load by 1 or 2 cycles uses the data from the load, the A to E bypass (see Figure 2.30) serves to reduce or avoid
stall cycles. An instruction flow of this is shown in Figure 2.16.
34MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
Figure 2.16 IU Pipeline A to E Data bypass
2.7 Coprocessor 2 Instructions
One CycleOne CycleOne CycleOne CycleOne Cycle
Load Instruction
Consumer of Load Data Instruction
IEMA
Data bypass from A to E
IEMA
IEMA
One Clock
Load Delay
W
One Cycle
W
2.6.2 Move from HI/LO and CP0 Delay
As indicated in Figure 2.30, not only load data, but also data moved from the HI or LO registers (MFHI/MFLO) and
data moved from CP0 (MFC0) enters the IU-Pipeline in the A stage. That is, data is not available in the integer pipeline until early in the A stage. The A to E bypass is available for this data. But as for Loads, an instruction following
immediately after one of these move instructions must be paused for one cycle if the target of the move is among the
sources of the following instruction and this causes an interlock slip in the E stage (see 2.10 “Instruction Interlocks”
on page 38). An interlock slip after a MFHI is illustrated in Figure 2.17.
Figure 2.17 IU Pipeline Slip after a MFHI
One CycleOne CycleOne CycleOne CycleOne Cycle
MFHI (to R3)
ADD (R4=R3+R5)
IEMA
2.7 Coprocessor 2 Instructions
If a coprocessor 2 is attached to the M14K core, a number of transactions must take place on the CP2 Interface for
each coprocessor 2 instruction. First, if the CU[2] bit in the CP0 Status register is not set, then no coprocessor 2
related instruction will start a transaction on the CP2 Interface; instead, a Coprocessor Unusable exception will be
signaled. If the CU[2] bit is set, and a coprocessor 2 instruction is fetched, the following transactions will occur on the
CP2 Interface:
1.The Instruction is presented on the instructions bus in E stage. Coprocessor 2 can do a decode in the same cycle.
2.The Instruction is validated from the core in M stage. From this point, the core will accept control and data signals back from coprocessor 2. All control and data signals from coprocessor 2 are captured on input latches to the
core.
One Cycle
W
Data bypass from A to E
E (slip)I
EMA
One Cycle
W
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0435
Pipeline of the M14K™ Core
3.If all the expected control and data signals were presented to the core in the previous M stage, the core will proceed to execute the A stage. If some return information is missing, the A stage will not advance and cause a slip
in all I, E, and M stages (see 2.9 “Slip Conditions” on page 37).
If this instruction sent data from the core to coprocessor 2, this data is sent in the A stage.
4.The instruction completion is signaled to coprocessor 2 in the W stage. Potential data from the coprocessor is
written to the register file.
Figure 2.18 shows the timing relationship between the M14K core and coprocessor 2 for all coprocessor 2 instruc-
tions.
Figure 2.18 Coprocessor 2 Interface Transactions
One CycleOne CycleOne CycleOne CycleOne Cycle
COP2 inst.
Coreinternal
operations
Core to CP2
info.
CP2 to Coreinfo.
CP2 internal
operations
IEMA
Fetch
instrucion
Get ready for
new inst.
Decode and
setup valid
Ready
Decode & get
FromData
Get ToData
from memory
Control &
FromData
See
Valid
Capture
Control &
FromData
ToDataCompleteValidate inst.Instrucion
W
Capture
ToData
Complete
instruction
As can be seen in the Figure, all control and data from the coprocessor must occur in the M stage. If this is not the
case, the A stage will start slipping in the following cycle and thus stall the I, E, M. and A stages; but if all expected
control and data is available in the M stage, coprocessor 2 instructions can execute with no pipeline stalls. The only
exception to this is the Branch on Coprocessor conditions (BC2) instruction. All branch instructions, including the
regular BEQ, BNE, etc., must be resolved in the E stage. The M14K core does not have branch prediction logic, and
thus the target address must be available before the end of the E stage. The BC2 instruction has to follow the same
protocol as all other coprocessor 2 instructions on the CP2 Interface. All core interface operations belonging to the E,
M, and A stages will have to occur in the E stage for BC2 instructions. This means that a BC2 instruction always slips
for a minimum of 2 cycles int the E stage, and any delay in the return of branch information from coprocessor 2 will
add to the number of slip cycles. All other Coprocessor 2 instructions can operate without slips, provided that all control and data information from coprocessor 2 is transferred in the M stage.
2.8 Interlock Handling
Smooth pipeline flow is interrupted when cache misses occur or when data dependencies are detected. Interruptions
handled entirely in hardware, such as cache misses, are referred to as interlocks. At each cycle, interlock conditions
are checked for all active instructions.
36MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
2.9 Slip Conditions
Table 2.4 lists the types of pipeline interlocks for the M14K processor core.
Table 2.4 Pipeline Interlocks
Interlock TypeSourcesSlip Stage
I-side SRAM StallSRAM Access not completeE Stage
InstructionProducer-consumer hazardsE/M Stage
Hardware Dependencies (MDU)E Stage
BC2 waiting for COP2 Condition Check
D-side SRAM StallSRAM Access not completeA Stage
Coprocessor 2 completion slipCoprocessor 2 control and/or data delay
from coprocessor
In general, MIPS processors support two types of hardware interlocks:
•Stalls, which are resolved by halting the pipeline
•Slips, which allow one part of the pipeline to advance while another part of the pipeline is held static
In the M14K processor core, all interlocks are handled as slips.
A Stage
2.9 Slip Conditions
On every clock, internal logic determines whether each pipe stage is allowed to advance. These slip conditions propagate backwards down the pipe. For example, if the M stage does not advance, neither does the E or I stage.
Slipped instructions are retried on subsequent cycles until they issue. The back end of the pipeline advances normally
during slips. This resolves the conflict when the slip was caused by a missing result. NOPs are inserted into the bubble in the pipeline. Figure 2.19 shows an instruction cache miss that causes a two-cycle slip.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0437
Pipeline of the M14K™ Core
Figure 2.19 Instruction Cache Miss Slip
Clock123456
1
2
3
Stage
M
I
E
A
I
I
3
I
2
I
1
I
0
I
4
5
I
I
3
4
I
I
2
3
I
I
1
2
I
I
5
I
4
0
I
3
I
5
6
I
I
4
5
0
I
4
0
0
1 Cache miss detected
Critical word received
2
3 Execute E-stage
In the first clock cycle in Figure 2.19, the pipeline is full and the cache miss is detected. Instruction I0 is in the A
stage, instruction I1 is in the M stage, instruction I2 is in the E stage, and instruction I3 is in the I stage. The cache
miss occurs in clock 2 when the I4 instruction fetch is attempted. I4 advances to the E stage and waits for the instruction to be fetched from main memory. In this example, two clocks (3 and 4) are required to fetch the I4 instruction
from memory. After the cache miss has been resolved in clock 4 and the instruction is bypassed to the E stage, the
pipeline is restarted, causing I4 to finally execute it’s E-stage operations.
2.10 Instruction Interlocks
Most instructions can be issued at a rate of one per clock cycle. In order to adhere to the sequential programming
model, the issue of an instruction must sometimes be delayed to ensure that the result of a prior instruction is available. Table 2.5 details the instruction interactions that prevent an instruction from advancing in the processor pipeline.
Table 2.5 Instruction Interlocks
Instruction Interlocks
Issue Delay (in
First InstructionSecond Instruction
LB/LBU/LH/LHU/LL/LW/LWL/LWRConsumer of load data1E stage
MFC0Consumer of destination regis-
ter
MULTx/MADDx/MSUBx
(high-performance MDU)
MUL
(high-performance MDU)
16bx32bMFLO/MFHI0
32bx32b1M stage
16bx32bConsumer of target data2E stage
32bx32b3E stage
Clock Cycles)Slip Stage
1E stage
38MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
In general, the M14K core ensures that instructions are executed following a fully sequential program model in which
each instruction in the program sees the results of the previous instruction. There are some deviations to this model,
referred to as hazards.
E stage
E stage
E stage
Prior to Release 2 of the MIPS32® Architecture, hazards (primarily CP0 hazards) were relegated to implementation-dependent cycle-based solutions, primarily based on the SSNOP instruction. This has been an insufficient and
error-prone practice that must be addressed with a firm compact between hardware and software. As such, new
instructions have been added to Release 2 of the architecture which act as explicit barriers that eliminate hazards. To
the extent that it was possible to do so, the new instructions have been added in such a way that they are backward-compatible with existing MIPS processors.
2.11.1 Types of Hazards
With one exception, all hazards were eliminated in Release 1 of the Architecture for unprivileged software. The
exception occurs when unprivileged software writes a new instruction sequence and then wishes to jump to it. Such
an operation remained a hazard, and is addressed by the capabilities of Release 2.
In privileged software, there are two types of hazards: execution hazards and instruction hazards.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0439
Pipeline of the M14K™ Core
Execution hazards are those created by the execution of one instruction, and seen by the execution of another instruction. Table 2.6 lists execution hazards.
Table 2.6 Execution Hazards
Producer→ConsumerHazard On
Spacing
(Instructions)
MTC0→Coprocessor instruction execution depends on the new value of Sta-
tus
CU
MTC0→ERETEPC
MTC0→ERETStatus0
MTC0, EI, DI→Interrupted InstructionStatus
MTC0→Interrupted InstructionCause
MTC0→RDPGPR
WRPGPR
MTC0→Instruction not seeing a Timer InterruptCompare
MTC0→Instruction affected by changeAny other CP0
1. This is the minimum value. Actual value is system-dependent since it is a function of the sequential logic between the SI_TimerInt
output and the external logic which feeds SI_TimerInt back into one of the SI_Int inputs, or a function of the method for handling
SI_TimerInt in an external interrupt controller.
Status
CU
DEPC
ErrorEPC
IE
IP
SRSCtl
PSS
update that
clears Timer
Interrupt
register
1
1
1
3
1
1
4
2
Instruction hazards are those created by the execution of one instruction, and seen by the instruction fetch of another
instruction. Table 2.7 lists instruction hazards.
Table 2.7 Instruction Hazards
Spacing
Producer→ConsumerHazard On
MTC0→Instruction fetch seeing the new value (including a change to ERL fol-
lowed by an instruction fetch from the useg segment)
Instruction stream
write via redirected store
→Instruction fetch seeing the new instruction streamCache entries3
Status
(Instructions)
2.11.2 Instruction Listing
Table 2.8 lists the instructions designed to eliminate hazards. See the document titled MIPS32® Architecture for Pro-
grammers Volume II: The MIPS32® Instruction Set (MD00086) for a more detailed description of these instructions.
Table 2.8 Hazard Instruction Listing
MnemonicFunction
EHBClear execution hazard
40MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
2.11 Hazards
Table 2.8 Hazard Instruction Listing (Continued)
MnemonicFunction
JALR.HBClear both execution and instruction hazards
JR.HBClear both execution and instruction hazards
SYNCISynchronize caches after instruction stream write
2.11.2.1 Instruction Encoding
The EHB instruction is encoded using a variant of the NOP/SSNOP encoding. This encoding was chosen for compatibility with the Release 1 SSNOP instruction, such that existing software may be modified to be compatible with both
Release 1 and Release 2 implementations. See the EHB instruction description for additional information.
The JALR.HB and JR.HB instructions are encoding using bit 10 of the hint field of the JALR and JR instructions.
These encodings were chosen for compatibility with existing MIPS implementations, including many which pre-date
the MIPS32 architecture. Because a pipeline flush clears hazards on most early implementations, the JALR.HB or
JR.HB instructions can be included in existing software for backward and forward compatibility. See the JALR.HB
and JR.HB instructions for additional information.
The SYNCI instruction is encoded using a new encoding of the REGIMM opcode. This encoding was chosen
because it causes a Reserved Instruction exception on all Release 1 implementations. As such, kernel software running on processors that don’t implement Release 2 can emulate the function using the CACHE instruction.
2.11.3 Eliminating Hazards
The Spacing column shown in Table 2.6 and Table 2.7 indicates the number of unrelated instructions (such as NOPs
or SSNOPs) that, prior to the capabilities of Release 2, would need to be placed between the producer and consumer
of the hazard in order to ensure that the effects of the first instruction are seen by the second instruction. Entries in the
table that are listed as 0 are traditional MIPS hazards which are not hazards on the M14K core.
With the hazard elimination instructions available in Release 2, the preferred method to eliminate hazards is to place
one of the instructions listed in Table 2.8 between the producer and consumer of the hazard. Execution hazards can be
removed by using the EHB, JALR.HB, or JR.HB instructions. Instruction hazards can be removed by using the
JALR.HB or JR.HB instructions, in conjunction with the SYNCI instruction. Since the M14K core does not contain
caches, the SYNCI instruction is not strictly necessary, but is still recommended to create portable code that can be
run on other MIPS processors that may contain caches.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0441
Pipeline of the M14K™ Core
42MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
Chapter 3
Memory Management of the M14K™ Core
The M14K™ processor core includes a Memory Management Unit (MMU) that interfaces between the execution
unit and the cache controller. The core implements a simple Fixed Mapping Translation (FMT) style MMU.
This chapter contains the following sections:
•Section 3.1 “Introduction”
•Section 3.2 “Modes of Operation”
•Section 3.3 “Fixed Mapping MMU”
•Section 3.4 “System Control Coprocessor”
3.1 Introduction
The MMU in a M14K processor core translates a virtual address to a physical address before the request is sent to the
SRAM interface for an external memory reference.
In the M14K processor core, the MMU is based on a simple algorithm to translate virtual addresses to physical
addresses via a Fixed Mapping Translation (FMT) mechanism. These translations are different for various regions of
the virtual address space (useg/kuseg, kseg0, kseg1, kseg2/3).
3.1.1 Memory Management Unit (MMU)
The M14K core contains a simple Fixed Mapping Translation (FMT) MMU that interfaces between the execution
unit and the SRAM controller.
3.1.1.1 Fixed Mapping Translation (FMT)
An FMT is smaller and simpler than the full Translation Lookaside Buffer (TLB) style MMU found in other MIPS
cores. Like a TLB, the FMT performs virtual-to-physical address translation and provides attributes for the different
segments. Those segments that are unmapped in a TLB implementation (kseg0 and kseg1) are translated identically
by the FMT.
Figure 3.1 shows how the memory management unit interacts with the SRAM access in the M14K core.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0443
Memory Management of the M14K™ Core
Figure 3.1 Address Translation During SRAM Access
Instruction
Address
Calculator
Data
Address
Calculator
3.2 Modes of Operation
The M14K core implements three modes of operation:
•User mode is most often used for applications programs.
•Kernel mode is typically used for handling exceptions and operating-system kernel functions, including CP0
management and I/O device accesses.
•Debug mode is used during system bring-up and software development. Refer to the EJTAG section for more
information on debug mode.
Virtual
Address
Virtual
Address
FMT
Physical
Address
Instn
SRAM
SRAM
Interface
Data
SRAM
Physical
Address
User mode is most often used for application programs. Kernel mode is typically used for handling exceptions and
privileged operating system functions, including CP0 management and I/O device accesses. Debug mode is used for
software debugging and most likely occurs within a software development tool.
The address translation performed by the MMU depends on the mode in which the processor is operating.
3.2.1 Virtual Memory Segments
The Virtual memory segments differ depending on the mode of operation. Figure 3.2 shows the segmentation for the
4 GByte (232 bytes) virtual memory space addressed by a 32-bit virtual address, for the three modes of operation.
The core enters Kernel mode both at reset and when an exception is recognized. While in Kernel mode, software has
access to the entire address space, as well as all CP0 registers. User mode accesses are limited to a subset of the virtual address space (0x0000_0000 to 0x7FFF_FFFF) and can be inhibited from accessing CP0 functions. In User
mode, virtual addresses 0x8000_0000 to 0xFFFF_FFFF are invalid and cause an exception if accessed.
Debug mode is entered on a debug exception. While in Debug mode, the debug software has access to the same
address space and CP0 registers as for Kernel mode. In addition, while in Debug mode the core has access to the
debug segment dseg. This area overlays part of the kernel segment kseg3. dseg access in Debug mode can be turned
on or off, allowing full access to the entire kseg3 in Debug mode, if so desired.
44MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
Each of the segments shown in Figure 3.2 are either mapped or unmapped. The following two sub-sections explain
the distinction. Then sections 3.2.2 “User Mode”, 3.2.3 “Kernel Mode” and 3.2.4 “Debug Mode” specify which
segments are actually mapped and unmapped.
3.2.1.1 Unmapped Segments
An unmapped segment does not use the FMT to translate from virtual-to-physical addresses.
Unmapped segments have a fixed simple translation from virtual to physical address. This is much like the translations the FMT provides for the M14K core, but we will still make the distinction.
All segments are treated as uncached within the M14K core. Cache coherency attributes of cached or uncached can be
specified and this information will be sent with the request to allow the system to make a distinction between the two.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0445
3.2 Modes of Operation
All valid user mode virtual addresses have their most significant bit cleared to 0, indicating that user mode can only
access the lower half of the virtual memory map. Any attempt to reference an address with the most significant bit set
while in user mode causes an address error exception.
The system maps all references to useg through the FMT.
3.2.3 Kernel Mode
The processor operates in Kernel mode when the DMbit in the Debugregister is 0 and the Statusregister contains one
or more of the following values:
•UM = 0
•ERL = 1
•EXL = 1
When a non-debug exception is detected,
EXL or ERL will be set and the processor will enter Kernel mode. At the end
of the exception handler routine, an Exception Return (ERET) instruction is generally executed. The ERET instruction jumps to the Exception PC, clears ERL, and clears EXL if ERL=0. This may return the processor to User mode.
Kernel mode virtual address space is divided into regions differentiated by the high-order bits of the virtual address,
as shown in Figure 3.4. Also, Table 3.2 lists the characteristics of the Kernel mode segments.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0447
Memory Management of the M14K™ Core
Figure 3.4 Kernel Mode Virtual Address Space
0xFFFF_FFFF
0xE000_0000
0xDFFF_FFFF
0xC000_0000
0xBFFF_FFFF
0xA000_0000
0x9FFF_FFFF
0x8000_0000
0x7FFF_FFFF
Kernel virtual address space
Fix Mapped, 512MB
Kernel virtual address space
Fix Mapped, 512MB
Kernel virtual address space
Unmapped, Uncached, 512MB
Kernel virtual address space
Unmapped, 512MB
Fixed Mapped, 2048MB
kseg3
kseg2
kseg1
kseg0
kuseg
0x0000_0000
Status Register Is One
Address Bit
of These Values
Values
A(31) = 0(UM = 0
EXL = 1
A(31:29) = 100
A(31:29) = 101
A(31:29) = 110
A(31:29) = 111
2
2
2
2
ERL = 1)
Table 3.2 Kernel Mode Segments
Segment
NameAddress Range
kuseg0x0000_0000
or
0x7FFF_FFFF
or
and
DM = 0
kseg00x8000_0000
0x9FFF_FFFF
kseg10xA000_0000
0xBFFF_FFFF
kseg20xC000_0000
0xDFFF_FFFF
kseg30xE000_0000
0xFFFF_FFFF
through
through
through
through
through
Segment
SizeUMEXLERL
2 GBytes (2
bytes)
512 MBytes
29
bytes)
(2
512 MBytes
29
bytes)
(2
512 MBytes
29
bytes)
(2
512 MBytes
29
bytes)
(2
31
48MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
3.2 Modes of Operation
3.2.3.1 Kernel Mode, User Space (kuseg)
In Kernel mode, when the most-significant bit of the virtual address (A31) is cleared, the 32-bit kuseg virtual address
space is selected and covers the full 231bytes (2 GBytes) of the current user address space mapped to addresses
0x0000_0000 - 0x7FFF_FFFF.
When the Status register’s ERL = 1, the user address region becomes a 2
29
-byte unmapped and uncached address
space. While in this setting, the kuseg virtual address maps directly to the same physical address.
3.2.3.2 Kernel Mode, Kernel Space 0 (kseg0)
In Kernel mode, when the most-significant three bits of the virtual address are 1002, 32-bit kseg0 virtual address
space is selected; it is the 229-byte (512-MByte) kernel virtual space located at addresses 0x8000_0000 0x9FFF_FFFF. References to kseg0 are unmapped; the physical address selected is defined by subtracting
0x8000_0000 from the virtual address. The K0 field of the Config register controls cacheability.
3.2.3.3 Kernel Mode, Kernel Space 1 (kseg1)
In Kernel mode, when the most-significant three bits of the 32-bit virtual address are 1012, 32-bit kseg1 virtual
address space is selected. kseg1 is the 229-byte (512-MByte) kernel virtual space located at addresses 0xA000_0000 0xBFFF_FFFF. References to kseg1 are unmapped; the physical address selected is defined by subtracting
0xA000_0000 from the virtual address.
3.2.3.4 Kernel Mode, Kernel Space 2 (kseg2)
In Kernel mode, when UM = 0, ERL =1,orEXL = 1 in the Status register, and DM= 0 in the Debug register, and the
most-significant three bits of the 32-bit virtual address are 1102, 32-bit kseg2 virtual address space is selected. In the
M14K core, this 229-byte (512-MByte) kernel virtual space is located at physical addresses 0xC000_0000 0xDFFF_FFFF.
3.2.3.5 Kernel Mode, Kernel Space 3 (kseg3)
In Kernel mode, when the most-significant three bits of the 32-bit virtual address are 111
the kseg3 virtual address
2 ,
space is selected. In the M14K core, this 229-byte (512-MByte) kernel virtual space is located at physical addresses
0xE000_0000 - 0xFFFF_FFFF.
3.2.4 Debug Mode
Debug mode address space is identical to Kernel mode address space with respect to mapped and unmapped areas,
except for kseg3. In kseg3, a debug segment dseg co-exists in the virtual address range 0xFF20_0000 to
0xFF3F_FFFF. The layout is shown in Figure 3.5.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0449
3.3 Fixed Mapping MMU
unpredictable, and writes are ignored to any unimplemented register in the drseg. Refer to Chapter 8, “EJTAG Debug
Support in the M14K™ Core” on page 155 for more information on the DCR.
The allowed access size is limited for the drseg. Only word size transactions are allowed. Operation of the processor
is undefined for other transaction sizes.
3.2.4.2 Conditions and Behavior for Access to dmseg, EJTAG Memory
The behavior of CPU access to the dmseg address range at 0xFF20_0000 to 0xFF2F_FFFF is determined by the table
shown in Table 3.5.
Table 3.5 CPU Access to dmseg Address Range
ProbEn bit in
Transaction
Load / StoreDon’t care1Kernel mode address space (kseg3)
Fetch1Don’t caredmseg
Load / Store10
Fetch0Don’t careSee comments below
Load / Store00
DCR register
The case with access to the dmseg when the ProbEn bit in the DCR register is 0 is not expected to happen. Debug
software is expected to check the state of the ProbEn bit in DCR register before attempting to reference dmseg. If
such a reference does happen, the reference hangs until it is satisfied by the probe. The probe can not assume that
there will never be a reference to dmseg if the ProbEn bit in the DCR register is 0 because there is an inherent race
between the debug software sampling the ProbEn bit as 1 and the probe clearing it to 0.
3.3 Fixed Mapping MMU
The M14K core implements a simple Fixed Mapping (FM) memory management unit that is smaller than the a full
translation lookaside buffer (TLB) and more easily synthesized. Like a TLB, the FMT performs virtual-to-physical
address translation and provides attributes for the different memory segments. Those memory segments which are
unmapped in a TLB implementation (kseg0 and kseg1) are translated identically by the FMT MMU.
LSNM bit in
Debug registerAccess
The FMT also determines the cacheability of each segment. These attributes are controlled via bits in the Config register. Table 3.6 shows the encoding for the K23 (bits 30:28), KU (bits 27:25) and K0 (bits 2:0) of the Config register.
The M14K core does not contain caches and will treat all references as uncached, but these Config fields will be sent
out to the system with the request and it can choose to use them to control any external caching that may be present..
Table 3.6 Cacheability of Segments with Block Address Translation
Virtual Address
Segment
useg/kuseg0x0000_0000-
kseg00x8000_0000-
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0451
RangeCacheability
Controlled by the KU field (bits 27:25) of the Config register.
0x7FFF_FFFF
Controlled by the K0 field (bits 2:0) of the Config register.
0x9FFF_FFFF
Memory Management of the M14K™ Core
54MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
Chapter 4
Exceptions and Interrupts in the M14K™ Core
The M14K™ processor core receives exceptions from a number of sources, including arithmetic overflows, I/O interrupts, and system calls. When the CPU detects one of these exceptions, the normal sequence of instruction execution
is suspended and the processor enters kernel mode.
In kernel mode the core disables interrupts and forces execution of a software exception processor (called a handler)
located at a specific address. The handler saves the context of the processor, including the contents of the program
counter, the current operating mode, and the status of the interrupts (enabled or disabled). This context is saved so it
can be restored when the exception has been serviced.
When an exception occurs, the core loads the Exception Program Counter (EPC) register with a location where execution can restart after the exception has been serviced. Most exceptions are precise, which mean that EPC can be
used to identify the instruction that caused the exception. For precise exceptions, the restart location in the EPCregister is the address of the instruction that caused the exception or, if the instruction was executing in a branch delay slot,
the address of the branch instruction immediately preceding the delay slot. To distinguish between the two, software
must read the BD bit in the CP0 Cause register. Bus error exceptions and CP2 exceptions may be imprecise. For
imprecise exceptions the instruction that caused the exception cannot be identified.
This chapter contains the following sections:
•Section 4.1 “Exception Conditions”
•Section 4.2 “Exception Priority”
•Section 4.3 “Interrupts”
•Section 4.4 “GPR Shadow Registers”
•Section 4.5 “Exception Vector Locations”
•Section 4.6 “General Exception Processing”
•Section 4.7 “Debug Exception Processing”
•Section 4.8 “Exception Descriptions”
•Section 4.9 “Exception Handling and Servicing Flowcharts”
4.1 Exception Conditions
When an exception condition occurs, the instruction causing the exception and all those that follow it in the pipeline
are cancelled (“flushed”). Accordingly, any stall conditions and any later exception conditions that might have referenced this instruction are inhibited—obviously there is no benefit in servicing stalls for a cancelled instruction.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0455
Exceptions and Interrupts in the M14K™ Core
When an exception condition is detected on an instruction fetch, the core aborts that instruction and all instructions
that follow. When this instruction reaches the W stage, various CP0 registers are written with the exception state,
change the current program counter (PC) to the appropriate exception vector address, and clearing the exception bits
of earlier pipeline stages.
This implementation allows all preceding instructions to complete execution and prevents all subsequent instructions
from completing. Thus, the value in the EPC (ErrorEPC for errors, or DEPC for debug exceptions) is sufficient to
restart execution. It also ensures that exceptions are taken in the order of execution; an instruction taking an exception
may itself be killed by an instruction further down the pipeline that takes an exception in a later cycle.
4.2 Exception Priority
Table 4.1 contains a list and a brief description of all exception conditions, The exceptions are listed in the order of
their relative priority, from highest priority (Reset) to lowest priority. When several exceptions occur simultaneously,
the exception with the highest priority is taken.
Table 4.1 Priority of Exceptions
ExceptionDescription
ResetAssertion of SI_ColdReset signal.
Soft ResetAssertion of SI_Reset signal.
DSSEJTAG Debug Single Step.
DINTEJTAG Debug Interrupt. Caused by the assertion of the external EJ_DINT
input, or by setting the EjtagBrk bit in the ECR register.
NMIAsserting edge of SI_NMI signal.
InterruptAssertion of unmasked hardware or software interrupt signal.
Protection - Instruction fetchInstruction fetch access to a protected memory region was attempted.
Instruction Validity ExceptionsAn instruction could not be completed because it was not allowed access to the
required resources (Coprocessor Unusable) or was illegal (Reserved Instruction). Ifboth exceptions occur on the same instruction, the Coprocessor Unusable Exception takes priority over the Reserved Instruction Exception.
Protection - Instr ExecutionAttempted to write EBase when not allowed by MPU..
TrExecution of a trap (when trap condition is true).
Protection - Data accessData access to a protected memory region was attempted.
DDBL / DDBSEJTAG Data Address Break (address only) or EJTAG Data Value Break on
Store (address and value).
AdELLoad address alignment error.
User mode load reference to kernel address.
AdESStore address alignment error.
User mode store to kernel address.
DSRAM Parity ErrorParity error on D-SRAM access.
56MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
DBELoad or store bus error.
DDBLEJTAG data hardware breakpoint matched in load data compare.
CBrkEJTAG complex breakpoint.
4.3 Interrupts
In the MIPS32® Release 1 architecture, support for exceptions included two software interrupts, six hardware interrupts, and a special-purpose timer interrupt. The timer interrupt was provided external to the core and was typically
combined with hardware interrupt 5 in a system-dependent manner. Interrupts were handled either through the general exception vector (offset 0x180) or the special interrupt vector (0x200), based on the value of CauseIV. Software
was required to prioritize interrupts as a function of the CauseIV bits in the interrupt handler prologue.
Release 2 of the Architecture, implemented by the M14K core, adds a number of upward-compatible extensions to
the Release 1 interrupt architecture, including support for vectored interrupts and the implementation of a new interrupt mode that permits the use of an external interrupt controller.
The M14K core also includes the Microcontroller Application-Specific Extension (MCU ASE) that provides
enhanced interrupt delivery and interrupt-latency reduction.
4.3 Interrupts
Table 4.1 Priority of Exceptions (Continued)
ExceptionDescription
4.3.1 Interrupt Modes
The M14K core includes support for three interrupt modes, as defined by Release 2 of the Architecture:
•Interrupt Compatibility mode, in which the behavior of the M14K is identical to the behavior of a Release 1
implementations.
•Vectored Interrupt (VI) mode, which adds the ability to prioritize and vector interrupts to a handler dedicated to
that interrupt, and to assign a GPR shadow set for use during interrupt processing. The presence of this mode is
denoted by the VIntbit in the Config3 register. Although this mode is architecturally optional, it is always present
on the M14K processor, so the VInt bit will always read as a 1.
•External Interrupt Controller (EIC) mode, which redefines the way interrupts are handled to provide full support
for an external interrupt controller that handles prioritization and vectoring of interrupts. As with VI mode, this
mode is architecturally optional. The presence of this mode is denoted by the VEICbit in the Config3register. On
the M14K core, the VEICbit is set externally by the static input, SI_EICPresent, to allow system logic to indicate
the presence of an external interrupt controller.
Following reset, the M14K processor defaults to Compatibility mode, which is fully compatible with all implementations of Release 1 of the Architecture.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0457
Exceptions and Interrupts in the M14K™ Core
Table 4.2 shows the current interrupt mode of the processor as a function of the Coprocessor 0 register fields that can
affect the mode.
Table 4.2 Interrupt Modes
BEV
Status
IV
Cause
VS
IntCtl
VINT
VEIC
Config3
Config3
Interrupt Mode
1 xxxx Compatibly
x 0xxx Compatibility
xx =0xx Compatibility
01 ≠010 Vectored Interrupt
01 ≠0x1 External Interrupt Controller
01 ≠000 Can’t happen - IntCtl
can not be non-zero if neither
VS
Vectored Interrupt nor External Interrupt Controller mode
is implemented.
“x” denotes don’t care
4.3.1.1 Interrupt Compatibility Mode
This is the default interrupt mode for the processor and is entered when a Reset exception occurs. In this mode, interrupts are non-vectored and dispatched though exception vector offset 16#180 (if Cause
(if Cause
•Cause
= 1). This mode is in effect if any of the following conditions are true:
IV
= 0
IV
= 0) or vector offset 16#200
IV
•Status
•IntCtl
= 1
BEV
= 0, which would be the case if vectored interrupts are not implemented, or have been disabled.
VS
Here is a typical software handler for interrupt compatibility mode:
/*
* Assumptions:
* - Cause
* be isolated from the general exception vector before getting
* here)
* - GPRs k0 and k1 are available (no shadow register switches invoked in
* compatibility mode)
* - The software priority is IP9..IP0 (HW7..HW0, SW1..SW0)
*
* Location: Offset 0x200 from exception base
*/
IVexception:
mfc0k0, C0_Cause/* Read Cause register for IP bits */
mfc0k1, C0_Status/* and Status register for IM bits */
andik0, k0, M_CauseIM/* Keep only IP bits from Cause */
andk0, k0, k1/* and mask with IM bits */
beqk0, zero, Dismiss/* no bits set - spurious interrupt */
clzk0, k0/* Find first bit set, IP9..IP0; k0 = 14..23 */
= 1 (if it were zero, the interrupt exception would have to
IV
58MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
*/
lak1, VectorBase/* Get base of 10 interrupt vectors */
adduk0, k0, k1/* Compute target from base and offset */
jrk0/* Jump to specific exception routine */
nop
/*
* Each interrupt processing routine processes a specific interrupt, analogous
* to those reached in VI or EIC interrupt mode. Since each processing routine
* is dedicated to a particular interrupt line, it has the context to know
* which line was asserted. Each processing routine may need to look further
* to determine the actual source of the interrupt if multiple interrupt requests
* are ORed together on a single IP line. Once that task is performed, the
* interrupt may be processed in one of two ways:
*
* - Completely at interrupt level (e.g., a simply UART interrupt). The
* SimpleInterrupt routine below is an example of this type.
* - By saving sufficient state and re-enabling other interrupts. In this
* case the software model determines which interrupts are disabled during
* the processing of this interrupt. Typically, this is either the single
* StatusIM bit that corresponds to the interrupt being processed, or some
* collection of other Status
bits so that “lower” priority interrupts are
IM
* also disabled. The NestedInterrupt routine below is an example of this type.
*/
SimpleInterrupt:
/*
* Process the device interrupt here and clear the interupt request
* at the device. In order to do this, some registers may need to be
* saved and restored. The coprocessor 0 state is such that an ERET
* will simple return to the interrupted code.
*/
eret/* Return to interrupted code */
NestedException:
/*
* Nested exceptions typically require saving the EPC and Status registers,
* any GPRs that may be modified by the nested exception routine, disabling
* the appropriate IM bits in Status to prevent an interrupt loop, putting
* the processor in kernel mode, and re-enabling interrupts. The sample code
* below can not cover all nuances of this processing and is intended only
* to demonstrate the concepts.
*/
/* Save GPRs here, and setup software context */
mfc0k0, C0_EPC/* Get restart address */
swk0, EPCSave/* Save in memory */
mfc0k0, C0_Status/* Get Status value */
swk0, StatusSave/* Save in memory */
lik1, ~IMbitsToClear/* Get Im bits to clear for this interrupt */
/* this must include at least the IM bit */
/* for the current interrupt, and may include */
/* others */
andk0, k0, k1/* Clear bits in copy of Status */
insk0, zero, S_StatusEXL, (W_StatusKSU+W_StatusERL+W_StatusEXL)
/* Clear KSU, ERL, EXL bits in k0 */
mtc0k0, C0_Status/* Modify mask, switch to kernel mode, */
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0459
Exceptions and Interrupts in the M14K™ Core
/*
* Process interrupt here, including clearing device interrupt.
* In some environments this may be done with a thread running in
* kernel or user mode. Such an environment is well beyond the scope of
* this example.
*/
/*
* To complete interrupt processing, the saved values must be restored
* and the original interrupted code restarted.
*/
di/* Disable interrupts - may not be required */
lwk0, StatusSave/* Get saved Status (including EXL set) */
lwk1, EPCSave/* and EPC */
mtc0k0, C0_Status/* Restore the original value */
mtc0k1, C0_EPC/* and EPC */
/* Restore GPRs and software state */
eret/* Dismiss the interrupt */
4.3.1.2 Vectored Interrupt (VI) Mode
/* re-enable interrupts */
In Vectored Interrupt (VI) mode, a priority encoder prioritizes pending interrupts and generates a vector which can be
used to direct each interrupt to a dedicated handler routine. This mode also allows each interrupt to be mapped to a
GPR shadow register set for use by the interrupt handler. VI mode is in effect when all the following conditions are
true:
•Config3
•Config3
•IntCtl
•Cause
•Status
VS
VInt
VEIC
IV
BEV
= 1
= 0
≠ 0
= 1
= 0
In VI interrupt mode, the eight hardware interrupts are interpreted as individual hardware interrupt requests. The
timer interrupt is combined in a system-dependent way (external to the core) with the hardware interrupts (the interrupt with which they are combined is indicated by the PTI field in IntCtlI) to provide the appropriate relative priority
of the timer interrupt with that of the hardware interrupts. The processor interrupt logic ANDs each of the Cause
bits with the corresponding StatusIM bits. If any of these values is 1, and if interrupts are enabled (Status
IE
IP
= 1,
60MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
4.3 Interrupts
Status
=0,andStatus
EXL
= 0), an interrupt is signaled and a priority encoder scans the values in the order shown
ERL
in Table 4.3.
Table 4.3 Relative Interrupt Priority for Vectored Interrupt Mode
Interrupt
Relative
Priority
Highest PriorityHardwareHW7IP9 and IM99
Lowest PrioritySW0IP0 and IM00
Interrupt
Type
SoftwareSW1IP1 and IM11
Interrupt
Source
HW6IP8 and IM88
HW5IP7 and IM77
HW4IP6 and IM66
HW3IP5 and IM55
HW2IP4 and IM44
HW1IP3 and IM33
HW0IP2 and IM22
Request
Calculated From
The priority order places a relative priority on each hardware interrupt and places the software interrupts at a priority
lower than all hardware interrupts. When the priority encoder finds the highest priority pending interrupt, it outputs
an encoded vector number that is used in the calculation of the handler for that interrupt, as described below. This is
shown pictorially in Figure 4.1.
Vector Number
Generated by
Priority Encoder
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0461
lik1, ~IMbitsToClear/* Get Im bits to clear for this interrupt */
/* this must include at least the IM bit */
/* for the current interrupt, and may include */
/* others */
andk0, k0, k1/* Clear bits in copy of Status */
/* If switching shadow sets, write new value to SRSCtl
insk0, zero, S_StatusEXL, (W_StatusKSU+W_StatusERL+W_StatusEXL)
/* Clear KSU, ERL, EXL bits in k0 */
mtc0k0, C0_Status/* Modify mask, switch to kernel mode, */
/* re-enable interrupts */
/*
* If switching shadow sets, clear only KSU above, write target
* address to EPC, and do execute an eret to clear EXL, switch
* shadow sets, and jump to routine
*/
/* Process interrupt here, including clearing device interrupt */
/*
* To complete interrupt processing, the saved values must be restored
* and the original interrupted code restarted.
*/
di/* Disable interrupts - may not be required */
lwk0, StatusSave/* Get saved Status (including EXL set) */
lwk1, EPCSave/* and EPC */
mtc0k0, C0_Status/* Restore the original value */
lwk0, SRSCtlSave/* Get saved SRSCtl */
mtc0k1, C0_EPC/* and EPC */
mtc0k0, C0_SRSCtl/* Restore shadow sets */
ehb/* Clear hazard */
eret/* Dismiss the interrupt */
here */
PSS
4.3 Interrupts
4.3.1.3 External Interrupt Controller Mode
External Internal Interrupt Controller Mode redefines the way that the processor interrupt logic is configured to provide support for an external interrupt controller. The interrupt controller is responsible for prioritizing all interrupts,
including hardware, software, timer, and performance counter interrupts, and directly supplying to the processor the
priority level and vector number of the highest priority interrupt. EIC interrupt mode is in effect if all of the following
conditions are true:
•Config3
•IntCtl
•Cause
•Status
In EIC interrupt mode, the processor sends the state of the software interrupt requests (Cause
rupt request (CauseTI), the performance counter interrupt request (Cause
(Cause
FDCI
= 1
VEIC
≠ 0
VS
= 1
IV
= 0
BEV
), the timer inter-
IP1..IP0
) and Fast Debug Channel Interrupt
PCI
) to the external interrupt controller, where it prioritizes these interrupts in a system-dependent way with
other hardware interrupts. The interrupt controller can be a hard-wired logic block, or it can be configurable based on
control and status registers. This allows the interrupt controller to be more specific or more general as a function of
the system environment and needs.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0463
Exceptions and Interrupts in the M14K™ Core
The external interrupt controller prioritizes its interrupt requests and produces the priority level and the vector number of the highest priority interrupt to be serviced. The priority level, called the Requested Interrupt Priority Level
(RIPL), is an 8-bit encoded value in the range 0..255, inclusive. A value of 0 indicates that no interrupt requests are
pending. The values 1..255 represent the lowest (1) to highest (255) RIPL for the interrupt to be serviced. The interrupt controller passes this value on the 8 hardware interrupt lines, which are treated as an encoded value in EIC interrupt mode. There are two implementation options available for the vector offset:
1.The first option is to send a separate vector number along with the RIPL to the processor.
2.A second option is to send an entire vector offset along with the RIPL to the processor. This option is
enabled through the core’s configuration GUI, and it is not affected by software.
The M14K core does not support the option to treat the RIPL value as the vector number for the processor.
Status
(which overlays StatusI
IPL
) is interpreted as the Interrupt Priority Level (IPL) at which the processor is
M9..IM2
currently operating (with a value of zero indicating that no interrupt is currently being serviced). When the interrupt
controller requests service for an interrupt, the processor compares RIPL with Status
interrupt has higher priority than the current IPL. If RIPL is strictly greater than Status
(StatusIE = 1, Status
starts the interrupt exception, it loads RIPL into Cause
= 0, and Status
EXL
= 0) an interrupt request is signaled to the pipeline. When the processor
ERL
(which overlays Cause
RIPL
interrupt controller to notify it that the request is being serviced. Because Cause
to determine if the requested
IPL
, and interrupts are enabled
IPL
) and signals the external
IP9..IP2
is only loaded by the processor
RIPL
when an interrupt exception is signaled, it is available to software during interrupt processing. The vector number that
the EIC passes to the core is combined with the IntCtl
to determine where the interrupt service routine is located.
VS
The vector number is not stored in any software-visible registers.
In EIC interrupt mode, the external interrupt controller is also responsible for supplying the GPR shadow set number
to use when servicing the interrupt. As such, the SRSMap register is not used in this mode, and the mapping of the
vectored interrupt to a GPR shadow set is done by programming (or designing) the interrupt controller to provide the
correct GPR shadow set number when an interrupt is requested. When the processor loads an interrupt request into
Cause
, it also loads the GPR shadow set number into SRSCtl
RIPL
, which is copied to SRSCtl
EICSS
when the inter-
CSS
rupt is serviced.
The operation of EIC interrupt mode is shown pictorially in Figure 4.2.
64MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
Exceptions and Interrupts in the M14K™ Core
swk0, StatusSave/* Save in memory */
insk0, k1, S_StatusIPL, 6 /* Set IPL to RIPL in copy of Status */
mfc0k1, C0_SRSCtl/* Save SRSCtl if changing shadow sets */
swk1, SRSCtlSave
/* If switching shadow sets, write new value to SRSCtl
insk0, zero, S_StatusEXL, (W_StatusKSU+W_StatusERL+W_StatusEXL)
/* Clear KSU, ERL, EXL bits in k0 */
mtc0k0, C0_Status/* Modify IPL, switch to kernel mode, */
/* re-enable interrupts */
/*
* If switching shadow sets, clear only KSU above, write target
* address to EPC, and do execute an eret to clear EXL, switch
* shadow sets, and jump to routine
*/
/* Process interrupt here, including clearing device interrupt */
/*
* The interrupt completion code is identical to that shown for VI mode above.
*/
4.3.2 Generation of Exception Vector Offsets for Vectored Interrupts
here */
PSS
For vectored interrupts (in either VI or EIC interrupt mode), a vector number is produced by the interrupt control
logic. This number is combined with IntCtlVS to create the interrupt offset, which is added to 16#200 to create the
exception vector offset. For VI interrupt mode, the vector number is in the range 0..9, inclusive. For EIC interrupt
mode, the vector number is in the range 0..63, inclusive. The IntCtlVS field specifies the spacing between vector locations. If this value is zero (the default reset state), the vector spacing is zero and the processor reverts to Interrupt
Compatibility Mode. A non-zero value enables vectored interrupts, and Table 4.4 shows the exception vector offset
for a representative subset of the vector numbers and values of the IntCtlVS field.
Table 4.4 Exception Vector Offsets for Vectored Interrupts
Value of IntCtl
Vector Number
016#020016#020016#020016#020016#0200
116#022016#024016#028016#030016#0400
216#024016#028016#030016#040016#0600
316#026016#02C016#038016#050016#0800
416#028016#030016#040016#060016#0A00
516#02A016#034016#048016#070016#0C00
616#02C016#038016#050016#080016#0E00
716#02E016#03C016#058016#090016#1000
6116#09A016#114016#208016#3F0016#7C00
6216#09C016#118016#210016#400016#7E00
6316#09E016#11C016#218016#410016#8000
2#000012#000102#001002#010002#10000
•
•
•
VS
Field
66MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
4.3 Interrupts
The general equation for the exception vector offset for a vectored interrupt is:
When using large vector spacing and EIC mode, the offset value can overlap with bits that are specified in the EBase
register. Software must ensure that any overlapping bits are specified as 0 in EBase. This implementation ORs
together the offset and base registers, but it is architecturally undefined and software should not rely on this behavior.
Although there are 255 EIC priority interrupts, only 64 vectors are provided. There is no one-to-one mapping for each
EIC interrupt to its interrupt vector. The 255 priority interrupts will share the 64 interrupt vectors as specified by the
SI_EICVector[5:0] input pins. However, as mentioned in option 2 of Section 4.3.1.3 “External Interrupt Controller
Mode”, the SI_Offset[17:1] input pins can be used to provide each EIC interrupt with a unique interrupt handler loca-
tion.
4.3.3 MCU ASE Enhancement for Interrupt Handling
The MCU ASE extends the MIPS/microMIPS32 Architecture with a set of new features designed for the microcontroller market. The MCU ASE contains enhancements in two key areas: interrupt delivery and interrupt latency. For
more details, refer to the The MCU Privileged Resource Architecture chapter of the MIPS® Architecture for Pro-
grammers Volume IV-h: The MCU Application-Specific Extension to the MIPS32 Architecture [10] or MIPS® Architecture for Programmers Volume IV-h: The MCU Application-Specific Extension to the microMIPS32™ Architecture
[11].
4.3.3.1 Interrupt Delivery
The MCU ASE extends the number of hardware interrupt sources from 6 to 8. For legacy and vectored-interrupt
mode, this represents 8 external interrupt sources. For EIC mode, the widened
IPL and RIPL fields can now represent
256 external interrupt sources.
4.3.3.2 Interrupt Latency Reduction
The MCU ASE includes a package of extensions to MIPS/microMIPS3232 that decrease the latency of the processor’s response to a signalled interrupt.
Interrupt Vector Prefetching
Normally on MIPS architecture processors, when an interrupt or exception is signalled, execution pipelines must be
flushed before the interrupt/exception handler is fetched. This is necessary to avoid mixing the contexts of the interrupted/faulting program and the exception handler. The MCU ASE introduces a hardware mechanism in which the
interrupt exception vector is prefetched whenever the interrupt input signals change. The prefetch memory transaction occurs in parallel with the pipeline flush and exception prioritization. This decreases the overall latency of the
execution of the interrupt handler’s first instruction.
Automated Interrupt Prologue
The use of Shadow Register Sets avoids the software steps of having to save general-purpose registers before handling an interrupt.
The MCU ASE adds additional hardware logic that automatically saves some of the COP0 state in the stack and automatically updates some of the COP0 registers in preparation for interrupt handling.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0467
Exceptions and Interrupts in the M14K™ Core
Automated Interrupt Epilogue
A mirror to the Automated Prologue, this features automates the restoration of some of the COP0 registers from the
stack and the preparation of some of the COP0 registers for returning to non-exception mode. This feature is implemented within the IRET instruction, which is introduced in this ASE.
Interrupt Chaining
An optional feature of the Automated Interrupt Epilogue, this feature allows handling a second interrupt after a primary interrupt is handled, without returning to non-exception mode (and the related pipeline flushes that would normally be necessary).
4.4 GPR Shadow Registers
Release 2 of the Architecture optionally removes the need to save and restore GPRs on entry to high priority interrupts or exceptions, and to provide specified processor modes with the same capability. This is done by introducing
multiple copies of the GPRs, called shadow sets, and allowing privileged software to associate a shadow set with
entry to kernel mode via an interrupt vector or exception. The normal GPRs are logically considered shadow set zero.
The number of GPR shadow sets is a build-time option on the M14K core. Although Release 2 of the Architecture
defines a maximum of 16 shadow sets, the core allows one (the normal GPRs), two, four, eight or sixteen shadow sets.
The highest number actually implemented is indicated by the SRSCtl
GPRs are implemented.
field. If this field is zero, only the normal
HSS
Shadow sets are new copies of the GPRs that can be substituted for the normal GPRs on entry to kernel mode via an
interrupt or exception. When a shadow set is bound to a kernel mode entry condition, reference to GPRs work exactly
as one would expect, but they are redirected to registers that are dedicated to that condition. Privileged software may
need to reference all GPRs in the register file, even specific shadow registers that are not visible in the current mode.
The RDPGPR and WRPGPR instructions are used for this purpose. The CSS field of the SRSCtlregister provides the
number of the current shadow register set, and the PSS field of the SRSCtl register provides the number of the previous shadow register set (that which was current before the last exception or interrupt occurred).
If the processor is operating in VI interrupt mode, binding of a vectored interrupt to a shadow set is done by writing to
the SRSMap register. If the processor is operating in EIC interrupt mode, the binding of the interrupt to a specific
shadow set is provided by the external interrupt controller, and is configured in an implementation-dependent way.
Binding of an exception or non-vectored interrupt to a shadow set is done by writing to the ESS field of the SRSCtl
register. When an exception or interrupt occurs, the value of SRSCtl
to the value taken from the appropriate source. On an ERET, the value of SRSCtl
is copied to SRSCtl
CSS
, and SRSCtl
PSS
is copied back into SRSCtl
PSS
CSS
is set
CSS
to restore the shadow set of the mode to which control returns. More precisely, the rules for updating the fields in the
SRSCtl register on an interrupt or exception are as follows:
1.No field in the SRSCtl register is updated if any of the following conditions is true. In this case, steps 2 and 3 are
skipped.
•The exception is one that sets Status
: Reset, Soft Reset, or NMI.
ERL
•The exception causes entry into EJTAG Debug Mode.
•Status
•Status
68MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
BEV
EXL
= 1
= 1
4.5 Exception Vector Locations
2.SRSCtl
3.SRSCtl
•The appropriate field of the
•The EICSS field of the SRSCtl register if the exception is an interrupt, CauseIV = 1, and Config3
is copied to SRSCtl
CSS
is updated from one of the following sources:
CSS
Config3
= 0, and Config3
VEIC
.
PSS
SRSMap register, based on IPL, if the exception is an interrupt, Cause
= 1. These are the conditions for a vectored interrupt.
VInt
VEIC
IV
= 1,
= 1.
These are the conditions for a vectored EIC interrupt.
•The ESS field of the SRSCtlregister in any other case. This is the condition for a non-interrupt exception, or
a non-vectored interrupt.
Similarly, the rules for updating the fields in the SRSCtl register at the end of an exception or interrupt are as follows:
1.No field in the SRSCtl register is updated if any of the following conditions is true. In this case, step 2 is skipped.
•A DERET is executed.
•An ERET is executed with Status
2.SRSCtl
is copied to SRSCtl
PSS
CSS
= 1.
ERL
.
These rules have the effect of preserving the SRSCtl register in any case of a nested exception or one which occurs
before the processor has been fully initialize (Status
BEV
= 1).
Privileged software may switch the current shadow set by writing a new value into SRSCtl
target address, and doing an ERET.
4.5 Exception Vector Locations
The Reset, Soft Reset, and NMI exceptions are always vectored to location 16#BFC0.0000. EJTAG Debug exceptions are vectored to location 16#BFC0.0480, or to location 16#FF20.0200 if the ProbTrap bit is zero or one, respectively, in the EJTAG_Control_register. Addresses for all other exceptions are a combination of a vector offset and a
vector base address. In Release 1 of the architecture, the vector base address was fixed. In Release 2 of the architecture, software is allowed to specify the vector base address via the EBase register for exceptions that occur when
Status
set in the Status register. Table 4.6 gives the offsets from the vector base address as a function of the exception. Note
that the IVbit in the Causeregister causes Interrupts to use a dedicated exception vector offset, rather than the general
exception vector. For implementations of Release 2 of the Architecture,
Table 4.4 shows the offset from the base address in the case where Status
tions of Release 1 of the architecture in which CauseIV = 1, the vector offset is as if IntCt
bines these two tables into one that contains all possible vector addresses as a function of the state that can affect the
equals 0. Table 4.5 gives the vector base address as a function of the exception and whether the BEV bit is
BEV
= 0 and Cause
BEV
, loading EPC with a
PSS
= 1. For implementa-
IV
were 0. Table 4.7 com-
lVS
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0469
Exceptions and Interrupts in the M14K™ Core
vector selection. To avoid complexity in the table, the vector address value assumes that the EBaseregister, as implemented in Release 2 devices, is not changed from its reset state and that IntCt
Table 4.5 Exception Vector Base Addresses
Status
BEV
lVS
is 0.
Exception
01
Reset, Soft Reset, NMI16#BFC0.0000
EJTAG Debug (with ProbEn = 0 in
16#BFC0.0480
the EJTAG Control Register)
EJTAG Debug (with ProbEn = 1 in
16#FF20.0200
the EJTAG Control Register)
SRAM Parity ErrorEBase
EBase
28..12
Note that EBase
|| 1 ||
31..30
|| 16#000
have the
31..30
fixed value 2#10
OtherFor Release 1 of the architecture:
16#8000.0000
For Release 2 of the architecture:
EBase
Note that EBase
31..12
|| 16#000
have the
31..30
fixed value 2#10
Table 4.6 Exception Vector Offsets
ExceptionVector Offset
General Exception16#180
Interrupt, Cause
Reset, Soft Reset, NMINone (Uses Reset Base Address)
= 116#200 (In Release 2 implementa-
IV
tions, this is the base of the vectored
interrupt table when Status
16#BFC0.0300
16#BFC0.0200
= 0)
BEV
Table 4.7 Exception Vectors
Vector
For Release 2
Implementations, assumes
EJTAG
Exception
Status
BEV
Status
EXL
Cause
IV
ProbEn
Reset, Soft Reset, NMIxxxx16#BFC0.0000
EJTAG Debugxxx016#BFC0.0480
EJTAG Debugxxx116#FF20.0200
SRAM Parity Error0xxx16#EBase[31:30] || 2#1
SRAM Parity Error1xxx16#BFC0.0300
Interrupt000x16#8000.0180
70MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
that EBase retains its reset
state and that IntCtl
VS
= 0
|| EBase[28:12] ||
16#100
Table 4.7 Exception Vectors (Continued)
BEV
Status
‘x’ denotes don’t care
Exception
Interrupt001x16#8000.0200
Interrupt100x16#BFC0.0380
Interrupt101x16#BFC0.0400
All others0xxx16#8000.0180
All others1xxx16#BFC0.0380
Status
4.6 General Exception Processing
With the exception of Reset, Soft Reset, NMI, cache error, and EJTAG Debug exceptions, which have their own special processing as described below, exceptions have the same basic processing flow:
EXL
Cause
IV
EJTAG
ProbEn
4.6 General Exception Processing
Vector
For Release 2
Implementations, assumes
that EBase retains its reset
state and that IntCtl
VS
= 0
•If the EXL bit in the Status register is zero, the EPC register is loaded with the PC at which execution will be
restarted and the BDbit is set appropriately in the Causeregister (see Table 5.17). The value loaded into the EPC
register is dependent on whether the processor implements microMIPS, and whether the instruction is in the
delay slot of a branch or jump which has delay slots. Table 4.8 shows the value stored in each of the CP0 PC registers, including EPC. For implementations of Release 2 of the Architecture if Status
SRSCtl register is copied to the PSS field, and the CSS value is loaded from the appropriate source.
=0,theCSS field in the
BEV
If the EXL bit in the Status register is set, the EPC register is not loaded and the BD bit is not changed in the
Cause register. For implementations of Release 2 of the Architecture, the SRSCtl register is not changed.
Table 4.8 Value Stored in EPC, ErrorEPC, or DEPC on an Exception
microMIPS
Implemented?
NoNoAddress of the instruction
NoYesAddress of the branch or jump instruction (PC-4)
YesNoUpper31 bits of the address of the instruction, combined
YesYesUpper31 bits of the branch or jump instruction (PC-2 or
•The CE and ExcCode fields of the Cause registers are loaded with the values appropriate to the exception. The
CE field is loaded, but not defined, for any exception type other than a coprocessor unusable exception.
In Branch/Jump
Delay Slot?Value stored in EPC/ErrorEPC/DEPC
with the ISA Mode bit
PC-4 depending on size of the instruction in the microMIPS ISA Mode and PC-4 in the 32-bit ISA Mode), combined with the ISA Mode bit
•The EXL bit is set in the Status register.
•The processor is started at the exception vector.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0471
Exceptions and Interrupts in the M14K™ Core
The value loaded into EPC represents the restart address for the exception and need not be modified by exception
handler software in the normal case. Software need not look at the BD bit in the Cause register unless it wishes to
identify the address of the instruction that actually caused the exception.
Note that individual exception types may load additional information into other registers. This is noted in the description of each exception type below.
Operation:
/* If Status
/* and neither EPC nor Cause
if Status
vectorOffset ← 16#180
else
if InstructionInBranchDelaySlot then
EPC ← restartPC/* PC of branch/jump */
Cause
else
EPC ← restartPC/* PC of instruction */
Cause
endif
/* Compute vector offsets as a function of the type of exception */
NewShadowSet ← SRSCtl
if ExceptionType = TLBRefill then
vectorOffset ← 16#000
elseif (ExceptionType = Interrupt) then
if (Cause
else
endif /* if (Cause
endif /* elseif (ExceptionType = Interrupt) then */
is 1, all exceptions go through the general exception vector */
/* Update the shadow set information for an implementation of */
/* Release 2 of the architecture */
if ((ArchitectureRevision ≥ 2) and (SRSCtl
(Status
SRSCtl
SRSCtl
= 0)) then
ERL
← SRSCtl
PSS
← NewShadowSet
CSS
CSS
> 0) and (Status
HSS
= 0) and
BEV
endif
endif /* if Status
Cause
Cause
Status
← FaultingCoprocessorNumber
CE
ExcCode
EXL
← ExceptionType
← 1
= 1 then */
EXL
72MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
/* Calculate the vector base address */
if Status
vectorBase ← 16#BFC0.0200
else
if ArchitectureRevision ≥ 2 then
else
endif
endif
/* Exception PC is the sum of vectorBase and vectorOffset */
PC ← vectorBase
= 1 then
BEV
/* The fixed value of EBase
vectorBase ← EBase
vectorBase ← 16#8000.0000
31..30
31..12
|| (vectorBase
/* No carry between bits 29 and 30 */
4.7 Debug Exception Processing
All debug exceptions have the same basic processing flow:
•The DEPC register is loaded with the program counter (PC) value at which execution will be restarted and the
DBD bit is set appropriately in the Debug register. The value loaded into the DEPC register is the current PC if
the instruction is not in the delay slot of a branch, or the PC-4 of the branch if the instruction is in the delay slot
of a branch.
31..30
|| 16#000
+ vectorOffset
29..0
4.7 Debug Exception Processing
forces the base to be in kseg0 or kseg1 */
)
29..0
•The DSS, DBp, DDBL, DDBS, DIB, DINT, DIBImpr, DDBLImpr, and DDBSImpr bits in the Debug register are
updated appropriately depending on the debug exception type.
•The Debug2 register is updated with additional information for complex breakpoints.
•Halt and Doze bits in the Debug register are updated appropriately.
•DM bit in the Debug register is set to 1.
•The processor is started at the debug exception vector.
The value loaded into DEPC represents the restart address for the debug exception and need not be modified by the
debug exception handler software in the usual case. Debug software need not look at the DBD bit in the Debug register unless it wishes to identify the address of the instruction that actually caused the debug exception.
A unique debug exception is indicated through the DSS, DBp, DDBL, DDBS, DIB, DINT, DIBImpr, DDBLImpr, and
DDBSImpr bits in the Debug register.
No other CP0 registers or fields are changed due to the debug exception, thus no additional state is saved.
Operation:
if InstructionInBranchDelaySlot then
DEPC ← PC-4
Debug
else
DEPC ← PC
Debug
endif
DBD
DBD
← 1
← 0
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0473
The same debug exception vector location is used for all debug exceptions. The location is determined by the ProbTrap bit in the EJTAG Control register (ECR), as shown in Table 4.9.
Table 4.9 Debug Exception Vector Addresses
ProbTrap bit in ECR
RegisterDebug Exception Vector Address
00xBFC0_0480
10xFF20_0200 in dmseg
4.8 Exception Descriptions
The following subsections describe each of the exceptions listed in the same sequence as shown in Table 4.1.
ProbTrap
= 1 then
4.8.1 Reset/SoftReset Exception
A reset exception occurs when the SI_ColdReset signal is asserted to the processor; a soft reset occurs when the
SI_Reset signal is asserted. These exceptions are not maskable. When one of these exceptions occurs, the processor
performs a full reset initialization, including aborting state machines, establishing critical state, and generally placing
the processor in a state in which it can execute instructions from uncached, unmapped address space. On a Reset/SoftReset exception, the state of the processor is not defined, with the following exceptions:
•The Config register is initialized with its boot state.
•The RP, BEV, TS, SR, NMI, and ERL fields of the Status register are initialized to a specified state.
•The ErrorEPC register is loaded with PC-4 if the state of the processor indicates that it was executing an instruction in the delay slot of a branch. Otherwise, the ErrorEPCregister is loaded with PC. Note that this value may or
may not be predictable.
•PC is loaded with 0xBFC0_0000.
Cause Register ExcCode Value:
None
Additional State Saved:
None
Entry Vector Used:
Reset (0xBFC0_0000)
74MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
4.8 Exception Descriptions
Operation:
Config ← ConfigurationState
Status
Status
Status
Status
Status
Status
if InstructionInBranchDelaySlot then
else
endif
PC ← 0xBFC0_0000
← 0
RP
← 1
BEV
← 0
TS
← 0/1 (depending on Reset or SoftReset)
SR
← 0
NMI
← 1
ERL
ErrorEPC ← PC - 4
ErrorEPC ← PC
4.8.2 Debug Single Step Exception
A debug single step exception occurs after the CPU has executed one/two instructions in non-debug mode, when
returning to non-debug mode after debug mode. One instruction is allowed to execute when returning to a non
jump/branch instruction, otherwise two instructions are allowed to execute since the jump/branch and the instruction
in the delay slot are executed as one step. Debug single step exceptions are enabled by the SSt bit in the Debug register, and are always disabled for the first one/two instructions after a DERET.
The DEPC register points to the instruction on which the debug single step exception occurred, which is also the next
instruction to single step or execute when returning from debug mode. So the DEPC will not point to the instruction
which has just been single stepped, but rather the following instruction. The DBD bit in the Debug register is never
set for a debug single step exception, since the jump/branch and the instruction in the delay slot is executed in one
step.
Exceptions occurring on the instruction(s) executed with debug single step exception enabled are taken even though
debug single step was enabled. For a normal exception (other than reset), a debug single step exception is then taken
on the first instruction in the normal exception handler. Debug exceptions are unaffected by single step mode, e.g.
returning to a SDBBP instruction with debug single step exceptions enabled causes a debug software breakpoint
exception, and DEPC points to the SDBBP instruction. However, returning to an instruction (not jump/branch) just
before the SDBBP instruction, causes a debug single step exception with the DEPC pointing to the SDBBP instruction.
To ensure proper functionality of single step, the debug single step exception has priority over all other exceptions,
except reset and soft reset.
Debug Register Debug Status Bit Set
DSS
Additional State Saved
None
Entry Vector Used
Debug exception vector
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0475
Exceptions and Interrupts in the M14K™ Core
4.8.3 Debug Interrupt Exception
A debug interrupt exception is either caused by the EjtagBrk bit in the EJTAG Control register (controlled through the
TAP), or caused by the debug interrupt request signal to the CPU.
The debug interrupt exception is an asynchronous debug exception which is taken as soon as possible, but with no
specific relation to the executed instructions. The DEPC register is set to the instruction where execution should continue after the debug handler is through. The DBD bit is set based on whether the interrupted instruction was executing in the delay slot of a branch.
Debug Register Debug Status Bit Set
DINT
Additional State Saved
None
Entry Vector Used
Debug exception vector
4.8.4 Non-Maskable Interrupt (NMI) Exception
A non maskable interrupt exception occurs when the SI_NMI signal is asserted to the processor. SI_NMI is an edge
sensitive signal - only one NMI exception will be taken each time it is asserted. An NMI exception occurs only at
instruction boundaries, so it does not cause any reset or other hardware initialization. The state of the cache, memory,
and other processor states are consistent and all registers are preserved, with the following exceptions:
•The BEV, TS, SR, NMI, and ERL fields of the Status register are initialized to a specified state.
•The ErrorEPC register is loaded with PC-4 if the state of the processor indicates that it was executing an instruction in the delay slot of a branch. Otherwise, the ErrorEPC register is loaded with PC.
•PC is loaded with 0xBFC0_0000.
Cause Register ExcCode Value:
None
Additional State Saved:
None
Entry Vector Used:
Reset (0xBFC0_0000)
Operation:
Status
Status
Status
Status
Status
if InstructionInBranchDelaySlot then
ErrorEPC ← PC - 4
else
BEV
TS
SR
NMI
ERL
← 1
← 0
← 0
← 1
← 1
76MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
4.8 Exception Descriptions
ErrorEPC ← PC
endif
PC ← 0xBFC0_0000
4.8.5 Interrupt Exception
The interrupt exception occurs when one or more of the eight hardware, two software, or timer interrupt requests is
enabled by the Status register, and the interrupt input is asserted. See 4.3 “Interrupts” on page 57 for more details
about the processing of interrupts.
Register ExcCode Value:
Int
Additional State Saved:
Table 4.10 Register States an Interrupt Exception
Register StateValue
CauseIPindicates the interrupts that are pending.
Entry Vector Used:
See 4.3.2 “Generation of Exception Vector Offsets for Vectored Interrupts” on page 66 for the entry vector used,
depending on the interrupt mode the processor is operating in.
4.8.6 Debug Instruction Break Exception
A debug instruction break exception occurs when an instruction hardware breakpoint matches an executed instruction. The DEPC register and DBD bit in the Debug register indicate the instruction that caused the instruction hardware breakpoint to match. This exception can only occur if instruction hardware breakpoints are implemented.
An address error exception occurs on an instruction or data access when an attempt is made to execute one of the following:
•Fetch an instruction, load a word, or store a word that is not aligned on a word boundary
•Load or store a halfword that is not aligned on a halfword boundary
•Reference the kernel address space from user mode
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0477
Exceptions and Interrupts in the M14K™ Core
Note that in the case of an instruction fetch that is not aligned on a word boundary, PC is updated before the condition
is detected. Therefore, both EPCand BadVAddr point to the unaligned instruction address. In the case of a data access
the exception is taken if either an unaligned address or an address that was inaccessible in the current processor mode
was referenced by a load or store instruction.
Cause Register ExcCode Value:
AdEL: Reference was a load or an instruction fetch
AdES: Reference was a store
Additional State Saved:
Table 4.11 CP0 Register States on an Address Exception Error
Register StateValue
BadVAddrFailing address
Entry Vector Used:
General exception vector (offset 0x180)
4.8.8 SRAMParity Error Exception
A SRAM error exception occurs when an instruction or data reference detects a data error. This exception is not
maskable. To avoid disturbing the error in the cache array the exception vector is to an unmapped, uncached address.
This exception is precise.
Cause Register ExcCode Value
N/A
Additional State Saved
Table 4.12 CP0 Register States on a SRAMParity Error Exception
Register StateValue
CacheErrError state
ErrorEPCRestart PC
Entry Vector Used
Cache error vector (offset 16#100)
4.8.9 Bus Error Exception — Instruction Fetch or Data Access
A bus error exception occurs when an instruction or data access makes a bus request and that request terminates in an
error. The bus error exception can occur on either an instruction fetch or a data access. Bus error exceptions that occur
on an instruction fetch have a higher priority than bus error exceptions that occur on a data access.
Bus errors taken on any external access on the M14K core are always precise.
Cause Register ExcCode Value:
IBE:Error on an instruction reference
78MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
4.8 Exception Descriptions
DBE:Error on a data reference
Additional State Saved:
None
Entry Vector Used:
General exception vector (offset 0x180)
4.8.10 Protection Exception
The protection exception occurs when an access to memory that has been protected by the Memory Protection Unit
has been attempted. Or under certain circumstances, attempted write to the EBase register. See the "Security Features
of the M14K™ Processor Family" (MD00896) for more information.
Register ExcCode Value:
Prot (Cause Code 29)
Additional State Saved:
MPU Config Register, Triggered Field
MPU StatusN Register, Cause* Fields
Entry Vector Used
General exception vector (offset 0x180)
4.8.11 Debug Software Breakpoint Exception
A debug software breakpoint exception occurs when an SDBBP instruction is executed. The DEPC register and DBD
bit in the Debug register will indicate the SDBBP instruction that caused the debug exception.
Debug Register Debug Status Bit Set:
DBp
Additional State Saved:
None
Entry Vector Used:
Debug exception vector
4.8.12 Execution Exception — System Call
The system call exception is one of the execution exceptions. All of these exceptions have the same priority. A system call exception occurs when a SYSCALL instruction is executed.
Cause Register ExcCode Value:
Sys
Additional State Saved:
None
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0479
Exceptions and Interrupts in the M14K™ Core
Entry Vector Used:
General exception vector (offset 0x180)
4.8.13 Execution Exception — Breakpoint
The breakpoint exception is one of the execution exceptions. All of these exceptions have the same priority. A breakpoint exception occurs when a BREAK instruction is executed.
Cause Register ExcCode Value:
Bp
Additional State Saved:
None
Entry Vector Used:
General exception vector (offset 0x180)
4.8.14 Execution Exception — Reserved Instruction
The reserved instruction exception is one of the execution exceptions. All of these exceptions have the same priority.
A reserved instruction exception occurs when a reserved or undefined major opcode or function field is executed.
This includes Coprocessor 2 instructions which are decoded reserved in the Coprocessor 2.
Cause Register ExcCode Value:
RI
Additional State Saved:
None
Entry Vector Used:
General exception vector (offset 0x180)
4.8.15 Execution Exception — Coprocessor Unusable
The coprocessor unusable exception is one of the execution exceptions. All of these exceptions have the same priority. A coprocessor unusable exception occurs when an attempt is made to execute a coprocessor instruction for one of
the following:
•a corresponding coprocessor unit that has not been marked usable by setting its CU bit in the Status register
•CP0 instructions, when the unit has not been marked usable, and the processor is executing in user mode
Cause Register ExcCode Value:
CpU
80MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
Additional State Saved:
Table 4.13 Register States on a Coprocessor Unusable Exception
Register StateValue
4.8 Exception Descriptions
Cause
CE
Unit number of the coprocessor being referenced
Entry Vector Used:
General exception vector (offset 0x180)
4.8.16 Execution Exception — CorExtend Unusable
The CorExtend unusable exception is one of the execution exceptions. All of these exceptions have the same priority.
A CorExtend Unusable exception occurs when an attempt is made to execute a CorExtend instruction when
Status
is cleared. It is implementation-dependent whether this functionality is supported. Generally, the function-
CEE
ality will only be supported if a CorExtend block contains local destination registers
The Coprocessor 2 exception is one of the execution exceptions. All of these exceptions have the same priority. A
Coprocessor 2 exception occurs when a valid Coprocessor 2 instruction cause a general exception in the Coprocessor
2.
Cause Register ExcCode Value:
C2E
Additional State Saved:
Depending on the Coprocessor 2 implementation, additional state information of the exception can be saved in a
Coprocessor 2 control register.
The Implementation-Specific 1 exception is one of the execution exceptions. All of these exceptions have the same
priority. An implementation-specific 1 exception occurs when a valid coprocessor 2 instruction cause an implementation-specific 1 exception in the Coprocessor 2.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0481
Exceptions and Interrupts in the M14K™ Core
Cause Register ExcCode Value:
IS1
Additional State Saved:
Depending on the coprocessor 2 implementation, additional state information of the exception can be saved in a
coprocessor 2 control register.
Entry Vector Used:
General exception vector (offset 0x180)
4.8.19 Execution Exception — Integer Overflow
The integer overflow exception is one of the execution exceptions. All of these exceptions have the same priority. An
integer overflow exception occurs when selected integer instructions result in a 2’s complement overflow.
Cause Register ExcCode Value:
Ov
Additional State Saved:
None
Entry Vector Used:
General exception vector (offset 0x180)
4.8.20 Execution Exception — Trap
The trap exception is one of the execution exceptions. All of these exceptions have the same priority. A trap exception occurs when a trap instruction results in a TRUE value.
Cause Register ExcCode Value:
Tr
Additional State Saved:
None
Entry Vector Used:
General exception vector (offset 0x180)
4.8.21 Debug Data Break Exception
A debug data break exception occurs when a data hardware breakpoint matches the load/store transaction of an executed load/store instruction. The DEPC register and DBD bit in the Debug register will indicate the load/store instruction that caused the data hardware breakpoint to match. The load/store instruction that caused the debug exception
has not completed e.g. not updated the register file, and the instruction can be re-executed after returning from the
debug handler.
Debug Register Debug Status Bit Set:
DDBL for a load instruction or DDBS for a store instruction
82MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
4.9 Exception Handling and Servicing Flowcharts
Additional State Saved:
None
Entry Vector Used:
Debug exception vector
4.8.22 Complex Break Exception
A complex data break exception occurs when the complex hardware breakpoint detects an enabled breakpoint. Complex breaks are taken imprecisely—the instruction that actually caused the exception is allowed to complete and the
DEPC register and DBD bit in the Debug register point to a following instruction.
Debug Register Debug Status Bit Set:
DIBImpr, DDBLImpr, and/or DDBSImpr
Additional State Saved:
Debug2 fields indicate which type(s) of complex breakpoints were detected.
Entry Vector Used:
Debug exception vector
4.9 Exception Handling and Servicing Flowcharts
The remainder of this chapter contains flowcharts for the following exceptions and guidelines for their handlers:
•General exceptions and their exception handler
•Reset, soft reset and NMI exceptions, and a guideline to their handler
•Debug exceptions
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.0483
Exceptions and Interrupts in the M14K™ Core
Figure 4.3 General Exception Handler (HW)
Exceptions other than Reset, SoftReset, NMI, EJTag Debug and cache error,or first-level TLB miss.
Note: Interrupts can be masked by IE or IMs and Watch is maskedif EXL = 1
BadVA is set only for AdEL/S
Set Cause EXCCode,CE
BadVA ← VA
exceptions. Note: not set if it is a BusError
Comments
Check ifexception within
anotherexception
Yes
EPC ← (PC - 4)
CauseBD ← 1
PC ← 0x8000_0000 + 180
(unmapped, cached)
EXL
=0
Instr. inBr.Dly.
Slot?
EXL ← 1
.
Status
BEV
=1
No
EPC ← PC
Cause
=1 (bootstrap)=0 (normal)
PC ← 0xBFC0_0200 + 180
(unmapped, uncached)
← 0
BD
Processorforced to KernelMode
&interrupt disabled
To General Exception Servicing Guidelines
84MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.