Mips Technologies MIPS32 M14K User Manual

MIPS32® M14K™ Processor Core Family
Software User’s Manual
Document Number: MD00668
Revision 02.04
March 24, 2014
Unpublished rights (if any) reserved under the copyright laws of the United States of America and other countries.
This document contains information that is proprietary to MIPS Tech, LLC, a Wave Computing company (“MIPS”) and MIPS’ affiliates as applicable. Any copying, reproducing, modifying or use of this information (in whole or in part) that is not expressly permitted in writing by MIPS or MIPS’ affiliates as applicable or an authorized third party is strictly prohibited. At a minimum, this information is protected under unfair competition and copyright laws. Violations thereof may result in criminal penalties and fines. Any document provided in source format (i.e., in a modifiable form such as in FrameMaker or Microsoft Word format) is subject to use and distribution restrictions that are independent of and supplemental to any and all confidentiality restrictions. UNDER NO CIRCUMSTANCES MAY A DOCUMENT PROVIDED IN SOURCE FORMAT BE DISTRIBUTED TO A THIRD PARTY IN SOURCE FORMAT WITHOUT THE EXPRESS WRITTEN PERMISSION OF MIPS (AND MIPS’ AFFILIATES AS APPLICABLE) reserve the right to change the information contained in this document to improve function, design or otherwise.
MIPS and MIPS’ affiliates do not assume any liability arising out of the application or use of this information, or of any error or omission in such information. Any warranties, whether express, statutory, implied or otherwise, including but not limited to the implied warranties of merchantability or fitness for a particular purpose, are excluded. Except as expressly provided in any written license agreement from MIPS or an authorized third party, the furnishing of this document does not give recipient any license to any intellectual property rights, including any patent rights, that cover the information in this document.
The information contained in this document shall not be exported, reexported, transferred, or released, directly or indirectly, in violation of the law of any country or international law, regulation, treaty, Executive Order, statute, amendments or supplements thereto. Should a conflict arise regarding the export, reexport, transfer, or release of the information contained in this document, the laws of the United States of America shall be the governing law.
The information contained in this document constitutes one or more of the following: commercial computer software, commercial computer software documentation or other commercial items. If the user of this information, or any related documentation of any kind, including related technical data or manuals, is an agency, department, or other entity of the United States government ("Government"), the use, duplication, reproduction, release, modification, disclosure, or transfer of this information, or any related documentation of any kind, is restricted in accordance with Federal Acquisition Regulation 12.212 for civilian agencies and Defense Federal Acquisition Regulation Supplement 227.7202 for military agencies. The use of this information by the Government is further restricted in accordance with the terms of the license agreement(s) and/or applicable contract terms and conditions covering this information from MIPS Technologies or an authorized third party.
MIPS, MIPS I, MIPS II, MIPS III, MIPS IV, MIPS V, MIPSr3, MIPS32, MIPS64, microMIPS32, microMIPS64, MIPS-3D, MIPS16, MIPS16e, MIPS-Based, MIPSsim, MIPSpro, MIPS-VERIFIED, Aptiv logo, microAptiv logo, interAptiv logo, microMIPS logo, MIPS Technologies logo, MIPS-VERIFIED logo, proAptiv logo, 4K, 4Kc, 4Km, 4Kp, 4KE, 4KEc, 4KEm, 4KEp, 4KS, 4KSc, 4KSd, M4K, M14K, 5K, 5Kc, 5Kf, 24K, 24Kc, 24Kf, 24KE, 24KEc, 24KEf, 34K, 34Kc, 34Kf, 74K, 74Kc, 74Kf, 1004K, 1004Kc, 1004Kf, 1074K, 1074Kc, 1074Kf, R3000, R4000, R5000, Aptiv, ASMACRO, Atlas, "At the core of the user experience.", BusBridge, Bus Navigator, CLAM, CorExtend, CoreFPGA, CoreLV, EC, FPGA View, FS2, FS2 FIRST SILICON SOLUTIONS logo, FS2 NAVIGATOR, HyperDebug, HyperJTAG, IASim, iFlowtrace, interAptiv, JALGO, Logic Navigator, Malta, MDMX, MED, MGB, microAptiv, microMIPS, Navigator, OCI, PDtrace, the Pipeline, proAptiv, Pro Series, SEAD-3, SmartMIPS, SOC-it, and YAMON are trademarks or registered trademarks of MIPS and MIPS’ affiliates as applicable in the United States and other countries.
All other trademarks referred to herein are the property of their respective owners.
D/W^ϯϮΠ Dϭϰ<Ρ WƌŽĐĞƐƐŽƌ ŽƌĞ &ĂŵŝůLJ ^ŽĨƚǁĂƌĞ hƐĞƌƐ DĂŶƵĂů ZĞǀŝƐŝŽŶ ϬϮϬϰ
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04 3
4 MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
Table of Contents
Chapter 1: Introduction to the MIPS32® M14K™ Processor Core.....................................................4
1.1: Features ...................................................................................................................................................... 4
1.2: M14K™ Core Block Diagram ...................................................................................................................... 8
1.2.1: Required Logic Blocks ....................................................................................................................... 9
1.2.1.1: Execution Unit .......................................................................................................................... 9
1.2.1.2: General Purposed Register (GPR) Shadow Registers........................................................... 10
1.2.1.3: Multiply/Divide Unit (MDU) .....................................................................................................10
1.2.1.4: System Control Coprocessor (CP0) .......................................................................................10
1.2.1.5: Memory Management Unit (MMU) ......................................................................................... 12
1.2.1.6: SRAM Interface Controller...................................................................................................... 14
1.2.1.7: Power Management ............................................................................................................... 15
1.2.2: Optional Logic Blocks....................................................................................................................... 16
1.2.2.1: Reference Design................................................................................................................... 16
1.2.2.2: microMIPS™ ISA.................................................................................................................... 17
1.2.2.3: Memory Protection Unit..........................................................................................................17
1.2.2.4: Coprocessor 2 Interface .........................................................................................................17
1.2.2.5: CorExtend® User-defined Instruction Extensions..................................................................18
1.2.2.6: EJTAG Debug Support........................................................................................................... 18
Chapter 2: Pipeline of the M14K™ Core.............................................................................................22
2.1: Pipeline Stages.......................................................................................................................................... 22
2.1.1: I Stage: Instruction Fetch.................................................................................................................24
2.1.2: E Stage: Execution...........................................................................................................................24
2.1.3: M Stage: Memory Fetch...................................................................................................................24
2.1.4: A Stage: Align .................................................................................................................................. 25
2.1.5: W Stage: Writeback ......................................................................................................................... 25
2.2: Multiply/Divide Operations......................................................................................................................... 25
2.3: MDU Pipeline - High-performance MDU ................................................................................................... 25
2.3.1: 32x16 Multiply (High-Performance MDU) ........................................................................................ 28
2.3.2: 32x32 Multiply (High-Performance MDU) ........................................................................................ 29
2.3.3: Divide (High-Performance MDU) ..................................................................................................... 29
2.4: MDU Pipeline - Area-Efficient MDU........................................................................................................... 30
2.4.1: Multiply (Area-Efficient MDU)...........................................................................................................31
2.4.2: Multiply Accumulate (Area-Efficient MDU)....................................................................................... 32
2.4.3: Divide (Area-Efficient MDU)............................................................................................................. 32
2.5: Branch Delay.............................................................................................................................................32
2.6: Data Bypassing ......................................................................................................................................... 33
2.6.1: Load Delay....................................................................................................................................... 34
2.6.2: Move from HI/LO and CP0 Delay..................................................................................................... 35
2.7: Coprocessor 2 Instructions........................................................................................................................ 35
2.8: Interlock Handling...................................................................................................................................... 36
2.9: Slip Conditions........................................................................................................................................... 37
2.10: Instruction Interlocks................................................................................................................................ 38
2.11: Hazards...................................................................................................................................................39
2.11.1: Types of Hazards........................................................................................................................... 39
2.11.2: Instruction Listing...........................................................................................................................40
2.11.2.1: Instruction Encoding.............................................................................................................41
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04 1
2.11.3: Eliminating Hazards.......................................................................................................................41
Chapter 3: Memory Management of the M14K™ Core......................................................................43
3.1: Introduction................................................................................................................................................43
3.1.1: Memory Management Unit (MMU) ..................................................................................................43
3.1.1.1: Fixed Mapping Translation (FMT) ..........................................................................................43
3.2: Modes of Operation...................................................................................................................................44
3.2.1: Virtual Memory Segments................................................................................................................44
3.2.1.1: Unmapped Segments............................................................................................................. 45
3.2.1.2: Mapped Segments ................................................................................................................. 46
3.2.2: User Mode........................................................................................................................................46
3.2.3: Kernel Mode..................................................................................................................................... 47
3.2.3.1: Kernel Mode, User Space (kuseg) .........................................................................................49
3.2.3.2: Kernel Mode, Kernel Space 0 (kseg0)....................................................................................49
3.2.3.3: Kernel Mode, Kernel Space 1 (kseg1)....................................................................................49
3.2.3.4: Kernel Mode, Kernel Space 2 (kseg2)....................................................................................49
3.2.3.5: Kernel Mode, Kernel Space 3 (kseg3)....................................................................................49
3.2.4: Debug Mode.....................................................................................................................................49
3.2.4.1: Conditions and Behavior for Access to drseg, EJTAG Registers........................................... 50
3.2.4.2: Conditions and Behavior for Access to dmseg, EJTAG Memory ........................................... 51
3.3: Fixed Mapping MMU ................................................................................................................................. 51
3.4: System Control Coprocessor..................................................................................................................... 53
Chapter 4: Exceptions and Interrupts in the M14K™ Core...............................................................55
4.1: Exception Conditions................................................................................................................................. 55
4.2: Exception Priority....................................................................................................................................... 56
4.3: Interrupts ................................................................................................................................................... 57
4.3.1: Interrupt Modes................................................................................................................................ 57
4.3.1.1: Interrupt Compatibility Mode................................................................................................... 58
4.3.1.2: Vectored Interrupt (VI) Mode..................................................................................................60
4.3.1.3: External Interrupt Controller Mode .........................................................................................63
4.3.2: Generation of Exception Vector Offsets for Vectored Interrupts...................................................... 66
4.3.3: MCU ASE Enhancement for Interrupt Handling...............................................................................67
4.3.3.1: Interrupt Delivery ....................................................................................................................67
4.3.3.2: Interrupt Latency Reduction ................................................................................................... 67
4.4: GPR Shadow Registers............................................................................................................................. 68
4.5: Exception Vector Locations.......................................................................................................................69
4.6: General Exception Processing..................................................................................................................71
4.7: Debug Exception Processing .................................................................................................................... 73
4.8: Exception Descriptions..............................................................................................................................74
4.8.1: Reset/SoftReset Exception..............................................................................................................74
4.8.2: Debug Single Step Exception .......................................................................................................... 75
4.8.3: Debug Interrupt Exception ............................................................................................................... 76
4.8.4: Non-Maskable Interrupt (NMI) Exception.........................................................................................76
4.8.5: Interrupt Exception........................................................................................................................... 77
4.8.6: Debug Instruction Break Exception..................................................................................................77
4.8.7: Address Error Exception — Instruction Fetch/Data Access.............................................................77
4.8.8: SRAM Parity Error Exception...........................................................................................................78
4.8.9: Bus Error Exception — Instruction Fetch or Data Access................................................................ 78
4.8.10: Protection Exception......................................................................................................................79
4.8.11: Debug Software Breakpoint Exception .......................................................................................... 79
4.8.12: Execution Exception — System Call..............................................................................................79
4.8.13: Execution Exception — Breakpoint................................................................................................80
2 MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
4.8.14: Execution Exception — Reserved Instruction................................................................................ 80
4.8.15: Execution Exception — Coprocessor Unusable ............................................................................ 80
4.8.16: Execution Exception — CorExtend Unusable................................................................................81
4.8.17: Execution Exception — Coprocessor 2 Exception.........................................................................81
4.8.18: Execution Exception — Implementation-Specific 1 Exception.......................................................81
4.8.19: Execution Exception — Integer Overflow....................................................................................... 82
4.8.20: Execution Exception — Trap.......................................................................................................... 82
4.8.21: Debug Data Break Exception.........................................................................................................82
4.8.22: Complex Break Exception..............................................................................................................83
4.9: Exception Handling and Servicing Flowcharts .......................................................................................... 83
Chapter 5: CP0 Registers of the M14K™ Core ..................................................................................88
5.1: CP0 Register Summary............................................................................................................................. 88
5.2: CP0 Register Descriptions ........................................................................................................................ 90
5.2.1: UserLocal Register (CP0 Register 4, Select 2)................................................................................90
5.2.2: HWREna Register (CP0 Register 7, Select 0)................................................................................. 91
5.2.3: BadVAddr Register (CP0 Register 8, Select 0)................................................................................ 92
5.2.4: BadInstr Register (CP0 Register 8, Select 1)................................................................................... 92
5.2.5: BadInstrP Register (CP0 Register 8, Select 2)................................................................................93
5.2.6: Count Register (CP0 Register 9, Select 0) ...................................................................................... 94
5.2.7: Compare Register (CP0 Register 11, Select 0)...............................................................................94
5.2.8: Status Register (CP0 Register 12, Select 0)....................................................................................95
5.2.9: IntCtl Register (CP0 Register 12, Select 1)...................................................................................... 99
5.2.10: SRSCtl Register (CP0 Register 12, Select 2)..............................................................................103
5.2.11: SRSMap Register (CP0 Register 12, Select 3)............................................................................ 106
5.2.12: View_IPL Register (CP0 Register 12, Select 4)...........................................................................107
5.2.13: SRSMap2 Register (CP0 Register 12, Select 5).......................................................................... 107
5.2.14: Cause Register (CP0 Register 13, Select 0)................................................................................ 108
5.2.15: View_RIPL Register (CP0 Register 13, Select 4)........................................................................113
5.2.16: NestedExc (CP0 Register 13, Select 5)....................................................................................... 113
5.2.17: Exception Program Counter (CP0 Register 14, Select 0)............................................................ 114
5.2.18: NestedEPC (CP0 Register 14, Select 2)...................................................................................... 115
5.2.19: Processor Identification (CP0 Register 15, Select 0)................................................................... 116
5.2.20: EBase Register (CP0 Register 15, Select 1) ............................................................................... 117
5.2.21: CDMMBase Register (CP0 Register 15, Select 2)....................................................................... 118
5.2.22: Config Register (CP0 Register 16, Select 0)................................................................................ 119
5.2.23: Config1 Register (CP0 Register 16, Select 1).............................................................................. 121
5.2.24: Config2 Register (CP0 Register 16, Select 2).............................................................................. 122
5.2.25: Config3 Register (CP0 Register 16, Select 3).............................................................................. 123
5.2.26: Config4 Register (CP0 Register 16, Select 4).............................................................................. 126
5.2.27: Config5 Register (CP0 Register 16, Select 5).............................................................................. 126
5.2.28: Config7 Register (CP0 Register 16, Select 7).............................................................................. 127
5.2.29: Debug Register (CP0 Register 23, Select 0) ............................................................................... 127
5.2.30: Trace Control Register (CP0 Register 23, Select 1)....................................................................132
5.2.31: Trace Control2 Register (CP0 Register 23, Select 2)..................................................................134
5.2.32: User Trace Data1 Register (CP0 Register 23, Select 3)/User Trace Data2 Register (CP0 Register
24, Select 3)............................................................................................................................................. 135
5.2.33: TraceBPC Register (CP0 Register 23, Select 4) ......................................................................... 136
5.2.34: Debug2 Register (CP0 Register 23, Select 6) ............................................................................. 137
5.2.35: Debug Exception Program Counter Register (CP0 Register 24, Select 0).................................. 138
5.2.36: Performance Counter Register (CP0 Register 25, select 0-3)..................................................... 139
5.2.37: ErrCtl Register (CP0 Register 26, Select 0).................................................................................144
5.2.38: CacheErr Register (CP0 Register 27, Select 0)...........................................................................144
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04 3
5.2.39: ErrorEPC (CP0 Register 30, Select 0)......................................................................................... 145
5.2.40: DeSave Register (CP0 Register 31, Select 0).............................................................................146
5.2.41: KScratchn Registers (CP0 Register 31, Selects 2 to 3)...............................................................146
Chapter 6: Hardware and Software Initialization of the M14K™ Core...........................................149
6.1: Hardware-Initialized Processor State......................................................................................................149
6.1.1: Coprocessor 0 State ...................................................................................................................... 149
6.1.2: Bus State Machines.......................................................................................................................150
6.1.3: Static Configuration Inputs............................................................................................................. 150
6.1.4: Fetch Address................................................................................................................................ 150
6.2: Software Initialized Processor State........................................................................................................ 150
6.2.1: Register File................................................................................................................................... 150
6.2.2: Coprocessor 0 State ...................................................................................................................... 150
Chapter 7: Power Management of the M14K™ Core.......................................................................153
7.1: Register-Controlled Power Management ................................................................................................ 153
7.2: Instruction-Controlled Power Management.............................................................................................154
Chapter 8: EJTAG Debug Support in the M14K™ Core..................................................................155
8.1: Debug Control Register...........................................................................................................................155
8.2: Hardware Breakpoints.............................................................................................................................160
8.2.1: Data Breakpoints............................................................................................................................161
8.2.2: Complex Breakpoints..................................................................................................................... 161
8.2.3: Conditions for Matching Breakpoints ............................................................................................. 161
8.2.3.1: Conditions for Matching Instruction Breakpoints .................................................................. 161
8.2.3.2: Conditions for Matching Data Breakpoints ........................................................................... 163
8.2.4: Debug Exceptions from Breakpoints..............................................................................................165
8.2.4.1: Debug Exception by Instruction Breakpoint.......................................................................... 165
8.2.4.2: Debug Exception by Data Breakpoint................................................................................... 165
8.2.5: Breakpoint Used as Triggerpoint.................................................................................................... 166
8.2.6: Instruction Breakpoint Registers....................................................................................................166
8.2.6.1: Instruction Breakpoint Status (IBS) Register (0x1000)......................................................... 167
8.2.6.2: Instruction Breakpoint Address n (IBAn) Register (0x1100 + n * 0x100).............................. 167
8.2.6.3: Instruction Breakpoint Address Mask n (IBMn) Register (0x1108 + n*0x100) ..................... 168
8.2.6.4: Instruction Breakpoint ASID n (IBASIDn) Register (0x1110 + n*0x100) .............................. 168
8.2.6.5: Instruction Breakpoint Control n (IBCn) Register (0x1118 + n*0x100)................................. 168
8.2.6.6: Instruction Breakpoint Complex Control n (IBCCn) Register (0x1120 + n*0x100)...............170
8.2.6.7: Instruction Breakpoint Pass Counter n (IBPCn) Register (0x1128 + n*0x100) .................... 170
8.2.7: Data Breakpoint Registers.............................................................................................................171
8.2.7.1: Data Breakpoint Status (DBS) Register (0x2000) ................................................................ 172
8.2.7.2: Data Breakpoint Address n (DBAn) Register (0x2100 + 0x100 * n)..................................... 172
8.2.7.3: Data Breakpoint Address Mask n (DBMn) Register (0x2108 + 0x100 * n)...........................173
8.2.7.4: Data Breakpoint ASID n (DBASIDn) Register (0x2110 + 0x100 * n)....................................173
8.2.7.5: Data Breakpoint Control n (DBCn) Register (0x2118 + 0x100 * n) ...................................... 173
8.2.7.6: Data Breakpoint Value n (DBVn) Register (0x2120 + 0x100 * n)......................................... 175
8.2.7.7: Data Breakpoint Complex Control n (DBCCn) Register (0x2128 + n*0x100)....................... 176
8.2.7.8: Data Breakpoint Pass Counter n (DBPCn) Register (0x2130 + n*0x100)............................177
8.2.7.9: Data Value Match (DVM) Register (0x2ffo)..........................................................................177
8.2.8: Complex Breakpoint Registers....................................................................................................... 178
8.2.8.1: Complex Break and Trigger Control (CBTC) Register (0x8000) .......................................... 178
8.2.8.2: Priming Condition A (PrCndAI/Dn) Registers.......................................................................179
8.2.8.3: Stopwatch Timer Control (STCtl) Register (0x8900)............................................................181
4 MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
8.2.8.4: Stopwatch Timer Count (STCnt) Register (0x8908)............................................................. 182
8.3: Complex Breakpoint Usage..................................................................................................................... 182
8.3.1: Checking for Presence of Complex Break Support........................................................................ 182
8.3.2: General Complex Break Behavior.................................................................................................. 183
8.3.3: Usage of Pass Counters................................................................................................................183
8.3.4: Usage of Tuple Breakpoints...........................................................................................................184
8.3.5: Usage of Priming Conditions.......................................................................................................... 184
8.3.6: Usage of Data Qualified Breakpoints............................................................................................. 185
8.3.7: Usage of Stopwatch Timers........................................................................................................... 185
8.4: Test Access Port (TAP)...........................................................................................................................186
8.4.1: EJTAG Internal and External Interfaces......................................................................................... 186
8.4.2: Test Access Port Operation...........................................................................................................187
8.4.2.1: Test-Logic-Reset State......................................................................................................... 188
8.4.2.2: Run-Test/Idle State............................................................................................................... 188
8.4.2.3: Select_DR_Scan State......................................................................................................... 188
8.4.2.4: Select_IR_Scan State .......................................................................................................... 188
8.4.2.5: Capture_DR State ................................................................................................................189
8.4.2.6: Shift_DR State...................................................................................................................... 189
8.4.2.7: Exit1_DR State.....................................................................................................................189
8.4.2.8: Pause_DR State................................................................................................................... 189
8.4.2.9: Exit2_DR State.....................................................................................................................189
8.4.2.10: Update_DR State ............................................................................................................... 189
8.4.2.11: Capture_IR State................................................................................................................ 190
8.4.2.12: Shift_IR State ..................................................................................................................... 190
8.4.2.13: Exit1_IR State..................................................................................................................... 190
8.4.2.14: Pause_IR State .................................................................................................................. 190
8.4.2.15: Exit2_IR State..................................................................................................................... 190
8.4.2.16: Update_IR State.................................................................................................................190
8.4.3: Test Access Port (TAP) Instructions..............................................................................................190
8.4.3.1: BYPASS Instruction.............................................................................................................. 191
8.4.3.2: IDCODE Instruction..............................................................................................................191
8.4.3.3: IMPCODE Instruction ...........................................................................................................191
8.4.3.4: ADDRESS Instruction........................................................................................................... 191
8.4.3.5: DATA Instruction .................................................................................................................. 192
8.4.3.6: CONTROL Instruction ..........................................................................................................192
8.4.3.7: ALL Instruction...................................................................................................................... 192
8.4.3.8: EJTAGBOOT Instruction ......................................................................................................192
8.4.3.9: NORMALBOOT Instruction ..................................................................................................192
8.4.3.10: FASTDATA Instruction .......................................................................................................193
8.4.3.11: PCsample Register (PCSAMPLE Instruction)....................................................................193
8.4.3.12: FDC Instruction................................................................................................................... 193
8.4.3.13: TCBCONTROLA Instruction............................................................................................... 193
8.4.3.14: TCBCONTROLB Instruction............................................................................................... 193
8.4.3.15: TCBDATA Instruction .........................................................................................................193
8.5: EJTAG TAP Registers............................................................................................................................. 193
8.5.1: Instruction Register........................................................................................................................193
8.5.2: Data Registers Overview ............................................................................................................... 194
8.5.2.1: Bypass Register ................................................................................................................... 194
8.5.2.2: Device Identification (ID) Register........................................................................................194
8.5.2.3: Implementation Register....................................................................................................... 195
8.5.2.4: EJTAG Control Register.......................................................................................................196
8.5.3: Processor Access Address Register..............................................................................................202
8.5.3.1: Processor Access Data Register.......................................................................................... 202
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04 5
8.5.4: Fastdata Register (TAP Instruction FASTDATA)........................................................................... 203
8.6: TAP Processor Accesses........................................................................................................................204
8.6.1: Fetch/Load and Store from/to EJTAG Probe Through dmseg....................................................... 205
8.7: SecureDebug........................................................................................................................................... 206
8.7.1: Disabling EJTAG Debugging ......................................................................................................... 206
8.7.1.1: EJ_DisableProbeDebug Signal ............................................................................................206
8.7.1.2: Override for EjtagBrk and DINT Disable............................................................................... 207
8.7.2: EJTAG Features Unmodified by SecureDebug ............................................................................. 207
8.8: iFlowtrace™ Mechanism.........................................................................................................................207
8.8.1: A Simple Instruction-Only Tracing Scheme ................................................................................... 208
8.8.1.1: Trace Inputs.......................................................................................................................... 208
8.8.1.2: Normal Trace Mode Outputs ................................................................................................208
8.8.2: Special Trace Modes ..................................................................................................................... 209
8.8.2.1: Mode Descriptions................................................................................................................ 209
8.8.2.2: Special Trace Mode Outputs................................................................................................211
8.8.3: ITCB Overview............................................................................................................................... 212
8.8.4: ITCB iFlowtrace Interface............................................................................................................... 212
8.8.5: TCB Storage Representation......................................................................................................... 213
8.8.6: ITCB Register Interface for Software Configurability ..................................................................... 214
8.8.6.1: iFlowtrace Control/Status (IFCTL) Register (offset 0x3fc0).................................................. 214
8.8.6.2: ITCBTW Register (offset 0x3F80) ........................................................................................ 216
8.8.6.3: ITCBRDP Register (Offset 0x3f88)....................................................................................... 217
8.8.6.4: ITCBWRP Register (Offset 0x3f90)...................................................................................... 217
8.8.7: ITCB iFlowtrace Off-Chip Interface................................................................................................218
8.8.8: Breakpoint-Based Enabling of Tracing........................................................................................... 218
8.9: PC/Data Address Sampling..................................................................................................................... 219
8.9.1: PC Sampling in Wait State.............................................................................................................220
8.9.2: Data Address Sampling ................................................................................................................. 220
8.10: Fast Debug Channel.............................................................................................................................. 220
8.10.1: Common Device Memory Map..................................................................................................... 221
8.10.2: Fast Debug Channel Interrupt......................................................................................................221
8.10.3: M14K™ Core FDC Buffers........................................................................................................... 221
8.10.4: Sleep mode.................................................................................................................................. 223
8.10.5: FDC TAP Register ....................................................................................................................... 223
8.10.6: Fast Debug Channel Registers.................................................................................................... 224
8.10.6.1: FDC Access Control and Status (FDACSR) Register (Offset 0x0).....................................224
8.10.6.2: FDC Configuration (FDCFG) Register (Offset 0x8)............................................................ 225
8.10.6.3: FDC Status (FDSTAT) Register (Offset 0x10) ...................................................................226
8.10.6.4: FDC Receive (FDRX) Register (Offset 0x18)..................................................................... 227
8.10.6.5: FDC Transmit n (FDTXn) Registers (Offset 0x20 + 0x8*n)................................................227
8.11: cJTAG Interface..................................................................................................................................... 228
Chapter 9: Instruction Set Overview.................................................................................................230
9.1: CPU Instruction Formats.........................................................................................................................230
9.2: Load and Store Instructions..................................................................................................................... 231
9.2.1: Scheduling a Load Delay Slot........................................................................................................ 231
9.2.2: Defining Access Types................................................................................................................... 231
9.3: Computational Instructions......................................................................................................................232
9.3.1: Cycle Timing for Multiply and Divide Instructions........................................................................... 233
9.4: Jump and Branch Instructions.................................................................................................................233
9.4.1: Overview of Jump Instructions....................................................................................................... 233
9.4.2: Overview of Branch Instructions .................................................................................................... 233
9.5: Control Instructions.................................................................................................................................. 233
6 MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
9.6: Coprocessor Instructions......................................................................................................................... 233
9.7: Enhancements to the MIPS Architecture................................................................................................. 233
9.7.1: CLO - Count Leading Ones............................................................................................................234
9.7.2: CLZ - Count Leading Zeros............................................................................................................ 234
9.7.3: MADD - Multiply and Add Word.....................................................................................................234
9.7.4: MADDU - Multiply and Add Unsigned Word .................................................................................. 234
9.7.5: MSUB - Multiply and Subtract Word .............................................................................................. 234
9.7.6: MSUBU - Multiply and Subtract Unsigned Word............................................................................ 235
9.7.7: MUL - Multiply Word....................................................................................................................... 235
9.7.8: SSNOP- Superscalar Inhibit NOP..................................................................................................235
9.8: MCU ASE Instructions............................................................................................................................. 235
9.8.1: ACLR..............................................................................................................................................235
9.8.2: ASET.............................................................................................................................................. 235
9.8.3: IRET............................................................................................................................................... 235
Chapter 10: M14K™ Processor Core Instructions ..........................................................................236
10.1: Understanding the Instruction Descriptions........................................................................................... 236
10.2: M14K™ Core Opcode Map...................................................................................................................236
10.3: MIPS32® Instruction Set for the M14K™ Core.....................................................................................239
Chapter 11: microMIPS™ Instruction Set Architecture ..................................................................267
11.1: Overview................................................................................................................................................ 267
11.1.1: MIPSr3™ Architecture ................................................................................................................. 267
11.1.2: Default ISA Mode......................................................................................................................... 268
11.1.3: Software Detection.......................................................................................................................268
11.1.4: Compliance and Subsetting.........................................................................................................268
11.1.5: Mode Switch.................................................................................................................................268
11.1.6: Branch and Jump Offsets.............................................................................................................269
11.1.7: Coprocessor Unusable Behavior ................................................................................................. 269
11.2: Instruction Formats................................................................................................................................ 269
11.2.1: Instruction Stream Organization and Endianness........................................................................272
11.3: microMIPS Re-encoded Instructions.....................................................................................................272
11.3.1: 16-Bit Category............................................................................................................................273
11.3.1.1: Frequent MIPS32 Instructions............................................................................................273
11.3.1.2: Frequent MIPS32 Instruction Sequences........................................................................... 275
11.3.1.3: Instruction-Specific Register Specifiers and Immediate Field Encodings........................... 277
11.3.2: 16-bit Instruction Register Set......................................................................................................278
11.3.3: 32-Bit Category............................................................................................................................280
11.3.3.1: New 32-bit instructions .......................................................................................................280
Appendix A: References ....................................................................................................................283
Appendix B: Revision History ...........................................................................................................285
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04 7
8 MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
List of Figures
Figure 1.1: M14K™ Processor Core Block Diagram ................................................................................................9
Figure 1.2: M14K™ Core Virtual Address Map ...................................................................................................... 13
Figure 1.3: Address Translation During SRAM Access with FMT Implementation ................................................ 14
Figure 1.4: Reference Design Block Diagram......................................................................................................... 17
Figure 1.5: FDC Overview.......................................................................................................................................20
Figure 1.6: cJTAG Support ..................................................................................................................................... 21
Figure 2.1: M14K™ Core Pipeline Stages with high-performance MDU ...............................................................23
Figure 2.2: M14K™ Core Pipeline Stages with area-efficient MDU .......................................................................23
Figure 2.3: MDU Pipeline Behavior During Multiply Operations ............................................................................28
Figure 2.4: MDU Pipeline Flow During a 32x16 Multiply Operation ....................................................................... 29
Figure 2.5: MDU Pipeline Flow During a 32x32 Multiply Operation ....................................................................... 29
Figure 2.6: High-Performance MDU Pipeline Flow During a 8-bit Divide (DIV) Operation ....................................30
Figure 2.7: High-Performance MDU Pipeline Flow During a 16-bit Divide (DIV) Operation ..................................30
Figure 2.8: High-Performance MDU Pipeline Flow During a 24-bit Divide (DIV) Operation ..................................30
Figure 2.9: High-Performance MDU Pipeline Flow During a 32-bit Divide (DIV) Operation ..................................30
Figure 2.10: M14K™ Area-Efficient MDU Pipeline Flow During a Multiply Operation ...........................................31
Figure 2.11: M14K™ Core Area-Efficient MDU Pipeline Flow During a Multiply Accumulate Operation ............... 32
Figure 2.12: M14K™ Core Area-Efficient MDU Pipeline Flow During a Divide (DIV) Operation ...........................32
Figure 2.13: IU Pipeline Branch Delay ................................................................................................................... 33
Figure 2.14: IU Pipeline Data Bypass ...................................................................................................................34
Figure 2.15: IU Pipeline M to E bypass .................................................................................................................. 34
Figure 2.16: IU Pipeline A to E Data bypass .......................................................................................................... 35
Figure 2.17: IU Pipeline Slip after a MFHI ..............................................................................................................35
Figure 2.18: Coprocessor 2 Interface Transactions ............................................................................................... 36
Figure 2.19: Instruction Cache Miss Slip ................................................................................................................38
Figure 3.1: Address Translation During SRAM Access ......................................................................................... 44
Figure 3.2: M14K™ processor core Virtual Memory Map ...................................................................................... 45
Figure 3.3: User Mode Virtual Address Space ....................................................................................................... 46
Figure 3.4: Kernel Mode Virtual Address Space ...................................................................................................48
Figure 3.5: Debug Mode Virtual Address Space .................................................................................................... 50
Figure 3.6: FMT Memory Map (ERL=0) in the M14K™ Processor Core ............................................................... 52
Figure 3.7: FMT Memory Map (ERL=1) in the M14K™ Processor Core ............................................................... 53
Figure 4.1: Interrupt Generation for Vectored Interrupt Mode ................................................................................ 62
Figure 4.2: Interrupt Generation for External Interrupt Controller Interrupt Mode .................................................. 65
Figure 4.3: General Exception Handler (HW) ........................................................................................................ 84
Figure 4.4: General Exception Servicing Guidelines (SW) .................................................................................... 85
Figure 4.5: Reset, Soft Reset and NMI Exception Handling and Servicing Guidelines ......................................... 86
Figure 5.1: UserLocal Register Format .................................................................................................................. 90
Figure 5.2: HWREna Register Format....................................................................................................................91
Figure 5.3: BadVAddr Register Format .................................................................................................................. 92
Figure 5.4: BadInstr Register Format...................................................................................................................... 93
Figure 5.5: BadInstrP Register Format ................................................................................................................... 93
Figure 5.6: Count Register Format .........................................................................................................................94
Figure 5.7: Compare Register Format ................................................................................................................... 95
Figure 5.8: Status Register Format......................................................................................................................... 95
Figure 5.9: IntCtl Register Format........................................................................................................................... 99
Figure 5.10: SRSCtl Register Format ................................................................................................................... 103
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04 1
Figure 5.11: SRSMap Register Format................................................................................................................. 106
Figure 5-12: View_IPL Register Format................................................................................................................ 107
Figure 5-13: SRSMap Register Format.................................................................................................................108
Figure 5.14: Cause Register Format..................................................................................................................... 108
Figure 5-15: View_RIPL Register Format.............................................................................................................113
Figure 5-16: NestedExc Register Format..............................................................................................................114
Figure 5.17: EPC Register Format ....................................................................................................................... 115
Figure 5-18: NestedEPC Register Format............................................................................................................116
Figure 5.19: PRId Register Format ......................................................................................................................116
Figure 5.20: EBase Register Format.....................................................................................................................118
Figure 5.21: CDMMBase Register Format............................................................................................................ 118
Figure 5.22: Config Register Format — Select 0 .................................................................................................119
Figure 5.23: Config1 Register Format — Select 1 ...............................................................................................121
Figure 5.24: Config2 Register Format — Select 2 ...............................................................................................122
Figure 5-25: Config3 Register Format...................................................................................................................123
Figure 5-26: Config4 Register Format...................................................................................................................126
Figure 5-27: Config5 Register Format...................................................................................................................127
Figure 5.28: Config7 Register Format .................................................................................................................. 127
Figure 5.29: Debug Register Format ....................................................................................................................128
Figure 5.30: TraceControl Register Format ......................................................................................................... 132
Figure 5.31: TraceControl2 Register Format ....................................................................................................... 134
Figure 5.32: User Trace Data1/User Trace Data2 Register Format .................................................................... 136
Figure 5.33: Trace BPC Register Format .............................................................................................................136
Figure 5.34: Debug2 Register Format ..................................................................................................................137
Figure 5.35: DEPC Register Format ....................................................................................................................139
Figure 5.36: Performance Counter Control Register ............................................................................................140
Figure 5.37: Performance Counter Count Register ..............................................................................................143
Figure 5.38: ErrCtl Register Format .................................................................................................................... 144
Figure 5.39: CacheErr Register (Primary Caches) .............................................................................................. 144
Figure 5.40: ErrorEPC Register Format ............................................................................................................... 146
Figure 5.41: DeSave Register Format ................................................................................................................. 146
Figure 5-42: KScratchn Register Format .............................................................................................................. 147
Figure 8.1: DCR Register Format ......................................................................................................................... 156
Figure 8.2: IBS Register Format .......................................................................................................................... 167
Figure 8.3: IBAn Register Format ........................................................................................................................ 167
Figure 8.4: IBMn Register Format ........................................................................................................................ 168
Figure 8.5: IBASIDn Register Format .................................................................................................................. 168
Figure 8.6: IBCn Register Format ........................................................................................................................169
Figure 8.7: IBCCn Register Format ......................................................................................................................170
Figure 8.8: IBPCn Register Format ...................................................................................................................... 171
Figure 8.9: DBS Register Format ......................................................................................................................... 172
Figure 8.10: DBAn Register Format ..................................................................................................................... 172
Figure 8.11: DBMn Register Format ....................................................................................................................173
Figure 8.12: DBASIDn Register Format ............................................................................................................... 173
Figure 8.13: DBCn Register Format .....................................................................................................................173
Figure 8.14: DBVn Register Format ..................................................................................................................... 175
Figure 8.15: DBCCn Register Format .................................................................................................................. 176
Figure 8.16: DBPCn Register Format...................................................................................................................177
Figure 8.17: DVM Register Format ......................................................................................................................177
Figure 8.18: CBTC Register Format ..................................................................................................................... 178
Figure 8.19: PrCndA Register Format ..................................................................................................................179
Figure 8.20: STCtl Register Format .....................................................................................................................181
Figure 8.21: STCnt Register Format .................................................................................................................... 182
2 MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
Figure 8.22: TAP Controller State Diagram ......................................................................................................... 188
Figure 8.23: Concatenation of the EJTAG Address, Data and Control Registers ................................................192
Figure 8.24: TDI to TDO Path When in Shift-DR State and FASTDATA Instruction is Selected ......................... 193
Figure 8.25: Device Identification Register Format .............................................................................................. 194
Figure 8.26: Implementation Register Format ......................................................................................................195
Figure 8.27: EJTAG Control Register Format ...................................................................................................... 196
Figure 8.28: Endian Formats for the PAD Register ..............................................................................................203
Figure 8.29: Fastdata Register Format.................................................................................................................203
Figure 8.30: Trace Logic Overview.......................................................................................................................212
Figure 8.31: Control/Status Register..................................................................................................................... 214
Figure 8.32: ITCBTW Register Format ................................................................................................................. 216
Figure 8.33: ITCBRDP Register Format ............................................................................................................... 217
Figure 8.34: ITCBWRP Register Format...............................................................................................................217
Figure 8.35: PCSAMPLE TAP Register Format (MIPS32) ................................................................................... 219
Figure 8.36: Fast Debug Channel Buffer Organization......................................................................................... 222
Figure 8.37: FDC TAP Register Format................................................................................................................ 223
Figure 8.38: FDC Access Control and Status Register......................................................................................... 224
Figure 8.39: FDC Configuration Register.............................................................................................................. 225
Figure 8.40: FDC Status Register......................................................................................................................... 226
Figure 8.41: FDC Receive Register......................................................................................................................227
Figure 8.42: FDC Transmit Register.....................................................................................................................227
Figure 8.43: cJTAG Interface................................................................................................................................ 228
Figure 9.1: Instruction Formats ............................................................................................................................231
Figure 11.1: 16-Bit Instruction Formats................................................................................................................. 270
Figure 11.2: 32-Bit Instruction Formats................................................................................................................. 271
Figure 11.3: Immediate Fields within 32-Bit Instructions.......................................................................................271
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04 3
4 MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
List of Tables
Table 2.1: MDU Instruction Latencies (High-Performance MDU)...........................................................................26
Table 2.2: MDU Instruction Repeat Rates (High-Performance MDU).....................................................................27
Table 2.3: M14K™ Core Instruction Latencies (Area-Efficient MDU).....................................................................31
Table 2.4: Pipeline Interlocks.................................................................................................................................. 37
Table 2.5: Instruction Interlocks..............................................................................................................................38
Table 2.6: Execution Hazards................................................................................................................................. 40
Table 2.7: Instruction Hazards................................................................................................................................40
Table 2.8: Hazard Instruction Listing ......................................................................................................................40
Table 3.1: User Mode Segments ............................................................................................................................ 46
Table 3.2: Kernel Mode Segments .........................................................................................................................48
Table 3.3: Physical Address and Cache Attributes for dseg, dmseg, and drseg Address Spaces......................... 50
Table 3.4: CPU Access to drseg Address Range...................................................................................................50
Table 3.5: CPU Access to dmseg Address Range ................................................................................................. 51
Table 3.6: Cacheability of Segments with Block Address Translation....................................................................51
Table 4.1: Priority of Exceptions ............................................................................................................................. 56
Table 4.2: Interrupt Modes...................................................................................................................................... 58
Table 4.3: Relative Interrupt Priority for Vectored Interrupt Mode...........................................................................61
Table 4.4: Exception Vector Offsets for Vectored Interrupts................................................................................... 66
Table 4.5: Exception Vector Base Addresses......................................................................................................... 70
Table 4.6: Exception Vector Offsets .......................................................................................................................70
Table 4.7: Exception Vectors..................................................................................................................................70
Table 4.8: Value Stored in EPC, ErrorEPC, or DEPC on an Exception.................................................................. 71
Table 4.9: Debug Exception Vector Addresses ...................................................................................................... 74
Table 4.10: Register States an Interrupt Exception ................................................................................................ 77
Table 4.11: CP0 Register States on an Address Exception Error...........................................................................78
Table 4.12: CP0 Register States on a SRAM Parity Error Exception.....................................................................78
Table 4.13: Register States on a Coprocessor Unusable Exception......................................................................81
Table 5.1: CP0 Registers........................................................................................................................................ 88
Table 5.2: CP0 Register R/W Field Types..............................................................................................................90
Table 5.4: HWREna Register Field Descriptions....................................................................................................91
Table 5.3: UserLocal Register Field Descriptions................................................................................................... 91
Table 5.5: BadVAddr Register Field Description.....................................................................................................92
Table 5.6: BadInstr Register Field Descriptions...................................................................................................... 93
Table 5.8: Count Register Field Description ........................................................................................................... 94
Table 5.7: BadInstrP Register Field Descriptions ................................................................................................... 94
Table 5.9: Compare Register Field Description......................................................................................................95
Table 5.10: Status Register Field Descriptions....................................................................................................... 96
Table 5.11: IntCtl Register Field Descriptions....................................................................................................... 100
Table 5.12: SRSCtl Register Field Descriptions ................................................................................................... 103
Table 5.13: Sources for new SRSCtl
Table 5.14: SRSMap Register Field Descriptions................................................................................................. 106
Table 5.15: View_IPL Register Field Descriptions................................................................................................ 107
Table 5.16: SRSMap Register Field Descriptions................................................................................................. 108
Table 5.17: Cause Register Field Descriptions..................................................................................................... 108
Table 5.18: Cause Register ExcCode Field.......................................................................................................... 112
Table 5.19: View_RIPL Register Field Descriptions ............................................................................................. 113
Table 5.20: NestedExc Register Field Descriptions.............................................................................................. 114
on an Exception or Interrupt................................................................. 106
CSS
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04 1
Table 5.21: EPC Register Field Description..........................................................................................................115
Table 5.22: NestedEPC Register Field Descriptions ............................................................................................ 116
Table 5.23: PRId Register Field Descriptions.......................................................................................................116
Table 5.24: EBase Register Field Descriptions.....................................................................................................118
Table 5.25: CDMMBase Register Field Descriptions............................................................................................ 119
Table 5.26: Config Register Field Descriptions..................................................................................................... 120
Table 5.27: Cache Coherency Attributes..............................................................................................................121
Table 5.28: Config1 Register Field Descriptions — Select 1................................................................................121
Table 5.29: Config2 Register Field Descriptions — Select 1................................................................................122
Table 5.30: Config3 Register Field Descriptions................................................................................................... 123
Table 5.31: Config4 Register Field Descriptions................................................................................................... 126
Table 5.32: Config5 Register Field Descriptions................................................................................................... 127
Table 5.33: Config7 Register Field Descriptions................................................................................................... 127
Table 5.34: Debug Register Field Descriptions.....................................................................................................128
Table 5.35: TraceControl Register Field Descriptions .......................................................................................... 132
Table 5.36: TraceControl2 Register Field Descriptions ........................................................................................ 134
Table 5.37: UserTraceData1/UserTraceData2 Register Field Descriptions ......................................................... 136
Table 5.38: TraceBPC Register Field Descriptions...............................................................................................136
Table 5.39: Debug2 Register Field Descriptions...................................................................................................138
Table 5.40: DEPC Register Formats.....................................................................................................................139
Table 5.41: Performance Counter Register Selects..............................................................................................139
Table 5.42: Performance Counter Control Register Field Descriptions................................................................140
Table 5.43: Performance Counter Events Sorted by Event Number .................................................................... 140
Table 5.44: Performance Counter Event Descriptions Sorted by Event Type......................................................142
Table 5.46: Errctl Register Field Descriptions....................................................................................................... 144
Table 5.45: Performance Counter Count Register Field Descriptions..................................................................144
Table 5.47: CacheErr Register Field Descriptions (Primary Caches)................................................................... 145
Table 5.48: ErrorEPC Register Field Description..................................................................................................146
Table 5.49: DeSave Register Field Description....................................................................................................146
Table 5.50: KScratchn Register Field Descriptions...............................................................................................147
Table 8.1: DCR Register Field Descriptions ......................................................................................................... 156
Table 8.2: Addresses for Instruction Breakpoint Registers................................................................................... 166
Table 8.3: IBS Register Field Descriptions ........................................................................................................... 167
Table 8.4: IBAn Register Field Descriptions ......................................................................................................... 167
Table 8.5: IBMn Register Field Descriptions......................................................................................................... 168
Table 8.6: IBASIDn Register Field Descriptions ................................................................................................... 168
Table 8.7: IBCn Register Field Descriptions.........................................................................................................169
Table 8.8: IBCCn Register Field Descriptions.......................................................................................................170
Table 8.9: IBPCn Register Field Descriptions....................................................................................................... 171
Table 8.10: Addresses for Data Breakpoint Registers.......................................................................................... 171
Table 8.11: DBS Register Field Descriptions........................................................................................................ 172
Table 8.12: DBAn Register Field Descriptions...................................................................................................... 172
Table 8.13: DBMn Register Field Descriptions.....................................................................................................173
Table 8.14: DBASIDn Register Field Descriptions................................................................................................ 173
Table 8.15: DBCn Register Field Descriptions......................................................................................................174
Table 8.16: DBVn Register Field Descriptions...................................................................................................... 175
Table 8.17: DBCCn Register Field Descriptions................................................................................................... 176
Table 8.18: DBPCn Register Field Descriptions...................................................................................................177
Table 8.19: DVM Register Field Descriptions.......................................................................................................177
Table 8.20: Addresses for Complex Breakpoint Registers ................................................................................... 178
Table 8.21: CBTC Register Field Descriptions ..................................................................................................... 178
Table 8.23: Priming Conditions and Register Values for 6I/2D Configuration ...................................................... 180
Table 8.24: Priming Conditions and Register Values for 8I/4D Configuration ...................................................... 180
2 MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
Table 8.22: PrCndA Register Field Descriptions...................................................................................................180
Table 8.25: STCtl Register Field Descriptions......................................................................................................181
Table 8.26: STCtl Register Field Descriptions......................................................................................................182
Table 8.27: EJTAG Interface Pins ........................................................................................................................186
Table 8.28: Implemented EJTAG Instructions ...................................................................................................... 191
Table 8.30: Implementation Register Descriptions ............................................................................................... 195
Table 8.29: Device Identification Register.............................................................................................................195
Table 8.31: EJTAG Control Register Descriptions................................................................................................ 197
Table 8.32: Fastdata Register Field Description................................................................................................... 203
Table 8.33: Operation of the FASTDATA access ................................................................................................. 204
Table 8.34: EJ_DisableProbeDebug Signal Overview.......................................................................................... 207
Table 8.35: Data Bus Encoding ............................................................................................................................ 213
Table 8.36: Tag Bit Encoding................................................................................................................................ 213
Table 8.37: Control/Status Register Field Descriptions ........................................................................................ 215
Table 8.38: ITCBTW Register Field Descriptions ................................................................................................. 216
Table 8.39: ITCBRDP Register Field Descriptions ............................................................................................... 217
Table 8.40: ITCBWRP Register Field Descriptions...............................................................................................217
Table 8.41: drseg Registers that Enable/Disable Trace from Breakpoint-Based Triggers.................................... 218
Table 8.42: FDC TAP Register Field Descriptions................................................................................................ 223
Table 8.43: FDC Register Mapping.......................................................................................................................224
Table 8.44: FDC Access Control and Status Register Field Descriptions ............................................................ 224
Table 8.45: FDC Configuration Register Field Descriptions ................................................................................. 225
Table 8.46: FDC Status Register Field Descriptions.............................................................................................226
Table 8.47: FDC Receive Register Field Descriptions.......................................................................................... 227
Table 8.49: FDTXn Address Decode....................................................................................................................228
Table 8.48: FDC Transmit Register Field Descriptions......................................................................................... 228
Table 9.1: Byte Access Within a Word.................................................................................................................. 232
Table 10.1: Encoding of the Opcode Field............................................................................................................ 237
Table 10.2: Special Opcode Encoding of Function Field......................................................................................237
Table 10.3: Special2 Opcode Encoding of Function Field....................................................................................237
Table 10.4: Special3 Opcode Encoding of Function Field....................................................................................238
Table 10.5: RegImm Encoding of rt Field..............................................................................................................238
Table 10.6: COP2 Encoding of rs Field ................................................................................................................238
Table 10.7: COP2 Encoding of rt Field When rs=BC2..........................................................................................238
Table 10.8: COP0 Encoding of rs Field ................................................................................................................239
Table 10.9: COP0 Encoding of Function Field When rs=CO................................................................................ 239
Table 10.10: Instruction Set..................................................................................................................................239
Table 11.1: 16-Bit Re-encoding of Frequent MIPS32 Instructions........................................................................ 274
Table 11.2: 16-Bit Re-encoding of Frequent MIPS32 Instruction Sequences.......................................................275
Table 11.3: Instruction-Specific Register Specifiers and Immediate Field Values............................................... 277
Table 11.4: 16-Bit Instruction General-Purpose Registers - $2-$7, $16, $17.......................................................278
Table 11.5: SB16, SH16, SW16 Source Registers - $0, $2-$7, $17.....................................................................279
Table 11.6: 16-Bit Instruction Implicit General-Purpose Registers ....................................................................... 279
Table 11.7: 16-Bit Instruction Special-Purpose Registers.....................................................................................280
Table 11.8: 32-bit Instructions introduced within microMIPS................................................................................280
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04 3
Chapter 1
Introduction to the MIPS32® M14K™ Processor Core
The MIPS32® M14K™ core from MIPS Technologies is a high-performance, low-power, 32-bit MIPS RISC proces­sor core intended for custom system-on-silicon applications. The core is designed for semiconductor manufacturing companies, ASIC developers, and system OEMs who want to rapidly integrate their own custom logic and peripher­als with a high-performance RISC processor. The M14K core is fully synthesizable to allow maximum flexibility; it is highly portable across processes and can easily be integrated into full system-on-silicon designs. This allows devel­opers to focus their attention on end-user specific characteristics of their product.
The M14K core is especially well-suited for microcontrollers and applications that have real-time requirements with a high level of performance efficiency and security requirements.
The M14K core implements the MIPS Architecture in a 5-stage pipeline. It includes support for the microMIPS™ ISA, an Instruction Set Architecture with optimized MIPS32 16-bit and 32-bit instructions that provides a significant reduction in code size with a performance equivalent to MIPS32. The M14K core is a successor to the M4K®, designed from the same microarchitecture, including the Microcontroller Application-Specific Extension (MCU™ ASE), enhanced interrupt handling, lower interrupt latency, a memory protection unit (MPU), a reference design of an optimized interface for flash memory and built-in native AMBA®-3 AHB-Lite Bus Interface Unit (BIU), with additional power saving, debug, and profiling features.
The M14K core is cacheless; in lieu of caches, it includes a simple interface to SRAM-style devices. This interface may be configured for independent instruction and data devices or combined into a unified interface. The SRAM interface allows deterministic latency to memory, while still maintaining high performance.
The core includes one of two different Multiply/Divide Unit (MDU) implementations, selectable at build-time, allow­ing the user to trade-off performance and area for integer multiply and divide operations. The high-performance MDU option implements single-cycle multiply and multiply-accumulate (MAC) instructions that enable DSP algo­rithms to be performed efficiently. It allows 32-bit x 16-bit MAC instructions to be issued every cycle, while a 32-bit x 32-bit MAC instruction can be issued every other cycle. The area-efficient MDU option handles multiplies with a one-bit-per-clock iterative algorithm.
The MMU consists of a simple Fixed Mapping Translation (FMT) mechanism, for applications that do not require the full capabilities of a Translation Lookaside Buffer- (TLB-) based MMU available on other MIPS cores.
The basic Enhanced JTAG (EJTAG) features provide CPU run control with stop, single-stepping and re-start, and with software breakpoints using the SDBBP instruction. Additional EJTAG features such as instruction and data vir­tual address hardware breakpoints, complex hardware breakpoints, connection to an external EJTAG probe through the Test Access Port (TAP), and PC/Data tracing, may be included as an option.
1.1 Features
5-stage pipeline
32-bit Address and Data Paths
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04 4
1.1 Features
MIPS32 Instruction Set Architecture
MIPS32 Enhanced Architecture Features
Vectored interrupts and support for external interrupt controller
Programmable exception vector base
Atomic interrupt enable/disable
GPR shadow registers (one, three, seven, or fifteen additional shadows can be optionally added to minimize latency for interrupt handlers)
Bit field manipulation instructions
microMIPS Instruction Set Architecture
microMIPS ISA is a build-time configurable option that reduces code size over MIPS32, while maintaining MIPS32 performance.
Combining both 16-bit and 32-bit opcodes, microMIPS supports all MIPS32 instructions (except branch-likely instructions) with new optimized encoding. Frequently used MIPS32 instructions are available as 16-bit instructions.
Added fifteen new 32-bit instructions and thirty-nine 16-bit instructions.
Stack pointer implicit in instruction.
MIPS32 assembly and ABI-compatible.
Supports MIPS architecture Modules and User-defined Instructions (UDIs).
MCU™ ASE
Increases the number of interrupt hardware inputs from 6 to 8 for Vectored Interrupt (VI) mode, and from 63 to 255 for External Interrupt Controller (EIC) mode.
Separate priority and vector generation. 16-bit vector address is provided.
Hardware assist combined with the use of Shadow Register Sets to reduce interrupt latency during the pro­logue and epilogue of an interrupt.
An interrupt return with automated interrupt epilogue handling instruction (IRET) improves interrupt latency.
Supports optional interrupt chaining.
Two memory-to-memory atomic read-modify-write instructions (ASET and ACLR) eases commonly used semaphore manipulation in microcontroller applications. Interrupts are automatically disabled during the operation to maintain coherency.
Memory Management Unit
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04 5
Introduction to the MIPS32® M14K™ Processor Core
Simple Fixed Mapping Translation (FMT) mechanism
Memory Protection Unit
Optional feature that improves system security by restricting access, execution, and trace capabilities from untrusted code in predefined memory regions.
Simple SRAM-Style Interface
Cacheless operation enables deterministic response and reduces die-size
32-bit address and data; input byte-enables enable simple connection to narrower devices
Single or multi-cycle latencies
Configuration option for dual or unified instruction/data interfaces
Redirection mechanism on dual I/D interfaces permits D-side references to be handled by I-side
Transactions can be aborted
Reference Design
A typical SRAM reference design is provided.
An AHB-Lite BIU reference design is provided between the SRAM interface and AHB-Lite Bus.
An optimized interface for slow memory (Flash) access using prefetch buffer scheme is provided.
Parity Support
The ISRAM and DSRAM support optional parity detection.
Multiply/Divide Unit (area-efficient configuration )
32 clock latency on multiply
34 clock latency on multiply-accumulate
33-35 clock latency on divide (sign-dependent)
Multiply/Divide Unit (high-performance configuration)
Maximum issue rate of one 32x16 multiply per clock via on-chip 32x16 hardware multiplier array.
Maximum issue rate of one 32x32 multiply every other clock
Early-in iterative divide. Minimum 11 and maximum 34 clock latency (dividend (rs) sign extension-depen­dent)
CorExtend® User-Defined Instruction Set Extensions
Allows user to define and add instructions to the core at build time
6 MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
Maintains full MIPS32 compatibility
Supported by industry-standard development tools
Single or multi-cycle instructions
Multi-Core Support
External lock indication enables multi-processor semaphores based on LL/SC instructions
External sync indication allows memory ordering
Debug support includes cross-core triggers
Coprocessor 2 interface
32-bit interface to an external coprocessor
Power Control
Minimum frequency: 0 MHz
1.1 Features
Power-down mode (triggered by WAIT instruction)
Support for software-controlled clock divider
Support for extensive use of local gated clocks
EJTAG Debug/Profiling and iFlowtrace™ Mechanism
CPU control with start, stop, and single stepping
Virtual instruction and data address/value breakpoints
Hardware breakpoint supports both address match and address range triggering
Optional simple hardware breakpoints on virtual addresses; 8I/4D, 6I/2D, 4I/2D, 2I/1D breakpoints, or no breakpoints
Optional complex hardware breakpoints with 8I/4D, 6I/2D simple breakpoints
TAP controller is chainable for multi-CPU debug
Supports EJTAG (IEEE 1149.1) and compatible with cJTAG 2-wire (IEEE 1149.7) extension protocol
Cross-CPU breakpoint support
iFlowtrace support for real-time instruction PC and special events
PC and/or load/store address sampling for profiling
Performance Counters
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04 7
Introduction to the MIPS32® M14K™ Processor Core
Support for Fast Debug Channel (FDC)
SecureDebug
An optional feature that disables access via EJTAG in an untrusted environment
Testability
Full scan design achieves test coverage in excess of 99% (dependent on library and configuration options)
1.2 M14K™ Core Block Diagram
The M14K core contains both required and optional blocks, as shown in the block diagram in Figure 1.1. Required blocks are the lightly shaded areas of the block diagram and are always present in any core implementation. Optional blocks may be added to the base core, depending on the needs of a specific implementation. The required blocks are as follows:
Instruction Decode
Execution Unit
General Purposed Registers (GPR)
Multiply/Divide Unit (MDU)
System Control Coprocessor (CP0)
Memory Management Unit (MMU)
I/D SRAM Interfaces
Power Management
Optional blocks include:
Configurable instruction decoder supporting three ISA modes: MIPS32-only, MIPS32 and microMIPS, or micro­MIPS-only
Memory Protection Unit (MPU)
Reference Design of I/D-SRAM, BIU, Slow Memory Interface
Coprocessor 2 interface
CorExtend® User-Defined Instruction (UDI) interface
Debug/Profiling with Enhanced JTAG (EJTAG) Controller, Break points, Sampling, Performance counters, Fast Debug Channel, and iFlowtrace logic
8 MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
Introduction to the MIPS32® M14K™ Processor Core
1.2.1.2 General Purposed Register (GPR) Shadow Registers
The M14K core contains thirty-two 32-bit general-purpose registers used for integer operations and address calcula­tion. Optionally, one, three, seven or fifteen additional register file shadow sets (each containing thirty-two registers) can be added to minimize context switching overhead during interrupt/exception processing. The register file consists of two read ports and one write port and is fully bypassed to minimize operation latency in the pipeline.
1.2.1.3 Multiply/Divide Unit (MDU)
The M14K core includes a multiply/divide unit (MDU) that contains a separate, dedicated pipeline for integer multi­ply/divide operations. This pipeline operates in parallel with the integer unit (IU) pipeline and does not stall when the IU pipeline stalls. This allows the long-running MDU operations to be partially masked by system stalls and/or other integer unit instructions.
The MIPS architecture defines that the result of a multiply or divide operation be placed in a pair of
HI and LO regis-
ters. Using the Move-From-HI (MFHI) and Move-From-LO (MFLO) instructions, these values can be transferred to the general-purpose register file.
There are two configuration options for the MDU: 1) a higher performance 32x16 multiplier block; 2) an area-effi­cient iterative multiplier block. . The selection of the MDU style allows the implementor to determine the appropriate performance and area trade-off for the application.
MDU with 32x16 High-Performance Multiplier
The high-performance MDU consists of a 32x16 Booth-recoded multiplier, a pair of result/accumulation registers (
HI
and LO), a divide state machine, and the necessary multiplexers and control logic. The first number shown (‘32’ of 32x16) represents the rs operand. The second number (‘16’ of 32x16) represents the rt operand. The M14K core only checks the value of the rt operand to determine how many times the operation must pass through the multiplier. The 16x16 and 32x16 operations pass through the multiplier once. A 32x32 operation passes through the multiplier twice.
The MDU supports execution of one 16x16 or 32x16 multiply or multiply-accumulate operation every clock cycle; 32x32 multiply operations can be issued every other clock cycle. Appropriate interlocks are implemented to stall the issuance of back-to-back 32x32 multiply operations. The multiply operand size is automatically determined by logic built into the MDU.
MDU with Area-Efficient Option
With the area-efficient option, multiply and divide operations are implemented with a simple 1-bit-per-clock iterative algorithm. Any attempt to issue a subsequent MDU instruction while a multiply/divide is still active causes an MDU pipeline stall until the operation is completed.
Regardless of the multiplier array implementation, divide operations are implemented with a simple 1-bit-per-clock iterative algorithm. An early-in detection checks the sign extension of the dividend (rs) operand. If rs is 8 bits wide, 23 iterations are skipped. For a 16-bit-wide rs, 15 iterations are skipped, and for a 24-bit-wide rs, 7 iterations are skipped. Any attempt to issue a subsequent MDU instruction while a divide is still active causes an IU pipeline stall until the divide operation has completed.
1.2.1.4 System Control Coprocessor (CP0)
In the MIPS architecture, CP0 is responsible for the virtual-to-physical address translation, the exception control sys­tem, the processor’s diagnostics capability, the operating modes (kernel, user, and debug), and whether interrupts are enabled or disabled. Configuration information, such as presence of build-time options like microMIPS, CorExtend Module or Coprocessor 2 interface, is also available by accessing the CP0 registers.
10 MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
1.2 M14K™ Core Block Diagram
Coprocessor 0 also contains the logic for identifying and managing exceptions. Exceptions can be caused by a variety of sources, including boundary cases in data, external events, or program errors.
Interrupt Handling
The M14K core includes support for eight hardware interrupt pins, two software interrupts, and a timer interrupt. These interrupts can be used in any of three interrupt modes, as defined by Release 2 of the MIPS32 Architecture:
Interrupt compatibility mode, which acts identically to that in an implementation of Release 1 of the Architec­ture.
Vectored Interrupt (VI) mode, which adds the ability to prioritize and vector interrupts to a handler dedicated to that interrupt, and to assign a GPR shadow set for use during interrupt processing. The presence of this mode is denoted by the
VInt bit in the Config3 register. This mode is architecturally optional; but it is always present on
the M14K core, so the VInt bit will always read as a 1 for the M14K core.
External Interrupt Controller (EIC) mode, which redefines the way in which interrupts are handled to provide full support for an external interrupt controller handling prioritization and vectoring of interrupts. The presence of this mode denoted by the
VEIC bit in the Config3 register. Again, this mode is architecturally optional. On the
M14K core, the VEIC bit is set externally by the static input, SI_EICPresent, to allow system logic to indicate the presence of an external interrupt controller.
The reset state of the processor is interrupt compatibility mode, such that a processor supporting Release 2 of the Architecture, the M14K core for example, is fully compatible with implementations of Release 1 of the Architecture.
VI or EIC interrupt modes can be combined with the optional shadow registers to specify which shadow set should be used on entry to a particular vector. The shadow registers further improve interrupt latency by avoiding the need to save context when invoking an interrupt handler.
In the M14K core, interrupt latency is reduced by:
Speculative interrupt vector prefetching during the pipeline flush.
Interrupt Automated Prologue (IAP) in hardware: Shadow Register Sets remove the need to save GPRs, and IAP removes the need to save specific Control Registers when handling an interrupt.
Interrupt Automated Epilogue (IAE) in hardware: Shadow Register Sets remove the need to restore GPRs, and IAE removes the need to restore specific Control Registers when returning from an interrupt.
Allow interrupt chaining. When servicing an interrupt and interrupt chaining is enabled, there is no need to return from the current Interrupt Service Routine (ISR) if there is another valid interrupt pending to be serviced. The control of the processor can jump directly from the current ISR to the next ISR without IAE and IAP.
GPR Shadow Registers
The MIPS32 Architecture optionally removes the need to save and restore GPRs on entry to high-priority interrupts or exceptions, and to provide specified processor modes with the same capability. This is done by introducing multi­ple copies of the GPRs, called shadow sets, and allowing privileged software to associate a shadow set with entry to kernel mode via an interrupt vector or exception. The normal GPRs are logically considered shadow set zero.
The number of GPR shadow sets is a build-time option. The M14K core allows 1 (the normal GPRs), 2, 4, 8, or 16 shadow sets. The highest number actually implemented is indicated by the SRSCtlHSS field. If this field is zero, only the normal GPRs are implemented.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04 11
Introduction to the MIPS32® M14K™ Processor Core
Shadow sets are new copies of the GPRs that can be substituted for the normal GPRs on entry to kernel mode via an interrupt or exception. When a shadow set is bound to a kernel-mode entry condition, references to GPRs operate exactly as one would expect, but they are redirected to registers that are dedicated to that condition. Privileged soft­ware may need to reference all GPRs in the register file, even specific shadow registers that are not visible in the cur­rent mode, and the RDPGPR and WRPGPR instructions are used for this purpose. The CSS field of the SRSCtl register provides the number of the current shadow register set, and the PSS field of the SRSCtl register provides the number of the previous shadow register set that was current before the last exception or interrupt occurred.
If the processor is operating in VI interrupt mode, binding of a vectored interrupt to a shadow set is done by writing to the SRSMap register. If the processor is operating in EIC interrupt mode, the binding of the interrupt to a specific shadow set is provided by the external interrupt controller and is configured in an implementation-dependent way. Binding of an exception or non-vectored interrupt to a shadow set is done by writing to the ESS field of the SRSCtl register. When an exception or interrupt occurs, the value of SRSCtl
to the value taken from the appropriate source. On an ERET, the value of SRSCtl to restore the shadow set of the mode to which control returns.
Refer to Chapter 5, “CP0 Registers of the M14K™ Core” on page 88 for more information on the CP0 registers. Refer to Chapter 8, “EJTAG Debug Support in the M14K™ Core” on page 155 for more information on EJTAG debug registers.
1.2.1.5 Memory Management Unit (MMU)
is copied to SRSCtl
CSS
, and SRSCtl
PSS
is copied back into SRSCtl
PSS
CSS
is set
CSS
Modes of Operation
The M14K core implements three modes of operation:
User mode is most often used for applications programs.
Kernel mode is typically used for handling exceptions and operating-system kernel functions, including CP0
management and I/O device accesses.
Debug mode is used during system bring-up and software development. Refer to the EJTAG section for more
information on debug mode.
Figure 1.2 shows the virtual address map of the MIPS Architecture.
12 MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
Figure 1.2 M14K™ Core Virtual Address Map
0xFFFFFFFF
Fix Mapped
0xFF400000
0xFF3FFFFF
0xFF200000
0xF1FFFFFF
0xE0000000
0xDFFFFFFF
0xC0000000
0xBFFFFFFF
0xA0000000
0x9FFFFFFF
0x80000000
0x7FFFFFFF
Memory/EJTAG
Fix Mapped
Kernel Virtual Address Space
Fix Mapped, 512 MB
Kernel Virtual Address Space
Unmapped, 512 MB
Uncached
Kernel Virtual Address Space
Unmapped, 512 MB
1
kseg3
kseg2
kseg1
kseg0
1.2 M14K™ Core Block Diagram
User Virtual Address Space
kuseg
Mapped, 2048 MB
0x00000000
1. This space is mapped to memory in user or kernel mode, and by the EJTAG module in debug mode.
Memory Management Unit (MMU)
The M14K core contains a simple Fixed Mapping Translation (FMT) MMU that interfaces between the execution unit and the SRAM controller.
Fixed Mapping Translation (FMT)
A FMT is smaller and simpler than the full Translation Lookaside Buffer (TLB) style MMU found in other MIPS cores. Like a TLB, the FMT performs virtual-to-physical address translation and provides attributes for the dif­ferent segments. Those segments that are unmapped in a TLB implementation (kseg0 and kseg1) are translated identically by the FMT.
Figure 1.3 shows how the FMT is implemented in the M14K core.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04 13
Introduction to the MIPS32® M14K™ Processor Core
Figure 1.3 Address Translation During SRAM Access with FMT Implementation
FMT
Physical Address
Physical Address
SRAM interface
Inst SRAM
Data SRAM
Instruction Address Calculator
Data Address Calculator
Virtual
Address
Virtual
Address
1.2.1.6 SRAM Interface Controller
Instead of caches, the M14K core contains an interface to SRAM-style memories that can be tightly coupled to the core. This permits deterministic response time with less area than is typically required for caches. The SRAM inter­face includes separate uni-directional 32-bit buses for address, read data, and write data.
Dual or Unified Interfaces
The SRAM interface includes a build-time option to select either dual or unified instruction and data interfaces.
The dual interface enables independent connection to instruction and data devices. It generally yields the highest per­formance, because the pipeline can generate simultaneous I and D requests, which are then serviced in parallel.
For simpler or cost-sensitive systems, it is also possible to combine the I and D interfaces into a common interface that services both types of requests. If I and D requests occur simultaneously, priority is given to the D side.
Back-stalling
Typically, read and write transactions will complete in a single cycle. However, if multi-cycle latency is desired, the interface can be stalled to allow connection to slower devices.
Redirection
When the dual I/D interface is present, a mechanism exists to divert D-side references to the I-side, if desired. The mechanism can be explicitly invoked for any other D-side references, as well. When the DS_Redir signal is asserted, a D-side request is diverted to the I-side interface in the following cycle, and the D-side will be stalled until the trans­action is completed.
Transaction Abort
The core may request a transaction (fetch/load/store/sync) to be aborted. This is particularly useful in case of inter­rupts. Because the core does not know whether transactions are re-startable, it cannot arbitrarily interrupt a request that has been initiated on the SRAM interface. However, cycles spent waiting for a multi-cycle transaction to com­plete can directly impact interrupt latency. In order to minimize this effect, the interface supports an abort mecha­nism. The core requests an abort whenever an interrupt is detected and a transaction is pending (abort of an instruction fetch may also be requested in other cases). The external system logic can choose to acknowledge or to ignore the abort request.
Connecting to Narrower Devices
The instruction and data read buses are always 32 bits in width. To facilitate connection to narrower memories, the SRAM interface protocol includes input byte-enables that can be used by system logic to signal validity as partial read data becomes available. The input byte-enables conditionally register the incoming read data bytes within the
14 MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
1.2 M14K™ Core Block Diagram
core, and thus eliminate the need for external registers to gather the entire 32 bits of data. External muxes are required to redirect the narrower data to the appropriate byte lanes.
Lock Mechanism
The SRAM interface includes a protocol to identify a locked sequence, and is used in conjunction with the LL/SC atomic read-modify-write semaphore instructions.
Sync Mechanism
The interface includes a protocol that externalizes the execution of the SYNC instruction. External logic might choose to use this information to enforce memory ordering between various elements in the system.
External Call Indication
The instruction fetch interface contains signals that indicate that the core is fetching the target of a subroutine call-type instruction such as JAL or BAL. At some point after a call, there will typically be a return to the original code sequence. If a system prefetches instructions, it can make use of this information to save instructions that were prefetched and are likely to be executed after the return.
1.2.1.7 Power Management
The M14K core offers a number of power management features, including low-power design, active power manage­ment, and power-down modes of operation. The core is a static design that supports slowing or halting the clocks, which reduces system power consumption during idle periods.
The M14K core provides two mechanisms for system-level low-power support:
Register-controlled power management
Instruction-controlled power management
Register-Controlled Power Management
The RP bit in the CP0 Status register provides a software mechanism for placing the system into a low-power state. The state of the RP bit is available externally via the SI_RP signal. The external agent then decides whether to place the device in a low-power mode, such as reducing the system clock frequency.
Three additional bits,StatusEXL, StatusERL, and DebugDM support the power management function by allowing the user to change the power state if an exception or error occurs while the M14K core is in a low-power state. Depending on what type of exception is taken, one of these three bits will be asserted and reflected on the SI_EXL, SI_ERL, or
EJ_DebugM outputs. The external agent can look at these signals and determine whether to leave the low-power state
to service the exception.
The following four power-down signals are part of the system interface and change state as the corresponding bits in the CP0 registers are set or cleared:
The SI_RP signal represents the state of the RP bit (27) in the CP0 Status register.
The SI_EXL signal represents the state of the EXL bit (1) in the CP0 Status register.
The SI_ERL signal represents the state of the ERL bit (2) in the CP0 Status register.
The EJ_DebugM signal represents the state of the DM bit (30) in the CP0 Debug register.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04 15
Introduction to the MIPS32® M14K™ Processor Core
Instruction-Controlled Power Management
The second mechanism for invoking power-down mode is by executing the WAIT instruction. When the WAIT instruction is executed, the internal clock is suspended; however, the internal timer and some of the input pins (SI_Int[5:0], SI_NMI, SI_Reset, and SI_ColdReset) continue to run. When the CPU is in instruction-controlled power management mode, any interrupt, NMI, or reset condition causes the CPU to exit this mode and resume normal oper­ation.
The M14K core asserts the SI_Sleep signal, which is part of the system interface bus, whenever the WAIT instruction is executed. The assertion of SI_Sleep indicates that the clock has stopped and the M14K core is waiting for an inter­rupt.
Local clock gating
The majority of the power consumed by the M14K core is in the clock tree and clocking registers. The core has sup­port for extensive use of local gated clocks. Power-consciousimplementors can use these gated clocks to significantly reduce power consumption within the core.
Refer to Chapter 7, “Power Management of the M14K™ Core” on page 153 for more information on power manage­ment.
1.2.2 Optional Logic Blocks
The core consists of the following optional logic blocks as shown in the block diagram in Figure 1.1.
1.2.2.1 Reference Design
The M14K core contains a reference design that shows a typical usage of the core with:
Dual I-SRAM and D-SRAM interface with fast memories (i.e., SRAM) for instruction and data storage.
Optimized interface for slow memory (i.e., Flash memory) access by having a prefetch buffer and a wider Data Read bus (i.e., IS_RData[127:0]) to speed up I-Fetch performance.
AHB-lite bus interface to the system bus if the memory accesses are outside the memory map for the SRAM and Flash regions. AHB-Lite is a subset of the AHB bus protocol that supports a single bus master. The interface shares the same 32-bit Read and Write address bus and has two unidirectional 32-bit buses for Read and Write data.
The reference design is optional and can be modified by the user to better fit the SOC design requirement.
16 MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
Introduction to the MIPS32® M14K™ Processor Core
1.2.2.5 CorExtend® User-defined Instruction Extensions
An optional CorExtend User-defined Instruction (UDI) block enables the implementation of a small number of appli­cation-specific instructions that are tightly coupled to the core’s execution unit. The interface to the UDI block is external to the M14K core.
Such instructions may operate on a general-purpose register, immediate data specified by the instruction word, or local state stored within the UDI block. The destination may be a general-purpose register or local UDI state. The operation may complete in one cycle or multiple cycles, if desired.
Refer to Table 10.3 “Special2 Opcode Encoding of Function Field” for a specification of the opcode map available for user-defined instructions.
1.2.2.6 EJTAG Debug Support
The M14K core provides for an optional Enhanced JTAG (EJTAG) interface for use in the software debug of applica­tion and kernel code. In addition to standard user and kernel modes of operation, the M14K core provides a Debug mode that is entered after a debug exception (derived from a hardware breakpoint, single-step exception, etc.) is taken and continues until a debug exception return (DERET) instruction is executed. During this time, the processor exe­cutes the debug exception-handler routine.
The EJTAG interface operates through the Test Access Port (TAP), a serial communication port used for transferring test data in and out of the M14K core. In addition to the standard JTAG instructions, special instructions defined in the EJTAG specification specify which registers are selected and how they are used.
Debug Registers
Four debug registers (DEBUG, DEBUG2, DEPC, and DESAVE) have been added to the MIPS Coprocessor 0 (CP0) register set. The DEBUG and DEBUG2 registers show the cause of the debug exception and are used for setting up single-step operations. The DEPC (Debug Exception Program Counter) register holds the address on which the debug exception was taken, which is used to resume program execution after the debug operation finishes. Finally, the
DESAVE (Debug Exception Save) register enables the saving of general-purpose registers used during execution of
the debug exception handler.
To exit debug mode, a Debug Exception Return (DERET) instruction is executed. When this instruction is executed, the system exits debug mode, allowing normal execution of application and system code to resume.
EJTAG Hardware Breakpoints
There are several types of simple hardware breakpoints defined in the EJTAG specification. These stop the normal operation of the CPU and force the system into debug mode. There are two types of simple hardware breakpoints implemented in the M14K core: Instruction breakpoints and Data breakpoints. Additionally, complex hardware breakpoints can be included, which allow detection of more intricate sequences of events.
The M14K core can be configured with the following breakpoint options:
No data or instruction, or complex breakpoints
One data and two instruction breakpoints, without complex breakpoints
Two data and four instruction breakpoints, without complex breakpoints
Two data and six instruction breakpoints, with or without complex breakpoints
18 MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
1.2 M14K™ Core Block Diagram
Four data and eight instruction breakpoints, with or without complex breakpoints
Instruction breakpoints occur on instruction execution operations, and the breakpoint is set on the virtual address. A mask can be applied to the virtual address to set breakpoints on a binary range of instructions.
Data breakpoints occur on load/store transactions, and the breakpoint is set on a virtual address value, with the same single address or binary address range as the Instruction breakpoint. Data breakpoints can be set on a load, a store, or both. Data breakpoints can also be set to match on the operand value of the load/store operation, with byte-granularity masking. Finally, masks can be applied to both the virtual address and the load/store value.
In addition, the M14K core has a configurable feature to support data and instruction address-range triggered break­points, where a breakpoint can occur when a virtual address is either within or outside a pair of 32-bit addresses. Unlike the traditional address-mask control, address-range triggering is not restricted to a power-of-two binary boundary.
Complex breakpoints utilize the simple instruction and data breakpoints and break when combinations of events are seen. Complex break features include:
Pass Counters - Each time a matching condition is seen, a counter is decremented. The break or trigger will only be enabled when the counter has counted down to 0.
Tuples - A tuple is the pairing of an instruction and a data breakpoint. The tuple will match if both the virtual address of the load or store instruction matches the instruction breakpoint, and the data breakpoint of the result­ing load or store address and optional data value matches.
Priming - This allows a breakpoint to be enabled only after other break conditions have been met. Also called sequential or armed triggering.
Qualified - This feature uses a data breakpoint to qualify when an instruction breakpoint can be taken. When a load matches the data address and the data value, the instruction break will be enabled. If a load matches the address, but has mis-matching data, the instruction break will be disabled.
Performance Counters
Performance counters are used to accumulate occurrences of internal predefined events/cycles/conditions for pro­gram analysis, debug, or profiling. A few examples of event types are clock cycles, instructions executed, specific instruction types executed, loads, stores, exceptions, and cycles while the CPU is stalled. There are two, 32-bit counters. Each can count one of the 64 internal predefined events selected by a corresponding control register. A counter overflow can be programmed to generate an interrupt, where the interrupt-handler software can maintain larger total counts.
PC/Address Sampling
This sampling function is used for program profiling and hot-spots analysis. Instruction PC and/or Load/Store addresses can be sampled periodically. The result is scanned out through the EJTAG port. The Debug Control
Register
(DCR) is used to specify the sample period and the sample trigger.
Fast Debug Channel (FDC)
The M14K core includes an optional FDC as a mechanism for high bandwidth data transfer between a debug host/probe and a target. FDC provides a FIFO buffering scheme to transfer data serially, with low CPU overhead and minimized waiting time. The data transfer occurs in the background, and the target CPU can choose either to check the status of the transfer periodically or to be interrupted at the end of the transfer.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04 19
Introduction to the MIPS32® M14K™ Processor Core
Figure 1.5 FDC Overview
M14K
EJTAG
TAP
Receive from
Probe to Core
Transmitfrom Core to Probe
32
32
FDC
FIFO
FIFO
Probe
TDI
TDO
Tap Controller
TMS
iFlowtrace™
The M14K core has an option for a simple trace mechanism named iFlowtrace. This mechanism only traces the instruction PC, not data addresses or values. This simplification allows the trace block to be smaller and the trace compression to be more efficient. iFlowtrace memory can be configured as off-chip, on-chip, or both.
iFlowtrace also offers special-event trace modes when normal tracing is disabled, namely:
Function Call/Return and Exception Tracing mode to trace the PC value of function calls and returns and/or exceptions and returns.
Breakpoint Match mode traces the breakpoint ID of a matching breakpoint and, for data breakpoints, the PC value of the instruction that caused it.
Filtered Data Tracing mode traces the ID of a matching data breakpoint, the load or store data value, access type and memory access size, and the low-order address bits of the memory access, which is useful when the data breakpoint is set up to match a binary range of addresses.
User Trace Messages. The user can instrument their code to add their own 32-bit value messages into the trace by writing to the Cop0 UTM register.
Delta Cycle mode works in combination with the above trace modes to provide a timestamp between stored events. It reports the number of cycles that have elapsed since the last message was generated and put into the trace.
Refer to Chapter 8, “EJTAG Debug Support in the M14K™ Core” on page 155 for more information on the EJTAG features.
cJTAG Support
The M14K core provides an external conversion block which converts the existing EJTAG (IEEE 1149.1) 4-wire interface at the M14K core to a cJTAG (IEEE 1149.7) 2-wire interface. cJTAG reduces the number of wires from 4 to 2 and enables the support of Star-2 scan topology in the system debug environment.
20 MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04
M14K
1.2 M14K™ Core Block Diagram
Figure 1.6 cJTAG Support
EJTAG
Tap
Controller
EJTAG
4-wire
interface
TDI TDO TCK TMS
cJTAG
Conversion
Block
cJTAG 2-wire
interface
TMSC
TCK
SecureDebug
SecureDebug improves security by disabling untrusted EJTAG debug access. An input signal is used to disable debug features, such as Probe Trap, Debug Interrupt Exception (EjtagBrk and DINT), EJTAGBOOT instruction, and PC Sampling.
MIPS32® M14K™ Processor Core Family Software User’s Manual, Revision 02.04 21
Chapter 2
Pipeline of the M14K™ Core
The M14K processor core implements a 5-stage pipeline similar to the original M4K pipeline. The pipeline allows the processor to achieve high frequency while minimizing device complexity, reducing both cost and power con­sumption. This chapter contains the following sections:
Section 2.1 “Pipeline Stages”
Section 2.2 “Multiply/Divide Operations”
Section 2.3 “MDU Pipeline - High-performance MDU”
Section 2.4 “MDU Pipeline - Area-Efficient MDU”
Section 2.5 “Branch Delay”
Section 2.6 “Data Bypassing”
Section 2.8 “Interlock Handling”
Section 2.9 “Slip Conditions”
Section 2.10 “Instruction Interlocks”
Section 2.11 “Hazards”
2.1 Pipeline Stages
The M14K core implements a 5-stage pipeline with a performance similar to the M4K pipeline. The pipeline allows the processor to achieve high frequency while minimizing device complexity, reducing both cost and power con­sumption.
The M14K core pipeline consists of five stages:
Instruction (I Stage)
Execution (E Stage)
Memory (M Stage)
Align (A Stage)
Writeback (W stage)
MIPS32® M14Kª P rocessor Core Family Software User’s Manual, Revision 02.04 22
2.1 Pipeline Stages
The M14K core implements a bypass mechanism that allows the result of an operation to be forwarded directly to the instruction that needs it without having to write the result to the register and then read it back.
The M14K soft core includes a build-time option that determines the type of multiply/divide unit (MDU) imple­mented. The MDU can be either a high-performance 32x16 multiplier array or an iterative, area-efficient array. The MDU choice has a significant effect on the MDU pipeline, and the latency of multiply/divide instructions executed on the core. Software can query the type of MDU present on a specific implementation of the core by querying the MDU bit in the Config register (CP0 register 16, select 0); see Chapter 5, “CP0 Registers of the M14K™ Core” on page 88 for more details.
Figure 2.1 shows the operations performed in each pipeline stage of the M14K processor core, when the high-perfor-
mance multiplier is present.
Figure 2.1 M14K™ Core Pipeline Stages with high-performance MDU
: SRAM read
I E M A W
ISRAM
RegRd
I Dec
M->E Bypass
ALU Op
D-AC
I-AC2I-AC1
MUL
Mult,
Divide
A->E Bypass
DSRAM
A->E Bypass
16x16,
32x32
Align
MDU Res Rdy
Sign Adjust
RegW
RegW
MDU Res RdyMult, CPA
MDU Res RdyCPA
MDU Res Rdy
IU-PipelineMDU-Pipeli
ISRAM
I Dec
RegRd
I-AC2I-AC1
ALU Op
D-AC
DSRAM
Align
RegW
MUL
CPA
Mult, Macc
Divide
Sign Adjust
MDU Res Rdy
: Instruction Decode : Register file read : Instruction Address Calculation stage 1and 2 : Arithmetic Logic and Shift operations : Data Address Calculation : DSRAM read : Load data aligner : Register file write : MUL instruction : Carry Propagate Adder :
Multi
ply and Multiply Accumulateinstructions : Divide instructions : Last stage of Divide is a sign adjust : Result can be read from MDU : One or more cycles.
Figure 2.2 shows the operations performed in each pipeline stage of the M14K processor core, when the area-efficient
multiplier is present.
Figure 2.2 M14K™ Core Pipeline Stages with area-efficient MDU
: I-SRAM read
I E M A W
I-SRAM
RegRd
A->E Bypass
M->E Bypass
ALU Op
D-AC
D-SRAM
I-AC2I-AC1
A->E Bypass
MUL
Multiply, Divid
AlignI Dec
MDU Res Rdy
RegW
IU-PipelMDU-Pip
RegW
MDU Res Rdy
I-SRAM
I Dec
RegRd
I-AC2I-AC1
ALU Op
D-AC
D-SRAM
Align
RegW
MUL
Multiply, Divide
MDU Res Rdy
: Instruction Decode : Register file read : Instruction Address Calculation stage 1and 2 : Arithmetic Logic and Shift operations : Data Address Calculation
: D-SRAM read : Load data aligner : Register file write : MUL instruction : Multiply, Multiply Acc. And Divide : Result can be read from MDU
: One or more cycles.
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 23
Pipeline of the M14K™ Core
2.1.1 I Stage: Instruction Fetch
During the Instruction fetch stage:
An instruction is fetched from the instructionSRAM.
If both MIPS32 and microMIPS ISAs are supported, microMIPS instructions are converted to MIPS32-like instructions. If the MIPS32 ISA is not supported, 16-bit microMIPS instructions will be first recoded into 32-bit microMIPS equivalent instructions, and then decoded in native microMIPS ISA format.
2.1.2 E Stage: Execution
During the Execution stage:
Operands are fetched from the register file.
Operands from the M and A stage are bypassed to this stage.
The Arithmetic Logic Unit (ALU) begins the arithmetic or logical operation for register-to-register instructions.
The ALU calculates the data virtual address for load and store instructions and the MMU performs the fixed vir­tual-to-physical address translation.
The ALU determines whether the branch condition is true and calculates the virtual branch target address for branch instructions.
Instruction logic selects an instruction address and the MMU performs the fixed virtual-to-physical address translation.
All multiply and divide operations begin in this stage.
2.1.3 M Stage: Memory Fetch
During the Memory fetch stage:
The arithmetic ALU operation completes.
The data SRAM access is performed for load and store instructions.
A 16x16 or 32x16 multiply calculation completes (high-performance MDU option).
A 32x32 multiply operation stalls the MDU pipeline for one clock in the M stage (high-performance MDU option ).
A multiply operation stalls the MDU pipeline for 31 clocks in the M stage (area-efficient MDU option ).
A multiply-accumulate operation stalls the MDU pipeline for 33 clocks in the M stage (area-efficient MDU option ).
A divide operation stalls the MDU pipeline for a maximum of 34 clocks in the M stage. Early-in sign extension detection on the dividend will skip 7, 15, or 23 stall clocks (only the divider in the fast MDU option supports early-in detection).
24 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
2.2 Multiply/Divide Operations
2.1.4 A Stage: Align
During the Align stage:
Load data is aligned to its word boundary.
A multiply/divide operation updates the HI/LO registers (area-efficient MDU option).
Multiply operation performs the carry-propagate-add. The actual register writeback is performed in the W stage (high-performance MDU option).
A MUL operation makes the result available for writeback. The actual register writeback is performed in the W stage.
EJTAG complex break conditions are evaluated.
2.1.5 W Stage: Writeback
During the Writeback stage:
For register-to-register or load instructions, the result is written back to the register file.
2.2 Multiply/Divide Operations
The M14K core implements the standard MIPS II™ multiply and divide instructions. Additionally, several new instructions were standardized in the MIPS32 architecture for enhanced performance.
The targeted multiply instruction, MUL, specifies that multiply results be placed in the general-purpose register file instead of the HI/LO register pair. By avoiding the explicit MFLO instruction, required when using the LO register, and by supporting multiple destination registers, the throughput of multiply-intensive operations is increased.
Four instructions, multiply-add (MADD), multiply-add-unsigned (MADDU), multiply-subtract (MSUB), and multi­ply-subtract-unsigned (MSUBU), are used to perform the multiply-accumulate and multiply-subtract operations. The MADD/MADDU instruction multiplies two numbers and then adds the product to the current contents of the HI and LO registers. Similarly, the MSUB/MSUBU instruction multiplies two operands and then subtracts the product from the HI and LO registers. The MADD/MADDU and MSUB/MSUBU operations are commonly used in DSP algo­rithms.
All multiply operations (except the MUL instruction) write to the HI/LO register pair. All integer operations write to the general purpose registers (GPR). Because MDU operations write to different registers than integer operations, integer instructions that follow can execute before the MDU operation has completed. The MFLO and MFHI instruc­tions are used to move data from the HI/LO register pair to the GPR file. If an MFLO or MFHI instruction is issued before the MDU operation completes, it will stall to wait for the data.
2.3 MDU Pipeline - High-performance MDU
The M14K processor core contains an autonomous multiply/divide unit (MDU) with a separate pipeline for multiply and divide operations. This pipeline operates in parallel with the integer unit (IU) pipeline and does not stall when the IU pipeline stalls. This allows multi-cycle MDU operations, such as a divide, to be partially masked by system stalls and/or other integer unit instructions.
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 25
Pipeline of the M14K™ Core
The MDU consists of a 32x16 Booth-encoded multiplier array, a carry propagate adder, result/accumulation registers (HI and LO), multiply and divide state machines, and all necessary multiplexers and control logic. The first number shown (‘32’ of 32x16) represents the rs operand. The second number (‘16’ of 32x16) represents the rt operand. The core only checks the latter (rt) operand value to determine how many times the operation must pass through the mul­tiplier array. The 16x16 and 32x16 operations pass through the multiplier array once. A 32x32 operation passes through the multiplier array twice.
The MDU supports execution of a 16x16 or 32x16 multiply operation every clock cycle; 32x32 multiply operations can be issued every other clock cycle. Appropriate interlocks are implemented to stall the issue of back-to-back 32x32 multiply operations. Multiply operand size is automatically determined by logic built into the MDU. Divide operations are implemented with a simple 1 bit per clock iterative algorithm with an early in detection of sign exten­sion on the dividend (rs). Any attempt to issue a subsequent MDU instruction while a divide is still active causes an IU pipeline stall until the divide operation is completed.
Table 2.1 lists the latencies (number of cycles until a result is available) for multiply, and divide instructions. The
latencies are listed in terms of pipeline clocks. In this table ‘latency’ refers to the number of cycles necessary for the first instruction to produce the result needed by the second instruction.
Table 2.1 MDU Instruction Latencies (High-Performance MDU)
Size of Operand
1st Instruction
[1]
16 bit MULT/MULTU,
MADD/MADDU,
MSUB/MSUBU
32 bit MULT/MULTU,
MADD/MADDU, or
MSUB/MSUBU
16 bit MUL
32 bit MUL
Instruction Sequence
MADD/MADDU,
MSUB/MSUBU or
MADD/MADDU,
MSUB/MSUBU or
Integer operation
Integer operation
MFHI/MFLO
MFHI/MFLO
[2]
[2]
Latency
Clocks1st Instruction 2nd Instruction
1
2
[3]
2
[3]
2
8 bit DIVU MFHI/MFLO 9
16 bit DIVU MFHI/MFLO 17
24 bit DIVU MFHI/MFLO 25
32 bit DIVU MFHI/MFLO 33
8 bit DIV MFHI/MFLO
16 bit DIV MFHI/MFLO
24 bit DIV MFHI/MFLO
32 bit DIV MFHI/MFLO
any MFHI/MFLO
Integer operation
any MTHI/MTLO MADD/MADDU or
[2]
10
18[4]
26
34
2
1
[4]
[4]
[4]
MSUB/MSUBU
[1] For multiply operations, this is the rt operand. For divide operations, this is the rs operand. [2] Integer Operation refers to any integer instruction that uses the result of a previous MDU operation. [3] This does not include the 1 or 2 IU pipeline stalls (16 bit or 32 bit) that the MUL operation causes irre-
spective of the following instruction.These stalls do not add to the latency of 2.
[4] If both operands are positive, then the Sign Adjust stage is bypassed. Latency is then the same as for
DIVU.
26 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
2.3 MDU Pipeline - High-performance MDU
In Table 2.1, a latency of one means that the first and second instructions can be issued back-to-back in the code, without the MDU causing any stalls in the IU pipeline. A latency of two means that if issued back-to-back, the IU pipeline will be stalled for one cycle. MUL operations are special, because the MDU needs to stall the IU pipeline in order to maintain its register file write slot. As a result, the MUL 16x16 or 32x16 operation will always force a one­cycle stall of the IU pipeline, and the MUL 32x32 will force a two-cycle stall. If the integer instruction immediately following the MUL operation uses its result, an additional stall is forced on the IU pipeline.
Table 2.2 lists the repeat rates (peak issue rate of cycles until the operation can be reissued) for multiply accumu-
late/subtract instructions. The repeat rates are listed in terms of pipeline clocks. In this table ‘repeat rate’ refers to the case where the first MDU instruction (in the table below) if back-to-back with the second instruction.
Table 2.2 MDU Instruction Repeat Rates (High-Performance MDU)
Operand Size of 1st
Instruction
16 bit MULT/MULTU,
MADD/MADDU,
MSUB/MSUBU
32 bit MULT/MULTU,
MADD/MADDU,
MSUB/MSUBU
Instruction Sequence
MADD/MADDU,
MADD/MADDU, MSUB/MSUBU 2
Repeat
Rate1st Instruction 2nd Instruction
1
MSUB/MSUBU
Figure 2.3 below shows the pipeline flow for the following sequence:
1. 32x16 multiply (Mult1)
2. Add
3. 32x32 multiply (Mult2)
4. Subtract (Sub)
The 32x16 multiply operation requires one clock of each pipeline stage to complete. The 32x32 multiply operation requires two clocks in the M
always starts a computation in the final phase of the E stage. As shown in the figure, the M MDU pipeline occurs in parallel with the M stage of the IU pipeline, the A stage, and the W
stage occurs in parallel with the W stage. In general this need not be the case. Following the 1st
MDU
pipe-stage. The MDU pipeline is shown as the shaded areas of Figure 2.3 and
MDU
pipe-stage of the
MDU
stage occurs in parallel with the A
MDU
cycle of the M stages, the two pipelines need not be synchronized. This does not present a problem because results in the MDU pipeline are written to the HI and LO registers, while the integer pipeline results are written to the register file.
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 27
Pipeline of the M14K™ Core
Figure 2.3 MDU Pipeline Behavior During Multiply Operations
cycle 1 cycle 2 cycle 3 cycle 4 cycle 5 cycle 6 cycle 7 cycle 8
Mult
Mult
Add
Sub
I E A
1
I E
2
M
MDU
I E A
MDU
I E
W
MDU
A
M
MDU
WM
M
MDU
MDU
A
W
MDU
WM
The following is a cycle-by-cycle analysis of Figure 2.3.
1. The first 32x16 multiply operation (Mult1) is fetched from the instruction cache and enters the I stage.
2. An Add operation enters the I stage. The Mult1 operation enters the E stage. The integer and MDU pipelines share the I and E pipeline stages. At the end of the E stage in cycle 2, the MDU pipeline starts processing the
multiply operation (Mult1).
3. In cycle 3, a 32x32 multiply operation (Mult2) enters the I stage and is fetched from the instruction cache. Since the Add operation has not yet reached the M stage by cycle 3, there is no activity in the M stage of the integer
pipeline at this time.
4. In cycle 4, the Subtract instruction enters I stage. The second multiply operation (Mult2) enters the E stage. And the Add operation enters M stage of the integer pipe. Since the Mult1 multiply is a 32x16 operation, only one clock is required for the M
5. In cycle 5, the Subtract instruction enters E stage. The Mult2multiply enters the M
stage, hence the Mult1operation passes to the A
MDU
stage of the MDU pipeline.
MDU
stage. The Add operation
MDU
enters the A stage of the integer pipeline. The Mult1operation completes and is written back in to the HI/LO reg­ister pair in the W
MDU
stage.
6. Since a 32x32 multiply requires two passes through the multiplier, with each pass requiring one clock, the 32x32 Mult2remains in the M
stage in cycle 6. The Sub instruction enters M stage in the integer pipeline. The Add
MDU
operation completes and is written to the register file in the W stage of the integer pipeline.
7. The Mult2 multiply operation progresses to the A
8. The Mult2 operation completes and is written to the HI/LO registers pair in the the W
stage, and the Sub instruction progress to the A stage.
MDU
stage, while the Sub
MDU
instruction writes to the register file in the W stage.
2.3.1 32x16 Multiply (High-Performance MDU)
The 32x16 multiply operation begins in the last phase of the E stage, which is shared between the integer and MDU pipelines. In the latter phase of the E stage, the rs and rt operands arrive and the Booth-recoding function occurs at this time. The multiply calculation requires one clock and occurs in the M
carry-propagate-add (CPA) function occurs and the operation is completed. The result is ready to be read from the HI/LO registers in the W
MDU
stage.
Figure 2.4 shows a diagram of a 32x16 multiply operation.
28 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
stage. In the A
MDU
MDU
stage, the
2.3 MDU Pipeline - High-performance MDU
Figure 2.4 MDU Pipeline Flow During a 32x16 Multiply Operation
Clock 1 2 3 4
E
M
MDU
A
MDU
W
MDU
Booth Array
CPA
Res
2.3.2 32x32 Multiply (High-Performance MDU)
The 32x32 multiply operation begins in the last phase of the E stage, which is shared between the integer and MDU pipelines. In the latter phase of the E stage, the rs and rt operands arrive and the Booth-recoding function occurs at this time. The multiply calculation requires two clocks and occurs in the M
stage. In the A
MDU
stage, the CPA
MDU
function occurs and the operation is completed.
Figure 2.5 shows a diagram of a 32x32 multiply operation.
Figure 2.5 MDU Pipeline Flow During a 32x32 Multiply Operation
Clock 1 2 3 4
E M
Booth Array
MDU
Booth
M
MDU
Array
A
CPA
MDU
W
5
MDU
Res
2.3.3 Divide (High-Performance MDU)
Divide operations are implemented using a simple non-restoring division algorithm. This algorithm works only for positive operands, hence the first cycle of the M that this cycle is spent even if the adjustment is not necessary. During the next maximum 32 cycles (3-34) an iterative add/subtract loop is executed. In cycle 3 an early-in detection is performed in parallel with the add/subtract. The adjusted rs operand is detected to be zero extended on the upper most 8, 16 or 24 bits. If this is the case the following 7, 15 or 23 cycles of the add/subtract iterations are skipped.
stage is used to negate the rs operand (RS Adjust) if needed. Note
MDU
The remainder adjust (Rem Adjust) cycle is required if the remainder was negative. Note that this cycle is spent even if the remainder was positive. A sign adjust is performed on the quotient and/or remainder if necessary. The sign adjust stage is skipped if both operands are positive. In this case the Rem Adjust is moved to the A
MDU
stage.
Figure 2.6, Figure 2.7, Figure 2.8 and Figure 2.9 show the latency for 8, 16, 24 and 32 bit divide operations, respec-
tively. The repeat rate is either 11, 19, 27 or 35 cycles (one less if the sign adjust stage is skipped) as a second divide can be in the RS Adjust stage when the first divide is in the Reg WR stage.
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 29
Pipeline of the M14K™ Core
Figure 2.6 High-Performance MDU Pipeline Flow During a 8-bit Divide (DIV) Operation
Clock 1 2 4-10 11
M
E Stage
Stage M
MDU
RS Adjust
3
M
Stage
MDU
Add/Subtract
Early In
Stage M
MDU
Stage A
MDU
Rem AdjustAdd/Subtract
12
Stage
MDU
Sign Adjust
13
W
Stage
MDU
MDU Res Rdy
Figure 2.7 High-Performance MDU Pipeline Flow During a 16-bit Divide (DIV) Operation
Clock 1 2 4-18 19
E Stage
M
Stage M
MDU
RS Adjust
3
M
Stage
MDU
Add/Subtract
Early In
Stage M
MDU
Stage A
MDU
Rem AdjustAdd/Subtract
20
Stage
MDU
Sign Adjust
21
W
Stage
MDU
MDU Res Rdy
Figure 2.8 High-Performance MDU Pipeline Flow During a 24-bit Divide (DIV) Operation
Clock 1 2 4-26 27
E Stage M
Stage M
MDU
3
M
Stage
MDU
Stage M
MDU
Stage A
MDU
MDU
28
Stage
29
W
Stage
MDU
RS Adjust
Add/Subtract
Early In
Figure 2.9 High-Performance MDU Pipeline Flow During a 32-bit Divide (DIV) Operation
Clock 1 2 4-34 35
E Stage M
Stage M
MDU
RS Adjust
3
M
Stage
MDU
Add/Subtract
Early In
2.4 MDU Pipeline - Area-Efficient MDU
The area-efficient multiply/divide unit (MDU) is a separate autonomous block for multiply and divide operations. The MDU is not pipelined, but rather performs the computations iteratively in parallel with the integer unit (IU) pipe­line and does not stall when the IU pipeline stalls. This allows the long-running MDU operations to be partially masked by system stalls and/or other integer unit instructions.
The MDU consists of one 32-bit adder result-accumulate registers (HI and LO), a combined multiply/divide state machine, and all multiplexers and control logic. A simple 1-bit-per-clock recursive algorithm is used for both multi­ply and divide operations. Using Booth’s algorithm all multiply operations complete in 32 clocks. Two extra clocks are needed for multiply-accumulate. The non-restoring algorithm used for divide operations will not work with nega-
Stage M
MDU
Rem AdjustAdd/Subtract
Stage A
MDU
Rem AdjustAdd/Subtract
Sign Adjust
36
Stage
MDU
Sign Adjust
MDU Res Rdy
37
W
Stage
MDU
MDU Res Rdy
30 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
2.4 MDU Pipeline - Area-Efficient MDU
tive numbers. Adjustment before and after are thus required depending on the sign of the operands. All divide opera­tions complete in 33 to 35 clocks.
Table 2.3 lists the latencies (number of cycles until a result is available) for multiply and divide instructions. The
latencies are listed in terms of pipeline clocks. In this table ‘latency’ refers to the number of cycles necessary for the second instruction to use the results of the first.
Table 2.3 M14K™ Core Instruction Latencies (Area-Efficient MDU)
Operand Signs of
1st Instruction
(Rs,Rt)
any, any MULT/MULTU MADD/MADDU,
any, any MADD/MADDU,
MSUB/MSUBU
any, any MUL
any, any DIVU MFHI/MFLO 33
pos, pos DIV MFHI/MFLO 33
any, neg DIV MFHI/MFLO
neg, pos DIV MFHI/MFLO 35
any, any MFHI/MFLO
any, any MTHI/MTLO MADD/MADDU,
[1] Integer Operation refers to any integer instruction that uses the result of a previous MDU operation.
2.4.1 Multiply (Area-Efficient MDU)
Instruction Sequence
MSUB/MSUBU, or
MADD/MADDU,
MSUB/MSUBU, or
Integer operation
Integer operation
MSUB/MSUBU
MFHI/MFLO
MFHI/MFLO
[1]
[1]
Latency
Clocks1st Instruction 2nd Instruction
32
34
32
34
2
1
Multiply operations are executed using a simple iterative multiply algorithm. Using Booth’s approach, this algorithm works for both positive and negative operands. The operation uses 32 cycles in M
stage to complete a multiplica-
MDU
tion. The register writeback to HI and LO are done in the A stage. For MUL operations, the register file writeback is done in the W
MDU
stage.
Figure 2.10 shows the latency for a multiply operation. The repeat rate is 33 cycles as a second multiply can be in the
first M
stage when the first multiply is in A
MDU
MDU
stage.
Figure 2.10 M14K™ Area-Efficient MDU Pipeline Flow During a Multiply Operation
Clock 1 2-33 34 35
M
E-Stage
-Stage A
MDU
Add/sub-shift
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 31
-Stage
MDU
HI/LO Write
W
MDU
Reg WR
-Stage
Pipeline of the M14K™ Core
2.4.2 Multiply Accumulate (Area-Efficient MDU)
Multiply-accumulate operations use the same multiply machine as used for multiply only. Two extra stages are needed to perform the addition/subtraction. The operations uses 34 cycles in M
ply-accumulate. The register writeback to HI and LO are done in the A stage.
Figure 2.11 shows the latency for a multiply-accumulate operation. The repeat rate is 35 cycles as a second multi-
ply-accumulate can be in the E stage when the first multiply is in the last M
MDU
Figure 2.11 M14K™ Core Area-Efficient MDU Pipeline Flow During a Multiply Accumulate Operation
Clock 1 2-33 34 35
E Stage M
Stage M
MDU
Stage M
MDU
MDU
Stage
36
A
MDU
stage to complete the multi-
MDU
stage.
37
Stage
W
Stage
MDU
Add/Subtract Shift
Accumulate/LO
Accumulate/HI
HI/LO Write
2.4.3 Divide (Area-Efficient MDU)
Divide operations also implement a simple non-restoring algorithm. This algorithm works only for positive operands, hence the first cycle of the M
executed even if negation is not needed. The next 32 cycle (3-34) executes an interactive add/subtract-shift function.
Two sign adjust (Sign Adjust 1/2) cycles are used to change the sign of one or both the quotient and the remainder. Note that one or both of these cycles are skipped if they are not needed. The rule is, if both operands were positive or if this is an unsigned division; both of the sign adjust cycles are skipped. If the rs operand was negative, one of the sign adjust cycles is skipped. If only the rs operand was negative, none of the sign adjust cycles are skipped. Register writeback to HI and LO are done in the A stage.
Figure 2.12 shows the pipeline flow for a divide operation. The repeat rate is either 34, 35 or 36 cycles (depending on
how many sign adjust cycles are skipped) as a second divide can be in the E stage when the first divide is in the last M
stage.
MDU
Figure 2.12 M14K™ Core Area-Efficient MDU Pipeline Flow During a Divide (DIV) Operation
Clock
1 2 3-34 35
E Stage M
stage is used to negate the rs operand (RS Adjust) if needed. Note that this cycle is
MDU
MDU
M
MDU
M
MDU
M
36
MDU
A
37
MDU
W
38
MDU
RS Adjust
Sign Adjust 1Add/Subtract
Sign Adjust 2
HI/LO Write
2.5 Branch Delay
The pipeline has a branch delay of one cycle. The one-cycle branch delay is a result of the branch decision logic oper­ating during the E pipeline stage. This allows the branch target address to be used in the I stage of the instruction fol­lowing 2 cycles after the branch instruction. By executing the 1st instruction following the branch instruction sequentially before switching to the branch target, the intervening branch delay slot is utilized. This avoids bubbles
32 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
2.6 Data Bypassing
being injected into the pipeline on branch instructions. Both the address calculation and the branch condition check are performed in the E stage.
The pipeline begins the fetch of either the branch path or the fall-through path in the cycle following the delay slot. After the branch decision is made, the processor continues with the fetch of either the branch path (for a taken branch) or the fall-through path (for the non-taken branch).
The branch delay means that the instruction immediately following a branch is always executed, regardless of the branch direction. If no useful instruction can be placed after the branch, then the compiler or assembler must insert a NOP instruction in the delay slot.
Figure 2.13 illustrates the branch delay.
Figure 2.13 IU Pipeline Branch Delay
Jump or Branch
Delay Slot Instruction
Jump Target Instruction
2.6 Data Bypassing
Most MIPS32 instructions use one or two register values as source operands. These operands are fetched from the register file in the first part of E stage. The ALU straddles the E-to-M boundary, and can present the result early in the M stage. However, the result is not written to the register file before the W stage. If no precautions were taken, it would take 3 cycles before the result was available for the following instructions. To avoid this, data bypassing is implemented.
Between the register file and the ALU a data-bypass multiplexer is placed on both operands (see figure below). This enables the M14K core to forward data from a preceding instruction whose target is a source register of a following instruction. An M to E bypass and an A to E bypass feed the bypass multiplexers. A W to E bypass is not needed, as the register file is capable of making an internal bypass of Rd write data directly to the Rs and Rt read ports.
One Cycle
IEMA
One Cycle One Cycle One Cycle One Cycle
IEMA
IEMA
One Clock
Branch
Delay
One Cycle
W
W
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 33
Pipeline of the M14K™ Core
Figure 2.14 IU Pipeline Data Bypass
E stage M stage A stage W stageI stage
A to E bypass
M to E bypass
Instruction
Rs Addr
Rs Read
Rt Addr
Reg File
Rd Write
Rt Read
ALU
E stage
Bypass
multiplexers
ALU
M stage
Loaddata, HI/LO Data or
CP0 data
Figure 2.15 shows the data bypass for an Add1instruction followed by a Sub2and another Add3instruction. The Sub
instruction uses the output from the Add1instruction as one of the operands, and thus the M to E bypass is used. The following Add3 uses the result from both the first Add1 instruction and the Sub2 instruction. Since the Add1 data is now in A stage, the A to E bypass is used, and the M to E bypass is used to bypass the Sub2 data to the Add2 instruc­tion.
Figure 2.15 IU Pipeline M to E bypass
ADD
1
R3=R2+R1
One Cycle One Cycle One Cycle One Cycle One Cycle
I
EMA
M to E bypass
A to E bypass
W
One Cycle
2
SUB
2
R4=R3-R7
ADD
3
R5=R3+R4
IEMA
M to E bypass
IEMA
W
2.6.1 Load Delay
Load delay refers to the fact that data fetched by a load instruction is not available in the integer pipeline until after the load aligner in A stage. All instructions need the source operands available in the E stage. An instruction immedi­ately following a load instruction will, if it has the same source register as was the target of the load, cause an instruc­tion interlock pipeline slip in the E stage (see 2.10 “Instruction Interlocks” on page 38). If an instruction following the load by 1 or 2 cycles uses the data from the load, the A to E bypass (see Figure 2.30) serves to reduce or avoid stall cycles. An instruction flow of this is shown in Figure 2.16.
34 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
Figure 2.16 IU Pipeline A to E Data bypass
2.7 Coprocessor 2 Instructions
One Cycle One Cycle One Cycle One Cycle One Cycle
Load Instruction
Consumer of Load Data Instruction
IEMA
Data bypass from A to E
IEMA
IEMA
One Clock
Load Delay
W
One Cycle
W
2.6.2 Move from HI/LO and CP0 Delay
As indicated in Figure 2.30, not only load data, but also data moved from the HI or LO registers (MFHI/MFLO) and data moved from CP0 (MFC0) enters the IU-Pipeline in the A stage. That is, data is not available in the integer pipe­line until early in the A stage. The A to E bypass is available for this data. But as for Loads, an instruction following immediately after one of these move instructions must be paused for one cycle if the target of the move is among the sources of the following instruction and this causes an interlock slip in the E stage (see 2.10 “Instruction Interlocks”
on page 38). An interlock slip after a MFHI is illustrated in Figure 2.17.
Figure 2.17 IU Pipeline Slip after a MFHI
One Cycle One Cycle One Cycle One Cycle One Cycle
MFHI (to R3)
ADD (R4=R3+R5)
IEMA
2.7 Coprocessor 2 Instructions
If a coprocessor 2 is attached to the M14K core, a number of transactions must take place on the CP2 Interface for each coprocessor 2 instruction. First, if the CU[2] bit in the CP0 Status register is not set, then no coprocessor 2 related instruction will start a transaction on the CP2 Interface; instead, a Coprocessor Unusable exception will be signaled. If the CU[2] bit is set, and a coprocessor 2 instruction is fetched, the following transactions will occur on the CP2 Interface:
1. The Instruction is presented on the instructions bus in E stage. Coprocessor 2 can do a decode in the same cycle.
2. The Instruction is validated from the core in M stage. From this point, the core will accept control and data sig­nals back from coprocessor 2. All control and data signals from coprocessor 2 are captured on input latches to the core.
One Cycle
W
Data bypass from A to E
E (slip)I
EMA
One Cycle
W
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 35
Pipeline of the M14K™ Core
3. If all the expected control and data signals were presented to the core in the previous M stage, the core will pro­ceed to execute the A stage. If some return information is missing, the A stage will not advance and cause a slip in all I, E, and M stages (see 2.9 “Slip Conditions” on page 37). If this instruction sent data from the core to coprocessor 2, this data is sent in the A stage.
4. The instruction completion is signaled to coprocessor 2 in the W stage. Potential data from the coprocessor is written to the register file.
Figure 2.18 shows the timing relationship between the M14K core and coprocessor 2 for all coprocessor 2 instruc-
tions.
Figure 2.18 Coprocessor 2 Interface Transactions
One Cycle One Cycle One Cycle One Cycle One Cycle
COP2 inst.
Core internal operations
Core to CP2 info.
CP2 to Core info.
CP2 internal operations
IEMA
Fetch
instrucion
Get ready for
new inst.
Decode and
setup valid
Ready
Decode & get
FromData
Get ToData
from memory
Control &
FromData
See
Valid
Capture
Control &
FromData
ToData CompleteValidate inst.Instrucion
W
Capture
ToData
Complete
instruction
As can be seen in the Figure, all control and data from the coprocessor must occur in the M stage. If this is not the case, the A stage will start slipping in the following cycle and thus stall the I, E, M. and A stages; but if all expected control and data is available in the M stage, coprocessor 2 instructions can execute with no pipeline stalls. The only exception to this is the Branch on Coprocessor conditions (BC2) instruction. All branch instructions, including the regular BEQ, BNE, etc., must be resolved in the E stage. The M14K core does not have branch prediction logic, and thus the target address must be available before the end of the E stage. The BC2 instruction has to follow the same protocol as all other coprocessor 2 instructions on the CP2 Interface. All core interface operations belonging to the E, M, and A stages will have to occur in the E stage for BC2 instructions. This means that a BC2 instruction always slips for a minimum of 2 cycles int the E stage, and any delay in the return of branch information from coprocessor 2 will add to the number of slip cycles. All other Coprocessor 2 instructions can operate without slips, provided that all con­trol and data information from coprocessor 2 is transferred in the M stage.
2.8 Interlock Handling
Smooth pipeline flow is interrupted when cache misses occur or when data dependencies are detected. Interruptions handled entirely in hardware, such as cache misses, are referred to as interlocks. At each cycle, interlock conditions are checked for all active instructions.
36 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
2.9 Slip Conditions
Table 2.4 lists the types of pipeline interlocks for the M14K processor core.
Table 2.4 Pipeline Interlocks
Interlock Type Sources Slip Stage
I-side SRAM Stall SRAM Access not complete E Stage
Instruction Producer-consumer hazards E/M Stage
Hardware Dependencies (MDU) E Stage
BC2 waiting for COP2 Condition Check
D-side SRAM Stall SRAM Access not complete A Stage
Coprocessor 2 completion slip Coprocessor 2 control and/or data delay
from coprocessor
In general, MIPS processors support two types of hardware interlocks:
Stalls, which are resolved by halting the pipeline
Slips, which allow one part of the pipeline to advance while another part of the pipeline is held static
In the M14K processor core, all interlocks are handled as slips.
A Stage
2.9 Slip Conditions
On every clock, internal logic determines whether each pipe stage is allowed to advance. These slip conditions prop­agate backwards down the pipe. For example, if the M stage does not advance, neither does the E or I stage.
Slipped instructions are retried on subsequent cycles until they issue. The back end of the pipeline advances normally during slips. This resolves the conflict when the slip was caused by a missing result. NOPs are inserted into the bub­ble in the pipeline. Figure 2.19 shows an instruction cache miss that causes a two-cycle slip.
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 37
Pipeline of the M14K™ Core
Figure 2.19 Instruction Cache Miss Slip
Clock 1 2 3 4 5 6
1
2
3
Stage
M
I
E
A
I
I
3
I
2
I
1
I
0
I
4
5
I
I
3
4
I
I
2
3
I
I
1
2
I
I
5
I
4
0
I
3
I
5
6
I
I
4
5
0
I
4
0
0
1 Cache miss detected
Critical word received
2
3 Execute E-stage
In the first clock cycle in Figure 2.19, the pipeline is full and the cache miss is detected. Instruction I0 is in the A stage, instruction I1 is in the M stage, instruction I2 is in the E stage, and instruction I3 is in the I stage. The cache miss occurs in clock 2 when the I4 instruction fetch is attempted. I4 advances to the E stage and waits for the instruc­tion to be fetched from main memory. In this example, two clocks (3 and 4) are required to fetch the I4 instruction from memory. After the cache miss has been resolved in clock 4 and the instruction is bypassed to the E stage, the
pipeline is restarted, causing I4 to finally execute it’s E-stage operations.
2.10 Instruction Interlocks
Most instructions can be issued at a rate of one per clock cycle. In order to adhere to the sequential programming model, the issue of an instruction must sometimes be delayed to ensure that the result of a prior instruction is avail­able. Table 2.5 details the instruction interactions that prevent an instruction from advancing in the processor pipe­line.
Table 2.5 Instruction Interlocks
Instruction Interlocks
Issue Delay (in
First Instruction Second Instruction
LB/LBU/LH/LHU/LL/LW/LWL/LWR Consumer of load data 1 E stage
MFC0 Consumer of destination regis-
ter
MULTx/MADDx/MSUBx (high-performance MDU)
MUL (high-performance MDU)
16bx32b MFLO/MFHI 0
32bx32b 1 M stage
16bx32b Consumer of target data 2 E stage
32bx32b 3 E stage
Clock Cycles) Slip Stage
1 E stage
38 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
Table 2.5 Instruction Interlocks (Continued)
Instruction Interlocks
First Instruction Second Instruction
2.11 Hazards
Issue Delay (in
Clock Cycles) Slip Stage
MUL (high-performance MDU)
MFHI/MFLO Consumer of target data 1 E stage
MULTx/MADDx/MSUBx (high-performance MDU)
DIV MUL/MULTx/MADDx/
MULT/MUL/MADD/MSUB/MTHI/MTLO/MFH I/MFLO/DIV (area-efficient MDU)
MUL (area-efficient MDU)
MFC0/MFC2/CFC2 Consumer of target data 1 E stage
16bx32b Non-Consumer of target data 1 E stage
32bx32b 2 E stage
16bx32b MULT/MUL/MADD/MSUB
32bx32b
MTHI/MTLO/DIV
MSUBx/MTHI/MTLO/ MFHI/MFLO/DIV
MULT/MUL/MADD/MSUB/ MTHI/MTLO/MFHI/MFLO/ DIV
Any Instruction Until MUL completes E stage
Until DIV completes E stage
Until 1st MDU op
[1]
0
[1]
1
completes
2.11 Hazards
In general, the M14K core ensures that instructions are executed following a fully sequential program model in which each instruction in the program sees the results of the previous instruction. There are some deviations to this model, referred to as hazards.
E stage
E stage
E stage
Prior to Release 2 of the MIPS32® Architecture, hazards (primarily CP0 hazards) were relegated to implementa­tion-dependent cycle-based solutions, primarily based on the SSNOP instruction. This has been an insufficient and error-prone practice that must be addressed with a firm compact between hardware and software. As such, new instructions have been added to Release 2 of the architecture which act as explicit barriers that eliminate hazards. To the extent that it was possible to do so, the new instructions have been added in such a way that they are back­ward-compatible with existing MIPS processors.
2.11.1 Types of Hazards
With one exception, all hazards were eliminated in Release 1 of the Architecture for unprivileged software. The exception occurs when unprivileged software writes a new instruction sequence and then wishes to jump to it. Such an operation remained a hazard, and is addressed by the capabilities of Release 2.
In privileged software, there are two types of hazards: execution hazards and instruction hazards.
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 39
Pipeline of the M14K™ Core
Execution hazards are those created by the execution of one instruction, and seen by the execution of another instruc­tion. Table 2.6 lists execution hazards.
Table 2.6 Execution Hazards
Producer Consumer Hazard On
Spacing
(Instructions)
MTC0 Coprocessor instruction execution depends on the new value of Sta-
tus
CU
MTC0 ERET EPC
MTC0 ERET Status 0
MTC0, EI, DI Interrupted Instruction Status
MTC0 Interrupted Instruction Cause
MTC0 RDPGPR
WRPGPR
MTC0 Instruction not seeing a Timer Interrupt Compare
MTC0 Instruction affected by change Any other CP0
1. This is the minimum value. Actual value is system-dependent since it is a function of the sequential logic between the SI_TimerInt output and the external logic which feeds SI_TimerInt back into one of the SI_Int inputs, or a function of the method for handling
SI_TimerInt in an external interrupt controller.
Status
CU
DEPC
ErrorEPC
IE
IP
SRSCtl
PSS
update that
clears Timer
Interrupt
register
1
1
1
3
1
1
4
2
Instruction hazards are those created by the execution of one instruction, and seen by the instruction fetch of another instruction. Table 2.7 lists instruction hazards.
Table 2.7 Instruction Hazards
Spacing
Producer Consumer Hazard On
MTC0 Instruction fetch seeing the new value (including a change to ERL fol-
lowed by an instruction fetch from the useg segment)
Instruction stream write via redi­rected store
Instruction fetch seeing the new instruction stream Cache entries 3
Status
(Instructions)
2.11.2 Instruction Listing
Table 2.8 lists the instructions designed to eliminate hazards. See the document titled MIPS32® Architecture for Pro-
grammers Volume II: The MIPS32® Instruction Set (MD00086) for a more detailed description of these instructions.
Table 2.8 Hazard Instruction Listing
Mnemonic Function
EHB Clear execution hazard
40 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
2.11 Hazards
Table 2.8 Hazard Instruction Listing (Continued)
Mnemonic Function
JALR.HB Clear both execution and instruction hazards
JR.HB Clear both execution and instruction hazards
SYNCI Synchronize caches after instruction stream write
2.11.2.1 Instruction Encoding
The EHB instruction is encoded using a variant of the NOP/SSNOP encoding. This encoding was chosen for compat­ibility with the Release 1 SSNOP instruction, such that existing software may be modified to be compatible with both Release 1 and Release 2 implementations. See the EHB instruction description for additional information.
The JALR.HB and JR.HB instructions are encoding using bit 10 of the hint field of the JALR and JR instructions. These encodings were chosen for compatibility with existing MIPS implementations, including many which pre-date the MIPS32 architecture. Because a pipeline flush clears hazards on most early implementations, the JALR.HB or JR.HB instructions can be included in existing software for backward and forward compatibility. See the JALR.HB and JR.HB instructions for additional information.
The SYNCI instruction is encoded using a new encoding of the REGIMM opcode. This encoding was chosen because it causes a Reserved Instruction exception on all Release 1 implementations. As such, kernel software run­ning on processors that don’t implement Release 2 can emulate the function using the CACHE instruction.
2.11.3 Eliminating Hazards
The Spacing column shown in Table 2.6 and Table 2.7 indicates the number of unrelated instructions (such as NOPs or SSNOPs) that, prior to the capabilities of Release 2, would need to be placed between the producer and consumer of the hazard in order to ensure that the effects of the first instruction are seen by the second instruction. Entries in the table that are listed as 0 are traditional MIPS hazards which are not hazards on the M14K core.
With the hazard elimination instructions available in Release 2, the preferred method to eliminate hazards is to place one of the instructions listed in Table 2.8 between the producer and consumer of the hazard. Execution hazards can be removed by using the EHB, JALR.HB, or JR.HB instructions. Instruction hazards can be removed by using the JALR.HB or JR.HB instructions, in conjunction with the SYNCI instruction. Since the M14K core does not contain caches, the SYNCI instruction is not strictly necessary, but is still recommended to create portable code that can be run on other MIPS processors that may contain caches.
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 41
Pipeline of the M14K™ Core
42 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
Chapter 3
Memory Management of the M14K™ Core
The M14K™ processor core includes a Memory Management Unit (MMU) that interfaces between the execution unit and the cache controller. The core implements a simple Fixed Mapping Translation (FMT) style MMU.
This chapter contains the following sections:
Section 3.1 “Introduction”
Section 3.2 “Modes of Operation”
Section 3.3 “Fixed Mapping MMU”
Section 3.4 “System Control Coprocessor”
3.1 Introduction
The MMU in a M14K processor core translates a virtual address to a physical address before the request is sent to the SRAM interface for an external memory reference.
In the M14K processor core, the MMU is based on a simple algorithm to translate virtual addresses to physical addresses via a Fixed Mapping Translation (FMT) mechanism. These translations are different for various regions of the virtual address space (useg/kuseg, kseg0, kseg1, kseg2/3).
3.1.1 Memory Management Unit (MMU)
The M14K core contains a simple Fixed Mapping Translation (FMT) MMU that interfaces between the execution unit and the SRAM controller.
3.1.1.1 Fixed Mapping Translation (FMT)
An FMT is smaller and simpler than the full Translation Lookaside Buffer (TLB) style MMU found in other MIPS cores. Like a TLB, the FMT performs virtual-to-physical address translation and provides attributes for the different segments. Those segments that are unmapped in a TLB implementation (kseg0 and kseg1) are translated identically by the FMT.
Figure 3.1 shows how the memory management unit interacts with the SRAM access in the M14K core.
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 43
Memory Management of the M14K™ Core
Figure 3.1 Address Translation During SRAM Access
Instruction Address
Calculator
Data Address Calculator
3.2 Modes of Operation
The M14K core implements three modes of operation:
User mode is most often used for applications programs.
Kernel mode is typically used for handling exceptions and operating-system kernel functions, including CP0 management and I/O device accesses.
Debug mode is used during system bring-up and software development. Refer to the EJTAG section for more information on debug mode.
Virtual Address
Virtual Address
FMT
Physical Address
Instn SRAM
SRAM Interface
Data SRAM
Physical Address
User mode is most often used for application programs. Kernel mode is typically used for handling exceptions and privileged operating system functions, including CP0 management and I/O device accesses. Debug mode is used for software debugging and most likely occurs within a software development tool.
The address translation performed by the MMU depends on the mode in which the processor is operating.
3.2.1 Virtual Memory Segments
The Virtual memory segments differ depending on the mode of operation. Figure 3.2 shows the segmentation for the 4 GByte (232 bytes) virtual memory space addressed by a 32-bit virtual address, for the three modes of operation.
The core enters Kernel mode both at reset and when an exception is recognized. While in Kernel mode, software has access to the entire address space, as well as all CP0 registers. User mode accesses are limited to a subset of the vir­tual address space (0x0000_0000 to 0x7FFF_FFFF) and can be inhibited from accessing CP0 functions. In User mode, virtual addresses 0x8000_0000 to 0xFFFF_FFFF are invalid and cause an exception if accessed.
Debug mode is entered on a debug exception. While in Debug mode, the debug software has access to the same address space and CP0 registers as for Kernel mode. In addition, while in Debug mode the core has access to the debug segment dseg. This area overlays part of the kernel segment kseg3. dseg access in Debug mode can be turned on or off, allowing full access to the entire kseg3 in Debug mode, if so desired.
44 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
0xFFFF_FFFF
0xFF40_0000
0xFF3F_FFFF
0xFF20_0000
0xFF1F_FFFF
0xE000_0000
0xDFFF_FFFF
Figure 3.2 M14K™ processor core Virtual Memory Map
User Mode Kernel Mode Debug ModeVirtual Address
kseg3
kseg2
3.2 Modes of Operation
kseg3
dseg
kseg3
kseg2
0xC000_0000
0xBFFF_FFFF
0xA000_0000
0x9FFF_FFFF
0x8000_0000
0x7FFF_FFFF
0x0000_0000
kseg1
kseg0
useg kuseg kuseg
kseg1
kseg0
Each of the segments shown in Figure 3.2 are either mapped or unmapped. The following two sub-sections explain the distinction. Then sections 3.2.2 “User Mode”, 3.2.3 “Kernel Mode” and 3.2.4 “Debug Mode” specify which segments are actually mapped and unmapped.
3.2.1.1 Unmapped Segments
An unmapped segment does not use the FMT to translate from virtual-to-physical addresses.
Unmapped segments have a fixed simple translation from virtual to physical address. This is much like the transla­tions the FMT provides for the M14K core, but we will still make the distinction.
All segments are treated as uncached within the M14K core. Cache coherency attributes of cached or uncached can be specified and this information will be sent with the request to allow the system to make a distinction between the two.
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 45
3.2 Modes of Operation
All valid user mode virtual addresses have their most significant bit cleared to 0, indicating that user mode can only access the lower half of the virtual memory map. Any attempt to reference an address with the most significant bit set while in user mode causes an address error exception.
The system maps all references to useg through the FMT.
3.2.3 Kernel Mode
The processor operates in Kernel mode when the DMbit in the Debugregister is 0 and the Statusregister contains one or more of the following values:
UM = 0
ERL = 1
EXL = 1
When a non-debug exception is detected,
EXL or ERL will be set and the processor will enter Kernel mode. At the end
of the exception handler routine, an Exception Return (ERET) instruction is generally executed. The ERET instruc­tion jumps to the Exception PC, clears ERL, and clears EXL if ERL=0. This may return the processor to User mode.
Kernel mode virtual address space is divided into regions differentiated by the high-order bits of the virtual address, as shown in Figure 3.4. Also, Table 3.2 lists the characteristics of the Kernel mode segments.
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 47
Memory Management of the M14K™ Core
Figure 3.4 Kernel Mode Virtual Address Space
0xFFFF_FFFF
0xE000_0000 0xDFFF_FFFF
0xC000_0000 0xBFFF_FFFF
0xA000_0000 0x9FFF_FFFF
0x8000_0000
0x7FFF_FFFF
Kernel virtual address space
Fix Mapped, 512MB
Kernel virtual address space
Fix Mapped, 512MB
Kernel virtual address space
Unmapped, Uncached, 512MB
Kernel virtual address space
Unmapped, 512MB
Fixed Mapped, 2048MB
kseg3
kseg2
kseg1
kseg0
kuseg
0x0000_0000
Status Register Is One
Address Bit
of These Values
Values
A(31) = 0 (UM = 0
EXL = 1
A(31:29) = 100
A(31:29) = 101
A(31:29) = 110
A(31:29) = 111
2
2
2
2
ERL = 1)
Table 3.2 Kernel Mode Segments
Segment
Name Address Range
kuseg 0x0000_0000
or
0x7FFF_FFFF
or
and
DM = 0
kseg0 0x8000_0000
0x9FFF_FFFF
kseg1 0xA000_0000
0xBFFF_FFFF
kseg2 0xC000_0000
0xDFFF_FFFF
kseg3 0xE000_0000
0xFFFF_FFFF
through
through
through
through
through
Segment
SizeUM EXL ERL
2 GBytes (2
bytes)
512 MBytes
29
bytes)
(2
512 MBytes
29
bytes)
(2
512 MBytes
29
bytes)
(2
512 MBytes
29
bytes)
(2
31
48 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
3.2 Modes of Operation
3.2.3.1 Kernel Mode, User Space (kuseg)
In Kernel mode, when the most-significant bit of the virtual address (A31) is cleared, the 32-bit kuseg virtual address space is selected and covers the full 231bytes (2 GBytes) of the current user address space mapped to addresses
0x0000_0000 - 0x7FFF_FFFF.
When the Status register’s ERL = 1, the user address region becomes a 2
29
-byte unmapped and uncached address
space. While in this setting, the kuseg virtual address maps directly to the same physical address.
3.2.3.2 Kernel Mode, Kernel Space 0 (kseg0)
In Kernel mode, when the most-significant three bits of the virtual address are 1002, 32-bit kseg0 virtual address
space is selected; it is the 229-byte (512-MByte) kernel virtual space located at addresses 0x8000_0000 ­0x9FFF_FFFF. References to kseg0 are unmapped; the physical address selected is defined by subtracting 0x8000_0000 from the virtual address. The K0 field of the Config register controls cacheability.
3.2.3.3 Kernel Mode, Kernel Space 1 (kseg1)
In Kernel mode, when the most-significant three bits of the 32-bit virtual address are 1012, 32-bit kseg1 virtual
address space is selected. kseg1 is the 229-byte (512-MByte) kernel virtual space located at addresses 0xA000_0000 ­0xBFFF_FFFF. References to kseg1 are unmapped; the physical address selected is defined by subtracting 0xA000_0000 from the virtual address.
3.2.3.4 Kernel Mode, Kernel Space 2 (kseg2)
In Kernel mode, when UM = 0, ERL =1,orEXL = 1 in the Status register, and DM= 0 in the Debug register, and the most-significant three bits of the 32-bit virtual address are 1102, 32-bit kseg2 virtual address space is selected. In the
M14K core, this 229-byte (512-MByte) kernel virtual space is located at physical addresses 0xC000_0000 ­0xDFFF_FFFF.
3.2.3.5 Kernel Mode, Kernel Space 3 (kseg3)
In Kernel mode, when the most-significant three bits of the 32-bit virtual address are 111
the kseg3 virtual address
2 ,
space is selected. In the M14K core, this 229-byte (512-MByte) kernel virtual space is located at physical addresses 0xE000_0000 - 0xFFFF_FFFF.
3.2.4 Debug Mode
Debug mode address space is identical to Kernel mode address space with respect to mapped and unmapped areas, except for kseg3. In kseg3, a debug segment dseg co-exists in the virtual address range 0xFF20_0000 to 0xFF3F_FFFF. The layout is shown in Figure 3.5.
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 49
3.3 Fixed Mapping MMU
unpredictable, and writes are ignored to any unimplemented register in the drseg. Refer to Chapter 8, “EJTAG Debug
Support in the M14K™ Core” on page 155 for more information on the DCR.
The allowed access size is limited for the drseg. Only word size transactions are allowed. Operation of the processor is undefined for other transaction sizes.
3.2.4.2 Conditions and Behavior for Access to dmseg, EJTAG Memory
The behavior of CPU access to the dmseg address range at 0xFF20_0000 to 0xFF2F_FFFF is determined by the table shown in Table 3.5.
Table 3.5 CPU Access to dmseg Address Range
ProbEn bit in
Transaction
Load / Store Don’t care 1 Kernel mode address space (kseg3)
Fetch 1 Don’t care dmseg
Load / Store 1 0
Fetch 0 Don’t care See comments below
Load / Store 0 0
DCR register
The case with access to the dmseg when the ProbEn bit in the DCR register is 0 is not expected to happen. Debug software is expected to check the state of the ProbEn bit in DCR register before attempting to reference dmseg. If such a reference does happen, the reference hangs until it is satisfied by the probe. The probe can not assume that there will never be a reference to dmseg if the ProbEn bit in the DCR register is 0 because there is an inherent race between the debug software sampling the ProbEn bit as 1 and the probe clearing it to 0.
3.3 Fixed Mapping MMU
The M14K core implements a simple Fixed Mapping (FM) memory management unit that is smaller than the a full translation lookaside buffer (TLB) and more easily synthesized. Like a TLB, the FMT performs virtual-to-physical address translation and provides attributes for the different memory segments. Those memory segments which are unmapped in a TLB implementation (kseg0 and kseg1) are translated identically by the FMT MMU.
LSNM bit in
Debug register Access
The FMT also determines the cacheability of each segment. These attributes are controlled via bits in the Config reg­ister. Table 3.6 shows the encoding for the K23 (bits 30:28), KU (bits 27:25) and K0 (bits 2:0) of the Config register.
The M14K core does not contain caches and will treat all references as uncached, but these Config fields will be sent out to the system with the request and it can choose to use them to control any external caching that may be present..
Table 3.6 Cacheability of Segments with Block Address Translation
Virtual Address
Segment
useg/kuseg 0x0000_0000-
kseg0 0x8000_0000-
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 51
Range Cacheability
Controlled by the KU field (bits 27:25) of the Config register.
0x7FFF_FFFF
Controlled by the K0 field (bits 2:0) of the Config register.
0x9FFF_FFFF
Memory Management of the M14K™ Core
54 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
Chapter 4
Exceptions and Interrupts in the M14K™ Core
The M14K™ processor core receives exceptions from a number of sources, including arithmetic overflows, I/O inter­rupts, and system calls. When the CPU detects one of these exceptions, the normal sequence of instruction execution is suspended and the processor enters kernel mode.
In kernel mode the core disables interrupts and forces execution of a software exception processor (called a handler) located at a specific address. The handler saves the context of the processor, including the contents of the program counter, the current operating mode, and the status of the interrupts (enabled or disabled). This context is saved so it can be restored when the exception has been serviced.
When an exception occurs, the core loads the Exception Program Counter (EPC) register with a location where exe­cution can restart after the exception has been serviced. Most exceptions are precise, which mean that EPC can be used to identify the instruction that caused the exception. For precise exceptions, the restart location in the EPCregis­ter is the address of the instruction that caused the exception or, if the instruction was executing in a branch delay slot, the address of the branch instruction immediately preceding the delay slot. To distinguish between the two, software must read the BD bit in the CP0 Cause register. Bus error exceptions and CP2 exceptions may be imprecise. For imprecise exceptions the instruction that caused the exception cannot be identified.
This chapter contains the following sections:
Section 4.1 “Exception Conditions”
Section 4.2 “Exception Priority”
Section 4.3 “Interrupts”
Section 4.4 “GPR Shadow Registers”
Section 4.5 “Exception Vector Locations”
Section 4.6 “General Exception Processing”
Section 4.7 “Debug Exception Processing”
Section 4.8 “Exception Descriptions”
Section 4.9 “Exception Handling and Servicing Flowcharts”
4.1 Exception Conditions
When an exception condition occurs, the instruction causing the exception and all those that follow it in the pipeline are cancelled (“flushed”). Accordingly, any stall conditions and any later exception conditions that might have refer­enced this instruction are inhibited—obviously there is no benefit in servicing stalls for a cancelled instruction.
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 55
Exceptions and Interrupts in the M14K™ Core
When an exception condition is detected on an instruction fetch, the core aborts that instruction and all instructions that follow. When this instruction reaches the W stage, various CP0 registers are written with the exception state, change the current program counter (PC) to the appropriate exception vector address, and clearing the exception bits of earlier pipeline stages.
This implementation allows all preceding instructions to complete execution and prevents all subsequent instructions from completing. Thus, the value in the EPC (ErrorEPC for errors, or DEPC for debug exceptions) is sufficient to restart execution. It also ensures that exceptions are taken in the order of execution; an instruction taking an exception may itself be killed by an instruction further down the pipeline that takes an exception in a later cycle.
4.2 Exception Priority
Table 4.1 contains a list and a brief description of all exception conditions, The exceptions are listed in the order of
their relative priority, from highest priority (Reset) to lowest priority. When several exceptions occur simultaneously, the exception with the highest priority is taken.
Table 4.1 Priority of Exceptions
Exception Description
Reset Assertion of SI_ColdReset signal.
Soft Reset Assertion of SI_Reset signal.
DSS EJTAG Debug Single Step.
DINT EJTAG Debug Interrupt. Caused by the assertion of the external EJ_DINT
input, or by setting the EjtagBrk bit in the ECR register.
NMI Asserting edge of SI_NMI signal.
Interrupt Assertion of unmasked hardware or software interrupt signal.
Protection - Instruction fetch Instruction fetch access to a protected memory region was attempted.
DIB EJTAG debug hardware instruction break matched.
AdEL Fetch address alignment error.
User-mode fetch reference to kernel address.
ISRAM Parity Error Parity error on I-SRAM access
IBE Instruction fetch bus error.
Instruction Validity Exceptions An instruction could not be completed because it was not allowed access to the
required resources (Coprocessor Unusable) or was illegal (Reserved Instruc­tion). Ifboth exceptions occur on the same instruction, the Coprocessor Unus­able Exception takes priority over the Reserved Instruction Exception.
Protection - Instr Execution Attempted to write EBase when not allowed by MPU..
Tr Execution of a trap (when trap condition is true).
Protection - Data access Data access to a protected memory region was attempted.
DDBL / DDBS EJTAG Data Address Break (address only) or EJTAG Data Value Break on
Store (address and value).
AdEL Load address alignment error.
User mode load reference to kernel address.
AdES Store address alignment error.
User mode store to kernel address.
DSRAM Parity Error Parity error on D-SRAM access.
56 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
DBE Load or store bus error.
DDBL EJTAG data hardware breakpoint matched in load data compare.
CBrk EJTAG complex breakpoint.
4.3 Interrupts
In the MIPS32® Release 1 architecture, support for exceptions included two software interrupts, six hardware inter­rupts, and a special-purpose timer interrupt. The timer interrupt was provided external to the core and was typically combined with hardware interrupt 5 in a system-dependent manner. Interrupts were handled either through the gen­eral exception vector (offset 0x180) or the special interrupt vector (0x200), based on the value of CauseIV. Software was required to prioritize interrupts as a function of the CauseIV bits in the interrupt handler prologue.
Release 2 of the Architecture, implemented by the M14K core, adds a number of upward-compatible extensions to the Release 1 interrupt architecture, including support for vectored interrupts and the implementation of a new inter­rupt mode that permits the use of an external interrupt controller.
The M14K core also includes the Microcontroller Application-Specific Extension (MCU ASE) that provides enhanced interrupt delivery and interrupt-latency reduction.
4.3 Interrupts
Table 4.1 Priority of Exceptions (Continued)
Exception Description
4.3.1 Interrupt Modes
The M14K core includes support for three interrupt modes, as defined by Release 2 of the Architecture:
Interrupt Compatibility mode, in which the behavior of the M14K is identical to the behavior of a Release 1 implementations.
Vectored Interrupt (VI) mode, which adds the ability to prioritize and vector interrupts to a handler dedicated to that interrupt, and to assign a GPR shadow set for use during interrupt processing. The presence of this mode is denoted by the VIntbit in the Config3 register. Although this mode is architecturally optional, it is always present on the M14K processor, so the VInt bit will always read as a 1.
External Interrupt Controller (EIC) mode, which redefines the way interrupts are handled to provide full support for an external interrupt controller that handles prioritization and vectoring of interrupts. As with VI mode, this mode is architecturally optional. The presence of this mode is denoted by the VEICbit in the Config3register. On the M14K core, the VEICbit is set externally by the static input, SI_EICPresent, to allow system logic to indicate the presence of an external interrupt controller.
Following reset, the M14K processor defaults to Compatibility mode, which is fully compatible with all implementa­tions of Release 1 of the Architecture.
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 57
Exceptions and Interrupts in the M14K™ Core
Table 4.2 shows the current interrupt mode of the processor as a function of the Coprocessor 0 register fields that can
affect the mode.
Table 4.2 Interrupt Modes
BEV
Status
IV
Cause
VS
IntCtl
VINT
VEIC
Config3
Config3
Interrupt Mode
1 x x x x Compatibly
x 0 x x x Compatibility
xx =0 x x Compatibility
01 ≠0 1 0 Vectored Interrupt
01 ≠0 x 1 External Interrupt Controller
01 ≠0 0 0 Can’t happen - IntCtl
can not be non-zero if neither
VS
Vectored Interrupt nor External Interrupt Controller mode is implemented.
“x” denotes don’t care
4.3.1.1 Interrupt Compatibility Mode
This is the default interrupt mode for the processor and is entered when a Reset exception occurs. In this mode, inter­rupts are non-vectored and dispatched though exception vector offset 16#180 (if Cause
(if Cause
Cause
= 1). This mode is in effect if any of the following conditions are true:
IV
= 0
IV
= 0) or vector offset 16#200
IV
Status
IntCtl
= 1
BEV
= 0, which would be the case if vectored interrupts are not implemented, or have been disabled.
VS
Here is a typical software handler for interrupt compatibility mode:
/* * Assumptions: * - Cause * be isolated from the general exception vector before getting * here) * - GPRs k0 and k1 are available (no shadow register switches invoked in * compatibility mode) * - The software priority is IP9..IP0 (HW7..HW0, SW1..SW0) * * Location: Offset 0x200 from exception base */
IVexception:
mfc0 k0, C0_Cause /* Read Cause register for IP bits */ mfc0 k1, C0_Status /* and Status register for IM bits */ andi k0, k0, M_CauseIM /* Keep only IP bits from Cause */ and k0, k0, k1 /* and mask with IM bits */ beq k0, zero, Dismiss /* no bits set - spurious interrupt */ clz k0, k0 /* Find first bit set, IP9..IP0; k0 = 14..23 */
= 1 (if it were zero, the interrupt exception would have to
IV
58 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
4.3 Interrupts
xori k0, k0, 0x17 /* 14..23 => 9..0 */ sll k0, k0, VS /* Shift to emulate software IntCtl
VS
*/ la k1, VectorBase /* Get base of 10 interrupt vectors */ addu k0, k0, k1 /* Compute target from base and offset */ jr k0 /* Jump to specific exception routine */ nop
/* * Each interrupt processing routine processes a specific interrupt, analogous * to those reached in VI or EIC interrupt mode. Since each processing routine * is dedicated to a particular interrupt line, it has the context to know * which line was asserted. Each processing routine may need to look further * to determine the actual source of the interrupt if multiple interrupt requests * are ORed together on a single IP line. Once that task is performed, the * interrupt may be processed in one of two ways: * * - Completely at interrupt level (e.g., a simply UART interrupt). The * SimpleInterrupt routine below is an example of this type. * - By saving sufficient state and re-enabling other interrupts. In this * case the software model determines which interrupts are disabled during * the processing of this interrupt. Typically, this is either the single * StatusIM bit that corresponds to the interrupt being processed, or some * collection of other Status
bits so that “lower” priority interrupts are
IM
* also disabled. The NestedInterrupt routine below is an example of this type. */
SimpleInterrupt: /* * Process the device interrupt here and clear the interupt request * at the device. In order to do this, some registers may need to be * saved and restored. The coprocessor 0 state is such that an ERET * will simple return to the interrupted code. */
eret /* Return to interrupted code */
NestedException: /* * Nested exceptions typically require saving the EPC and Status registers, * any GPRs that may be modified by the nested exception routine, disabling * the appropriate IM bits in Status to prevent an interrupt loop, putting * the processor in kernel mode, and re-enabling interrupts. The sample code * below can not cover all nuances of this processing and is intended only * to demonstrate the concepts. */
/* Save GPRs here, and setup software context */ mfc0 k0, C0_EPC /* Get restart address */ sw k0, EPCSave /* Save in memory */ mfc0 k0, C0_Status /* Get Status value */ sw k0, StatusSave /* Save in memory */ li k1, ~IMbitsToClear /* Get Im bits to clear for this interrupt */
/* this must include at least the IM bit */ /* for the current interrupt, and may include */
/* others */ and k0, k0, k1 /* Clear bits in copy of Status */ ins k0, zero, S_StatusEXL, (W_StatusKSU+W_StatusERL+W_StatusEXL)
/* Clear KSU, ERL, EXL bits in k0 */
mtc0 k0, C0_Status /* Modify mask, switch to kernel mode, */
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 59
Exceptions and Interrupts in the M14K™ Core
/* * Process interrupt here, including clearing device interrupt. * In some environments this may be done with a thread running in * kernel or user mode. Such an environment is well beyond the scope of * this example. */
/* * To complete interrupt processing, the saved values must be restored * and the original interrupted code restarted. */
di /* Disable interrupts - may not be required */ lw k0, StatusSave /* Get saved Status (including EXL set) */ lw k1, EPCSave /* and EPC */ mtc0 k0, C0_Status /* Restore the original value */ mtc0 k1, C0_EPC /* and EPC */ /* Restore GPRs and software state */ eret /* Dismiss the interrupt */
4.3.1.2 Vectored Interrupt (VI) Mode
/* re-enable interrupts */
In Vectored Interrupt (VI) mode, a priority encoder prioritizes pending interrupts and generates a vector which can be used to direct each interrupt to a dedicated handler routine. This mode also allows each interrupt to be mapped to a GPR shadow register set for use by the interrupt handler. VI mode is in effect when all the following conditions are true:
Config3
Config3
IntCtl
Cause
Status
VS
VInt
VEIC
IV
BEV
= 1
= 0
0
= 1
= 0
In VI interrupt mode, the eight hardware interrupts are interpreted as individual hardware interrupt requests. The timer interrupt is combined in a system-dependent way (external to the core) with the hardware interrupts (the inter­rupt with which they are combined is indicated by the PTI field in IntCtlI) to provide the appropriate relative priority of the timer interrupt with that of the hardware interrupts. The processor interrupt logic ANDs each of the Cause
bits with the corresponding StatusIM bits. If any of these values is 1, and if interrupts are enabled (Status
IE
IP
= 1,
60 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
4.3 Interrupts
Status
=0,andStatus
EXL
= 0), an interrupt is signaled and a priority encoder scans the values in the order shown
ERL
in Table 4.3.
Table 4.3 Relative Interrupt Priority for Vectored Interrupt Mode
Interrupt
Relative
Priority
Highest Priority Hardware HW7 IP9 and IM9 9
Lowest Priority SW0 IP0 and IM0 0
Interrupt
Type
Software SW1 IP1 and IM1 1
Interrupt
Source
HW6 IP8 and IM8 8
HW5 IP7 and IM7 7
HW4 IP6 and IM6 6
HW3 IP5 and IM5 5
HW2 IP4 and IM4 4
HW1 IP3 and IM3 3
HW0 IP2 and IM2 2
Request
Calculated From
The priority order places a relative priority on each hardware interrupt and places the software interrupts at a priority lower than all hardware interrupts. When the priority encoder finds the highest priority pending interrupt, it outputs an encoded vector number that is used in the calculation of the handler for that interrupt, as described below. This is shown pictorially in Figure 4.1.
Vector Number
Generated by
Priority Encoder
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 61
li k1, ~IMbitsToClear /* Get Im bits to clear for this interrupt */
/* this must include at least the IM bit */
/* for the current interrupt, and may include */
/* others */ and k0, k0, k1 /* Clear bits in copy of Status */ /* If switching shadow sets, write new value to SRSCtl ins k0, zero, S_StatusEXL, (W_StatusKSU+W_StatusERL+W_StatusEXL)
/* Clear KSU, ERL, EXL bits in k0 */
mtc0 k0, C0_Status /* Modify mask, switch to kernel mode, */
/* re-enable interrupts */ /* * If switching shadow sets, clear only KSU above, write target * address to EPC, and do execute an eret to clear EXL, switch * shadow sets, and jump to routine */
/* Process interrupt here, including clearing device interrupt */
/* * To complete interrupt processing, the saved values must be restored * and the original interrupted code restarted. */
di /* Disable interrupts - may not be required */ lw k0, StatusSave /* Get saved Status (including EXL set) */ lw k1, EPCSave /* and EPC */ mtc0 k0, C0_Status /* Restore the original value */ lw k0, SRSCtlSave /* Get saved SRSCtl */ mtc0 k1, C0_EPC /* and EPC */ mtc0 k0, C0_SRSCtl /* Restore shadow sets */ ehb /* Clear hazard */ eret /* Dismiss the interrupt */
here */
PSS
4.3 Interrupts
4.3.1.3 External Interrupt Controller Mode
External Internal Interrupt Controller Mode redefines the way that the processor interrupt logic is configured to pro­vide support for an external interrupt controller. The interrupt controller is responsible for prioritizing all interrupts, including hardware, software, timer, and performance counter interrupts, and directly supplying to the processor the priority level and vector number of the highest priority interrupt. EIC interrupt mode is in effect if all of the following conditions are true:
Config3
IntCtl
Cause
Status
In EIC interrupt mode, the processor sends the state of the software interrupt requests (Cause rupt request (CauseTI), the performance counter interrupt request (Cause (Cause
FDCI
= 1
VEIC
0
VS
= 1
IV
= 0
BEV
), the timer inter-
IP1..IP0
) and Fast Debug Channel Interrupt
PCI
) to the external interrupt controller, where it prioritizes these interrupts in a system-dependent way with
other hardware interrupts. The interrupt controller can be a hard-wired logic block, or it can be configurable based on control and status registers. This allows the interrupt controller to be more specific or more general as a function of the system environment and needs.
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 63
Exceptions and Interrupts in the M14K™ Core
The external interrupt controller prioritizes its interrupt requests and produces the priority level and the vector num­ber of the highest priority interrupt to be serviced. The priority level, called the Requested Interrupt Priority Level (RIPL), is an 8-bit encoded value in the range 0..255, inclusive. A value of 0 indicates that no interrupt requests are pending. The values 1..255 represent the lowest (1) to highest (255) RIPL for the interrupt to be serviced. The inter­rupt controller passes this value on the 8 hardware interrupt lines, which are treated as an encoded value in EIC inter­rupt mode. There are two implementation options available for the vector offset:
1. The first option is to send a separate vector number along with the RIPL to the processor.
2. A second option is to send an entire vector offset along with the RIPL to the processor. This option is enabled through the core’s configuration GUI, and it is not affected by software.
The M14K core does not support the option to treat the RIPL value as the vector number for the processor.
Status
(which overlays StatusI
IPL
) is interpreted as the Interrupt Priority Level (IPL) at which the processor is
M9..IM2
currently operating (with a value of zero indicating that no interrupt is currently being serviced). When the interrupt controller requests service for an interrupt, the processor compares RIPL with Status
interrupt has higher priority than the current IPL. If RIPL is strictly greater than Status (StatusIE = 1, Status starts the interrupt exception, it loads RIPL into Cause
= 0, and Status
EXL
= 0) an interrupt request is signaled to the pipeline. When the processor
ERL
(which overlays Cause
RIPL
interrupt controller to notify it that the request is being serviced. Because Cause
to determine if the requested
IPL
, and interrupts are enabled
IPL
) and signals the external
IP9..IP2
is only loaded by the processor
RIPL
when an interrupt exception is signaled, it is available to software during interrupt processing. The vector number that the EIC passes to the core is combined with the IntCtl
to determine where the interrupt service routine is located.
VS
The vector number is not stored in any software-visible registers.
In EIC interrupt mode, the external interrupt controller is also responsible for supplying the GPR shadow set number to use when servicing the interrupt. As such, the SRSMap register is not used in this mode, and the mapping of the vectored interrupt to a GPR shadow set is done by programming (or designing) the interrupt controller to provide the correct GPR shadow set number when an interrupt is requested. When the processor loads an interrupt request into
Cause
, it also loads the GPR shadow set number into SRSCtl
RIPL
, which is copied to SRSCtl
EICSS
when the inter-
CSS
rupt is serviced.
The operation of EIC interrupt mode is shown pictorially in Figure 4.2.
64 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
Exceptions and Interrupts in the M14K™ Core
sw k0, StatusSave /* Save in memory */ ins k0, k1, S_StatusIPL, 6 /* Set IPL to RIPL in copy of Status */ mfc0 k1, C0_SRSCtl /* Save SRSCtl if changing shadow sets */ sw k1, SRSCtlSave /* If switching shadow sets, write new value to SRSCtl ins k0, zero, S_StatusEXL, (W_StatusKSU+W_StatusERL+W_StatusEXL)
/* Clear KSU, ERL, EXL bits in k0 */
mtc0 k0, C0_Status /* Modify IPL, switch to kernel mode, */
/* re-enable interrupts */ /* * If switching shadow sets, clear only KSU above, write target * address to EPC, and do execute an eret to clear EXL, switch * shadow sets, and jump to routine */
/* Process interrupt here, including clearing device interrupt */
/* * The interrupt completion code is identical to that shown for VI mode above. */
4.3.2 Generation of Exception Vector Offsets for Vectored Interrupts
here */
PSS
For vectored interrupts (in either VI or EIC interrupt mode), a vector number is produced by the interrupt control logic. This number is combined with IntCtlVS to create the interrupt offset, which is added to 16#200 to create the exception vector offset. For VI interrupt mode, the vector number is in the range 0..9, inclusive. For EIC interrupt mode, the vector number is in the range 0..63, inclusive. The IntCtlVS field specifies the spacing between vector loca­tions. If this value is zero (the default reset state), the vector spacing is zero and the processor reverts to Interrupt Compatibility Mode. A non-zero value enables vectored interrupts, and Table 4.4 shows the exception vector offset for a representative subset of the vector numbers and values of the IntCtlVS field.
Table 4.4 Exception Vector Offsets for Vectored Interrupts
Value of IntCtl
Vector Number
0 16#0200 16#0200 16#0200 16#0200 16#0200
1 16#0220 16#0240 16#0280 16#0300 16#0400
2 16#0240 16#0280 16#0300 16#0400 16#0600
3 16#0260 16#02C0 16#0380 16#0500 16#0800
4 16#0280 16#0300 16#0400 16#0600 16#0A00
5 16#02A0 16#0340 16#0480 16#0700 16#0C00
6 16#02C0 16#0380 16#0500 16#0800 16#0E00
7 16#02E0 16#03C0 16#0580 16#0900 16#1000
61 16#09A0 16#1140 16#2080 16#3F00 16#7C00
62 16#09C0 16#1180 16#2100 16#4000 16#7E00
63 16#09E0 16#11C0 16#2180 16#4100 16#8000
2#00001 2#00010 2#00100 2#01000 2#10000
VS
Field
66 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
4.3 Interrupts
The general equation for the exception vector offset for a vectored interrupt is:
vectorOffset 16#200 + (vectorNumber × (IntCtlVS|| 2#00000))
When using large vector spacing and EIC mode, the offset value can overlap with bits that are specified in the EBase register. Software must ensure that any overlapping bits are specified as 0 in EBase. This implementation ORs together the offset and base registers, but it is architecturally undefined and software should not rely on this behavior.
Although there are 255 EIC priority interrupts, only 64 vectors are provided. There is no one-to-one mapping for each EIC interrupt to its interrupt vector. The 255 priority interrupts will share the 64 interrupt vectors as specified by the
SI_EICVector[5:0] input pins. However, as mentioned in option 2 of Section 4.3.1.3 “External Interrupt Controller
Mode”, the SI_Offset[17:1] input pins can be used to provide each EIC interrupt with a unique interrupt handler loca-
tion.
4.3.3 MCU ASE Enhancement for Interrupt Handling
The MCU ASE extends the MIPS/microMIPS32 Architecture with a set of new features designed for the microcon­troller market. The MCU ASE contains enhancements in two key areas: interrupt delivery and interrupt latency. For more details, refer to the The MCU Privileged Resource Architecture chapter of the MIPS® Architecture for Pro-
grammers Volume IV-h: The MCU Application-Specific Extension to the MIPS32 Architecture [10] or MIPS® Archi­tecture for Programmers Volume IV-h: The MCU Application-Specific Extension to the microMIPS32™ Architecture
[11].
4.3.3.1 Interrupt Delivery
The MCU ASE extends the number of hardware interrupt sources from 6 to 8. For legacy and vectored-interrupt mode, this represents 8 external interrupt sources. For EIC mode, the widened
IPL and RIPL fields can now represent
256 external interrupt sources.
4.3.3.2 Interrupt Latency Reduction
The MCU ASE includes a package of extensions to MIPS/microMIPS3232 that decrease the latency of the proces­sor’s response to a signalled interrupt.
Interrupt Vector Prefetching
Normally on MIPS architecture processors, when an interrupt or exception is signalled, execution pipelines must be flushed before the interrupt/exception handler is fetched. This is necessary to avoid mixing the contexts of the inter­rupted/faulting program and the exception handler. The MCU ASE introduces a hardware mechanism in which the interrupt exception vector is prefetched whenever the interrupt input signals change. The prefetch memory transac­tion occurs in parallel with the pipeline flush and exception prioritization. This decreases the overall latency of the execution of the interrupt handler’s first instruction.
Automated Interrupt Prologue
The use of Shadow Register Sets avoids the software steps of having to save general-purpose registers before han­dling an interrupt.
The MCU ASE adds additional hardware logic that automatically saves some of the COP0 state in the stack and auto­matically updates some of the COP0 registers in preparation for interrupt handling.
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 67
Exceptions and Interrupts in the M14K™ Core
Automated Interrupt Epilogue
A mirror to the Automated Prologue, this features automates the restoration of some of the COP0 registers from the stack and the preparation of some of the COP0 registers for returning to non-exception mode. This feature is imple­mented within the IRET instruction, which is introduced in this ASE.
Interrupt Chaining
An optional feature of the Automated Interrupt Epilogue, this feature allows handling a second interrupt after a pri­mary interrupt is handled, without returning to non-exception mode (and the related pipeline flushes that would nor­mally be necessary).
4.4 GPR Shadow Registers
Release 2 of the Architecture optionally removes the need to save and restore GPRs on entry to high priority inter­rupts or exceptions, and to provide specified processor modes with the same capability. This is done by introducing multiple copies of the GPRs, called shadow sets, and allowing privileged software to associate a shadow set with entry to kernel mode via an interrupt vector or exception. The normal GPRs are logically considered shadow set zero.
The number of GPR shadow sets is a build-time option on the M14K core. Although Release 2 of the Architecture defines a maximum of 16 shadow sets, the core allows one (the normal GPRs), two, four, eight or sixteen shadow sets. The highest number actually implemented is indicated by the SRSCtl
GPRs are implemented.
field. If this field is zero, only the normal
HSS
Shadow sets are new copies of the GPRs that can be substituted for the normal GPRs on entry to kernel mode via an interrupt or exception. When a shadow set is bound to a kernel mode entry condition, reference to GPRs work exactly as one would expect, but they are redirected to registers that are dedicated to that condition. Privileged software may need to reference all GPRs in the register file, even specific shadow registers that are not visible in the current mode. The RDPGPR and WRPGPR instructions are used for this purpose. The CSS field of the SRSCtlregister provides the number of the current shadow register set, and the PSS field of the SRSCtl register provides the number of the previ­ous shadow register set (that which was current before the last exception or interrupt occurred).
If the processor is operating in VI interrupt mode, binding of a vectored interrupt to a shadow set is done by writing to the SRSMap register. If the processor is operating in EIC interrupt mode, the binding of the interrupt to a specific shadow set is provided by the external interrupt controller, and is configured in an implementation-dependent way. Binding of an exception or non-vectored interrupt to a shadow set is done by writing to the ESS field of the SRSCtl register. When an exception or interrupt occurs, the value of SRSCtl
to the value taken from the appropriate source. On an ERET, the value of SRSCtl
is copied to SRSCtl
CSS
, and SRSCtl
PSS
is copied back into SRSCtl
PSS
CSS
is set
CSS
to restore the shadow set of the mode to which control returns. More precisely, the rules for updating the fields in the
SRSCtl register on an interrupt or exception are as follows:
1. No field in the SRSCtl register is updated if any of the following conditions is true. In this case, steps 2 and 3 are
skipped.
The exception is one that sets Status
: Reset, Soft Reset, or NMI.
ERL
The exception causes entry into EJTAG Debug Mode.
Status
Status
68 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
BEV
EXL
= 1
= 1
4.5 Exception Vector Locations
2. SRSCtl
3. SRSCtl
The appropriate field of the
The EICSS field of the SRSCtl register if the exception is an interrupt, CauseIV = 1, and Config3
is copied to SRSCtl
CSS
is updated from one of the following sources:
CSS
Config3
= 0, and Config3
VEIC
.
PSS
SRSMap register, based on IPL, if the exception is an interrupt, Cause
= 1. These are the conditions for a vectored interrupt.
VInt
VEIC
IV
= 1,
= 1.
These are the conditions for a vectored EIC interrupt.
The ESS field of the SRSCtlregister in any other case. This is the condition for a non-interrupt exception, or
a non-vectored interrupt.
Similarly, the rules for updating the fields in the SRSCtl register at the end of an exception or interrupt are as follows:
1. No field in the SRSCtl register is updated if any of the following conditions is true. In this case, step 2 is skipped.
A DERET is executed.
An ERET is executed with Status
2. SRSCtl
is copied to SRSCtl
PSS
CSS
= 1.
ERL
.
These rules have the effect of preserving the SRSCtl register in any case of a nested exception or one which occurs before the processor has been fully initialize (Status
BEV
= 1).
Privileged software may switch the current shadow set by writing a new value into SRSCtl target address, and doing an ERET.
4.5 Exception Vector Locations
The Reset, Soft Reset, and NMI exceptions are always vectored to location 16#BFC0.0000. EJTAG Debug excep­tions are vectored to location 16#BFC0.0480, or to location 16#FF20.0200 if the ProbTrap bit is zero or one, respec­tively, in the EJTAG_Control_register. Addresses for all other exceptions are a combination of a vector offset and a vector base address. In Release 1 of the architecture, the vector base address was fixed. In Release 2 of the architec­ture, software is allowed to specify the vector base address via the EBase register for exceptions that occur when
Status
set in the Status register. Table 4.6 gives the offsets from the vector base address as a function of the exception. Note that the IVbit in the Causeregister causes Interrupts to use a dedicated exception vector offset, rather than the general exception vector. For implementations of Release 2 of the Architecture,
Table 4.4 shows the offset from the base address in the case where Status
tions of Release 1 of the architecture in which CauseIV = 1, the vector offset is as if IntCt bines these two tables into one that contains all possible vector addresses as a function of the state that can affect the
equals 0. Table 4.5 gives the vector base address as a function of the exception and whether the BEV bit is
BEV
= 0 and Cause
BEV
, loading EPC with a
PSS
= 1. For implementa-
IV
were 0. Table 4.7 com-
lVS
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 69
Exceptions and Interrupts in the M14K™ Core
vector selection. To avoid complexity in the table, the vector address value assumes that the EBaseregister, as imple­mented in Release 2 devices, is not changed from its reset state and that IntCt
Table 4.5 Exception Vector Base Addresses
Status
BEV
lVS
is 0.
Exception
01
Reset, Soft Reset, NMI 16#BFC0.0000
EJTAG Debug (with ProbEn = 0 in
16#BFC0.0480
the EJTAG Control Register) EJTAG Debug (with ProbEn = 1 in
16#FF20.0200
the EJTAG Control Register)
SRAM Parity Error EBase
EBase
28..12
Note that EBase
|| 1 ||
31..30
|| 16#000
have the
31..30
fixed value 2#10
Other For Release 1 of the architecture:
16#8000.0000
For Release 2 of the architecture:
EBase
Note that EBase
31..12
|| 16#000
have the
31..30
fixed value 2#10
Table 4.6 Exception Vector Offsets
Exception Vector Offset
General Exception 16#180
Interrupt, Cause
Reset, Soft Reset, NMI None (Uses Reset Base Address)
= 1 16#200 (In Release 2 implementa-
IV
tions, this is the base of the vectored
interrupt table when Status
16#BFC0.0300
16#BFC0.0200
= 0)
BEV
Table 4.7 Exception Vectors
Vector
For Release 2
Implementations, assumes
EJTAG
Exception
Status
BEV
Status
EXL
Cause
IV
ProbEn
Reset, Soft Reset, NMI x x x x 16#BFC0.0000
EJTAG Debug x x x 0 16#BFC0.0480
EJTAG Debug x x x 1 16#FF20.0200
SRAM Parity Error 0 x x x 16#EBase[31:30] || 2#1
SRAM Parity Error 1 x x x 16#BFC0.0300
Interrupt 0 0 0 x 16#8000.0180
70 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
that EBase retains its reset
state and that IntCtl
VS
= 0
|| EBase[28:12] ||
16#100
Table 4.7 Exception Vectors (Continued)
BEV
Status
‘x’ denotes don’t care
Exception
Interrupt 0 0 1 x 16#8000.0200
Interrupt 1 0 0 x 16#BFC0.0380
Interrupt 1 0 1 x 16#BFC0.0400
All others 0 x x x 16#8000.0180
All others 1 x x x 16#BFC0.0380
Status
4.6 General Exception Processing
With the exception of Reset, Soft Reset, NMI, cache error, and EJTAG Debug exceptions, which have their own spe­cial processing as described below, exceptions have the same basic processing flow:
EXL
Cause
IV
EJTAG
ProbEn
4.6 General Exception Processing
Vector
For Release 2 Implementations, assumes that EBase retains its reset
state and that IntCtl
VS
= 0
If the EXL bit in the Status register is zero, the EPC register is loaded with the PC at which execution will be restarted and the BDbit is set appropriately in the Causeregister (see Table 5.17). The value loaded into the EPC register is dependent on whether the processor implements microMIPS, and whether the instruction is in the delay slot of a branch or jump which has delay slots. Table 4.8 shows the value stored in each of the CP0 PC reg­isters, including EPC. For implementations of Release 2 of the Architecture if Status
SRSCtl register is copied to the PSS field, and the CSS value is loaded from the appropriate source.
=0,theCSS field in the
BEV
If the EXL bit in the Status register is set, the EPC register is not loaded and the BD bit is not changed in the
Cause register. For implementations of Release 2 of the Architecture, the SRSCtl register is not changed.
Table 4.8 Value Stored in EPC, ErrorEPC, or DEPC on an Exception
microMIPS
Implemented?
No No Address of the instruction
No Yes Address of the branch or jump instruction (PC-4)
Yes No Upper31 bits of the address of the instruction, combined
Yes Yes Upper31 bits of the branch or jump instruction (PC-2 or
The CE and ExcCode fields of the Cause registers are loaded with the values appropriate to the exception. The
CE field is loaded, but not defined, for any exception type other than a coprocessor unusable exception.
In Branch/Jump
Delay Slot? Value stored in EPC/ErrorEPC/DEPC
with the ISA Mode bit
PC-4 depending on size of the instruction in the micro­MIPS ISA Mode and PC-4 in the 32-bit ISA Mode), com­bined with the ISA Mode bit
The EXL bit is set in the Status register.
The processor is started at the exception vector.
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 71
Exceptions and Interrupts in the M14K™ Core
The value loaded into EPC represents the restart address for the exception and need not be modified by exception handler software in the normal case. Software need not look at the BD bit in the Cause register unless it wishes to identify the address of the instruction that actually caused the exception.
Note that individual exception types may load additional information into other registers. This is noted in the descrip­tion of each exception type below.
Operation:
/* If Status /* and neither EPC nor Cause if Status
vectorOffset 16#180
else
if InstructionInBranchDelaySlot then
EPC restartPC/* PC of branch/jump */ Cause
else
EPC restartPC /* PC of instruction */ Cause
endif
/* Compute vector offsets as a function of the type of exception */ NewShadowSet SRSCtl if ExceptionType = TLBRefill then
vectorOffset 16#000
elseif (ExceptionType = Interrupt) then
if (Cause
else
endif /* if (Cause
endif /* elseif (ExceptionType = Interrupt) then */
is 1, all exceptions go through the general exception vector */
EXL
= 1 then
EXL
1
BD
0
BD
ESS
= 0) then
IV
vectorOffset 16#180
if (Status
= 1) or (IntCtlVS = 0) then
BEV
vectorOffset 16#200
else
if Config3
VEIC
VecNum Cause NewShadowSet SRSCtl
else
VecNum VIntPriorityEncoder()
NewShadowSet SRSMap endif vectorOffset 16#200 + (VecNum × (IntCtl
endif /* if (Status
IV
nor SRSCtl are modified */
BD
/* Assume exception, Release 2 only */
= 1 then
RIPL
EICSS
IPL×4+3..IPL×4
= 1) or (IntCtlVS = 0) then */
BEV
= 0) then */
|| 2#00000))
VS
/* Update the shadow set information for an implementation of */ /* Release 2 of the architecture */ if ((ArchitectureRevision 2) and (SRSCtl
(Status SRSCtl SRSCtl
= 0)) then
ERL
SRSCtl
PSS
NewShadowSet
CSS
CSS
> 0) and (Status
HSS
= 0) and
BEV
endif
endif /* if Status
Cause Cause Status
FaultingCoprocessorNumber
CE
ExcCode
EXL
ExceptionType
1
= 1 then */
EXL
72 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
/* Calculate the vector base address */ if Status
vectorBase 16#BFC0.0200
else
if ArchitectureRevision 2 then
else
endif
endif
/* Exception PC is the sum of vectorBase and vectorOffset */ PC vectorBase
= 1 then
BEV
/* The fixed value of EBase vectorBase EBase
vectorBase 16#8000.0000
31..30
31..12
|| (vectorBase
/* No carry between bits 29 and 30 */
4.7 Debug Exception Processing
All debug exceptions have the same basic processing flow:
The DEPC register is loaded with the program counter (PC) value at which execution will be restarted and the
DBD bit is set appropriately in the Debug register. The value loaded into the DEPC register is the current PC if
the instruction is not in the delay slot of a branch, or the PC-4 of the branch if the instruction is in the delay slot of a branch.
31..30
|| 16#000
+ vectorOffset
29..0
4.7 Debug Exception Processing
forces the base to be in kseg0 or kseg1 */
)
29..0
The DSS, DBp, DDBL, DDBS, DIB, DINT, DIBImpr, DDBLImpr, and DDBSImpr bits in the Debug register are
updated appropriately depending on the debug exception type.
The Debug2 register is updated with additional information for complex breakpoints.
Halt and Doze bits in the Debug register are updated appropriately.
DM bit in the Debug register is set to 1.
The processor is started at the debug exception vector.
The value loaded into DEPC represents the restart address for the debug exception and need not be modified by the debug exception handler software in the usual case. Debug software need not look at the DBD bit in the Debug regis­ter unless it wishes to identify the address of the instruction that actually caused the debug exception.
A unique debug exception is indicated through the DSS, DBp, DDBL, DDBS, DIB, DINT, DIBImpr, DDBLImpr, and
DDBSImpr bits in the Debug register.
No other CP0 registers or fields are changed due to the debug exception, thus no additional state is saved.
Operation:
if InstructionInBranchDelaySlot then
DEPC PC-4 Debug
else
DEPC PC Debug
endif
DBD
DBD
1
0
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 73
Exceptions and Interrupts in the M14K™ Core
Debug
D* bits
Debug
Halt
Debug
Doze
Debug
DM
if EJTAGControlRegister
PC 0xFF20_0200
else
PC 0xBFC0_0480
endif
DebugExceptionType HaltStatusAtDebugException DozeStatusAtDebugException
1
The same debug exception vector location is used for all debug exceptions. The location is determined by the Prob­Trap bit in the EJTAG Control register (ECR), as shown in Table 4.9.
Table 4.9 Debug Exception Vector Addresses
ProbTrap bit in ECR
Register Debug Exception Vector Address
0 0xBFC0_0480
1 0xFF20_0200 in dmseg
4.8 Exception Descriptions
The following subsections describe each of the exceptions listed in the same sequence as shown in Table 4.1.
ProbTrap
= 1 then
4.8.1 Reset/SoftReset Exception
A reset exception occurs when the SI_ColdReset signal is asserted to the processor; a soft reset occurs when the
SI_Reset signal is asserted. These exceptions are not maskable. When one of these exceptions occurs, the processor
performs a full reset initialization, including aborting state machines, establishing critical state, and generally placing the processor in a state in which it can execute instructions from uncached, unmapped address space. On a Reset/Soft­Reset exception, the state of the processor is not defined, with the following exceptions:
The Config register is initialized with its boot state.
The RP, BEV, TS, SR, NMI, and ERL fields of the Status register are initialized to a specified state.
The ErrorEPC register is loaded with PC-4 if the state of the processor indicates that it was executing an instruc­tion in the delay slot of a branch. Otherwise, the ErrorEPCregister is loaded with PC. Note that this value may or may not be predictable.
PC is loaded with 0xBFC0_0000.
Cause Register ExcCode Value:
None
Additional State Saved:
None
Entry Vector Used:
Reset (0xBFC0_0000)
74 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
4.8 Exception Descriptions
Operation:
Config ConfigurationState Status Status Status Status Status Status if InstructionInBranchDelaySlot then
else
endif PC 0xBFC0_0000
0
RP
1
BEV
0
TS
0/1 (depending on Reset or SoftReset)
SR
0
NMI
1
ERL
ErrorEPC PC - 4
ErrorEPC PC
4.8.2 Debug Single Step Exception
A debug single step exception occurs after the CPU has executed one/two instructions in non-debug mode, when returning to non-debug mode after debug mode. One instruction is allowed to execute when returning to a non jump/branch instruction, otherwise two instructions are allowed to execute since the jump/branch and the instruction in the delay slot are executed as one step. Debug single step exceptions are enabled by the SSt bit in the Debug regis­ter, and are always disabled for the first one/two instructions after a DERET.
The DEPC register points to the instruction on which the debug single step exception occurred, which is also the next instruction to single step or execute when returning from debug mode. So the DEPC will not point to the instruction which has just been single stepped, but rather the following instruction. The DBD bit in the Debug register is never set for a debug single step exception, since the jump/branch and the instruction in the delay slot is executed in one step.
Exceptions occurring on the instruction(s) executed with debug single step exception enabled are taken even though debug single step was enabled. For a normal exception (other than reset), a debug single step exception is then taken on the first instruction in the normal exception handler. Debug exceptions are unaffected by single step mode, e.g. returning to a SDBBP instruction with debug single step exceptions enabled causes a debug software breakpoint exception, and DEPC points to the SDBBP instruction. However, returning to an instruction (not jump/branch) just before the SDBBP instruction, causes a debug single step exception with the DEPC pointing to the SDBBP instruc­tion.
To ensure proper functionality of single step, the debug single step exception has priority over all other exceptions, except reset and soft reset.
Debug Register Debug Status Bit Set
DSS
Additional State Saved
None
Entry Vector Used
Debug exception vector
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 75
Exceptions and Interrupts in the M14K™ Core
4.8.3 Debug Interrupt Exception
A debug interrupt exception is either caused by the EjtagBrk bit in the EJTAG Control register (controlled through the TAP), or caused by the debug interrupt request signal to the CPU.
The debug interrupt exception is an asynchronous debug exception which is taken as soon as possible, but with no specific relation to the executed instructions. The DEPC register is set to the instruction where execution should con­tinue after the debug handler is through. The DBD bit is set based on whether the interrupted instruction was execut­ing in the delay slot of a branch.
Debug Register Debug Status Bit Set
DINT
Additional State Saved
None
Entry Vector Used
Debug exception vector
4.8.4 Non-Maskable Interrupt (NMI) Exception
A non maskable interrupt exception occurs when the SI_NMI signal is asserted to the processor. SI_NMI is an edge sensitive signal - only one NMI exception will be taken each time it is asserted. An NMI exception occurs only at instruction boundaries, so it does not cause any reset or other hardware initialization. The state of the cache, memory, and other processor states are consistent and all registers are preserved, with the following exceptions:
The BEV, TS, SR, NMI, and ERL fields of the Status register are initialized to a specified state.
The ErrorEPC register is loaded with PC-4 if the state of the processor indicates that it was executing an instruc­tion in the delay slot of a branch. Otherwise, the ErrorEPC register is loaded with PC.
PC is loaded with 0xBFC0_0000.
Cause Register ExcCode Value:
None
Additional State Saved:
None
Entry Vector Used:
Reset (0xBFC0_0000)
Operation:
Status Status Status Status Status if InstructionInBranchDelaySlot then
ErrorEPC PC - 4
else
BEV
TS
SR
NMI
ERL
1 0 0
1
1
76 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
4.8 Exception Descriptions
ErrorEPC PC endif PC 0xBFC0_0000
4.8.5 Interrupt Exception
The interrupt exception occurs when one or more of the eight hardware, two software, or timer interrupt requests is enabled by the Status register, and the interrupt input is asserted. See 4.3 “Interrupts” on page 57 for more details about the processing of interrupts.
Register ExcCode Value:
Int
Additional State Saved:
Table 4.10 Register States an Interrupt Exception
Register State Value
CauseIP indicates the interrupts that are pending.
Entry Vector Used:
See 4.3.2 “Generation of Exception Vector Offsets for Vectored Interrupts” on page 66 for the entry vector used, depending on the interrupt mode the processor is operating in.
4.8.6 Debug Instruction Break Exception
A debug instruction break exception occurs when an instruction hardware breakpoint matches an executed instruc­tion. The DEPC register and DBD bit in the Debug register indicate the instruction that caused the instruction hard­ware breakpoint to match. This exception can only occur if instruction hardware breakpoints are implemented.
Debug Register Debug Status Bit Set:
DIB
Additional State Saved:
None
Entry Vector Used:
Debug exception vector
4.8.7 Address Error Exception — Instruction Fetch/Data Access
An address error exception occurs on an instruction or data access when an attempt is made to execute one of the fol­lowing:
Fetch an instruction, load a word, or store a word that is not aligned on a word boundary
Load or store a halfword that is not aligned on a halfword boundary
Reference the kernel address space from user mode
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 77
Exceptions and Interrupts in the M14K™ Core
Note that in the case of an instruction fetch that is not aligned on a word boundary, PC is updated before the condition is detected. Therefore, both EPCand BadVAddr point to the unaligned instruction address. In the case of a data access the exception is taken if either an unaligned address or an address that was inaccessible in the current processor mode was referenced by a load or store instruction.
Cause Register ExcCode Value:
AdEL: Reference was a load or an instruction fetch
AdES: Reference was a store
Additional State Saved:
Table 4.11 CP0 Register States on an Address Exception Error
Register State Value
BadVAddr Failing address
Entry Vector Used:
General exception vector (offset 0x180)
4.8.8 SRAM Parity Error Exception
A SRAM error exception occurs when an instruction or data reference detects a data error. This exception is not maskable. To avoid disturbing the error in the cache array the exception vector is to an unmapped, uncached address. This exception is precise.
Cause Register ExcCode Value
N/A
Additional State Saved
Table 4.12 CP0 Register States on a SRAM Parity Error Exception
Register State Value
CacheErr Error state
ErrorEPC Restart PC
Entry Vector Used
Cache error vector (offset 16#100)
4.8.9 Bus Error Exception — Instruction Fetch or Data Access
A bus error exception occurs when an instruction or data access makes a bus request and that request terminates in an error. The bus error exception can occur on either an instruction fetch or a data access. Bus error exceptions that occur on an instruction fetch have a higher priority than bus error exceptions that occur on a data access.
Bus errors taken on any external access on the M14K core are always precise.
Cause Register ExcCode Value:
IBE: Error on an instruction reference
78 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
4.8 Exception Descriptions
DBE: Error on a data reference
Additional State Saved:
None
Entry Vector Used:
General exception vector (offset 0x180)
4.8.10 Protection Exception
The protection exception occurs when an access to memory that has been protected by the Memory Protection Unit has been attempted. Or under certain circumstances, attempted write to the EBase register. See the "Security Features of the M14K™ Processor Family" (MD00896) for more information.
Register ExcCode Value:
Prot (Cause Code 29)
Additional State Saved:
MPU Config Register, Triggered Field
MPU StatusN Register, Cause* Fields
Entry Vector Used
General exception vector (offset 0x180)
4.8.11 Debug Software Breakpoint Exception
A debug software breakpoint exception occurs when an SDBBP instruction is executed. The DEPC register and DBD bit in the Debug register will indicate the SDBBP instruction that caused the debug exception.
Debug Register Debug Status Bit Set:
DBp
Additional State Saved:
None
Entry Vector Used:
Debug exception vector
4.8.12 Execution Exception — System Call
The system call exception is one of the execution exceptions. All of these exceptions have the same priority. A sys­tem call exception occurs when a SYSCALL instruction is executed.
Cause Register ExcCode Value:
Sys
Additional State Saved:
None
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 79
Exceptions and Interrupts in the M14K™ Core
Entry Vector Used:
General exception vector (offset 0x180)
4.8.13 Execution Exception — Breakpoint
The breakpoint exception is one of the execution exceptions. All of these exceptions have the same priority. A break­point exception occurs when a BREAK instruction is executed.
Cause Register ExcCode Value:
Bp
Additional State Saved:
None
Entry Vector Used:
General exception vector (offset 0x180)
4.8.14 Execution Exception — Reserved Instruction
The reserved instruction exception is one of the execution exceptions. All of these exceptions have the same priority. A reserved instruction exception occurs when a reserved or undefined major opcode or function field is executed. This includes Coprocessor 2 instructions which are decoded reserved in the Coprocessor 2.
Cause Register ExcCode Value:
RI
Additional State Saved:
None
Entry Vector Used:
General exception vector (offset 0x180)
4.8.15 Execution Exception — Coprocessor Unusable
The coprocessor unusable exception is one of the execution exceptions. All of these exceptions have the same prior­ity. A coprocessor unusable exception occurs when an attempt is made to execute a coprocessor instruction for one of the following:
a corresponding coprocessor unit that has not been marked usable by setting its CU bit in the Status register
CP0 instructions, when the unit has not been marked usable, and the processor is executing in user mode
Cause Register ExcCode Value:
CpU
80 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
Additional State Saved:
Table 4.13 Register States on a Coprocessor Unusable Exception
Register State Value
4.8 Exception Descriptions
Cause
CE
Unit number of the coprocessor being referenced
Entry Vector Used:
General exception vector (offset 0x180)
4.8.16 Execution Exception — CorExtend Unusable
The CorExtend unusable exception is one of the execution exceptions. All of these exceptions have the same priority. A CorExtend Unusable exception occurs when an attempt is made to execute a CorExtend instruction when
Status
is cleared. It is implementation-dependent whether this functionality is supported. Generally, the function-
CEE
ality will only be supported if a CorExtend block contains local destination registers
Cause Register ExcCode Value:
CEU
Additional State Saved:
None
Entry Vector Used:
General exception vector (offset 0x180)
4.8.17 Execution Exception — Coprocessor 2 Exception
The Coprocessor 2 exception is one of the execution exceptions. All of these exceptions have the same priority. A Coprocessor 2 exception occurs when a valid Coprocessor 2 instruction cause a general exception in the Coprocessor
2.
Cause Register ExcCode Value:
C2E
Additional State Saved:
Depending on the Coprocessor 2 implementation, additional state information of the exception can be saved in a Coprocessor 2 control register.
Entry Vector Used:
General exception vector (offset 0x180)
4.8.18 Execution Exception — Implementation-Specific 1 Exception
The Implementation-Specific 1 exception is one of the execution exceptions. All of these exceptions have the same priority. An implementation-specific 1 exception occurs when a valid coprocessor 2 instruction cause an implementa­tion-specific 1 exception in the Coprocessor 2.
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 81
Exceptions and Interrupts in the M14K™ Core
Cause Register ExcCode Value:
IS1
Additional State Saved:
Depending on the coprocessor 2 implementation, additional state information of the exception can be saved in a coprocessor 2 control register.
Entry Vector Used:
General exception vector (offset 0x180)
4.8.19 Execution Exception — Integer Overflow
The integer overflow exception is one of the execution exceptions. All of these exceptions have the same priority. An integer overflow exception occurs when selected integer instructions result in a 2’s complement overflow.
Cause Register ExcCode Value:
Ov
Additional State Saved:
None
Entry Vector Used:
General exception vector (offset 0x180)
4.8.20 Execution Exception — Trap
The trap exception is one of the execution exceptions. All of these exceptions have the same priority. A trap excep­tion occurs when a trap instruction results in a TRUE value.
Cause Register ExcCode Value:
Tr
Additional State Saved:
None
Entry Vector Used:
General exception vector (offset 0x180)
4.8.21 Debug Data Break Exception
A debug data break exception occurs when a data hardware breakpoint matches the load/store transaction of an exe­cuted load/store instruction. The DEPC register and DBD bit in the Debug register will indicate the load/store instruc­tion that caused the data hardware breakpoint to match. The load/store instruction that caused the debug exception has not completed e.g. not updated the register file, and the instruction can be re-executed after returning from the debug handler.
Debug Register Debug Status Bit Set:
DDBL for a load instruction or DDBS for a store instruction
82 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
4.9 Exception Handling and Servicing Flowcharts
Additional State Saved:
None
Entry Vector Used:
Debug exception vector
4.8.22 Complex Break Exception
A complex data break exception occurs when the complex hardware breakpoint detects an enabled breakpoint. Com­plex breaks are taken imprecisely—the instruction that actually caused the exception is allowed to complete and the
DEPC register and DBD bit in the Debug register point to a following instruction.
Debug Register Debug Status Bit Set:
DIBImpr, DDBLImpr, and/or DDBSImpr
Additional State Saved:
Debug2 fields indicate which type(s) of complex breakpoints were detected.
Entry Vector Used:
Debug exception vector
4.9 Exception Handling and Servicing Flowcharts
The remainder of this chapter contains flowcharts for the following exceptions and guidelines for their handlers:
General exceptions and their exception handler
Reset, soft reset and NMI exceptions, and a guideline to their handler
Debug exceptions
MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04 83
Exceptions and Interrupts in the M14K™ Core
Figure 4.3 General Exception Handler (HW)
Exceptions other than Reset, SoftReset, NMI, EJTag Debug and cache error,or first-level TLB miss. Note: Interrupts can be masked by IE or IMs and Watch is masked if EXL = 1
BadVA is set only for AdEL/S
Set Cause EXCCode,CE
BadVA VA
exceptions. Note: not set if it is a Bus Error
Comments
Check if exception within
another exception
Yes
EPC (PC - 4)
CauseBD 1
PC 0x8000_0000 + 180
(unmapped, cached)
EXL
=0
Instr. inBr.Dly.
Slot?
EXL 1
.
Status
BEV
=1
No
EPC PC
Cause
=1 (bootstrap)=0 (normal)
PC 0xBFC0_0200 + 180
(unmapped, uncached)
0
BD
Processorforced to KernelMode
&interrupt disabled
To General Exception Servicing Guidelines
84 MIPS32® M14K Processor Core Family Software User’s Manual, Revision 02.04
Loading...