Xilinx MicroBlaze Processor Reference Guide

MicroBlaze Processor Reference Guide
UG984 (v2018.2) June 21, 2018
Revision History
Send Feedback
06/21/2018: Released with Vivado® Design Suite 2018.2 without changes from 2018.1.
The following table shows the revision history for this document.
04/04/2018
10/04/2017
04/05/2017
10/05/2016
2018.1
2017.3
2017.1
2016.3
Updated for Vivado 2018.1 release:
Included information about instruction pipeline hazards and forwarding.
Clarified that software break does not set the BIP bit in MSR.
Explained memory scrubbing behavior.
Added more detailed description of sleep and pause usage.
Clarified use of parallel debug clock and reset.
Updated for Vivado 2017.3 release:
Added automotive UltraScale+ Zynq and Spartan-7 devices.
Updated description of debug trace, to add event trace, new in version 10.0.
Added 4PB extended address size.
Clarified description of cache trace signals.
Updated for Vivado 2017.1 release:
Added description of MMU Physical Address Extension (PAE), new in version 10.0.
Extended privileged instruction list, and updated instruction descriptions.
Updated information on debug program trace.
Added reference to the Triple Modular Redundancy (TMR) subsystem.
Corrected description of BSIFI instruction.
Updated MFSE instruction description with PAE information.
Added MTSE instruction used with PAE, new in version 10.0.
Updated WDC instruction for external cache invalidate and flush.
Updated for Vivado 2016.3 release:
Added description of frequency optimized 8-stage pipeline, new in version 10.0.
Describe bit field instructions, new in version 10.0.
Include information on parallel debug interface, new in version 10.0.
Added version 10.0 to MicroBlaze release version code in PVR.
Included Spartan-7 target architecture in PVR.
Updated description of MSR reset value.
Updated Xilinx
04/06/2016
MicroBlaze Processor Reference Guide 2
UG984 (v2018.2) June 21, 2018 www.xilinx.com
2016.1
Updated for Vivado 2016.1 release:
Included description of address extension, new in version 9.6.
Included description of pipeline pause functionality, new in version 9.6
Included description of non-secure AXI access support, new in version 9.6.
Included description of hibernate and suspend instructions, new in version 9.6.
Added version 9.6 to MicroBlaze release version code in PVR.
Corrected references to Table 2-46 and Table 2-47.
Replaced references to the deprecated Xilinx Microprocessor Debugger (XMD) with Xilinx System Debugger (XSDB).
Removed C code function attributes svc_handler and svc_table_handler.
Date Version Revision
Send Feedback
04/15/2015
10/01/2014
04/02/2014
2015.1
2014.3
2014.1
Updated for Vivado 2015.1 release:
Included description of 16 word cache line length, new in version 9.5.
Added version 9.5 to MicroBlaze release version code in PVR.
Corrected description of supported endianness and parameter C_ENDIANNESS.
Corrected description of outstanding reads for instruction and data cache.
Updated FPGA configuration memory protection document reference [Ref 5].
Corrected Bus Index Range definitions for Lockstep Comparison in Table 3-14.
Clarified registers altered for IDIV instruction.
Corrected PVR assembler mnemonics for MFS instruction.
Updated performance and resource utilization for 2015.1.
Added references to training resources.
Updated for Vivado 2014.3 release:
Corrected semantic description for PCMPEQ and PCMPNE in Table 2.1.
Added version 9.4 to MicroBlaze release version code in PVR.
Included description of external program trace, new in version 9.4
Updated for Vivado 2014.1 release:
Added v9.3 to MicroBlaze release version code in PVR.
Clarified availability and behavior of stack protection registers.
Corrected description of LMB instruction and data bus exception.
Included description of extended debug features, new in version 9.3: performance monitoring, program trace and non-intrusive profiling.
Included definition of Reset Mode signals, new in version 9.3.
Clarified how the AXI4-Stream TLAST signal is handled.
Added UltraScale and updated performance and resource utilization for 2014.1.
12/18/2013
10/02/2013
06/19/2013
03/20/2013
2013.4
2013.3
2013.2
2013.1
Updated for Vivado 2013.4 release.
Updated for Vivado 2013.3 release.
Updated for Vivado 2013.2 release.
Initial Xilinx release. This User Guide is derived from UG081.
MicroBlaze Processor Reference Guide 3
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Table of Contents
Send Feedback
Chapter 1: Introduction
Guide Contents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Chapter 2: MicroBlaze Architecture
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Data Types and Endianness. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Pipeline Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Memory Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Privileged Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Virtual-Memory Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Reset, Interrupts, Exceptions, and Break . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Instruction Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Data Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Floating-Point Unit (FPU). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Stream Link Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Debug and Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Fault Tolerance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Lockstep Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Coherency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Data and Instruction Address Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
MicroBlaze Processor Reference Guide 4
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 3: MicroBlaze Signal Interface Description
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
MicroBlaze I/O Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
AXI4 and ACE Interface Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Local Memory Bus (LMB) Interface Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Lockstep Interface Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Debug Interface Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Trace Interface Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
MicroBlaze Core Configurability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
Send Feedback
Chapter 4: MicroBlaze Application Binary Interface
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
Register Usage Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Stack Convention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
Memory Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Interrupt, Break and Exception Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
Chapter 5: MicroBlaze Instruction Set Architecture
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
Formats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Appendix A: Performance and Resource Utilization
Performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
Resource Utilization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
IP Characterization and fMAX Margin System Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
Appendix B: Additional Resources and Legal Notices
Xilinx Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
Solution Centers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
Documentation Navigator and Design Hubs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
Training Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
Please Read: Important Legal Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
MicroBlaze Processor Reference Guide 5
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Introduction
Send Feedback
The MicroBlaze™ Processor Reference Guide provides information about the 32-bit soft processor, MicroBlaze, which is included in Vivado. The document is intended as a guide to the MicroBlaze hardware architecture.
Guide Contents
This guide contains the following chapters:
Chapter 2, MicroBlaze Architecture contains an overview of MicroBlaze features as well as information on Big-Endian and Little-Endian bit-reversed format, 32-bit general purpose registers, cache software support, and AXI4-Stream interfaces.
Chapter 1
Chapter 3, MicroBlaze Signal Interface Description describes the types of signal interfaces that can be used to connect MicroBlaze.
Chapter 4, MicroBlaze Application Binary Interface describes the Application Binary Interface important for developing software in assembly language for the processor.
Chapter 5, MicroBlaze Instruction Set Architecture provides notation, formats, and instructions for the Instruction Set Architecture (ISA) of MicroBlaze.
Appendix A, Performance and Resource Utilization contains maximum frequencies and resource utilization numbers for different configurations and devices.
Appendix B, Additional Resources and Legal Notices provides links to documentation and additional resources.
MicroBlaze Processor Reference Guide 6
UG984 (v2018.2) June 21, 2018 www.xilinx.com
MicroBlaze Architecture
Bus
IF
I-Cache
Instruction
Buffer
Instruction
Buffer
Branch Target
Cache
Program
Counter
M_AXI_IC
Memory Management Unit (MMU)
ITLB DTLBUTLB
Bus
IF
D-Cache
M_AXI_DC
M_AXI_DP
DLMB
M0_AXIS .. M15_AXIS
S0_AXIS .. S15_AXIS
Special
Purpose
Registers
Instruction
Decode
Register File
32 x 32b
ALU
Shift
Barrel Shift
Multiplier
Divider
FPU
Instruction-side
Bus interface
Data-side
Bus interface
Optional MicroBlaze feature
M_AXI_IP
ILMB
M_ACE_DC
M_ACE_IC
X19738-090717
Send Feedback
Introduction
This chapter contains an overview of MicroBlaze™ features and detailed information on MicroBlaze architecture including Big-Endian or Little-Endian bit-reversed format, 32-bit general purpose registers, virtual-memory management, cache software support, and AXI4-Stream interfaces.
Overview
The MicroBlaze embedded processor soft core is a reduced instruction set computer (RISC) optimized for implementation in Xilinx® Field Programmable Gate Arrays (FPGAs). The following figure shows a functional block diagram of the MicroBlaze core.
Chapter 2
X-Ref Target - Figure 2-1
MicroBlaze Processor Reference Guide 7
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Figure 2-1: MicroBlaze Core Block Diagram
Chapter 2: MicroBlaze Architecture
Send Feedback
Features
The MicroBlaze soft core processor is highly configurable, allowing you to select a specific set of features required by your design.
The fixed feature set of the processor includes:
Thirty-two 32-bit general purpose registers
32-bit instruction word with three operands and two addressing modes
Default 32-bit address bus, extensible to 64 bits
Single issue pipeline
In addition to these fixed features, the MicroBlaze processor is parameterized to allow selective enabling of additional functionality. Older (deprecated) versions of MicroBlaze support a subset of the optional features described in this manual. Only the latest (preferred) version of MicroBlaze (v10.0) supports all options.
RECOMMENDED: Xilinx recommends that all new designs use the latest preferred version of the
MicroBlaze processor.
The following table provides an overview of the configurable features by MicroBlaze versions.
Table 2-1: Configurable Feature Overview by MicroBlaze Version
MicroBlaze versions
Feature
v9.2 v9.3 v9.4 v9.5 v9.6 v10.0
Version Status
Processor pipeline depth
Local Memory Bus (LMB) data side interface
Local Memory Bus (LMB) instruction side interface
Hardware barrel shifter
Hardware divider
Hardware debug logic
Stream link interfaces
Machine status set and clear instructions
Cache line word length
Hardware exception support
Pattern compare instructions
Floating-point unit (FPU)
deprecated deprecated deprecated deprecated deprecated preferred
3/5 3/5 3/5 3/5 3/5 3/5/8
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
0-16 AXI 0-16 AXI 0-16 AXI 0-16 AXI 0-16 AXI 0-16 AXI
option option option option option option
4, 8 4, 8 4, 8 4, 8, 16 4, 8, 16 4, 8, 16
option option option option option option
option option option option option option
option option option option option option
MicroBlaze Processor Reference Guide 8
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Table 2-1: Configurable Feature Overview by MicroBlaze Version (Cont’d)
Feature
Disable hardware multiplier
Hardware debug readable ESR and EAR
Processor Version Register (PVR)
Area or speed optimized
Hardware multiplier 64-bit result
LUT cache memory
Floating-point conversion and square root instructions
Memory Management Unit (MMU)
Extended stream instructions
Use Cache Interface for All I-Cache Memory Accesses
Use Cache Interface for All D-Cache Memory Accesses
Use Write-back Caching Policy for D-Cache
Branch Target Cache (BTC)
Streams for I-Cache
Victim handling for I-Cache
Victim handling for D-Cache
AXI4 (M_AXI_DP) data side interface
AXI4 (M_AXI_IP) instruction side interface
AXI4 (M_AXI_DC) protocol for D­Cache
AXI4 (M_AXI_IC) protocol for I­Cache
AXI4 protocol for stream accesses
Fault tolerant features
Force distributed RAM for cache tags
Configurable cache data widths
Count Leading Zeros instruction
Memory Barrier instruction
Stack overflow and underflow detection
Allow stream instructions in user mode
1
MicroBlaze versions
v9.2 v9.3 v9.4 v9.5 v9.6 v10.0
option option option option option option
Yes Yes Yes Yes Yes Yes
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
option option option option option option
Yes Yes Yes Yes Yes Yes
option option option option option option
option option option option option option
MicroBlaze Processor Reference Guide 9
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Table 2-1: Configurable Feature Overview by MicroBlaze Version (Cont’d)
Feature
v9.2 v9.3 v9.4 v9.5 v9.6 v10.0
Lockstep support
Configurable use of FPGA primitives
Low-latency interrupt mode
Swap instructions
Sleep mode and sleep instruction
Relocatable base vectors
ACE (M_ACE_DC) protocol for D­Cache
ACE (M_ACE_IC) protocol for I­Cache
Extended debug: performance monitoring, program trace, non­intrusive profiling
Reset mode: enter sleep or debug halt at reset
Extended debug: external program trace
Extended data addressing
Pipeline pause functionality
Hibernate and suspend instructions
Non-secure mode
Bit field instructions
2
Parallel debug interface
MMU Physical Address Extension
1. Used for saving DSP48E primitives.
2. Bit field instructions are available when C_USE_BARREL = 1.
option option option option option option
option option option option option option
option option option option option option
option option option option option option
Yes Yes Yes Yes Yes Yes
option option option option option option
option option option option option option
option option option option option option
option option option option option
option option option option option
MicroBlaze versions
option option option option
option option
Yes Yes
Yes Yes
Yes Yes
option
option
option
MicroBlaze Processor Reference Guide 10
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Data Types and Endianness
The MicroBlaze processor uses Big-Endian or Little-Endian format to represent data, depending on the selected endianness. The parameter endian) by default.
The hardware supported data types for MicroBlaze are word, half word, and byte. When using the reversed load and store instructions LHUR, LWR, SHR, and SWR, the bytes in the data are reversed, as indicated by the byte-reversed order.
The following tables show the bit and byte organization for each type.
Table 2-2: Word Data Type
Big-Endian Byte Address n n+1 n+2 n+3
Big-Endian Byte Significance MSByte LSByte
Big-Endian Byte Order n n+1 n+2 n+3
Big-Endian Byte-Reversed Order n+3 n+2 n+1 n
Little-Endian Byte Address n+3 n+2 n+1 n
Little-Endian Byte Significance MSByte LSByte
Little-Endian Byte Order n+3 n+2 n+1 n
Little-Endian Byte-Reversed Order n n+1 n+2 n+3
Bit Label 0 31
Bit Significance MSBit LSBit
C_ENDIANNESS is set to 1 (little-
Table 2-3: Half Word Data Type
Big-Endian Byte Address n n+1
Big-Endian Byte Significance MSByte LSByte
Big-Endian Byte Order n n+1
Big-Endian Byte-Reversed Order n+1 n
Little-Endian Byte Address n+1 n
Little-Endian Byte Significance MSByte LSByte
Little-Endian Byte Order n+1 n
Little-Endian Byte-Reversed Order n n+1
Bit Label 0 15
Bit Significance MSBit LSBit
Table 2-4: Byte Data Type
Byte Address n
Bit Label 0 7
Bit Significance MSBit LSBit
MicroBlaze Processor Reference Guide 11
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Instructions
Send Feedback
Instruction Summary
All MicroBlaze instructions are 32 bits and are defined as either Type A or Type B. Type A instructions have up to two source register operands and one destination register operand. Type B instructions have one source register and a 16-bit immediate operand (which can be extended to 32 bits by preceding the Type B instruction with an imm instruction).
Type B instructions have a single destination register operand. Instructions are provided in the following functional categories: arithmetic, logical, branch, load/store, and special. The following table describes the instruction set nomenclature used in the semantics of each instruction.
Instruction Set Architecture, for more information on these instructions.
Table 2-5: Instruction Set Nomenclature
Symbol Description
Table 2-5 lists the MicroBlaze instruction set. See Chapter 5, MicroBlaze
Chapter 2: MicroBlaze Architecture
Ra R0 - R31, General Purpose Register, source operand a
Rb R0 - R31, General Purpose Register, source operand b
Rd R0 - R31, General Purpose Register, destination operand
SPR[x] Special Purpose Register number x
MSR Machine Status Register = SPR[1]
ESR Exception Status Register = SPR[5]
EAR Exception Address Register = SPR[3]
FSR Floating-point Unit Status Register = SPR[7]
PVRx Processor Version Register, where x is the register number = SPR[8192 + x]
BTR Branch Target Register = SPR[11]
PC Execute stage Program Counter = SPR[0]
x[y] Bit y of register x
x[y:z] Bit range y to z of register x
x Bit inverted value of register x
Imm 16 bit immediate value
Immx x bit immediate value
FSLx 4 bit AXI4-Stream port designator, where x is the port number
C Carry flag, MSR[29]
Sa Special Purpose Register, source operand
Sd Special Purpose Register, destination operand
s(x) Sign extend argument x to 32-bit value
MicroBlaze Processor Reference Guide 12
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Table 2-5: Instruction Set Nomenclature (Cont’d)
Send Feedback
Symbol Description
*Addr Memory contents at location Addr (data-size aligned)
:= Assignment operator
= Equality comparison
!= Inequality comparison
> Greater than comparison
>= Greater than or equal comparison
< Less than comparison
<= Less than or equal comparison
+ Arithmetic add
* Arithmetic multiply
/ Arithmetic divide
>> x Bit shift right x bits
<< x Bit shift left x bits
Chapter 2: MicroBlaze Architecture
and Logic AND
or Logic OR
xor Logic exclusive OR
op1 if cond else op2 Perform op1 if condition cond is true, else perform op2
& Concatenate. For example “0000100 & Imm7” is the concatenation of the fixed field
“0000100” and a 7 bit immediate value.
signed Operation performed on signed integer data type. All arithmetic operations are
performed on signed word operands, unless otherwise specified
unsigned Operation performed on unsigned integer data type
float Operation performed on floating-point data type
clz(r) Count leading zeros
Table 2-6: MicroBlaze Instruction Set Summary
Type A 0-5 6-10 11-15 16-20 21-31
Semantics
Type B 0-5 6-10 11-15 16-31
ADD Rd,Ra,Rb 000000 Rd Ra Rb 00000000000 Rd := Rb + Ra
RSUB Rd,Ra,Rb 000001 Rd Ra Rb 00000000000 Rd := Rb + Ra + 1
ADDC Rd,Ra,Rb 000010 Rd Ra Rb 00000000000 Rd := Rb + Ra + C
RSUBC Rd,Ra,Rb 000011 Rd Ra Rb 00000000000 Rd := Rb + Ra + C
ADDK Rd,Ra,Rb 000100 Rd Ra Rb 00000000000 Rd := Rb + Ra
RSUBK Rd,Ra,Rb 000101 Rd Ra Rb 00000000000 Rd := Rb + Ra + 1
MicroBlaze Processor Reference Guide 13
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Table 2-6: MicroBlaze Instruction Set Summary (Cont’d)
Send Feedback
Chapter 2: MicroBlaze Architecture
Type A 0-5 6-10 11-15 16-20 21-31
Semantics
Type B 0-5 6-10 11-15 16-31
CMP Rd,Ra,Rb 000101 Rd Ra Rb 00000000001 Rd := Rb + Ra + 1
Rd[0] := 0 if (Rb >= Ra) else
Rd[0] := 1
CMPU Rd,Ra,Rb 000101 Rd Ra Rb 00000000011 Rd := Rb + Ra + 1 (unsigned)
Rd[0] := 0 if (Rb >= Ra, unsigned) else Rd[0] := 1
ADDKC Rd,Ra,Rb 000110 Rd Ra Rb 00000000000 Rd := Rb + Ra + C
RSUBKC Rd,Ra,Rb 000111 Rd Ra Rb 00000000000 Rd := Rb + Ra + C
ADDI Rd,Ra,Imm 001000 Rd Ra Imm Rd := s(Imm) + Ra
RSUBI Rd,Ra,Imm 001001 Rd Ra Imm Rd := s(Imm) + Ra + 1
ADDIC Rd,Ra,Imm 001010 Rd Ra Imm Rd := s(Imm) + Ra + C
RSUBIC Rd,Ra,Imm 001011 Rd Ra Imm Rd := s(Imm) + Ra + C
ADDIK Rd,Ra,Imm 001100 Rd Ra Imm Rd := s(Imm) + Ra
RSUBIK Rd,Ra,Imm 001101 Rd Ra Imm Rd := s(Imm) + Ra + 1
ADDIKC Rd,Ra,Imm 001110 Rd Ra Imm Rd := s(Imm) + Ra + C
RSUBIKC Rd,Ra,Imm 001111 Rd Ra Imm Rd := s(Imm) + Ra + C
MUL Rd,Ra,Rb 010000 Rd Ra Rb 00000000000 Rd := Ra * Rb
MULH Rd,Ra,Rb 010000 Rd Ra Rb 00000000001 Rd := (Ra * Rb) >> 32 (signed)
MULHU Rd,Ra,Rb 010000 Rd Ra Rb 00000000011 Rd := (Ra * Rb) >> 32 (unsigned)
MULHSU Rd,Ra,Rb 010000 Rd Ra Rb 00000000010 Rd := (Ra, signed * Rb, unsigned) >>
32 (signed)
BSRL Rd,Ra,Rb 010001 Rd Ra Rb 00000000000 Rd := 0 & (Ra >> Rb)
BSRA Rd,Ra,Rb 010001 Rd Ra Rb 01000000000 Rd := s(Ra >> Rb)
BSLL Rd,Ra,Rb 010001 Rd Ra Rb 10000000000 Rd := (Ra << Rb) & 0
IDIV Rd,Ra,Rb 010010 Rd Ra Rb 00000000000 Rd := Rb/Ra
IDIVU Rd,Ra,Rb 010010 Rd Ra Rb 00000000010 Rd := Rb/Ra, unsigned
TNEAGETD Rd,Rb 010011 Rd 00000 Rb 0N0TAE
00000
TNAPUTD Ra,Rb 010011 00000 Ra Rb 0N0TA0
00000
TNECAGETD Rd,Rb 010011 Rd 00000 Rb 0N1TAE
00000
Rd := FSL Rb[28:31] (data read)
MSR[FSL] := 1 if (FSL_S_Control = 1)
MSR[C] := not FSL_S_Exists if N = 1
FSL Rb[28:31] := Ra (data write)
MSR[C] := FSL_M_Full if N = 1
Rd := FSL Rb[28:31] (control read)
MSR[FSL] := 1 if (FSL_S_Control = 0)
MSR[C] := not FSL_S_Exists if N = 1
MicroBlaze Processor Reference Guide 14
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Table 2-6: MicroBlaze Instruction Set Summary (Cont’d)
Send Feedback
Chapter 2: MicroBlaze Architecture
Type A 0-5 6-10 11-15 16-20 21-31
Semantics
Type B 0-5 6-10 11-15 16-31
TNCAPUTD Ra,Rb 010011 00000 Ra Rb 0N1TA0
00000
FADD Rd,Ra,Rb 010110 Rd Ra Rb 00000000000 Rd := Rb+Ra, float
FRSUB Rd,Ra,Rb 010110 Rd Ra Rb 00010000000 Rd := Rb-Ra, float
FMUL Rd,Ra,Rb 010110 Rd Ra Rb 00100000000 Rd := Rb*Ra, float
FDIV Rd,Ra,Rb 010110 Rd Ra Rb 00110000000 Rd := Rb/Ra, float
FCMP.UN Rd,Ra,Rb 010110 Rd Ra Rb 01000000000 Rd := 1 if (Rb = NaN or Ra = NaN,
FCMP.LT Rd,Ra,Rb 010110 Rd Ra Rb 01000010000 Rd := 1 if (Rb < Ra, float1) else
FCMP.EQ Rd,Ra,Rb 010110 Rd Ra Rb 01000100000 Rd := 1 if (Rb = Ra, float1) else
FCMP.LE Rd,Ra,Rb 010110 Rd Ra Rb 01000110000 Rd := 1 if (Rb <= Ra, float1) else
FCMP.GT Rd,Ra,Rb 010110 Rd Ra Rb 01001000000 Rd := 1 if (Rb > Ra, float1) else
FSL Rb[28:31] := Ra (control write)
MSR[C] := FSL_M_Full if N = 1
1
1
1
1
1
) else
float
Rd := 0
Rd := 0
Rd := 0
Rd := 0
Rd := 0
FCMP.NE Rd,Ra,Rb 010110 Rd Ra Rb 01001010000 Rd := 1 if (Rb != Ra, float1) else
Rd := 0
FCMP.GE Rd,Ra,Rb 010110 Rd Ra Rb 01001100000 Rd := 1 if (Rb >= Ra, float1) else
Rd := 0
FLT Rd,Ra 010110 Rd Ra 0 01010000000 Rd := float (Ra)
FINT Rd,Ra 010110 Rd Ra 0 01100000000 Rd := int (Ra)
FSQRT Rd,Ra 010110 Rd Ra 0 01110000000 Rd := sqrt (Ra)
1
1
1
MULI Rd,Ra,Imm 011000 Rd Ra Imm Rd := Ra * s(Imm)
BSRLI Rd,Ra,Imm 011001 Rd Ra 00000000000 &
Rd : = 0 & (Ra >> Imm5)
Imm5
BSRAI Rd,Ra,Imm 011001 Rd Ra 00000010000 &
Rd := s(Ra >> Imm5)
Imm5
BSLLI Rd,Ra,Imm 011001 Rd Ra 00000100000 &
Rd := (Ra << Imm5) & 0
Imm5
BSEFI Rd,Ra,
ImmW,Imm
S
011001 Rd Ra 01000 &
Imm
& 0 & Imm
W
Rd[0:31-ImmW] := 0
Rd[32-ImmW:31] := (Ra >> ImmS)
S
MicroBlaze Processor Reference Guide 15
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Table 2-6: MicroBlaze Instruction Set Summary (Cont’d)
Send Feedback
Chapter 2: MicroBlaze Architecture
Type A 0-5 6-10 11-15 16-20 21-31
Type B 0-5 6-10 11-15 16-31
BSIFI Rd,Ra,
Width,Imm
TNEAGET Rd,FSLx 011011 Rd 00000 0N0TAE000000 &
TNAPUT Ra,FSLx 011011 00000 Ra 1N0TA0000000 &
TNECAGET Rd,FSLx 011011 Rd 00000 0N1TAE000000 &
TNCAPUT Ra,FSLx 011011 00000 Ra 1N1TA0000000 &
S
011001 Rd Ra 10000 &
Imm
& 0 & Imm
W
FSLx
FSLx
FSLx
FSLx
Semantics
M := (0xffffffff << (ImmW + 1)) xor
(0xffffffff << ImmS)
S
Rd := ((Ra << ImmS) and M) xor
(Rd and M) ImmW := ImmS + Width - 1
Rd := FSLx (data read, blocking if N = 0)
MSR[FSL] := 1 if (FSLx_S_Control = 1)
MSR[C] := not FSLx_S_Exists if N = 1
FSLx := Ra (data write, block if N = 0)
MSR[C] := FSLx_M_Full if N = 1
Rd := FSLx (control read, block if N =
0)
MSR[FSL] := 1 if (FSLx_S_Control = 0)
MSR[C] := not FSLx_S_Exists if N = 1
FSLx := Ra (control write, block if N =
0)
MSR[C] := FSLx_M_Full if N = 1
OR Rd,Ra,Rb 100000 Rd Ra Rb 00000000000 Rd := Ra or Rb
PCMPBF Rd,Ra,Rb 100000 Rd Ra Rb 10000000000 Rd := 1 if (Rb[0:7] = Ra[0:7]) else
Rd := 2 if (Rb[8:15] = Ra[8:15]) else
Rd := 3 if (Rb[16:23] = Ra[16:23]) else
Rd := 4 if (Rb[24:31] = Ra[24:31]) else
Rd := 0
AND Rd,Ra,Rb 100001 Rd Ra Rb 00000000000 Rd := Ra and Rb
XOR Rd,Ra,Rb 100010 Rd Ra Rb 00000000000 Rd := Ra xor Rb
PCMPEQ Rd,Ra,Rb 100010 Rd Ra Rb 10000000000 Rd := 1 if (Rb = Ra) else
Rd := 0
ANDN Rd,Ra,Rb 100011 Rd Ra Rb 00000000000 Rd := Ra and Rb
PCMPNE Rd,Ra,Rb 100011 Rd Ra Rb 10000000000 Rd := 1 if (Rb != Ra) else
Rd := 0
SRA Rd,Ra 100100 Rd Ra 0000000000000001 Rd := s(Ra >> 1)
C := Ra[31]
SRC Rd,Ra 100100 Rd Ra 0000000000100001 Rd := C & (Ra >> 1)
C := Ra[31]
SRL Rd,Ra 100100 Rd Ra 0000000001000001 Rd := 0 & (Ra >> 1)
C := Ra[31]
MicroBlaze Processor Reference Guide 16
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Table 2-6: MicroBlaze Instruction Set Summary (Cont’d)
Send Feedback
Chapter 2: MicroBlaze Architecture
Type A 0-5 6-10 11-15 16-20 21-31
Semantics
Type B 0-5 6-10 11-15 16-31
SEXT8 Rd,Ra 100100 Rd Ra 0000000001100000 Rd := s(Ra[24:31])
SEXT16 Rd,Ra 100100 Rd Ra 0000000001100001 Rd := s(Ra[16:31])
CLZ Rd, Ra 100100 Rd Ra 0000000011100000 Rd = clz(Ra)
SWAPB Rd, Ra 100100 Rd Ra 0000000111100000 Rd = (Ra)[24:31, 16:23, 8:15, 0:7]
SWAPH Rd, Ra 100100 Rd Ra 0000000111100010 Rd = (Ra)[16:31, 0:15]
WIC Ra,Rb 100100 00000 Ra Rb 00001101000 ICache_Line[Ra >> 4].Tag := 0 if
C_ICACHE_LINE_LEN = 4)
(
ICache_Line[Ra >> 5].Tag := 0 if
C_ICACHE_LINE_LEN = 8)
(
ICache_Line[Ra >> 6].Tag := 0 if
C_ICACHE_LINE_LEN = 16)
(
WDC Ra,Rb 100100 00000 Ra Rb 00001100100 Cache line is cleared, discarding
stored data.
DCache_Line[Ra >> 4].Tag := 0 if
C_DCACHE_LINE_LEN = 4)
(
DCache_Line[Ra >> 5].Tag := 0 if
C_DCACHE_LINE_LEN = 8)
(
DCache_Line[Ra >> 6].Tag := 0 if
C_DCACHE_LINE_LEN = 16)
(
WDC.FLUSH Ra,Rb 100100 00000 Ra Rb 00001110100 Cache line is flushed, writing stored
data to memory, and then cleared. Used when
C_DCACHE_USE_WRITEBACK = 1.
WDC.CLEAR Ra,Rb 100100 00000 Ra Rb 00001100110 Cache line with matching address is
cleared, discarding stored data. Used when
C_DCACHE_USE_WRITEBACK = 1.
WDC.CLEAR.EA Ra,Rb
100100 00000 Ra Rb 00011100110 Cache line with matching extended
address Ra & Rb is cleared. Used when
C_DCACHE_USE_WRITEBACK = 1.
MTS Sd,Ra 100101 00000 Ra 11 & Sd SPR[Sd] := Ra, where:
· SPR[0x0001] is MSR
· SPR[0x0007] is FSR
· SPR[0x0800] is SLR
· SPR[0x0802] is SHR
· SPR[0x1000] is PID
· SPR[0x1001] is ZPR
· SPR[0x1002] is TLBX
· SPR[0x1003] is TLBLO[LSH]
· SPR[0x1004] is TLBHI
· SPR[0x1005] is TLBSX
MicroBlaze Processor Reference Guide 17
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Table 2-6: MicroBlaze Instruction Set Summary (Cont’d)
Send Feedback
Chapter 2: MicroBlaze Architecture
Type A 0-5 6-10 11-15 16-20 21-31
Semantics
Type B 0-5 6-10 11-15 16-31
MTSE Sd,Ra 100101 01000 Ra 11 & Sd SPR[Sd} := Ra, where:
· SPR[0x1003] is TLBLO[MSH]
MFS Rd,Sa 100101 Rd 00000 10 & Sa Rd := SPR[Sa], where:
· SPR[0x0000] is PC
· SPR[0x0001] is MSR
· SPR[0x0003] is EAR[LSH]
· SPR[0x0005] is ESR
· SPR[0x0007] is FSR
· SPR[0x000B] is BTR
· SPR[0x000D] is EDR
· SPR[0x0800] is SLR
· SPR[0x0802] is SHR
· SPR[0x1000] is PID
· SPR[0x1001] is ZPR
· SPR[0x1002] is TLBX
· SPR[0x1003] is TLBLO[LSH]
· SPR[0x1004] is TLBHI
· SPR[0x2000-200B] is PVR[0­12][LSH]
MFSE Rd,Sa 100101 Rd 01000 10 & Sa Rd := SPR[Sa][MSH], where:
· SPR[0x0003] is EAR[MSH]
· SPR[0x1003] is TLBLO[MSH]
· SPR[0x2006-2009] is PVR[6­9][MSH]
MSRCLR Rd,Imm 100101 Rd 00001 00 & Imm14 Rd := MSR
MSR := MSR and Imm14
MSRSET Rd,Imm 100101 Rd 00000 00 & Imm14 Rd := MSR
MSR := MSR or Imm14
BR Rb 100110 00000 00000 Rb 00000000000 PC := PC + Rb
BRD Rb 100110 00000 10000 Rb 00000000000 PC := PC + Rb
BRLD Rd,Rb 100110 Rd 10100 Rb 00000000000 PC := PC + Rb
Rd := PC
BRA Rb 100110 00000 01000 Rb 00000000000 PC := Rb
BRAD Rb 100110 00000 11000 Rb 00000000000 PC := Rb
BRALD Rd,Rb 100110 Rd 11100 Rb 00000000000 PC := Rb
Rd := PC
MicroBlaze Processor Reference Guide 18
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Table 2-6: MicroBlaze Instruction Set Summary (Cont’d)
Send Feedback
Chapter 2: MicroBlaze Architecture
Type A 0-5 6-10 11-15 16-20 21-31
Semantics
Type B 0-5 6-10 11-15 16-31
BRK Rd,Rb 100110 Rd 01100 Rb 00000000000 PC := Rb
Rd := PC
MSR[BIP] := 1
BEQ Ra,Rb 100111 00000 Ra Rb 00000000000 PC := PC + Rb if Ra = 0
BNE Ra,Rb 100111 00001 Ra Rb 00000000000 PC := PC + Rb if Ra != 0
BLT Ra,Rb 100111 00010 Ra Rb 00000000000 PC := PC + Rb if Ra < 0
BLE Ra,Rb 100111 00011 Ra Rb 00000000000 PC := PC + Rb if Ra <= 0
BGT Ra,Rb 100111 00100 Ra Rb 00000000000 PC := PC + Rb if Ra > 0
BGE Ra,Rb 100111 00101 Ra Rb 00000000000 PC := PC + Rb if Ra >= 0
BEQD Ra,Rb 100111 10000 Ra Rb 00000000000 PC := PC + Rb if Ra = 0
BNED Ra,Rb 100111 10001 Ra Rb 00000000000 PC := PC + Rb if Ra != 0
BLTD Ra,Rb 100111 10010 Ra Rb 00000000000 PC := PC + Rb if Ra < 0
BLED Ra,Rb 100111 10011 Ra Rb 00000000000 PC := PC + Rb if Ra <= 0
BGTD Ra,Rb 100111 10100 Ra Rb 00000000000 PC := PC + Rb if Ra > 0
BGED Ra,Rb 100111 10101 Ra Rb 00000000000 PC := PC + Rb if Ra >= 0
ORI Rd,Ra,Imm 101000 Rd Ra Imm Rd := Ra or s(Imm)
ANDI Rd,Ra,Imm 101001 Rd Ra Imm Rd := Ra and s(Imm)
XORI Rd,Ra,Imm 101010 Rd Ra Imm Rd := Ra xor s(Imm)
ANDNI Rd,Ra,Imm 101011 Rd Ra Imm Rd := Ra and s(Imm)
IMM Imm 101100 00000 00000 Imm Imm[0:15] := Imm
RTSD Ra,Imm 101101 10000 Ra Imm PC := Ra + s(Imm)
RTID Ra,Imm 101101 10001 Ra Imm PC := Ra + s(Imm)
MSR[IE] := 1
RTBD Ra,Imm 101101 10010 Ra Imm PC := Ra + s(Imm)
MSR[BIP] := 0
RTED Ra,Imm 101101 10100 Ra Imm PC := Ra + s(Imm)
MSR[EE] := 1, MSR[EIP] := 0
ESR := 0
BRI Imm 101110 00000 00000 Imm PC := PC + s(Imm)
MBAR Imm 101110 Imm 00010 0000000000000100 PC := PC + 4; Wait for memory
accesses.
BRID Imm 101110 00000 10000 Imm PC := PC + s(Imm)
BRLID Rd,Imm 101110 Rd 10100 Imm PC := PC + s(Imm)
Rd := PC
BRAI Imm 101110 00000 01000 Imm PC := s(Imm)
MicroBlaze Processor Reference Guide 19
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Table 2-6: MicroBlaze Instruction Set Summary (Cont’d)
Send Feedback
Chapter 2: MicroBlaze Architecture
Type A 0-5 6-10 11-15 16-20 21-31
Semantics
Type B 0-5 6-10 11-15 16-31
BRAID Imm 101110 00000 11000 Imm PC := s(Imm)
BRALID Rd,Imm 101110 Rd 11100 Imm PC := s(Imm)
Rd := PC
BRKI Rd,Imm 101110 Rd 01100 Imm PC := s(Imm)
Rd := PC MSR[BIP] := 1
BEQI Ra,Imm 101111 00000 Ra Imm PC := PC + s(Imm) if Ra = 0
BNEI Ra,Imm 101111 00001 Ra Imm PC := PC + s(Imm) if Ra != 0
BLTI Ra,Imm 101111 00010 Ra Imm PC := PC + s(Imm) if Ra < 0
BLEI Ra,Imm 101111 00011 Ra Imm PC := PC + s(Imm) if Ra <= 0
BGTI Ra,Imm 101111 00100 Ra Imm PC := PC + s(Imm) if Ra > 0
BGEI Ra,Imm 101111 00101 Ra Imm PC := PC + s(Imm) if Ra >= 0
BEQID Ra,Imm 101111 10000 Ra Imm PC := PC + s(Imm) if Ra = 0
BNEID Ra,Imm 101111 10001 Ra Imm PC := PC + s(Imm) if Ra != 0
BLTID Ra,Imm 101111 10010 Ra Imm PC := PC + s(Imm) if Ra < 0
BLEID Ra,Imm 101111 10011 Ra Imm PC := PC + s(Imm) if Ra <= 0
BGTID Ra,Imm 101111 10100 Ra Imm PC := PC + s(Imm) if Ra > 0
BGEID Ra,Imm 101111 10101 Ra Imm PC := PC + s(Imm) if Ra >= 0
LBU Rd,Ra,Rb
LBUR Rd,Ra,Rb
LBUEA Rd,Ra,Rb 110000 Rd Ra Rb 00010000000 Addr := Ra & Rb
LHU Rd,Ra,Rb
LHUR Rd,Ra,Rb
LHUEA Rd,Ra,Rb 110001 Rd Ra Rb 00010000000 Addr := Ra & Rb
LW Rd,Ra,Rb
LWR Rd,Ra,Rb
LWX Rd,Ra,Rb 110010 Rd Ra Rb 10000000000 Addr := Ra + Rb
110000 Rd Ra Rb 00000000000
01000000000
110001 Rd Ra Rb 00000000000
01000000000
110010 Rd Ra Rb 00000000000
01000000000
Addr := Ra + Rb
Rd[0:23] := 0
Rd[24:31] := *Addr[0:7]
Rd[0:23] := 0
Rd[24:31] := *Addr[0:7]
Addr := Ra + Rb
Rd[0:15] := 0
Rd[16:31] := *Addr[0:15]
Rd[0:15] := 0
Rd[16:31] := *Addr[0:15]
Addr := Ra + Rb
Rd := *Addr
Rd := *Addr Reservation := 1
MicroBlaze Processor Reference Guide 20
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Table 2-6: MicroBlaze Instruction Set Summary (Cont’d)
Send Feedback
Chapter 2: MicroBlaze Architecture
Type A 0-5 6-10 11-15 16-20 21-31
Semantics
Type B 0-5 6-10 11-15 16-31
LWEA Rd,Ra,Rb 110010 Rd Ra Rb 00010000000 Addr := Ra & Rb
Rd := *Addr
SB Rd,Ra,Rb
SBR Rd,Ra,Rb
SBEA Rd,Ra,Rb 110100 Rd Ra Rb 00010000000 Addr := Ra & Rb
SH Rd,Ra,Rb
SHR Rd,Ra,Rb
SHEA Rd,Ra,Rb 110101 Rd Ra Rb 00010000000 Addr := Ra & Rb
SW Rd,Ra,Rb
SWR Rd,Ra,Rb
SWX Rd,Ra,Rb 110110 Rd Ra Rb 10000000000 Addr := Ra + Rb
SWEA Rd,Ra,Rb 110110 Rd Ra Rb 00010000000 Addr := Ra & Rb
110100 Rd Ra Rb 00000000000
01000000000
110101 Rd Ra Rb 00000000000
01000000000
110110 Rd Ra Rb 00000000000
01000000000
Addr := Ra + Rb
*Addr[0:8] := Rd[24:31]
*Addr[0:8] := Rd[24:31]
Addr := Ra + Rb
*Addr[0:16] := Rd[16:31]
*Addr[0:16] := Rd[16:31]
Addr := Ra + Rb
*Addr := Rd
*Addr := Rd if Reservation = 1
Reservation := 0
*Addr := Rd
LBUI Rd,Ra,Imm 111000 Rd Ra Imm Addr := Ra + s(Imm)
Rd[0:23] := 0
Rd[24:31] := *Addr[0:7]
LHUI Rd,Ra,Imm 111001 Rd Ra Imm Addr := Ra + s(Imm)
Rd[0:15] := 0 Rd[16:31] := *Addr[0:15]
LWI Rd,Ra,Imm 111010 Rd Ra Imm Addr := Ra + s(Imm)
Rd := *Addr
SBI Rd,Ra,Imm 111100 Rd Ra Imm Addr := Ra + s(Imm)
*Addr[0:7] := Rd[24:31]
SHI Rd,Ra,Imm 111101 Rd Ra Imm Addr := Ra + s(Imm)
*Addr[0:15] := Rd[16:31]
SWI Rd,Ra,Imm 111110 Rd Ra Imm Addr := Ra + s(Imm)
*Addr := Rd
1. Due to the many different corner cases involved in floating-point arithmetic, only the normal behavior is described. A full description of the behavior can be found in Chapter 5, “MicroBlaze Instruction Set Architecture.”
MicroBlaze Processor Reference Guide 21
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Semaphore Synchronization
The LWX and SWX instructions are used to implement common semaphore operations, including test and set, compare and swap, exchange memory, and fetch and add. They are also used to implement spinlocks.
These instructions are typically used by system programs and are called by application programs as needed.
Generally, a program uses LWX to load a semaphore from memory, causing the reservation to be set (the processor maintains the reservation internally). The program can compute a result based on the semaphore value and conditionally store the result back to the same memory location using the SWX instruction. The conditional store is performed based on the existence of the reservation established by the preceding LWX instruction. If the reservation exists when the store is executed, the store is performed and MSR[C] is cleared to 0. If the reservation does not exist when the store is executed, the target memory location is not modified and MSR[C] is set to 1.
If the store is successful, the sequence of instructions from the semaphore load to the semaphore store appear to be executed atomically—no other device modified the semaphore location between the read and the update. Other devices can read from the semaphore location during the operation.
For a semaphore operation to work properly, the LWX instruction must be paired with an SWX instruction, and both must specify identical addresses.
The reservation granularity in MicroBlaze is a word. For both instructions, the address must be word aligned. No unaligned exceptions are generated for these instructions.
The conditional store is always attempted when a reservation exists, even if the store address does not match the load address that set the reservation.
Only one reservation can be maintained at a time. The address associated with the reservation can be changed by executing a subsequent LWX instruction.
The conditional store is performed based upon the reservation established by the last LWX instruction executed. Executing an SWX instruction always clears a reservation held by the processor, whether the address matches that established by the LWX or not.
Reset, interrupts, exceptions, and breaks (including the BRK and BRKI instructions) all clear the reservation.
The following provides general guidelines for using the LWX and SWX instructions:
The LWX and SWX instructions should be paired and use the same address.
MicroBlaze Processor Reference Guide 22
UG984 (v2018.2) June 21, 2018 www.xilinx.com
An unpaired SWX instruction to an arbitrary address can be used to clear any reservation held by the processor.
Chapter 2: MicroBlaze Architecture
Send Feedback
A conditional sequence begins with an LWX instruction. It can be followed by memory accesses and/or computations on the loaded value. The sequence ends with an SWX instruction. In most cases, failure of the SWX instruction should cause a branch back to the LWX for a repeated attempt.
An LWX instruction can be left unpaired when executing certain synchronization primitives if the value loaded by the LWX is not zero. An implementation of Test and Set exemplifies this:
loop: lwx r5,r3,r0 ; load and reserve
bnei r5,next ; branch if not equal to zero addik r5,r5,1 ; increment value swx r5,r3,r0 ; try to store non-zero value addic r5,r0,0 ; check reservation bnei r5,loop ; loop if reservation lost
next:
Performance can be improved by minimizing looping on an LWX instruction that fails to return a desired value. Performance can also be improved by using an ordinary load instruction to do the initial value check. An implementation of a spinlock exemplifies this:
loop: lw r5,r3,r0 ; load the word
bnei r5,loop ; loop back if word not equal to 0 lwx r5,r3,r0 ; try reserving again bnei r5,loop ; likely that no branch is needed addik r5,r5,1 ; increment value swx r5,r3,r0 ; try to store non-zero value addic r5,r0,0 ; check reservation bnei r5,loop ; loop if reservation lost
Minimizing the looping on an LWX/SWX instruction pair increases the likelihood that forward progress is made. The old value should be tested before attempting the store. If the order is reversed (store before load), more SWX instructions are executed and reservations are more likely to be lost between the LWX and SWX instructions.
Self-modifying Code
When using self-modifying code software must ensure that the modified instructions have been written to memory prior to fetching them for execution. There are several aspects to consider:
MicroBlaze Processor Reference Guide 23
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
The instructions to be modified could already have been fetched prior to modification:
Into the instruction prefetch buffer
-
Into the instruction cache, if it is enabled
-
Into a stream buffer, if instruction cache stream buffers are used
-
Into the instruction cache, and then saved in a victim buffer, if victim buffers are
-
used.
To ensure that the modified code is always executed instead of the old unmodified code, software must handle all these cases.
If one or more of the instructions to be modified is a branch, and the branch target cache is used, the branch target address might have been cached.
To avoid using the cached branch target address, software must ensure that the branch target cache is cleared prior to executing the modified code.
The modified instructions might not have been written to memory prior to execution:
They might be en-route to memory, in temporary storage in the interconnect or the
-
memory controller.
They might be stored in the data cache, if write-back cache is used.
-
They might be saved in a victim buffer, if write-back cache and victim buffers are
-
used.
Software must ensure that the modified instructions have been written to memory before being fetched by the processor.
The annotated code below shows how each of the above issues can be addressed. This code assumes that both instruction cache and write-back data cache is used. If not, the corresponding instructions can be omitted.
The following code exemplifies storing a modified instruction:
swi r5,r6,0 ; r5 = new instruction
; r6 = physical instruction address wdc.flush r6,r0 ; flush write-back data cache line mbar 1 ; ensure new instruction is written to memory wic r7,r0 ; invalidate line, empty stream & victim buffers
; r7 = virtual instruction address mbar 2 ; empty prefetch buffer, clear branch target cache
MicroBlaze Processor Reference Guide 24
UG984 (v2018.2) June 21, 2018 www.xilinx.com
The physical and virtual addresses above are identical, unless MMU virtual mode is used. If the MMU is enabled, the code sequences must be executed in real mode, because WIC and WDC are privileged instructions. The first instruction after the code sequences above must not be modified, because it might have been prefetched.
X-Ref Target - Figure 2-2
R0 – R31
0 31
X19739-091117
Send Feedback
Chapter 2: MicroBlaze Architecture
Registers
MicroBlaze has an orthogonal instruction set architecture. It has thirty-two 32-bit general purpose registers and up to eighteen 32-bit special purpose registers, depending on configured options.
General Purpose Registers
The thirty-two 32-bit General Purpose Registers are numbered R0 through R31. The register file is reset on bit stream download (reset value is 0x00000000). The following figure is a representation of a General Purpose Register and register and the register reset value (if existing).
Note: The register file is not reset by the external reset inputs: Reset and Debug_Rst.
Table 2-7 provides a description of each
Figure 2-2: R0-R31
Table 2-7: General Purpose Registers (R0-R31)
Bits Name Description Reset Value
0:31 R0 Always has a value of zero. Anything written to R0 is
discarded
0:31 R1 through R13 32-bit general purpose registers -
0:31 R14 32-bit register used to store return addresses for
interrupts.
0:31 R15 32-bit general purpose register. Recommended for storing
return addresses for user vectors.
0:31 R16 32-bit register used to store return addresses for breaks. -
0:31 R17 If MicroBlaze is configured to support hardware
exceptions, this register is loaded with the address of the instruction following the instruction causing the HW exception, except for exceptions in delay slots that use BTR instead (see general purpose register.
0:31 R18 through R31 R18 through R31 are 32-bit general purpose registers. -
Branch Target Register (BTR)); if not, it is a
0x00000000
-
-
-
See Table 4-2 for software conventions on general purpose register usage.
MicroBlaze Processor Reference Guide 25
UG984 (v2018.2) June 21, 2018 www.xilinx.com
X-Ref Target - Figure 2-3
31
PC
0
X19740-082517
31
RES
ReservedCC
0
3029282726252423222120191817
IECBIPFSLICEDZODCEEEEIPPVRUMUMSVMVMS
X19741-091117
Send Feedback
Chapter 2: MicroBlaze Architecture
Special Purpose Registers
Program Counter (PC)
The program counter (PC) is the 32-bit address of the execution instruction. It can be read with an MFS instruction, but it cannot be written with an MTS instruction. When used with the MFS instruction the PC register is specified by setting Sa = 0x0000. The following figure illustrates the PC and
Table 2-8: Program Counter (PC)
Bits Name Description Reset Value
Table 2-8 provides a description and reset value.
Figure 2-3: PC
X-Ref Target - Figure 2-4
0:31 PC Program Counter
Address of executing instruction, that is, “mfs r2, 0” stores the address of the mfs instruction itself in R2.
0x00000000
Machine Status Register (MSR)
The Machine Status Register contains control and status bits for the processor. It can be read with an MFS instruction. When reading the MSR, bit 29 is replicated in bit 0 as the carry copy. MSR can be written using either an
MSRCLR instructions.
When writing to the MSR using MSRSET or MSRCLR, the Carry bit takes effect immediately and the remaining bits take effect one clock cycle later. When writing using MTS, all bits take effect one clock cycle later. Any value written to bit 0 is discarded.
When used with an MTS or MFS instruction, the MSR is specified by setting Sx = 0x0001. The following table illustrates the MSR register and and reset values.
MTS instruction or the dedicated MSRSET and
Table 2-9 provides the bit description
MicroBlaze Processor Reference Guide 26
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Figure 2-4: MSR
Chapter 2: MicroBlaze Architecture
Send Feedback
Table 2-9: Machine Status Register (MSR)
Bits Name Description Reset Value
0 CC Arithmetic Carry Copy
Copy of the Arithmetic Carry (bit 29). CC is always the same as bit C.
1:16 Reserved
17 VMS Virtual Protected Mode Save
Only available when configured with an MMU
(if C_USE_MMU > 1 and C_AREA_OPTIMIZED = 0 or 2)
Read/Write
18 VM Virtual Protected Mode
0 = MMU address translation and access protection disabled, with
C_USE_MMU = 3 (Virtual). Access protection disabled with C_USE_MMU = 2 (Protection)
1 = MMU address translation and access protection enabled, with
C_USE_MMU = 3 (Virtual). Access protection enabled, with C_USE_MMU = 2 (Protection).
Only available when configured with an MMU
(if C_USE_MMU > 1 and C_AREA_OPTIMIZED = 0 or 2)
Read/Write
19 UMS User Mode Save
Only available when configured with an MMU
(if C_USE_MMU > 0 and C_AREA_OPTIMIZED = 0 or 2)
Read/Write
0
0
0
0
20 UM User Mode
0 = Privileged Mode, all instructions are allowed
1 = User Mode, certain instructions are not allowed
Only available when configured with an MMU
(if C_USE_MMU > 0 and C_AREA_OPTIMIZED = 0 or 2)
Read/Write
21 PVR Processor Version Register exists
0 = No Processor Version Register
1 = Processor Version Register exists
Read only
22 EIP Exception In Progress
0 = No hardware exception in progress
1 = Hardware exception in progress
Only available if configured with exception support
C_*_EXCEPTION or C_USE_MMU > 0)
(
Read/Write
0
Based on
parameter
C_PVR
0
MicroBlaze Processor Reference Guide 27
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Table 2-9: Machine Status Register (MSR) (Cont’d)
Bits Name Description Reset Value
23 EE Exception Enable
0 = Hardware exceptions disabled
1 = Hardware exceptions enabled
Only available if configured with exception support
C_*_EXCEPTION or C_USE_MMU > 0)
(
Read/Write
24 DCE Data Cache Enable
0 = Data Cache disabled
1 = Data Cache enabled
Only available if configured to use data cache (C_USE_DCACHE = 1)
Read/Write
25 DZO Division by Zero or Division Overflow
0 = No division by zero or division overflow has occurred
1 = Division by zero or division overflow has occurred
Only available if configured to use hardware divider
(C_USE_DIV = 1)
Read/Write
26 ICE Instruction Cache Enable
0 = Instruction Cache disabled
1 = Instruction Cache enabled
Only available if configured to use instruction cache
(C_USE_ICACHE = 1)
Read/Write
1
0
0
2
0
0
MicroBlaze Processor Reference Guide 28
UG984 (v2018.2) June 21, 2018 www.xilinx.com
27 FSL AXI4-Stream Error
0 = get or getd had no error 1 = get or getd control type mismatch
This bit is sticky, that is it is set by a get or getd instruction when a control bit mismatch occurs. To clear it an MTS or MSRCLR instruction must be used.
Only available if configured to use stream links (C_FSL_LINKS > 0)
Read/Write
28 BIP Break in Progress
0 = No Break in Progress
1 = Break in Progress
Break Sources can be software break instruction or hardware break
Ext_Brk or Ext_NM_Brk pin.
from
Read/Write
0
0
Chapter 2: MicroBlaze Architecture
C_ADDR_SIZE - 1
EAR
0
X19742-082517
Send Feedback
Table 2-9: Machine Status Register (MSR) (Cont’d)
Bits Name Description Reset Value
29 C Arithmetic Carry
0 = No Carry (Borrow)
1 = Carry (No Borrow)
Read/Write
30 IE Interrupt Enable
0 = Interrupts disabled
1 = Interrupts enabled
Read/Write
31 - Reserved 0
1. The MMU exceptions (Data Storage Exception, Instruction Storage Exception, Data TLB Miss Exception, Instruction TLB Miss Exception) cannot be disabled, and are not affected by this bit.
2. This bit is only used for integer divide-by-zero or divide overflow signaling. There is a floating-point equivalent in the FSR. The DZO-bit flags divide by zero or divide overflow conditions regardless if the processor is configured with exception handling or not.
0
0
Exception Address Register (EAR)
The Exception Address Register stores the full load/store address that caused the exception for the following:
An unaligned access exception that specifies the unaligned access data address
X-Ref Target - Figure 2-5
•An
M_AXI_DP exception that specifies the failing AXI4 data access address
A data storage exception that specifies the (virtual) effective address accessed
An instruction storage exception that specifies the (virtual) effective address read
A data TLB miss exception that specifies the (virtual) effective address accessed
An instruction TLB miss exception that specifies the (virtual) effective address read
The contents of this register is undefined for all other exceptions. When read with the MFS or MFSE instruction, the EAR is specified by setting Sa = 0x0003. The EAR register is illustrated in the following figure and
Table 2-10 provides bit descriptions and reset values.
With extended data addressing is enabled (parameter C_ADDR_SIZE > 32), the 32 least significant bits of the register are read with the MFS instruction, and the most significant bits with the MFSE instruction.
Figure 2-5: EAR
MicroBlaze Processor Reference Guide 29
UG984 (v2018.2) June 21, 2018 www.xilinx.com
X-Ref Target - Figure 2-6
31
EC
19
Reserved
2726
20
ESS
DS
X19743-082517
Send Feedback
Chapter 2: MicroBlaze Architecture
Table 2-10: Exception Address Register (EAR)
Bits Name Description Reset Value
0:C_ADDR_SIZE-1 EAR Exception Address Register 0
Exception Status Register (ESR)
The Exception Status Register contains status bits for the processor. When read with the MFS instruction, the ESR is specified by setting Sa = 0x0005. The ESR register is illustrated in the following figure, provides the Exception Specific Status (ESS).
Table 2-11 provides bit descriptions and reset values, and Table 2-12
Figure 2-6: ESR
Table 2-11: Exception Status Register (ESR)
Bits Name Description Reset Value
0:18 Reserved
19 DS Delay Slot Exception.
0 = not caused by delay slot instruction
1 = caused by delay slot instruction
Read-only
20:26 ESS Exception Specific Status
For details, see Table 2-12.
Read-only
Table 2-12
0
See
MicroBlaze Processor Reference Guide 30
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Table 2-11: Exception Status Register (ESR) (Cont’d)
Bits Name Description Reset Value
27:31 EC Exception Cause
00000 = Stream exception
00001 = Unaligned data access exception
00010 = Illegal op-code exception
00011 = Instruction bus error exception
00100 = Data bus error exception
00101 = Divide exception
00110 = floating-point unit exception
00111 = Privileged instruction exception
00111 = Stack protection violation exception
10000 = Data storage exception
10001 = Instruction storage exception
10010 = Data TLB miss exception
10011 = Instruction TLB miss exception
Read-only
Table 2-12: Exception Specific Status (ESS)
Exception
Cause
Unaligned Data Access
Illegal Instruction
Instruction bus error
Data bus error
Divide 20 DEC Divide - Division exception cause
Bits Name Description Reset Value
20 W Word Access Exception
0 = unaligned halfword access
1 = unaligned word access
21 S Store Access Exception
0 = unaligned load access
1 = unaligned store access
22:26 Rx Source/Destination Register
General purpose register used as source (Store) or destination (Load) in unaligned access
20:26 Reserved 0
20 ECC Exception caused by ILMB correctable or
uncorrectable error
21:26 Reserved 0
20 ECC Exception caused by DLMB correctable or
uncorrectable error
21:26 Reserved 0
0 = Divide-By-Zero
1 = Division Overflow
21:26 Reserved 0
0
0
0
0
0
0
0
MicroBlaze Processor Reference Guide 31
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Table 2-12: Exception Specific Status (ESS) (Cont’d)
Exception
Cause
Floating­point unit
Privileged instruction
Stack protection violation
Stream 20:22 Reserved 0
Data storage
Instruction storage
Data TLB miss
Instruction TLB miss
Bits Name Description Reset Value
20:26 Reserved 0
20:26 Reserved 0
20:26 Reserved 0
23:26 FSL AXI4-Stream index that caused the exception 0
20 DIZ Data storage - Zone protection
0 = Did not occur
1 = Occurred
21 S Data storage - Store instruction
0 = Did not occur
1 = Occurred
22:26 Reserved 0
20 DIZ Instruction storage - Zone protection
0 = Did not occur
1 = Occurred
21:26 Reserved 0
20 Reserved 0
21 S Data TLB miss - Store instruction
0 = Did not occur
1 = Occurred
22:26 Reserved 0
20:26 Reserved 0
0
0
0
0
MicroBlaze Processor Reference Guide 32
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Branch Target Register (BTR)
The Branch Target Register only exists if the MicroBlaze processor is configured to use exceptions. The register stores the branch target address for all delay slot branch instructions executed while MSR[EIP] = 0. If an exception is caused by an instruction in a delay slot (that is, ESR[DS]=1), the exception handler should return execution to the address stored in BTR instead of the normal exception return address stored in R17. When read with the MFS instruction, the BTR is specified by setting Sa = 0x000B. The BTR register is illustrated in the following figure and
Table 2-13 provides bit descriptions and reset values.
X-Ref Target - Figure 2-7
31
BTR
0
X19744-082517
31
DO
Reserved
3029
27
UFOF
DZIO
28
0
X19745-091317
Send Feedback
Chapter 2: MicroBlaze Architecture
Figure 2-7: BTR
Table 2-13: Branch Target Register (BTR)
Bits Name Description Reset Value
X-Ref Target - Figure 2-8
0:31 BTR Branch target address used by handler when returning from
0x00000000
an exception caused by an instruction in a delay slot.
Read-only
Floating-Point Status Register (FSR)
The Floating-Point Status Register contains status bits for the floating-point unit. It can be read with an MFS, and written with an MTS instruction. When read or written, the register is specified by setting Sa = 0x0007. The bits in this register are sticky floating-point instructions can only set bits in the register, and the only way to clear the register is by using the MTS instruction. The following figure illustrates the FSR register and provides bit descriptions and reset values.
Figure 2-8: FSR
Table 2-14: Floating-Point Status Register (FSR)
Table 2-14
MicroBlaze Processor Reference Guide 33
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Bits Name Description Reset Value
0:26 Reserved undefined
27 IO Invalid operation 0
28 DZ Divide-by-zero 0
29 OF Overflow 0
30 UF Underflow 0
31 DO Denormalized operand error 0
X-Ref Target - Figure 2-9
EDR
310
X19746-082517
SLR
310
X19747-082517
Send Feedback
Chapter 2: MicroBlaze Architecture
Exception Data Register (EDR)
The Exception Data Register stores data read on an AXI4-Stream link that caused a stream exception.
The contents of this register is undefined for all other exceptions. When read with the MFS instruction, the EDR is specified by setting Sa = 0x000D. The following figure illustrates the EDR register and
Note: The register is only implemented if C_FSL_LINKS is greater than 0 and C_FSL_EXCEPTION
is set to 1.
Table 2-15 provides bit descriptions and reset values.
Figure 2-9: EDR
X-Ref Target - Figure 2-10
Table 2-15: Exception Data Register (EDR)
Bits Name Description Reset Value
0:31 EDR Exception Data Register 0x00000000
Stack Low Register (SLR)
The Stack Low Register stores the stack low limit use to detect stack overflow. When the address of a load or store instruction using the stack pointer (register R1) as rA is less than the Stack Low Register, a stack overflow occurs, causing a Stack Protection Violation exception if exceptions are enabled in MSR.
When read with the MFS instruction, the SLR is specified by setting Sa = 0x0800.
Figure 2-10 illustrates the SLR register and Table 2-16 provides bit descriptions and reset
values.
Note: The register is only implemented if stack protection is enabled by setting the parameter
C_USE_STACK_PROTECTION to 1. If sta ck protection is not implemented, writing to the register has no effect.
Note: Stack protection is not available when the MMU is enabled (C_USE_MMU > 0). With the MMU
page-based memory protection is provided through the UTLB instead.
MicroBlaze Processor Reference Guide 34
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Figure 2-10: SLR
Chapter 2: MicroBlaze Architecture
SHR
310
X19748-082517
Send Feedback
Table 2-16: Stack Low Register (SLR)
Bits Name Description Reset Value
0:31 SLR Stack Low Register 0x00000000
Stack High Register (SHR)
The Stack High Register stores the stack high limit use to detect stack underflow. When the address of a load or store instruction using the stack pointer (register R1) as rA is greater than the Stack High Register, a stack underflow occurs, causing a Stack Protection Violation exception if exceptions are enabled in MSR.
When read with the MFS instruction, the SHR is specified by setting Sa = 0x0802. The following figure illustrates the SHR register and reset values.
Note: The register is only implemented if stack protection is enabled by setting the parameter
C_USE_STACK_PROTECTION to 1. If sta ck protection is not implemented, writing to the register has no effect.
Table 2-17 provides bit descriptions and
X-Ref Target - Figure 2-11
Note: Stack protection is not available when the MMU is enabled (C_USE_MMU > 0). With the MMU
page-based memory protection is provided through the UTLB instead.
Figure 2-11: SHR
Table 2-17: Stack High Register (SHR)
Bits Name Description Reset Value
0:31 SHR Stack High Register 0xFFFFFFFF
Process Identifier Register (PID)
The Process Identifier Register is used to uniquely identify a software process during MMU address translation. It is controlled by the The register is only implemented if
C_AREA_OPTIMIZED is set to 0 (Performance) or 2 (Frequency).
C_USE_MMU is greater than 1 (User Mode) and
C_USE_MMU configuration option on MicroBlaze.
MicroBlaze Processor Reference Guide 35
UG984 (v2018.2) June 21, 2018 www.xilinx.com
When accessed with the MFS and MTS instructions, the PID is specified by setting Sa = 0x1000. The register is accessible according to the memory management special registers parameter
C_MMU_TLB_ACCESS.
X-Ref Target - Figure 2-12
31
24
PID
RESERVED
0
X19749-091317
30
ZP15
28
ZP14
26
ZP13
24
ZP12
22
ZP11
20
ZP10
18
ZP9
16
ZP8
14
ZP7
12
ZP6
10
ZP5
8
ZP4
6
ZP3
4
ZP2
2
ZP1
0
ZP0
X19750-091317
Send Feedback
Chapter 2: MicroBlaze Architecture
PID is also used when accessing a TLB entry:
When writing Translation Look-Aside Buffer High (TLBHI) the value of PID is stored in the TID field of the TLB entry
When reading TLBHI and MSR[UM] is not set, the value in the TID field is stored in PID
The following figure illustrates the PID register and Table 2-18 provides bit descriptions and reset values.
Figure 2-12: PID
Table 2-18: Process Identifier Register (PID)
Bits Name Description Reset Value
X-Ref Target - Figure 2-13
0:23 Reserved
24:31 PID Used to uniquely identify a software process during MMU
0x00
address translation.
Read/Write
Zone Protection Register (ZPR)
The Zone Protection Register is used to override MMU memory protection defined in TLB entries. It is controlled by the is only implemented if
C_USE_MMU is greater than 1 (User Mode), C_AREA_OPTIMIZED is set
to 0 (Performance) or 2 (Frequency), and if the number of specified memory protection zones is greater than zero ( number of specified memory protection zones ( MFS and MTS instructions, the ZPR is specified by setting Sa = 0x1001. The register is accessible according to the memory management special registers parameter
C_MMU_TLB_ACCESS.
The following figure illustrates the ZPR register and Table 2-19 provides bit descriptions and reset values.
C_USE_MMU configuration option on MicroBlaze. The register
C_MMU_ZONES > 0). The implemented register bits depend on the
C_MMU_ZONES). When accessed with the
MicroBlaze Processor Reference Guide 36
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Figure 2-13: ZPR
Chapter 2: MicroBlaze Architecture
Send Feedback
Table 2-19: Zone Protection Register (ZPR)
Bits Name Description Reset Value
0:1
2:3
...
30:31
ZP0
ZP1
...
ZP15
Zone Protect
User mode (MSR[UM] = 1):
00 = Override V in TLB entry. No access to the page is allowed
01 = No override. Use V, WR and EX from TLB entry
10 = No override. Use V, WR and EX from TLB entry
11 = Override WR and EX in TLB entry. Access the page as writable and executable
Privileged mode (MSR[UM] = 0):
00 = No override. Use V, WR and EX from TLB entry
01 = No override. Use V, WR and EX from TLB entry
10 = Override WR and EX in TLB entry. Access the page as writable and executable
11 = Override WR and EX in TLB entry. Access the page as writable and executable
Read/Write
0x00000000
Translation Look-Aside Buffer Low Register (TLBLO)
The Translation Look-Aside Buffer Low Register is used to access MMU Unified Translation Look-Aside Buffer (UTLB) entries. It is controlled by the MicroBlaze. The register is only implemented if and
C_AREA_OPTIMIZED is set to 0 (Performance) or 2 (Frequency). When accessed with the
C_USE_MMU is greater than 1 (User Mode),
MFS and MTS instructions, the TLBLO is specified by setting Sa = 0x1003.
C_USE_MMU configuration option on
When reading or writing TLBLO, the UTLB entry indexed by the TLBX register is accessed. The register is readable according to the memory management special registers parameter
C_MMU_TLB_ACCESS.
When the MMU Physical Address Extension (PAE) is enabled (parameters C_USE_MMU = 3 and
C_ADDR_SIZE > 32), the 32 least significant bits of TLBLO are accessed with the MFS
and MTS instructions, and the most significant bits with the MFSE and MTSE instruction. When writing the register with PAE enabled, the most significant bits must be written first.
The UTLB is reset on bit stream download (reset value is 0x00000000 for all TLBLO entries).
Note: The UTLB is not reset by the external reset inputs: Reset and Debug_Rst. This means that
the entire UTLB must be initialized after reset, to avoid any stale data.
The following figure illustrates the TLBLO register and Table 2-20 provides bit descriptions and reset values. When PAE is enabled the RPN field of the register is extended according to the
C_ADDR_SIZE parameter up to 54 bits to be able to hold up to a 64-bit physical
address.
MicroBlaze Processor Reference Guide 37
UG984 (v2018.2) June 21, 2018 www.xilinx.com
X-Ref Target - Figure 2-14
0
0
22
n-10 n-9 n-8
28
29
30
31
n-4
n-3 n-2 n-1
C_ADDR_SIZE = 32 or C_USE_MMU 3:
PAE: C_ADDR_SIZE > 32 and C_USE_MMU = 3 (n = C_ADDR_SIZE):
RPN
EX
W
WR
ZSEL I M
G
23 24
X19751-091317
Send Feedback
Chapter 2: MicroBlaze Architecture
Figure 2-14: TLBLO
Table 2-20: Translation Look-Aside Buffer Low Register (TLBLO)
1
Bits
Name Description Reset Value
0:21
0:n-11
22
n-10
23
n-9
RPN Real Page Number or Physical Page Number
When a TLB hit occurs, this field is read from the TLB entry and is used to form the physical address. Depending on the value of the SIZE field, some of the RPN bits are not used in the physical address. Software must clear unused bits in this field to zero.
Only defined when C_USE_MMU=3 (Virtual).
Read/Write
EX Executable
When bit is set to 1, the page contains executable code, and instructions can be fetched from the page. When bit is cleared to 0, instructions cannot be fetched from the page. Attempts to fetch instructions from a page with a clear EX bit cause an instruction­storage exception.
Read/Write
WR Writable
When bit is set to 1, the page is writable and store instructions can be used to store data at addresses within the page.
When bit is cleared to 0, the page is read-only (not writable). Attempts to store data into a page with a clear WR bit cause a data storage exception.
Read/Write
0x000000
0
0
24:27
n-8:n-5
ZSEL Zone Select
This field selects one of 16 zone fields (Z0-Z15) from the zone­protection register (ZPR).
For example, if ZSEL 0x5, zone field Z5 is selected. The selected ZPR field is used to modify the access protection specified by the TLB entry EX and WR fields. It is also used to prevent access to a page by overriding the TLB V (valid) field.
Read/Write
MicroBlaze Processor Reference Guide 38
UG984 (v2018.2) June 21, 2018 www.xilinx.com
0x0
Chapter 2: MicroBlaze Architecture
Send Feedback
Table 2-20: Translation Look-Aside Buffer Low Register (TLBLO) (Cont’d)
1
Bits
Name Description Reset Value
28
n-4
29
n-3
30
n-2
31
n-1
W Write Through
When the parameter C_DCACHE_USE_WRITEBACK is set to 1, this bit controls caching policy. A write-through policy is selected when set to 1, and a write-back policy is selected otherwise.
This bit is fixed to 1, and write-through is always used, when
C_DCACHE_USE_WRITEBACK is cleared to 0.
Read/Write
I Inhibit Caching
When bit is set to 1, accesses to the page are not cached (caching is inhibited).
When cleared to 0, accesses to the page are cacheable.
Read/Write
M Memory Coherent
This bit is fixed to 0, because memory coherence is not implemented on MicroBlaze.
Read Only
G Guarded
When bit is set to 1, speculative page accesses are not allowed (memory is guarded).
When cleared to 0, speculative page accesses are allowed.
The G attribute can be used to protect memory-mapped I/O devices from inappropriate instruction accesses.
Read/Write
0/1
0
0
0
1. The bit index n = C_ADDR_SIZE applies when PAE is enabled.
Translation Look-Aside Buffer High Register (TLBHI)
The Translation Look-Aside Buffer High Register is used to access MMU Unified Translation Look-Aside Buffer (UTLB) entries. It is controlled by the MicroBlaze. The register is only implemented if and
C_AREA_OPTIMIZED is set to 0 (Performance) or 2 (Frequency). When accessed with the
C_USE_MMU is greater than 1 (User Mode),
MFS and MTS instructions, the TLBHI is specified by setting Sa = 0x1004. When reading or writing TLBHI, the UTLB entry indexed by the TLBX register is accessed.
The register is readable according to the memory management special registers parameter
C_MMU_TLB_ACCESS.
PID is also used when accessing a TLB entry:
When writing TLBHI the value of PID is stored in the TID field of the TLB entry
When reading TLBHI and MSR[UM] is not set, the value in the TID field is stored in PID
The UTLB is reset on bit stream download (reset value is 0x00000000 for all TLBHI entries).
C_USE_MMU configuration option on
MicroBlaze Processor Reference Guide 39
UG984 (v2018.2) June 21, 2018 www.xilinx.com
X-Ref Target - Figure 2-15
TAG
22
0
31
28
27
26
25
SIZE
V E U0 Reserved
X19752-091317
Send Feedback
Chapter 2: MicroBlaze Architecture
Note: The UTLB is not reset by the external reset inputs: Reset and Debug_Rst.
The following figure illustrates the TLBHI register and Table 2-21 provides bit descriptions and reset values.
Figure 2-15: TLBHI
Table 2-21: Translation Look-Aside Buffer High Register (TLBHI)
Bits Name Description Reset Value
0:21 TAG TLB-entry tag
Is compared with the page number portion of the virtual memory address under the control of the SIZE field.
Read/Write
22:24 SIZE Size
Specifies the page size. The SIZE field controls the bit range used in comparing the TAG field with the page number portion of the virtual memory address. The page sizes defined by this field are listed in
Table 2-38.
Read/Write
25 V Valid
When this bit is set to 1, the TLB entry is valid and contains a page­translation entry.
When cleared to 0, the TLB entry is invalid.
Read/Write
26 E Endian
When this bit is set to 1, the page is accessed as a big endian page.
When cleared to 0, the page is accessed as a little endian page.
The E bit only affects data read or data write accesses. Instruction accesses are not affected.
The E bit is only implemented when the parameter
C_USE_REORDER_INSTR is set to 1, otherwise it is fixed to 0.
Read/Write
0x000000
000
0
0
MicroBlaze Processor Reference Guide 40
UG984 (v2018.2) June 21, 2018 www.xilinx.com
27 U0 User Defined
This bit is fixed to 0, since there are no user defined storage attributes on MicroBlaze.
Read Only
28:31 Reserved
0
X-Ref Target - Figure 2-16
31
26
INDEX
ReservedMISS
0
X19753-082517
Send Feedback
Chapter 2: MicroBlaze Architecture
Translation Look-Aside Buffer Index Register (TLBX)
The Translation Look-Aside Buffer Index Register is used as an index to the Unified Translation Look-Aside Buffer (UTLB) when accessing the TLBLO and TLBHI registers. It is controlled by the implemented if 0 (Performance) or 2 (Frequency). When accessed with the MFS and MTS instructions, the TLBX is specified by setting
The following figure illustrates the TLBX register and Table 2-22 provides bit descriptions and reset values.
C_USE_MMU configuration option on MicroBlaze. The register is only
C_USE_MMU is greater than 1 (User Mode), and C_AREA_OPTIMIZED is set to
Sa = 0x1002.
Figure 2-16: TLBX
Table 2-22: Translation Look-Aside Buffer Index Register (TLBX)
Bits Name Description Reset Value
0 MISS TLB Miss
This bit is cleared to 0 when the TLBSX register is written with a virtual address, and the virtual address is found in a TLB entry.
The bit is set to 1 if the virtual address is not found. It is also cleared when the TLBX register itself is written.
Read Only
Can be read if the memory management special registers parameter
1:25 Reserved
26:31 INDEX TLB Index
This field is used to index the Translation Look-Aside Buffer entry accessed by the TLBLO and TLBHI registers. The field is updated with a TLB index when the TLBSX register is written with a virtual address, and the virtual address is found in the corresponding TLB entry.
Read/Write
Can be read and written if the memory management special registers parameter
C_MMU_TLB_ACCESS > 0 (MINIMAL).
C_MMU_TLB_ACCESS > 0 (MINIMAL).
0
000000
MicroBlaze Processor Reference Guide 41
UG984 (v2018.2) June 21, 2018 www.xilinx.com
X-Ref Target - Figure 2-17
31
22
Reserved
VPN
0
X19754-082517
Send Feedback
Chapter 2: MicroBlaze Architecture
Translation Look-Aside Buffer Search Index Register (TLBSX)
The Translation Look-Aside Buffer Search Index Register (TLBSX) is used to search for a virtual page number in the Unified Translation Look-Aside Buffer (UTLB). It is controlled by the
C_USE_MMU configuration option on the MicroBlaze processor.
The register is only implemented if C_USE_MMU is greater than 1 (User Mode), and
C_AREA_OPTIMIZED is set to 0 (Performance) or 2 (Frequency).
When written with the MTS instruction, the TLBSX is specified by setting Sa = 0x1005. The following figure illustrates the TLBSX register and reset values.
Figure 2-17: TLBSX
Table 2-23 provides bit descriptions and
Table 2-23: Translation Look-Aside Buffer Index Search Register (TLBSX)
Bits Name Description Reset Value
0:21 VPN Virtual Page Number
This field represents the page number portion of the virtual memory address. It is compared with the page number portion of the virtual memory address under the control of the SIZE field, in each of the Translation Look-Aside Buffer entries that have the V bit set to 1.
If the virtual page number is found, the TLBX register is written with the index of the TLB entry and the MISS bit in TLBX is cleared to 0. If the virtual page number is not found in any of the TLB entries, the MISS bit in the TLBX register is set to 1.
Write Only
22:31 Reserved
Processor Version Register (PVR)
The Processor Version Register is controlled by the C_PVR configuration option on MicroBlaze.
When C_PVR is set to 0 (None) the processor does not implement any PVR and MSR[PVR]=0.
MicroBlaze Processor Reference Guide 42
UG984 (v2018.2) June 21, 2018 www.xilinx.com
When
C_PVR is set to 1 (Basic), MicroBlaze implements only the first register: PVR0, and
if set to 2 (Full), all 13 PVR registers (PVR0 to PVR12) are implemented.
When read with the MFS or MFSE instruction the PVR is specified by setting Sa = 0x200x, with x being the register number between 0x0 and 0xB.
Chapter 2: MicroBlaze Architecture
Send Feedback
With extended data addressing is enabled (parameter C_ADDR_SIZE > 32), the 32 least significant bits of PVR8 and PVR9 are read with the MFS instruction, and the most significant bits with the MFSE instruction.
When physical address extension (PAE) is enabled (parameters C_USE_MMU = 3 and
C_ADDR_SIZE > 32), the 32 least significant bits of PVR6 and PVR7 are read with the MFS
instruction, and the most significant bits with the MFSE instruction.
Table 2-24 through Table 2-36 provide bit descriptions and values.
Table 2-24: Processor Version Register 0 (PVR0)
Bits Name Description Value
0 CFG PVR implementation:
0 = Basic, 1 = Full
1 BS Use barrel shifter
2 DIV Use divider C_USE_DIV
3 MUL Use hardware multiplier C_USE_HW_MUL > 0 (None)
4 FPU Use FPU C_USE_FPU > 0 (None)
5 EXC Use any type of exceptions Based on C_*_EXCEPTION
6 ICU Use instruction cache C_USE_ICACHE
7 DCU Use data cache C_USE_DCACHE
8 MMU Use MMU C_USE_MMU > 0 (None)
9 BTC Use branch target cache C_USE_BRANCH_TARGET_CACHE
10 ENDI Selected endianness:
Always 1 = Little endian
11 FT Implement fault tolerant features C_FAULT_TOLERANT
12 SPROT Use stack protection C_USE_STACK_PROTECTION
13 REORD Implement reorder instructions C_USE_REORDER_INSTR
14:15 Reserved 0
16:23 MBV MicroBlaze release version code Release Specific
0x19 = v8.40.b
0x1B = v9.0
0x1D = v9.1
0x1F = v9.2
0x20 = v9.3
24:31 USR1 User configured value 1 C_PVR_USER1
0x21 = v9.4
0x22 = v9.5
0x23 = v9.6
0x24 = v10.0
Based on C_PVR
C_USE_BARREL
Also set if C_USE_MMU > 0 (None)
C_ENDIANNESS
MicroBlaze Processor Reference Guide 43
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Table 2-25: Processor Version Register 1 (PVR1)
Bits Name Description Value
0:31 USR2 User configured value 2 C_PVR_USER2
Chapter 2: MicroBlaze Architecture
Send Feedback
Table 2-26: Processor Version Register 2 (PVR2)
Bits Name Description Value
0 DAXI Data side AXI4 or ACE in use C_D_AXI
1 DLMB Data side LMB in use C_D_LMB
2 IAXI Instruction side AXI4 or ACE in use C_I_AXI
3 ILMB Instruction side LMB in use C_I_LMB
4 IRQEDGE Interrupt is edge triggered C_INTERRUPT_IS_EDGE
5 IRQPOS Interrupt edge is positive C_EDGE_IS_POSITIVE
6 CEEXC Generate bus exceptions for ECC
correctable errors in LMB memory
7 FREQ Select implementation to optimize
processor frequency
8 Reserved 0
9 Reserved 1
10 ACE Use ACE interconnect C_INTERCONNECT = 3 (ACE)
11 AXI4DP Data Peripheral AXI interface uses AXI4
protocol, with support for exclusive access
12 FSL Use extended AXI4-Stream instructions C_USE_EXTENDED_FSL_INSTR
13 FSLEXC Generate exception for AXI4-Stream
control bit mismatch
14 MSR Use msrset and msrclr instructions C_USE_MSR_INSTR
15 PCMP Use pattern compare and CLZ instructions C_USE_PCMP_INSTR
16 AREA Select implementation to optimize area
with lower instruction throughput
17 BS Use barrel shifter C_USE_BARREL
18 DIV Use divider C_USE_DIV
19 MUL Use hardware multiplier C_USE_HW_MUL > 0 (None)
C_ECC_USE_CE_EXCEPTION
C_AREA_OPTIMIZED=2
(Frequency)
C_M_AXI_DP_EXCLUSIVE_ ACCESS
C_FSL_EXCEPTION
C_AREA_OPTIMIZED = 1 (Area)
MicroBlaze Processor Reference Guide 44
UG984 (v2018.2) June 21, 2018 www.xilinx.com
20 FPU Use FPU C_USE_FPU > 0 (None)
21 MUL64 Use 64-bit hardware multiplier C_USE_HW_MUL = 2 (Mul64)
22 FPU2 Use floating-point conversion and square
root instructions
23 IMPEXC Allow imprecise exceptions for ECC errors
in LMB memory
24 Reserved 0
25 OP0EXC Generate exception for 0x0 illegal opcode C_OPCODE_0x0_ILLEGAL
26 UNEXC Generate exception for unaligned data
access
27 OPEXC Generate exception for any illegal opcode C_ILL_OPCODE_EXCEPTION
28 AXIDEXC Generate exception for M_AXI_D error C_M_AXI_D_BUS_EXCEPTION
C_USE_FPU = 2 (Extended)
C_IMPRECISE_EXCEPTIONS
C_UNALIGNED_EXCEPTIONS
Chapter 2: MicroBlaze Architecture
Send Feedback
Table 2-26: Processor Version Register 2 (PVR2) (Cont’d)
Bits Name Description Value
29 AXIIEXC Generate exception for M_AXI_I error C_M_AXI_I_BUS_EXCEPTION
30 DIVEXC Generate exception for division by zero or
division overflow
31 FPUEXC Generate exceptions from FPU C_FPU_EXCEPTION
C_DIV_ZERO_EXCEPTION
Table 2-27: Processor Version Register 3 (PVR3)
Bits Name Description Value
0 DEBUG Use debug logic C_DEBUG_ENABLED > 0
1 EXT_DEBUG Use extended debug logic C_DEBUG_ENABLED = 2
(Extended)
2 Reserved
3:6 PCBRK Number of PC breakpoints C_NUMBER_OF_PC_BRK
7:9 Reserved
10:12 RDADDR Number of read address breakpoints C_NUMBER_OF_RD_ADDR_BRK
13:15 Reserved
16:18 WRADDR Number of write address breakpoints C_NUMBER_OF_WR_ADDR_BRK
19 Reserved 0
20:24 FSL Number of AXI4-Stream links C_FSL_LINKS
25:28 Reserved
29:31 BTC_SIZE Branch Target Cache size C_BRANCH_TARGET_CACHE_SIZE
MicroBlaze Processor Reference Guide 45
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Table 2-28: Processor Version Register 4 (PVR4)
Bits Name Description Value
0 ICU Use instruction cache C_USE_ICACHE
1:5 ICTS Instruction cache tag size C_ADDR_TAG_BITS
6 Reserved 1
7 ICW Allow instruction cache write C_ALLOW_ICACHE_WR
8:10 ICLL The base two logarithm of the instruction
cache line length
11:15 ICBS The base two logarithm of the instruction
cache byte size
16 IAU The instruction cache is used for all memory
accesses within the cacheable range
17:18 Reserved 0
19:21 ICV Instruction cache victims 0-3: C_ICACHE_VICTIMS = 0,2,4,8
22:23 ICS Instruction cache streams C_ICACHE_STREAMS
24 IFTL Instruction cache tag uses distributed RAM C_ICACHE_FORCE_TAG_LUTRAM
25 ICDW Instruction cache data width C_ICACHE_DATA_WIDTH > 0
26:31 Reserved 0
log2(C_ICACHE_LINE_LEN)
log2(C_CACHE_BYTE_SIZE)
C_ICACHE_ALWAYS_USED
Table 2-29: Processor Version Register 5 (PVR5)
Bits Name Description Value
0 DCU Use data cache C_USE_DCACHE
1:5 DCTS Data cache tag size C_DCACHE_ADDR_TAG
6 Reserved 1
7 DCW Allow data cache write C_ALLOW_DCACHE_WR
8:10 DCLL The base two logarithm of the data cache line
length
11:15 DCBS The base two logarithm of the data cache
byte size
16 DAU The data cache is used for all memory
accesses within the cacheable range
17 DWB Data cache policy is write-back C_DCACHE_USE_WRITEBACK
18 Reserved 0
19:21 DCV Data cache victims 0-3: C_DCACHE_VICTIMS = 0,2,4,8
22:23 Reserved 0
24 DFTL Data cache tag uses distributed RAM C_DCACHE_FORCE_TAG_LUTRAM
25 DCDW Data cache data width C_DCACHE_DATA_WIDTH > 0
26 AXI4DC Data Cache AXI interface uses AXI4 protocol,
with support for exclusive access
27:31 Reserved 0
log2(C_DCACHE_LINE_LEN)
log2(C_DCACHE_BYTE_SIZE)
C_DCACHE_ALWAYS_USED
C_M_AXI_DC_EXCLUSIVE_ACCES S
MicroBlaze Processor Reference Guide 46
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Table 2-30: Processor Version Register 6 (PVR6)
Bits Name Description Value
0:C_ADDR_SIZE-1 ICBA Instruction Cache Base Address C_ICACHE_BASEADDR
Table 2-31: Processor Version Register 7 (PVR7)
Bits Name Description Value
0:C_ADDR_SIZE-1 ICHA Instruction Cache High Address C_ICACHE_HIGHADDR
Table 2-32: Processor Version Register 8 (PVR8)
Bits Name Description Value
0:C_ADDR_SIZE-1 DCBA Data Cache Base Address C_DCACHE_BASEADDR
Table 2-33: Processor Version Register 9 (PVR9)
Bits Name Description Value
0:C_ADDR_SIZE-1 DCHA Data Cache High Address C_DCACHE_HIGHADDR
Table 2-34: Processor Version Register 10 (PVR10)
Bits Name Description Value
0:7 ARCH Target architecture: Defined by parameter
0xF = Virtex®-7, Defense Grade Virtex-7 Q
0x10 = Kintex®-7, Defense Grade Kintex-7 Q
0x11 = Artix®-7, Automotive Artix-7,
Defense Grade Artix-7 Q
0x12 = Zynq®-7000, Automotive Zynq-7000,
Defense Grade Zynq-7000 Q
0x13 = UltraScale™ Virtex
0x14 = UltraScale Kintex
0x15 = UltraScale+™ Zynq, Automotive
UltraScale+ Zynq
0x16 = UltraScale+ Virtex
0x17 = UltraScale+ Kintex
0x18 = Spartan®-7, Automotive Spartan-7
8:13 ASIZE Number of extended address bits C_ADDR_SIZE - 32
14:31 Reserved 0
C_FAMILY
MicroBlaze Processor Reference Guide 47
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Table 2-35: Processor Version Register 11 (PVR11)
Bits Name Description Value
0:1 MMU Use MMU: C_USE_MMU
0 = None 1 = User Mode
2:4 ITLB Instruction Shadow TLB size log2(C_MMU_ITLB_SIZE)
5:7 DTLB Data Shadow TLB size log2(C_MMU_DTLB_SIZE)
8:9 TLBACC TLB register access: C_MMU_TLB_ACCESS
0 = Minimal 1 = Read
10:14 ZONES Number of memory protection zones C_MMU_ZONES
15 PRIVINS Privileged instructions:
0 = Full protection 1 = Allow stream instructions
16:16 Reserved Reserved for future use 0
17:31 RSTMSR Reset value for MSR C_RESET_MSR_IE << 2 |
2 = Protection 3 = Virtual
2 = Write 3 = Full
C_MMU_PRIVILEGED_INSTR
C_RESET_MSR_BIP << 4 | C_RESET_MSR_ICE << 6 | C_RESET_MSR_DCE << 8 | C_RESET_MSR_EE << 9 | C_RESET_MSR_EIP << 10
Table 2-36: Processor Version Register 12 (PVR12)
Bits Name Description Va lue
0:31 VECTORS Location of MicroBlaze vectors C_BASE_VECTORS
MicroBlaze Processor Reference Guide 48
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Pipeline Architecture
MicroBlaze instruction execution is pipelined. For most instructions, each stage takes one clock cycle to complete. Consequently, the number of clock cycles necessary for a specific instruction to complete is equal to the number of pipeline stages, and one instruction is completed on every cycle in the absence of data, control or structural hazards.
A data hazard occurs when the result of an instruction is needed by a subsequent instruction. This can result in stalling the pipeline, unless the result can be forwarded to the subsequent instruction. The MicroBlaze GNU Compiler attempts to avoid data hazards by reordering instructions during optimization.
A control hazard occurs when a branch is taken, and the next instruction is not immediately available. This results in stalling the pipeline. MicroBlaze provides delay slot branches and the optional branch target cache to reduce the number of stall cycles.
A structural hazard occurs for a few instructions that require multiple clock cycles in the execute stage or a later stage to complete. This is achieved by stalling the pipeline.
Load and store instructions accessing slower memory might take multiple cycles. The pipeline is stalled until the access completes. MicroBlaze provides the optional data cache to improve the average latency of slower memory.
When executing from slower memory, instruction fetches might take multiple cycles. This additional latency directly affects the efficiency of the pipeline. MicroBlaze implements an instruction prefetch buffer that reduces the impact of such multi-cycle instruction memory latency. While the pipeline is stalled for any other reason, the prefetch buffer continues to load sequential instructions speculatively. When the pipeline resumes execution, the fetch stage can load new instructions directly from the prefetch buffer instead of waiting for the instruction memory access to complete.
If instructions are modified during execution (for example with self-modifying code), the prefetch buffer should be emptied before executing the modified instructions, to ensure that it does not contain the old unmodified instructions.
RECOMMENDED: The recommended way to do this is using an MBAR instruction, although it is also
possible to use a synchronizing branch instruction, for example BRI 4.
MicroBlaze also provides the optional instruction cache to improve the average instruction fetch latency of slower memory.
MicroBlaze Processor Reference Guide 49
UG984 (v2018.2) June 21, 2018 www.xilinx.com
All hazards are independent, and can potentially occur simultaneously. In such cases, the number of cycles the pipeline is stalled is defined by the hazard with the longest stall duration.
Chapter 2: MicroBlaze Architecture
Send Feedback
Three Stage Pipeline
With C_AREA_OPTIMIZED set to 1 (Area), the pipeline is divided into three stages to minimize hardware cost: Fetch, Decode, and Execute.
cycle1 cycle2 cycle3 cycle4 cycle5 cycle6 cycle7
instruction 1 Fetch Decode Execute
instruction 2 Fetch Decode Execute Execute Execute
instruction 3 Fetch Decode
Stall Stall Execute
The three stage pipeline does not have any data hazards. Pipeline stalls are caused by control hazards, structural hazards due to multi-cycle instructions, memory accesses using slower memory, instruction fetch from slower memory, or stream accesses.
The multi-cycle instruction categories are barrel shift, multiply, divide and floating-point instructions.
Five Stage Pipeline
With C_AREA_OPTIMIZED set to 0 (Performance), the pipeline is divided into five stages to maximize performance: Fetch (IF), Decode (OF), Execute (EX), Access Memory (MEM), and Writeback (WB).
cycle1 cycle2 cycle3 cycle4 cycle5 cycle6 cycle7 cycle8 cycle9
instruction 1 IF OF EX MEM WB
instruction 2 IF OF EX MEM MEM MEM WB
instruction 3 IF OF EX
The five stage pipeline has two kinds of data hazard:
Stall Stall MEM WB
MicroBlaze Processor Reference Guide 50
UG984 (v2018.2) June 21, 2018 www.xilinx.com
An instruction in OF needs the result from an instruction in EX as a source operand. In this case, the EX instruction categories are load, store, barrel shift, multiply, divide, and floating-point instructions. This results in a 1-2 cycle stall.
An instruction in OF uses the result from an instruction in MEM as a source operand. In this case, the MEM instruction categories are load, multiply, and floating-point instructions. This results in a 1 cycle stall.
Pipeline stalls are caused by data hazards, control hazards, structural hazards due to multi­cycle instructions, memory accesses using slower memory, instruction fetch from slower memory, or stream accesses.
The multi-cycle instruction categories are divide and floating-point instructions.
Chapter 2: MicroBlaze Architecture
Send Feedback
Eight Stage Pipeline
With C_AREA_OPTIMIZED set to 2 (Frequency), the pipeline is divided into eight stages to maximize possible frequency: Fetch (IF), Decode (OF), Execute (EX), Access Memory 0 (M0), Access Memory 1 (M1), Access Memory 2 (M2), Access Memory 3 (M3) and Writeback (WB).
cycle1 cycle2 cycle3 cycle4 cycle5 cycle6 cycle7 cycle8 cycle9 cycle10 cycle11
instruction 1
IF OF EX M0 M1 M2 M3 WB
instruction 2
instruction 3
IF OF EX M0 M0 M1 M2 M3 WB
IF OF EX Stall M0 M1 M2 M3 WB
The eight stage pipeline has four kinds of data hazard:
An instruction in OF needs the result from an instruction in EX as a source operand. In this case, the EX instruction categories are load, store, barrel shift, multiply, divide, and floating-point instructions. This results in a 1-5 cycle stall.
An instruction in OF uses the result from an instruction in M0 as a source operand. In this case, the M0 instruction categories are load, multiply, divide, and floating-point instructions. This results in a 1-4 cycle stall.
An instruction in OF uses the result from an instruction in M1 or M2 as a source operand. In this case, the M1 or M2 instruction categories are load, divide, and floating-point instructions. This results in a 1-3 or 1-2 cycle stall respectively.
An instruction in OF uses the result from an instruction in M3 as a source operand. In this case, M3 instruction categories are load and floating-point instructions. This results in a 1 cycle stall.
In addition to multi-cycle instructions, there are three other kinds of structural hazards:
MicroBlaze Processor Reference Guide 51
UG984 (v2018.2) June 21, 2018 www.xilinx.com
An instruction in OF is a stream instruction, and the instruction in EX is a stream, load, store, divide, or floating-point instruction with corresponding exception implemented. This results in a 1 cycle stall.
An instruction in OF is a stream instruction, and the instruction in M0, M1, M2 or M3 is a load, store, divide, or floating-point instruction with corresponding exception implemented. This results in a 1 cycle stall.
An instruction in M0 is a load or store instruction, and the instruction in M1, M2 or M3 is a load, store, divide, or floating-point instruction with corresponding exception implemented. This results in a 1 cycle stall.
Pipeline stalls are caused by data hazards, control hazards, structural hazards, memory accesses using slower memory, instruction fetch from slower memory, or stream accesses.
The multi-cycle instruction categories are divide instructions and floating-point instructions FDIV, FINT, and FSQRT.
Chapter 2: MicroBlaze Architecture
Send Feedback
Branches
Normally the instructions in the fetch and decode stages (as well as prefetch buffer) are flushed when executing a taken branch. The fetch pipeline stage is then reloaded with a new instruction from the calculated branch address. A taken branch in MicroBlaze takes three clock cycles to execute, two of which are required for refilling the pipeline. To reduce this latency overhead, MicroBlaze supports branches with delay slots and the optional branch target cache.
Delay Slots
When executing a taken branch with delay slot, only the fetch pipeline stage in MicroBlaze is flushed. The instruction in the decode stage (branch delay slot) is allowed to complete. This technique effectively reduces the branch penalty from two clock cycles to one. Branch instructions with delay slots have a D appended to the instruction mnemonic. For example, the BNE instruction does not execute the subsequent instruction (does not have a delay slot), whereas BNED executes the next instruction before control is transferred to the branch location.
A delay slot must not contain the following instructions: IMM, branch, or break. Interrupts and external hardware breaks are deferred until after the delay slot branch has been completed. Instructions that could cause recoverable exceptions (for example unaligned word or halfword load and store) are allowed in the delay slot.
If an exception is caused in a delay slot the ESR[DS] bit is set, and the exception handler is responsible for returning the execution to the branch target (stored in the special purpose register BTR). If the ESR[DS] bit is set, register R17 is not valid (otherwise it contains the address following the instruction causing the exception).
Branch Target Cache
To improve branch performance, MicroBlaze provides a branch target cache (BTC) coupled with a branch prediction scheme. With the BTC enabled, a correctly predicted immediate branch or return instruction incurs no overhead.
The BTC operates by saving the target address of each immediate branch and return instruction the first time the instruction is encountered. The next time it is encountered, it is usually found in the Branch Target Cache, and the Instruction Fetch Program Counter is then simply changed to the saved target address, in case the branch should be taken. Unconditional branches and return instructions are always taken, whereas conditional branches use branch prediction, to avoid taking a branch that should not have been taken and vice versa.
MicroBlaze Processor Reference Guide 52
UG984 (v2018.2) June 21, 2018 www.xilinx.com
The BTC is cleared when a memory barrier (MBAR 0) or synchronizing branch (BRI 4) is executed. This also occurs when the memory barrier or synchronizing branch follows immediately after a branch instruction, even if that branch is taken. To avoid inadvertently
Chapter 2: MicroBlaze Architecture
Send Feedback
clearing the BTC, the memory barrier or synchronizing branch should not be placed immediately after a branch instruction.
There are three cases where the branch prediction can cause a mispredict, namely:
A conditional branch that should not have been taken, is actually taken,
A conditional branch that should actually have been taken, is not taken,
The target address of a return instruction is incorrect, which might occur when returning from a function called from different places in the code.
All of these cases are detected and corrected when the branch or return instruction reaches the execute stage, and the branch prediction bits or target address are updated in the BTC, to reflect the actual instruction behavior. This correction incurs a penalty of 2 clock cycles for the 5-stage pipeline and 7-9 clock cycles for the 8-stage pipeline.
The size of the BTC can be selected with C_BRANCH_TARGET_CACHE_SIZE. The default recommended setting uses one block RAM, and provides 512 entries. When selecting 64 entries or below, distributed RAM is used to implement the BTC, otherwise block RAM is used.
When the BTC uses block RAM, and C_FAULT_TOLERANT is set to 1, block RAMs are protected by parity. In case of a parity error, the branch is not predicted. To avoid accumulating errors in this case, the BTC should be cleared periodically by a synchronizing branch.
The Branch Target Cache is available when C_USE_BRANCH_TARGET_CACHE is set to 1 and
C_AREA_OPTIMIZED is set to 0 (Performance) or 2 (Frequency).
Pipeline Hazard Example
The effect of a data hazard is illustrated in Table 2-37, using the five stage pipeline.
The example shows a data hazard for a multiplication instruction, where the subsequent add instruction needs the result in register r3 to proceed. This means that the add instruction is stalled in OF during cycle 3 and 4 until the multiplication is complete.
Table 2-37: Multiplication Data Hazard Example
Cycle IF OF EX MEM WB
1 mul r3, r4, r5
2 add r6, r3, r4 mul r3, r4, r5
3 add r6, r3, r4 mul r3, r4, r5
4 add r6, r3, r4 - mul r3, r4, r5
5 add r6, r3, r4 - - mul r3, r4, r5
6 add r6, r3, r4 - -
MicroBlaze Processor Reference Guide 53
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Avoiding Data Hazards
In some cases, the MicroBlaze GNU Compiler is not able to optimize code to completely avoid data hazards. However, it is often possible to change the source code in order to achieve this, mainly by better utilization of the general purpose registers.
Two C code examples are shown here:
Multiplication of a static array in memory
static int a[4], b[4], c[4]; register int a0, a1, a2, a3, b0, b1, b2, b3, c0, c1, c2, c3;
a0 = a[0]; a1 = a[1]; a2 = a[2]; a3 = a[3]; b0 = b[0]; b1 = b[1]; b2 = b[2]; b3 = b[3]; c0 = a0 * b0; c1 = a1 * b1; c2 = a2 * b2; c3 = a3 * b3; c[3] = c3; c2 = c[2]; c1 = c[1]; c0 = c[0];
This code ensures that load instructions are first executed to load operands into separate registers, which are then multiplied and finally stored. The code can be extended up to 8 multiplications without running out of general purpose registers.
Fetching a data packet from an AXI4-Stream interface.
#include <mb_interface.h>
static int a[4]; register int a0, a1, a2, a3;
getfsl(a0, rfsl0); getfsl(a1, rfsl0); getfsl(a2, rfsl0); getfsl(a3, rfsl0); a[3] = a3; a[1] = a1; a[2] = a2; a[0] = a0;
This code ensures that get instructions using different registers are first executed, and then data is stored. The code can be extended to up to 16 accesses without running out of general purpose registers.
MicroBlaze Processor Reference Guide 54
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Memory Architecture
MicroBlaze is implemented with a Harvard memory architecture; instruction and data accesses are done in separate address spaces.
The instruction address space has a 32-bit virtual address range (that is, handles up to 4GB of instructions), and can be extended up to a 64-bit physical address range when using the MMU in virtual mode.
The data address space has a default 32-bit range, and can be extended up to a 64-bit range (that is, handles from 4GB to 16EB of data). The instruction and data memory ranges can be made to overlap by mapping them both to the same physical memory. The latter is necessary for software debugging.
Both instruction and data interfaces of MicroBlaze are default 32 bits wide and use big endian or little endian, bit-reversed format, depending on the selected endianness. MicroBlaze supports word, halfword, and byte accesses to data memory.
Big endian format is only supported when using the MMU in virtual or protected mode (
C_USE_MMU > 1) or when reorder instructions are enabled (C_USE_REORDER_INSTR = 1).
Data accesses must be aligned (word accesses must be on word boundaries, halfword on halfword boundaries), unless the processor is configured to support unaligned exceptions. All instruction accesses must be word aligned.
MicroBlaze prefetches instructions to improve performance, using the instruction prefetch buffer and (if enabled) instruction cache streams. To avoid attempts to prefetch instructions beyond the end of physical memory, which might cause an instruction bus error or a processor stall, instructions must not be located too close to the end of physical memory. The instruction prefetch buffer requires 16 bytes margin, and using instruction cache streams adds two additional cache lines (32, 64 or 128 bytes).
MicroBlaze does not separate data accesses to I/O and memory (it uses memory-mapped I/O). The processor has up to three interfaces for memory accesses:
Local Memory Bus (LMB)
Advanced eXtensible Interface (AXI4) for peripheral access
Advanced eXtensible Interface (AXI4) or AXI Coherency Extension (ACE) for cache access
MicroBlaze Processor Reference Guide 55
UG984 (v2018.2) June 21, 2018 www.xilinx.com
The LMB memory address range must not overlap with AXI4 ranges.
The C_ENDIANNESS parameter is always set to little endian.
Chapter 2: MicroBlaze Architecture
Send Feedback
MicroBlaze has a single cycle latency for accesses to local memory (LMB) and for cache read hits, except with cache read hits require two clock cycles, and with
C_AREA_OPTIMIZED set to 1 (Area), when data side accesses and data
C_FAULT_TOLERANT set to 1, when byte
writes and halfword writes to LMB normally require two clock cycles.
The data cache write latency depends on C_DCACHE_USE_WRITEBACK. When
C_DCACHE_USE_WRITEBACK is set to 1, the write latency normally is one cycle (more if the
cache needs to do memory accesses). When
C_DCACHE_USE_WRITEBACK is cleared to 0, the
write latency normally is two cycles (more if the posted-write buffer in the memory controller is full).
The MicroBlaze instruction and data caches can be configured to use 4, 8 or 16 word cache lines. When using a longer cache line, more bytes are prefetched, which generally improves performance for software with sequential access patterns. However, for software with a more random access pattern the performance can instead decrease for a given cache size. This is caused by a reduced cache hit rate due to fewer available cache lines.
For details on the different memory interfaces, see Chapter 3, MicroBlaze Signal Interface
Description.
Privileged Instructions
The following MicroBlaze instructions are privileged:
GET, GETD,PUT,PUTD (except when explicitly allowed)
WIC, WDC
MTS, MTSE
MSRCLR, MSRSET (except when only the C bit is affected)
BRK
RTID, RTBD, RTED
BRKI (except when jumping to physical address C_BASE_VECTORS + 0x8 or
C_BASE_VECTORS + 0x18)
SLEEP, HIBERNATE, SUSPEND
LBUEA, LHUEA, LWEA, SBEA, SHEA, SWEA (except when explicitly allowed)
Attempted use of these instructions when running in user mode causes a privileged instruction exception. When setting the parameter instructions
GET, GETD, PUT, and PUTD are not considered privileged, and can be executed
when running in user mode.
C_MMU_PRIVILEGED_INSTR to 1 or 3, the
MicroBlaze Processor Reference Guide 56
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
CAUTION! It is strongly discouraged to do this, unless absolutely necessary for performance reasons,
because it allows application processes to interfere with each other.
When setting the parameter C_MMU_PRIVILEGED_INSTR to 2 or 3, the extended address instructions
LBUEA, LHUEA, LWEA, SBEA, SHEA, and SWEA are not considered privileged, and
will bypass the MMU translation, treating the extended address as a physical address. This is useful to run software in virtual mode while still having direct access to the full physical address space, but is discouraged in all cases where protection between application processes is necessary.
There are six ways to leave user mode and virtual mode:
1. Hardware generated reset (including debug reset)
2. Hardware exception
3. Non-maskable break or hardware break
4. Interrupt
5. Executing "
6. Executing the software break instructions “
C_BASE_VECTORS + 0x8 or C_BASE_VECTORS + 0x18
BRALID Re,C_BASE_VECTORS + 0x8” to perform a user vector exception
BRKI” jumping to physical address
In all of these cases, except hardware generated reset, the user mode and virtual mode status is saved in the MSR UMS and VMS bits.
Application (user-mode) programs transfer control to system-service routines (privileged mode programs) using the
C_BASE_VECTORS + 0x8. Executing this instruction causes a system-call exception to occur.
BRALID or BRKI instruction, jumping to physical address
The exception handler determines which system-service routine to call and whether the calling application has permission to call that service. If permission is granted, the exception handler performs the actual procedure call to the system-service routine on behalf of the application program.
The execution environment expected by the system-service routine requires the execution of prologue instructions to set up that environment. Those instructions usually create the block of storage that holds procedural information (the activation record), update and initialize pointers, and save volatile registers (the registers that the system-service routine uses). Prologue code can be inserted by the linker when creating an executable module, or it can be included as stub code in either the system-call interrupt handler or the system­library routines.
MicroBlaze Processor Reference Guide 57
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Returns from the system-service routine reverse the process described above. Epilogue code is executed to unwind and deallocate the activation record, restore pointers, and restore volatile registers. The interrupt handler executes a return from exception instruction (
RTED) to return to the application.
Chapter 2: MicroBlaze Architecture
Send Feedback
Virtual-Memory Management
Programs running on MicroBlaze use effective addresses to access a flat 4 GB address space. The processor can interpret this address space in one of two ways, depending on the translation mode:
In real mode, effective addresses are used to directly access physical memory
In virtual mode, effective addresses are translated into physical addresses by the virtual-memory management hardware in the processor
Virtual mode provides system software with the ability to relocate programs and data anywhere in the physical address space. System software can move inactive programs and data out of physical memory when space is required by active programs and data.
Relocation can make it appear to a program that more memory exists than is actually implemented by the system. This frees the programmer from working within the limits imposed by the amount of physical memory present in a system. Programmers do not need to know which physical-memory addresses are assigned to other software processes and hardware devices. The addresses visible to programs are translated into the appropriate physical addresses by the processor.
Virtual mode provides greater control over memory protection. Blocks of memory as small as 1 KB can be individually protected from unauthorized access. Protection and relocation enable system software to support multitasking. This capability gives the appearance of simultaneous or near-simultaneous execution of multiple programs.
In MicroBlaze, virtual mode is implemented by the memory-management unit (MMU), available when (Performance) or 2 (Frequency). The MMU controls effective-address to physical-address mapping and supports memory protection. Using these capabilities, system software can implement demand-paged virtual memory and other memory management schemes.
The MicroBlaze MMU implementation is based upon the PowerPC™ 405 processor.
The MMU features are summarized as follows:
Translates effective addresses into physical addresses
Controls page-level access during address translation
Provides additional virtual-mode protection control through the use of zones
Provides independent control over instruction-address and data-address translation and protection
Supports eight page sizes: 1 kB, 4 kB, 16 kB, 64 kB, 256 kB, 1 MB, 4 MB, and 16 MB. Any combination of page sizes can be used by system software
C_USE_MMU is set to 3 (Virtual) and C_AREA_OPTIMIZED is set to 0
MicroBlaze Processor Reference Guide 58
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Software controls the page-replacement strategy
Chapter 2: MicroBlaze Architecture
31
24
Processor ID Register
31
n
32-bit Effective Address
0
Effective Page Number
Offset
39
n+8
40-bit Virtual Address
8
Effective Page Number
OffsetPID
0
Translation Look-Aside
Buffer (TLB) Look-Up
31
32-bit Physical Address
0
Real Page Number Offset
32-63
Up to 64-bit Physical Address
0
Physical Address Extension: Real Page Number
Offset
or
0
X19755-091317
Send Feedback
Real Mode
The processor references memory when it fetches an instruction and when it accesses data with a load or store instruction. Programs reference memory locations using a 32-bit effective address calculated by the processor. When real mode is enabled, the physical address is identical to the effective address and the processor uses it to access physical memory. After a processor reset, the processor operates in real mode. Real mode can also be enabled by clearing the VM bit in the MSR.
Physical-memory data accesses (loads and stores) are performed in real mode using the effective address. Real mode does not provide system software with virtual address translation, but the full memory access-protection is available, implemented when
C_USE_MMU > 1 (User Mode) and C_AREA_OPTIMIZED = 0 (Performance) or 2 (Frequency).
Implementation of a real-mode memory manager is more straightforward than a virtual­mode memory manager. Real mode is often an appropriate solution for memory management in simple embedded environments, when access-protection is necessary, but virtual address translation is not required.
Virtual Mode
In virtual mode, the processor translates an effective address into a physical address using the process shown in address can be extended up to 64 bits. Virtual mode can be enabled by setting the VM bit in the MSR.
X-Ref Target - Figure 2-18
Figure 2-18. With the Physical Address Extension (PAE) the physical
MicroBlaze Processor Reference Guide 59
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Figure 2-18: Virtual-Mode Address Translation
Chapter 2: MicroBlaze Architecture
Send Feedback
Each address shown in Figure 2-18 contains a page-number field and an offset field. The page number represents the portion of the address translated by the MMU. The offset represents the byte offset into a page and is not translated by the MMU. The virtual address consists of an additional field, called the process ID (PID), which is taken from the PID register (see Process-ID Register, page number (EPN) is referred to as the virtual page number (VPN). The value n is determined by the page size, as shown in
System software maintains a page-translation table that contains entries used to translate each virtual page into a physical page. The page size defined by a page translation entry determines the size of the page number and offset fields. For example, when a 4 kB page size is used, the page-number field is 20 bits and the offset field is 12 bits. The VPN in this case is 28 bits.
Then the most frequently used page translations are stored in the translation look-aside buffer (TLB). When translating a virtual address, the MMU examines the page-translation entries for a matching VPN (PID and EPN). Rather than examining all entries in the table, only entries contained in the processor TLB are examined. When a page-translation entry is found with a matching VPN, the corresponding physical-page number is read from the entry and combined with the offset to form the physical address. This physical address is used by the processor to reference memory.
Table 2-38.
36). The combination of PID and effective page
System software can use the PID to uniquely identify software processes (tasks, subroutines, threads) running on the processor. Independently compiled processes can operate in effective-address regions that overlap each other. This overlap must be resolved by system software if multitasking is supported. Assigning a PID to each process enables system software to resolve the overlap by relocating each process into a unique region of virtual­address space. The virtual-address space mappings enable independent translation of each process into the physical-address space.
Page-Translation Table
The page-translation table is a software-defined and software-managed data structure containing page translations. The requirement for software-managed page translation represents an architectural trade-off targeted at embedded-system applications. Embedded systems tend to have a tightly controlled operating environment and a well­defined set of application software. That environment enables virtual-memory management to be optimized for each embedded system in the following ways:
The page-translation table can be organized to maximize page-table search performance (also called table walking) so that a given page-translation entry is located quickly. Most general-purpose processors implement either an indexed page table (simple search method, large page-table size) or a hashed page table (complex search method, small page-table size). With software table walking, any hybrid organization can be employed that suits the particular embedded system. Both the page-table size and access time can be optimized.
MicroBlaze Processor Reference Guide 60
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Independent page sizes can be used for application modules, device drivers, system service routines, and data. Independent page-size selection enables system software to more efficiently use memory by reducing fragmentation (unused memory). For example, a large data structure can be allocated to a 16 MB page and a small I/O device-driver can be allocated to a 1 KB page.
Page replacement can be tuned to minimize the occurrence of missing page translations. As described in the following section, the most-frequently used page translations are stored in the translation look-aside buffer (TLB).
Software is responsible for deciding which translations are stored in the TLB and which translations are replaced when a new translation is required. The replacement strategy can be tuned to avoid thrashing, whereby page-translation entries are constantly being moved in and out of the TLB. The replacement strategy can also be tuned to prevent replacement of critical-page translations, a process sometimes referred to as page locking.
The unified 64-entry TLB, managed by software, caches a subset of instruction and data page-translation entries accessible by the MMU. Software is responsible for reading entries from the page-translation table in system memory and storing them in the TLB. The following section describes the unified TLB in more detail. Internally, the MMU also contains shadow TLBs for instructions and data, with sizes configurable by
C_MMU_DTLB_SIZE respectively.
C_MMU_ITLB_SIZE and
These shadow TLBs are managed entirely by the processor (transparent to software) and are used to minimize access conflicts with the unified TLB.
Translation Look-Aside Buffer
The translation look-aside buffer (TLB) is used by the MicroBlaze MMU for address translation when the processor is running in virtual mode, memory protection, and storage control. Each entry within the TLB contains the information necessary to identify a virtual page (PID and effective page number), specify its translation into a physical page, determine the protection characteristics of the page, and specify the storage attributes associated with the page.
The MicroBlaze TLB is physically implemented as three separate TLBs:
Unified TLB: The UTLB contains 64 entries and is pseudo-associative. Instruction-page and data-page translation can be stored in any UTLB entry. The initialization and management of the UTLB is controlled completely by software.
Instruction Shadow TLB: The ITLB contains instruction page-translation entries and is fully associative. The page-translation entries stored in the ITLB represent the most­recently accessed instruction-page translations from the UTLB. The ITLB is used to minimize contention between instruction translation and UTLB-update operations. The initialization and management of the ITLB is controlled completely by hardware and is transparent to software.
MicroBlaze Processor Reference Guide 61
UG984 (v2018.2) June 21, 2018 www.xilinx.com
X-Ref Target - Figure 2-19
Perform DTLB
Look-Up
Generate I-side
Effective Address
No Translation
Perform ITLB
Look-Up
Translation Disabled
(MSR[VM]=0)
Translation Enabled
(MSR[VM]=1)
Generate D-side
Effective Address
No Translation
Translation Enabled
(MSR[VM]=1)
Translation Disabled
(MSR[VM]=0)
ITLB Hit ITLB Miss DTLB Miss DTLB Hit
Extract Real
Address from ITLB
Perform UTLB
Look-Up
Extract Real
Address from DTLB
Continue I-cache
Access
Continue I-cache
or D-cache
Access
UTLB Hit UTLB Miss
Extract Real
Address from UTLB
I-Side TLB Miss or
D-Side TLB Miss
Exception
Route Address
to ITLB
Route Address
to DTLB
X19756-082517
Send Feedback
Chapter 2: MicroBlaze Architecture
Data Shadow TLB: The DTLB contains data page-translation entries and is fully associative. The page-translation entries stored in the DTLB represent the most-recently accessed data-page translations from the UTLB. The DTLB is used to minimize contention between data translation and UTLB-update operations. The initialization and management of the DTLB is controlled completely by hardware and is transparent to software.
The following figure provides the translation flow for TLB.
Figure 2-19: TLB Address Translation Flow
MicroBlaze Processor Reference Guide 62
UG984 (v2018.2) June 21, 2018 www.xilinx.com
X-Ref Target - Figure 2-20
RPN
22
0
31
28
24
23
ZSEL
W I G
TAG
22
0
3528
272625
SIZE
V E TID
TLBLO:
TLBHI:
29
30
M
U0
EX
WR
X19757-091117
Send Feedback
Chapter 2: MicroBlaze Architecture
TLB Entry Format
The following figure shows the format of a TLB entry. Each TLB entry ranges from 68 bits up to 100 bits and is composed of two portions: TLBLO (also referred to as the data entry), and TLBHI (also referred to as the tag entry).
Figure 2-20: TLB Entry Format (PAE Disabled)
When the Physical Address Extension (PAE) is enabled, the TLB entry is extended with up to 32 additional bits in the TLBLO RPN field to support up to a 64 bit physical address.
The TLB entry contents are described in more detail in Table 2-20 and Table 2-21, including the TLBLO format with PAE enabled.
The fields within a TLB entry are categorized as follows:
Virtual-page identification (TAG, SIZE, V, TID): These fields identify the page-translation entry. They are compared with the virtual-page number during the translation process.
Physical-page identification (RPN, SIZE): These fields identify the translated page in physical memory.
Access control (EX, WR, ZSEL): These fields specify the type of access allowed in the page and are used to protect pages from improper accesses.
Storage attributes (W, I, M, G, E, U0): These fields specify the storage-control attributes, such as caching policy for the data cache (write-back or write-through), whether a page is cacheable, and how bytes are ordered (endianness).
Table 2-38 shows the relationship between the TLB-entry SIZE field and the translated
page size. This table also shows how the page size determines which address bits are involved in a tag comparison, which address bits are used as a page offset, and which bits in the physical page number are used in the physical address. With PAE enabled, the most significant bits of the physical address are directly taken from the extended RPN field.
MicroBlaze Processor Reference Guide 63
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Table 2-38: Page-Translation Bit Ranges by Page Size
PAE D isabl ed PAE Enabled
Page
Size
SIZE
TLBHI
Field
Tag Comparison
Bit Range
Page Offset
Physical
Page
Number
1 KB 000 TAG[0:21] - Address[0:21] Address[22:31] RPN[0:21] - RPN[0:n-11] -
4 KB 001 TAG[0:19] - Address[0:19] Address[20:31] RPN[0:19] 20:21 RPN[0:n-13] n-12:n-11
16 KB 010 TAG[0:17] - Address[0:17] Address[18:31] RPN[0:17] 18:21 RPN[0:n-15] n-14:n-11
64 KB 011 TAG[0:15] - Address[0:15] Address[16:31] RPN[0:15] 16:21 RPN[0:n-17] n-16:n-11
256 KB 100 TAG[0:13] - Address[0:13] Address[14:31] RPN[0:13] 14:21 RPN[0:n-19] n-18:n-11
1 MB 101 TAG[0:11] - Address[0:11] Address[12:31] RPN[0:11] 12:21 RPN[0:n-21] n-20:n-11
4 MB 110 TAG[0:9] - Address[0:9] Address[10:31] RPN[0:9] 10:21 RPN[0:n-23] n-22:n-11
16 MB 111 TAG[0:7] - Address[0:7] Address[8:31] RPN[0:7] 8:21 RPN[0:n-25] n-24:n-11
1. The bit index n = C_ADDR_SIZE.
RPN
Bits
Clear to
0
Physical Page
Number
1
RPN Bits
Clear to 0
TLB Access
When the MMU translates a virtual address (the combination of PID and effective address) into a physical address, it first examines the appropriate shadow TLB for the page translation entry. If an entry is found, it is used to access physical memory. If an entry is not found, the MMU examines the UTLB for the entry. A delay occurs each time the UTLB must be accessed due to a shadow TLB miss. The miss latency ranges from 2-32 cycles. The DTLB has priority over the ITLB if both simultaneously access the UTLB.
Figure 2-20 shows the logical process the MMU follows when examining a page-translation
entry in one of the shadow TLBs or the UTLB. All valid entries in the TLB are checked.
A TLB hit occurs when all of the following conditions are met by a TLB entry:
The entry is valid
The TAG field in the entry matches the effective address EPN under the control of the SIZE field in the entry
The TID field in the entry matches the PID
If any of the above conditions are not met, a TLB miss occurs. A TLB miss causes an exception, described as follows:
MicroBlaze Processor Reference Guide 64
UG984 (v2018.2) June 21, 2018 www.xilinx.com
A TID value of 0x00 causes the MMU to ignore the comparison between the TID and PID. Only the TAG and EA[EPN] are compared. A TLB entry with TID=0x00 represents a process­independent translation. Pages that are accessed globally by all processes should be assigned a TID value of 0x00. A PID value of 0x00 does not identify a process that can access any page. When PID=0x00, a page-translation hit only occurs when TID=0x00. It is possible for software to load the TLB with multiple entries that match an EA[EPN] and PID
Chapter 2: MicroBlaze Architecture
Send Feedback
combination. However, this is considered a programming error and results in undefined behavior.
When a hit occurs, the MMU reads the RPN field from the corresponding TLB entry. Some or all of the bits in this field are used, depending on the value of the
Table 2-38).
For example, with PAE disabled, if the SIZE field specifies a 256 kB page size, RPN[0:13] represents the physical page number and is used to form the physical address. RPN[14:21] is not used, and software must clear those bits to 0 when initializing the TLB entry. The remainder of the physical address is taken from the page-offset portion of the EA. If the page size is 256 kB, the 32-bit physical address is formed by concatenating RPN[0:13] with bits 14:31 of the effective address.
Instead, with PAE enabled and assuming a physical address size of 40 bits (C_ADDR_SIZE set to 40), RPN[0:21] represents the physical page number and RPN[22:29] is not used. The 40­bit physical address is formed by concatenating RPN[0:21] with bits 14:31 of the effective address.
SIZE field (see
Prior to accessing physical memory, the MMU examines the TLB-entry access-control fields. These fields indicate whether the currently executing program is allowed to perform the requested memory access.
If access is allowed, the MMU checks the storage-attribute fields to determine how to access the page. The storage-attribute fields specify the caching policy for memory accesses.
TLB Access Failures
A TLB-access failure causes an exception to occur. This interrupts execution of the instruction that caused the failure and transfers control to an interrupt handler to resolve the failure. A TLB access can fail for two reasons:
A matching TLB entry was not found, resulting in a TLB miss
A matching TLB entry was found, but access to the page was prevented by either the storage attributes or zone protection
When an interrupt occurs, the processor enters real mode by clearing MSR[VM] to 0. In real mode, all address translation and memory-protection checks performed by the MMU are disabled. After system software initializes the UTLB with page-translation entries, management of the MicroBlaze UTLB is usually performed using interrupt handlers running in real mode.
MicroBlaze Processor Reference Guide 65
UG984 (v2018.2) June 21, 2018 www.xilinx.com
X-Ref Target - Figure 2-21
Check TLB-Entry
Using Virtual Address
TLB HI[V]=1 TLB Entry Miss
No
TLBHI[TID]=0x00
Yes
Compare
TLBHI[TAG] with EA[EPN]
Using TLBHI[SIZE]
Compare
TLBHI[TID] with PID
TLB Entry Miss
No Match
Check Access Access Violation
Not allowed
Match (TLB Hit)
Allowed
Check for
Guarded Storage
Storage Violation
Guarded
Data Reference Instruction Fetch
Read TLBLO[RPN] Using TLBHI[SIZE]
Extract Offset from EA
using TLBHI[SIZE]
Generate Physical Address from
TLBLO[RPN] and Offset
Yes No
Match
TLB Entry Miss
No Match
Not Guarded
X19758-091317
Send Feedback
Chapter 2: MicroBlaze Architecture
The following figure diagrams the general process for examining a TLB entry.
MicroBlaze Processor Reference Guide 66
UG984 (v2018.2) June 21, 2018 www.xilinx.com
The following sections describe the conditions under which exceptions occur due to TLB
Figure 2-21: General Process for Examining a TLB Entry
access failures.
Chapter 2: MicroBlaze Architecture
Send Feedback
Data-Storage Exception
When virtual mode is enabled, (MSR[VM]=1), a data-storage exception occurs when access to a page is not permitted for any of the following reasons:
From user mode:
The TLB entry specifies a zone field that prevents access to the page (ZPR[Zn]=00).
-
This applies to load and store instructions.
The TLB entry specifies a read-only page (TLBLO[WR]=0) that is not otherwise
-
overridden by the zone field (ZPR[Zn]‚ 11). This applies to store instructions.
From privileged mode:
The TLB entry specifies a read-only page (TLBLO[WR]=0) that is not otherwise
-
overridden by the zone field (ZPR[Zn]‚ 10 and ZPR[Zn]‚ 11). This applies to store instructions.
Instruction-Storage Exception
When virtual mode is enabled, (MSR[VM]=1), an instruction-storage exception occurs when access to a page is not permitted for any of the following reasons:
From user mode:
The TLB entry specifies a zone field that prevents access to the page (ZPR[Zn]=00).
-
The TLB entry specifies a non-executable page (TLBLO[EX]=0) that is not otherwise
-
overridden by the zone field (ZPR[Zn]‚ 11).
The TLB entry specifies a guarded-storage page (TLBLO[G]=1).
-
From privileged mode:
The TLB entry specifies a non-executable page (TLBLO[EX]=0) that is not otherwise
-
overridden by the zone field (ZPR[Zn]‚ 10 and ZPR[Zn]‚ 11).
The TLB entry specifies a guarded-storage page (TLBLO[G]=1).
-
Data TLB-Miss Exception
When virtual mode is enabled (MSR[VM]=1) a data TLB-miss exception occurs if a valid, matching TLB entry was not found in the TLB (shadow and UTLB). Any load or store instruction can cause a data TLB-miss exception.
MicroBlaze Processor Reference Guide 67
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Instruction TLB-Miss Exception
When virtual mode is enabled (MSR[VM]=1) an instruction TLB-miss exception occurs if a valid, matching TLB entry was not found in the TLB (shadow and UTLB). Any instruction fetch can cause an instruction TLB-miss exception.
Chapter 2: MicroBlaze Architecture
Send Feedback
Access Protection
System software uses access protection to protect sensitive memory locations from improper access. System software can restrict memory accesses for both user-mode and privileged-mode software. Restrictions can be placed on reads, writes, and instruction fetches. Access protection is available when virtual protected mode is enabled.
Access control applies to instruction fetches, data loads, and data stores. The TLB entry for a virtual page specifies the type of access allowed to the page.
The TLB entry also specifies a zone-protection field in the zone-protection register that is used to override the access controls specified by the TLB entry.
TLB Access-Protection Controls
Each TLB entry controls three types of access:
Process: Processes are protected from unauthorized access by assigning a unique process ID (PID) to each process. When system software starts a user-mode application, it loads the PID for that application into the PID register. As the application executes, memory addresses are translated using only TLB entries with a TID field in Translation Look-Aside Buffer High (TLBHI) that matches the PID. This enables system software to restrict accesses for an application to a specific area in virtual memory. A TLB entry with TID=0x00 represents a process-independent translation. Pages that are accessed globally by all processes should be assigned a TID value of 0x00.
Execution: The processor executes instructions only if they are fetched from a virtual page marked as executable (TLBLO[EX]=1). Clearing TLBLO[EX] to 0 prevents execution of instructions fetched from a page, instead causing an instruction-storage interrupt (ISI) to occur. The ISI does not occur when the instruction is fetched, but instead occurs when the instruction is executed. This prevents speculatively fetched instructions that are later discarded (rather than executed) from causing an ISI.
The zone-protection register can override execution protection.
Read/Write: Data is written only to virtual pages marked as writable (TLBLO[WR]=1). Clearing TLBLO[WR] to 0 marks a page as read-only. An attempt to write to a read-only page causes a data-storage interrupt (DSI) to occur.
The zone-protection register can override write protection.
TLB entries cannot be used to prevent programs from reading pages. In virtual mode, zone protection is used to read-protect pages. This is done by defining a no-access-allowed zone (ZPR[Zn] = 00) and using it to override the TLB-entry access protection. Only programs running in user mode can be prevented from reading a page. Privileged programs always have read access to a page.
MicroBlaze Processor Reference Guide 68
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Zone Protection
Zone protection is used to override the access protection specified in a TLB entry. Zones are an arbitrary grouping of virtual pages with common access protection. Zones can contain any number of pages specifying any combination of page sizes. There is no requirement for a zone to contain adjacent pages.
The zone-protection register (ZPR) is a 32-bit register used to specify the type of protection override applied to each of 16 possible zones. The protection override for a zone is encoded in the ZPR as a 2-bit field.
The 4-bit zone-select field in a TLB entry (TLBLO[ZSEL]) selects one of the 16 zone fields from the ZPR (Z0–Z15). For example, zone Z5 is selected when ZSEL = 0101.
Changing a zone field in the ZPR applies a protection override across all pages in that zone. Without the ZPR, protection changes require individual alterations to each page translation entry within the zone.
Unimplemented zones (when C_MMU_ZONES < 16) are treated as if they contained 11.
UTLB Management
The UTLB serves as the interface between the processor MMU and memory-management software. System software manages the UTLB to tell the MMU how to translate virtual addresses into physical addresses. When a problem occurs due to a missing translation or an access violation, the MMU communicates the problem to system software using the exception mechanism. System software is responsible for providing interrupt handlers to correct these problems so that the MMU can proceed with memory translation.
Software reads and writes UTLB entries using the MFS and MTS instructions, respectively. With PAE enabled, the MFSE and MTSE instructions are used to access the most significant part of the real page number. These instructions use the TLBX register index (numbered 0 to
63) corresponding to one of the 64 entries in the UTLB. The tag and data portions are read
and written separately, so software must execute two MFS or MTS instructions, and also an additional MFSE or MTSE instruction when PAE is enabled, to completely access an entry.
The UTLB is searched for a specific translation using the TLBSX register. TLBSX locates a translation using an effective address and loads the corresponding UTLB index into the TLBX register.
Individual UTLB entries are invalidated using the MTS instruction to clear the valid bit in the tag portion of a TLB entry (TLBHI[V]).
MicroBlaze Processor Reference Guide 69
UG984 (v2018.2) June 21, 2018 www.xilinx.com
When C_FAULT_TOLERANT is set to 1, the UTLB block RAM is protected by parity. In case of a parity error, a TLB miss exception occurs. To avoid accumulating errors in this case, each entry in the UTLB should be periodically invalidated.
Chapter 2: MicroBlaze Architecture
Send Feedback
Recording Page Access and Page Modification
Software management of virtual-memory poses several challenges:
In a virtual-memory environment, software and data often consume more memory than is physically available. Some of the software and data pages must be stored outside physical memory, such as on a hard drive, when they are not used. Ideally, the most­frequently used pages stay in physical memory and infrequently used pages are stored elsewhere.
When pages in physical-memory are replaced to make room for new pages, it is important to know whether the replaced (old) pages were modified.
If they were modified, they must be saved prior to loading the replacement (new) pages. If the old pages were not modified, the new pages can be loaded without saving the old pages.
A limited number of page translations are kept in the UTLB. The remaining translations must be stored in the page-translation table. When a translation is not found in the UTLB (due to a miss), system software must decide which UTLB entry to discard so that the missing translation can be loaded. It is desirable for system software to replace infrequently used translations rather than frequently used translations.
Solving the above problems in an efficient manner requires keeping track of page accesses and page modifications. MicroBlaze does not track page access and page modification in hardware. Instead, system software can use the TLB-miss exceptions and the data-storage exception to collect this information. As the information is collected, it can be stored in a data structure associated with the page-translation table.
Page-access information is used to determine which pages should be kept in physical memory and which are replaced when physical-memory space is required. System software can use the valid bit in the TLB entry (TLBHI[V]) to monitor page accesses. This requires page translations be initialized as not valid (TLBHI[V]=0) to indicate they have not been accessed. The first attempt to access a page causes a TLB-miss exception, either because the UTLB entry is marked not valid or because the page translation is not present in the UTLB. The TLB-miss handler updates the UTLB with a valid translation (TLBHI[V]=1). The set valid bit serves as a record that the page and its translation have been accessed. The TLB­miss handler can also record the information in a separate data structure associated with the page-translation entry.
Page-modification information is used to indicate whether an old page can be overwritten with a new page or the old page must first be stored to a hard disk. System software can use the write-protection bit in the TLB entry (TLBLO[WR]) to monitor page modification. This requires page translations be initialized as read-only (TLBLO[WR]=0) to indicate they have not been modified. The first attempt to write data into a page causes a data-storage exception, assuming the page has already been accessed and marked valid as described above. If software has permission to write into the page, the data-storage handler marks the page as writable (TLBLO[WR]=1) and returns.
MicroBlaze Processor Reference Guide 70
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
The set write-protection bit serves as a record that a page has been modified. The data­storage handler can also record this information in a separate data structure associated with the page-translation entry.
Tracking page modification is useful when virtual mode is first entered and when a new process is started.
MicroBlaze Processor Reference Guide 71
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Reset, Interrupts, Exceptions, and Break
MicroBlaze supports reset, interrupt, user exception, break, and hardware exceptions. The following section describes the execution flow associated with each of these events.
The relative priority starting with the highest is:
1. Reset
2. Hardware Exception
3. Non-maskable Break
4. Break
5. Interrupt
6. User Vector (Exception)
Table 2-39 defines the memory address locations of the associated vectors and the
hardware enforced register file locations for return addresses. Each vector allocates two addresses to allow full address range branching (requires an instruction). Normally the vectors start at address 0x00000000, but the parameter
C_BASE_VECTORS can be used to locate them anywhere in memory.
IMM followed by a BRAI
The address range 0x28 to 0x4F is reserved for future software support by Xilinx. Allocating these addresses for user applications is likely to conflict with future releases of SDK support software.
Table 2-39: Vectors and Return Address Register File Location
Event Vector Address
Reset
User Vector (Exception)
Interrupt
Break: Non-maskable hardware
Break: Hardware
Break: Software
Hardware Exception
Reserved by Xilinx for future use
1. With low-latency interrupt mode, the vector address is supplied by the Interrupt Controller.
1
C_BASE_VECTORS + 0x00000000 ­C_BASE_VECTORS + 0x00000004
C_BASE_VECTORS + 0x00000008 ­C_BASE_VECTORS + 0x0000000C
C_BASE_VECTORS + 0x00000010 ­C_BASE_VECTORS + 0x00000014
C_BASE_VECTORS + 0x00000018 ­C_BASE_VECTORS + 0x0000001C
C_BASE_VECTORS + 0x00000020 ­C_BASE_VECTORS + 0x00000024
C_BASE_VECTORS + 0x00000028 ­C_BASE_VECTORS + 0x0000004F
Register File
Return Address
-
Rx
R14
R16
R17 or BTR
-
MicroBlaze Processor Reference Guide 72
UG984 (v2018.2) June 21, 2018 www.xilinx.com
All of these events will clear the reservation bit, used together with the LWX and SWX instructions to implement mutual exclusion, such as semaphores and spinlocks.
Reset
Send Feedback
Chapter 2: MicroBlaze Architecture
When a Reset or Debug_Rst
(1)
occurs, MicroBlaze flushes the pipeline and starts fetching instructions from the reset vector (address 0x0). Both external reset signals are active high and should be asserted for a minimum of 16 cycles. See
MicroBlaze Core Configurability in
Chapter 3 for more information on the MSR reset value parameters.
Equivalent Pseudocode
PC C_BASE_VECTORS + 0x00000000 MSR C_RESET_MSR_IE << 2 | C_RESET_MSR_BIP << 4 | C_RESET_MSR_ICE << 6 |
C_RESET_MSR_DCE << 8 | C_RESET_MSR_EE << 9 | C_RESET_MSR_EIP << 10
0; ESR 0; FSR 0
EAR PID 0; ZPR 0; TLBX 0 Reservation 0
Hardware Exceptions
MicroBlaze can be configured to trap the following internal error conditions: illegal instruction, instruction and data bus error, and unaligned access. The divide exception can only be enabled if the processor is configured with a hardware divider (
When configured with a hardware floating-point unit (C_USE_FPU>0), it can also trap the following floating-point specific exceptions: underflow, overflow, float division-by-zero, invalid operation, and denormalized operand error.
C_USE_DIV=1).
When configured with a hardware memory management unit (MMU), it can also trap the following memory management specific exceptions: Illegal Instruction Exception, Data Storage Exception, Instruction Storage Exception, Data TLB Miss Exception, and Instruction TLB Miss Exception.
A hardware exception causes MicroBlaze to flush the pipeline and branch to the hardware exception vector (address
C_BASE_VECTORS + 0x20). The execution stage instruction in the
exception cycle is not executed.
The exception also updates the general purpose register R17 in the following manner:
For the MMU exceptions (Data Storage Exception, Instruction Storage Exception, Data
TLB Miss Exception, Instruction TLB Miss Exception) the register R17 is loaded with the appropriate program counter value to re-execute the instruction causing the exception upon return. The value is adjusted to return to a preceding
IMM instruction, if any. If the
exception is caused by an instruction in a branch delay slot, the value is adjusted to return to the branch instruction, including adjustment for a preceding
IMM instruction,
if any.
MicroBlaze Processor Reference Guide 73
UG984 (v2018.2) June 21, 2018 www.xilinx.com
1. Reset input controlled by the debugger using MDM.
Chapter 2: MicroBlaze Architecture
Send Feedback
For all other exceptions the register R17 is loaded with the program counter value of
the subsequent instruction, unless the exception is caused by an instruction in a branch delay slot. If the exception is caused by an instruction in a branch delay slot, the ESR[DS] bit is set. In this case the exception handler should resume execution from the branch target address stored in BTR.
The EE and EIP bits in MSR are automatically reverted when executing the RTED instruction.
The VM and UM bits in MSR are automatically reverted from VMS and UMS when executing the
RTED, RTBD, and RTID instructions.
Exception Priority
When two or more exceptions occur simultaneously, they are handled in the following order, from the highest priority to the lowest:
Instruction Bus Exception
Instruction TLB Miss Exception
Instruction Storage Exception
Illegal Opcode Exception
Privileged Instruction Exception or Stack Protection Violation Exception
Data TLB Miss Exception
Data Storage Exception
Unaligned Exception
Data Bus Exception
Divide Exception
•FPU Exception
•Stream Exception
Exception Causes
Stream Exception: The AXI4-Stream exception is caused by executing a get or getd
instruction with the ‘e’ bit set to ‘1’ when there is a control bit mismatch.
Instruction Bus Exception: The instruction bus exception is caused by errors when
reading data from memory.
The instruction peripheral AXI4 interface (M_AXI_IP) exception is caused by an error
-
response on
M_AXI_IP_RRESP.
MicroBlaze Processor Reference Guide 74
UG984 (v2018.2) June 21, 2018 www.xilinx.com
The instruction cache AXI4 interface (M_AXI_IC) is caused by an error response on
-
M_AXI_IC_RRESP. The exception can only occur when C_ICACHE_ALWAYS_USED is set
Chapter 2: MicroBlaze Architecture
Send Feedback
to 1 and the cache is turned off, or if the MMU Inhibit Caching bit is set for the address. In all other cases the response is ignored.
The instructions side local memory (ILMB) can only cause instruction bus exception
-
when either an uncorrectable error occurs in the LMB memory, as indicated by the
IUE signal, or C_ECC_USE_CE_EXCEPTION is set to 1 and a correctable error occurs
in the LMB memory, as indicated by the
ICE signal.
Illegal Opcode Exception: The illegal opcode exception is caused by an instruction
with an invalid major opcode (bits 0 through 5 of instruction). Bits 6 through 31 of the instruction are not checked. Optional processor instructions are detected as illegal if not enabled. If the optional feature
C_OPCODE_0x0_ILLEGAL is enabled, an illegal
opcode exception is also caused if the instruction is equal to 0x00000000.
Data Bus Exception: The data bus exception is caused by errors when reading data
from memory or writing data to memory.
The data peripheral AXI4 interface (M_AXI_DP) exception is caused by an error
-
response on
The data cache AXI4 interface (M_AXI_DC) exception is caused by:
-
M_AXI_DP_RRESP or M_AXI_DP_BRESP.
- An error response on
OKAY response on M_AXI_DC_RRESP in case of an exclusive access using LWX.
-
The exception can only occur when cache is turned off, when an exclusive access using
M_AXI_DC_RRESP or M_AXI_DC_BRESP,
C_DCACHE_ALWAYS_USED is set to 1 and the
LWX or SWX is performed, or if the
MMU Inhibit Caching bit is set for the address. In all other cases the response is ignored.
The data side local memory (DLMB) can only cause instruction bus exception when
-
either an uncorrectable error occurs in the LMB memory, as indicated by the signal, or LMB memory, as indicated by the
C_ECC_USE_CE_EXCEPTION is set to 1 and a correctable error occurs in the
DCE signal. An error can occur for all read
DUE
accesses, and for byte and halfword write accesses.
Unaligned Exception: The unaligned exception is caused by a word access where the
address to the data bus has bits 30 or 31 set, or a half-word access with bit 31 set.
MicroBlaze Processor Reference Guide 75
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Divide Exception: The divide exception is caused by an integer division (idiv or
idivu) where the divisor is zero, or by a signed integer division (idiv) where overflow
occurs (-2147483648 / -1).
FPU Exception: An FPU exception is caused by an underflow, overflow, divide-by-zero,
illegal operation, or denormalized operand occurring with a floating-point instruction.
Underflow occurs when the result is denormalized.
-
Overflow occurs when the result is not-a-number (NaN).
-
The divide-by-zero FPU exception is caused by the rA operand to fdiv being zero
-
when rB is not infinite.
Illegal operation is caused by a signaling NaN operand or by illegal infinite or zero
-
operand combinations.
Privileged Instruction Exception: The Privileged Instruction exception is caused by an
attempt to execute a privileged instruction in User Mode.
Stack Protection Violation Exception: A Stack Protection Violation exception is
caused by executing a load or store instruction using the stack pointer (register R1) as rA with an address outside the stack boundaries defined by the special Stack Low and Stack High registers, causing a stack overflow or a stack underflow.
Data Storage Exception: The Data Storage exception is caused by an attempt to
access data in memory that results in a memory-protection violation.
Instruction Storage Exception: The Instruction Storage exception is caused by an
attempt to access instructions in memory that results in a memory-protection violation.
Data TLB Miss Exception: The Data TLB Miss exception is caused by an attempt to
access data in memory, when a valid Translation Look-Aside Buffer entry is not present, and virtual protected mode is enabled.
Instruction TLB Miss Exception: The Instruction TLB Miss exception is caused by an
attempt to access instructions in memory, when a valid Translation Look-Aside Buffer entry is not present, and virtual protected mode is enabled.
Should an Instruction Bus Exception, Illegal Opcode Exception, or Data Bus Exception occur when and MSR[EE] cleared), the pipeline is halted, and the external signal
C_FAULT_TOLERANT is set to 1, and an exception is in progress (that is MSR[EIP] set
MB_Error is set.
Imprecise Exceptions
Normally all exceptions in MicroBlaze are precise, meaning that any instructions in the pipeline after the instruction causing an exception are invalidated, and have no effect.
MicroBlaze Processor Reference Guide 76
UG984 (v2018.2) June 21, 2018 www.xilinx.com
When C_IMPRECISE_EXCEPTIONS is set to 1 (ECC) an Instruction Bus Exception or Data Bus Exception caused by ECC errors in LMB memory is not precise, meaning that a subsequent memory access instruction in the pipeline might be executed. If this behavior is acceptable, the maximum frequency can be improved by setting this parameter to 1.
Chapter 2: MicroBlaze Architecture
Send Feedback
Equivalent Pseudocode
ESR[DS] exception in delay slot if ESR[DS] then
branch target PC
BTR if MMU exception then
if branch preceded by IMM then
PC - 8
r17
else
PC - 4
r17
else
invalid value
r17
else if MMU exception then
if instruction preceded by IMM then
PC - 4
r17
else
r17 ← PC
else
PC + 4
r17
PC ← C_BASE_VECTORS + 0x00000020 MSR[EE] MSR[UMS] MSR[UM], MSR[UM] 0, MSR[VMS] MSR[VM], MSR[VM] 0 ESR[EC] exception specific value ESR[ESS] EAR exception specific value FSR exception specific value Reservation 0
0, MSR[EIP] 1
exception specific value
Breaks
There are two kinds of breaks:
Hardware (external) breaks
Software (internal) breaks
Hardware Breaks
Hardware breaks are performed by asserting the external break signal (that is, the Ext_BRK and
Ext_NM_BRK input ports). On a break, the instruction in the execution stage completes
while the instruction in the decode stage is replaced by a branch to the break vector (address
The break return address (the PC associated with the instruction in the decode stage at the time of the break) is automatically loaded into general purpose register R16. MicroBlaze also sets the Break In Progress (
A normal hardware break (that is, the Ext_BRK input port) is only handled when MSR[BIP] and MSR[EIP] are set to 0 (that is, there is no break or exception in progress). The Break In Progress flag disables interrupts. A non-maskable break (that is, the port) is always handled immediately.
C_BASE_VECTORS + 0x18).
BIP) flag in the Machine Status Register (MSR).
Ext_NM_BRK input
MicroBlaze Processor Reference Guide 77
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
The BIP bit in the MSR is automatically cleared when executing the RTBD instruction.
The Ext_BRK signal must be kept asserted until the break has occurred, and deasserted before the RTBD instruction is executed. The clock cycle.
Ext_NM_BRK signal must only be asserted one
Software Breaks
To perform a software break, use the brk and brki instructions. Refer to Chapter 5,
MicroBlaze Instruction Set Architecture for detailed information on software breaks.
As a special case, when C_DEBUG_ENABLED is greater than zero, and “brki rD,0x18” is executed, a software breakpoint is signaled to the debugger; for example, the Xilinx System Debugger (XSDB) tool, irrespective of the value of in the MSR is not set.
C_BASE_VECTORS. In this case the BIP bit
Latency
The time it takes the MicroBlaze processor to enter a break service routine from the time the break occurs depends on the instruction currently in the execution stage and the latency to the memory storing the break vector.
Equivalent Pseudocode
r16 ← PC PC ← C_BASE_VECTORS + 0x00000018 MSR[BIP] MSR[UMS] MSR[UM], MSR[UM] 0, MSR[VMS] MSR[VM], MSR[VM] 0 Reservation 0
1
Interrupt
MicroBlaze supports one external interrupt source (connected to the Interrupt input port). The processor only reacts to interrupts if the Interrupt Enable (IE) bit in the Machine Status Register (MSR) is set to 1. On an interrupt, the instruction in the execution stage completes while the instruction in the decode stage is replaced by a branch to the interrupt vector. This is either address address supplied by the Interrupt Controller.
The interrupt return address (the PC associated with the instruction in the decode stage at the time of the interrupt) is automatically loaded into general purpose register R14. In addition, the processor also disables future interrupts by clearing the IE bit in the MSR. The IE bit is automatically set again when executing the RTID instruction.
Interrupts are ignored by the processor if either of the break in progress (BIP) or exception in progress (
EIP) bits in the MSR are set to 1.
C_BASE_VECTORS + 0x10, or with low-latency interrupt mode, the
MicroBlaze Processor Reference Guide 78
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
By using the parameter C_INTERRUPT_IS_EDGE, the external interrupt can either be set to level-sensitive or edge-triggered:
When using level-sensitive interrupts, the Interrupt input must remain set until
MicroBlaze has taken the interrupt, and jumped to the interrupt vector. Software must acknowledge the interrupt at the source to clear it before returning from the interrupt handler. If not, the interrupt is taken again, as soon as interrupts are enabled when returning from the interrupt handler.
When using edge-triggered interrupts, MicroBlaze detects and latches the
Interrupt
input edge, which means that the input only needs to be asserted one clock cycle. The interrupt input can remain asserted, but must be deasserted at least one clock cycle before a new interrupt can be detected. The latching of an edge-triggered interrupt is independent of the IE bit in MSR. Should an interrupt occur while the IE bit is 0, it will immediately be serviced when the IE bit is set to 1.
With periodic interrupt sources, such as the FIT Timer IP core, that do not have a method to clear the interrupt from software, it is recommended to use edge-triggered interrupts.
Low-latency Interrupt Mode
A low-latency interrupt mode is available, which allows the Interrupt Controller to directly supply the interrupt vector for each individual interrupt (using the input port). The address of each fast interrupt handler must be passed to the Interrupt Controller when initializing the interrupt system. When a particular interrupt occurs, this address is supplied by the Interrupt Controller, which allows MicroBlaze to directly jump to the handler code.
With this mode, MicroBlaze also directly sends the appropriate interrupt acknowledge to the Interrupt Controller (using the
Interrupt_Ack output port), although it is still the
responsibility of the Interrupt Service Routine to acknowledge level sensitive interrupts at the source.
Interrupt_Address
MicroBlaze Processor Reference Guide 79
UG984 (v2018.2) June 21, 2018 www.xilinx.com
This information allows the Interrupt Controller to acknowledge interrupts appropriately, both for level-sensitive and edge-triggered interrupt.
To inform the Interrupt Controller of the interrupt handling events, Interrupt_Ack is set to:
01: When MicroBlaze jumps to the interrupt handler code,
10: When the RTID instruction is executed to return from interrupt,
11: When MSR[IE] is changed from 0 to 1, which enables interrupts again.
The Interrupt_Ack output port is active during one clock cycle, and is then reset to 00.
Chapter 2: MicroBlaze Architecture
Send Feedback
Latency
The time it takes MicroBlaze to enter an Interrupt Service Routine (ISR) from the time an interrupt occurs, depends on the configuration of the processor and the latency of the memory controller storing the interrupt vectors. If MicroBlaze is configured to have a hardware divider, the largest latency happens when an interrupt occurs during the execution of a division instruction.
With low-latency interrupt mode, the time to enter the ISR is significantly reduced, since the interrupt vector for each individual interrupt is directly supplied by the Interrupt Controller. With compiler support for fast interrupts, there is no need for a common ISR at all. Instead, the ISR for each individual interrupt will be directly called, and the compiler takes care of saving and restoring registers used by the ISR.
Equivalent Pseudocode
r14 ← PC if C_USE_INTERRUPT = 2
Interrupt_Address
PC Interrupt_Ack ← 01
else
C_BASE_VECTORS + 0x00000010
PC
MSR[IE] 0 MSR[UMS] Reservation 0
MSR[UM], MSR[UM] 0, MSR[VMS] MSR[VM], MSR[VM] 0
User Vector (Exception)
The user exception vector is located at address 0x8. A user exception is caused by inserting a ‘BRALID Rx,0x8’ instruction in the software flow. Although Rx could be any general purpose register, Xilinx recommends using R15 for storing the user exception return address, and to use the RTSD instruction to return from the user exception handler.
Pseudocode
rx ← PC
C_BASE_VECTORS + 0x00000008
PC MSR[UMS] MSR[UM], MSR[UM] 0, MSR[VMS] MSR[VM], MSR[VM] 0 Reservation 0
MicroBlaze Processor Reference Guide 80
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Instruction Cache
Overview
MicroBlaze can be used with an optional instruction cache for improved performance when executing code that resides outside the LMB address range.
The instruction cache has the following features:
Direct mapped (1-way associative)
User selectable cacheable memory address range
Configurable cache and tag size
Caching over AXI4 interface (
Option to use 4, 8 or 16 word cache-line
Cache on and off controlled using a bit in the MSR
Optional WIC instruction to invalidate instruction cache lines
Optional stream buffers to improve performance by speculatively prefetching
instructions
Optional victim cache to improve performance by saving evicted cache lines
Optional parity protection that invalidates cache lines if a Block RAM bit error is
detected
Optional data width selection to either use 32 bits, an entire cache line, or 512 bits
M_AXI_IC)
General Instruction Cache Functionality
When the instruction cache is used, the memory address space is split into two segments: a cacheable segment and a non-cacheable segment. The cacheable segment is determined by two parameters: this range correspond to the cacheable address segment. All other addresses are non­cacheable.
C_ICACHE_BASEADDR and C_ICACHE_HIGHADDR. All addresses within
MicroBlaze Processor Reference Guide 81
UG984 (v2018.2) June 21, 2018 www.xilinx.com
The cacheable segment size must be 2N, where N is a positive integer. The range specified by
C_ICACHE_BASEADDR and C_ICACHE_HIGHADDR must comprise a complete power-of-two
N
range, such that range = 2 zero.
The cacheable instruction address consists of two parts: the cache address, and the tag address. The MicroBlaze instruction cache can be configured from 64 bytes to 64 kB. This corresponds to a cache address of between 6 and 16 bits. The tag address together with the cache address should match the full address of cacheable memory.
and the N least significant bits of C_ICACHE_BASEADDR must be
X-Ref Target - Figure 2-22
Tag Address Cache Address
Instruction Address Bits
- -
=
Ta g
RAM
Line Addr
Ta g
Valid (word and line)
Cache_Hit
Instruction
RAM
Word Addr
Cache_instruction_data
30 31
0
X19759-091317
Send Feedback
Chapter 2: MicroBlaze Architecture
When selecting cache sizes below 2 kB, distributed RAM is used to implement the Tag RAM and Instruction RAM. Distributed RAM is always used to implement the Tag RAM, when setting the parameter
C_ICACHE_FORCE_TAG_LUTRAM to 1. This parameter is only available
with cache size 8 kB and less for 4 word cache-lines, with 16 kB and less for 8 word cache­lines, and with 32 kB and less for 16 word cache-lines.
For example: in a MicroBlaze configured with C_ICACHE_BASEADDR= 0x00300000,
C_ICACHE_HIGHADDR=0x0030ffff, C_CACHE_BYTE_SIZE=4096, C_ICACHE_LINE_LEN=8,
C_ICACHE_FORCE_TAG_LUTRAM=0; the cacheable memory of 64 kB uses 16 bits of byte
and address, and the 4 kB cache uses 12 bits of byte address, thus the required address tag width is: 16-12=4 bits. The total number of block RAM primitives required in this configuration is: 2 RAMB16 for storing the 1024 instruction words, and 1 RAMB16 for 128 cache line entries, each consisting of: 4 bits of tag, 8 word-valid bits, 1 line-valid bit. In total 3 RAMB16 primitives.
The following figure shows the organization of Instruction Cache.
MicroBlaze Processor Reference Guide 82
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Figure 2-22: Instruction Cache Organization
Instruction Cache Operation
For every instruction fetched, the instruction cache detects if the instruction address belongs to the cacheable segment. If the address is non-cacheable, the cache controller ignores the instruction and lets the is cacheable, a lookup is performed on the tag memory to check if the requested address is currently cached. The lookup is successful if: the word and line valid bits are set, and the tag address matches the instruction address tag segment. On a cache miss, the cache controller requests the new instruction over the instruction AXI4 interface ( the memory controller to return the associated cache line.
C_ICACHE_DATA_WIDTH determines the bus data width, either 32 bits, an entire cache line
(128, 256 or 512 bits), or 512 bits.
M_AXI_IP or LMB complete the request. If the address
M_AXI_IC), and waits for
Chapter 2: MicroBlaze Architecture
Send Feedback
When C_FAULT_TOLERANT is set to 1, a cache miss also occurs if a parity error is detected in a tag or instruction Block RAM.
The instruction cache issues burst accesses for the AXI4 interface when 32-bit data width is used, otherwise single accesses are used.
Stream Buffers
When stream buffers are enabled, by setting the parameter C_ICACHE_STREAMS to 1, the cache will speculatively fetch cache lines in advance in sequence following the last requested address, until the stream buffer is full.
The stream buffer can hold up to two cache lines. Should the processor subsequently request instructions from a cache line prefetched by the stream buffer, which occurs in linear code, they are immediately available.
The stream buffer often improves performance, since the processor generally has to spend less time waiting for instructions to be fetched from memory.
C_ICACHE_DATA_WIDTH determines the amount of data transferred from the stream buffer
each clock cycle, either 32 bits or an entire cache line.
To be able to use instruction cache stream buffers, area optimization must not be enabled.
Victim Cache
The victim cache is enabled by setting the parameter C_ICACHE_VICTIMS to 2, 4 or 8. This defines the number of cache lines that can be stored in the victim cache. Whenever a cache line is evicted from the cache, it is saved in the victim cache. By saving the most recent lines they can be fetched much faster, should the processor request them, thereby improving performance. If the victim cache is not used, all evicted cache lines must be read from memory again when they are needed.
C_ICACHE_DATA_WIDTH determines the amount of data transferred from/to the victim
cache each clock cycle, either 32 bits or an entire cache line.
Note: To be able to use the victim cache, area optimization must not be enabled.
Instruction Cache Software Support
MicroBlaze Processor Reference Guide 83
UG984 (v2018.2) June 21, 2018 www.xilinx.com
MSR Bit
The ICE bit in the MSR provides software control to enable and disable caches.
The contents of the cache are preserved by default when the cache is disabled. You can invalidate cache lines using the WIC instruction or using the hardware debug logic of MicroBlaze.
Chapter 2: MicroBlaze Architecture
Send Feedback
WIC Instruction
The optional WIC instruction (C_ALLOW_ICACHE_WR=1) is used to invalidate cache lines in the instruction cache from an application. For a detailed description, see
MicroBlaze Instruction Set Architecture.
The WIC instruction can also be used together with parity protection to periodically invalidate entries the cache, to avoid accumulating errors.
Chapter 5,
MicroBlaze Processor Reference Guide 84
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Data Cache
Overview
The MicroBlaze processor can be used with an optional data cache for improved performance. The cached memory range must not include addresses in the LMB address range. The data cache has the following features:
Direct mapped (1-way associative)
•Write-through or Write-back
User selectable cacheable memory address range
Configurable cache size and tag size
Caching over AXI4 interface (
Option to use 4, 8 or 16 word cache-lines
Cache on and off controlled using a bit in the MSR
Optional WDC instruction to invalidate or flush data cache lines
Optional victim cache with write-back to improve performance by saving evicted cache
lines
Optional parity protection for write-through cache that invalidates cache lines if a Block
RAM bit error is detected
Optional data width selection to either use 32 bits, an entire cache line, or 512 bits
M_AXI_DC)
General Data Cache Functionality
When the data cache is used, the memory address space is split into two segments: a cacheable segment and a non-cacheable segment. The cacheable area is determined by two parameters: range correspond to the cacheable address space. All other addresses are non-cacheable.
The cacheable segment size must be 2N, where N is a positive integer. The range specified by
C_DCACHE_BASEADDR and C_DCACHE_HIGHADDR must comprise a complete power-of-two
range, such that range = 2 zero.
C_DCACHE_BASEADDR and C_DCACHE_HIGHADDR. All addresses within this
N
and the N least significant bits of C_DCACHE_BASEADDR must be
MicroBlaze Processor Reference Guide 85
UG984 (v2018.2) June 21, 2018 www.xilinx.com
X-Ref Target - Figure 2-23
Tag Address Cache Word Address
Data Address Bits
- -
=
Ta g
RAM
Addr
Ta g
Valid
Cache_Hit
Data RAM
Addr
Cache_data
Load_Instruction
0 30 31
X19760-091317
Send Feedback
Chapter 2: MicroBlaze Architecture
The following figure shows the Data Cache organization.
Figure 2-23: Data Cache Organization
The cacheable data address consists of two parts: the cache address, and the tag address. The MicroBlaze data cache can be configured from 64 bytes to 64 kB. This corresponds to a cache address of between 6 and 16 bits. The tag address together with the cache address should match the full address of cacheable memory. When selecting cache sizes below 2 kB, distributed RAM is used to implement the Tag RAM and Data RAM, except that block RAM is always used for the Data RAM when
C_DCACHE_USE_WRITEBACK is not set. Distributed RAM is always used to implement the Tag
RAM, when setting the parameter
C_AREA_OPTIMIZED is set to 1 (Area) and
C_DCACHE_FORCE_TAG_LUTRAM to 1. This parameter is
only available with cache size 8 kB and less for 4 word cache-lines, with 16 kB and less for 8 word cache-lines, and with 32 kB and less for 16 word cache-lines.
For example, in a MicroBlaze configured with C_DCACHE_BASEADDR=0x00400000,
C_DCACHE_HIGHADDR=0x00403fff, C_DCACHE_BYTE_SIZE=2048, C_DCACHE_LINE_LEN=4,
and
C_DCACHE_FORCE_TAG_LUTRAM=0; the cacheable memory of 16 kB uses 14 bits of byte
address, and the 2 kB cache uses 11 bits of byte address, thus the required address tag width is 14-11=3 bits. The total number of block RAM primitives required in this configuration is 1 RAMB16 for storing the 512 data words, and 1 RAMB16 for 128 cache line entries, each consisting of 3 bits of tag, 4 word-valid bits, 1 line-valid bit. In total, 2 RAMB16 primitives.
Data Cache Operation
The caching policy used by the MicroBlaze data cache, write-back or write-through, is determined by the parameter write-back protocol is implemented; otherwise write-through is implemented.
However, when configured with an MMU (C_USE_MMU > 1, C_AREA_OPTIMIZED = 0 (Performance) or 2 (Frequency), virtual mode is determined by the W storage attribute in the TLB entry, whereas write-back is used in real mode.
C_DCACHE_USE_WRITEBACK. When this parameter is set, a
C_DCACHE_USE_WRITEBACK = 1), the caching policy in
MicroBlaze Processor Reference Guide 86
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
With the write-back protocol, a store to an address within the cacheable range always updates the cached data. If the target address word is not in the cache (that is, the access is a cache miss), and the location in the cache contains data that has not yet been written to memory (the cache location is dirty), the old data is written over the data AXI4 interface (
M_AXI_DC) to external memory before updating the cache with the new data. If only a
single word needs to be written, a single word write is used, otherwise a burst write is used. For byte or halfword stores, in case of a cache miss, the address is first requested over the data AXI4 interface, while a word store only updates the cache.
With the write-through protocol, a store to an address within the cacheable range generates an equivalent byte, halfword, or word write over the data AXI4 interface to external memory. The write also updates the cached data if the target address word is in the cache (that is, the write is a cache hit). A write cache-miss does not load the associated cache line into the cache.
Provided that the cache is enabled a load from an address within the cacheable range triggers a check to determine if the requested data is currently cached. If it is (that is, on a cache hit) the requested data is retrieved from the cache. If not (that is, on a cache miss) the address is requested over the data AXI4 interface using a burst read, and the processor pipeline stalls until the cache line associated to the requested address is returned from the external memory controller.
The parameter C_DCACHE_DATA_WIDTH determines the bus data width, either 32 bits, an entire cache line (128, 256 or 512 bits), or 512 bits.
When C_FAULT_TOLERANT is set to 1 and write-through protocol is used, a cache miss also occurs if a parity error is detected in the tag or data block RAM.
MicroBlaze Processor Reference Guide 87
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
The following table summarizes all types of accesses issued by the data cache AXI4 interface.
Table 2-40: Data Cache Interface Accesses
Policy State Direction Access Type
Write­through
Write-back Cache
Cache
Enabled
Cache
Disabled
Enabled
Cache
Disabled
Read Burst for 32-bit interface non-exclusive access and exclusive
access with ACE enabled, single access otherwise
Write Single access
Read Burst for 32-bit interface exclusive access with ACE enabled,
single access otherwise
Write Single access
Read Burst for 32-bit interface, single access otherwise
Write Burst for 32-bit interface cache lines with more than one valid
word, a single access otherwise
Read Burst for 32-bit interface non-exclusive access, discarding all but
the desired data, a single access otherwise
Write Single access
Victim Cache
The victim cache is enabled by setting the parameter C_DCACHE_VICTIMS to 2, 4 or 8. This defines the number of cache lines that can be stored in the victim cache. Whenever a complete cache line is evicted from the cache, it is saved in the victim cache. By saving the most recent lines they can be fetched much faster, should the processor request them, thereby improving performance. If the victim cache is not used, all evicted cache lines must be read from memory again when they are needed.
MicroBlaze Processor Reference Guide 88
UG984 (v2018.2) June 21, 2018 www.xilinx.com
With the AXI4 interface, C_DCACHE_DATA_WIDTH determines the amount of data transferred from/to the victim cache each clock cycle, either 32 bits or an entire cache line.
Note: To be able to use the victim cache, write-back must be enabled and area optimization must
not be enabled.
Data Cache Software Support
MSR Bit
The DCE bit in the MSR controls whether or not the cache is enabled. When disabling caches the user must ensure that all the prior writes within the cacheable range have been completed in external memory before reading back over writing to a semaphore immediately before turning off caches, and then in a loop poll until it has been written. The contents of the cache are preserved when the cache is disabled.
M_AXI_DP. This can be done by
Chapter 2: MicroBlaze Architecture
Send Feedback
WDC Instruction
The optional WDC instruction (C_ALLOW_DCACHE_WR=1) is used to invalidate or flush cache lines in the data cache from an application. For a detailed description, please refer to
Chapter 5, MicroBlaze Instruction Set Architecture.
The WDC instruction can also be used together with parity protection to periodically invalidate entries the cache, to avoid accumulating errors.
With an external L2 cache, such as the System Cache, connected to MicroBlaze using the ACE interface, external cache invalidate or flush can be performed with WDC. See the LogiCore IP System Cache Product Guide (PG118) Cache.
[Ref 6] for more information on the System
MicroBlaze Processor Reference Guide 89
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Floating-Point Unit (FPU)
Overview
The MicroBlaze floating-point unit is based on the IEEE 754-1985 standard[Ref 18]:
Uses IEEE 754 single precision floating-point format, including definitions for infinity,
not-a-number (NaN), and zero
Supports addition, subtraction, multiplication, division, comparison, conversion and
square root instructions
Implements round-to-nearest mode
Generates sticky status bits for: underflow, overflow, divide-by-zero and invalid
operation
For improved performance, the following non-standard simplifications are made:
Denormalized
(1)
operands are not supported. A hardware floating-point operation on a denormalized number returns a quiet NaN and sets the sticky denormalized operand error bit in FSR; see Floating-Point Status Register (FSR).
A denormalized result is stored as a signed 0 with the underflow bit set in FSR. This method is commonly referred to as Flush-to-Zero (FTZ)
An operation on a quiet NaN returns the fixed NaN: 0xFFC00000, rather than one of the NaN operands
Overflow as a result of a floating-point operation always returns signed
Format
An IEEE 754 single precision floating-point number is composed of the following three fields:
1. 1-bit sign
2. 8-bit biased exponent
3. 23-bit fraction (a.k.a. mantissa or significand)
The fields are stored in a 32 bit word as defined in the following figure:
MicroBlaze Processor Reference Guide 90
UG984 (v2018.2) June 21, 2018 www.xilinx.com
1. Numbers that are so close to 0, that they cannot be represented with full precision, that is, any number n that falls in the
following ranges: ( 1.17549*10
-38
> n > 0 ), or ( 0 > n > -1.17549 * 10
-38
)
X-Ref Target - Figure 2-24
31
9
fraction
exponent
10
sign
X19761-082517
Send Feedback
Chapter 2: MicroBlaze Architecture
Figure 2-24: IEEE 754 Single Precision Format
The value of a floating-point number v in MicroBlaze has the following interpretation:
1. If exponent = 255 and fraction <> 0, then v= NaN, regardless of the sign bit
2. If exponent = 255 and fraction = 0, then v= (-1)
3. If 0 < exponent < 255, then v = (-1)
sign
* 2
4. If exponent = 0 and fraction <> 0, then v = (-1)
5. If exponent = 0 and fraction = 0, then v = (-1)
sign
*
(exponent-127)
sign
* 2
sign
* 0
* (1.fraction)
-126
* (0.fraction)
For practical purposes only 3 and 5 are useful, while the others all represent either an error or numbers that can no longer be represented with full precision in a 32 bit format.
Rounding
The MicroBlaze FPU only implements the default rounding mode, “Round-to-nearest”, specified in IEEE 754. By definition, the result of any floating-point operation should return the nearest single precision value to the infinitely precise result. If the two nearest representable values are equally near, then the one with its least significant bit zero is returned.
Operations
All MicroBlaze FPU operations use the processors general purpose registers rather than a dedicated floating-point register file, see
Arithmetic
The FPU implements the following floating-point operations:
•addition, fadd
•subtraction, frsub
multiplication, fmul
division, fdiv
square root, fsqrt (available if
C_USE_FPU = 2, EXTENDED)
General Purpose Registers.
MicroBlaze Processor Reference Guide 91
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Comparison
The FPU implements the following floating-point comparisons:
compare less-than, fcmp.lt
•compare equal, fcmp.eq
compare less-or-equal, fcmp.le
compare greater-than, fcmp.gt
compare not-equal, fcmp.ne
compare greater-or-equal, fcmp.ge
compare unordered, fcmp.un (used for NaN)
Conversion
The FPU implements the following conversions (available if C_USE_FPU = 2, EXTENDED):
convert from signed integer to floating-point, flt
convert from floating-point to signed integer, fint
Exceptions
The floating-point unit uses the regular hardware exception mechanism in MicroBlaze. When enabled, exceptions are thrown for all the IEEE standard conditions: underflow, overflow, divide-by-zero, and illegal operation, as well as for the MicroBlaze specific exception: denormalized operand error.
A floating-point exception inhibits the write to the destination register (Rd). This allows a floating-point exception handler to operate on the uncorrupted register file.
Software Support
The SDK compiler system, based on GCC, provides support for the floating-point Unit compliant with the MicroBlaze API. Compiler flags are automatically added to the GCC command line based on the type of FPU present in the system, when using SDK.
All double-precision operations are emulated in software. Be aware that the xil_printf() function does not support floating-point output. The standard C library related functions do support floating-point output, but will increase the program code size.
printf() and
MicroBlaze Processor Reference Guide 92
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
Libraries and Binary Compatibility
The SDK compiler system only includes software floating-point C runtime libraries. To take advantage of the hardware FPU, the libraries must be recompiled with the appropriate compiler switches.
For all cases where separate compilation is used, it is very important that you ensure the consistency of FPU compiler flags throughout the build.
Operator Latencies
The latencies of the various operations supported by the FPU are listed in Chapter 5,
“MicroBlaze Instruction Set Architecture.” The FPU instructions are not pipelined, so only
one operation can be ongoing at any time.
C Language Programming
To gain maximum benefit from the FPU without low-level assembly-language programming, it is important to consider how the C compiler will interpret your source code. Very often the same algorithm can be expressed in many different ways, and some are more efficient than others.
Immediate Constants
Floating-point constants in C are double-precision by default. When using a single­precision FPU, careless coding could result in double-precision software emulation routines being used instead of the native single-precision instructions. To avoid this, explicitly specify (by cast or suffix) that immediate constants in your arithmetic expressions are single-precision values.
For example:
float x = 0.0; ... x += (float)1.0; /* float addition */ x += 1.0F; /* alternative to above */ x += 1.0; /* warning - uses double addition! */
Note that the GNU C compiler can be instructed to treat all floating-point constants as single-precision (contrary to the ANSI C standard) by supplying the compiler flag -fsingle­precision-constants.
Avoiding Unnecessary Casting
While conversions between floating-point and integer formats are supported in hardware by the FPU, when possible.
C_USE_FPU is set to 2 (Extended), it is still best to avoid them when
MicroBlaze Processor Reference Guide 93
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
The following not-recommended example calculates the sum of squares of the integers from 1 to 10 using floating-point representation:
float sum, t; int i; sum = 0.0f; for (i = 1; i <= 10; i++) {
t = (float)i; sum += t * t;
}
The above code requires a cast from an integer to a float on each loop iteration. This can be rewritten as:
float sum, t; int i; t = sum = 0.0f; for(i = 1; i <= 10; i++) {
t += 1.0f; sum += t * t;
}
Note: The compiler is not at liberty to perform this optimization in general, as the two code
fragments above might give different results in some cases (for example, very large t).
Using Square Root Runtime Library Function
The standard C runtime math library functions operate using double-precision arithmetic. When using a single-precision FPU, calls to the square root functions ( inefficient emulation routines being used instead of FPU instructions:
#include <math.h> ... float x=-1.0F; ... x = sqrt(x); /* uses double precision */
Here the math.h header is included to avoid a warning message from the compiler.
When used with single-precision data types, the result is a cast to double, a runtime library call is made (which does not use the FPU) and then a truncation back to float is performed.
The solution is to use the non-ANSI function sqrtf() instead, which operates using single precision and can be carried out using the FPU. For example:
#include <math.h> ... float x=-1.0F; ... x = sqrtf(x); /* uses single precision */
sqrt()) result in
MicroBlaze Processor Reference Guide 94
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Note: When compiling this code, the compiler flag -fno-math-errno (in addition to
-mhard-float and -mxl-float-sqrt) must be used, to ensure that the compiler does not
generate unnecessary code to handle error conditions by updating the errno variable.
Chapter 2: MicroBlaze Architecture
MicroBlaze
Link x
// Configure fx
cput Rc, RFSLx
// Store operands
put Ra, RFSLx // op 1
put Rb, RFSLx // op 2
// Load result
Register
File
Custom HW Accelerator
Op 1 Reg Op 2 Reg
ConfigReg
f
x
Result Reg
Link x
X19783-091317
Send Feedback
Stream Link Interfaces
MicroBlaze can be configured with up to 16 AXI4-Stream interfaces, each consisting of one input and one output port. The channels are dedicated uni-directional point-to-point data streaming interfaces.
For detailed information on the AXI4-Stream interface, please refer to the AMBA 4 AXI4- Stream Protocol Specification, Version 1.0 (
The interfaces on MicroBlaze are 32 bits wide. A separate bit indicates whether the sent/received word is of control or data type. The get instruction in the MicroBlaze ISA is used to transfer information from a port to a general purpose register. The put instruction is used to transfer data in the opposite direction. Both instructions come in 4 flavors: blocking data, non-blocking data, blocking control, and non-blocking control. For a detailed description of the get and put instructions, see
Set Architecture.
Arm IHI 0051A) [Ref 14] document.
Chapter 5, MicroBlaze Instruction
X-Ref Target - Figure 2-25
Hardware Acceleration
Each link provides a low latency dedicated interface to the processor pipeline. Thus they are ideal for extending the processors execution unit with custom hardware accelerators. A simple example is illustrated in the following figure. The code uses RFSLx to indicate the used link.
Figure 2-25: Stream Link Used with HW Accelerated Function f
x
MicroBlaze Processor Reference Guide 95
UG984 (v2018.2) June 21, 2018 www.xilinx.com
This method is similar to extending the ISA with custom instructions, but has the benefit of not making the overall speed of the processor pipeline dependent on the custom function. Also, there are no additional requirements on the software tool chain associated with this type of functional extension.
Chapter 2: MicroBlaze Architecture
Send Feedback
Debug and Trace
Debug Overview
MicroBlaze features a debug interface to support JTAG based software debugging tools (commonly known as BDM or Background Debug Mode debuggers) like the Xilinx System Debugger (XSDB) tool. The debug interface is designed to be connected to the Xilinx Microprocessor Debug Module (MDM) core, which interfaces with the JTAG port of Xilinx FPGAs. Multiple MicroBlaze instances can be interfaced with a single MDM to enable multiprocessor debugging.
To be able to download programs, set software breakpoints and disassemble code, the instruction and data memory ranges must overlap, and use the same physical memory.
Debug registers are accessed using the debug interface, and are not directly visible to software running on the processor, unless the MDM is configured to enable software access to user-accessible debug registers. The debug interface can either use JTAG serial access or AXI4-Lite parallel access, controlled by the parameter
C_DEBUG_INTERFACE.
See the MicroBlaze Debug Module (MDM) Product Guide (PG115) [Ref 4] for a detailed description of the MDM features.
The basic debugging features enabled by setting C_DEBUG_ENABLED to 1 (Basic) include:
Configurable number of hardware breakpoints and watchpoints and unlimited software breakpoints
External processor control enables debug tools to stop, reset, and single step MicroBlaze
Read from and write to: memory, general purpose registers, and special purpose register, except EAR, EDR, ESR, BTR and PVR0 - PVR12, which can only be read
Support for multiple processors
The extended debugging features enabled by setting C_DEBUG_ENABLED to 2 (Extended) include:
Configurable number of performance monitoring event and latency counters
Program Trace:
Embedded program trace with configurable trace buffer size
-
MicroBlaze Processor Reference Guide 96
UG984 (v2018.2) June 21, 2018 www.xilinx.com
External program trace for multiple processors, provided by the MDM
-
Non-intrusive profiling support with configurable profiling buffer size
Cross trigger support between multiple processors, and external cross trigger inputs and outputs, provided by the MDM
Chapter 2: MicroBlaze Architecture
μ
ΣL
N
-------
=
σ
NΣ L
2
ΣL()
2
N
-----------------------------------------
=
Send Feedback
Performance Monitoring
With extended debugging, MicroBlaze provides performance monitoring counters to count various events and to measure latency during program execution. The number of event counters and latency counters can be configured with
C_DEBUG_LATENCY_COUNTERS respectively, and the counter width can be set to 32, 48 or 64
bits with
C_DEBUG_COUNTER_WIDTH. With the default configuration, the counter width is set
to 32 bits and there are five event counters and one latency counter.
An event counter simply counts the number of times a certain event has occurred, whereas a latency counter provides the following information:
Number of times the event has occurred (N)
The sum of each event latency measured by counting clock cycles from the event starts until it finishes (ΣL), used to calculate the mean latency
2
The sum of each event latency squared (ΣL
), used to calculate the latency standard
deviation
C_DEBUG_EVENT_COUNTERS and
The minimum, shortest, measured latency for all events (L
The maximum, longest, measured latency for all events (L
min
max
)
)
The mean latency (μ) is calculated by the formula:
The standard deviation (σ) of the latency is calculated by the formula:
Counting can be started or stopped using the Performance Counter Command Register or by cross trigger events (see
Table 2-62).
When configuring, reading or writing counters, they are accessed sequentially through the performance counter registers. After every access the selected counter item is incremented.
All counters are sampled simultaneously for reading using the Performance Counter Command Register. This can be done while counting, or after counting has been stopped.
When an event counter reaches its maximum value, the overflow status bit is set, and the external interrupt signal
Dbg_Intr is set to one. The interrupt signal is reset to zero by
clearing the counters using the Performance Counter Command Register.
By using one of the event counters to count number of clock cycles, and initializing this counter to overflow after a predetermined sampling interval, the external interrupt can be used to periodically sample the performance counters.
The available events are described in Table 2-41, listed in numerical order.
MicroBlaze Processor Reference Guide 97
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Chapter 2: MicroBlaze Architecture
Send Feedback
A typical procedure to follow when initializing and using the performance monitoring counters is delineated in the steps below.
1. Initialize the events to be monitored:
Use the Performance Command Register (Table 2-44) to reset the selected counter
-
to the first counter, by setting the Reset bit.
Write the desired event numbers for all counters in order, using the Performance
-
Control Register (Table 2-43). With the default configuration this means writing the register five times for the event counters and then once for the latency counter.
2. Clear all counters and start monitoring using the Performance Command Register, by setting the Clear and Start bits.
3. Run the program or function to be monitored.
4. Sample counters and stop monitoring using the Performance Command Register, by setting the Sample and Stop bits.
5. Read the results from all counters:
Use the Performance Command Register to reset the selected counter to the first
-
counter, by setting the Reset bit.
Read the status for all counters in order, using the Performance Counter Status
-
Register (Table 2-45). With the default configuration this means reading the register five times for the event counters and then once for the latency counter. Ensure that the result is valid by checking that the overflow and full bits are not set.
Use the Performance Command Register to reset the selected counter to the first
-
counter, by setting the Reset bit.
Read the counter items for all counters in order, using the Performance Counter
-
Data Read Register (Table 2-46). With the default configuration this means reading the register five times for the event counters and then four times for the latency counter as described in Table 2-47.
6. Calculate the final results, depending on the measured events, for example:
Use the formulas above to determine the mean latency and standard deviation for
-
any measured latency.
The clock cycles per instruction (CPI) can be calculated by E30 / E0.
-
The instruction and data cache hit rates can be calculated by E11 / E10 and E47 / E46.
-
MicroBlaze Processor Reference Guide 98
UG984 (v2018.2) June 21, 2018 www.xilinx.com
The instruction cache miss latency is determined by (E60(ΣL) - E60(N)) / (E10 - E11),
-
and equivalent formulas can be used to determine the data cache read and write miss latencies.
The ratio of floating-point instructions in a program is E29/E0.
-
Chapter 2: MicroBlaze Architecture
Send Feedback
Table 2-41: MicroBlaze Performance Monitoring Events
Event Description Event Description
Event Counter Events
0 Any valid instruction executed 29 Floating-point (fadd, ..., fsqrt)
1 Load word (lw, lwi, lwx) executed 30 Number of clock cycles
2 Load halfword (lhu, lhui) executed 31 Immediate (imm) executed
3 Load byte (lbu, lbui) executed 32 Pattern compare (pcmpbf, pcmpeq, pcmpne)
4 Store word (sw, swi, swx) executed 33 Sign extend instructions (sext8, sext16) executed
5 Store halfword (sh, shi) executed 34 Instruction cache invalidate (wic) executed
6 Store byte (sb, sbi) executed 35 Data cache invalidate or flush (wdc) executed
7 Unconditional branch (br, bri, brk, brki) executed 36 Machine status instructions (msrset, msrclr)
8 Taken conditional branch (beq, ..., bnei) executed 37 Unconditional branch with delay slot executed
9 Not taken conditional branch (beq,..., bnei)
executed
10 Data request from instruction cache 39 Not taken conditional branch with delay slot
11 Hit in instruction cache 40 Delay slot with no operation instruction executed
12 Read data requested from data cache 41 Load instruction (lbu, ..., lwx) executed
13 Read data hit in data cache 42 Store instruction (sb, ..., swx) executed
14 Write data request to data cache 43 MMU data access request
15 Write data hit in data cache 44 Conditional branch (beq, ..., bnei) executed
16 Load (lbu, ..., lwx) with r1 as operand executed 45 Branch (br, bri, brk, brki, beq, ..., bnei)
17 Store (sb, ..., swx) with r1 as operand executed 46 Read or write data request from/to data cache
18 Logical operation (and, andn, or, xor) executed 47 Read or write data cache hit
19 Arithmetic operation (add, idiv, mul, rsub) executed 48 MMU exception taken
20 Multiply operation (mul, mulh, mulhu, mulhsu, muli) 49 MMU instruction side exception taken
21 Barrel shifter operation (bsrl, bsra, bsll) executed 50 MMU data side exception taken
22 Shift operation (sra, src, srl) executed 51 Pipeline stalled
23 Exception taken 52 Branch target cache hit for a branch or return
24 Interrupt occurred 53 MMU instruction side access request
25 Pipeline stalled due to operand fetch stage (OF) 54 MMU instruction TLB (ITLB) hit
26 Pipeline stalled due to execute stage (EX) 55 MMU data TLB (DTLB) hit
27 Pipeline stalled due to memory stage (MEM) 56 MMU unified TLB (UTLB) hit
28 Integer divide (idiv, idivu) executed
38 Taken conditional branch with delay slot executed
executed
57 Interrupt latency from input to interrupt vector 61 MMU address lookup latency
58 Data cache latency for memory read 62 Peripheral AXI interface data read latency
59 Data cache latency for memory write 63 Peripheral AXI interface data write latency
60 Instruction cache latency for memory read
MicroBlaze Processor Reference Guide 99
UG984 (v2018.2) June 21, 2018 www.xilinx.com
Latency and Event Counter events
Chapter 2: MicroBlaze Architecture
0
7
Event
Reserved
31 8
X19762-082517
Send Feedback
The debug registers used to configure and control performance monitoring, and to read or write the event and latency counters, are listed in
Table 2-42. All of these registers except
the Performance Counter Command register are accessed repeatedly to read or write information, first for all of the event counters followed by all of the latency counters.
The DBG_CTRL value indicates the value to use in the MDM Debug Register Access Control Register to access the register, used with MDM software access to debug registers.
Table 2-42: MicroBlaze Performance Monitoring Debug Registers
Register Name Size (bits)
Performance Counter Control
Performance Counter Command
Performance Counter Status
Performance Counter Data Read
Performance Counter Data Write
32
32
MDM
Command
8
5
2
0101 0001 4A207 W
0101 0010 4A404 W
0101 0011 4A601 R
0101 0110 4AC1F R
0101 0111 4AE1F W
DBG_CTRL
Value
R/W Description
Select event for each configured counter, according to
Command to clear counters, start or stop counting, or sample counters
Read the sampled status for each configured performance counter
Read the sampled values for each configured performance counter
Write initial values for each configured performance counter
Table 2-41
Performance Counter Control Register
The Performance Counter Control Register (PCCTRLR) is used to define the events that are counted by the configured performance counters. To define the events for all configured counters, the register should be written repeatedly for each of the counters. This register is a write-only register. Issuing a read request has no effect, and undefined data is read.
Every time the register is written, the selected counter is incremented. By using the Performance Counter Command Register, the selected counter can be reset to the first counter again. See the following figure and table.
X-Ref Target - Figure 2-26
Figure 2-26: Performance Counter Control Register
Table 2-43: Performance Counter Control Register (PCCTRLR)
Bits Name Description Reset Value
7:0 Event Performance counter event, according to Table 2-41.
MicroBlaze Processor Reference Guide 100
UG984 (v2018.2) June 21, 2018 www.xilinx.com
0
Loading...