STM32F3 Series, STM32F4 Series, STM32L4 Series and
STM32L4+ Series Cortex
Introduction
®
-M4 programming manual
This programming manual provides information for application and system-level software
developers. It gives a full description of the STM32 Cortex
model, instruction set and core peripherals.
The STM32F3 Series, STM32F4 Series, STM32L4 Series and STM32L4+ Series
®
Cortex
microcontroller market. It offers significant benefits to developers, including:
• Outstanding processing performance combined with fast interrupt handling
• Enhanced system debug with extensive breakpoint and trace capabilities
• Efficient processor core, system and memories
• Ultra-low power consumption with integrated sleep modes
• Platform security
-M4 processor is a high performance 32-bit processor designed for the
®
-M4 processor programming
Reference documents
Available from STMicroelectronics web site www.st.com:
•STM32F3 Series, STM32F4 Series, STM32L4 Series and STM32L4+ Series
datasheets
•STM32F3 Series, STM32F4 Series, STM32L4 Series and STM32L4+ Series reference
manuals
This document provides the information required for application and system-level software
development. It does not provide information on debug components, features, or operation.
This material is for microcontroller software and hardware engineers, including those who
have no experience of ARM products.
1.1 Typographical conventions
The typographical conventions used in this document are:
italicHighlights important notes, introduces special terminology, denotes
internal cross-references, and citations.
< and > Enclose replaceable terms for assembler syntax where they appear in
code or code fragments. For example:
LDRSB<cond> <Rt>, [<Rn>, #<offset>]
boldHighlights interface elements, such as menu names. Denotes signal
names. Also used for terms in descriptive lists, where appropriate.
monospace
Denotes text that you can enter at the keyboard, such as commands,
file and program names, and source code.
mono
space
Denotes a permitted abbreviation for a command or option. You can
enter the underlined text instead of the full command or option name.
monospace italic
Denotes arguments to monospace text where the argument is to be
replaced by a specific value.
monospace bold
Denotes language keywords when used outside example code.
1.2 List of abbreviations for registers
The following abbreviations are used in register descriptions:
read/write (rw)Software can read and write to these bits.
read-only (r)Software can only read these bits.
write-only (w)Software can only write to this bit.
Reading the bit returns the reset value.
read/clear (rc_w1) Software can read as well as clear this bit by writing 1.
Writing ‘0’ has no effect on the bit value.
read/clear (rc_w0) Software can read as well as clear this bit by writing 0.
Writing ‘1’ has no effect on the bit value.
toggle (t)Software can only toggle this bit by writing ‘1’. Writing ‘0’ has no effect.
Reserved (Res.)Reserved bit, must be kept at reset value.
12/260DocID022708 Rev 6
Page 13
PM0214About this document
Embedded
Trace Macrocell
NVIC
Debug
access
port
Memory
protection unit
Serial
wire
viewer
Bus matrix
Code
interface
SRAM and
peripheral interface
Data
watchpoints
Flash
patch
Cortex-M4
processor
FPU
Processor
core
1.3 About the STM32 Cortex-M4 processor and core peripherals
The Cortex-M4 processor is a high performance 32-bit processor designed for the
microcontroller market. It offers significant benefits to developers, including:
•outstanding processing performance combined with fast interrupt handling
•enhanced system debug with extensive breakpoint and trace capabilities
•efficient processor core, system and memories
•ultra-low power consumption with integrated sleep modes
•platform security robustness, with integrated memory protection unit (MPU).
The Cortex-M4 processor is built on a high-performance processor core, with a 3-stage
pipeline Harvard architecture, making it ideal for demanding embedded applications. The
processor delivers exceptional power efficiency through an efficient instruction set and
extensively optimized design, providing high-end processing hardware including IEEE754compliant single-precision floating-point computation, a range of single-cycle and SIMD
multiplication and multiply-with-accumulate capabilities, saturating arithmetic and dedicated
hardware division.
Figure 1. STM32 Cortex-M4 implementation
To facilitate the design of cost-sensitive devices, the Cortex-M4 processor implements
tightly-coupled system components that reduce processor area while significantly improving
interrupt handling and system debug capabilities. The Cortex-M4 processor implements a
®
version of the Thumb
density and reduced program memory requirements. The Cortex-M4 instruction set
provides the exceptional performance expected of a modern 32-bit architecture, with the
instruction set based on Thumb-2 technology, ensuring high code
high code density of 8-bit and 16-bit microcontrollers.
The Cortex-M4 processor closely integrates a configurable nested interrupt controller
(NVIC), to deliver industry-leading interrupt performance. The NVIC includes a nonmaskable interrupt (NMI), and provides up to 256 interrupt priority levels. The tight
DocID022708 Rev 613/260
259
Page 14
About this documentPM0214
integration of the processor core and NVIC provides fast execution of interrupt service
routines (ISRs), dramatically reducing the interrupt latency. This is achieved through the
hardware stacking of registers, and the ability to suspend load-multiple and store-multiple
operations. Interrupt handlers do not require any assembler stubs, removing any code
overhead from the ISRs. Tail-chaining optimization also significantly reduces the overhead
when switching from one ISR to another.
To optimize low-power designs, the deep sleep function, included in the sleep mode
integrated in the NVIC, enables the STM32 to enter STOP or STDBY mode.
1.3.1 System level interface
The Cortex-M4 processor provides multiple interfaces using AMBA® technology to provide
high speed, low latency memory accesses. It supports unaligned data accesses and
implements atomic bit manipulation that enables faster peripheral controls, system
spinlocks and thread-safe Boolean data handling.
The Cortex-M4 processor has a memory protection unit (MPU) that provides fine grain
memory control, enabling applications to utilize multiple privilege levels, separating and
protecting code, data and stack on a task-by-task basis. Such requirements are critical in
many embedded applications such as automotive.
1.3.2 Integrated configurable debug
The Cortex-M4 processor implements a complete hardware debug solution. This provides
high system visibility of the processor and memory through either a traditional JTAG port or
a 2-pin Serial Wire Debug (SWD) port that is ideal for small package devices.
For system trace the processor integrates an Instrumentation Trace Macrocell (ITM)
alongside data watchpoints and a profiling unit. To enable simple and cost-effective profiling
of the system events these generate, a Serial Wire Viewer (SWV) can export a stream of
software-generated messages, data trace, and profiling information through a single pin.
The optional Embedded Trace Macrocell™ (ETM) delivers unrivalled instruction trace
capture in an area far smaller than traditional trace units.
1.3.3 Cortex-M4 processor features and benefits summary
•Tight integration of system peripherals reduces area and development costs
•Thumb instruction set combines high code density with 32-bit performance
•IEEE754-compliant single-precision FPU implemented in all STM32 Cortex-M4
microcontrollers
•Power control optimization of system components
•Integrated sleep modes for low power consumption
•Fast code execution permits slower processor clock or increases sleep mode time
•Hardware division and fast multiplier
•Deterministic, high-performance interrupt handling for time-critical applications
•Memory protection unit (MPU) for safety-critical applications
•Extensive debug and trace capabilities: Serial Wire Debug and Serial Wire Trace
reduce the number of pins required for debugging and tracing.
14/260DocID022708 Rev 6
Page 15
PM0214About this document
1.3.4 Cortex-M4 core peripherals
The peripherals are:
Nested vectored interrupt controller
The nested vectored interrupt controller (NVIC) is an embedded interrupt controller that
supports low latency interrupt processing.
System control block
The system control block (SCB) is the programmer’s model interface to the processor.
It provides system implementation information and system control, including
configuration, control, and reporting of system exceptions.
System timer
The system timer, SysTick, is a 24-bit count-down timer. Use this as a Real Time
Operating System (RTOS) tick timer or as a simple counter.
Memory protection unit
The Memory protection unit (MPU) improves system reliability by defining the memory
attributes for different memory regions. It provides up to eight different regions, and an
optional predefined background region.
Floating-point unit
The Floating-point unit (FPU) provides IEEE754-compliant operations on single-
precision, 32-bit, floating-point values.
DocID022708 Rev 615/260
259
Page 16
The Cortex-M4 processorPM0214
2 The Cortex-M4 processor
2.1 Programmers model
This section describes the Cortex-M4 programmer’s model. In addition to the individual core
register descriptions, it contains information about the processor modes and privilege levels
for software execution and stacks.
2.1.1 Processor mode and privilege levels for software execution
The processor modes are:
Thread mode: Used to execute application software.
The processor enters Thread mode when it comes out of reset.
The CONTROL register controls whether software execution is
privileged or unprivileged, see
Handler mode: Used to handle exceptions.
The processor returns to Thread mode when it has finished exception
processing.
Software execution is always privileged.
CONTROL register on page 24.
The privilege levels for software execution are:
Unprivileged: Unprivileged software executes at the unprivileged level and:
Privileged: Privileged software executes at the privileged level and can use all the
2.1.2 Stacks
The processor uses a full descending stack. This means the stack pointer indicates the last
stacked item on the stack memory. When the processor pushes a new item onto the stack, it
decrements the stack pointer and then writes the item to the new memory location. The
processor implements two stacks, the main stack and the process stack, with independent
copies of the stack pointer, see
In Thread mode, the CONTROL register controls whether the processor uses the main
stack or the process stack, see
processor always uses the main stack. The options for processor operations are:
•Has limited access to the MSR and MRS instructions, and cannot
use the CPS instruction.
•Cannot access the system timer, NVIC, or system control block.
•Might have restricted access to memory or peripherals.
•Must use the SVC instruction to make a supervisor call to transfer
control to privileged software.
instructions and has access to all resources.
Can write to the CONTROL register to change the privilege level for
software execution.
Stack pointer on page 18.
CONTROL register on page 24. In Handler mode, the
16/260DocID022708 Rev 6
Page 17
PM0214The Cortex-M4 processor
Table 1. Summary of processor mode, execution privilege level, and stack usage
R0-R12read-write EitherUnknownGeneral-purpose registers on page 18
MSPread-write PrivilegedSee description Stack pointer on page 18
PSPread-write EitherUnknownStack pointer on page 18
LRread-write Either0xFFFFFFFFLink register on page 18
PCread-write EitherSee description Program counter on page 18
DocID022708 Rev 617/260
259
Page 18
The Cortex-M4 processorPM0214
Table 2. Core register set summary (continued)
Required
NameType
PSRread-write Privileged0x01000000 Program status register on page 18
ASPRread-write EitherUnknown
IPSRread-onlyPrivileged0x00000000
EPSRread-onlyPrivileged0x01000000
PRIMASKread-write Privileged0x00000000Priority mask register on page 23
FAULTMASKread-write Privileged0x00000000Fault mask register on page 23
BASEPRIread-write Privileged0x00000000Base priority mask register on page 24
CONTROLread-write Privileged0x00000000CONTROL register on page 24
1. Describes access type during program execution in thread mode and Handler mode. Debug access can
differ.
2. An entry of either means privileged and unprivileged software can access the register.
(1)
privilege
(2)
Reset
value
Description
Application program status register on
page 20
Interrupt program status register on
page 21
Execution program status register on
page 21
General-purpose registers
R0-R12 are 32-bit general-purpose registers for data operations.
Stack pointer
The Stack Pointer (SP) is register R13. In Thread mode, bit[1] of the CONTROL register
indicates the stack pointer to use:
•0: Main Stack Pointer (MSP). This is the reset value.
•1: Process Stack Pointer (PSP).
On reset, the processor loads the MSP with the value from address 0x00000000.
Link register
The Link Register (LR) is register R14. It stores the return information for subroutines,
function calls, and exceptions. On reset, the processor loads the LR value 0xFFFFFFFF.
Program counter
The Program Counter (PC) is register R15. It contains the current program address. On
reset, the processor loads the PC with the value of the reset vector, which is at address
0x00000004. Bit[0] of the value is loaded into the EPSR T-bit at reset and must be 1.
Program status register
The Program Status Register (PSR) combines:
•Application Program Status Register (APSR)
•Interrupt Program Status Register (IPSR)
•Execution Program Status Register (EPSR)
18/260DocID022708 Rev 6
Page 19
PM0214The Cortex-M4 processor
25 24 23
ReservedISR_NUMBER
31 30 29 28 27
NZCV
0
Reserved
APSR
IPSR
EPSR
ReservedReserved
2616 1510 9
ReservedICI/ITICI/ITT
Q
8
1920
GE[3:0]Reserved
25 24 2331 30 29 28 27
NZCV
0
PSR
Reserved
2616 1510 9
ICI/IT
Q
8
1920
GE[3:0]Reserved
ISR_NUMBER
These registers are mutually exclusive bitfields in the 32-bit PSR. The bit assignments are
as shown in
Figure 3 and Figure 4.
Figure 3. APSR, IPSR and EPSR bit assignments
Figure 4. PSR bit assignments
Access these registers individually or as a combination of any two or all three registers,
using the register name as an argument to the MSR or MRS instructions. For example:
•Read all of the registers using PSR with the MRS instruction.
•Write to the APSR N, Z, C, V, and Q bits using APSR_nzcvq with the MSR instruction.
The PSR combinations and attributes are:
RegisterTypeCombination
PSRread-write
IEPSRread-onlyEPSR and IPSR
IAPSRread-write
EAPSRread-write
1. The processor ignores writes to the IPSR bits.
2. Reads of the EPSR bits return zero, and the processor ignores writes to the these bits
Table 3. PSR register combinations
(1), (2)
(1)
(2)
APSR, EPSR, and IPSR
APSR and IPSR
APSR and EPSR
See the instruction descriptions MRS on page 185 and MSR on page 186 for more
information about how to access the program status registers.
DocID022708 Rev 619/260
259
Page 20
The Cortex-M4 processorPM0214
Application program status register
The APSR contains the current state of the condition flags from previous instruction
executions. See the register summary in
assignments are:
BitsDescription
Bit 31N: Negative or less than flag:
0: Operation result was positive, zero, greater than, or equal
1: Operation result was negative or less than.
Bit 30Z: Zero flag:
0: Operation result was not zero
1: Operation result was zero.
Bit 29C: Carry or borrow flag:
0: Add operation did not result in a carry bit or subtract operation resulted in a
borrow bit
1: Add operation resulted in a carry bit or subtract operation did not result in a
borrow bit.
Bit 28V: Overflow flag:
0: Operation did not result in an overflow
1: Operation resulted in an overflow.
Table 4. APSR bit definitions
Table 2 on page 17 for its attributes. The bit
Bit 27Q: DSP overflow and saturation flag: Sticky saturation flag.
0: Indicates that saturation has not occurred since reset or since the bit was last
cleared to zero
1: Indicates when an SSAT or USAT instruction results in saturation, or indicates a
DSP overflow.
This bit is cleared to zero by software using an MRS instruction.
Bits 26:20Reserved.
Bits 19:16GE[3:0]: Greater than or Equal flags. See SEL on page 104 for more information.
Bits 15:0Reserved.
20/260DocID022708 Rev 6
Page 21
PM0214The Cortex-M4 processor
Interrupt program status register
The IPSR contains the exception type number of the current Interrupt Service Routine
(ISR). See the register summary in
The bit assignments are:
BitsDescription
Bits 31:9Reserved
Bits 8:0ISR_NUMBER:
This is the number of the current exception:
0: Thread mode
1: Reserved
2: NMI
3: Hard fault
4: Memory management fault
5: Bus fault
6: Usage fault
7: Reserved
....
10: Reserved
11: SVC a l l
12: Reserved for Debug
13: Reserved
14: PendSV
15: SysTick
16: IRQ0
....
....
83: IRQ81
see Exception types on page 36 for more information.
(1)
(1)
Tabl e 2 on page 17 for its attributes.
Table 5. IPSR bit definitions
1. SeeSTM32 product reference manual/datasheet for more information on interrupt mapping
Execution program status register
The EPSR contains the Thumb state bit, and the execution state bits for either the:
•If-Then (IT) instruction
•Interruptible-Continuable Instruction (ICI) field for an interrupted load multiple or store
multiple instruction.
See the register summary in Table 2 on page 17 for the EPSR attributes. The bit
assignments are:
DocID022708 Rev 621/260
259
Page 22
The Cortex-M4 processorPM0214
BitsDescription
Bits 31:27Reserved.
Bits 26:25, 15:10ICI: Interruptible-continuable instruction bits, see Interruptible-continuable
instructions on page 22.
Bits 26:25, 15:10IT: Indicates the execution state bits of the IT instruction, see
Bit 24T: Thumb state bit.
Bits 23:16Reserved.
Bits 9:0Reserved.
Table 6. EPSR bit definitions
IT on page 144.
Attempts to read the EPSR directly through application software using the MSR instruction
always return zero. Attempts to write the EPSR using the MSR instruction in application
software are ignored. Fault handlers can examine EPSR value in the stacked PSR to
indicate the operation that is at fault. See
Section 2.3.7: Exception entry and return on
page 41.
Interruptible-continuable instructions
When an interrupt occurs during the execution of an LDM STM, PUSH, POP, VLDM, VSTM,
VPUSH, or VPOP instruction, the processor:
•Stops the load multiple or store multiple instruction operation temporarily
•Stores the next register operand in the multiple operation to EPSR bits[15:12].
After servicing the interrupt, the processor:
•Returns to the register pointed to by bits[15:12]
•Resumes execution of the multiple load or store instruction.
When the EPSR holds ICI execution state, bits[26:25,11:10] are zero.
If-Then block
The If-Then block contains up to four instructions following a 16-bit IT instruction. Each
instruction in the block is conditional. The conditions for the instructions are either all the
same, or some can be the inverse of others. See
IT on page 144 for more information.
Thumb state
The Cortex-M4 processor only supports execution of instructions in Thumb state. The
following can clear the T bit to 0:
•Instructions BLX, BX and POP{PC}
•Restoration from the stacked xPSR value on an exception return
•Bit[0] of the vector value on an exception entry or reset
Attempting to execute instructions when the T bit is 0 results in a fault or lockup. See Lockup
on page 46 for more information.
Exception mask registers
The exception mask registers disable the handling of exceptions by the processor. Disable
exceptions where they might impact on timing critical tasks.
22/260DocID022708 Rev 6
Page 23
PM0214The Cortex-M4 processor
31
Reserved
10
PRIMASK
Reserved
0
131
FAULTMASK
To access the exception mask registers use the MSR and MRS instructions, or the CPS
instruction to change the value of PRIMASK or FAULTMASK. See
MRS on page 185, MSR
on page 186, and CPS on page 181 for more information.
Priority mask register
The PRIMASK register prevents the activation of all exceptions with configurable priority.
See the register summary in
assignments.
Table 2 on page 17 for its attributes. Figure 5 shows the bit
Figure 5. PRIMASK bit assignments
Table 7. PRIMASK register bit definitions
BitsDescription
Bits 31:1Reserved
PRIMASK:
Bit 0
0: No effect
1: Prevents the activation of all exceptions with configurable priority.
Fault mask register
The FAULTMASK register prevents activation of all exceptions except for Non-Maskable
Interrupt (NMI). See the register summary in
shows the bit assignments.
Figure 6. FAULTMASK bit assignments
BitsFunction
Table 8. FAULTMASK register bit definitions
Tab le 2 on page 17 for its attributes. Figure 6
Bits 31:1Reserved
Bit 0FAULTMASK:
0: No effect
1: Prevents the activation of all exceptions except for NMI.
The processor clears the FAULTMASK bit to 0 on exit from any exception handler except
the NMI handler.
DocID022708 Rev 623/260
259
Page 24
The Cortex-M4 processorPM0214
BASEPRIReserved
31078
Base priority mask register
The BASEPRI register defines the minimum priority for exception processing. When
BASEPRI is set to a nonzero value, it prevents the activation of all exceptions with same or
lower priority level as the BASEPRI value. See the register summary in
for its attributes. Figure 7 shows the bit assignments.
Figure 7. BASEPRI bit assignments
Tabl e 2 on page 17
Table 9. BASEPRI register bit assignments
BitsFunction
Bits 31:8Reserved
(1)
Bits 7:4BASEPRI[7:4] Priority mask bits
0x00: no effect
Nonzero: defines the base priority for exception processing.
The processor does not process any exception with a priority value greater than or
equal to BASEPRI.
Bits 3:0Reserved
1. This field is similar to the priority fields in the interrupt priority registers. See Interrupt priority registers
(NVIC_IPRx) on page 214 for more information. Remember that higher priority field values correspond to
lower exception priorities.
CONTROL register
The CONTROL register controls the stack used and the privilege level for software
execution when the processor is in Thread mode and indicates whether the FPU state is
active. See the register summary in
Table 10. CONTROL register bit definitions
BitsFunction
Bits 31:3Reserved
Bit 2FPCA: Indicates whether floating-point context currently active:
0: No floating-point context active
1: Floating-point context active.
The Cortex-M4 uses this bit to determine whether to preserve floating-point state
when processing an exception.
Bit 1SPSEL: Active stack pointer selection. Selects the current stack:
0: MSP is the current stack pointer
1: PSP is the current stack pointer.
In Handler mode this bit reads as zero and ignores writes. The Cortex-M4 updates
this bit automatically on exception return.
Bit 0nPRIV: Thread mode privilege level. Defines the Thread mode privilege level.
0: Privileged
1: Unprivileged.
Tab le 2 on page 17 for its attributes.
24/260DocID022708 Rev 6
Page 25
PM0214The Cortex-M4 processor
Handler mode always uses the MSP, so the processor ignores explicit writes to the active
stack pointer bit of the CONTROL register when in Handler mode. The exception entry and
return mechanisms update the CONTROL register.
In an OS environment, it is recommended that threads running in Thread mode use the
process stack, and the kernel and exception handlers use the main stack.
By default, Thread mode uses the MSP. To switch the stack pointer used in Thread mode to
the PSP, either:
•use the MSR instruction to set the Active stack pointer bit to 1, see MSR on page 186.
•perform an exception return to Thread mode with the appropriate EXC_RETURN
value, see Exception return behavior on page 43.
When changing the stack pointer, software must use an ISB instruction immediately after
the MSR instruction. This ensures that instructions after the ISB execute using the new
stack pointer. See
ISB on page 184
2.1.4 Exceptions and interrupts
The Cortex-M4 processor supports interrupts and system exceptions. The processor and
the Nested Vectored Interrupt Controller (NVIC) prioritize and handle all exceptions. An
exception changes the normal flow of software control. The processor uses handler mode to
handle all exceptions except for reset. See
return on page 43 for more information.
Exception entry on page 41 and Exception
The NVIC registers control interrupt handling. See Nested vectored interrupt controller
(NVIC) on page 207 for more information.
2.1.5 Data types
The processor:
•Supports the following data types:
–32-bit words
–16-bit halfwords
–8-bit bytes
•manages all memory accesses as little-endian. See Memory regions, types and
attributes on page 28 for more information.
2.1.6 The Cortex microcontroller software interface standard (CMSIS)
For a Cortex-M4 microcontroller system, the Cortex Microcontroller Software Interface
Standard (CMSIS) defines:
•A common way to:
–Access peripheral registers
–Define exception vectors
•The names of:
–The registers of the core peripherals
–The core exception vectors
•A device-independent interface for RTOS kernels, including a debug channel
DocID022708 Rev 625/260
259
Page 26
The Cortex-M4 processorPM0214
The CMSIS includes address definitions and data structures for the core peripherals in the
Cortex-M4 processor.
CMSIS simplifies software development by enabling the reuse of template code and the
combination of CMSIS-compliant software components from various middleware vendors.
Software vendors can expand the CMSIS to include their peripheral definitions and access
functions for those peripherals.
This document includes the register names defined by the CMSIS, and gives short
descriptions of the CMSIS functions that address the processor core and the core
peripherals.
This document uses the register short names defined by the CMSIS. In a few cases these
differ from the architectural short names that might be used in other documents.
The following sections give more information about the CMSIS:
•Section 2.5.4: Power management programming hints on page 48
•CMSIS intrinsic functions on page 57
•Interrupt set-enable registers (NVIC_ISERx) on page 209
•NVIC programming hints on page 217
26/260DocID022708 Rev 6
Page 27
PM0214The Cortex-M4 processor
Vendor-specific
memory
External device
External RAM
Peripheral
SRAM
Code
0xFFFFFFFF
Private peripheral
bus
0xE0100000
0xE00FFFFF
0x9FFFFFFF
0xA0000000
0x5FFFFFFF
0x60000000
0x3FFFFFFF
0x40000000
0x1FFFFFFF
0x20000000
0x00000000
0x40000000
Bit band region
Bit band alias
32MB
1MB
0x400FFFFF
0x42000000
0x43FFFFFF
Bit band region
Bit band alias
32MB
1MB
0x20000000
0x200FFFFF
0x22000000
0x23FFFFFF
1.0GB
1.0GB
0.5GB
0.5GB
0.5GB
0xDFFFFFFF
0xE0000000
1.0MB
511MB
2.2 Memory model
This section describes the processor memory map, the behavior of memory accesses, and
the bit-banding features. The processor has a fixed memory map that provides up to 4 GB of
addressable memory.
Figure 8. Memory map
The regions for SRAM and peripherals include bit-band regions. Bit-banding provides
atomic operations to bit data, see
The processor reserves regions of the Private peripheral bus (PPB) address range for core
peripheral registers, see
page 192.
Section 2.2.5: Bit-banding on page 31.
Section 4.1: About the STM32 Cortex-M4 core peripherals on
DocID022708 Rev 627/260
259
Page 28
The Cortex-M4 processorPM0214
2.2.1 Memory regions, types and attributes
The memory map and the programming of the MPU splits the memory map into regions.
Each region has a defined memory type, and some regions have additional memory
attributes. The memory type and attributes determine the behavior of accesses to the
region.
The memory types are:
Normal The processor can re-order transactions for efficiency, or
perform speculative reads.
Device The processor preserves transaction order relative to other
transactions to Device or Strongly-ordered memory.
Strongly-ordered The processor preserves transaction order relative to all other
transactions.
The different ordering requirements for Device and Strongly-ordered memory mean that the
memory system can buffer a write to Device memory, but must not buffer a write to Stronglyordered memory.
The additional memory attributes include:
Execute Never (XN) Means that the processor prevents instruction accesses. Any
attempt to fetch an instruction from an XN region causes a
memory management fault exception.
2.2.2 Memory system ordering of memory accesses
For most memory accesses caused by explicit memory access instructions, the memory
system does not guarantee that the order, in which the accesses complete, matches the
program order of the instructions, providing this does not affect the behavior of the
instruction sequence. Normally, if correct program execution depends on two memory
accesses completing in program order, software must insert a memory barrier instruction
between the memory access instructions, see
accesses on page 30.
However, the memory system does guarantee some ordering of accesses to Device and
Strongly-ordered memory. For two memory access instructions A1 and A2, if A1 occurs
before A2 in program order, the ordering of the memory accesses caused by two
instructions is:
A1
Normal access----
Device access, non-shareable-<-<
Device access, shareable--<<
Table 11. Ordering of memory accesses
Normal access
Section 2.2.4: Software ordering of memory
Device accessStrongly
Non-shareableShareable
(1)
A2
ordered
access
Strongly ordered access-<<<
1. - means that the memory system does not guarantee the ordering of the accesses.
< means that accesses are observed in program order, that is, A1 is always observed before A2.
28/260DocID022708 Rev 6
Page 29
PM0214The Cortex-M4 processor
2.2.3 Behavior of memory accesses
The behavior of accesses to each region in the memory map is:
Table 12. Memory access behavior
Address
range
0x000000000x1FFFFFFF
Memory
region
Memory
type
CodeNormal
XNDescription
(1)
Executable region for program code. Can also put
data here.
Executable region for data. Can also put code
0x200000000x3FFFFFFF
SRAMNormal
(1)
here.
This region includes bit band and bit band alias
areas, see Table 13 on page 31.
0x400000000x5FFFFFFF
0x600000000x9FFFFFFF
0xA00000000xDFFFFFFF
0xED0000000xED0FFFFF
0xED1000000xFFFFFFFF
1. See Memory regions, types and attributes on page 28 for more information.
PeripheralDevice
External
RAM
External
device
Private
Peripheral
Bus
Memory
mapped
peripherals
(1)
(1)
Normal
(1)
Device
Stronglyordered
(1)
Device
(1)
This region includes bit band and bit band alias
(1)
XN
areas, see Table 14 on page 31.
-Executable region for data.
(1)
XN
External Device memory
This region includes the NVIC, system timer, and
(1)
XN
system control block.
This region includes all the STM32 standard
(1)
XN
peripherals.
The Code, SRAM, and external RAM regions can hold programs. However, it is
recommended that programs always use the Code region. The reason of this is that the
processor has separate buses that enable instruction fetches and data accesses to occur
simultaneously.
The MPU can override the default memory access behavior described in this section. For
more information, see
Memory protection unit (MPU) on page 192.
Instruction prefetch and branch prediction
The Cortex-M4 processor:
•Prefetches instructions ahead of execution
•Speculatively prefetches from branch target addresses.
DocID022708 Rev 629/260
259
Page 30
The Cortex-M4 processorPM0214
2.2.4 Software ordering of memory accesses
The order of instructions in the program flow does not always guarantee the order of the
corresponding memory transactions. The reason for this is that:
•The processor can reorder some memory accesses to improve efficiency, providing this
does not affect the behavior of the instruction sequence.
•The processor has multiple bus interfaces.
•Memory or devices in the memory map have different wait states.
•Some memory accesses are buffered or speculative.
Section 2.2.2: Memory system ordering of memory accesses on page 28 describes the
cases where the memory system guarantees the order of memory accesses. Otherwise, if
the order of memory accesses is critical, software must include memory barrier instructions
to force that ordering. The processor provides the following memory barrier instructions:
DMBThe Data Memory Barrier (DMB) instruction ensures that outstanding memory
transactions complete before subsequent memory transactions. See DMB on
page 182.
DSBThe Data Synchronization Barrier (DSB) instruction ensures that outstanding
memory transactions complete before subsequent instructions execute. See DSB
on page 183.
ISB The Instruction Synchronization Barrier (ISB) ensures that the effect of all
completed memory transactions is recognizable by subsequent instructions. See
ISB on page 184.
Use memory barrier instructions in, for example:
•Vector table. If the program changes an entry in the vector table, and then enables the
corresponding exception, use a DMB instruction between the operations. This ensures
that if the exception is taken immediately after being enabled the processor uses the
new exception vector.
•Self-modifying code. If a program contains self-modifying code, use an ISB
instruction immediately after the code modification in the program. This ensures that
the subsequent instruction execution uses the updated program.
•Memory map switching. If the system contains a memory map switching mechanism,
use a DSB instruction after switching the memory map in the program. This ensures
that the subsequent instruction execution uses the updated memory map.
•Dynamic exception priority change. When an exception priority has to change when
the exception is pending or active, use DSB instructions after the change. This ensures
that the change takes effect on completion of the DSB instruction.
•Using a semaphore in multi-master system. If the system contains more than one
bus master, for example, if another processor is present in the system, each processor
must use a DMB instruction after any semaphore instructions, to ensure other bus
masters see the memory transactions in the order in which they were executed.
Memory accesses to Strongly-ordered memory, such as the system control block, do not
require the use of DMB instructions.
For MPU programming, use a DSB followed by an ISB instruction or exception return to
ensure that the new MPU configuration is used by subsequent instructions.
30/260DocID022708 Rev 6
Page 31
PM0214The Cortex-M4 processor
2.2.5 Bit-banding
A bit-band region maps each word in a bit-band alias region to a single bit in the bit-band
region. The bit-band regions occupy the lowest 1 Mbyte of the SRAM and peripheral
memory regions.
The memory map has two 32 Mbyte alias regions that map to two 1 Mbyte bit-band regions:
•Accesses to the 32 Mbyte SRAM alias region map to the 1 Mbyte SRAM bit-band
region, as shown in Table 13
•Accesses to the 32 MB peripheral alias region map to the 1 MB peripheral bit-band
region, as shown in Table 14.
Table 13. SRAM memory bit-banding regions
Address
range
0x200000000x200FFFFF
0x220000000x23FFFFFF
Memory
region
SRAM bit-band region
SRAM bit-band alias
Instruction and data accesses
Direct accesses to this memory range behave as SRAM memory
accesses, but this region is also bit addressable through bit-band alias.
Data accesses to this region are remapped to bit band region. A write
operation is performed as read-modify-write. Instruction accesses are not
remapped.
Table 14. Peripheral memory bit-banding regions
Address
range
0x400000000x400FFFFF
0x420000000x43FFFFFF
Peripheral
bit-band region
Peripheral
bit-band alias
Memory
region
Instruction and data accesses
Direct accesses to this memory range behave as peripheral memory
accesses, but this region is also bit addressable through bit-band
alias.
Data accesses to this region are remapped to bit-band region. A write
operation is performed as read-modify-write. Instruction accesses are
not permitted.
Note:A word access to the SRAM or peripheral bit-band alias regions map to a single bit in the
SRAM or peripheral bit-band region.
Bit band accesses can use byte, halfword, or word transfers. The bit band transfer size
matches the transfer size of the instruction making the bit band access.
The following formula shows how the alias region maps onto the bit-band region:
bit_word_offset = (byte_offset x 32) + (bit_number x 4)
•Bit_word_offset is the position of the target bit in the bit-band memory region.
•Bit_word_addr is the address of the word in the alias memory region that maps to the
targeted bit.
•Bit_band_base is the starting address of the alias region.
•Byte_offset is the number of the byte in the bit-band region that contains the targeted
bit.
•Bit_number is the bit position, 0-7, of the targeted bit.
Figure 9 on page 32 shows examples of bit-band mapping between the SRAM bit-band
alias region and the SRAM bit-band region:
•The alias word at 0x23FFFFED maps to bit[0] of the bit-band byte at
0x200FFFFF: 0x23FFFFED = 0x22000000 + (0xFFFFF*32) + (0*4).
•The alias word at 0x23FFFFFC maps to bit[7] of the bit-band byte at
0x200FFFFF: 0x23FFFFFC = 0x22000000 + (0xFFFFF*32) + (7*4).
•The alias word at 0x22000000 maps to bit[0] of the bit-band byte at
0x20000000: 0x22000000 = 0x22000000 + (0*32) + (0 *4).
•The alias word at 0x2200001C maps to bit[7] of the bit-band byte at
0x20000000: 0x2200001C = 0x22000000+ (0*32) + (7*4).
Figure 9. Bit-band mapping
Directly accessing an alias region
Writing to a word in the alias region updates a single bit in the bit-band region.
Bit[0] of the value written to a word in the alias region determines the value written to the
targeted bit in the bit-band region. Writing a value with bit[0] set to 1 writes a 1 to the bitband bit, and writing a value with bit[0] set to 0 writes a 0 to the bit-band bit.
Bits[31:1] of the alias word have no effect on the bit-band bit. Writing 0x01 has the same
effect as writing 0xFF. Writing 0x00 has the same effect as writing 0x0E.
32/260DocID022708 Rev 6
Page 33
PM0214The Cortex-M4 processor
MemoryRegister
Address A
A+1
lsbyte
msbyte
A+2
A+3
07
B0B1B3B2
3124 2316 158 70
B0
B1
B2
B3
Reading a word in the alias region:
•0x00000000 indicates that the targeted bit in the bit-band region is set to zero
•0x00000001 indicates that the targeted bit in the bit-band region is set to 1
Directly accessing a bit-band region
Behavior of memory accesses on page 29 describes the behavior of direct byte, halfword,
or word accesses to the bit-band regions.
2.2.6 Memory endianness
The processor views memory as a linear collection of bytes numbered in ascending order
from zero. For example, bytes 0-3 hold the first stored word, and bytes 4-7 hold the second
stored word.
Little-endian format
In little-endian format, the processor stores the least significant byte of a word at the lowestnumbered byte, and the most significant byte at the highest-numbered byte. See
for an example.
Figure 10. Little-endian example
Figure 10
2.2.7 Synchronization primitives
The Cortex-M4 instruction set includes pairs of synchronization primitives. These provide a
non-blocking mechanism that a thread or process can use to obtain exclusive access to a
memory location. Software can use them to perform a guaranteed read-modify-write
memory update sequence, or for a semaphore mechanism.
A pair of synchronization primitives comprises:
•Load-Exclusive instruction: used to read the value of a memory location, requesting
exclusive access to that location.
•Store-Exclusive instruction: used to attempt to write to the same memory location,
returning a status bit to a register. If this bit is:
0: the thread or process gained exclusive access to memory, and the write succeeds.
1: the thread or process did not gain exclusive access to memory, and no write is
performed.
DocID022708 Rev 633/260
259
Page 34
The Cortex-M4 processorPM0214
The pairs of Load-Exclusive and Store-Exclusive instructions are:
•The word instructions LDREX and STREX
•The halfword instructions LDREXH and STREXH
•The byte instructions LDREXB and STREXB.
Software must use a Load-Exclusive instruction with the corresponding Store-Exclusive
instruction.
To perform a guaranteed read-modify-write of a memory location, software must:
1.Use a Load-Exclusive instruction to read the value of the location.
2. Update the value, as required.
3. Use a Store-Exclusive instruction to attempt to write the new value back to the memory
location.
4. Test the returned status bit. If this bit is:
0: The read-modify-write completed successfully.
1: No write was performed. This indicates that the value returned at step 1 might be out
of date. The software must retry the read-modify-write sequence.
Software can use the synchronization primitives to implement a semaphores as follows:
1.Use a Load-Exclusive instruction to read from the semaphore address to check
whether the semaphore is free.
2. If the semaphore is free, use a Store-Exclusive to write the claim value to the
semaphore address.
3. If the returned status bit from step 2 indicates that the Store-Exclusive succeeded then
the software has claimed the semaphore. However, if the Store-Exclusive failed,
another process might have claimed the semaphore after software performed step 1.
The Cortex-M4 includes an exclusive access monitor, that tags the fact that the processor
has executed a Load-Exclusive instruction. If the processor is part of a multiprocessor
system, the system also globally tags the memory locations addressed by exclusive
accesses by each processor.
The processor removes its exclusive access tag if:
•It executes a CLREX instruction.
•It executes a Store-Exclusive instruction, regardless of whether the write succeeds.
•An exception occurs. This means the processor can resolve semaphore conflicts
between different threads.
In a multiprocessor implementation, executing a:
•CLREX instruction removes only the local exclusive access tag for the processor.
•Store-Exclusive instruction, or an exception, removes the local exclusive access tags,
and global exclusive access tags for the processor.
For more information about the synchronization primitive instructions, see LDREX and
STREX on page 78 and CLREX on page 79.
34/260DocID022708 Rev 6
Page 35
PM0214The Cortex-M4 processor
2.2.8 Programming hints for the synchronization primitives
ISO/IEC C cannot directly generate the exclusive access instructions. CMSIS provides
intrinsic functions for generation of these instructions:
Table 15. CMSIS functions for exclusive access instructions
value = __LDREXH (address); // load 16-bit value from memory address
//0x20001002
DocID022708 Rev 635/260
259
Page 36
The Cortex-M4 processorPM0214
2.3 Exception model
This section describes the exception model.
2.3.1 Exception states
Each exception is in one of the following states:
Inactive The exception is not active and not pending.
Pending The exception is waiting to be serviced by the processor. An interrupt
request from a peripheral or from software can change the state of the
corresponding interrupt to pending.
ActiveAn exception that is being serviced by the processor but has not
completed.
Note: An exception handler can interrupt the execution of another exception
handler. In this case both exceptions are in the active state.
ActiveandpendingThe exception is being serviced by the processor and there is a
pending exception from the same source.
2.3.2 Exception types
The exception types are:
ResetReset is invoked on power up or a warm reset. The exception model
treats reset as a special form of exception. When reset is asserted, the
operation of the processor stops, potentially at any point in an
instruction. When reset is deasserted, execution restarts from the
address provided by the reset entry in the vector table. Execution
restarts as privileged execution in Thread mode.
NMI A NonMaskable Interrupt (NMI) can be signalled by a peripheral or
triggered by software. This is the highest priority exception other than
reset. It is permanently enabled and has a fixed priority of -2. NMIs
cannot be:
•Masked or prevented from activation by any other exception
•Preempted by any exception other than Reset.
Hard fault A hard fault is an exception that occurs because of an error during
exception processing, or because an exception cannot be managed by
any other exception mechanism. Hard faults have a fixed priority of -1,
meaning they have higher priority than any exception with configurable
priority.
Memory
management fault
A memory management fault is an exception that occurs because of a
memory protection related fault. The MPU or the fixed memory
protection constraints determines this fault, for both instruction and
data memory transactions. This fault is used to abort instruction
accesses to Execute Never (XN) memory regions.
36/260DocID022708 Rev 6
Page 37
PM0214The Cortex-M4 processor
Bus fault A bus fault is an exception that occurs because of a memory related
fault for an instruction or data memory transaction. This might be from
an error detected on a bus in the memory system.
Usage faultA usage fault is an exception that occurs in case of an instruction
execution fault. This includes:
•An undefined instruction
•An illegal unaligned access
•Invalid state on instruction execution
•An error on exception return.
The following can cause a usage fault when the core is configured to
report it:
•An unaligned address on word and halfword memory access
•Division by zero
SVCall A supervisor call (SVC) is an exception that is triggered by the SVC
instruction. In an OS environment, applications can use SVC
instructions to access OS kernel functions and device drivers.
PendSVPendSV is an interrupt-driven request for system-level service. In an
OS environment, use PendSV for context switching when no other
exception is active.
SysTick A SysTick exception is an exception the system timer generates when
it reaches zero. Software can also generate a SysTick exception. In an
OS environment, the processor can use this exception as system tick.
Interrupt (IRQ) An interrupt, or IRQ, is an exception signalled by a peripheral, or
generated by a software request. All interrupts are asynchronous to
instruction execution. In the system, peripherals use interrupts to
communicate with the processor.
Exception
number
1-Reset-3, the highest0x00000004Asynchronous
2-14NMI-20x00000008Asynchronous
3-13Hard fault-10x0000000C-
4 -12
5-11Bus fault Configurable
6-10Usage fault Configurable
7-10---Reserved-
11-5SVCallConfigurable
12-13---Reserved-
14-2PendSVConfigurable
(1)
IRQ
number
Table 16. Properties of the different exception types
(1)
Exception
type
Memory
management fault
Priority
Configurable
Vector address
or offset
(3)
0x00000010Synchronous
(3)
0x00000014
(3)
0x00000018Synchronous
(3)
0x0000002CSynchronous
(3)
0x00000038Asynchronous
(2)
Activation
Synchronous when precise
Asynchronous when imprecise
DocID022708 Rev 637/260
259
Page 38
The Cortex-M4 processorPM0214
Table 16. Properties of the different exception types (continued)
Exception
number
(1)
15-1SysTickConfigurable
16 and
above
1. To simplify the software layer, the CMSIS only uses IRQ numbers and therefore uses negative values for exceptions other
than interrupts. The IPSR returns the Exception number. For further information see Interrupt program status register on
page 21.
2. See Vector table on page 39 for more information.
3. See System handler priority registers (SHPRx) on page 232.
4. See Interrupt priority registers (NVIC_IPRx) on page 214.
5. Increasing in steps of 4.
IRQ
number
0 and
above
(1)
Exception
type
Interrupt (IRQ)Configurable
Priority
Vector address
or offset
(3)
0x0000003C Asynchronous
0x00000040 and
(4)
above
(5)
(2)
Asynchronous
Activation
For an asynchronous exception other than reset, the processor can execute another
instruction between when the exception is triggered and when the processor enters the
exception handler.
Privileged software can disable the exceptions that Tabl e 16 on page 37 shows as having
configurable priority. For further information, see:
•System handler control and state register (SHCSR) on page 234
•Interrupt clear-enable registers (NVIC_ICERx) on page 210
For more information about hard faults, memory management faults, bus faults, and usage
faults, see
Section 2.4: Fault handling on page 43.
2.3.3 Exception handlers
The processor handles exceptions using:
Interrupt Service
Routines (ISRs)
Fault handlers Hard fault, memory management fault, usage fault, bus fault are fault
System handlers NMI, PendSV, SVCall SysTick, and the fault exceptions are all
Interrupts IRQ0 to IRQ81 are the exceptions handled by ISRs.
exceptions handled by the fault handlers.
system exceptions that are handled by system handlers.
38/260DocID022708 Rev 6
Page 39
PM0214The Cortex-M4 processor
Initial SP value
Reset
Hard fault
NMI
Memory management fault
Usage fault
Bus fault
0x0000
0x0004
0x0008
0x000C
0x0010
0x0014
0x0018
Reserved
SVCall
PendSV
Reserved for Debug
Systick
IRQ0
Reserved
0x002C
0x0038
0x003C
0x0040
OffsetException number
2
3
4
5
6
11
12
14
15
16
18
13
7
10
1
Vector
.
.
.
8
9
IRQ1
IRQ2
0x0044
IRQ239
17
0x0048
0x004C
255
.
.
.
.
.
.
0x03FC
IRQ number
-14
-13
-12
-11
-10
-5
-2
-1
0
2
1
239
MS30018V1
2.3.4 Vector table
The vector table contains the reset value of the stack pointer, and the start addresses, also
called exception vectors, for all exception handlers.
of the exception vectors in the vector table. The least-significant bit of each vector must be
1, indicating that the exception handler is Thumb code.
Figure 11. Vector table
Figure 11 on page 39 shows the order
On system reset, the vector table is fixed at address 0x00000000. Privileged software can
write to the VTOR to relocate the vector table start address to a different memory location, in
the range 0x00000080 to 0x3FFFFF80. For further information see
register (VTOR) on page 226.
Vector table offset
DocID022708 Rev 639/260
259
Page 40
The Cortex-M4 processorPM0214
2.3.5 Exception priorities
Tab le 16 on page 37 shows that all exceptions have an associated priority, in details:
•A lower priority value indicating a higher priority
•Configurable priorities for all exceptions except Reset, Hard fault, and NMI.
If software does not configure any priorities, then all exceptions with a configurable priority
have a priority of 0. For information about configuring exception priorities see
•System handler priority registers (SHPRx) on page 232
•Interrupt priority registers (NVIC_IPRx) on page 214
Configurable priority values are in the range 0-15. This means that the Reset, Hard fault,
and NMI exceptions, with fixed negative priority values, always have higher priority than any
other exception.
For example, assigning a higher priority value to IRQ[0] and a lower priority value to IRQ[1]
means that IRQ[1] has higher priority than IRQ[0]. If both IRQ[1] and IRQ[0] are asserted,
IRQ[1] is processed before IRQ[0].
If multiple pending exceptions have the same priority, the pending exception with the lowest
exception number takes precedence. For example, if both IRQ[0] and IRQ[1] are pending
and have the same priority, then IRQ[0] is processed before IRQ[1].
When the processor is executing an exception handler, the exception handler is preempted
if a higher priority exception occurs. If an exception occurs with the same priority as the
exception being handled, the handler is not preempted, irrespective of the exception
number. However, the status of the new interrupt changes to pending.
2.3.6 Interrupt priority grouping
To increase priority control in systems with interrupts, the NVIC supports priority grouping.
This divides each interrupt priority register entry into two fields:
•An upper field that defines the group priority
•A lower field that defines a subpriority within the group.
Only the group priority determines preemption of interrupt exceptions. When the processor
is executing an interrupt exception handler, another interrupt with the same group priority as
the interrupt being handled does not preempt the handler,
If multiple pending interrupts have the same group priority, the subpriority field determines
the order in which they are processed. If multiple pending interrupts have the same group
priority and subpriority, the interrupt with the lowest IRQ number is processed first.
For information about splitting the interrupt priority fields into group priority and subpriority,
see
Application interrupt and reset control register (AIRCR) on page 227.
40/260DocID022708 Rev 6
Page 41
PM0214The Cortex-M4 processor
2.3.7 Exception entry and return
Descriptions of exception handling use the following terms:
Preemption When the processor is executing an exception handler, an exception can
preempt the exception handler if its priority is higher than the priority of the
exception being handled. See
more information about preemption by an interrupt.
When one exception preempts another, the exceptions are called nested
exceptions. See
Exception entry on page 41 more information.
Return This occurs when the exception handler is completed, and:
•There is no pending exception with sufficient priority to be serviced
•The completed exception handler was not handling a late-arriving
exception.
The processor pops the stack and restores the processor state to the state it
had before the interrupt occurred. See
information.
Tail-chaining This mechanism speeds up exception servicing. On completion of an
exception handler, if there is a pending exception that meets the
requirements for exception entry, the stack pop is skipped and control
transfers to the new exception handler.
Section 2.3.6: Interrupt priority grouping for
Exception return on page 43 for more
Late-arriving This mechanism speeds up preemption. If a higher priority exception occurs
during state saving for a previous exception, the processor switches to
handle the higher priority exception and initiates the vector fetch for that
exception. State saving is not affected by late arrival because the state saved
is the same for both exceptions. Therefore the state saving continues
uninterrupted. The processor can accept a late arriving exception until the
first instruction of the exception handler of the original exception enters the
execute stage of the processor. On return from the exception handler of the
late-arriving exception, the normal tail-chaining rules apply.
Exception entry
Exception entry occurs when there is a pending exception with sufficient priority and either:
•The processor is in Thread mode
•The new exception is of higher priority than the exception being handled, in which case
the new exception preempts the original exception.
When one exception preempts another, the exceptions are nested.
Sufficient priority means the exception has more priority than any limits set by the mask
registers. For more information see
less priority than this is pending but is not handled by the processor.
When the processor takes an exception, unless the exception is a tail-chained or a latearriving exception, the processor pushes information onto the current stack. This operation
is referred as stacking and the structure of eight data words is referred as stack frame.
Exception mask registers on page 22. An exception with
When using floating-point routines, the Cortex-M4 processor automatically stacks the
architected floating-point state on exception entry.
Figure 12 on page 42 shows the Cortex-
M4 stack frame layout when floating-point state is preserved on the stack as the result of an
interrupt or an exception. Where stack space for floating-point state is not allocated, the
DocID022708 Rev 641/260
259
Page 42
The Cortex-M4 processorPM0214
Pre-IRQ top of stack
xPSR
PC
LR
R12
R3
R2
R1
R0
{aligner}
IRQ top of stack
Decreasing
memory
address
xPSR
PC
LR
R12
R3
R2
R1
R0
S7
S6
S5
S4
S3
S2
S1
S0
S9
S8
FPSCR
S15
S14
S13
S12
S11
S10
{aligner}
IRQ top of stack
...
Exception frame with
floating-point storage
Exception frame without
floating-point storage
Pre-IRQ top of stack
...
MS30019V1
stack frame is the same as that of ARMv7-M implementations without an FPU. Figure 12 on
page 42 also shows this stack frame.
Figure 12. Cortex-M4 stack frame layout
42/260DocID022708 Rev 6
Immediately after stacking, the stack pointer indicates the lowest address in the stack frame.
The alignment of the stack frame is controlled via the STKALIGN bit of the Configuration
Control Register (CCR).
The stack frame includes the return address. This is the address of the next instruction in
the interrupted program. This value is restored to the PC at exception return so that the
interrupted program resumes.
In parallel to the stacking operation, the processor performs a vector fetch that reads the
exception handler start address from the vector table. When stacking is complete, the
processor starts executing the exception handler. At the same time, the processor writes an
EXC_RETURN value to the LR. This indicates which stack pointer corresponds to the stack
frame and what operation mode the was processor was in before the entry occurred.
If no higher priority exception occurs during exception entry, the processor starts executing
the exception handler and automatically changes the status of the corresponding pending
interrupt to active.
If another higher priority exception occurs during exception entry, the processor starts
executing the exception handler for this exception and does not change the pending status
of the earlier exception. This is the late arrival case.
Page 43
PM0214The Cortex-M4 processor
Exception return
Exception return occurs when the processor is in Handler mode and executes one of the
following instructions to load the EXC_RETURN value into the PC:
•an LDM or POP instruction that loads the PC
•an LDR instruction with PC as the destination
•a BX instruction using any register.
EXC_RETURN is the value loaded into the LR on exception entry. The exception
mechanism relies on this value to detect when the processor has completed an exception
handler. The lowest five bits of this value provide information on the return stack and
processor mode.
exception return behavior.
All EXC_RETURN values have bits[31:5] set to one. When this value is loaded into the PC it
indicates to the processor that the exception is complete, and the processor initiates the
appropriate exception return sequence.
EXC_RETURN[31:0]Description
Tab le 17 shows the EXC_RETURN values with a description of the
Table 17. Exception return behavior
0xFFFFFFF1
0xFFFFFFF9
0xFFFFFFFD
0xFFFFFFE1
0xFFFFFFE9
0xFFFFFFED
2.4 Fault handling
Faults are a subset of the exceptions. For more information, see Exception model on
page 36. The following elements generate a fault:
•A bus error on:
–An instruction fetch or vector table load
–A data access
•An internally-detected error such as an undefined instruction
•Attempting to execute an instruction from a memory region marked as Non-Executable
(XN).
•A privilege violation or an attempt to access an unmanaged region causing an MPU
fault.
Return to Handler mode, exception return uses non-floating-point state from
the MSP and execution uses MSP after return.
Return to Thread mode, exception return uses non-floating-point state from
MSP and execution uses MSP after return.
Return to Thread mode, exception return uses non-floating-point state from
the PSP and execution uses PSP after return.
Return to Handler mode, exception return uses floating-point-state from MSP
and execution uses MSP after return.
Return to Thread mode, exception return uses floating-point state from MSP
and execution uses MSP after return.
Return to Thread mode, exception return uses floating-point state from PSP
and execution uses PSP after return.
DocID022708 Rev 643/260
259
Page 44
The Cortex-M4 processorPM0214
2.4.1 Fault types
Tab le 18 shows the types of fault, the handler used for the fault, the corresponding fault
status register, and the register bit that indicates that the fault has occurred. See
Configurable fault status register (CFSR; UFSR+BFSR+MMFSR) on page 236 for more
information about the fault status registers.
FaultHandlerBit nameFault status register
Table 18. Faults
Bus error on a vector read
VECTTBL
Hard fault
Fault escalated to a hard faultFORCED
MPU or default memory map
mismatch:
– on instruction accessIACCVIOL
-
(1)
– on data accessDACCVIOL
MemManage
– during exception stackingMSTKERR
– during exception unstackingMUNSKERR
– during lazy floating-point state
preservation
Bus error:
MLSPERR
--
– During exception stackingSTKERR
– During exception unstackingUNSTKERR
– During instruction prefetchIBUSERR
– During lazy floating-point state
preservation
Bus fault
LSPERR
Precise data bus errorPRECISERR
Imprecise data bus errorIMPRECISERR
Attempt to access a coprocessor
NOCP
Undefined instructionUNDEFINSTR
Attempt to enter an invalid instruction
set state
(2)
Usage fault
INVSTATE
Invalid EXC_RETURN valueINVPC
Illegal unaligned load or storeUNALIGNED
Hard fault status register
(HFSR) on page 240
Memory management fault
address register (MMFAR)
on page 241
Bus fault address register
(BFAR) on page 241
Configurable fault status
register (CFSR;
UFSR+BFSR+MMFSR) on
page 236
Divide By 0DIVBYZERO
1. Occurs on an access to an XN region even if the MPU is disabled.
2. Attempting to use an instruction set other than the Thumb instruction set, or returns to a non load/store-
multiple instruction with ICI continuation.
44/260DocID022708 Rev 6
Page 45
PM0214The Cortex-M4 processor
2.4.2 Fault escalation and hard faults
All faults exceptions except for hard fault have configurable exception priority, as described
in
System handler priority registers (SHPRx) on page 232. Software can disable execution
of the handlers for these faults, as described in System handler control and state register
(SHCSR) on page 234.
Usually, the exception priority, together with the values of the exception mask registers,
determines whether the processor enters the fault handler, and whether a fault handler can
preempt another fault handler, as described in
In some situations, a fault with configurable priority is treated as a hard fault. This is called
priority escalation, and the fault is described as escalated to hard fault. Escalation to hard
fault occurs when:
•A fault handler causes the same kind of fault as the one it is servicing. This escalation
to hard fault occurs when a fault handler cannot preempt itself because it must have
the same priority as the current priority level.
•A fault handler causes a fault with the same or lower priority as the fault it is servicing.
This is because the handler for the new fault cannot preempt the currently executing
fault handler.
•An exception handler causes a fault for which the priority is the same as or lower than
the currently executing exception.
•A fault occurs and the handler for that fault is not enabled.
Section 2.3: Exception model on page 36.
If a bus fault occurs during a stack push when entering a bus fault handler, the bus fault
does not escalate to a hard fault. This means that if a corrupted stack causes a fault, the
fault handler executes even though the stack push for the handler failed. The fault handler
operates but the stack contents are corrupted.
Only Reset and NMI can preempt the fixed priority hard fault. A hard fault can preempt any
exception other than Reset, NMI, or another hard fault.
DocID022708 Rev 645/260
259
Page 46
The Cortex-M4 processorPM0214
2.4.3 Fault status registers and fault address registers
The fault status registers indicate the cause of a fault. For bus faults and memory
management faults, the fault address register indicates the address accessed by the
operation that caused the fault, as shown in
HandlerStatus register
Hard faultHFSR-Hard fault status register (HFSR) on page 240
Table 19. Fault status and fault address registers
Address register
name
name
Tabl e 19.
Register description
Memory
management fault
Bus faultBFSRBFARBus fault address register (BFAR) on page 241
Usage faultUFSR-
MMFSRMMFAR
Memory management fault address register
(MMFAR) on page 241
Configurable fault status register (CFSR;
UFSR+BFSR+MMFSR) on page 236
2.4.4 Lockup
The processor enters a lockup state if a hard fault occurs when executing the NMI or hard
fault handlers. When the processor is in lockup state it does not execute any instructions.
The processor remains in lockup state until either:
•It is reset
•An NMI occurs
•It is halted by a debugger
If lockup state occurs from the NMI handler a subsequent NMI does not cause the
processor to leave lockup state.
2.5 Power management
The STM32 and Cortex-M4 processor sleep modes reduce power consumption:
•Sleep mode stops the processor clock. All other system and peripheral clocks may still
be running.
•Deep sleep mode stops most of the STM32 system and peripheral clocks. At product
level, this corresponds to either the Stop or the Standby mode. For more details, please
refer to the “Power modes” Section in the STM32 reference manual.
The SLEEPDEEP bit of the SCR selects which sleep mode is used, as described in System
control register (SCR) on page 229. For more information about the behavior of the sleep
modes see the STM32 product reference manual.
This section describes the mechanisms for entering sleep mode, and the conditions for
waking up from sleep mode.
46/260DocID022708 Rev 6
Page 47
PM0214The Cortex-M4 processor
2.5.1 Entering sleep mode
This section describes the mechanisms software can use to put the processor into sleep
mode.
The system can generate spurious wakeup events, for example a debug operation that
wakes up the processor. Therefore software must be able to put the processor back into
sleep mode after such an event. A program might have an idle loop to put the processor
back to sleep mode.
Wait for interrupt
The wait for interrupt instruction, WFI, causes immediate entry to sleep mode (unless the
wake-up condition is true, as shown in
When the processor executes a WFI instruction, it stops executing instructions and enters
sleep mode. See
WFI on page 191 for more information.
Wait for event
The wait for event instruction, WFE, causes entry to sleep mode depending on the value of
a one-bit event register. When the processor executes a WFE instruction, it checks the
value of the event register:
•0: the processor stops executing instructions and enters sleep mode
•1: the processor clears the register to 0 and continues executing instructions without
entering sleep mode.
Wakeup from WFI or sleep-on-exit on page 47).
See WFE on page 190 for more information.
If the event register is 1, this indicates that the processor must not enter sleep mode on
execution of a WFE instruction. Typically, this is because an external event signal is
asserted, or a processor in the system has executed an SEV instruction, as shown in
on page 188. Software cannot access this register directly.
Sleep-on-exit
If the SLEEPONEXIT bit of the SCR is set to 1, when the processor completes the execution
of an exception handler, it returns to Thread mode and immediately enters sleep mode. Use
this mechanism in applications that only require the processor to run when an exception
occurs.
2.5.2 Wakeup from sleep mode
The conditions for the processor to wakeup depend on the mechanism that caused it to
enter sleep mode.
Wakeup from WFI or sleep-on-exit
Normally, the processor wakes up only when it detects an exception with sufficient priority to
cause exception entry.
Some embedded systems might have to execute system restore tasks after the processor
wakes up, and before it executes an interrupt handler. To achieve this set the PRIMASK bit
to 1 and the FAULTMASK bit to 0. If an interrupt arrives that is enabled and has a higher
priority than current exception priority, the processor wakes up but does not execute the
interrupt handler until the processor sets PRIMASK to zero. For more information about
PRIMASK and FAULTMASK see
SEV
Exception mask registers on page 22.
DocID022708 Rev 647/260
259
Page 48
The Cortex-M4 processorPM0214
Wakeup from WFE
The processor wakes up if:
•it detects an exception with sufficient priority to cause exception entry
•it detects an external event signal, see Section 2.5.3: External event input / extended
interrupt and event input
•in a multiprocessor system, another processor in the system executes an SEV
instruction.
In addition, if the SEVONPEND bit in the SCR is set to 1, any new pending interrupt triggers
an event and wakes up the processor, even if the interrupt is disabled or has insufficient
priority to cause exception entry. For more information about the SCR see
register (SCR) on page 229.
System control
2.5.3 External event input / extended interrupt and event input
The processor provides an external event input signal.
This signal is generated by the External or Extended Interrupt/event Controller (EXTI) on
asynchronous event detection (from external input pins or asynchronous peripheral event).
This signal can wakeup the processor from WFE, or set the internal WFE event register to
one to indicate that the processor must not enter sleep mode on a later WFE instruction, as
described in
reference manual, Low power modes section.
Wait for event on page 47. Fore more details please refer to the STM32
2.5.4 Power management programming hints
ISO/IEC C cannot directly generate the WFI and WFE instructions. The CMSIS provides the
following functions for these instructions:
void __WFE(void) // Wait for Event
void __WFI(void) // Wait for Interrupt
48/260DocID022708 Rev 6
Page 49
PM0214The STM32 Cortex-M4 instruction set
3 The STM32 Cortex-M4 instruction set
This chapter is the reference material for the Cortex-M4 instruction set description in a User
Guide. The following sections give general information:
Section 3.1: Instruction set summary on page 49
Section 3.2: CMSIS intrinsic functions on page 57
Section 3.3: About the instruction descriptions on page 59
Each of the following sections describes a functional group of Cortex-M4 instructions.
Together they describe all the instructions supported by the Cortex-M4 processor:
Section 3.4: Memory access instructions on page 68
Section 3.5: General data processing instructions on page 80
Section 3.6: Multiply and divide instructions on page 108
Section 3.7: Saturating instructions on page 124
Section 3.8: Packing and unpacking instructions on page 133
Section 3.9: Bitfield instructions on page 137
Section 3.10: Floating-point instructions on page 148
Section 3.11: Miscellaneous instructions on page 179
3.1 Instruction set summary
The processor implements a version of the thumb instruction set. Table 20 lists the
supported instructions.
In Table 20:
•Angle brackets, <>, enclose alternative forms of the operand.
•Braces, {}, enclose optional operands.
•The operands column is not exhaustive.
•Op2 is a flexible second operand that can be either a register or a constant.
•Most instructions can use an optional condition code suffix.
For more information on the instructions and operands, see the instruction descriptions.
MnemonicOperandsBrief descriptionFlagsPage
Table 20. Cortex-M4 instructions
ADC, ADCS
ADD, ADDS
ADD, ADDW
ADRRd, labelLoad PC-relative address—3.4.1 on page 69
{Rd,} Rn, Op2
{Rd,} Rn, Op2
{Rd,} Rn, #imm12
Add with carryN,Z,C,V 3.5.1 on page 82
AddN,Z,C,V 3.5.1 on page 82
AddN,Z,C,V 3.5.1 on page 82
DocID022708 Rev 649/260
259
Page 50
The STM32 Cortex-M4 instruction setPM0214
Table 20. Cortex-M4 instructions (continued)
MnemonicOperandsBrief descriptionFlagsPage
AND, ANDS {Rd,} Rn, Op2Logical ANDN,Z,C3.5.2 on page 84
ASR, ASRSRd, Rm, <Rs|#n>Arithmetic shift rightN,Z,C3.5.3 on page 85
BlabelBranch—3.9.5 on page 141
BFCRd, #lsb, #widthBit field clear—3.9.1 on page 138
BFIRd, Rn, #lsb, #widthBit field insert—3.9.1 on page 138
BIC, BICS
BKPT#immBreakpoint—3.11.1 on page 180
BLlabelBranch with link—3.9.5 on page 141
BLXRmBranch indirect with link—3.9.5 on page 141
BXRmBranch indirect—3.9.5 on page 141
{Rd,} Rn, Op2
Bit clearN,Z,C3.5.2 on page 84
CBNZRn, label
CBZRn, labelCompare and branch if zero—3.9.6 on page 143
CLREX—Clear exclusive—3.4.9 on page 79
CLZRd, RmCount leading zeros—3.5.4 on page 86
CMNRn, Op2Compare negativeN,Z,C,V 3.5.5 on page 87
CMPRn, Op2CompareN,Z,C,V 3.5.5 on page 87
CPSIDiflags
CPSIEiflags
DMB—Data memory barrier—3.11.4 on page 183
DSB—Data synchronization barrier—3.11.4 on page 183
EOR, EORS {Rd,} Rn, Op2Exclusive ORN,Z,C3.5.2 on page 84
ISB—
IT—If-then condition block—3.9.7 on page 144
LDMRn{!}, reglist
LDMDB,
LDMEA
LDMFD,
LDMIA
Rn{!}, reglist
Rn{!}, reglist
Compare and branch if non
zero
Change processor state,
disable interrupts
Change processor state,
enable interrupts
Instruction synchronization
barrier
Load multiple registers,
increment after
Load multiple registers,
decrement before
Load multiple registers,
increment after
—3.9.6 on page 143
—3.11.2 on page 181
—3.11.2 on page 181
—3.11.5 on page 184
—3.4.6 on page 75
—3.4.6 on page 75
—3.4.6 on page 75
LDRRt, [Rn, #offset]Load register with word—3.4 on page 68
LDRB,
LDRBT
LDRDRt, Rt2, [Rn, #offset]Load register with two bytes—3.4.2 on page 70
LDREXRt, [Rn, #offset]Load register exclusive—3.4.8 on page 78
50/260DocID022708 Rev 6
Rt, [Rn, #offset]Load register with byte—3.4 on page 68
Page 51
PM0214The STM32 Cortex-M4 instruction set
Table 20. Cortex-M4 instructions (continued)
MnemonicOperandsBrief descriptionFlagsPage
LDREXBRt, [Rn]
LDREXHRt, [Rn]
LDRH,
LDRHT
LDRSB,
LDRSBT
LDRSH,
LDRSHT
Rt, [Rn, #offset]Load register with halfword—3.4 on page 68
Rt, [Rn, #offset]Load register with signed byte —3.4 on page 68
Rt, [Rn, #offset]
Load register exclusive with
byte
Load register exclusive with
halfword
Load register with signed
halfword
—3.4.8 on page 78
—3.4.8 on page 78
—3.4 on page 68
LDRTRt, [Rn, #offset]Load register with word—3.4 on page 68
LSL, LSLSRd, Rm, <Rs|#n>Logical shift leftN,Z,C3.5.3 on page 85
LSR, LSRSRd, Rm, <Rs|#n>Logical shift rightN,Z,C3.5.3 on page 85
MLARd, Rn, Rm, Ra
MLSRd, Rn, Rm, Ra
Multiply with accumulate, 32bit result
Multiply and subtract, 32-bit
result
—3.6.1 on page 109
—3.6.1 on page 109
MOV, MOVS Rd, Op2MoveN,Z,C3.5.6 on page 88
MOVTRd, #imm16Move top—3.5.7 on page 90
MOVW,
MOV
Rd, #imm16Move 16-bit constantN,Z,C3.5.6 on page 88
MRSRd, spec_reg
MSRspec_reg, Rm
Move from special register to
general register
Move from general register to
special register
—3.11.6 on page 185
N,Z,C,V 3.11.7 on page 186
MUL, MULS {Rd,} Rn, RmMultiply, 32-bit resultN,Z3.6.1 on page 109
MVN, MVNS Rd, Op2Move NOTN,Z,C3.5.6 on page 88
NOP—No operation—3.11.8 on page 187
ORN, ORNS {Rd,} Rn, Op2Logical OR NOTN,Z,C3.5.2 on page 84
ORR, ORRS {Rd,} Rn, Op2Logical ORN,Z,C3.5.2 on page 84
PKHTB,
PKHBT
{Rd,} Rn, Rm, Op2Pack Halfword3.8.1 on page 134
POPreglistPop registers from stack—3.4.7 on page 77
PUSHreglistPush registers onto stack—3.4.7 on page 77
QADD{Rd,} Rn, RmSaturating double and add3.7.3 on page 127
QADD16{Rd,} Rn, RmSaturating add 163.7.3 on page 127
QADD8{Rd,} Rn, RmSaturating add 83.7.3 on page 127
QASX{Rd,} Rn, Rm
Saturating add and subtract
with exchange
3.7.4 on page 128
DocID022708 Rev 651/260
259
Page 52
The STM32 Cortex-M4 instruction setPM0214
Table 20. Cortex-M4 instructions (continued)
MnemonicOperandsBrief descriptionFlagsPage
QDADD{Rd,} Rn, RmSaturating add3.7.5 on page 129
QDSUB{Rd,} Rn, Rm
QSAX{Rd,} Rn, Rm
QSUB{Rd,} Rn, RmSaturating subtract3.7.3 on page 127
QSUB16{Rd,} Rn, RmSaturating subtract 163.7.4 on page 128
QSUB8{Rd,} Rn, RmSaturating subtract 83.7.4 on page 128
RBITRd, RnReverse bits—3.7.4 on page 128
REVRd, RnReverse byte order in a word—3.5.8 on page 91
REV16Rd, Rn
REVSHRd, Rn
ROR, RORS Rd, Rm, <Rs|#n>Rotate rightN,Z,C3.5.3 on page 85
RRX, RRXS Rd, RmRotate right with extendN,Z,C3.5.3 on page 85
RSB, RSBS{Rd,} Rn, Op2Reverse subtractN,Z,C,V 3.5.1 on page 82
SADD16{Rd,} Rn, RmSigned add 163.5.9 on page 92
SADD8{Rd,} Rn, RmSigned add 83.5.9 on page 92
SASX{Rd,} Rn, Rm
Saturating double and
subtract
Saturating subtract and add
with exchange
Reverse byte order in each
halfword
Reverse byte order in bottom
halfword and sign extend
Signed add and subtract with
exchange
3.7.5 on page 129
3.7.4 on page 128
—3.5.8 on page 91
—3.5.8 on page 91
3.5.14 on page 97
SBC, SBCS{Rd,} Rn, Op2Subtract with carryN,Z,C,V 3.5.1 on page 82
SBFXRd, Rn, #lsb, #widthSigned bit field extract—3.9.2 on page 139
SDIV{Rd,} Rn, RmSigned divide—3.6.3 on page 111
SEV—Send event—3.11.9 on page 188
SHADD16{Rd,} Rn, RmSigned halving add 16—3.5.10 on page 93
SHADD8{Rd,} Rn, RmSigned halving add 8—3.5.10 on page 93
SHASX{Rd,} Rn, Rm
SHSAX{Rd,} Rn, Rm
SHSUB16{Rd,} Rn, RmSigned halving subtract 16—3.5.12 on page 95
SHSUB8{Rd,} Rn, RmSigned halving subtract 8—3.5.12 on page 95
SMLABB,
SMLABT,
SMLATB,
SMLATT
SMLAD
,
SMLADX
52/260DocID022708 Rev 6
Rd, Rn, Rm, Ra
Rd, Rn, Rm, Ra
Signed halving add and
subtract with exchange
Signed halving subtract and
add with exchange
Signed multiply accumulate
long
(halfwords)
Signed multiply accumulate
dual
—3.5.11 on page 94
—3.5.11 on page 94
Q3.6.3 on page 111
Q3.6.4 on page 113
Page 53
PM0214The STM32 Cortex-M4 instruction set
Table 20. Cortex-M4 instructions (continued)
MnemonicOperandsBrief descriptionFlagsPage
Signed multiply with
SMLALRdLo, RdHi, Rn, Rm
SMLALBB,
SMLALBT,
SMLALTB,
RdLo, RdHi, Rn, Rm
SMLALTT
accumulate (32 x 32 + 64), 64bit result
Signed multiply accumulate
long,
halfwords
—3.6.2 on page 110
—3.6.5 on page 114
SMLALD
,
SMLALDX
SMLAWB,
SMLAWT
RdLo, RdHi, Rn, Rm
Rd, Rn, Rm, Ra
Signed multiply accumulate
long dual
Signed multiply accumulate,
word by halfword
—3.6.5 on page 114
Q3.6.3 on page 111
SMLSDRd, Rn, Rm, RaSigned multiply subtract dualQ3.6.6 on page 116
SMLSLDRdLo, RdHi, Rn, Rm
SMMLARd, Rn, Rm, Ra
SMMLS
,
SMMLR
SMMUL,
SMMULR
Rd, Rn, Rm, Ra
{Rd,} Rn, Rm
Signed multiply subtract long
dual
Signed most significant word
multiply accumulate
Signed most significant word
multiply subtract
Signed most significant word
multiply
—3.6.6 on page 116
—3.6.7 on page 118
—3.6.7 on page 118
—3.6.8 on page 119
SMUAD{Rd,} Rn, RmSigned dual multiply addQ3.6.9 on page 120
SMULBB,
SMULBT
SMULTB,
{Rd,} Rn, RmSigned multiply (halfwords)—3.6.10 on page 121
SMULTT
SMULLRdLo, RdHi, Rn, Rm
Signed multiply (32 x 32), 64bit result
—3.6.2 on page 110
SSATRd, #n, Rm {,shift #s} Signed saturateQ3.7.1 on page 125
SSAT16Rd, #n, RmSigned saturate 16Q3.7.2 on page 126
SSAX{Rd,} Rn, Rm
Signed subtract and add with
exchange
GE3.5.14 on page 97
SSUB16{Rd,} Rn, RmSigned subtract 16—3.5.13 on page 96
SSUB8{Rd,} Rn, RmSigned subtract 8—3.5.13 on page 96
STMRn{!}, reglist
STMDB,
STMEA
STMFD,
STMIA
Rn{!}, reglist
Rn{!}, reglist
Store multiple registers,
increment after
Store multiple registers,
decrement before
Store multiple registers,
increment after
—3.4.6 on page 75
—3.4.6 on page 75
—3.4.6 on page 75
STRRt, [Rn, #offset]Store register word—3.4 on page 68
STRB,
STRBT
Rt, [Rn, #offset]Store register byte—3.4 on page 68
DocID022708 Rev 653/260
259
Page 54
The STM32 Cortex-M4 instruction setPM0214
Table 20. Cortex-M4 instructions (continued)
MnemonicOperandsBrief descriptionFlagsPage
STRDRt, Rt2, [Rn, #offset]Store register two words—3.4.2 on page 70
STREXRd, Rt, [Rn, #offset]Store register exclusive—3.4.8 on page 78
STREXBRd, Rt, [Rn]Store register exclusive byte—3.4.8 on page 78
STREXHRd, Rt, [Rn]
STRH,
STRHT
STRTRt, [Rn, #offset]Store register word—3.4 on page 68
SUB, SUBS{Rd,} Rn, Op2SubtractN,Z,C,V 3.5.1 on page 82
SUB, SUBW {Rd,} Rn, #imm12SubtractN,Z,C,V 3.5.1 on page 82
Rt, [Rn, #offset]Store register halfword—3.4 on page 68
Store register exclusive
halfword
—3.4.8 on page 78
SVC#immSupervisor call—
SXTAB
SXTAB16
SXTAH
SXTB16{Rd,} Rm {,ROR #n}Signed extend byte 16—3.8.2 on page 135
SXTB{Rd,} Rm {,ROR #n}Sign extend a byte—3.9.3 on page 140
SXTH{Rd,} Rm {,ROR #n}Sign extend a halfword—3.9.3 on page 140
TBB[Rn, Rm]Table branch byte—3.9.8 on page 146
TBH[Rn, Rm, LSL #1]Table branch halfword—3.9.8 on page 146
TEQRn, Op2Test equivalenceN,Z,C3.5.9 on page 92
TSTRn, Op2TestN,Z,C3.5.9 on page 92
UADD16{Rd,} Rn, RmUnsigned add 16GE3.5.16 on page 99
UADD8{Rd,} Rn, RmUnsigned add 8GE3.5.16 on page 99
USAX{Rd,} Rn, Rm
UHADD16{Rd,} Rn, RmUnsigned halving add 16—3.5.18 on page 101
{Rd,} Rn, Rm,{,ROR
#}
{Rd,} Rn, Rm,{,ROR #}Dual extend 8 bits to 16 and
{Rd,} Rn, Rm,{,ROR
#}
Extend 8 bits to 32 and add—3.8.3 on page 136
add
Extend 16 bits to 32 and add—3.8.3 on page 136
Unsigned subtract and add
with exchange
—3.8.3 on page 136
GE3.5.17 on page 100
3.11.10 on page
189
UHADD8{Rd,} Rn, RmUnsigned halving add 8—3.5.18 on page 101
UHASX{Rd,} Rn, Rm
UHSAX{Rd,} Rn, Rm
UHSUB16{Rd,} Rn, RmUnsigned halving subtract 16—3.5.20 on page 103
UHSUB8{Rd,} Rn, RmUnsigned halving subtract 8—3.5.20 on page 103
UBFXRd, Rn, #lsb, #widthUnsigned bit field extract—3.9.2 on page 139
54/260DocID022708 Rev 6
Unsigned halving add and
subtract with exchange
Unsigned halving subtract and
add with exchange
—3.5.19 on page 102
—3.5.19 on page 102
Page 55
PM0214The STM32 Cortex-M4 instruction set
Table 20. Cortex-M4 instructions (continued)
MnemonicOperandsBrief descriptionFlagsPage
UDIV{Rd,} Rn, RmUnsigned divide—3.6.3 on page 111
Unsigned multiply accumulate
UMAALRdLo, RdHi, Rn, Rm
UMLALRdLo, RdHi, Rn, Rm
accumulate long (32 x 32 + 32
+32), 64-bit result
Unsigned multiply with
accumulate (32 x 32 + 64), 64bit result
—3.6.2 on page 110
—3.6.2 on page 110
UMULLRdLo, RdHi, Rn, Rm
Unsigned multiply (32 x 32),
64-bit result
—3.6.2 on page 110
UQADD16{Rd,} Rn, RmUnsigned saturating add 16—3.7.7 on page 131
UQADD8{Rd,} Rn, RmUnsigned saturating add 8—3.7.7 on page 131
UQASX{Rd,} Rn, Rm
UQSAX{Rd,} Rn, Rm
UQSUB16{Rd,} Rn, Rm
Unsigned saturating add and
subtract with exchange
Unsigned saturating subtract
and add with exchange
Unsigned saturating subtract
16
—3.7.6 on page 130
—3.7.6 on page 130
—3.7.7 on page 131
UQSUB8{Rd,} Rn, RmUnsigned saturating subtract 8 —3.7.7 on page 131
USAD8{Rd,} Rn, Rm
USADA8{Rd,} Rn, Rm, Ra
Unsigned sum of absolute
differences
Unsigned sum of absolute
differences and accumulate
—3.5.22 on page 105
—3.5.23 on page 106
USATRd, #n, Rm {,shift #s} Unsigned saturateQ3.7.1 on page 125
USAT16Rd, #n, RmUnsigned saturate 16Q3.7.2 on page 126
UASX{Rd,} Rn, Rm
Unsigned add and subtract
with exchange
GE3.5.17 on page 100
USUB16{Rd,} Rn, RmUnsigned subtract 16GE3.5.24 on page 107
USUB8{Rd,} Rn, RmUnsigned subtract 8GE3.5.24 on page 107
UXTAB
UXTAB16
UXTAH
{Rd,} Rn, Rm,{,ROR #}Rotate, extend 8 bits to 32 and
add
{Rd,} Rn, Rm,{,ROR #}Rotate, dual extend 8 bits to
16 and add
{Rd,} Rn, Rm,{,ROR #}Rotate, unsigned extend and
add halfword
—3.8.3 on page 136
—3.8.3 on page 136
—3.8.3 on page 136
UXTB{Rd,} Rm {,ROR #n}Zero extend a byte—3.8.2 on page 135
UXTB16{Rd,} Rm {,ROR #n}Unsigned extend byte 16—3.8.2 on page 135
UXTH{Rd,} Rm {,ROR #n}Zero extend a halfword—3.8.2 on page 135
VABS.F32Sd, SmFloating-point absolute—3.10.1 on page 150
VADD.F32{Sd,} Sn, SmFloating-point add—3.10.2 on page 151
DocID022708 Rev 655/260
259
Page 56
The STM32 Cortex-M4 instruction setPM0214
Table 20. Cortex-M4 instructions (continued)
MnemonicOperandsBrief descriptionFlagsPage
Compare two floating-point
VCMP.F32Sd, <Sm | #0.0>
VCMPE.F32 Sd, <Sm | #0.0>
registers, or one floating-point
register and zero
Compare two floating-point
registers, or one floating-point
register and zero with Invalid
Operation check
FPSCR 3.10.3 on page 152
FPSCR 3.10.3 on page 152
VCVT.S32.F
32
VCVT.S16.F
32
VCVTR.S32.
F32
VCVT<B|H>.
F32.F16
VCVTT<B|T
>.F32.F16
VDIV.F32{Sd,} Sn, SmFloating-point divide—3.10.7 on page 156
VFMA.F32{Sd,} Sn, Sm
VFNMA.F32 {Sd,} Sn, Sm
VFMS.F32{Sd,} Sn, Sm
VFNMS.F32 {Sd,} Sn, Sm
VLDM.F<32|
64>
VLDR.F<32|
64>
Sd, Sm
Sd, Sd, #fbits
Sd, Sm
Sd, Sm
Sd, Sm
Rn{!}, list
Dd|Sd>, [Rn]
<
Convert between floating-point
and integer
Convert between floating-point
and fixed point
Convert between floating-point
and integer with rounding
Converts half-precision value
to single-precision
Converts single-precision
register to half-precision
ISO/IEC C code cannot directly access some Cortex-M4 instructions. This section describes
intrinsic functions that can generate these instructions, provided by the CMIS, and that
might be provided by a C compiler. If a C compiler does not support an appropriate intrinsic
function, you might have to use an inline assembler to access some instructions.
The CMSIS provides the intrinsic functions listed in Table 21 to generate instructions that
ANSI cannot directly access.
DocID022708 Rev 657/260
—
3.10.27 on page
176
3.11.12 on page
191
259
Page 58
The STM32 Cortex-M4 instruction setPM0214
Table 21. CMSIS intrinsic functions to generate some Cortex-M4 instructions
InstructionCMSIS intrinsic function
CPSIE Ivoid __enable_irq(void)
CPSID Ivoid __disable_irq(void)
CPSIE Fvoid __enable_fault_irq(void)
CPSID Fvoid __disable_fault_irq(void)
ISBvoid __ISB(void)
DSBvoid __DSB(void)
DMBvoid __DMB(void)
REVuint32_t __REV(uint32_t int value)
REV16uint32_t __REV16(uint32_t int value)
REVSHuint32_t __REVSH(uint32_t int value)
RBITuint32_t __RBIT(uint32_t int value)
SEVvoid __SEV(void)
WFEvoid __WFE(void)
WFIvoid __WFI(void)
The CMSIS also provides a number of functions for accessing the special registers using
MRS
and
MSR
instructions (see Tab le 22).
PRIMASK
FAULTMASK
BASEPRI
CONTROL
MSP
PSP
Table 22. CMSIS intrinsic functions to access the special registers
Special registerAccessCMSIS function
Readuint32_t __get_PRIMASK (void)
Writevoid __set_PRIMASK (uint32_t value)
Readuint32_t __get_FAULTMASK (void)
Writevoid __set_FAULTMASK (uint32_t value)
Readuint32_t __get_BASEPRI (void)
Writevoid __set_BASEPRI (uint32_t value)
Readuint32_t __get_CONTROL (void)
Writevoid __set_CONTROL (uint32_t value)
Readuint32_t __get_MSP (void)
Writevoid __set_MSP (uint32_t TopOfMainStack)
Readuint32_t __get_PSP (void)
Writevoid __set_PSP (uint32_t TopOfProcStack)
58/260DocID022708 Rev 6
Page 59
PM0214The STM32 Cortex-M4 instruction set
3.3 About the instruction descriptions
The following sections give more information about using the instructions:
•Operands on page 59
•Restrictions when using PC or SP on page 59
•Flexible second operand on page 59
•Shift operations on page 61
•Address alignment on page 64
•PC-relative expressions on page 64
•Conditional execution on page 64
•Instruction width selection on page 67
3.3.1 Operands
An instruction operand can be an ARM register, a constant, or another instruction-specific
parameter. Instructions act on the operands and often store the result in a destination
register. When there is a destination register in the instruction, it is usually specified before
the operands.
Operands in some instructions are flexible in that they can either be a register or a constant
(see
Flexible second operand).
3.3.2 Restrictions when using PC or SP
Many instructions have restrictions on whether you can use the program counter (PC) or
stack pointer (SP) for the operands or destination register. See instruction descriptions for
more information.
Bit[0] of any address written to the PC with a BX, BLX, LDM, LDR, or POP instruction must
be 1 for correct execution, because this bit indicates the required instruction set, and the
Cortex-M4 processor only supports thumb instructions.
3.3.3 Flexible second operand
Many general data processing instructions have a flexible second operand. This is shown
as operand2 in the description of the syntax of each instruction.
Operand2 can be a:
•Constant
•Register with optional shift
DocID022708 Rev 659/260
259
Page 60
The STM32 Cortex-M4 instruction setPM0214
Constant
You specify an operand2 constant in the form #constant, where constant can be:
•Any constant that can be produced by shifting an 8-bit value left by any number of bits
within a 32-bit word.
•Any constant of the form 0x00XY00XY
•Any constant of the form 0xXY00XY00
•Any constant of the form 0xXYXYXYXY
In the constants shown above, X and Y are hexadecimal digits.
In addition, in a small number of instructions, constant can include a wider range of values.
These are described in the individual instruction descriptions.
When an operand2 constant is used with the instructions MOVS, MVNS, ANDS, ORRS,
ORNS, EORS, BICS, TEQ or TST, the carry flag is updated to bit[31] of the constant, if the
constant is greater than 255 and can be produced by shifting an 8-bit value. These
instructions do not affect the carry flag if operand2 is any other constant.
Instruction substitution
The assembler might be able to produce an equivalent instruction if a not permitted constant
is specified. For example, the instruction CMP Rd, #0xFFFFFFFE might be assembled as
the equivalent of instruction CMN Rd, #0x2.
Register with optional shift
An operand2 register is specified in the form Rm {, shift}, where:
•Rm is the register holding the data for the second operand
•Shift is an optional shift to be applied to Rm. It can be one of the following:
ASR #n: Arithmetic shift right n bits, 1 ≤ n≤ 32
LSL #n: Logical shift left n bits, 1 ≤ n ≤ 31
LSR #n: Logical shift right n bits, 1 ≤ n ≤ 32
ROR #n: Rotate right n bits, 1 ≤ n≤ 31
RRX: Rotate right one bit, with extend
—: If omitted, no shift occurs, equivalent to
If you omit the shift, or specify LSL #0, the instruction uses the value in Rm.
If you specify a shift, the shift is applied to the value in Rm, and the resulting 32-bit value is
used by the instruction. However, the contents in the Rm
Specifying a register with shift also updates the carry flag when used with certain
instructions. For information on the shift operations and how they affect the carry flag, see
Shift operations.
LSL #0
register remain unchanged.
60/260DocID022708 Rev 6
Page 61
PM0214The STM32 Cortex-M4 instruction set
MSv39652V1
Carry
Flag
0315 4 3 2 1
3.3.4 Shift operations
Register shift operations move the bits in a register left or right by a specified number of bits,
the shift length. Register shift can be performed:
•Directly by the instructions ASR, LSR, LSL, ROR, and RRX. The result is written to a
destination register.
•During the calculation of operand2 by the instructions that specify the second operand
as a register with shift (see Flexible second operand on page 59). The result is used by
the instruction.
The permitted shift lengths depend on the shift type and the instruction (see the individual
instruction description or
Register shift operations update the carry flag except when the specified shift length is 0.
The following sub-sections describe the various shift operations and how they affect the
carry flag. In these descriptions,
the shift length.
ASR
Arithmetic shift right by n bits moves the left-hand 32-n bits of the Rm register to the right by
n places, into the right-hand 32-n bits of the result. And it copies the original bit[31] of the register into the left-hand n bits of the result (see
Flexible second operand). If the shift length is 0, no shift occurs.
Rm
is the register containing the value to be shifted, and n is
Figure 13: ASR #3 on page 61).
You can use the ASR #n operation to divide the value in the Rm register by 2n, with the
result being rounded towards negative-infinity.
When the instruction is ASRS or when ASR #n is used in operand2 with the instructions
MOVS, MVNS, ANDS, ORRS, ORNS, EORS, BICS, TEQ or TST, the carry flag is updated
to the last bit shifted out, bit[n-1], of the Rm register.
Note:1If n is 32 or more, all the bits in the result are set to the value of bit[31] of Rm.
2If n is 32 or more and the carry flag is updated, it is updated to the value of bit[31] of Rm.
Figure 13. ASR #3
DocID022708 Rev 661/260
259
Page 62
The STM32 Cortex-M4 instruction setPM0214
MSv39679V1
Carry
Flag
0315 4 3 2 1
00
0
MSv39678V1
0315 4 3 2 1
Carry
Flag
00
0
LSR
Logical shift right by n bits moves the left-hand 32-n bits of the Rm register to the right by n
places, into the right-hand 32-n bits of the result. And it sets the left-hand n bits of the result
to 0 (see
You can use the LSR #n operation to divide the value in the Rm register by 2n, if the value is
regarded as an unsigned integer.
When the instruction is LSRS or when LSR #n is used in operand2 with the instructions
MOVS, MVNS, ANDS, ORRS, ORNS, EORS, BICS, TEQ or TST, the carry flag is updated
to the last bit shifted out, bit[n-1], of the Rm register.
Note:1If n is 32 or more, then all the bits in the result are cleared to 0.
2If n is 33 or more and the carry flag is updated, it is updated to 0.
Figure 14).
Figure 14. LSR #3
LSL
Logical shift left by n bits moves the right-hand 32-n bits of the Rm register to the left by n
places, into the left-hand 32-n bits of the result. And it sets the right-hand n bits of the result
to 0 (see
The LSL #n operation can be used to multiply the value in the Rm register by 2n, if the value
is regarded as an unsigned integer or a two’s complement signed integer. Overflow can
occur without warning.
When the instruction is LSLS or when LSL #n, with non-zero n, is used in operand2 with the
instructions MOVS, MVNS, ANDS, ORRS, ORNS, EORS, BICS, TEQ or TST, the carry flag
is updated to the last bit shifted out, bit[32-n], of the Rm register. These instructions do not
affect the carry flag when used with LSL #0.
Note:1If n is 32 or more, then all the bits in the result are cleared to 0.
2If n is 33 or more and the carry flag is updated, it is updated to 0.
Figure 15: LSL #3).
Figure 15. LSL #3
62/260DocID022708 Rev 6
Page 63
PM0214The STM32 Cortex-M4 instruction set
MSv39685V1
Carry
Flag
0315 4 3 2 1
MSv39686V1
30
Carry
Flag
0311
ROR
Rotate right by n bits moves the left-hand 32-n bits of the
into the right-hand
into the left-hand
32-n
bits of the result. It also moves the right-hand n bits of the register
n
bits of the result (see Figure 16).
When the instruction is RORS or when ROR #n is used in
Rm
register to the right by n places,
operand2
with the instructions
MOVS, MVNS, ANDS, ORRS, ORNS, EORS, BICS, TEQ or TST, the carry flag is updated
to the last bit rotation, bit[
n
-1], of the
Rm
register.
Note:1If n is 32, then the value of the result is same as the value in Rm, and if the carry flag is
updated, it is updated to bit[31] of
2
ROR
with shift length, n, more than 32 is the same as
Rm
.
ROR
with shift length n-32.
Figure 16. ROR #3
RRX
Rotate right with extend moves the bits of the
copies the carry flag into bit[31] of the result (see
Rm
register to the right by one bit. And it
Figure 17).
When the instruction is RRXS or when RRX is used in operand2 with the instructions
MOVS, MVNS, ANDS, ORRS, ORNS, EORS, BICS, TEQ or TST, the carry flag is updated
to bit[0] of the
Rm
register.
Figure 17. RRX #3
DocID022708 Rev 663/260
259
Page 64
The STM32 Cortex-M4 instruction setPM0214
3.3.5 Address alignment
An aligned access is an operation where a word-aligned address is used for a word, dual
word, or multiple word access, or where a halfword-aligned address is used for a halfword
access. Byte accesses are always aligned.
The Cortex-M4 processor supports unaligned access only for the following instructions:
•LDR, LDRT
•LDRH, LDRHT
•LDRSH, LDRSHT
•STR, STRT
•STRH, STRHT
All other load and store instructions generate a usage fault exception if they perform an
unaligned access, and therefore their accesses must be address aligned. For more
information about usage faults see
Unaligned accesses are usually slower than aligned accesses. In addition, some memory
regions might not support unaligned accesses. Therefore, ARM recommends that
programmers to ensure that accesses are aligned. To avoid accidental generation of
unaligned accesses, use the UNALIGN_TRP bit in the configuration and control register to
trap all unaligned accesses, see
Fault handling on page 43.
Configuration and control register (CCR) on page 230.
3.3.6 PC-relative expressions
A PC-relative expression or label is a symbol that represents the address of an instruction or
literal data. It is represented in the instruction as the PC value plus or minus a numeric
offset. The assembler calculates the required offset from the label and the address of the
current instruction. If the offset is too big, the assembler produces an error.
•For the B, BL, CBNZ, and CBZ instructions, the value of the PC is the address of the
current instruction plus four bytes.
•For all other instructions that use labels, the value of the PC is the address of the
current instruction plus four bytes, with bit[1] of the result cleared to 0 to make it wordaligned.
•Your assembler might permit other syntaxes for PC-relative expressions, such as a
label plus or minus a number, or an expression of the form [PC, #number].
3.3.7 Conditional execution
Most data processing instructions can optionally update the condition flags in the application
program status register (APSR) according to the result of the operation (see
program status register on page 20). Some instructions update all flags, and some only
update a subset. If a flag is not updated, the original value is preserved. See the instruction
descriptions for the flags they affect.
You can execute an instruction conditionally, based on the condition flags set in another
instruction:
•Immediately after the instruction that updated the flags
•After any number of intervening instructions that have not updated the flags
Application
64/260DocID022708 Rev 6
Page 65
PM0214The STM32 Cortex-M4 instruction set
Conditional execution is available by using conditional branches or by adding condition code
suffixes to instructions. See
Tab le 23: Condition code suffixes on page 66 for a list of the
suffixes to add to instructions to make them conditional instructions. The condition code
suffix enables the processor to test a condition based on the flags. If the condition test of a
conditional instruction fails, the instruction:
•Does not execute.
•Does not write any value to its destination register.
•Does not affect any of the flags.
•Does not generate any exception.
Conditional instructions, except for conditional branches, must be inside an If-then
instruction block. See
IT
instruction. Depending on the vendor, the assembler might automatically insert an IT
IT on page 144 for more information and restrictions when using the
instruction if you have conditional instructions outside the IT block.
Use the CBZ and CBNZ instructions to compare the value of a register against zero and
branch on the result.
This section describes:
•The condition flags
•Condition code suffixes on page 66
The condition flags
The APSR contains the following condition flags:
•N: Set to 1 when the result of the operation is negative, otherwise cleared to 0.
•Z: Set to 1 when the result of the operation is zero, otherwise cleared to 0.
•C: Set to 1 when the operation results in a carry, otherwise cleared to 0.
•V: Set to 1 when the operation causes an overflow, otherwise cleared to 0.
For more information about the APSR see Program status register on page 18.
A carry occurs:
•If the result of an addition is greater than or equal to 232.
•If the result of a subtraction is positive or zero.
•As the result of an inline barrel shifter operation in a move or logical instruction.
Overflow occurs if the sign of a result does not match the sign of the result had the operation
been performed at infinite precision, for example:
•if adding two negative values results in a positive value.
•if adding two positive values results in a negative value.
•if subtracting a positive value from a negative value generates a positive value.
•if subtracting a negative value from a positive value generates a negative value.
The Compare operations are identical to subtracting, for CMP, or adding, for CMN, except
that the result is discarded. See the instruction descriptions for more information.
Most instructions update the status flags only if the S suffix is specified. See the instruction
descriptions for more information.
DocID022708 Rev 665/260
259
Page 66
The STM32 Cortex-M4 instruction setPM0214
Condition code suffixes
The instructions that can be conditional have an optional condition code, shown in syntax
descriptions as
instruction with a condition code is only executed if the condition code flags in the APSR
meet the specified condition.
You can use conditional execution with the IT instruction to reduce the number of branch
instructions in code.
Tab le 23 also shows the relationship between condition code suffixes and the N, Z, C, and V
flags.
SuffixFlagsMeaning
EQZ = 1Equal
NEZ = 0Not equal
CS or HSC = 1Higher or same, unsigned ≥
CC or LOC = 0Lower, unsigned <
MIN = 1Negative
{cond}
. Conditional execution requires a preceding IT instruction. An
Tabl e 23 shows the condition codes to use.
Table 23. Condition code suffixes
PLN = 0Positive or zero
VSV = 1Overflow
VCV = 0No overflow
HIC = 1 and Z = 0Higher, unsigned >
LSC = 0 or Z = 1Lower or same, unsigned
GEN = V Greater than or equal, signed ≥
LTN ! = VLess than, signed <
GTZ = 0 and N = VGreater than, signed >
LEZ = 1 and N != V Less than or equal, signed
ALCan have any valueAlways. This is the default when no suffix is specified.
≤
≤
Specific example 1: Absolute value shows the use of a conditional instruction to find the
absolute value of a number. R0 = ABS(R1).
Specific example 1: Absolute value
MOVSR0, R1; R0 = R1, setting flags
IT MI ; IT instruction for the negative condition
RSBMIR0, R1, #0; If negative, R0 = -R1
Specific example 2: Compare and update value shows the use of conditional instructions to
update the value of R4 if the signed value R0 and R2 are greater than R1 and R3
respectively.
Specific example 2: Compare and update value
CMP R0, R1 ; compare R0 and R1, setting flags
ITT GT ; IT instruction for the two GT conditions
66/260DocID022708 Rev 6
Page 67
PM0214The STM32 Cortex-M4 instruction set
CMPGT R2, R3; if 'greater than', compare R2 and R3, setting flags
MOVGT R4, R5 ; if still 'greater than', do R4 = R5
3.3.8 Instruction width selection
There are many instructions that can generate either a 16-bit encoding or a 32-bit encoding
depending on the specified operands and destination register. For some of these
instructions, you can force a specific instruction size by using an instruction width suffix.
The .W suffix forces a 32-bit instruction encoding. The .N suffix forces a 16-bit instruction
encoding.
If you specify an instruction width suffix and the assembler cannot generate an instruction
encoding of the requested width, it generates an error.
In some cases it might be necessary to specify the .W suffix, for example if the operand is
the label of an instruction or literal data, as in the case of branch instructions. The reason for
this is that the assembler might not automatically generate the right size encoding.
To use an instruction width suffix, place it immediately after the instruction mnemonic and
condition code, if any.
Specific example 3: Instruction width selection shows instructions
with the instruction width suffix.
Specific example 3: Instruction width selection
BCS.W label; creates 32-bit instruction even for a short branch
ADDS.W R0, R0, R1;creates a 32-bit instruction even though the same
; operation can be done by a 16-bit instruction
DocID022708 Rev 667/260
259
Page 68
The STM32 Cortex-M4 instruction setPM0214
3.4 Memory access instructions
Tab le 24 shows the memory access instructions:
MnemonicBrief descriptionSee
ADRLoad PC-relative addressADR on page 69
CLREXClear exclusiveCLREX on page 79
LDM{mode}Load multiple registersLDM and STM on page 75
LDR{type}Load register using immediate offsetLDR and STR, immediate offset on page 70
LDR{type}Load register using register offsetLDR and STR, register offset on page 72
LDR{type}TLoad register with unprivileged accessLDR and STR, unprivileged on page 73
LDRLoad register using PC-relative address LDR, PC-relative on page 74
LDRDLoad register dualLDR and STR, immediate offset on page 70
LDREX{type} Load register exclusiveLDREX and STREX on page 78
POPPop registers from stackPUSH and POP on page 77
PUSHPush registers onto stackPUSH and POP on page 77
Table 24. Memory access instructions
STM{mode}Store multiple registersLDM and STM on page 75
STR{type}Store register using immediate offsetLDR and STR, immediate offset on page 70
STR{type}Store register using register offsetLDR and STR, register offset on page 72
STR{type}TStore register with unprivileged access LDR and STR, unprivileged on page 73
STREX{type} Store register exclusive LDREX and STREX on page 78
68/260DocID022708 Rev 6
Page 69
PM0214The STM32 Cortex-M4 instruction set
3.4.1 ADR
Load PC-relative address.
Syntax
ADR{cond} Rd, label
Where:
•‘cond’ is an optional condition code (see Conditional execution on page 64)
•‘Rd’ is the destination register
•‘label’ is a PC-relative expression (see PC-relative expressions on page 64)
Operation
ADR determines the address by adding an immediate value to the PC. It writes the result to
the destination register.
ADR produces position-independent code, because the address is PC-relative.
If you use ADR to generate a target address for a BX or BLX instruction, you must ensure
that bit[0] of the address you generate is set to1 for correct execution.
Values of label must be within the range -4095 to 4095 from the address in the PC.
Note:You might have to use the .W suffix to get the maximum offset range or to generate
addresses that are not word-aligned (see
Instruction width selection on page 67).
Restrictions
Rd
must be neither SP nor PC.
Condition flags
This instruction does not change the flags.
Examples
ADR R1, TextMessage; write address value of a location labelled as
; TextMessage to R1
DocID022708 Rev 669/260
259
Page 70
The STM32 Cortex-M4 instruction setPM0214
3.4.2 LDR and STR, immediate offset
Load and Store with immediate offset, pre-indexed immediate offset, or post-indexed
immediate offset.
opD{cond} Rt, Rt2, [Rn {, #offset}]; immediate offset, two words
opD{cond} Rt, Rt2, [Rn, #offset]!; pre-indexed, two words
opD{cond} Rt, Rt2, [Rn], #offset; post-indexed, two words
Where:
•‘op’ is either LDR (load register) or STR (store register)
•‘type’ is one of the following:
B: Unsigned byte, zero extends to 32 bits on loads
SB: Signed byte, sign extends to 32 bits (LDR only)
H: Unsigned halfword, zero extends to 32 bits on loads
SH: Signed halfword, sign extends to 32 bits (LDR only)
—: Omit, for word
•‘cond’ is an optional condition code (see Conditional execution on page 64)
•‘Rt’ is the register to load or store
•‘Rn’ is the register on which the memory address is based
•‘offset’ is an offset from Rn. If offset is omitted, the address is the contents of Rn
•‘Rt2’ is the additional register to load or store for two-word operations
Operation
LDR instructions load one or two registers with a value from memory. STR instructions store
one or two register values to memory.
Load and store instructions with immediate offset can use the following addressing modes:
Offset addressing
The offset value is added to or subtracted from the address obtained from the register
Rn. The result is used as the address for the memory access. The register Rn is
unaltered. The assembly language syntax for this mode is: [Rn, #offset].
Pre-indexed addressing
The offset value is added to or subtracted from the address obtained from the register
Rn. The result is used as the address for the memory access and written back into the
register Rn. The assembly language syntax for this mode is: [Rn, #offset]!
Post-indexed addressing
The address obtained from the register Rn is used as the address for the memory
access. The offset value is added to or subtracted from the address, and written back
into the register Rn. The assembly language syntax for this mode is: [Rn], #offset.
70/260DocID022708 Rev 6
Page 71
PM0214The STM32 Cortex-M4 instruction set
The value to load or store can be a byte, halfword, word, or two words. Bytes and halfwords
can either be signed or unsigned (see
Address alignment on page 64).
Tab le 25 shows the range of offsets for immediate, pre-indexed and post-indexed forms.
Table 25. Immediate, pre-indexed and post-indexed offset ranges
Word, halfword, signed
halfword, byte, or signed byte
Two words
-255 to 4095-255 to 255-255 to 255
Multiple of 4 in the
range -1020 to 1020
Multiple of 4 in the
range -1020 to 1020
Multiple of 4 in the
range -1020 to 1020
Restrictions
•For load instructions:
–Rt can be SP or PC for word loads only.
–Rt must be different from Rt2 for two-word loads.
–Rn must be different from Rt and Rt2 in the pre-indexed or post-indexed forms.
•When Rt is PC in a word load instruction.
–bit[0] of the loaded value must be 1 for correct execution.
–A branch occurs to the address created by changing bit[0] of the loaded value to 0.
–If the instruction is conditional, it must be the last instruction in the IT block.
•For store instructions:
–Rt can be SP for word stores only.
–Rt must not be PC.
–Rn must not be PC.
–Rn must be different from Rt and Rt2 in the pre-indexed or post-indexed forms
Condition flags
These instructions do not change the flags.
Examples
LDR R8, [R10]; loads R8 from the address in R10.
LDRNE R2, [R5, #960]!; loads (conditionally) R2 from a word
; 960 bytes above the address in R5, and
; increments R5 by 960.
STR R2, [R9,#const-struc]; const-struc is an expression evaluating
; to a constant in the range 0-4095.
STRH R3, [R4], #4; Store R3 as halfword data into address in
; R4, then increment R4 by 4
LDRD R8, R9, [R3, #0x20]; Load R8 from a word 32 bytes above the
; address in R3, and load R9 from a word 36
; bytes above the address in R3
STRD R0, R1, [R8], #-16; Store R0 to address in R8, and store R1 to
; a word 4 bytes above the address in R8,
; and then decrement R8 by 16.
DocID022708 Rev 671/260
259
Page 72
The STM32 Cortex-M4 instruction setPM0214
3.4.3 LDR and STR, register offset
Load and Store with register offset.
Syntax
op{type}{cond} Rt, [Rn, Rm {, LSL #n}]
Where:
•‘op’ is either LDR (load register) or STR (store register).
•‘type’ is one of the following:
B: Unsigned byte, zero extends to 32 bits on loads.
SB: Signed byte, sign extends to 32 bits (LDR only).
H: Unsigned halfword, zero extends to 32 bits on loads.
SH: Signed halfword, sign extends to 32 bits (LDR only).
—: Omit, for word.
•‘cond’ is an optional condition code, see Conditional execution on page 64.
•‘Rt’ is the register to load or store.
•‘Rn’ is the register on which the memory address is based.
•‘Rm’ is a register containing a value to be used as the offset.
•‘LSL #n’ is an optional shift, with n in the range 0 to 3.
Operation
LDR instructions load a register with a value from memory. STR instructions store a register
value into memory. The memory address to load from or store to is at an offset from the
register Rn. The offset is specified by the Rm register and can be shifted left by up to 3 bits
using LSL. The value to load or store can be a byte, halfword, or word. For load instructions,
bytes and halfwords can either be signed or unsigned (see
Address alignment on page 64).
Restrictions
In these instructions:
•Rn must not be PC.
•Rm must be neither SP nor PC.
•Rt can be SP only for word loads and word stores.
•Rt can be PC only for word loads.
When Rt is PC in a word load instruction:
•bit[0] of the loaded value must be 1 for correct execution, and a branch occurs to this
halfword-aligned address.
•If the instruction is conditional, it must be the last instruction in the IT block.
Condition flags
These instructions do not change the flags.
Examples
STR R0, [R5, R1]; store value of R0 into an address equal to
; sum of R5 and R1
LDRSB R0, [R5, R1, LSL #1]; read byte value from an address equal to
; sum of R5 and two times R1, sign extended it
72/260DocID022708 Rev 6
Page 73
PM0214The STM32 Cortex-M4 instruction set
; to a word value and put it in R0
STR R0, [R1, R2, LSL #2]; stores R0 to an address equal to sum of R1
•‘op’ is either LDR (load register) or STR (store register).
•‘type’ is one of the following:
B: Unsigned byte, zero extends to 32 bits on loads.
SB: Signed byte, sign extends to 32 bits (LDR only).
H: Unsigned halfword, zero extends to 32 bits on loads.
SH: Signed halfword, sign extends to 32 bits (LDR only).
—: Omit, for word.
•‘cond’ is an optional condition code, see Conditional execution on page 64.
•‘Rt’ is the register to load or store.
•‘Rn’ is the register on which the memory address is based.
•‘offset’ is an offset from Rn and can be 0 to 255. If offset is omitted, the address is the
value in Rn.
Operation
These load and store instructions perform the same function as the memory access
instructions with immediate offset (see
LDR and STR, immediate offset on page 70). The
difference is that these instructions have only unprivileged access even when used in
privileged software.
When used in unprivileged software, these instructions behave in exactly the same way as
normal memory access instructions with immediate offset.
Restrictions
In these instructions:
•Rn must not be PC.
•Rt must be neither SP nor PC.
Condition flags
These instructions do not change the flags.
Examples
STRBTEQ R4, [R7]; conditionally store least significant byte in
; R4 to an address in R7, with unprivileged access
LDRHT R2, [R2, #8]; load halfword value from an address equal to
; sum of R2 and 8 into R2, with unprivileged access
DocID022708 Rev 673/260
259
Page 74
The STM32 Cortex-M4 instruction setPM0214
3.4.5 LDR, PC-relative
Load register from memory.
Syntax
LDR{type}{cond} Rt, label
LDRD{cond} Rt, Rt2, label; load two words
Where:
•‘type’ is one of the following:
B: Unsigned byte, zero extends to 32 bits.
SB: Signed byte, sign extends to 32 bits.
H: Unsigned halfword, sign extends to 32 bits.
SH: Signed halfword, sign extends to 32 bits.
—: Omit, for word.
•‘cond’ is an optional condition code, see Conditional execution on page 64.
•‘Rt’ is the register to load or store.
•‘Rt2’ is the second register to load or store.
•‘label’ is a PC-relative expression, see PC-relative expressions on page 64.
Operation
LDR loads a register with a value from a PC-relative memory address.
The memory address is specified by a label or by an offset from the PC.
The value to load or store can be a byte, halfword, or word. For load instructions, bytes and
halfwords can either be signed or unsigned (see
Address alignment on page 64).
‘label’ must be within a limited range of the current instruction. Table 26 shows the possible
offsets between label and the PC. You might have to use the .W suffix to get the maximum
offset range (see
Word, halfword, signed halfword, byte, signed byte−4095 to 4095
Two words−1020 to 1020
Instruction width selection on page 67).
Table 26. label-PC offset ranges
Instruction typeOffset range
Restrictions
In these instructions:
•Rt2 must be neither SP nor PC
•Rt must be different from Rt2
•Rt can be SP or PC only for word loads
•When Rt is PC in a word load instruction: bit[0] of the loaded value must be 1 for
correct execution, and a branch occurs to this halfword-aligned address. If the
instruction is conditional, it must be the last instruction in the IT block.
74/260DocID022708 Rev 6
Page 75
PM0214The STM32 Cortex-M4 instruction set
Condition flags
These instructions do not change the flags.
Examples
LDR R0, LookUpTable; load R0 with a word of data from an address
; labelled as LookUpTable
LDRSB R7, localdata; load a byte value from an address labelled
; as localdata, sign extend it to a word
; value, and put it in R7
3.4.6 LDM and STM
Load and Store Multiple registers.
Syntax
op{addr_mode}{cond} Rn{!}, reglist
Where:
•‘op’ is either LDM (load multiple register) or STM (store multiple register).
•‘addr_mode’ is any of the following:
IA: Increment address after each access (this is the default).
DB: Decrement address before each access.
•‘cond’ is an optional condition code, see Conditional execution on page 64.
•‘Rn’ is the register on which the memory addresses are based.
•‘!’ is an optional writeback suffix. If ! is present, the final address that is loaded from or
stored to is written back into Rn.
•‘reglist’ is a list of one or more registers to be loaded or stored, enclosed in braces. It
can contain register ranges. It must be comma-separated if it contains more than one
register or register range, see Examples on page 76.
LDM and LDMFD are synonyms for LDMIA. LDMFD refers to its use for popping data from
full descending stacks.
LDMEA is a synonym for LDMDB, and refers to its use for popping data from empty
ascending stacks.
STM and STMEA are synonyms for STMIA. STMEA refers to its use for pushing data onto
empty ascending stacks.
STMFD is s synonym for STMDB, and refers to its use for pushing data onto full descending
stacks
Operation
LDM instructions load the registers in reglist with word values from memory addresses
based on Rn.
STM instructions store the word values in the registers in reglist to memory addresses
based on Rn.
For LDM, LDMIA, LDMFD, STM, STMIA, and STMEA the memory addresses used for the
accesses are at 4-byte intervals ranging from Rn to Rn + 4 * (n-1), where n is the number of
registers in reglist. The accesses happen in order of increasing register numbers, with the
DocID022708 Rev 675/260
259
Page 76
The STM32 Cortex-M4 instruction setPM0214
lowest numbered register using the lowest memory address and the highest number
register using the highest memory address. If the writeback suffix is specified, the value of
Rn + 4 * (n-1) is written back to Rn.
For LDMDB, LDMEA, STMDB, and STMFD the memory addresses used for the accesses
are at 4-byte intervals ranging from Rn to Rn - 4 * (n-1), where n is the number of registers
in reglist. The accesses happen in order of decreasing register numbers, with the highest
numbered register using the highest memory address and the lowest number register using
the lowest memory address. If the writeback suffix is specified, the value Rn - 4 * (n) is
written back to Rn.
The PUSH and POP instructions can be expressed in this form (see PUSH and POP for
details).
Restrictions
In these instructions:
•Rn must not be PC.
•reglist must not contain SP.
•In any STM instruction, reglist must not contain PC.
•In any LDM instruction, reglist must not contain PC if it contains LR.
•reglist must not contain Rn if you specify the writeback suffix.
When PC is in reglist in an LDM instruction:
•bit[0] of the value loaded to the PC must be 1 for correct execution, and a branch
occurs to this halfword-aligned address.
•If the instruction is conditional, it must be the last instruction in the IT block.
Condition flags
These instructions do not change the flags.
Examples
LDM R8,{R0,R2,R9}; LDMIA is a synonym for LDM
STMDB R1!,{R3-R6,R11,R12}
Incorrect examples
STM R5!,{R5,R4,R9}; value stored for R5 is unpredictable
LDM R2, {}; there must be at least one register in the list
76/260DocID022708 Rev 6
Page 77
PM0214The STM32 Cortex-M4 instruction set
3.4.7 PUSH and POP
Push registers onto, and pop registers off a full-descending stack. PUSH and POP are
synonyms for STMDB and LDM (or LDMIA) with the memory addresses for the access
based on SP, and with the final address for the access written back to the SP. PUSH and
POP are the preferred mnemonics in these cases.
Syntax
PUSH{cond} reglist
POP{cond} reglist
Where:
•‘cond’ is an optional condition code (see Conditional execution on page 64).
•‘reglist’ is a non-empty list of registers (or register ranges), enclosed in braces.
Commas must separate register lists or ranges (see Examples on page 76).
Operation
•PUSH stores registers on the stack in order of decreasing register numbers, with the
highest numbered register using the highest memory address and the lowest
numbered register using the lowest memory address.
•POP loads registers from the stack in order of increasing register numbers, with the
lowest numbered register using the lowest memory address and the highest numbered
register using the highest memory address.
•PUSH uses the value in the SP register minus four as the highest memory address,
POP uses the SP register value as the lowest memory address, implementing a fulldescending stack. On completion, PUSH updates the SP register to point to the
location of the lowest store value, and POP updates the SP register to point to the
location above the highest location loaded.
•If a POP instruction includes PC in its reglist, a branch to this location is performed
when the POP instruction has completed. Bit[0] of the value read for the PC is used to
update the APSR T-bit. This bit must be 1 to ensure correct operation. See LDM and
STM on page 75 for more information.
Restrictions
In these instructions:
•‘reglist’ must not contain SP.
•For the PUSH instruction, reglist must not contain PC.
•For the POP instruction, reglist must not contain PC if it contains LR.
When PC is in reglist in a POP instruction: bit[0] of the value loaded to the PC must be
1 for correct execution, and a branch occurs to this halfword-aligned address. If the
instruction is conditional, it must be the last instruction in the IT block.
Condition flags
These instructions do not change the flags.
Examples
PUSH {R0,R4-R7} ; Push R0,R4,R5,R6,R7 onto the stack
PUSH {R2,LR} ; Push R2 and the link-register onto the stack
POP {R0,R6,PC} ; Pop r0,r6 and PC from the stack, then branch to new PC.
DocID022708 Rev 677/260
259
Page 78
The STM32 Cortex-M4 instruction setPM0214
3.4.8 LDREX and STREX
Load and Store Register Exclusive.
Syntax
LDREX{cond} Rt, [Rn {, #offset}]
STREX{cond} Rd, Rt, [Rn {, #offset}]
LDREXB{cond} Rt, [Rn]
STREXB{cond} Rd, Rt, [Rn]
LDREXH{cond} Rt, [Rn]
STREXH{cond} Rd, Rt, [Rn]
Where:
•‘cond’ is an optional condition code (see Conditional execution on page 64).
•‘Rd’ is the destination register for the returned status.
•‘Rt’ is the register to load or store.
•‘Rn’ is the register on which the memory address is based.
•‘offset’ is an optional offset applied to the value in Rn. If offset is omitted, the address is
the value in Rn.
Operation
LDREX, LDREXB, and LDREXH load a word, byte, and halfword respectively from a
memory address.
STREX, STREXB, and STREXH attempt to store a word, byte, and halfword respectively to
a memory address. The address used in any store-exclusive instruction must be the same
as the address in the most recently executed load-exclusive instruction. The value stored by
the store-exclusive instruction must also have the same data size as the value loaded by the
preceding load-exclusive instruction. This means software must always use a loadexclusive instruction and a matching store-exclusive instruction to perform a
synchronization operation, see
Synchronization primitives on page 33.
If a store-exclusive instruction performs the store, it writes 0 to its destination register.
If it does not perform the store, it writes 1 to its destination register.
If the store-exclusive instruction writes 0 to the destination register, it is guaranteed that no
other process in the system has accessed the memory location between the load-exclusive
and store-exclusive instructions.
For reasons of performance, keep the number of instructions between corresponding loadexclusive and store-exclusive instruction to a minimum.
Note:The result of executing a store-exclusive instruction to an address that is different from that
used in the preceding load-exclusive instruction is unpredictable.
78/260DocID022708 Rev 6
Page 79
PM0214The STM32 Cortex-M4 instruction set
Restrictions
In these instructions:
•Do not use PC.
•Do not use SP for Rd and Rt.
•For STREX, Rd must be different from both Rt and Rn.
•The value of offset must be a multiple of four in the range 0-1020.
Condition flags
These instructions do not change the flags.
Examples
MOV R1, #0x1
LDREX R0, [LockAddr]; load the lock value
CMP R0, #0; is the lock free?
ITT EQ; IT instruction for STREXEQ and CMPEQ
STREXEQ R0, R1, [LockAddr]; try and claim the lock
CMPEQ R0, #0; did this succeed?
BNE try
3.4.9 CLREX
Clear Exclusive.
Syntax
CLREX{cond}
Where:
‘cond’ is an optional condition code (see Conditional execution on page 64)
Operation
Use
CLREX
register and fail to perform the store. It is useful in exception handler code to force the failure
of the store exclusive if the exception occurs between a load exclusive instruction and the
matching store exclusive instruction in a synchronization operation.
to make the next
; initialize the ‘lock taken’ value try
; no – try again
; yes – we have the lock
STREX, STREXB
, or
STREXH
instruction write 1 to its destination
See Synchronization primitives on page 33 for more information.
Condition flags
These instructions do not change the flags.
Examples
CLREX
DocID022708 Rev 679/260
259
Page 80
The STM32 Cortex-M4 instruction setPM0214
3.5 General data processing instructions
Tab le 27 shows the data processing instructions.
MnemonicBrief descriptionSee
ADCAdd with carryADD, ADC, SUB, SBC, and RSB on page 82
ADDAddADD, ADC, SUB, SBC, and RSB on page 82
ADDWAddADD, ADC, SUB, SBC, and RSB on page 82
ANDLogical ANDAND, ORR, EOR, BIC, and ORN on page 84
ASRArithmetic Shift RightASR, LSL, LSR, ROR, and RRX on page 85
BICBit ClearAND, ORR, EOR, BIC, and ORN on page 84
CLZCount leading zerosCLZ on page 86
CMNCompare NegativeCMP and CMN on page 87
CMPCompareCMP and CMN on page 87
EORExclusive ORAND, ORR, EOR, BIC, and ORN on page 84
LSLLogical Shift LeftASR, LSL, LSR, ROR, and RRX on page 85
Table 27. Data processing instructions
LSRLogical Shift RightASR, LSL, LSR, ROR, and RRX on page 85
MOVMoveMOV and MVN on page 88
MOVTMove TopMOVT on page 90
MOVWMove 16-bit constantMOV and MVN on page 88
MVNMove NOTMOV and MVN on page 88
ORNLogical OR NOTAND, ORR, EOR, BIC, and ORN on page 84
ORRLogical ORAND, ORR, EOR, BIC, and ORN on page 84
RBITReverse BitsREV, REV16, REVSH, and RBIT on page 91
REVReverse byte order in a wordREV, REV16, REVSH, and RBIT on page 91
REV16Reverse byte order in each halfwordREV, REV16, REVSH, and RBIT on page 91
REVSHReverse byte order in bottom halfword and sign extend REV, REV16, REVSH, and RBIT on page 91
RORRotate RightASR, LSL, LSR, ROR, and RRX on page 85
RRXRotate Right with ExtendASR, LSL, LSR, ROR, and RRX on page 85
RSBReverse SubtractADD, ADC, SUB, SBC, and RSB on page 82
SADD16Signed Add 16SADD16 and SADD8 on page 92
SADD8Signed Add 8SADD16 and SADD8 on page 92
SASXSigned Add and Subtract with ExchangeSASX and SSAX on page 97
SSAXSigned Subtract and Add with ExchangeSASX and SSAX on page 97
SBCSubtract with CarryADD, ADC, SUB, SBC, and RSB on page 82
SHADD16 Signed Halving Add 16SHADD16 and SHADD8 on page 93
SHADD8Signed Halving Add 8SHADD16 and SHADD8 on page 93
80/260DocID022708 Rev 6
Page 81
PM0214The STM32 Cortex-M4 instruction set
Table 27. Data processing instructions (continued)
MnemonicBrief descriptionSee
SHASXSigned Halving Add and Subtract with ExchangeSHASX and SHSAX on page 94
SHSAXSigned Halving Subtract and Add with exchangeSHASX and SHSAX on page 94
SHSUB16Signed Halving Subtract 16SHSUB16 and SHSUB8 on page 95
SHSUB8Signed Halving Subtract 8SHSUB16 and SHSUB8 on page 95
SSUB16Signed Subtract 16SSUB16 and SSUB8 on page 96
SSUB8Signed subtract 8SSUB16 and SSUB8 on page 96
SUBSubtractADD, ADC, SUB, SBC, and RSB on page 82
SUBWSubtractADD, ADC, SUB, SBC, and RSB on page 82
TEQTest EquivalenceSADD16 and SADD8 on page 92
TSTTestSADD16 and SADD8 on page 92
UADD16Unsigned Add 16UADD16 and UADD8 on page 99
UADD8Unsigned Add 8UADD16 and UADD8 on page 99
UASXUnsigned Add and Subtract with ExchangeUASX and USAX on page 100
USAXUnsigned Subtract and Add with ExchangeUASX and USAX on page 100
UHADD16 Unsigned Halving Add 16UHADD16 and UHADD8 on page 101
UHADD8Unsigned Halving Add 8UHADD16 and UHADD8 on page 101
UHASXUnsigned Halving Add and Subtract with ExchangeUHASX and UHSAX on page 102
UHSAXUnsigned Halving Subtract and Add with ExchangeUHASX and UHSAX on page 102
UHSUB16 Unsigned Halving Subtract 16UHSUB16 and UHSUB8 on page 103
UHSUB8Unsigned Halving Subtract 8UHSUB16 and UHSUB8 on page 103
USAD8Unsigned Sum of Absolute DifferencesUSAD8 on page 105
USADA8Unsigned Sum of Absolute Differences and accumulate USADA8 on page 106
USUB16Unsigned Subtract 16USUB16 and USUB8 on page 107
USUB8Unsigned Subtract 8USUB16 and USUB8 on page 107
DocID022708 Rev 681/260
259
Page 82
The STM32 Cortex-M4 instruction setPM0214
3.5.1 ADD, ADC, SUB, SBC, and RSB
Add, Add with Carry, Subtract, Subtract with Carry, and Reverse Subtract.
Syntax
op{S}{cond} {Rd,} Rn, Operand2
op{cond} {Rd,} Rn, #imm12; ADD and SUB only
Where:
•‘op’ is one of the following:
ADD: Add
ADC: Add with carry
SUB: Subtract
SBC: Subtract with carry
RSB: Reverse subtract
•‘S’ is an optional suffix. If S is specified, the condition code flags are updated on the
result of the operation (see Conditional execution on page 64)
•‘cond’ is an optional condition code (see Conditional execution on page 64)
•‘Rd’ is the destination register. If Rd is omitted, the destination register is Rn
•‘Rn’ is the register holding the first operand
•‘Operand2’ is a flexible second operand (see Flexible second operand on page 59 for
details of the options)
•‘imm12’ is any value in the range 0—4095
Operation
The ADD instruction adds the value of operand2 or imm12 to the value in Rn.
The ADC instruction adds the values in Rn and operand2, together with the carry flag.
The SUB instruction subtracts the value of operand2 or imm12 from the value in Rn.
The SBC instruction subtracts the value of operand2 from the value in Rn. If the carry flag is
clear, the result is reduced by one.
The RSB instruction subtracts the value in Rn from the value of operand2. This is useful
because of the wide range of options for operand2.
Use ADC and SBC to synthesize multiword arithmetic (see Multiword arithmetic examples
on page 83 and ADR on page 69).
ADDW is equivalent to the ADD syntax that uses the imm12 operand. SUBW is equivalent
to the SUB syntax that uses the imm12 operand.
82/260DocID022708 Rev 6
Page 83
PM0214The STM32 Cortex-M4 instruction set
Restrictions
In these instructions:
•Operand2 must be neither SP nor PC
•Rd can be SP only in ADD and SUB, and only with the following additional restrictions:
–Rn must also be SP.
–Any shift in operand2 must be limited to a maximum of three bits using LSL.
•Rn can be SP only in ADD and SUB.
•Rd can be PC only in the ADD{cond} PC, PC, Rm instruction where:
–You must not specify the S suffix.
–Rm must be neither PC nor SP.
–If the instruction is conditional, it must be the last instruction in the IT block.
•With the exception of the ADD{cond} PC, PC, Rm instruction, Rn can be PC only in
ADD and SUB, and only with the following additional restrictions:
–You must not specify the S suffix.
–The second operand must be a constant in the range 0 to 4095.
Note:1When using the PC for an addition or a subtraction, bits[1:0] of the PC are rounded to b00
before performing the calculation, making the base address for the calculation word-aligned.
2If you want to generate the address of an instruction, you have to adjust the constant based
on the value of the PC. ARM recommends that you use the
SUB
with Rn equal to the PC, because your assembler automatically calculates the correct
constant for the
ADR
instruction.
ADR
instruction instead of
ADD
or
When Rd is PC in the ADD{cond} PC, PC, Rm instruction:
•Bit[0] of the value written to the PC is ignored.
•A branch occurs to the address created by forcing bit[0] of that value to 0.
Condition flags
If S is specified, these instructions update the N, Z, C and V flags according to the result.
Examples
ADD R2, R1, R3
SUBS R8, R6, #240; sets the flags on the result
RSB R4, R4, #1280; subtracts contents of R4 from 1280
ADCHI R11, R0, R3; only executed if C flag set and Z flag clear
Multiword arithmetic examples
Specific example 4: 64-bit addition shows two instructions that add a 64-bit integer
contained in R2 and R3 to another 64-bit integer contained in R0 and R1, and place the
result in R4 and R5.
Specific example 4: 64-bit addition
ADDS R4, R0, R2; add the least significant words
ADC R5, R1, R3; add the most significant words with carry
DocID022708 Rev 683/260
259
Page 84
The STM32 Cortex-M4 instruction setPM0214
Multiword values do not have to use consecutive registers. Specific example 5: 96-bit
subtraction shows instructions that subtract a 96-bit integer contained in R9, R1, and R11
from another contained in R6, R2, and R8. The example stores the result in R6, R9, and R2.
Specific example 5: 96-bit subtraction
SUBS R6, R6, R9; subtract the least significant words
SBCS R9, R2, R1; subtract the middle words with carry
SBC R2, R8, R11; subtract the most significant words with carry
3.5.2 AND, ORR, EOR, BIC, and ORN
Logical AND, OR, Exclusive OR, Bit Clear, and OR NOT.
Syntax
op{S}{cond} {Rd,} Rn, Operand2
Where:
•‘op’ is one of:
AND: Logical AND.
ORR: Logical OR or bit set.
EOR: Logical exclusive OR.
BIC: Logical AND NOT or bit clear.
ORN: Logical OR NOT.
•‘S’ is an optional suffix. If S is specified, the condition code flags are updated on the
result of the operation, see Conditional execution on page 64.
•‘cond’ is an optional condition code, see Conditional execution on page 64.
•‘Rd’ is the destination register.
•‘Rn’ is the register holding the first operand.
•‘Operand2’ is a flexible second operand, see Flexible second operand on page 59 for
details of the options.
Operation
The AND, EOR, and ORR instructions perform bitwise AND, exclusive OR, and OR
operations on the values in Rn and operand2.
The BIC instruction performs an AND operation on the bits in Rn with the complements of
the corresponding bits in the value of operand2.
The ORN instruction performs an OR operation on the bits in Rn with the complements of
the corresponding bits in the value of operand2.
Restrictions
Do not use either SP or PC.
84/260DocID022708 Rev 6
Page 85
PM0214The STM32 Cortex-M4 instruction set
Condition flags
If S is specified, these instructions:
•Update the N and Z flags according to the result.
•Can update the C flag during the calculation of operand2, see Flexible second operand
on page 59.
•Do not affect the V flag.
Examples
AND R9, R2,#0xFF00
ORREQ R2, R0, R5
ANDS R9, R8, #0x19
EORS R7, R11, #0x18181818
BIC R0, R1, #0xab
ORN R7, R11, R14, ROR #4
ORNS R7, R11, R14, ASR #32
3.5.3 ASR, LSL, LSR, ROR, and RRX
Arithmetic Shift Right, Logical Shift Left, Logical Shift Right, Rotate Right, and Rotate Right
with Extend.
Syntax
op{S}{cond} Rd, Rm, Rs
op{S}{cond} Rd, Rm, #n
RRX{S}{cond} Rd, Rm
Where:
•‘op’ is one of the following:
ASR: Arithmetic Shift Right
LSL: Logical Shift Left
LSR: Logical Shift Right
ROR: Rotate Right
•‘S’ is an optional suffix. If S is specified, the condition code flags are updated on the
result of the operation, see Conditional execution on page 64.
•‘Rd’ is the destination register.
•‘Rm’ is the register holding the value to be shifted.
•‘Rs’ is the register holding the shift length to apply to the value Rm. Only the least
significant byte is used and can be in the range 0 to 255.
•‘n’ is the shift length. The range of shift lengths depends on the instruction as follows:
ASR: Shift length from 1 to 32
LSL: Shift length from 0 to 31
LSR: Shift length from 1 to 32
ROR: Shift length from 1 to 31
Note:MOVS Rd, Rm is the preferred syntax for LSLS Rd, Rm, #0.
DocID022708 Rev 685/260
259
Page 86
The STM32 Cortex-M4 instruction setPM0214
Operation
ASR, LSL, LSR, and ROR move the bits in the
of places specified by constant n or register Rs.
RRX moves the bits in
In all these instructions, the result is written to Rd, but the value in
unchanged. For details on what result is generated by the different instructions see
operations on page 61.
Rm
register to the right by 1.
Rm
register to the left or right by the number
Rm
register remains
Shift
Restrictions
Do not use either SP or PC.
Condition flags
If S is specified:
•These instructions update the N and Z flags according to the result
•The C flag is updated to the last bit shifted out, except when the shift length is 0 (see
Shift operations on page 61).
Examples
ASR R7, R8, #9; arithmetic shift right by 9 bits
LSLS R1, R2, #3; logical shift left by 3 bits with flag update
LSR R4, R5, #6; logical shift right by 6 bits
ROR R4, R5, R6; rotate right by the value in the bottom byte of R6
RRX R4, R5; rotate right with extend
3.5.4 CLZ
Count leading zeros.
Syntax
CLZ{cond} Rd, Rm
Where:
•‘cond’ is an optional condition code (see Conditional execution on page 64).
•‘Rd’ is the destination register.
•‘Rm’ is the operand register.
Operation
The CLZ instruction counts the number of leading zeros in the value in Rm and returns the
result in Rd. The result value is 32 if no bits are set in the source register, and zero if bit[31]
is set.
Restrictions
Do not use either SP or PC.
Condition flags
This instruction does not change the flags.
86/260DocID022708 Rev 6
Page 87
PM0214The STM32 Cortex-M4 instruction set
Examples
CLZ R4,R9
CLZNE R2,R3
3.5.5 CMP and CMN
Compare and Compare Negative.
Syntax
CMP{cond} Rn, Operand2
CMN{cond} Rn, Operand2
Where:
•‘cond’ is an optional condition code (see Conditional execution on page 64).
•‘Rn’ is the register holding the first operand.
•‘Operand2’ is a flexible second operand (see Flexible second operand on page 59) for
details of the options.
Operation
These instructions compare the value in a register with operand2. They update the condition
flags on the result, but do not write the result to a register.
The CMP instruction subtracts the value of operand2 from the value in Rn. This is the same
as a SUBS instruction, except that the result is discarded.
The CMN instruction adds the value of operand2 to the value in Rn. This is the same as an
ADDS instruction, except that the result is discarded.
Restrictions
In these instructions:
•Do not use PC.
•Operand2 must not be SP.
Condition flags
These instructions update the N, Z, C and V flags according to the result.
Examples
CMP R2, R9
CMN R0, #6400
CMPGT SP, R7, LSL #2
DocID022708 Rev 687/260
259
Page 88
The STM32 Cortex-M4 instruction setPM0214
3.5.6 MOV and MVN
Move and Move NOT.
Syntax
MOV{S}{cond} Rd, Operand2
MOV{cond} Rd, #imm16
MVN{S}{cond} Rd, Operand2
Where:
•‘S’ is an optional suffix. If S is specified, the condition code flags are updated on the
result of the operation (see Conditional execution on page 64).
•‘cond’ is an optional condition code (see Conditional execution on page 64).
•‘Rd’ is the destination register.
•‘Operand2’ is a flexible second operand (see Flexible second operand on page 59) for
details of the options.
•‘imm16’ is any value in the range 0—65535.
Operation
The MOV instruction copies the value of operand2 into Rd.
When operand2 in a MOV instruction is a register with a shift other than LSL #0, the
preferred syntax is the corresponding shift instruction:
•ASR{S}{cond} Rd, Rm, #n is the preferred syntax for MOV{S}{cond} Rd, Rm, ASR #n
•LSL{S}{cond} Rd, Rm, #n is the preferred syntax for MOV{S}{cond} Rd, Rm, LSL #n if n
!= 0
•LSR{S}{cond} Rd, Rm, #n is the preferred syntax for MOV{S}{cond} Rd, Rm, LSR #n
•ROR{S}{cond} Rd, Rm, #n is the preferred syntax for MOV{S}{cond} Rd, Rm, ROR #n
•RRX{S}{cond} Rd, Rm is the preferred syntax for MOV{S}{cond} Rd, Rm, RRX
Also, the MOV instruction permits additional forms of operand2 as synonyms for shift
instructions:
•MOV{S}{cond} Rd, Rm, ASR Rs is a synonym for ASR{S}{cond} Rd, Rm, Rs
•MOV{S}{cond} Rd, Rm, LSL Rs is a synonym for LSL{S}{cond} Rd, Rm, Rs
•MOV{S}{cond} Rd, Rm, LSR Rs is a synonym for LSR{S}{cond} Rd, Rm, Rs
•MOV{S}{cond} Rd, Rm, ROR Rs is a synonym for ROR{S}{cond} Rd, Rm, Rs
See ASR, LSL, LSR, ROR, and RRX on page 85.
The MVN instruction takes the value of operand2, performs a bitwise logical NOT operation
on the value, and places the result into Rd.
Note:The MOVW instruction provides the same function as MOV, but is restricted to use of the
imm16 operand.
88/260DocID022708 Rev 6
Page 89
PM0214The STM32 Cortex-M4 instruction set
Restrictions
You can use SP and PC only in the MOV instruction, with the following restrictions:
•The second operand must be a register without shift
•You must not specify the S suffix
When Rd is PC in a MOV instruction:
•bit[0] of the value written to the PC is ignored
•A branch occurs to the address created by forcing bit[0] of that value to 0.
Note:Though it is possible to use MOV as a branch instruction, ARM strongly recommends the
use of a BX or BLX instruction to branch for software portability to the ARM instruction set.
Condition flags
If S is specified, these instructions:
•Update the N and Z flags according to the result
•Can update the C flag during the calculation of operand2 (see Flexible second operand
on page 59).
•Do not affect the V flag
Example
MOVS R11, #0x000B; write value of 0x000B to R11, flags get updated
MOV R1, #0xFA05; write value of 0xFA05 to R1, flags not updated
MOVS R10, R12; write value in R12 to R10, flags get updated
MOV R3, #23; write value of 23 to R3
MOV R8, SP; write value of stack pointer to R8
MVNS R2, #0xF; write value of 0xFFFFFFF0 (bitwise inverse of 0xF)
; to the R2 and update flags
DocID022708 Rev 689/260
259
Page 90
The STM32 Cortex-M4 instruction setPM0214
3.5.7 MOVT
Move Top.
Syntax
MOVT{cond} Rd, #imm16
Where:
•‘cond’ is an optional condition code (see Conditional execution on page 64).
•‘Rd’ is the destination register.
•‘imm16’ is a 16-bit immediate constant.
Operation
MOVT writes a 16-bit immediate value, imm16, to the top halfword, Rd[31:16], of its
destination register. The write does not affect Rd[15:0].
The MOV, MOVT instruction pair enables you to generate any 32-bit constant.
Restrictions
Rd must be neither SP nor PC.
Condition flags
This instruction does not change the flags.
Examples
MOVT R3, #0xF123; write 0xF123 to upper halfword of R3,
; lower halfword and APSR are unchanged
90/260DocID022708 Rev 6
Page 91
PM0214The STM32 Cortex-M4 instruction set
3.5.8 REV, REV16, REVSH, and RBIT
Reverse bytes and Reverse bits.
Syntax
op{cond} Rd, Rn
Where:
•‘op’ is one of the following:
REV: Reverse byte order in a word.
REV16: Reverse byte order in each halfword independently.
REVSH: Reverse byte order in the bottom halfword, and sign extends to 32 bits.
RBIT: Reverse the bit order in a 32-bit word.
•‘cond’ is an optional condition code, see Conditional execution on page 64.
•‘Rd’ is the destination register.
•‘Rn’ is the register holding the operand.
Operation
Use these instructions to change endianness of data:
•REV: Converts either:
–32-bit big-endian data into little-endian data
–or 32-bit little-endian data into big-endian data.
•REV16: Converts either:
–16-bit big-endian data into little-endian data
–or 16-bit little-endian data into big-endian data.
•REVSH: Converts either:
–16-bit signed big-endian data into 32-bit signed little-endian data
–or 16-bit signed little-endian data into 32-bit signed big-endian data.
Restrictions
Do not use either SP or PC.
Condition flags
These instructions do not change the flags.
Examples
REV R3, R7; reverse byte order of value in R7 and write it to R3
REV16 R0, R0 ; reverse byte order of each 16-bit halfword in R0
REVSH R0, R5 ; reverse Signed Halfword
REVHS R3, R7 ; reverse with Higher or Same condition
RBIT R7, R8 ; reverse bit order of value in R8 and write result to R7
DocID022708 Rev 691/260
259
Page 92
The STM32 Cortex-M4 instruction setPM0214
3.5.9 SADD16 and SADD8
Signed Add 16 and Signed Add 8
Syntax
op{cond}{Rd,} Rn, Rm
Where:
•op is any of the following:
SADD16: Performs two 16-bit signed integer additions.
SADD8: Performs four 8-bit signed integer additions.
•‘cond’ is an optional condition code (see Conditional execution on page 64).
•‘Rd’ is the destination register.
•‘Rn’ is the register holding the operand.
•‘Rm’ is the second register holding the operand.
Operation
Use these instructions to perform a halfword or byte add in parallel:
The SADD16 instruction:
1.Adds each halfword from the first operand to the corresponding halfword of the second
operand.
2. Writes the result in the corresponding halfwords of the destination register.
The SADD8 instruction:
1.Adds each byte of the first operand to the corresponding byte of the second operand.
2. Writes the result in the corresponding bytes of the destination register.
Restrictions
Do not use SP and do not use PC.
Condition flags
These instructions do not change the flags.
Examples
SADD16 R1, R0 ; Adds the halfwords in R0 to the corresponding halfword
; of R1 and writes to corresponding halfword of R1.
SADD8 R4, R0, R5 ; Adds bytes of R0 to the corresponding byte in R5 and
; writes to the corresponding byte in R4.
92/260DocID022708 Rev 6
Page 93
PM0214The STM32 Cortex-M4 instruction set
3.5.10 SHADD16 and SHADD8
Signed Halving Add 16 and Signed Halving Add 8
Syntax
op{cond}{Rd,} Rn, Rm
Where:
•op is any of the following:
SHADD16: Signed halving add 16.
SHADD8: Signed halving add 8.
•‘cond’ is an optional condition code (see Conditional execution on page 64).
•‘Rd’ is the destination register.
•‘Rn’ is the register holding the operand.
•‘Rm’ is the second operand register.
Operation
Use these instructions to add 16-bit and 8-bit data and then to halve the result before writing
the result to the destination register:
The SHADD16 instruction:
1.Adds each halfword from the first operand to the corresponding halfword of the second
operand.
2. Shuffles the result by one bit to the right, halving the data.
3. Writes the halfword results in the destination register.
The SHADDB8 instruction:
1.Adds each byte of the first operand to the corresponding byte of the second operand.
2. Shuffles the result by one bit to the right, halving the data.
3. Writes the byte results in the destination register.
Restrictions
Do not use SP and do not use PC.
Condition flags
These instructions do not change the flags.
Examples
SHADD16 R1, R0 ; Adds halfwords in R0 to corresponding halfword of R1 &
; writes halved result to corresponding halfword in R1
SHADD8 R4, R0, R5 ; Adds bytes of R0 to corresponding byte in R5 and
; writes halved result to corresponding byte in R4.
DocID022708 Rev 693/260
259
Page 94
The STM32 Cortex-M4 instruction setPM0214
3.5.11 SHASX and SHSAX
Signed Halving Add and Subtract with Exchange / Signed Halving Subtract and Add with
Exchange.
Syntax
op{cond} {Rd}, Rn, Rm
Where:
•op is any of the following:
SHASX: Add and subtract with exchange and halving.
SHSAX: Subtract and add with exchange and halving.
•‘cond’ is an optional condition code (see Conditional execution on page 64):
•‘Rd’ is the destination register:
•‘Rn’ is the register holding the operand:
•‘Rn’, ‘Rm’ are the registers holding the first and second operands:
Operation
The SHASX instruction:
1.Adds the top halfword of the first operand to the bottom halfword of second operand.
2. Writes the halfword result of the addition to the top halfword of the destination register,
shifted by one bit to the right, causing a divide by two, or halving.
3. Subtracts the top halfword of the second operand from the bottom highword of the first
operand.
4. Writes the halfword result of the division in the bottom halfword of the destination
register, shifted by one bit to the right, causing a divide by two, or halving.
The SHSAX instruction:
1.Subtracts the bottom halfword of the second operand from the top highword of the first
operand.
2. Writes the halfword result of the addition to the bottom halfword of the destination
register, shifted by one bit to the right, causing a divide by two, or halving.
3. Adds the bottom halfword of the first operand to the top halfword of the second
operand.
4. Writes the halfword result of the division in the top halfword of the destination register,
shifted by one bit to the right, causing a divide by two, or halving.
Restrictions
Do not use SP and do not use PC.
Condition flags
These instructions do not affect the condition code flags.
Examples
SHASXR7, R4, R2 ; Adds top halfword of R4 to bottom halfword of R2
; and writes halved result to top halfword of R7
; Subtracts top halfword of R2 from bottom halfword of
94/260DocID022708 Rev 6
Page 95
PM0214The STM32 Cortex-M4 instruction set
; R4 and writes halved result to bottom halfword of R7
SHSAXR0, R3, R5 ; Subtracts bottom halfword of R5 from top halfword
; of R3 and writes halved result to top halfword of R0
; Adds top halfword of R5 to bottom halfword of R3 and
; writes halved result to bottom halfword of R0.
3.5.12 SHSUB16 and SHSUB8
Signed Halving Subtract 16 and Signed Halving Subtract 8
Syntax
op{cond}{Rd,} Rn, Rm
Where:
•op is any of the following:
SHSUB16: Signed halving subtract 16
SHSUB8: Signed halving subtract 8
•‘cond’ is an optional condition code (see Conditional execution on page 64)
•‘Rd’ is the destination register
•‘Rn’ is the register holding the operand
•‘Rm’ is the second operand register
Operation
Use these instructions to add 16-bit and 8-bit data and then to halve the result before writing
the result to the destination register:
The SHSUB16 instruction:
1.Subtracts each halfword of the second operand from the corresponding halfwords of
the first operand.
2. Shuffles the result by one bit to the right, halving the data.
3. Writes the halved halfword results in the destination register.
The SHSUBB8 instruction:
1.Subtracts each byte of the second operand from the corresponding byte of the first
operand,
2. Shuffles the result by one bit to the right, halving the data,
3. Writes the corresponding signed byte results in the destination register.
Restrictions
Do not use SP and do not use PC.
Condition flags
These instructions do not change the flags.
Examples
SHSUB16 R1, R0 ; Subtracts halfwords in R0 from corresponding halfword
; of R1 and writes to corresponding halfword of R1
SHSUB8 R4, R0, R5 ; Subtracts bytes of R0 from corresponding byte in R5,
DocID022708 Rev 695/260
259
Page 96
The STM32 Cortex-M4 instruction setPM0214
; and writes to corresponding byte in R4.
3.5.13 SSUB16 and SSUB8
Signed Subtract 16 and Signed Subtract 8
Syntax
op{cond}{Rd,} Rn, Rm
Where:
•op is one of the following:
SSUB16: Performs two 16-bit signed integer subtractions.
SSUB8: Performs four 8-bit signed integer subtractions.
•‘cond’ is an optional condition code (see Conditional execution on page 64).
•‘Rd’ is the destination register.
•‘Rn’ is the register holding the operand.
•‘Rm’ is the second operand register.
Operation
Use these instructions to change endianness of data:
The SSUB16 instruction:
1.Subtracts each halfword from the second operand from the corresponding halfword of
the first operand.
2. Writes the difference result of two signed halfwords in the corresponding halfword of
the destination register.
The SSUB8 instruction:
1.Subtracts each byte of the second operand from the corresponding byte of the first
operand.
2. Writes the difference result of four signed bytes in the corresponding byte of the
destination register.
Restrictions
Do not use SP and do not use PC.
Condition flags
These instructions do not change the flags.
Examples
SSUB16 R1, R0 ; Subtracts halfwords in R0 from corresponding halfword
; of R1 and writes to corresponding halfword of R1
SSUB8 R4, R0, R5 ; Subtracts bytes of R5 from corresponding byte in
; R0, and writes to corresponding byte of R4.
96/260DocID022708 Rev 6
Page 97
PM0214The STM32 Cortex-M4 instruction set
3.5.14 SASX and SSAX
Signed Add and Subtract with Exchange and Signed Subtract and Add with Exchange.
Syntax
op{cond} {Rd}, Rm, Rn
Where:
•op is any of the following:
SASX: Signed add and subtract with exchange.
SSAX: Signed subtract and add with exchange.
•‘cond’ is an optional condition code (see Conditional execution on page 64).
•‘Rd’ is the destination register.
•‘Rn’ ,‘Rm’ are the registers holding the first and second operands.
Operation
The SASX instruction:
1.Adds the signed top halfword of the first operand with to the signed bottom halfword of
the second operand.
2. Writes the signed result of the addition to the top halfword of the destination register.
3. Subtracts the signed bottom halfword of the second operand from the top signed
highword of the first operand.
4. Writes the signed result of the subtraction to the bottom halfword of the destination
register.
The SSAX instruction:
1.Subtracts the signed bottom halfword of the second operand from the top signed
highword of the first operand.
2. Writes the signed result of the addition to the bottom halfword of the destination
register.
3. Adds the signed top halfword of the first operand to the signed bottom halfword of the
second operand.
4. Writes the signed result of the subtraction to the top halfword of the destination register.
Restrictions
Do not use SP and do not use PC.
Condition flags
These instructions do not affect the condition code flags.
Examples
SASX R0, R4, R5; Adds top halfword of R4 to bottom halfword of R5 and
; writes to top halfword of R0
; Subtracts bottom halfword of R5 from top halfword of R4
; and writes to bottom halfword of R0
SSAX R7, R3, R2; Subtracts top halfword of R2 from bottom halfword of R3
; and writes to bottom halfword of R7
DocID022708 Rev 697/260
259
Page 98
The STM32 Cortex-M4 instruction setPM0214
; Adds top halfword of R3 with bottom halfword of R2 and
; writes to top halfword of R7.
3.5.15 TST and TEQ
Test bits and Test Equivalence.
Syntax
TST{cond} Rn, Operand2
TEQ{cond} Rn, Operand2
Where:
•‘cond’ is an optional condition code (see Conditional execution on page 64).
•‘Rn’ is the register holding the first operand.
•‘Operand2’ is a flexible second operand (see Flexible second operand on page 59) for
details of the options.
Operation
These instructions test the value in a register against operand2. They update the condition
flags based on the result, but do not write the result to a register.
The TST instruction performs a bitwise AND operation on the value in Rn and the value of
operand2. This is the same as the ANDS instruction, except that it discards the result.
To test whether a bit of Rn is 0 or 1, use the TST instruction with an operand2 constant that
has that bit set to 1 and all other bits cleared to 0.
The TEQ instruction performs a bitwise exclusive OR operation on the value in Rn and the
value of operand2. This is the same as the EORS instruction, except that it discards the
result.
Use the TEQ instruction to test if two values are equal without affecting the V or C flags.
TEQ is also useful for testing the sign of a value. After the comparison, the N flag is the
logical exclusive OR of the sign bits of the two operands.
Restrictions
Do not use either SP or PC.
Condition flags
These instructions:
•Update the N and Z flags according to the result
•Can update the C flag during the calculation of operand2 (see Flexible second operand
on page 59).
•Do not affect the V flag
Examples
TST R0, #0x3F8; perform bitwise AND of R0 value to 0x3F8,
; APSR is updated but result is discarded
TEQEQ R10, R9; conditionally test if value in R10 is equal to
; value in R9, APSR is updated but result is discarded
98/260DocID022708 Rev 6
Page 99
PM0214The STM32 Cortex-M4 instruction set
3.5.16 UADD16 and UADD8
Unsigned Add 16 and Unsigned Add 8
Syntax
op{cond}{Rd,} Rn, Rm
Where:
•op is one of the following:
UADD16: Performs two 16-bit unsigned integer additions.
UADD8: Performs four 8-bit unsigned integer additions.
•‘cond’ is an optional condition code (see Conditional execution on page 64).
•‘Rd’ is the destination register.
•‘Rn’ is the first register holding the operand.
•‘Rm’ is the second register holding the operand.
Operation
Use these instructions to add 16- and 8-bit unsigned data:
The UADD16 instruction:
1.Adds each halfword from the first operand to the corresponding halfword of the second
operand.
2. Writes the unsigned result in the corresponding halfwords of the destination register.
The UADD16 instruction:
1.Adds each byte of the first operand to the corresponding byte of the second operand.
2. Writes the unsigned result in the corresponding byte of the destination register.
Restrictions
Do not use SP and do not use PC.
Condition flags
These instructions do not change the flags.
Examples
UADD16 R1, R0 ; Adds halfwords in R0 to corresponding halfword of R1,
; writes to corresponding halfword of R1
UADD8 R4, R0, R5 ; Adds bytes of R0 to corresponding byte in R5 and writes
; to corresponding byte in R4.
DocID022708 Rev 699/260
259
Page 100
The STM32 Cortex-M4 instruction setPM0214
3.5.17 UASX and USAX
Add and Subtract with Exchange and Subtract and Add with Exchange.
Syntax
op{cond} {Rd}, Rn, Rm
Where:
•op is one of:
UASX: Add and subtract with exchange.
USAX: Subtract and add with exchange.
•‘cond’ is an optional condition code (see Conditional execution on page 64).
•‘Rd’ is the destination register.
•‘Rn’ ‘Rm’ are registers holding the first and second operands.
Operation
The UASX instruction:
1.Subtracts the top halfword of the second operand from the bottom halfword of the first
operand.
2. Writes the unsigned result from the subtraction to the bottom halfword of the
destination register.
3. Adds the top halfword of the first operand with bottom halfword of the second operand.
4. Writes the unsigned result of the addition to the top halfword of the destination register.
The USAX instruction:
1.Adds the bottom halfword of the first operand to the top halfword of the second
operand.
2. Writes the unsigned result of the addition to the bottom halfword of the destination
register.
3. Subtracts the bottom halfword of the second operand from the top halfword of the first
operand.
4. Writes the unsigned result from the subtraction to the top halfword of the destination
register.
Restrictions
Do not use SP and do not use PC.
Condition flags
These instructions do not affect the condition code flags.
Examples
UASX R0, R4, R5; Adds top halfword of R4 to bottom halfword of R5 and
; writes to top halfword of R0
; Subtracts bottom halfword of R5 from top halfword of R0
; and writes to bottom halfword of R0
USAX R7, R3, R2; Subtracts top halfword of R2 from bottom halfword of R3
; and writes to bottom halfword of R7
; Adds top halfword of R3 to bottom halfword of R2 and
100/260DocID022708 Rev 6
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.