The STM8 family of HCMOS microcontrollers is designed and built around an enhanced
industry standard 8-bit core and a library of peripheral blocks, which include ROM, Flash,
RAM, EEPROM, I/O, Serial Interfaces (SPI, USART, I2C,...), 16-bit Timers, A/D converters,
comparators, power supervisors etc. These blocks may be assembled in various
combinations in order to provide cost-effective solutions for application-specific products.
The STM8 family forms a part of the STMicroelectronics 8-bit MCU product line, which finds
its place in a wide variety of applications such as automotive systems, remote controls,
video monitors, car radio and numerous other consumer, industrial, telecom, and multimedia
products.
The 8-bit STM8 Core is designed for high code efficiency. It contains 6 internal registers, 20
addressing modes and 80 instructions. The 6 internal registers include two 16-bit Index
registers, an 8-bit Accumulator, a 24-bit Program Counter, a 16-bit Stack Pointer and an 8-
bit Condition Code register. The two Index registers X and Y enable Indexed Addressing
modes with or without offset, along with read-modify-write type data manipulation. These
registers simplify branching routines and data/arrays modifications.
The 24-bit Program Counter is able to address up to 16-Mbyte of RAM, ROM or Flash
memory. The 16-bit Stack Pointer provides access to a 64K-level Stack. The Core also
includes a Condition Code register providing 7 Condition flags that indicate the result of the
last instruction executed.
The 20 Addressing modes, including Indirect Relative and Indexed addressing, allow
sophisticated branching routines or CASE-type functions. The Indexed Indirect Addressing
mode, for instance, permits look-up tables to be located anywhere in the address space,
thus enabling very flexible programming and compact C-based code. The stack pointer
relative addressing mode permits optimized C compiler stack model for local variables and
parameter passing.
The Instruction Set is 8-bit oriented with a 2-byte average instruction size. This Instruction
Set offers, in addition to standard data movement and logic/arithmetic functions, 8-bit by 8-
bit multiplication, 16-bit by 8-bit and 16-bit by 16-bit division, bit manipulation, data transfer
between Stack and Accumulator (Push / Pop) with direct stack access, as well as data
transfer using the X and Y registers or direct memory-to-memory transfers.
The number of Interrupt vectors can vary up to 32, and the interrupt priority level may be
managed by software providing hardware controlled nested capability. Some peripherals
include Direct Memory Access (DMA) between serial interfaces and memory. Support for
slow memories allows easy external code execution through serial or parallel interface
(ROMLESS products for instance).
The STM8 has a high energy-efficient architecture, based on a Harvard architecture and
pipelined execution. A 32-bit wide program memory bus allows most of the instructions to be
fetched in 1 CPU cycle. Moreover, as the average instruction length is 2 bytes, this allows for
a reduction in the power consumption by only accessing the program memory half of the
time, on average. The pipelined execution allowed the execution time to be minimized,
ensuring high system performance, when needed, together with the possibility to reduce the
overall energy consumption, by using different power saving operating modes. Power-saving
can be managed under program control by placing the device in SLOW, WAIT, SLOW-WAIT,
ACTIVE-HALT or HALT mode (see product datasheet for more details).
Doc ID 13590 Rev 39/162
STM8 architecturePM0044
Additional blocks
The additional blocks take the form of integrated hardware peripherals arranged around the
central processor core. The following (non-exhaustive) list details the features of some of the
currently available blocks:
Boot ROM Memory area containing the bootloader code
Flash Flash-based devices
RAM Sizes up to several Kbytes
Data EEPROM
Timers
A/D converter
I2C
SPI
USART
Watchdog
I/O ports
Sizes up to several Kbytes. Erase/programming operations do not require
additional external power sources.
Different versions based on 8/16-bit free running or autoreload timer/counter are
available. They can be coupled with either input captures, output compares or
PWM facilities. PWM functions can have software programmable duty cycle
between 0% to 100% in up to 256/65536 steps. The outputs can be filtered to
provide D/A conversion.
The Analog to Digital Converter uses a sample and hold technique. It has 12-bit
resolution.
Multi/master, single master, single slave modes, DMA or 1byte transfer, standard
and fast I2C modes, 7 and 10-bit addressing.
The Serial peripheral Interface is a fully synchronous 3/4 wire interface ideal for
Master and Slave applications such as driving devices with input shift register
(LCD driver, external memory,...).
The USART is a fast synchronous/asynchronous interface which features both
duplex transmission, NRZ format, programmable baud rates and standard error
detection. The USART can also emulate RS232 protocol.
It has the ability to induce a full reset of the MCU if its counter counts down to
zero prior to being reset by the software. This feature is especially useful in noisy
applications.
They are programmable by software to act in several input or output
configurations on an individual line basis, including high current and interrupt
generation. The basic block has eight bit lines.
1.1 STM8 development support
The STM8 family of MCUs is supported by a comprehensive range of development tools.
This family presently comprises hardware tools (emulators, programmers), a software
package (assembler-linker, debugger, archiver) and a C-compiler development tool.
STM8 and ST7 CPUs are supported by a single toolchain allowing easy reuse and
portability of the applications between product lines.
10/162Doc ID 13590 Rev 3
PM0044STM8 architecture
1.2 Enhanced STM8 features
●16-Mbyte linear program memory space with 3 FAR instructions (CALLF, RETF, JPF)
●16-Mbyte linear data memory space with 1 FAR instruction (LDF)
●Up to 32 24-bit interrupt vectors with optimized context save management
●16-bit Stack Pointer (SP=SH:S) with stack manipulation instructions and addressing
modes
●New register and memory access instructions (EXG, MOV)
●New arithmetic instructions: DIV 16/8 and DIVW 16/16
●New bit handling instructions (CCF, BCPL, BCCM)
●2 x 16-bit index registers (X=XH:XL, Y=YH:YL). 8-bit data transfers address the low
byte. The high-byte is not affected, with a reset value of 0. This allows the use of X/Y as
8-bit values.
●Fast interrupt handling through alternate register files (up to 4 contexts) with standard
stack compatible mode (for real time OS kernels)
●16-bit/8-bit stack operations (X, Y, A, CC stacking)
●16-bit pointer direct update with 16-bit relative offset (ADDW/SUBW for X/Y/SP)
●8-bit & 16-bit arithmetic and signed arithmetic support
Doc ID 13590 Rev 311/162
GlossaryPM0044
2 Glossary
mnemmnemonic
srcsource
dstdestination
cyduration of the instruction in CPU clock cycles (internal clock)
The global configuration register is a memory mapped register. It controls the configuration
of the processor. It contains the AL control bit:
AL: Activation level
If the AL bit is 0 (main), the IRET will cause the context to be retrieved from stack and the
main program will continue after the WFI instruction.
If the AL bit is 1 (interrupt only active), the IRET will cause the CPU to go back to WFI/HALT
mode without restoring the context.
This bit is used to control the low power modes of the MCU. In a very low power application,
the MCU spends most of the time in WFI/HALT mode and is woken up (through interrupts)
at specific moments in order to execute a specific task. Some of these recurring tasks are
short enough to be treated directly in an ISR, rather than going back to the main program. In
this case, by programming the AL bit to 1 before going to low power (by executing WFI/HALT
instruction), the run time/ISR execution is reduced due to the fact that the register context is
not saved/restored each time.
Condition Code register (CC)
The Condition Code register is a 8-bit register which indicates the result of the instruction
just executed as well as the state of the processor. These bits can be individually tested by a
program and specified action taken as a result of their state. The following paragraphs
describe each bit.
●V: Overflow
When set, V indicates that an overflow occurred during the last signed arithmetic
operation, on the MSB operation result bit. See INC, INCW, DEC, DECW, NEG, NEGW,
ADD, ADC, SUB, SUBW, SBC, CP, CPW instructions.
●I1: Interrupt mask level 1
The I1 flag works in conjunction with the I0 flag to define the current interruptability level
as shown in the following table. These flags can be set and cleared by software through
the RIM, SIM, HALT, WFI, IRET, TRAP and POP instructions and are automatically set
by hardware when entering an interrupt service routine.
Table 1.Interruptability levels
InterruptabilityPriorityI1I0
Interruptable Main
Interruptable Level 101
Interruptable Level 200
Non Interruptable11
●H: Half carry bit
Lowest
↕
Highest
10
The H bit is set to 1 when a carry occurs between the bits 3 and 4 of the ALU during an
ADD or ADC instruction. The H bit is useful in BCD arithmetic subroutines.
For ADDW, SUBW it is set when a carry occurs from bit 7 to 8, allowing to implement
byte arithmetic on 16-bit index registers.
Doc ID 13590 Rev 315/162
STM8 core descriptionPM0044
●I0: Interrupt mask level 0
See Flag I1
●N: Negative
When set to 1, this bit indicates that the result of the last arithmetic, logical or data
manipulation is negative (i.e. the most significant bit is a logic 1).
●Z: Zero
When set to 1, this bit indicates that the result of the last arithmetic, logical or data
manipulation is zero.
●C: Carry
When set, C indicates that a carry or borrow out of the ALU occurred during the last
arithmetic operation on the MSB operation result bit (bit 7 for 8-bit result/destination or
bit 15 for 16-bit result). This bit is also affected during bit test, branch, shift, rotate and
load instructions. See ADD, ADC, SUB, SBC instructions.
In bit test operations, C is the copy of the tested bit. See BTJF, BTJT instructions.
In shift and rotates operations, the carry is updated. See RRC, RLC, SRL, SLL, SRA
instructions.
This bit can be set, reset or complemented by software using SCF, RCF, CCF
instructions.
Example: Addition
$B5 + $94 = "C" + $49 = $149
C70
010110101
C70
+010010100
C70
=101001001
The results of each instruction on the Condition Code register are shown by tables in
Section 7: STM8 instruction set. The following table is an example:
VI1HI0NZC
V00NZ1
where
Nothing =Flag not affected
Flag name =Flag affected
0 =Flag cleared
1 =Flag set
16/162Doc ID 13590 Rev 3
PM0044STM8 memory interface
4 STM8 memory interface
4.1 Program space
The program space is 16-Mbyte and linear. To distinguish the 1, 2 and 3 byte wide
addressing modes, naming has been defined as shown in Figure 3:
●"Page" [0xXXXX00 to 0xXXXXFF]: 256-byte wide memory space with the same two
most significant address bytes (XXXX defines the page number).
●"Section" [0xXX0000 to 0xXXFFFF]: 64-Kbyte wide memory space with the same most
significant address byte (XX defines the section number).
The reset and interrupt vector table are placed at address 0x8000 for the STM8 family.
(Note: the base address may be different for later implementations.) The table has 32 4-byte
entries: RESET, Trap, NMI and up to 29 normal user interrupts. Each entry consists of the
reserved op-code 0x82, followed by a 24-bit value: PCE, PCH, PCL address of the
respective Interrupt Service Routine. The main program and ISRs can be mapped
anywhere in the 16 Mbyte memory space.
CALL/CALLR and RET must be used only in the same section. The effective address for the
CALL/RET is used as an offset to the current PCE register value. For the JP, the effective
address 16 or 17-bit (for indexed addressing) long, is added to the current PCE value. In
order to reach any address in the program space, the JPF jump and CALLF call instructions
are provided with a three byte extended addressing mode while the RETF pops also three
bytes from the stack.
As the memory space is linear, sections can be crossed by two CPU actions: next
instruction byte fetch (PC+1), relative jumps and, in some cases, by JP (for indexed
addressing mode).
Note:For safe memory usage, a function which crosses sections MUST:
- be called by a CALLF
- include only far instructions for code operation (CALLF & JPF)
All label pointers are located in section 0 (JP [ptr.w] example: ptr.w is located in section 0
and the jump address in current section)
Any illegal op-code read from the program space triggers a MCU reset.
4.2 Data space
The data space is 16-Mbyte and linear. As the stack must be located in section 0 and as
data access outside section 0/1 can be managed only with LDF instructions, frequently used
data should be located in section 0 to get the optimum code efficiency.
All data pointers are located in section 0 only.
Indexed addressing (with 16-bit index registers and long offset) allows data access over
section 0 and 1.
All the peripherals are memory mapped in the data space.
Doc ID 13590 Rev 317/162
STM8 memory interfacePM0044
VECTORS
PAGE 0
0x000000
0x0000FF
0x00807F
0x00FFFF
0x010000
0x01FFFF
0xFF0000
0xFFFFFF
1-BYTE ADDRESSING MODE
BIT HANDLING CAPABILITY
2-BYTE ADDRESSING MODE
3-BYTE ADDRESSING MODE
FAST DATA ACCESS WITH
DATA SPACE
SECTION 0
SECTION 1
SECTION 256
RESET
L
RESET
H
TRAP
L
TRAP
H
NMI
L
NMI
H
INT0
L
INT0
H
INT1
L
INT1
H
INT28
L
INT28
H
0x00807C
0x008000
PROGRAM SPACE
BIT HANDLING CAPABILITY
POWERFUL DATA MANAGEMENT
ACCESSIBLE DATA
STACK AREA
SHORT GENERATED CODE
RESET
E
TRAP
E
NMI
E
INT0
E
INT1
E
INT28
E
0x008000
POINTERS
0x82
0x82
0x82
0x82
0x82
0x82
Figure 3.Address spaces
18/162Doc ID 13590 Rev 3
PM0044STM8 memory interface
PCE PCH PCL
PROGRAM COUNTER
Data@E Data@E0:H:L0x00
YN
"LDF" INSTRUCTION
@DATABUS
RAM FETCH INSTRUCTION
YN
CPU
Memory Interface (RAM)
STALL
A15..0
7
24
17
24
D7..0
R/W
DATABUS
@BUS
Memory Interface (Flash)
STALL
A23..0
D31..0
DATABUS
@BUS
(FETCH)
24
@DATABUS
24
4.3 Memory interface architecture
The STM8 uses a Harvard architecture, with separate program and data memory buses.
However, the logical address space is unified, all memories sharing the same 16-Mbytes
space, non-overlapped. The memory interfaces are shown in Figure 4. It consists of two
buses: address, data, read/write control signal (R/W) and memory acknowledge signal
(STALL).
The STALL acknowledge signal makes the CPU compatible with slow serial or parallel
memory interfaces. When the memory interface is slow the CPU waits the memory
acknowledge before executing the instruction. So in such a case, the instruction CPU cycle
time is prolonged compare to the value given in this manual.
The program memory bus is 32-bit wide, allowing the fetch of most of the instructions in one
cycle.
As the address space is unified, the architecture allows data to be stored also in the Flash
memory and program to be fetched also from RAM (data bus). In this later case the
performance is impacted, besides the fact that data and fetch operation share the same bus,
the instructions will be fetched one byte at a time, thus taking longer (1 cycle /byte).
The STM8 family uses a 3-stage pipeline to increase the speed of the flow of instructions
sent to the processor. Pipelined execution allows several operations to be performed
simultaneously, rather than serially:
●Fetch
●Decode and address
●Execute
The Program Counter (PC) points always to the instruction in decode stage as shown in
Figure 5.
Figure 5.Pipelined execution principle
5.1 Description of pipelined execution stages
Figure 6 and Section 5.1.1, Section 5.1.2, and Section 5.1.3 provide a detailed description
The first pipeline stage includes a 64-bit fetch buffer and a 32-bit prefetch buffer, totalling 3
words named F
to be available for decoding immediately after F
The instruction access from Flash Program memory is 32-bit wide and it is performed from
an aligned address i.e. 0xXXX0, 0xXXX4, 0xXXX8, or 0xXXXC.
Unlike the decode and execute stages that are performed at every cycle, the fetch stage
accesses the program memory only when needed, and stops memory access when the
buffer is full. This allows reducing the core power consumption,
Reading program from RAM is similar to reading program from ROM. However, since the
RAM data bus is 8-bit wide, 4 consecutive read operations have to be performed to load one
word, thus resulting in RAM execution being slower than Flash execution.
F
X
5.1.2 Decoding and addressing stage
The decoding stage includes an instruction alignment unit. The alignment unit uses the 64bit input from the fetch unit and feeds an instruction (from 1 to 5 bytes depending on the
instruction) to the decoding unit.
The instruction code consists of 2 parts (see examples in Table 2 ):
●The op-code itself (1 or 2 bytes)
●and a data/address part (0 to 3 bytes).
, F2 and F3. This buffer structure allows any instruction code (up to 5 bytes)
1
(and F2 when needed) is/are loaded.
1
Doc ID 13590 Rev 321/162
Pipelined executionPM0044
The op-code is decoded in this stage. When present, the instruction address is used for
address computation, whilst the immediate operand is forwarded to the execution stage.
Table 2.Data/address decoding examples
InstructionSyntaxOp-code Data/address
Register to register
move
Register loadLD A,($12,SP)0x7B0x12
Register storeLD ($12,SP),A0x6B0x12
Data load / store with
extended address
Long/unaligned instructions
For long instructions (i.e. 5-bytes instructions), the fetch may need 2 program memory
accesses to be completed. In this case, the decoding stage (after decoding the op-code
part), is stalled waiting for the fetch stage to complete the 2nd fetch.
In case of shorter instructions, this may also happen when they cross a 32-bit boundary.
Indirect addressing
For indirect addressing, the CPU is stalled in this stage to read the pointer from the data
memory (i.e. RAM). The number of cycles during which the CPU is stalled depends on the
pointer size (short, long or extended addressing mode).
5.1.3 Execution stage
In the execution stage, the operation is executed and the result is stored in the accumulator,
index register or RAM.
LD A, XH0x95-
LDF A,($123456,Y)0x90 AF0x12 34 56
5.2 Data memory conflicts
3 types of operations perform accesses to the data memory:
●Effective address computation in case of indirect addressing
●Data read: source operand
●Data write: destination for store or read-modify-write operations
In case of simultaneous accesses to the same memory area both in execution stage (write)
and decoding stage (read), the decode stage is stalled till the execution stage releases the
resource.
22/162Doc ID 13590 Rev 3
PM0044Pipelined execution
C
y
DecCyExeCy1–+=
5.3 Pipelined execution examples
A few pipelined execution examples are reported below. The numbers of cycles for the
decoding and execution stages correspond to the minimum number of cycles needed by the
instruction itself. In some cases, depending on the instruction sequence, the cycle taken
could be more than that number.
5.4 Conventions
Although the decode and/or execute stage of some instructions may take a different number
of cycles, a simplified convention providing a good match with reality, has been used in this
section:
●The decode stage of each instruction takes one cycle only
●The execution stage takes a number of cycles equal to
Where
C
is the number of execution cycles. In case of decode and execute cycles, It
y
corresponds to the minimum number of cycles needed by the instruction itself, and
does not take into account the impact of the instruction sequence.
DecCy is the exact number of decode cycles.
ExeCy is the exact number of execute cycles.
The decode stage of the next instruction starts during the last execution cycle. In
instructions performing pipeline flush, the convention is that, in case the branch is taken, the
next fetch are performed during the last instruction execution cycle.
The exact number of cycles (see Tab l e 3 ) and the number of cycles obtained using this
convention (see Tab l e 4 ) are identical.
Table 3.Example with exact number of cycles
AddressInstruction
0xC000LDW X, [$50.w]413
0xC003ADDW X, #20223
0xC006LD A, [$30].w313
0xC009….
Decode
cycles
Execute
cycles
lgth
F
Time (cycle)
12345678910 11 12 13 14
D D D D E
1
D D D D DEE
F
2
F
3
D DDDDDE
Doc ID 13590 Rev 323/162
Pipelined executionPM0044
Table 4.Example with conventional number of cycles
AddressInstruction
0xC000 LDW X, [$50.w]433
0xC003ADDW X, #20333
0xC006LD A, [$30].w333
Decode
cycles
Execute
cycles
lgth
1 234567 8 91011121314
Time (cycle)
D E E E E
F
1
D D D D EEE
F
2
F
3
D DDD
EEE
0xC009….
Table 5.Legend
Symbol/ColorDefinition
FFetch
DDecode stalled
DDecode
EExecute
5.4.1 Optimized pipeline example – execution from Flash Program memory
In the example shown in Tab l e 6 , the code is stored in the Flash Program memory (32-bit
bus). As a result, 3 cycles are needed to fill the 96-bit prefetch buffer. At each cycle, one
word is loaded and stored in F
the instructions contained in one of the F
instruction contained in F
, F2 and F3. The next fetch operation can start only when all
1
(SWAP A) is decoded, and a fetch operation can start to fill F
3
word are decoded. In fact, at cycle 9, the last
x
word.
3
24/162Doc ID 13590 Rev 3
PM0044Pipelined execution
Table 6.Optimized pipeline example - execution from Flash
Add.Instruction
0xC000NEG A111
0xC001XOR A, $10112
0xC003LD A, #20112
0xC005SUB A,$1000113
Decod.
cycles
Exec.
cycles
lgth
1234567891011121314
DE
F
1
DE
DE
F
2
0xC008INC A111
0xC009LD XL, A111
F
0xC00ASRL A111
3
0xC00BSWAP A111
0xC00CSLA $15112
F
0xC00ECP A,#$FE112
1
0xC010 MOV $100, #11114
0xC014 MOV $101, #22114
Table 7.Legend
DE
F
2
DE
Cycle
DE
DE
DE
F
3
DE
DE
D E
D E
Symbol/ColorDefinition
FFetch
DDecode
EExecute
Doc ID 13590 Rev 325/162
Pipelined executionPM0044
5.4.2 Optimize pipeline example – execution from RAM
In the example shown in Tab l e 8 , the RAM is accessed through an 8-bit bus. As a result, 12
cycles are required to fill the 96-bit pre-fetch buffer. Every 4 cycles, one word is loaded and
stored in F
filled. This occurs for example till the 4
decoded only at the 5
In case of read/write access to the RAM, the fetch is stalled. This occurs during the 6
since RAM address 10 is read during the decode stage of XOR A, $10.
Table 8.Optimize pipeline example – execution from RAM
. The decoding of the first word instruction can start only when the Fx word is
x
th
cycle.
th
cycle, and the first instruction (NEG A) can be
Cycle
th
cycle
Add.
Instruction
0xC000NEG A111
0xC001
0xC003 LD A, #20112
0xC005
0xC008INC A111
0xC009 LD XL, A111
0xC00ASRL A111
0xC00B SWAP A111
0xC00C SLA $15112
0xC00E
XOR A,
$10
SUB
A,$1000
CP
A,#$FE
Table 9.Legend
Decode cycles
Execute cycles
112
113
112
lgth
1 2 34567 8 9 10 11 12 13 14 15 16 17 18 19 20 21
D D DDE
1_1
F
1_4F2_1
F
D E
D D D DE
FS
2_2F2_3F2_4
F
DE
DD DD E
3_1
F
FS
3_2
F
F
3_3
F
3_4
D E
1_1F1_2
F
D E
1_3F1_4
F
DE
D E
D E
1_2F1_3
F
Symbol/ColorDefinition
FFetch
FSFetch stalled
DDecode
DDecode stalled
EExecute
26/162Doc ID 13590 Rev 3
PM0044Pipelined execution
5.4.3 Pipeline with Call/Jump
In the example shown in Tab l e 1 0, a branch is taken after the JP/CALL instruction, and the
fetched instruction(s) are lost (flush). New instructions must be fetched. 3 fetch sequences
are required to refill the pre-fetch buffer. The fetch start depends on the instruction being
executed.
For a JP instruction, the fetch can start during the first cycle of the "dummy" execution.
For the CALL instruction, it starts after the last cycle of the CALL execution.
Table 10.Example of pipeline with Call/Jump
Cycle
DE
DEE
F
2
Add.Instruction
Decode
cycles
Execute
cycles
lgth
0xC000INC A111
0xC001JP label113
0xC004LDW X,[$5432.w]XX4
0xD010label: NEG A111
0xD011CALL label2123
0xD014LDW X,[$5432.w]XX4
1 23456 7 8 91011
DE
F
1
DE
F
2
Flush
F
1
0xD018LDW X,[$7895.w]XX4F3FS
0xE030label2: INCW X111
Table 11.Legend
Symbol/ColorDefinition
FFetch
FSFetch stalled
DDecode
EExecute
Flush
F1DE
5.4.4 Pipeline stalled
The decode stage can be stalled when the execution lasts more than one cycle.
The flush is due to the branch. Fetching the branch address is performed during the second
execution cycle of the BTJF instruction.
The Decode operation can also be stalled when the memory target is modified during the
previous instruction. In the example given in Ta bl e 1 2, the INCW Y instruction writes the X
Doc ID 13590 Rev 327/162
Pipelined executionPM0044
register during the first execution cycle. As a result, in this cycle, the next instruction
(LD A,(X)) cannot be decoded since it reads the X register.
Table 12.Example of stalled pipeline
Time (cycles)
DEE
DDE
AddressInstruction
Decode
cycles
Execute
cycles
0xC000SUB SP, #20112
0xC002LD A, #20112
0xC004BTJT 0x10, #5, to125
0xC009INC A111
lgth
1 2347 8 91011121314
DE
F
1
DE
F
2
F
3
0xC00ABTJF 0x20, #3, to125
F
0xC00FNOPXX1
0xC010LDW X,[$5432.w]XX4F
1
2
0xC014LDW X,[$1234.w]XX4F
0xD020to: INCW Y112
0xD023LD A,(X)112
Table 13.Legend
Symbol/ColorDefinition
FFetch
DDecode stalled
DEE
3
Flush
F
DE
1
DDE
DDecode
EExecute
28/162Doc ID 13590 Rev 3
PM0044Pipelined execution
5.4.5 Pipeline with 1 wait state
In the example given in Ta bl e 1 4 , performing the fetch takes 2 cycles, and there is no
overlap between the 2 fetch cycles.
If the instruction is decoded/executed during the last 2 fetch cycles, then the wait state is
transparent compared to the no-wait state execution.
Table 14.Pipeline with 1 wait state
AddressInstruction
Decode
cycles
Execute
cycles
0xC000NEG A111
0xC001DEC ($10, X)113
0xC004LDW X, #20113
0xC007LD (X), A111
0xC008INC A111
0xC009NEG ($5A, Y)111
Table 15.Legend
Symbol/ColorDefinition
FFetch
DDecode stalled
DDecode
MSMemory stalled
EExecute
lgth
Time (cycle)
12345678910
MS
DE
F
1
MS
DE
F
2
DEE
DDE
MS
F
3
DE
DE
Doc ID 13590 Rev 329/162
STM8 addressing modesPM0044
6 STM8 addressing modes
The STM8 core features 18 different addressing modes which can be classified in 8 main
groups:
Table 16.STM8 core addressing modes
Addressing mode groupsExample
InherentNOP
ImmediateLD A,#$55
DirectLD A,$55
IndexedLD A,($55,X)
SP IndexedLD A,($55,SP)
IndirectLD A,([$55],X)
RelativeJRNE loop
Bit operationBSET byte,#5
The STM8 Instruction set is designed to minimize the number of required bytes per
instruction. To do so, most of the addressing modes can be split in three sub-modes called
extended, long and short:
●The extended addressing mode ("e") can reach any byte in the 16-Mbyte addressing
space, but the instruction size is bigger than the short and long addressing mode.
Moreover, the number of instructions with this addressing mode (far) is limited (CALLF,
RETF, JPF and LDF)
●The long addressing mode ("w") is the most powerful for program management, when
the program is executed in the same section (same PCE value). The long addressing
mode is optimized for data management in the first 64-Kbyte addressing space (from
0x000000 to 0x00FFFF) with a complete set of instructions, but the instruction size is
bigger than the short addressing mode.
●The short addressing mode ("b") is less powerful because it can only access the page
zero (from 0x000000 to 0x0000FF), but the instruction size is more compact.
LongDirect Indexed(longoff,ndx) (ptr.w + ndx) op + 1..2 Word000000..01FFFE
ExtendedDirect Indexed(extoff,ndx)(ptr.e + ndx) op + 1..3 Ext Word 000000..FFFFFF
The data byte required for operation is found by its memory address, which is defined by the
unsigned addition of an index register (X or Y or SP) with an offset which follows the opcode.
The indexed addressing mode is made of five sub-modes:
Table 25.No Offset, Long, Short and SP Indexed instructions
Direct Indexed (shortoff,SP) (ptr + SP)op + 1Byte00..(FF+stacktop)
This is a combination of indirect and indexed addressing mode. The data byte required for
the operation is found by its memory address, which is defined by the unsigned addition of
an index register value (X or Y) with a pointer value located in memory. The pointer address
follows the op-code.
The indirect indexed addressing mode is made of four sub-modes:
Table 32.Available Long Pointer Long and Short Pointer Long Indirect
DirectRelativeoffPC = PC + offop + 1---PC +127/-128
This addressing mode is used to modify the PC register value, by adding an 8-bit signed
offset to it. The offset added to the PC register value is relative to the start of the next
instruction.
Table 36.Available Relative Direct instructions
InstructionsFunctions
JRxxConditional Jump
JRAJump Relative Always
CALLRCall Relative
The offset follows the op-code.
Example:
04A72717jreq skip
04A99Dnop
04AA9Dnop
04C020FEskip jra*; Infinite loop
Action:
if (Z == 1)then PC = PC + $17 = $04A9 + $17 = $04C0
elsePC = PC= $04A9
Doc ID 13590 Rev 355/162
STM8 addressing modesPM0044
Z
CC
Before completion
After completion
Z = 1
04A9
27
17
Adder
27
17
PC
EA
(Branch taken)
CC
Adder
04C0
New PC
04A9
SKIP :
Instruction Complete
New PC = EA = 04C0
Steps to Determine
Effective Address
PC = 04A7
PC = PC + 1 = 04A8
TEMP = (PC) = 17
PC = PC +1 = 04A9
Stop here if there
is no Branch; i.e., Z = 0
EA = PC + TEMP
= 04A9 + 17
= 04C0
New PC = EA if Branch is taken
JREQ SKIP
JREQ SKIP
EA
02
PC
04A8
04A9
04A8
04C0
04A9
04A7
04A7
04A7
17
04C0
Z = 0
27
17
04A8
04A9
CC
04A9
New PC
Instruction Complete
New PC = EA = 04A9
JREQ SKIP
04A7
04A7
04A9
After completion
(No branch taken)
Figure 21. Relative Direct addressing mode example
56/162Doc ID 13590 Rev 3
PM0044STM8 addressing modes
6.13 Bit Direct (Long) addressing mode
Table 37.Overview of Bit Direct addressing mode instruction
The data byte required for the operation is found by its memory address, which follows the
op-code. The bit used for the operation is selected by the bit selector which is encoded in
the instruction op-code.
Table 38.Available Bit Direct instructions
InstructionsFunctions
BRES Bit Reset
BSETBit Set
BCPLBit Complement
BCCMCopy Carry Bit to Memory
The address is a word, thus allowing 0000 to FFFF addressing space, but requires 2 bytes
after the op-code. The bit selector #n (n=0 to 7) selects the n
th
bit from the byte pointed to by
the address.
Example:
0408721006E5BCPLcoeff, #0
06E540coeffdc.b$ 40
Action:
(coeff) = ($06E5) XOR 2**0 = $40 XOR $01 = $41
Doc ID 13590 Rev 357/162
STM8 addressing modesPM0044
0408
PC
E5
Steps to determine
effective address
PC = 0408
PC = PC + 2 = 040A
EA = (PC ) : (PC+1) = 06E5
PC + 2 =
Before completion
40 XOR 01
040C
New PC
Instruction complete
New PC =040C
After completion
10
06E5E
A
06
4
0
06E5
E5
10
06
4
1
040A
040B
040C
040A
040B
040C
0409
0409
06E5
06E
5
Coeff .byte 040h
Coeff .byte 040h
(EA) = (EA) | 2**0 = 40 | 01 = 41
90
0408BCPL Coeff,#0
90
0408BCPL Coeff,#0
EA = (PC ) : (PC+1) = 06E5
Figure 22. Bit Long Direct addressing mode example
58/162Doc ID 13590 Rev 3
PM0044STM8 addressing modes
6.14 Bit Direct (Long) Relative addressing mode
Table 39.Overview of Bit Direct (Long) Relative addressing mode
This addressing mode is a combination between the Bit Direct addressing mode (for data
addressing) and Relative Direct mode (for PC computation).
The data byte required for the operation is found by its memory address, which follows the
op-code. The bit used for the test operation is selected by the bit selector which is encoded
in the instruction op-code. Following the logical test operation, the PC register value can be
modified, by adding an 8-bit signed offset to it.
Table 40.Available Bit Direct Relative instructions
InstructionsFunctions
BTJT, BTJFBit Test and Jump
The data address is a word, thus allowing 0000 to FFFF addressing space (requires 2 bytes
after the op-code). The bit selector #n (n=0 to 7) selects the n
th
bit from the byte pointed to
by the address. The offset follows the op-code and data address.
Example:
104B 00DRA dc.b$00 ; Port A data
register(input
value)
bit0 equ$0 ; data bit 0
04A7 7201104BFB wait_1BTJF DRA, bit0, wait_1
04AC ....cont_0
Action:
Test = select_bit(0, ($4B)) = select_bit(0, DRA)
if (Test /= 1)then PC = PC + $FB = $0004AC - $05 =
else PC = PC= $0004AC
Doc ID 13590 Rev 359/162
$0004A7
STM8 addressing modesPM0044
After completion
b0 = 0
04AC
72
01
Adder
72
01
PC
EA
(Branch taken)
(EA)
Adder
04A7
New PC
04AC
Instruction Complete
New PC = EA = 04A7
Steps to Determine
Effective Address
PC = 04A7
PC = PC + 2 = 04A9
TEMP = (PC) = FC
PC = PC +1 = 04AC
Stop here if there
is no Branch; i.e., Test = TRUE (1)
EA = PC + TEMP
= 04AA + FD
= 04A7
New PC = EA if Branch is taken
wait_1
EA
05
PC
04A8
04A9
04A8
04A9
04A7
04A7
04A7
FB
04A7
b0 = 1
01
10
04A9
04AA
(EA)
04AC
New PC
Instruction Complete
New PC = EA = 04AC
04A8
04A7
04AC
After completion
(No branch taken)
DRA .byte
104C
104B
DRA
b0
BTJF DRA, #0, wait_1
DRA.b0 =? 0
10
EA = (PC):(PC+1) = 104B
PC = PC + 2 = 04AB
Test = (EA).b0
10
4B
04AB
wait_1
BTJF DRA, #0, wait_1
wait_1
BTJF DRA, #0, wait_1
04AA
4B
04AA
4B
04AC
FB
04AB
FB
04AB
FB
72
04A7
Figure 23. Bit Long Direct Relative addressing mode example
60/162Doc ID 13590 Rev 3
PM0044STM8 instruction set
7 STM8 instruction set
7.1 Introduction
This chapter describes all the STM8 instructions. There are 96 and they are described in
alphabetical order. However, they can be classified in 13 main groups as follows:
Table 41.Instruction groups
Load and
Transfer
Stack
operation
LD
PUSHPOP
LDF
CLRMOVEXGLDWCLRW EXGW
PUSH
W
POPW
Increment/
Decrement
Compare and
Tests
Logical
operations
Bit OperationBSETBRESBCPLBCCM
Conditional Bit
Test and
Branch
Arithmetic
operations
Shift and
Rotates
Unconditional
Jump or Call
Conditional
Branch/
Execution
Interrupt
management
Condition
Code Flag
modification
Breakpoint/
software break
INCDECINCWDECW
CPTNZBCPCPWTNZW
ANDORXORCPLCPLW
BTJTBTJF
NEGADCADDSUBSBCMULDIVDIVW NEGW ADDW SUBW
SLLSRLSRARLCRRCSWAP SLLWSRLW SRAW RLCW RRCW
SWAPRLWA RRWA
JRAJRTJRFJPJPFCALL CALLR CALLFRETRETFNOP
JRxxWFE
TRAPWFIHALTIRET
SIMRIMSCFRCFCCFRVF
BREAK
The instructions are described with one to five bytes.
PC-1End of previous instruction
PCOp-code
PC+1..4 Additional word (0 to 4) according to the number of bytes required to compute the
effective address(es)
Doc ID 13590 Rev 361/162
STM8 instruction setPM0044
Using a pre-code (two-byte op-codes)
In order to extend the number of available op-codes for an 8-bit CPU (256 op-codes), four
different pre-code bytes are defined. These pre-codes modify the meaning of the instruction
they precede.
The whole instruction becomes:
PC-1End of previous instruction
PCPre-code
PC+1Op-code
PC+2Additional word (0 to 3) according to the number of bytes required to compute the
effective address
These pre-bytes are:
0x90 = PDYReplaces an X based instruction using immediate, direct, indexed or
inherent addressing mode by a Y one.
It also provides read/modify/write instructions using Y indexed
addressing mode with long offset and two bit handling instructions
(BCPL and BCCM)
0x92 = PIXReplaces an instruction using direct, direct bit, or direct relative
addressing mode to an instruction using the corresponding indirect
addressing mode.
It also changes an instruction using X indexed addressing mode to
an instruction using indirect X indexed addressing mode.
0x91 = PIYReplace an instruction using indirect X indexed addressing mode by
a Y one.
0x72 = PWSPProvide long addressing mode for bit handling and read/modify/write
instructions.
It also provides indirect addressing mode with two byte pointer for
read/modify/write and register/memory instructions.
Finally it provides stack pointer indexed addressing mode on
register/memory instructions.
62/162Doc ID 13590 Rev 3
PM0044STM8 instruction set
7.2 Nomenclature
7.2.1 Operators
←is loaded with ...
↔has its value exchanged with ...
7.2.2 CPU registers
Aaccumulator
XX index register (2 bytes)
XLleast significant byte of the X index register (1 byte)
XHmost significant byte of the X index register (1 byte)
YY index register (2 bytes)
YLleast significant byte of the Y index register (1 byte)
YHmost significant byte of the Y index register (1 byte)
PCprogram counter register (3 bytes)
PCLlow significant byte of the program counter register (1 byte)
PCHhigh significant byte of the program counter register (1 byte)
PCEextended significant byte of the program counter register (1 byte)
SPstack pointer register (2 bytes)
CCCondition code register (1 byte)
CC.Voverflow flag of the code condition register (1 bit)
CC.I0interrupt mask bit 0 of the code condition register (1 bit)
CC.Hhalf carry flag of the code condition register (1 bit)
CC.I1interrupt mask bit 1 of the code condition register (1 bit)
CC.Nnegative flag of the code condition register (1 bit)
CC.Zzero flag of the code condition register (1 bit)
CC.Ccarry flag of the code condition register (1 bit)
7.2.3 Code condition bit value notation
-bit not affected by the instruction
1bit forced to 1 by the instruction
0bit forced to 0 by the instruction
Xbit modified by the instruction
7.2.4 Memory and addressing
M(...)content of a memory location
R8-bit operation result value
R(...)8-bit operation result value stored into the register or memory shown inside parentheses
Rnbit n of the operation result value (0≤n≤7)
XX.Bbit B of the XX register or memory location
imm.bbyte immediate value
imm.w16-bit immediate value
shortmem memory location with short addressing mode (1 byte)
longmem memory location with long addressing mode (2 bytes)
extmemmemory location with extended addressing mode (3 bytes)
[shortptr.w] short pointer (1 byte) on long memory location (2 bytes). Assembler notation = [$12.w].
[longptr.w] long pointer (2 bytes) on long memory location (2 bytes). Assembler notation = [$1234.w]
[longptr.e] long pointer (2 bytes) on extended memory location (3 bytes). Assembler notation = [$1234.e]
Doc ID 13590 Rev 363/162
STM8 instruction setPM0044
7.2.5 Operation code notation
eeextended order byte of 24-bit extended address
wwhigh order byte of 16-bit long address or middle order byte of 24-bit extended address
bbshort address or low order byte of 16-bit long address or 24-bit extended address
iiimmediate data byte or low order byte of 16-bit immediate data
iwhigh order byte of 16-bit immediate data
rrrelative offset byte in a range of [-128..+127]
A[longptr.w]ADD A,[$1000.w]4472 CB MS LS
A([shortptr.w],X) ADD A,([$10.w],X)4392 DBXX✗
A([longptr.w],X)ADD A,([$1000.w],X)4472 DB MSLS
A([shortptr.w],Y) ADD A,([$10.w],Y)4391 DBXX✗
See also: ADDW, ADC, SUB, SBC, MUL, DIV
Doc ID 13590 Rev 377/162
STM8 instruction setPM0044
ADDW
Word Addition with index registers
ADDW
SyntaxADDW dst,src e.g. ADDW X,#$1000
Operationdst <= dst + src
DescriptionThe source (16-bit) is added to the contents of the destination, which is an
index register (X/Y) and the result is stored in the same index register. The
source is a 16-bit memory or data word. The ADDW instruction can also be
used to add an immediate value to the stack pointer (SP).
Instruction overview
mnemdstsrc
ADDWXMemV-H-NZC
ADDWYMemV-H-NZC
ADDWSPImm-------
V ⇒(A15.M15 + M15.R15
+ R15.A15)⊕ (A14.M14 + M14.R14 + R14.A14)
VI1HI0NZC
Affected condition flags
Set if the signed operation generates an overflow, cleared otherwise.
H ⇒ X7.M7 + M7.R7
+ R7.X7
Set if a carry occurred from bit 7 of the result, cleared otherwise.
N ⇒R15
Set if bit 15 of the result is set (negative value), cleared otherwise.
DescriptionComplements the bit position in destination location. Leaves all other bits
unchanged.
M(longmem).bit <- -M(longmem).bit
Instruction overview
mnemdst
VI1HI0NZ C
BCPLMem-------
Affected condition flags
Detailed description
dstpos = 0..7AsmcylgthOp-code(s)ST7
longmemn = 2*posBCPL $1000,#21490 1n MSLS
See also: CPL, BRES, BSET
82/162Doc ID 13590 Rev 3
PM0044STM8 instruction set
BREAK
Software break
BREAK
Syntax
Operation
DescriptionIn debug mode, the CPU is stalled and can be restarted by the debugger.
This instruction equals a NOP when the debugger is not connected.
Instruction overview
mnem
VI1HI0NZC
SIM-1-1---
Affected condition flags
Detailed description
Addressing
mode
InherentBREAK118B✗
AsmcylgthOp-code(s)ST7
Doc ID 13590 Rev 383/162
STM8 instruction setPM0044
BRES
Bit Reset
BRES
SyntaxBRES dst,#pos pos = [0..7] e.g. BRES PADR,#6
Operationdst <= dst AND COMPLEMENT (2**pos)
DescriptionRead the destination byte, reset the corresponding bit (bit position), and
write the result in destination byte. The destination is a memory byte. The
bit position is a constant. This instruction is fast, compact, and does not
affect any register. Very useful for boolean variable manipulation.
Instruction overview
mnemdstbit position
VI1HI0NZ C
BRESMem#pos-------
Affected condition flags
Detailed description
dstpos = 0..7AsmcylgthOp-code(s)ST7
longmemn=1+2*posBRES $1000,#714721n MS LS
See also: BSET
84/162Doc ID 13590 Rev 3
PM0044STM8 instruction set
BSET
Bit Set
BSET
SyntaxBSET dst,#pospos = [0..7] e.g. BSET PADR,#7
Operationdst <= dst OR (2**pos)
DescriptionRead the destination byte, set the corresponding bit (bit position), and write
the result in destination byte. The destination is a memory byte. The bit
position is a constant. This instruction is fast, compact, and does not affect
any register. Very useful for boolean variable manipulation.
Instruction overview
mnemdstbit position
VI1HI0NZ C
BSETMem#pos-------
Affected condition flags
Detailed description
dstpos = 0..7AsmcylgthOp-code(s)ST7
longmemn=2*posBSET $1000,#114721n MS LS
See also: BRES
Doc ID 13590 Rev 385/162
STM8 instruction setPM0044
BTJF
Bit Test and Jump if False
BTJF
SyntaxBTJF dst,#pos,rel pos = [0..7], rel is relative jump label
e.g.:BTJFPADR,#3,skip
OperationPC = PC+lgth
PC = PC + rel IF (dst AND (2**pos)) = 0
DescriptionRead the destination byte, test the corresponding bit (bit position), and
jump to 'rel' label if the bit is false (0), else continue the program to the next
instruction. The tested bit is saved in the C flag. The destination is a
memory byte. The bit position is a constant. The jump label represents a
signed offset to be added to the current PC/instruction address (relative
jump). This instruction is used for boolean variable manipulation, hardware
register flag tests, or I/O polling. This instruction is fast, compact, and does
not affect any registers. Very useful for boolean variable manipulation.
Instruction overview
mnemdstbit positionjump label
BTJF Mem#posrel------C
Affected condition flags
VI1HI0NZ C
C ⇒Tested bit is saved in the C flag.
Detailed description
dstpos = 0..7AsmcylgthOp-code(s)ST7
longmemn = 1+2*pos
BTJF
$1000,#1,loop
2/35720n MS LS XX
See also: BTJT
86/162Doc ID 13590 Rev 3
PM0044STM8 instruction set
BTJT
Bit Test and Jump if True
BTJT
SyntaxBTJT dst,#pos,relpos = [0..7], rel is relative jump label
e.g.:BTJT PADR,#7,skip
OperationPC = PC+lgth
PC = PC + rel IF (dst AND (2**pos)) <> 0
DescriptionRead the destination byte, test the corresponding bit (bit position), and
jump to 'rel' label if the bit is true (1), else continue the program to the next
instruction. The tested bit is saved in the C flag. The destination is a
memory byte. The bit position is a constant. The jump label represents a
signed offset to be added to the current PC/instruction address (relative
jump). This instruction is used for boolean variable manipulation, hardware
register flag tests, or I/O polling.
Instruction overview
mnemdstbit positionjump label
BTJT Mem#posrel------C
Affected condition flags
VI1HI0NZ C
C ⇒Tested bit is saved in the C flag.
Detailed description
dstpos = 0..7AsmcylgthOp-code(s)ST7
longmemn= 2*pos
BTJT
$1000,#1,loop
2/35720n MS LS XX
See also: BTJF
Doc ID 13590 Rev 387/162
STM8 instruction setPM0044
CALL
CALL Subroutine
CALL
(Absolute)
OperationPC = PC+lgth
(SP--) = PCL
(SP--) = PCH
PC = dst
DescriptionThe current PC register value is pushed onto the stack, then PC is loaded
with the destination address in same section of memory. The CALL
destination and the instruction following the CALL should be in the same
section as PCE is not stacked. The corresponding RET instruction should
be executed in the same section. This instruction should be used versus
CALLR when developing a program.
DescriptionThe current PC register value is pushed onto the stack, then PC is loaded
with the destination address.This instruction is used with extended memory
addresses. For safe memory usage, a function which crosses sections
must be called by CALLF.
Instruction overview
mnemdst
VI1HI0NZC
CALLFMem-------
Affected condition flags
Detailed description
dstAsmcylgthOp-code(s)ST7
extmemCALLF $35AA00548DExtBMSLS
[longptr.e]CALLF [$2FFC.e]8492 8DMSLS
See also: RETF, CALL, JPF
Doc ID 13590 Rev 389/162
STM8 instruction setPM0044
CALLR
CALL Subroutine Relative
CALLR
SyntaxCALLR dst e.g. CALLR chk_pol
OperationPC = PC+lgth
(SP--) = PCL
(SP--) = PCH
PC = PC + dst
DescriptionThe current PC register value is pushed onto the stack, then PC is loaded
with the relative destination address. This instruction is used, once a
program is debugged, to shrink the overall program size. The CALLR
destination and the corresponding RET instruction address must be in the
same section, as PCE is not stacked.
Instruction overview
mnemdst
VI1HI0NZC
CALLRMem-------
Affected condition flags
Detailed description
dstAsmcylgthOp-code(s)ST7
shortmemCALLR $1042ADXX✗
See also: CALL, RET
90/162Doc ID 13590 Rev 3
PM0044STM8 instruction set
CCF
Complement Carry Flag
CCF
SyntaxCCF
OperationCC.C <- CC.
C
DescriptionComplements the Carry flag of the Condition Code (CC) register.
Instruction overview
Affected condition flags
C =C
mnem
VI1HI0NZC
CCF------
,
Complements the carry flag of the CC register.
Detailed description
Addressing
mode
InherentCCF118C
AsmcylgthOp-code(s)ST7
See also: RCF, SCF
C
Doc ID 13590 Rev 391/162
STM8 instruction setPM0044
CLR
Clear
CLR
SyntaxCLR dst e.g. CLR A
Operationdst <= 00
DescriptionThe destination byte is forced to 00 value. The destination is either a
memory byte location or the accumulator. This instruction is compact, and
does not affect any register when used with RAM variables.
Instruction overview
mnemdst
VI1HI0NZ C
CLRMem----01-
CLRA01
Affected condition flags
N: 0
Cleared
Z: 1
Set
Detailed description
dstAsmcylgthOp-code(s)ST7
ACLR A114F✗
shortmemCLR $10123FXX✗
longmemCLR $100014725FMSLS
(X)CLR (X)117F✗
(shortoff.X)CLR ($10,X)126FXX✗
(longoff,X)CLR ($1000,X)14724FMSLS
(Y)CLR (Y)12907F✗
(shortoff,Y)CLR ($10,Y)13906FXX✗
(longoff,Y)CLR ($1000,Y)14904FMSLS
(shortoff,SP)CLR ($10,SP)120FXX
[shortptr.w]CLR [$10]43923FXX✗
[longptr.w]CLR [$1000].w44723FMSLS
([shortptr.w],X) CLR ([$10],X)43926FXX✗
([longptr.w].X]
([shortptr.w],Y) CLR ([$10],Y)43916FXX✗
CLR
([$1000.w],X)
44726FMSLS
See also: LD
92/162Doc ID 13590 Rev 3
PM0044STM8 instruction set
CLRW
Clear word
CLRW
SyntaxCLRW dst e.g. CLRW X
Operationdst <= 00
DescriptionThe destination is forced to 0000 value. The destination is an index
register.
Instruction overview
mnemdst
VI1HI0NZC
CLRWX----01-
CLRWY----01-
Affected condition flags
N: 0
Cleared
Z: 1
Set
Detailed description
dstAsmcylgthOp-code(s)ST7
XCLRW X115F
YCLRW Y12 905F
See also: LD
Doc ID 13590 Rev 393/162
STM8 instruction setPM0044
CP
Compare
CP
SyntaxCP dst,src e.g. CP A,(tbl,X)
Operation{N, Z, C} = Test (dst - src)
DescriptionThe source byte is subtracted from the destination byte and the result
is lost. However, N, Z, C flags of Condition Code (CC) register are updated
according to the result.The destination is a register, and the source is a
memory or data byte. This instruction generally is used just before a
conditional jump instruction.
Set if the signed subtraction of the destination (dst) value from the
source (src) value generates a signed overflow (signed result cannot be
represented on 8 bits).
N ⇒R7
Set if bit 7 of the result is set (negative value), cleared otherwise.
Z ⇒R7
.R6.R5.R4.R3.R2.R1.R0
Set if the result is zero (0x00), cleared otherwise.
C ⇒(A7
.M7 + A7.R7 + A7.M7.R7)
Set if the unsigned value of the contents of source (src) is larger than the
unsigned value of the destination (dst), cleared otherwise.
A[longptr.w]CP A,[$1000.w]4472C1MS LS
A([shortptr.w],X)CP A,([$10.w],X)4392D1XX✗
A([longptr.w],X)CP A,([$1000.w],X)4472D1MS LS
A([shortptr.w],Y)CP A,([$10.w],Y)4391D1XX✗
See also: CPW, TNZ, BCP
94/162Doc ID 13590 Rev 3
PM0044STM8 instruction set
CPW
Compare word
CPW
SyntaxCPW dst,src e.g. CPW Y,(tbl,X)
Operation{N, Z, C} = Test (dst - src)
DescriptionThe source byte is subtracted from the destination byte and the result is
lost. However, N, Z, C flags of Condition Code (CC) register are updated
according to the result. The destination is an index register, and the source
is a memory or data word. This instruction generally is used just before a
conditional jump instruction.
Set if the signed subtraction of the destination (dst) value from the source (src)
value generates a signed overflow (signed result cannot be represented on 16
bits).
N ⇒R15
Set if bit 7 of the result is set (negative value), cleared otherwise.
Set if the result is zero (0x00), cleared otherwise.
C ⇒(X15
.M15 + X15.R15 + X15.M15.R15)
Set if the unsigned value of the contents of source (src) is larger than the
unsigned value of the destination (dst), cleared otherwise.
Detailed description
dstsrcAsmcylgthOp-code(s)ST7
X#wordCPW X,#$1023A3 MS LS✗
XshortmemCPW X,$1022B3 XX✗
XlongmemCPW X,$100023C3 MS LS✗
X(Y)CPW X,(Y)2290F3✗
X(shortoff,Y)CPW X,($10,Y)2390 E3 XX✗
X (longoff,Y)CPW X,($1000,Y)2490 D3 MS LS✗
X(shortoff,SP)CPW X,($10,SP)2213 XX
X[shortptr.w]CPW X,[$10.w]5392 C3 XX✗
X[longptr.w]CPW X,[$1000.w]5472 C3 MS LS
X([shortptr.w],Y)CPW X,([$10.w],Y)5391 D3 XX✗
Note:CPW Y, (shortoff, SP) is not implemented, but can be emulated through a macro using
EXGW X,Y
& CPW X, (shortoff, SP)
See also: CP, TNZW, BCP
96/162Doc ID 13590 Rev 3
PM0044STM8 instruction set
CPL
Logical 1’s Complement
CPL
SyntaxCPL dst e.g. CPL (X)
Operationdst <= dst XOR FF, or FF - dst
DescriptionThe destination byte is read, then each bit is toggled (inverted) and the
result is written to the destination byte. The destination is either a memory
byte or a register. This instruction is compact, and does not affect any
registers when used with RAM variables.
Instruction overview
mnemdst
VI1HI0NZC
CPLMem----NZ1
CPLReg----NZ1
Affected condition flags
N ⇒R7
Set if bit 7 of the result is set (negative value), cleared otherwise.
Z ⇒R7
.R6.R5.R4.R3.R2.R1.R0
Set if the result is zero (0x00), cleared otherwise.
C ⇒ 1
Set.
Detailed description
dstAsmcylgthOp-code(s)ST7
ACPL A11 43✗
shortmemCPL$101233XX✗
longmemCPL$1000147253MSLS
(X)CPL(X)1173✗
(shortoff.X)CPL($10,X)1263XX✗
(longoff,X)CPL($1000,X)147243MSLS
(Y)CPL(Y)129073✗
(shortoff,Y)CPL($10,Y)139063XX✗
(longoff,Y)CPL($1000,Y)149043MSLS
(shortoff,SP)CPL($10,SP)1203XX✗
[shortptr.w]CPL[$10]439233XX✗
[longptr.w]CPL[$1000].w447233MSLS
([shortptr.w],X)CPL([$10],X)439263XX✗
([longptr.w].X] CPL([$1000.w],X)447263MSLS
([shortptr.w],Y)CPL([$10],Y)439163XX✗
See also: NEG, XOR, AND, OR
Doc ID 13590 Rev 397/162
STM8 instruction setPM0044
CPLW
Logical 1’s Complement Word
CPLW
SyntaxCPLW dst e.g. CPLW X
Operationdst <= dst XOR FFFF, or FFFF - dst
DescriptionThe destination index register is read, then each bit is toggled (inverted)
and the result is written back to the destination index register.
Instruction overview
mnemdst
VI1HI0NZC
CPLWReg----NZ1
Affected condition flags
N ⇒R15
Set if bit 7 of the result is set (negative value), cleared otherwise.
Set if the result is zero (0x00), cleared otherwise.
C ⇒ 1
Set
Detailed description
dstAsmcylgthOp-code(s)ST7
XCPLW X2153✗
YCPWL Y229053✗
See also: CPL, NEGW, XOR, AND, OR
98/162Doc ID 13590 Rev 3
PM0044STM8 instruction set
DEC
Decrement
DEC
SyntaxDEC dst
Operationdst <= dst - 1
DescriptionThe destination byte is read, then decremented by one, and the result is
written to the destination byte. The destination is either a memory byte or a
register. This instruction is compact, and does not affect any registers when
used with RAM variables.
Instruction overview
mnemdst
DECMemV---NZ-
DECRegV---NZ-
V ⇒(A7.M7 + M7.R7
VI1HI0NZ C
+ R7.A7)⊕ (A6.M6 + M6.R6 + R6.A6)
Affected condition flags
Set if the signed operation generates an overflow, cleared otherwise.
N ⇒R7
Set if bit 7 of the result is set (negative value), cleared otherwise.
Z ⇒R7
.R6.R5.R4.R3.R2.R1.R0
Set if the result is zero (0x00), cleared otherwise.
Detailed description
dstAsmcylgthOp-code(s)ST7
ADEC A1 14A✗
shortmemDEC $10123AXX✗
longmemDEC $100014725AMSLS
(X)DEC(X)117A✗
(shortoff.X)DEC($10,X)126AXX✗
(longoff,X)DEC($1000,X)14724AMSLS
(Y)DEC(Y)12907A✗
(shortoff,Y)DEC($10,Y)13906AXX✗
(longoff,Y)DEC($1000,Y)14904AMSLS
(shortoff,SP)DEC($10,SP)120AXX
[shortptr.w]DEC[$10]43923AXX✗
[longptr.w]DEC[$1000].w44723AMSLS
([shortptr.w],X) DEC([$10],X)43926AXX✗
([longptr.w].X] DEC([$1000.w],X)44726AMSLS
([shortptr.w],Y) DEC([$10],Y)43916AXX✗
See also: DECW, INC
Doc ID 13590 Rev 399/162
STM8 instruction setPM0044
DECW
Decrement word
DECW
SyntaxDECW dst
Operationdst <= dst - 1
DescriptionThe value of the destination index register is decremented by one.
Instruction overview
mnemdst
DECWRegV---NZ-
V ⇒(A15.M15 + M15.R15
VI1HI0NZC
+ R15.A15)⊕(A14.M14 + M14.R14 + R14.A14)
Affected condition flags
Set if the signed operation generates an overflow, cleared otherwise.
N ⇒R15
Set if bit 15 of the result is set (negative value), cleared otherwise.