STMicroelectronics PM0044 Programming manual

PM0044
Programming manual
STM8 CPU programming manual
Introduction
The STM8 family of HCMOS microcontrollers is designed and built around an enhanced industry standard 8-bit core and a library of peripheral blocks, which include ROM, Flash, RAM, EEPROM, I/O, Serial Interfaces (SPI, USART, I2C,...), 16-bit Timers, A/D converters, comparators, power supervisors etc. These blocks may be assembled in various combinations in order to provide cost-effective solutions for application-specific products.
The STM8 family forms a part of the STMicroelectronics 8-bit MCU product line, which finds its place in a wide variety of applications such as automotive systems, remote controls, video monitors, car radio and numerous other consumer, industrial, telecom, and multimedia products.
September 2011 Doc ID 13590 Rev 3 1/162
www.st.com
Contents PM0044
Contents
1 STM8 architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.1 STM8 development support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2 Enhanced STM8 features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3 STM8 core description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 CPU registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4 STM8 memory interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1 Program space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 Data space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3 Memory interface architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5 Pipelined execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.1 Description of pipelined execution stages . . . . . . . . . . . . . . . . . . . . . . . . 20
5.1.1 Fetch stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.1.2 Decoding and addressing stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.1.3 Execution stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.2 Data memory conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.3 Pipelined execution examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.4 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.4.1 Optimized pipeline example – execution from Flash Program memory . 24
5.4.2 Optimize pipeline example – execution from RAM . . . . . . . . . . . . . . . . 26
5.4.3 Pipeline with Call/Jump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.4.4 Pipeline stalled . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.4.5 Pipeline with 1 wait state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6 STM8 addressing modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
6.1 Inherent addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.2 Immediate addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.3 Direct addressing mode (Short, Long, Extended) . . . . . . . . . . . . . . . . . . 34
2/162 Doc ID 13590 Rev 3
PM0044 Contents
6.3.1 Short Direct addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.3.2 Long Direct addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.3.3 Extended Direct addressing mode (only for CALLF and JPF) . . . . . . . . 38
6.4 Indexed addressing mode (No Offset, Short, SP, Long, Extended) . . . . . 39
6.4.1 No Offset Indexed addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
6.4.2 Short Indexed addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.4.3 SP Indexed addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.4.4 Long Indexed addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.4.5 Extended Indexed (only LDF instruction) . . . . . . . . . . . . . . . . . . . . . . . . 44
6.5 Indirect (Short Pointer Long, Long Pointer Long) . . . . . . . . . . . . . . . . . . . 45
6.6 Short Pointer Indirect Long addressing mode . . . . . . . . . . . . . . . . . . . . . 46
6.7 Long Pointer Indirect Long addressing mode . . . . . . . . . . . . . . . . . . . . . . 47
6.8 Indirect Indexed (Short Pointer Long, Long Pointer Long,
Long Pointer Extended) addressing mode . . . . . . . . . . . . . . . . . . . . . . . . 48
6.9 Short Pointer Indirect Long Indexed addressing mode . . . . . . . . . . . . . . 49
6.10 Long Pointer Indirect Long Indexed addressing mode . . . . . . . . . . . . . . . 51
6.11 Long Pointer Indirect Extended Indexed addressing mode . . . . . . . . . . . 53
6.12 Relative Direct addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.13 Bit Direct (Long) addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.14 Bit Direct (Long) Relative addressing mode . . . . . . . . . . . . . . . . . . . . . . . 59
7 STM8 instruction set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.2 Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.2.1 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.2.2 CPU registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.2.3 Code condition bit value notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.2.4 Memory and addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.2.5 Operation code notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
7.3 Instruction set summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
7.4 Instruction set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
ADC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
ADD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
ADDW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
AND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Doc ID 13590 Rev 3 3/162
Contents PM0044
BCCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
BCP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
BCPL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
BREAK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
BRES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
BSET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
BTJF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
BTJT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
CALL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
CALLF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
CALLR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
CCF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
CLR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
CLRW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
CP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
CPW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
CPL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
CPLW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
DEC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
DECW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
DIV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
DIVW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
EXG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
EXGW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
HALT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
INC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
INCW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
INT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
IRET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
JP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
JPF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
JRA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
JRxx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4/162 Doc ID 13590 Rev 3
PM0044 Contents
LD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
LDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
LDW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
MOV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
MUL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
NEG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
NEGW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
NOP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
OR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
POP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
POPW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
PUSH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
PUSHW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
RCF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
RET. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
RETF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
RIM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
RLC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
RLCW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
RLWA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
RRC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
RRCW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
RRWA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
RVF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
SBC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
SCF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
SIM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
SLL/SLA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
SLLW/SLAW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
SRA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
SRAW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
SRL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
SRLW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Doc ID 13590 Rev 3 5/162
Contents PM0044
SUB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
SUBW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
SWAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
SWAPW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
TNZ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
TNZW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
TRAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
WFE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
WFI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
XOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8 Revision history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6/162 Doc ID 13590 Rev 3
PM0044 List of tables
List of tables
Table 1. Interruptability levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Table 2. Data/address decoding examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Table 3. Example with exact number of cycles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Table 4. Example with conventional number of cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Table 5. Legend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Table 6. Optimized pipeline example - execution from Flash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Table 7. Legend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Table 8. Optimize pipeline example – execution from RAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Table 9. Legend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Table 10. Example of pipeline with Call/Jump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Table 11. Legend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Table 12. Example of stalled pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Table 13. Legend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Table 14. Pipeline with 1 wait state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Table 15. Legend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Table 16. STM8 core addressing modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Table 17. STM8 addressing mode overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Table 18. Inherent addressing instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Table 19. Immediate addressing instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Table 20. Overview of Direct addressing mode instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Table 21. Available Long and Short Direct addressing mode instructions . . . . . . . . . . . . . . . . . . . . . 34
Table 22. Available Extended Direct addressing mode instructions. . . . . . . . . . . . . . . . . . . . . . . . . . 35
Table 23. Available Long Direct addressing mode instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Table 24. Overview Indexed addressing mode instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Table 25. No Offset, Long, Short and SP Indexed instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Table 26. No Offset, Long, Short Indexed Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Table 27. Extended Indexed Instructions only . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Table 28. Overview of Indirect addressing instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Table 29. Available Long Pointer Long and Short Pointer Long Indirect Instructions. . . . . . . . . . . . . 45
Table 30. Available Long Pointer Long Indirect Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Table 31. Overview of Indirect indexed instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Table 32. Available Long Pointer Long and Short Pointer Long Indirect
Indexed instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Table 33. Available Long Pointer Long Indirect Indexed instructions . . . . . . . . . . . . . . . . . . . . . . . . . 48
Table 34. Long Pointer Extended Indirect Indexed instructions instruction . . . . . . . . . . . . . . . . . . . . 48
Table 35. Overview of Relative Direct addressing mode instructions. . . . . . . . . . . . . . . . . . . . . . . . . 55
Table 36. Available Relative Direct instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Table 37. Overview of Bit Direct addressing mode instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Table 38. Available Bit Direct instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Table 39. Overview of Bit Direct (Long) Relative addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Table 40. Available Bit Direct Relative instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Table 41. Instruction groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Table 42. Instruction set summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Table 43. Document revision history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Doc ID 13590 Rev 3 7/162
List of figures PM0044
List of figures
Figure 1. Programming model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Figure 2. Context save/restore for interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Figure 3. Address spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Figure 4. Memory Interface Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Figure 5. Pipelined execution principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Figure 6. Pipelined execution stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Figure 7. Immediate addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Figure 8. Short Direct addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Figure 9. Long Direct addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Figure 10. Far Direct addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Figure 11. No Offset Indexed addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Figure 12. Short Indexed - 8-bit offset - addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Figure 13. SP Indexed - 8-bit offset - addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Figure 14. Long Indexed - 16-bit offset - addressing mode example. . . . . . . . . . . . . . . . . . . . . . . . . . 43
Figure 15. Far Indexed - 16-bit offset - addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Figure 16. Short Pointer Indirect Long addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Figure 17. Long Pointer Indirect Long addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Figure 18. Short Pointer Indirect Long Indexed addressing mode example . . . . . . . . . . . . . . . . . . . . 50
Figure 19. Long Pointer Indirect Long Indexed addressing mode example. . . . . . . . . . . . . . . . . . . . . 52
Figure 20. Long Pointer Indirect Extended Indexed addressing mode example . . . . . . . . . . . . . . . . . 54
Figure 21. Relative Direct addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Figure 22. Bit Long Direct addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Figure 23. Bit Long Direct Relative addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
8/162 Doc ID 13590 Rev 3
PM0044 STM8 architecture

1 STM8 architecture

The 8-bit STM8 Core is designed for high code efficiency. It contains 6 internal registers, 20
addressing modes and 80 instructions. The 6 internal registers include two 16-bit Index
registers, an 8-bit Accumulator, a 24-bit Program Counter, a 16-bit Stack Pointer and an 8-
bit Condition Code register. The two Index registers X and Y enable Indexed Addressing
modes with or without offset, along with read-modify-write type data manipulation. These
registers simplify branching routines and data/arrays modifications.
The 24-bit Program Counter is able to address up to 16-Mbyte of RAM, ROM or Flash
memory. The 16-bit Stack Pointer provides access to a 64K-level Stack. The Core also
includes a Condition Code register providing 7 Condition flags that indicate the result of the
last instruction executed.
The 20 Addressing modes, including Indirect Relative and Indexed addressing, allow
sophisticated branching routines or CASE-type functions. The Indexed Indirect Addressing
mode, for instance, permits look-up tables to be located anywhere in the address space,
thus enabling very flexible programming and compact C-based code. The stack pointer
relative addressing mode permits optimized C compiler stack model for local variables and
parameter passing.
The Instruction Set is 8-bit oriented with a 2-byte average instruction size. This Instruction
Set offers, in addition to standard data movement and logic/arithmetic functions, 8-bit by 8-
bit multiplication, 16-bit by 8-bit and 16-bit by 16-bit division, bit manipulation, data transfer
between Stack and Accumulator (Push / Pop) with direct stack access, as well as data
transfer using the X and Y registers or direct memory-to-memory transfers.
The number of Interrupt vectors can vary up to 32, and the interrupt priority level may be
managed by software providing hardware controlled nested capability. Some peripherals
include Direct Memory Access (DMA) between serial interfaces and memory. Support for
slow memories allows easy external code execution through serial or parallel interface
(ROMLESS products for instance).
The STM8 has a high energy-efficient architecture, based on a Harvard architecture and
pipelined execution. A 32-bit wide program memory bus allows most of the instructions to be
fetched in 1 CPU cycle. Moreover, as the average instruction length is 2 bytes, this allows for
a reduction in the power consumption by only accessing the program memory half of the
time, on average. The pipelined execution allowed the execution time to be minimized,
ensuring high system performance, when needed, together with the possibility to reduce the
overall energy consumption, by using different power saving operating modes. Power-saving
can be managed under program control by placing the device in SLOW, WAIT, SLOW-WAIT,
ACTIVE-HALT or HALT mode (see product datasheet for more details).
Doc ID 13590 Rev 3 9/162
STM8 architecture PM0044
Additional blocks
The additional blocks take the form of integrated hardware peripherals arranged around the
central processor core. The following (non-exhaustive) list details the features of some of the
currently available blocks:
Boot ROM Memory area containing the bootloader code
Flash Flash-based devices
RAM Sizes up to several Kbytes
Data EEPROM
Timers
A/D converter
I2C
SPI
USART
Watchdog
I/O ports
Sizes up to several Kbytes. Erase/programming operations do not require additional external power sources.
Different versions based on 8/16-bit free running or autoreload timer/counter are available. They can be coupled with either input captures, output compares or PWM facilities. PWM functions can have software programmable duty cycle between 0% to 100% in up to 256/65536 steps. The outputs can be filtered to provide D/A conversion.
The Analog to Digital Converter uses a sample and hold technique. It has 12-bit resolution.
Multi/master, single master, single slave modes, DMA or 1byte transfer, standard and fast I2C modes, 7 and 10-bit addressing.
The Serial peripheral Interface is a fully synchronous 3/4 wire interface ideal for Master and Slave applications such as driving devices with input shift register (LCD driver, external memory,...).
The USART is a fast synchronous/asynchronous interface which features both duplex transmission, NRZ format, programmable baud rates and standard error detection. The USART can also emulate RS232 protocol.
It has the ability to induce a full reset of the MCU if its counter counts down to zero prior to being reset by the software. This feature is especially useful in noisy applications.
They are programmable by software to act in several input or output configurations on an individual line basis, including high current and interrupt generation. The basic block has eight bit lines.

1.1 STM8 development support

The STM8 family of MCUs is supported by a comprehensive range of development tools.
This family presently comprises hardware tools (emulators, programmers), a software
package (assembler-linker, debugger, archiver) and a C-compiler development tool.
STM8 and ST7 CPUs are supported by a single toolchain allowing easy reuse and
portability of the applications between product lines.
10/162 Doc ID 13590 Rev 3
PM0044 STM8 architecture

1.2 Enhanced STM8 features

16-Mbyte linear program memory space with 3 FAR instructions (CALLF, RETF, JPF)
16-Mbyte linear data memory space with 1 FAR instruction (LDF)
Up to 32 24-bit interrupt vectors with optimized context save management
16-bit Stack Pointer (SP=SH:S) with stack manipulation instructions and addressing
modes
New register and memory access instructions (EXG, MOV)
New arithmetic instructions: DIV 16/8 and DIVW 16/16
New bit handling instructions (CCF, BCPL, BCCM)
2 x 16-bit index registers (X=XH:XL, Y=YH:YL). 8-bit data transfers address the low
byte. The high-byte is not affected, with a reset value of 0. This allows the use of X/Y as 8-bit values.
Fast interrupt handling through alternate register files (up to 4 contexts) with standard
stack compatible mode (for real time OS kernels)
16-bit/8-bit stack operations (X, Y, A, CC stacking)
16-bit pointer direct update with 16-bit relative offset (ADDW/SUBW for X/Y/SP)
8-bit & 16-bit arithmetic and signed arithmetic support
Doc ID 13590 Rev 3 11/162
Glossary PM0044

2 Glossary

mnem mnemonic
src source
dst destination
cy duration of the instruction in CPU clock cycles (internal clock)
lgth length of the instruction in byte(s)
op-code instruction byte(s) implementation (1..4 bytes), operation code.
mem memory location
imm immediate value
off offset
ptr pointer
pos position
byte a byte
word 16-bit value
short represent a short 8-bit addressing mode
long represent a long 16-bit addressing mode
EA Effective Address: The final computed data byte address
Page Zero all data located at [00..FF] addressing space (single byte address)
(XX) content of a memory location XX
XX a byte value
ExtB Extended byte
MS Most Significant byte of a 16-bit value (MSB)
LS Least Significant byte of a 16-bit value (LSB)
A Accumulator register
X 16-bit X Index register
Y 16-bit Y Index register
reg A, XL or YL register (1-byte LS part of X/Y), XH or YH (1-byte MS part of X/Y)
ndx index register, either X or Y
PC 24-bit Program Counter register
SP 16-bit Stack Pointer
S Stack Pointer LSB
CC Condition Code register
12/162 Doc ID 13590 Rev 3
PM0044 STM8 core description
07
A ACCUMULATOR
015
SP STACK POINTER
X INDEX
Y INDEX
07815
PC PROGRAM COUNTER
PCH PCL
07
CC CODE CONDITION
V
-
I1 H I0 N Z C
1623
PCE
07815
XH XL
07815
YH YL

3 STM8 core description

3.1 Introduction

The CPU has a full 8-bit architecture, with 16-bit operations on index registers (for address
computation). Six internal registers allow efficient 8-bit data manipulation. The CPU is able
to execute 80 basic instructions. It features 20 addressing modes and can address 6 internal
registers and 16 Mbytes of memory/peripheral registers.

3.2 CPU registers

The 6 CPU registers are shown in the programming model in Figure 1. Following an
interrupt, the register context is saved. The context is saved by pushing registers onto the
stack in the order shown in Figure 2. They are popped from the stack in the reverse order.
Accumulator (A)
The accumulator is an 8-bit general purpose register used to hold operands and the results
of the arithmetic and logic calculations as well as data manipulations.
Index registers (X and Y)
These 16-bit registers are used to create effective addresses or as temporary storage area
for data manipulations. In most of the cases, the cross assembler generates a PRECODE
instruction (PRE) to indicate that the following instruction refers to the Y register. Both X and
Y are automatically saved on interrupt routine branch.
Program Counter (PC)
The program counter is a 24-bit register used to store the address of the next instruction to
be executed by the CPU. It is automatically refreshed after each processed instruction. As a
result, the STM8 core can access up to 16-Mbytes of memory.

Figure 1. Programming model

Doc ID 13590 Rev 3 13/162
STM8 core description PM0044
Stack Pointer (SP)
The stack pointer is a 16-bit register. It contains the address of the next free location of the
stack. Depending on the product, the most significant bits can be forced to a preset value.
The stack is used to save the CPU context on subroutines calls or interrupts. The user can
also directly use it through the POP and PUSH instructions.
After an MCU reset the Stack Pointer is set to its upper limit value. It is then decremented
after data has been pushed onto the stack and incremented after data is popped from the
stack. When the lower limit is exceeded, the stack pointer wraps around to the stack upper
limit. The previously stored information is then overwritten, and therefore lost.
A subroutine call occupies two or three locations.
When an interrupt occurs, the CPU registers (CC, X, Y, A, PC) are pushed onto the stack.
This operation takes 9 CPU cycles and uses 9 bytes in RAM.
Note: The WFI/HALT instructions save the context in advance. If an interrupt occurs while the CPU
is in one of these modes, the latency is reduced.

Figure 2. Context save/restore for interrupts

).4%22504'%.%2!4)/.EXECUTEPIPELINE
#OMPLETEINSTRUCTIONINEXECUTESTAGECYCLELATENCY
053(0#, 053(0#(
053(0#%
053(9 053(8 053(!
053(##
*5-04/).4%225042/54).%')6%."94(%).4%225046%#4/2
).4%225042/54).%
%8%#54)/.
)2%4).3425#4)/.
0/0##
0/0! 0/08
0/09 0/00#% 0/00#( 0/00#,
#05#9#,%3
5.34!#+ 0/0
#05#9#,%3
2%4 5
2.
0#,
0#,
0#(
0#,
0#%
0#,
9,
0#,
9(
0#,
8,
0#,
8(
0#,
!
0#,
##
0#,
).4%22504
34!#+ 053(
*5-04/4(%!$$2%33')6%."902/'2!-#/5.4%22ELOAD0IPELINE
14/162 Doc ID 13590 Rev 3
-36
PM0044 STM8 core description
Global configuration register (CFG_GCR)
The global configuration register is a memory mapped register. It controls the configuration of the processor. It contains the AL control bit:
AL: Activation level
If the AL bit is 0 (main), the IRET will cause the context to be retrieved from stack and the main program will continue after the WFI instruction.
If the AL bit is 1 (interrupt only active), the IRET will cause the CPU to go back to WFI/HALT mode without restoring the context.
This bit is used to control the low power modes of the MCU. In a very low power application, the MCU spends most of the time in WFI/HALT mode and is woken up (through interrupts) at specific moments in order to execute a specific task. Some of these recurring tasks are short enough to be treated directly in an ISR, rather than going back to the main program. In this case, by programming the AL bit to 1 before going to low power (by executing WFI/HALT instruction), the run time/ISR execution is reduced due to the fact that the register context is not saved/restored each time.
Condition Code register (CC)
The Condition Code register is a 8-bit register which indicates the result of the instruction just executed as well as the state of the processor. These bits can be individually tested by a program and specified action taken as a result of their state. The following paragraphs describe each bit.
V: Overflow
When set, V indicates that an overflow occurred during the last signed arithmetic operation, on the MSB operation result bit. See INC, INCW, DEC, DECW, NEG, NEGW, ADD, ADC, SUB, SUBW, SBC, CP, CPW instructions.
I1: Interrupt mask level 1
The I1 flag works in conjunction with the I0 flag to define the current interruptability level as shown in the following table. These flags can be set and cleared by software through the RIM, SIM, HALT, WFI, IRET, TRAP and POP instructions and are automatically set by hardware when entering an interrupt service routine.

Table 1. Interruptability levels

Interruptability Priority I1 I0
Interruptable Main
Interruptable Level 1 01
Interruptable Level 2 00
Non Interruptable 11
H: Half carry bit
Lowest
Highest
10
The H bit is set to 1 when a carry occurs between the bits 3 and 4 of the ALU during an ADD or ADC instruction. The H bit is useful in BCD arithmetic subroutines.
For ADDW, SUBW it is set when a carry occurs from bit 7 to 8, allowing to implement byte arithmetic on 16-bit index registers.
Doc ID 13590 Rev 3 15/162
STM8 core description PM0044
I0: Interrupt mask level 0
See Flag I1
N: Negative
When set to 1, this bit indicates that the result of the last arithmetic, logical or data manipulation is negative (i.e. the most significant bit is a logic 1).
Z: Zero
When set to 1, this bit indicates that the result of the last arithmetic, logical or data manipulation is zero.
C: Carry
When set, C indicates that a carry or borrow out of the ALU occurred during the last arithmetic operation on the MSB operation result bit (bit 7 for 8-bit result/destination or bit 15 for 16-bit result). This bit is also affected during bit test, branch, shift, rotate and load instructions. See ADD, ADC, SUB, SBC instructions. In bit test operations, C is the copy of the tested bit. See BTJF, BTJT instructions. In shift and rotates operations, the carry is updated. See RRC, RLC, SRL, SLL, SRA instructions.
This bit can be set, reset or complemented by software using SCF, RCF, CCF instructions.
Example: Addition
$B5 + $94 = "C" + $49 = $149
C7 0
0 10110101
C7 0
+0 10010100
C7 0
=1 01001001
The results of each instruction on the Condition Code register are shown by tables in
Section 7: STM8 instruction set. The following table is an example:
VI1HI0NZC
V0 0NZ1
where
Nothing = Flag not affected
Flag name =Flag affected
0 = Flag cleared
1 = Flag set
16/162 Doc ID 13590 Rev 3
PM0044 STM8 memory interface

4 STM8 memory interface

4.1 Program space

The program space is 16-Mbyte and linear. To distinguish the 1, 2 and 3 byte wide addressing modes, naming has been defined as shown in Figure 3:
"Page" [0xXXXX00 to 0xXXXXFF]: 256-byte wide memory space with the same two
most significant address bytes (XXXX defines the page number).
"Section" [0xXX0000 to 0xXXFFFF]: 64-Kbyte wide memory space with the same most
significant address byte (XX defines the section number).
The reset and interrupt vector table are placed at address 0x8000 for the STM8 family. (Note: the base address may be different for later implementations.) The table has 32 4-byte entries: RESET, Trap, NMI and up to 29 normal user interrupts. Each entry consists of the reserved op-code 0x82, followed by a 24-bit value: PCE, PCH, PCL address of the respective Interrupt Service Routine. The main program and ISRs can be mapped anywhere in the 16 Mbyte memory space.
CALL/CALLR and RET must be used only in the same section. The effective address for the CALL/RET is used as an offset to the current PCE register value. For the JP, the effective address 16 or 17-bit (for indexed addressing) long, is added to the current PCE value. In order to reach any address in the program space, the JPF jump and CALLF call instructions are provided with a three byte extended addressing mode while the RETF pops also three bytes from the stack.
As the memory space is linear, sections can be crossed by two CPU actions: next instruction byte fetch (PC+1), relative jumps and, in some cases, by JP (for indexed addressing mode).
Note: For safe memory usage, a function which crosses sections MUST:
- be called by a CALLF
- include only far instructions for code operation (CALLF & JPF)
All label pointers are located in section 0 (JP [ptr.w] example: ptr.w is located in section 0 and the jump address in current section)
Any illegal op-code read from the program space triggers a MCU reset.

4.2 Data space

The data space is 16-Mbyte and linear. As the stack must be located in section 0 and as data access outside section 0/1 can be managed only with LDF instructions, frequently used data should be located in section 0 to get the optimum code efficiency.
All data pointers are located in section 0 only.
Indexed addressing (with 16-bit index registers and long offset) allows data access over section 0 and 1.
All the peripherals are memory mapped in the data space.
Doc ID 13590 Rev 3 17/162
STM8 memory interface PM0044
VECTORS
PAGE 0
0x000000
0x0000FF
0x00807F
0x00FFFF
0x010000
0x01FFFF
0xFF0000
0xFFFFFF
1-BYTE ADDRESSING MODE
BIT HANDLING CAPABILITY
2-BYTE ADDRESSING MODE
3-BYTE ADDRESSING MODE
FAST DATA ACCESS WITH
DATA SPACE
SECTION 0
SECTION 1
SECTION 256
RESET
L
RESET
H
TRAP
L
TRAP
H
NMI
L
NMI
H
INT0
L
INT0
H
INT1
L
INT1
H
INT28
L
INT28
H
0x00807C
0x008000
PROGRAM SPACE
BIT HANDLING CAPABILITY POWERFUL DATA MANAGEMENT
ACCESSIBLE DATA
STACK AREA
SHORT GENERATED CODE
RESET
E
TRAP
E
NMI
E
INT0
E
INT1
E
INT28
E
0x008000
POINTERS
0x82
0x82
0x82
0x82
0x82
0x82

Figure 3. Address spaces

18/162 Doc ID 13590 Rev 3
PM0044 STM8 memory interface
PCE PCH PCL
PROGRAM COUNTER
Data@E Data@E0:H:L0x00
YN
"LDF" INSTRUCTION
@DATABUS
RAM FETCH INSTRUCTION
YN
CPU
Memory Interface (RAM)
STALL
A15..0
7
24
17
24
D7..0
R/W
DATABUS
@BUS
Memory Interface (Flash)
STALL
A23..0
D31..0
DATABUS
@BUS
(FETCH)
24
@DATABUS
24

4.3 Memory interface architecture

The STM8 uses a Harvard architecture, with separate program and data memory buses. However, the logical address space is unified, all memories sharing the same 16-Mbytes space, non-overlapped. The memory interfaces are shown in Figure 4. It consists of two buses: address, data, read/write control signal (R/W) and memory acknowledge signal (STALL).
The STALL acknowledge signal makes the CPU compatible with slow serial or parallel memory interfaces. When the memory interface is slow the CPU waits the memory acknowledge before executing the instruction. So in such a case, the instruction CPU cycle time is prolonged compare to the value given in this manual.
The program memory bus is 32-bit wide, allowing the fetch of most of the instructions in one cycle.
As the address space is unified, the architecture allows data to be stored also in the Flash memory and program to be fetched also from RAM (data bus). In this later case the performance is impacted, besides the fact that data and fetch operation share the same bus, the instructions will be fetched one byte at a time, thus taking longer (1 cycle /byte).

Figure 4. Memory Interface Architecture

Doc ID 13590 Rev 3 19/162
Pipelined execution PM0044
-36
%8%#54%
$%#/$%
&%4#(
0#N
0#
0#N
)NSTRUCTIONSSFETCHEDFROMMEMORY
)NSTRUCTIONSDECODINGANDDATAREADFROMMEMORYIFNEEDED
2EGISTERSDATAREADFROMREGISTERBANK 3HIFTAND!,5OPERATION 7RITEBACKREGISTERSDATATO2EGISTERBANK 7RITEBACKDATATOMEMORY

5 Pipelined execution

The STM8 family uses a 3-stage pipeline to increase the speed of the flow of instructions sent to the processor. Pipelined execution allows several operations to be performed simultaneously, rather than serially:
Fetch
Decode and address
Execute
The Program Counter (PC) points always to the instruction in decode stage as shown in
Figure 5.

Figure 5. Pipelined execution principle

5.1 Description of pipelined execution stages

Figure 6 and Section 5.1.1, Section 5.1.2, and Section 5.1.3 provide a detailed description
of each stage of the pipeline execution.
20/162 Doc ID 13590 Rev 3
PM0044 Pipelined execution
-36
&ETCH $ECODE-EM2EAD %XECUTE7RITEBACK
!,5
%XECUTION
2EGISTER
!LIGN
-%-/29
#/.42/,
2!-
0ERIPHERALS
&LASH
INSTRUCTION
MEMORY
BIT
BIT
0REFETCHBUFFER
BIT
!DDRESS
COMPUTATION
/PCODE
)MM
7"ADD
-RD
$ECODE
0#
)4#
7RITE"ACK

Figure 6. Pipelined execution stages

5.1.1 Fetch stage

The first pipeline stage includes a 64-bit fetch buffer and a 32-bit prefetch buffer, totalling 3 words named F to be available for decoding immediately after F
The instruction access from Flash Program memory is 32-bit wide and it is performed from an aligned address i.e. 0xXXX0, 0xXXX4, 0xXXX8, or 0xXXXC.
Unlike the decode and execute stages that are performed at every cycle, the fetch stage accesses the program memory only when needed, and stops memory access when the buffer is full. This allows reducing the core power consumption,
Reading program from RAM is similar to reading program from ROM. However, since the RAM data bus is 8-bit wide, 4 consecutive read operations have to be performed to load one
word, thus resulting in RAM execution being slower than Flash execution.
F
X

5.1.2 Decoding and addressing stage

The decoding stage includes an instruction alignment unit. The alignment unit uses the 64­bit input from the fetch unit and feeds an instruction (from 1 to 5 bytes depending on the instruction) to the decoding unit.
The instruction code consists of 2 parts (see examples in Table 2 ):
The op-code itself (1 or 2 bytes)
and a data/address part (0 to 3 bytes).
, F2 and F3. This buffer structure allows any instruction code (up to 5 bytes)
1
(and F2 when needed) is/are loaded.
1
Doc ID 13590 Rev 3 21/162
Pipelined execution PM0044
The op-code is decoded in this stage. When present, the instruction address is used for address computation, whilst the immediate operand is forwarded to the execution stage.
Table 2. Data/address decoding examples
Instruction Syntax Op-code Data/address
Register to register move
Register load LD A,($12,SP) 0x7B 0x12
Register store LD ($12,SP),A 0x6B 0x12
Data load / store with extended address
Long/unaligned instructions
For long instructions (i.e. 5-bytes instructions), the fetch may need 2 program memory accesses to be completed. In this case, the decoding stage (after decoding the op-code part), is stalled waiting for the fetch stage to complete the 2nd fetch.
In case of shorter instructions, this may also happen when they cross a 32-bit boundary.
Indirect addressing
For indirect addressing, the CPU is stalled in this stage to read the pointer from the data memory (i.e. RAM). The number of cycles during which the CPU is stalled depends on the pointer size (short, long or extended addressing mode).

5.1.3 Execution stage

In the execution stage, the operation is executed and the result is stored in the accumulator, index register or RAM.
LD A, XH 0x95 -
LDF A,($123456,Y) 0x90 AF 0x12 34 56

5.2 Data memory conflicts

3 types of operations perform accesses to the data memory:
Effective address computation in case of indirect addressing
Data read: source operand
Data write: destination for store or read-modify-write operations
In case of simultaneous accesses to the same memory area both in execution stage (write) and decoding stage (read), the decode stage is stalled till the execution stage releases the resource.
22/162 Doc ID 13590 Rev 3
PM0044 Pipelined execution
C
y
DecCy ExeCy 1+=

5.3 Pipelined execution examples

A few pipelined execution examples are reported below. The numbers of cycles for the decoding and execution stages correspond to the minimum number of cycles needed by the instruction itself. In some cases, depending on the instruction sequence, the cycle taken could be more than that number.

5.4 Conventions

Although the decode and/or execute stage of some instructions may take a different number of cycles, a simplified convention providing a good match with reality, has been used in this section:
The decode stage of each instruction takes one cycle only
The execution stage takes a number of cycles equal to
Where
C
is the number of execution cycles. In case of decode and execute cycles, It
y
corresponds to the minimum number of cycles needed by the instruction itself, and does not take into account the impact of the instruction sequence.
DecCy is the exact number of decode cycles. ExeCy is the exact number of execute cycles.
The decode stage of the next instruction starts during the last execution cycle. In instructions performing pipeline flush, the convention is that, in case the branch is taken, the next fetch are performed during the last instruction execution cycle.
The exact number of cycles (see Tab l e 3 ) and the number of cycles obtained using this convention (see Tab l e 4 ) are identical.

Table 3. Example with exact number of cycles

Address Instruction
0xC000 LDW X, [$50.w] 4 1 3
0xC003 ADDW X, #20 2 2 3
0xC006 LD A, [$30].w 3 1 3
0xC009 ….
Decode
cycles
Execute
cycles
lgth
F
Time (cycle)
1 2 3 4 5 6 7 8 9 10 11 12 13 14
D D D D E
1
D D D D D E E
F
2
F
3
D D D D D D E
Doc ID 13590 Rev 3 23/162
Pipelined execution PM0044

Table 4. Example with conventional number of cycles

Address Instruction
0xC000 LDW X, [$50.w] 4 3 3
0xC003 ADDW X, #20 3 3 3
0xC006 LD A, [$30].w 3 3 3
Decode
cycles
Execute
cycles
lgth
1 234567 8 91011121314
Time (cycle)
D E E E E
F
1
D D D D E E E
F
2
F
3
D D D D
E E E
0xC009 ….

Table 5. Legend

Symbol/Color Definition
FFetch
D Decode stalled
DDecode
EExecute

5.4.1 Optimized pipeline example – execution from Flash Program memory

In the example shown in Tab l e 6 , the code is stored in the Flash Program memory (32-bit bus). As a result, 3 cycles are needed to fill the 96-bit prefetch buffer. At each cycle, one word is loaded and stored in F the instructions contained in one of the F instruction contained in F
, F2 and F3. The next fetch operation can start only when all
1
(SWAP A) is decoded, and a fetch operation can start to fill F
3
word are decoded. In fact, at cycle 9, the last
x
word.
3
24/162 Doc ID 13590 Rev 3
PM0044 Pipelined execution
Table 6. Optimized pipeline example - execution from Flash
Add. Instruction
0xC000 NEG A 1 1 1
0xC001 XOR A, $10 1 1 2
0xC003 LD A, #20 1 1 2
0xC005 SUB A,$1000 1 1 3
Decod.
cycles
Exec.
cycles
lgth
1234567891011121314
D E
F
1
D E
D E
F
2
0xC008 INC A 1 1 1
0xC009 LD XL, A 1 1 1
F
0xC00A SRL A 1 1 1
3
0xC00B SWAP A 1 1 1
0xC00C SLA $15 1 1 2
F
0xC00E CP A,#$FE 1 1 2
1
0xC010 MOV $100, #11 1 1 4
0xC014 MOV $101, #22 1 1 4
Table 7. Legend
D E
F
2
D E
Cycle
D E
D E
D E
F
3
D E
D E
D E
D E
Symbol/Color Definition
FFetch
DDecode
EExecute
Doc ID 13590 Rev 3 25/162
Pipelined execution PM0044

5.4.2 Optimize pipeline example – execution from RAM

In the example shown in Tab l e 8 , the RAM is accessed through an 8-bit bus. As a result, 12 cycles are required to fill the 96-bit pre-fetch buffer. Every 4 cycles, one word is loaded and stored in F filled. This occurs for example till the 4 decoded only at the 5
In case of read/write access to the RAM, the fetch is stalled. This occurs during the 6 since RAM address 10 is read during the decode stage of XOR A, $10.
Table 8. Optimize pipeline example – execution from RAM
. The decoding of the first word instruction can start only when the Fx word is
x
th
cycle.
th
cycle, and the first instruction (NEG A) can be
Cycle
th
cycle
Add.
Instruction
0xC000 NEG A 1 1 1
0xC001
0xC003 LD A, #20 1 1 2
0xC005
0xC008 INC A 1 1 1
0xC009 LD XL, A 1 1 1
0xC00A SRL A 1 1 1
0xC00B SWAP A 1 1 1
0xC00C SLA $15 1 1 2
0xC00E
XOR A,
$10
SUB
A,$1000
CP
A,#$FE
Table 9. Legend
Decode cycles
Execute cycles
112
113
112
lgth
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
D D D D E
1_1
F
1_4F2_1
F
D E
D D D D E
FS
2_2F2_3F2_4
F
D E
D D D D E
3_1
F
FS
3_2
F
F
3_3
F
3_4
D E
1_1F1_2
F
D E
1_3F1_4
F
D E
D E
D E
1_2F1_3
F
Symbol/Color Definition
FFetch
FS Fetch stalled
DDecode
D Decode stalled
EExecute
26/162 Doc ID 13590 Rev 3
PM0044 Pipelined execution

5.4.3 Pipeline with Call/Jump

In the example shown in Tab l e 1 0, a branch is taken after the JP/CALL instruction, and the fetched instruction(s) are lost (flush). New instructions must be fetched. 3 fetch sequences are required to refill the pre-fetch buffer. The fetch start depends on the instruction being executed.
For a JP instruction, the fetch can start during the first cycle of the "dummy" execution.
For the CALL instruction, it starts after the last cycle of the CALL execution.
Table 10. Example of pipeline with Call/Jump
Cycle
D E
D E E
F
2
Add. Instruction
Decode
cycles
Execute
cycles
lgth
0xC000 INC A 1 1 1
0xC001 JP label 1 1 3
0xC004 LDW X,[$5432.w] X X 4
0xD010 label: NEG A 1 1 1
0xD011 CALL label2 1 2 3
0xD014 LDW X,[$5432.w] X X 4
1 23456 7 8 91011
D E
F
1
D E
F
2
Flush
F
1
0xD018 LDW X,[$7895.w] X X 4 F3FS
0xE030 label2: INCW X 1 1 1
Table 11. Legend
Symbol/Color Definition
FFetch
FS Fetch stalled
DDecode
EExecute
Flush
F1D E

5.4.4 Pipeline stalled

The decode stage can be stalled when the execution lasts more than one cycle.
The flush is due to the branch. Fetching the branch address is performed during the second execution cycle of the BTJF instruction.
The Decode operation can also be stalled when the memory target is modified during the previous instruction. In the example given in Ta bl e 1 2, the INCW Y instruction writes the X
Doc ID 13590 Rev 3 27/162
Pipelined execution PM0044
register during the first execution cycle. As a result, in this cycle, the next instruction (LD A,(X)) cannot be decoded since it reads the X register.
Table 12. Example of stalled pipeline
Time (cycles)
D E E
D D E
Address Instruction
Decode
cycles
Execute
cycles
0xC000 SUB SP, #20 1 1 2
0xC002 LD A, #20 1 1 2
0xC004 BTJT 0x10, #5, to 1 2 5
0xC009 INC A 1 1 1
lgth
1 2347 8 91011121314
D E
F
1
D E
F
2
F
3
0xC00A BTJF 0x20, #3, to 1 2 5
F
0xC00F NOP X X 1
0xC010 LDW X,[$5432.w] X X 4 F
1
2
0xC014 LDW X,[$1234.w] X X 4 F
0xD020 to: INCW Y 1 1 2
0xD023 LD A,(X) 1 1 2
Table 13. Legend
Symbol/Color Definition
FFetch
D Decode stalled
D E E
3
Flush
F
D E
1
D D E
DDecode
EExecute
28/162 Doc ID 13590 Rev 3
PM0044 Pipelined execution

5.4.5 Pipeline with 1 wait state

In the example given in Ta bl e 1 4 , performing the fetch takes 2 cycles, and there is no overlap between the 2 fetch cycles.
If the instruction is decoded/executed during the last 2 fetch cycles, then the wait state is transparent compared to the no-wait state execution.
Table 14. Pipeline with 1 wait state
Address Instruction
Decode
cycles
Execute
cycles
0xC000 NEG A 1 1 1
0xC001 DEC ($10, X) 1 1 3
0xC004 LDW X, #20 1 1 3
0xC007 LD (X), A 1 1 1
0xC008 INC A 1 1 1
0xC009 NEG ($5A, Y) 1 1 1
Table 15. Legend
Symbol/Color Definition
FFetch
D Decode stalled
DDecode
MS Memory stalled
EExecute
lgth
Time (cycle)
12345678910
MS
D E
F
1
MS
D E
F
2
D E E
D D E
MS
F
3
D E
D E
Doc ID 13590 Rev 3 29/162
STM8 addressing modes PM0044

6 STM8 addressing modes

The STM8 core features 18 different addressing modes which can be classified in 8 main groups:

Table 16. STM8 core addressing modes

Addressing mode groups Example
Inherent NOP
Immediate LD A,#$55
Direct LD A,$55
Indexed LD A,($55,X)
SP Indexed LD A,($55,SP)
Indirect LD A,([$55],X)
Relative JRNE loop
Bit operation BSET byte,#5
The STM8 Instruction set is designed to minimize the number of required bytes per instruction. To do so, most of the addressing modes can be split in three sub-modes called extended, long and short:
The extended addressing mode ("e") can reach any byte in the 16-Mbyte addressing
space, but the instruction size is bigger than the short and long addressing mode. Moreover, the number of instructions with this addressing mode (far) is limited (CALLF, RETF, JPF and LDF)
The long addressing mode ("w") is the most powerful for program management, when
the program is executed in the same section (same PCE value). The long addressing mode is optimized for data management in the first 64-Kbyte addressing space (from 0x000000 to 0x00FFFF) with a complete set of instructions, but the instruction size is bigger than the short addressing mode.
The short addressing mode ("b") is less powerful because it can only access the page
zero (from 0x000000 to 0x0000FF), but the instruction size is more compact.

Table 17. STM8 addressing mode overview

Inherent NOP
Immediate LD A,#$55
Short Direct LD A,$10 000000..0000FF
Long Direct LD A,$1000 000000..00FFFF
Extended Direct LDF A,$100000 000000..FFFFFF
Mode Syntax
Destination
address
Pointer
address
size
Pointer
No Offset Direct Indexed LD A,(X) 000000..00FFFF
Short Direct Indexed LD A,($10,X) 000000..0100FE
30/162 Doc ID 13590 Rev 3
Loading...
+ 132 hidden pages