STMicroelectronics PM0044 Programming manual

PM0044

Programming manual

STM8 CPU programming manual

Introduction

The STM8 family of HCMOS microcontrollers is designed and built around an enhanced industry standard 8-bit core and a library of peripheral blocks, which include ROM, Flash, RAM, EEPROM, I/O, Serial Interfaces (SPI, USART, I2C,...), 16-bit Timers, A/D converters, comparators, power supervisors etc. These blocks may be assembled in various combinations in order to provide cost-effective solutions for application-specific products.

The STM8 family forms a part of the STMicroelectronics 8-bit MCU product line, which finds its place in a wide variety of applications such as automotive systems, remote controls, video monitors, car radio and numerous other consumer, industrial, telecom, and multimedia products.

September 2011

Doc ID 13590 Rev 3

1/162

www.st.com

Contents

PM0044

 

 

Contents

1

STM8 architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. 9

 

1.1

STM8 development support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

 

1.2

Enhanced STM8 features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

2

Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

12

3

STM8 core description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

 

3.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

 

3.2

CPU registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

4

STM8 memory interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

 

4.1

Program space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

 

4.2

Data space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

 

4.3

Memory interface architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

5

Pipelined execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

20

 

5.1

Description of pipelined execution stages . . . . . . . . . . . . . . . . . . . . . . . .

20

5.1.1 Fetch stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5.1.2 Decoding and addressing stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5.1.3 Execution stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.2 Data memory conflicts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.3 Pipelined execution examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.4 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.4.1Optimized pipeline example – execution from Flash Program memory . 24

 

5.4.2

Optimize pipeline example – execution from RAM . . . . . . . . . . . . . . . .

26

 

5.4.3

Pipeline with Call/Jump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

 

5.4.4

Pipeline stalled . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

 

5.4.5

Pipeline with 1 wait state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

6

STM8 addressing modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

6.1 Inherent addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 6.2 Immediate addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6.3 Direct addressing mode (Short, Long, Extended) . . . . . . . . . . . . . . . . . . 34

2/162

Doc ID 13590 Rev 3

PM0044

Contents

 

 

6.3.1 Short Direct addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 6.3.2 Long Direct addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 6.3.3 Extended Direct addressing mode (only for CALLF and JPF) . . . . . . . . 38

6.4Indexed addressing mode (No Offset, Short, SP, Long, Extended) . . . . . 39

6.4.1 No Offset Indexed addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 6.4.2 Short Indexed addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 6.4.3 SP Indexed addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 6.4.4 Long Indexed addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 6.4.5 Extended Indexed (only LDF instruction) . . . . . . . . . . . . . . . . . . . . . . . . 44

6.5 Indirect (Short Pointer Long, Long Pointer Long) . . . . . . . . . . . . . . . . . . . 45 6.6 Short Pointer Indirect Long addressing mode . . . . . . . . . . . . . . . . . . . . . 46 6.7 Long Pointer Indirect Long addressing mode . . . . . . . . . . . . . . . . . . . . . . 47

6.8Indirect Indexed (Short Pointer Long, Long Pointer Long,

Long Pointer Extended) addressing mode . . . . . . . . . . . . . . . . . . . . . . . . 48 6.9 Short Pointer Indirect Long Indexed addressing mode . . . . . . . . . . . . . . 49 6.10 Long Pointer Indirect Long Indexed addressing mode . . . . . . . . . . . . . . . 51 6.11 Long Pointer Indirect Extended Indexed addressing mode . . . . . . . . . . . 53 6.12 Relative Direct addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 6.13 Bit Direct (Long) addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6.14 Bit Direct (Long) Relative addressing mode . . . . . . . . . . . . . . . . . . . . . . . 59

7

STM8 instruction set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

61

 

7.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

61

 

7.2

Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

 

 

7.2.1

Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

 

 

7.2.2

CPU registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

 

 

7.2.3

Code condition bit value notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

 

 

7.2.4

Memory and addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

 

 

7.2.5

Operation code notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64

7.3 Instruction set summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 7.4 Instruction set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 ADC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 ADD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 ADDW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 AND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

Doc ID 13590 Rev 3

3/162

Contents

PM0044

 

 

BCCM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 BCP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 BCPL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 BREAK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 BRES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 BSET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 BTJF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 BTJT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 CALL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 CALLF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 CALLR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 CCF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 CLR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 CLRW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 CP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 CPW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 CPL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 CPLW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 DEC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 DECW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 DIV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 DIVW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 EXG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 EXGW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 HALT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 INC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 INCW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 INT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 IRET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 JP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 JPF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 JRA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 JRxx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

4/162

Doc ID 13590 Rev 3

PM0044

Contents

 

 

LD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 LDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 LDW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 MOV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 MUL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 NEG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 NEGW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 NOP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 OR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 POP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 POPW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 PUSH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 PUSHW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 RCF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 RET. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 RETF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 RIM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 RLC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 RLCW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 RLWA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 RRC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 RRCW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 RRWA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 RVF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 SBC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 SCF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 SIM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 SLL/SLA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 SLLW/SLAW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 SRA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 SRAW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 SRL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 SRLW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

Doc ID 13590 Rev 3

5/162

Contents

 

PM0044

 

SUB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . 151

 

SUBW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . 152

 

SWAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . 153

 

SWAPW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . 154

 

TNZ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . 155

 

TNZW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . 156

 

TRAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . 157

 

WFE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . 158

 

WFI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . 159

 

XOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . 160

8

Revision history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . 161

6/162

Doc ID 13590 Rev 3

PM0044

List of tables

 

 

List of tables

Table 1. Interruptability levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Table 2. Data/address decoding examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Table 3. Example with exact number of cycles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Table 4. Example with conventional number of cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Table 5. Legend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Table 6. Optimized pipeline example - execution from Flash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Table 7. Legend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Table 8. Optimize pipeline example – execution from RAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Table 9. Legend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Table 10. Example of pipeline with Call/Jump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Table 11. Legend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Table 12. Example of stalled pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Table 13. Legend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Table 14. Pipeline with 1 wait state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Table 15. Legend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Table 16. STM8 core addressing modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Table 17. STM8 addressing mode overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Table 18. Inherent addressing instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Table 19. Immediate addressing instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Table 20. Overview of Direct addressing mode instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Table 21. Available Long and Short Direct addressing mode instructions . . . . . . . . . . . . . . . . . . . . . 34 Table 22. Available Extended Direct addressing mode instructions . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Table 23. Available Long Direct addressing mode instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Table 24. Overview Indexed addressing mode instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Table 25. No Offset, Long, Short and SP Indexed instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Table 26. No Offset, Long, Short Indexed Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Table 27. Extended Indexed Instructions only . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Table 28. Overview of Indirect addressing instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Table 29. Available Long Pointer Long and Short Pointer Long Indirect Instructions. . . . . . . . . . . . . 45 Table 30. Available Long Pointer Long Indirect Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Table 31. Overview of Indirect indexed instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Table 32. Available Long Pointer Long and Short Pointer Long Indirect

Indexed instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Table 33. Available Long Pointer Long Indirect Indexed instructions . . . . . . . . . . . . . . . . . . . . . . . . . 48 Table 34. Long Pointer Extended Indirect Indexed instructions instruction . . . . . . . . . . . . . . . . . . . . 48 Table 35. Overview of Relative Direct addressing mode instructions. . . . . . . . . . . . . . . . . . . . . . . . . 55 Table 36. Available Relative Direct instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Table 37. Overview of Bit Direct addressing mode instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Table 38. Available Bit Direct instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Table 39. Overview of Bit Direct (Long) Relative addressing mode . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Table 40. Available Bit Direct Relative instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Table 41. Instruction groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Table 42. Instruction set summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Table 43. Document revision history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

Doc ID 13590 Rev 3

7/162

List of figures

PM0044

 

 

List of figures

Figure 1. Programming model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Figure 2. Context save/restore for interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Figure 3. Address spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Figure 4. Memory Interface Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Figure 5. Pipelined execution principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Figure 6. Pipelined execution stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Figure 7. Immediate addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Figure 8. Short Direct addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Figure 9. Long Direct addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Figure 10. Far Direct addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Figure 11. No Offset Indexed addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Figure 12. Short Indexed - 8-bit offset - addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Figure 13. SP Indexed - 8-bit offset - addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Figure 14. Long Indexed - 16-bit offset - addressing mode example. . . . . . . . . . . . . . . . . . . . . . . . . . 43 Figure 15. Far Indexed - 16-bit offset - addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Figure 16. Short Pointer Indirect Long addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Figure 17. Long Pointer Indirect Long addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Figure 18. Short Pointer Indirect Long Indexed addressing mode example . . . . . . . . . . . . . . . . . . . . 50 Figure 19. Long Pointer Indirect Long Indexed addressing mode example. . . . . . . . . . . . . . . . . . . . . 52 Figure 20. Long Pointer Indirect Extended Indexed addressing mode example . . . . . . . . . . . . . . . . . 54 Figure 21. Relative Direct addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Figure 22. Bit Long Direct addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Figure 23. Bit Long Direct Relative addressing mode example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

8/162

Doc ID 13590 Rev 3

PM0044

STM8 architecture

 

 

1 STM8 architecture

The 8-bit STM8 Core is designed for high code efficiency. It contains 6 internal registers, 20 addressing modes and 80 instructions. The 6 internal registers include two 16-bit Index registers, an 8-bit Accumulator, a 24-bit Program Counter, a 16-bit Stack Pointer and an 8- bit Condition Code register. The two Index registers X and Y enable Indexed Addressing modes with or without offset, along with read-modify-write type data manipulation. These registers simplify branching routines and data/arrays modifications.

The 24-bit Program Counter is able to address up to 16-Mbyte of RAM, ROM or Flash memory. The 16-bit Stack Pointer provides access to a 64K-level Stack. The Core also includes a Condition Code register providing 7 Condition flags that indicate the result of the last instruction executed.

The 20 Addressing modes, including Indirect Relative and Indexed addressing, allow sophisticated branching routines or CASE-type functions. The Indexed Indirect Addressing mode, for instance, permits look-up tables to be located anywhere in the address space, thus enabling very flexible programming and compact C-based code. The stack pointer relative addressing mode permits optimized C compiler stack model for local variables and parameter passing.

The Instruction Set is 8-bit oriented with a 2-byte average instruction size. This Instruction Set offers, in addition to standard data movement and logic/arithmetic functions, 8-bit by 8- bit multiplication, 16-bit by 8-bit and 16-bit by 16-bit division, bit manipulation, data transfer between Stack and Accumulator (Push / Pop) with direct stack access, as well as data transfer using the X and Y registers or direct memory-to-memory transfers.

The number of Interrupt vectors can vary up to 32, and the interrupt priority level may be managed by software providing hardware controlled nested capability. Some peripherals include Direct Memory Access (DMA) between serial interfaces and memory. Support for slow memories allows easy external code execution through serial or parallel interface (ROMLESS products for instance).

The STM8 has a high energy-efficient architecture, based on a Harvard architecture and pipelined execution. A 32-bit wide program memory bus allows most of the instructions to be fetched in 1 CPU cycle. Moreover, as the average instruction length is 2 bytes, this allows for a reduction in the power consumption by only accessing the program memory half of the time, on average. The pipelined execution allowed the execution time to be minimized, ensuring high system performance, when needed, together with the possibility to reduce the overall energy consumption, by using different power saving operating modes. Power-saving can be managed under program control by placing the device in SLOW, WAIT, SLOW-WAIT, ACTIVE-HALT or HALT mode (see product datasheet for more details).

Doc ID 13590 Rev 3

9/162

STM8 architecture

PM0044

 

 

Additional blocks

The additional blocks take the form of integrated hardware peripherals arranged around the central processor core. The following (non-exhaustive) list details the features of some of the currently available blocks:

Boot ROM

Memory area containing the bootloader code

Flash

Flash-based devices

RAM

Sizes up to several Kbytes

Data EEPROM

Sizes up to several Kbytes. Erase/programming operations do not require

additional external power sources.

 

 

Different versions based on 8/16-bit free running or autoreload timer/counter are

 

available. They can be coupled with either input captures, output compares or

Timers

PWM facilities. PWM functions can have software programmable duty cycle

 

between 0% to 100% in up to 256/65536 steps. The outputs can be filtered to

 

provide D/A conversion.

A/D converter

The Analog to Digital Converter uses a sample and hold technique. It has 12-bit

resolution.

 

I2C

Multi/master, single master, single slave modes, DMA or 1byte transfer, standard

and fast I2C modes, 7 and 10-bit addressing.

 

 

The Serial peripheral Interface is a fully synchronous 3/4 wire interface ideal for

SPI

Master and Slave applications such as driving devices with input shift register

 

(LCD driver, external memory,...).

 

The USART is a fast synchronous/asynchronous interface which features both

USART

duplex transmission, NRZ format, programmable baud rates and standard error

 

detection. The USART can also emulate RS232 protocol.

 

It has the ability to induce a full reset of the MCU if its counter counts down to

Watchdog

zero prior to being reset by the software. This feature is especially useful in noisy

 

applications.

 

They are programmable by software to act in several input or output

I/O ports

configurations on an individual line basis, including high current and interrupt

 

generation. The basic block has eight bit lines.

1.1STM8 development support

The STM8 family of MCUs is supported by a comprehensive range of development tools. This family presently comprises hardware tools (emulators, programmers), a software package (assembler-linker, debugger, archiver) and a C-compiler development tool.

STM8 and ST7 CPUs are supported by a single toolchain allowing easy reuse and portability of the applications between product lines.

10/162

Doc ID 13590 Rev 3

PM0044

STM8 architecture

 

 

1.2Enhanced STM8 features

16-Mbyte linear program memory space with 3 FAR instructions (CALLF, RETF, JPF)

16-Mbyte linear data memory space with 1 FAR instruction (LDF)

Up to 32 24-bit interrupt vectors with optimized context save management

16-bit Stack Pointer (SP=SH:S) with stack manipulation instructions and addressing modes

New register and memory access instructions (EXG, MOV)

New arithmetic instructions: DIV 16/8 and DIVW 16/16

New bit handling instructions (CCF, BCPL, BCCM)

2 x 16-bit index registers (X=XH:XL, Y=YH:YL). 8-bit data transfers address the low byte. The high-byte is not affected, with a reset value of 0. This allows the use of X/Y as 8-bit values.

Fast interrupt handling through alternate register files (up to 4 contexts) with standard stack compatible mode (for real time OS kernels)

16-bit/8-bit stack operations (X, Y, A, CC stacking)

16-bit pointer direct update with 16-bit relative offset (ADDW/SUBW for X/Y/SP)

8-bit & 16-bit arithmetic and signed arithmetic support

Doc ID 13590 Rev 3

11/162

Glossary

PM0044

 

 

2 Glossary

mnem

mnemonic

src

source

dst

destination

cy

duration of the instruction in CPU clock cycles (internal clock)

lgth

length of the instruction in byte(s)

op-code

instruction byte(s) implementation (1..4 bytes), operation code.

mem

memory location

imm

immediate value

off

offset

ptr

pointer

pos

position

byte

a byte

word

16-bit value

short

represent a short 8-bit addressing mode

long

represent a long 16-bit addressing mode

EA

Effective Address: The final computed data byte address

Page Zero

all data located at [00..FF] addressing space (single byte address)

(XX)content of a memory location XX

XX

a byte value

ExtB

Extended byte

MS

Most Significant byte of a 16-bit value (MSB)

LS

Least Significant byte of a 16-bit value (LSB)

A

Accumulator register

X16-bit X Index register

Y16-bit Y Index register

reg

A, XL or YL register (1-byte LS part of X/Y), XH or YH (1-byte MS part of X/Y)

ndx

index register, either X or Y

PC

24-bit Program Counter register

SP

16-bit Stack Pointer

S

Stack Pointer LSB

CC

Condition Code register

12/162

Doc ID 13590 Rev 3

PM0044

STM8 core description

 

 

3 STM8 core description

3.1Introduction

The CPU has a full 8-bit architecture, with 16-bit operations on index registers (for address computation). Six internal registers allow efficient 8-bit data manipulation. The CPU is able to execute 80 basic instructions. It features 20 addressing modes and can address 6 internal registers and 16 Mbytes of memory/peripheral registers.

3.2CPU registers

The 6 CPU registers are shown in the programming model in Figure 1. Following an interrupt, the register context is saved. The context is saved by pushing registers onto the stack in the order shown in Figure 2. They are popped from the stack in the reverse order.

Accumulator (A)

The accumulator is an 8-bit general purpose register used to hold operands and the results of the arithmetic and logic calculations as well as data manipulations.

Index registers (X and Y)

These 16-bit registers are used to create effective addresses or as temporary storage area for data manipulations. In most of the cases, the cross assembler generates a PRECODE instruction (PRE) to indicate that the following instruction refers to the Y register. Both X and Y are automatically saved on interrupt routine branch.

Program Counter (PC)

The program counter is a 24-bit register used to store the address of the next instruction to be executed by the CPU. It is automatically refreshed after each processed instruction. As a result, the STM8 core can access up to 16-Mbytes of memory.

Figure 1.

Programming model

 

 

 

 

 

 

 

 

7

0

 

 

 

 

 

 

 

 

A ACCUMULATOR

 

 

15

8

7

0

 

 

 

 

 

XH

 

XL

X INDEX

 

 

15

8

7

0

 

 

 

 

 

YH

 

YL

Y INDEX

 

 

15

 

 

0

 

 

 

 

 

 

 

 

SP STACK POINTER

 

 

 

 

 

 

 

23

16 15

8

7

0

 

 

 

PCE

 

PCH

 

PCL

PC PROGRAM COUNTER

 

 

 

 

 

7

0

CC CODE CONDITION

 

 

 

 

 

V

- I1 H I0 N Z C

 

 

 

 

 

 

 

 

Doc ID 13590 Rev 3

13/162

STM8 core description

PM0044

 

 

 

Stack Pointer (SP)

 

The stack pointer is a 16-bit register. It contains the address of the next free location of the

 

stack. Depending on the product, the most significant bits can be forced to a preset value.

 

The stack is used to save the CPU context on subroutines calls or interrupts. The user can

 

also directly use it through the POP and PUSH instructions.

 

After an MCU reset the Stack Pointer is set to its upper limit value. It is then decremented

 

after data has been pushed onto the stack and incremented after data is popped from the

 

stack. When the lower limit is exceeded, the stack pointer wraps around to the stack upper

 

limit. The previously stored information is then overwritten, and therefore lost.

 

A subroutine call occupies two or three locations.

 

When an interrupt occurs, the CPU registers (CC, X, Y, A, PC) are pushed onto the stack.

 

This operation takes 9 CPU cycles and uses 9 bytes in RAM.

Note:

The WFI/HALT instructions save the context in advance. If an interrupt occurs while the CPU

 

is in one of these modes, the latency is reduced.

Figure 2.

Context save/restore for interrupts

 

 

 

 

 

 

 

 

 

).4%225040'%.%2!4)/. EXECUTEUPIPELINE

 

 

#OMPLETE INSTRUCTION INOEXECUTEUSTAGE CYCLE LATENCY

 

053( 0#,

 

053( 0#(

 

053( 0#%

 

053( 9

 

053( 8

 

053( !

#05##9#,%3

053( ##

*5-0 4//).4%2250402/54).% ')6%. "9'4(%().4%2250406%#4/2

).4%2250402/54).% %8%#54)/.

)2%4 % ).3425#4)/.

 

0#,

<![if ! IE]>

<![endif]>.)

 

 

0#,(

34!#+

<![if ! IE]>

<![endif]>2%45.2

<![if ! IE]>

<![endif]>4%22504

0#,

053(

 

0#,%

 

 

0#,9,

 

 

 

9(

 

 

5.34!#+

0#,8,

 

 

0/0

0#,8(

 

 

 

 

 

 

0#,!

 

 

 

0#,##

 

 

0/0/##

0/0/!

0/0/8

0/0/9

0/0/0#%

0/0/0#(

0/0/0#,

#05##9#,%3

*5-0 4//4(%(!$$2%33$')6%.%"9 02/'2!-2#/5.4%2# 2ELOAD40IPELINE

-3 6

14/162

Doc ID 13590 Rev 3

PM0044

STM8 core description

 

 

Global configuration register (CFG_GCR)

The global configuration register is a memory mapped register. It controls the configuration of the processor. It contains the AL control bit:

AL: Activation level

If the AL bit is 0 (main), the IRET will cause the context to be retrieved from stack and the main program will continue after the WFI instruction.

If the AL bit is 1 (interrupt only active), the IRET will cause the CPU to go back to WFI/HALT mode without restoring the context.

This bit is used to control the low power modes of the MCU. In a very low power application, the MCU spends most of the time in WFI/HALT mode and is woken up (through interrupts) at specific moments in order to execute a specific task. Some of these recurring tasks are short enough to be treated directly in an ISR, rather than going back to the main program. In this case, by programming the AL bit to 1 before going to low power (by executing WFI/HALT instruction), the run time/ISR execution is reduced due to the fact that the register context is not saved/restored each time.

Condition Code register (CC)

The Condition Code register is a 8-bit register which indicates the result of the instruction just executed as well as the state of the processor. These bits can be individually tested by a program and specified action taken as a result of their state. The following paragraphs describe each bit.

V: Overflow

When set, V indicates that an overflow occurred during the last signed arithmetic operation, on the MSB operation result bit. See INC, INCW, DEC, DECW, NEG, NEGW, ADD, ADC, SUB, SUBW, SBC, CP, CPW instructions.

I1: Interrupt mask level 1

The I1 flag works in conjunction with the I0 flag to define the current interruptability level as shown in the following table. These flags can be set and cleared by software through the RIM, SIM, HALT, WFI, IRET, TRAP and POP instructions and are automatically set by hardware when entering an interrupt service routine.

Table 1.

Interruptability levels

 

 

 

 

Interruptability

 

Priority

I1

I0

 

 

 

 

 

Interruptable Main

 

 

1

0

 

 

Lowest

 

 

Interruptable Level 1

 

0

1

 

 

 

 

 

Interruptable Level 2

 

0

0

 

Highest

 

 

 

 

 

Non Interruptable

 

 

1

1

 

 

 

 

 

 

H: Half carry bit

The H bit is set to 1 when a carry occurs between the bits 3 and 4 of the ALU during an ADD or ADC instruction. The H bit is useful in BCD arithmetic subroutines.

For ADDW, SUBW it is set when a carry occurs from bit 7 to 8, allowing to implement byte arithmetic on 16-bit index registers.

Doc ID 13590 Rev 3

15/162

STM8 core description

PM0044

 

 

I0: Interrupt mask level 0

See Flag I1

N: Negative

When set to 1, this bit indicates that the result of the last arithmetic, logical or data manipulation is negative (i.e. the most significant bit is a logic 1).

Z: Zero

When set to 1, this bit indicates that the result of the last arithmetic, logical or data manipulation is zero.

C: Carry

When set, C indicates that a carry or borrow out of the ALU occurred during the last arithmetic operation on the MSB operation result bit (bit 7 for 8-bit result/destination or bit 15 for 16-bit result). This bit is also affected during bit test, branch, shift, rotate and load instructions. See ADD, ADC, SUB, SBC instructions.

In bit test operations, C is the copy of the tested bit. See BTJF, BTJT instructions.

In shift and rotates operations, the carry is updated. See RRC, RLC, SRL, SLL, SRA instructions.

This bit can be set, reset or complemented by software using SCF, RCF, CCF instructions.

Example: Addition

$B5 + $94 = "C" + $49 = $149

 

C

7

 

 

 

 

 

 

0

 

0

 

1

0

1

1

0

1

0

1

 

 

 

 

 

 

 

 

 

 

 

 

C

7

 

 

 

 

 

 

0

 

 

 

 

 

 

 

 

 

 

+

0

 

1

0

0

1

0

1

0

0

 

 

 

 

 

 

 

 

 

 

 

 

C

7

 

 

 

 

 

 

0

 

 

 

 

 

 

 

 

 

 

=

1

 

0

1

0

0

1

0

0

1

 

 

 

 

 

 

 

 

 

 

 

The results of each instruction on the Condition Code register are shown by tables in Section 7: STM8 instruction set. The following table is an example:

V

 

I1

 

H

 

I0

N

Z

C

 

 

 

 

 

 

 

 

 

 

V

 

0

 

 

 

0

N

Z

1

 

 

 

 

 

 

 

 

 

 

 

where

 

 

 

 

 

 

 

Nothing = Flag not affected

 

 

Flag name =Flag affected

 

 

 

0 =

 

 

Flag cleared

 

 

 

1 =

 

 

Flag set

 

 

 

 

16/162

Doc ID 13590 Rev 3

PM0044

STM8 memory interface

 

 

4 STM8 memory interface

4.1Program space

The program space is 16-Mbyte and linear. To distinguish the 1, 2 and 3 byte wide addressing modes, naming has been defined as shown in Figure 3:

"Page" [0xXXXX00 to 0xXXXXFF]: 256-byte wide memory space with the same two most significant address bytes (XXXX defines the page number).

"Section" [0xXX0000 to 0xXXFFFF]: 64-Kbyte wide memory space with the same most significant address byte (XX defines the section number).

 

The reset and interrupt vector table are placed at address 0x8000 for the STM8 family.

 

(Note: the base address may be different for later implementations.) The table has 32 4-byte

 

entries: RESET, Trap, NMI and up to 29 normal user interrupts. Each entry consists of the

 

reserved op-code 0x82, followed by a 24-bit value: PCE, PCH, PCL address of the

 

respective Interrupt Service Routine. The main program and ISRs can be mapped

 

anywhere in the 16 Mbyte memory space.

 

CALL/CALLR and RET must be used only in the same section. The effective address for the

 

CALL/RET is used as an offset to the current PCE register value. For the JP, the effective

 

address 16 or 17-bit (for indexed addressing) long, is added to the current PCE value. In

 

order to reach any address in the program space, the JPF jump and CALLF call instructions

 

are provided with a three byte extended addressing mode while the RETF pops also three

 

bytes from the stack.

 

As the memory space is linear, sections can be crossed by two CPU actions: next

 

instruction byte fetch (PC+1), relative jumps and, in some cases, by JP (for indexed

 

addressing mode).

Note:

For safe memory usage, a function which crosses sections MUST:

 

- be called by a CALLF

 

- include only far instructions for code operation (CALLF & JPF)

 

All label pointers are located in section 0 (JP [ptr.w] example: ptr.w is located in section 0

 

and the jump address in current section)

 

Any illegal op-code read from the program space triggers a MCU reset.

4.2 Data space

The data space is 16-Mbyte and linear. As the stack must be located in section 0 and as data access outside section 0/1 can be managed only with LDF instructions, frequently used data should be located in section 0 to get the optimum code efficiency.

All data pointers are located in section 0 only.

Indexed addressing (with 16-bit index registers and long offset) allows data access over section 0 and 1.

All the peripherals are memory mapped in the data space.

Doc ID 13590 Rev 3

17/162

STM8 memory interface

PM0044

 

 

Figure 3. Address spaces

PROGRAM SPACE

 

 

 

 

 

0xFFFFFF

 

 

 

 

 

 

 

SECTION 256

0x82

INT28E

INT28H

INT28L

0x00807C

0xFF0000

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0x82

INT1E

INT1H

INT1L

0x82

INT0E

INT0H

INT0L

0x82

NMIE

NMIH

NMIL

0x82

TRAPE

TRAPH

TRAPL

0x82

RESETE

RESETH

RESETL

0x01FFFF

 

 

SECTION 1

 

<![if ! IE]>

<![endif]>0SECTION

0x008000

VECTORS

 

0x0000FF

PAGE 0

 

0x000000

 

DATA SPACE

3-BYTE ADDRESSING MODE

ACCESSIBLE DATA

2-BYTE ADDRESSING MODE

BIT HANDLING CAPABILITY POWERFUL DATA MANAGEMENT

STACK AREA POINTERS

1-BYTE ADDRESSING MODE

BIT HANDLING CAPABILITY FAST DATA ACCESS WITH SHORT GENERATED CODE

18/162

Doc ID 13590 Rev 3

PM0044

STM8 memory interface

 

 

4.3Memory interface architecture

The STM8 uses a Harvard architecture, with separate program and data memory buses. However, the logical address space is unified, all memories sharing the same 16-Mbytes space, non-overlapped. The memory interfaces are shown in Figure 4. It consists of two buses: address, data, read/write control signal (R/W) and memory acknowledge signal (STALL).

The STALL acknowledge signal makes the CPU compatible with slow serial or parallel memory interfaces. When the memory interface is slow the CPU waits the memory acknowledge before executing the instruction. So in such a case, the instruction CPU cycle time is prolonged compare to the value given in this manual.

The program memory bus is 32-bit wide, allowing the fetch of most of the instructions in one cycle.

As the address space is unified, the architecture allows data to be stored also in the Flash memory and program to be fetched also from RAM (data bus). In this later case the performance is impacted, besides the fact that data and fetch operation share the same bus, the instructions will be fetched one byte at a time, thus taking longer (1 cycle /byte).

Figure 4. Memory Interface Architecture

Memory Interface (Flash)

 

<![if ! IE]>

<![endif]>DATABUS (FETCH)

D31..0

 

 

 

 

STALL

 

 

<![if ! IE]>

<![endif]>@BUS

A23..0

 

 

 

24

 

 

 

 

 

 

 

0x00

Data@E

Data@E0:H:L

 

 

CPU

 

 

 

 

 

 

 

 

 

17

@DATABUS

 

24

"LDF" INSTRUCTION

 

 

 

 

 

 

 

 

PROGRAM COUNTER

 

 

 

 

 

 

7

 

PCE

PCH

PCL

 

@DATABUS

24

 

 

24

 

 

 

 

RAM FETCH INSTRUCTION

N

 

 

Y

 

STALL

<![if ! IE]>

<![endif]>DATABUS

 

<![if ! IE]>

<![endif]>@BUS

A15..0

 

R/W

 

D7..0

 

 

 

 

 

 

 

 

Memory Interface (RAM)

Doc ID 13590 Rev 3

19/162

Pipelined execution

PM0044

 

 

5 Pipelined execution

The STM8 family uses a 3-stage pipeline to increase the speed of the flow of instructions sent to the processor. Pipelined execution allows several operations to be performed simultaneously, rather than serially:

Fetch

Decode and address

Execute

The Program Counter (PC) points always to the instruction in decode stage as shown in

Figure 5.

Figure 5. Pipelined execution principle

&%4#(

0# #N

0# $%#/$%

0# N

%8%#54%

 

)NSTRUCTIONS S FETCHED FROMDMEMORY

)NSTRUCTIONSIDECODING AND DATADREAD FROMAMEMORYY IF NEEDED

2EGISTER S TDATA READ FROM REGISTERRBANK 3HIFTIANDT!,5AOPERATION 7RITETBACKKREGISTER S SDATA TO 2EGISTER BANK

7RITETBACKKDATA TOTMEMORY

-3 6

5.1Description of pipelined execution stages

Figure 6 and Section 5.1.1, Section 5.1.2, and Section 5.1.3 provide a detailed description of each stage of the pipeline execution.

20/162

Doc ID 13590 Rev 3

STMicroelectronics PM0044 Programming manual

PM0044

Pipelined execution

 

 

Figure 6. Pipelined execution stages

 

0#

 

 

 

<![if ! IE]>

<![endif]>7" ADD

 

<![if ! IE]>

<![endif]>7RITE "ACKK

 

 

 

!DDRESSR

 

 

-%-/29

 

 

COMPUTATION

 

 

#/.42/,

 

 

 

 

 

&LASH

 

 

 

 

<![if ! IE]>

<![endif]>-RD

 

<![if ! IE]>

<![endif]>!,5

INSTRUCTION

BIT

 

)4#

 

 

 

 

MEMORY

 

 

 

 

 

 

 

 

 

 

 

 

0ERIPHERALS

 

BITB

 

 

<![if ! IE]>

<![endif]>CODE/P )MM

<![if ! IE]>

<![endif]>%XECUTION

 

 

 

 

 

 

2!-

BIT

 

 

 

 

 

<![if ! IE]>

<![endif]>0REFETCHBUFFER

 

<![if ! IE]>

<![endif]>!LIGN

<![if ! IE]>

<![endif]>$ECODE

<![if ! IE]>

<![endif]>2EGISTER

&ETCH C H

 

$ECODE -EM 2EAD-

 

%XECUTE 7RITETBACK

 

 

 

 

 

 

 

-3 6

5.1.1Fetch stage

The first pipeline stage includes a 64-bit fetch buffer and a 32-bit prefetch buffer, totalling 3 words named F1, F2 and F3. This buffer structure allows any instruction code (up to 5 bytes) to be available for decoding immediately after F1 (and F2 when needed) is/are loaded.

The instruction access from Flash Program memory is 32-bit wide and it is performed from an aligned address i.e. 0xXXX0, 0xXXX4, 0xXXX8, or 0xXXXC.

Unlike the decode and execute stages that are performed at every cycle, the fetch stage accesses the program memory only when needed, and stops memory access when the buffer is full. This allows reducing the core power consumption,

Reading program from RAM is similar to reading program from ROM. However, since the RAM data bus is 8-bit wide, 4 consecutive read operations have to be performed to load one FX word, thus resulting in RAM execution being slower than Flash execution.

5.1.2Decoding and addressing stage

The decoding stage includes an instruction alignment unit. The alignment unit uses the 64bit input from the fetch unit and feeds an instruction (from 1 to 5 bytes depending on the instruction) to the decoding unit.

The instruction code consists of 2 parts (see examples in Table 2):

The op-code itself (1 or 2 bytes)

and a data/address part (0 to 3 bytes).

Doc ID 13590 Rev 3

21/162

Pipelined execution

PM0044

 

 

The op-code is decoded in this stage. When present, the instruction address is used for address computation, whilst the immediate operand is forwarded to the execution stage.

Table 2.

Data/address decoding examples

 

 

 

Instruction

Syntax

 

Op-code

Data/address

 

 

 

 

 

 

Register to register

LD A, XH

 

0x95

-

 

move

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Register load

 

LD A,($12,SP)

 

0x7B

0x12

 

 

 

 

 

 

 

Register store

LD ($12,SP),A

 

0x6B

0x12

 

 

 

 

 

 

 

Data load / store with

LDF A,($123456,Y)

 

0x90 AF

0x12 34

56

extended address

 

 

 

 

 

 

 

 

 

 

 

 

 

Long/unaligned instructions

For long instructions (i.e. 5-bytes instructions), the fetch may need 2 program memory accesses to be completed. In this case, the decoding stage (after decoding the op-code part), is stalled waiting for the fetch stage to complete the 2nd fetch.

In case of shorter instructions, this may also happen when they cross a 32-bit boundary.

Indirect addressing

For indirect addressing, the CPU is stalled in this stage to read the pointer from the data memory (i.e. RAM). The number of cycles during which the CPU is stalled depends on the pointer size (short, long or extended addressing mode).

5.1.3Execution stage

In the execution stage, the operation is executed and the result is stored in the accumulator, index register or RAM.

5.2Data memory conflicts

3 types of operations perform accesses to the data memory:

Effective address computation in case of indirect addressing

Data read: source operand

Data write: destination for store or read-modify-write operations

In case of simultaneous accesses to the same memory area both in execution stage (write) and decoding stage (read), the decode stage is stalled till the execution stage releases the resource.

22/162

Doc ID 13590 Rev 3

PM0044

Pipelined execution

 

 

5.3Pipelined execution examples

A few pipelined execution examples are reported below. The numbers of cycles for the decoding and execution stages correspond to the minimum number of cycles needed by the instruction itself. In some cases, depending on the instruction sequence, the cycle taken could be more than that number.

5.4Conventions

Although the decode and/or execute stage of some instructions may take a different number of cycles, a simplified convention providing a good match with reality, has been used in this section:

The decode stage of each instruction takes one cycle only

The execution stage takes a number of cycles equal to

Cy = DecCy + ExeCy 1

Where

Cy is the number of execution cycles. In case of decode and execute cycles, It corresponds to the minimum number of cycles needed by the instruction itself, and does not take into account the impact of the instruction sequence.

DecCy is the exact number of decode cycles.

ExeCy is the exact number of execute cycles.

The decode stage of the next instruction starts during the last execution cycle. In instructions performing pipeline flush, the convention is that, in case the branch is taken, the next fetch are performed during the last instruction execution cycle.

The exact number of cycles (see Table 3) and the number of cycles obtained using this convention (see Table 4) are identical.

Table 3.

Example with exact number of cycles

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Address

Instruction

Decode

Execute

lgth

 

 

 

 

 

 

Time (cycle)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

cycles

cycles

1

2

3

4

5

6

 

7

8

9

10

11

12

13

14

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC000

LDW X, [$50.w]

4

1

3

F1

D

D

D

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC003

ADDW X, #20

2

2

3

F2

D

D

D

D

 

D

E

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC006

LD A, [$30].w

3

1

3

 

F3

 

 

D

 

D

D

D

D

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC009

….

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Doc ID 13590 Rev 3

23/162

Pipelined execution

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PM0044

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 4.

Example with conventional number of cycles

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Address

Instruction

Decode

Execute

lgth

 

 

 

 

 

Time (cycle)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

cycles

cycles

1

2

3

4

5

6

7

8

9

10

11

12

 

13

14

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC000

LDW X, [$50.w]

4

3

3

 

D

E

E

E

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

F1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC003

ADDW X, #20

3

3

3

 

D

D

D

D

E

E

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

F2

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC006

LD A, [$30].w

3

3

3

 

F3

 

 

D

D

D

D

E

E

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC009

….

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 5.

Legend

 

 

Symbol/Color

Definition

 

 

 

 

F

Fetch

 

 

 

 

D

Decode stalled

 

 

 

 

D

Decode

 

 

 

 

E

Execute

 

 

 

5.4.1Optimized pipeline example – execution from Flash Program memory

In the example shown in Table 6, the code is stored in the Flash Program memory (32-bit bus). As a result, 3 cycles are needed to fill the 96-bit prefetch buffer. At each cycle, one word is loaded and stored in F1, F2 and F3. The next fetch operation can start only when all the instructions contained in one of the Fx word are decoded. In fact, at cycle 9, the last instruction contained in F3 (SWAP A) is decoded, and a fetch operation can start to fill F3 word.

24/162

Doc ID 13590 Rev 3

PM0044

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Pipelined execution

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 6.

 

Optimized pipeline example - execution from Flash

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Add.

Instruction

Decod.

Exec.

lgth

 

 

 

 

 

 

Cycle

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

cycles

cycles

1

2

3

4

5

6

7

 

8

 

9

10

11

12

13

14

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC000

 

NEG A

1

1

1

 

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC001

XOR A, $10

1

1

2

F1

 

D

E

 

 

 

 

 

 

 

 

 

 

 

 

0xC003

LD A, #20

1

1

2

 

F2

 

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC005

SUB A,$1000

1

1

3

 

 

 

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC008

 

INC A

1

1

1

 

 

 

 

 

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC009

 

LD XL, A

1

1

1

 

 

F3

 

 

 

D

 

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC00A

 

SRL A

1

1

1

 

 

 

 

 

 

 

D

 

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC00B

 

SWAP A

1

1

1

 

 

 

 

 

 

 

 

 

 

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC00C

 

SLA $15

1

1

2

 

 

 

F1

 

 

 

 

 

 

 

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC00E

CP A,#$FE

1

1

2

 

 

 

 

 

 

 

 

 

 

 

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC010

MOV $100, #11

1

1

4

 

 

 

 

F2

 

 

 

 

 

 

 

 

D

E

 

0xC014

MOV $101, #22

1

1

4

 

 

 

 

 

 

 

 

 

F3

 

 

 

D

E

 

 

Table 7.

Legend

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Symbol/Color

 

 

 

 

 

 

 

Definition

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

F

 

 

 

 

 

 

 

 

 

Fetch

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

D

 

 

 

 

 

 

 

 

Decode

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

E

 

 

 

 

 

 

 

 

Execute

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Doc ID 13590 Rev 3

25/162

Pipelined execution

PM0044

 

 

5.4.2Optimize pipeline example – execution from RAM

In the example shown in Table 8, the RAM is accessed through an 8-bit bus. As a result, 12 cycles are required to fill the 96-bit pre-fetch buffer. Every 4 cycles, one word is loaded and stored in Fx. The decoding of the first word instruction can start only when the Fx word is filled. This occurs for example till the 4th cycle, and the first instruction (NEG A) can be decoded only at the 5th cycle.

In case of read/write access to the RAM, the fetch is stalled. This occurs during the 6th cycle since RAM address 10 is read during the decode stage of XOR A, $10.

Table 8.

Optimize pipeline example – execution from RAM

 

 

 

 

 

 

 

 

 

 

 

 

<![if ! IE]>

<![endif]>Instruction

<![if ! IE]>

<![endif]>cyclesDecode

<![if ! IE]>

<![endif]>cyclesExecute

<![if ! IE]>

<![endif]>lgth

 

 

 

 

 

 

 

 

 

 

Cycle

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Add.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC000

 

NEG A

1

1

1

<![if ! IE]>

<![endif]>1_1

D

D

D

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

<![if ! IE]>

<![endif]>F

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC001

 

XOR A,

1

1

2

 

<![if ! IE]>

<![endif]>1_2

<![if ! IE]>

<![endif]>1_3

 

 

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

$10

 

 

 

 

<![if ! IE]>

<![endif]>F

<![if ! IE]>

<![endif]>F

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC003

 

LD A, #20

1

1

2

 

 

 

<![if ! IE]>

<![endif]>1_4

<![if ! IE]>

<![endif]>2_1

 

D

D

D

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

<![if ! IE]>

<![endif]>F

<![if ! IE]>

<![endif]>F

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC005

 

SUB

1

1

3

 

 

 

 

 

FS

<![if ! IE]>

<![endif]>2_2

<![if ! IE]>

<![endif]>2_3

<![if ! IE]>

<![endif]>2_4

 

D

E

 

 

 

 

 

 

 

 

 

 

 

A,$1000

 

 

 

 

 

 

 

 

 

<![if ! IE]>

<![endif]>F

<![if ! IE]>

<![endif]>F

<![if ! IE]>

<![endif]>F

 

 

 

 

 

 

 

 

 

 

 

 

0xC008

 

INC A

1

1

1

 

 

 

 

 

 

 

 

 

<![if ! IE]>

<![endif]>3_1

 

D

D

D

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

<![if ! IE]>

<![endif]>F

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC009

 

LD XL, A

1

1

1

 

 

 

 

 

 

 

 

 

 

FS

<![if ! IE]>

<![endif]>3_2

 

 

 

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

<![if ! IE]>

<![endif]>F

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC00A

 

SRL A

1

1

1

 

 

 

 

 

 

 

 

 

 

 

 

<![if ! IE]>

<![endif]>3_3

 

 

 

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

<![if ! IE]>

<![endif]>F

 

 

 

 

 

 

 

 

0xC00B

 

SWAP A

1

1

1

 

 

 

 

 

 

 

 

 

 

 

 

 

<![if ! IE]>

<![endif]>3_4

 

 

 

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

<![if ! IE]>

<![endif]>F

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC00C

 

SLA $15

1

1

2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

<![if ! IE]>

<![endif]>1_1

<![if ! IE]>

<![endif]>1_2

 

 

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

<![if ! IE]>

<![endif]>F

<![if ! IE]>

<![endif]>F

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC00E

 

CP

1

1

2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

<![if ! IE]>

<![endif]>1_3

<![if ! IE]>

<![endif]>1_4

 

D

E

 

 

A,#$FE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

<![if ! IE]>

<![endif]>F

<![if ! IE]>

<![endif]>F

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 9.

Legend

 

 

Symbol/Color

Definition

 

 

 

 

F

Fetch

 

 

 

 

FS

Fetch stalled

 

 

 

 

D

Decode

 

 

 

 

D

Decode stalled

 

 

 

 

E

Execute

 

 

 

26/162

Doc ID 13590 Rev 3

PM0044

Pipelined execution

 

 

5.4.3Pipeline with Call/Jump

In the example shown in Table 10, a branch is taken after the JP/CALL instruction, and the fetched instruction(s) are lost (flush). New instructions must be fetched. 3 fetch sequences are required to refill the pre-fetch buffer. The fetch start depends on the instruction being executed.

For a JP instruction, the fetch can start during the first cycle of the "dummy" execution.

 

 

For the CALL instruction, it starts after the last cycle of the CALL execution.

 

 

 

 

Table 10.

 

Example of pipeline with Call/Jump

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Add.

 

Instruction

Decode

Execute

 

lgth

 

 

 

 

 

 

Cycle

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

cycles

cycles

 

1

2

3

4

5

 

6

 

7

8

9

10

11

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC000

 

INC A

1

1

 

1

F1

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC001

 

JP label

1

1

 

3

 

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC004

 

LDW X,[$5432.w]

X

X

 

4

 

F2

<![if ! IE]>

<![endif]>Flush

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xD010

 

label: NEG A

1

1

 

1

 

 

 

F1

D

 

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xD011

 

CALL label2

1

2

 

3

 

 

 

 

 

D

 

E

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xD014

 

LDW X,[$5432.w]

X

X

 

4

 

 

 

 

F2

 

 

 

<![if ! IE]>

<![endif]>Flush

 

 

 

 

0xD018

 

LDW X,[$7895.w]

X

X

 

4

 

 

 

 

 

 

F3

 

FS

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xE030

 

label2: INCW X

1

1

 

1

 

 

 

 

 

 

 

 

 

 

F1

D

E

 

 

Table 11. Legend

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Symbol/Color

 

 

 

 

 

 

 

 

Definition

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

F

 

 

 

 

 

 

 

 

 

Fetch

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

FS

 

 

 

 

 

 

 

 

Fetch stalled

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

D

 

 

 

 

 

 

 

 

Decode

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

E

 

 

 

 

 

 

 

 

Execute

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

5.4.4Pipeline stalled

The decode stage can be stalled when the execution lasts more than one cycle.

The flush is due to the branch. Fetching the branch address is performed during the second execution cycle of the BTJF instruction.

The Decode operation can also be stalled when the memory target is modified during the previous instruction. In the example given in Table 12, the INCW Y instruction writes the X

Doc ID 13590 Rev 3

27/162

Pipelined execution

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PM0044

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

register during the first execution cycle. As a result, in this cycle, the next instruction

 

 

 

(LD A,(X)) cannot be decoded since it reads the X register.

 

 

 

 

 

 

 

 

Table 12.

 

Example of stalled pipeline

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Address

 

Instruction

Decode

 

Execute

lgth

 

 

 

 

 

Time (cycles)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

cycles

 

cycles

1

2

3

4

7

 

8

9

10

11

12

13

14

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC000

 

SUB SP, #20

1

 

1

2

 

F1

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC002

 

LD A, #20

1

 

1

2

 

 

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC004

BTJT 0x10, #5, to

1

 

2

5

 

 

F2

 

D

E

 

E

 

 

 

 

 

 

0xC009

 

INC A

1

 

1

1

 

 

 

F3

 

D

 

D

E

 

 

 

 

 

0xC00A

BTJF 0x20, #3, to

1

 

2

5

 

 

 

 

F1

 

 

 

D

E

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC00F

 

NOP

X

 

X

1

 

 

 

 

 

 

 

 

<![if ! IE]>

<![endif]>Flush

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC010

LDW X,[$5432.w]

X

 

X

4

 

 

 

 

 

F2

 

 

 

 

 

 

0xC014

LDW X,[$1234.w]

X

 

X

4

 

 

 

 

 

 

 

F3

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xD020

 

to: INCW Y

1

 

1

2

 

 

 

 

 

 

 

 

 

 

F1

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xD023

 

LD A,(X)

1

 

1

2

 

 

 

 

 

 

 

 

 

 

D

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 13. Legend

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Symbol/Color

 

 

 

 

 

 

 

 

Definition

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

F

 

 

 

 

 

 

 

 

 

Fetch

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

D

 

 

 

 

 

 

 

Decode stalled

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

D

 

 

 

 

 

 

 

 

 

Decode

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

E

 

 

 

 

 

 

 

 

 

Execute

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

28/162

Doc ID 13590 Rev 3

PM0044

Pipelined execution

 

 

5.4.5Pipeline with 1 wait state

In the example given in Table 14, performing the fetch takes 2 cycles, and there is no overlap between the 2 fetch cycles.

If the instruction is decoded/executed during the last 2 fetch cycles, then the wait state is transparent compared to the no-wait state execution.

Table 14.

Pipeline with 1 wait state

 

 

 

 

 

 

 

 

 

 

 

 

Address

Instruction

Decode

Execute

lgth

 

 

 

 

Time (cycle)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

cycles

cycles

1

2

3

4

 

5

6

7

8

9

10

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC000

NEG A

1

1

1

MS

F1

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC001

DEC ($10, X)

1

1

3

 

 

D

 

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC004

LDW X, #20

1

1

3

 

 

MS

F2

 

D

E

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC007

LD (X), A

1

1

1

 

 

 

 

 

D

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC008

INC A

1

1

1

 

 

 

 

 

MS

F3

 

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0xC009

NEG ($5A, Y)

1

1

1

 

 

 

 

 

 

 

 

D

E

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Table 15. Legend

Symbol/Color

Definition

 

 

F

Fetch

 

 

D

Decode stalled

 

 

D

Decode

 

 

MS

Memory stalled

 

 

E

Execute

 

 

Doc ID 13590 Rev 3

29/162

STM8 addressing modes

PM0044

 

 

6 STM8 addressing modes

The STM8 core features 18 different addressing modes which can be classified in 8 main groups:

Table 16. STM8 core addressing modes

Addressing mode groups

Example

 

 

Inherent

NOP

 

 

Immediate

LD A,#$55

 

 

Direct

LD A,$55

 

 

Indexed

LD A,($55,X)

 

 

SP Indexed

LD A,($55,SP)

 

 

Indirect

LD A,([$55],X)

 

 

Relative

JRNE loop

 

 

Bit operation

BSET byte,#5

 

 

The STM8 Instruction set is designed to minimize the number of required bytes per instruction. To do so, most of the addressing modes can be split in three sub-modes called extended, long and short:

The extended addressing mode ("e") can reach any byte in the 16-Mbyte addressing space, but the instruction size is bigger than the short and long addressing mode. Moreover, the number of instructions with this addressing mode (far) is limited (CALLF, RETF, JPF and LDF)

The long addressing mode ("w") is the most powerful for program management, when the program is executed in the same section (same PCE value). The long addressing mode is optimized for data management in the first 64-Kbyte addressing space (from 0x000000 to 0x00FFFF) with a complete set of instructions, but the instruction size is bigger than the short addressing mode.

The short addressing mode ("b") is less powerful because it can only access the page zero (from 0x000000 to 0x0000FF), but the instruction size is more compact.

Table 17.

STM8 addressing mode overview

 

 

 

 

Mode

 

Syntax

Destination

Pointer

<![if ! IE]>

<![endif]>Pointer size

 

 

 

 

 

address

address

 

 

 

 

 

 

 

 

 

 

 

 

 

Inherent

 

 

NOP

 

 

 

 

 

 

 

 

 

 

Immediate

 

 

LD A,#$55

 

 

 

 

 

 

 

 

 

 

Short

Direct

 

LD A,$10

000000..0000FF

 

 

 

 

 

 

 

 

 

Long

Direct

 

LD A,$1000

000000..00FFFF

 

 

 

 

 

 

 

 

 

Extended

Direct

 

LDF A,$100000

000000..FFFFFF

 

 

 

 

 

 

 

 

 

No Offset

Direct

Indexed

LD A,(X)

000000..00FFFF

 

 

 

 

 

 

 

 

 

Short

Direct

Indexed

LD A,($10,X)

000000..0100FE

 

 

 

 

 

 

 

 

 

30/162

Doc ID 13590 Rev 3

Loading...
+ 132 hidden pages