Renesas SuperH SH-4A Software Manual

Page 1
REJ09B0003-0150Z
The revision list can be viewed directly by  clicking the title page.  The revision list summarizes the locations of  revisions and additions. Details should always  be checked by referring to the relevant text.
32
SH-4A
Software Manual
SuperH™ RISC engine Family
Rev.1.50 Revision Date: Oct. 29, 2004
Page 2
Rev. 1.50, 10/04, page ii of xx
Page 3

Keep safety first in your circuit designs!

1. Renesas Technology Corp. puts the maximum effort into making semiconductor products better and more reliable, but there is always the possibility that trouble may occur with them. Trouble with semiconductors may lead to personal injury, fire or property damage. Remember to give due consideration to safety when making your circuit designs, with appropriate measures such as (i) placement of substitutive, auxiliary circuits, (ii) use of nonflammable material or (iii) prevention against any malfunction or mishap.

Notes regarding these materials

1. These materials are intended as a reference to assist our customers in the selection of the Renesas Technology Corp. product best suited to the customer's application; they do not convey any license under any intellectual property rights, or any other rights, belonging to Renesas Technology Corp. or a third party.
2. Renesas Technology Corp. assumes no responsibility for any damage, or infringement of any third­party's rights, originating in the use of any product data, diagrams, charts, programs, algorithms, or circuit application examples contained in these materials.
3. All information contained in these materials, including product data, diagrams, charts, programs and algorithms represents information on products at the time of publication of these materials, and are subject to change by Renesas Technology Corp. without notice due to product improvements or other reasons. It is therefore recommended that customers contact Renesas Technology Corp. or an authorized Renesas Technology Corp. product distributor for the latest product information before purchasing a product listed herein. The information described here may contain technical inaccuracies or typographical errors. Renesas Technology Corp. assumes no responsibility for any damage, liability, or other loss rising from these inaccuracies or errors. Please also pay attention to information published by Renesas Technology Corp. by various means, including the Renesas Technology Corp. Semiconductor home page (http://www.renesas.com).
4. When using any or all of the information contained in these materials, including product data, diagrams, charts, programs, and algorithms, please be sure to evaluate all information as a total system before making a final decision on the applicability of the information and products. Renesas Technology Corp. assumes no responsibility for any damage, liability or other loss resulting from the information contained herein.
5. Renesas Technology Corp. semiconductors are not designed or manufactured for use in a device or system that is used under circumstances in which human life is potentially at stake. Please contact Renesas Technology Corp. or an authorized Renesas Technology Corp. product distributor when considering the use of a product contained herein for any specific purposes, such as apparatus or systems for transportation, vehicular, medical, aerospace, nuclear, or undersea repeater use.
6. The prior written approval of Renesas Technology Corp. is necessary to reprint or reproduce in whole or in part these materials.
7. If these products or technologies are subject to the Japanese export control restrictions, they must be exported under a license from the Japanese government and cannot be imported into a country other than the approved destination. Any diversion or reexport contrary to the export control laws and regulations of Japan and/or the country of destination is prohibited.
8. Please contact Renesas Technology Corp. for further details on these materials or the products contained therein.
Rev. 1.50, 10/04, page iii of xx
Page 4

General Precautions on Handling of Product

1. Treatment of NC Pins
Note: Do not connect anything to the NC pins.
The NC (not connected) pins are either not connected to any of the internal circuitry or are they are used as test pins or to reduce noise. If something is connected to the NC pins, the operation of the LSI is not guaranteed.
2. Treatment of Unused Input Pins
Note: Fix all unused input pins to high or low level.
Generally, the input pins of CMOS products are high-impedance input pins. If unused pins are in their open states, intermediate levels are induced by noise in the vicinity, a pass­through current flows internally, and a malfunction may occur.
3. Processing before Initialization
Note: When power is first supplied, the product’s state is undefined.
The states of internal circuits are undefined until full power is supplied throughout the chip and a low level is input on the reset pin. During the period where the states are undefined, the register settings and the output state of each pin are also undefined. Design your system so that it does not malfunction because of processing while it is in this undefined state. For those products which have a reset function, reset the LSI immediately after the power supply has been turned on.
4. Prohibition of Access to Undefined or Reserved Addresses
Note: Access to undefined or reserved addresses is prohibited.
The undefined or reserved addresses may be used to expand functions, or test registers may have been be allocated to these addresses. Do not access these registers; the system’s operation is not guaranteed if they are accessed.
5. Reading from/Writing to Reserved Bit of Each Register
Note: Treat the reserved bit of register used in each module as follows except in cases where the
specifications for values which are read from or written to the bit are provided in the description. The bit is always read as 0. The write value should be 0 or one, which has been read immediately before writing. Writing the value, which has been read immediately before writing has the advantage of preventing the bit from being affected on its extended function when the function is assigned.
Rev. 1.50, 10/04, page iv of xx
Page 5

Configuration of This Manual

This manual comprises the following items:
1. General Precautions on Handling of Product
2. Configuration of This Manual
3. Preface
4. Contents
5. Overview
6. Description of Functional Modules
CPU and System-Control Modules
On-Chip Peripheral Modules
The configuration of the functional description of each module differs according to the
module. However, the generic style includes the following items:
i) Feature
ii) Input/Output Pin
iii) Register Description
iv) Operation
v) Usage Note
When designing an application system that includes this LSI, take notes into account. Each section includes notes in relation to the descriptions given, and usage notes are given, as required, as the final part of each section.
7. List of Registers
8. Appendix
9. Index
Rev. 1.50, 10/04, page v of xx
Page 6

Preface

The SH-4A is a RISC (Reduced Instruction Set Computer) microcomputer which includes a Renesas Technology-original RISC CPU as its core.
Target Users: This manual was written for users who will be using the SH-4A in the design of
application systems. Users of this manual are expected to understand the fundamentals of electrical circuits, logical circuits, microcomputers, and assembly/C languages programming.
Objective: This manual was written to understand the instructions of the SH4A. For the
hardware functions, refer to corresponding hardware manual.
Notes on reading this manual:
In order to understand the overall functions of the chip
Read the manual according to the contents. This manual can be roughly categorized into parts on the CPU, system control functions, and instructions.
In order to understand the instructions
The instruction format and basic operation are explained in section 3, Instruction Set. For details on each instruction operation, read section 10, Instruction Descriptions.
Rules: Register name: The following notation is used for cases when the same or a
similar function, e.g. serial communication, is implemented on more than one channel: XXX_N (XXX is the register name and N is the channel number)
Bit order: The MSB is on the left and the LSB is on the right.
Number notation: Binary is B'xxxx, hexadecimal is H'xxxx, decimal is xxxx. Signal notation: An overbar is added to a low-active signal: xxxx
Related Manuals: The latest versions of all related manuals are available from our web site.
Please ensure you have the latest versions of all documents you require. http://www.renesas.com/
Rev. 1.50, 10/04, page vi of xx
Page 7
Abbreviations
ALU Arithmetic Logic Unit
ASID Address Space Identifier
CPU Central Processing Unit
FPU Floating Point Unit
LRU Least Recently Used
LSB Least Significant Bit
MMU Memory Management Unit
MSB Most Significant Bit
PC Program Counter
RISC Reduced Instruction Set Computer
TLB Translation Lookaside Buffer
Rev. 1.50, 10/04, page vii of xx
Page 8
Rev. 1.50, 10/04, page viii of xx
Page 9

Contents

Section 1 Overview............................................................................................1
1.1 Features............................................................................................................................. 1
1.2 Changes from SH-4 to SH-4A .......................................................................................... 4
Section 2 Programming Model ..........................................................................7
2.1 Data Formats..................................................................................................................... 7
2.2 Register Descriptions........................................................................................................ 8
2.2.1 Privileged Mode and Banks................................................................................. 8
2.2.2 General Registers................................................................................................. 11
2.2.3 Floating-Point Registers.......................................................................................12
2.2.4 Control Registers .................................................................................................14
2.2.5 System Registers.................................................................................................. 16
2.3 Memory-Mapped Registers...............................................................................................19
2.4 Data Formats in Registers................................................................................................. 20
2.5 Data Formats in Memory .................................................................................................. 20
2.6 Processing States...............................................................................................................21
2.7 Usage Notes...................................................................................................................... 22
2.7.1 Notes on Self-Modified Codes.............................................................................22
Section 3 Instruction Set ....................................................................................23
3.1 Execution Environment .................................................................................................... 23
3.2 Addressing Modes ............................................................................................................25
3.3 Instruction Set................................................................................................................... 29
Section 4 Pipelining ...........................................................................................43
4.1 Pipelines............................................................................................................................ 43
4.2 Parallel-Executability........................................................................................................ 54
4.3 Issue Rates and Execution Cycles..................................................................................... 56
Section 5 Exception Handling ...........................................................................65
5.1 Summary of Exception Handling...................................................................................... 65
5.2 Register Descriptions........................................................................................................ 65
5.2.1 TRAPA Exception Register (TRA) ..................................................................... 66
5.2.2 Exception Event Register (EXPEVT).................................................................. 67
5.2.3 Interrupt Event Register (INTEVT)..................................................................... 68
5.3 Exception Handling Functions.......................................................................................... 69
5.3.1 Exception Handling Flow .................................................................................... 69
5.3.2 Exception Handling Vector Addresses ................................................................ 69
5.4 Exception Types and Priorities ......................................................................................... 70
Rev. 1.50, 10/04, page ix of xx
Page 10
5.5 Exception Flow................................................................................................................. 72
5.5.1 Exception Flow.................................................................................................... 72
5.5.2 Exception Source Acceptance.............................................................................. 73
5.5.3 Exception Requests and BL Bit ........................................................................... 74
5.5.4 Return from Exception Handling......................................................................... 74
5.6 Description of Exceptions................................................................................................. 75
5.6.1 Resets................................................................................................................... 75
5.6.2 General Exceptions.............................................................................................. 77
5.6.3 Interrupts.............................................................................................................. 91
5.6.4 Priority Order with Multiple Exceptions .............................................................92
5.7 Usage Notes...................................................................................................................... 94
Section 6 Floating-Point Unit (FPU)................................................................. 97
6.1 Features............................................................................................................................. 97
6.2 Data Formats..................................................................................................................... 98
6.2.1 Floating-Point Format.......................................................................................... 98
6.2.2 Non-Numbers (NaN) ........................................................................................... 101
6.2.3 Denormalized Numbers ....................................................................................... 102
6.3 Register Descriptions........................................................................................................ 103
6.3.1 Floating-Point Registers ...................................................................................... 103
6.3.2 Floating-Point Status/Control Register (FPSCR) ................................................ 105
6.3.3 Floating-Point Communication Register (FPUL)................................................ 107
6.4 Rounding...........................................................................................................................108
6.5 Floating-Point Exceptions................................................................................................. 109
6.5.1 General FPU Disable Exceptions and Slot FPU Disable Exceptions .................. 109
6.5.2 FPU Exception Sources ....................................................................................... 109
6.5.3 FPU Exception Handling..................................................................................... 110
6.6 Graphics Support Functions.............................................................................................. 111
6.6.1 Geometric Operation Instructions........................................................................ 111
6.6.2 Pair Single-Precision Data Transfer.....................................................................112
Section 7 Memory Management Unit (MMU).................................................. 113
7.1 Overview of MMU ........................................................................................................... 113
7.1.1 Address Spaces .................................................................................................... 115
7.2 Register Descriptions........................................................................................................ 121
7.2.1 Page Table Entry High Register (PTEH)............................................................. 122
7.2.2 Page Table Entry Low Register (PTEL).............................................................. 123
7.2.3 Translation Table Base Register (TTB)............................................................... 124
7.2.4 TLB Exception Address Register (TEA)............................................................. 124
7.2.5 MMU Control Register (MMUCR)..................................................................... 125
7.2.6 Physical Address Space Control Register (PASCR)............................................ 128
7.2.7 Instruction Re-Fetch Inhibit Control Register (IRMCR)..................................... 129
7.3 TLB Functions.................................................................................................................. 131
Rev. 1.50, 10/04, page x of xx
Page 11
7.3.1 Unified TLB (UTLB) Configuration ................................................................... 131
7.3.2 Instruction TLB (ITLB) Configuration................................................................ 133
7.3.3 Address Translation Method................................................................................ 134
7.4 MMU Functions................................................................................................................ 136
7.4.1 MMU Hardware Management............................................................................. 136
7.4.2 MMU Software Management ..............................................................................136
7.4.3 MMU Instruction (LDTLB)................................................................................. 137
7.4.4 Hardware ITLB Miss Handling ........................................................................... 139
7.4.5 Avoiding Synonym Problems.............................................................................. 139
7.5 MMU Exceptions.............................................................................................................. 140
7.5.1 Instruction TLB Multiple Hit Exception.............................................................. 140
7.5.2 Instruction TLB Miss Exception.......................................................................... 141
7.5.3 Instruction TLB Protection Violation Exception................................................. 142
7.5.4 Data TLB Multiple Hit Exception .......................................................................143
7.5.5 Data TLB Miss Exception ...................................................................................143
7.5.6 Data TLB Protection Violation Exception........................................................... 144
7.5.7 Initial Page Write Exception................................................................................ 145
7.6 Memory-Mapped TLB Configuration............................................................................... 146
7.6.1 ITLB Address Array ............................................................................................ 147
7.6.2 ITLB Data Array.................................................................................................. 148
7.6.3 UTLB Address Array........................................................................................... 149
7.6.4 UTLB Data Array ................................................................................................150
7.7 32-Bit Address Extended Mode........................................................................................ 151
7.7.1 Overview of 32-Bit Address Extended Mode...................................................... 152
7.7.2 Transition to 32-Bit Address Extended Mode .....................................................152
7.7.3 Privileged Space Mapping Buffer (PMB) Configuration ....................................152
7.7.4 PMB Function...................................................................................................... 154
7.7.5 Memory-Mapped PMB Configuration................................................................. 154
7.7.6 Notes on Using 32-Bit Address Extended Mode ................................................. 156
Section 8 Caches................................................................................................159
8.1 Features............................................................................................................................. 159
8.2 Register Descriptions........................................................................................................ 162
8.2.1 Cache Control Register (CCR) ............................................................................163
8.2.2 Queue Address Control Register 0 (QACR0)...................................................... 165
8.2.3 Queue Address Control Register 1 (QACR1)...................................................... 166
8.2.4 On-Chip Memory Control Register (RAMCR) ................................................... 167
8.3 Operand Cache Operation................................................................................................. 169
8.3.1 Read Operation .................................................................................................... 169
8.3.2 Prefetch Operation ............................................................................................... 170
8.3.3 Write Operation ...................................................................................................171
8.3.4 Write-Back Buffer ...............................................................................................172
8.3.5 Write-Through Buffer.......................................................................................... 172
Rev. 1.50, 10/04, page xi of xx
Page 12
8.3.6 OC Two-Way Mode ............................................................................................ 173
8.4 Instruction Cache Operation ............................................................................................. 173
8.4.1 Read Operation .................................................................................................... 173
8.4.2 Prefetch Operation............................................................................................... 174
8.4.3 IC Two-Way Mode.............................................................................................. 174
8.5 Cache Operation Instruction ............................................................................................. 175
8.5.1 Coherency between Cache and External Memory............................................... 175
8.5.2 Prefetch Operation............................................................................................... 176
8.6 Memory-Mapped Cache Configuration............................................................................ 176
8.6.1 IC Address Array................................................................................................. 177
8.6.2 IC Data Array ......................................................................................................178
8.6.3 OC Address Array ...............................................................................................179
8.6.4 OC Data Array..................................................................................................... 181
8.7 Store Queues..................................................................................................................... 182
8.7.1 SQ Configuration................................................................................................. 182
8.7.2 Writing to SQ....................................................................................................... 182
8.7.3 Transfer to External Memory ..............................................................................183
8.7.4 Determination of SQ Access Exception............................................................... 184
8.7.5 Reading from SQ ................................................................................................. 184
8.8 Notes on Using 32-Bit Address Extended Mode.............................................................. 185
Section 9 L Memory.......................................................................................... 187
9.1 Features............................................................................................................................. 187
9.2 Register Descriptions........................................................................................................ 188
9.2.1 On-Chip Memory Control Register (RAMCR) ................................................... 189
9.2.2 L Memory Transfer Source Address Register 0 (LSA0) ..................................... 190
9.2.3 L Memory Transfer Source Address Register 1 (LSA1) ..................................... 191
9.2.4 L Memory Transfer Destination Address Register 0 (LDA0) .............................193
9.2.5 L Memory Transfer Destination Address Register 1 (LDA1) .............................195
9.3 Operation .......................................................................................................................... 197
9.3.1 Access from the CPU and FPU............................................................................ 197
9.3.2 Access from the SuperHyway Bus Master Module............................................. 197
9.3.3 Block Transfer ..................................................................................................... 197
9.4 L Memory Protective Functions ....................................................................................... 199
9.5 Usage Notes...................................................................................................................... 200
9.5.1 Page Conflict .......................................................................................................200
9.5.2 L Memory Coherency.......................................................................................... 200
9.5.3 Sleep Mode .......................................................................................................... 200
9.6 Notes on Using 32-Bit Address Extended Mode.............................................................. 200
Section 10 Instruction Descriptions...................................................................201
10.1 CPU instruction.................................................................................................................202
10.1.1 ADD (Add binary): Arithmetic Instruction ......................................................... 204
Rev. 1.50, 10/04, page xii of xx
Page 13
10.1.2 ADDC (Add with Carry): Arithmetic Instruction ................................................ 205
10.1.3 ADDV (Add with (V flag) Overflow Check): Arithmetic Instruction................. 206
10.1.4 AND (AND Logical): Logical Instruction........................................................... 208
10.1.5 BF (Branch if False): Branch Instruction............................................................. 210
10.1.6 BF/S (Branch if False with Delay Slot): Branch Instruction................................212
10.1.7 BRA (Branch): Branch Instruction ...................................................................... 214
10.1.8 BRAF (Branch Far): Branch Instruction (Delayed Branch Instruction) ..............216
10.1.9 BT (Branch if True): Branch Instruction .............................................................217
10.1.10 BT/S (Branch if True with Delay Slot): Branch Instruction ................................ 219
10.1.11 CLRMAC (Clear MAC Register): System Control Instruction........................... 221
10.1.12 CLRS (Clear S Bit): System Control Instruction................................................. 222
10.1.13 CLRT (Clear T Bit): System Control Instruction ................................................223
10.1.14 CMP/cond (Compare Conditionally): Arithmetic Instruction..............................224
10.1.15 DIV0S (Divide (Step 0) as Signed): Arithmetic Instruction ................................ 228
10.1.16 DIV0U (Divide (Step 0) as Unsigned): Arithmetic Instruction ........................... 229
10.1.17 DIV1 (Divide 1 Step): Arithmetic Instruction ..................................................... 230
10.1.18 DMULS.L (Double-length Multiply as Signed): Arithmetic Instruction.............235
10.1.19 DMULU.L (Double-length Multiply as Unsigned): Arithmetic Instruction........ 237
10.1.20 DT (Decrement and Test): Arithmetic Instruction............................................... 239
10.1.21 EXTS (Extend as Signed): Arithmetic Instruction...............................................240
10.1.22 EXTU (Extend as Unsigned): Arithmetic Instruction.......................................... 242
10.1.23 ICBI (Instruction Cache Block Invalidate): Data Transfer Instruction................ 243
10.1.24 JMP (Jump): Branch Instruction.......................................................................... 244
10.1.25 LDC (Load to Control Register): System Control Instruction............................. 245
10.1.26 LDS (Load to System Register): System Control Instruction.............................. 249
10.1.27 LDTLB (Load PTEH/PTEL to TLB): System Control Instruction
(Privileged Instruction) ........................................................................................ 251
10.1.28 MAC.L (Multiply and Accumulate Long): Arithmetic Instruction .....................253
10.1.29 MAC.W (Multiply and Accumulate Word): Arithmetic Instruction....................257
10.1.30 MOV (Move data): Data Transfer Instruction .....................................................260
10.1.31 MOV (Move Constant Value): Data Transfer Instruction ................................... 266
10.1.32 MOV (Move Global Data): Data Transfer Instruction.........................................269
10.1.33 MOV (Move Structure Data): Data Transfer Instruction..................................... 272
10.1.34 MOVA (Move Effective Address): Data Transfer Instruction ............................275
10.1.35 MOVCA.L (Move with Cache Block Allocation): Data Transfer Instruction..... 276
10.1.36 MOVCO (Move Conditional): Data Transfer Instruction....................................277
10.1.37 MOVLI (Move Linked): Data Transfer Instruction............................................. 279
10.1.38 MOVT (Move T Bit): Data Transfer Instruction................................................. 280
10.1.39 MOVUA (Move Unaligned): Data Transfer Instruction......................................281
10.1.40 MUL.L (Multiply Long): Arithmetic Instruction................................................. 283
10.1.41 MULS.W (Multiply as Signed Word): Arithmetic Instruction............................ 284
10.1.42 MULU.W (Multiply as Unsigned Word): Arithmetic Instruction ....................... 285
10.1.43 NEG (Negate): Arithmetic Instruction................................................................. 286
Rev. 1.50, 10/04, page xiii of xx
Page 14
10.1.44 NEGC (Negate with Carry): Arithmetic Instruction............................................ 287
10.1.45 NOP (No Operation): System Control Instruction............................................... 288
10.1.46 NOT (Not-logical Complement): Logical Instruction ......................................... 289
10.1.47 OCBI (Operand Cache Block Invalidate): Data Transfer Instruction.................. 290
10.1.48 OCBP (Operand Cache Block Purge): Data Transfer Instruction........................291
10.1.49 OCBWB (Operand Cache Block Write Back): Data Transfer Instruction...........292
10.1.50 OR (OR Logical): Logical Instruction................................................................. 293
10.1.51 PREF (Prefetch Data to Cache): Data Transfer Instruction ................................. 296
10.1.52 PREFI (Prefetch Instruction Cache Block): Data Transfer Instruction................ 297
10.1.53 ROTCL (Rotate with Carry Left): Shift Instruction ............................................ 298
10.1.54 ROTCR (Rotate with Carry Right): Shift Instruction.......................................... 299
10.1.55 ROTL (Rotate Left): Shift Instruction ................................................................. 300
10.1.56 ROTR (Rotate Right): Shift Instruction............................................................... 301
10.1.57 RTE (Return from Exception): System Control Instruction ................................ 302
10.1.58 RTS (Return from Subroutine): Branch Instruction............................................. 304
10.1.59 SETS (Set S Bit): System Control Instruction ..................................................... 306
10.1.60 SETT (Set T Bit): System Control Instruction..................................................... 307
10.1.61 SHAD (Shift Arithmetic Dynamically): Shift Instruction ................................... 308
10.1.62 SHAL (Shift Arithmetic Left): Shift Instruction.................................................. 310
10.1.63 SHAR (Shift Arithmetic Right): Shift Instruction............................................... 311
10.1.64 SHLD (Shift Logical Dynamically): Shift Instruction......................................... 312
10.1.65 SHLL (Shift Logical Left ): Shift Instruction ...................................................... 314
10.1.66 SHLLn (n bits Shift Logical Left): Shift Instruction ........................................... 315
10.1.67 SHLR (Shift Logical Right): Shift Instruction..................................................... 317
10.1.68 SHLRn (n bits Shift Logical Right): Shift Instruction......................................... 318
10.1.69 SLEEP (Sleep): System Control Instruction (Privileged Instruction).................. 320
10.1.70 STC (Store Control Register): System Control Instruction
(Privileged Instruction)........................................................................................ 321
10.1.71 STS (Store System Register): System Control Instruction .................................. 325
10.1.72 SUB (Subtract Binary): Arithmetic Instruction ................................................... 327
10.1.73 SUBC (Subtract with Carry): Arithmetic Instruction .......................................... 328
10.1.74 SUBV (Subtract with (V flag) Underflow Check): Arithmetic Instruction......... 329
10.1.75 SWAP (Swap Register Halves): Data Transfer Instruction ................................. 331
10.1.76 SYNCO (Synchronize Data Operation): Data Transfer Instruction..................... 333
10.1.77 TAS (Test And Set): Logical Instruction............................................................. 334
10.1.78 TRAPA (Trap Always): System Control Instruction........................................... 336
10.1.79 TST (Test Logical): Logical Instruction .............................................................. 337
10.1.80 XOR (Exclusive OR Logical): Logical Instruction ............................................. 339
10.1.81 XTRCT (Extract): Data Transfer Instruction....................................................... 341
10.2 CPU Instructions (FPU related) ........................................................................................ 342
10.2.1 BSR (Branch to Subroutine): Branch Instruction
(Delayed Branch Instruction)............................................................................... 342
Rev. 1.50, 10/04, page xiv of xx
Page 15
10.2.2 BSRF (Branch to Subroutine Far): Branch Instruction
(Delayed Branch Instruction)............................................................................... 344
10.2.3 JSR (Jump to Subroutine): Branch Instruction (Delayed Branch Instruction)..... 346
10.2.4 LDC (Load to Control Register): System Control Instruction
(Privileged Instruction) ........................................................................................ 348
10.2.5 LDS (Load to FPU System register): System Control Instruction....................... 349
10.2.6 STC (Store Control Register): System Control Instruction
(Privileged Instruction) ........................................................................................ 351
10.2.7 STS (Store from FPU System Register): System Control Instruction ................. 352
10.3 FPU Instruction.................................................................................................................354
10.3.1 FABS (Floating-point Absolute Value): Floating-Point Instruction.................... 365
10.3.2 FADD (Floating-point ADD): Floating-Point Instruction ...................................366
10.3.3 FCMP (Floating-point Compare): Floating-Point Instruction.............................. 369
10.3.4 FCNVDS (Floating-point Convert Double to Single Precision):
Floating-Point Instruction .................................................................................... 373
10.3.5 FCNVSD (Floating-point Convert Single to Double Precision):
Floating-Point Instruction .................................................................................... 376
10.3.6 FDIV (Floating-point Divide): Floating-Point Instruction................................... 378
10.3.7 FIPR (Floating-point Inner Product): Floating-Point Instruction......................... 382
10.3.8 FLDI0 (Floating-point Load Immediate 0.0): Floating-Point Instruction............ 384
10.3.9 FLDI1 (Floating-point Load Immediate 1.0): Floating-Point Instruction............ 385
10.3.10 FLDS (Floating-point Load to System register): Floating-Point Instruction...... 386
10.3.11 FLOAT (Floating-point Convert from Integer): Floating-Point Instruction........ 387
10.3.12 FMAC (Floating-point Multiply and Accumulate): Floating-Point Instruction... 389
10.3.13 FMOV (Floating-point Move): Floating-Point Instruction.................................. 395
10.3.14 FMOV (Floating-point Move Extension): Floating-Point Instruction................. 399
10.3.15 FMUL (Floating-point Multiply): Floating-Point Instruction.............................. 402
10.3.16 FNEG (Floating-point Negate Value): Floating-Point Instruction.......................405
10.3.17 FPCHG (Pr-bit Change): Floating-Point Instruction ........................................... 406
10.3.18 FRCHG (FR-bit Change): Floating-Point Instruction.......................................... 407
10.3.19 FSCA (Floating Point Sine And Cosine Approximate):
Floating-Point Instruction .................................................................................... 408
10.3.20 FSCHG (Sz-bit Change): Floating-Point Instruction........................................... 410
10.3.21 FSQRT (Floating-point Square Root): Floating-Point Instruction....................... 411
10.3.22 FSRRA (Floating Point Square Reciprocal Approximate):
Floating-Point Instruction ...................................................................................414
10.3.23 FSTS (Floating-point Store System Register): Floating-Point Instruction ..........416
10.3.24 FSUB (Floating-point Subtract): Floating-Point Instruction................................417
10.3.25 FTRC (Floating-point Truncate and Convert to integer):
Floating-Point Instruction .................................................................................... 420
10.3.26 FTRV (Floating-point Transform Vector): Floating-Point Instruction................ 423
Rev. 1.50, 10/04, page xv of xx
Page 16
Section 11 List of Registers............................................................................... 427
11.1 Register Addresses
(by functional module, in order of the corresponding section numbers) ..........................428
11.2 Register States in Each Operating Mode .......................................................................... 430
Appendix .........................................................................................................431
A. CPU Operation Mode Register (CPUOPM) ..................................................................... 431
B. Instruction Prefetching and Its Side Effects...................................................................... 433
C. Speculative Execution for Subroutine Return................................................................... 434
D. Version Registers (PVR, PRR)......................................................................................... 435
Main Revisions and Additions in this Edition..................................................... 437
Index .........................................................................................................445
Rev. 1.50, 10/04, page xvi of xx
Page 17

Figures

Section 1 Overview
Figure 2.1 Data Formats ................................................................................................................. 7
Figure 2.2 CPU Register Configuration in Each Processing Mode.............................................. 10
Figure 2.3 General Registers ........................................................................................................11
Figure 2.4 Floating-Point Registers.............................................................................................. 13
Figure 2.5 Relationship between SZ bit and Endian..................................................................... 18
Figure 2.6 Formats of Byte Data and Word Data in Register....................................................... 20
Figure 2.7 Data Formats in Memory.............................................................................................21
Figure 2.8 Processing State Transitions........................................................................................ 21
Section 4 Pipelining
Figure 4.1 Basic Pipelines ............................................................................................................43
Figure 4.2 Instruction Execution Patterns (1)............................................................................... 45
Figure 4.2 Instruction Execution Patterns (2)............................................................................... 46
Figure 4.2 Instruction Execution Patterns (3)............................................................................... 47
Figure 4.2 Instruction Execution Patterns (4)............................................................................... 48
Figure 4.2 Instruction Execution Patterns (5)............................................................................... 49
Figure 4.2 Instruction Execution Patterns (6)............................................................................... 50
Figure 4.2 Instruction Execution Patterns (7)............................................................................... 51
Figure 4.2 Instruction Execution Patterns (8)............................................................................... 52
Figure 4.2 Instruction Execution Patterns (9)............................................................................... 53
Section 5 Exception Handling
Figure 5.1 Instruction Execution and Exception Handling...........................................................72
Figure 5.2 Example of General Exception Acceptance Order...................................................... 73
Section 6 Floating-Point Unit (FPU)
Figure 6.1 Format of Single-Precision Floating-Point Number....................................................98
Figure 6.2 Format of Double-Precision Floating-Point Number .................................................. 98
Figure 6.3 Single-Precision NaN Bit Pattern.............................................................................. 101
Figure 6.4 Floating-Point Registers............................................................................................ 104
Figure 6.5 Relation between SZ Bit and Endian......................................................................... 106
Section 7 Memory Management Unit (MMU)
Figure 7.1 Role of MMU ............................................................................................................ 115
Figure 7.2 Virtual Address Space (AT in MMUCR= 0)............................................................. 116
Figure 7.3 Virtual Address Space (AT in MMUCR= 1)............................................................. 116
Figure 7.4 P4 Area...................................................................................................................... 118
Figure 7.5 Physical Address Space.............................................................................................119
Figure 7.6 UTLB Configuration .................................................................................................131
Figure 7.7 Relationship between Page Size and Address Format............................................... 133
Figure 7.8 ITLB Configuration................................................................................................... 133
Rev. 1.50, 10/04, page xvii of xx
Page 18
Figure 7.9 Flowchart of Memory Access Using UTLB..............................................................134
Figure 7.10 Flowchart of Memory Access Using ITLB ............................................................. 135
Figure 7.11 Operation of LDTLB Instruction............................................................................. 138
Figure 7.12 Memory-Mapped ITLB Address Array................................................................... 147
Figure 7.13 Memory-Mapped ITLB Data Array ........................................................................ 148
Figure 7.14 Memory-Mapped UTLB Address Array ................................................................. 150
Figure 7.15 Memory-Mapped UTLB Data Array....................................................................... 151
Figure 7.16 Physical Address Space (32-Bit Address Extended Mode)..................................... 151
Figure 7.17 PMB Configuration.................................................................................................152
Figure 7.18 Memory-Mapped PMB Address Array................................................................... 155
Figure 7.19 Memory-Mapped PMB Data Array......................................................................... 156
Section 8 Caches
Figure 8.1 Configuration of Operand Cache (OC) .....................................................................160
Figure 8.2 Configuration of Instruction Cache (IC) ...................................................................161
Figure 8.3 Configuration of Write-Back Buffer ......................................................................... 172
Figure 8.4 Configuration of Write-Through Buffer.................................................................... 172
Figure 8.5 Memory-Mapped IC Address Array .........................................................................178
Figure 8.6 Memory-Mapped IC Data Array............................................................................... 179
Figure 8.7 Memory-Mapped OC Address Array........................................................................ 180
Figure 8.8 Memory-Mapped OC Data Array ............................................................................. 181
Figure 8.9 Store Queue Configuration........................................................................................ 182
Appendix
Figure B.1 Instruction Prefetch................................................................................................... 433
Rev. 1.50, 10/04, page xviii of xx
Page 19

Tables

Section 1 Overview
Table 1.1 Features..................................................................................................................... 1
Table 1.2 Changes from SH-4 to SH-4A .................................................................................. 4
Section 2 Programming Model
Table 2.1 Initial Register Values...............................................................................................9
Table 2.2 Bit Allocation for FPU Exception Handling........................................................... 19
Section 3 Instruction Set
Table 3.1 Execution Order of Delayed Branch Instructions ................................................... 23
Table 3.2 Addressing Modes and Effective Addresses........................................................... 25
Table 3.3 Notation Used in Instruction List............................................................................ 29
Table 3.4 Fixed-Point Transfer Instructions ........................................................................... 31
Table 3.5 Arithmetic Operation Instructions ..........................................................................33
Table 3.6 Logic Operation Instructions .................................................................................. 35
Table 3.7 Shift Instructions..................................................................................................... 36
Table 3.8 Branch Instructions................................................................................................. 37
Table 3.9 System Control Instructions.................................................................................... 37
Table 3.10 Floating-Point Single-Precision Instructions ..........................................................40
Table 3.11 Floating-Point Double-Precision Instructions......................................................... 41
Table 3.12 Floating-Point Control Instructions ........................................................................41
Table 3.13 Floating-Point Graphics Acceleration Instructions................................................. 42
Section 4 Pipelining
Table 4.1 Representations of Instruction Execution Patterns..................................................44
Table 4.2 Instruction Groups ..................................................................................................54
Table 4.3 Combination of Preceding and Following Instructions...........................................55
Table 4.4 Issue Rates and Execution Cycles........................................................................... 57
Section 5 Exception Handling
Table 5.1 Register Configuration............................................................................................ 65
Table 5.2 States of Register in Each Operating Mode............................................................ 65
Table 5.3 Exceptions...............................................................................................................70
Section 6 Floating-Point Unit (FPU)
Table 6.1 Floating-Point Number Formats and Parameters.................................................... 99
Table 6.2 Floating-Point Ranges........................................................................................... 100
Table 6.3 Bit Allocation for FPU Exception Handling......................................................... 107
Section 7 Memory Management Unit (MMU)
Table 7.1 Register Configuration.......................................................................................... 121
Table 7.2 Register States in Each Processing State ..............................................................121
Rev. 1.50, 10/04, page xix of xx
Page 20
Section 8 Caches
Table 8.1 Cache Features...................................................................................................... 159
Table 8.2 Store Queue Features............................................................................................ 159
Table 8.3 Register Configuration.......................................................................................... 162
Table 8.4 Register States in Each Processing State .............................................................. 162
Section 9 L Memory
Table 9.1 L Memory Addresses............................................................................................ 187
Table 9.2 Register Configuration.......................................................................................... 188
Table 9.3 Register Status in Each Processing State.............................................................. 188
Table 9.4 Protective Function Exceptions to Access L Memory.......................................... 199
Appendix
Table D.1 Register Configuration.......................................................................................... 435
Rev. 1.50, 10/04, page xx of xx
Page 21

Section 1 Overview

1.1 Features

The SH-4A is a 32-bit RISC (reduced instruction set computer) microprocessor that is upward compatible with the SH-1, SH-2, SH-3, and SH-4 microcomputers at instruction set code level. Its 16-bit fixed-length instruction set enables program code size to be reduced by almost 50% compared with 32-bit instructions. The features of the SH-4A are listed in table 1.1.
Table 1.1 Features
Item Features
CPU
Renesas Technology original architecture
32-bit internal data bus
General-register files:
Sixteen 32-bit general registers (eight 32-bit shadow registers)
Seven 32-bit control registers
Four 32-bit system registers
RISC-type instruction set (upward compatible with the SH-1, SH-2, SH-3,
and SH-4 microcomputers)
Instruction length: 16-bit fixed length for improved code efficiency
Load/store architecture
Delayed branch instructions
Instructions executed with conditions
Instruction set based on the C language
Super scalar which executes two instructions simultaneously including the
FPU
Instruction execution time: Two instructions per cycle (max)
Virtual address space: 4 Gbytes
Space identifier ASID: 8 bits, 256 virtual address spaces
On-chip multiplier
Seven-stage pipeline
Rev. 1.50, 10/04, page 1 of 448
Page 22
Item Features
Floatingpoint unit
(FPU)
Memory management unit (MMU)
On-chip floating-point coprocessor
Supports single-precision (32 bits) and double-precision (64 bits)
Supports IEEE754-compliant data types and exceptions
Two rounding modes: Round to Nearest and Round to Zero
Handling of denormalized numbers: Truncation to zero or interrupt
generation for IEEE754 compliance
Floating-point registers: 32 bits × 16 words × 2 banks
(single-precision × 16 words or double-precision × 8 words) × 2 banks
32-bit CPU-FPU floating-point communication register (FPUL)
Supports FMAC (multiply-and-accumulate) instruction
Supports FDIV (divide) and FSQRT (square root) instructions
Supports FLDI0/FLDI1 (load constant 0/1) instructions
Instruction execution times
Latency (FADD/FSUB): 3 cycles (single-precision), 5 cycles (double-
precision)
Latency (FMAC/ FMUL): 5 cycles (single-precision), 7 cycles (double-
precision)
Pitch (FADD/FSUB): 1 cycle (single-precision/double-precision)
Pitch (FMAC/FMUL): 1 cycle (single-precision), 3 cycles (double-
precision)
Note: FMAC is supported for single-precision only.
3-D graphics instructions (single-precision only):
4-dimensional vector conversion and matrix operations (FTRV): 4 cycles
(pitch), 8 cycles (latency)
4-dimensional vector (FIPR) inner product: 1 cycle (pitch), 5 cycles
(latency)
Ten-stage pipeline
4 Gbytes of physical address space, 256 address space identifiers (address
space identifier ASID: 8 bits)
Supports single virtual memory mode and multiple virtual memory mode
Supports multiple page sizes: 1 Kbyte, 4 Kbytes, 64 Kbytes, or 1 Mbyte
4-entry full associative TLB for instructions
64-entry full associative TLB for instructions and operands
Supports software selection of replacement method and random-counter
replacement algorithms
Contents of TLB are directly accessible through address mapping
Rev. 1.50, 10/04, page 2 of 448
Page 23
Item Features
Cache memory
L memory
Instruction cache (IC)
4-way set associative
32-byte block length
Operand cache (OC)
4-way set associative
32-byte block length
Selectable write method (copy-back or write-through)
Storage queue (32 bytes × 2 entries)
Note: For the size of instruction cash and operand cash, see corresponding
hardware manual on the product.
Two independent read/write ports
8-/16-/32-/64-bit access from the CPU
8-/16-/32-/64-bit and 16-/32-byte access from the external devices
Note: For the size of L memory, see the hardware manual of the target product.
Rev. 1.50, 10/04, page 3 of 448
Page 24

1.2 Changes from SH-4 to SH-4A

Table 1.2 summarizes the changes from SH-4 to SH-4A based on the sections and sub-sections in this manual.
Table 1.2 Changes from SH-4 to SH-4A
Section No. and Name
1. Overview Modified entirely
2. Programming Model
3. Instruction Set 3.3 Instruction Set
4. Pipelining
5. Exception Handling
6. FPU
Sub­section
2.2 Register
4.1 Pipelines The number of stages in the pipeline is
4.2 Parallel-
4.3 Execution Cycles The number of execution cycles is
6.3.2 Floating-Point
6.5 Floating-Point
Sub-section Name Changes
(Detailed differences are described in the following sections).
The operations in SZ=1 and PR=1 are
Descriptions
Executability
Status/Control Register (FPSCR)
Exceptions
added to the floating point status/control register (FPSCR).
9 instructions are added as CPU instructions.
3 instructions are added as FPU instructions.
changed from five to seven.
9 instructions are added as CPU instructions.
3 instructions are added as FPU instructions.
Instruction group and parallel execution combinations are modified.
modified.
Operations in SZ = 1 and PR = 1 and each endian are added
Specification of FPU exception detection condition with FPU exception enabled is changed.
Rev. 1.50, 10/04, page 4 of 448
Page 25
Section No. and Name
7. Memory Management Unit
7.7 32-Bit Address
Sub­section
7.1.1 Address Spaces
7.2 Register
7.2.6 Physical Address
7.2.7 Instruction Re-
7.3 TLB Functions Space attribute bits (SA [2:0]) and timing
7.4.5 Avoiding Synonym
7.5.1,
7.5.4
7.6 Memory-Mapped
7.6.3 UTLB Address
7.6.4 UTLB Data Array Memory allocated addresses are changed
Sub-section Name Changes
Area P4 configuration is modified.
On-chip RAM space is deleted.
The page table entry assist register (PTEA)
Descriptions
Space Control Register (PASCR)
Fetch Inhibit Control Register (IRMCR)
Problems
Instruction TLB Multiple Hit Exception and Data TLB Multiple Hit Exception
TLB Configuration
Array
Extended Mode
is deleted.
A physical address space control register is added.
Newly added
Newly added.
control bit (TC) are deleted from the TLB.
The corresponding bits are modified according to the cache size change and the index mode deletion.
Multiple hits during the UTLB search caused by ITLB mishandling are changed to be handled as a TLB multiple hit instruction exception.
Data array 2 in the ITLB and UTLB is deleted.
Associative writes to the UTLB address array are changed to not generate data TLB multiple hit exceptions.
Memory allocated addresses are changed from H'F6000000–H'F6FFFFFF to H'F6000000–H'F60FFFFF.
from H'F7000000–H'F77FFFFF to H'F7000000–H'F70FFFFF.
Newly added.
Rev. 1.50, 10/04, page 5 of 448
Page 26
Section No. and Name
8. Caches
8.8 Notes on Using
9. L Memory Newly added.
10. Instruction Descriptions
Sub­section
8.1 Features
8.2 Register
8.2.1 Cache Control
8.2.4 On-Chip Memory
8.3 Operand Cache
8.3.6 OC Two-Way
8.4 Instruction Cache
8.4.3 IC Two-Way Mode Newly added.
8.5.1 Coherency
8.6 Memory-Mapped
Sub-section Name Changes
Instruction cache capacity is changed to 32 Kbytes.
The caching method is changed to a 4-way set-associative method.
An on-chip memory control register is
Descriptions
Register (CCR)
Control Register (RAMCR)
Operation
Mode
Operation
between Cache and External Memory
Cache Configuration
32-Bit Address Extended Mode
added.
Modified.
(Descriptions in CCR are modified.)
Newly added.
RAM mode and OC index mode are deleted.
Newly added.
IC index mode is deleted.
The ICBI, PREFI, and SYNCO instructions are added.
The entry bits and the way bits are modified according to the size modification and changed into 4-way set associative cache.
Newly added.
9 instructions are added as CPU instructions.
3 instructions are added as FPU instructions.
Rev. 1.50, 10/04, page 6 of 448
Page 27

Section 2 Programming Model

The programming model of the SH-4A is explained in this section. The SH-4A has registers and data formats as shown below.

2.1 Data Formats

The data formats supported in the SH-4A are shown in figure 2.1.
0
7
Byte (8 bits)
Word (16 bits)
Longword (32 bits)
Single-precision floating-point (32 bits)
Double-precision floating-point (64 bits)
[Legend]
s
:Sign field
e
:Exponent field
f
:Fraction field
62 51
se
Figure 2.1 Data Formats
15
31
se f
f
0
0
031 30 22
063
Rev. 1.50, 10/04, page 7 of 448
Page 28

2.2 Register Descriptions

2.2.1 Privileged Mode and Banks

Processing Modes: This LSI has two processing modes, user mode and privileged mode. This
LSI normally operates in user mode, and switches to privileged mode when an exception occurs or an interrupt is accepted. There are four kinds of registers—general registers, system registers, control registers, and floating-point registers—and the registers that can be accessed differ in the two processing modes.
General Registers: There are 16 general registers, designated R0 to R15. General registers R0 to
R7 are banked registers which are switched by a processing mode change.
Privileged mode
In privileged mode, the register bank bit (RB) in the status register (SR) defines which banked register set is accessed as general registers, and which set is accessed only through the load control register (LDC) and store control register (STC) instructions.
When the RB bit is 1 (that is, when bank 1 is selected), the 16 registers comprising bank 1 general registers R0_BANK1 to R7_BANK1 and non-banked general registers R8 to R15 can be accessed as general registers R0 to R15. In this case, the eight registers comprising bank 0 general registers R0_BANK0 to R7_BANK0 are accessed by the LDC/STC instructions. When the RB bit is 0 (that is, when bank 0 is selected), the 16 registers comprising bank 0 general registers R0_BANK0 to R7_BANK0 and non-banked general registers R8 to R15 can be accessed as general registers R0 to R15. In this case, the eight registers comprising bank 1 general registers R0_BANK1 to R7_BANK1 are accessed by the LDC/STC instructions.
User mode
In user mode, the 16 registers comprising bank 0 general registers R0_BANK0 to R7_BANK0 and non-banked general registers R8 to R15 can be accessed as general registers R0 to R15. The eight registers comprising bank 1 general registers R0_BANK1 to R7_BANK1 cannot be accessed.
Control Registers: Control registers comprise the global base register (GBR) and status register
(SR), which can be accessed in both processing modes, and the saved status register (SSR), saved program counter (SPC), vector base register (VBR), saved general register 15 (SGR), and debug base register (DBR), which can only be accessed in privileged mode. Some bits of the status register (such as the RB bit) can only be accessed in privileged mode.
System Registers: System registers comprise the multiply-and-accumulate registers
(MACH/MACL), the procedure register (PR), and the program counter (PC). Access to these registers does not depend on the processing mode.
Rev. 1.50, 10/04, page 8 of 448
Page 29
Floating-Point Registers and System Regi sters Rela ted t o FPU : There are thirty-two floating-
point registers, FR0–FR15 and XF0–XF15. FR0–FR15 and XF0–XF15 can be assigned to either of two banks (FPR0_BANK0–FPR15_BANK0 or FPR0_BANK1–FPR15_BANK1).
FR0–FR15 can be used as the eight registers DR0/2/4/6/8/10/12/14 (double-precision floating­point registers, or pair registers) or the four registers FV0/4/8/12 (register vectors), while XF0– XF15 can be used as the eight registers XD0/2/4/6/8/10/12/14 (register pairs) or register matrix XMTRX.
System registers related to the FPU comprise the floating-point communication register (FPUL) and the floating-point status/control register (FPSCR). These registers are used for communication between the FPU and the CPU, and the exception handling setting.
Register values after a reset are shown in table 2.1.
Table 2.1 Initial Register Values
Type Registers Initial Value*
General registers R0_BANK0 to R7_BANK0,
R0_BANK1 to R7_BANK1, R8 to R15
Control registers
SR MD bit = 1, RB bit = 1, BL bit = 1, FD bit = 0,
GBR, SSR, SPC, SGR, DBR Undefined
VBR H'00000000
MACH, MACL, PR Undefined System registers
PC H'A0000000
Undefined
IMASK = B'1111, reserved bits = 0, others = undefined
FR0 to FR15, XF0 to XF15,
registers
Note: * Initialized by a power-on reset and manual reset.
FPUL
FPSCR H'00040001
Undefined Floating-point
The CPU register configuration in each processing mode is shown in figure 2.2.
User mode and privileged mode are switched by the processing mode bit (MD) in the status register.
Rev. 1.50, 10/04, page 9 of 448
Page 30
31 0
R0_BANK0*1,*
R1_BANK0* R2_BANK0* R3_BANK0* R4_BANK0* R5_BANK0* R6_BANK0* R7_BANK0*
R8
R9 R10 R11 R12 R13 R14 R15
2
2
2
2
2
2
2
2
SR
31 0
R0_BANK1*1,*
R1_BANK1* R2_BANK1* R3_BANK1* R4_BANK1* R5_BANK1* R6_BANK1* R7_BANK1*
R8
R9 R10 R11 R12 R13 R14 R15
SR
SSR
3
3
3
3
3
3
3
3
31 0
R0_BANK0*1,*
R1_BANK0* R2_BANK0* R3_BANK0* R4_BANK0* R5_BANK0* R6_BANK0* R7_BANK0*
R8
R9 R10 R11 R12 R13 R14 R15
SR
SSR
4
4
4
4
4
4
4
4
GBR MACH MACL
PR
PC
GBR MACH MACL
PR
VBR
PC
SPC
SGR
DBR
4
4
4
4
4
4
4
4
R0_BANK1*1,*
R1_BANK1* R2_BANK1* R3_BANK1* R4_BANK1* R5_BANK1* R6_BANK1* R7_BANK1*
(c) Register configuration in privileged mode (RB = 0)
(a) Register configuration in user mode
Notes: 1.
R0 is used as the index register in indexed register-indirect addressing mode and
R0_BANK0*1,*
R1_BANK0* R2_BANK0* R3_BANK0* R4_BANK0* R5_BANK0* R6_BANK0* R7_BANK0*
(b) Register configuration in privileged mode (RB = 1)
indexed GBR indirect addressing mode. Banked registers
2. Banked registers
3. Accessed as general registers when the RB bit is set to 1 in SR. Accessed only by LDC/STC instructions when the RB bit is cleared to 0. Banked registers
4. Accessed as general registers when the RB bit is cleared to 0 in SR. Accessed only by LDC/STC instructions when the RB bit is set to 1.
Figure 2.2 CPU Register Configuration in Each Processing M ode
GBR MACH MACL
PR
VBR
PC
SPC
SGR
DBR
3
3
3
3
3
3
3
3
Rev. 1.50, 10/04, page 10 of 448
Page 31

2.2.2 General Registers

Figure 2.3 shows the relationship between the processing modes and general registers. The SH-4A has twenty-four 32-bit general registers (R0_BANK0 to R7_BANK0, R0_BANK1 to R7_BANK1, and R8 to R15). However, only 16 of these can be accessed as general registers R0 to R15 in one processing mode. The SH-4A has two processing modes, user mode and privileged mode.
R0_BANK0 to R7_BANK0
Allocated to R0 to R7 in user mode (SR.MD = 0)
Allocated to R0 to R7 when SR.RB = 0 in privileged mode (SR.MD = 1).
R0_BANK1 to R7_BANK1
Cannot be accessed in user mode.
Allocated to R0 to R7 when SR.RB = 1 in privileged mode.
SR.MD = 0 or (SR.MD = 1, SR.RB = 0)
R0 R1 R2 R3 R4 R5 R6 R7
R0_BANK1 R1_BANK1 R2_BANK1 R3_BANK1 R4_BANK1 R5_BANK1 R6_BANK1 R7_BANK1
R8
R9 R10 R11 R12 R13 R14 R15
R0_BANK0 R1_BANK0 R2_BANK0 R3_BANK0 R4_BANK0 R5_BANK0 R6_BANK0 R7_BANK0
R0_BANK1 R1_BANK1 R2_BANK1 R3_BANK1 R4_BANK1 R5_BANK1 R6_BANK1 R7_BANK1
R8
R9 R10 R11 R12 R13 R14 R15
(SR.MD = 1, SR.RB = 1)
R0_BANK0 R1_BANK0 R2_BANK0 R3_BANK0 R4_BANK0 R5_BANK0 R6_BANK0 R7_BANK0
R0 R1 R2 R3 R4 R5 R6 R7
R8
R9 R10 R11 R12 R13 R14 R15
Figure 2.3 General Registers
Note on Programming: As the user's R0 to R7 are assigned to R0_BANK0 to R7_BANK0, and
after an exception or interrupt R0 to R7 are assigned to R0_BANK1 to R7_BANK1, it is not necessary for the interrupt handler to save and restore the user's R0 to R7 (R0_BANK0 to R7_BANK0).
Rev. 1.50, 10/04, page 11 of 448
Page 32

2.2.3 Floating-Point Registers

Figure 2.4 shows the floating-point register configuration. There are thirty-two 32-bit floating­point registers, FPR0_BANK0 to FPR15_BANK0, AND FPR0_BANK1 to FPR15_BANK1, comprising two banks. These registers are referenced as FR0 to FR15, DR0/2/4/6/8/10/12/14, FV0/4/8/12, XF0 to XF15, XD0/2/4/6/8/10/12/14, or XMTRX. Reference names of each register are defined depending on the state of the FR bit in FPSCR (see figure 2.4).
1. Floating-point registers, FPRn_BANKj (32 registers)
FPR0_BANK0 to FPR15_BANK0
FPR0_BANK1 to FPR15_BANK1
2. Single-precision floating-point registers, FRi (16 registers)
When FPSCR.FR = 0, FR0 to FR15 are assigned to FPR0_BANK0 to FPR15_BANK0;
when FPSCR.FR = 1, FR0 to FR15 are assigned to FPR0_BANK1 to FPR15_BANK1.
3. Double-precision floating-point registers or single-precision floating-point registers, DRi (8 registers): A DR register comprises two FR registers.
DR0 = {FR0, FR1}, DR2 = {FR2, FR3}, DR4 = {FR4, FR5}, DR6 = {FR6, FR7}, DR8 = {FR8, FR9}, DR10 = {FR10, FR11}, DR12 = {FR12, FR13}, DR14 = {FR14, FR15}
4. Single-precision floating-point vector registers, FVi (4 registers): An FV register comprises four FR registers.
FV0 = {FR0, FR1, FR2, FR3}, FV4 = {FR4, FR5, FR6, FR7}, FV8 = {FR8, FR9, FR10, FR11}, FV12 = {FR12, FR13, FR14, FR15}
5. Single-precision floating-point extended registers, XFi (16 registers)
When FPSCR.FR = 0, XF0 to XF15 are assigned to FPR0_BANK1 to FPR15_BANK1;
when FPSCR.FR = 1, XF0 to XF15 are assigned to FPR0_BANK0 to FPR15_BANK0.
6. Double-precision floating-point extended registers, XDi (8 registers): An XD register comprises two XF registers.
XD0 = {XF0, XF1}, XD2 = {XF2, XF3}, XD4 = {XF4, XF5}, XD6 = {XF6, XF7}, XD8 = {XF8, XF9}, XD10 = {XF10, XF11}, XD12 = {XF12, XF13}, XD14 = {XF14, XF15}
7. Single-precision floating-point extended register matrix, XMTRX: XMTRX comprises all 16 XF registers.
XMTRX = XF0 XF4 XF8 XF12
XF1 XF5 XF9 XF13
XF2 XF6 XF10 XF14
XF3 XF7 XF11 XF15
Rev. 1.50, 10/04, page 12 of 448
Page 33
FPSCR.FR = 0 FPSCR.FR = 1
FR0
FV0
FV4
FV8
FV12
DR0
DR2
DR4
DR6
DR8
DR10
DR12
DR14
XD0XMTRX
XD2
XD4
XD6
XD8
XD10
XD12
XD14
FR1 FR2 FR3 FR4 FR5 FR6 FR7 FR8 FR9 FR10 FR11 FR12 FR13 FR14 FR15
XF0 XF1 XF2 XF3 XF4 XF5 XF6 XF7 XF8 XF9 XF10 XF11 XF12 XF13 XF14 XF15
FPR0_BANK0 FPR1_BANK0 FPR2_BANK0 FPR3_BANK0 FPR4_BANK0 FPR5_BANK0 FPR6_BANK0 FPR7_BANK0 FPR8_BANK0
FPR9_BANK0 FPR10_BANK0 FPR11_BANK0 FPR12_BANK0 FPR13_BANK0 FPR14_BANK0 FPR15_BANK0
FPR0_BANK1
FPR1_BANK1
FPR2_BANK1
FPR3_BANK1
FPR4_BANK1
FPR5_BANK1
FPR6_BANK1
FPR7_BANK1
FPR8_BANK1
FPR9_BANK1 FPR10_BANK1 FPR11_BANK1 FPR12_BANK1 FPR13_BANK1 FPR14_BANK1 FPR15_BANK1
XF0 XF1 XF2 XF3 XF4 XF5 XF6 XF7 XF8 XF9 XF10 XF11 XF12 XF13 XF14 XF15
FR0 FR1 FR2 FR3 FR4 FR5 FR6 FR7 FR8 FR9 FR10 FR11 FR12 FR13 FR14 FR15
DR0
DR2
DR4
DR6
DR8
DR10
DR12
DR14
XD0 XMTRX
XD2
XD4
XD6
XD8
XD10
XD12
XD14
FV0
FV4
FV8
FV12
Figure 2.4 Floating-Point Registers
Rev. 1.50, 10/04, page 13 of 448
Page 34

2.2.4 Control Registers

Status Register (SR)
BIt:
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
MD RB BL
Initial value:
Initial value:
Bit Bit Name
31 — 0 R Reserved
30 MD 1 R/W Processing Mode
29 RB 1 R/W Privileged Mode General Register Bank Specification
28 BL 1 R/W Exception/Interrupt Block Bit
27 to 16 — All 0 R Reserved
0111000000000000
R/W:
RR/WR/WR/WRRRRRRRRRRRR
BIt:
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
FD M Q IMASK S T
0000000011110000
R/W:
R/W R R R R R R/W R/W R/W R/W R/W R/W R R R/W R/W
Initial Value R/W Description
For details on reading/writing this bit, see General Precautions on Handling of Product.
Selects the processing mode.
0: User mode (Some instructions cannot be executed and some resources cannot be accessed.) 1: Privileged mode
This bit is set to 1 by an exception or interrupt.
Bit
0: R0_BANK0 to R7_BANK0 are accessed as general
registers R0 to R7 and R0_BANK1 to R7_BANK1 can be accessed using LDC/STC instructions
1: R0_BANK1 to R7_BANK1 are accessed as general
registers R0 to R7 and R0_BANK0–R7_BANK0 can be accessed using LDC/STC instructions
This bit is set to 1 by an exception or interrupt.
This bit is set to 1 by a reset, an exception, or an interrupt. While this bit is set to 1, an interrupt request is masked. In this case, this processor enters the reset state when a general exception other than a user break occurs.
For details on reading/writing this bit, see General Precautions on Handling of Product.
0
Rev. 1.50, 10/04, page 14 of 448
Page 35
Initial
Bit Bit Name
15 FD 0 R/W FPU Disable Bit
14 to 10 — All 0 R Reserved
9 M 0 R/W M Bit
8 Q 0 R/W Q Bit
7 to 4 IMASK All 1 R/W Interrupt Mask Level Bits
3, 2 All 0 R Reserved
1 S 0 R/W S Bit
0 T 0 R/W T Bit
Value R/W Description
When this bit is set to 1 and an FPU instruction is not in a delay slot, a general FPU disable exception occurs. When this bit is set to 1 and an FPU instruction is in a delay slot, a slot FPU disable exception occurs. (FPU instructions: H'F*** instructions and LDS (.L)/STS(.L) instructions using FPUL/FPSCR)
For details on reading/writing this bit, see General Precautions on Handling of Product.
Used by the DIV0S, DIV0U, and DIV1 instructions.
Used by the DIV0S, DIV0U, and DIV1 instructions.
An interrupt whose priority is equal to or less than the value of the IMASK bits is masked. It can be chosen by CPU operation mode register (CPUOPM) whether the level of IMASK is changed to accept an interrupt or not when an interrupt is occurred. For details, see Appendix A, CPU Operation Mode Register (CPUOPM).
For details on reading/writing this bit, see General Precautions on Handling of Product.
Used by the MAC instruction.
Indicates true/false condition, carry/borrow, or overflow/underflow.
For details, see section 3, Instruction Set.
Saved Status Register (SSR) (32 bits, Privileged Mode, Initial Value = Undefined): The
contents of SR are saved to SSR in the event of an exception or interrupt.
Saved Program Counter (SPC) (32 bits, Privileged Mode, Initial Value = Undefined): The
address of an instruction at which an interrupt or exception occurs is saved to SPC.
Global Base Register (GBR) (32 bits, Initial Value = Undefined): GBR is referenced as the
base address of addressing @(disp,GBR) and @(R0,GBR).
Rev. 1.50, 10/04, page 15 of 448
Page 36
Vector Base Register (VBR) (32 bits, Privileged Mode, Initial Value = H'000 000 00): VBR is
referenced as the branch destination base address in the event of an exception or interrupt. For details, see section 5, Exception Handling.
Saved General Register 15 (SGR) (32 bits, Privileged Mode, Initial Value = Undefined): The
contents of R15 are saved to SGR in the event of an exception or interrupt.
Debug Base Register (DBR) (32 bits, Privileged Mode, Initial Value = Undefined): When the
user break debugging function is enabled (CBCR.UBDE = 1), DBR is referenced as the branch destination address of the user break handler instead of VBR.

2.2.5 System Registers

Multiply-and-Accumulate Registers (MACH and MACL) (32 bits, Initial Value = Undefined): MACH and MACL are used for the added value in a MAC instruction, and to store
the operation result of a MAC or MUL instruction.
Procedure Register (PR) (32 bits, Initial Value = Undefined): The return address is stored in
PR in a subroutine call using a BSR, BSRF, or JSR instruction. PR is referenced by the subroutine return instruction (RTS).
Program Counter (PC) (32 bits, Initial Value = H'A0000000): PC indicates the address of the
instruction currently being executed.
Rev. 1.50, 10/04, page 16 of 448
Page 37
Floating-Point Status/Control Register (FPSCR)
BIt:
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
Initial value:
Initial value:
0000000000000100
R/W:
RRRRRRRRRRR/WR/WR/WR/WR/W
BIt:
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
Cause
0000000000000001
R/W:
R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W R/W
Enable (EN)
Initial
Bit Bit Name
Value R/W Description
31 to 22 — All 0 R Reserved
For details on reading/writing this bit, see General Precautions on Handling of Product.
21 FR 0 R/W Floating-Point Register Bank
0: FPR0_BANK0 to FPR15_BANK0 are assigned to
FR0 to FR15 and FPR0_BANK1 to FPR15_BANK1 are assigned to XF0 to XF15
1: FPR0_BANK0 to FPR15_BANK0 are assigned to
XF0 to XF15 and FPR0_BANK1 to FPR15_BANK1 are assigned to FR0 to FR15
20 SZ 0 R/W Transfer Size Mode
0: Data size of FMOV instruction is 32-bits 1: Data size of FMOV instruction is a 32-bit register pair (64 bits)
For relationship between the SZ bit, PR bit, and endian, see figure 2.5.
19 PR 0 R/W Precision Mode
0: Floating-point instructions are executed as single-precision operations 1: Floating-point instructions are executed as double-precision operations (graphics support instructions are undefined)
For relationship between the SZ bit, PR bit, and endian, see figure 2.5
18 DN 1 R/W Denormalization Mode
0: Denormalized number is treated as such 1: Denormalized number is treated as zero
FR SZ PR DN
Cause
R/W
0
Flag RM
Rev. 1.50, 10/04, page 17 of 448
Page 38
Initial
Bit Bit Name
17 to 12 Cause All 0 R/W
11 to 7 Enable (EN) All 0 R/W
6 to 2 Flag All 0 R/W
Value R/W Description
FPU Exception Cause Field FPU Exception Enable Field FPU Exception Flag Field Each time an FPU operation instruction is executed, the FPU exception cause field is cleared to 0. When an FPU exception occurs, the bits corresponding to FPU exception cause field and flag field are set to 1. The FPU exception flag field remains set to 1 until it is cleared to 0 by software.
For bit allocations of each field, see table 2.2.
1, 0 RM 01 R/W Rounding Mode
These bits select the rounding mode.
00: Round to Nearest
01: Round to Zero
10: Reserved
11: Reserved
<Big endian>
63 0
Floating-point register
DR (2i)
63 0
FR (2i) FR (2i+1)
Memory area
<Little endian>
Floating-point register
Memory area
63 32 31 0
63 0 63 0
63 0
FR (2i) FR (2i+1)
63 32
Notes: 1. In the case of SZ = 0 and PR = 0, DR register can not be used.
2. The bit-location of DR register is used for double precision format when PR = 1. (In the case of (2), it is used when PR is changed from 0 to 1.)
Figure 2.5 Relationship between SZ bit and Endian
Rev. 1.50, 10/04, page 18 of 448
8n+4 8n+78n 8n+3
DR (2i)
4n 4m4n+3 4m+3
(1) SZ = 0 (2) SZ = 1, PR = 0
1, *2
*
31 0
DR (2i)
63 0
FR (2i+1)FR (2i)
63 32 31 0
63 0
2
*
63 0
63 32
8n+48n+78n+3 8n
DR (2i)
FR (2i+1)FR (2i)
31 0
8n8n+38n+7 8n+4
(3) SZ = 1, PR = 1
Page 39
Table 2.2 Bit Allocation for FPU Exception Handling
Field Name
Cause FPU exception
cause field
Enable FPU exception
enable field
Flag FPU exception flag
field
FPU Error (E)
Bit 17 Bit 16 Bit 15 Bit 14 Bit 13 Bit 12
None Bit 11 Bit 10 Bit 9 Bit 8 Bit 7
None Bit 6 Bit 5 Bit 4 Bit 3 Bit 2
Invalid Operation (V)
Division by Zero (Z)
Overflow (O)
Underflow (U)
Inexact (I)
Floating-Point Communication Regi s ter ( FPUL) (32 bits, Initial Value = Undefined):
Information is transferred between the FPU and CPU via FPUL.

2.3 Memory-Mapped Registers

Some control registers are mapped to the following memory areas. Each of the mapped registers has two addresses.
H'1C00 0000 to H'1FFF FFFF H'FC00 0000 to H'FFFF FFFF
These two areas are used as follows.
H'1C00 0000 to H'1FFF FFFF
This area must be accessed using the address translation function of the MMU.
Setting the page number of this area to the corresponding field of the TLB enables access to a memory-mapped register.
The operation of an access to this area without using the address translation function of the MMU is not guaranteed.
H'FC00 0000 to H'FFFF FFFF
Access to area H'FC00 0000 to H'FFFF FFFF in user mode will cause an address error. Memory-mapped registers can be referenced in user mode by means of access that involves address translation.
Note: Do not access addresses to which registers are not mapped in either area. The operation of
an access to an address with no register mapped is undefined. Also, memory-mapped registers must be accessed using a fixed data size. The operation of an access using an invalid data size is undefined.
Rev. 1.50, 10/04, page 19 of 448
Page 40

2.4 Data Formats in Registers

Register operands are always longwords (32 bits). When a memory operand is only a byte (8 bits) or a word (16 bits), it is sign-extended into a longword when loaded into a register.
0
67
S
31
SS
1415
S
31
SS
067
0
01415
Figure 2.6 Formats of Byte Data and Word Data in Register

2.5 Data Formats in Memory

Memory data formats are classified into bytes, words, and longwords. Memory can be accessed in an 8-bit byte, 16-bit word, or 32-bit longword form. A memory operand less than 32 bits in length is sign-extended before being loaded into a register.
A word operand must be accessed starting from a word boundary (even address of a 2-byte unit: address 2n), and a longword operand starting from a longword boundary (even address of a 4-byte unit: address 4n). An address error will result if this rule is not observed. A byte operand can be accessed from any address.
Big endian or little endian byte order can be selected for the data format. The endian should be set with the external pin after a power-on reset. The endian cannot be changed dynamically. Bit positions are numbered left to right from most-significant to least-significant. Thus, in a 32-bit longword, the leftmost bit, bit 31, is the most significant bit and the rightmost bit, bit 0, is the least significant bit.
The data format in memory is shown in figure 2.7.
Rev. 1.50, 10/04, page 20 of 448
Page 41
Address A
Address A + 4
Address A + 8
A + 1 A + 2 A + 3
A
31
23 15 7 0
70707070
Byte 0
Byte 1 Byte 2
15 0 15 0
Word 0
31 0
Longword
Byte 3
Word 1
A + 11
A + 10 A + 9 A + 8
31
23 15 7 0
70707070
Byte 3
Byte 2 Byte 1 Byte 0
15 0
Word 1
31 0
15 0
Longword
Word 0
Address A + 8
Address A + 4
Address A
Big endian Little endian
Figure 2.7 Data Formats in Memory
For the 64-bit data format, see figure 2.5.

2.6 Processing States

This LSI has major three processing states: the reset state, instruction execution state, and power­down state.
Reset State: In this state the CPU is reset. The reset state is divided into the power-on reset state
and the manual reset.
In the power-on reset state, the internal state of the CPU and the on-chip peripheral module registers are initialized. In the manual reset state, the internal state of the CPU and some registers of on-chip peripheral modules are initialized. For details, see register descriptions for each section.
Instruction Execution State: In this state, the CPU executes program instructions in sequence.
The Instruction execution state has the normal program execution state and the exception handling state.
Power-Down State: In a power-down state, CPU halts operation and power consumption is
reduced. The power-down state is entered by executing a SLEEP instruction. There are two modes in the power-down state: sleep mode and standby mode.
From any state when reset/manual reset input
Reset/manual reset clearance
Instruction execution state
Figure 2.8 Processing State Transitions
Reset state
Reset/manual reset input
Sleep instruction execution
Interrupt occurence
Reset/manual reset input
Power-down state
Rev. 1.50, 10/04, page 21 of 448
Page 42

2.7 Usage Notes

2.7.1 Notes on Self-Modified Codes

The SH-4A prefetches instructions to accelerate the processing speed. Therefore if the instruction in the memory is modified and it is executed immediately, then the pre-modified code in the prefetch buffer may be executed. And the SH4AL-DSP supports each instruction and operand cache, the coherency should be considered. In order to reflect the modified code definitely, one of the following sequences should be executed.
In Case the Modified Codes are in Non-Cacheable Area:
SYNCO
ICBI @Rn
The target for the ICBI instruction can be any address within the range where no address error exception occurs.
In Case the Modified Codes are in Cacheable Area (Write-Through):
SYNCO
ICBI @Rn
All instruction cache areas corresponding to the modified codes should be invalidated by the ICBI instruction. The ICBI instruction should be issued to each cache line. One cache line is 32 bytes.
In Case the Modified Codes are in Cacheable Area (Copy-Back):
OCBP @Rm or OCBWB @Rm
SYNCO
ICBI @Rn
All operand cache areas corresponding to the modified codes should be written back to the main memory by the OCBP or OCBWB instruction. Then all instruction cache areas corresponding to the modified codes should be invalidated by the ICBI instruction. The OCBP, OCBWB, and ICBI instruction should be issued to each cache line. One cache line is 32 bytes.
Rev. 1.50, 10/04, page 22 of 448
Page 43

Section 3 Instruction Set

The SH-4A's instruction set is implemented with 16-bit fixed-length instructions. The SH-4A can use byte (8-bit), word (16-bit), longword (32-bit), and quadword (64-bit) data sizes for memory access. Single-precision floating-point data (32 bits) can be moved to and from memory using longword or quadword size. Double-precision floating-point data (64 bits) can be moved to and from memory using longword size. When the SH-4A moves byte-size or word-size data from memory to a register, the data is sign-extended.

3.1 Execution Environment

PC: At the start of instruction execution, the PC indicates the address of the instruction itself.
Load-Store Architecture: The SH-4A has a load-store architecture in which operations are
basically executed using registers. Except for bit-manipulation operations such as logical AND that are executed directly in memory, operands in an operation that requires memory access are
loaded into registers and the operation is executed between the registers.
Delayed Branches: Except for the two branch instructions BF and BT, the SH-4A's branch
instructions and RTE are delayed branches. In a delayed branch, the instruction following the branch is executed before the branch destination instruction.
Delay Slot: This execution slot following a delayed branch is called a delay slot. For example, the
BRA execution sequence is as follows:
Table 3.1 Execution Order of Delayed Branch Instructions
Instructions Execution Order
BRA TARGET (Delayed branch instruction) BRA ADD (Delay slot) : ADD :
TARGET target-inst (Branch destination instruction) target-inst
A slot illegal instruction exception may occur when a specific instruction is executed in a delay slot. For details, see section 5, Exception Handling. The instruction following BF/S or BT/S for which the branch is not taken is also a delay slot instruction.
T Bit: The T bit in SR is used to show the result of a compare operation, and is referenced by a
conditional branch instruction. An example of the use of a conditional branch instruction is shown below.
Rev. 1.50, 10/04, page 23 of 448
Page 44
ADD #1, R0 ; T bit is not changed by ADD operation CMP/EQ R1, R0 ; If R0 = R1, T bit is set to 1 BT TARGET ; Branches to TARGET if T bit = 1 (R0 = R1)
In an RTE delay slot, the SR bits are referenced as follows. In instruction access, the MD bit is used before modification, and in data access, the MD bit is accessed after modification. The other bits—S, T, M, Q, FD, BL, and RB—after modification are used for delay slot instruction execution. The STC and STC.L SR instructions access all SR bits after modification.
Constant Values: An 8-bit constant value can be specified by the instruction code and an
immediate value. 16-bit and 32-bit constant values can be defined as literal constant values in memory, and can be referenced by a PC-relative load instruction.
MOV.W @(disp, PC), Rn MOV.L @(disp, PC), Rn
There are no PC-relative load instructions for floating-point operations. However, it is possible to set 0.0 or 1.0 by using the FLDI0 or FLDI1 instruction on a single-precision floating-point register.
Rev. 1.50, 10/04, page 24 of 448
Page 45

3.2 Addressing Modes

Addressing modes and effective address calculation methods are shown in table 3.2. When a location in virtual memory space is accessed (AT in MMUCR = 1), the effective address is translated into a physical memory address. If multiple virtual memory space systems are selected (SV in MMUCR = 0), the least significant bit of PTEH is also referenced as the access ASID. For details, see section 7, Memory Management Unit (MMU).
Table 3.2 Addressing Modes and Effective Addresses
Addressing Mode
Register direct
Register indirect
Register indirect with post­increment
Register indirect with pre­decrement
Instruction Format Effective Address Calculation Method
Rn Effective address is register Rn.
(Operand is register Rn contents.)
@Rn Effective address is register Rn contents.
Rn Rn
@Rn+ Effective address is register Rn contents.
A constant is added to Rn after instruction execution: 1 for a byte operand, 2 for a word operand, 4 for a longword operand, 8 for a quadword operand.
Rn Rn
Rn + 1/2/4
1/2/4
@–Rn Effective address is register Rn contents,
decremented by a constant beforehand: 1 for a byte operand, 2 for a word operand, 4 for a longword operand, 8 for a quadword operand.
Rn
Rn – 1/2/4
1/2/4
+
Rn – 1/2/4/8
Calculation Formula
Rn EA (EA: effective address)
Rn EA After instruction execution
Byte: Rn + 1 Rn
Word: Rn + 2 Rn
Longword: Rn + 4 Rn
Quadword: Rn + 8 Rn
Byte: Rn – 1 Rn
Word: Rn – 2 Rn
Longword: Rn – 4 Rn
Quadword: Rn – 8 Rn
Rn EA (Instruction executed with Rn after calculation)
Rev. 1.50, 10/04, page 25 of 448
Page 46
Addressing Mode
Register indirect with displacement
Instruction Format Effective Address Calculation Method
@(disp:4, Rn) Effective address is register Rn contents with
4-bit displacement disp added. After disp is zero-extended, it is multiplied by 1 (byte), 2 (word), or 4 (longword), according to the operand size.
Rn
disp
(zero-extended)
+
×
Rn + disp × 1/2/4
Calculation Formula
Byte: Rn + disp EA
Word: Rn + disp × 2 EA
Longword: Rn + disp × 4 EA
Indexed register indirect
GBR indirect with displace­ment
Indexed GBR indirect
1/2/4
@(R0, Rn) Effective address is sum of register Rn and R0
contents.
Rn
+
R0
Rn + R0
@(disp:8, GBR) Effective address is register GBR contents with
8-bit displacement disp added. After disp is zero-extended, it is multiplied by 1 (byte), 2 (word), or 4 (longword), according to the operand size.
GBR
disp
(zero-extended)
1/2/4
+
×
GBR
+ disp × 1/2/4
@(R0, GBR) Effective address is sum of register GBR and R0
contents.
GBR
Rn + R0 EA
Byte: GBR + disp EA
Word: GBR + disp × 2 EA
Longword: GBR + disp × 4 EA
GBR + R0 EA
Rev. 1.50, 10/04, page 26 of 448
R0
+
GBR + R0
Page 47
Addressing Mode
PC-relative with displacement
Instruction Format Effective Address Calculation Method
@(disp:8, PC) Effective address is PC + 4 with 8-bit displacement
disp added. After disp is zero-extended, it is multiplied by 2 (word), or 4 (longword), according to the operand size. With a longword operand, the lower 2 bits of PC are masked.
PC
*
&
Calculation Formula
Word: PC + 4 + disp × 2 EA
Longword: PC & H'FFFF FFFC + 4 + disp × 4 EA
H'FFFF FFFC
4
disp
(zero-extended)
2/4
+
PC + 4 + disp
× 2
+
×
*
With longword operand
or PC &
H'FFFF FFFC
+ 4 + disp × 4
PC-relative disp:8 Effective address is PC + 4 with 8-bit displacement
disp added after being sign-extended and multiplied by 2.
PC
+
4
disp
(sign-extended)
2
+
PC + 4 + disp × 2
×
PC + 4 + disp × 2 Branch­Target
Rev. 1.50, 10/04, page 27 of 448
Page 48
Addressing Mode
Instruction Format Effective Address Calculation Method
PC-relative disp:12 Effective address is PC + 4 with 12-bit
displacement disp added after being sign-extended and multiplied by 2.
PC
+
Calculation Formula
PC + 4 + disp × 2 Branch­Target
4
disp
(sign-extended)
2
Rn Effective address is sum of PC + 4 and Rn.
PC
4
Rn
Immediate #imm:8 8-bit immediate data imm of TST, AND, OR, or
+
PC + 4 + disp × 2
×
PC + 4 + Rn Branch-Target
+
+
PC + 4 + Rn
XOR instruction is zero-extended.
#imm:8 8-bit immediate data imm of MOV, ADD, or
CMP/EQ instruction is sign-extended.
#imm:8 8-bit immediate data imm of TRAPA instruction is
zero-extended and multiplied by 4.
Note: For the addressing modes below that use a displacement (disp), the assembler descriptions
in this manual show the value before scaling (×1, ×2, or ×4) is performed according to the operand size. This is done to clarify the operation of the LSI. Refer to the relevant
assembler notation rules for the actual assembler descriptions. @ (disp:4, Rn) ; Register indirect with displacement @ (disp:8, GBR) ; GBR indirect with displacement @ (disp:8, PC) ; PC-relative with displacement disp:8, disp:12 ; PC-relative
Rev. 1.50, 10/04, page 28 of 448
Page 49

3.3 Instruction Set

Table 3.3 shows the notation used in the SH instruction lists shown in tables 3.4 to 3.13.
Table 3.3 Notation Used in Instruction List
Item Format Description
Instruction mnemonic
Operation notation
Instruction code MSB LSB mmmm: Register number (Rm, FRm)
Privileged mode "Privileged" means the instruction can only be executed
OP.Sz SRC, DEST OP: Operation code
Sz: Size SRC: Source operand DEST: Source and/or destination operand Rm: Source register Rn: Destination register imm: Immediate data disp: Displacement
, Transfer direction
(xx) Memory operand M/Q/T SR flag bits & Logical AND of individual bits | Logical OR of individual bits
Logical exclusive-OR of individual bits
~ Logical NOT of individual bits <<n, >>n n-bit shift
nnnn: Register number (Rn, FRn) 0000: R0, FR0 0001: R1, FR1 : 1111: R15, FR15 mmm: Register number (DRm, XDm, Rm_BANK) nnn: Register number (DRn, XDn, Rn_BANK) 000: DR0, XD0, R0_BANK 001: DR2, XD2, R1_BANK : 111: DR14, XD14, R7_BANK mm: Register number (FVm) nn: Register number (FVn) 00: FV0 01: FV4 10: FV8 11: FV12 iiii: Immediate data dddd: Displacement
in privileged mode.
Rev. 1.50, 10/04, page 29 of 448
Page 50
Item Format Description
T bit Value of T bit after
instruction execution
New "New" means the instruction which is newly added in this
Note: Scaling (×1, ×2, ×4, or ×8) is executed according to the size of the instruction operand.
—: No change
LSI.
Rev. 1.50, 10/04, page 30 of 448
Page 51
Table 3.4 Fixed-Point Transfer Instructions
Instruction Operation Instruction Code Privileged T Bit New
MOV
MOV.W
MOV.L
MOV
MOV.B
MOV.W
MOV.L
MOV.B
MOV.W
MOV.L
MOV.B
MOV.W
MOV.L
MOV.B
MOV.W @Rm+,Rn (Rm) → sign extension → Rn,
MOV.L @Rm+,Rn (Rm) → Rn, Rm + 4 → Rm
MOV.B R0,@(disp*,Rn) R0 → (disp + Rn)
MOV.W R0,@(disp*,Rn) R0 → (disp × 2 + Rn)
MOV.L Rm,@(disp*,Rn) Rm (disp × 4 + Rn)
MOV.B @(disp*,Rm),R0 (disp + Rm) sign extension
MOV.W @(disp*,Rm),R0 (disp × 2 + Rm) sign
MOV.L @(disp*,Rm),Rn (disp × 4 + Rm) Rn
MOV.B Rm,@(R0,Rn) Rm → (R0 + Rn)
MOV.W Rm,@(R0,Rn) Rm → (R0 + Rn)
MOV.L Rm,@(R0,Rn) Rm → (R0 + Rn)
MOV.B @(R0,Rm),Rn (R0 + Rm)
MOV.W @(R0,Rm),Rn (R0 + Rm)
MOV.L @(R0,Rm),Rn (R0 + Rm) Rn
#imm,Rn imm → sign extension → Rn
@(disp*,PC), Rn (disp × 2 + PC + 4) sign
extension Rn
@(disp*,PC), Rn (disp × 4 + PC & H'FFFF FFFC
+ 4) Rn Rm,Rn Rm → Rn
Rm,@Rn Rm → (Rn)
Rm,@Rn Rm → (Rn)
Rm,@Rn Rm → (Rn)
@Rm,Rn (Rm) → sign extension → Rn
@Rm,Rn (Rm) → sign extension → Rn
@Rm,Rn (Rm) → Rn
Rm,@-Rn Rn-1 → Rn, Rm → (Rn)
Rm,@-Rn Rn-2 → Rn, Rm → (Rn)
Rm,@-Rn Rn-4 → Rn, Rm → (Rn)
@Rm+,Rn (Rm)→ sign extension → Rn,
Rm + 1 Rm
Rm + 2 Rm
R0
extension R0
sign extension Rn
sign extension Rn
1110nnnniiiiiiii
1001nnnndddddddd
1101nnnndddddddd
0110nnnnmmmm0011
0010nnnnmmmm0000
0010nnnnmmmm0001
0010nnnnmmmm0010
0110nnnnmmmm0000
0110nnnnmmmm0001
0110nnnnmmmm0010
0010nnnnmmmm0100
0010nnnnmmmm0101
0010nnnnmmmm0110
0110nnnnmmmm0100
0110nnnnmmmm0101
0110nnnnmmmm0110
10000000nnnndddd
10000001nnnndddd
0001nnnnmmmmdddd
10000100mmmmdddd
10000101mmmmdddd
0101nnnnmmmmdddd
0000nnnnmmmm0100
0000nnnnmmmm0101
0000nnnnmmmm0110
0000nnnnmmmm1100
0000nnnnmmmm1101
0000nnnnmmmm1110
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
Rev. 1.50, 10/04, page 31 of 448
Page 52
Instruction Operation Instruction Code Privileged T Bit New
MOV.B R0,@(disp*,GBR) R0 (disp + GBR) 11000000dddddddd — — — MOV.W R0,@(disp*,GBR) R0 (disp × 2 + GBR) 11000001dddddddd — — — MOV.L R0,@(disp*,GBR) R0 (disp × 4 + GBR) 11000010dddddddd — — — MOV.B @(disp*,GBR),R0 (disp + GBR)
sign extension R0
MOV.W @(disp*,GBR),R0 (disp × 2 + GBR)
sign extension R0 MOV.L @(disp*,GBR),R0 (disp × 4 + GBR) R0 11000110dddddddd — — — MOVA @(disp*,PC),R0 disp × 4 +
PC & H'FFFF FFFC
+ 4 R0 MOVCO.L R0,@Rn LDST → T
If (T == 1) R0 (Rn)
0 LDST MOVLI.L @Rm,R0 1 → LDST
(Rm) R0
When interrupt/exception
occurred 0 → LDST MOVUA.L @Rm,R0 (Rm) → R0
Load non-boundary
alignment data MOVUA.L @Rm+,R0 (Rm) → R0, Rm + 4 →
Rm
Load non-boundary
alignment data MOVT Rn T → Rn
SWAP.B Rm,Rn Rm → swap lower 2 bytes
Rn SWAP.W Rm,Rn Rm → swap upper/lower
words Rn XTRCT Rm,Rn Rm:Rn middle 32 bits
Rn
11000100dddddddd — — —
11000101dddddddd — — —
11000111dddddddd — — —
0000nnnn01110011 LDST New
0000mmmm01100011 New
0100mmmm10101001 New
0100mmmm11101001 New
0000nnnn00101001
0110nnnnmmmm1000
0110nnnnmmmm1001
0010nnnnmmmm1101
— — —
— — —
— — —
— — —
Note: * The assembler of Renesas uses the value after scaling (×1, ×2, or ×4) as the
displacement (disp).
Rev. 1.50, 10/04, page 32 of 448
Page 53
Table 3.5 Arithmetic Operation Instructions
Instruction Operation Instruction Code Privileged T Bit New
ADD Rm,Rn Rn + Rm Rn
ADD #imm,Rn Rn + imm Rn
ADDC Rm,Rn Rn + Rm + T Rn,
carry T
ADDV Rm,Rn Rn + Rm Rn,
overflow T
CMP/EQ #imm,R0 When R0 = imm, 1 T
Otherwise, 0 T
CMP/EQ Rm,Rn When Rn = Rm, 1 T
Otherwise, 0 T
CMP/HS Rm,Rn When Rn ≥ Rm (unsigned),
1 T Otherwise, 0 T
CMP/GE Rm,Rn When Rn ≥ Rm (signed),
1 T Otherwise, 0 T
CMP/HI Rm,Rn When Rn > Rm (unsigned),
1 T Otherwise, 0 T
CMP/GT Rm,Rn When Rn > Rm (signed),
1 T Otherwise, 0 T
CMP/PZ Rn When Rn ≥ 0, 1 → T
Otherwise, 0 T
CMP/PL Rn When Rn > 0, 1 T
Otherwise, 0 T
CMP/STR Rm,Rn When any bytes are equal,
1 T Otherwise, 0 T
DIV1 Rm,Rn 1-step division (Rn ÷ Rm)
DIV0S Rm,Rn MSB of Rn Q,
MSB of Rm M, M^Q T
DIV0U 0 → M/Q/T
DMULS.L Rm,Rn Signed,
Rn × Rm MAC, 32 × 32 64 bits
DMULU.L Rm,Rn Unsigned,
Rn × Rm MAC, 32 × 32 64 bits
0011nnnnmmmm1100
0111nnnniiiiiiii
0011nnnnmmmm1110
0011nnnnmmmm1111
10001000iiiiiiii
0011nnnnmmmm0000
0011nnnnmmmm0010
0011nnnnmmmm0011
0011nnnnmmmm0110
0011nnnnmmmm0111
0100nnnn00010001
0100nnnn00010101
0010nnnnmmmm1100
0011nnnnmmmm0100
0010nnnnmmmm0111
0000000000011001
0011nnnnmmmm1101
0011nnnnmmmm0101
— — —
— — —
— Carry —
— Overflow
— Comparison
result
— Comparison
result
— Comparison
result
— Comparison
result
— Comparison
result
— Comparison
result
— Comparison
result
— Comparison
result
— Comparison
result
— Calculation
result
— Calculation
result
— 0 —
— — —
— — —
Rev. 1.50, 10/04, page 33 of 448
Page 54
Instruction Operation Instruction Code Privileged T Bit New
DT Rn Rn – 1 Rn;
when Rn = 0, 1 T When Rn 0, 0 T
EXTS.B Rm,Rn Rm sign-extended from
byte Rn
EXTS.W Rm,Rn Rm sign-extended from
word Rn
EXTU.B Rm,Rn Rm zero-extended from
byte Rn
EXTU.W Rm,Rn Rm zero-extended from
word Rn
MAC.L @Rm+,@Rn+ Signed,
(Rn) × (Rm) + MAC MAC Rn + 4 Rn, Rm + 4 Rm 32 × 32 + 64 64 bits
MAC.W @Rm+,@Rn+ Signed,
(Rn) × (Rm) + MAC MAC Rn + 2 → Rn, Rm + 2 Rm 16 × 16 + 64 64 bits
MUL.L Rm,Rn Rn × Rm → MACL
32 × 32 32 bits
MULS.W Rm,Rn Signed,
Rn × Rm MACL 16 × 16 32 bits
MULU.W Rm,Rn Unsigned,
Rn × Rm MACL 16 × 16 32 bits
NEG Rm,Rn 0 – Rm Rn
NEGC Rm,Rn 0 – Rm – T → Rn,
borrow T SUB Rm,Rn Rn – Rm Rn
SUBC Rm,Rn Rn – Rm – T Rn,
borrow T SUBV Rm,Rn Rn – Rm → Rn,
underflow T
0100nnnn00010000
0110nnnnmmmm1110
0110nnnnmmmm1111
0110nnnnmmmm1100
0110nnnnmmmm1101
0000nnnnmmmm1111
0100nnnnmmmm1111
0000nnnnmmmm0111
0010nnnnmmmm1111
0010nnnnmmmm1110
0110nnnnmmmm1011
0110nnnnmmmm1010
0011nnnnmmmm1000
0011nnnnmmmm1010
0011nnnnmmmm1011
— Comparison
result
— — —
— — —
— — —
— — —
— — —
— — —
— — —
— — —
— — —
— — —
— Borrow
— — —
— Borrow
— Underflow
Rev. 1.50, 10/04, page 34 of 448
Page 55
Table 3.6 Logic Operation Instructions
Instruction Operation Instruction Code Privileged T Bit New
AND Rm,Rn Rn & Rm Rn
AND #imm,R0 R0 & imm R0
AND.B #imm, @(R0,GBR) (R0 + GBR) & imm
(R0 + GBR) NOT Rm,Rn ~Rm → Rn
OR Rm,Rn Rn | Rm Rn
OR #imm,R0 R0 | imm R0
OR.B #imm, @(R0,GBR) (R0 + GBR) | imm
TAS.B @Rn When (Rn) = 0, 1 T
TST Rm,Rn Rn & Rm;
TST #imm,R0 R0 & imm;
TST.B #imm, @(R0,GBR)
XOR Rm,Rn
(R0 + GBR)
Otherwise, 0 T
In both cases,
1 MSB of (Rn)
when result = 0, 1 T
Otherwise, 0 T
when result = 0, 1 T
Otherwise, 0 T
(R0 + GBR) & imm;
when result = 0, 1 T
Otherwise, 0 T
Rn Rm Rn 0010nnnnmmmm1010
0010nnnnmmmm1001
11001001iiiiiiii
11001101iiiiiiii
0110nnnnmmmm0111
0010nnnnmmmm1011
11001011iiiiiiii
11001111iiiiiiii
0100nnnn00011011
0010nnnnmmmm1000
11001000iiiiiiii
11001100iiiiiiii
— —
— —
— —
— —
— —
— —
— —
— Test
— Test
— Test
result
result
result
Test result
XOR #imm,R0
XOR.B #imm, @(R0,GBR)
R0 imm R0 11001010iiiiiiii
(R0 + GBR) imm
(R0 + GBR)
11001110iiiiiiii
Rev. 1.50, 10/04, page 35 of 448
Page 56
Table 3.7 Shift Instructions
Instruction Operation Instruction Code Privileged T Bit New
ROTL Rn T ← Rn ← MSB
ROTR Rn LSB → Rn → T
ROTCL Rn T ← Rn ← T
ROTCR Rn T → Rn → T
SHAD Rm,Rn When Rm ≥ 0, Rn << Rm → Rn
When Rm < 0, Rn >> Rm
[MSB Rn] SHAL Rn T ← Rn ← 0
SHAR Rn MSB → Rn → T
SHLD Rm,Rn When Rm ≥ 0, Rn << Rm → Rn
When Rm < 0, Rn >> Rm
[0 Rn] SHLL Rn T ← Rn ← 0
SHLR Rn 0 → Rn → T
SHLL2 Rn Rn << 2 Rn
SHLR2 Rn Rn >> 2 Rn
SHLL8 Rn Rn << 8 Rn
SHLR8 Rn Rn >> 8 Rn
SHLL16 Rn Rn << 16 Rn
SHLR16 Rn Rn >> 16 Rn
0100nnnn00000100
0100nnnn00000101
0100nnnn00100100
0100nnnn00100101
0100nnnnmmmm1100
0100nnnn00100000
0100nnnn00100001
0100nnnnmmmm1101
0100nnnn00000000
0100nnnn00000001
0100nnnn00001000
0100nnnn00001001
0100nnnn00011000
0100nnnn00011001
0100nnnn00101000
0100nnnn00101001
— MSB
— LSB
— MSB
— LSB
— —
— MSB
— LSB
— —
— MSB
— LSB
— —
— —
— —
— —
— —
— —
Rev. 1.50, 10/04, page 36 of 448
Page 57
Table 3.8 Branch Instructions
Instruction Operation Instruction Code Privileged T Bit New
BF label When T = 0, disp × 2 + PC +
4 PC When T = 1, nop
BF/S label Delayed branch; when T = 0,
disp × 2 + PC + 4 PC When T = 1, nop
BT label When T = 1, disp × 2 + PC +
4 PC When T = 0, nop
BT/S label Delayed branch; when T = 1,
disp × 2 + PC + 4 PC When T = 0, nop
BRA label Delayed branch, disp × 2 +
PC + 4 PC
BRAF Rn Delayed branch, Rn + PC + 4
PC
BSR label Delayed branch, PC + 4 PR,
disp × 2 + PC + 4 PC
BSRF Rn Delayed branch, PC + 4 PR,
Rn + PC + 4 PC JMP @Rn Delayed branch, Rn → PC
JSR @Rn Delayed branch, PC + 4 PR,
Rn PC RTS Delayed branch, PR → PC
10001011dddddddd
10001111dddddddd
10001001dddddddd
10001101dddddddd
1010dddddddddddd
0000nnnn00100011
1011dddddddddddd
0000nnnn00000011
0100nnnn00101011
0100nnnn00001011
0000000000001011
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
Table 3.9 System Control Instructions
Instruction Operation Instruction Code Privileged T Bit New
CLRMAC 0 → MACH, MACL
CLRS 0 → S
CLRT 0 → T
ICBI @Rn Invalidates instruction cache block
indicated by logical address
LDC Rm,SR Rm → SR
LDC Rm,GBR Rm → GBR
LDC Rm,VBR Rm → VBR
LDC Rm,SGR Rm → SGR
LDC Rm,SSR Rm → SSR
0000000000101000
0000000001001000
0000000000001000
0000nnnn11100011
0100mmmm00001110
0100mmmm00011110
0100mmmm00101110
0100mmmm00111010
0100mmmm00111110
— —
— —
— 0 —
New
Privileged LSB —
— —
Privileged —
Privileged —
Privileged —
Rev. 1.50, 10/04, page 37 of 448
Page 58
Instruction Operation Instruction Code Privileged T Bit New
LDC Rm,SPC Rm → SPC
LDC Rm,DBR Rm → DBR
LDC Rm,Rn_BANK Rm → Rn_BANK (n = 0 to 7)
LDC.L @Rm+,SR (Rm) → SR, Rm + 4 → Rm
LDC.L @Rm+,GBR (Rm) → GBR, Rm + 4 → Rm 0100mmmm00010111 — — LDC.L @Rm+,VBR (Rm) → VBR, Rm + 4 → Rm 0100mmmm00100111 Privileged — LDC.L @Rm+,SGR (Rm) → SGR, Rm + 4 → Rm 0100mmmm00110110 Privileged — LDC.L @Rm+,SSR (Rm) → SSR, Rm + 4 → Rm 0100mmmm00110111 Privileged — LDC.L @Rm+,SPC (Rm) → SPC, Rm + 4 → Rm 0100mmmm01000111 Privileged — LDC.L @Rm+,DBR (Rm) → DBR, Rm + 4 → Rm 0100mmmm11110110 Privileged — LDC.L @Rm+,Rn_BANK (Rm) → Rn_BANK,
Rm + 4 Rm LDS Rm,MACH Rm → MACH 0100mmmm00001010 — — LDS Rm,MACL Rm → MACL 0100mmmm00011010 — — LDS Rm,PR Rm → PR 0100mmmm00101010 — — LDS.L @Rm+,MACH (Rm) → MACH, Rm + 4 → Rm 0100mmmm00000110 — — LDS.L @Rm+,MACL (Rm) → MACL, Rm + 4 → Rm 0100mmmm00010110 — — LDS.L @Rm+,PR (Rm) → PR, Rm + 4 → Rm 0100mmmm00100110 — — LDTLB PTEH/PTEL → TLB 0000000000111000 Privileged — MOVCA.L R0,@Rn R0 → (Rn) (without fetching
cache block)
NOP No operation 0000000000001001 — —
OCBI @Rn Invalidates operand cache
block
OCBP @Rn Writes back and invalidates
operand cache block
OCBWB @Rn Writes back operand cache
block PREF @Rn (Rn) → operand cache 0000nnnn10000011 — —
PREFI @Rn Reads 32-byte instruction
block into instruction cache RTE Delayed branch, SSR/SPC
SR/PC SETS 1 → S 0000000001011000 — — SETT 1 → T 0000000000011000 — 1
SLEEP Sleep or standby 0000000000011011 Privileged — — STC SR,Rn SR → Rn 0000nnnn00000010 Privileged — STC GBR,Rn GBR → Rn 0000nnnn00010010 — —
0100mmmm01001110
0100mmmm11111010
0100mmmm1nnn1110
0100mmmm00000111
0100mmmm1nnn0111 Privileged —
0000nnnn11000011 — —
0000nnnn10010011 — —
0000nnnn10100011 — —
0000nnnn10110011 — —
0000nnnn11010011 New
0000000000101011 Privileged —
Privileged —
Privileged —
Privileged —
Privileged LSB —
Rev. 1.50, 10/04, page 38 of 448
Page 59
Instruction Operation Instruction Code Privileged T Bit New
STC VBR,Rn VBR → Rn 0000nnnn00100010 Privileged — STC SSR,Rn SSR → Rn 0000nnnn00110010 Privileged — STC SPC,Rn SPC → Rn 0000nnnn01000010 Privileged — STC SGR,Rn SGR → Rn 0000nnnn00111010 Privileged — STC DBR,Rn DBR → Rn
STC Rm_BANK,Rn Rm_BANK → Rn
(m = 0 to 7)
STC.L SR,@-Rn Rn – 4 Rn, SR (Rn)
STC.L GBR,@-Rn Rn – 4 Rn, GBR → (Rn)
STC.L VBR,@-Rn Rn – 4 Rn, VBR (Rn)
STC.L SSR,@-Rn Rn – 4 Rn, SSR (Rn)
STC.L SPC,@-Rn Rn – 4 Rn, SPC (Rn)
STC.L SGR,@-Rn Rn – 4 → Rn, SGR (Rn)
STC.L DBR,@-Rn Rn – 4 → Rn, DBR → (Rn)
STC.L Rm_BANK,@-Rn Rn – 4 Rn,
Rm_BANK (Rn) (m = 0 to 7)
STS MACH,Rn MACH → Rn
STS MACL,Rn MACL → Rn
STS PR,Rn PR → Rn
STS.L MACH,@-Rn Rn – 4 → Rn, MACH → (Rn)
STS.L MACL,@-Rn Rn – 4 Rn, MACL (Rn)
STS.L PR,@-Rn Rn – 4 Rn, PR (Rn)
SYNCO Prevents the next instruction
from being issued until instructions issued before this instruction have been completed.
TRAPA #imm PC + 2 SPC,
SR SSR, #imm << 2 TRA, H'160 EXPEVT, VBR + H'0100 PC
0000nnnn11111010
0000nnnn1mmm0010
0100nnnn00000011
0100nnnn00010011
0100nnnn00100011
0100nnnn00110011
0100nnnn01000011
0100nnnn00110010
0100nnnn11110010
0100nnnn1mmm0011
0000nnnn00001010
0000nnnn00011010
0000nnnn00101010
0100nnnn00000010
0100nnnn00010010
0100nnnn00100010
0000000010101011
11000011iiiiiiii
Privileged —
Privileged —
Privileged —
— —
Privileged —
Privileged —
Privileged —
Privileged —
Privileged —
Privileged —
— —
— —
— —
— —
— —
— —
New
— —
Rev. 1.50, 10/04, page 39 of 448
Page 60
Table 3.10 Floating-Point Single-Precision Instructions
Instruction Operation Instruction Code Privileged T Bit New
FLDI0 FRn H'0000 0000 → FRn
FLDI1 FRn H'3F80 0000 → FRn
FMOV FRm,FRn FRm → FRn
FMOV.S @Rm,FRn (Rm) → FRn
FMOV.S @(R0,Rm),FRn (R0 + Rm) FRn
FMOV.S @Rm+,FRn (Rm) → FRn, Rm + 4 → Rm
FMOV.S FRm,@Rn FRm → (Rn)
FMOV.S FRm,@-Rn Rn-4 → Rn, FRm → (Rn)
FMOV.S FRm,@(R0,Rn) FRm → (R0 + Rn)
FMOV DRm,DRn DRm → DRn
FMOV @Rm,DRn (Rm) → DRn
FMOV @(R0,Rm),DRn (R0 + Rm) DRn
FMOV @Rm+,DRn (Rm) → DRn, Rm + 8 → Rm
FMOV DRm,@Rn DRm → (Rn)
FMOV DRm,@-Rn Rn-8 → Rn, DRm → (Rn)
FMOV DRm,@(R0,Rn) DRm → (R0 + Rn)
FLDS FRm,FPUL FRm → FPUL
FSTS FPUL,FRn FPUL → FRn
FABS FRn FRn & H'7FFF FFFF FRn
FADD FRm,FRn FRn + FRm FRn
FCMP/EQ FRm,FRn When FRn = FRm, 1 T
Otherwise, 0 T
FCMP/GT FRm,FRn When FRn > FRm, 1 T
Otherwise, 0 T
FDIV FRm,FRn FRn/FRm → FRn
FLOAT FPUL,FRn (float) FPUL → FRn
FMAC FR0,FRm,FRn FR0*FRm + FRn FRn
FMUL FRm,FRn FRn*FRm FRn
FNEG FRn FRn ∧ H'8000 0000 → FRn
FSQRT FRn √FRn → FRn
FSUB FRm,FRn FRn FRm → FRn
FTRC FRm,FPUL (long) FRm → FPUL
1111nnnn10001101
1111nnnn10011101
1111nnnnmmmm1100
1111nnnnmmmm1000
1111nnnnmmmm0110
1111nnnnmmmm1001
1111nnnnmmmm1010
1111nnnnmmmm1011
1111nnnnmmmm0111
1111nnn0mmm01100
1111nnn0mmmm1000
1111nnn0mmmm0110
1111nnn0mmmm1001
1111nnnnmmm01010
1111nnnnmmm01011
1111nnnnmmm00111
1111mmmm00011101
1111nnnn00001101
1111nnnn01011101
1111nnnnmmmm0000
1111nnnnmmmm0100
1111nnnnmmmm0101
1111nnnnmmmm0011
1111nnnn00101101
1111nnnnmmmm1110
1111nnnnmmmm0010
1111nnnn01001101
1111nnnn01101101
1111nnnnmmmm0001
1111mmmm00111101
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
Comparis on result
Comparis on result
Rev. 1.50, 10/04, page 40 of 448
Page 61
Table 3.11 Floating-Point Double-Precision Instructions
Instruction Operation Instruction Code Privileged T Bit New
FABS DRn DRn & H'7FFF FFFF FFFF
FFFF DRn
FADD DRm,DRn DRn + DRm → DRn
FCMP/EQ DRm,DRn When DRn = DRm, 1 T
Otherwise, 0 T
FCMP/GT DRm,DRn When DRn > DRm, 1 T
Otherwise, 0 T FDIV DRm,DRn DRn /DRm → DRn
FCNVDS DRm,FPUL double_to_ float(DRm)
FPUL FCNVSD FPUL,DRn float_to_ double (FPUL)
DRn FLOAT FPUL,DRn (float)FPUL → DRn
FMUL DRm,DRn DRn *DRm → DRn
FNEG DRn DRn ^ H'8000 0000 0000
0000 → DRn FSQRT DRn √DRn → DRn
FSUB DRm,DRn DRn – DRm DRn
FTRC DRm,FPUL (long) DRm → FPUL
1111nnn001011101
1111nnn0mmm00000
1111nnn0mmm00100
1111nnn0mmm00101
1111nnn0mmm00011
1111mmm010111101
1111nnn010101101
1111nnn000101101
1111nnn0mmm00010
1111nnn001001101
1111nnn001101101
1111nnn0mmm00001
1111mmm000111101
— — —
— — —
— Comparison
result
— Comparison
result
— — —
— — —
— — —
— — —
— — —
— — —
— — —
— — —
— — —
Table 3.12 Floating-Point Control Instructions
Instruction Operation Instruction Code Privileged T Bit New
LDS Rm,FPSCR Rm → FPSCR
LDS Rm,FPUL Rm → FPUL
LDS.L @Rm+,FPSCR (Rm) → FPSCR, Rm+4 → Rm
LDS.L @Rm+,FPUL (Rm) → FPUL, Rm+4 → Rm
STS FPSCR,Rn FPSCR → Rn
STS FPUL,Rn FPUL → Rn
STS.L FPSCR,@-Rn Rn – 4 Rn, FPSCR (Rn)
STS.L FPUL,@-Rn Rn – 4 Rn, FPUL (Rn)
0100mmmm01101010
0100mmmm01011010
0100mmmm01100110
0100mmmm01010110
0000nnnn01101010
0000nnnn01011010
0100nnnn01100010
0100nnnn01010010
— — —
— — —
— — —
— — —
— — —
— — —
— — —
— — —
Rev. 1.50, 10/04, page 41 of 448
Page 62
Table 3.13 Floating-Point Graphics Acceleration Instructions
Instruction Operation Instruction Code Privileged T Bit New
FMOV DRm,XDn DRm → XDn
FMOV XDm,DRn XDm → DRn
FMOV XDm,XDn XDm → XDn
FMOV @Rm,XDn (Rm) → XDn
FMOV @Rm+,XDn (Rm) → XDn, Rm + 8 → Rm
FMOV @(R0,Rm),XDn (R0 + Rm) XDn
FMOV XDm,@Rn XDm → (Rn)
FMOV XDm,@-Rn Rn – 8 Rn, XDm (Rn)
FMOV XDm,@(R0,Rn) XDm → (R0 + Rn)
FIPR FVm,FVn inner_product (FVm, FVn)
FR[n+3]
FTRV XMTRX,FVn transform_vector (XMTRX,
FVn) FVn
FRCHG ~FPSCR.FR → FPSCR.FR
FSCHG ~FPSCR.SZ → FPSCR.SZ
FPCHG ~FPSCR.PR → FPSCR.PR
FSRRA FRn 1/sqrt(FRn) → FRn
FSCA FPUL,DRn sin(FPUL) → FRn
cos(FPUL) FR[n + 1]
Note: * sqrt(FRn) is the square root of FRn.
1111nnn1mmm01100
1111nnn0mmm11100
1111nnn1mmm11100
1111nnn1mmmm1000
1111nnn1mmmm1001
1111nnn1mmmm0110
1111nnnnmmm11010
1111nnnnmmm11011
1111nnnnmmm10111
1111nnmm11101101
1111nn0111111101
1111101111111101
1111001111111101
1111011111111101
1111nnnn01111101
1111nnn011111101
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
— —
New
New
New
Rev. 1.50, 10/04, page 42 of 448
Page 63

Section 4 Pipelining

The SH-4A is a 2-ILP (instruction-level-parallelism) superscalar pipelining microprocessor. Instruction execution is pipelined, and two instructions can be executed in parallel.

4.1 Pipelines

Figure 4.1 shows the basic pipelines. Normally, a pipeline consists of seven stages: instruction fetch (I1/I2), decode and register read (ID), execution (E1/E2/E3), and write-back (WB). An instruction is executed as a combination of basic pipelines.
1. General Pipeline
I1 I2 ID E1 E2 E3 WB
-
Instruction fetch
2. General Load/Store Pipeline
I1 I2 ID E1 E2 E3 WB
-
Instruction fetch
3. Special Pipeline
I1 I2 ID E1 E2 E3 WB
-
Instruction fetch
-
Instruction decode
-
Issue
-
Register read
-
Instruction decode
-
Issue
-
Register read
-
Instruction decode
-
Issue
-
Register read
-
Forwarding
-
Address calculation
-
Forwarding
-
Operation
-
Memory data access
-
Operation
-
Write-back
-
Write-back
-
Write-back
4. Special Load/Store Pipeline
I1 I2 ID E1 E2 E3 WB
-
Instruction fetch
-
Instruction decode
-
Issue
-
Register read
5. Floating-Point Pipeline
I1 I2 ID FS1 FS2 FS4FS3 FS
-
Instruction fetch
-
Instruction decode
-
Issue
-
Register read
-
Forwarding
-
Operation
-
Operation
-
Operation
-
Operation
-
Write-back
6. Floating-Point Extended Pipeline
I1 I2 ID FE1 FE2 FE3 FE4 FE5 FE6 FS
-
Instruction fetch
-Instruction decode
-Issue
-Register read
-Forwarding
-
Operation-Operation-Operation-Operation-Operation-Operation
-
Write-back
Figure 4.1 Basic Pipelines
Rev. 1.50, 10/04, page 43 of 448
Page 64
Figure 4.2 shows the instruction execution patterns. Representations in figure 4.2 and their descriptions are listed in table 4.1.
Table 4.1 Representations of Instruction Execution Patterns
Representation Description
E1 E2 E3 WB
S1 S2 S3 WB
s1 s2 s3 WB
E1/S1
E1S1 E1s1
,
M2 M3 MS
FE1 FE2 FE3 FE4 FE5 FE6 FS
FS1 FS2 FS3 FS4 FS
ID
CPU EX pipe is occupied
CPU LS pipe is occupied (with memory access)
CPU LS pipe is occupied (without memory access)
Either CPU EX pipe or CPU LS pipe is occupied
Both CPU EX pipe and CPU LS pipe are occupied
CPU MULT operation unit is occupied
FPU-EX pipe is occupied
FPU-LS pipe is occupied
ID stage is locked
Both CPU and FPU pipes are occupied
Rev. 1.50, 10/04, page 44 of 448
Page 65
(1-1) BF, BF/S, BT, BT/S, BRA, BSR:
I1 I2
ID E1/S1 E2/s2 E3/s3 WB
1 issue cycle + 0 to 2 branch cycles
(I2)
(I1) (ID)
(Branch destination instruction)
Note:
In branch instructions that are categorized as (1-1), the number of branch cycles may be reduced by prefetching.
(1-2) JSR, JMP, BRAF, BSRF:
I1 I2 ID E1/S1 E2/S2 E3/S3 WB
(1-3) RTS:
(1-4) RTE:
(1-5) TRAPA: 8
I1 I2 ID S1 S2 S3
1 issue cycle + 0 to 3 branch cycles
I1 I2 ID E1/S1 E2/S2 E3/S3 WB
4 issue cycles + 1 branch cycles
I1 I2 ID s1 s2 s3 WB
issue cycles + 5 cycles + 1 branch cycle
1 issue cycle + 3 branch cycles
ID
E1s1
E2s2
WB
E3s3
(I1) (ID)(I2)
(I1) (ID)(I2)
E1s1
E2s2IDE3s3IDWB
WB
ID
(I1)
(Branch destination instruction)
Note: The number of branch cycles may be
0 by prefetching instruction.
(Branch destination instruction)
(I2)
(ID)
(Branch destination instruction)
Note:
It is 14 cycles to the ID stage in the first instruction of exception handler
(1-6) SLEEP: 2
I1 I2 ID S1 S2 S3 WB
issue cycles
E1s1
ID
E2s2
E1s1
ID
E3s3
WB
E2s2
ID
E1s1
ID
WB
E3s3 E2s2
E3s3
E1s1
E2s2
E1s1
ID
ID
It is not constant cycles to
Note:
the clock halted period.
Figure 4.2 Instruction Execution Patterns (1)
Rev. 1.50, 10/04, page 45 of 448
WB
WB
E3s3
E3s3
E2s2
E1s1
E2s2 E3s3 WB E1s1
ID
(I1)
WB
E2s2
(ID)(I2)
E3s3
WB
Page 66
(2-1) 1-step operation (EX type): 1 issue cycle
EXT[SU].[BW], MOVT, SWAP, XTRCT, ADD*, CMP*, DIV*, DT, NEG*, SUB*, AND, AND#, NOT, OR, OR#, TST, TST#, XOR, XOR#, ROT*, SHA*, SHL*, CLRS, CLRT, SETS, SETT
Note: Except for AND#, OR#, TST#, and XOR# instructions using GBR relative addressing mode
I1 I2 ID E1 E2 E3
(2-2) 1-step operation (LS type): 1 issue cycle
MOVA
I1 I2 ID
(2-3) 1-step operation (MT type): 1 issue cycle
MOV#, NOP
I1 I2 ID E1/S1 E2/s2 E3/s3
(2-4) MOV (MT type): 1 issue cycle
MOV
I1 I2 ID
s1 s2 s3
E1/s1 E2/s2 E3/S3
Figure 4.2 Instruction Execution Patterns (2)
WB
WB
WB
WB
Rev. 1.50, 10/04, page 46 of 448
Page 67
(3-1) Load/store: 1 issue cycle
MOV.[BWL], MOV.[BWL] @(d,GBR)
I1 I2 ID S1 S2 S3
WB
(3-2) AND.B, OR.B, XOR.B, TST.B: 3 issue cycles
I1 I2 ID S1 S2 S3
WB
ID
E2S2 E3S3 WBE1S1ID
(3-3) TAS.B: 4 issue cycles
I1 I2 ID S1 S2 S3
E1S1
ID
ID
WB
E2S2 E3S3 WB
ID
(3-4) PREF, OCBI, OCBP, OCBWB, MOVCA.L, SYNCO: 1 issue cycle
I1 I2 ID S1 S2 S3
WB
(3-5) LDTLB: 1 issue cycle
E2s2 E3s3
I1 I2 ID
E1s1
WB
(3-6) ICBI: 8 issue cycles + 5 cycles + 3 branch cycle
I1 I2 ID s1 s2 s3
ID
ID
WB
ID
ID
E2S2 E3S3
WBE1S1
(3-7) PREFI: 5 issue cycles + 5 cycles + 3 branch cycle
I1 I2 ID s1 s2 s3
E1s1 E2s2 E3s3 WB
ID
WB
5 cycles (min.)
(3-8) MOVLI.L: 1 issue cycle
I1 I2 ID S1 S2 S3
(3-9) MOVCO.L: 1 issue cycle
I1 I2 ID S1 S2 S3
(3-10) MOVUA.L: 2 issue cycles
I1 I2 ID S1 S2 S3 WB
S1 S2 S3
Figure 4.2 Instruction Execution Patterns (3)
5 cycles (min.)
WB
WB
WB
ID
ID
E2s2
E3s3
E1s1
ID
E1s1
ID
(Branch to the next instruction of ICBI.)
E1s1
E2s2
E1s1
ID
ID
(Branch to the next instruction of PREFI.)
E2s2 E1s1
E3s3 E2s2 E1s1
WB E3s3 E2s2
(I1)
WB E3s3 E2s2
(I1)
WB
E3s3
WB
E3s3
WB
(ID)(I2)
WB
(ID)(I2)
Rev. 1.50, 10/04, page 47 of 448
Page 68
(4-1) LDC to Rp_BANK/SSR/SPC/VBR: 1 issue cycle
I1 I2 ID s1 s2 s3
(4-2) LDC to DBR/SGR: 4 issue cycles
I1 I2 ID s1 s2 s3
(4-3) LDC to GBR: 1 issue cycle
I1 I2 ID s1 s2 s3
(4-4) LDC to SR: 4 issue cycles + 3 branch cycles
I1 I2 ID E1s1 E2s2 E3s3 WB
(4-5) LDC.L to Rp_BANK/SSR/SPC/VBR: 1 issue cycle
I1 I2 ID S1 S2 S3
(4-6) LDC.L to DBR/SGR: 4 issue cycles
I1 I2 ID S1 S2 S3 WB
ID
ID
ID
ID
ID
ID
WB
WB
ID
WB
ID
(I1) (ID)(I2)
WB
ID
(Branch to the next instruction.)
(4-7) LDC.L to GBR: 1 issue cycle
I1 I2 ID S1 S2 S3
(4-8) LDC.L to SR: 6 issue cycles + 3 branch cycles
I1 I2 ID E1S1 E2S2 E3S3 WB
Figure 4.2 Instruction Execution Patterns (4)
Rev. 1.50, 10/04, page 48 of 448
WB
ID
ID
ID
ID
ID
(I1) (ID)(I2)
(Branch to the next instruction.)
Page 69
(4-9) STC from DBR/GBR/Rp_BANK/SSR/SPC/VBR/SGR: 1 issue cycle
I1 I2 ID s1 s2 s3
(4-10) STC from SR: 1 issue cycle
I1 I2 ID
(4-11) STC.L from DBR/GBR/Rp_BANK/SSR/SPC/VBR/SGR: 1 issue cycle
I1 I2 ID S1 S2 S3
(4-12) STC.L from SR: 1 issue cycle
I1 I2 ID
(4-13) LDS to PR: 1 issue cycle
I1 I2 ID s1 s2 s3
(4-14) LDS.L to PR: 1 issue cycle
I1 I2 ID S1 S2 S3
(4-15) STS from PR: 1 issue cycle
I1 I2 ID s1 s2 s3
(4-16) STS.L from PR: 1 issue cycle
I1 I2 ID S1 S2 S3
E1s1 E2s2 E3s3
E1S1 E2S2 E3S3
WB
WB
WB
WB
WB
WB
WB
WB
(4-17) BSRF, BSR, JSR delay slot instructions (PR set): 0 issue cycle
(I1) (I2) (ID) (??1) (??2) (??3)
Notes:
The value of PR is changed in the E3 stage of delay slot instruction. When the STS and STS.L instructions from PR are used as delay slot instructions, changed PR value is used.
(WB)
Figure 4.2 Instruction Execution Patterns (5)
Rev. 1.50, 10/04, page 49 of 448
Page 70
(5-1) LDS to MACH/L: 1 issue cycle
I1 I2 ID s1 s2 s3 WB
(5-2) LDS.L to MACH/L: 1 issue cycle
I1 I2 ID S1 S2 S3 WB
(5-3) STS from MACH/L: 1 issue cycle
I1 I2 ID s1 s2 s3 WB
(5-4) STS.L from MACH/L: 1 issue cycle
I1 I2 ID S1 S2 S3 WB
(5-5) MULS.W, MULU.W: 1 issue cycle
I1 I2 ID E1 M2 M3
(5-6) DMULS.L, DMULU.L, MUL.L: 1 issue cycle
I1 I2
(5-7) CLRMAC: 1 issue cycle
I1 I2
(5-8) MAC.W: 2 issue cycle
I1 I2 ID S1 S2 S3 WB
E1 M2 M3
ID
ID
E1 M2 M3 MS
ID
S1 S2 S3
MS
MS
MS
MS
MS
M2 M3
MS
WB
M2 M3
MS
(5-9) MAC.L: 2 issue cycle
I1 I2 ID S1 S2 S3 WB
Figure 4.2 Instruction Execution Patterns (6)
Rev. 1.50, 10/04, page 50 of 448
ID
S1 S2 S3
WB
M2
M3 M2 M3
MS
Page 71
(6-1) LDS to FPUL: 1 issue cycle
I1 I2 ID s1 s2 s3
(6-2) STS from FPUL: 1 issue cycle
I1 I2
(6-3) LDS.L to FPUL: 1 issue cycle
I1 I2 ID S1 S2 S3 WB
(6-4) STS.L from FPUL: 1 issue cycle
I1 I2
(6-5) LDS to FPSCR: 1 issue cycle
I1 I2 ID s1 s2
(6-6) STS from FPSCR: 1 issue cycle
I1 I2
(6-7) LDS.L to FPSCR: 1 issue cycle
I1 I2 ID WB
(6-8) STS.L from FPSCR: 1 issue cycle
I1 I2
FS1 FS2 FS3 FS4
ID
FS1 FS2 FS3 FS4
s1 s2 s3 WB
FS1 FS2 FS3 FS4
FS1 FS2 FS3 FS4
ID
S1 S2 S3
FS1 FS2 FS3 FS4 FS
ID
FS1 FS2 FS3 FS4
s1 s2 s3 WB
S1 S2 S3
FS1 FS2 FS3 FS4
ID
FS1 FS2 FS3 FS4
S1 S2 S3
FS
FS
WB
s3
FS
WB
(6-9) FPU load/store instruction FMOV: 1 issue cycle
I1 I2
(6-10) FLDS: 1 issue cycle
I1 I2
(6-11) FSTS: 1 issue cycle
I1 I2
ID
S1 S2 S3
FS1 FS2 FS3 FS4
s1 s2 s3
ID
FS1 FS2 FS3 FS4 FS
s1 s2
ID
FS1 FS2 FS4
s3
FS3
Figure 4.2 Instruction Execution Patterns (7)
WB
FS
WB
FS
Rev. 1.50, 10/04, page 51 of 448
Page 72
(6-12) Single-precision FABS, FNEG/double-precision FABS, FNEG: 1 issue cycle
I1 I2 ID s1 s2 s3
(6-13) FLDI0, FLDI1: 1 issue cycle
I1 I2 ID s1 s2 s3
(6-14) Single-precision floating-point computation: 1 issue cycle
FCMP/EQ, FCMP/GT, FADD, FLOAT, FMAC, FMUL, FSUB, FTRC, FRCHG, FSCHG, FPCHG
I1 I2
(6-15) Single-precision FDIV/FSQRT: 1 issue cycle
I1 I2
FS1 FS2 FS3 FS4
FS1 FS2 FS3 FS4
FE1 FE2 FE3 FE4 FE5
ID
FE1 FE2 FE3 FE4 FE5 FE6
ID
FEDS (Divider occupied cycle)
FS
FS
FE6 FS
FS
(6-16) Double-precision floating-point computation: 1 issue cycle
FCMP/EQ, FCMP/GT, FADD, FLOAT, FSUB, FTRC, FCNVSD, FCNVDS
FE1 FE2 FE3 FE4 FE5
I1 I2
(6-17) Double-precision floating-point computation: 1 issue cycle
FMUL
I1 I2
(6-18) Double-precision FDIV/FSQRT: 1 issue cycle
I1 I2
ID
ID
ID
FE2 FE3 FE4 FE5 FE6 FS
FE1
FE1 FE2 FE3 FE4 FE5 FE6 FS
FE1 FE2 FE3 FE4 FE5 FE6 FS
FE1 FE2 FE3 FE4 FE5 FE6
FEDS (Divider occupied cycle)
Figure 4.2 Instruction Execution Patterns (8)
FE3 FE4 FE5
FE6 FS
FS
FE3 FE4 FE5 FE6 FS
FE3 FE4 FE5 FE6 FS
FE6 FS
Rev. 1.50, 10/04, page 52 of 448
Page 73
(6-19) FIPR: 1 issue cycle
I1 I2
(6-20) FTRV: 1 issue cycle
I1 I2
(6-21) FSRRA: 1 issue cycle
I1 I2
(6-22) FSCA: 1 issue cycle
I1 I2
ID
FE1 FE2 FE3 FE4 FE5 FE6
FE1 FE2
ID
FE1 FE2 FE3
ID
FE1 FE2
ID
Function computing unit occupied cycle
Function computing unit occupied cycle
FE3 FE4 FE5 FE6 FS
FE1 FE2
FE1 FE2
FE3 FE4 FE5 FE6 FS FE2
FE1
FE1 FE2
FE4 FE5 FE6 FS
FEPL
FE3 FE4 FE5 FE6 FS
FE3 FE4 FE5 FE6 FS FE2
FE1
FEPL
FE3 FE4 FE5 FE6 FS
FE3 FE4 FE5
FS
FE3 FE4 FE5
Figure 4.2 Instruction Execution Patterns (9)
FE6
FE6
FS
FS
Rev. 1.50, 10/04, page 53 of 448
Page 74

4.2 Parallel-Executability

Instructions are categorized into six groups according to the internal function blocks used, as shown in table 4.2. Table 4.3 shows the parallel-executability of pairs of instructions in terms of groups. For example, ADD in the EX group and BRA in the BR group can be executed in parallel.
Table 4.2 Instruction Groups
Instruction Group
EX ADD
ADDC
ADDV
AND #imm,R0
AND Rm,Rn
CLRMAC
CLRS
CLRT
CMP
DIV0S
DIV0U
DIV1
DMUS.L
DMULU.L
MT MOV #imm,Rn MOV Rm,Rn NOP
BR BF
BF/S
BRA
LS FABS
FNEG
FLDI0
FLDI1
FLDS
FMOV @adr,FR
FMOV FR,@adr
FMOV FR,FR
FMOV.S @adr,FR
FE FADD
FSUB
FCMP (S/D)
FCNVDS
FCNVSD
DT
EXTS
EXTU
MOVT
MUL.L
MULS.W
MULU.W
NEG
NEGC
NOT
OR #imm,R0
OR Rm,Rn
ROTCL
ROTCR
BRAF
BSR
BSRF
FMOV.S FR,@adr
FSTS
LDC Rm,CR1
LDC.L @Rm+,CR1
LDS Rm,SR1
LDS Rm,SR2
LDS.L @adr,SR2
LDS.L @Rm+,SR1
LDS.L @Rm+,SR2
FDIV
FIPR
FLOAT
FMAC
FMUL
Instruction
ROTL
ROTR
SETS
SETT
SHAD
SHAL
SHAR
SHLD
SHLL
SHLL2
SHLL8
SHLL16
SHLR
SHLR2
BT
BT/S
JMP
MOV.[BWL] @adr,R
MOV.[BWL] R,@adr
MOVA
MOVCA.L
MOVUA
OCBI
OCBP
OCBWB
PREF
FRCHG
FSCHG
FSQRT
FTRC
FTRV
SHLR8
SHLR16
SUB
SUBC
SUBV
SWAP
TST #imm,R0
TST Rm,Rn
XOR #imm,R0
XOR Rm,Rn
XTRCT
JSR
RTS
STC CR2,Rn
STC.L CR2,@-Rn
STS SR2,Rn
STS.L SR2,@-Rn
STS SR1,Rn
STS.L SR1,@-Rn
FSCA
FSRRA
FPCHG
Rev. 1.50, 10/04, page 54 of 448
Page 75
Instruction Group Instruction
CO AND.B #imm,@(R0,GBR)
ICBI
LDC Rm,DBR
LDC Rm, SGR
LDC Rm,SR
LDC.L @Rm+,DBR
LDC.L @Rm+,SGR
LDC.L @Rm+,SR
LDTLB
MAC.L
MAC.W
MOVCO
MOVLI
OR.B #imm,@(R0,GBR)
PREFI
RTE
SLEEP
STC SR,Rn
STC.L SR,@-Rn
SYNCO
TAS.B
TRAPA
TST.B #imm,@(R0,GBR)
XOR.B #imm,@(R0,GBR)
[Legend] R: Rm/Rn @adr: Address SR1: MACH/MACL/PR SR2: FPUL/FPSCR CR1: GBR/Rp_BANK/SPC/SSR/VBR CR2: CR1/DBR/SGR FR: FRm/FRn/DRm/DRn/XDm/XDn
The parallel execution of two instructions can be carried out under following conditions.
1. Both addr (preceding instruction) and addr+2 (following instruction) are specified within the minimum page size (1 Kbyte).
2. The execution of these two instructions is supported in table 4.3, Combination of Preceding and Following Instructions.
3. Data used by an instruction of addr does not conflict with data used by a previous instruction
4. Data used by an instruction of addr+2 does not conflict with data used by a previous instruction
5. Both instructions are valid
Table 4.3 Combination of Preceding and Following Instructions
Preceding Instruction (addr) EX MT BR LS FE CO
Following Instruction (addr+2)
EX No Yes Yes Yes Yes
MT Yes Yes Yes Yes Yes
BR Yes Yes No Yes Yes
LS Yes Yes Yes No Yes
FE Yes Yes Yes Yes No
CO No
Rev. 1.50, 10/04, page 55 of 448
Page 76

4.3 Issue Rates and Execution Cycles

Instruction execution cycles are summarized in table 4.4. Instruction Group in the table 4.4 corresponds to the category in the table 4.2. Penalty cycles due to a pipeline stall are not considered in the issue rates and execution cycles in this section.
1. Issue Rate
Issue rates indicates the issue period between one instruction and next instruction.
E.g. AND.B instruction
I1 I2 ID S1 S2 S3 WB
ID
E1S1
ID
E2S2 E3S3 WB
Issue rate: 3
E.g. MAC.W instruction
Next instruction
I1 I2 ID S1 S2 S3
Issue rate: 2
Next instruction
(I1)
ID
(I2)
(I1) (ID)(I2)
S2 S3 WB
S1
(ID)
WB
M2
2. Execution Cycles
Execution cycles indicates the cycle counts an instruction occupied the pipeline based on the next rules.
CPU instruction
Execution Cycles: 3
E.g. AND.B instruction
I1 I2 ID S1 S2 S3 WB
E.g. MAC.W instruction
I1 I2 ID S1 S2 S3
ID
E1S1
ID
E2S2 E3S3
Execution Cycles: 4
WB
S2 S3 WB
S1
ID
M2
M3
FPU instruction E.g. FMUL instruction
I1 I2
ID
FE2 FE3 FE4 FE5 FE6
FE1
FE1 FE2 FE3 FE4 FE5 FE6 FS
FE1 FE2 FE3 FE4 FE5 FE6 FS
Execution Cycles: 3
FS
M3
WB
MS
MS
E.g. FDIV instruction
I2
I1
ID
Rev. 1.50, 10/04, page 56 of 448
FE1 FE2
FE5
FE4
FE3
Divider occupation cycle
FE6
FS
Execution Cycles: 14
FE5
FE4
FE3
FE6
FS
Page 77
Table 4.4 Issue Rates and Execution Cycles
Functional Category
Data transfer instructions
Instruction
No. Instruction
1 EXTS.B Rm,Rn EX 1 1 2-1
2 EXTS.W Rm,Rn EX 1 1 2-1
3 EXTU.B Rm,Rn EX 1 1 2-1
4 EXTU.W Rm,Rn EX 1 1 2-1
5 MOV Rm,Rn MT 1 1 2-4
6 MOV #imm,Rn MT 1 1 2-3
7 MOVA @(disp,PC),R0 LS 1 1 2-2
8 MOV.W @(disp,PC),Rn LS 1 1 3-1
9 MOV.L @(disp,PC),Rn LS 1 1 3-1
10 MOV.B @Rm,Rn LS 1 1 3-1
11 MOV.W @Rm,Rn LS 1 1 3-1
12 MOV.L @Rm,Rn LS 1 1 3-1
13 MOV.B @Rm+,Rn LS 1 1 3-1
14 MOV.W @Rm+,Rn LS 1 1 3-1
15 MOV.L @Rm+,Rn LS 1 1 3-1
16 MOV.B @(disp,Rm),R0 LS 1 1 3-1
17 MOV.W @(disp,Rm),R0 LS 1 1 3-1
18 MOV.L @(disp,Rm),Rn LS 1 1 3-1
19 MOV.B @(R0,Rm),Rn LS 1 1 3-1
20 MOV.W @(R0,Rm),Rn LS 1 1 3-1
21 MOV.L @(R0,Rm),Rn LS 1 1 3-1
22 MOV.B @(disp,GBR),R0 LS 1 1 3-1
23 MOV.W @(disp, GBR),R0 LS 1 1 3-1
24 MOV.L @(disp, GBR),R0 LS 1 1 3-1
25 MOV.B Rm,@Rn LS 1 1 3-1
26 MOV.W Rm,@Rn LS 1 1 3-1
27 MOV.L Rm,@Rn LS 1 1 3-1
28 MOV.B Rm,@-Rn LS 1 1 3-1
29 MOV.W Rm,@-Rn LS 1 1 3-1
30 MOV.L Rm,@-Rn LS 1 1 3-1
31 MOV.B R0,@(disp,Rn) LS 1 1 3-1
Group
Issue Rate
Execution Cycles
Execution Pattern
Rev. 1.50, 10/04, page 57 of 448
Page 78
Functional Category
Data transfer instructions
Fixed-point arithmetic instructions
Instruction
No. Instruction
32 MOV.W R0,@(disp,Rn) LS 1 1 3-1
33 MOV.L Rm,@(disp,Rn) LS 1 1 3-1
34 MOV.B Rm,@(R0,Rn) LS 1 1 3-1
35 MOV.W Rm,@(R0,Rn) LS 1 1 3-1
36 MOV.L Rm,@(R0,Rn) LS 1 1 3-1
37 MOV.B R0,@(disp,GBR) LS 1 1 3-1
38 MOV.W R0,@(disp,GBR) LS 1 1 3-1
39 MOV.L R0,@(disp,GBR) LS 1 1 3-1
40 MOVCA.L R0,@Rn LS 1 1 3-4
41 MOVCO.L R0,@Rn CO 1 1 3-9
42 MOVLI.L @Rm,R0 CO 1 1 3-8
43 MOVUA.L @Rm,R0 LS 2 2 3-10
44 MOVUA.L @Rm+,R0 LS 2 2 3-10
45 MOVT Rn EX 1 1 2-1
46 OCBI @Rn LS 1 1 3-4
47 OCBP @Rn LS 1 1 3-4
48 OCBWB @Rn LS 1 1 3-4
49 PREF @Rn LS 1 1 3-4
50 SWAP.B Rm,Rn EX 1 1 2-1
51 SWAP.W Rm,Rn EX 1 1 2-1
52 XTRCT Rm,Rn EX 1 1 2-1
53 ADD Rm,Rn EX 1 1 2-1
54 ADD #imm,Rn EX 1 1 2-1
55 ADDC Rm,Rn EX 1 1 2-1
56 ADDV Rm,Rn EX 1 1 2-1
57 CMP/EQ #imm,R0 EX 1 1 2-1
58 CMP/EQ Rm,Rn EX 1 1 2-1
59 CMP/GE Rm,Rn EX 1 1 2-1
60 CMP/GT Rm,Rn EX 1 1 2-1
61 CMP/HI Rm,Rn EX 1 1 2-1
62 CMP/HS Rm,Rn EX 1 1 2-1
Group
Issue Rate
Execution Cycles
Execution Pattern
Rev. 1.50, 10/04, page 58 of 448
Page 79
Functional Category
Fixed-point arithmetic instructions
Logical instructions
Instruction
No. Instruction
63 CMP/PL Rn EX 1 1 2-1
64 CMP/PZ Rn EX 1 1 2-1
65 CMP/STR Rm,Rn EX 1 1 2-1
66 DIV0S Rm,Rn EX 1 1 2-1
67 DIV0U EX 1 1 2-1
68 DIV1 Rm,Rn EX 1 1 2-1
69 DMULS.L Rm,Rn EX 1 2 5-6
70 DMULU.L Rm,Rn EX 1 2 5-6
71 DT Rn EX 1 1 2-1
72 MAC.L @Rm+,@Rn+ CO 2 5 5-9
73 MAC.W @Rm+,@Rn+ CO 2 4 5-8
74 MUL.L Rm,Rn EX 1 2 5-6
75 MULS.W Rm,Rn EX 1 1 5-5
76 MULU.W Rm,Rn EX 1 1 5-5
77 NEG Rm,Rn EX 1 1 2-1
78 NEGC Rm,Rn EX 1 1 2-1
79 SUB Rm,Rn EX 1 1 2-1
80 SUBC Rm,Rn EX 1 1 2-1
81 SUBV Rm,Rn EX 1 1 2-1
82 AND Rm,Rn EX 1 1 2-1
83 AND #imm,R0 EX 1 1 2-1
84 AND.B #imm,@(R0,GBR) CO 3 3 3-2
85 NOT Rm,Rn EX 1 1 2-1
86 OR Rm,Rn EX 1 1 2-1
87 OR #imm,R0 EX 1 1 2-1
88 OR.B #imm,@(R0,GBR) CO 3 3 3-2
89 TAS.B @Rn CO 4 4 3-3
90 TST Rm,Rn EX 1 1 2-1
91 TST #imm,R0 EX 1 1 2-1
92 TST.B #imm,@(R0,GBR) CO 3 3 3-2
93 XOR Rm,Rn EX 1 1 2-1
94 XOR #imm,R0 EX 1 1 2-1
Group
Issue Rate
Execution Cycles
Execution Pattern
Rev. 1.50, 10/04, page 59 of 448
Page 80
Functional Category
Logical instructions
Shift instructions
Branch instructions
System control instructions
Instruction
No. Instruction
95 XOR.B #imm,@(R0,GBR) CO 3 3 3-2
96 ROTL Rn EX 1 1 2-1
97 ROTR Rn EX 1 1 2-1
98 ROTCL Rn EX 1 1 2-1
99 ROTCR Rn EX 1 1 2-1
100 SHAD Rm,Rn EX 1 1 2-1
101 SHAL Rn EX 1 1 2-1
102 SHAR Rn EX 1 1 2-1
103 SHLD Rm,Rn EX 1 1 2-1
104 SHLL Rn EX 1 1 2-1
105 SHLL2 Rn EX 1 1 2-1
106 SHLL8 Rn EX 1 1 2-1
107 SHLL16 Rn EX 1 1 2-1
108 SHLR Rn EX 1 1 2-1
109 SHLR2 Rn EX 1 1 2-1
110 SHLR8 Rn EX 1 1 2-1
111 SHLR16 Rn EX 1 1 2-1
112 BF disp BR 1+0 to 2 1 1-1
113 BF/S disp BR 1+0 to 2 1 1-1
114 BT disp BR 1+0 to 2 1 1-1
115 BT/S disp BR 1+0 to 2 1 1-1
116 BRA disp BR 1+0 to 2 1 1-1
117 BRAF Rm BR 1+3 1 1-2
118 BSR disp BR 1+0 to 2 1 1-1
119 BSRF Rm BR 1+3 1 1-2
120 JMP @Rn BR 1+3 1 1-2
121 JSR @Rn BR 1+3 1 1-2
122 RTS BR 1+0 to 3 1 1-3
123 NOP MT 1 1 2-3
124 CLRMAC EX 1 1 5-7
125 CLRS EX 1 1 2-1
Group
Issue Rate
Execution Cycles
Execution Pattern
Rev. 1.50, 10/04, page 60 of 448
Page 81
Functional Category
System control instructions
Instruction
No. Instruction
126 CLRT EX 1 1 2-1
127 ICBI @Rn CO 8+5+3 13 3-6
128 SETS EX 1 1 2-1
129 SETT EX 1 1 2-1
130 PREFI CO 5+5+3 10 3-7
131 SYNCO @Rn CO Undefined Undefined 3-4
132 TRAPA #imm CO 8+5+1 13 1-5
133 RTE CO 4+1 4 1-4
134 SLEEP CO Undefined Undefined 1-6
135 LDTLB CO 1 1 3-5
136 LDC Rm,DBR CO 4 4 4-2
137 LDC Rm,SGR CO 4 4 4-2
138 LDC Rm,GBR LS 1 1 4-3
139 LDC Rm,Rp_BANK LS 1 1 4-1
140 LDC Rm,SR CO 4+3 4 4-4
141 LDC Rm,SSR LS 1 1 4-1
142 LDC Rm,SPC LS 1 1 4-1
143 LDC Rm,VBR LS 1 1 4-1
144 LDC.L @Rm+,DBR CO 4 4 4-6
145 LDC.L @Rm+,SGR CO 4 4 4-6
146 LDC.L @Rm+,GBR LS 1 1 4-7
147 LDC.L @Rm+,Rp_BANK LS 1 1 4-5
148 LDC.L @Rm+,SR CO 6+3 4 4-8
149 LDC.L @Rm+,SSR LS 1 1 4-5
150 LDC.L @Rm+,SPC LS 1 1 4-5
151 LDC.L @Rm+,VBR LS 1 1 4-5
152 LDS Rm,MACH LS 1 1 5-1
153 LDS Rm,MACL LS 1 1 5-1
154 LDS Rm,PR LS 1 1 4-13
155 LDS.L @Rm+,MACH LS 1 1 5-2
156 LDS.L @Rm+,MACL LS 1 1 5-2
157 LDS.L @Rm+,PR LS 1 1 4-14
Group
Issue Rate
Execution Cycles
Execution Pattern
Rev. 1.50, 10/04, page 61 of 448
Page 82
Functional Category
System control instructions
Single­precision floating-point instructions
Instruction
No. Instruction
158 STC DBR,Rn LS 1 1 4-9
159 STC SGR,Rn LS 1 1 4-9
160 STC GBR,Rn LS 1 1 4-9
161 STC Rp_BANK,Rn LS 1 1 4-9
162 STC SR,Rn CO 1 1 4-10
163 STC SSR,Rn LS 1 1 4-9
164 STC SPC,Rn LS 1 1 4-9
165 STC VBR,Rn LS 1 1 4-9
166 STC.L DBR,@-Rn LS 1 1 4-11
167 STC.L SGR,@-Rn LS 1 1 4-11
168 STC.L GBR,@-Rn LS 1 1 4-11
169 STC.L Rp_BANK,@-Rn LS 1 1 4-11
170 STC.L SR,@-Rn CO 1 1 4-12
171 STC.L SSR,@-Rn LS 1 1 4-11
172 STC.L SPC,@-Rn LS 1 1 4-11
173 STC.L VBR,@-Rn LS 1 1 4-11
174 STS MACH,Rn LS 1 1 5-3
175 STS MACL,Rn LS 1 1 5-3
176 STS PR,Rn LS 1 1 4-15
177 STS.L MACH,@-Rn LS 1 1 5-4
178 STS.L MACL,@-Rn LS 1 1 5-4
179 STS.L PR,@-Rn LS 1 1 4-16
180 FLDI0 FRn LS 1 1 6-13
181 FLDI1 FRn LS 1 1 6-13
182 FMOV FRm,FRn LS 1 1 6-9
183 FMOV.S @Rm,FRn LS 1 1 6-9
184 FMOV.S @Rm+,FRn LS 1 1 6-9
185 FMOV.S @(R0,Rm),FRn LS 1 1 6-9
186 FMOV.S FRm,@Rn LS 1 1 6-9
187 FMOV.S FRm,@-Rn LS 1 1 6-9
188 FMOV.S FRm,@(R0,Rn) LS 1 1 6-9
Group
Issue Rate
Execution Cycles
Execution Pattern
Rev. 1.50, 10/04, page 62 of 448
Page 83
Functional Category
Single­precision floating-point instructions
Double­precision floating-point instructions
Instruction
No. Instruction
189 FLDS FRm,FPUL LS 1 1 6-10
190 FSTS FPUL,FRn LS 1 1 6-11
191 FABS FRn LS 1 1 6-12
192 FADD FRm,FRn FE 1 1 6-14
193 FCMP/EQ FRm,FRn FE 1 1 6-14
194 FCMP/GT FRm,FRn FE 1 1 6-14
195 FDIV FRm,FRn FE 1 14 6-15
196 FLOAT FPUL,FRn FE 1 1 6-14
197 FMAC FR0,FRm,FRn FE 1 1 6-14
198 FMUL FRm,FRn FE 1 1 6-14
199 FNEG FRn LS 1 1 6-12
200 FSQRT FRn FE 1 30 6-15
201 FSUB FRm,FRn FE 1 1 6-14
202 FTRC FRm,FPUL FE 1 1 6-14
203 FMOV DRm,DRn LS 1 1 6-9
204 FMOV @Rm,DRn LS 1 1 6-9
205 FMOV @Rm+,DRn LS 1 1 6-9
206 FMOV @(R0,Rm),DRn LS 1 1 6-9
207 FMOV DRm,@Rn LS 1 1 6-9
208 FMOV DRm,@-Rn LS 1 1 6-9
209 FMOV DRm,@(R0,Rn) LS 1 1 6-9
210 FABS DRn LS 1 1 6-12
211 FADD DRm,DRn FE 1 1 6-16
212 FCMP/EQ DRm,DRn FE 1 1 6-16
213 FCMP/GT DRm,DRn FE 1 1 6-16
214 FCNVDS DRm,FPUL FE 1 1 6-16
215 FCNVSD FPUL,DRn FE 1 1 6-16
216 FDIV DRm,DRn FE 1 14 6-18
217 FLOAT FPUL,DRn FE 1 1 6-16
218 FMUL DRm,DRn FE 1 3 6-17
219 FNEG DRn LS 1 1 6-12
Group
Issue Rate
Execution Cycles
Execution Pattern
Rev. 1.50, 10/04, page 63 of 448
Page 84
Functional Category
Double­precision floating-point instructions
FPU system control instructions
Graphics acceleration instructions
Instruction
No. Instruction
220 FSQRT DRn FE 1 30 6-18
221 FSUB DRm,DRn FE 1 1 6-16
222 FTRC DRm,FPUL FE 1 1 6-16
223 LDS Rm,FPUL LS 1 1 6-1
224 LDS Rm,FPSCR LS 1 1 6-5
225 LDS.L @Rm+,FPUL LS 1 1 6-3
226 LDS.L @Rm+,FPSCR LS 1 1 6-7
227 STS FPUL,Rn LS 1 1 6-2
228 STS FPSCR,Rn LS 1 1 6-6
229 STS.L FPUL,@-Rn LS 1 1 6-4
230 STS.L FPSCR,@-Rn LS 1 1 6-8
231 FMOV DRm,XDn LS 1 1 6-9
232 FMOV XDm,DRn LS 1 1 6-9
233 FMOV XDm,XDn LS 1 1 6-9
234 FMOV @Rm,XDn LS 1 1 6-9
235 FMOV @Rm+,XDn LS 1 1 6-9
236 FMOV @(R0,Rm),XDn LS 1 1 6-9
237 FMOV XDm,@Rn LS 1 1 6-9
238 FMOV XDm,@-Rn LS 1 1 6-9
239 FMOV XDm,@(R0,Rn) LS 1 1 6-9
240 FIPR FVm,FVn FE 1 1 6-19
241 FRCHG FE 1 1 6-14
242 FSCHG FE 1 1 6-14
243 FPCHG FE 1 1 6-14
244 FSRRA FRn FE 1 1 6-21
245 FSCA FPUL,DRn FE 1 3 6-22
246 FTRV XMTRX,FVn FE 1 4 6-20
Group
Issue Rate
Execution Cycles
Execution Pattern
Rev. 1.50, 10/04, page 64 of 448
Page 85

Section 5 Exception Handling

5.1 Summary of Exception Handling

Exception handling processing is handled by a special routine which is executed by a reset, general exception handling, or interrupt. For example, if the executing instruction ends abnormally, appropriate action must be taken in order to return to the original program sequence, or report the abnormality before terminating the processing. The process of generating an exception handling request in response to abnormal termination, and passing control to a user­written exception handling routine, in order to support such functions, is given the generic name of exception handling.
The exception handling in the SH-4A is of three kinds: resets, general exceptions, and interrupts.

5.2 Register Descriptions

Table 5.1 lists the configuration of registers related exception handling.
Table 5.1 Register Configuration
Area 7
Register Name Abbr. R/W P4 Address*
TRAPA exception register TRA R/W H'FF00 0020 H'1F00 0020 32
Exception event register EXPEVT R/W H'FF00 0024 H'1F00 0024 32
Interrupt event register INTEVT R/W H'FF00 0028 H'1F00 0028 32
Note: * P4 is the address when virtual address space P4 area is used. Area 7 is the address
when physical address space area 7 is accessed by using the TLB.
Table 5.2 States of Register in Each Operating Mode
Address* Access Size
Power-on
Register Name Abbr.
TRAPA exception register TRA Undefined Undefined Retained Retained
Exception event register EXPEVT H'0000 0000 H'0000 0020 Retained Retained
Interrupt event register INTEVT Undefined Undefined Retained Retained
Rev. 1.50, 10/04, page 65 of 448
Reset Manual Reset Sleep Standby
Page 86

5.2.1 TRAPA Exception Register (TRA)

The TRAPA exception register (TRA) consists of 8-bit immediate data (imm) for the TRAPA instruction. TRA is set automatically by hardware when a TRAPA instruction is executed. TRA can also be modified by software.
Bit:
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
Initial value:
Initial value:
Bit Bit Name
31 to 10 All 0 R Reserved
9 to 2 TRACODE Undefined R/W TRAPA Code
1, 0 All 0 R Reserved
0000000000000000
R/W:
RRRRRRRRRRRRRRRR
Bit:
1514131211109876543210
000000
R/W:
RRRRRRR/W
R/W
TRACODE
R/W R/W R/W
R/W R/W R/W R R
Initial Value R/W Description
For details on reading/writing this bit, see General Precautions on Handling of Product.
8-bit immediate data of TRAPA instruction is set
For details on reading/writing this bit, see General Precautions on Handling of Product.
00
Rev. 1.50, 10/04, page 66 of 448
Page 87

5.2.2 Exception Event Register (EXPEVT)

The exception event register (EXPEVT) consists of a 12-bit exception code. The exception code set in EXPEVT is that for a reset or general exception event. The exception code is set automatically by hardware when an exception occurs. EXPEVT can also be modified by software.
Bit:
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
Initial value:
Initial value:
Bit Bit Name
31 to 12 All 0 R Reserved
11 to 0 EXPCODE H'000 or
0000000000000000
R/W:
RRRRRRRRRRRRRRRR
Bit:
1514131211109876543210
0000
R/W:
R R R R R/W R/W R/W
0000000/100000
R/W
EXPCODE
R/W R/W
R/W R/W R/W R/W R/W R/W
Initial Value R/W Description
For details on reading/writing this bit, see General Precautions on Handling of Product.
R/W Exception Code
H'020
The exception code for a reset or general exception is set. For details, see table 5.3.
Rev. 1.50, 10/04, page 67 of 448
Page 88

5.2.3 Interrupt Event Register (INTEVT)

The interrupt event register (INTEVT) consists of a 14-bit exception code. The exception code is set automatically by hardware when an exception occurs. INTEVT can also be modified by software.
Bit:
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
Initial value:
Initial value:
Bit Bit Name
31 to 14 All 0 R Reserved
13 to 0 INTCODE Undefined R/W Exception Code
0000000000000000
R/W:
RRRRRRRRRRRRRRRR
Bit:
1514131211109876543210
R/W
INTCODE
R/W
R/W R/W R/W R/W R/W R/W R/W
00
R/W:
R R R/W R/W R/W R/W R/W
Initial Value R/W Description
For details on reading/writing this bit, see General Precautions on Handling of Product.
The exception code for an interrupt is set. For details, see table 5.3.
Rev. 1.50, 10/04, page 68 of 448
Page 89

5.3 Exception Handling Functions

5.3.1 Exception Handling Flow

In exception handling, the contents of the program counter (PC), status register (SR), and R15 are saved in the saved program counter (SPC), saved status register (SSR), and saved general register15 (SGR), and the CPU starts execution of the appropriate exception handling routine according to the vector address. An exception handling routine is a program written by the user to handle a specific exception. The exception handling routine is terminated and control returned to the original program by executing a return-from-exception instruction (RTE). This instruction restores the PC and SR contents and returns control to the normal processing routine at the point at which the exception occurred. The SGR contents are not written back to R15 with an RTE instruction.
The basic processing flow is as follows. For the meaning of the SR bits, see section 2, Programming Model.
1. The PC, SR, and R15 contents are saved in SPC, SSR, and SGR, respectively.
2. The block bit (BL) in SR is set to 1.
3. The mode bit (MD) in SR is set to 1.
4. The register bank bit (RB) in SR is set to 1.
5. In a reset, the FPU disable bit (FD) in SR is cleared to 0.
6. The exception code is written to bits 11 to 0 of the exception event register (EXPEVT) or interrupt event register (INTEVT).
7. The CPU branches to the determined exception handling vector address, and the exception handling routine begins.

5.3.2 Exception Handling Vector Addresses

The reset vector address is fixed at H'A0000000. Exception and interrupt vector addresses are determined by adding the offset for the specific event to the vector base address, which is set by software in the vector base register (VBR). In the case of the TLB miss exception, for example, the offset is H'00000400, so if H'9C080000 is set in VBR, the exception handling vector address will be H'9C080400. If a further exception occurs at the exception handling vector address, a duplicate exception will result, and recovery will be difficult; therefore, addresses that are not to be converted (in P1 and P2 areas) should be specified for vector addresses.
Rev. 1.50, 10/04, page 69 of 448
Page 90

5.4 Exception Types and Priorities

Table 5.3 shows the types of exceptions, with their relative priorities, vector addresses, and exception/interrupt codes.
Table 5.3 Exceptions
Exception Transition
Direction*3
Exception Category
Reset Abort type
General exception
Rev. 1.50, 10/04, page 70 of 448
Execution Mode
Re­execution type
Completion type
Exception
Power-on reset 1 1 H'A000 0000 — H'000
Manual reset 1 2 H'A000 0000 — H'020
H-UDI reset 1 1 H'A000 0000 — H'000
Instruction TLB multiple-hit exception
Data TLB multiple-hit exception 1 4 H'A000 0000 — H'140
User break before instruction execution*
Instruction address error 2 1 (VBR) H'100 H'0E0
Instruction TLB miss exception 2 2 (VBR) H'400 H'040
Instruction TLB protection violation exception
General illegal instruction exception
Slot illegal instruction exception 2 4 (VBR) H'100 H'1A0
General FPU disable exception 2 4 (VBR) H'100 H'800
Slot FPU disable exception 2 4 (VBR) H'100 H'820
Data address error (read) 2 5 (VBR) H'100 H'0E0
Data address error (write) 2 5 (VBR) H'100 H'100
Data TLB miss exception (read) 2 6 (VBR) H'400 H'040
Data TLB miss exception (write) 2 6 (VBR) H'400 H'060
Data TLB protection violation exception (read)
Data TLB protection violation exception (write)
FPU exception 2 8 (VBR) H'100 H'120
Initial page write exception 2 9 (VBR) H'100 H'080
Unconditional trap (TRAPA) 2 4 (VBR) H'100 H'160
User break after instruction execution*
Priority Level*2
1 3 H'A000 0000 — H'140
2 0 (VBR/DBR) H'100/— H'1E0
2 3 (VBR) H'100 H'0A0
2 4 (VBR) H'100 H'180
2 7 (VBR) H'100 H'0A0
2 7 (VBR) H'100 H'0C0
2 10 (VBR/DBR) H'100/— H'1E0
Priority Order*
Vector
2
Address Offset
Exception Code*4
Page 91
Exception Transition
Direction*3
Exception Category
Execution Mode
type
Exception
Nonmaskable interrupt 3 (VBR) H'600 H'1C0 Interrupt Completion
General interrupt request 4 (VBR) H'600 —
Priority Level*2
Priority Order*2
Vector Address Offset
Exception Code*4
Note: 1. When UBDE in CBCR = 1, PC = DBR. In other cases, PC = VBR + H'100.
2. Priority is first assigned by priority level, then by priority order within each level (the
lowest number represents the highest priority).
3. Control passes to H'A000 0000 in a reset, and to [VBR + offset] in other cases.
4. Stored in EXPEVT for a reset or general exception, and in INTEVT for an interrupt.
Rev. 1.50, 10/04, page 71 of 448
Page 92

5.5 Exception Flow

5.5.1 Exception Flow

Figure 5.1 shows an outline flowchart of the basic operations in instruction execution and exception handling. For the sake of clarity, the following description assumes that instructions are executed sequentially, one by one. Figure 5.1 shows the relative priority order of the different kinds of exceptions (reset, general exception, and interrupt). Register settings in the event of an exception are shown only for SSR, SPC, SGR, EXPEVT/INTEVT, SR, and PC. However, other registers may be set automatically by hardware, depending on the exception. For details, see section 5.6, Description of Exceptions. Also, see section 5.6.4, Priority Order with Multiple Exceptions, for exception handling during execution of a delayed branch instruction and a delay slot instruction, or in the case of instructions in which two data accesses are performed.
Reset
requested?
No
Execute next instruction
General
exception requested?
No
Interrupt
requested?
No
Note: * When the exception of the highest priority is an interrupt.
Whether IMASK is updated or not can be set by software.
Yes
Yes
Yes
SSR SR SPC PC SGR R15 EXPEVT/INTEVT exception code SR.{MD,RB,BL} 111 SR.IMASK received interuupt level (*) PC (CBCR.UBDE=1 && User_Break? DBR: (VBR + Offset))
Is highest-
priority exception
re-exception
type?
No
Cancel instruction execution
Yes
result
EXPEVT exception code SR. {MD, RB, BL, FD, IMASK} 11101111 PC H'A000 0000
Figure 5.1 Instruction Execution and Exception Handling
Rev. 1.50, 10/04, page 72 of 448
Page 93

5.5.2 Exception Source Acceptance

A priority ranking is provided for all exceptions for use in determining which of two or more simultaneously generated exceptions should be accepted. Five of the general exceptions—general illegal instruction exception, slot illegal instruction exception, general FPU disable exception, slot FPU disable exception, and unconditional trap exception—are detected in the process of instruction decoding, and do not occur simultaneously in the instruction pipeline. These exceptions therefore all have the same priority. General exceptions are detected in the order of instruction execution. However, exception handling is performed in the order of instruction flow (program order). Thus, an exception for an earlier instruction is accepted before that for a later instruction. An example of the order of acceptance for general exceptions is shown in figure 5.2.
Pipeline flow:
Instruction n Instruction n + 1
Instruction n + 2
Instruction n + 3
Order of detection:
General illegal instruction exception (instruction n + 1) and TLB miss (instruction n + 2) are detected simultaneously
TLB miss (instruction n)
Order of exception handling:
TLB miss (instruction n)
Re-execution of instruction n
General illegal instruction exception (instruction n + 1)
Re-execution of instruction n + 1
TLB miss (instruction n + 2)
Re-execution of instruction n + 2
I1
I2
I1
ID
I2
ID
General illegal instruction exception
TLB miss (instruction access)
I1
I2
I1
TLB miss (data access)
E2
E1
E2
E1
ID
E1
I2 ID
Program order
E3 E3
E2
E1
WB WB
WB
E3
E2
E3
1
2
3
[Legend]
I1, I2: Instruction fetch ID : Instruction decode E1, E2, E3: Instruction execution (E2, E3 Memory access) WB Write-back
WB
Execution of instruction n + 3
Figure 5.2 Example of General Exception Acceptance Order
4
Rev. 1.50, 10/04, page 73 of 448
Page 94

5.5.3 Exception Requests and BL Bit

When the BL bit in SR is 0, exceptions and interrupts are accepted.
When the BL bit in SR is 1 and an exception other than a user break is generated, the CPU's internal registers and the registers of the other modules are set to their states following a manual reset, and the CPU branches to the same address as in a reset (H'A0000000). For the operation in the event of a user break, see the User Break Controller (UBC) section of the hardware manual of the target product. If an ordinary interrupt occurs, the interrupt request is held pending and is accepted after the BL bit has been cleared to 0 by software. If a nonmaskable interrupt (NMI) occurs, it can be held pending or accepted according to the setting made by software.
Thus, normally, SPC and SSR are saved and then the BL bit in SR is cleared to 0, to enable multiple exception state acceptance.

5.5.4 Return from Exception Handling

The RTE instruction is used to return from exception handling. When the RTE instruction is executed, the SPC contents are restored to PC and the SSR contents to SR, and the CPU returns from the exception handling routine by branching to the SPC address. If SPC and SSR were saved to external memory, set the BL bit in SR to 1 before restoring the SPC and SSR contents and issuing the RTE instruction.
Rev. 1.50, 10/04, page 74 of 448
Page 95

5.6 Description of Exceptions

The various exception handling operations explained here are exception sources, transition address on the occurrence of exception, and processor operation when a transition is made.

5.6.1 Resets

Power-On Reset:
Condition:
Power-on reset request
Operations:
Exception code H'000 is set in EXPEVT, initialization of the CPU and on-chip peripheral module is carried out, and then a branch is made to the reset vector (H'A0000000). For details, see the register descriptions in the relevant sections. A power-on reset should be executed when power is supplied.
Manual Reset:
Condition:
Manual reset request
Operations:
Exception code H'020 is set in EXPEVT, initialization of the CPU and on-chip peripheral module is carried out, and then a branch is made to the branch vector (H'A0000000). The registers initialized by a power-on reset and manual reset are different. For details, see the register descriptions in the relevant sections.
H-UDI Reset:
Source: SDIR.TI[7:4] = B'0110 (negation) or B'0111 (assertion)
Transition address: H'A0000000
Transition operations:
Exception code H'000 is set in EXPEVT, initialization of VBR and SR is performed, and a branch is made to PC = H'A0000000.
CPU and on-chip peripheral module initialization is performed. For details, see the register descriptions in the relevant sections of the hardware manual of the target product.
Rev. 1.50, 10/04, page 75 of 448
Page 96
Instruction TLB Multiple Hit Exception:
Source: Multiple ITLB address matches
Transition address: H'A0000000
Transition operations:
The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicates the ASID when this exception occurred.
Exception code H'140 is set in EXPEVT, initialization of VBR and SR is performed, and a branch is made to PC = H'A0000000.
CPU and on-chip peripheral module initialization is performed in the same way as in a manual reset. For details, see the register descriptions in the relevant sections of the hardware manual of the target product.
Data TLB Multiple-Hit Exception:
Source: Multiple UTLB address matches
Transition address: H'A0000000
Transition operations:
The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicates the ASID when this exception occurred.
Exception code H'140 is set in EXPEVT, initialization of VBR and SR is performed, and a branch is made to PC = H'A0000000.
CPU and on-chip peripheral module initialization is performed in the same way as in a manual reset. For details, see the register descriptions in the relevant sections of the hardware manual of the target product.
Rev. 1.50, 10/04, page 76 of 448
Page 97

5.6.2 General Exceptions

Data TLB Miss Exception:
Source: Address mismatch in UTLB address comparison
Transition address: VBR + H'00000400
Transition operations:
The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicates the ASID when this exception occurred.
The PC and SR contents for the instruction at which this exception occurred are saved in SPC and SSR. The R15 contents at this time are saved in SGR.
Exception code H'040 (for a read access) or H'060 (for a write access) is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + H'0400.
To speed up TLB miss processing, the offset is separate from that of other exceptions.
Data_TLB_miss_exception()
{
TEA = EXCEPTION_ADDRESS;
PTEH.VPN = PAGE_NUMBER;
SPC = PC;
SSR = SR;
SGR = R15;
EXPEVT = read_access ? H'0000 0040 : H'0000 0060;
SR.MD = 1;
SR.RB = 1;
SR.BL = 1;
PC = VBR + H'0000 0400;
}
Rev. 1.50, 10/04, page 77 of 448
Page 98
Instruction TLB Miss Exception:
Source: Address mismatch in ITLB address comparison
Transition address: VBR + H'00000400
Transition operations:
The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicates the ASID when this exception occurred.
The PC and SR contents for the instruction at which this exception occurred are saved in SPC and SSR. The R15 contents at this time are saved in SGR.
Exception code H'40 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + H'0400.
To speed up TLB miss processing, the offset is separate from that of other exceptions.
ITLB_miss_exception()
{
TEA = EXCEPTION_ADDRESS;
PTEH.VPN = PAGE_NUMBER;
SPC = PC;
SSR = SR;
SGR = R15;
EXPEVT = H'0000 0040;
SR.MD = 1;
SR.RB = 1;
SR.BL = 1;
PC = VBR + H'0000 0400;
}
Rev. 1.50, 10/04, page 78 of 448
Page 99
Initial Page Write Exception:
Source: TLB is hit in a store access, but dirty bit D = 0
Transition address: VBR + H'00000100
Transition operations:
The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicates the ASID when this exception occurred.
The PC and SR contents for the instruction at which this exception occurred are saved in SPC and SSR. The R15 contents at this time are saved in SGR.
Exception code H'080 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + H'0100.
Initial_write_exception()
{
TEA = EXCEPTION_ADDRESS;
PTEH.VPN = PAGE_NUMBER;
SPC = PC;
SSR = SR;
SGR = R15;
EXPEVT = H'0000 0080;
SR.MD = 1;
SR.RB = 1;
SR.BL = 1;
PC = VBR + H'0000 0100;
}
Rev. 1.50, 10/04, page 79 of 448
Page 100
Data TLB Protection Violation Exception:
Source: The access does not accord with the UTLB protection information (PR bits) shown below.
PR Privileged Mode User Mode
00 Only read access possible Access not possible
01 Read/write access possible Access not possible
10 Only read access possible Only read access possible
11 Read/write access possible Read/write access possible
Transition address: VBR + H'00000100
Transition operations:
The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicates the ASID when this exception occurred.
The PC and SR contents for the instruction at which this exception occurred are saved in SPC and SSR. The R15 contents at this time are saved in SGR.
Exception code H'0A0 (for a read access) or H'0C0 (for a write access) is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + H'0100.
Data_TLB_protection_violation_exception()
{
TEA = EXCEPTION_ADDRESS;
PTEH.VPN = PAGE_NUMBER;
SPC = PC;
SSR = SR;
SGR = R15;
EXPEVT = read_access ? H'0000 00A0 : H'0000 00C0;
SR.MD = 1;
SR.RB = 1;
SR.BL = 1;
PC = VBR + H'0000 0100;
}
Rev. 1.50, 10/04, page 80 of 448
Loading...