IBM SA14-2339-04 User Manual

Download

PowerPC 405

Embedded Processor Core

User’s Manual

SA14-2339-04

Fifth Edition (December 2001)

This edition of

IBM PPC405 Embedded Processor Core User’s Manual

applies to the IBM PPC405 32-bit

embedded processor core, until otherwise indicated in new versions or application notes.

The following paragraph does not apply to the United Kingdom or any country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS MANUAL “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions; therefore, this statement may not apply to you.

IBM does not warrant that the products in this publication, whether individually or as one or more groups, will meet your requirements or that the publication or the accompanying product descriptions are error-free.

This publication could contain technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or program(s) described in this publication at any time.

It is possible that this publication may contain references to, or information about, IBM products (machines and programs), programming, or services that are not announced in your country. Such references or information must not be construed to mean that IBM intends to announce such IBM products, programming, or services in your country. Any reference to an IBM licensed program in this publication is not intended to state or imply that you can use only IBM’s licensed program. You can use any functionally equivalent program instead.

No part of this publication may be reproduced or distributed in any form or by any means, or stored in a data base or retrieval system, without the written permission of IBM.

Requests for copies of this publication and for technical information about IBM products should be made to your IBM Authorized Dealer or your IBM Marketing Representative.

Address technical queries about this product to ppcsupp@us.ibm.com Address comments about this publication to: IBM Corporation

Department YM5A P.O. Box 12195 Research Triangle Park, NC 27709

IBM may use or distribute whatever information you supply in any way it believes appropriate without incurring any obligation to you.

disclosure is subject to restrictions set forth in GSA ADP Schedule Contract with IBM Corporation.

Patents and Trademarks

IBM may have patents or pending patent applications covering the subject matter in this publication. The furnishing of this publication does not give you any license to these patents. You can send license inquiries, in writing, to the IBM Director of Licensing, IBM Corporation, 208 Harbor Drive, Stamford, CT 06904, United States of America.

The following terms are trademarks of IBM Corporation: IBM

PowerPC PowerPC Architecture PowerPC Embedded Controllers RISCWatch

Other terms which are trademarks are the property of their respective owners.

Figures ......................................................................................................................................xv

Tables .....................................................................................................................................xviii

About This Book .....................................................................................................................xxi

Who Should Use This Book .............................................................................................................................. xxi

How to Use This Book ...................................................................................................................................... xxi

Conventions ..................................................................................................................................................... xxii

Chapter 1. Overview ...............................................................................................................1-1

PPC405 Features ............................................................................................................................................ 1-1

PowerPC Architecture ...................................................................................................................................... 1-3

The PPC405 as a PowerPC Implementation ................................................................................................... 1-3

Processor Core Organization ........................................................................................................................... 1-4

Instruction and Data Cache Controllers ...................................................................................................... 1-4

Instruction Cache Unit ............................................................................................................................ 1-4

Data Cache Unit ..................................................................................................................................... 1-5

Memory Management Unit .......................................................................................................................... 1-5

Timer Facilities ............................................................................................................................................ 1-6

Debug .......................................................................................................................................................... 1-7

Development Tool Support ..................................................................................................................... 1-7

Debug Modes ......................................................................................................................................... 1-7

Core Interfaces ............................................................................................................................................ 1-7

Processor Local Bus ............................................................................................................................... 1-8

Device Control Register Bus ................................................................................................................... 1-8

Clock and Power Management ............................................................................................................... 1-8

JTAG ....................................................................................................................................................... 1-8

Interrupts ................................................................................................................................................ 1-8

Auxiliary Processor Unit .......................................................................................................................... 1-8

On-Chip Memory .................................................................................................................................... 1-8

Data Types .................................................................................................................................................. 1-8

Processor Core Register Set Summary ...................................................................................................... 1-9

General Purpose Registers .................................................................................................................... 1-9

Special Purpose Registers ..................................................................................................................... 1-9

Machine State Register .......................................................................................................................... 1-9

Condition Register .................................................................................................................................. 1-9

Device Control Registers ........................................................................................................................ 1-9

Addressing Modes ..................................................................................................................................... 1-10

Chapter 2. Programming Model ............................................................................................2-1

User and Privileged Programming Models ...................................................................................................... 2-1

Memory Organization and Addressing ............................................................................................................. 2-1

Storage Attributes ........................................................................................................................................ 2-2

Registers .......................................................................................................................................................... 2-2

General Purpose Registers (R0-R31) ......................................................................................................... 2-5

Special Purpose Registers .......................................................................................................................... 2-5

Count Register (CTR) ............................................................................................................................. 2-6

Link Register (LR) .................................................................................................................................. 2-7

Fixed Point Exception Register (XER) .................................................................................................... 2-7

Special Purpose Register General (SPRG0–SPRG7) ............................................................................ 2-9

Processor Version Register (PVR) ....................................................................................................... 2-10

Condition Register (CR) ............................................................................................................................ 2-10

CR Fields after Compare Instructions ................................................................................................... 2-11

Contents v

The CR0 Field ...................................................................................................................................... 2-12

The Time Base .......................................................................................................................................... 2-13

Machine State Register (MSR) ................................................................................................................. 2-13

Device Control Registers .......................................................................................................................... 2-15

Data Types and Alignment ............................................................................................................................ 2-16

Alignment for Storage Reference and Cache Control Instructions ........................................................... 2-16

Alignment and Endian Operation .............................................................................................................. 2-17

Summary of Instructions Causing Alignment Exceptions ......................................................................... 2-17

Byte Ordering ............................................................................................................................................... 2-17

Structure Mapping Examples .................................................................................................................... 2-18

Big Endian Mapping ............................................................................................................................. 2-19

Little Endian Mapping ........................................................................................................................... 2-19

Support for Little Endian Byte Ordering .................................................................................................... 2-19

Endian (E) Storage Attribute ..................................................................................................................... 2-19

Fetching Instructions from Little Endian Storage Regions ................................................................... 2-20

Accessing Data in Little Endian Storage Regions ................................................................................ 2-21

PowerPC Byte-Reverse Instructions .................................................................................................... 2-21

Instruction Processing ................................................................................................................................... 2-23

Branch Processing ........................................................................................................................................ 2-24

Unconditional Branch Target Addressing Options .................................................................................... 2-24

Conditional Branch Target Addressing Options ........................................................................................ 2-24

Conditional Branch Condition Register Testing ........................................................................................ 2-25

BO Field on Conditional Branches ............................................................................................................ 2-25

Branch Prediction ...................................................................................................................................... 2-26

Speculative Accesses .................................................................................................................................... 2-27

Speculative Accesses in the PPC405 ....................................................................................................... 2-27

Prefetch Distance Down an Unresolved Branch Path .......................................................................... 2-28

Prefetch of Branches to the CTR and Branches to the LR ................................................................... 2-28

Preventing Inappropriate Speculative Accesses ....................................................................................... 2-28

Fetching Past an Interrupt-Causing or Interrupt-Returning Instruction ................................................. 2-28

Fetching Past tw or twi Instructions ...................................................................................................... 2-29

Fetching Past an Unconditional Branch ............................................................................................... 2-29

Suggested Locations of Memory-Mapped Hardware ........................................................................... 2-29

Summary ................................................................................................................................................... 2-30

Privileged Mode Operation ............................................................................................................................ 2-30

MSR Bits and Exception Handling ............................................................................................................ 2-31

Privileged Instructions ............................................................................................................................... 2-31

Privileged SPRs ........................................................................................................................................ 2-32

Privileged DCRs ........................................................................................................................................ 2-32

Synchronization ............................................................................................................................................. 2-33

Context Synchronization ........................................................................................................................... 2-33

Execution Synchronization ........................................................................................................................ 2-35

Storage Synchronization ........................................................................................................................... 2-35

Instruction Set ................................................................................................................................................ 2-36

Instructions Specific to the IBM PowerPC Embedded Environment ...................................................... 2-37

Storage Reference Instructions ................................................................................................................ 2-37

Arithmetic Instructions ............................................................................................................................... 2-38

Logical Instructions ................................................................................................................................... 2-39

Compare Instructions ................................................................................................................................ 2-39

Branch Instructions ................................................................................................................................... 2-40

CR Logical Instructions ........................................................................................................................ 2-40

Rotate Instructions ............................................................................................................................... 2-40

Shift Instructions ................................................................................................................................... 2-41

Cache Management Instructions .......................................................................................................... 2-41

Interrupt Control Instructions ..................................................................................................................... 2-41

TLB Management Instructions .................................................................................................................. 2-42

vi PPC405 Core User’s Manual

Processor Management Instructions ......................................................................................................... 2-42

Extended Mnemonics ................................................................................................................................ 2-42

Chapter 3. Initialization ..........................................................................................................3-1

Processor State After Reset ............................................................................................................................ 3-1

Machine State Register Contents after Reset ............................................................................................. 3-2

Contents of Special Purpose Registers after Reset .................................................................................... 3-3

PPC405 Initial Processor Sequencing ............................................................................................................. 3-3

Initialization Requirements ............................................................................................................................... 3-4

Initialization Code Example .............................................................................................................................. 3-5

Chapter 4. Cache Operations ................................................................................................4-1

ICU and DCU Organization and Sizes ............................................................................................................. 4-2

ICU Overview ................................................................................................................................................... 4-3

ICU Operations ............................................................................................................................................ 4-4

Instruction Cachability Control ..................................................................................................................... 4-5

Instruction Cache Synonyms ....................................................................................................................... 4-5

ICU Coherency ............................................................................................................................................ 4-6

DCU Overview ................................................................................................................................................. 4-6

DCU Operations .......................................................................................................................................... 4-6

DCU Write Strategies .................................................................................................................................. 4-7

DCU Load and Store Strategies .................................................................................................................. 4-8

Data Cachability Control .............................................................................................................................. 4-8

DCU Coherency .......................................................................................................................................... 4-9

Cache Instructions ........................................................................................................................................... 4-9

ICU Instructions ........................................................................................................................................... 4-9

DCU Instructions ....................................................................................................................................... 4-10

Cache Control and Debugging Features ....................................................................................................... 4-11

CCR0 Programming Guidelines ................................................................................................................ 4-13

ICU Debugging .......................................................................................................................................... 4-14

DCU Debugging ........................................................................................................................................ 4-15

DCU Performance .......................................................................................................................................... 4-16

Pipeline Stalls ............................................................................................................................................ 4-16

Cache Operation Priorities ........................................................................................................................ 4-17

Simultaneous Cache Operations ............................................................................................................... 4-17

Sequential Cache Operations ................................................................................................................... 4-18

Chapter 5. Fixed-Point Interrupts and Exceptions ..............................................................5-1

Architectural Definitions and Behavior ............................................................................................................. 5-1

Behavior of the PPC405 Processor Core Implementation ............................................................................... 5-2

Interrupt Handling Priorities ............................................................................................................................. 5-3

Critical and Noncritical Interrupts ..................................................................................................................... 5-5

General Interrupt Handling Registers .............................................................................................................. 5-7

Machine State Register (MSR) .................................................................................................................... 5-7

Save/Restore Registers 0 and 1 (SRR0–SRR1) ......................................................................................... 5-9

Save/Restore Registers 2 and 3 (SRR2–SRR3) ......................................................................................... 5-9

Exception Vector Prefix Register (EVPR) ................................................................................................ 5-10

Exception Syndrome Register (ESR) ........................................................................................................ 5-11

Data Exception Address Register (DEAR) ................................................................................................ 5-13

Critical Input Interrupts ................................................................................................................................... 5-13

Machine Check Interrupts .............................................................................................................................. 5-14

Instruction Machine Check Handling ......................................................................................................... 5-14

Data Machine Check Handling .................................................................................................................. 5-15

Data Storage Interrupt ................................................................................................................................... 5-16

Instruction Storage Interrupt .......................................................................................................................... 5-17

External Interrupt ........................................................................................................................................... 5-18

External Interrupt Handling ........................................................................................................................ 5-18

Contents vii

Alignment Interrupt ........................................................................................................................................ 5-19

Program Interrupt .......................................................................................................................................... 5-20

FPU Unavailable Interrupt ............................................................................................................................. 5-21

System Call Interrupt ..................................................................................................................................... 5-22

APU Unavailable Interrupt ............................................................................................................................. 5-22

Programmable Interval Timer (PIT) Interrupt ................................................................................................. 5-22

Fixed Interval Timer (FIT) Interrupt ................................................................................................................ 5-23

Watchdog Timer Interrupt .............................................................................................................................. 5-24

Data TLB Miss Interrupt ................................................................................................................................. 5-25

Instruction TLB Miss Interrupt ........................................................................................................................ 5-25

Debug Interrupt .............................................................................................................................................. 5-26

Chapter 6. Timer Facilities ....................................................................................................6-1

Time Base ....................................................................................................................................................... 6-1

Reading the Time Base .............................................................................................................................. 6-3

Writing the Time Base ................................................................................................................................. 6-3

Programmable Interval Timer (PIT) ................................................................................................................. 6-4

Fixed Interval Timer (FIT) ........................................................................................................................... 6-5

Watchdog Timer .............................................................................................................................................. 6-6

Timer Status Register (TSR) ........................................................................................................................... 6-8

Timer Control Register (TCR) .......................................................................................................................... 6-9

Chapter 7. Memory Management ..........................................................................................7-1

MMU Overview ................................................................................................................................................ 7-1

Address Translation ......................................................................................................................................... 7-1

Translation Lookaside Buffer (TLB) ................................................................................................................. 7-2

Unified TLB ................................................................................................................................................. 7-2

TLB Fields ................................................................................................................................................... 7-3

Page Identification Fields ....................................................................................................................... 7-3

Translation Field ..................................................................................................................................... 7-4

Access Control Fields ............................................................................................................................. 7-5

Storage Attribute Fields .......................................................................................................................... 7-5

Shadow Instruction TLB .............................................................................................................................. 7-6

ITLB Accesses ....................................................................................................................................... 7-7

Shadow Data TLB ....................................................................................................................................... 7-7

DTLB Accesses ...................................................................................................................................... 7-7

Shadow TLB Consistency ........................................................................................................................... 7-7

TLB-Related Interrupts .................................................................................................................................... 7-9

Data Storage Interrupt .............................................................................................................................. 7-10

Instruction Storage Interrupt ..................................................................................................................... 7-10

Data TLB Miss Interrupt ............................................................................................................................ 7-11

Instruction TLB Miss Interrupt ................................................................................................................... 7-11

Program Interrupt ...................................................................................................................................... 7-11

TLB Management .......................................................................................................................................... 7-11

TLB Search Instructions (tlbsx/tlbsx.) ....................................................................................................... 7-12

TLB Read/Write Instructions (tlbre/tlbwe) ................................................................................................. 7-12

TLB Invalidate Instruction (tlbia) ............................................................................................................... 7-12

TLB Sync Instruction (tlbsync) .................................................................................................................. 7-12

Recording Page References and Changes ................................................................................................... 7-12

Access Protection .......................................................................................................................................... 7-13

Access Protection Mechanisms in the TLB ............................................................................................... 7-13

General Access Protection ................................................................................................................... 7-13

Execute Permissions ............................................................................................................................ 7-14

Write Permissions ................................................................................................................................ 7-14

Zone Protection .................................................................................................................................... 7-14

Access Protection for Cache Control Instructions ..................................................................................... 7-16

Access Protection for String Instructions .................................................................................................. 7-17

viii PPC405 Core User’s Manual

Real-Mode Storage Attribute Control ............................................................................................................. 7-17

Storage Attribute Control Registers ........................................................................................................... 7-19

Data Cache Write-through Register (DCWR) ....................................................................................... 7-19

Data Cache Cachability Register (DCCR) ............................................................................................ 7-20

Instruction Cache Cachability Register (ICCR) ..................................................................................... 7-20

Storage Guarded Register (SGR) ......................................................................................................... 7-20

Storage User-defined 0 Register (SU0R) ............................................................................................. 7-20

Storage Little-Endian Register (SLER) ................................................................................................. 7-20

Chapter 8. Debugging ............................................................................................................8-1

Development Tool Support .............................................................................................................................. 8-1

Debug Modes ................................................................................................................................................... 8-1

Internal Debug Mode ................................................................................................................................... 8-1

External Debug Mode .................................................................................................................................. 8-2

Debug Wait Mode ........................................................................................................................................ 8-2

Real-time Trace Debug Mode ..................................................................................................................... 8-3

Processor Control ............................................................................................................................................ 8-3

Processor Status .............................................................................................................................................. 8-4

Debug Registers .............................................................................................................................................. 8-4

Debug Control Registers ............................................................................................................................. 8-4

Debug Control Register 0 (DBCR0) ........................................................................................................ 8-4

Debug Control Register1 (DBCR1) ......................................................................................................... 8-6

Debug Status Register (DBSR) .................................................................................................................. 8-7

Instruction Address Compare Registers (IAC1–IAC4) ................................................................................ 8-9

Data Address Compare Registers (DAC1–DAC2) .................................................................................... 8-9

Data Value Compare Registers (DVC1–DVC2) ........................................................................................8-10

Debug Events ............................................................................................................................................ 8-10

Instruction Complete Debug Event ............................................................................................................ 8-11

Branch Taken Debug Event ...................................................................................................................... 8-11

Exception Taken Debug Event .................................................................................................................. 8-11

Trap Taken Debug Event .......................................................................................................................... 8-12

Unconditional Debug Event ....................................................................................................................... 8-12

IAC Debug Event ....................................................................................................................................... 8-12

IAC Exact Address Compare ................................................................................................................ 8-12

IAC Range Address Compare .............................................................................................................. 8-12

DAC Debug Event ..................................................................................................................................... 8-13

DAC Exact Address Compare .............................................................................................................. 8-13

DAC Range Address Compare ............................................................................................................. 8-14

DAC Applied to Cache Instructions ....................................................................................................... 8-15

DAC Applied to String Instructions ........................................................................................................ 8-16

Data Value Compare Debug Event ........................................................................................................... 8-16

Imprecise Debug Event ............................................................................................................................. 8-19

Debug Interface ............................................................................................................................................. 8-19

IEEE 1149.1 Test Access Port (JTAG Debug Port) ..................................................................................8-19

JTAG Connector ............................................................................................................................................ 8-20

JTAG Instructions ...................................................................................................................................... 8-21

JTAG Boundary Scan ................................................................................................................................ 8-21

Trace Port ...................................................................................................................................................... 8-22

Chapter 9. Instruction Set .....................................................................................................9-1

Instruction Set Portability ................................................................................................................................. 9-1

Instruction Formats .......................................................................................................................................... 9-2

Pseudocode ..................................................................................................................................................... 9-2

Operator Precedence .................................................................................................................................. 9-5

Register Usage ................................................................................................................................................ 9-5

Alphabetical Instruction Listing ........................................................................................................................ 9-5

add .............................................................................................................................................................. 9-6

Contents ix

addc ............................................................................................................................................................ 9-7

adde ............................................................................................................................................................ 9-8

addi ............................................................................................................................................................. 9-9

addic ......................................................................................................................................................... 9-10

addic. ........................................................................................................................................................ 9-11

addis ......................................................................................................................................................... 9-12

addme ....................................................................................................................................................... 9-13

addze ........................................................................................................................................................ 9-14

and ............................................................................................................................................................ 9-15

andc .......................................................................................................................................................... 9-16

andi. .......................................................................................................................................................... 9-17

andis. ........................................................................................................................................................ 9-18

b ................................................................................................................................................................ 9-19

bc .............................................................................................................................................................. 9-20

bcctr .......................................................................................................................................................... 9-26

bclr ............................................................................................................................................................ 9-30

cmp ........................................................................................................................................................... 9-34

cmpi .......................................................................................................................................................... 9-35

cmpl .......................................................................................................................................................... 9-36

cmpli .......................................................................................................................................................... 9-37

cntlzw ........................................................................................................................................................ 9-38

crand ......................................................................................................................................................... 9-39

crandc ....................................................................................................................................................... 9-40

creqv ......................................................................................................................................................... 9-41

crnand ....................................................................................................................................................... 9-42

crnor .......................................................................................................................................................... 9-43

cror ............................................................................................................................................................ 9-44

crorc .......................................................................................................................................................... 9-45

crxor .......................................................................................................................................................... 9-46

dcba .......................................................................................................................................................... 9-47

dcbf ........................................................................................................................................................... 9-49

dcbi ........................................................................................................................................................... 9-50

dcbst ......................................................................................................................................................... 9-51

dcbt ........................................................................................................................................................... 9-52

dcbtst ........................................................................................................................................................ 9-53

dcbz .......................................................................................................................................................... 9-54

dccci .......................................................................................................................................................... 9-56

dcread ....................................................................................................................................................... 9-57

divw ........................................................................................................................................................... 9-59

divwu ......................................................................................................................................................... 9-60

eieio .......................................................................................................................................................... 9-61

eqv ............................................................................................................................................................ 9-62

extsb ......................................................................................................................................................... 9-63

extsh ......................................................................................................................................................... 9-64

icbi ............................................................................................................................................................. 9-65

icbt ............................................................................................................................................................ 9-66

iccci ........................................................................................................................................................... 9-67

icread ........................................................................................................................................................ 9-68

isync .......................................................................................................................................................... 9-70

lbz ............................................................................................................................................................. 9-71

lbzu ........................................................................................................................................................... 9-72

lbzux .......................................................................................................................................................... 9-73

lbzx ............................................................................................................................................................ 9-74

lha ............................................................................................................................................................. 9-75

x PPC405 Core User’s Manual

lhau ............................................................................................................................................................ 9-76

lhaux .......................................................................................................................................................... 9-77

lhax ............................................................................................................................................................ 9-78

lhbrx ........................................................................................................................................................... 9-79

lhz .............................................................................................................................................................. 9-80

lhzu ............................................................................................................................................................ 9-81

lhzux .......................................................................................................................................................... 9-82

lhzx ............................................................................................................................................................ 9-83

lmw ............................................................................................................................................................ 9-84

lswi ............................................................................................................................................................ 9-85

lswx ........................................................................................................................................................... 9-87

lwarx .......................................................................................................................................................... 9-89

lwbrx .......................................................................................................................................................... 9-90

lwz ............................................................................................................................................................. 9-91

lwzu ........................................................................................................................................................... 9-92

lwzux ......................................................................................................................................................... 9-93

lwzx ........................................................................................................................................................... 9-94

macchw ..................................................................................................................................................... 9-95

macchws ................................................................................................................................................... 9-96

macchwsu ................................................................................................................................................. 9-97

macchwu ................................................................................................................................................... 9-98

machhw ..................................................................................................................................................... 9-99

machhws ................................................................................................................................................. 9-100

machhwsu ............................................................................................................................................... 9-101

machhwu ................................................................................................................................................. 9-102

maclhw .................................................................................................................................................... 9-103

maclhws .................................................................................................................................................. 9-104

maclhwsu ................................................................................................................................................ 9-105

maclhwu .................................................................................................................................................. 9-106

mcrf ......................................................................................................................................................... 9-107

mcrxr ....................................................................................................................................................... 9-108

mfcr ......................................................................................................................................................... 9-109

mfdcr ....................................................................................................................................................... 9-110

mfmsr ...................................................................................................................................................... 9-111

mfspr ....................................................................................................................................................... 9-112

mftb ......................................................................................................................................................... 9-114

mtcrf ........................................................................................................................................................ 9-116

mtdcr ....................................................................................................................................................... 9-117

mtmsr ...................................................................................................................................................... 9-118

mtspr ....................................................................................................................................................... 9-119

mulchw .................................................................................................................................................... 9-121

mulchwu .................................................................................................................................................. 9-122

mulhhw .................................................................................................................................................... 9-123

mulhhwu .................................................................................................................................................. 9-124

mulhw ...................................................................................................................................................... 9-125

mulhwu .................................................................................................................................................... 9-126

mullhw ..................................................................................................................................................... 9-127

mullhwu ................................................................................................................................................... 9-128

mulli ......................................................................................................................................................... 9-129

mullw ....................................................................................................................................................... 9-130

nand ........................................................................................................................................................ 9-131

neg .......................................................................................................................................................... 9-132

nmacchw ................................................................................................................................................. 9-133

nmacchws ............................................................................................................................................... 9-134

Contents xi

nmachhw ................................................................................................................................................. 9-135

nmachhws ............................................................................................................................................... 9-136

nmaclhw .................................................................................................................................................. 9-137

nmaclhws ................................................................................................................................................ 9-138

nor ........................................................................................................................................................... 9-139

or ............................................................................................................................................................. 9-140

orc ........................................................................................................................................................... 9-141

ori ............................................................................................................................................................ 9-142

oris .......................................................................................................................................................... 9-143

rfci ........................................................................................................................................................... 9-144

rfi ............................................................................................................................................................. 9-145

rlwimi ....................................................................................................................................................... 9-146

rlwinm ...................................................................................................................................................... 9-147

rlwnm ...................................................................................................................................................... 9-150

sc ............................................................................................................................................................ 9-151

slw ........................................................................................................................................................... 9-152

sraw ........................................................................................................................................................ 9-153

srawi ........................................................................................................................................................ 9-154

srw .......................................................................................................................................................... 9-155

stb ........................................................................................................................................................... 9-156

stbu ......................................................................................................................................................... 9-157

stbux ....................................................................................................................................................... 9-158

stbx ......................................................................................................................................................... 9-159

sth ........................................................................................................................................................... 9-160

sthbrx ...................................................................................................................................................... 9-161

sthu ......................................................................................................................................................... 9-162

sthux ....................................................................................................................................................... 9-163

sthx ......................................................................................................................................................... 9-164

stmw ........................................................................................................................................................ 9-165

stswi ........................................................................................................................................................ 9-166

stswx ....................................................................................................................................................... 9-167

stw ........................................................................................................................................................... 9-169

stwbrx ...................................................................................................................................................... 9-170

stwcx. ...................................................................................................................................................... 9-171

stwu ......................................................................................................................................................... 9-173

stwux ....................................................................................................................................................... 9-174

stwx ......................................................................................................................................................... 9-175

subf ......................................................................................................................................................... 9-176

subfc ....................................................................................................................................................... 9-177

subfe ....................................................................................................................................................... 9-178

subfic ....................................................................................................................................................... 9-179

subfme .................................................................................................................................................... 9-180

subfze ..................................................................................................................................................... 9-181

sync ......................................................................................................................................................... 9-182

tlbia ......................................................................................................................................................... 9-183

tlbre ......................................................................................................................................................... 9-184

tlbsx ......................................................................................................................................................... 9-186

tlbsync ..................................................................................................................................................... 9-187

tlbwe ........................................................................................................................................................ 9-188

tw ............................................................................................................................................................ 9-190

twi ............................................................................................................................................................ 9-193

wrtee ....................................................................................................................................................... 9-196

wrteei ...................................................................................................................................................... 9-197

xor ........................................................................................................................................................... 9-198

xii PPC405 Core User’s Manual

xori ........................................................................................................................................................... 9-199

xoris ......................................................................................................................................................... 9-200

Chapter 10. Register Summary ..........................................................................................10-1

Reserved Registers ....................................................................................................................................... 10-1

Reserved Fields ............................................................................................................................................. 10-1

General Purpose Registers ............................................................................................................................ 10-1

Machine State Register and Condition Register ............................................................................................ 10-1

Special Purpose Registers ............................................................................................................................. 10-2

Time Base Registers ...................................................................................................................................... 10-4

Device Control Registers ............................................................................................................................... 10-4

Alphabetical Listing of PPC405 Registers ..................................................................................................... 10-5

CCR0 ......................................................................................................................................................... 10-6

CR ............................................................................................................................................................. 10-8

CTR ........................................................................................................................................................... 10-9

DAC1–DAC2 ........................................................................................................................................... 10-10

DBCR0 .................................................................................................................................................... 10-11

DBCR1 .................................................................................................................................................... 10-13

DBSR ...................................................................................................................................................... 10-15

DCCR ...................................................................................................................................................... 10-17

DCWR ..................................................................................................................................................... 10-19

DEAR ...................................................................................................................................................... 10-21

DVCR1–DVCR2 ...................................................................................................................................... 10-22

ESR ......................................................................................................................................................... 10-23

EVPR ....................................................................................................................................................... 10-25

GPR0–GPR31 ......................................................................................................................................... 10-26

IAC1–IAC4 .............................................................................................................................................. 10-27

ICCR ........................................................................................................................................................ 10-28

ICDBDR ................................................................................................................................................... 10-30

LR ............................................................................................................................................................ 10-31

MSR ........................................................................................................................................................ 10-32

PID .......................................................................................................................................................... 10-34

PIT ........................................................................................................................................................... 10-35

PVR ......................................................................................................................................................... 10-36

SGR ......................................................................................................................................................... 10-37

SLER ....................................................................................................................................................... 10-39

SPRG0–SPRG7 ...................................................................................................................................... 10-41

SRR0 ....................................................................................................................................................... 10-42

SRR1 ....................................................................................................................................................... 10-43

SRR2 ....................................................................................................................................................... 10-44

SRR3 ....................................................................................................................................................... 10-45

SU0R ....................................................................................................................................................... 10-46

TBL .......................................................................................................................................................... 10-48

TBU ......................................................................................................................................................... 10-49

TCR ......................................................................................................................................................... 10-50

TSR ......................................................................................................................................................... 10-51

USPRG0 .................................................................................................................................................. 10-52

XER ......................................................................................................................................................... 10-53

ZPR ......................................................................................................................................................... 10-54

A. Instruction Summary ........................................................................................................ A-1

Instruction Set and Extended Mnemonics – Alphabetical ................................................................................ A-1

Instructions Sorted by Opcode ....................................................................................................................... A-33

Instruction Formats ........................................................................................................................................ A-41

Instruction Fields ....................................................................................................................................... A-41

Contents xiii

Instruction Format Diagrams ..................................................................................................................... A-43

I-Form A-44 B-Form A-44 SC-Form A-44 D-Form A-44 X-Form A-45 XL-Form A-45 XFX-Form A-46 X0-Form A-46 M-Form A-46

B. Instructions by Category ................................................................................................. B-1

Implementation-Specific Instructions ............................................................................................................... B-1

Instructions in the IBM PowerPC Embedded Environment ............................................................................. B-5

Privileged Instructions ..................................................................................................................................... B-7

Assembler Extended Mnemonics .................................................................................................................... B-9

Storage Reference Instructions ..................................................................................................................... B-29

Arithmetic and Logical Instructions ................................................................................................................ B-33

Condition Register Logical Instructions ......................................................................................................... B-37

Branch Instructions ........................................................................................................................................ B-38

Comparison Instructions ................................................................................................................................ B-39

Rotate and Shift Instructions ......................................................................................................................... B-40

Cache Control Instructions ............................................................................................................................ B-41

Interrupt Control Instructions ......................................................................................................................... B-42

TLB Management Instructions ....................................................................................................................... B-42

Processor Management Instructions ............................................................................................................. B-44

C. Code Optimization and Instruction Timings ..................................................................C-1

Code Optimization Guidelines ......................................................................................................................... C-1

Condition Register Bits for Boolean Variables ............................................................................................ C-1

CR Logical Instruction for Compound Branches ......................................................................................... C-1

Floating-Point Emulation ............................................................................................................................. C-1

Cache Usage .............................................................................................................................................. C-2

CR Dependencies ....................................................................................................................................... C-2

Branch Prediction ........................................................................................................................................ C-2

Alignment .................................................................................................................................................... C-2

Instruction Timings .......................................................................................................................................... C-3

General Rules ............................................................................................................................................. C-3

Branches ..................................................................................................................................................... C-3

Multiplies ..................................................................................................................................................... C-4

Scalar Load Instructions ............................................................................................................................. C-5

Scalar Store Instructions ............................................................................................................................. C-6

Alignment in Scalar Load and Store Instructions ........................................................................................ C-6

String and Multiple Instructions ................................................................................................................... C-6

Loads and Store Misses ............................................................................................................................. C-7

Instruction Cache Misses ............................................................................................................................ C-7

Index ........................................................................................................................................ X-1

xiv PPC405 Core User’s Manual

Figures

Figure 1-1. PPC405 Block Diagram ................................................................................................................1-4

Figure 2-1. PPC405 Programming Model—Registers ....................................................................................2-4

Figure 2-2. General Purpose Registers (R0-R31) ..........................................................................................2-5

Figure 2-3. Count Register (CTR) ...................................................................................................................2-7

Figure 2-4. Link Register (LR) .........................................................................................................................2-7

Figure 2-5. Fixed Point Exception Register (XER) ..........................................................................................2-8

Figure 2-6. Special Purpose Register General (SPRG0–SPRG7) ...............................................................2-10

Figure 2-7. Processor Version Register (PVR) .............................................................................................2-10

Figure 2-8. Condition Register (CR) .............................................................................................................2-11

Figure 2-9. Machine State Register (MSR) ...................................................................................................2-14

Figure 2-10. PPC405 Data Types .................................................................................................................2-16

Figure 2-11. Normal Word Load or Store (Big Endian Storage Region) .......................................................2-22

Figure 2-12. Byte-Reverse Word Load or Store (Little Endian Storage Region) ..........................................2-22

Figure 2-13. Byte-Reverse Word Load or Store (Big Endian Storage Region) .............................................2-22

Figure 2-14. Normal Word Load or Store (Little Endian Storage Region) ....................................................2-23

Figure 2-15. PPC405 Instruction Pipeline .....................................................................................................2-24

Figure 4-1. Instruction Flow ............................................................................................................................4-4

Figure 4-2. Core Configuration Register 0 (CCR0) .......................................................................................4-11

Figure 4-3. Instruction Cache Debug Data Register (ICDBDR) ....................................................................4-14

Figure 5-1. Machine State Register (MSR) .....................................................................................................5-7

Figure 5-2. Save/Restore Register 0 (SRR0) .................................................................................................5-9

Figure 5-3. Save/Restore Register 1 (SRR1) .................................................................................................5-9

Figure 5-4. Save/Restore Register 2 (SRR2) ...............................................................................................5-10

Figure 5-5. Save/Restore Register 3 (SRR3) ...............................................................................................5-10

Figure 5-6. Exception Vector Prefix Register (EVPR) ...................................................................................5-11

Figure 5-7. Exception Syndrome Register (ESR) .........................................................................................5-11

Figure 5-8. Data Exception Address Register (DEAR) .................................................................................5-13

Figure 6-1. Relationship of Timer Facilities to the Time Base ........................................................................6-1

Figure 6-2. Time Base Lower (TBL) ................................................................................................................6-2

Figure 6-3. Time Base Upper (TBU) ...............................................................................................................6-2

Figure 6-4. Programmable Interval Timer (PIT) ..............................................................................................6-5

Figure 6-5. Watchdog Timer State Machine ..................................................................................................6-7

Figure 6-6. Timer Status Register (TSR) ........................................................................................................6-8

Figure 6-7. Timer Control Register (TCR) .......................................................................................................6-9

Figure 7-1. Effective to Real Address Translation Flow ..................................................................................7-2

Figure 7-2. TLB Entries ...................................................................................................................................7-3

Figure 7-3. ITLB/DTLB/UTLB Address Resolution .........................................................................................7-9

Figure 7-4. Process ID (PID) .........................................................................................................................7-14

Figure 7-5. Zone Protection Register (ZPR) .................................................................................................7-15

Figure 7-6. Generic Storage Attribute Control Register ................................................................................7-19

Figure 8-1. Debug Control Register 0 (DBCR0) .............................................................................................8-4

Figure 8-2. Debug Control Register 1 (DBCR1) .............................................................................................8-6

Figures xv

Figure 8-3. Debug Status Register (DBSR) .................................................................................................... 8-8

Figure 8-4. Instruction Address Compare Registers (IAC1–IAC4) ................................................................. 8-9

Figure 8-5. Data Address Compare Registers (DAC1–DAC2) ..................................................................... 8-10

Figure 8-6. Data Value Compare Registers (DVC1–DVC2) ......................................................................... 8-10

Figure 8-7. Inclusive IAC Range Address Compares ................................................................................... 8-13

Figure 8-8. Exclusive IAC Range Address Compares .................................................................................. 8-13

Figure 8-9. Inclusive DAC Range Address Compares ................................................................................. 8-15

Figure 8-10. Exclusive DAC Range Address Compares .............................................................................. 8-15

Figure 8-11. JTAG Connector Physical Layout (Top View) .......................................................................... 8-20

Figure 10-1. Core Configuration Register 0 (CCR0) .................................................................................... 10-6

Figure 10-2. Condition Register (CR) ........................................................................................................... 10-8

Figure 10-3. Count Register (CTR) .............................................................................................................. 10-9

Figure 10-4. Data Address Compare Registers (DAC1–DAC2) ................................................................. 10-10

Figure 10-5. Debug Control Register 0 (DBCR0) ....................................................................................... 10-11

Figure 10-6. Debug Control Register 1 (DBCR1) ....................................................................................... 10-13

Figure 10-7. Debug Status Register (DBSR) .............................................................................................. 10-15

Figure 10-8. Data Cache Cachability Register (DCCR) ............................................................................. 10-17

Figure 10-9. Data Cache Write-through Register (DCWR) ........................................................................ 10-19

Figure 10-10. Data Exception Address Register (DEAR) ........................................................................... 10-21

Figure 10-11. Data Value Compare Registers (DVC1–DVC2) ................................................................... 10-22

Figure 10-12. Exception Syndrome Register (ESR) ................................................................................... 10-23

Figure 10-13. Exception Vector Prefix Register (EVPR) ............................................................................ 10-25

Figure 10-14. General Purpose Registers (R0-R31) .................................................................................. 10-26

Figure 10-15. Instruction Address Compare Registers (IAC1–IAC4) ......................................................... 10-27

Figure 10-16. Instruction Cache Cachability Register (ICCR) .................................................................... 10-28

Figure 10-17. Instruction Cache Debug Data Register (ICDBDR) ............................................................. 10-30

Figure 10-18. Link Register (LR) ................................................................................................................ 10-31

Figure 10-19. Machine State Register (MSR) ............................................................................................ 10-32

Figure 10-20. Process ID (PID) .................................................................................................................. 10-34

Figure 10-21. Programmable Interval Timer (PIT) ...................................................................................... 10-35

Figure 10-22. Processor Version Register (PVR) ....................................................................................... 10-36

Figure 10-23. Storage Guarded Register (SGR) ........................................................................................ 10-37

Figure 10-24. Storage Little-Endian Register (SLER) ................................................................................ 10-39

Figure 10-25. Special Purpose Registers General (SPRG0–SPRG7) ....................................................... 10-41

Figure 10-26. Save/Restore Register 0 (SRR0) ......................................................................................... 10-42

Figure 10-27. Save/Restore Register 1 (SRR1) ......................................................................................... 10-43

Figure 10-28. Save/Restore Register 2 (SRR2) ......................................................................................... 10-44

Figure 10-29. Save/Restore Register 3 (SRR3) ......................................................................................... 10-45

Figure 10-30. Storage User-defined 0 Register (SU0R) ............................................................................. 10-46

Figure 10-31. Time Base Lower (TBL) ....................................................................................................... 10-48

Figure 10-32. Time Base Upper (TBU) ....................................................................................................... 10-49

Figure 10-33. Timer Control Register (TCR) .............................................................................................. 10-50

Figure 10-34. Timer Status Register (TSR) ................................................................................................ 10-51

Figure 10-35. User SPR General 0 (USPRG0) .......................................................................................... 10-52

Figure 10-36. Fixed Point Exception Register (XER) ................................................................................. 10-53

Figure 10-37. Zone Protection Register (ZPR) ........................................................................................... 10-54

xvi PPC405 Core User’s Manual

Figure A-1. I Instruction Format ....................................................................................................................A-44

Figure A-2. B Instruction Format ...................................................................................................................A-44

Figure A-3. SC Instruction Format ................................................................................................................A-44

Figure A-4. D Instruction Format ...................................................................................................................A-44

Figure A-5. X Instruction Format ...................................................................................................................A-45

Figure A-6. XL Instruction Format .................................................................................................................A-45

Figure A-7. XFX Instruction Format ..............................................................................................................A-46

Figure A-8. XO Instruction Format ................................................................................................................A-46

Figure A-9. M Instruction Format ..................................................................................................................A-46

Figures xvii

Tables

Table 2-1. PPC405 SPRs ................................................................................................................................ 2-6

Table 2-2. XER[CA] Updating Instructions ...................................................................................................... 2-9

Table 2-3. XER[SO,OV] Updating Instructions ................................................................................................ 2-9

Table 2-4. Time Base Registers..................................................................................................................... 2-13

Table 2-5. Alignment Exception Summary .................................................................................................... 2-17

Table 2-6. Bits of the BO Field ...................................................................................................................... 2-25

Table 2-7. Conditional Branch BO Field ........................................................................................................ 2-26

Table 2-8. Example Memory Mapping............................................................................................................ 2-30

Table 2-9. Privileged Instructions .................................................................................................................. 2-31

Table 2-10. PPC405 Instruction Set Summary............................................................................................... 2-36

Table 2-11. Implementation-specific Instructions........................................................................................... 2-37

Table 2-12. Storage Reference Instructions .................................................................................................. 2-37

Table 2-13. Arithmetic Instructions ................................................................................................................ 2-38

Table 2-14. Multiply-Accumulate and Multiply Halfword Instructions ............................................................. 2-39

Table 2-15. Logical Instructions ..................................................................................................................... 2-39

Table 2-16. Compare Instructions ................................................................................................................. 2-39

Table 2-17. Branch Instructions ..................................................................................................................... 2-40

Table 2-18. CR Logical Instructions .............................................................................................................. 2-40

Table 2-19. Rotate Instructions ..................................................................................................................... 2-40

Table 2-20. Shift Instructions ......................................................................................................................... 2-41

Table 2-21. Cache Management Instructions ................................................................................................ 2-41

Table 2-22. Interrupt Control Instructions ...................................................................................................... 2-41

Table 2-23. TLB Management Instructions ................................................................................................... 2-42

Table 2-24. Processor Management Instructions .......................................................................................... 2-42

Table 3-1. MSR Contents after Reset .............................................................................................................. 3-2

Table 3-2. SPR Contents After Reset .............................................................................................................. 3-3

Table 4-1. Available Cache Array Sizes........................................................................................................... 4-2

Table 4-2. ICU and DCU Cache Array Organization........................................................................................ 4-3

Table 4-3. Cache Sizes, Tag Fields, and Lines................................................................................................ 4-3

Table 4-4. Priority Changes With Different Data Cache Operations .............................................................. 4-17

Table 5-1. Interrupt Handling Priorities ............................................................................................................ 5-4

Table 5-2. Interrupt Vector Offsets .................................................................................................................. 5-6

Table 5-3. ESR Alteration by Various Interrupts ............................................................................................ 5-13

Table 5-4. Register Settings during Critical Input Interrupts .......................................................................... 5-14

Table 5-5. Register Settings during Machine Check—Instruction Interrupts ................................................. 5-15

Table 5-6. Register Settings during Machine Check—Data Interrupts .......................................................... 5-15

Table 5-7. Register Settings during Data Storage Interrupts ......................................................................... 5-17

Table 5-8. Register Settings during Instruction Storage Interrupts ................................................................ 5-18

Table 5-9. Register Settings during External Interrupts ................................................................................. 5-19

Table 5-10. Alignment Interrupt Summary ..................................................................................................... 5-19

Table 5-11. Register Settings during Alignment Interrupts ............................................................................ 5-19

Table 5-12. ESR Usage for Program Interrupts ............................................................................................ 5-20

xviii PPC405 Core User’s Manual

Table 5-13. Register Settings during Program Interrupts ..............................................................................5-21

Table 5-14. Register Settings during FPU Unavailable Interrupts .................................................................5-21

Table 5-15. Register Settings during System Call Interrupts .........................................................................5-22

Table 5-16. Register Settings during APU Unavailable Interrupts .................................................................5-22

Table 5-17. Register Settings during Programmable Interval Timer Interrupts ..............................................5-23

Table 5-18. Register Settings during Fixed Interval Timer Interrupts ............................................................5-24

Table 5-19. Register Settings during Watchdog Timer Interrupts ..................................................................5-24

Table 5-20. Register Settings during Data TLB Miss Interrupts .....................................................................5-25

Table 5-21. Register Settings during Instruction TLB Miss Interrupts ............................................................5-25

Table 5-22. SRR2 during Debug Interrupts ....................................................................................................5-26

Table 5-23. Register Settings during Debug Interrupts ..................................................................................5-26

Table 6-1. Time Base Access ..........................................................................................................................6-3

Table 6-2. FIT Controls ....................................................................................................................................6-5

Table 6-3. Watchdog Timer Controls ...............................................................................................................6-6

Table 7-1. TLB Fields Related to Page Size ....................................................................................................7-4

Table 7-2. Protection Applied to Cache Control Instructions .........................................................................7-16

Table 8-1. Debug Events................................................................................................................................8-11

Table 8-2. DAC Applied to Cache Instructions ..............................................................................................8-15

Table 8-3. Setting of DBSR Bits for DAC and DVC Events............................................................................8-17

Table 8-4. Comparisons Based on DBCR1[DVnM]........................................................................................8-18

Table 8-5. Comparisons for Aligned DVC Accesses ......................................................................................8-18

Table 8-6. Comparisons for Misaligned DVC Accesses.................................................................................8-19

Table 8-7. JTAG Connector Signals ..............................................................................................................8-20

Table 8-8. JTAG Instructions..........................................................................................................................8-21

Table 9-1. Implementation-Specific Instructions...............................................................................................9-1

Table 9-2. Operator Precedence ......................................................................................................................9-5

Table 9-3. Extended Mnemonics for addi ........................................................................................................9-9

Table 9-4. Extended Mnemonics for addic ....................................................................................................9-10

Table 9-5. Extended Mnemonics for addic. ...................................................................................................9-11

Table 9-6. Extended Mnemonics for addis ....................................................................................................9-12

Table 9-7. Extended Mnemonics for bc, bca, bcl, bcla ..................................................................................9-21

Table 9-8. Extended Mnemonics for bcctr, bcctrl ...........................................................................................9-27

Table 9-9. Extended Mnemonics for bclr, bclrl ...............................................................................................9-30

Table 9-10. Extended Mnemonics for cmp ....................................................................................................9-34

Table 9-11. Extended Mnemonics for cmpi ...................................................................................................9-35

Table 9-12. Extended Mnemonics for cmpl ...................................................................................................9-36

Table 9-13. Extended Mnemonics for cmpli ...................................................................................................9-37

Table 9-14. Extended Mnemonics for creqv ..................................................................................................9-41

Table 9-15. Extended Mnemonics for crnor ...................................................................................................9-43

Table 9-16. Extended Mnemonics for cror .....................................................................................................9-44

Table 9-17. Extended Mnemonics for crxor ...................................................................................................9-46

Table 9-18. Transfer Bit Mnemonic Assignment...........................................................................................9-108

Table 9-19. Extended Mnemonics for mfspr ................................................................................................9-113

Table 9-20. Extended Mnemonics for mftb...................................................................................................9-114

Table 9-21. Extended Mnemonics for mftb ..................................................................................................9-115

Table 9-22. Extended Mnemonics for mtcrf .................................................................................................9-116

Tables xix

Table 9-23. Extended Mnemonics for mtspr ................................................................................................ 9-120

Table 9-24. Extended Mnemonics for nor, nor. ........................................................................................... 9-139

Table 9-25. Extended Mnemonics for or, or. ............................................................................................... 9-140

Table 9-26. Extended Mnemonics for ori ..................................................................................................... 9-142

Table 9-27. Extended Mnemonics for rlwimi, rlwimi. ................................................................................... 9-146

Table 9-28. Extended Mnemonics for rlwinm, rlwinm. ................................................................................. 9-147

Table 9-29. Extended Mnemonics for rlwnm, rlwnm. .................................................................................. 9-150

Table 9-30. Extended Mnemonics for subf, subf., subfo, subfo. ................................................................. 9-176

Table 9-31. Extended Mnemonics for subfc, subfc., subfco, subfco. .......................................................... 9-177

Table 9-32. Extended Mnemonics for tlbre .................................................................................................. 9-185

Table 9-33. Extended Mnemonics for tlbwe ................................................................................................ 9-189

Table 9-34. Extended Mnemonics for tw ..................................................................................................... 9-191

Table 9-35. Extended Mnemonics for twi .................................................................................................... 9-194

Table 10-1. PPC405 General Purpose Registers........................................................................................... 10-1

Table 10-2. Special Purpose Registers ......................................................................................................... 10-2

Table 10-3. Time Base Registers................................................................................................................... 10-4

Table A-1. PPC405 Instruction Syntax Summary ........................................................................................... A-1

Table A-2. PPC405 Instructions by Opcode ................................................................................................. A-33

Table B-1. PPC405 Instruction Set Categories............................................................................................... B-1

Table B-2. Implementation-specific Instructions ............................................................................................. B-1

Table B-3. Instructions in the IBM PowerPC Embedded Environment ........................................................... B-5

Table B-4. Privileged Instructions ................................................................................................................... B-7

Table B-5. Extended Mnemonics for PPC405 .............................................................................................. B-10

Table B-6. Storage Reference Instructions .................................................................................................. B-29

Table B-7. Arithmetic and Logical Instructions ............................................................................................. B-33

Table B-8. Condition Register Logical Instructions ....................................................................................... B-37

Table B-9. Branch Instructions ..................................................................................................................... B-38

Table B-10. Comparison Instructions ........................................................................................................... B-39

Table B-11. Rotate and Shift Instructions ..................................................................................................... B-40

Table B-12. Cache Control Instructions ........................................................................................................ B-41

Table B-13. Interrupt Control Instructions ..................................................................................................... B-42

Table B-14. TLB Management Instructions .................................................................................................. B-42

Table B-15. Processor Management Instructions ........................................................................................ B-44

Table C-1. Cache Sizes, Tag Fields, and Lines.............................................................................................. C-2

Table C-2. Multiply and MAC Instruction Timing............................................................................................. C-5

Table C-3. Instruction Cache Miss Penalties................................................................................................... C-7

xx PPC405 Core User’s Manual

About This Book

This user’s manual provides the architectural overview,programming model, and detailed information about the registers, the instruction set, and operations of the IBM™ PowerPC™ 405 (PPC405 core) 32-bit RISC embedded processor core.

The PPC405 RISC embedded processor core features:

• PowerPC Architecture™

• Single-cycle execution for most instructions

• Instruction cache unit and data cache unit

• Support for little endian operation

• Interrupt interface for one critical and one non-critical interrupt signal

• JTAG interface

• Extensive development tool support

Who Should Use This Book

This book is for system hardware and software developers, and for application developers who need to understand the PPC405 core. The audience should understand embedded processor design, embedded system design, operating systems, RISC processing, and design for testability.

How to Use This Book

This book describes the PPC405 device architecture, programming model, external interfaces, internal registers, and instruction set. This book contains the following chapters, arranged in parts:

Chapter 1 Overview Chapter 2 Programming Model Chapter 3 Initialization Chapter 4 Cache Operations Chapter 5 Fixed-Point Interrupts and Exceptions Chapter 6 Timer Facilities Chapter 7 Memory Management Chapter 8 Debugging Chapter 9 Instruction Set Chapter 10 Register Summary

This book contains the following appendixes:

Appendix A Instruction Summary Appendix B Instructions by Category Appendix C Code Optimization and Instruction Timings

About This Book xxi

To help readers ﬁnd material in these chapters, the book contains:

Contents, on page v. Figures, on page xv. Tables, on page xviii. Index, on page X-1.

Conventions

The following is a list of notational conventions frequently used in this manual.

ActiveLow An overbar indicates an active-low signal.

0x 0b

n n

A decimal number A hexadecimal number A binary number

= Assignment

∧ AND logical operator ¬ NOT logical operator ∨ OR logical operator ⊕ Exclusive-OR (XOR) logical operator

+ Twos complement addition – Twos complement subtraction, unary minus

× Multiplication ÷ Division yielding a quotient

% Remainder of an integer division; (33 % 32) = 1.

|| Concatenation =, ≠ Equal, not equal relations

<, > Signed comparison relations

, Unsigned comparison relations

if...then...else... Conditional execution; if

condition

thena elseb, wherea andb represent one or more pseudocode statements. Indenting indicates the ranges of andb. Ifb is null, the else does not appear.

do Do loop. “to” and “by” clauses specify incrementing an iteration variable;

“while” and “until” clauses specify terminating conditions. Indenting indicates the scope of a loop.

leave Leave innermost do loop or do loop speciﬁed in a leave statement. FLD An instruction or register ﬁeld FLD

FLD

b:b

xxii PPC405 Core User’s Manual

A bit in a named instruction or register ﬁeld A range of bits in a named instruction or register ﬁeld

FLD REG REG REG

b,b, . . .

b b:b b,b, . . .

A list of bits, by number or name, in a named instruction or register ﬁeld A bit in a named register A range of bits in a named register

A list of bits, by number or name, in a named register REG[FLD] A ﬁeld in a named register REG[FLD, FLD

] A list of ﬁelds in a named register

. . .

REG[FLD:FLD] Arange of ﬁelds in a named register GPR(r) General Purpose Register (GPR) r, where 0 ≤ r ≤ 31. (GPR(r)) The contents of GPR r, where 0 ≤ r ≤ 31. DCR(DCRN) A Device Control Register (DCR) speciﬁed by the DCRF ﬁeld in an

mfdcr or mtdcr instruction SPR(SPRN) An SPR speciﬁed by the SPRF ﬁeld in an mfspr or mtspr instruction TBR(TBRN) A Time Base Register (TBR) speciﬁed by the TBRF ﬁeld in an mftb

instruction GPRs RA, RB,

. . .

(Rx) The contents of a GPR, wherex is A, B, S, or T (RA|0) The contents of the register RA or 0, if the RA ﬁeld is 0. CR

FLD

0:3

b The bit or bit valueb is replicatedn times.

The ﬁeld in the condition register pointed to by a ﬁeld of an instruction.

A 4-bit object used to store condition results in compare instructions.

xx Bit positions which are don’t-cares. CEIL(x) Least integer ≥ x. EXTS(x) The result of extending

on the left with sign bits. PC Program counter. RESERVE Reserve bit; indicates whether a process has reserved a block of

storage.

CIA Current instruction address; the 32-bit address of the instruction being

described by a sequence of pseudocode. This address is used to set the next instruction address (NIA). Does not correspond to any architected register.

NIA Next instruction address; the 32-bit address of the next instruction to be

executed. In pseudocode, a successful branch is indicated by assigning a value to NIA. For instructions that do not branch, the NIA is CIA +4.

MS(addr, n) The number of bytes represented by

addr

represented by

at the location in main storage

EA Effective address; the 32-bit address, derived by applying indexing or

indirect addressing rules to the speciﬁed operand, that speciﬁes a location in main storage.

About This Book xxiii

EA EA

b b:b

A bit in an effective address. A range of bits in an effective address.

ROTL((RS),n) Rotate left; the contents of RS are shifted left the number of bits

speciﬁed byn.

MASK(MB,ME) Mask having 1s in positions MB through ME (wrapping if MB > ME) and

0s elsewhere.

instruction(EA) An instruction operating on a data or instruction cache block associated

with an EA.

xxiv PPC405 Core User’s Manual

Chapter 1. Overview

The IBM 405 32-bit reduced instruction set computer (RISC) processor core, referred to as the PPC405 core, implements the PowerPC Architecture with extensions for embedded applications.

This chapter describes:

• PPC405 core features

• The PowerPC Architecture

• The PPC405 implementation of the IBM PowerPC Embedded Environment, an extension of the PowerPC Architecture for embedded applications

• PPC405 organization, including a block diagram and descriptions of the functional units

• PPC405 registers

• PPC405 addressing modes

1.1 PPC405 Features

The PPC405 core provides high performance and low power consumption. The PPC405 RISC CPU executes at sustained speeds approaching one cycle per instruction. On-chip instruction and data caches arrays can be implemented to reduce chip count and design complexity in systems and improve system throughput.

The PowerPC RISC ﬁxed-point CPU features:

• PowerPC User Instruction Set Architecture (UISA) and extensions for embedded applications

• Thirty-two 32-bit general purpose registers (GPRs)

• Static branch prediction

• Five-stage pipeline with single-cycle execution of most instructions, including loads/stores

• Unaligned load/store support to cache arrays, main memory, and on-chip memory (OCM)

• Hardware multiply/divide for faster integer arithmetic (4-cycle multiply, 35-cycle divide)

• Multiply-accumulate instructions

• Enhanced string and multiple-word handling

• True little endian operation

• Programmable Interval Timer (PIT), Fixed Interval Timer (FIT), and watchdog timer

• Forward and reverse trace from a trigger event

• Storage control – Separate, conﬁgurable, two-way set-associative instruction and data cache units; for the

PPC405B3, the instruction cache array is 16KB and the data cache array is 8KB – Eight words (32 bytes) per cache line – Support for any combination of 0KB, 4KB, 8KB, and 16KB, and 32KB instruction and data cache

arrays, depending on model

Overview 1-1

– Instruction cache unit (ICU) non-blocking during line ﬁlls, data cache unit (DCU) non-blocking

during line ﬁlls and ﬂushes

– Read and write line buffers – Instruction fetch hits are supplied from line buffer – Data load/store hits are supplied to line buffer – Programmable ICU prefetching of next sequential line into line buffer – Programmable ICU prefetching of non-cacheable instructions, full line (eight words) or half line

(four words)

– Write-back or write-through DCU write strategies – Programmable allocation on loads and stores – Operand forwarding during cache line ﬁlls

• Memory Management – Translation of the 4GB logical address space into physical addresses

– Independent enabling of instruction and data translation/protection – Page level access control using the translation mechanism – Software control of page replacement strategy – Additional control over protection using zones

– WIU0GE (write-through, cachability, compresseduser-deﬁned 0, guarded, endian) storage

attribute control for each virtual memory region

• WIU0GE storage attribute control for thirty-two real 128MB regions in real mode

• Support for OCM that provides memory access performance identical to cache hits

• Full PowerPC ﬂoating-point unit (FPU) support using the auxiliary processor unit (APU) interface (the PPC405 does not include an FPU)

• PowerPC timer facilities – 64-bit time base

– PIT, FIT, and watchdog timers – Synchronous external time base clock input

• Debug Support – Enhanced debug support with logical operators – Four instruction address compares (IACs) – Two data address compares (DACs) – Two data value compares (DVCs) – JTAG instruction to write to ICU – Forward or backward instruction tracing

• Minimized interrupt latency

• Advanced power management support

1-2 PPC405 Core User’s Manual

1.2 PowerPC Architecture

The PowerPC Architecture comprises three levels of standards:

• PowerPC User Instruction Set Architecture (UISA), including the base user-level instruction set,

user-level registers, programming model, data types, and addressing modes. This is referred to as Book I of the PowerPC Architecture.

• PowerPC Virtual Environment Architecture, describing the memory model, cache model, cache-

control instructions, address aliasing, and related issues. While accessible from the user level, these features are intended to be accessed from within library routines provided by the system software. This is referred to as Book II of the PowerPC Architecture.

• PowerPC Operating Environment Architecture, including the memory management model,

supervisor-level registers, and the exception model. These features are not accessible from the user level. This is referred to as Book III of the PowerPC Architecture.

Book I and Book II deﬁne the instruction set and facilities available to the application programmer. Book III deﬁnes features, such as system-level instructions, that are not directly accessible by user applications. The PowerPC Architecture is described in

for a New Family of RISC Processors

The PowerPC Architecture provides compatibility of PowerPC Book I application code across all PowerPC implementations to help maximize the portability of applications developed for PowerPC processors. This is accomplished through compliance with the ﬁrst level of the architectural deﬁnition, the PowerPC UISA, which is common to all PowerPC implementations.

The PowerPC Architecture: A Specification

1.3 The PPC405 as a PowerPC Implementation

The PPC405 implements the PowerPC UISA, user-level registers, programming model, data types, addressing modes, and 32-bit ﬁxed-point operations. The PPC405 fully complies with the PowerPC UISA. The UISA 64-bit operations are not implemented, nor are the ﬂoating point operations, unless a ﬂoating point unit (FPU) is implemented. The ﬂoating point operations, which cause exceptions, can then be emulated by software.

Most of the features of the PPC405 are compatible with the PowerPC Virtual Environment and Operating Environment Architectures, as implemented in PowerPC processors such as the 6xx/7xx family. The PPC405 also provides a number of optimizations and extensions to these layers of the PowerPC Architecture. The full architecture of the PPC405 is deﬁned by the PowerPC Embedded Environment and the PowerPC User Instruction Set Architecture.

The primary extensions of the PowerPC Architecture deﬁned in the Embedded Environment are:

• A simpliﬁed memory management mechanism with enhancements for embedded applications

• An enhanced, dual-level interrupt structure

• An architected DCR address space for integrated peripheral control

• The addition of several instructions to support these modiﬁed and extended resources Finally, some of the speciﬁc implementation features of the PPC405 are beyond the scope of the

PowerPC Architecture. These features are included to enhance performance, integrate functionality, and reduce system complexity in embedded control applications.

Overview 1-3

1.4 Processor Core Organization

The processor core consists of a 5-stage pipeline, separate instruction and data cache units, virtual memory management unit (MMU), three timers, debug, and interfaces to other functions.

Figure 1-1 illustrates the logical organization of the PPC405.

PLB Master Instruction

Interface OCM

I-Cache I-Cache

ControllerArray

Instruction

Cache

Unit

Cache Units

Data

Cache

Unit

D-Cache D-Cache

ControllerArray

PLB Master Data

Interface OCM

MMU

Instruction Shadow

TLB

(4 Entry)

Uniﬁed TLB

(64 Entry)

Data Shadow

TLB

(8 Entry)

405 CPU

Fetch

Decode

Logic

Execute Unit (EXU)

32 x 32

GPR

3-Element

and

ALU

Figure 1-1. PPC405 Block Diagram

Fetch Queue (PFB1,

PFB0,

DCD)

MAC

APU/FPU

Timers

(FIT,

PIT,

Watchdog)

Timers

Debug

Debug Logic

(4 IAC, 2 DAC, 2 DVC)

JTAG Instruction

Trace

1.4.1 Instruction and Data Cache Controllers

The instruction cache unit (ICU) and data cache unit (DCU) enable concurrent accesses and minimize pipeline stalls. The storage capacity of the cache units, which can range from 0KB–32KB, depends upon the implementation. Both cache units are two-way set-associative, use a 32-byte line size. The instruction set provides a rich assortment of cache control instructions, including instructions to read tag information and data arrays. See Chapter 4, “Cache Operations,” for detailed information about the ICU and DCU.

The cache units are PLB-compliant for use in the IBM Core+ASIC program.

1.4.1.1 Instruction Cache Unit

The ICU provides one or two instructions per cycle to the execution unit (EXU) over a 64-bit bus. A line buffer (built into the output of the array for manufacturing test) enables the ICU to be accessed only once for every four instructions, to reduce power consumption by the array.

The ICU can forward any or all of the words of a line ﬁll to the EXU to minimize pipeline stalls caused by cache misses. The ICU aborts speculative fetches abandoned by the EXU, eliminating

1-4 PPC405 Core User’s Manual

unnecessary line ﬁlls and enabling the ICU to handle the next EXU fetch. Aborting abandoned requests also eliminates unnecessary external bus activity to increase external bus utilization.

1.4.1.2 Data Cache Unit

The DCU transfers 1, 2, 3, 4, or 8 bytes per cycle, depending on the number of byte enables presented by the CPU.The DCU contains a single-element command and store data queue to reduce pipeline stalls; this queue enables the DCU to independently process load/store and cache control instructions. Dynamic PLB request prioritization reduces pipeline stalls evenfurther.When the DCU is busy with a low-priority request while a subsequent storage operation requested by the CPU is stalled, the DCU automatically increases the priority of the current request to the PLB.

The DCU uses a two-line ﬂush queue to minimize pipeline stalls caused by cache misses. Line ﬂushes are postponed until after a line ﬁll is completed. Registers comprise the ﬁrst position of the ﬂush queue; the line buffer built into the output of the array for manufacturing test serves as the second position of the ﬂush queue. Pipeline stalls are further reduced by forwarding the requested word to the CPU during the line ﬁll. Single-queued ﬂushes are non-blocking. When a ﬂush operation is pending, the DCU can continue to access the array to determine subsequent load or store hits. Under these conditions, load hits can occur concurrently with store hits to write-back memory without stalling the pipeline. Requests abandoned by the CPU can also be aborted by the cache controller.

Additional DCU features enable the programmer to tailor performance for a given application. The DCU can function in write-back or write-through mode, as controlled by the Data Cache Write-through Register (DCWR) or the translation look-aside buffer (TLB). DCU performance can be tuned to balance performance and memory coherency.Store-without-allocate, controlled by the SWOA ﬁeld of the Core Conﬁguration Register 0 (CCR0), can inhibit line ﬁlls caused by store misses to further reduce potential pipeline stalls and unwanted external bus trafﬁc. Similarly, load-without-allocate, controlled by CCR0[LWOA], can inhibit line ﬁlls caused by load misses.

1.4.2 Memory Management Unit

The 4GB address space of the PPC405 is presented as a ﬂat address space. The MMU provides address translation, protection functions, and storage attribute control for

embeddedembedded applications. The MMU supports demand paged virtual memory and other management schemes that require precise control of logical to physical address mapping and ﬂexible memory protection. Working with appropriate system level software, the MMU provides the following functions:

• Translation of the 4GB logical address space into physical addresses

• Independent enabling of instruction and data translation/protection

• Page level access control using the translation mechanism

• Software control of page replacement strategy

• Additional control over protection using zones

• Storage attributes for cache policy and speculative memory access control The MMU can be disabled under software control. If the MMU is not used, the PPC405 core provides

other storage control mechanisms. The translation lookaside buffer (TLB) is the hardware resource that controls translation and

protection. It consists of 64 entries, each specifying a page to be translated. The TLB is fully

Overview 1-5

associative; a page entry can be placed anywhere in the TLB. The translation function of the MMU occurs pre-cache for data accesses. Cache tags and indexing use physical addresses for data accesses; instruction fetches are virtually indexed and physically tagged.

Software manages the establishment and replacement of TLB entries. This gives system software signiﬁcant ﬂexibility in implementing a custom page replacement strategy. For example, to reduce TLB thrashing or translation delays, software can reserve several TLB entries for globally accessible static mappings. The instruction set provides several instructions to manage TLB entries. These instructions are privileged and require the software to be executingin supervisor state. Additional TLB instructions are provided to move TLB entry ﬁelds to and from GPRs.

The MMU divides logical storage into pages. Eight page sizes (1KB, 4KB, 16KB, 64KB, 256KB, 1MB, 4MB, 16MB) are simultaneously supported, so that, at any given time, the TLB can contain entries for any combination of page sizes. For a logical to physical translation to occur, a valid entry for the page containing the logical address must be in the TLB. Addresses for which no TLB entry exists cause TLB-Miss exceptions.

To improve performance, 4 instruction-side and 8 data-side TLB entries are kept in shadow arrays. The shadow arrays prevent TLB contention. Hardware manages the replacement and invalidation of shadow-TLB entries; no system software action is required. The shadow arrays can be thought of as level 1 TLBs, with the main TLB serving as a level 2 TLB.

When address translation is enabled, the translation mechanism provides a basic level of protection. Physical addresses not mapped by a page entry are inaccessible when translation is enabled. Read access is implied by the existence of the valid entry in the TLB. The EX and WR bits in the TLB entry further deﬁne levels of access for the page, by permitting execute and write access, respectively.

The Zone Protection Register (ZPR) enables the system software to override the TLB access controls. For example, the ZPR provides a way to deny read access to application programs. The ZPR can be used to classify storage by type; access by type can be changed without manipulating individual TLB entries.

The PowerPC Architecture provides WIU0GE (write-back/write through, cachability, user-deﬁned 0, guarded, endian) storage attributes that control memory accesses, using bits in the TLB or, when address translation is disabled, storage attribute control registers.

When address translation is enabled (MSR[IR, DR] = 1), storage attribute control bits in the TLB control the storage attributes associated with the current page. When address translation is disabled (MSR[IR, DR] = 0), bits in each storage attribute control register control the storage attributes associated with storage regions. Each storage attribute control register contains 32 ﬁelds. Each ﬁeld sets the associated storage attribute for a 128MB memory region. See “Real-Mode Storage Attribute Control” on page 7-17 for more information about the storage attribute control registers.

1.4.3 Timer Facilities

The processor core contains a time base and three timers:

• Programmable Interval Timer (PIT)

• Fixed Interval Timer (FIT)

• Watchdog timer

1-6 PPC405 Core User’s Manual

The time base is a 64-bit counter incremented either by an internal signal equal to the CPU clock rate or by a separate external timer clock signal. No interrupts are generated when the time base rolls over.

The PIT is a 32-bit register that is decremented at the same rate as the time base is incremented. The user loads the PIT register with a value to create the desired delay. When a decrement occurs on a PIT count of 1, the timer stops decrementing, a bit is set in the Timer Status Register (TSR), and a PIT interrupt is generated. Optionally, the PIT can be programmed to reload automatically the last value written to the PIT register, after which the PIT begins decrementing again.The Timer Control Register (TCR) contains the interrupt enable for the PIT interrupt.

The FIT generates periodic interrupts based on selected bits in the time base. Users can select one of four intervals for the timer period by setting the appropriate bits in the TCR. When the selected bit in the time base changes from 0 to 1, a bit is set in the TSR and a FIT interrupt is generated. The FIT interrupt enable is contained in the TCR.

The watchdog timer generates a periodic interrupt based on selected bits in the time base. Users can select one of four time periods for the interval and the type of reset generated if the watchdog timer expires twice without an intervening clear from software.

1.4.4 Debug

The processor core debug facilities include debug modes for the various types of debugging used during hardware and software development. Also included are debug events that allow developers to control the debug process. Debug modes and debug events are controlled using debug registers in the chip. The debug registers are accessed either through software running on the processor, or through the JTAG port. The JTAG port can also be used for board test.

The debug modes, events, controls, and interfaces provide a powerful combination of debug facilities for hardware and software development tools.

1.4.4.1 Development Tool Support

The PPC405 supports a wide range of hardware and software development tools. An operating system debugger is an example of an operating system-aware debugger, implemented

using software traps. RISCWatch is an example of a development tool that uses the external debug mode, debug events,

and the JTAG port to support hardware and software development and debugging. The RISCTrace™ feature of RISCWatch is an example of a development tool that uses the real-time

trace capability of the processor core.

1.4.4.2 Debug Modes

The internal, external,real-time-trace, and debug wait modes support a variety of debug tool used in embedded systems development. These debug modes are described in detail in “Debug Modes” on page 8-1.

1.4.5 Core Interfaces

The core provides a range of I/O interfaces that simplify the attachment of on-chip and off-chip devices.

Overview 1-7

1.4.5.1 Processor Local Bus

The PLB-compliant interface provides separate 32-bit address and 64-bit data buses for the instruction and data sides.

1.4.5.2 Device Control Register Bus

The Device Control Register (DCR) bus supports the attachment of on-chip registers for device control.

These registers are accessed using the mfdcr and mtdcr instructions.

1.4.5.3 Clock and Power Management

This interface supports several methods of clock distribution and power management.

1.4.5.4 JTAG

The JTAG port is enhanced to support the attachment of a debug tool such as the RISCWatch product from IBM Microelectronics. Through the JTAG test access port, a debug tool can single-step the processor and interrogate internal processor state to facilitate software debugging. The enhancements comply with the IEEE 1149.1 speciﬁcation for vendor-speciﬁc extensions, and are therefore compatible with standard JTAG hardware for boundary-scan system testing.

1.4.5.5 Interrupts

The processor core provides an interface to an on-chip interrupt controller that is logically outside the core. The interrupt controller combines asynchronous interrupt inputs from on-chip and off-chip sources and presents them to the core using a pair of interrupt signals: critical and non-critical. The sources of asynchronous interrupts are external signals, the JTAG/debug unit, and any implemented peripherals.

1.4.5.6 Auxiliary Processor Unit

The auxiliary processor unit (APU) interface supports the attachment of auxiliary processor hardware and the implementation of the associated instructions for improved performance in specialized applications.

1.4.5.7 On-Chip Memory

The on-chip memory (OCM) interface supports the implementation of instruction- and data-side memory that can be accessed at performance levels matching the cache arrays.

1.4.6 Data Types

Processor core operands are bytes, halfwords, and words. Multiple words or strings of bytes can be transferredusing the load/store multiple and load/store string instructions. Data is represented in twos complement notation or in unsigned ﬁxed-point format.

The address of a multibyte operand is always the lowest memory address occupied by that operand. Byte ordering can be selected as big endian (the lowest memory address of an operand contains its most signiﬁcant byte) or as little endian (the lowest memory address of an operand contains its least

1-8 PPC405 Core User’s Manual

signiﬁcant byte). See “Byte Ordering” on page 2-17 for more information about big and little endian operation.

1.4.7 Processor Core Register Set Summary

The processor core registers can be grouped into basic categories based on function and access mode: general purpose registers (GPRs), special purpose registers (SPRs), the machine state register (MSR), the condition register (CR), and, in Core+ASIC implementations, device control registers (DCRs).

Chapter 10, “Register Summary,” provides a register diagram and a register ﬁeld description table for each register.

1.4.7.1 General Purpose Registers

The processor core contains 32 GPRs; each register contains 32 bits. The contents of the GPRs can be transferred from memory using load instructions and stored to memory using store instructions. GPRs, which are speciﬁed as operands in many instructions, can also receive instruction results and the contents of other registers.

1.4.7.2 Special Purpose Registers

Special Purpose Registers (SPRs), which are part of the PowerPC Architecture, are accessed using the mtspr and mfspr instructions. SPRs control the use of the debug facilities, timers, interrupts, storage control attributes, and other architected processor resources.

All SPRs are privileged (unavailable to user-mode programs), except the Count Register (CTR), the Link Register (LR), SPR General Purpose Registers (SPRG4–SPRG7, read-only), and the Fixedpoint Exception Register (XER). Note that access to the Time Base Lower (TBL) and Time Base Upper (TBU) registers, when addressed as SPRs, is write-only and privileged. However, when addressed as Time Base Registers (TBRs), read access to these registers is not privileged. See “Time Base Registers” on page 10-4 for more information.

1.4.7.3 Machine State Register

The PPC405 contains a 32-bit Machine State Register (MSR). The contents of a GPR can be written to the MSR using the mtmsr instruction, and the MSR contents can be read into a GPR using the

mfmsr instruction. The MSR contains ﬁelds that control the operation of the processor core.

1.4.7.4 Condition Register

The PPC405 contains a 32-bit Condition Register (CR). These bits are grouped into eight 4-bit ﬁelds, CR[CR0]–CR[CR7]. Instructions are provided to perform logical operations on CR ﬁelds and bits within ﬁelds and to test CR bits within ﬁelds. The CR ﬁelds, which are set by compare instructions, can be used to control branches. CR[CR0] can be set implicitly by arithmetic instructions.

1.4.7.5 Device Control Registers

DCRs, which are architecturally outside of the processor core, are accessed using the mtdcr and mfdcr instructions. DCRs are used to control, conﬁgure, and hold status for various functional units that are not part of the processor core. Although the PPC405 does not contain DCRs, the mtdcr and mfdcr instructions are provided.

Overview 1-9

The mtdcr and mfdcr instructions are privileged, for all DCRs. Therefore, all accesses to DCRs are privileged. See “Privileged Mode Operation” on page 2-30.

All DCR numbers are reserved, and should be neither read nor written, unless they are part of an IBM Core+ASIC implementation.

1.4.8 Addressing Modes

The processor core supports the following addressing modes, which enable efﬁcient retrieval and storage of data in memory:

• Base plus displacement addressing

• Indexed addressing

• Base plus displacement addressing and indexed addressing, with update In the base plus displacement addressing mode, an effective address (EA) is formed by adding a

displacement to a base address contained in a GPR (or to an implied base of 0). The displacement is an immediate ﬁeld in an instruction.

In the indexed addressing mode, the EA is formed by adding an index contained in a GPR to a base address contained in a GPR (or to an implied base of 0).

The base plus displacement and the indexed addressing modes also have a “with update” mode. In “with update” mode, the effective address calculated for the current operation is saved in the base GPR, and can be used as the base in the next operation. The “with update” mode relieves the processor from repeatedly loading a GPR with an address for each piece of data, regardless of the proximity of the data in memory.

1-10 PPC405 Core User’s Manual

Chapter 2. Programming Model

The programming model of the PPC405 embedded processor core describes the following features and operations:

• Memory organization and addressing, starting on page 2-1

• Registers, starting on page 2-2

• Data types and alignment, starting on page 2-16

• Byte ordering, starting on page 2-17

• Instruction processing, starting on page 2-23

• Branching control, starting on page 2-24

• Speculative accesses, starting on page 2-27

• Privileged mode operation, starting on page 2-30

• Synchronization, starting on page 2-33

• Instruction set, starting on page 2-36

2.1 User and Privileged Programming Models

The PPC405 executes programs in two modes, also referred to as states. Programs running in

privileged mode

instruction. These instructions and registers comprise the privileged programming model. In

, certain registers and instructions are unavailable to programs. This is also called the problem

mode

state. Those registers and instructions that are available comprise the user programming model. Privileged mode provides operating system software access to all processor resources. Because

access to certain processor resources is denied in user mode, application software runs in user mode. Operating system software and other application software is protected from the effects of an errant application program.

Throughout this book, the terms user program and privileged programs are used to associate programs with one of the programming models. Registers and instructions are described as user or privileged. Privileged mode operation is described in detail in “Privileged Mode Operation” on page 2-30.

(also referred to as the supervisor state) can access any register and execute any

user

2.2 Memory Organization and Addressing

The PowerPC Architecture deﬁnes a 32-bit, 4-gigabyte (GB) ﬂat address space for instructions and data

User’s manuals for standard products containing a PPC405 core describe the memory organizations and physical address maps of the standard products.

Programming Model 2-1

2.2.1 Storage Attributes

The PowerPC Architecture deﬁnes storage attributes that control data and instruction accesses. Storage attributes are provided to control cache write-through policy (the W storage attribute), cachability (the I storage attribute), memory coherency in multiprocessor environments (the M storage attribute), and guarding against speculative memory accesses (the G storage attribute). The IBM PowerPC Embedded Environment deﬁnes additional storage attributes for storage compression (the U0 storage attribute) and byte ordering (the E storage attribute).

The PPC405 core provides two control mechanisms for the W, I, U0, G, and E attributes.Because the PPC405 core does not provide hardware support for multiprocessor environments, the M storage attribute, when present, has no effect.

When the PPC405 core operates in virtual mode (address translation is enabled), each storage attribute is controlled by the W, I, U0, G, and E ﬁelds in the translation lookaside buffer (TLB) entry for each memory page. The size of memory pages, and hence the size of storage attribute control regions, is variable. Multiple sizes can be in effect simultaneously on different pages.

When the PPC405 core operates in real mode (address translation is disabled), storage attribute control registers control the corresponding storage attributes. These registers are:

• Data Cache Write-through Register (DCWR)

• Data Cache Cachability Register (DCCR)

• Instruction Cache Cachability Register (ICCR)

• Storage Guarded Register (SGR)

• Storage Little-Endian Register (SLER)

• Storage User-deﬁned 0 Register (SU0R) Each storage attribute control register contains 32 bits; each bit controls one of thirty-two 128MB

storage attribute control regions. Bit 0 of each register controls the lowest-order region, with ascending bits controlling ascending regions in memory. The storage attributes in each storage attribute region are set independently of each other and of the storage attributes for other regions.

2.3 Registers

All PPC405 registers are listed in this section. Some of the frequently-used registers are described in detail. Other registers are covered in their respective topic chapters (for example, the cache registers are described in Chapter 4, “Cache Operations”). All registers are summarized in Chapter 10, “Register Summary.”

The registers are grouped into categories: General Purpose Registers (GPRs), Special Purpose Registers (SPRs), Time Base Registers (TBRs), the Machine State Register (MSR), the Condition Register (CR), and, in standard products, Device Control Registers (DCRs). Different instructions are used to access each category of registers.

For all registers with ﬁelds marked as

undeﬁned

When reading from a register with a reserved ﬁeld, ignore that ﬁeld.

. That is, when writing to a register with a reserved ﬁeld, write a 0 to the reserved ﬁeld.

reserved

, the reserved ﬁelds should be written as 0 and read as

2-2 PPC405 Core User’s Manual

Programming Note: A good coding practice is to perform the initial write to a register with

reserved ﬁelds as described, and to perform all subsequent writes to the register using a readmodify-write strategy: read the register, use logical instructions to alter deﬁned ﬁelds, leaving reserved ﬁelds unmodiﬁed, and write the register.

Figure 2-1 on page 2-4 illustrates the registers in the user and supervisor programming models.

Programming Model 2-3

User Model

General-Purpose Registers

GPR0 GPR1

•

GPR31

SPR General Registers (read-only)

SPRG4 SPRG5 SPRG5 SPRG7

User SPR General Register 0 (read/write)

SPR 0x104 SPR 0x105 SPR 0x106 SPR 0x107

Supervisor Model

Machine State Register

MSR

Core Conﬁguration Register

CCR0

SPR General Registers

SPRG0 SPRG1 SPRG2 SPRG3 SPRG4 SPRG5 SPRG6

SPR 0x3B3

SPR 0x110 SPR 0x111 SPR 0x112 SPR 0x113 SPR 0x114 SPR 0x115 SPR 0x116

Processor Version Register

PVR

Timer Facilities

Time Base Registers

TBL TBU

Timer Control Register

TCR

Timer Status Register

TSR

Programmable Interval Timer

PIT

SPR 0x11F

SPR 0x11C SPR 0x11D

SPR 0x3DA

SPR 0x3D8

SPR 0x3DB

USPRG0

Condition Register

Fixed-Point Exception Register

XER

Link Register

Count Register

CTR

Time Base Registers (read-only)

TBL

TBU

Storage Attribute Control Registers

DCCR DCWR ICCR SGR SLER SU0R

SPR 0x100

SPR 0x001

SPR 0x008

SPR 0x009

TBR 0x10C TBR 0x10D

SPR 0x3FA SPR 0x3BA SPR 0x3FB SPR 0x3B9 SPR 0x3BB SPR 0x3BC

Figure 2-1. PPC405 Programming Model—Registers

SPRG7

Exception Handling Registers

Exception Vector Preﬁx Register

EVPR

Exception Syndrome Register

ESR

Data Exception Address Register

DEAR SPR 0x3D5

Save/Restore Registers

SRR0 SRR1 SRR2 SRR3

Memory Management Registers

Process ID

PID

Zone Protection Register

ZPR

SPR 0x117

SPR 0x3D5

SPR 0x3D4

SPR 0x01A SPR 0x01B SPR 0x3DE SPR 0x3DF

SPR 0x3B1

SPR 0x3B0

Debug Registers

Debug Status Register

DBSR

Debug Control Registers

DBCR0 DBCR1

Data Address Compares

DAC1 DAC2

Data Value Compares

DVC1 DVC2

Instruction Address Compares

IAC1 IAC2 IAC3 IAC4

Instruction Cache Debug Data Register

ICDBR SPR 0x3D3

SPR 0x3F0

SPR 0x3F2 SPR 0x3BD

SPR 0x3F6 SPR 0x3F7

SPR 0x3B6 SPR 0x3B7

SPR 0x3F4 SPR 0x3F5 SPR 0x3B4 SPR 0x3B5

2-4 PPC405 Core User’s Manual

2.3.1 General Purpose Registers (R0-R31)

The PPC405 core contains thirty-two 32-bit general purpose registers (GPRs). Data from memory can be read into GPRs using load instructions and the contents of GPRs can be written to memory using store instructions. Most integer instructions use GPRs for source and destination operands. See Table 10, “Register Summary,” on page 10-1 for the numbering of the GPRs.

0 31

Figure 2-2. General Purpose Registers (R0-R31)

0:31 General Purpose Register data

2.3.2 Special Purpose Registers

Special purpose registers (SPRs), which are part of the PowerPC Architecture and the IBM PowerPC Embedded Environment, are accessed using the mtspr and mfspr instructions.

SPRs control the operation of debug facilities, timers, interrupts, storage control attributes, and other architected processor resources. Table 10, “Register Summary,” on page 10-1 shows the mnemonic, name, and number for each SPR. Table 2-1, “PPC405 SPRs,” on page 2-6 lists the PPC405 SPRs by function and indicates the pages where the SPRs are described more fully.

Except for the Link Register (LR), the Count Register (CTR), the Fixed-point Exception Register (XER), User SPR General 0 (USPRG0, and read access to SPR General 4–7 (SPRG4–SPRG7), all SPRs are privileged. As SPRs, the registers TBL and TBU are privileged write-only; as TBRs, these registers can be read in user mode. Unless used to access non-privileged SPRs, attempts to execute mfspr and mtspr instructions while in user mode cause privileged violation program interrupts. See “Privileged SPRs” on page 2-32.

Programming Model 2-5

Table 2-1. PPC405 SPRs

Function Register Access Page

Conﬁguration CCR0

Branch Control

Debug

Fixed-point Exception XER

General-Purpose SPR

Interrupts and Exceptions

Processor Version PVR

Privileged 4-11

CTR LR DAC1 DAC2 DBCR0 DBCR1 DBSR DVC1 DVC2 IAC1 IAC2 IAC3 IAC4 Privileged 8-9 ICDBDR

SPRG0 SPRG1 SPRG2 SPRG3 Privileged 2-9 SPRG4 SPRG5 SPRG6 SPRG7 User read, privileged write 2-9 USPRG0 DEAR ESR EVPR SRR0 SRR1 SRR2 SRR3

User 2-6 User 2-7 Privileged 8-9 Privileged 8-4 Privileged 8-7 Privileged 8-10

Privileged 4-14 User 2-7

User 2-9 Privileged 5-13 Privileged 5-11 Privileged 5-10 Privileged 5-9 Privileged 5-9 Privileged, read-only 2-10

Storage Attribute Control

Timer Facilities

Zone Protection ZPR

DCCR DCWR ICCR SGR SLER SU0R TBL TBU PIT TCR TSR

Privileged 7-17 Privileged 7-17 Privileged 7-17 Privileged 7-17 Privileged 7-17 Privileged 7-17 Privileged, write-only 6-1 Privileged 6-4 Privileged 6-9 Privileged 6-8 Privileged 7-14

2.3.2.1 Count Register (CTR)

The CTR is written from a GPR using mtspr. The CTR contents can be used as a loop count that is decremented and tested by some branch instructions. Alternatively, the CTR contents can specify a target address for the bcctr instruction, enabling branching to any address.

The CTR is in the user programming model.

2-6 PPC405 Core User’s Manual

0 31

Figure 2-3. Count Register (CTR)

0:31 Count Used as count for branch conditional with

decrement instructions, or as address for branch-to-counter instructions.

2.3.2.2 Link Register (LR)

The LR is written from a GPR using mtspr, and by branch instructions that have the LK bit set to 1. Such branch instructions load the LR with the address of the instruction following the branch instruction. Thus, the LR contents can be used as the return address for a subroutine that was called using the branch.

The LR contents can be used as a target address for the bclr instruction. This allows branching to any address.

When the LR contents represent an instruction address, LR

are assumed to be 0, because all

30:31

instructions must be word-aligned. However, when LR is read using mfspr, all 32 bits are returned as written.

The LR is in the user programming model.

0 31

Figure 2-4. Link Register (LR)

0:31 Link Register contents If (LR) represents an instruction address,

should be 0.

30:31

2.3.2.3 Fixed Point Exception Register (XER)

The XER records overﬂow and carry conditions generated by integer arithmetic instructions. The Summary Overﬂow(SO) ﬁeld is set to 1 when instructions cause the Overﬂow (OV) ﬁeld to be set

to 1. The SO ﬁeld does not necessarily indicate that an overﬂow occurred on the most recent arithmetic operation, but that an overﬂow occurred since the last clearing of XER[SO]. mtspr(XER) sets XER[SO, OV] to the value of bit positions 0 and 1 in the source register, respectively.

Programming Model 2-7

Once set, XER[SO] is not reset until an mtspr(XER) is executed with data that explicitly puts a 0 in the SO bit, or until an mcrxr instruction is executed.

XER[OV] is set to indicate whether an instruction that updates XER[OV] produces a result that “overﬂows” the 32-bit target register. XER[OV] = 1 indicates overﬂow. For arithmetic operations, this occurs when an operation has a carry-in to the most-signiﬁcant bit of the result that does not equal the carry-out of the most-signiﬁcant bit (that is, the exclusive-or of the carry-in and the carry-out is 1).

The following instructions set XER[OV] differently. The speciﬁc behavior is indicated in the instruction descriptions in Chapter 9, “Instruction Set.”

• Move instructions: mcrxr, mtspr(XER)

• Multiply and divide instructions: mullwo, mullwo., divwo, divwo., divwuo, divwuo

The Carry (CA) ﬁeld is set to indicate whether an instruction that updates XER[CA] produces a result that has a carry-out of the most-signiﬁcant bit. XER[CA] = 1 indicates a carry.

The following instructions set XER[CA] differently.The speciﬁc behavior is indicated in the instruction descriptions in Chapter 9, “Instruction Set.”

• Move instructions mcrxr, mtspr(XER)

• Shift-algebraic operations

sraw, srawi

The Transfer Byte Count (TBC) ﬁeld is the byte count for load/store string instructions. The XER is part of the user programming model.

0123 24 25 31

TBC

Figure 2-5. Fixed Point Exception Register (XER)

0 SO Summary Overﬂow

0 No overﬂow has occurred. 1 Overﬂow has occurred.

1 OV Overﬂow

0 No overﬂow has occurred. 0 Overﬂow has occurred.

2 CA Carry

0 Carry has not occurred. 1 Carry has occurred.

Can be instructions; can be mcrxr.

Can be instructions; can be mcrxr, or “o” form instructions.

Can be instructions that update the CA ﬁeld; can be arithmetic instructions that update the CA ﬁeld.

set

by mtspr or by using “o” form

reset

set

by mtspr or by using “o” form

reset

set

by mtspr or arithmetic

reset

by mtspr, by mcrxr, or by

by mtspr or by

by mtspr, by

2-8 PPC405 Core User’s Manual

3:24 Reserved 25:31 TBC Transfer Byte Count Used by lswx and stswx; written by mtspr.

Table 2-2 and Table 2-3 list the PPC405 instructions that update the XER. In the tables, the syntax “[o]” indicates that the instruction has an “o” form that updates XER[SO,OV], and a “non-o” form. The syntax “[.]” indicates that the instruction has a “record” form that updates CR[CR0] (see “Condition Register (CR)” on page 2-10), and a “non-record” form.

Table 2-2. XER[CA] Updating Instructions

Integer Arithmetic

Add Subtract

Integer

Shift Shift

Right

Algebraic

Processor

Control

Management

addc[o][.] adde[o][.] addic[.] addme[o][.] addze[o][.]

subfc[o][.] subfe[o][.] subﬁc subfme[o][.] subfze[o][.]

Table 2-3. XER[SO,OV] Updating Instructions

Integer Arithmetic Auxiliary Processor

Add Subtract Multiply Divide Negate

addo[.] addco[.] addeo[.] addmeo[.] addzeo[.]

subfo[.] subfco[.] subfeo[.] subfmeo[.] subfzeo[.]

mullwo[.] divwo[.]

divwuo[.]

nego[.] macchwo[.]

sraw[.] srawi[.]

Accumulate

macchwso[.] macchwsuo[.] macchwuo[.] machhwo[.] machhwso[.] machhwsuo[.] machhwuo[.] maclhwo[.] maclhwso[.] maclhwsuo[.] maclhwuo[.]

mtspr mcrxr

Multiply-

Negative Multiply-

Accumulate

nmacchwo[.] nmacchwso[.] nmachhwo[.] nmachhwso[.] nmaclhwo[.] nmaclhwso[.]

Processor

Control

Management mtspr

mcrxr

2.3.2.4 Special Purpose Register General (SPRG0–SPRG7)

USPRG0 and SPRG0–SPRG7 are provided for general purpose software use. For example, these registers are used as temporary storage locations. For example, an interrupt handler might save the contents of a GPR to an SPRG, and later restore the GPR from it. This is faster than a save/restore to a memory location. These registers are written using mtspr and read using mfspr.

Access to USPRG0 is non-privileged for both read and write.

Programming Model 2-9

SPRG0–SPRG7 provide temporary storage locations. For example, an interrupt handler might save the contents of a GPR to an SPRG, and later restore the GPR from it. This is faster than performing a save/restore to memory. These registers are written by mtspr and read by mfspr.

Access to SPRG0–SPRG7 is privileged, except for read access to SPRG4–SPRG7. See “Privileged SPRs” on page 2-32 for more information.

0 31

Figure 2-6. Special Purpose Register General (SPRG0–SPRG7)

0:31 General data Software value; no hardware usage.

2.3.2.5 Processor Version Register (PVR)

The PVR is a read-only register that uniquely identiﬁes a standard product or Core+ASIC implementation. Software can examinethe PVR to recognize implementation-dependent featuresand determine available hardware resources.

Access to the PVR is privileged. See “Privileged SPRs” on page 2-32 for more information.

OWN

0 1112 1516 2122 2526 31

UDEF

CAS

PCL

AID

Figure 2-7. Processor Version Register (PVR)

0:11 OWN Owner Identiﬁer Identiﬁes the owner of a core 12:15 PCF Processor Core Family Identiﬁes the processor core family. 16:21 CAS Cache Array Sizes Identiﬁes the cache array sizes. 22:25 PCL Processor Core Version Identiﬁes the core version for a speciﬁc

combination of PVR[PCF] and PVR[CAS]

26:31 AID ASIC Identiﬁer Assigned sequentially; identiﬁes an ASIC

function, version, and technology

2.3.3 Condition Register (CR)

The CR contains eight 4-bit ﬁelds (CR0–CR7), as shown in Figure 3-8. The ﬁelds contain conditions detected during the executionof integer or logical compare instructions, as indicated in the instruction

2-10 PPC405 Core User’s Manual

descriptions in Chapter 9, “Instruction Set.” The CR contents can be used in conditional branch instructions.

The CR can be modiﬁed in any of the following ways:

• mtcrf sets speciﬁed CR ﬁelds by writing to the CR from a GPR, under control of a mask speciﬁed

as an instruction ﬁeld.

• mcrf sets a speciﬁed CR ﬁeld by copying another CR ﬁeld to it.

• mcrxr copies certain bits of the XER into a designated CR ﬁeld, and then clears the corresponding

XER bits.

• The “with update” forms of integer instructions implicitly update CR[CR0].

• Integer compare instructions update a speciﬁed CR ﬁeld.

• Auxiliary processor instructions can update a speciﬁed CR ﬁeld (including the implicit update of

CR[CR1] by certain ﬂoating-point operations).

• The CR-logical instructions update a speciﬁed CR bit with the result of a logical operation on a

speciﬁed pair of CR bit ﬁelds.

• Conditional branch instructions can test a CR bit as one of the branch conditions. If a CR ﬁeld is set by a compare instruction, the bits are set as described in “CR Fields after Compare

Instructions.” The CR is part of the user programming model.

CR0

0 3 4 7 8 1112 1516 1920 2324 2728 31

CR1

CR2

CR3

CR4

CR5

CR6

CR7

Figure 2-8. Condition Register (CR)

0:3 CR0 Condition Register Field 0 4:7 CR1 Condition Register Field 1 8:11 CR2 Condition Register Field 2 12:15 CR3 Condition Register Field 3 16:19 CR4 Condition Register Field 4 20:23 CR5 Condition Register Field 5 24:27 CR6 Condition Register Field 6 28:31 CR7 Condition Register Field 7

2.3.3.1 CR Fields after Compare Instructions

Compare instructions compare the values of two registers. The two types of compare instructions,

arithmetic

and

logical

, are distinguished by the interpretation given to the 32-bit values. For

Programming Model 2-11

arithmetic

compares, the values are considered to be signed, where 31 bits represent the magnitude and the

logical

most-signiﬁcant bit is a sign bit. For

compares, the values are considered to be unsigned, so all 32 bits represent magnitude. There is no sign bit. As an example, consider the comparison of 0 with 0xFFFFFFFF. In an

logical

compare, 0xFFFFFFFF is larger.

arithmetic

compare, 0 is larger, because 0xFFFF FFFF represents –1; in a

A compare instruction can direct its CR update to any CR ﬁeld. The ﬁrst data operand of a compare instruction speciﬁes a GPR. The second data operand speciﬁes another GPR, or immediate data derived from the IM ﬁeld of the immediate instruction form. The contents of the GPR speciﬁed by the ﬁrst data operand are compared with the contents of the GPR speciﬁed by the second data operand (or with the immediate data). See descriptions of the compare instructions (page 9-34 through page 9-37) for precise details.

After a compare, the speciﬁed CR ﬁeld is interpreted as follows:

LT (bit 0) The ﬁrst operand is less than the second operand. GT (bit 1) The ﬁrst operand is greater than the second operand. EQ (bit 2) The ﬁrst operand is equal to the second operand. SO (bit 3) Summary overﬂow; a copy of XER[SO].

2.3.3.2 The CR0 Field

After the execution of compare instructions that update CR[CR0], CR[CR0] is interpreted as described in “CR Fields after Compare Instructions” on page 2-11. The “dot” forms of arithmetic and logical instructions also alter CR[CR0]. After most instructions that update CR[CR0], the bits of CR0 are interpreted as follows:

LT (bit 0) Less than 0; set if the most-signiﬁcant bit of the 32-bit result is 1.

GT (bit 1)

Greater than 0; set if the 32-bit result is non-zero and the most-

signiﬁcant bit of the result is 0. EQ (bit 2) Equal to 0; set if the 32-bit result is 0. SO (bit 3) Summary overﬂow; a copy of XER[SO] at instruction completion.

The CR[CR0]

LT, GT, EQ

subﬁelds are set as the result of an algebraic comparison of the instruction result to 0, regardless of the type of instruction that sets CR[CR0]. If the instruction result is 0, the EQ subﬁeld is set to 1. If the result is not 0, either LT or GT is set, depending on the value of the mostsigniﬁcant bit of the result.

When updating CR[CR0], the most signiﬁcant bit of an instruction result is considered a sign bit, even for instructions that produce results that are not usually thought of as signed. For example, logical instructions such as and., or.,and nor.update CR[CR0]

LT, GT, EQ

using such an arithmetic comparison

to 0, although the result of such a logical operation is not actually an arithmetic result. If an arithmetic overﬂow occurs, the “sign” of an instruction result indicated in CR[CR0]

LT, GT, EQ

might not represent the “true” (inﬁnitely precise) algebraic result of the instruction that set CR0. For example, if an add. instruction adds two large positive numbers and the magnitude of the result cannot be represented as a twos-complement number in a 32-bit register, an overﬂow occurs and CR[CR0]

are set, although the inﬁnitely precise result of the add is positive.

LT, SO

2-12 PPC405 Core User’s Manual

Adding the largest 32-bit twos-complement negative number, 0x8000 0000, to itself results in an arithmetic overﬂow and 0x0000 0000 is recorded in the target register. CR[CR0]

EQ, SO

is set,

indicating a result of 0, but the inﬁnitely precise result is negative. The CR[CR0]

cause an overﬂow, but even for these instructions CR[CR0]

subﬁeld is a copy of XER[SO]. Instructions that do not alter the XER[SO] bit cannot

is a copy of XER[SO].

Some instructions set CR[CR0] differently or do not speciﬁcally set any of the subﬁelds. These instructions include:

• Compare instructions cmp, cmpi, cmpl, cmpli

• CR logical instructions crand, crandc, creqv, crnand, crnor, cror, crorc, crxor, mcrf

• Move CR instructions

mtcrf, mcrxr

• stwcx.

The instruction descriptions provide detailed information about how the listed instructions alter CR[CR0].

2.3.4 The Time Base

The PowerPC Architecture provides a 64-bit time base. “Time Base” on page 6-1 describes the architected time base. Access to the time base is through two 32-bit time base registers (TBRs). The least-signiﬁcant 32 bits of the time base are read from the Time Base Lower (TBL) register and the most-signiﬁcant 32 bits are read from the Time Base Upper (TBU) register.

User-mode access to the time base is read-only, and there is no explicitly privileged read access to the time base.

The mftb instruction reads from TBL and TBU. Writing the time base is accomplished by moving the contents of a GPR to a pair of SPRs, which are also called TBL and TBU, using mtspr.

Table 2-4 shows the mnemonics and names of the TBRs.

Table 2-4. Time Base Registers

Mnemonic Register Name Access

TBL Time Base Lower (Read-only) Read-only TBU Time Base Upper (Read-only) Read-only

2.3.5 Machine State Register (MSR)

The Machine State Register (MSR) controls processor core functions, such as the enabling or disabling of interrupts and address translation.

The MSR is written from a GPR using the mtmsr instruction. The contents of the MSR can be read into a GPR using the mfmsr instruction. MSR[EE] is set or cleared using the wrtee or wrteei instructions.

Programming Model 2-13

The MSR contents are automatically saved, altered, and restored by the interrupt-handling mechanism. See “Machine State Register (MSR)” on page 5-7.

APE

DWE

FE1

DRME

0 567 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 31

FE0

Figure 2-9. Machine State Register (MSR)

0:5 Reserved 6 AP Auxiliary Processor Available

0 APU not available.

1 APU available. 7:11 12 APE APU Exception Enable

13 WE Wait State Enable

14 CE Critical Interrupt Enable

Reserved

0 APU exception disabled.

1 APU exception enabled.

0 The processor is not in the wait state.

1 The processor is in the wait state.

0 Critical interrupts are disabled.

1 Critical interrupts are enabled.

If MSR[WE] = 1, the processor remains in the wait state until an interrupt is taken, a reset occurs, or an external debug tool clears WE.

Controls the critical interrupt input and watchdog timer ﬁrst time-out interrupts.

Reserved 16 EE External Interrupt Enable

0 Asynchronous interruptsare disabled.

1 Asynchronous interrupts are enabled. 17 PR Problem State

0 Supervisor state (all instructions

allowed).

1 Problem state (some instructions not

allowed).

18 FP Floating Point Available

0 The processor cannot execute ﬂoating-

point instructions

1 The processor can execute ﬂoating-point

instructions

19 ME Machine Check Enable

0 Machine check interrupts are disabled.

1 Machine check interrupts are enabled.

Controls the non-critical external interrupt input, PIT, and FIT interrupts.

2-14 PPC405 Core User’s Manual

20 FE0 Floating-point exception mode 0

0 If MSR[FE1] = 0, ignore exceptions

mode; if MSR[FE1] = 1, imprecise nonrecoverable mode

1 If MSR[FE1] = 0, imprecise recoverable

mode; if MSR[FE1] = 1, precise mode

21 DWE Debug Wait Enable

0 Debug wait mode is disabled. 1 Debug wait mode is enabled.

22 DE Debug Interrupts Enable

0 Debug interrupts are disabled. 1 Debug interrupts are enabled.

23 FE1 Floating-point exception mode 1

0 If MSR[FE0] = 0, ignore exceptions

mode; if MSR[FE0] = 1, imprecise recoverable mode

1 If MSR[FE0] = 0, imprecise non-

recoverable mode; if MSR[FE0]= 1,

precise mode 24:25 26 IR Instruction Relocate

27 DR Data Relocate

28:31

Reserved

0 Instruction address translation is

disabled.

1 Instruction address translation is

enabled.

0 Data address translation is disabled. 1 Data address translation is enabled.

Reserved

2.3.6 Device Control Registers

Device Control Registers (DCRs), on-chip registers that exist architecturally outside the processor core, are not part of the IBM PowerPC Embedded Environment. The Embedded Environment simply deﬁnes the existence of a DCR address space and the instructions that access the DCRs, but does not deﬁne any DCRs. The instructions that access the DCRs are mtdcr (move to device control register) and mfdcr (move from device control register).

DCRs are used to control the operations of on-chip buses, peripherals, and some processor behavior.

Programming Model 2-15

2.4 Data Types and Alignment

The data types consist of bytes (eight bits), halfwords (two bytes), words (four bytes), and strings (1 to 128 bytes). Figure 2-10 shows the byte, halfword, and word data types and their bit and byte deﬁnitions for big endian representations of values. Note that PowerPC bit numbering is reversed from industry conventions; bit 0 represents the most signiﬁcant bit of a value.

Byte

Bit

Byte

Halfword

Word

Figure 2-10. PPC405 Data Types

Data is represented in either twos-complement notation or in an unsigned integer format; data representation is independent of alignment issues.

The address of a data object is always the lowest address of any byte comprising the object. All instructions are words, and are word-aligned (the lowest byte address is divisible by 4).

2.4.1 Alignment for Storage Reference and Cache Control Instructions

The storage reference instructions (loads and stores; see Table 2-12, “Storage Reference Instructions,” on page 2-37) move data to and from storage. The data cache control instructions listed in Table 2-21, “Cache Management Instructions,” on page 2-41, control the contents and operation of the data cache unit (DCU). Both types of instructions form an effective address (EA). The method of calculating the EA for the storage reference and cache control instructions is detailed in the description of those instructions. See Chapter 9, “Instruction Set,” for more information.

Cache control instructions ignore the ﬁve least signiﬁcant bits of the EA; no alignment restrictions exist in the DCU because of EAs. However, storage control attributes can cause alignment exceptions. When data address translation is disabled and a dcbz instruction references a storage region that is non-cachable, or for which write-through caching is the write strategy, an alignment exception is taken. Such exceptions result from the storage control attributes, not from EA alignment. The alignment exception enables system software to emulate the write-through function.

Alignment requirements for the storage reference instructions and the dcread instruction depend on the particular instruction. Table 2-5, “Alignment Exception Summary,” on page 2-17, summarizes the instructions that cause alignment exceptions.

The data targets of instructions are of types that depend upon the instruction. The load/store instructions have the following “natural” alignments:

• Load/store word instructions have word targets, word-aligned.

• Load/ store halfword instructions have halfword targets, halfword-aligned.

• Load/store byte instructions have byte targets, byte-aligned (that is, any alignment).

2-16 PPC405 Core User’s Manual

Misalignments are addresses that are not naturally aligned on data type boundaries. An address not divisible by four is misaligned with respect to word instructions. An address not divisible by two is misaligned with respect to halfword instructions. The PPC405 core implementation handles misalignments within and across word boundaries, but there is a performance penalty because additional cycles are required.

2.4.2 Alignment and Endian Operation

The endian storage control attribute does not affect alignment behavior. In little endian storage regions, the alignment of data is treated as it is in big endian storage regions; no special alignment exceptions occur when accessing data in little endian storage regions. Note that the alignment exceptions that apply to big endian region accesses also apply to little endian storage region accesses.

2.4.3 Summary of Instructions Causing Alignment Exceptions

Table 2-5 summarizes the instructions that cause alignment exceptions and the conditions under which the alignment exceptions occur.

Table 2-5. Alignment Exception Summary

Instructions Causing Alignment

Exceptions Conditions

dcbz EA in non-cachable or write-through storage dcread, lwarx, stwcx. EA not word-aligned APU load/store halfword EA not halfword-aligned APU load/store word EA not word-aligned APU load/store doubleword EA not word-aligned

2.5 Byte Ordering

The following discussion describes the “endianness” of the PPC405, which, by default and in normal use is “big endian.”

If scalars (individual data items and instructions) were indivisible, “byte ordering” would not be a concern. It is meaningless to consider the order of bits or groups of bits within a byte, the smallest addressable unit of storage; nothing can be observed about such order.Only when scalars, which the programmer and processor regard as indivisible quantities, can comprise more than one addressable unit of storage does the question of byte order arise.

For a machine in which the smallest addressable unit of storage is the 32-bit word, there is no question of the ordering of bytes within words. All transfers of individual scalars between registers and storage are of words, and the address of the byte containing the high-order eight bits of a scalar is the same as the address of any other byte of the scalar.

For the PowerPC Architecture, as for most computer architectures currently implemented, the smallest addressable unit of storage is the 8-bit byte. Other scalars are halfwords, words, or doublewords, which consist of groups of bytes. When a word-length scalar is moved from a register to

Programming Model 2-17

storage, the scalar is stored in four consecutive byte addresses. It thus becomes meaningful to discuss the order of the byte addresses with respect to the value of the scalar: that is, which byte contains the highest-order eight bits of the scalar, which byte contains the next-highest-order eight bits, and so on.

Given a scalar that contains multiple bytes, the choice of byte ordering is essentially arbitrary. There are 4! = 24 ways to specify the ordering of four bytes within a word, but only two of these orderings are commonly used:

• The ordering that assigns the lowest address to the highest-order (“leftmost”) eight bits of the scalar, the next sequential address to the next-highest-order eight bits, and so on.

This ordering is called number, comes ﬁrst in storage.

• The ordering that assigns the lowest address to the lowest-order (“rightmost”) eight bits of the scalar, the next sequential address to the next-lowest-order eight bits, and so on.

This ordering is called number, comes ﬁrst in storage.

big endian

little endian

because the “big end” of the scalar, considered as a binary

because the “little end” of the scalar, considered as a binary

2.5.1 Structure Mapping Examples

The following C language structure,s, contains an assortment of scalars and a character string. The comments show the value assumed to be in each structure element; these values show how the bytes comprising each structure element are mapped into storage.

struct {

int a; /* 0x1112_1314 word */ long long b; /* 0x2122_2324_2526_2728 doubleword */ char *c; /* 0x3132_3334 word */ char d[7]; /* 'A','B','C','D','E','F','G' array of bytes */ short e; /* 0x5152 halfword */ int f; /* 0x6162_6364 word */

} s;

C structure mapping rules permit the use of padding (skipped bytes) to align scalars on desirable boundaries. The structure mapping examples show each scalar aligned at its natural boundary. This

alignment introduces padding of four bytes between

bytes between mappings.

2-18 PPC405 Core User’s Manual

andf. The same amount of padding is present in both big endian and little endian

andb, one byte betweend ande, and two

2.5.1.1 Big Endian Mapping

The big endian mapping of structures follows. (The data is highlighted in the structure mappings. Addresses, in hexadecimal, are below the data stored at the address. The contents of each byte, as

deﬁned in structure

, is shown as a (hexadecimal) number or character (for the string elements).

11 12 13 14

0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07

21 22 23 24 25 26 27 28

0x08 0x09 0x0A 0x0B 0x0C 0x0D 0x0E 0x0F

31 32 33 34 'A' 'B' 'C' 'D'

0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17

'E' 'F' 'G' 51 52

0x18 0x19 0x1A 0x1B 0x1C 0x1D 0x1E 0x1F

61 62 63 64

0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27

2.5.1.2 Little Endian Mapping

Structures is shown mapped little endian.

14 13 12 11

0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07

28 27 26 25 24 23 22 21

0x08 0x09 0x0A 0x0B 0x0C 0x0D 0x0E 0x0F

34 33 32 31 'A' 'B' 'C' 'D'

0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17

'E' 'F' 'G' 52 51

0x18 0x19 0x1A 0x1B 0x1C 0x1D 0x1E 0x1F

64 63 62 61

0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27

2.5.2 Support for Little Endian Byte Ordering

This book describes the processor as if it operated only in a big endian fashion. In fact, the IBM PowerPC Embedded Environment also supports little endian operation.

The PowerPC little endian mode, deﬁned in the PowerPC Architecture, is not implemented.

2.5.3 Endian (E) Storage Attribute

The endian (E) storage attribute supports direct connection of the PPC405 core to little endian peripherals and to memory containing little endian instructions and data. For every storage reference (instruction fetch or load/store access), an E storage attribute is associated with the storage region of the reference. The E attribute speciﬁes whether that region is organized as big endian (E = 0) or little endian (E = 1).

Programming Model 2-19

When address translation is enabled (MSR[IR] = 1 or MSR[DR] = 1), the E ﬁeld in the corresponding TLB entry controls the endianness of a memory region. When address translation is disabled (MSR[IR] = 0 or MSR[DR] = 0), the SLER controls the endianness of a memory region.

Bytes in storage that are accessed as little endian are arranged in true little endian format. The PPC405 does not support the little endian mode deﬁned in the PowerPC architecture and used in PPC401xx and PPC403xx processors. Furthermore, no address modiﬁcation is performed when accessing storage regions programmed as little endian. Instead, the PPC405 reorders the bytes as they are transferred between the processor and memory.

The on-the-ﬂy reversal of bytes in little endian storage regions is handled in one of two ways, depending on whether the storage access is an instruction fetch or a data access (load/store). The following sections describe byte reordering for the two kinds of storage accesses.

2.5.3.1 Fetching Instructions from Little Endian Storage Regions

Instructions are words (four bytes) that are aligned on word boundaries in memory. As such, instructions in a big endian memory region are arranged with the most signiﬁcant byte (MSB) of the instruction word at the lowest address.

Consider the big endian mapping of instruction

= add r7, r7, r4:

MSB LSB

0x00 0x01 0x02 0x03

at address 00, where, for example,

On the other hand, in the little endian mapping instruction

is arranged with the least signiﬁcant byte

(LSB) of the instruction word at the lowest numbered address:

LSB MSB

0x00 0x01 0x02 0x03

When an instruction is fetched from memory, the instruction must be placed in the instruction queue in the proper order. The execution unit assumes that the MSB of an instruction word is at the lowest address. Therefore, when instructions are fetched from little endian storage regions, the four bytes of an instruction word are reversed before the instruction is decoded. In the PPC405 core, the byte reversal occurs between memory and the instruction cache unit (ICU). The ICU always stores instructions in big endian format, regardless of whether the memory region containing the instruction is programmed as big endian or little endian. Thus, the bytes are already in the proper order when an instruction is transferred from the ICU to the decode stage of the pipeline.

If a storage region is reprogrammed from one endian format to the other, the storage region must be reloaded with program and data structures in the appropriate endian format. If the endian format of instruction memory changes, the ICU must be made coherent with the updates. The ICU must be invalidated and the updated instruction memory using the new endian format must be fetched so that the proper byte ordering occurs before the new instructions are placed in the ICU.

2-20 PPC405 Core User’s Manual

2.5.3.2 Accessing Data in Little Endian Storage Regions

Unlike instruction fetches from little endian storage regions, data accesses from little endian storage

not

regions are depends on the data type (byte, halfword, or word) of a speciﬁc data item. It is only when moving a data item required. Therefore, byte reversal during load/store accesses is performed between the DCU and the GPR.

When accessing data in a little endian storage region:

• For byte loads/stores, no reordering occurs.

• For halfword loads/stores, bytes are reversed within the halfword.

• For word loads/stores, bytes are reversed within the word. Note that this applies, regardless of data alignment. The big endian and little endian mappings of the structure

on page 2-18, demonstrate how the size of an item determines its byte ordering. For example:

• The word

byte-reversed between memory and the DCU. Data byte ordering, in memory,

of a speciﬁc type

has its four bytes reversed within the word spanning addresses 0x00–0x03.

from or to a GPR that it becomes known what type of byte reversal is

, shown in “Structure Mapping Examples”

• The halfword Note that the array of bytes

little endian mappings are compared. For example, the character 'A' is located at address 0x14 in both the big endian and little endian mappings.

In little endian storage regions, the alignment of data is treated as it is in big endian storage regions. Unlike PowerPC little endian mode, no special alignment exceptions occur when accessing data in little endian storage regions.

has its two bytes reversed within the halfword spanning addresses 0x1C–0x1D.

, where each data item is a byte, is not reversed when the big endian and

2.5.3.3 PowerPC Byte-Reverse Instructions

For big endian storage regions, normal load/store instructions move the more signiﬁcant bytes of a register to and from the lower-numbered memory addresses. The load/store with byte-reverse instructions move the more signiﬁcant bytes of the register to and from the higher numbered memory addresses.

As Figure 2-11 through Figure 2-14 illustrate, a normal store to a big endian storage region is the same as a byte-reverse store to a little endian storage region. Conversely, a normal store to a little endian storage region is the same as a byte-reverse store to a big endian storage region.

Programming Model 2-21

Figure 2-11 illustrates the contents of a GPR and memory (starting at address 00) after a normal load/store in a big endian storage region.

MSB

11 12 13 14

0x00 0x01 0x02 0x03

LSB

GPR

Memory

Figure 2-11. Normal Word Load or Store (Big Endian Storage Region)

Note that the results are identical to the results of a load/store with byte-reverse in a little endian storage region, as illustrated in Figure 2-12.

MSB

11 12 13 14

0x00 0x01 0x02 0x03

LSB

GPR

Memory

Figure 2-12. Byte-Reverse Word Load or Store (Little Endian Storage Region)

Figure 2-13 illustrates the contents of a GPR and memory (starting at address 00) after a load/store with byte-reverse in a big endian storage region.

MSB

11 12 13 14

14 13 12 11

0x00 0x01 0x02 0x03

LSB

GPR

Memory

Figure 2-13. Byte-Reverse Word Load or Store (Big Endian Storage Region)

2-22 PPC405 Core User’s Manual

Note that the results are identical to the results of a normal load/store in a little endian storage region, as illustrated in Figure 2-14.

MSB

11 12 13 14

14 13 12 11

0x00 0x01 0x02 0x03

LSB

GPR

Memory

Figure 2-14. Normal Word Load or Store (Little Endian Storage Region)

The E storage attribute augments the byte-reverse load/store instructions in two important ways:

• The load/store with byte-reverse instructions do not solve the problem of fetching instructions from

a storage region in little endian format. Only the endian storage attribute mechanism supports the fetching of little endian program images.

• Typical compilers cannot make general use of the byte-reverse load/store instructions, so these

instructions are ordinarily used only in device drivers written in hand-coded assembler. Compilers can, however, take full advantage of the endian storage attribute mechanism, enabling

application programmers working in a high-level language, such as C, to compile programs and data structures into little endian format.

2.6 Instruction Processing

The instruction pipeline, illustrated in Figure 2-15, contains three queue locations: prefetch buffer 1 (PFB1), prefetch buffer 0 (PFB0), and decode (DCD). This queue implements a pipeline with the following functional stages: fetch, decode, execute, write-back and load write-back. Instructions are fetched from the instruction cache unit (ICU), placed in the instruction queue, and eventually dispatched to the execution unit (EXU).

Instructions are fetched from the ICU at the request of the EXU. Cachable instructions are forwarded directly to the instruction queue and stored in the ICU cache array. Non-cachable instructions are also forwarded directly to the instruction queue, but are not stored in the ICU cache array. Fetched instructions drop to the empty queue location closest to the EXU. When there is room in the queue, instructions can be returned from the ICU two at a time. If the queue is empty and the ICU is returning two instructions, one instruction drops into DCD while the other drops into PFB0. PFB1 buffers instructions when the pipeline stalls.

Programming Model 2-23

Branch instructions are examined in DCD and PFB0 while all other instructions are decoded in DCD. All instructions must pass through DCD before entering the EXU. The EXU contains the execute, write-back and load write-back stages of the pipe. The results of most instructions are calculated during the execute stage and written to the GPR ﬁle during the write back stage. Load instructions write the GPR ﬁle during the load write-back stage.

ICU

Fetch

PFB1

Instruction

Queue

PFB0

DCD

Dispatch

EXU

Figure 2-15. PPC405 Instruction Pipeline

2.7 Branch Processing

The PPC405, which provides a variety of conditional and unconditional branching instructions, uses the branch prediction techniques described in “Branch Prediction” on page 3-35.

2.7.1 Unconditional Branch Target Addressing Options

The unconditional branches (b, ba, bl, bla) carry the displacement to the branch target address as a signed 26-bit value (the 24-bit LI ﬁeld right-extended with 0b00). The displacement enables unconditional branches to cover an address range of ±32MB.

For the relative (AA = 0) forms (b, bl), the target address is the current instruction address (CIA, the address of the branch instruction) plus the signed displacement.

For the absolute (AA = 1) forms (ba, bla), the target address is 0 plus the signed displacement. If the sign bit (LI[0]) is 0, the displacement is the target address. If the sign bit is 1, the displacement is a negative value and wraps to the highest memory addresses. For example, if the displacement is 0x3FF FFFC (the 26-bit representation of –4), the target address is 0xFFFF FFFC (0 – 4B, or 4 bytes below the top of memory).

2.7.2 Conditional Branch Target Addressing Options

The conditional branches (bc, bca, bcl, bcla) carry the displacement to the branch target address as a signed 16-bit value (the 14-bit BD ﬁeld right-extended with 0b00). The displacement enables conditional branches to cover an address range of ±32KB.

2-24 PPC405 Core User’s Manual

For the relative (AA = 0) forms (bc, bcl), the target address is the CIA plus the signed displacement. For the absolute (AA = 1) forms (bca, bcla), the target address is 0 plus the signed displacement. If

the sign bit (BD[0]) is 0, the displacement is the target address. If the sign bit is 1, the displacement is negative and wraps to the highest memory addresses. For example, if the displacement is 0xFFFC (the 16-bit representation of –4), the target address is 0xFFFF FFFC (0 – 4B, or 4 bytes from the top of memory).

2.7.3 Conditional Branch Condition Register Testing

Conditional branch instructions can test a CR bit. The value of the BI ﬁeld speciﬁes the bit to be tested (bit 0–31). The BO ﬁeld controls whether the CR bit is tested, as described in the following section.

2.7.4 BO Field on Conditional Branches

The BO ﬁeld of the conditional branch instruction speciﬁes the conditions used to control branching, and speciﬁes how the branch affects the CTR.

Conditional branch instructions can test one bit in the CR. This option is selected when BO[0] = 0; if BO[0] = 1, the CR does not participate in the branch condition test. If this option is selected, the condition is satisﬁed (branch can occur) if CR[BI] = BO[1].

Conditional branch instructions can decrement the CTR by one, and after the decrement, test the CTR value. This option is selected when BO[2] = 0. If this option is selected, BO[3] speciﬁes the condition that must be satisﬁed to allow a branch to be taken. If BO[3] = 0, CTR ≠ 0 is required for a branch to occur. If BO[3] = 1, CTR = 0 is required for a branch to occur.

If BO[2] = 1, the contents of the CTR are left unchanged, and the CTR does not participate in the branch condition test.

Table 2-6 summarizes the usage of the bits of the BO ﬁeld. BO[4] is further discussed in “Branch Prediction.”

Table 2-6. Bits of the BO Field

BO Bit Description

BO[0] CR Test Control

0 Test CR bit speciﬁed by BI ﬁeld for value speciﬁed by BO[1] 1 Do not test CR

BO[1] CR Test Value

0 Test for CR[BI] = 0. 1 Test for CR[BI] = 1.

BO[2] CTR Test Control

0 Decrement CTR by one and test whether CTR satisﬁes the

condition speciﬁed by BO[3].

1 Do not change CTR, do not test CTR.

BO[3] CTR Test Value

0 Test for CTR ≠ 0. 1 Test for CTR = 0.

BO[4] Branch Prediction Reversal

0 Apply standard branch prediction. 1 Reverse the standard branch prediction.

Programming Model 2-25

Table 2-7 lists speciﬁc BO ﬁeld contents, and the resulting actions;zrepresents a mandatory value of

0, and

is a branch prediction option discussed in “Branch Prediction.”

Table 2-7. Conditional Branch BO Field

Value Description

0000

Decrement the CTR, then branch if the decremented CTR ≠ 0 and CR[BI]=0.

0001

Decrement the CTR, then branch if the decremented CTR = 0 and CR[BI] = 0. 001 0100 0101 011 1

z00y

z01y

z1zz

Branch if CR[BI] = 0.

Decrement the CTR, then branch if the decremented CTR ≠ 0 and CR[BI] = 1.

Decrement the CTR, then branch if the decremented CTR=0 and CR[BI] = 1.

Branch if CR[BI] = 1.

Decrement the CTR, then branch if the decremented CTR ≠ 0.

Decrement the CTR, then branch if the decremented CTR = 0.

Branch always.

2.7.5 Branch Prediction

Conditional branches present a problem to the instruction fetcher. A branch might be taken. The branch EXU attempts to predict whether or not a branch is taken before all information necessary to determine the branch direction is available. This decision is called a

branch prediction

can then prefetch instructions starting at the predicted branch target address. If the prediction is correct, time is saved because the branched-to instruction is available in the instruction queue. Otherwise, the instruction pipeline stalls while the correct instruction is fetched into the instruction queue. To be effective, branch prediction must be correct most of the time.

. The fetcher

The PowerPCArchitecture enables software to reversethe default branch prediction, which is deﬁned as follows:

Predict that the branch is to be taken if ((BO[0]

where

is the sign bit of the displacement for conditional branch (bc) instructions, and 0 for bclr and

∧ BO[2]) ∨

)= 1

bcctr instructions. (BO[0]

∧ BO[2]) = 1 only when the conditional branch tests nothing (the “branch always” condition).

Obviously, the branch should be predicted taken for this case. If the branch tests anything, (BO[0]

∧ BO[2]) = 0, and

entirely controls the prediction. The default prediction for this case was decided by considering the relative form of bc, which is commonly used at the end of loops to control the number of times that a loop is executed. The branch is taken every time the loop is executed except the last, so it is best if the branch is predicted taken. The branch target is

the beginning of the loop, so the branch displacement is negative and

If branch displacements are positive (

= 0), the branch is predicted not taken. If the branch

instruction is any form of bclr or bcctr except the “branch always” forms, then

=1.

= 0, and the branch is

predicted not taken. There is a peculiar consequence of this prediction algorithm for the absolute forms of bc (bca and

bcla). As described in “Unconditional Branch Target Addressing Options” on page 2-24, if the

algebraic sign of the displacement is negative (

= 1), the branch target address is in high memory. If

2-26 PPC405 Core User’s Manual

the algebraic sign of the displacement is positive (s = 0), the branch target address is in low memory. Because these are absolute-addressing forms, there is no reason to treat high and low memory differently. Nevertheless, for the high memory case the default prediction is taken, and for the low memory case the default prediction is not taken.

BO[4] is the reverse of the standard prediction is applied. For the cases in Table 3-17 where BO[4] = can reverse the default prediction. This should only be done when the default prediction is likely to be wrong. Note that for the “branch always” condition, reversal of the default prediction is not allowed.

The PowerPC Architecture requires assemblers to provide a way to conveniently control branch prediction. For any conditional branch mnemonic, a sufﬁx may be added to the mnemonic to control prediction, as follows:

+ Predict branch to be taken

− Predict branch to be not taken

For example, bcctr+ causes BO[4] to be set appropriately to force the branch to be predicted taken.

prediction reversal bit

. If BO[4] = 0, the default prediction is applied. If BO[4] = 1, the

, software

2.8 Speculative Accesses

The PowerPC Architecture permits implementations to perform speculative accesses to memory, either for instruction fetching, or for data loads. A speculative access is deﬁned as any access which is not required by a sequential execution model.

For example, prefetching instructions beyond an undetermined conditional branch is a speculative fetch; if the branch is not in the predicted direction, the program, as executed, never needs the instructions from the predicted path.

Sometimes speculative accesses are inappropriate. For example, attempting to fetch instructions from addresses that cannot contain instructions can cause problems.To protect against errant accesses to “sensitive” memory or I/O devices, the PowerPC Architecture provides the G (guarded) storage attribute, which can be used to specify memory pages from which speculative accesses are prohibited. (Actually, speculative accesses to guarded storage are allowed in certain limited circumstances; if an instruction in a cache block will be executed, the rest of the cache block can be speculatively accessed.)

2.8.1 Speculative Accesses in the PPC405

The PPC405 does not perform speculative loads. Two methods control speculative instruction fetching. If instruction address translation is enabled

(MSR[IR] = 1), the G (guarded) ﬁeld in the translation lookaside buffer (TLB) entries controls speculative accesses.

If instruction address translation is disabled (MSR[IR] = 0), the Storage Guarded Register (SGR) controls speculative accesses for regions of memory. When a region is guarded (speculative fetching is disallowed), instruction prefetching is disabled for that region. A fetch request must be completely resolved (no longer speculative) before it is issued. There is a considerable performance penalty for fetching from guarded storage, so guarding should be used only when required.

Note that, following any reset, the PPC405 core operates with all of storage guarded.

Programming Model 2-27

Note that when address translation is enabled, attempts to fetch from guarded storage result in instruction storage exceptions. Guarded memory is in most often needed with peripheral status registers that are cleared automatically after being read, because an unintended access resulting from a speculative fetch would cause the loss of status information. Because the MMU provides 64 pages with a wide range of page sizes as small as 1KB, fetching instructions from guarded storage should be unnecessary.

2.8.1.1 Prefetch Distance Down an Unresolved Branch Path

The fetcher will speculatively access up to 19 instructions down a predicted branch path, whether taken or sequential, regardless of cachability.

2.8.1.2 Prefetch of Branches to the CTR and Branches to the LR

When the instruction fetcher predicts that a bctr or blr instruction will be taken, the fetcher does not attempt to fetch an instruction from the target address in the CTR or LR if an executing instruction updates the register ahead of the branch. (See “Instruction Processing” on page 2-23 for a description of the instruction pipeline). The fetcher recognizes that the CTR or LR contains data left from an earlier use and that such data is probably not valid.

In such cases, the fetcher does not fetch the instruction at the target address until the instruction that is updating the CTR or LR completes. Only then are the “correct” CTR or LR contents known. This prevents the fetcher from speculatively accessing a completely “random” address. After the CTR or LR contents are known to be correct, the fetcher accesses no more than ﬁve instructions down the sequential or taken path of an unresolved branch, or at the address contained in the CTR or LR.

2.8.2 Preventing Inappropriate Speculative Accesses

A memory-mapped I/O peripheral, such as a serial port having a status register that is automatically reset when read provides a simple example of storage that should not be speculatively accessed. If code is in memory at an address adjacent to the peripheral (for example, code goes from 0x0000 0000 to 0x0000 0FFF, and the peripheral is at 0x0000 1000), prefetching past the end of the code will read the peripheral.

Guarding storage also prevents prefetching past the end of memory.If the highest memory address is left unguarded, the fetcher could attempt to fetch past the last valid address, potentially causing machine checks on the fetches from invalid addresses. While the machine checks do not actually cause an exception until the processor attempts to execute an instruction at an invalid address, some systems could suffer from the attempt to access such an invalid address. For example, an external memory controller might log an error.

System designers can avoid problems from speculative fetching without using the guarded storage attributes. The rest of this section describes ways to prevent speculative instruction fetches to sensitive addresses in unguarded memory regions.

2.8.2.1 Fetching Past an Interrupt-Causing or Interrupt-Returning Instruction

Suppose a bctr or blr instruction closely follows an interrupt-causing or interrupt-returning instruction (sc, rﬁ, or rfci). The fetcher does not prevent speculatively fetching past one of these instructions. In other words, the fetcher does not treat the interrupt-causing and interrupt-returning instructions specially when deciding whether to predict down a branch path. Instructions after an rﬁ, for example, are considered to be on the determined branch path.

2-28 PPC405 Core User’s Manual

To understand the implications of this situation, consider the code sequence:

handler: aaa

bbb rﬁ

subroutine: bctr

When executingthe interrupt handler, the fetcher does not recognize the rﬁ as a break in the program ﬂow, and speculatively fetches the target of the bctr, which is really the ﬁrst instruction of a subroutine that has not been called. Therefore, the CTR might contain an invalid pointer.

To protect against such a prefetch, the software must insert an unconditional branch hang (b $) just after the rﬁ. This prevents the hardware from prefetching the invalid target address used by bctr.

Consider also the above code sequence, with the rﬁ instruction replaced by an sc instruction used to initialize the CTR with the appropriate value for the bctr to branch to, upon return from the system call. The sc handler returns to the instruction following the sc, which can’t be a branch hang. Instead, software could put a mtctr just before the sc to load a non-sensitive address into the CTR. This address will be used as the prediction address before the sc executes. An alternative would be to put a mfctr or mtctr between the sc and the bctr; the mtctr prevents the fetcher from speculatively accessing the address contained in the CTR before initialization.

2.8.2.2 Fetching Past tw or twi Instructions

The interrupt-causing instructions, tw and twi, do not require the special handling described in “Fetching Past an Interrupt-Causing or Interrupt-Returning Instruction” on page 2-28. These instructions are typically used by debuggers, which implement software breakpoints by substituting a trap instruction for the instruction originally at the breakpoint address. In a code sequence mtlr followedby blr (or mtctr followedby bctr), replacement of mtlr/mtctr by tw or twi leavesthe LR/CTR uninitialized. It would be inappropriate to fetch from the blr/bctr target address. This situation is common, and the fetcher is designed to prevent the problem.

2.8.2.3 Fetching Past an Unconditional Branch

When an unconditional branch is in DCD in the instruction queue, the fetcher recognizes that the sequential instructions following the branch are unnecessary. These sequential addresses are not accessed. Addresses at the branch target are accessed instead.

Therefore, placing an unconditional branch just before the start of a sensitive address space (for example, at the “end” of a memory area that borders an I/O device) guarantees that addresses in the sensitive area will not be speculatively fetched.

2.8.2.4 Suggested Locations of Memory-Mapped Hardware

The preferred method of protecting memory-mapped hardware from inadvertent access is to use address translation, with hardware isolated to guarded pages (the G storage attribute in the associated TLB entry is set to 1.) The pages can be as small as 1KB. Code should never be stored in such pages.

If address translation is disabled, the preferred protection method is to isolate memory-mapped hardware into regions guarded using the SGR. Code should never be stored in such regions. The disadvantage of this method, compared to the preferred method, is that each region guarded by the SGR consumes 128MB of the address space.

Programming Model 2-29

Table 2-8 shows two address regions of the PPC405 core. Suppose a system designer can map all I/O devices and all ROM and SRAM devices into any location in either region. The choices made by the designer can prevent speculative accesses to the memory-mapped I/O devices.

Table 2-8. Example Memory Mapping

0x7800 0000 – 0x7FFF FFFF (SGR bit 15) 128MB Region 2 0x7000 0000 – 0x77FF FFFF (SGR bit 14) 128MB Region 1

A simple wayto avoid the problem of speculative reads to peripherals is to map all storage containing code into Region 2, and all I/O devices into Region 1. Thus, accesses to Region 2 would only be for code and program data. Speculative fetches occuring in Region 2 would never access addresses in Region 1. Note that this hardware organization eliminates the need to use of the G storage attribute to protect Region 1. However, Region 1 could be set as guarded with no performance penalty, because there is no code to execute or variable data to access in Region 1.

The use of these regions could be reversed (code in Region 1 and I/O devices in Region 2), if Region 2 is set as guarded. Prefetching from the highest addresses of Region 1 could cause an attempt to speculatively access the bottom of Region 2, but guarding prevents this from occurring. The performance penalty is slight, under the assumption that code infrequently executes the instructions in the highest addresses of Region 1.

2.8.3 Summary

Software should take the following actions to prevent speculative accesses to sensitive data areas, if the sensitive data areas are not in guarded storage:

• Protect against accesses to “random” values in the LR or CTR on blr or bctr branches followingrﬁ,

rfci, or sc instructions by putting appropriate instructions before or after the rﬁ, rfci, or sc instruction. See “Fetching Past an Interrupt-Causing or Interrupt-Returning Instruction” on page 2-28.

• Protect against “running past” the end of memory into a bordering I/O device by putting an

unconditional branch at the end of the memory area. See “Fetching Past an Unconditional Branch” on page 2-29.

• Recognize that a maximum of 19 words can be prefetched past an unresolved conditional branch,

either down the target path or the sequential path. See “Prefetch Distance Down an Unresolved Branch Path” on page 2-28.

Of course, software should not code branches with known unsafe targets (either relative to the instruction counter, or to addresses contained in the LR or CTR), on the assumption that the targets are “protected” by code guaranteeing that the unsafe direction is not taken. The fetcher assumes that if a branch is predicted to be taken, it is safe to fetch down the target path.

2.9 Privileged Mode Operation

In the PowerPC Architecture, several terms describe two operating modes that have different instruction execution privileges. When a processor is in “privileged mode,” it can execute all instructions in the instruction set. This mode is also called the “supervisor state.” The other mode, in

2-30 PPC405 Core User’s Manual

which certain instructions cannot be executed, is called the “user mode,” or “problem state.” These terms are used in pairs:

Privileged Non-privileged

Privileged Mode User Mode Supervisor State Problem State

The architecture uses MSR[PR] to control the execution mode. When MSR[PR] = 1, the processor is in user mode (problem state); when MSR[PR] = 0, the processor is in privileged mode (supervisor state).

After a reset, MSR[PR] = 0.

2.9.1 MSR Bits and Exception Handling

The current value of MSR[PR] is saved, along with all other MSR bits, in the SRR1 (for non-critical interrupts) or SRR3 (for critical interrupts) upon any interrupt, and MSR[PR] is set to 0. Therefore, all exception handlers operate in privileged mode.

Attempting to execute a privileged instruction while in user mode causes a privileged violation program exception (see “Program Interrupt” on page 5-20). The PPC405 core does not execute the instruction, and the program counter is loaded with EVPR[0:15] || 0x0700, the address of an exception processing routine.

The PRR ﬁeld of the Exception Syndrome Register (ESR) is set when an interrupt was caused by a privileged instruction program exception. Software is not required to clear ESR[PPR].

2.9.2 Privileged Instructions

The instructions listed in Table 2-9 are privileged and cannot be executed while in user mode (MSR[PR] = 1).

Table 2-9. Privileged Instructions dcbi dccci dcread iccci icread mfdcr mfmsr mfspr

mtdcr mtmsr mtspr

For all SPRs except CTR, LR, SPRG4–SPRG7, and XER. See “Privileged SPRs” on page 2-32

For all SPRs except CTR, LR, XER. See “Privileged SPRs” on page 2-32

rfci rﬁ

Programming Model 2-31

Table 2-9. Privileged Instructions (continued) tlbia tlbre tlbsx tlbsync tlbwe wrtee wrteei

2.9.3 Privileged SPRs

All SPRs are privileged, except for the LR, the CTR, the XER, USPRG0, and read access to SPRG4– SPRG7. Reading from the time base registers Time Base Lower (TBL) and Time Base Upper (TBU) is not privileged. These registers are read using the mftb instruction, rather than the mfspr instruction. TBL and TBU are written (with different addresses) using mtspr, which is privileged for these registers. Except for moves to and from non-privileged SPRs, attempts to execute mfspr and mtspr instructions while in user mode result in privileged violation program exceptions.

In a mfspr or mtspr instruction, the 10-bit SPRN ﬁeld speciﬁes the SPR number of the source or destination SPR. The SPRN ﬁeld contains two ﬁve-bit subﬁelds, SPRN

and SPRN

0:4

assembler handles the unusual register number encoding to generate the SPRF ﬁeld. In the

for the mfspr and mtspr instructions, the SPRN subﬁelds are

code

and SPRF

) for compatibility with the POWER Architecture.

0:4

reversed

(ending up as SPRF

5:9

. The

machine

5:9

In the PowerPCArchitecture, SPR numbers havinga1inthemost-signiﬁcant bit of the SPRF ﬁeld are privileged.

The following example illustrates how SPR numbers appear in assembler language coding and in machine coding of the mfspr and mtspr instructions.

In assembler language coding, SRR0 is SPR 26. Note that the assembler handles the unusual register number encoding to generate the SPRF ﬁeld.

mfspr r5,26

When the SPR number is considered as a binary number (0b0000011010), the most-signiﬁcant bit is

0. However, the machine code for the instruction reverses the subﬁelds, resulting in the following SPRF ﬁeld: 0b1101000000. The most-signiﬁcant bit is 1; SRR0 is privileged.

When an SPR number is considered as a hexadecimal number, the second digit of the three-digit hexadecimalnumber indicates whether an SPR is privileged. If the second digit is odd (1, 3, 5, 7, 9, B, D, F), the SPR is privileged.

For example, the SPR number of SRR0 is 26 (0x01A). The second hexadecimal digit is odd; SRR0 is privileged. In contrast, the LR is SPR 8 (0x008); the second hexadecimal digit is not odd; the LR is non-privileged.

2.9.4 Privileged DCRs

The mtdcr and mfdcr instructions themselves are privileged, in all cases. All DCRs are privileged.

2-32 PPC405 Core User’s Manual

2.10 Synchronization

The PPC405 core supports the synchronization operations of the PowerPC Architecture. The following book, chapter, and section numbers refer to related information in

Architecture: A Specification for a New Family of RISC Processors

• Book II, Section 1.8.1, “Storage Access Ordering” and “Enforce In-order Execution of I/O”

• Book III, Section 1.7, “Synchronization”

• Book III, Chapter 7, “Synchronization Requirements for Special Registers and Lookaside Buffers”

2.10.1 Context Synchronization

The context of a program is the environment (for example, privilege and address translation) in which the program executes. Context is controlled by the content of certain registers, such as the Machine State Register (MSR), and includes the content of all GPRs and SPRs.

An instruction or event is context synchronizing if it satisﬁes the following requirements:

The PowerPC

1. All instructions that existed

2. All instructions that exists

Such instructions and events are called “context synchronizing operations.” In the PPC405 core, these include any interrupt, except a non-recoverable instruction machine check, and the isync, rfci, rﬁ, and sc instructions.

However, context speciﬁcally excludes the contents of memory. A context synchronizing operation does not guarantee that subsequent instructions observe the memory context established by previous instructions. To guarantee memory access ordering in the PPC405 core, one must use either an eieio instruction or a sync instruction. Note that for the PPC405 core, the eieio and sync instructions are implemented identically. See “Storage Synchronization” on page 2-35.

The contents of DCRs are not considered as part of the processor “context” managed by a context synchronizing operation. DCRs are not part of the processor core, and are analogous to memorymapped registers. Their context is managed in a manner similar to that of memory contents.

Finally, implementations of the PowerPC Architecture can exempt the machine check exception from context synchronization control. If the machine check exception is exempted, an instruction that

precedes

synchronizing operation occurs and additional instructions have completed.

before

after

the context synchronizing operation.

a context synchronizing operation can cause a machine check exception

precede

the context synchronizing operation.

a context synchronizing operation must complete in the context that

after

the context

The following scenarios use pseudocode examples to illustrate these limitations of context synchronization. Subsequent text explains how software can further guarantee “storage ordering.”

1. Consider the following instruction sequence:

STORE non-cachable to address XYZ isync XYZ instruction

Programming Model 2-33

In this sequence, the isync instruction does not guarantee that the XYZ instruction is fetched after the STORE has occurred to memory. There is no guarantee which XYZ instruction will execute; either the old version or the new (stored) version might.

2. Consider the following instruction sequence, which assumes that a PPC405 core is part of a standard product that uses DCRs to provide bus region control:

STORE non-cachable to address XYZ isync MTDCR to change a bus region containing XYZ

In this sequence, there is no guarantee that the STORE will occur before the mtdcr changing the bus region control DCR. The STORE could fail because of a conﬁguration error.

Consider an interrupt that changes privileged mode. An interrupt is a context synchronizing operation, because interrupts cause the MSR to be updated. The MSR is part of the processor context; the context synchronizing operation guarantees that all instructions that precede the interrupt complete using the preinterrupt value of MSR[PR], and that all instructions that follow the interrupt complete using the postinterrupt value.

Consider, on the other hand, some code that uses mtmsr to change the value of MSR[PR], which changes the privileged mode. In this case, the MSR is changed, changing the context. It is possible, for example, that prefetched privileged instructions expect to execute after the mtmsr has changed the operating mode from privileged mode to user mode. To prevent privileged instruction program exceptions, the code must execute a context synchronization operation, such as isync, immediately after the mtmsr instruction to prevent further instruction execution until the mtmsr completes.

eieio or sync can ensure that the contents of memory and DCRs are synchronized in the instruction stream. These instructions guarantee storage ordering because all memory accesses that precede eieio or sync are completed before subsequent memory accesses. Neither eieio nor sync guarantee that instruction prefetching is delayed until the eieio or sync completes. The instructions do not cause the prefetch queues to be purged and instructions to be refetched. See “Storage Synchronization” on page 2-35 for more information.

Instruction cache state is part of context. A context synchronization operation is required to guarantee instruction cache access ordering.

3. Consider the following instruction sequence, which is required for creating self-modifying code:

STORE Change data cache contents dcbst Flush the new data cache contents to memory sync Guarantee that dcbst completes before subsequent instructions begin icbi Context changing operation; invalidates instruction cache contents. isync Context synchronizing operation; causes refetch using new instruction cache context

text and new memory context, due to the previous sync.

If software wishes to ensure that all storage accesses are complete before executing a mtdcr to change a bus region (Example 2), the software must issue a sync after all storage accesses and before the mtdcr. Likewise, if the software is to ensure that all instruction fetches after the mtdcr use the new bank register contents, the software must issue an isync, after the mtdcr and before the ﬁrst instruction that should be fetched in the new context.

2-34 PPC405 Core User’s Manual

isync guarantees that all subsequent instructions are fetched and executed using the context

established by all previous instructions. isync is a context synchronizing operation; isync causes all subsequently prefetched instructions to be discarded and refetched.

The following example illustrates the use of isync with debug exceptions:

mtdbcr0 Enable an instruction address compare (IAC) event isync Wait for the new Debug Control Register 0 (DBCR0) context to be established XYZ This instruction is at the IAC address; an isync was necessary to guarantee that the

IAC event occurs at the execution of this instruction

2.10.2 Execution Synchronization

For completeness, consider the deﬁnition of execution synchronizing as it relates to context synchronization. Execution synchronization is architecturally a subset of context synchronization.

Execution synchronization guarantees that the following requirement is met:

All instructions that that existed

The following requirement need not be met:

All instructions that exists

Execution synchronization ensures that preceding instructions execute in the old context; subsequent instructions might executein either the new or old context (indeterminate). The PPC405 core provides three execution synchronizing operations: the eieio, mtmsr, and sync instructions.

Because mtmsr is execution synchronizing, it guarantees that previous instructions complete using the old MSR value. (For example, using mtmsr to change the endian mode.) However, to guarantee that subsequent instructions use the new MSR value, we have to insert a context synchronization operation, such as isync.

Note that the PowerPC Architecture requires MSR[EE] (the external interrupt bit) to be, in effect, execution synchronizing: if a mtmsr sets MSR[EE] = 1, and an external interrupt is pending, the exception must be taken before the instruction that follows mtmsr is executed. However, the mtmsr instruction is not a context synchronizing operation, so the PPC405 core does not, for example, discard prefetched instructions and refetch. Note that the wrtee and wrteei instructions can change the value of MSR[EE], but are not execution synchronizing.

Finally, while sync and eieio are execution synchronizing, they are also more restrictive in their requirement of memory ordering. Stating that an operation is execution synchronizing does not imply storage ordering. This is an additional speciﬁc requirement of sync and eieio.

before

after

the execution synchronizing operation.

precede

the execution synchronizing operation.

an execution synchronizing operation must complete in the context

an execution synchronizing operation must complete in the context that

2.10.3 Storage Synchronization

The sync instruction guarantees that all previous storage references complete with respect to the PPC405 core before the sync instruction completes (therefore, before any subsequent instructions begin to execute). The sync instruction is execution synchronizing.

Consider the following use of sync:

Programming Model 2-35

stw Store to peripheral sync Wait for store to actually complete mtdcr Reconﬁgure device

The eieio instruction guarantees the order of storage accesses. All storage accesses that precede eieio complete before any storage accesses that follow the instruction, as in the following example:

stb X Store to peripheral, address X; this resets a status bit in the device eieio Guarantee stb X completes before next instruction lbz Y Load from peripheral, address Y; this is the status register updated by stb X.

eieio was necessary, because the read and write addresses are different, but affect each other

The PPC405 core implements both sync and eieio identically, in the manner described above for sync. In the PowerPC Architecture, sync can function across all processors in a multiprocessor environment; eieio functions only within its executing processor. The PPC405 does not provide hardware support for multiprocessor memory coherency, so sync does not guarantee memory ordering across multiple processors.

2.11 Instruction Set

The PPC405 instruction set contains instructions deﬁned in the PowerPC Architecture and instructions speciﬁc to the IBM PowerPC 400 family of embedded processors.

Chapter 9, “Instruction Set,” contains detailed descriptions of each instruction.

Appendix A, “Instruction Summary,” alphabetically lists each instruction and extended mnemonic and provides a short-form description. Appendix B, “Instructions by Category,” provides short-form descriptions of instructions, grouped by the instruction categories listed in Table 2-10, “PPC405 Instruction Set Summary,” on page 2-36.

Table 2-10 summarizes the PPC405 instruction set functions by categories. Instructions within each category are described in subsequent sections.

Table 2-10. PPC405 Instruction Set Summary

Storage Reference load, store Arithmetic add, subtract, negate, multiply, multiply-accumulate, multiply halfword, divide Logical and, andc, or, orc, xor, nand, nor, xnor, sign extension, count leading zeros Comparison compare, compare logical, compare immediate Branch branch, branch conditional, branch to LR, branch to CTR CR Logical crand, crandc, cror, crorc, crnand, crnor, crxor, crxnor, move CR ﬁeld Rotate rotate and insert, rotate and mask, shift left, shift right Shift shift left, shift right, shift right algebraic Cache Management invalidate, touch, zero, ﬂush, store, read Interrupt Control write to external interrupt enable bit, move to/from MSR, return from interrupt,

return from critical interrupt

Processor Management system call, synchronize, trap, move to/from DCRs, move to/from SPRs, move

to/from CR

2-36 PPC405 Core User’s Manual

2.11.1 Instructions Speciﬁc to the IBM PowerPC Embedded Environment

To support functions required in embedded real-time applications, the IBM PowerPC 400 family of embedded processors deﬁnes instructions that are not deﬁned in the PowerPC Architecture.

Table 2-11 lists the instructions speciﬁc to IBM PowerPC embedded processors. Programs using these instructions are not portable to PowerPCimplementations that are not part of the IBM PowerPC 400 family of embedded processors.

In the table, the syntax [s] indicates that the instruction has a signed form. The syntax [u] indicates that the instruction has an unsigned form. The syntax “[.]” indicates that the instruction has a “record” form that updates CR[CR0], and a “non-record” form.

Table 2-11. Implementation-speciﬁc Instructions

dccci dcread iccci icread

macchw[s][u] machhw[s][u] maclhw[s][u] nmacchw[s] nmachhw[s] nmaclhw[s]

mulchw[u] mulhhw[u] mullhw[u]

mfdcr mtdcr rfci tlbre tlbsx[.] tlbwe wrtee wrteei

2.11.2 Storage Reference Instructions

Table 2-12 lists the PPC405 storage reference instructions. Load/store instructions transfer data between memory and the GPRs. These instructions operate on bytes, halfwords, and words. Storage reference instructions also support loading or storing multiple registers, character strings, and bytereversed data.

In the table, the syntax “[u]” indicates that an instruction has an “update” form that updates the RA addressing register with the calculated address, and a “non-update” form. The syntax “[x]” indicates that an instruction has an “indexed” form, which forms the address by adding the contents of the RA and RB GPRs and a “base + displacement” form, in which the address is formed by adding a 16-bit signed immediate value (included as part of the instruction word) to the contents of RA GPR.

Table 2-12. Storage Reference Instructions

Loads Stores

Byte Halfword Word Multiple/String Byte Halfword Word Multiple/String

lbz[u][x] lha[u][x]

lhbrx lhz[u][x]

lwarx lwbrx lwz[u][x]

lmw lswi lswx

stb[u][x] sth[u][x]

sthbrx

Programming Model 2-37

stw[u][x] stwbrx stwcx.

stmw stswi stswx

2.11.3 Arithmetic Instructions

Arithmetic operations are performed on integer operands stored in GPRs. Instructions that perform operations on two operands are deﬁned in a three-operand format; an operation is performed on the operands, which are stored in two GPRs. The result is placed in a third, operand, which is stored in a GPR. Instructions that perform operations on one operand are deﬁned using a two-operand format; the operation is performed on the operand in a GPR and the result is placed in another GPR. Several instructions also have immediate formats in which an operand is contained in a ﬁeld in the instruction word.

Most arithmetic instructions have versions that can update CR[CR0] and XER[SO, OV], based on the result of the instruction. Some arithmetic instructions also update XER[CA] implicitly. See “Condition Register (CR)” on page 2-10 and “Fixed Point Exception Register (XER)” on page 2-7 for more information.

Table 2-13 lists the PPC405 arithmetic instructions. In the table, the syntax “[o]” indicates that an instruction has an “o” form that updates XER[SO,OV], and a “non-o” form. The syntax “[.]” indicates that the instruction has a “record” form that updates CR[CR0], and a “non-record” form.

Table 2-13. Arithmetic Instructions

Add Subtract Multiply Divide Negate

add[o][.] addc[o][.] adde[o][.] addi addic[.] addis addme[o][.] addze[o][.]

subf[o][.] subfc[o][.] subfe[o][.] subﬁc subfme[o][.] subfze[o][.]

mulhw[.] mulhwu[.] mulli mullw[o][.]

divw[o][.] divwu[o][.]

neg[o][.]

2-38 PPC405 Core User’s Manual

Table 2-14 lists additional arithmetic instructions for multiply-accumulate and multiply halfword operations. In the table, the syntax “[o]” indicates that an instruction has an “o” form that updates XER[SO,OV], and a “non-o” form. The syntax “[.]” indicates that the instruction has a “record” form that updates CR[CR0], and a “non-record” form.

Table 2-14. Multiply-Accumulate and Multiply Halfword Instructions

Negative-

Multiply-

Accumulate

Multiply-

Accumulate

Multiply

Halfword

macchw[o][.] macchws[o][.] macchwsu[o][.] macchwu[o][.] machhw[o][.] machhws[o][.] machhwsu[o][.] machhwu[o][.] maclhw[o][.] maclhws[o][.] maclhwsu[o][.] maclhwu[o][.]

nmacchw[o][.] nmacchws[o][.] nmachhw[o][.] nmachhws[o][.] nmaclhw[o][.] nmaclhws[o][.]

mulchw[.] mulchwu[.] mulhhw[.] mulhhwu[.] mullhw[.] mullhwu[.]

2.11.4 Logical Instructions

Table 2-15 lists the PPC405 logical instructions. In the table, the syntax “[.]” indicates that the instruction has a “record” form that updates CR[CR0], and a “non-record” form.

Table 2-15. Logical Instructions

And

and[.] andi. andis.

And with

complement Nand Or andc[.] nand[.] or[.]

ori oris

Or with

complement Nor Xor Equivalence Extend sign orc[.] nor[.] xor[.]

xori xoris

eqv[.] extsb[.]

extsh[.]

Count

leading

zeros

cntlzw[.]

2.11.5 Compare Instructions

These instructions perform arithmetic or logical comparisons between two operands and update the CR with the result of the comparison.

Table 2-16 lists the PPC405 core compare instructions.

Table 2-16. Compare Instructions

Arithmetic Logical cmp

cmpi

cmpl cmpli

Programming Model 2-39

2.11.6 Branch Instructions

These instructions unconditionally or conditionally branch to an address. Conditional branch instructions can test condition codes set by a previous instruction and branch accordingly.Conditional branch instructions can also decrement and test the CTR as part of branch determination, and can save the return address in the LR.The target address for a branch can be a displacement from the current instruction address (a relative address), an absolute address, or contained in the CTR or LR.

See “Branch Processing” on page 2-24 for more information on branch operations. Table 2-17 lists the PPC405 branch instructions. In the table, the syntax “[l]” indicates that the

instruction has a “link update” form that updates LR with the address of the instruction after the branch, and a “non-link update” form. The syntax “[a]” indicates that the instruction has an “absolute address” form, in which the target address is formed directly using the immediate ﬁeld speciﬁed as part of the instruction, and a “relative” form, in which the target address is formed by adding the immediate ﬁeld to the address of the branch instruction).

Table 2-17. Branch Instructions

Branch

b[l][a] bc[l][a] bcctr[l] bclr[l]

2.11.6.1 CR Logical Instructions

These instructions perform logical operations on a speciﬁed pair of bits in the CR, placing the result in another speciﬁed bit. These instructions can logically combine the results of several comparisons without incurring the overhead of conditional branch instructions. Software performance can signiﬁcantly improve if multiple conditions are tested at once as part of a branch decision.

Table 2-18 lists the PPC405 condition register logical instructions.

Table 2-18. CR Logical Instructions

crand crandc creqv crnand

crnor cror crorc crxor mcrf

2.11.6.2 Rotate Instructions

These instructions rotate operands stored in the GPRs. Rotate instructions can also mask rotated operands.

Table 2-19 lists the PPC405 rotate instructions. In the table, the syntax “[.]” indicates that the instruction has a “record” form that updates CR[CR0], and a “non-record” form.

Table 2-19. Rotate Instructions

Rotate and Insert Rotate and Mask rlwimi[.] rlwinm[.]

2-40 PPC405 Core User’s Manual

rlwnm[

2.11.6.3 Shift Instructions

These instructions rotate operands stored in the GPRs. Table 2-20 lists the PPC405 shift instructions. Shift right algebraic instructions implicitly update

XER[CA]. In the table, the syntax “[.]” indicates that the instruction has a “record” form that updates CR[CR0], and a “non-record” form.

Table 2-20. Shift Instructions

Shift Right

Shift Left Shift Right slw[.] srw[.] sraw[.]

Algebraic

srawi[.]

2.11.6.4 Cache Management Instructions

These instructions control the operation of the ICU and DCU. Instructions are provided to ﬁll or invalidate instruction cache blocks. Instructions are also provided to ﬁll, ﬂush, invalidate, or zero data cache blocks, where a block is deﬁned as a 32-byte cache line.

Table 2-21 lists the PPC405 core cache management instructions.

Table 2-21. Cache Management Instructions

DCU ICU

dcba dcbf dcbi dcbst dcbt dcbtst dcbz dccci dcread

icbi icbt iccci icread

2.11.7 Interrupt Control Instructions

mfmsr and mtmsr read and write data between the MSR and a GPR to enable and disable

interrupts. wrtee and wrteei enable and disable external interrupts. rﬁ and rfci return from interrupt handlers. Table 2-22 lists the PPC405 core interrupt control instructions.

Table 2-22. Interrupt Control Instructions

mfmsr mtmsr rﬁ rfci wrtee wrteei

Programming Model 2-41

2.11.8 TLB Management Instructions

The TLB management instructions read and write entries of the TLB array in the MMU, search the TLB array for an entry which will translate a given address, and invalidate all TLB entries. There is also an instruction for synchronizing TLB updates with other processors, but because the PPC405 core is for use in uniprocessor environments, this instruction performs no operation.

Table 2-23 lists the TLB management instructions. In the table, the syntax “[.]” indicates that the instruction has a “record” form that updates CR[CR0], and a “non-record” form.

Table 2-23. TLB Management Instructions

tlbia tlbre tlbsx[.] tlbsync tlbwe

2.11.9 Processor Management Instructions

These instructions move data between the GPRs and SPRs, the CR, and DCRs in the PPC405 core, and provide traps, system calls, and synchronization controls.

Table 2-24 lists the processor management instructions in the PPC405 core.

Table 2-24. Processor Management Instructions

eieio isync sync

mcrxr mfcr mfdcr mfspr

mtcrf mtdcr mtspr sc tw twi

2.11.10 Extended Mnemonics

In addition to mnemonics for instructions supported directly by hardware, the PowerPC Architecture deﬁnes numerous

extended mnemonics

An extended mnemonic translates directly into the mnemonic of a hardware instruction, typically with carefully speciﬁed operands. For example, the PowerPC Architecture does not deﬁne a “shift right word immediate” instruction, because the “rotate left word immediate then AND with mask,” (rlwinm) instruction can accomplish the same result:

rlwinm RA,RS,32–n,n,31

However, because the required operands are not obvious, the PowerPC Architecture deﬁnes an extended mnemonic:

srwi RA,RS,n

Extended mnemonics transfer the problem of remembering complex or frequently used operand combinations to the assembler, and can more clearly reﬂect a programmer’s intentions. Thus, programs can be more readable.

2-42 PPC405 Core User’s Manual

Refer to the following chapter and appendixes for lists of the extended mnemonics:

• Chapter 9, “Instruction Set,” lists extended mnemonics under the associated hardware instruction mnemonics.

• Appendix A, “Instruction Summary,” lists extended mnemonics alphabetically, along with the hardware instruction mnemonics.

• Table B-5 in Appendix B, “Instructions by Category,” lists all extended mnemonics.

Programming Model 2-43

2-44 PPC405 Core User’s Manual

Chapter 3. Initialization

This chapter describes reset operations, the initial state of the PPC405 core after a reset, and an exampleof the initialization code required to begin executing application code. Initialization of external system components or system-speciﬁc chip facilities may also be performed, in addition to the basic initialization described in this chapter.

Reset operations affect the PPC405 at power on time as well as during normal operation, if programmed to do so. To understand how these operations work it is necessary to ﬁrst understand the signal pins involved as well as the terminology of core, chip and system resets.Three types of reset, each with different scope, are possible in the PPC405. A core reset affects only the processor core. Chip resets affect the processor core and all on-chip peripherals. System resets affect the processor core, all on-chip peripherals, and any off-chip devices connected to the chip reset net. Only the processor core can request a core or chip reset.

The processor core can request three types of processor resets: core, chip,and system. Each type of reset can be generated by a JTAG debug tool, by the second expiration of the watchdog timer, or by writing a non-zero value to the Reset (RST) ﬁeld of Debug Control Register 0 (DBCR0). In Core+ASIC and system on chip (SOC) designs, reset signals from on-chip and external peripherals can initiate system resets.

Core reset Resets the processor core, including the data cache unit (DCU) and instruction

cache unit (ICU).

Chip reset Resets the processor core, including the DCU and ICU. This type of reset is

provided in the IBM PowerPC 400 Series Embedded controllers as a means of resetting on-chip peripherals, and is provided on the PPC405 for compatibility.

System reset Resets the entire chip. The reset signal is driven active by the PPC405 during

system reset.

The effects of core and chip resets on the processor core are identical. To determine which reset type occurred, the most-recent reset (MRR) ﬁeld of the Debug Status Register (DBSR) can be examined.

3.1 Processor State After Reset

After a reset, the contents of the Machine State Register (MSR) and the Special Purpose Registers (SPRs) control the initial processor state. The contents of Device Control Registers (DCRs) control the initial states of on-chip devices. Chapter 10, “Register Summary,” contains descriptions of the registers.

In general, the contents of SPRs are undeﬁned after a reset. Reset initializes the minimum number of SPR ﬁelds required for allow successful instruction fetching. “Contents of Special Purpose Registers after Reset” on page 3-3 describes these initial values. System software fully conﬁgures the processor.

“Machine State Register Contents after Reset” on page 3-2 describes the MSR contents. The MCI ﬁeld of the Exception Syndrome Register (ESR) is cleared so that it can be determined if

there has been a machine check during initialization, before machine check exceptions are enabled.

Initialization 3-1

Two SPRs contain status on the type of reset that has occurred. The Debug Status Register (DBSR) contains the most recent reset type. The Timer Status Register (TSR) contains the most recent watchdog reset.

3.1.1 Machine State Register Contents after Reset

After all resets, all ﬁelds of the Machine State Register (MSR) contain zeros. Table 3-1 shows how this affects core operation.

Table 3-1. MSR Contents after Reset

Core

MSR AP 0 0 0 APU unavailable

APE 0 0 0 Auxiliary processor exception disabled WE 0 0 0 Wait state disabled CE 0 0 0 Critical interrupts disabled EE 0 0 0 External interrupts disabled PR 0 0 0 Supervisor mode FP 0 0 0 Floating point unavailable ME 0 0 0 Machine check exceptions disabled

Reset

Chip

Reset

System

Reset Comment

FE0 0 0 0 Floating point exception disabled DWE 0 0 0 Debug wait mode disabled DE 0 0 0 Debug interrupts disabled FE1 0 0 0 Floating point exceptions disabled DR 0 0 0 Data translation disabled IR 0 0 0 Instruction translation disabled

3-2 PPC405 Core User’s Manual

3.1.2 Contents of Special Purpose Registers after Reset

In general, the contents of Special Purpose Registers (SPRs) are undeﬁned after a core, chip, or system reset. Some SPRs retain the contents they had before a reset occurred.

Table 3-2 shows the contents of SPRs that are deﬁned or unchanged after core, chip, and system resets.

Table 3-2. SPR Contents After Reset

CCR0 0:31 0x00700000 0x00700000 0x00700000 Sets ICU and DCU PLB

priorities

DBCR0 EDM 0 0 0 External debug mode

disabled

RST 00 00 00 No reset action. DBCR1 0:31 0x00000000 0x00000000 0x00000000 Data compares disabled DBSR MRR 01 10 11 Most recent reset DCCR S0:S31 0x00000000 0x00000000 0x00000000 Data cache disabled DCWR W0:W31 0x00000000 0x00000000 0x00000000 Data cache write-through

disabled ESR 0:31 0x00000000 0x00000000 0x00000000 No exception syndromes ICCR S0:S31 0x00000000 0x00000000 0x00000000 Instruction cache disabled PVR 0:31 Processor version SGR G0:G31 0xFFFFFFFF 0xFFFFFFFF 0xFFFFFFFF Storage is guarded SLER S0:S31 0x00000000 0x00000000 0x00000000 Storage is big endian SU0R K0:K31 0x00000000 0x00000000 0x00000000 Storage is uncompressed TCR WRC 00 00 00 Watchdog timer reset disabled TSR WRS Copy of

TCR[WRC] PIS Undeﬁned Undeﬁned Undeﬁned After POR FIS Unchanged Unchanged Unchanged If reset not caused by

Copy of TCR[WRC]

Watchdog reset status

watchdog timer

3.2 PPC405 Initial Processor Sequencing

After any reset, the processor core fetches the word at address 0xFFFFFFFC and attempts to execute it. The instruction at 0xFFFFFFFC is typically a branch to initialization code. Unless the instruction at 0xFFFFFFFC is an unconditional branch, fetching can wrap to address 0x00000000 and attempt to execute the instruction at this location.

Initialization 3-3

Because the processor is initially in big endian mode, initialization code must be in big endian format until the endian storage attribute for the addressed region is changed, or until code branches to a region deﬁned as little endian storage.

Before a reset operation begins, the system must provide non-volatile memory, or memory initialized by some mechanism external to the processor. This memory must be located at address 0xFFFFFFFC.

3.3 Initialization Requirements

When any reset is performed, the processor is initialized to a minimum conﬁguration to start executing initialization code. Initialization code is necessary to complete the processor and system conﬁguration.

The initialization code example in this section performs the conﬁguration tasks required to prepare the PPC405 core to boot an operating system or run an application program.

Some portions of the initialization code work with system components that are beyond the scope of this manual.

Initialization code should perform the following tasks to conﬁgure the processor resources. To improve instruction fetching performance: initialize the SGR appropriately for guarded or

unguarded storage. Since all storage is initially guarded and speculative fetching is inhibited to guarded storage, reprogramming the SGR will improve performance for unguarded regions.

1. Before executing instructions as cachable: – Invalidate the instruction cache.

– Initialize the ICCR to conﬁgure instruction cachability.

2. Before using storage access instructions: – Invalidate the data cache.

– Initialize CRRO to determine if a store miss results in a line ﬁll (SWOA). – Initialize the DCWR to select copy-back or write-through caching. – Initialize the DCCR to conﬁgure data cachability.

3. Before allowing interrupts (synchronous or asynchronous): – Initialize the EVPR to point to vector table.

– Provide vector table with branches to interrupt handlers.

4. Before enabling asynchronous interrupts: – Initialize timer facilities.

– Initialize MSR to enable appropriate interrupts.

5. Initialize other processor features, such as the MMU, APU (if implemented), debug, and trace.

6. Initialize non-processor resources. – Initialize system memory as required by the operating system or application code.

– Initialize off-chip system facilities.

7. Start the execution of operating system or application code.

3-4 PPC405 Core User’s Manual

3.4 Initialization Code Example

The following initialization code illustrates the steps that should be taken to initialize the processor before an operating system or user programs begin execution. The example is presented in pseudocode; function calls are named similarly to PPC405 mnemonics where appropriate. Speciﬁc implementations may require different ordering of these sections to ensure proper operation.

/*—————————————————————————————————————— */

/* PPC405 Initialization Pseudo Code */

/*—————————————————————————————————————— */

@0xFFFFFFFC: /* initial instruction fetch from 0xFFFFFFFC */

ba(init_code); /* branch to initialization code */

@init_code:

/* ———————————————————————————————————— */

/* Conﬁgure guarded attribute for performance. */

/* ———————————————————————————————————— */

mtspr(SGR, guarded_attribute);

/* ———————————————————————————————————— */

/* Conﬁgure endianness and compression. */

/* ———————————————————————————————————— */

mtspr(SLER, endianness); mtspr(SU0R, compression_attribute);

/* —————————————————————————*/ /* Invalidate the instruction cache and enable cachability —*/ /* —————————————————————————*/

iccci; /* invalidate i-cache */

mtspr(ICCR, i_cache_cachability); /* enable I-cache*/

isync;

/* ———————————————————————————————————— */

/* Invalidate the data cache and enable cachability */

/* ———————————————————————————————————— */

address = 0; /* start at ﬁrst line */ for (line = 0; line <m_lines; line++) /* D-cache has m_lines congruence classes */ {

dccci(address); /* invalidate congruence class */

address += 32; /* point to the next congruence class */ } mtspr(CCR0, store-miss_line-ﬁll); mtspr(DCWR, copy-back_write-thru); mtspr(DCCR, d_cache_cachability); /* enable D-cache */ isync;

/* ———————————————————————————————————— */

/* Prepare system for synchronous interrupts. */

/* ———————————————————————————————————— */

Initialization 3-5

mtspr(EVPR, preﬁx_addr); /* initialize exception vector preﬁx */ /* Initialize vector table and interrupt handlers if not already done */ /* Initialize and conﬁgure timer facilities */ mtspr(PIT, 0); /* clear PIT so no PIT indication after TSR cleared*/

mtspr(TSR, 0xFFFFFFFF); /* clear TSR */ mtspr(TCR, timer_enable); /* enable desired timers */ mtspr(TBL, 0); /* reset time base low ﬁrst to avoid ripple */ mtspr(TBU, time_base_u); /* set time base, hi ﬁrst to catch possible ripple */ mtspr(TBL, time_base_l); /* set time base, low */ mtspr(PIT, pit_count); /* set desired PIT count */

/* Initialize the MSR */

/*———————————————————————————————————— */

/* Exceptions must be enabled immediately after timer facilities to avoid missing a */ /* timer exception. */ /* */ /* The MSR also controls privileged/user mode, translation, and the wait state. */ /* These must be initialized by the operating system or application code. */ /* If enabling translation, code must initialize the TLB. */

/*———————————————————————————————————— */

mtmsr(machine_state);

/*———————————————————————————————————— */

/* Initialization of other processor facilities should be performed at this time. */

/*———————————————————————————————————— */

/* Initialization of non-processor facilities should be performed at this time. */

/*———————————————————————————————————— */

/* Branch to operating system or application code can occur at this time. */

/*———————————————————————————————————— */

3-6 PPC405 Core User’s Manual

Chapter 4. Cache Operations

The PPC405 core incorporates two internal cache units, an instruction cache unit (ICU) and a data cache unit (DCU). Instructions and data can be accessed in the caches much faster than in main memory, if instruction and data cache arrays are implemented. The PPC405B3 core has a 16KB instruction cache array and an 8KB data cache array.

The ICU controls instruction accesses to main memory and, if an instruction cache array is implemented, stores frequently used instructions to reduce the overhead of instruction transfers between the instruction pipeline and external memory. Using the instruction cache minimizes access latency for frequently executed instructions.

The DCU controls data accesses to main memory and, if a data cache array is implemented, stores frequently used data to reduce the overhead of data transfers between the GPRs and external memory. Using the data cache minimizes access latency for frequently used data.

The ICU features:

• Programmable address pipelining and prefetching for cache misses and non-cachable lines

• Support for non-cachable hits from lines contained in the line ﬁll buffer

• Programmable non-cachable requests to memory as 4 or 8 words (or half line or line)

• Bypass path for critical words

• Non-blocking cache for hits during ﬁlls

• Flash invalidate (one instruction invalidates entire cache)

• Programmable allocation for fetch ﬁlls, enabling program control of cache contents using the icbt instruction

• Virtually indexed, physically tagged cache arrays

• A rich set of cache control instructions

The DCU features:

• Address pipelining for line ﬁlls

• Support for load hits from non-cachable and non-allocated lines contained in the line ﬁll buffer

• Bypass path for critical words

• Non-blocking cache for hits during ﬁlls

• Write-back and write-through write strategies controlled by storage attributes

• Programmable non-cachable load requests to memory as lines or words.

• Handling of up to two pending line ﬂushes.

• Holding of up to three stores before stalling the core pipeline

• Physically indexed, physically tagged cache arrays

• A rich set of cache control instructions

Cache Operations 4-1

The PPC405 core can include an instruction cache array and a data cache array. The size of the cache arrays can vary by core implementation, as shown in Table 4-1.

Table 4-1. Available Cache Array Sizes

ICU Cache Array Size DCU Cache Array Size

0KB 0KB 4KB 4KB

8KB 8KB 16KB 16KB 32KB 32KB

Programming Note: If the ICU cache array or the DCU cache array is not present (0KB), the I (cachability) storage attribute must be turned off for instruction-side or data-side memory, respectively.

“ICU and DCU Organization and Sizes” describes the organization and sizes of the ICU and the DCU. “ICU Overview” on page 4-3 and “DCU Overview” on page 4-6 provide overviews of the ICU and DCU.

4.1 ICU and DCU Organization and Sizes

The ICU and DCU contain control logic and, in some implementations, cache arrays. The control logic, which handles data transfers between the cache units, main memory,and the RISC core, differs signiﬁcantly between the ICU and DCU. The ICU and DCU cache arrays, which (when implemented) store instructions and data from main memory, respectively, are almost identical. (The DCU array adds a “dirty” bit to mark modiﬁed lines.)

The ICU and DCU cache arrays are two-way set-associative. In both cache units, a cache line can be in one of two locations in the cache array. The two locations are members of a set of locations. Each set is divided into two ways, way A and way B; a cache line can be located in either way. Each way is

organized as

lines of eight words each, wherenis the cache size, in kilobytes, multiplied by 16. For

example, a 4KB cache array contains 64 lines. Cache lines are addressed using a tag ﬁeld and an index. The tag ﬁelds are also two-way set-

associative. As shown in Table 4-2, the tag ﬁelds in ways A and B store address bits A

0:21

for each

4-2 PPC405 Core User’s Manual

cache line. The remaining address bits (A

) serve as an index to the cache array. The two cache

:27

lines that correspond with the same line index are called a congruence class.

Table 4-2. ICU and DCU Cache Array Organization

Tags (Two-way Set) Cache Lines (Two-way Set)

Way AWay BWay AWay B

A A

Table 4-3 shows the values of

Line 0 A

–1

Line 1 A

–1

•

Linen–2 A

–1

Linen–1 A

–1

Line 0 Line 0 Line 0

–1

Line 1 Line 1 Line 1

–1

•

Linen– 2 Linen– 2 Linen–2

–1

Linen– 1 Linen– 1 Linen–1

–1

andn for various cache array sizes.

•

Table 4-3. Cache Sizes, Tag Fields, and Lines

Instruction Cache Array Data Cache Array

Array Size

0KB———— 4KB 22 (0:21) 64 20 (0:19) 64

8KB 22 (0:21) 128 20 (0:19) 128 16KB 22 (0:21) 256 20 (0:19) 256 32KB 22 (0:21) 512 20 (0:19) 512

(Tag Field Bits)

(Lines)

(Tag Field Bits)

(Lines)

When the ICU or DCU requests a cache line from main memory (an operation called a cache line ﬁll), a least-recently-used (LRU) policy determines which cache line way will receive the requested line. The index, determined by the instruction or data address, selects a congruence class. Within a congruence class, the most recently accessed line (in either way A or way B) is retained and the LRU bit in the associated tag array marks the other line as LRU. The LRU line then receives the requested instruction or data words. After the cache line ﬁll, the LRU bit is set to identify as LRU the line opposite the line just ﬁlled.

4.2 ICU Overview

The ICU manages instruction transfers between external cachable memory and the instruction queue in the execution unit.

Cache Operations 4-3

Figure 4-1 shows the relationships between the ICU and the instruction pipeline.

Instructions

Addresses

Bypass Path

Instruction Queue

Addresses from Fetcher

Tag

Arrays

Instruction

Arrays

PFB1 PFB0

Decode

Execute

Figure 4-1. Instruction Flow

4.2.1 ICU Operations

Instructions from cachable memory regions are copied into the instruction cache array, if an array is present. The fetcher can access instructions much more quickly from a cache array than from memory. Cache lines can be loaded either target-word-ﬁrst or sequentially, or in any order. Targetword-ﬁrst ﬁlls start at the requested word, continue to the end of the line, and then wrap to ﬁll the remaining words at the beginning of the line. Sequential ﬁlls start at the ﬁrst word of the cache line and proceed sequentially to the last word of the line.

The bypass path handles instructions in cache-inhibited memory and improves performance during line ﬁll operations. If a request from the fetcher obtains an entire line from memory, the queue does not have to wait for the entire line to reach the cache. The target word (the word requested by the fetcher) is sent on the bypass path to the queue while the line ﬁll proceeds, evenif the selected line ﬁll order is not target-word-ﬁrst.

Cache line ﬁlls always run to completion, even if the instruction stream branches awayfrom the rest of the line. As requested instructions are received, they go to the fetcher from the ﬁll register before the line ﬁlls in the cache. The ﬁlled line is always placed in the ICU; if an external memory subsystem error occurs during the ﬁll, the line is not written to the cache. During a clock cycle, the ICU can send two instruction to the fetcher.

4-4 PPC405 Core User’s Manual

4.2.2 Instruction Cachability Control

When instruction address translation is enabled (MSR[IR] = 1), instruction cachability is controlled by the I storage attribute in the translation lookaside buffer (TLB) entry for the memory page. If TLB_entry[I] = 1, caching is inhibited; otherwise caching is enabled. Cachability is controlled separately for each page, which can range in size from 1KB to 16MB. “Translation Lookaside Buffer (TLB)” on page 7-2 describes the TLB.

When instruction address translation is disabled (MSR[IR] = 0), instruction cachability is controlled by the Instruction Cache Cachability Register (ICCR). Each ﬁeld in the ICCR (ICCR[S0:S31]) controls the cachability of a 128MB region (see “Real-Mode Storage Attribute Control” on page 7-17). If

ICCR[S

] = 1, caching is enabled for the speciﬁed region; otherwise, caching is inhibited.

The performance of the PPC405 core is signiﬁcantly lower while fetching instructions from cacheinhibited regions.

Following system reset, address translation is disabled and all ICCR bits are reset to 0 so that no memory regions are cachable. Before regions can be designated as cachable, the ICU cache array must be invalidated, if an array is present. The iccci instruction must execute before the cache is enabled. Address translation can then be enabled, if required, and the TLB or the ICCR can then be conﬁgured for the required cachability

4.2.3 Instruction Cache Synonyms

The following information applies only if instruction address translation is enabled (MSR[IR] = 1) and 1KB or 4KB page sizes are used. See Chapter 7, “Memory Management,” for information about address translation and page sizes.

An instruction cache synonym occurs when the instruction cache array contains multiple cache lines from the same real address. Such synonyms result from combinations of:

• Cache array size

• Cache associativity

• Page size

• The use of effective addresses (EAs) to index the cache array For example, the instruction cache array has a "way size" of 8KB (16KB array/2 ways). Thus, 11 bits

(EA the low order 8 bits (EA

) are needed to select a word (instruction) in each way. For the minimum page size of 1KB,

19:29

) address a word in a page. The high order address bits (EA

22:29