Size:
3.64 Mb
Download

Title Page

IBM PowerPC 750GX and 750GL RISC Microprocessor

User’s Manual

Version 1.2

March 27, 2006

®

© Copyright International Business Machines Corporation 2004, 2006

All Rights Reserved

Printed in the United States of America March 2006.

The following are trademarks of International Business Machines Corporation in the United States, or other countries, or both:

IBM

POWER

PowerPC 750

IBM Logo

PowerPC

PowerPC Architecture

 

 

PowerPC Logo

IEEE is a registered trademark in the United States, owned by the Institute of Electrical and Electronics Engineers.

Other company, product, and service names may be trademarks or service marks of others.

All information contained in this document is subject to change without notice. The products described in this document are NOT intended for use in applications such as implantation, life support, or other hazardous uses where malfunction could result in death, bodily injury, or catastrophic property damage. The information contained in this document does not affect or change IBM product specifications or warranties. Nothing in this document shall operate as an express or implied license or indemnity under the intellectual property rights of IBM or third parties. All information contained in this document was obtained in specific environments, and is presented as an illustration. The results obtained in other operating environments may vary.

THE INFORMATION CONTAINED IN THIS DOCUMENT IS PROVIDED ON AN “AS IS” BASIS. In no event will IBM be liable for damages arising directly or indirectly from any use of the information contained in this document.

IBM Microelectronics Division

2070 Route 52, Bldg. 330

Hopewell Junction, NY 12533-6351

The IBM home page can be found at ibm.com

The IBM Microelectronics Division home page can be found at ibm.com/chips

gx_title.fm.(1.2) March 27, 2006

 

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

List of Figures ..............................................................................................................

13

List of Tables ...............................................................................................................

. 15

About This Manual ........................................................................................................

19

Who Should Read This Manual ...................................................................................................

......... 19

Related Publications .............................................................................................................................

19

Conventions Used in This Manual ...............................................................................................

......... 20

Using This Manual with the Programming Environments Manual .........................................................

22

1. PowerPC 750GX Overview .......................................................................................

23

1.1 750GX Microprocessor Overview .............................................................................................

...... 23

1.2 750GX Microprocessor Features .............................................................................................

....... 25

1.2.1 Instruction Flow ........................................................................................................

............. 29

1.2.1.1 Instruction Queue and Dispatch Unit ..............................................................................

29

1.2.1.2 Branch Processing Unit (BPU) .......................................................................................

29

1.2.1.3 Completion Unit .......................................................................................................

....... 30

1.2.2 Independent Execution Units .............................................................................................

.... 31

1.2.2.1 Integer Units (IUs) ...................................................................................................

....... 31

1.2.2.2 Floating-Point Unit (FPU) .............................................................................................

.. 31

1.2.2.3 Load/Store Unit (LSU) .................................................................................................

... 32

1.2.2.4 System Register Unit (SRU) ...........................................................................................

32

1.2.3 Memory Management Units (MMUs) .....................................................................................

32

1.2.4 On-Chip Level 1 Instruction and Data Caches ......................................................................

33

1.2.5 On-Chip Level 2 Cache Implementation ................................................................................

35

1.2.6 System Interface/Bus Interface Unit (BIU) .............................................................................

35

1.2.7 Signals ...................................................................................................................................

37

1.2.8 Signal Configuration ....................................................................................................

.......... 38

1.2.9 Clocking .................................................................................................................................

40

1.3 750GX Microprocessor Implementation .......................................................................................

... 40

1.4 PowerPC Registers and Programming Model ................................................................................

42

1.5 Instruction Set .................................................................................................................................

45

1.5.1 PowerPC Instruction Set .................................................................................................

...... 45

1.5.2 750GX Microprocessor Instruction Set ..................................................................................

47

1.6 On-Chip Cache Implementation ..............................................................................................

........ 47

1.6.1 PowerPC Cache Model .....................................................................................................

.... 47

1.6.2 750GX Microprocessor Cache Implementation ....................................................................

47

1.7 Exception Model ..............................................................................................................................

48

1.7.1 PowerPC Exception Model .................................................................................................

... 48

1.7.2 750GX Microprocessor Exception Implementation ...............................................................

49

1.8 Memory Management .........................................................................................................

............ 51

1.8.1 PowerPC Memory-Management Model ................................................................................

51

1.8.2 750GX Microprocessor Memory-Management Implementation ...........................................

52

1.9 Instruction Timing ............................................................................................................................

52

1.10 Power Management .........................................................................................................

............. 54

1.11 Thermal Management .......................................................................................................

............ 55

1.12 Performance Monitor .....................................................................................................................

56

750gx_umTOC.fm.(1.2)

 

March 27, 2006

Page 3 of 377

User’s Manual

 

IBM PowerPC 750GX and 750GL RISC Microprocessor

 

2. Programming Model ..................................................................................................

57

2.1 PowerPC 750GX Processor Register Set .......................................................................................

57

2.1.1 Register Set ...........................................................................................................................

57

2.1.2 PowerPC 750GX-Specific Registers ......................................................................................

64

2.1.2.1 Instruction Address Breakpoint Register (IABR) ............................................................

64

2.1.2.2 Hardware-Implementation-Dependent Register 0 (HID0) ..............................................

65

2.1.2.3 Hardware-Implementation-Dependent Register 1 (HID1) ..............................................

70

2.1.2.4 Hardware-Implementation-Dependent Register 2 (HID2) ..............................................

71

2.1.2.5 Performance-Monitor Registers ......................................................................................

72

2.1.3 Instruction Cache Throttling Control Register (ICTC) ............................................................

77

2.1.4 Thermal-Management Registers (THRMn) ............................................................................

78

2.1.4.1 Thermal-Management Registers 1–2 (THRM1–THRM2) ...............................................

78

2.1.4.2 Thermal-Management Register 3 (THRM3) ...................................................................

79

2.1.4.3 Thermal-Management Register 4 (THRM4) ...................................................................

80

2.1.5 L2 Cache Control Register (L2CR) ........................................................................................

81

2.2 Operand Conventions .....................................................................................................................

82

2.2.1 Data Organization in Memory and Data Transfers ................................................................

82

2.2.2 Alignment and Misaligned Accesses .....................................................................................

82

2.2.3 Floating-Point Operand and Execution Models—UISA .........................................................

83

2.2.3.1 Denormalized Number Support ......................................................................................

83

2.2.3.2 Non-IEEE Mode (Nondenormalized Mode) ....................................................................

83

2.2.3.3 Time-Critical Floating-Point Operation ...........................................................................

84

2.2.3.4 Floating-Point Storage Access Alignment ......................................................................

84

2.2.3.5 Optional Floating-Point Graphics Instructions ................................................................

84

2.3 Instruction Set Summary .................................................................................................................

86

2.3.1 Classes of Instructions ...........................................................................................................

87

2.3.1.1 Definition of Boundedly Undefined .................................................................................

87

2.3.1.2 Defined Instruction Class ................................................................................................

87

2.3.1.3 Illegal Instruction Class ...................................................................................................

88

2.3.1.4 Reserved Instruction Class .............................................................................................

89

2.3.2 Addressing Modes .................................................................................................................

89

2.3.2.1 Memory Addressing ........................................................................................................

89

2.3.2.2 Memory Operands ..........................................................................................................

89

2.3.2.3 Effective Address Calculation .........................................................................................

90

2.3.2.4 Synchronization ..............................................................................................................

90

2.3.3 Instruction Set Overview ........................................................................................................

91

2.3.4 PowerPC UISA Instructions ...................................................................................................

92

2.3.4.1 Integer Instructions .........................................................................................................

92

2.3.4.2 Floating-Point Instructions ..............................................................................................

95

2.3.4.3 Load-and-Store Instructions ...........................................................................................

98

2.3.4.4 Branch and Flow-Control Instructions ..........................................................................

106

2.3.4.5 System Linkage Instruction—UISA ..............................................................................

108

2.3.4.6 Processor Control Instructions—UISA .........................................................................

108

2.3.4.7 Memory Synchronization Instructions—UISA ...............................................................

113

2.3.5 PowerPC VEA Instructions ..................................................................................................

113

2.3.5.1 Processor Control Instructions—VEA ...........................................................................

113

2.3.5.2 Memory Synchronization Instructions—VEA ................................................................

114

2.3.5.3 Memory Control Instructions—VEA ..............................................................................

115

2.3.5.4 Optional External Control Instructions ..........................................................................

117

2.3.6 PowerPC OEA Instructions ..................................................................................................

118

 

750gx_umTOC.fm.(1.2)

Page 4 of 377

March 27, 2006

 

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

2.3.6.1 System Linkage Instructions—OEA .............................................................................

 

118

2.3.6.2 Processor Control Instructions—OEA ..........................................................................

 

118

2.3.6.3 Memory Control Instructions—OEA .............................................................................

 

119

2.3.7 Recommended Simplified Mnemonics ................................................................................

 

120

3. Instruction-Cache and Data-Cache Operation ....................................................

 

121

3.1 Data-Cache Organization ...................................................................................................

...........

123

3.2 Instruction-Cache Organization ............................................................................................

.........

124

3.3 Memory and Cache Coherency ................................................................................................

....

125

3.3.1 Memory/Cache Access Attributes (WIMG Bits) ...................................................................

 

125

3.3.2 MEI Protocol ............................................................................................................

............

126

3.3.2.1 MEI Hardware Considerations .....................................................................................

 

128

3.3.3 Coherency Precautions in Single-Processor Systems ........................................................

 

129

3.3.4 Coherency Precautions in Multiprocessor Systems ............................................................

 

129

3.3.5 PowerPC 750GX-Initiated Load/Store Operations ..............................................................

 

130

3.3.5.1 Performed Loads and Stores .......................................................................................

 

130

3.3.5.2 Sequential Consistency of Memory Accesses .............................................................

 

130

3.3.5.3 Atomic Memory References .........................................................................................

 

130

3.4 Cache Control ...............................................................................................................................

 

131

3.4.1 Cache-Control Parameters in HID0 .....................................................................................

 

131

3.4.1.1 Data-Cache Flash Invalidation .....................................................................................

 

132

3.4.1.2 Enabling and Disabling the Data Cache .......................................................................

 

132

3.4.1.3 Locking the Data Cache ...............................................................................................

 

132

3.4.1.4 Instruction-Cache Flash Invalidation ............................................................................

 

133

3.4.1.5 Enabling and Disabling the Instruction Cache ..............................................................

 

133

3.4.1.6 Locking the Instruction Cache ......................................................................................

 

133

3.4.2 Cache-Control Instructions ..............................................................................................

....

133

3.4.2.1 Data Cache Block Touch (dcbt) and Data Cache Block Touch for Store (dcbtst) ......

134

3.4.2.2 Data Cache Block Zero (dcbz) .....................................................................................

 

134

3.4.2.3 Data Cache Block Store (dcbst) ..................................................................................

 

135

3.4.2.4 Data Cache Block Flush (dcbf) ....................................................................................

 

135

3.4.2.5 Data Cache Block Invalidate (dcbi) .............................................................................

 

135

3.4.2.6 Instruction Cache Block Invalidate (icbi) ......................................................................

 

136

3.5 Cache Operations .........................................................................................................................

 

136

3.5.1 Cache-Block-Replacement/Castout Operations ..................................................................

 

136

3.5.2 Cache Flush Operations ..................................................................................................

....

138

3.5.3 Data-Cache Block-Fill Operations .......................................................................................

 

139

3.5.4 Instruction-Cache Block-Fill Operations ..............................................................................

 

139

3.5.5 Data-Cache Block-Push Operations ....................................................................................

 

139

3.6 L1 Caches and 60x Bus Transactions ........................................................................................

..

139

3.6.1 Read Operations and the MEI Protocol ...............................................................................

 

140

3.6.2 Bus Operations Caused by Cache-Control Instructions ......................................................

 

141

3.6.3 Snooping .............................................................................................................................

 

142

3.6.4 Snoop Response to 60x Bus Transactions .........................................................................

 

143

3.6.5 Transfer Attributes .....................................................................................................

..........

145

3.7 MEI State Transactions .................................................................................................................

 

147

4. Exceptions ...............................................................................................................

 

151

4.1 PowerPC 750GX Microprocessor Exceptions ...............................................................................

 

152

750gx_umTOC.fm.(1.2)

 

 

March 27, 2006

Page 5 of 377

User’s Manual

 

IBM PowerPC 750GX and 750GL RISC Microprocessor

 

4.2 Exception Recognition and Priorities .............................................................................................

153

4.3 Exception Processing ....................................................................................................................

156

4.3.1 Machine Status Save/Restore Register 0 (SRR0) ...............................................................

156

4.3.2 Machine Status Save/Restore Register 1 (SRR1) ...............................................................

157

4.3.3 Machine State Register (MSR) ............................................................................................

158

4.3.4 Enabling and Disabling Exceptions ......................................................................................

160

4.3.5 Steps for Exception Processing ...........................................................................................

160

4.3.6 Setting MSR[RI] ...................................................................................................................

161

4.3.7 Returning from an Exception Handler ..................................................................................

161

4.4 Process Switching .........................................................................................................................

162

4.5 Exception Definitions .....................................................................................................................

162

4.5.1 System Reset Exception (0x00100) .....................................................................................

163

4.5.1.1 Soft Reset .....................................................................................................................

164

4.5.1.2 Hard Reset ...................................................................................................................

164

4.5.2 Machine-Check Exception (0x00200) ..................................................................................

167

4.5.2.1 Machine-Check Exception Enabled (MSR[ME] = 1) .....................................................

168

4.5.2.2 Checkstop State (MSR[ME] = 0) ..................................................................................

169

4.5.3 DSI Exception (0x00300) .....................................................................................................

169

4.5.4 ISI Exception (0x00400) .......................................................................................................

169

4.5.5 External Interrupt Exception (0x00500) ...............................................................................

169

4.5.6 Alignment Exception (0x00600) ...........................................................................................

170

4.5.7 Program Exception (0x00700) .............................................................................................

170

4.5.8 Floating-Point Unavailable Exception (0x00800) .................................................................

171

4.5.9 Decrementer Exception (0x00900) ......................................................................................

171

4.5.10 System Call Exception (0x00C00) .....................................................................................

171

4.5.11 Trace Exception (0x00D00) ...............................................................................................

171

4.5.12 Floating-Point Assist Exception (0x00E00) ........................................................................

171

4.5.13 Performance-Monitor Interrupt (0x00F00) .........................................................................

172

4.5.14 Instruction Address Breakpoint Exception (0x01300) ........................................................

173

4.5.15 System Management Interrupt (0x01400) .........................................................................

173

4.5.16 Thermal-Management Interrupt Exception (0x01700) .......................................................

174

4.5.17 Data Address Breakpoint Exception ..................................................................................

175

4.5.17.1 Data Address Breakpoint Register (DABR) ................................................................

175

4.5.18 Soft Stops ..........................................................................................................................

175

4.5.19 Exception Latencies ...........................................................................................................

176

4.5.20 Summary of Front-End Exception Handling .......................................................................

176

4.5.21 Timer Facilities ...................................................................................................................

177

4.5.22 External Access Instructions ..............................................................................................

177

5. Memory Management ..............................................................................................

179

5.1 MMU Overview ..............................................................................................................................

179

5.1.1 Memory Addressing .............................................................................................................

181

5.1.2 MMU Organization ...............................................................................................................

181

5.1.3 Address-Translation Mechanisms ........................................................................................

186

5.1.4 Memory-Protection Facilities ................................................................................................

187

5.1.5 Page History Information .....................................................................................................

188

5.1.6 General Flow of MMU Address Translation .........................................................................

189

5.1.6.1 Real-Addressing Mode and Block-Address-Translation Selection ...............................

189

5.1.6.2 Page-Address-Translation Selection ............................................................................

190

5.1.7 MMU Exceptions Summary .................................................................................................

192

 

750gx_umTOC.fm.(1.2)

Page 6 of 377

March 27, 2006

 

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

5.1.8 MMU Instructions and Register Summary ...........................................................................

194

5.2 Real-Addressing Mode ......................................................................................................

............ 195

5.3 Block-Address Translation .................................................................................................

........... 196

5.4 Memory Segment Model ......................................................................................................

......... 196

5.4.1 Page History Recording ..................................................................................................

..... 196

5.4.1.1 Referenced Bit ........................................................................................................

...... 197

5.4.1.2 Changed Bit ...........................................................................................................

....... 198

5.4.1.3 Scenarios for Referenced and Changed Bit Recording ...............................................

198

5.4.2 Page Memory Protection ..................................................................................................

... 199

5.4.3 TLB Description .........................................................................................................

.......... 199

5.4.3.1 TLB Organization ......................................................................................................

... 199

5.4.3.2 TLB Invalidation ......................................................................................................

...... 201

5.4.4 Page-Address-Translation Summary ..................................................................................

202

5.4.5 Page Table-Search Operation .............................................................................................

204

5.4.6 Page Table Updates ......................................................................................................

...... 207

5.4.7 Segment Register Updates ................................................................................................

. 207

6. Instruction Timing ...................................................................................................

209

6.1 Terminology and Conventions ...............................................................................................

....... 209

6.2 Instruction Timing Overview ...............................................................................................

........... 211

6.3 Timing Considerations ..................................................................................................................

215

6.3.1 General Instruction Flow ................................................................................................

...... 215

6.3.2 Instruction Fetch Timing ................................................................................................

...... 216

6.3.2.1 Cache Arbitration .....................................................................................................

..... 217

6.3.2.2 Cache Hit .............................................................................................................

......... 217

6.3.2.3 Cache Miss ............................................................................................................

....... 222

6.3.2.4 L2 Cache Access Timing Considerations .....................................................................

224

6.3.2.5 Instruction Dispatch and Completion Considerations ...................................................

224

6.3.2.6 Rename Register Operation .........................................................................................

224

6.3.2.7 Instruction Serialization .............................................................................................

... 225

6.4 Execution-Unit Timings .................................................................................................................

225

6.4.1 Branch Processing Unit Execution Timing ..........................................................................

225

6.4.1.1 Branch Folding ........................................................................................................

..... 226

6.4.1.2 Branch Instructions and Completion ............................................................................

227

6.4.1.3 Branch Prediction and Resolution ................................................................................

228

6.4.2 Integer Unit Execution Timing ...........................................................................................

.. 232

6.4.3 Floating-Point Unit Execution Timing ..................................................................................

232

6.4.4 Effect of Floating-Point Exceptions on Performance ...........................................................

232

6.4.5 Load/Store Unit Execution Timing .......................................................................................

233

6.4.6 Effect of Operand Placement on Performance ....................................................................

233

6.4.7 Integer Store Gathering .................................................................................................

...... 234

6.4.8 System Register Unit Execution Timing ..............................................................................

234

6.5 Memory Performance Considerations .........................................................................................

.. 235

6.5.1 Caching and Memory Coherency ........................................................................................

235

6.5.2 Effect of TLB Miss ......................................................................................................

......... 236

6.6 Instruction Scheduling Guidelines .........................................................................................

........ 236

6.6.1 Branch, Dispatch, and Completion-Unit Resource Requirements .......................................

237

6.6.1.1 Branch-Resolution Resource Requirements ................................................................

237

6.6.1.2 Dispatch-Unit Resource Requirements ........................................................................

237

750gx_umTOC.fm.(1.2)

 

March 27, 2006

Page 7 of 377

User’s Manual

 

IBM PowerPC 750GX and 750GL RISC Microprocessor

 

6.6.1.3 Completion-Unit Resource Requirements ....................................................................

237

6.7 Instruction Latency Summary ........................................................................................................

238

7. Signal Descriptions .................................................................................................

249

7.1 Signal Configuration ......................................................................................................................

250

7.2 Signal Descriptions ........................................................................................................................

251

7.2.1 Address-Bus Arbitration Signals ..........................................................................................

251

7.2.1.1 Bus Request (BR)—Output ..........................................................................................

251

7.2.1.2 Bus Grant (BG)—Input .................................................................................................

252

7.2.1.3 Address Bus Busy (ABB) ..............................................................................................

252

7.2.2 Address Transfer Start Signals ............................................................................................

253

7.2.2.1 Transfer Start (TS) ........................................................................................................

253

7.2.3 Address Transfer Signals .....................................................................................................

254

7.2.3.1 Address Bus (A[0–31]) .................................................................................................

254

7.2.3.2 Address-Bus Parity (AP[0–3]) .......................................................................................

255

7.2.4 Address Transfer Attribute Signals ......................................................................................

255

7.2.4.1 Transfer Type (TT[0–4]) ...............................................................................................

256

7.2.4.2 Transfer Size (TSIZ[0–2])—Output ...............................................................................

258

7.2.4.3 Transfer Burst (TBST) ..................................................................................................

259

7.2.4.4 Cache Inhibit (CI)—Output ...........................................................................................

260

7.2.4.5 Write-Through (WT)—Output .......................................................................................

260

7.2.4.6 Global (GBL) .................................................................................................................

261

7.2.5 Address Transfer Termination Signals .................................................................................

262

7.2.5.1 Address Acknowledge (AACK)—Input .........................................................................

262

7.2.5.2 Address Retry (ARTRY) ...............................................................................................

263

7.2.6 Data-Bus Arbitration Signals ................................................................................................

264

7.2.6.1 Data-Bus Grant (DBG)—Input ......................................................................................

264

7.2.6.2 Data-Bus Write-Only (DBWO) ......................................................................................

265

7.2.6.3 Data Bus Busy (DBB) ...................................................................................................

265

7.2.7 Data-Transfer Signals ..........................................................................................................

266

7.2.7.1 Data Bus (DH[0–31], DL[0–31]) ....................................................................................

266

7.2.7.2 Data-Bus Parity (DP[0–7]) ............................................................................................

267

7.2.7.3 Data Bus Disable (DBDIS)—Input ................................................................................

268

7.2.8 Data-Transfer Termination Signals ......................................................................................

268

7.2.8.1 Transfer Acknowledge (TA)—Input ..............................................................................

268

7.2.8.2 Data Retry (DRTRY)—Input .........................................................................................

269

7.2.8.3 Transfer Error Acknowledge (TEA)—Input ...................................................................

269

7.2.9 System Status Signals .........................................................................................................

270

7.2.9.1 Interrupt (INT)— Input ..................................................................................................

270

7.2.9.2 System Management Interrupt (SMI)—Input ................................................................

270

7.2.9.3 Machine-Check Interrupt (MCP)—Input .......................................................................

271

7.2.9.4 Checkstop Input (CKSTP_IN)—Input ...........................................................................

271

7.2.9.5 Checkstop Output (CKSTP_OUT)—Output .................................................................

271

7.2.10 Reset Signals .....................................................................................................................

272

7.2.10.1 Hard Reset (HRESET)—Input ....................................................................................

272

7.2.10.2 Soft Reset (SRESET)—Input .....................................................................................

272

7.2.11 Processor Status Signals ...................................................................................................

273

7.2.11.1 Quiescent Request (QREQ)—Output .........................................................................

273

7.2.11.2 Quiescent Acknowledge (QACK)—Input ....................................................................

273

7.2.11.3 Reservation (RSRV)—Output .....................................................................................

273

 

750gx_umTOC.fm.(1.2)

Page 8 of 377

March 27, 2006

 

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

7.2.11.4 Time Base Enable (TBEN)—Input .............................................................................

274

7.2.11.5 TLB Invalidate Synchronize (TLBISYNC)—Input .......................................................

274

7.2.12 Processor Mode Selection Signals ....................................................................................

274

7.2.13 I/O Voltage Select Signals .............................................................................................

.... 275

7.2.14 Test Interface Signals .................................................................................................

....... 275

7.2.14.1 IEEE 1149.1a-1993 Interface Description ..................................................................

275

7.2.14.2 LSSD_MODE .............................................................................................................

275

7.2.14.3 L1_TSTCLK ............................................................................................................

.... 276

7.2.14.4 L2_TSTCLK ............................................................................................................

.... 276

7.2.14.5 BVSEL ................................................................................................................

........ 276

7.2.15 Clock Signals ..........................................................................................................

........... 276

7.2.15.1 System Clock (SYSCLK)—Input ................................................................................

277

7.2.15.2 Clock Out (CLK_OUT)—Output .................................................................................

277

7.2.15.3 PLL Configuration (PLL_CFG[0:4])—Input .................................................................

277

7.2.15.4 PLL Range (PLL_RNG[0:1])—Input ...........................................................................

278

7.2.16 Power and Ground Signals ...............................................................................................

. 278

8. Bus Interface Operation .........................................................................................

279

8.1 Bus Interface Overview .................................................................................................................

280

8.1.1 Operation of the Instruction and Data L1 Caches ...............................................................

281

8.1.2 Operation of the Bus Interface ..........................................................................................

... 282

8.1.3 Bus Signal Clocking .....................................................................................................

........ 282

8.1.4 Optional 32-Bit Data Bus Mode ...........................................................................................

282

8.1.5 Direct-Store Accesses ...................................................................................................

...... 283

8.2 Memory-Access Protocol ....................................................................................................

.......... 284

8.2.1 Arbitration Signals .....................................................................................................

.......... 285

8.2.2 Miss-under-Miss .........................................................................................................

......... 286

8.2.2.1 Miss-under-Miss and System Performance .................................................................

287

8.2.2.2 Speculative Loads and Conditional Branches ..............................................................

290

8.3 Address-Bus Tenure .....................................................................................................................

290

8.3.1 Address-Bus Arbitration .................................................................................................

...... 290

8.3.2 Address Transfer ........................................................................................................

......... 292

8.3.2.1 Address-Bus Parity ....................................................................................................

... 294

8.3.2.2 Address Transfer Attribute Signals ...............................................................................

294

8.3.2.3 Burst Ordering During Data Transfers ..........................................................................

295

8.3.2.4 Effect of Alignment in Data Transfers ...........................................................................

296

8.3.2.5 Alignment of External Control Instructions ...................................................................

300

8.3.3 Address Transfer Termination ............................................................................................

. 300

8.4 Data-Bus Tenure ...........................................................................................................................

301

8.4.1 Data-Bus Arbitration ....................................................................................................

........ 301

8.4.1.1 Using the DBB Signal ...................................................................................................

302

8.4.2 Data-Bus Write-Only .....................................................................................................

....... 303

8.4.3 Data Transfer ...........................................................................................................

............ 303

8.4.4 Data-Transfer Termination ...............................................................................................

... 303

8.4.4.1 Normal Single-Beat Termination ..................................................................................

304

8.4.4.2 Data-Transfer Termination Due to a Bus Error ............................................................

307

8.4.5 Memory Coherency—MEI Protocol .....................................................................................

308

8.5 Timing Examples ...........................................................................................................................

309

8.6 Optional Bus Configuration ................................................................................................

........... 316

8.6.1 32-Bit Data Bus Mode ....................................................................................................

..... 316

750gx_umTOC.fm.(1.2)

 

March 27, 2006

Page 9 of 377

User’s Manual

 

IBM PowerPC 750GX and 750GL RISC Microprocessor

 

 

8.6.2 No-DRTRY Mode .................................................................................................................

318

8.7

Processor State Signals ................................................................................................................

319

 

8.7.1 Support for the lwarx and stwcx. Instruction Pair ...............................................................

319

 

8.7.2 TLBISYNC Input ..................................................................................................................

319

8.8

IEEE 1149.1a-1993 Compliant Interface .......................................................................................

319

 

8.8.1 JTAG/COP Interface ............................................................................................................

319

8.9

Using Data-Bus Write-Only ...........................................................................................................

320

9. L2 Cache ...................................................................................................................

323

9.1

L2 Cache Overview .......................................................................................................................

323

9.2

L2 Cache Operation ......................................................................................................................

323

9.3

L2 Cache Control Register (L2CR) ...............................................................................................

329

9.4

L2 Cache Initialization ...................................................................................................................

329

9.5

L2 Cache Global Invalidation ........................................................................................................

329

9.6

L2 Cache Used as On-Chip Memory ............................................................................................

330

 

9.6.1 Locking the L2 Cache ..........................................................................................................

330

 

9.6.1.1 Loading the Locked L2 Cache ......................................................................................

331

 

9.6.1.2 Locked Cache Operation ..............................................................................................

331

9.7

Data-Only and Instruction-Only Modes .........................................................................................

332

9.8

L2 Cache Test Features and Methods ..........................................................................................

332

 

9.8.1 L2CR Support for L2 Cache Testing ....................................................................................

332

 

9.8.2 L2 Cache Testing .................................................................................................................

333

9.9

L2 Cache Timing ...........................................................................................................................

333

10. Power and Thermal Management ........................................................................

335

10.1 Dynamic Power Management .....................................................................................................

335

10.2 Programmable Power Modes ......................................................................................................

335

 

10.2.1 Power Management Modes ...............................................................................................

337

 

10.2.1.1 Full On Mode ..............................................................................................................

337

 

10.2.1.2 Doze Mode .................................................................................................................

337

 

10.2.1.3 Nap Mode ...................................................................................................................

337

 

10.2.1.4 Sleep Mode ................................................................................................................

339

 

10.2.1.5 Dynamic Power Reduction .........................................................................................

339

 

10.2.2 Power Management Software Considerations ...................................................................

340

10.3 750GX Dual PLL Feature ............................................................................................................

340

 

10.3.1 Overview ............................................................................................................................

340

 

10.3.2 Configuration Restriction on Frequency Transitions ..........................................................

341

 

10.3.3 Dual PLL Implementation ...................................................................................................

342

10.4 Thermal Assist Unit .....................................................................................................................

343

 

10.4.1 Thermal Assist Unit Overview ............................................................................................

343

 

10.4.2 Thermal Assist Unit Operation ...........................................................................................

344

 

10.4.2.1 TAU Single-Threshold Mode ......................................................................................

345

 

10.4.2.2 TAU Dual-Threshold Mode .........................................................................................

346

 

10.4.2.3 750GX Junction Temperature Determination .............................................................

346

 

10.4.2.4 Power Saving Modes and TAU Operation ..................................................................

347

10.5 Instruction-Cache Throttling ........................................................................................................

347

11. Performance Monitor and System Related Features .........................................

349

 

 

750gx_umTOC.fm.(1.2)

Page 10 of 377

March 27, 2006

 

 

User’s Manual

 

IBM PowerPC 750GX and 750GL RISC Microprocessor

11.1

Performance-Monitor Interrupt ............................................................................................

........ 349

11.2

Special-Purpose Registers Used by Performance Monitor .........................................................

350

11.2.1 Performance-Monitor Registers .........................................................................................

351

 

11.2.1.1 Monitor Mode Control Register 0 (MMCR0) ...............................................................

351

 

11.2.1.2 User Monitor Mode Control Register 0 (UMMCR0) ....................................................

351

 

11.2.1.3 Monitor Mode Control Register 1 (MMCR1) ...............................................................

351

 

11.2.1.4 User Monitor Mode Control Register 1 (UMMCR1) ....................................................

351

 

11.2.1.5 Performance-Monitor Counter Registers (PMCn) ......................................................

351

 

11.2.1.6 User Performance-Monitor Counter Registers (UPMC1–UPMC4) ............................

354

 

11.2.1.7 Sampled Instruction Address Register (SIA) ..............................................................

355

 

11.2.1.8 User Sampled Instruction Address Register (USIA) ...................................................

355

11.3

Event Counting ............................................................................................................................

355

11.4

Event Selection ...........................................................................................................................

356

11.5

Notes ...........................................................................................................................................

356

11.6

Debug Support ............................................................................................................................

357

11.6.1 Overview ............................................................................................................................

357

11.6.2 Data-Address Breakpoint ................................................................................................

.. 357

11.7

JTAG/COP Functions .......................................................................................................

........... 357

11.7.1 Introduction ........................................................................................................................

357

11.7.2 Processor Resources Available through JTAG/COP Serial Interface ...............................

357

11.8

Resets .........................................................................................................................................

359

11.8.1 Hard Reset .............................................................................................................

........... 359

11.8.2 Soft Reset ..........................................................................................................................

359

11.8.3 Reset Sequence .........................................................................................................

....... 360

11.9

Checkstops .................................................................................................................................

361

11.9.1 Checkstop Sources ......................................................................................................

..... 361

11.9.2 Checkstop Control Bits .................................................................................................

..... 361

11.9.3 Open-Collector-Driver States during Checkstop ...............................................................

362

11.9.4 Vacancy Slot Application ...............................................................................................

.... 362

11.10 750GX Parity .............................................................................................................................

363

11.10.1 Parity Control and Status .............................................................................................

.... 364

11.10.2 Enabling Parity Error Detection .......................................................................................

364

11.10.3 Parity Errors .........................................................................................................

............ 364

Acronyms and Abbreviations ...................................................................................

365

Index

........................................................................................................................

.... 369

Revision Log ..............................................................................................................

377

750gx_umTOC.fm.(1.2)

 

March 27, 2006

Page 11 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

 

750gx_umTOC.fm.(1.2)

Page 12 of 377

March 27, 2006

 

 

User’s Manual

 

IBM PowerPC 750GX and 750GL RISC Microprocessor

List of Figures

 

Figure 1-1.

750GX Microprocessor Block Diagram ..................................................................................

25

Figure 1-2.

L1 Cache Organization ..........................................................................................................

34

Figure 1-3.

System Interface ....................................................................................................................

37

Figure 1-4.

750GX Microprocessor Signal Groups ...................................................................................

39

Figure 1-5.

Pipeline Diagram ....................................................................................................................

53

Figure 2-1.

PowerPC 750GX Microprocessor Programming Model—Registers ......................................

58

Figure 3-1.

Cache Integration .................................................................................................................

122

Figure 3-2.

Data-Cache Organization .....................................................................................................

123

Figure 3-3.

Instruction-Cache Organization ............................................................................................

125

Figure 3-4.

MEI Cache-Coherency Protocol—State Diagram (WIM = 001) ...........................................

128

Figure 3-5.

PLRU Replacement Algorithm .............................................................................................

137

Figure 3-6.

750GX Cache Addresses .....................................................................................................

140

Figure 4-1.

SRESET Asserted During HRESET ....................................................................................

164

Figure 5-1.

MMU Conceptual Block Diagram .........................................................................................

183

Figure 5-2.

PowerPC 750GX Microprocessor IMMU Block Diagram .....................................................

184

Figure 5-3.

750GX Microprocessor DMMU Block Diagram ....................................................................

185

Figure 5-4.

Address-Translation Types ..................................................................................................

187

Figure 5-5.

General Flow of Address Translation (Real-Addressing Mode and Block) ..........................

189

Figure 5-6.

General Flow of Page and Direct-Store Interface Address Translation ...............................

191

Figure 5-7.

Segment Register and DTLB Organization ..........................................................................

200

Figure 5-8.

Page-Address-Translation Flow—TLB Hit ...........................................................................

203

Figure 5-9.

Primary Page Table Search .................................................................................................

205

Figure 5-10.

Secondary Page-Table-Search Flow ...................................................................................

206

Figure 6-1.

Pipelined Execution Unit ......................................................................................................

212

Figure 6-2.

Superscalar/Pipeline Diagram ..............................................................................................

212

Figure 6-3.

PowerPC 750GX Microprocessor Pipeline Stages ..............................................................

214

Figure 6-4.

Instruction Flow Diagram .....................................................................................................

218

Figure 6-5.

Instruction Timing—Cache Hit .............................................................................................

220

Figure 6-6.

Instruction Timing—Cache Miss ..........................................................................................

223

Figure 6-7.

Branch Taken .......................................................................................................................

227

Figure 6-8.

Removal of Fall-Through Branch Instruction ........................................................................

227

Figure 6-9.

Branch Completion ...............................................................................................................

228

Figure 6-10.

Branch Instruction Timing ....................................................................................................

231

Figure 7-1.

750GX Signal Groups ..........................................................................................................

250

Figure 8-1.

Bus Interface Address Buffers .............................................................................................

280

Figure 8-2.

Timing Diagram Legend .......................................................................................................

283

Figure 8-3.

Overlapping Tenures on the 750GX Bus for a Single-Beat Transfer ...................................

284

Figure 8-4.

Cache Diagram for Miss-under-Miss Feature ......................................................................

286

750gx_umLOF.fm.(1.2)

List of Figures

March 27, 2006

 

Page 13 of 377

User’s Manual

 

 

IBM PowerPC 750GX and 750GL RISC Microprocessor

 

Figure 8-5.

First Level Address Pipelining ..............................................................................................

287

Figure 8-6.

Address-Bus Arbitration ........................................................................................................

290

Figure 8-7.

Address-Bus Arbitration Showing Bus Parking ....................................................................

291

Figure 8-8.

Address-Bus Transfer ...........................................................................................................

293

Figure 8-9.

Snooped Address Cycle with ARTRY ..................................................................................

301

Figure 8-10.

Data-Bus Arbitration .............................................................................................................

302

Figure 8-11.

Normal Single-Beat Read Termination .................................................................................

304

Figure 8-12.

Normal Single-Beat Write Termination .................................................................................

305

Figure 8-13.

Normal Burst Transaction .....................................................................................................

305

Figure 8-14.

Termination with DRTRY ......................................................................................................

306

Figure 8-15.

Read Burst with TA Wait States and DRTRY .......................................................................

307

Figure 8-16.

MEI Cache-Coherency Protocol—State Diagram (WIM = 001) ...........................................

309

Figure 8-17.

Fastest Single-Beat Reads ...................................................................................................

310

Figure 8-18.

Fastest Single-Beat Writes ...................................................................................................

311

Figure 8-19.

Single-Beat Reads Showing Data-Delay Controls ...............................................................

312

Figure 8-20.

Single-Beat Writes Showing Data-Delay Controls ................................................................

313

Figure 8-21.

Burst Transfers with Data-Delay Controls ............................................................................

314

Figure 8-22.

Use of Transfer Error Acknowledge (TEA) ...........................................................................

315

Figure 8-23.

32-Bit Data-Bus Transfer (8-Beat Burst) ..............................................................................

317

Figure 8-24.

32-Bit Data-Bus Transfer (2-Beat Burst with DRTRY) ..........................................................

317

Figure 8-25.

IEEE 1149.1a-1993 Compliant Boundary-Scan Interface ....................................................

320

Figure 8-26.

Data-Bus Write-Only Transaction .........................................................................................

320

Figure 9-1.

L2 Cache ..............................................................................................................................

327

Figure 10-1.

750GX Power States ............................................................................................................

336

Figure 10-2.

Dual PLL Block Diagram ......................................................................................................

342

Figure 10-3.

Dual PLL Switching Example, 3X to 4X ................................................................................

343

Figure 10-4.

Thermal Assist Unit Block Diagram ......................................................................................

344

Figure 10-5.

Instruction Cache Throttling Control SPR Diagram ..............................................................

347

Figure 11-1.

750GX IEEE 1149.1a-1993/COP Organization ....................................................................

358

Figure 11-2.

Reset Sequence ...................................................................................................................

360

List of Figures

750gx_umLOF.fm.(1.2)

Page 14 of 377

March 27, 2006

 

 

User’s Manual

 

IBM PowerPC 750GX and 750GL RISC Microprocessor

List of Tables

 

Table 1-1.

Architecture-Defined Registers (Excluding SPRs) .................................................................

42

Table 1-2.

Architecture-Defined SPRs Implemented ..............................................................................

43

Table 1-3.

Implementation-Specific Registers .........................................................................................

44

Table 1-4.

750GX Microprocessor Exception Classifications ..................................................................

49

Table 1-5.

Exceptions and Conditions .....................................................................................................

50

Table 2-1.

Additional MSR Bits ...............................................................................................................

60

Table 2-2.

Additional SRR1 Bits ..............................................................................................................

62

Table 2-3.

Valid THRM1/THRM2 Bit Settings .........................................................................................

79

Table 2-4.

Memory Operands .................................................................................................................

82

Table 2-5.

Floating-Point Operand Data-Type Behavior .........................................................................

84

Table 2-6.

Floating-Point Result Data-Type Behavior .............................................................................

85

Table 2-7.

Integer Arithmetic Instructions ................................................................................................

92

Table 2-8.

Integer Compare Instructions .................................................................................................

93

Table 2-9.

Integer Logical Instructions ....................................................................................................

94

Table 2-10.

Integer Rotate Instructions .....................................................................................................

95

Table 2-11.

Integer Shift Instructions ........................................................................................................

95

Table 2-12.

Floating-Point Arithmetic Instructions .....................................................................................

96

Table 2-13.

Floating-Point Multiply/Add Instructions .................................................................................

96

Table 2-14.

Floating-Point Rounding and Conversion Instructions ...........................................................

97

Table 2-15.

Floating-Point Compare Instructions ......................................................................................

97

Table 2-16.

Floating-Point Status and Control Register Instructions ........................................................

97

Table 2-17.

Floating-Point Move Instructions ............................................................................................

98

Table 2-18.

Integer Load Instructions ........................................................................................................

99

Table 2-19.

Integer Store Instructions .....................................................................................................

101

Table 2-20.

Integer Load-and-Store with Byte-Reverse Instructions ......................................................

102

Table 2-21.

Integer Load-and-Store Multiple Instructions .......................................................................

102

Table 2-22.

Integer Load-and-Store String Instructions ..........................................................................

103

Table 2-23.

Floating-Point Load Instructions ...........................................................................................

104

Table 2-24.

Floating-Point Store Instructions ..........................................................................................

105

Table 2-25.

Store Floating-Point Single Behavior ...................................................................................

105

Table 2-26.

Store Floating-Point Double Behavior ..................................................................................

105

Table 2-27.

Branch Instructions ..............................................................................................................

107

Table 2-28.

Condition Register Logical Instructions ................................................................................

107

Table 2-29.

Trap Instructions ..................................................................................................................

108

Table 2-30.

System Linkage Instruction—UISA ......................................................................................

108

Table 2-31.

Move-to/Move-from Condition Register Instructions ............................................................

108

Table 2-32.

Move-to/Move-from Special-Purpose Register Instructions (UISA) .....................................

109

Table 2-33.

PowerPC Encodings ............................................................................................................

109

750gx_umLOT.fm.(1.2)

List of Tables

March 27, 2006

 

Page 15 of 377

User’s Manual

 

 

IBM PowerPC 750GX and 750GL RISC Microprocessor

 

Table 2-34.

SPR Encodings for 750GX-Defined Registers (mfspr) ........................................................

112

Table 2-35.

Memory Synchronization Instructions—UISA .......................................................................

113

Table 2-36.

Move-from Time Base Instruction .........................................................................................

114

Table 2-37.

Memory Synchronization Instructions—VEA ........................................................................

115

Table 2-38.

User-Level Cache Instructions .............................................................................................

116

Table 2-39.

External Control Instructions ................................................................................................

117

Table 2-40.

System Linkage Instructions—OEA .....................................................................................

118

Table 2-41.

Move-to/Move-from Machine State Register Instructions .....................................................

118

Table 2-42.

Move-to/Move-from Special-Purpose Register Instructions (OEA) ......................................

118

Table 2-43.

Supervisor-Level Cache-Management Instruction ...............................................................

119

Table 2-44.

Segment Register Manipulation Instructions ........................................................................

119

Table 2-45.

Translation Lookaside Buffer Management Instruction ........................................................

120

Table 3-1.

MEI State Definitions ............................................................................................................

127

Table 3-2.

PLRU Bit Update Rules ........................................................................................................

138

Table 3-3.

PLRU Replacement Block Selection ....................................................................................

138

Table 3-4.

Bus Operations Caused by Cache-Control Instructions (WIM = 001) ..................................

141

Table 3-5.

Response to Snooped Bus Transactions .............................................................................

143

Table 3-6.

Address/Transfer Attribute Summary ...................................................................................

146

Table 3-7.

MEI State Transitions ...........................................................................................................

147

Table 4-1.

PowerPC 750GX Microprocessor Exception Classifications ................................................

152

Table 4-2.

Exceptions and Conditions ...................................................................................................

152

Table 4-3.

Exception Priorities ...............................................................................................................

155

Table 4-4.

IEEE Floating-Point Exception Mode Bits ............................................................................

160

Table 4-5.

MSR Setting Due to Exception .............................................................................................

162

Table 4-6.

System Reset Exception–Register Settings .........................................................................

163

Table 4-7.

Settings Caused by Hard Reset ...........................................................................................

166

Table 4-8.

HID0 Machine-Check Enable Bits ........................................................................................

167

Table 4-9.

Machine-Check Exception—Register Settings .....................................................................

168

Table 4-10.

Performance-Monitor Interrupt Exception—Register Settings ..............................................

172

Table 4-11.

Instruction Address Breakpoint Exception—Register Settings .............................................

173

Table 4-12.

System Management Interrupt Exception—Register Settings .............................................

174

Table 4-13.

Thermal-Management Interrupt Exception—Register Settings ............................................

174

Table 4-14.

Front-End Exception Handling Summary .............................................................................

176

Table 5-1.

MMU Feature Summary .......................................................................................................

180

Table 5-2.

Access Protection Options for Pages ...................................................................................

188

Table 5-3.

Translation Exception Conditions .........................................................................................

192

Table 5-4.

Other MMU Exception Conditions for the 750GX Processor ................................................

193

Table 5-5.

750GX Microprocessor Instruction Summary—Control MMUs ............................................

194

Table 5-6.

750GX Microprocessor MMU Registers ...............................................................................

195

List of Tables

 

750gx_umLOT.fm.(1.2)

Page 16 of 377

 

March 27, 2006

 

 

User’s Manual

 

IBM PowerPC 750GX and 750GL RISC Microprocessor

Table 5-7.

Table-Search Operations to Update History Bits—TLB Hit Case

........................................ 197

Table 5-8.

Model for Guaranteed R and C Bit Settings .........................................................................

198

Table 6-1.

Notation Conventions for Instruction Timing ........................................................................

214

Table 6-2.

Performance Effects of Memory Operand Placement ..........................................................

233

Table 6-3.

TLB Miss Latencies ..............................................................................................................

236

Table 6-4.

Branch Instructions ..............................................................................................................

238

Table 6-5.

System-Register Instructions ...............................................................................................

238

Table 6-6.

Condition Register Logical Instructions ................................................................................

240

Table 6-7.

Integer Instructions ...............................................................................................................

240

Table 6-8.

Floating-Point Instructions ....................................................................................................

242

Table 6-9.

Load-and-Store Instructions .................................................................................................

244

Table 7-1.

Transfer Type Encodings for PowerPC 750GX Bus Master ................................................

256

Table 7-2.

PowerPC 750GX Snoop Hit Response ................................................................................

257

Table 7-3.

Data-Transfer Size ...............................................................................................................

259

Table 7-4.

Data-Bus Lane Assignments ................................................................................................

266

Table 7-5.

DP[0–7] Signal Assignments ................................................................................................

267

Table 7-6.

Summary of Mode Select Signals ........................................................................................

274

Table 7-7.

Bus Voltage Selection Settings ............................................................................................

275

Table 7-8.

IEEE Interface Pin Descriptions ...........................................................................................

275

Table 8-1.

Transfer Size Signal Encodings ...........................................................................................

294

Table 8-2.

Burst Ordering—64-Bit Bus ..................................................................................................

295

Table 8-3.

Burst Ordering—32-Bit Bus ..................................................................................................

296

Table 8-4.

Aligned Data Transfers ........................................................................................................

296

Table 8-5.

Misaligned Data Transfers (4-Byte Examples) .....................................................................

298

Table 8-6.

Aligned Data Transfers (32-Bit Bus Mode) ..........................................................................

298

Table 8-7.

Misaligned 32-Bit Data-Bus Transfer (4-Byte Examples) .....................................................

299

Table 9-1.

Interpretation of LRU Bits .....................................................................................................

324

Table 9-2.

Modification of LRU Bits .......................................................................................................

325

Table 9-3.

Effect of Locked Ways on LRU Interpretation ......................................................................

325

Table 10-1.

750GX Microprocessor Programmable Power Modes .........................................................

336

Table 10-2.

HID0 Power Saving Mode Bit Settings .................................................................................

337

Table 10-3.

Valid THRM1 and THRM2 Bit Settings ................................................................................

345

Table 10-4.

ICTC Bit Field Settings .........................................................................................................

348

Table 11-1.

Performance Monitor SPRs .................................................................................................

350

Table 11-2.

PMC1 Events—MMCR0[19:25] Select Encodings ...............................................................

352

Table 11-3.

PMC2 Events—MMCR0[26:31] Select Encodings ...............................................................

352

Table 11-4.

PMC3 Events—MMCR1[0:4] Select Encodings ...................................................................

353

Table 11-5.

PMC4 Events—MMCR1[5:9] Select Encodings ...................................................................

354

Table 11-6.

HID0 Checkstop Control Bits ...............................................................................................

361

750gx_umLOT.fm.(1.2)

List of Tables

March 27, 2006

 

Page 17 of 377

User’s Manual

 

 

IBM PowerPC 750GX and 750GL RISC Microprocessor

 

Table 11-7.

HID2 Checkstop Control Bits ................................................................................................

362

Table 11-8.

L2CR Checkstop Control Bits ...............................................................................................

362

List of Tables

750gx_umLOT.fm.(1.2)

Page 18 of 377

March 27, 2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

About This Manual

This user’s manual defines the functionality of the PowerPC® 750GX and 750GL RISC microprocessors. It describes features of the 750GX and 750GL that are not defined by the architecture. This book is intended as a companion to the PowerPC Microprocessor Family: The Programming Environments (referred to as The Programming Environments Manual).

Note: Soft copies of the latest version of this manual and documents referred to in this manual that are produced by IBM can be accessed on the world wide web as follows: http://www-3.ibm.com/chips/techlib.

Note: All information contained in this document referring to the PowerPC 750GX RISC Microprocessor also pertains to the IBM PowerPC 750GL RISC Microprocessor.

Who Should Read This Manual

This manual is intended for system software developers, hardware developers, and applications programmers designing products for the 750GX. Readers should understand operating systems, microprocessor system design, basic principles of RISC processing, and details of the PowerPC Architecture™.

Related Publications

PowerPC Architecture

May, Cathy, et. al., eds. The PowerPC Architecture: A Specification for a New Family of RISC Processors, Second Edition. San Francisco, CA: Morgan-Kaufmann, 1994.

McClanahan, Kip. PowerPC Programming for Intel Programmers. Foster City, CA: Hungry Minds, 1995.

Shanley, Tom. PowerPC System Architecture, Second Edition. Richardson, TX: Addison-Wesley, 1995.

PowerPC Microprocessor Documentation

The latest version of this manual, errata, and other IBM documents referred to in this manual can be found at: http://www.ibm.com/chips/techlib.

PowerPC 750GX RISC Microprocessor Datasheet. Provides data about bus timing, signal behavior, electrical and thermal characteristics, and other design considerations for each PowerPC implementation.

PowerPC Microprocessor Family: The Programming Environments Manual (G522-0290-01). Provides information about resources defined by the PowerPC Architecture that are common to PowerPC processors.

Implementation Variances Relative to Rev. 1 of The Programming Environments Manual.

PowerPC Microprocessor Family: The Programmer’s Pocket Reference Guide (SA14-2093-00). This foldout card provides an overview of the PowerPC registers, instructions, and exceptions for 32-bit implementations.

PowerPC Microprocessor Family: The Programmer’s Reference Guide (MPRPPCPRG-01). Includes the register summary, memory control model, exception vectors, and the PowerPC instruction set.

Application notes. These short documents contain information about specific design issues useful to programmers and engineers working with PowerPC processors.

gx_preface.fm.(1.2)

 

March 27, 2006

Page 19 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

Conventions Used in This Manual

Notational Conventions

mnemonics

Instruction mnemonics are shown in lowercase bold.

italics

Italics indicate variable command parameters. For example: bcctrx. Book titles in text are

 

 

set in italics.

0x0

Prefix to denote a hexadecimal number.

0b0

Prefix to denote a binary number.

crfD

Instruction syntax used to identify a destination Condition Register (CR) field.

rA, rB

Instruction syntax used to identify a source General Purpose Register (GPR).

rD

Instruction syntax used to identify a destination GPR.

frA, frB, frC

Instruction syntax used to identify a source Floating Point Register (FPR).

frD

Instruction syntax used to identify a destination FPR.

REG[FIELD]

Abbreviations or acronyms for registers are shown in uppercase text. Specific bits, fields,

 

 

or ranges appear in brackets. For example, MSR[LE] refers to the little-endian mode

 

 

enable bit in the Machine State Register.

x

In certain contexts, such as a signal encoding, this indicates a don’t care.

n

Used to express an undefined numerical value.

¬

NOT logical operator.

&

AND logical operator.

|

OR logical operator.

 

 

Indicates reserved bits or bit fields in a register. Although these bits can be written to as

0 0 0 0

 

 

either ones or zeros, they are always read as zeros.

 

 

 

gx_preface.fm.(1.2)

Page 20 of 377

March 27, 2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

Terminology Conventions

The following table describes terminology conventions used in this manual and the equivalent terminology used in the PowerPC Architecture specification.

PowerPC Architecture Specification

750GX User’s Manual

 

 

 

 

Data-storage interrupt (DSI)

DSI exception

 

 

Extended mnemonics

Simplified mnemonics

 

 

Fixed-point unit (FXU)

Integer unit (IU)

 

 

Instruction storage interrupt (ISI)

ISI exception

 

 

Interrupt

Exception

 

 

Privileged mode (or privileged state)

Supervisor-level privilege

 

 

Problem mode (or problem state)

User-level privilege

 

 

Real address

Physical address

 

 

Relocation

Translation

 

 

Storage (locations)

Memory

 

 

Storage (the act of)

Access

 

 

Store in

Write back

 

 

Store through

Write through

 

 

Instruction Field Conventions

The following table describes instruction field conventions used in this manual and the equivalent conventions from the PowerPC Architecture specification.

PowerPC Architecture Specification

750GX User’s Manual

 

 

 

 

BA, BB, BT

crbA, crbB, crbD (respectively)

 

 

BF, BFA

crfD, crfS (respectively)

 

 

D

d

 

 

DS

ds

 

 

FLM

FM

 

 

FRA, FRB, FRC, FRT, FRS

frA, frB, frC, frD, frS (respectively)

 

 

FXM

CRM

 

 

RA, RB, RT, RS

rA, rB, rD, rS (respectively)

 

 

SI

SIMM

 

 

U

IMM

 

 

UI

UIMM

 

 

/, //, ///

0...0 (shaded)

 

 

gx_preface.fm.(1.2)

 

March 27, 2006

Page 21 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

Using This Manual with the Programming Environments Manual

Because the PowerPC Architecture is designed to be flexible to support a broad range of processors, the

PowerPC Microprocessor Family: The Programming Environments Manual provides a general description of features that are common to PowerPC processors and indicates those features that are optional or that might be implemented differently in the design of each processor.

This document and The Programming Environments Manual describe three levels, or programming environments, of the PowerPC Architecture:

PowerPC user instruction set architecture (UISA)—The UISA defines the level of the architecture to which user-level software should conform. The UISA defines the base user-level instruction set, userlevel registers, data types, memory conventions, and the memory and programming models seen by application programmers.

PowerPC virtual environment architecture (VEA)—The VEA, which is the smallest component of the PowerPC Architecture, defines additional user-level functionality that falls outside typical user-level software requirements. The VEA describes the memory model for an environment in which multiple processors or other devices can access external memory and defines aspects of the cache model and cachecontrol instructions from a user-level perspective. The resources defined by the VEA are particularly useful for optimizing memory accesses and for managing resources in an environment in which other processors and other devices can access external memory.

Implementations that conform to the PowerPC VEA also conform to the PowerPC UISA, but might not necessarily adhere to the OEA.

PowerPC operating environment architecture (OEA)—The OEA defines supervisor-level resources typically required by an operating system. The OEA defines the PowerPC memory-management model, supervisor-level registers, and the exception model.

Implementations that conform to the PowerPC OEA also conform to the PowerPC UISA and VEA.

Some resources are defined more generally at one level in the architecture and more specifically at another. For example, conditions that cause a floating-point exception are defined by the UISA, while the exception mechanism itself is defined by the OEA.

Because it is important to distinguish between the levels of the architecture in order to ensure compatibility across multiple platforms, those distinctions are shown clearly throughout this book.

For ease in reference, the arrangement of topics in this book follows that of The Programming Environments Manual. Topics build upon one another, beginning with a description and complete summary of 750GXspecific registers and instructions and progressing to more specialized topics such as 750GX-specific details regarding the cache, exception, and memory-management models. Therefore, chapters can include information from multiple levels of the architecture. (For example, the discussion of the cache model uses information from both the VEA and the OEA.)

The PowerPC Architecture: A Specification for a New Family of RISC Processors defines the architecture from the perspective of the three programming environments and remains the defining document for the PowerPC Architecture. For information about PowerPC documentation, see Related Publications on page 19.

 

gx_preface.fm.(1.2)

Page 22 of 377

March 27, 2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

1. PowerPC 750GX Overview

The IBM PowerPC 750GX reduced instruction set computer (RISC) Microprocessor is an implementation of the PowerPC Architecture™ with enhancements based on the IBM PowerPC 750™, 750CXe, and 750FX RISC microprocessor designs. This chapter provides an overview of the PowerPC 750GX microprocessor features, including a block diagram that shows the major functional components. It also describes how the 750GX implementation complies with the PowerPC Architecture definition.

Note: In this document, the IBM PowerPC 750GX RISC Microprocessor is abbreviated as 750GX or 750GX RISC Microprocessor.

1.1 750GX Microprocessor Overview

The 750GX is a 32-bit implementation of the PowerPC Architecture in a 0.13 micron CMOS technology with six levels of copper interconnect. The 750GX is designed for high performance and low power consumption. It provides a superset of functionality to the PowerPC 750 processor, including a complete 60x bus interface, and enhancements such as an integrated 1-MB L2 cache.

750GX implements the 32-bit portion of the PowerPC Architecture, which provides 32-bit effective addresses, integer data types of 8, 16, and 32 bits, and floating-point data types of single and double-precision. 750GX is a superscalar processor that can complete two instructions simultaneously.

It incorporates the following six execution units:

Floating-point unit (FPU)

Branch processing unit (BPU)

System register unit (SRU)

Load/store unit (LSU)

Two integer units (IUs): IU1 executes all integer instructions. IU2 executes all integer instructions except multiply and divide instructions.

The ability to execute several instructions in parallel and the use of simple instructions with rapid execution times yield high efficiency and throughput for 750GX-based systems. Most integer instructions execute in one clock cycle. The FPU is pipelined; it breaks the tasks it performs into subtasks, and then executes in three successive stages. Typically, a floating-point instruction can occupy only one of the three stages at a time, freeing the previous stage to work on the next floating-point instruction. Thus, three single-precision floatingpoint instructions can be in the FPU execute stage at a time. Double-precision add instructions have a 3-cycle latency; double-precision multiply and multiply/add instructions have a 4-cycle latency.

Figure 1-1, 750GX Microprocessor Block Diagram, on page 25 shows the parallel organization of the execution units (shaded in the diagram). The instruction unit fetches, dispatches, and predicts branch instructions. Note that this is a conceptual model that shows basic features rather than attempting to show how features are implemented physically.

750GX has independent on-chip, 32-KB, 8-way set-associative, physically addressed caches for instructions and data, and independent instruction and data memory management units (MMUs). Each memory management unit has a 128-entry, 2-way set-associative translation lookaside buffer (DTLB and ITLB) that saves recently used page-address translations. Block-address translation is done through the 8-entry instruction

gx_01.fm.(1.2)

PowerPC 750GX Overview

March 27,2006

Page 23 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

and data block-address-translation (IBAT and DBAT) arrays, defined by the PowerPC Architecture. During block translation, effective addresses are compared simultaneously with all eight block-address-translation (BAT) entries.

For information about the L1 cache, see Chapter 3, Instruction-Cache and Data-Cache Operation, on

page 121. The L2 cache is implemented with an on-chip, 4-way set-associative tag memory, and an on-chip 1-MB SRAM with error correction code (ECC) protection for data storage. For more information on the L2 Cache, see Chapter 9 on page 323.

The 750GX has a 32-bit address bus and a 64-bit data bus. Multiple devices compete for system resources through a central external arbiter. The 750GX’s 3-state cache-coherency protocol (MEI) supports the modified, exclusive, and invalid states, a compatible subset of the MESI (modified/exclusive/shared/invalid)

4-state protocol, and it operates coherently in systems with 4-state caches. The 750GX supports single-beat and burst data transfers for external memory accesses and memory-mapped I/O operations. The system interface is described in Chapter 7, Signal Descriptions, on page 249 and Chapter 8, Bus Interface Operation, on page 279.

The 750GX has four software-controllable power-saving modes. The three static modes; doze, nap, and sleep; progressively reduce power dissipation. When functional units are idle, a dynamic power management mode causes those units to enter a low-power mode automatically without affecting operational performance, software execution, or external hardware. The 750GX also provides a thermal assist unit (TAU) and a way to reduce the instruction fetch rate to limit power dissipation. Power management is described in Chapter 10, Power and Thermal Management, on page 335.

PowerPC 750GX Overview

gx_01.fm.(1.2)

Page 24 of 377

March 27,2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

Figure 1-1. 750GX Microprocessor Block Diagram

Additional Features:

 

 

Instruction Control Unit

 

 

 

 

 

128-Bit

 

Ifetch

 

 

Branch Processing

 

 

 

(4 Instructions)

Time Base Cntr/

 

 

 

 

 

 

 

 

 

 

 

Unit

 

 

 

 

 

 

Decrementer

 

 

 

 

 

Instruction MMU

 

 

 

 

 

 

 

BTIC

CTR

 

 

 

Clock Multiplier

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

JTAG/COP Interface

 

Instruction Queue

 

64 Entries

LR

 

SRs

 

 

 

Thermal/Power

 

(6 Words)

 

BHT

CR

 

(Shadow)

IBAT

 

32-KB

 

Management

 

 

 

 

 

 

 

Array

Tags

I Cache

 

 

 

 

 

 

 

 

Performance Monitor

 

 

 

 

 

 

 

ITLB

 

 

 

 

 

 

 

 

 

 

Interrupt Logic

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2 Instructions

 

 

Dispatch Unit

64-Bit

 

 

 

 

 

64-Bit

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(2 Instructions)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Reservation Station

Reservation Station

Reservation Station

Reservation Station

GPR File

Reservation Station

FPR File

 

(2 Entry)

 

 

 

 

 

 

 

(2 Entry)

 

 

 

 

 

 

 

 

Rename Buffers

 

 

Rename Buffers

 

 

 

 

 

 

System Register

 

(6)

32-Bit

Load/Store Unit

(6)

 

Floating-Point

 

 

 

 

 

 

 

Integer Unit 1

Integer Unit 2

 

Unit

 

 

 

+

64-Bit

 

64-Bit

Unit

 

 

 

 

 

 

 

 

 

 

 

 

+ x ÷

+

 

 

 

 

 

(EA Calculation)

 

 

+ x ÷

 

 

 

 

 

 

Store Queue

 

 

FPSCR

 

 

 

 

 

 

 

 

 

 

 

 

 

32-Bit

 

32-Bit

 

 

 

 

 

 

 

 

 

 

 

 

 

EA

PA

 

 

 

 

 

 

 

Completion Unit

 

 

Data MMU

 

 

 

60x Bus Interface Unit

 

 

 

 

 

 

 

 

Instruction Fetch Queue

 

 

 

 

 

 

 

 

 

L2 Cache

 

 

Reorder Buffer

 

 

 

 

 

 

 

 

 

 

SRs

 

 

 

64-Bit

 

L1 Castout Queue

64-Bit

 

 

 

(6 Entry)

 

 

Tags

 

 

 

 

 

 

(Original)

DBAT

 

 

 

 

 

L2CR

 

 

 

 

 

 

 

 

 

 

Data Load Queue

 

 

 

 

 

 

 

Array

32-KB

 

256-Bit

 

L2 Tag

 

 

 

 

DTLB

 

D Cache

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

256-Bit

 

 

 

 

1 MB

 

 

 

 

 

 

 

 

32-Bit Address Bus

 

 

SRAM

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

60x Bus

 

 

 

 

64-Bit Data Bus

 

 

 

 

 

1.2 750GX Microprocessor Features

This section lists features of the 750GX. The interrelationship of these features is shown in Figure 1-1 on page 25.

Major features of 750GX are:

High-performance, superscalar microprocessor.

As many as four instructions can be fetched from the instruction cache per clock cycle.

As many as two instructions can be dispatched and completed per clock.

As many as six instructions can execute per clock (including two integer instructions).

Single-clock-cycle execution for most instructions.

Six independent execution units and two register files.

BPU featuring both static and dynamic branch prediction.

64-entry (16-set, 4-way set-associative) branch target instruction cache (BTIC), a cache of branch instructions that have been encountered in branch/loop code sequences. If a target instruction is in the BTIC, it is fetched into the instruction queue a cycle sooner than it can be

gx_01.fm.(1.2)

PowerPC 750GX Overview

March 27,2006

Page 25 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

made available from the instruction cache. Typically, if a fetch access hits the BTIC, it provides the first two instructions in the target stream effectively yielding a zero-cycle branch.

512-entry branch history table (BHT) with two bits per entry for four levels of prediction—not- taken, strongly not-taken, taken, strongly taken.

Removal of Branch instructions that do not update the Count Register (CTR) or Link Register (LR) from the instruction stream.

Two integer units (IUs) that share 32 general purpose registers (GPRs) for integer operands.

IU1 can execute any integer instruction.

IU2 can execute all integer instructions except multiply and divide instructions (multiply, divide, shift, rotate, arithmetic, and logical instructions). Most instructions that execute in the IU2 take one cycle to execute. The IU2 has a single-entry reservation station.

3-stage floating-point unit (FPU).

FPU fully compliant with IEEE® 754-1985 for both single-precision and double-precision operations.

Support for non-IEEE mode for time-critical operations.

Hardware support for denormalized numbers.

Hardware support for divide.

2-entry reservation station.

Thirty-two 64-bit Floating Point Registers (FPRs) for single and double-precision operations.

2-stage load/store unit (LSU).

2-entry reservation station.

4-entry load queue.

Single-cycle, pipelined cache access.

Dedicated adder performs effective address (EA) calculations.

Performs alignment and precision conversion for floating-point data.

Performs alignment and sign extension for integer data.

3-entry store queue.

Supports both big-endian and little-endian modes.

System register unit (SRU) handles miscellaneous instructions.

Executes Condition Register (CR) logical and Move-to/Move-from SPR instructions (mtspr and mfspr).

Single-entry reservation station.

Rename buffers.

Six GPR rename buffers.

Six FPR rename buffers.

Condition Register buffering supports two CR writes per clock.

Completion unit.

The completion unit retires an instruction from the 6-entry reorder buffer (completion queue) when all instructions ahead of it have been completed, the instruction has finished execution, and no exceptions are pending.

Guarantees a sequential programming model and a precise-exception model.

Monitors all dispatched instructions and retires them in order.

Tracks unresolved branches and flushes instructions from the mispredicted branch path.

PowerPC 750GX Overview

gx_01.fm.(1.2)

Page 26 of 377

March 27,2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

Retires as many as two instructions per clock.

Separate on-chip L1 instruction and data caches (Harvard architecture).

32-KB, 8-way set-associative instruction and data caches.

Pseudo least-recently-used (PLRU) replacement algorithm.

32-byte (8-word) cache block.

Physically indexed/physical tags.

Note: The PowerPC Architecture refers to physical address space as real address space.

Cache write-back or write-through operation programmable on a virtual-page or BAT-block basis.

Instruction cache can provide four instructions per clock; data cache can provide two words per clock

Caches can be disabled in software.

Caches can be locked in software.

Data-cache coherency (MEI) maintained in hardware.

The critical double word is made available to the requesting unit when it is read into the line-fill buffer. The cache is nonblocking, so it can be accessed during block reload.

Nonblocking instruction cache (one outstanding miss).

Nonblocking data cache (four outstanding misses).

No snooping of instruction cache.

Parity for L1 tags and caches.

Integrated L2 cache.

1-MB on-chip ECC SRAMs.

On-chip 4-way set-associative tag memory.

ECC error correction for most single-bit errors; detection of remaining single-bit errors and all doublebit errors.

Copy-back or write-through data cache on a page basis, or for entire L2.

64-byte line size, two sectors per line.

L2 frequency at core speed.

On-board ECC; parity for L2 tags.

Supports up to four outstanding misses (three data and one instruction or four data).

Cache locking by way.

Separate memory management units (MMUs) for instructions and data.

52-bit virtual address; 32-bit physical address.

Address translation for virtual pages or variable-sized BAT blocks.

Memory programmable as write-back or write-through, cacheable or noncacheable, and coherency enforced or coherency not enforced on a virtual-page or BAT block basis.

Separate IBAT and DBAT arrays (eight each) for instructions and data, respectively.

Separate virtual instruction and data translation lookaside buffers (TLBs).

Both TLBs are 128-entry, 2-way set associative, and use an LRU replacement algorithm.

gx_01.fm.(1.2)

PowerPC 750GX Overview

March 27,2006

Page 27 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

TLBs are hardware-reloadable (the page table search is performed by hardware).

Bus interface features:

Enhanced 60x bus that pipelines back-to-back reads to a depth of four. A dedicated snoop queue that allows snoop copybacks to also pipeline with up to the four maximum reads. Enveloped write transactions supported with the assertion of DBWO.

Selectable bus-to-core clock frequency ratios of 2x, 2.5x, 3x, 3.5x, 4x, 4.5x, 5x, 5.5x, 6x, 6.5x, 7x, 7.5x, 8x, 8.5x, 9x, 9.5x, 10x, 11x, 12x, 13x, 14x, 15x, 16x, 17x, 18x, 19x, and 20x supported (2x, 2.5x, 3x, and 3.5x not supported with bus pipelining enabled).

A 64-bit, split-transaction external data bus with burst transfers.

Support for address pipelining and limited out-of-order bus transactions.

8-word reload buffer for the L1 data cache.

Single-entry instruction fetch queue.

2-entry L2 cache castout queue.

No-DRTRY mode eliminates the DRTRY signal from the qualified bus grant. This allows the forward- ing of data during load operations to the internal core one bus cycle sooner than if the use of DRTRY is enabled.

Selectable I/O interface voltages of 1.8 V, 2.5 V, or 3.3 V

Multiprocessing support features:

Hardware-enforced, 3-state cache-coherency protocol (MEI) for data cache.

Load/store with reservation instruction pair for atomic memory references, semaphores, and other multiprocessor operations.

Power and thermal management:

Three static modes, doze, nap, and sleep, progressively reduce power dissipation:

Doze—All the functional units are disabled except for the Time Base/Decrementer Registers and the bus snooping logic.

Nap—The nap mode further reduces power consumption by disabling bus snooping, leaving only the Time Base Register and the PLL in a powered state.

Sleep—All internal functional units are disabled, after which external system logic can disable the PLL and SYSCLK.

Software-controllable thermal management. Thermal management is performed through the use of three supervisor-level registers and a 750GX-specific thermal-management exception.

Software-controlled frequency switching (dual PLL mode) to allow toggling between minimum and maximum frequencies to manage power consumption based on computational load.

Instruction-cache throttling provides control to slow instruction fetching to limit power consumption.

Hardware-assist features for fault-tolerant systems including L2 ECC correction, parity checking on internal arrays, and dual-processor lockstep operation.

Performance monitor can be used to help debug system designs and improve software efficiency.

In-system testability and debugging features through Joint Test Action Group (JTAG) boundary-scan capability.

PowerPC 750GX Overview

gx_01.fm.(1.2)

Page 28 of 377

March 27,2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

1.2.1 Instruction Flow

As shown in Figure 1-1, 750GX Microprocessor Block Diagram, on page 25, the 750GX instruction control unit provides centralized control of instruction flow to the execution units. The instruction unit contains a sequential instruction fetch (Ifetch), 6-entry instruction queue (IQ), dispatch unit, and BPU. It determines the address of the next instruction to be fetched based on information from the sequential instruction fetcher and from the BPU. See Chapter 6, Instruction Timing, on page 209 for more information.

The sequential instruction fetcher loads instructions from the instruction cache into the instruction queue. The BPU extracts branch instructions from the sequential instruction fetcher. Branch instructions that cannot be resolved immediately are predicted using either 750GX-specific dynamic branch prediction or the architec- ture-defined static branch prediction.

Branch instructions that do not update the LR or CTR are removed from (folded out of) the instruction stream. Instruction fetching continues along the predicted path of the branch instruction.

Instructions issued to execution units beyond a predicted branch can be executed but are not retired until the branch is resolved. If branch prediction is incorrect, the completion unit flushes all instructions fetched on the predicted path, and instruction fetching resumes along the correct path.

1.2.1.1 Instruction Queue and Dispatch Unit

The instruction queue (IQ), shown in Figure 1-1 on page 25, holds as many as six instructions and loads up to four instructions from the instruction cache during a single-processor clock cycle. The instruction fetcher continuously attempts to load as many instructions as there were vacancies created in the IQ in the previous clock cycle. All instructions except branches are dispatched to their respective execution units from the bottom two positions in the instruction queue (IQ0 and IQ1) at a maximum rate of two instructions per cycle. Reservation stations are provided for the IU1, IU2, FPU, LSU, and SRU for dispatched instructions. The dispatch unit checks for source and destination register dependencies, allocates rename buffers, determines whether a position is available in the completion queue, and inhibits subsequent instruction dispatching if these resources are not available.

Branch instructions can be detected, decoded, and predicted from anywhere in the instruction queue. For a more detailed discussion of instruction dispatch, see Section 6.6.1, Branch, Dispatch, and Completion-Unit Resource Requirements, on page 237.

1.2.1.2 Branch Processing Unit (BPU)

The BPU receives branch instructions from the sequential instruction fetcher and performs CR lookahead operations on conditional branches to resolve them early, achieving the effect of a zero-cycle branch in many cases.

Unconditional branch instructions and conditional branch instructions in which the condition is known can be resolved immediately. For unresolved conditional branch instructions, the branch path is predicted using either the architecture-defined static branch prediction or 750GX-specific dynamic branch prediction. Dynamic branch prediction is enabled if the BHT bit in Hardware-Implementation-Dependent Register 0 is set (HID0[BHT] = 1).

When a prediction is made, instruction fetching, dispatching, and execution continue along the predicted path, but instructions cannot be retired and write results back to architected registers until the prediction is determined to be correct (resolved). When a prediction is incorrect, the instructions from the incorrect path

gx_01.fm.(1.2)

PowerPC 750GX Overview

March 27,2006

Page 29 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

are flushed from the processor, and instruction fetching resumes along the correct path. The 750GX allows a second branch instruction to be predicted; instructions from the second predicted branch instruction stream can be fetched but cannot be dispatched. These instructions are held in the instruction queue.

Dynamic prediction is implemented using a 512-entry BHT. The BHT is a cache that provides two bits per entry that together indicate four levels of prediction for a branch instruction—not-taken, strongly not-taken, taken, strongly taken. When dynamic branch prediction is disabled, the BPU uses a bit in the instruction encoding to predict the direction of the conditional branch. Therefore, when an unresolved conditional branch instruction is encountered, the 750GX executes instructions from the predicted path although the results are not committed to architected registers until the conditional branch is resolved. This execution can continue until a second unresolved branch instruction is encountered.

When a branch is taken (or predicted as taken), the instructions from the untaken path must be flushed, and the target instruction stream must be fetched into the IQ. The BTIC is a 64-entry cache that contains the most recently used branch target instructions, typically in pairs. When an instruction fetch hits in the BTIC, the instructions arrive in the instruction queue in the next clock cycle, a clock cycle sooner than they would arrive from the instruction cache. Additional instructions arrive from the instruction cache in the next clock cycle. The BTIC reduces the number of missed opportunities to dispatch instructions and gives the processor a 1-cycle head start on processing the target stream. With the use of the BTIC, the 750GX achieves a zerocycle delay for branches taken. Coherency of the BTIC table is maintained by table reset on an instructioncache flash invalidate, Instruction Cache Block Invalidate (icbi) or Return from Interrupt (rfi) instruction execution, or when an exception is taken.

The BPU contains an adder to compute branch target addresses and three user-control registers—the Link Register (LR), the Count Register (CTR), and the CR. The BPU calculates the return pointer for subroutine calls and saves it into the LR for certain types of branch instructions. The LR also contains the branch target address for the Branch Conditional to Link Register (bclrx) instruction. The CTR contains the branch target address for the Branch Conditional to Count Register (bcctrx) instruction. Because the LR and CTR are special purpose registers (SPRs), their contents can be copied to or from any GPR. Since the BPU uses dedicated registers rather than GPRs or FPRs, execution of branch instructions is largely independent from execution of fixed-point and floating-point instructions.

1.2.1.3 Completion Unit

The completion unit operates closely with the dispatch unit. Instructions are fetched and dispatched in program order. At the point of dispatch, the program order is maintained by assigning each dispatched instruction a successive entry in the 6-entry completion queue. The completion unit tracks instructions from dispatch through execution and retires them in program order from the two bottom entries in the completion queue (CQ0 and CQ1).

Instructions cannot be dispatched to an execution unit unless there is a vacancy in the completion queue and rename buffers are available. Branch instructions that do not update the CTR or LR are removed from the instruction stream and do not occupy a space in the completion queue. Instructions that update the CTR and LR follow the same dispatch and completion procedures as nonbranch instructions, except that they are not issued to an execution unit.

An instruction is retired when it is removed from the completion queue and its results are written to architected registers (GPRs, FPRs, LR, and CTR) from the rename buffers. In-order completion ensures program integrity and the correct architectural state when the 750GX must recover from a mispredicted branch or any exception. Also, the rename buffers assigned to it by the dispatch unit are returned to the available rename buffer pool. These rename buffers are reused by the dispatch unit as subsequent instructions are dispatched.

PowerPC 750GX Overview

gx_01.fm.(1.2)

Page 30 of 377

March 27,2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

For a more detailed discussion of instruction completion, see Section 6.6.1, Branch, Dispatch, and Comple- tion-Unit Resource Requirements, on page 237.

1.2.2 Independent Execution Units

In addition to the BPU, the 750GX has the following five execution units:

Two integer units (IUs)

Floating-point unit (FPU)

Load/store unit (LSU)

System register unit (SRU)

1.2.2.1 Integer Units (IUs)

The integer units, IU1 and IU2, are shown in Figure 1-1 on page 25. IU1 can execute any integer instruction; IU2 can execute any integer instruction except multiplication and division instructions. Each IU has a singleentry reservation station that can receive instructions from the dispatch unit and operands from the GPRs or the rename buffers. The output of the IU is latched in the rename buffer assigned to the instruction by the dispatch unit.

Each IU consists of three single-cycle subunits—a fast adder/comparator, a subunit for logical operations, and a subunit for performing rotates, shifts, and count-leading-zero operations. These subunits handle all 1-cycle arithmetic and logical integer instructions; only one subunit can execute an instruction at a time.

The IU1 has a 32-bit integer multiplier/divider, as well as the adder, shift, and logical units of the IU2. The multiplier supports early exit for operations that do not require full 32 × 32-bit multiplication. Multiply and divide instructions spend several cycles in the execution stage before the results are written to the output rename buffer.

1.2.2.2 Floating-Point Unit (FPU)

The FPU, shown in Figure 1-1 on page 25, is designed as a 3-stage pipelined processing unit, where the first stage is for multiply, the second stage is for add, and the third stage is for normalize. A single-precision multiply/add operation is processed with 1-cycle throughput and 3-cycle latency. (A single-precision instruction spends one cycle in each stage of the FPU). A double-precision multiply requires two cycles in the multiply stage and one cycle in each additional stage. A double-precision multiply/add has a 2-cycle throughput and a 4-cycle latency. As instructions are dispatched to the FPU reservation station, source operand data can be accessed from the FPRs or from the FPR rename buffers. Results, in turn, are written to the rename buffers and are made available to subsequent instructions. Instructions pass through the reservation station and the pipeline stages in program order. Stalls due to contention for FPRs are minimized by automatic allocation of the six floating-point rename buffers. The completion unit writes the contents of the rename buffer to the appropriate FPR when floating-point instructions are retired.

The 750GX supports all IEEE 754-1985 floating-point data types (normalized, denormalized, not a number (NaN), zero, and infinity) in hardware, eliminating the latency incurred by software exception routines. (Note that “exception” is also referred to as “interrupt” in the architecture specification.)

gx_01.fm.(1.2)

PowerPC 750GX Overview

March 27,2006

Page 31 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

1.2.2.3 Load/Store Unit (LSU)

The LSU executes all load-and-store instructions and provides the data-transfer interface between the GPRs, FPRs, and the data-cache/memory subsystem. The LSU functions as a 2-stage pipelined unit, which calculates effective addresses in the first stage. In the second stage, the address is translated, the cache is accessed, and the data is aligned if necessary. Unless extensive data alignment is required (for example, to cross a double-word boundary), the instructions complete in two cycles with a 1-cycle throughput. The LSU also provides sequencing for load/store string and multiple register transfer instructions.

Load-and-store instructions are translated and issued in program order. However, some memory accesses can occur out of order. Synchronizing instructions can be used to enforce strict ordering if necessary. When there are no data dependencies and the guard bit for the page or block is cleared, a maximum of one out-of- order cacheable load operation can execute per cycle, with a 2-cycle total latency on a cache hit. Data returned from the cache is held in a rename buffer until the completion logic commits the value to a GPR or FPR. Stores cannot be executed out of order and are held in the store queue until the completion logic signals that the store operation is to be completed to memory. The 750GX executes store instructions with a maximum throughput of one per cycle and a 3-cycle latency to the data cache. The time required to perform the actual load or store operation depends on the processor/bus clock ratio and whether the operation involves the L1 cache, the L2 cache, system memory, or an I/O device.

The L/S unit has two reservation stations, Eib0 and Eib1. For loads, there is also a hold queue and a miss queue. A load that misses in the dcache advances from Eib0 to the miss queue, where only necessary state for instruction completion like the instruction ID and register rename ID are stored. If another load misses under an outstanding miss, then it is held in the hold queue and Eib0 is free. Two more load instructions may now be dispatched to Eib0 and Eib1. The Miss-under-Miss feature allows the hold, Eib0, and Eib1 load requests to proceed out to the bus, even though there is an outstanding miss that would normally stall the pending loads.

1.2.2.4 System Register Unit (SRU)

The SRU executes various system-level instructions, as well as Condition Register logical operations and Move-to/Move-from Special-Purpose Register instructions. To maintain system state, most instructions executed by the SRU are execution-serialized with other instructions; that is, the instruction is held for execution in the SRU until all previously issued instructions have been retired. Results from execution-serialized instructions executed by the SRU are not available or forwarded for subsequent instructions until the instruction completes.

1.2.3 Memory Management Units (MMUs)

The 750GX’s MMUs support up to 4 petabytes (252) of virtual memory and 4 gigabytes (232) of physical memory for instructions and data. The MMUs also control access privileges for these spaces on block and page granularities. Referenced and changed status is maintained by the processor for each page to support demand-paged virtual memory systems.

The LSU, with the aid of the MMU, translates effective addresses for data loads and stores. The effective address is calculated on the first cycle, and the MMU translates it to a physical address at the same time it is accessing the L1 cache on the second cycle. The MMU also provides the necessary control and protection information to complete the access. By the end of the second cycle, the data and control information is available if no miss conditions for translate and cache access were encountered. This yields a 1-cycle throughput and a 2-cycle latency.

PowerPC 750GX Overview

gx_01.fm.(1.2)

Page 32 of 377

March 27,2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

The 750GX supports the following types of memory translation:

Real-addressing mode

In this mode, translation is disabled (control bit MSR(IR) = 0 for instructions and

 

control bit MSR(DR) = 0 for data). The effective address is used as the physical

 

address to access memory.

Virtual-page-address

Translates from an effective address to a physical address by using the Segment

translation

Registers and the TLB and access data from a 4-KB virtual page. This page is

 

either in physical memory or on disk. If the latter, a page-fault exception occurs.

Block-address

Translates the effective address into a physical address by using the BAT Regis-

translation

ters and accesses a block (128 KB to 256 MB) in memory.

If translation is enabled, the appropriate MMU translates the higher-order bits of the effective address into physical address bits by using either BATs or the page translation method. The lower-order address bits, which are untranslated and therefore, considered both logical and physical, are directed to the L1 caches where they form the index into the 8-way set-associative tag and data arrays. After translating the address, the MMU passes the higher-order physical address bits to the cache, and the cache lookup completes. For caching-inhibited accesses or accesses that miss in the cache, the untranslated lower-order address bits are concatenated with the translated higher-order address bits. The resulting 32-bit physical address is used and accesses the L2 cache or system memory via the 60x bus.

If the BAT Registers are enabled and the address translates via this method, the page translation is canceled and the high-order physical address bits from the BAT Register are forward to the cache/memory access system. There are eight 8-byte BAT Registers, which function like an associative memory. These registers provide cache-control and protection information as well as address translation. Only one of the eight BAT entries should translate a given effective address.

If address relocation is enabled and the effective address does not translate via the BAT method, the virtualpage method is used. The four high-order bits of the effective address are used to access the 16-entry Segment Register array. From this array, a 24-bit Segment Register is accessed and used to form the highorder bits of a 52-bit virtual address. The low-order 28 bits of the effective address are used to form the loworder bits of the virtual address. This 52-bit virtual address is translated into a physical address by doing a lookup in the TLB. If the lookup is successful, a physical address is formed by using 16 low-order bits from the virtual address and 16 high-order bits from the TLB. The TLB also provides cache-control and protection information to be used by the cache/memory system.

TLBs are 128-entry, 2-way, set-associative caches that contain information about recently translated virtual addresses. When an address translation is not in a TLB, the 750GX automatically generates a page table search in memory to update the TLB. This search could find the desired entry in the L1 or L2 cache or in the page table in memory. The time to reload a TLB entry depends on where it is found; it could be completed in just a few cycles. If memory is searched, a maximum of 16 bus cycles would be needed before a page-fault exception is signaled.

1.2.4 On-Chip Level 1 Instruction and Data Caches

The 750GX implements separate instruction and data caches. Each cache is 32-KB and 8-way set-associa- tive. The caches are physically indexed. Each cache block contains eight contiguous words from memory that are loaded from an 8-word boundary (bits EA[27–31] are zeros); thus, a cache block never crosses a page boundary. A miss in the L1 cache causes a block reload from either the L2 cache, if the block is in the L2 cache, or from main memory. The critical double word is accessed first, forwarded to the load/store unit, and

gx_01.fm.(1.2)

PowerPC 750GX Overview

March 27,2006

Page 33 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

written into an 8-word buffer. Subsequent double words are fetched from either the L2 cache or the system memory and written into the buffer. Once the total block is in the buffer, the line is written into the L1 cache in a single cycle. This minimizes write cycles into the L1 cache, leaving more read/write cycles available to the LSU. The L1 is nonblocking and supports hits under misses during this block reload sequence. Misaligned accesses across a block or page boundary can incur a performance penalty. The 750GX L1 data cache supports miss-under-miss access, meaning that with one miss outstanding, the cache can continue to be accessed for up to three more misses. The 750GX L1 data cache also allows the additional misses to initiate a transaction in the bus interface unit, while the first miss is pending.

The 750GX L1 cache organization is shown in Figure 1-2, L1 Cache Organization.

Figure 1-2. L1 Cache Organization

 

 

 

128 Sets

 

 

Way 0

Address Tag 0

State

Words [0–7]

Way 1

Address Tag 1

State

Words [0–7]

Way 2

Address Tag 2

State

Words [0–7]

Way 3

Address Tag 3

State

Words [0–7]

Way 4

Address Tag 4

State

Words [0–7]

Way 5

Address Tag 5

State

Words [0–7]

Way 6

Address Tag 6

State

Words [0–7]

Way 7

Address Tag 7

State

Words [0–7]

 

 

 

8 Words/Way

The data cache provides double-word accesses to the LSU each cycle. Like the instruction cache, the data cache can be invalidated all at once or on a per-cache-block basis. The data cache can be disabled and invalidated by clearing the data-cache enable bit (HID0[DCE]) and setting the data-cache flash invalidate bit (HID0[DCFI]). The data cache can be locked by setting HID0[DLOCK]. To ensure cache coherency, the data cache supports the 3-state MEI protocol. The data-cache tags are single-ported, so a simultaneous load or store and a snoop access represent a resource collision, and an LSU access is delayed for one cycle. If a snoop hit occurs and a castout is required, the LSU is blocked internally for one cycle to allow the 8-word block of data to be copied to the write-back buffer.

The instruction cache provides up to four instructions to the instruction queue in a single cycle. Like the data cache, the instruction cache can be invalidated all at once or on a cache-block basis. The instruction cache can be disabled and invalidated by clearing the instruction-cache enable bit (HID0[ICE]) and setting the

PowerPC 750GX Overview

gx_01.fm.(1.2)

Page 34 of 377

March 27,2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

instruction-cache flash invalidate bit (HID0[ICFI]). The instruction cache can be locked by setting HID0[ILOCK]. The instruction cache supports only the valid and invalid states, and requires software to maintain coherency if the underlying program changes.

The 750GX also implements a 64-entry (16-set, 4-way set-associative) branch target instruction cache (BTIC). The BTIC is a cache of branch instructions that have been encountered in branch/loop code sequences. If the target instruction is in the BTIC, it is fetched into the instruction queue a cycle sooner than it can be made available from the instruction cache. Typically, the BTIC contains the first two instructions in the target stream. The BTIC can be disabled and invalidated through software.

Coherency of the BTIC is transparent to the running software and is coupled with various functions in the 750GX processor. When the BTIC is enabled and loaded with instruction pairs to support zero-cycle delay on branches taken, the table must be invalidated if the underlying program changes. (This is also true for the instruction cache.) The BTIC is invalidated on an instruction-cache flash invalidate, an icbi or rfi instruction, and any exception.

For more information and timing examples showing cache hit and cache miss latencies, see Section 6.3.2, Instruction Fetch Timing, on page 216.

1.2.5 On-Chip Level 2 Cache Implementation

The L2 cache is a unified cache that receives memory requests from both the L1 instruction and data caches independently. The L2 cache is implemented with an L2 Cache Control Register (L2CR), an on-chip, 4-way, set-associative tag array, and with a 1-MB, integrated SRAM for data storage. The L2 cache normally operates in write-back mode and supports cache coherency through snooping. The access interface to the L2 is 64 bits for writes and requires four cycles to write a single cache block. The access interface to the L2 is 256 bits for reads and requires one cycle to read a single cache block. The L2 uses ECC on a double word, corrects most single-bit errors, and detects the remaining single-bit errors and all double-bit errors. See

Figure 9-1, L2 Cache, on page 327.

The L2 cache is organized with 64-byte lines, which in turn are subdivided into 32-byte blocks, the unit at which cache coherency is maintained. This reduces the size of the tag array, and one tag supports two cache blocks. Each 32-byte cache block has its own valid and modified status bits. When a cache line is removed, the contents of both blocks and the tag are removed from the L2 cache. The cache block is only written to system memory if the modified bit is set.

Requests from the L1 cache generally result from instruction misses, data load or store misses, write-through operations, or cache-management instructions. Misses from the L1 cache are looked up in the L2 tags and serviced by the L2 cache if they hit; they are forwarded to the 60x bus interface if they miss.

The L2 cache can accept multiple, simultaneous accesses. However, they are serialized and processed one per cycle. The L1 instruction cache can request an instruction at the same time that the L1 data cache requests one load and two store operations. The L2 cache also services snoop requests from the bus. If there are multiple pending requests to the L2 cache, snoop requests have highest priority. Load-and-store requests from the L1 data cache have the next highest priority. The last priority consists of instruction fetch requests from the L1 instruction cache.

1.2.6 System Interface/Bus Interface Unit (BIU)

The PowerPC 750GX uses a reduced system signal set, which eliminates some optional 60x bus protocol pins. The system designer needs to make note of these differences.

gx_01.fm.(1.2)

PowerPC 750GX Overview

March 27,2006

Page 35 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

The address and data buses operate independently. Address and data tenures of a memory access are decoupled to provide more flexible control of bus traffic. The primary activity of the system interface is transferring data and instructions between the processor and system memory. There are two types of memory accesses:

Single-beat transfers

Allow transfer sizes of 8, 16, 24, 32, or 64 bits in one bus clock cycle. Single-beat

 

transactions are caused by uncacheable read and write operations that access

 

memory directly when caches are disabled, for cache-inhibited accesses, and for

 

stores in write-through mode. The two latter accesses are defined by control bits

 

provided by the MMU during address translation.

4-beat burst (32-byte)

Burst transactions, which always transfer an entire cache block (32 bytes), are initi-

data transfers

ated when an entire cache block is transferred. If the caches on the 750GX are

 

enabled and using write-back mode, burst-read operations are the most common

 

memory accesses, followed by burst-write memory operations.

The 750GX also supports address-only operations, which are variants of the burst and single-beat operations (for example, atomic memory operations and global memory operations that are snooped), and address retry activity (for example, when a snooped read access hits a modified block in the cache). The broadcast of some address-only operations is controlled through the address broadcast enable bit (HID0[ABE]). I/O accesses use the same protocol as memory accesses.

Access to the system interface is granted through an external arbitration mechanism that allows devices to compete for bus mastership. This arbitration mechanism is flexible, allowing the 750GX to be integrated into systems that implement various fairness and bus-parking procedures to avoid arbitration overhead.

Typically, memory accesses are weakly ordered—sequences of operations, including load/store string and multiple instructions, do not necessarily complete in the order they begin. This maximizes the efficiency of the bus without sacrificing data coherency. The 750GX allows read operations to go ahead of store operations except when a dependency exists, or when a noncacheable access is performed. It also allows a write operation to go ahead of a previously queued read data tenure (for example, letting a snoop push be enveloped between address and data tenures of a read operation). Because the 750GX can dynamically optimize runtime ordering of load/store traffic, overall performance is improved.

The system interface is specific for each PowerPC microprocessor implementation.

The 750GX signals are grouped as shown in Figure 1-3, System Interface. Test and control signals provide diagnostics for selected internal circuits.

PowerPC 750GX Overview

gx_01.fm.(1.2)

Page 36 of 377

March 27,2006

 

 

 

 

 

 

 

 

 

 

User’s Manual

 

 

 

 

 

 

 

IBM PowerPC 750GX and 750GL RISC Microprocessor

 

 

 

 

 

 

 

 

 

 

 

Figure 1-3. System Interface

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Address Arbitration

 

 

 

 

 

 

 

 

Data Arbitration

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Address Start

 

 

 

 

 

 

 

 

Data Transfer

 

 

 

 

 

 

 

 

Address Transfer

 

 

 

 

750GX

 

 

Data Termination

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Transfer Attribute

 

 

 

 

 

 

 

 

Test and Control

 

 

 

 

 

 

 

Address Termination

 

 

 

 

 

 

 

 

Clocks

 

 

 

 

 

 

 

Interrupt

 

 

 

 

 

 

 

 

Processor Status/Control

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

VDD

VDD (I/O)

 

 

 

 

 

 

 

 

 

 

 

The system interface supports address pipelining, which allows the address tenure of one transaction to overlap the data tenure of another. The 750GX can support up to five outstanding transactions on the bus, including up to one snoop copyback, up to four loads, and up to four stores. The extent of the pipelining depends on external arbitration and control circuitry. Similarly, the 750GX supports split-bus transactions for systems with multiple potential bus masters—one device can be master of the address bus while another is master of the data bus. Allowing multiple bus transactions to occur simultaneously increases the available bus bandwidth for other activity.

The 750GX’s clocking structure supports a wide range of processor-to-bus clock ratios.

1.2.7 Signals

The 750GX’s signals are grouped as follows:

Address arbitration

The 750GX uses these signals to arbitrate for address-bus mastership.

Address start

This signal indicates that a bus master has begun a transaction on the address

 

bus.

Address transfer

These signals include the address bus and are used to transfer the address.

Transfer attribute

These signals provide information about the type of transfer, such as the transfer

 

size and whether the transaction is burst, write-through, or caching-inhibited.

Address termination

These signals are used to acknowledge the end of the address phase of the trans-

 

action. They also indicate whether a condition exists that requires the address

 

phase to be repeated.

Data arbitration

The 750GX uses these signals to arbitrate for data-bus mastership.

Data transfer

These signals include the data bus and are used to transfer the data.

Data termination

These signals are required after each data beat in a data transfer. In a single-beat

 

transaction, a data termination signal also indicates the end of the tenure. In burst

 

accesses, data termination signals apply to individual beats and indicate the end of

 

the tenure only after the final data beat.

 

 

gx_01.fm.(1.2)

PowerPC 750GX Overview

March 27,2006

Page 37 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

Interrupt

These signals include the interrupt signal, checkstop signals, and both soft reset

 

and hard reset signals. These signals are used to generate interrupt exceptions

 

and, under various conditions, to reset the processor.

Processor status/control

These signals are used to indicate miscellaneous bus functions.

Clocks

These signals determine the system clock frequency. These signals can also be

 

used to synchronize multiprocessor systems.

Test and control

The common on-chip processor (COP) unit provides a serial interface to the

 

system for performing board-level boundary scan interconnect tests.

Note: A bar over a signal name indicates that the signal is active low—for example, ARTRY (address retry) and TS (transfer start). Active-low signals are referred to as asserted (active) when they are low and as negated when they are high. Signals that are not active low, such as A[0–31] (address-bus signals) and TT[0–4] (transfer type signals) are referred to as asserted when they are high and as negated when they are low.

1.2.8 Signal Configuration

Figure 1-4 shows the 750GX’s logical pin configuration. The signals are grouped by function.

PowerPC 750GX Overview

gx_01.fm.(1.2)

Page 38 of 377

March 27,2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

Figure 1-4. 750GX Microprocessor Signal Groups

ADDRESS

 

 

 

 

 

BR

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

BG

 

 

 

 

 

 

 

 

 

ARBITRATION

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ABB

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TS

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

A[0:31]

 

ADDRESS START/

 

 

AP[0:3]

 

ADDRESS TRANSFER/

 

TT[0:4]

TRANSFER ATTRIBUTE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TBST

 

 

 

 

 

 

 

 

 

 

 

 

 

TSIZ[0:2]

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

GBL

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

WT

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

CI

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ADDRESS

 

 

 

 

 

AACK

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TERMINATION

 

 

ARTY

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

DATA

 

 

 

 

 

DBG

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

DBWO

 

 

ARBITRATION

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

DBB

 

 

 

 

 

 

 

DATA

 

 

 

 

 

D[0:63]

 

 

 

 

 

 

 

 

 

 

 

 

DP[0:7]

TRANSFER

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

DBDIS

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

DATA

 

 

 

 

 

TA

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

DRTRY

 

TERMINATION

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TEA

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

1

1

1

 

32

 

4

 

5

 

1

 

3

 

1

1

1

1

1

1

750GX

1

1

1

1

 

 

1

 

1

1

1

1

1

11

1

64

1

8

 

11

5

1

1

1

2

1

5

3

INT

SMI

MCP

SRESET

HRESET

RSRVR

TBEN

TLBI SYNC

QREQ

QACK

CKSTP_IN

CKSTP_OUT

SYSCLK

PLL_CFG[0:4]

CLK_OUT

PLL_RNG[0:1]

JTAG / COP

FACTORY TEST

INTERRUPTS/ RESETS

PROCESSOR STATUS/ CONTROL

CLOCK

CONTROL

TEST INTERFACE

Signal functionality is described in detail in Chapter 7, Signal Descriptions, on page 249 and Chapter 8, Bus Interface Operation, on page 279.

Note: See the PowerPC 750GX Datasheet for a complete list of signal pins.

gx_01.fm.(1.2)

PowerPC 750GX Overview

March 27,2006

Page 39 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

1.2.9 Clocking

The 750GX requires a single system clock input, SYSCLK, that represents the bus interface frequency. Internally, the processor uses a phase-locked loop (PLL) circuit to generate a master core clock that is frequencymultiplied and phase-locked to the SYSCLK input. This core frequency is used to operate the internal circuitry.

The PLL is configured by the PLL_CFG[0:4] signals, which select the multiplier that the PLL uses to multiply the SYSCLK frequency up to the internal core frequency. In addition, the 750GX has two PLL_RNG bits that set the proper operation frequency range. The feedback in the PLL guarantees that the processor clock is phase locked to the bus clock, regardless of process variations, temperature changes, or parasitic capacitances.

The PLL also ensures a 50% duty cycle for the processor clock.

The 750GX supports various processor-to-bus clock frequency ratios, although not all ratios are available for all frequencies. Configuration of the processor/bus clock ratios is displayed through a 750GX-specific register, HID1. For information about supported clock frequencies, see the PowerPC 750GX Datasheet.

1.3 750GX Microprocessor Implementation

The PowerPC Architecture is derived from the Performance Optimized with Enhanced RISC (POWER™) architecture. The PowerPC Architecture shares the benefits of the POWER architecture optimized for singlechip implementations. The PowerPC Architecture design facilitates parallel instruction execution, and is scalable to take advantage of future technological gains.

The remainder of this chapter describes the PowerPC Architecture in general, and specific details about the implementation of 750GX as a low-power, 32-bit member of the PowerPC processor family. The structure of the remainder of this chapter reflects the organization of the user’s manual; each section provides an overview of the corresponding chapter. The following sections summarize the features of the 750GX, distinguishing those that are defined by the architecture from those that are unique to the 750GX implementation.

Registers and

Section 1.4, PowerPC Registers and Programming Model, on page 42 describes

programming model

the registers for the operating environment architecture common among PowerPC

 

processors and describes the programming model. It also describes the registers

 

that are unique to the 750GX. The information in this section is described more fully

 

in Chapter 2, Programming Model, on page 57.

Instruction set and

Section 1.5, Instruction Set, on page 45 describes the PowerPC instruction set and

addressing modes

addressing modes for the PowerPC operating environment architecture, defines

 

the PowerPC instructions implemented in the 750GX, and describes new instruc-

 

tion set extensions to improve the performance of single-precision floating-point

 

operations and the capability of data transfer. The information in this section is

 

described more fully in Section 2.3, Instruction Set Summary, on page 86.

Cache implementation

Section 1.6, On-Chip Cache Implementation, on page 47 describes the cache

 

model that is defined generally for PowerPC processors by the virtual environment

 

architecture. It also provides specific details about the 750GX L2 cache implemen-

 

tation. The information in this section is described more fully in Chapter 3, Instruc-

 

tion-Cache and Data-Cache Operation, on page 121.

PowerPC 750GX Overview

gx_01.fm.(1.2)

Page 40 of 377

March 27,2006

 

User’s Manual

 

IBM PowerPC 750GX and 750GL RISC Microprocessor

 

 

Exception mode

Section 1.7, Exception Model, on page 48 describes the exception model of the

 

PowerPC operating environment architecture and the differences in the 750GX

 

exception model. The information in this section is described more fully in

 

Chapter 4, Exceptions, on page 151.

Memory management

Section 1.8, Memory Management, on page 51 describes in general terms the

 

conventions for memory management among the PowerPC processors. This

 

section also describes the 750GX’s implementation of the 32-bit PowerPC

 

memory-management specification. The information in this section is described

 

more fully in Chapter 5, Memory Management, on page 179.

Instruction timing

Section 1.9, Instruction Timing, on page 52 provides a general description of the

 

instruction timing provided by the superscalar, parallel execution supported by the

 

PowerPC Architecture and the 750GX. The information in this section is described

 

in more detail in Chapter 6, Instruction Timing, on page 209.

Power management

Section 1.10, Power Management, on page 54 describes how power management

 

can be used to reduce power consumption when the processor, or portions of it,

 

are idle. The information in this section is described more fully in Chapter 10,

 

Power and Thermal Management, on page 335.

Thermal management

Section 1.11, Thermal Management, on page 55 describes how the thermal-

 

management unit and its associated registers (THRM1–THRM4) and exception

 

processing can be used to manage system activity in a way that prevents

 

exceeding system and junction temperature thresholds. This is particularly useful in

 

high-performance portable systems, which cannot use the same cooling mecha-

 

nisms (such as fans) that control overheating in desktop systems. The information

 

in this section is described more fully in Chapter 10, Power and Thermal Manage-

 

ment, on page 335.

Performance monitor

Section 1.12, Performance Monitor, on page 56 describes the performance-

 

monitor facility, which system designers can use to help bring up, debug, and opti-

 

mize software performance. The information in this section is described more fully

 

in Chapter 11, Performance Monitor and System Related Features, on page 349.

The PowerPC Architecture consists of the following layers, and adherence to the PowerPC Architecture can be described in terms of which of the following levels of the architecture is implemented.

PowerPC user instruction Defines the base user-level instruction set, user-level registers, data types,

set architecture (UISA) floating-point exception model, memory models for a uniprocessor environment, and programming model for a uniprocessor environment.

PowerPC virtual environDescribes the memory model for a multiprocessor environment, defines cache-

ment architecture (VEA)

control instructions, and describes other aspects of virtual environments. Imple-

 

mentations that conform to the VEA also adhere to the UISA, but might not neces-

 

sarily adhere to the OEA.

PowerPC operating

Defines the memory-management model, supervisor-level registers, synchroniza-

environment architecture

tion requirements, and the exception model. Implementations that conform to the

(OEA)

OEA also adhere to the UISA and the VEA.

gx_01.fm.(1.2)

PowerPC 750GX Overview

March 27,2006

Page 41 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

1.4 PowerPC Registers and Programming Model

The PowerPC Architecture defines register-to-register operations for most computational instructions. Source operands for these instructions are accessed from the registers or are provided as immediate values embedded in the instruction itself. The 3-register instruction format allows specification of a target register distinct from the two source operands. Only load-and-store instructions transfer data between registers and memory.

PowerPC processors have two levels of privilege: supervisor mode and user mode.The supervisor mode of operation is typically used by the operating system. The user mode of operation, also called the problem state, is typically used by the application software. The programming models incorporate 32 GPRs, 32 FPRs, Special-Purpose Registers (SPRs), and several miscellaneous registers. Each PowerPC microprocessor also has its own unique set of Hardware-Implementation-Dependent (HID) Registers.

While running in supervisor mode, the operating system is able to execute all instructions and access all registers defined in the PowerPC Architecture. In this mode, the operating system establishes all address translations and protection mechanisms, loads all Processor State Registers, and sets up all other control mechanisms defined in the PowerPC 750GX processor. While running in user mode (problem state), many of these registers and facilities are not accessible, and any attempt to read or write these register results in a program exception.

Figure 2-1, PowerPC 750GX Microprocessor Programming Model—Registers, on page 58 shows all the 750GX registers available at the user and supervisor levels. The numbers to the right of the SPRs indicate the number that is used in the syntax of the instruction operands to access the register. For more information, see Chapter 2, Programming Model, on page 57.

The following tables summarize the PowerPC registers implemented in 750GX, and describe registers (excluding SPRs) defined by the architecture.

Table 1-1. Architecture-Defined Registers (Excluding SPRs)

Register

Level

Function

 

 

 

 

 

 

 

 

The Condition Register (CR) consists of eight 4-bit fields that reflect the results of certain opera-

CR

User

tions, such as move, integer and floating-point compare, arithmetic, and logical instructions. The

 

 

register provides a mechanism for testing and branching.

 

 

 

 

 

The 32 Floating Point Registers (FPRs) serve as the data source or destination for floating-point

FPRs

User

instructions. These 64-bit registers can hold single-precision or double-precision floating-point val-

 

 

ues.

 

 

 

 

 

The Floating-Point Status and Control Register (FPSCR) contains the floating-point exception sig-

FPSCR

User

nal bits, exception summary bits, exception enable bits, and rounding control bits needed for com-

 

 

pliance with the IEEE 754-1985 standard.

 

 

 

 

 

The 32 GPRs contain the address and data arguments addressed from source or destination fields

GPRs

User

in integer instructions. Also, floating-point load-and-store instructions use GPRs to address mem-

 

 

ory.

 

 

 

 

 

The Machine State Register (MSR) defines the processor state. Its contents are saved when an

 

 

exception is taken and restored when exception handling completes. The 750GX implements

MSR

Supervisor

MSR[POW], defined by the architecture as optional, which is used to enable the power manage-

 

 

ment feature. The 750GX-specific MSR[PM] bit is used to mark a process for the performance

 

 

monitor.

 

 

 

 

 

The sixteen 32-bit Segment Registers (SRs) define the 4-GB space as sixteen 256-MB seg-

 

 

ments.The 750GX implements Segment Registers as two arrays—a main array for data accesses

SR0–SR15

Supervisor

and a shadow array for instruction accesses (see Figure 1-1 on page 25). Loading a segment entry

 

 

with the Move-to Segment Register (mtsr) instruction loads both arrays. The mfsr instruction

 

 

reads the master register, shown as part of the data MMU in Figure 1-1 on page 25.

 

 

 

 

 

 

 

PowerPC 750GX Overview

gx_01.fm.(1.2)

Page 42 of 377

 

March 27,2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

The OEA defines numerous Special-Purpose Registers that serve a variety of functions, such as providing controls, indicating status, configuring the processor, and performing special operations. During normal execution, a program can access the registers shown in Figure 2-1 on page 58, depending on the program’s access privilege (supervisor or user, determined by the privilege-level (PR) bit in the MSR). GPRs and FPRs are accessed through operands that are defined in the instructions. Access to registers can be explicit (that is, through the use of specific instructions for that purpose such as Move-to Special-Purpose Register (mtspr) and Move-from Special-Purpose Register (mfspr) instructions) or implicit, as the part of the execution of an instruction. Some registers can be accessed both explicitly and implicitly.

In the 750GX, all SPRs are 32 bits wide. Table 1-2 describes the architecture-defined SPRs implemented by the 750GX. In the PowerPC Microprocessor Family: The Programming Environments Manual, these registers are described in detail, including bit descriptions. Section 2.1.1, Register Set, on page 57 describes how these registers are implemented in the 750GX. In particular, that section describes those features defined as optional in the PowerPC Architecture that are implemented on the 750GX.

Table 1-2. Architecture-Defined SPRs Implemented (Page 1 of 2)

Register

Level

Function

 

 

 

 

 

 

LR

User

The Link Register (LR) can be used to provide the branch target address and to hold

the return address after branch and link instructions.

 

 

 

 

 

 

 

The architecture defines eight Block Address Translation Registers (BATs), each imple-

BATs

Supervisor

mented as a pair of 32-bit SPRs. In the 750GX, the BAT facility has been extended to

include 16 BATs (32 total SPRs), eight for instruction translation and eight for data

 

 

 

 

translation. BATs are used to define and configure blocks of memory.

 

 

 

CTR

User

The Count Register (CTR) is decremented and tested by branch-and-count instruc-

tions.

 

 

 

 

 

DABR

Supervisor

The optional Data Address Breakpoint Register (DABR) supports the data address

breakpoint facility.

 

 

 

 

 

DAR

User

The Data Address Register (DAR) holds the address of an access after an alignment or

data-storage interrupt (DSI) exception.

 

 

 

 

 

DEC

Supervisor

The Decrementer Register (DEC) is a 32-bit decrementing counter that provides a way

to schedule time-delayed exceptions.

 

 

 

 

 

DSISR

User

The Data Storage Interrupt Status Register (DSISR) defines the cause of data access

and alignment exceptions.

 

 

 

 

 

 

 

The External Access Register (EAR) controls access to the external access facility

EAR

Supervisor

through the External Control In Word Indexed (eciwx) and External Control Out Word

 

 

Indexed (ecowx) instructions.

 

 

 

PVR

Supervisor

The Processor Version Register (PVR) is a read-only register that identifies the proces-

sor version and revision level.

 

 

 

 

 

SDR1

Supervisor

Storage Description Register 1 (SDR1) specifies the page table address and size used

in virtual-to-physical page-address translation.

 

 

 

 

 

 

 

The Machine Status Save/Restore Register 0 (SRR0) saves the address used for

SRR0

Supervisor

restarting an interrupted program when an rfi instruction executes (also known as

 

 

exceptions).

 

 

 

SRR1

Supervisor

The Machine Status Save/Restore Register 1 (SRR1) is used to save machine status

on exceptions and to restore machine status when an rfi instruction is executed.

 

 

 

 

 

gx_01.fm.(1.2)

PowerPC 750GX Overview

March 27,2006

Page 43 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

Table 1-2. Architecture-Defined SPRs Implemented (Page 2 of 2)

Register

Level

Function

 

 

 

 

 

 

SPRG0–SPRG3

Supervisor

The general-purpose SPRs (SPRG0–SPRG3) are provided for operating system use.

 

 

 

 

User: read

The Time Base Register (TB) is a 64-bit register that maintains the time and date vari-

TB

Supervisor:

able. The TB consists of two 32-bit fields—time-base upper (TBU) and time-base lower

 

read/write

(TBL).

 

 

 

 

 

The Integer Exception Register (XER) contains the summary overflow bit, integer carry

XER

User

bit, overflow bit, and a field specifying the number of bytes to be transferred by a Load

 

 

String Word Indexed (lswx) or Store String Word Indexed (stswx) instruction.

 

 

 

Table 1-3 describes the SPRs in 750GX that are not defined by the PowerPC Architecture. Section 2.1.2, PowerPC 750GX-Specific Registers, on page 64 gives detailed descriptions of these registers, including bit descriptions.

.

Table 1-3. Implementation-Specific Registers

Register

Level

Function

 

 

 

 

 

 

 

 

 

HID0

Supervisor

The Hardware-Implementation-Dependent Register 0 (HID0) provides checkstop

 

enables and other functions.

 

 

 

 

 

 

 

 

HID1

Supervisor

The Hardware-Implementation-Dependent Register 1 (HID1) controls the dual PLLs.

 

 

 

 

 

HID2

Supervisor

The Hardware-Implementation-Dependent Register 2 (HID2) provides control and sta-

 

tus of special cache-related parity functions.

 

 

 

 

 

 

 

 

 

 

The Instruction Address Breakpoint Register (IABR) supports instruction address

 

IABR

Supervisor

breakpoint exceptions. It can hold an address to compare with instruction addresses in

 

 

 

the IQ. An address match causes an instruction address breakpoint exception.

 

 

 

 

 

 

 

The Instruction Cache-Throttling Control Register (ICTC) has bits for controlling the

 

ICTC

Supervisor

interval at which instructions are fetched into the instruction buffer in the instruction

 

 

 

unit. This helps control the 750GX’s overall junction temperature.

 

 

 

 

 

L2CR

Supervisor

The L2 Cache Control Register (L2CR) is used to configure and operate the L2 cache.

 

 

 

 

 

 

 

The Monitor Mode Control Registers (MMCR0–MMCR1) are used to enable various

 

MMCR0–MMCR1

Supervisor

performance monitoring interrupt functions. UMMCR0–UMMCR1 provide user-level

 

 

 

read access to MMCR0–MMCR1.

 

 

 

 

 

PMC1–PMC4

Supervisor

The Performance-Monitor Counter Registers (PMC1–PMC4) are used to count speci-

 

fied events. UPMC1–UPMC4 provide user-level read access to these registers.

 

 

 

 

 

 

 

 

 

 

The Sampled Instruction Address Register (SIA) holds the EA of an instruction execut-

 

SIA

Supervisor

ing at or around the time the processor signals the performance-monitor interrupt con-

 

 

 

dition. The USIA register provides user-level read access to the SIA.

 

 

 

 

 

 

 

THRM1 and THRM2 provide a way to compare the junction temperature against two

 

THRM1, THRM2

Supervisor

user-provided thresholds. The thermal assist unit (TAU) can be operated so that the

 

thermal sensor output is compared to only one threshold, selected in THRM1 or

 

 

 

 

 

 

THRM2.

 

 

 

 

 

THRM3

Supervisor

THRM3 is used to enable the TAU and to control the output sample time.

 

 

 

 

 

THRM4

Supervisor

THRM4 provides the temperature offset to junction temperature for accurate operation

 

of the thermal assist unit.

 

 

 

 

 

 

 

 

UMMCR0–UMMCR1

User

The User Monitor Mode Control Registers (UMMCR0–UMMCR1) provide user-level

 

read access to MMCR0–MMCR1.

 

 

 

 

 

 

 

 

UPMC1–UPMC4

User

The User Performance-Monitor Counter Registers (UPMC1–UPMC4) provide user-

 

level read access to PMC1–PMC4.

 

 

 

 

 

 

 

 

USIA

User

The User Sampled Instruction Address Register (USIA) provides user-level read

 

access to the SIA register.

 

 

 

 

 

 

 

 

 

 

 

 

PowerPC 750GX Overview

 

gx_01.fm.(1.2)

Page 44 of 377

 

March 27,2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

1.5 Instruction Set

All PowerPC instructions are encoded as single-word (32-bit) instructions. Instruction formats are consistent among all instruction types (the primary operation code is always 6 bits, register operands are always specified in the same bit fields in the instruction), permitting efficient decoding to occur in parallel with operand accesses. This fixed instruction length and consistent format greatly simplify instruction pipelining.

For more information, see Chapter 2, Programming Model, on page 57.

1.5.1 PowerPC Instruction Set

The PowerPC instructions are divided into the following categories.

Integer instructions—These include computational and logical instructions.

Integer arithmetic instructions

Integer compare instructions

Integer logical instructions

Integer rotate and shift instructions

Floating-point instructions—These include floating-point computational instructions, as well as instructions that affect the FPSCR.

Floating-point arithmetic instructions

Floating-point multiply/add instructions

Floating-point rounding and conversion instructions

Floating-point compare instructions

Floating-point status and control instructions

Load/store instructions—These include integer and floating-point load-and-store instructions.

Integer load-and-store instructions

Integer load-and-store multiple instructions

Floating-point load and store

Primitives used to construct atomic memory operations (Load Word and Reserve Indexed [lwarx] and Store Word Conditional Indexed [stwcx.] instructions)

Flow-control instructions—These include branching instructions, Condition Register logical instructions, trap instructions, and other instructions that affect the instruction flow.

Branch and trap instructions

Condition Register logical instructions (sets conditions for branches)

System call

Processor control instructions—These instructions are used to synchronize memory accesses and to manage caches, TLBs, and the Segment Registers.

Move-to/Move-from SPR instructions

Move-to/Move-from MSR

Synchronize (processor and memory system)

Instruction synchronize

Order loads and stores

Memory control instructions—To provide control of caches, TLBs, and SRs.

Supervisor-level cache-management instructions

User-level cache instructions

Segment Register manipulation instructions

gx_01.fm.(1.2)

PowerPC 750GX Overview

March 27,2006

Page 45 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

– Translation-lookaside-buffer management instructions

These categories do not indicate the execution unit that executes a particular instruction or group of instructions.

Integer instructions operate on byte, half-word, and word operands. Floating-point instructions operate on single-precision (one word) and double-precision (two words) floating-point operands. The PowerPC Architecture uses instructions that are four bytes long and word-aligned. It provides for integer byte, half-word, and word operand loads and stores between memory and a set of 32 GPRs. It also provides for single and double-precision loads and stores between memory and a set of 32 Floating Point Registers (FPRs).

Computational instructions do not access memory. To use a memory operand in a computation and then modify the same or another memory location, the memory contents must be loaded into a register, modified, and then written back to the target location using three or more instructions.

PowerPC processors follow the program flow when they are in the normal execution state; however, the flow of instructions can be interrupted directly by the execution of an instruction or by an asynchronous event. Either type of exception will cause the associated exception handler to be invoked.

Effective address computations for both data and instruction accesses use 32-bit signed two’s complement binary arithmetic. A carry from bit 0 and overflow are ignored.

PowerPC 750GX Overview

gx_01.fm.(1.2)

Page 46 of 377

March 27,2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

1.5.2 750GX Microprocessor Instruction Set

750GX instruction set is defined as follows.

750GX provides hardware support for all PowerPC instructions.

750GX implements the following instructions, which are optional in the PowerPC Architecture.

External Control In Word Indexed (eciwx).

External Control Out Word Indexed (ecowx).

Floating Select (fsel).

Floating Reciprocal Estimate Single-Precision (fres).

Floating Reciprocal Square Root Estimate (frsqrte).

Store Floating-Point as Integer Word (stfiw).

Note: The fres and frsqrte instructions are implemented in the 750GX with 12-bit precision (better than one part in 4000), which significantly exceeds the minimum precision required by the architecture.

1.6 On-Chip Cache Implementation

The following subsections describe the PowerPC Architecture’s treatment of cache in general, and the 750GX-specific implementation. A detailed description of the 750GX L1 cache implementation is provided in

Chapter 3, Instruction-Cache and Data-Cache Operation, on page 121. A detailed description of the L2 cache is provided in Chapter 9, L2 Cache, on page 323.

1.6.1 PowerPC Cache Model

The PowerPC Architecture does not define hardware aspects of cache implementations. For example, PowerPC processors can have unified caches, separate instruction and data caches (Harvard architecture), or no cache at all. PowerPC microprocessors control the following memory-access modes on a virtual-page or block (BAT) basis

Write-back/write-through mode

Caching-inhibited mode

Memory coherency

The caches are physically addressed, and the data cache can operate in either write-back or write-through mode, as specified by the PowerPC Architecture.

The PowerPC Architecture defines the term ‘cache block’ as the cacheable unit. The VEA and OEA define cache-management instructions that a programmer can use to affect cache contents.

1.6.2 750GX Microprocessor Cache Implementation

750GX cache implementation is described in Section 1.2.4, On-Chip Level 1 Instruction and Data Caches, on page 33 and Section 1.2.5, On-Chip Level 2 Cache Implementation, on page 35.

The BPU also contains a cache, the 64-entry BTIC, that provides immediate access to an instruction pair for taken branches. For more information, see Section 1.2.1.2, Branch Processing Unit (BPU), on page 29.

gx_01.fm.(1.2)

PowerPC 750GX Overview

March 27,2006

Page 47 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

1.7 Exception Model

The following sections describe the PowerPC exception model and the 750GX implementation. A detailed description of the 750GX exception model is provided in Chapter 4, Exceptions, on page 151 in this manual.

1.7.1 PowerPC Exception Model

The PowerPC exception model allows the processor to interrupt the instruction flow to handle certain situations caused by external signals, errors, or unusual conditions arising from the instruction execution. When exceptions occur, information about the state of the processor is saved to certain registers, and the processor begins execution at an address (exception vector) predetermined for each exception. System software must complete the saving of the processor state prior to servicing the exception. Exception processing proceeds in supervisor mode.

Although multiple exception conditions can map to a single exception vector, a more specific condition can be determined by examining a register associated with the exception. For example, the MSR, DSISR, and FPSCR contain status bits that further identify the exception condition. Additionally, some exception conditions can be explicitly enabled or disabled by software.

The PowerPC Architecture requires that exceptions be handled in specific priority and program order. Therefore, although a particular implementation might recognize exception conditions out of order, they are handled in program order. When an instruction-caused exception is recognized, any unexecuted instructions that appear earlier in the instruction stream, including any that are not dispatched, must complete before the exception is taken. Any exceptions those instructions cause must also be handled first. Likewise, asynchronous, precise exceptions are recognized when they occur. However, they are not handled until the instructions currently in the completion queue successfully retire or generate an exception, and the completion queue is emptied.

Unless a catastrophic condition causes a system reset or machine-check exception, only one exception is handled at a time. For example, if one instruction encounters multiple exception conditions, those conditions are handled sequentially in priority order. After the exception handler completes, the instruction processing continues until the next exception condition is encountered. Recognizing and handling exception conditions sequentially guarantees system integrity.

When an exception is taken, information about the processor state before the exception was taken is saved in SRR0 and SRR1. Exception handlers must save the information stored in SRR0 and SRR1 early to prevent the program state from being lost due to a system reset and machine-check exception or due to an instruc- tion-caused exception in the exception handler, and before re-enabling external interrupts. The exception handler must also save and restore any GPR registers used by the handler.

PowerPC 750GX Overview

gx_01.fm.(1.2)

Page 48 of 377

March 27,2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

The PowerPC Architecture supports four types of exceptions:

Synchronous,

These are caused by instructions. All instruction-caused exceptions are handled

precise

precisely. That is, the machine state at the time the exception occurs is known and

 

can be completely restored. This means that (excluding the trap and system call

 

exceptions) the address of the faulting instruction is provided to the exception

 

handler and that neither the faulting instruction nor subsequent instructions in the

 

code stream will complete execution before the exception is taken. Once the

 

exception is processed, execution resumes at the address of the faulting instruc-

 

tion (or at an alternate address provided by the exception handler). When an

 

exception is taken due to a trap or system call instruction, execution resumes at an

 

address provided by the handler.

Synchronous,

The PowerPC Architecture defines two imprecise floating-point exception modes,

imprecise

recoverable and nonrecoverable. Even though the 750GX provides a means to

 

enable the imprecise modes, it implements these modes identically to the precise

 

mode (that is, enabled floating-point exceptions are always precise).

Asynchronous,

The PowerPC Architecture defines external and decrementer interrupts as

maskable

maskable, asynchronous exceptions. When these exceptions occur, their handling

 

is postponed until the next instruction, and any exceptions associated with that

 

instruction completes execution. If no instructions are in the execution units, the

 

exception is taken immediately upon determination of the correct restart address

 

(for loading SRR0). As shown in the Table 1-4, 750GX Microprocessor Exception

 

Classifications, the 750GX implements additional asynchronous, maskable excep-

 

tions.

Asynchronous,

There are two nonmaskable asynchronous exceptions: system reset and the

nonmaskable

machine-check exception. These exceptions might not be recoverable, or might

 

provide a limited degree of recoverability. Exceptions report recoverability through

 

the MSR[RI] bit.

1.7.2 750GX Microprocessor Exception Implementation

The 750GX exception classes described above are shown in the Table 1-4. Although exceptions have other characteristics, such as priority and recoverability, Table 1-4 describes the precise or imprecise characteristics of exceptions the 750GX uniquely handles. Table 1-4 includes no synchronous imprecise exceptions; although the PowerPC Architecture supports imprecise handling of floating-point exceptions, the 750GX implements these exception modes precisely.

Table 1-4. 750GX Microprocessor Exception Classifications

Synchronous/Asynchronous

Precise/Imprecise

Exception Type

 

 

 

 

 

 

Asynchronous, nonmaskable

Imprecise

Machine check, system reset

 

 

 

Asynchronous, maskable

Precise

External, decrementer, system-management, performance-monitor,

and thermal-management interrupts

 

 

 

 

 

Synchronous

Precise

Instruction-caused exceptions

 

 

 

Table 1-5 on page 50 lists the 750GX exceptions and conditions that cause them. Exceptions specific to the 750GX are indicated.

gx_01.fm.(1.2)

PowerPC 750GX Overview

March 27,2006

Page 49 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

Table 1-5. Exceptions and Conditions

Exception Type

Vector Offset

 

 

 

 

 

Causing Conditions

(hex)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Reserved

00000

 

 

 

 

 

 

 

 

 

System reset

00100

Assertion of either

 

 

or

 

 

or a power-on reset.

HRESET

SRESET

 

 

 

 

 

 

 

Assertion of the transfer error acknowledge

 

during a data-bus transaction, asser-

 

 

(TEA)

Machine check

00200

tion of a machine-check interrupt (MCP), an address, data or L2 double-bit error.

 

 

MSR[ME] must be set.

 

 

 

Data storage interrupt

00300

As defined in the PowerPC Architecture (for example, a page fault occurs).

 

 

 

 

 

 

 

 

 

 

 

Instruction storage inter-

00400

As defined by the PowerPC Architecture (for example, a page fault occurs).

rupt (ISI)

 

 

 

 

 

 

 

 

 

 

 

 

 

External interrupt

00500

MSR[EE] = 1 and interrupt

 

 

 

is asserted.

(INT)

 

 

 

 

 

• A floating-point load/store, Store Multiple Word (stmw), Store Word Conditional

 

 

Indexed (stwcx.), Load Multiple Word (lmw), Load Word and Reserved Indexed

Alignment

00600

(lwarx), eciwx, or ecowx instruction operand is not word-aligned.

• A multiple/string load/store operation is attempted in little-endian mode.

 

 

 

 

• The operand of Data Cache Block Zero (dcbz) is in memory that is write-through-

 

 

required or caching-inhibited, or the cache is disabled.

 

 

 

Program

00700

As defined by the PowerPC Architecture.

 

 

 

Floating-point unavailable

00800

As defined by the PowerPC Architecture.

 

 

 

Decrementer

00900

As defined by the PowerPC Architecture, when the most significant bit of the DEC reg-

ister changes from 0 to 1 and MSR[EE] = 1.

 

 

 

 

 

Reserved

00A00–00BFF

 

 

 

System call

00C00

Execution of the System Call (sc) instruction.

 

 

 

Trace

00D00

MSR[SE] = 1 or a branch instruction completes and MSR[BE] = 1. Unlike the architec-

ture definition, Instruction Synchronization (isync) does not cause a trace exception

 

 

 

 

 

Reserved

00E00

The 750GX does not generate an exception to this vector. Other PowerPC processors

might use this vector for floating-point assist exceptions.

 

 

 

 

 

Reserved

00E10–00EFF

 

 

 

Performance monitor1

00F00

The limit specified in a Performance-Monitor Control (PMC) register is reached and

MMCR0[ENINT] = 1.

 

 

 

 

 

Instruction address

 

IABR[0–29] matches EA[0–29] of the next instruction to complete,

01300

IABR[TE] matches MSR[IR], and

breakpoint1

 

IABR[BE] = 1.

 

 

 

 

 

System management

01400

A system management exception is enabled if MSR[EE] = 1 and is signaled to the

exception

750GX by the assertion of an input signal pin (SMI).

 

 

 

 

Reserved

01500–016FF

 

 

 

Thermal-management

01700

Thermal management is enabled, the junction temperature exceeds the threshold

interrupt1

specified in THRM1 or THRM2, and MSR[EE] = 1.

 

Reserved

01800–02FFF

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1. 750GX-specific

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PowerPC 750GX Overview

gx_01.fm.(1.2)

Page 50 of 377

March 27,2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

1.8 Memory Management

The following subsections describe the memory-management features of the PowerPC Architecture, and the 750GX implementation. A detailed description of the 750GX MMU implementation is provided in Chapter 5, Memory Management, on page 179.

1.8.1 PowerPC Memory-Management Model

The primary functions of the MMU are to translate logical (effective) addresses to physical addresses for memory accesses and to provide access protection on blocks and pages of memory. There are two types of accesses generated by the 750GX that require address translation—instruction fetches, and data accesses to memory generated by load, store, and cache-control instructions.

The PowerPC Architecture defines different resources for 32-bit and 64-bit processors. The 750GX implements the 32-bit memory-management model. The memory management unit provides two types of memoryaccess models: block-address translate (BAT) model and a virtual address model. The BAT block sizes range from 128 KB to 256 MB, are selectable from high-order effective address bits, and have priority over the virtual model. The virtual model employs a 52-bit virtual address space made up of a 24-bit segment address space and a 28-bit effective address space. The virtual model uses a demand paging method with a 4-KB page size. In both models, address translation is done completely by hardware, in parallel with cache accesses, with no additional cycles incurred.

The 750GX MMU provides independent 8-entry BAT arrays for instructions and data that maintain address translations for blocks of memory. These entries define blocks that can vary from 128 KB to 256 MB. The BAT arrays are maintained by system software. Instructions and data share the same virtual address model, but could operate in separate segment spaces.

The PowerPC 750GX MMU and exception model support demand-paged virtual memory. Virtual memory management permits execution of programs larger than the size of physical memory. Demand-paged implies that individual pages for data and instructions are loaded into physical memory from the system disk only when they are required by an executing program. Infrequently used pages in memory are returned to disk or discarded if they have not been modified.

The hashed page table is a fixed-sized data structure1 that contains 8-byte page table entries (PTEs), which define the mapping between virtual pages and physical pages. The page table size is a power of two and is boundary aligned in memory based on the size of the table. The page table contains a number of page-table- entry groups (PTEGs). Since a PTEG contains eight PTEs of eight bytes each, each PTEG is 64 bytes long. PTEG addresses are entry points for table-search operations. A given page translation can be found in one of two possible PTEGs. The size and location in memory of the page table is defined in the SDR1 register.

Setting MSR[IR] enables instruction address translations and setting MSR[DR] enables data address translations. If the bit is cleared, the respective effective address is used as the physical address.

1. Size should be determined by the amount of physical memory available to the system.

gx_01.fm.(1.2)

PowerPC 750GX Overview

March 27,2006

Page 51 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

1.8.2 750GX Microprocessor Memory-Management Implementation

The 750GX implements separate MMUs for instructions and data. It implements a copy of the Segment Registers in the instruction MMU. However, read and write accesses (Move-from Segment Register [mfsr] and Move-to Segment Register [mtsr]) are handled through the Segment Registers implemented as part of the data MMU. The 750GX MMU is described in Section 1.2.3, Memory Management Units (MMUs), on page 32.

The R (referenced) bit is set in the PTE in memory during a page table search due to a TLB miss. Updates to the changed (C) bit are treated like TLB misses. The page table is searched again to find the correct PTE to update when the C bit changes from 0 to 1.

1.9 Instruction Timing

The 750GX is a pipelined, superscalar processor. A pipelined processor is one in which instruction processing is divided into discrete stages, allowing work to be done on multiple instructions in each stage. For example, after an instruction completes one stage, it can pass on to the next stage leaving the previous stage available to a subsequent instruction. This improves overall instruction throughput.

A superscalar processor is one that issues multiple independent instructions to separate execution units in a single cycle, allowing multiple instructions to execute in parallel. The 750GX has six independent execution units, two for integer instructions, and one each for floating-point instructions, branch instructions, load-and- store instructions, and system-register instructions. Having separate GPRs and FPRs allows integer, floatingpoint calculations, and load-and-store operations to occur simultaneously without interference. Additionally, rename buffers are provided to allow operations to post completed results for use by subsequent instructions without committing them to the architected FPR and GPR register files.

As shown in Figure 1-5 on page 53, the common pipeline of the 750GX has four stages through which all instructions must pass—fetch, decode/dispatch, execute, and complete/write back. Instructions flow sequentially through each stage. However, at dispatch, a position is made available in the completion queue at the same time it enters the execution stage. This simplifies the completion operation when instructions are retired in program order. Both the load/store and floating-point units have multiple stages to execute their instructions. An instruction occupies only one stage at a time in all execution units. At each stage, an instruction might proceed without delay or might stall. Stalls are caused by the requirement for additional processing or other events. For example, divide instructions require multiple cycles to complete the operation; load-and- store instructions might stall waiting for address translation (during TLB reload or page fault, for example).

PowerPC 750GX Overview

gx_01.fm.(1.2)

Page 52 of 377

March 27,2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

Figure 1-5. Pipeline Diagram

Fetch

Maximum 4-instruction fetch per

clock cycle

BPU

Maximum 3-instruction dispatch per

Dispatch clock cycle (includes one branch instruction)

Execute Stage

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

FPU1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

FPU2

 

 

 

 

 

 

 

 

LSU1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

SRU

 

FPU3

 

IU1

 

 

IU2

 

LSU2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Complete (Write-Back)

Maximum 2-instruction completion

per clock cycle

 

 

Note: Figure 1-5 does not show features such as reservation stations and rename buffers that reduce stalls and improve instruction throughput.

The instruction pipeline in the 750GX has four major pipeline stages. They are fetch, dispatch, execute, and complete:

The fetch pipeline stage primarily involves fetching instructions from the memory system and keeping the instruction queue full. The BPU decodes branches after they are fetched and removes (folds out) those that do not update the CTR or LR from the instruction stream. If the branch is taken or predicted as taken, the fetch unit is informed of the new address and fetching resumes along the taken path. For branches not taken or predicted as not taken, sequential fetching continues.

The dispatch unit is responsible for taking instructions from the bottom two locations of the instruction queue and delivering them to an execution unit for further processing. Dispatch is responsible for decoding the instructions and determining which instructions can be dispatched. To qualify for dispatch, a reservation station, a rename buffer, and a position in the completion queue all must be available. A branch instruction could be processed by the BPU on the same clock cycle for a maximum of three instructions dispatched per cycle.

The dispatch stage accesses operands, assigns a rename buffer for operands that update architected registers, which include the GPRs, FPRs, and CR, and delivers the instruction to the reservation registers of the respective execution units. If a source operand is not available because a previous instruction is updating the item in a rename buffer, dispatch provides a tag that indicates which rename buffer will supply the operand when it becomes available. At the end of the dispatch stage, the instructions are removed from the instruction queue, latched into reservation stations at the appropriate execution unit, and assigned positions in the completion buffers in sequential program order.

gx_01.fm.(1.2)

PowerPC 750GX Overview

March 27,2006

Page 53 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

The execution units process instructions from their reservation stations using the operands provided from dispatch, and notifies the completion stage when the instruction has finished execution. With the exception of multiply and divide, integer instructions complete execution in a single cycle.

The FPU has three stages (multiply, add, and normalize) for processing floating-point arithmetic. All sin- gle-precision arithmetic (add, subtract, multiply, and multiply/add) instructions are processed without stalls at each stage. They have a 1-cycle throughput and a 3-cycle latency. Three different arithmetic instructions can be in the execution unit at one time, with one instruction completing execution each cycle. Double-precision arithmetic multiply requires two cycles in the multiply stage, one cycle in the add stage, and one cycle in the normalize stage, which yields a 2-cycle throughput and a 4-cycle latency. All divide instructions require multiple cycles in the first stage for processing.

The load/store unit has two reservation registers and two pipeline stages. The first stage is for effective address calculation and the second stage is for MMU translation and accessing the L1 data cache. Load instructions have a 1-cycle throughput and a 2-cycle latency.

In the case of an internal exception, the execution unit reports the exception to the completion pipeline stage and (except for the FPU) discontinues instruction execution until the exception is handled. The exception is not signaled until it is determined that all previous instructions have completed to a point where they will not signal an exception.

The completion unit retires instructions from the bottom two positions of the completion queue in program order. This maintains the correct architectural machine state and transfers execution results from the rename buffers to the GPRs and FPRs (and CTR and LR, for some instructions) as instructions are retired. If the completion logic detects an instruction causing an exception, all subsequent instructions are cancelled, their execution results in rename buffers are discarded, and instructions are fetched from the appropriate exception vector.

Because the PowerPC Architecture can be applied to such a wide variety of implementations, instruction timing varies among PowerPC processors. For a detailed discussion of instruction timing with examples and a table of latencies for each execution unit, see Chapter 6, Instruction Timing, on page 209.

1.10 Power Management

The 750GX provides the following four power modes, selectable by setting the appropriate control bits in the MSR and HID0 registers:

Full-power

This is the default power state of the 750GX. The 750GX is fully powered, and the

 

internal functional units are operating at the full processor clock speed. If the

 

dynamic power management mode is enabled, functional units that are idle will

 

automatically enter a low-power state without affecting performance, software

 

execution, or external hardware.

Doze

All the functional units of the 750GX are disabled except for the Time Base/Decre-

 

menter Registers and the bus snooping logic. When the processor is in doze mode,

 

an external asynchronous interrupt, a system management interrupt, a decre-

 

menter exception, a hard or soft reset, or a machine check brings the 750GX into

 

the full-power state. The 750GX in doze mode maintains the PLL in a fully powered

 

state and locked to the system external clock input (SYSCLK) so a transition to the

 

full-power state takes only a few processor clock cycles.

PowerPC 750GX Overview

gx_01.fm.(1.2)

Page 54 of 377

March 27,2006

 

User’s Manual

 

IBM PowerPC 750GX and 750GL RISC Microprocessor

 

 

Nap

The nap mode further reduces power consumption by disabling bus snooping,

 

leaving only the Time Base Register and the PLL in a powered state. The 750GX

 

returns to the full-power state upon receipt of an external asynchronous interrupt, a

 

system management interrupt, a decrementer exception, a hard or soft reset, or a

 

machine-check interrupt (MCP). A return to full-power state from nap state takes

 

only a few processor clock cycles. When the processor is in nap mode, if QACK is

 

negated, the processor is put in doze mode to support snooping.

Sleep

Sleep mode minimizes power consumption by disabling all internal functional units,

 

after which external system logic can disable the PLL and SYSCLK. Returning the

 

750GX to the full-power state requires enabling the PLL and SYSCLK, followed by

 

the assertion of an external asynchronous interrupt, a system management inter-

 

rupt, a hard or soft reset, or a machine-check interrupt (MCP) signal after the time

 

required to relock the PLL.

In addition, the 750GX allows software-controlled toggling between two operating frequencies. During periods of processor inactivity or for applications requiring reduced computing performance, the processor may be toggled to a lower frequency to conserve power.

Chapter 10, Power and Thermal Management, on page 335 provides information about power-saving and thermal-management modes for the 750GX.

1.11 Thermal Management

The 750GX’s thermal assist unit (TAU) provides a way to control heat dissipation. This ability is particularly useful in portable computers, which, due to power consumption and size limitations, cannot use desktop cooling solutions such as fans. Therefore, better heat sink designs coupled with intelligent thermal management is of critical importance for high-performance portable systems.

Primarily, the thermal-management system monitors and regulates the system’s operating temperature. For example, if the temperature is about to exceed a set limit, the system can be made to slow down or even suspend operations temporarily in order to lower the temperature.

The thermal-management facility also ensures that the processor’s junction temperature does not exceed the operating specification. To avoid the inaccuracies that arise from measuring junction temperature with an external thermal sensor, the 750GX’s on-chip thermal sensor and logic tightly couple the thermal-manage- ment implementation.

The TAU consists of a thermal sensor, digital-to-analog convertor, comparator, control logic, and the dedicated SPRs described in Section 1.4, PowerPC Registers and Programming Model, on page 42. The TAU does the following.

Compares the junction temperature against user-programmable thresholds.

Generates a thermal-management interrupt if the temperature crosses the threshold.

Enables the user to estimate the junction temperature by using a software successive approximation routine.

gx_01.fm.(1.2)

PowerPC 750GX Overview

March 27,2006

Page 55 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

The TAU is controlled through the privileged mtspr and mfspr instructions to the four SPRs provided for configuring and controlling the sensor control logic. The SPRs function as follows.

THRM1 and THRM2 provide the ability to compare the junction temperature against two user-provided thresholds. Having dual thresholds gives the thermal-management software finer control of the junction temperature. In single-threshold mode, the thermal sensor output is compared to only one threshold in either THRM1 or THRM2.

THRM3 is used to enable the TAU and to control the comparator output sample time. The thermal-man- agement logic manages the thermal-management interrupt generation and time multiplexed comparisons in the dual-threshold mode, as well as other control functions.

THRM4 is used to improve accuracy in determining the actual junction temperature.

Instruction-cache throttling provides control of the 750GX’s overall junction temperature by determining the interval at which instructions are fetched. This feature is accessed through the ICTC register. Chapter 10, Power and Thermal Management, on page 335 provides information about power-saving and thermalmanagement modes for the 750GX.

1.12 Performance Monitor

The 750GX incorporates a performance-monitor facility that system designers can use to help bring up, debug, and optimize software performance. The performance monitor counts events during execution of code, which relate to dispatch, execution, completion, and memory accesses.

The performance monitor incorporates several registers that can be read and written to by supervisor-level software. User-level versions of these registers provide read-only access for user-level applications. These registers are described in Section 1.4, PowerPC Registers and Programming Model, on page 42. Perfor- mance-Monitor Control Registers, MMCR0 or MMCR1, can be used to specify which events are to be counted and the conditions for which a performance-monitoring interrupt is taken. Additionally, the Sampled Instruction Address Register, SIA (USIA), holds the address of the first instruction to complete after the counter overflowed.

Attempting to write to a user-read-only Performance-Monitor Register causes a program exception, regardless of the MSR[PR] setting. When a performance-monitoring interrupt occurs, program execution continues from vector offset 0x00F00.

Chapter 11, Performance Monitor and System Related Features, on page 349 describes the operation of the performance-monitor diagnostic tool incorporated in the 750GX.

PowerPC 750GX Overview

gx_01.fm.(1.2)

Page 56 of 377

March 27,2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

2. Programming Model

This chapter describes the 750GX programming model, emphasizing those features specific to the 750GX processor and summarizing those that are common to PowerPC processors. It consists of three major sections, which describe the following topics.

Registers implemented in the 750GX

Operand conventions

750GX instruction set

For detailed information about architecture-defined features, see the PowerPC Microprocessor Family: The Programming Environments Manual.

2.1 PowerPC 750GX Processor Register Set

This section describes the registers implemented in the 750GX. It includes an overview of registers defined by the PowerPC Architecture, highlighting differences in how these registers are implemented in the 750GX, and a detailed description of 750GX-specific registers. Full descriptions of the architecture-defined register set are provided in Chapter 2, “PowerPC Register Set” in the PowerPC Microprocessor Family: The Programming Environments Manual.

Registers are defined at all three levels of the PowerPC Architecture—user instruction set architecture (UISA), virtual environment architecture (VEA), and operating environment architecture (OEA). The PowerPC Architecture defines register-to-register operations for all computational instructions. Source data for these instructions are accessed from the on-chip registers or are provided as immediate values embedded in the opcode. The 3-register instruction format allows specification of a target register distinct from the two source registers, thus preserving the original data for use by other instructions and reducing the number of instructions required for certain operations. Data is transferred between memory and registers with explicit load-and- store instructions only.

2.1.1 Register Set

The registers implemented on the 750GX are shown in Figure 2-1 on page 58. The number to the right of the special-purpose registers (SPRs) indicates the number that is used in the syntax of the instruction operands to access the register (for example, the number used to access the Integer Exception Register (XER) is SPR 1). These registers can be accessed using the Move-to Special Purpose Register (mtspr) and Movefrom Special Purpose Register (mfspr) instructions.

gx_02.fm.(1.2)

Programming Model

March 27, 2006

Page 57 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

Figure 2-1. PowerPC 750GX Microprocessor Programming Model—Registers

SUPERVISOR MODEL—OEA

Configuration Registers

 

 

 

 

USER MODEL—VEA

 

 

 

Hardware

 

 

 

 

 

 

 

 

Processor

 

 

 

 

 

 

Machine

 

 

 

Time Base Facility (For Reading)

 

 

Implementation

 

 

 

 

 

 

 

 

Version

 

 

 

 

 

 

State

 

 

 

 

 

 

Registers1

 

 

 

 

 

 

 

 

Register

 

 

 

 

 

 

Register

 

 

 

TBL

TBR 268

 

 

 

TBU

 

 

 

 

TBR 269

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

MSR

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

HID0

SPR 1008

 

 

 

 

 

 

 

 

PVR

SPR 287

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

HID1

SPR 1009

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

HID2

SPR 1016

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

USER MODEL UISA

 

 

 

 

 

 

 

 

 

Memory-Management Registers

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Instruction BAT

 

 

 

 

 

 

 

 

Data BAT

 

 

 

Segment

 

 

 

 

 

 

 

 

 

Count Register

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Registers1

 

 

 

 

 

 

 

Registers1

 

 

 

Registers

 

 

 

 

 

 

 

 

 

 

CTR

SPR 9

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

IBAT0U

SPR 528

 

 

 

 

 

DBAT0U

SPR 536

 

 

 

 

 

 

 

 

 

SR0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

XER

 

XER

SPR 1

 

 

 

 

 

 

 

 

IBAT0L

SPR 529

 

 

 

 

 

 

DBAT0L

SPR 537

 

 

 

 

 

 

 

 

 

SR1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

IBAT1U

SPR 530

 

 

 

 

 

DBAT1U

SPR 538

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Link Register

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

IBAT1L

SPR 531

 

 

 

 

 

 

DBAT1L

SPR 539

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

General Purpose

 

 

IBAT2U

SPR 532

 

 

 

 

 

DBAT2U

SPR 540

 

 

 

 

 

 

 

 

 

SR15

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

LR

SPR 8

 

 

 

 

 

 

 

Registers

 

 

IBAT2L

SPR 533

 

 

 

 

 

 

DBAT2L

SPR 541

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

IBAT3U

SPR 534

 

 

 

 

 

DBAT3U

SPR 542

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

GPR0

 

 

 

 

 

IBAT3L

SPR 535

 

 

 

 

 

 

DBAT3L

SPR 543

 

 

 

 

 

 

 

 

 

SDR1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

GPR1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

IBAT4U

SPR 560

 

 

 

 

 

 

DBAT4U

SPR 568

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

SDR1

 

SPR 25

 

 

 

 

 

 

Condition Register

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

IBAT4L

SPR 561

 

 

 

 

 

 

DBAT4L

SPR 569

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

IBAT5U

SPR 562

 

 

 

 

 

 

DBAT5U

SPR 570

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

CR

 

 

 

 

 

 

 

 

 

 

 

 

 

GPR31

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

IBAT5L

SPR 563

 

 

 

 

 

 

DBAT5L

SPR 571

 

 

 

 

Save and Restore

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

IBAT6U

SPR 564

 

 

 

 

 

 

DBAT6U

SPR 572

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Registers

 

 

 

 

 

 

 

 

 

 

Performance Monitor Registers

 

 

IBAT6L

SPR 565

 

 

 

 

 

 

DBAT6L

SPR 573

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(For Reading)

 

 

 

 

 

 

 

 

 

 

 

 

 

IBAT7U

SPR 566

 

 

 

 

 

 

DBAT7U

SPR 574

 

 

 

 

 

SRR0

 

SPR 26

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

SRR1

 

SPR 27

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

IBAT7L

SPR 567

 

 

 

 

DBAT7L

SPR 575

 

 

 

 

 

 

Performance Counters1

 

Floating Point

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Exception Handling Registers

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Registers

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

UPMC1

SPR 937

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Data Address

 

 

 

 

 

DSISR

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

SPRGs

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

FPR0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

UPMC2

SPR 938

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Register

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

FPR1

 

 

 

 

 

 

SPRG0

SPR 272

 

 

 

 

 

 

 

 

 

 

 

 

DSISR

SPR 18

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

UPMC3

SPR 941

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

SPRG1

SPR 273

 

 

 

 

 

 

DAR

SPR 19

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

UPMC4

SPR 942

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

SPRG2

SPR 274

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Monitor Control1

 

 

 

 

 

 

 

FPR31

 

 

 

 

 

 

SPRG3

SPR 275

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

UMMCR0

SPR 936

 

 

 

 

Floating-Point Status

 

 

 

 

 

 

 

 

 

 

 

 

Miscellaneous Registers

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

UMMCR1

SPR 940

 

 

 

 

 

 

 

External Access

 

 

 

 

 

Time Base

 

 

 

 

 

 

Decrementer

 

 

 

 

 

 

and Control Register

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Register

 

 

 

 

 

 

 

 

(For Writing)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

FPSCR

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

DEC

SPR 22

 

 

 

 

 

Sampled Instruction

 

 

 

EAR

SPR 282

 

 

 

 

TBL

SPR 284

 

 

 

 

 

 

Instruction Address

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TBU

SPR 285

 

 

 

 

 

 

 

 

 

 

 

 

 

Address1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Data Address

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Breakpoint Register1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

USIA

 

SPR 939

 

 

 

 

 

 

 

 

 

Breakpoint Register

 

 

 

 

 

L2 Control

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

IABR

SPR 1010

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

DABR

SPR 1013

 

 

 

 

 

 

Register1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

L2CR

SPR 1017

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Performance Monitor Registers

 

 

 

 

 

 

 

 

Power/Thermal Management Registers

 

 

 

 

 

 

 

Performance

 

Sampled Instruction

 

 

Thermal Assist Unit Registers1

 

 

 

 

 

 

 

 

Instruction-Cache Throttling

 

 

Counters1

 

 

Address1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Control Register1

PMC1

SPR 953

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

THRM1

SPR 1020

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PMC2

SPR 954

 

 

 

 

 

 

SIA

 

 

 

 

SPR 955

 

 

 

 

 

 

THRM2

SPR 1021

 

 

 

 

 

 

 

 

 

 

 

 

 

ICTC

SPR 1019

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Monitor Control1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

THRM3

SPR 1022

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PMC3

SPR 957

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

THRM4

SPR

920

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PMC4

SPR 958

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

MMCR0

SPR 952

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

MMCR1

SPR 956

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1. These are processor-specific registers. They might not be supported by other PowerPC processors.

Programming Model

gx_02.fm.(1.2)

Page 58 of 377

March 27, 2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

The PowerPC UISA registers are user-level. General Purpose Registers (GPRs) and Floating Point Registers (FPRs) are accessed through instruction operands. Access to registers can be explicit (by using instructions for that purpose such as mtspr and mfspr instructions) or implicit as part of the execution of an instruction.

Some registers are accessed both explicitly and implicitly.

Implementation Note: The 750GX fully decodes the SPR field of the instruction. If the SPR specified is undefined, an illegal instruction program exception occurs.

Descriptions of the PowerPC user-level registers follow:

User-level registers (UISA)—The user-level registers can be accessed by all software with either user or supervisor privileges. They include the following registers:

General Purpose Registers (GPRs). The 32 GPRs (GPR0–GPR31) serve as data source or destination registers for integer instructions and provide data for generating addresses. See “General Purpose Registers (GPRs)” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual for more information.

Floating Point Registers (FPRs). The 32 FPRs (FPR0–FPR31) serve as the data source or destination for all floating-point instructions. See “Floating Point Registers (FPRs)” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual.

Condition Register (CR). The 32-bit CR consists of eight 4-bit fields, CR0–CR7, that reflect results of certain arithmetic operations and provide a mechanism for testing and branching. See “Condition Register (CR)” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual.

Floating-Point Status and Control Register (FPSCR). The FPSCR contains all floating-point exception signal bits, exception summary bits, exception enable bits, and rounding control bits needed for compliance with the IEEE 754-1985 standard. See “Floating-Point Status and Control Register (FPSCR)” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual.

The remaining user-level registers are SPRs. Note that the PowerPC Architecture provides a separate mechanism for accessing SPRs (the mtspr and mfspr instructions). These instructions are commonly used to explicitly access certain registers, while other SPRs are more typically accessed as the side effect of executing other instructions.

Integer Exception Register (XER). The XER indicates overflow and carries for integer operations. See “XER Register (XER)” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual for more information.

Implementation Note: To allow emulation of the Load String and Compare Byte Indexed (lscbx) instruction defined by the POWER architecture, XER[16–23] is implemented so that it can be read with mfspr and written with Move-to Fixed-Point Exception Register (mtxer) instructions.

Link Register (LR). The LR provides the branch target address for the Branch Conditional to Link Register (bclrx) instruction, and can be used to hold the logical address of the instruction that follows a branch and link instruction, typically used for linking to subroutines. See “Link Register (LR)” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual.

Count Register (CTR). The CTR holds a loop count that can be decremented during execution of appropriately coded branch instructions. The CTR can also provide the branch target address for the Branch Conditional to Count Register (bcctrx) instruction. See “Count Register (CTR)” in Chapter 2,

gx_02.fm.(1.2)

Programming Model

March 27, 2006

Page 59 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

“PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual.

User-level registers (VEA)—The PowerPC VEA defines the time-base facility (TB), which consists of two 32-bit registers—Time Base Upper (TBU) and Time Base Lower (TBL). The Time Base Registers can be written to only by supervisor-level instructions, but can be read by both user-level and supervisor-level software. For more information, see “PowerPC VEA Register Set—Time Base” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual.

Supervisor-level registers (OEA)—The OEA defines the registers an operating system uses for memory management, configuration, exception handling, and other operating system functions. The OEA defines the following supervisor-level registers for 32-bit implementations:

Configuration registers

Machine State Register (MSR). The MSR defines the state of the processor. The MSR can be modified by the Move-to Machine State Register (mtmsr), System Call (sc), and Return from Exception (rfi) instructions. It can be read by the Move-from Machine State Register (mfmsr) instruction. When an exception is taken, the contents of the MSR are saved to the Machine Status Save/Restore Register 1 (SRR1), which is described below. See “Machine State Register (MSR)” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual for more information.

Implementation Note: Table 2-1 describes MSR bits the 750GX implements that are not required by the PowerPC Architecture.

Table 2-1. Additional MSR Bits

Bit

Name

Description

 

 

 

 

 

 

 

 

Power management enable. Optional in the PowerPC Architecture.

 

 

0

Power management is disabled.

 

 

1

Power management is enabled.

13

POW

The processor can enter a power-saving mode when additional conditions are present. The mode

chosen is determined by the DOZE, NAP, and SLEEP bits in the Hardware-Implementation-

 

 

 

 

Dependent Register 0 (HID0), described in Section 2.1.2.2 on page 65.

 

 

To set the POW bit, see Table 10-2, HID0 Power Saving Mode Bit Settings, on page 337. The

 

 

750GX will clear the POW bit when it leaves a power saving mode.

 

 

 

 

 

Performance-monitor marked mode. This bit is specific to the 750GX, and is defined as reserved by

 

 

the PowerPC Architecture. See Chapter 10, Power and Thermal Management, on page 335.

 

 

0

Process is not a marked process.

29

PM

1

Process is a marked process.

 

 

The MSR[PM] bit is used by the Performance-Monitor to help determine when it should count

 

 

events. For a description of the Performance-Monitor, see Chapter 11, Performance Monitor and

 

 

System Related Features, on page 349.

 

 

 

 

Note: Setting MSR[EE] masks not only the architecture-defined external interrupt and decrementer exceptions, but also the 750GX-specific system management, performance-monitor, and thermal-management exceptions.

Processor Version Register (PVR). This register is a read-only register that identifies the version (model) and revision level of the PowerPC processor. For more information, see “Processor Version Register (PVR)” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual.

Note: The Processor Version Number is x’7002’ for the 750GX. The processor revision level will start at x’0100’ and will be incremented for each revision of the chip.

Programming Model

gx_02.fm.(1.2)

Page 60 of 377

March 27, 2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

Memory-management registers

Block-Address Translation (BAT) Registers. The PowerPC OEA includes an array of Block Address Translation Registers that can be used to specify eight blocks of instruction space and eight blocks of data space. The BAT registers are implemented in pairs—eight pairs of instruction BATs (IBAT0U–IBAT7U and IBAT0L–IBAT7L) and eight pairs of data BATs (DBAT0U–DBAT7U and DBAT0L–DBAT7L). Figure 2-1, PowerPC 750GX Microprocessor Programming Model— Registers lists the SPR numbers for the BAT registers. For more information, see “BAT Registers” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual. Because BAT upper and lower words are loaded separately, software must ensure that BAT translations are correct during the time that both BAT entries are being loaded.

The 750GX implements the G bit in the IBAT registers. However, attempting to execute code from an IBAT area with G = 1 causes an instruction storage interrupt (ISI) exception. This complies with the revision of the architecture described in the PowerPC Microprocessor Family: The Programming Environments Manual.

SDR1. The SDR1 register specifies the page table base address used in virtual-to-physical address translation. See “SDR1” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual.”

Segment Registers (SR). The PowerPC OEA defines sixteen 32-bit Segment Registers (SR0– SR15). Note that the SRs are implemented on 32-bit implementations only. The fields in the Segment Register are interpreted differently depending on the value of bit 0. See “Segment Registers” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual for more information.

Note: The 750GX implements separate memory management units (MMUs) for instruction and data. It associates the architecture-defined SRs with the data MMU (DMMU). It reflects the values of the SRs in separate, so-called ‘shadow’ Segment Registers in the instruction MMU (IMMU).

Exception-handling registers

Data Address Register (DAR). After a data-storage interrupt (DSI) exception or an alignment exception, DAR is set to the effective address (EA) generated by the instruction at fault. See “Data Address Register (DAR)” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual for more information.

SPRG0–SPRG3. The SPRG0–SPRG3 registers are provided for operating system use. See “SPRG0–SPRG3” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual for more information.

DSISR. The Data Storage Interrupt Status Register (DSISR) defines the cause of DSI and alignment exceptions. See “DSISR” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual for more information.

Machine Status Save/Restore Register 0 (SRR0). The SRR0 register is used to save the address of the instruction at which execution continues when an rfi executes at the end of an exception handler routine. See “Machine Status Save/Restore Register 0 (SRR0)” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual for more information.

Machine Status Save/Restore Register 1 (SRR1). The SRR1 is used to save machine status on exceptions and to restore machine status when rfi executes. See “Machine Status Save/Restore

gx_02.fm.(1.2)

Programming Model

March 27, 2006

Page 61 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

Register 1 (SRR1)” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual for more information.

Note: When a machine-check exception occurs, the 750GX sets one or more error bits in SRR1. Table 2-2 describes SRR1 bits 750GX implements that are not required by the PowerPC Architecture.

Table 2-2. Additional SRR1 Bits

Bit

Name

Description

 

 

 

 

 

 

4

CP

Internal cache parity error.

 

 

 

11

L2DBERR

Set by a double-bit error checking and correction (ECC) error in the L2.

 

 

 

 

 

12

MCpin

Set by the assertion of the machine-check interrupt

 

.

(MCP)

 

 

 

13

TEA

Set by a transfer error acknowledge

 

assertion on the 60x bus.

(TEA)

 

 

 

14

DP

Set by a data-parity error on the 60x bus.

 

 

 

15

AP

Set by an address-parity error on the 60x bus.

 

 

 

 

 

 

 

Miscellaneous registers

Time Base (TB). The TB is a 64-bit structure provided for maintaining the time of day and operating interval timers. The TB consists of two 32-bit registers—Time Base Upper (TBU) and Time Base Lower (TBL). The Time Base Registers can be written to only by supervisor-level software, but can be read by both userand supervisor-level software. See “Time Base Facility (TB)— OEA” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual for more information.

Decrementer Register (DEC). This register is a 32-bit decrementing counter that provides a mechanism for causing a decrementer exception after a programmable delay; the frequency is a subdivision of the processor clock. See “Decrementer Register (DEC)” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual for more information.

Note: In the 750GX, the Decrementer Register is decremented and the time base is incremented at a speed that is one-fourth the speed of the bus clock.

Data Address Breakpoint Register (DABR)—This optional register is used to cause a breakpoint exception if a specified data address is encountered. See “Data Address Breakpoint Register (DABR)” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual.

External Access Register (EAR). This optional register is used in conjunction with the External Control In Word Indexed (eciwx) and External Control Out Word Indexed (ecowx) instructions. Note that the EAR and the eciwx and ecowx instructions are optional in the PowerPC Architecture and might not be supported in all PowerPC processors that implement the OEA. See “External Access Register (EAR)” in Chapter 2, “PowerPC Register Set” of the PowerPC Microprocessor Family: The Programming Environments Manual for more information.

750GX-specific registers—The PowerPC Architecture allows implementation-specific SPRs. Those described below are incorporated in the 750GX. Note that, in the 750GX, these registers are all supervi- sor-level registers.

Instruction Address Breakpoint Register (IABR)—This register can be used to cause a breakpoint exception if a specified instruction address is encountered.

Programming Model

gx_02.fm.(1.2)

Page 62 of 377

March 27, 2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

Hardware-Implementation-Dependent Register 0 (HID0)—This register controls various functions, such as enabling checkstop conditions, and locking, enabling, and invalidating the instruction and data caches, power modes, miss-under-miss, and others.

Hardware-Implementation-Dependent Register 1 (HID1)—This register reflects the state of PLL_CFG[0:4] clock signals, and phase-locked loop (PLL) selection and range bits.

Hardware-Implementation-Dependent Register 2 (HID2)—This register controls parity enablement.

L2 Cache Control Register (L2CR)—This register is used to configure and operate the L2 cache.

Performance-monitor registers. The following registers are used to define and count events for use by the performance monitor:

The Performance-Monitor Counter Registers (PMC1–PMC4) are used to record the number of times a certain event has occurred. UPMC1–UPMC4 provide user-level read access to these registers.

The Monitor Mode Control Registers (MMCR0–MMCR1) are used to enable various perfor- mance-monitor interrupt functions. UMMCR0–UMMCR1 provide user-level read access to these registers.

The Sampled Instruction Address Register (SIA) contains the effective address of an instruction executing at or around the time that the processor signals the performance-monitor interrupt condition. USIA provides user-level read access to the SIA.

The 750GX does not implement the Sampled Data Address Register (SDA) or the user-level, read-only USDA registers. However, for compatibility with processors that do, those registers can be written to by boot code without causing an exception. SDA is SPR 959; USDA is SPR 943.

Instruction Cache Throttling Control Register (ICTC)—This register has bits for enabling the instruc- tion-cache throttling feature and for controlling the interval at which instructions are forwarded to the instruction buffer in the fetch unit. This provides control over the processor’s overall junction temperature.

Thermal-Management Registers (THRM1, THRM2, THRM3, and THRM4)—Used to enable and set thresholds for the thermal-management facility.

THRM1 and THRM2 provide the ability to compare the junction temperature against two userprovided thresholds. The dual thresholds allow the thermal-management software differing degrees of action in lowering the junction temperature. The TAU can be also operated in a singlethreshold mode in which the thermal sensor output is compared to only one threshold in either THRM1 or THRM2.

THRM3 is used to enable the thermal-management assist unit (TAU) and to control the comparator output sample time.

THRM4 is a read-only register containing a temperature offset (determined at the factory) applied to junction temperature measurements for improved accuracy.

Note: While it is not guaranteed that the implementation of 750GX-specific registers is consistent among PowerPC processors, other processors may implement similar or identical registers.

gx_02.fm.(1.2)

Programming Model

March 27, 2006

Page 63 of 377

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

2.1.2 PowerPC 750GX-Specific Registers

This section describes registers that are defined for the 750GX but are not included in the PowerPC Architecture.

2.1.2.1 Instruction Address Breakpoint Register (IABR)

The Instruction Address Breakpoint Register (IABR) supports the instruction address breakpoint exception. When this exception is enabled, instruction fetch addresses are compared with an effective address stored in the IABR. If the word specified in the IABR is fetched, the instruction breakpoint handler is invoked. The instruction that triggers the breakpoint does not execute before the handler is invoked. For more information, see Section 4.5.14, Instruction Address Breakpoint Exception (0x01300), on page 173. The IABR can be accessed with mtspr and mfspr using the SPR 1010.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Address

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

BE TE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Bits

 

 

 

 

Field Name

 

 

 

 

 

 

 

 

 

 

 

Description

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0:29

 

 

 

 

Address

 

 

 

Word address to be compared.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

30

 

 

 

 

 

BE

 

 

 

Breakpoint enabled. Setting this bit indicates that breakpoint checking is to be done.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

31

 

 

 

 

 

TE

 

 

 

Translation enabled. An IABR match is signaled if this bit matches MSR[IR].

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Programming Model

gx_02.fm.(1.2)

Page 64 of 377

March 27, 2006

User’s Manual

IBM PowerPC 750GX and 750GL RISC Microprocessor

2.1.2.2 Hardware-Implementation-Dependent Register 0 (HID0)

The Hardware-Implementation-Dependent Register 0 (HID0) controls the state of several functions within 750GX. HID0 can be accessed with mtspr and mfspr using SPR 1008.

EMCP

 

DBP

 

EBA

 

EBD

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0

 

1

 

2

 

3

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Reserved

4 5 6

PAR

 

DOZE

 

NAP

 

SLEEP

 

DPM

 

RISEG

 

Reserved

 

MUM

 

NHR

 

ICE

 

DCE

 

ILOCK

 

DLOCK

 

ICFI

 

DCFI

 

SPD

 

IFEM

 

SGE

 

DCFA

 

BTIC

 

Reserved

 

ABE

 

BHT

 

Reserved

 

NOOPTI

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

7

 

8

 

9

 

10

 

11

 

12

 

13

 

14

 

15

 

16

 

17

 

18

 

19

 

20

 

21

 

22

 

23

 

24

 

25

 

26

 

27

 

28

 

29

 

30

 

31

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Bits

 

Field Name

 

 

 

 

 

 

 

 

 

 

 

 

Description

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Enable

 

The primary purpose of this bit is to mask out further machine-check excep-

 

 

 

MCP.

 

 

 

tions caused by assertion of MCP, similar to how MSR[EE] can mask external interrupts.

 

 

 

 

 

 

Asserting

 

does not generate a machine-check exception or a

0

 

EMCP

0

Masks

MCP.