IBM PowerPC 750GX and 750GL RISC Microprocessor
User’s Manual
Version 1.2
March 27, 2006
®
© Copyright International Business Machines Corporation 2004, 2006
All Rights Reserved
Printed in the United States of America March 2006.
The following are trademarks of International Business Machines Corporation in the United States, or other countries, or both:
IBM |
POWER |
PowerPC 750 |
IBM Logo |
PowerPC |
PowerPC Architecture |
|
|
PowerPC Logo |
IEEE is a registered trademark in the United States, owned by the Institute of Electrical and Electronics Engineers.
Other company, product, and service names may be trademarks or service marks of others.
All information contained in this document is subject to change without notice. The products described in this document are NOT intended for use in applications such as implantation, life support, or other hazardous uses where malfunction could result in death, bodily injury, or catastrophic property damage. The information contained in this document does not affect or change IBM product specifications or warranties. Nothing in this document shall operate as an express or implied license or indemnity under the intellectual property rights of IBM or third parties. All information contained in this document was obtained in specific environments, and is presented as an illustration. The results obtained in other operating environments may vary.
THE INFORMATION CONTAINED IN THIS DOCUMENT IS PROVIDED ON AN “AS IS” BASIS. In no event will IBM be liable for damages arising directly or indirectly from any use of the information contained in this document.
IBM Microelectronics Division
2070 Route 52, Bldg. 330
Hopewell Junction, NY 12533-6351
The IBM home page can be found at ibm.com
The IBM Microelectronics Division home page can be found at ibm.com/chips
gx_title.fm.(1.2) March 27, 2006
|
User’s Manual |
IBM PowerPC 750GX and 750GL RISC Microprocessor |
|
List of Figures .............................................................................................................. |
13 |
List of Tables ............................................................................................................... |
. 15 |
About This Manual ........................................................................................................ |
19 |
Who Should Read This Manual ................................................................................................... |
......... 19 |
Related Publications ............................................................................................................................. |
19 |
Conventions Used in This Manual ............................................................................................... |
......... 20 |
Using This Manual with the Programming Environments Manual ......................................................... |
22 |
1. PowerPC 750GX Overview ....................................................................................... |
23 |
1.1 750GX Microprocessor Overview ............................................................................................. |
...... 23 |
1.2 750GX Microprocessor Features ............................................................................................. |
....... 25 |
1.2.1 Instruction Flow ........................................................................................................ |
............. 29 |
1.2.1.1 Instruction Queue and Dispatch Unit .............................................................................. |
29 |
1.2.1.2 Branch Processing Unit (BPU) ....................................................................................... |
29 |
1.2.1.3 Completion Unit ....................................................................................................... |
....... 30 |
1.2.2 Independent Execution Units ............................................................................................. |
.... 31 |
1.2.2.1 Integer Units (IUs) ................................................................................................... |
....... 31 |
1.2.2.2 Floating-Point Unit (FPU) ............................................................................................. |
.. 31 |
1.2.2.3 Load/Store Unit (LSU) ................................................................................................. |
... 32 |
1.2.2.4 System Register Unit (SRU) ........................................................................................... |
32 |
1.2.3 Memory Management Units (MMUs) ..................................................................................... |
32 |
1.2.4 On-Chip Level 1 Instruction and Data Caches ...................................................................... |
33 |
1.2.5 On-Chip Level 2 Cache Implementation ................................................................................ |
35 |
1.2.6 System Interface/Bus Interface Unit (BIU) ............................................................................. |
35 |
1.2.7 Signals ................................................................................................................................... |
37 |
1.2.8 Signal Configuration .................................................................................................... |
.......... 38 |
1.2.9 Clocking ................................................................................................................................. |
40 |
1.3 750GX Microprocessor Implementation ....................................................................................... |
... 40 |
1.4 PowerPC Registers and Programming Model ................................................................................ |
42 |
1.5 Instruction Set ................................................................................................................................. |
45 |
1.5.1 PowerPC Instruction Set ................................................................................................. |
...... 45 |
1.5.2 750GX Microprocessor Instruction Set .................................................................................. |
47 |
1.6 On-Chip Cache Implementation .............................................................................................. |
........ 47 |
1.6.1 PowerPC Cache Model ..................................................................................................... |
.... 47 |
1.6.2 750GX Microprocessor Cache Implementation .................................................................... |
47 |
1.7 Exception Model .............................................................................................................................. |
48 |
1.7.1 PowerPC Exception Model ................................................................................................. |
... 48 |
1.7.2 750GX Microprocessor Exception Implementation ............................................................... |
49 |
1.8 Memory Management ......................................................................................................... |
............ 51 |
1.8.1 PowerPC Memory-Management Model ................................................................................ |
51 |
1.8.2 750GX Microprocessor Memory-Management Implementation ........................................... |
52 |
1.9 Instruction Timing ............................................................................................................................ |
52 |
1.10 Power Management ......................................................................................................... |
............. 54 |
1.11 Thermal Management ....................................................................................................... |
............ 55 |
1.12 Performance Monitor ..................................................................................................................... |
56 |
750gx_umTOC.fm.(1.2) |
|
March 27, 2006 |
Page 3 of 377 |
User’s Manual |
|
IBM PowerPC 750GX and 750GL RISC Microprocessor |
|
2. Programming Model .................................................................................................. |
57 |
2.1 PowerPC 750GX Processor Register Set ....................................................................................... |
57 |
2.1.1 Register Set ........................................................................................................................... |
57 |
2.1.2 PowerPC 750GX-Specific Registers ...................................................................................... |
64 |
2.1.2.1 Instruction Address Breakpoint Register (IABR) ............................................................ |
64 |
2.1.2.2 Hardware-Implementation-Dependent Register 0 (HID0) .............................................. |
65 |
2.1.2.3 Hardware-Implementation-Dependent Register 1 (HID1) .............................................. |
70 |
2.1.2.4 Hardware-Implementation-Dependent Register 2 (HID2) .............................................. |
71 |
2.1.2.5 Performance-Monitor Registers ...................................................................................... |
72 |
2.1.3 Instruction Cache Throttling Control Register (ICTC) ............................................................ |
77 |
2.1.4 Thermal-Management Registers (THRMn) ............................................................................ |
78 |
2.1.4.1 Thermal-Management Registers 1–2 (THRM1–THRM2) ............................................... |
78 |
2.1.4.2 Thermal-Management Register 3 (THRM3) ................................................................... |
79 |
2.1.4.3 Thermal-Management Register 4 (THRM4) ................................................................... |
80 |
2.1.5 L2 Cache Control Register (L2CR) ........................................................................................ |
81 |
2.2 Operand Conventions ..................................................................................................................... |
82 |
2.2.1 Data Organization in Memory and Data Transfers ................................................................ |
82 |
2.2.2 Alignment and Misaligned Accesses ..................................................................................... |
82 |
2.2.3 Floating-Point Operand and Execution Models—UISA ......................................................... |
83 |
2.2.3.1 Denormalized Number Support ...................................................................................... |
83 |
2.2.3.2 Non-IEEE Mode (Nondenormalized Mode) .................................................................... |
83 |
2.2.3.3 Time-Critical Floating-Point Operation ........................................................................... |
84 |
2.2.3.4 Floating-Point Storage Access Alignment ...................................................................... |
84 |
2.2.3.5 Optional Floating-Point Graphics Instructions ................................................................ |
84 |
2.3 Instruction Set Summary ................................................................................................................. |
86 |
2.3.1 Classes of Instructions ........................................................................................................... |
87 |
2.3.1.1 Definition of Boundedly Undefined ................................................................................. |
87 |
2.3.1.2 Defined Instruction Class ................................................................................................ |
87 |
2.3.1.3 Illegal Instruction Class ................................................................................................... |
88 |
2.3.1.4 Reserved Instruction Class ............................................................................................. |
89 |
2.3.2 Addressing Modes ................................................................................................................. |
89 |
2.3.2.1 Memory Addressing ........................................................................................................ |
89 |
2.3.2.2 Memory Operands .......................................................................................................... |
89 |
2.3.2.3 Effective Address Calculation ......................................................................................... |
90 |
2.3.2.4 Synchronization .............................................................................................................. |
90 |
2.3.3 Instruction Set Overview ........................................................................................................ |
91 |
2.3.4 PowerPC UISA Instructions ................................................................................................... |
92 |
2.3.4.1 Integer Instructions ......................................................................................................... |
92 |
2.3.4.2 Floating-Point Instructions .............................................................................................. |
95 |
2.3.4.3 Load-and-Store Instructions ........................................................................................... |
98 |
2.3.4.4 Branch and Flow-Control Instructions .......................................................................... |
106 |
2.3.4.5 System Linkage Instruction—UISA .............................................................................. |
108 |
2.3.4.6 Processor Control Instructions—UISA ......................................................................... |
108 |
2.3.4.7 Memory Synchronization Instructions—UISA ............................................................... |
113 |
2.3.5 PowerPC VEA Instructions .................................................................................................. |
113 |
2.3.5.1 Processor Control Instructions—VEA ........................................................................... |
113 |
2.3.5.2 Memory Synchronization Instructions—VEA ................................................................ |
114 |
2.3.5.3 Memory Control Instructions—VEA .............................................................................. |
115 |
2.3.5.4 Optional External Control Instructions .......................................................................... |
117 |
2.3.6 PowerPC OEA Instructions .................................................................................................. |
118 |
|
750gx_umTOC.fm.(1.2) |
Page 4 of 377 |
March 27, 2006 |
|
User’s Manual |
|
IBM PowerPC 750GX and 750GL RISC Microprocessor |
||
2.3.6.1 System Linkage Instructions—OEA ............................................................................. |
|
118 |
2.3.6.2 Processor Control Instructions—OEA .......................................................................... |
|
118 |
2.3.6.3 Memory Control Instructions—OEA ............................................................................. |
|
119 |
2.3.7 Recommended Simplified Mnemonics ................................................................................ |
|
120 |
3. Instruction-Cache and Data-Cache Operation .................................................... |
|
121 |
3.1 Data-Cache Organization ................................................................................................... |
........... |
123 |
3.2 Instruction-Cache Organization ............................................................................................ |
......... |
124 |
3.3 Memory and Cache Coherency ................................................................................................ |
.... |
125 |
3.3.1 Memory/Cache Access Attributes (WIMG Bits) ................................................................... |
|
125 |
3.3.2 MEI Protocol ............................................................................................................ |
............ |
126 |
3.3.2.1 MEI Hardware Considerations ..................................................................................... |
|
128 |
3.3.3 Coherency Precautions in Single-Processor Systems ........................................................ |
|
129 |
3.3.4 Coherency Precautions in Multiprocessor Systems ............................................................ |
|
129 |
3.3.5 PowerPC 750GX-Initiated Load/Store Operations .............................................................. |
|
130 |
3.3.5.1 Performed Loads and Stores ....................................................................................... |
|
130 |
3.3.5.2 Sequential Consistency of Memory Accesses ............................................................. |
|
130 |
3.3.5.3 Atomic Memory References ......................................................................................... |
|
130 |
3.4 Cache Control ............................................................................................................................... |
|
131 |
3.4.1 Cache-Control Parameters in HID0 ..................................................................................... |
|
131 |
3.4.1.1 Data-Cache Flash Invalidation ..................................................................................... |
|
132 |
3.4.1.2 Enabling and Disabling the Data Cache ....................................................................... |
|
132 |
3.4.1.3 Locking the Data Cache ............................................................................................... |
|
132 |
3.4.1.4 Instruction-Cache Flash Invalidation ............................................................................ |
|
133 |
3.4.1.5 Enabling and Disabling the Instruction Cache .............................................................. |
|
133 |
3.4.1.6 Locking the Instruction Cache ...................................................................................... |
|
133 |
3.4.2 Cache-Control Instructions .............................................................................................. |
.... |
133 |
3.4.2.1 Data Cache Block Touch (dcbt) and Data Cache Block Touch for Store (dcbtst) ...... |
134 |
|
3.4.2.2 Data Cache Block Zero (dcbz) ..................................................................................... |
|
134 |
3.4.2.3 Data Cache Block Store (dcbst) .................................................................................. |
|
135 |
3.4.2.4 Data Cache Block Flush (dcbf) .................................................................................... |
|
135 |
3.4.2.5 Data Cache Block Invalidate (dcbi) ............................................................................. |
|
135 |
3.4.2.6 Instruction Cache Block Invalidate (icbi) ...................................................................... |
|
136 |
3.5 Cache Operations ......................................................................................................................... |
|
136 |
3.5.1 Cache-Block-Replacement/Castout Operations .................................................................. |
|
136 |
3.5.2 Cache Flush Operations .................................................................................................. |
.... |
138 |
3.5.3 Data-Cache Block-Fill Operations ....................................................................................... |
|
139 |
3.5.4 Instruction-Cache Block-Fill Operations .............................................................................. |
|
139 |
3.5.5 Data-Cache Block-Push Operations .................................................................................... |
|
139 |
3.6 L1 Caches and 60x Bus Transactions ........................................................................................ |
.. |
139 |
3.6.1 Read Operations and the MEI Protocol ............................................................................... |
|
140 |
3.6.2 Bus Operations Caused by Cache-Control Instructions ...................................................... |
|
141 |
3.6.3 Snooping ............................................................................................................................. |
|
142 |
3.6.4 Snoop Response to 60x Bus Transactions ......................................................................... |
|
143 |
3.6.5 Transfer Attributes ..................................................................................................... |
.......... |
145 |
3.7 MEI State Transactions ................................................................................................................. |
|
147 |
4. Exceptions ............................................................................................................... |
|
151 |
4.1 PowerPC 750GX Microprocessor Exceptions ............................................................................... |
|
152 |
750gx_umTOC.fm.(1.2) |
|
|
March 27, 2006 |
Page 5 of 377 |
User’s Manual |
|
IBM PowerPC 750GX and 750GL RISC Microprocessor |
|
4.2 Exception Recognition and Priorities ............................................................................................. |
153 |
4.3 Exception Processing .................................................................................................................... |
156 |
4.3.1 Machine Status Save/Restore Register 0 (SRR0) ............................................................... |
156 |
4.3.2 Machine Status Save/Restore Register 1 (SRR1) ............................................................... |
157 |
4.3.3 Machine State Register (MSR) ............................................................................................ |
158 |
4.3.4 Enabling and Disabling Exceptions ...................................................................................... |
160 |
4.3.5 Steps for Exception Processing ........................................................................................... |
160 |
4.3.6 Setting MSR[RI] ................................................................................................................... |
161 |
4.3.7 Returning from an Exception Handler .................................................................................. |
161 |
4.4 Process Switching ......................................................................................................................... |
162 |
4.5 Exception Definitions ..................................................................................................................... |
162 |
4.5.1 System Reset Exception (0x00100) ..................................................................................... |
163 |
4.5.1.1 Soft Reset ..................................................................................................................... |
164 |
4.5.1.2 Hard Reset ................................................................................................................... |
164 |
4.5.2 Machine-Check Exception (0x00200) .................................................................................. |
167 |
4.5.2.1 Machine-Check Exception Enabled (MSR[ME] = 1) ..................................................... |
168 |
4.5.2.2 Checkstop State (MSR[ME] = 0) .................................................................................. |
169 |
4.5.3 DSI Exception (0x00300) ..................................................................................................... |
169 |
4.5.4 ISI Exception (0x00400) ....................................................................................................... |
169 |
4.5.5 External Interrupt Exception (0x00500) ............................................................................... |
169 |
4.5.6 Alignment Exception (0x00600) ........................................................................................... |
170 |
4.5.7 Program Exception (0x00700) ............................................................................................. |
170 |
4.5.8 Floating-Point Unavailable Exception (0x00800) ................................................................. |
171 |
4.5.9 Decrementer Exception (0x00900) ...................................................................................... |
171 |
4.5.10 System Call Exception (0x00C00) ..................................................................................... |
171 |
4.5.11 Trace Exception (0x00D00) ............................................................................................... |
171 |
4.5.12 Floating-Point Assist Exception (0x00E00) ........................................................................ |
171 |
4.5.13 Performance-Monitor Interrupt (0x00F00) ......................................................................... |
172 |
4.5.14 Instruction Address Breakpoint Exception (0x01300) ........................................................ |
173 |
4.5.15 System Management Interrupt (0x01400) ......................................................................... |
173 |
4.5.16 Thermal-Management Interrupt Exception (0x01700) ....................................................... |
174 |
4.5.17 Data Address Breakpoint Exception .................................................................................. |
175 |
4.5.17.1 Data Address Breakpoint Register (DABR) ................................................................ |
175 |
4.5.18 Soft Stops .......................................................................................................................... |
175 |
4.5.19 Exception Latencies ........................................................................................................... |
176 |
4.5.20 Summary of Front-End Exception Handling ....................................................................... |
176 |
4.5.21 Timer Facilities ................................................................................................................... |
177 |
4.5.22 External Access Instructions .............................................................................................. |
177 |
5. Memory Management .............................................................................................. |
179 |
5.1 MMU Overview .............................................................................................................................. |
179 |
5.1.1 Memory Addressing ............................................................................................................. |
181 |
5.1.2 MMU Organization ............................................................................................................... |
181 |
5.1.3 Address-Translation Mechanisms ........................................................................................ |
186 |
5.1.4 Memory-Protection Facilities ................................................................................................ |
187 |
5.1.5 Page History Information ..................................................................................................... |
188 |
5.1.6 General Flow of MMU Address Translation ......................................................................... |
189 |
5.1.6.1 Real-Addressing Mode and Block-Address-Translation Selection ............................... |
189 |
5.1.6.2 Page-Address-Translation Selection ............................................................................ |
190 |
5.1.7 MMU Exceptions Summary ................................................................................................. |
192 |
|
750gx_umTOC.fm.(1.2) |
Page 6 of 377 |
March 27, 2006 |
|
User’s Manual |
IBM PowerPC 750GX and 750GL RISC Microprocessor |
|
5.1.8 MMU Instructions and Register Summary ........................................................................... |
194 |
5.2 Real-Addressing Mode ...................................................................................................... |
............ 195 |
5.3 Block-Address Translation ................................................................................................. |
........... 196 |
5.4 Memory Segment Model ...................................................................................................... |
......... 196 |
5.4.1 Page History Recording .................................................................................................. |
..... 196 |
5.4.1.1 Referenced Bit ........................................................................................................ |
...... 197 |
5.4.1.2 Changed Bit ........................................................................................................... |
....... 198 |
5.4.1.3 Scenarios for Referenced and Changed Bit Recording ............................................... |
198 |
5.4.2 Page Memory Protection .................................................................................................. |
... 199 |
5.4.3 TLB Description ......................................................................................................... |
.......... 199 |
5.4.3.1 TLB Organization ...................................................................................................... |
... 199 |
5.4.3.2 TLB Invalidation ...................................................................................................... |
...... 201 |
5.4.4 Page-Address-Translation Summary .................................................................................. |
202 |
5.4.5 Page Table-Search Operation ............................................................................................. |
204 |
5.4.6 Page Table Updates ...................................................................................................... |
...... 207 |
5.4.7 Segment Register Updates ................................................................................................ |
. 207 |
6. Instruction Timing ................................................................................................... |
209 |
6.1 Terminology and Conventions ............................................................................................... |
....... 209 |
6.2 Instruction Timing Overview ............................................................................................... |
........... 211 |
6.3 Timing Considerations .................................................................................................................. |
215 |
6.3.1 General Instruction Flow ................................................................................................ |
...... 215 |
6.3.2 Instruction Fetch Timing ................................................................................................ |
...... 216 |
6.3.2.1 Cache Arbitration ..................................................................................................... |
..... 217 |
6.3.2.2 Cache Hit ............................................................................................................. |
......... 217 |
6.3.2.3 Cache Miss ............................................................................................................ |
....... 222 |
6.3.2.4 L2 Cache Access Timing Considerations ..................................................................... |
224 |
6.3.2.5 Instruction Dispatch and Completion Considerations ................................................... |
224 |
6.3.2.6 Rename Register Operation ......................................................................................... |
224 |
6.3.2.7 Instruction Serialization ............................................................................................. |
... 225 |
6.4 Execution-Unit Timings ................................................................................................................. |
225 |
6.4.1 Branch Processing Unit Execution Timing .......................................................................... |
225 |
6.4.1.1 Branch Folding ........................................................................................................ |
..... 226 |
6.4.1.2 Branch Instructions and Completion ............................................................................ |
227 |
6.4.1.3 Branch Prediction and Resolution ................................................................................ |
228 |
6.4.2 Integer Unit Execution Timing ........................................................................................... |
.. 232 |
6.4.3 Floating-Point Unit Execution Timing .................................................................................. |
232 |
6.4.4 Effect of Floating-Point Exceptions on Performance ........................................................... |
232 |
6.4.5 Load/Store Unit Execution Timing ....................................................................................... |
233 |
6.4.6 Effect of Operand Placement on Performance .................................................................... |
233 |
6.4.7 Integer Store Gathering ................................................................................................. |
...... 234 |
6.4.8 System Register Unit Execution Timing .............................................................................. |
234 |
6.5 Memory Performance Considerations ......................................................................................... |
.. 235 |
6.5.1 Caching and Memory Coherency ........................................................................................ |
235 |
6.5.2 Effect of TLB Miss ...................................................................................................... |
......... 236 |
6.6 Instruction Scheduling Guidelines ......................................................................................... |
........ 236 |
6.6.1 Branch, Dispatch, and Completion-Unit Resource Requirements ....................................... |
237 |
6.6.1.1 Branch-Resolution Resource Requirements ................................................................ |
237 |
6.6.1.2 Dispatch-Unit Resource Requirements ........................................................................ |
237 |
750gx_umTOC.fm.(1.2) |
|
March 27, 2006 |
Page 7 of 377 |
User’s Manual |
|
IBM PowerPC 750GX and 750GL RISC Microprocessor |
|
6.6.1.3 Completion-Unit Resource Requirements .................................................................... |
237 |
6.7 Instruction Latency Summary ........................................................................................................ |
238 |
7. Signal Descriptions ................................................................................................. |
249 |
7.1 Signal Configuration ...................................................................................................................... |
250 |
7.2 Signal Descriptions ........................................................................................................................ |
251 |
7.2.1 Address-Bus Arbitration Signals .......................................................................................... |
251 |
7.2.1.1 Bus Request (BR)—Output .......................................................................................... |
251 |
7.2.1.2 Bus Grant (BG)—Input ................................................................................................. |
252 |
7.2.1.3 Address Bus Busy (ABB) .............................................................................................. |
252 |
7.2.2 Address Transfer Start Signals ............................................................................................ |
253 |
7.2.2.1 Transfer Start (TS) ........................................................................................................ |
253 |
7.2.3 Address Transfer Signals ..................................................................................................... |
254 |
7.2.3.1 Address Bus (A[0–31]) ................................................................................................. |
254 |
7.2.3.2 Address-Bus Parity (AP[0–3]) ....................................................................................... |
255 |
7.2.4 Address Transfer Attribute Signals ...................................................................................... |
255 |
7.2.4.1 Transfer Type (TT[0–4]) ............................................................................................... |
256 |
7.2.4.2 Transfer Size (TSIZ[0–2])—Output ............................................................................... |
258 |
7.2.4.3 Transfer Burst (TBST) .................................................................................................. |
259 |
7.2.4.4 Cache Inhibit (CI)—Output ........................................................................................... |
260 |
7.2.4.5 Write-Through (WT)—Output ....................................................................................... |
260 |
7.2.4.6 Global (GBL) ................................................................................................................. |
261 |
7.2.5 Address Transfer Termination Signals ................................................................................. |
262 |
7.2.5.1 Address Acknowledge (AACK)—Input ......................................................................... |
262 |
7.2.5.2 Address Retry (ARTRY) ............................................................................................... |
263 |
7.2.6 Data-Bus Arbitration Signals ................................................................................................ |
264 |
7.2.6.1 Data-Bus Grant (DBG)—Input ...................................................................................... |
264 |
7.2.6.2 Data-Bus Write-Only (DBWO) ...................................................................................... |
265 |
7.2.6.3 Data Bus Busy (DBB) ................................................................................................... |
265 |
7.2.7 Data-Transfer Signals .......................................................................................................... |
266 |
7.2.7.1 Data Bus (DH[0–31], DL[0–31]) .................................................................................... |
266 |
7.2.7.2 Data-Bus Parity (DP[0–7]) ............................................................................................ |
267 |
7.2.7.3 Data Bus Disable (DBDIS)—Input ................................................................................ |
268 |
7.2.8 Data-Transfer Termination Signals ...................................................................................... |
268 |
7.2.8.1 Transfer Acknowledge (TA)—Input .............................................................................. |
268 |
7.2.8.2 Data Retry (DRTRY)—Input ......................................................................................... |
269 |
7.2.8.3 Transfer Error Acknowledge (TEA)—Input ................................................................... |
269 |
7.2.9 System Status Signals ......................................................................................................... |
270 |
7.2.9.1 Interrupt (INT)— Input .................................................................................................. |
270 |
7.2.9.2 System Management Interrupt (SMI)—Input ................................................................ |
270 |
7.2.9.3 Machine-Check Interrupt (MCP)—Input ....................................................................... |
271 |
7.2.9.4 Checkstop Input (CKSTP_IN)—Input ........................................................................... |
271 |
7.2.9.5 Checkstop Output (CKSTP_OUT)—Output ................................................................. |
271 |
7.2.10 Reset Signals ..................................................................................................................... |
272 |
7.2.10.1 Hard Reset (HRESET)—Input .................................................................................... |
272 |
7.2.10.2 Soft Reset (SRESET)—Input ..................................................................................... |
272 |
7.2.11 Processor Status Signals ................................................................................................... |
273 |
7.2.11.1 Quiescent Request (QREQ)—Output ......................................................................... |
273 |
7.2.11.2 Quiescent Acknowledge (QACK)—Input .................................................................... |
273 |
7.2.11.3 Reservation (RSRV)—Output ..................................................................................... |
273 |
|
750gx_umTOC.fm.(1.2) |
Page 8 of 377 |
March 27, 2006 |
|
User’s Manual |
IBM PowerPC 750GX and 750GL RISC Microprocessor |
|
7.2.11.4 Time Base Enable (TBEN)—Input ............................................................................. |
274 |
7.2.11.5 TLB Invalidate Synchronize (TLBISYNC)—Input ....................................................... |
274 |
7.2.12 Processor Mode Selection Signals .................................................................................... |
274 |
7.2.13 I/O Voltage Select Signals ............................................................................................. |
.... 275 |
7.2.14 Test Interface Signals ................................................................................................. |
....... 275 |
7.2.14.1 IEEE 1149.1a-1993 Interface Description .................................................................. |
275 |
7.2.14.2 LSSD_MODE ............................................................................................................. |
275 |
7.2.14.3 L1_TSTCLK ............................................................................................................ |
.... 276 |
7.2.14.4 L2_TSTCLK ............................................................................................................ |
.... 276 |
7.2.14.5 BVSEL ................................................................................................................ |
........ 276 |
7.2.15 Clock Signals .......................................................................................................... |
........... 276 |
7.2.15.1 System Clock (SYSCLK)—Input ................................................................................ |
277 |
7.2.15.2 Clock Out (CLK_OUT)—Output ................................................................................. |
277 |
7.2.15.3 PLL Configuration (PLL_CFG[0:4])—Input ................................................................. |
277 |
7.2.15.4 PLL Range (PLL_RNG[0:1])—Input ........................................................................... |
278 |
7.2.16 Power and Ground Signals ............................................................................................... |
. 278 |
8. Bus Interface Operation ......................................................................................... |
279 |
8.1 Bus Interface Overview ................................................................................................................. |
280 |
8.1.1 Operation of the Instruction and Data L1 Caches ............................................................... |
281 |
8.1.2 Operation of the Bus Interface .......................................................................................... |
... 282 |
8.1.3 Bus Signal Clocking ..................................................................................................... |
........ 282 |
8.1.4 Optional 32-Bit Data Bus Mode ........................................................................................... |
282 |
8.1.5 Direct-Store Accesses ................................................................................................... |
...... 283 |
8.2 Memory-Access Protocol .................................................................................................... |
.......... 284 |
8.2.1 Arbitration Signals ..................................................................................................... |
.......... 285 |
8.2.2 Miss-under-Miss ......................................................................................................... |
......... 286 |
8.2.2.1 Miss-under-Miss and System Performance ................................................................. |
287 |
8.2.2.2 Speculative Loads and Conditional Branches .............................................................. |
290 |
8.3 Address-Bus Tenure ..................................................................................................................... |
290 |
8.3.1 Address-Bus Arbitration ................................................................................................. |
...... 290 |
8.3.2 Address Transfer ........................................................................................................ |
......... 292 |
8.3.2.1 Address-Bus Parity .................................................................................................... |
... 294 |
8.3.2.2 Address Transfer Attribute Signals ............................................................................... |
294 |
8.3.2.3 Burst Ordering During Data Transfers .......................................................................... |
295 |
8.3.2.4 Effect of Alignment in Data Transfers ........................................................................... |
296 |
8.3.2.5 Alignment of External Control Instructions ................................................................... |
300 |
8.3.3 Address Transfer Termination ............................................................................................ |
. 300 |
8.4 Data-Bus Tenure ........................................................................................................................... |
301 |
8.4.1 Data-Bus Arbitration .................................................................................................... |
........ 301 |
8.4.1.1 Using the DBB Signal ................................................................................................... |
302 |
8.4.2 Data-Bus Write-Only ..................................................................................................... |
....... 303 |
8.4.3 Data Transfer ........................................................................................................... |
............ 303 |
8.4.4 Data-Transfer Termination ............................................................................................... |
... 303 |
8.4.4.1 Normal Single-Beat Termination .................................................................................. |
304 |
8.4.4.2 Data-Transfer Termination Due to a Bus Error ............................................................ |
307 |
8.4.5 Memory Coherency—MEI Protocol ..................................................................................... |
308 |
8.5 Timing Examples ........................................................................................................................... |
309 |
8.6 Optional Bus Configuration ................................................................................................ |
........... 316 |
8.6.1 32-Bit Data Bus Mode .................................................................................................... |
..... 316 |
750gx_umTOC.fm.(1.2) |
|
March 27, 2006 |
Page 9 of 377 |
User’s Manual |
|
|
IBM PowerPC 750GX and 750GL RISC Microprocessor |
|
|
|
8.6.2 No-DRTRY Mode ................................................................................................................. |
318 |
8.7 |
Processor State Signals ................................................................................................................ |
319 |
|
8.7.1 Support for the lwarx and stwcx. Instruction Pair ............................................................... |
319 |
|
8.7.2 TLBISYNC Input .................................................................................................................. |
319 |
8.8 |
IEEE 1149.1a-1993 Compliant Interface ....................................................................................... |
319 |
|
8.8.1 JTAG/COP Interface ............................................................................................................ |
319 |
8.9 |
Using Data-Bus Write-Only ........................................................................................................... |
320 |
9. L2 Cache ................................................................................................................... |
323 |
|
9.1 |
L2 Cache Overview ....................................................................................................................... |
323 |
9.2 |
L2 Cache Operation ...................................................................................................................... |
323 |
9.3 |
L2 Cache Control Register (L2CR) ............................................................................................... |
329 |
9.4 |
L2 Cache Initialization ................................................................................................................... |
329 |
9.5 |
L2 Cache Global Invalidation ........................................................................................................ |
329 |
9.6 |
L2 Cache Used as On-Chip Memory ............................................................................................ |
330 |
|
9.6.1 Locking the L2 Cache .......................................................................................................... |
330 |
|
9.6.1.1 Loading the Locked L2 Cache ...................................................................................... |
331 |
|
9.6.1.2 Locked Cache Operation .............................................................................................. |
331 |
9.7 |
Data-Only and Instruction-Only Modes ......................................................................................... |
332 |
9.8 |
L2 Cache Test Features and Methods .......................................................................................... |
332 |
|
9.8.1 L2CR Support for L2 Cache Testing .................................................................................... |
332 |
|
9.8.2 L2 Cache Testing ................................................................................................................. |
333 |
9.9 |
L2 Cache Timing ........................................................................................................................... |
333 |
10. Power and Thermal Management ........................................................................ |
335 |
|
10.1 Dynamic Power Management ..................................................................................................... |
335 |
|
10.2 Programmable Power Modes ...................................................................................................... |
335 |
|
|
10.2.1 Power Management Modes ............................................................................................... |
337 |
|
10.2.1.1 Full On Mode .............................................................................................................. |
337 |
|
10.2.1.2 Doze Mode ................................................................................................................. |
337 |
|
10.2.1.3 Nap Mode ................................................................................................................... |
337 |
|
10.2.1.4 Sleep Mode ................................................................................................................ |
339 |
|
10.2.1.5 Dynamic Power Reduction ......................................................................................... |
339 |
|
10.2.2 Power Management Software Considerations ................................................................... |
340 |
10.3 750GX Dual PLL Feature ............................................................................................................ |
340 |
|
|
10.3.1 Overview ............................................................................................................................ |
340 |
|
10.3.2 Configuration Restriction on Frequency Transitions .......................................................... |
341 |
|
10.3.3 Dual PLL Implementation ................................................................................................... |
342 |
10.4 Thermal Assist Unit ..................................................................................................................... |
343 |
|
|
10.4.1 Thermal Assist Unit Overview ............................................................................................ |
343 |
|
10.4.2 Thermal Assist Unit Operation ........................................................................................... |
344 |
|
10.4.2.1 TAU Single-Threshold Mode ...................................................................................... |
345 |
|
10.4.2.2 TAU Dual-Threshold Mode ......................................................................................... |
346 |
|
10.4.2.3 750GX Junction Temperature Determination ............................................................. |
346 |
|
10.4.2.4 Power Saving Modes and TAU Operation .................................................................. |
347 |
10.5 Instruction-Cache Throttling ........................................................................................................ |
347 |
|
11. Performance Monitor and System Related Features ......................................... |
349 |
|
|
|
750gx_umTOC.fm.(1.2) |
Page 10 of 377 |
March 27, 2006 |
|
|
User’s Manual |
|
IBM PowerPC 750GX and 750GL RISC Microprocessor |
|
11.1 |
Performance-Monitor Interrupt ............................................................................................ |
........ 349 |
11.2 |
Special-Purpose Registers Used by Performance Monitor ......................................................... |
350 |
11.2.1 Performance-Monitor Registers ......................................................................................... |
351 |
|
|
11.2.1.1 Monitor Mode Control Register 0 (MMCR0) ............................................................... |
351 |
|
11.2.1.2 User Monitor Mode Control Register 0 (UMMCR0) .................................................... |
351 |
|
11.2.1.3 Monitor Mode Control Register 1 (MMCR1) ............................................................... |
351 |
|
11.2.1.4 User Monitor Mode Control Register 1 (UMMCR1) .................................................... |
351 |
|
11.2.1.5 Performance-Monitor Counter Registers (PMCn) ...................................................... |
351 |
|
11.2.1.6 User Performance-Monitor Counter Registers (UPMC1–UPMC4) ............................ |
354 |
|
11.2.1.7 Sampled Instruction Address Register (SIA) .............................................................. |
355 |
|
11.2.1.8 User Sampled Instruction Address Register (USIA) ................................................... |
355 |
11.3 |
Event Counting ............................................................................................................................ |
355 |
11.4 |
Event Selection ........................................................................................................................... |
356 |
11.5 |
Notes ........................................................................................................................................... |
356 |
11.6 |
Debug Support ............................................................................................................................ |
357 |
11.6.1 Overview ............................................................................................................................ |
357 |
|
11.6.2 Data-Address Breakpoint ................................................................................................ |
.. 357 |
|
11.7 |
JTAG/COP Functions ....................................................................................................... |
........... 357 |
11.7.1 Introduction ........................................................................................................................ |
357 |
|
11.7.2 Processor Resources Available through JTAG/COP Serial Interface ............................... |
357 |
|
11.8 |
Resets ......................................................................................................................................... |
359 |
11.8.1 Hard Reset ............................................................................................................. |
........... 359 |
|
11.8.2 Soft Reset .......................................................................................................................... |
359 |
|
11.8.3 Reset Sequence ......................................................................................................... |
....... 360 |
|
11.9 |
Checkstops ................................................................................................................................. |
361 |
11.9.1 Checkstop Sources ...................................................................................................... |
..... 361 |
|
11.9.2 Checkstop Control Bits ................................................................................................. |
..... 361 |
|
11.9.3 Open-Collector-Driver States during Checkstop ............................................................... |
362 |
|
11.9.4 Vacancy Slot Application ............................................................................................... |
.... 362 |
|
11.10 750GX Parity ............................................................................................................................. |
363 |
|
11.10.1 Parity Control and Status ............................................................................................. |
.... 364 |
|
11.10.2 Enabling Parity Error Detection ....................................................................................... |
364 |
|
11.10.3 Parity Errors ......................................................................................................... |
............ 364 |
|
Acronyms and Abbreviations ................................................................................... |
365 |
|
Index |
........................................................................................................................ |
.... 369 |
Revision Log .............................................................................................................. |
377 |
750gx_umTOC.fm.(1.2) |
|
March 27, 2006 |
Page 11 of 377 |
User’s Manual
IBM PowerPC 750GX and 750GL RISC Microprocessor
|
750gx_umTOC.fm.(1.2) |
Page 12 of 377 |
March 27, 2006 |
|
|
User’s Manual |
|
IBM PowerPC 750GX and 750GL RISC Microprocessor |
|
List of Figures |
|
|
Figure 1-1. |
750GX Microprocessor Block Diagram .................................................................................. |
25 |
Figure 1-2. |
L1 Cache Organization .......................................................................................................... |
34 |
Figure 1-3. |
System Interface .................................................................................................................... |
37 |
Figure 1-4. |
750GX Microprocessor Signal Groups ................................................................................... |
39 |
Figure 1-5. |
Pipeline Diagram .................................................................................................................... |
53 |
Figure 2-1. |
PowerPC 750GX Microprocessor Programming Model—Registers ...................................... |
58 |
Figure 3-1. |
Cache Integration ................................................................................................................. |
122 |
Figure 3-2. |
Data-Cache Organization ..................................................................................................... |
123 |
Figure 3-3. |
Instruction-Cache Organization ............................................................................................ |
125 |
Figure 3-4. |
MEI Cache-Coherency Protocol—State Diagram (WIM = 001) ........................................... |
128 |
Figure 3-5. |
PLRU Replacement Algorithm ............................................................................................. |
137 |
Figure 3-6. |
750GX Cache Addresses ..................................................................................................... |
140 |
Figure 4-1. |
SRESET Asserted During HRESET .................................................................................... |
164 |
Figure 5-1. |
MMU Conceptual Block Diagram ......................................................................................... |
183 |
Figure 5-2. |
PowerPC 750GX Microprocessor IMMU Block Diagram ..................................................... |
184 |
Figure 5-3. |
750GX Microprocessor DMMU Block Diagram .................................................................... |
185 |
Figure 5-4. |
Address-Translation Types .................................................................................................. |
187 |
Figure 5-5. |
General Flow of Address Translation (Real-Addressing Mode and Block) .......................... |
189 |
Figure 5-6. |
General Flow of Page and Direct-Store Interface Address Translation ............................... |
191 |
Figure 5-7. |
Segment Register and DTLB Organization .......................................................................... |
200 |
Figure 5-8. |
Page-Address-Translation Flow—TLB Hit ........................................................................... |
203 |
Figure 5-9. |
Primary Page Table Search ................................................................................................. |
205 |
Figure 5-10. |
Secondary Page-Table-Search Flow ................................................................................... |
206 |
Figure 6-1. |
Pipelined Execution Unit ...................................................................................................... |
212 |
Figure 6-2. |
Superscalar/Pipeline Diagram .............................................................................................. |
212 |
Figure 6-3. |
PowerPC 750GX Microprocessor Pipeline Stages .............................................................. |
214 |
Figure 6-4. |
Instruction Flow Diagram ..................................................................................................... |
218 |
Figure 6-5. |
Instruction Timing—Cache Hit ............................................................................................. |
220 |
Figure 6-6. |
Instruction Timing—Cache Miss .......................................................................................... |
223 |
Figure 6-7. |
Branch Taken ....................................................................................................................... |
227 |
Figure 6-8. |
Removal of Fall-Through Branch Instruction ........................................................................ |
227 |
Figure 6-9. |
Branch Completion ............................................................................................................... |
228 |
Figure 6-10. |
Branch Instruction Timing .................................................................................................... |
231 |
Figure 7-1. |
750GX Signal Groups .......................................................................................................... |
250 |
Figure 8-1. |
Bus Interface Address Buffers ............................................................................................. |
280 |
Figure 8-2. |
Timing Diagram Legend ....................................................................................................... |
283 |
Figure 8-3. |
Overlapping Tenures on the 750GX Bus for a Single-Beat Transfer ................................... |
284 |
Figure 8-4. |
Cache Diagram for Miss-under-Miss Feature ...................................................................... |
286 |
750gx_umLOF.fm.(1.2) |
List of Figures |
|
March 27, 2006 |
|
Page 13 of 377 |
User’s Manual |
|
|
IBM PowerPC 750GX and 750GL RISC Microprocessor |
|
|
Figure 8-5. |
First Level Address Pipelining .............................................................................................. |
287 |
Figure 8-6. |
Address-Bus Arbitration ........................................................................................................ |
290 |
Figure 8-7. |
Address-Bus Arbitration Showing Bus Parking .................................................................... |
291 |
Figure 8-8. |
Address-Bus Transfer ........................................................................................................... |
293 |
Figure 8-9. |
Snooped Address Cycle with ARTRY .................................................................................. |
301 |
Figure 8-10. |
Data-Bus Arbitration ............................................................................................................. |
302 |
Figure 8-11. |
Normal Single-Beat Read Termination ................................................................................. |
304 |
Figure 8-12. |
Normal Single-Beat Write Termination ................................................................................. |
305 |
Figure 8-13. |
Normal Burst Transaction ..................................................................................................... |
305 |
Figure 8-14. |
Termination with DRTRY ...................................................................................................... |
306 |
Figure 8-15. |
Read Burst with TA Wait States and DRTRY ....................................................................... |
307 |
Figure 8-16. |
MEI Cache-Coherency Protocol—State Diagram (WIM = 001) ........................................... |
309 |
Figure 8-17. |
Fastest Single-Beat Reads ................................................................................................... |
310 |
Figure 8-18. |
Fastest Single-Beat Writes ................................................................................................... |
311 |
Figure 8-19. |
Single-Beat Reads Showing Data-Delay Controls ............................................................... |
312 |
Figure 8-20. |
Single-Beat Writes Showing Data-Delay Controls ................................................................ |
313 |
Figure 8-21. |
Burst Transfers with Data-Delay Controls ............................................................................ |
314 |
Figure 8-22. |
Use of Transfer Error Acknowledge (TEA) ........................................................................... |
315 |
Figure 8-23. |
32-Bit Data-Bus Transfer (8-Beat Burst) .............................................................................. |
317 |
Figure 8-24. |
32-Bit Data-Bus Transfer (2-Beat Burst with DRTRY) .......................................................... |
317 |
Figure 8-25. |
IEEE 1149.1a-1993 Compliant Boundary-Scan Interface .................................................... |
320 |
Figure 8-26. |
Data-Bus Write-Only Transaction ......................................................................................... |
320 |
Figure 9-1. |
L2 Cache .............................................................................................................................. |
327 |
Figure 10-1. |
750GX Power States ............................................................................................................ |
336 |
Figure 10-2. |
Dual PLL Block Diagram ...................................................................................................... |
342 |
Figure 10-3. |
Dual PLL Switching Example, 3X to 4X ................................................................................ |
343 |
Figure 10-4. |
Thermal Assist Unit Block Diagram ...................................................................................... |
344 |
Figure 10-5. |
Instruction Cache Throttling Control SPR Diagram .............................................................. |
347 |
Figure 11-1. |
750GX IEEE 1149.1a-1993/COP Organization .................................................................... |
358 |
Figure 11-2. |
Reset Sequence ................................................................................................................... |
360 |
List of Figures |
750gx_umLOF.fm.(1.2) |
Page 14 of 377 |
March 27, 2006 |
|
|
User’s Manual |
|
IBM PowerPC 750GX and 750GL RISC Microprocessor |
|
List of Tables |
|
|
Table 1-1. |
Architecture-Defined Registers (Excluding SPRs) ................................................................. |
42 |
Table 1-2. |
Architecture-Defined SPRs Implemented .............................................................................. |
43 |
Table 1-3. |
Implementation-Specific Registers ......................................................................................... |
44 |
Table 1-4. |
750GX Microprocessor Exception Classifications .................................................................. |
49 |
Table 1-5. |
Exceptions and Conditions ..................................................................................................... |
50 |
Table 2-1. |
Additional MSR Bits ............................................................................................................... |
60 |
Table 2-2. |
Additional SRR1 Bits .............................................................................................................. |
62 |
Table 2-3. |
Valid THRM1/THRM2 Bit Settings ......................................................................................... |
79 |
Table 2-4. |
Memory Operands ................................................................................................................. |
82 |
Table 2-5. |
Floating-Point Operand Data-Type Behavior ......................................................................... |
84 |
Table 2-6. |
Floating-Point Result Data-Type Behavior ............................................................................. |
85 |
Table 2-7. |
Integer Arithmetic Instructions ................................................................................................ |
92 |
Table 2-8. |
Integer Compare Instructions ................................................................................................. |
93 |
Table 2-9. |
Integer Logical Instructions .................................................................................................... |
94 |
Table 2-10. |
Integer Rotate Instructions ..................................................................................................... |
95 |
Table 2-11. |
Integer Shift Instructions ........................................................................................................ |
95 |
Table 2-12. |
Floating-Point Arithmetic Instructions ..................................................................................... |
96 |
Table 2-13. |
Floating-Point Multiply/Add Instructions ................................................................................. |
96 |
Table 2-14. |
Floating-Point Rounding and Conversion Instructions ........................................................... |
97 |
Table 2-15. |
Floating-Point Compare Instructions ...................................................................................... |
97 |
Table 2-16. |
Floating-Point Status and Control Register Instructions ........................................................ |
97 |
Table 2-17. |
Floating-Point Move Instructions ............................................................................................ |
98 |
Table 2-18. |
Integer Load Instructions ........................................................................................................ |
99 |
Table 2-19. |
Integer Store Instructions ..................................................................................................... |
101 |
Table 2-20. |
Integer Load-and-Store with Byte-Reverse Instructions ...................................................... |
102 |
Table 2-21. |
Integer Load-and-Store Multiple Instructions ....................................................................... |
102 |
Table 2-22. |
Integer Load-and-Store String Instructions .......................................................................... |
103 |
Table 2-23. |
Floating-Point Load Instructions ........................................................................................... |
104 |
Table 2-24. |
Floating-Point Store Instructions .......................................................................................... |
105 |
Table 2-25. |
Store Floating-Point Single Behavior ................................................................................... |
105 |
Table 2-26. |
Store Floating-Point Double Behavior .................................................................................. |
105 |
Table 2-27. |
Branch Instructions .............................................................................................................. |
107 |
Table 2-28. |
Condition Register Logical Instructions ................................................................................ |
107 |
Table 2-29. |
Trap Instructions .................................................................................................................. |
108 |
Table 2-30. |
System Linkage Instruction—UISA ...................................................................................... |
108 |
Table 2-31. |
Move-to/Move-from Condition Register Instructions ............................................................ |
108 |
Table 2-32. |
Move-to/Move-from Special-Purpose Register Instructions (UISA) ..................................... |
109 |
Table 2-33. |
PowerPC Encodings ............................................................................................................ |
109 |
750gx_umLOT.fm.(1.2) |
List of Tables |
|
March 27, 2006 |
|
Page 15 of 377 |
User’s Manual |
|
|
IBM PowerPC 750GX and 750GL RISC Microprocessor |
|
|
Table 2-34. |
SPR Encodings for 750GX-Defined Registers (mfspr) ........................................................ |
112 |
Table 2-35. |
Memory Synchronization Instructions—UISA ....................................................................... |
113 |
Table 2-36. |
Move-from Time Base Instruction ......................................................................................... |
114 |
Table 2-37. |
Memory Synchronization Instructions—VEA ........................................................................ |
115 |
Table 2-38. |
User-Level Cache Instructions ............................................................................................. |
116 |
Table 2-39. |
External Control Instructions ................................................................................................ |
117 |
Table 2-40. |
System Linkage Instructions—OEA ..................................................................................... |
118 |
Table 2-41. |
Move-to/Move-from Machine State Register Instructions ..................................................... |
118 |
Table 2-42. |
Move-to/Move-from Special-Purpose Register Instructions (OEA) ...................................... |
118 |
Table 2-43. |
Supervisor-Level Cache-Management Instruction ............................................................... |
119 |
Table 2-44. |
Segment Register Manipulation Instructions ........................................................................ |
119 |
Table 2-45. |
Translation Lookaside Buffer Management Instruction ........................................................ |
120 |
Table 3-1. |
MEI State Definitions ............................................................................................................ |
127 |
Table 3-2. |
PLRU Bit Update Rules ........................................................................................................ |
138 |
Table 3-3. |
PLRU Replacement Block Selection .................................................................................... |
138 |
Table 3-4. |
Bus Operations Caused by Cache-Control Instructions (WIM = 001) .................................. |
141 |
Table 3-5. |
Response to Snooped Bus Transactions ............................................................................. |
143 |
Table 3-6. |
Address/Transfer Attribute Summary ................................................................................... |
146 |
Table 3-7. |
MEI State Transitions ........................................................................................................... |
147 |
Table 4-1. |
PowerPC 750GX Microprocessor Exception Classifications ................................................ |
152 |
Table 4-2. |
Exceptions and Conditions ................................................................................................... |
152 |
Table 4-3. |
Exception Priorities ............................................................................................................... |
155 |
Table 4-4. |
IEEE Floating-Point Exception Mode Bits ............................................................................ |
160 |
Table 4-5. |
MSR Setting Due to Exception ............................................................................................. |
162 |
Table 4-6. |
System Reset Exception–Register Settings ......................................................................... |
163 |
Table 4-7. |
Settings Caused by Hard Reset ........................................................................................... |
166 |
Table 4-8. |
HID0 Machine-Check Enable Bits ........................................................................................ |
167 |
Table 4-9. |
Machine-Check Exception—Register Settings ..................................................................... |
168 |
Table 4-10. |
Performance-Monitor Interrupt Exception—Register Settings .............................................. |
172 |
Table 4-11. |
Instruction Address Breakpoint Exception—Register Settings ............................................. |
173 |
Table 4-12. |
System Management Interrupt Exception—Register Settings ............................................. |
174 |
Table 4-13. |
Thermal-Management Interrupt Exception—Register Settings ............................................ |
174 |
Table 4-14. |
Front-End Exception Handling Summary ............................................................................. |
176 |
Table 5-1. |
MMU Feature Summary ....................................................................................................... |
180 |
Table 5-2. |
Access Protection Options for Pages ................................................................................... |
188 |
Table 5-3. |
Translation Exception Conditions ......................................................................................... |
192 |
Table 5-4. |
Other MMU Exception Conditions for the 750GX Processor ................................................ |
193 |
Table 5-5. |
750GX Microprocessor Instruction Summary—Control MMUs ............................................ |
194 |
Table 5-6. |
750GX Microprocessor MMU Registers ............................................................................... |
195 |
List of Tables |
|
750gx_umLOT.fm.(1.2) |
Page 16 of 377 |
|
March 27, 2006 |
|
|
User’s Manual |
|
IBM PowerPC 750GX and 750GL RISC Microprocessor |
|
Table 5-7. |
Table-Search Operations to Update History Bits—TLB Hit Case |
........................................ 197 |
Table 5-8. |
Model for Guaranteed R and C Bit Settings ......................................................................... |
198 |
Table 6-1. |
Notation Conventions for Instruction Timing ........................................................................ |
214 |
Table 6-2. |
Performance Effects of Memory Operand Placement .......................................................... |
233 |
Table 6-3. |
TLB Miss Latencies .............................................................................................................. |
236 |
Table 6-4. |
Branch Instructions .............................................................................................................. |
238 |
Table 6-5. |
System-Register Instructions ............................................................................................... |
238 |
Table 6-6. |
Condition Register Logical Instructions ................................................................................ |
240 |
Table 6-7. |
Integer Instructions ............................................................................................................... |
240 |
Table 6-8. |
Floating-Point Instructions .................................................................................................... |
242 |
Table 6-9. |
Load-and-Store Instructions ................................................................................................. |
244 |
Table 7-1. |
Transfer Type Encodings for PowerPC 750GX Bus Master ................................................ |
256 |
Table 7-2. |
PowerPC 750GX Snoop Hit Response ................................................................................ |
257 |
Table 7-3. |
Data-Transfer Size ............................................................................................................... |
259 |
Table 7-4. |
Data-Bus Lane Assignments ................................................................................................ |
266 |
Table 7-5. |
DP[0–7] Signal Assignments ................................................................................................ |
267 |
Table 7-6. |
Summary of Mode Select Signals ........................................................................................ |
274 |
Table 7-7. |
Bus Voltage Selection Settings ............................................................................................ |
275 |
Table 7-8. |
IEEE Interface Pin Descriptions ........................................................................................... |
275 |
Table 8-1. |
Transfer Size Signal Encodings ........................................................................................... |
294 |
Table 8-2. |
Burst Ordering—64-Bit Bus .................................................................................................. |
295 |
Table 8-3. |
Burst Ordering—32-Bit Bus .................................................................................................. |
296 |
Table 8-4. |
Aligned Data Transfers ........................................................................................................ |
296 |
Table 8-5. |
Misaligned Data Transfers (4-Byte Examples) ..................................................................... |
298 |
Table 8-6. |
Aligned Data Transfers (32-Bit Bus Mode) .......................................................................... |
298 |
Table 8-7. |
Misaligned 32-Bit Data-Bus Transfer (4-Byte Examples) ..................................................... |
299 |
Table 9-1. |
Interpretation of LRU Bits ..................................................................................................... |
324 |
Table 9-2. |
Modification of LRU Bits ....................................................................................................... |
325 |
Table 9-3. |
Effect of Locked Ways on LRU Interpretation ...................................................................... |
325 |
Table 10-1. |
750GX Microprocessor Programmable Power Modes ......................................................... |
336 |
Table 10-2. |
HID0 Power Saving Mode Bit Settings ................................................................................. |
337 |
Table 10-3. |
Valid THRM1 and THRM2 Bit Settings ................................................................................ |
345 |
Table 10-4. |
ICTC Bit Field Settings ......................................................................................................... |
348 |
Table 11-1. |
Performance Monitor SPRs ................................................................................................. |
350 |
Table 11-2. |
PMC1 Events—MMCR0[19:25] Select Encodings ............................................................... |
352 |
Table 11-3. |
PMC2 Events—MMCR0[26:31] Select Encodings ............................................................... |
352 |
Table 11-4. |
PMC3 Events—MMCR1[0:4] Select Encodings ................................................................... |
353 |
Table 11-5. |
PMC4 Events—MMCR1[5:9] Select Encodings ................................................................... |
354 |
Table 11-6. |
HID0 Checkstop Control Bits ............................................................................................... |
361 |
750gx_umLOT.fm.(1.2) |
List of Tables |
|
March 27, 2006 |
|
Page 17 of 377 |
User’s Manual |
|
|
IBM PowerPC 750GX and 750GL RISC Microprocessor |
|
|
Table 11-7. |
HID2 Checkstop Control Bits ................................................................................................ |
362 |
Table 11-8. |
L2CR Checkstop Control Bits ............................................................................................... |
362 |
List of Tables |
750gx_umLOT.fm.(1.2) |
Page 18 of 377 |
March 27, 2006 |
User’s Manual
IBM PowerPC 750GX and 750GL RISC Microprocessor
This user’s manual defines the functionality of the PowerPC® 750GX and 750GL RISC microprocessors. It describes features of the 750GX and 750GL that are not defined by the architecture. This book is intended as a companion to the PowerPC Microprocessor Family: The Programming Environments (referred to as The Programming Environments Manual).
Note: Soft copies of the latest version of this manual and documents referred to in this manual that are produced by IBM can be accessed on the world wide web as follows: http://www-3.ibm.com/chips/techlib.
Note: All information contained in this document referring to the PowerPC 750GX RISC Microprocessor also pertains to the IBM PowerPC 750GL RISC Microprocessor.
This manual is intended for system software developers, hardware developers, and applications programmers designing products for the 750GX. Readers should understand operating systems, microprocessor system design, basic principles of RISC processing, and details of the PowerPC Architecture™.
PowerPC Architecture
•May, Cathy, et. al., eds. The PowerPC Architecture: A Specification for a New Family of RISC Processors, Second Edition. San Francisco, CA: Morgan-Kaufmann, 1994.
•McClanahan, Kip. PowerPC Programming for Intel Programmers. Foster City, CA: Hungry Minds, 1995.
•Shanley, Tom. PowerPC System Architecture, Second Edition. Richardson, TX: Addison-Wesley, 1995.
PowerPC Microprocessor Documentation
The latest version of this manual, errata, and other IBM documents referred to in this manual can be found at: http://www.ibm.com/chips/techlib.
•PowerPC 750GX RISC Microprocessor Datasheet. Provides data about bus timing, signal behavior, electrical and thermal characteristics, and other design considerations for each PowerPC implementation.
•PowerPC Microprocessor Family: The Programming Environments Manual (G522-0290-01). Provides information about resources defined by the PowerPC Architecture that are common to PowerPC processors.
•Implementation Variances Relative to Rev. 1 of The Programming Environments Manual.
•PowerPC Microprocessor Family: The Programmer’s Pocket Reference Guide (SA14-2093-00). This foldout card provides an overview of the PowerPC registers, instructions, and exceptions for 32-bit implementations.
•PowerPC Microprocessor Family: The Programmer’s Reference Guide (MPRPPCPRG-01). Includes the register summary, memory control model, exception vectors, and the PowerPC instruction set.
•Application notes. These short documents contain information about specific design issues useful to programmers and engineers working with PowerPC processors.
gx_preface.fm.(1.2) |
|
March 27, 2006 |
Page 19 of 377 |
User’s Manual
IBM PowerPC 750GX and 750GL RISC Microprocessor
Notational Conventions
mnemonics |
Instruction mnemonics are shown in lowercase bold. |
|
italics |
Italics indicate variable command parameters. For example: bcctrx. Book titles in text are |
|
|
|
set in italics. |
0x0 |
Prefix to denote a hexadecimal number. |
|
0b0 |
Prefix to denote a binary number. |
|
crfD |
Instruction syntax used to identify a destination Condition Register (CR) field. |
|
rA, rB |
Instruction syntax used to identify a source General Purpose Register (GPR). |
|
rD |
Instruction syntax used to identify a destination GPR. |
|
frA, frB, frC |
Instruction syntax used to identify a source Floating Point Register (FPR). |
|
frD |
Instruction syntax used to identify a destination FPR. |
|
REG[FIELD] |
Abbreviations or acronyms for registers are shown in uppercase text. Specific bits, fields, |
|
|
|
or ranges appear in brackets. For example, MSR[LE] refers to the little-endian mode |
|
|
enable bit in the Machine State Register. |
x |
In certain contexts, such as a signal encoding, this indicates a don’t care. |
|
n |
Used to express an undefined numerical value. |
|
¬ |
NOT logical operator. |
|
& |
AND logical operator. |
|
| |
OR logical operator. |
|
|
|
Indicates reserved bits or bit fields in a register. Although these bits can be written to as |
0 0 0 0 |
||
|
|
either ones or zeros, they are always read as zeros. |
|
|
|
gx_preface.fm.(1.2) |
Page 20 of 377 |
March 27, 2006 |
User’s Manual
IBM PowerPC 750GX and 750GL RISC Microprocessor
Terminology Conventions
The following table describes terminology conventions used in this manual and the equivalent terminology used in the PowerPC Architecture specification.
PowerPC Architecture Specification |
750GX User’s Manual |
|
|
|
|
Data-storage interrupt (DSI) |
DSI exception |
|
|
Extended mnemonics |
Simplified mnemonics |
|
|
Fixed-point unit (FXU) |
Integer unit (IU) |
|
|
Instruction storage interrupt (ISI) |
ISI exception |
|
|
Interrupt |
Exception |
|
|
Privileged mode (or privileged state) |
Supervisor-level privilege |
|
|
Problem mode (or problem state) |
User-level privilege |
|
|
Real address |
Physical address |
|
|
Relocation |
Translation |
|
|
Storage (locations) |
Memory |
|
|
Storage (the act of) |
Access |
|
|
Store in |
Write back |
|
|
Store through |
Write through |
|
|
Instruction Field Conventions
The following table describes instruction field conventions used in this manual and the equivalent conventions from the PowerPC Architecture specification.
PowerPC Architecture Specification |
750GX User’s Manual |
|
|
|
|
BA, BB, BT |
crbA, crbB, crbD (respectively) |
|
|
BF, BFA |
crfD, crfS (respectively) |
|
|
D |
d |
|
|
DS |
ds |
|
|
FLM |
FM |
|
|
FRA, FRB, FRC, FRT, FRS |
frA, frB, frC, frD, frS (respectively) |
|
|
FXM |
CRM |
|
|
RA, RB, RT, RS |
rA, rB, rD, rS (respectively) |
|
|
SI |
SIMM |
|
|
U |
IMM |
|
|
UI |
UIMM |
|
|
/, //, /// |
0...0 (shaded) |
|
|
gx_preface.fm.(1.2) |
|
March 27, 2006 |
Page 21 of 377 |
User’s Manual
IBM PowerPC 750GX and 750GL RISC Microprocessor
Because the PowerPC Architecture is designed to be flexible to support a broad range of processors, the
PowerPC Microprocessor Family: The Programming Environments Manual provides a general description of features that are common to PowerPC processors and indicates those features that are optional or that might be implemented differently in the design of each processor.
This document and The Programming Environments Manual describe three levels, or programming environments, of the PowerPC Architecture:
•PowerPC user instruction set architecture (UISA)—The UISA defines the level of the architecture to which user-level software should conform. The UISA defines the base user-level instruction set, userlevel registers, data types, memory conventions, and the memory and programming models seen by application programmers.
•PowerPC virtual environment architecture (VEA)—The VEA, which is the smallest component of the PowerPC Architecture, defines additional user-level functionality that falls outside typical user-level software requirements. The VEA describes the memory model for an environment in which multiple processors or other devices can access external memory and defines aspects of the cache model and cachecontrol instructions from a user-level perspective. The resources defined by the VEA are particularly useful for optimizing memory accesses and for managing resources in an environment in which other processors and other devices can access external memory.
Implementations that conform to the PowerPC VEA also conform to the PowerPC UISA, but might not necessarily adhere to the OEA.
•PowerPC operating environment architecture (OEA)—The OEA defines supervisor-level resources typically required by an operating system. The OEA defines the PowerPC memory-management model, supervisor-level registers, and the exception model.
Implementations that conform to the PowerPC OEA also conform to the PowerPC UISA and VEA.
Some resources are defined more generally at one level in the architecture and more specifically at another. For example, conditions that cause a floating-point exception are defined by the UISA, while the exception mechanism itself is defined by the OEA.
Because it is important to distinguish between the levels of the architecture in order to ensure compatibility across multiple platforms, those distinctions are shown clearly throughout this book.
For ease in reference, the arrangement of topics in this book follows that of The Programming Environments Manual. Topics build upon one another, beginning with a description and complete summary of 750GXspecific registers and instructions and progressing to more specialized topics such as 750GX-specific details regarding the cache, exception, and memory-management models. Therefore, chapters can include information from multiple levels of the architecture. (For example, the discussion of the cache model uses information from both the VEA and the OEA.)
The PowerPC Architecture: A Specification for a New Family of RISC Processors defines the architecture from the perspective of the three programming environments and remains the defining document for the PowerPC Architecture. For information about PowerPC documentation, see Related Publications on page 19.
|
gx_preface.fm.(1.2) |
Page 22 of 377 |
March 27, 2006 |
User’s Manual
IBM PowerPC 750GX and 750GL RISC Microprocessor
The IBM PowerPC 750GX reduced instruction set computer (RISC) Microprocessor is an implementation of the PowerPC Architecture™ with enhancements based on the IBM PowerPC 750™, 750CXe, and 750FX RISC microprocessor designs. This chapter provides an overview of the PowerPC 750GX microprocessor features, including a block diagram that shows the major functional components. It also describes how the 750GX implementation complies with the PowerPC Architecture definition.
Note: In this document, the IBM PowerPC 750GX RISC Microprocessor is abbreviated as 750GX or 750GX RISC Microprocessor.
The 750GX is a 32-bit implementation of the PowerPC Architecture in a 0.13 micron CMOS technology with six levels of copper interconnect. The 750GX is designed for high performance and low power consumption. It provides a superset of functionality to the PowerPC 750 processor, including a complete 60x bus interface, and enhancements such as an integrated 1-MB L2 cache.
750GX implements the 32-bit portion of the PowerPC Architecture, which provides 32-bit effective addresses, integer data types of 8, 16, and 32 bits, and floating-point data types of single and double-precision. 750GX is a superscalar processor that can complete two instructions simultaneously.
It incorporates the following six execution units:
•Floating-point unit (FPU)
•Branch processing unit (BPU)
•System register unit (SRU)
•Load/store unit (LSU)
•Two integer units (IUs): IU1 executes all integer instructions. IU2 executes all integer instructions except multiply and divide instructions.
The ability to execute several instructions in parallel and the use of simple instructions with rapid execution times yield high efficiency and throughput for 750GX-based systems. Most integer instructions execute in one clock cycle. The FPU is pipelined; it breaks the tasks it performs into subtasks, and then executes in three successive stages. Typically, a floating-point instruction can occupy only one of the three stages at a time, freeing the previous stage to work on the next floating-point instruction. Thus, three single-precision floatingpoint instructions can be in the FPU execute stage at a time. Double-precision add instructions have a 3-cycle latency; double-precision multiply and multiply/add instructions have a 4-cycle latency.
Figure 1-1, 750GX Microprocessor Block Diagram, on page 25 shows the parallel organization of the execution units (shaded in the diagram). The instruction unit fetches, dispatches, and predicts branch instructions. Note that this is a conceptual model that shows basic features rather than attempting to show how features are implemented physically.
750GX has independent on-chip, 32-KB, 8-way set-associative, physically addressed caches for instructions and data, and independent instruction and data memory management units (MMUs). Each memory management unit has a 128-entry, 2-way set-associative translation lookaside buffer (DTLB and ITLB) that saves recently used page-address translations. Block-address translation is done through the 8-entry instruction
gx_01.fm.(1.2) |
PowerPC 750GX Overview |
March 27,2006 |
Page 23 of 377 |
User’s Manual
IBM PowerPC 750GX and 750GL RISC Microprocessor
and data block-address-translation (IBAT and DBAT) arrays, defined by the PowerPC Architecture. During block translation, effective addresses are compared simultaneously with all eight block-address-translation (BAT) entries.
For information about the L1 cache, see Chapter 3, Instruction-Cache and Data-Cache Operation, on
page 121. The L2 cache is implemented with an on-chip, 4-way set-associative tag memory, and an on-chip 1-MB SRAM with error correction code (ECC) protection for data storage. For more information on the L2 Cache, see Chapter 9 on page 323.
The 750GX has a 32-bit address bus and a 64-bit data bus. Multiple devices compete for system resources through a central external arbiter. The 750GX’s 3-state cache-coherency protocol (MEI) supports the modified, exclusive, and invalid states, a compatible subset of the MESI (modified/exclusive/shared/invalid)
4-state protocol, and it operates coherently in systems with 4-state caches. The 750GX supports single-beat and burst data transfers for external memory accesses and memory-mapped I/O operations. The system interface is described in Chapter 7, Signal Descriptions, on page 249 and Chapter 8, Bus Interface Operation, on page 279.
The 750GX has four software-controllable power-saving modes. The three static modes; doze, nap, and sleep; progressively reduce power dissipation. When functional units are idle, a dynamic power management mode causes those units to enter a low-power mode automatically without affecting operational performance, software execution, or external hardware. The 750GX also provides a thermal assist unit (TAU) and a way to reduce the instruction fetch rate to limit power dissipation. Power management is described in Chapter 10, Power and Thermal Management, on page 335.
PowerPC 750GX Overview |
gx_01.fm.(1.2) |
Page 24 of 377 |
March 27,2006 |
User’s Manual
IBM PowerPC 750GX and 750GL RISC Microprocessor
Figure 1-1. 750GX Microprocessor Block Diagram
Additional Features: |
|
|
Instruction Control Unit |
|
|
|
|
|
128-Bit |
||||
|
Ifetch |
|
|
Branch Processing |
|
|
|
(4 Instructions) |
|||||
• |
Time Base Cntr/ |
|
|
|
|
|
|
|
|
||||
|
|
|
Unit |
|
|
|
|
|
|||||
|
Decrementer |
|
|
|
|
|
Instruction MMU |
|
|
||||
|
|
|
|
|
BTIC |
CTR |
|
|
|
||||
• |
Clock Multiplier |
|
|
|
|
|
|
|
|||||
|
|
|
|
|
|
|
|
|
|||||
• |
JTAG/COP Interface |
|
Instruction Queue |
|
64 Entries |
LR |
|
SRs |
|
|
|
||
• |
Thermal/Power |
|
(6 Words) |
|
BHT |
CR |
|
(Shadow) |
IBAT |
|
32-KB |
||
|
Management |
|
|
|
|
|
|
|
Array |
Tags |
I Cache |
||
• |
|
|
|
|
|
|
|
|
|||||
Performance Monitor |
|
|
|
|
|
|
|
ITLB |
|
|
|
||
|
|
|
|
|
|
|
Interrupt Logic |
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
||
|
2 Instructions |
|
|
Dispatch Unit |
64-Bit |
|
|
|
|
|
64-Bit |
||
|
|
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
|
(2 Instructions) |
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
Reservation Station |
||
Reservation Station |
Reservation Station |
Reservation Station |
GPR File |
Reservation Station |
FPR File |
|
(2 Entry) |
|
|||||
|
|
|
|
|
|
(2 Entry) |
|
|
|
||||
|
|
|
|
|
Rename Buffers |
|
|
Rename Buffers |
|
|
|
||
|
|
|
System Register |
|
(6) |
32-Bit |
Load/Store Unit |
(6) |
|
Floating-Point |
|||
|
|
|
|
|
|
|
|||||||
Integer Unit 1 |
Integer Unit 2 |
|
Unit |
|
|
|
+ |
64-Bit |
|
64-Bit |
Unit |
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
+ x ÷ |
+ |
|
|
|
|
|
(EA Calculation) |
|
|
+ x ÷ |
|
||
|
|
|
|
|
Store Queue |
|
|
FPSCR |
|
||||
|
|
|
|
|
|
|
|
|
|
|
|||
|
32-Bit |
|
32-Bit |
|
|
|
|
|
|
|
|
||
|
|
|
|
|
EA |
PA |
|
|
|
|
|
|
|
Completion Unit |
|
|
Data MMU |
|
|
|
60x Bus Interface Unit |
|
|
|
|||
|
|
|
|
|
Instruction Fetch Queue |
|
|
|
|||||
|
|
|
|
|
|
L2 Cache |
|
|
|||||
Reorder Buffer |
|
|
|
|
|
|
|
|
|
||||
|
SRs |
|
|
|
64-Bit |
|
L1 Castout Queue |
64-Bit |
|
|
|
||
(6 Entry) |
|
|
Tags |
|
|
|
|
|
|||||
|
(Original) |
DBAT |
|
|
|
|
|
L2CR |
|
|
|||
|
|
|
|
|
|
|
|
Data Load Queue |
|
|
|
||
|
|
|
|
Array |
32-KB |
|
256-Bit |
|
L2 Tag |
|
|
||
|
|
DTLB |
|
D Cache |
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
256-Bit |
|
|
|
|
1 MB |
|
|
|
|
|
|
|
|
32-Bit Address Bus |
|
|
SRAM |
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|||
|
|
60x Bus |
|
|
|
|
64-Bit Data Bus |
|
|
|
|
|
This section lists features of the 750GX. The interrelationship of these features is shown in Figure 1-1 on page 25.
Major features of 750GX are:
•High-performance, superscalar microprocessor.
–As many as four instructions can be fetched from the instruction cache per clock cycle.
–As many as two instructions can be dispatched and completed per clock.
–As many as six instructions can execute per clock (including two integer instructions).
–Single-clock-cycle execution for most instructions.
•Six independent execution units and two register files.
–BPU featuring both static and dynamic branch prediction.
•64-entry (16-set, 4-way set-associative) branch target instruction cache (BTIC), a cache of branch instructions that have been encountered in branch/loop code sequences. If a target instruction is in the BTIC, it is fetched into the instruction queue a cycle sooner than it can be
gx_01.fm.(1.2) |
PowerPC 750GX Overview |
March 27,2006 |
Page 25 of 377 |
User’s Manual
IBM PowerPC 750GX and 750GL RISC Microprocessor
made available from the instruction cache. Typically, if a fetch access hits the BTIC, it provides the first two instructions in the target stream effectively yielding a zero-cycle branch.
•512-entry branch history table (BHT) with two bits per entry for four levels of prediction—not- taken, strongly not-taken, taken, strongly taken.
•Removal of Branch instructions that do not update the Count Register (CTR) or Link Register (LR) from the instruction stream.
–Two integer units (IUs) that share 32 general purpose registers (GPRs) for integer operands.
•IU1 can execute any integer instruction.
•IU2 can execute all integer instructions except multiply and divide instructions (multiply, divide, shift, rotate, arithmetic, and logical instructions). Most instructions that execute in the IU2 take one cycle to execute. The IU2 has a single-entry reservation station.
–3-stage floating-point unit (FPU).
•FPU fully compliant with IEEE® 754-1985 for both single-precision and double-precision operations.
•Support for non-IEEE mode for time-critical operations.
•Hardware support for denormalized numbers.
•Hardware support for divide.
•2-entry reservation station.
•Thirty-two 64-bit Floating Point Registers (FPRs) for single and double-precision operations.
–2-stage load/store unit (LSU).
•2-entry reservation station.
•4-entry load queue.
•Single-cycle, pipelined cache access.
•Dedicated adder performs effective address (EA) calculations.
•Performs alignment and precision conversion for floating-point data.
•Performs alignment and sign extension for integer data.
•3-entry store queue.
•Supports both big-endian and little-endian modes.
–System register unit (SRU) handles miscellaneous instructions.
•Executes Condition Register (CR) logical and Move-to/Move-from SPR instructions (mtspr and mfspr).
•Single-entry reservation station.
•Rename buffers.
–Six GPR rename buffers.
–Six FPR rename buffers.
–Condition Register buffering supports two CR writes per clock.
•Completion unit.
–The completion unit retires an instruction from the 6-entry reorder buffer (completion queue) when all instructions ahead of it have been completed, the instruction has finished execution, and no exceptions are pending.
–Guarantees a sequential programming model and a precise-exception model.
–Monitors all dispatched instructions and retires them in order.
–Tracks unresolved branches and flushes instructions from the mispredicted branch path.
PowerPC 750GX Overview |
gx_01.fm.(1.2) |
Page 26 of 377 |
March 27,2006 |
User’s Manual
IBM PowerPC 750GX and 750GL RISC Microprocessor
–Retires as many as two instructions per clock.
•Separate on-chip L1 instruction and data caches (Harvard architecture).
–32-KB, 8-way set-associative instruction and data caches.
–Pseudo least-recently-used (PLRU) replacement algorithm.
–32-byte (8-word) cache block.
–Physically indexed/physical tags.
Note: The PowerPC Architecture refers to physical address space as real address space.
–Cache write-back or write-through operation programmable on a virtual-page or BAT-block basis.
–Instruction cache can provide four instructions per clock; data cache can provide two words per clock
–Caches can be disabled in software.
–Caches can be locked in software.
–Data-cache coherency (MEI) maintained in hardware.
–The critical double word is made available to the requesting unit when it is read into the line-fill buffer. The cache is nonblocking, so it can be accessed during block reload.
–Nonblocking instruction cache (one outstanding miss).
–Nonblocking data cache (four outstanding misses).
–No snooping of instruction cache.
–Parity for L1 tags and caches.
•Integrated L2 cache.
–1-MB on-chip ECC SRAMs.
–On-chip 4-way set-associative tag memory.
–ECC error correction for most single-bit errors; detection of remaining single-bit errors and all doublebit errors.
–Copy-back or write-through data cache on a page basis, or for entire L2.
–64-byte line size, two sectors per line.
–L2 frequency at core speed.
–On-board ECC; parity for L2 tags.
–Supports up to four outstanding misses (three data and one instruction or four data).
–Cache locking by way.
•Separate memory management units (MMUs) for instructions and data.
–52-bit virtual address; 32-bit physical address.
–Address translation for virtual pages or variable-sized BAT blocks.
–Memory programmable as write-back or write-through, cacheable or noncacheable, and coherency enforced or coherency not enforced on a virtual-page or BAT block basis.
–Separate IBAT and DBAT arrays (eight each) for instructions and data, respectively.
–Separate virtual instruction and data translation lookaside buffers (TLBs).
•Both TLBs are 128-entry, 2-way set associative, and use an LRU replacement algorithm.
gx_01.fm.(1.2) |
PowerPC 750GX Overview |
March 27,2006 |
Page 27 of 377 |
User’s Manual
IBM PowerPC 750GX and 750GL RISC Microprocessor
•TLBs are hardware-reloadable (the page table search is performed by hardware).
•Bus interface features:
–Enhanced 60x bus that pipelines back-to-back reads to a depth of four. A dedicated snoop queue that allows snoop copybacks to also pipeline with up to the four maximum reads. Enveloped write transactions supported with the assertion of DBWO.
–Selectable bus-to-core clock frequency ratios of 2x, 2.5x, 3x, 3.5x, 4x, 4.5x, 5x, 5.5x, 6x, 6.5x, 7x, 7.5x, 8x, 8.5x, 9x, 9.5x, 10x, 11x, 12x, 13x, 14x, 15x, 16x, 17x, 18x, 19x, and 20x supported (2x, 2.5x, 3x, and 3.5x not supported with bus pipelining enabled).
–A 64-bit, split-transaction external data bus with burst transfers.
–Support for address pipelining and limited out-of-order bus transactions.
–8-word reload buffer for the L1 data cache.
–Single-entry instruction fetch queue.
–2-entry L2 cache castout queue.
–No-DRTRY mode eliminates the DRTRY signal from the qualified bus grant. This allows the forward- ing of data during load operations to the internal core one bus cycle sooner than if the use of DRTRY is enabled.
–Selectable I/O interface voltages of 1.8 V, 2.5 V, or 3.3 V
•Multiprocessing support features:
–Hardware-enforced, 3-state cache-coherency protocol (MEI) for data cache.
–Load/store with reservation instruction pair for atomic memory references, semaphores, and other multiprocessor operations.
•Power and thermal management:
–Three static modes, doze, nap, and sleep, progressively reduce power dissipation:
•Doze—All the functional units are disabled except for the Time Base/Decrementer Registers and the bus snooping logic.
•Nap—The nap mode further reduces power consumption by disabling bus snooping, leaving only the Time Base Register and the PLL in a powered state.
•Sleep—All internal functional units are disabled, after which external system logic can disable the PLL and SYSCLK.
–Software-controllable thermal management. Thermal management is performed through the use of three supervisor-level registers and a 750GX-specific thermal-management exception.
–Software-controlled frequency switching (dual PLL mode) to allow toggling between minimum and maximum frequencies to manage power consumption based on computational load.
–Instruction-cache throttling provides control to slow instruction fetching to limit power consumption.
•Hardware-assist features for fault-tolerant systems including L2 ECC correction, parity checking on internal arrays, and dual-processor lockstep operation.
•Performance monitor can be used to help debug system designs and improve software efficiency.
•In-system testability and debugging features through Joint Test Action Group (JTAG) boundary-scan capability.
PowerPC 750GX Overview |
gx_01.fm.(1.2) |
Page 28 of 377 |
March 27,2006 |
User’s Manual
IBM PowerPC 750GX and 750GL RISC Microprocessor
As shown in Figure 1-1, 750GX Microprocessor Block Diagram, on page 25, the 750GX instruction control unit provides centralized control of instruction flow to the execution units. The instruction unit contains a sequential instruction fetch (Ifetch), 6-entry instruction queue (IQ), dispatch unit, and BPU. It determines the address of the next instruction to be fetched based on information from the sequential instruction fetcher and from the BPU. See Chapter 6, Instruction Timing, on page 209 for more information.
The sequential instruction fetcher loads instructions from the instruction cache into the instruction queue. The BPU extracts branch instructions from the sequential instruction fetcher. Branch instructions that cannot be resolved immediately are predicted using either 750GX-specific dynamic branch prediction or the architec- ture-defined static branch prediction.
Branch instructions that do not update the LR or CTR are removed from (folded out of) the instruction stream. Instruction fetching continues along the predicted path of the branch instruction.
Instructions issued to execution units beyond a predicted branch can be executed but are not retired until the branch is resolved. If branch prediction is incorrect, the completion unit flushes all instructions fetched on the predicted path, and instruction fetching resumes along the correct path.
The instruction queue (IQ), shown in Figure 1-1 on page 25, holds as many as six instructions and loads up to four instructions from the instruction cache during a single-processor clock cycle. The instruction fetcher continuously attempts to load as many instructions as there were vacancies created in the IQ in the previous clock cycle. All instructions except branches are dispatched to their respective execution units from the bottom two positions in the instruction queue (IQ0 and IQ1) at a maximum rate of two instructions per cycle. Reservation stations are provided for the IU1, IU2, FPU, LSU, and SRU for dispatched instructions. The dispatch unit checks for source and destination register dependencies, allocates rename buffers, determines whether a position is available in the completion queue, and inhibits subsequent instruction dispatching if these resources are not available.
Branch instructions can be detected, decoded, and predicted from anywhere in the instruction queue. For a more detailed discussion of instruction dispatch, see Section 6.6.1, Branch, Dispatch, and Completion-Unit Resource Requirements, on page 237.
The BPU receives branch instructions from the sequential instruction fetcher and performs CR lookahead operations on conditional branches to resolve them early, achieving the effect of a zero-cycle branch in many cases.
Unconditional branch instructions and conditional branch instructions in which the condition is known can be resolved immediately. For unresolved conditional branch instructions, the branch path is predicted using either the architecture-defined static branch prediction or 750GX-specific dynamic branch prediction. Dynamic branch prediction is enabled if the BHT bit in Hardware-Implementation-Dependent Register 0 is set (HID0[BHT] = 1).
When a prediction is made, instruction fetching, dispatching, and execution continue along the predicted path, but instructions cannot be retired and write results back to architected registers until the prediction is determined to be correct (resolved). When a prediction is incorrect, the instructions from the incorrect path
gx_01.fm.(1.2) |
PowerPC 750GX Overview |
March 27,2006 |
Page 29 of 377 |
User’s Manual
IBM PowerPC 750GX and 750GL RISC Microprocessor
are flushed from the processor, and instruction fetching resumes along the correct path. The 750GX allows a second branch instruction to be predicted; instructions from the second predicted branch instruction stream can be fetched but cannot be dispatched. These instructions are held in the instruction queue.
Dynamic prediction is implemented using a 512-entry BHT. The BHT is a cache that provides two bits per entry that together indicate four levels of prediction for a branch instruction—not-taken, strongly not-taken, taken, strongly taken. When dynamic branch prediction is disabled, the BPU uses a bit in the instruction encoding to predict the direction of the conditional branch. Therefore, when an unresolved conditional branch instruction is encountered, the 750GX executes instructions from the predicted path although the results are not committed to architected registers until the conditional branch is resolved. This execution can continue until a second unresolved branch instruction is encountered.
When a branch is taken (or predicted as taken), the instructions from the untaken path must be flushed, and the target instruction stream must be fetched into the IQ. The BTIC is a 64-entry cache that contains the most recently used branch target instructions, typically in pairs. When an instruction fetch hits in the BTIC, the instructions arrive in the instruction queue in the next clock cycle, a clock cycle sooner than they would arrive from the instruction cache. Additional instructions arrive from the instruction cache in the next clock cycle. The BTIC reduces the number of missed opportunities to dispatch instructions and gives the processor a 1-cycle head start on processing the target stream. With the use of the BTIC, the 750GX achieves a zerocycle delay for branches taken. Coherency of the BTIC table is maintained by table reset on an instructioncache flash invalidate, Instruction Cache Block Invalidate (icbi) or Return from Interrupt (rfi) instruction execution, or when an exception is taken.
The BPU contains an adder to compute branch target addresses and three user-control registers—the Link Register (LR), the Count Register (CTR), and the CR. The BPU calculates the return pointer for subroutine calls and saves it into the LR for certain types of branch instructions. The LR also contains the branch target address for the Branch Conditional to Link Register (bclrx) instruction. The CTR contains the branch target address for the Branch Conditional to Count Register (bcctrx) instruction. Because the LR and CTR are special purpose registers (SPRs), their contents can be copied to or from any GPR. Since the BPU uses dedicated registers rather than GPRs or FPRs, execution of branch instructions is largely independent from execution of fixed-point and floating-point instructions.
The completion unit operates closely with the dispatch unit. Instructions are fetched and dispatched in program order. At the point of dispatch, the program order is maintained by assigning each dispatched instruction a successive entry in the 6-entry completion queue. The completion unit tracks instructions from dispatch through execution and retires them in program order from the two bottom entries in the completion queue (CQ0 and CQ1).
Instructions cannot be dispatched to an execution unit unless there is a vacancy in the completion queue and rename buffers are available. Branch instructions that do not update the CTR or LR are removed from the instruction stream and do not occupy a space in the completion queue. Instructions that update the CTR and LR follow the same dispatch and completion procedures as nonbranch instructions, except that they are not issued to an execution unit.
An instruction is retired when it is removed from the completion queue and its results are written to architected registers (GPRs, FPRs, LR, and CTR) from the rename buffers. In-order completion ensures program integrity and the correct architectural state when the 750GX must recover from a mispredicted branch or any exception. Also, the rename buffers assigned to it by the dispatch unit are returned to the available rename buffer pool. These rename buffers are reused by the dispatch unit as subsequent instructions are dispatched.
PowerPC 750GX Overview |
gx_01.fm.(1.2) |
Page 30 of 377 |
March 27,2006 |