Intel Pentium II Developer's Manual

D
Pentium® II Processor
Developer’s Manual
243502-001
October 1997
7
Information in this document is provided in connection with Intel products. No license, expres s or implied, by estoppel or otherwise, to any intellectual property rights is granted by this document. Except as provided in Intel’s Terms and Conditions of Sale for such products, Intel assumes no liability whatsoever, and Intel disc laims any expr ess or implied warranty, r elating to sale and/or use of Intel products including liability or wa r ra ntie s r elat ing t o f itne s s for a par t ic ular pu r pos e , me r c hantability , or infringement of any patent, copyright or other intellectual property r ight. Intel products are not intended for us e in medical, life saving, or life sustaining applications.
Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not r ely on the abs enc e or c harac teris tics of any featur es or ins truc tions mar ked "r eser v ed" or "undefined."
Intel reserves these for future definition and shall have no res ponsibility whatsoev er for conflic ts or inc ompatibilit ies aris ing from future changes to them.
The Pentium® II processor may contain design defects or error s known as errata which may cause the pr oduct to deviate from published specifications. Current characterized errata are available on request.
Contact your local Intel sales office or your distributor to obtain the latest spec ifications and before placing your product order.
Copies of documents which have an ordering number and are referenced in this document, or other Intel literature, may be obtained from:
Intel Corporation P.O. Box 5937 Denver, CO 80217-9808
or call 1-800-548-4725 or visit Intel’s website at http://www.intel.com
*Third-party brands and names are the property of their respective owners.
COPYRIGHT © INTEL CORPORATION, 1995, 1996, 1997
E
iii
TABLE OF CONTENTS
CHAPTER 1 COMPONENT INTRODUCTION
1.1. SYSTEM OVERVIEW.............................................................................................. 1-1
1.2. TERMINOLOGY...................................................................................................... 1-2
1.2.1. S.E.C. Cartridge Terminology ..............................................................................1-3
1.3. REFERENCES........................................................................................................ 1-3
CHAPTER 2 MICRO-ARCHITECTURE OVERVIEW
2.1. FULL CORE UTILIZATION...................................................................................... 2-2
2.2. THE PENTIUM® II PROCESSOR PIPELINE........................................................... 2-3
2.2.1. The Fetch/Decode Unit........................................................................................ 2-4
2.2.2. The Dispatch/Execute Unit................................................................................... 2-5
2.2.3. The Retire Unit.................................................................................................... 2-6
2.2.4. The Bus Interface Unit......................................................................................... 2-7
2.3. MMX™ TECHNOLOGY AND THE PENTIUM® II PROCESSOR.............................. 2-9
2.3.1. MMX™ Technology in the Pentium® II Processor Pipeline................................... 2-9
2.3.2. Caches...............................................................................................................2-13
2.4. WRITE BUFFERS ..................................................................................................2-14
2.5. ADDITIONAL INFORMATION ................................................................................2-14
2.6. ARCHITECTURE SUMMARY.................................................................................2-14
CHAPTER 3 SYSTEM BUS OVERVIEW
3.1. SIGNALING ON THE PENTIUM® II PROCESSOR SYSTEM BUS .......................... 3-1
3.2. SIGNAL OVERVIEW............................................................................................... 3-2
3.2.1. Execution Control Signals.................................................................................... 3-2
3.2.2. Arbitration Signals................................................................................................ 3-3
3.2.3. Request Signals .................................................................................................. 3-5
3.2.4. Snoop Signals .....................................................................................................3-5
3.2.5. Response Signals................................................................................................ 3-6
3.2.6. Data Response Signals........................................................................................ 3-7
3.2.7. Error Signals........................................................................................................3-7
3.2.8. Compatibility Signals............................................................................................ 3-9
3.2.9. Diagnostic Signals ..............................................................................................3-10
CHAPTER 4 DATA INTEGRITY
4.1. ERROR CLASSIFICATION...................................................................................... 4-1
4.2. PENTIUM® II PROCESSOR SYSTEM BUS DATA INTEGRITY ARCHITECTURE... 4-2
4.2.1. Bus Signals Protected Directly............................................................................. 4-2
4.2.2. Bus Signals Protected Indirectly........................................................................... 4-3
CONTENTS E
iv
4.2.3. Unprotected Bus Signals...................................................................................... 4-3
4.2.4. Hard-Error Response........................................................................................... 4-4
4.2.5. Pentium® II Processor System Bus Error Code Algorithms.................................. 4-4
4.2.5.1. PARITY ALGORITHM...................................................................................... 4-4
4.2.5.2. PENTIUM® II SYSTEM BUS ECC ALGORITHM.............................................. 4-4
CHAPTER 5 CONFIGURATION
5.1. DESCRIPTION........................................................................................................ 5-1
5.1.1. Output Tristate .................................................................................................... 5-2
5.1.2. Built-in Self Test ..................................................................................................5-2
5.1.3. Data Bus Error Checking Policy........................................................................... 5-3
5.1.4. Response Signal Parity Error Checking Policy...................................................... 5-3
5.1.5. AERR# Driving Policy .......................................................................................... 5-3
5.1.6. AERR# Observation Policy .................................................................................. 5-3
5.1.7. BERR# Driving Policy for Initiator Bus Errors........................................................5-3
5.1.8. BERR# Driving Policy for Target Bus Errors......................................................... 5-3
5.1.9. Bus Error Driving Policy for Initiator Internal Errors............................................... 5-4
5.1.10. BINIT# Driving Policy........................................................................................... 5-4
5.1.11. BINIT# Observation Policy................................................................................... 5-4
5.1.12. In-Order Queue Pipelining ...................................................................................5-4
5.1.13. Power-On Reset Vector....................................................................................... 5-4
5.1.14. FRC Mode Enable............................................................................................... 5-4
5.1.15. APIC Mode.......................................................................................................... 5-5
5.1.16. APIC Cluster ID................................................................................................... 5-5
5.1.17. Symmetric Agent Arbitration ID............................................................................ 5-5
5.1.18. Low Power Standby Enable................................................................................. 5-6
5.2. CLOCK FREQUENCIES AND RATIOS.................................................................... 5-6
5.3. SOFTWARE-PROGRAMMABLE OPTIONS............................................................. 5-7
5.4. INITIALIZATION PROCESS.................................................................................... 5-9
CHAPTER 6 TEST ACCESS PORT (TAP)
6.1. INTERFACE............................................................................................................ 6-1
6.2. ACCESSING THE TAP LOGIC................................................................................ 6-2
6.2.1. Accessing the Instruction Register....................................................................... 6-4
6.2.2. Accessing the Data Registers.............................................................................. 6-6
6.3. INSTRUCTION SET................................................................................................ 6-7
6.4. DATA REGISTER SUMMARY................................................................................. 6-8
6.4.1. Bypass Register ..................................................................................................6-8
6.4.2. Device ID Register............................................................................................... 6-8
6.4.3. BIST Result Boundary Scan Register................................................................... 6-9
6.4.4. Boundary Scan Register...................................................................................... 6-9
6.5. RESET BEHAVIOR................................................................................................. 6-9
CHAPTER 7 ELECTRICAL SPECIFICATIONS
7.1. THE PENTIUM® II PROCESSOR SYSTEM BUS AND V
REF
.................................. 7-1
7.2. CLOCK CONTROL AND LOW POWER STATES.................................................... 7-2
E CONTENTS
v
7.2.1. Normal State — State 1....................................................................................... 7-3
7.2.2. Auto HALT Power Down State — State 2............................................................. 7-3
7.2.3. Stop-Grant State — State 3................................................................................. 7-3
7.2.4. HALT/Grant Snoop State — State 4..................................................................... 7-4
7.2.5. Sleep State — State 5......................................................................................... 7-4
7.2.6. Deep Sleep State — 6......................................................................................... 7-5
7.2.7. Clock Control and Low Power Modes................................................................... 7-5
7.3. POWER AND GROUND PINS................................................................................. 7-5
7.4. DECOUPLING GUIDELINES................................................................................... 7-6
7.4.1. Pentium® II Processor Vcc
CORE
Decoupling...................................................... 7-6
7.4.2. System Bus GTL+ Decoupling............................................................................. 7-6
7.5. SYSTEM BUS CLOCK AND PROCESSOR CLOCKING.......................................... 7-7
7.5.1. Mixing Processors of Different Frequencies ......................................................... 7-9
7.6. VOLTAGE IDENTIFICATION................................................................................... 7-9
7.7. PENTIUM® II PROCESSOR SYSTEM BUS UNUSED PINS...................................7-11
7.8. PENTIUM® II PROCESSOR SYSTEM BUS SIGNAL GROUPS..............................7-12
7.8.1. Asynchronous vs. Synchronous for System Bus Signals .....................................7-12
7.9. TEST ACCESS PORT (TAP) CONNECTION..........................................................7-14
7.10. MAXIMUM RATINGS.............................................................................................7-14
7.11. PROCESSOR SYSTEM BUS DC SPECIFICATIONS..............................................7-14
7.12. PENTIUM® II PROCESSOR SYSTEM BUS AC SPECIFICATIONS........................7-19
CHAPTER 8 GTL+ INTERFACE SPECIFICATIONS
8.1. SYSTEM SPECIFICATION...................................................................................... 8-1
8.1.1. System Bus Specifications................................................................................... 8-2
8.1.2. System AC Parameters: Signal Quality................................................................ 8-3
8.1.2.1. RINGBACK TOLERANCE................................................................................ 8-5
8.1.3. AC Parameters: Flight Time................................................................................. 8-7
8.2. GENERAL GTL+ I/O BUFFER SPECIFICATION ....................................................8-13
8.2.1. I/O Buffer DC Specification.................................................................................8-13
8.2.2. I/O Buffer AC Specifications................................................................................8-14
8.2.3. Determining Clock-to-Out, Setup and Hold..........................................................8-14
8.2.3.1. CLOCK-TO-OUTPUT TIME, TCO...................................................................8-14
8.2.3.2. MINIMUM SETUP AND HOLD TIMES ............................................................8-16
8.2.3.3. RECEIVER RINGBACK TOLERANCE ............................................................8-19
8.2.4. System-Based Calculation of Required Input and Output Timings .......................8-19
8.2.4.1. CALCULATING TARGET T
FLIGHT_MAX
.......................................................8-19
8.2.4.2. CALCULATING TARGET T
HOLD
...................................................................8-20
8.3. PACKAGE SPECIFICATION ..................................................................................8-20
CHAPTER 9 SIGNAL QUALITY SPECIFICATIONS
9.1. SYSTEM BUS CLOCK (BCLK) SIGNAL QUALITY SPECIFICATIONS..................... 9-1
9.2. GTL+ SIGNAL QUALITY SPECIFICATIONS............................................................ 9-3
9.3. NON-GTL+ SIGNAL QUALITY SPECIFICATIONS................................................... 9-3
9.3.1. Overshoot/Undershoot Guidelines ....................................................................... 9-3
9.3.2. Ringback Specification......................................................................................... 9-4
9.3.3. Settling Limit Guideline ........................................................................................9-5
CONTENTS E
vi
CHAPTER 10 THERMAL SPECIFICATIONS AND DESIGN CONSIDERATIONS
10.1. THERMAL SPECIFICATIONS ................................................................................10-1
10.2. PENTIUM® II PROCESSOR THERMAL ANALYSIS...............................................10-2
10.2.1. Thermal Solution Performance............................................................................10-2
10.2.2. Measurements for Thermal Specifications...........................................................10-3
10.2.2.1. THERMAL PLATE TEMPERATURE MEASUREMENT....................................10-3
10.2.2.2. COVER TEMPERATURE MEASUREMENT....................................................10-5
10.3. THERMAL SOLUTION ATTACH METHODS ..........................................................10-6
10.3.1. Heatsink Clip Attach ...........................................................................................10-7
10.3.2. Rivscrew* Attach ................................................................................................10-9
CHAPTER 11 S.E.C. CARTRIDGE MECHANICAL SPECIFICATIONS
11.1. S.E.C. CARTRIDGE MATERIALS INFORMATION .................................................11-1
11.2. PROCESSOR EDGE FINGER SIGNAL LISTING..................................................11-13
CHAPTER 12 BOXED PROCESSOR SPECIFICATIONS
12.1. INTRODUCTION....................................................................................................12-1
12.2. MECHANICAL SPECIFICATIONS..........................................................................12-2
12.2.1. Boxed Processor Fan/Heatsink Dimensions........................................................12-2
12.2.2. Boxed Processor Fan/Heatsink Weight...............................................................12-4
12.2.3. Boxed Processor Retention Mechanism and Fan/Heatsink Support.....................12-4
12.3. BOXED PROCESSOR REQUIREMENTS...............................................................12-8
12.3.1. Fan/Heatsink Power Supply................................................................................12-8
12.4. THERMAL SPECIFICATIONS ..............................................................................12-10
12.4.1. Boxed Processor Cooling Requirements...........................................................12-10
CHAPTER 13 INTEGRATION TOOLS
13.1. IN-TARGET PROBE (ITP) FOR THE PENTIUM® II PROCESSOR.........................13-1
13.1.1. Primary Function ................................................................................................13-1
13.1.2. Debug Port Connector Description......................................................................13-2
13.1.3. Debug Port Signal Descriptions...........................................................................13-2
13.1.4. Debug Port Signal Notes.....................................................................................13-3
13.1.4.1. SIGNAL NOTE 1: DBRESET#.........................................................................13-3
13.1.4.2. SIGNAL NOTE 5: TDO AND TDI.....................................................................13-3
13.1.4.3. SIGNAL NOTE 7: TCK....................................................................................13-7
13.1.5. Debug Port Layout..............................................................................................13-8
13.1.5.1. SIGNAL QUALITY NOTES............................................................................13-10
13.1.5.2. DEBUG PORT CONNECTOR.......................................................................13-10
13.1.6. Using Boundary Scan to Communicate to the Processor...................................13-11
13.2. INTEGRATION TOOL CONSIDERATIONS ..........................................................13-11
13.2.1. Integration Tool Mechanical Keepouts...............................................................13-11
13.2.2. Pentium® II Processor LAI System Design Considerations ...............................13-11
E CONTENTS
vii
CHAPTER 14 ADVANCED FEATURES
14.1. ADDITIONAL INFORMATION ................................................................................14-1
APPENDIX A SIGNALS REFERENCE
Figures
Figure Title Page
1-1. Second Level Cache Implementations ................................................................. 1-2
2-1. Three Engines Communicating Using an Instruction Pool..................................... 2-1
2-2. A Typical Pseudo Code Fragment........................................................................ 2-2
2-3. The Three Core Engines Interface with Memory via Unified Caches..................... 2-3
2-4. Inside the Fetch/Decode Unit............................................................................... 2-4
2-5. Inside the Dispatch/Execute Unit.......................................................................... 2-5
2-6. Inside the Retire Unit........................................................................................... 2-7
2-7. Inside the Bus Interface Unit................................................................................ 2-8
2-8. Out of Order Core and Retirement Pipeline.........................................................2-10
2-9. Out-of-Order Core and Retirement Pipeline ........................................................2-12
3-1. Latched Bus Protocol........................................................................................... 3-1
5-1. Hardware Configuration Signal Sampling ............................................................. 5-1
6-1. Simplified Block Diagram of Processor TAP Logic................................................ 6-2
6-2. TAP Controller Finite State Machine .................................................................... 6-3
6-3. Processor TAP Instruction Register ..................................................................... 6-5
6-4. Operation of the Processor TAP Instruction Register............................................ 6-5
6-5. TAP Instruction Register Access.......................................................................... 6-6
7-1. GTL+ Bus Topology............................................................................................. 7-1
7-2. Stop Clock State Machine.................................................................................... 7-2
7-3. Timing Diagram of Clock Ratio Signals................................................................. 7-7
7-4. Example Schematic for Clock Ratio Pin Sharing................................................... 7-8
7-5. BCLK to Core Logic Offset..................................................................................7-25
7-6. BCLK, PICCLK, TCK Generic Clock Waveform...................................................7-25
7-7. System Bus Valid Delay Timings.........................................................................7-26
7-8. System Bus Setup and Hold Timings ..................................................................7-26
7-9. FRC Mode BCLK to PICCLK Timing ...................................................................7-27
7-10. System Bus Reset and Configuration Timings.....................................................7-27
7-11. Power-On Reset and Configuration Timings........................................................7-28
7-12. Test Timings (TAP Connection) ..........................................................................7-29
7-13. Test Reset Timings.............................................................................................7-29
8-1. Example Terminated Bus with GTL+ Transceivers............................................... 8-2
8-2. Receiver Waveform Showing Signal Quality Parameters...................................... 8-3
8-3. Low to High GTL+ Receiver Ringback Tolerance ................................................. 8-5
8-4. Standard Input Hi-to-Lo Waveform for Characterizing Receiver
Ringback Tolerance............................................................................................. 8-6
8-5. Measuring Nominal Flight Time............................................................................ 8-8
8-6. Flight Time of a Rising Edge Slower than 0.3V/ns ................................................ 8-9
8-7. Extrapolated Flight Time of a Non-Monotonic Rising Edge ..................................8-10
8-8. Extrapolated Flight Time of a Non-Monotonic Falling Edge..................................8-11
8-9. Test Load for Measuring Output AC Timings.......................................................8-15
8-10. Clock to Output Data Timing (TCO) ....................................................................8-15
CONTENTS E
viii
8-11. Standard Input Lo-to-Hi Waveform for Characterizing Receiver Setup Time........8-17
8-12. Standard Input Hi-to-Lo Waveform for Characterizing Receiver Setup Time........8-18
9-1. BCLK, TCK PICCLK Generic Clock Waveform at the Processor Edge Fingers..... 9-2
9-2. Non-GTL+ Overshoot/Undershoot and Ringback Tolerance................................. 9-4
10-1. Processor S.E.C. Cartridge Thermal Plate ..........................................................10-1
10-2. Processor Thermal Plate Temperature Measurement Location ...........................10-4
10-3. Technique for Measuring T
PLATE
with 0° Angle Attachment...............................10-4
10-4. Technique for Measuring T
PLATE
with 90° Angle Attachment.............................10-5
10-5. Guideline Locations for Cover Temperature (T
COVER
) Thermocouple
Placement..........................................................................................................10-6
10-6. Heatsink Attachment Mechanism Design Space..................................................10-7
10-7. Processor with an Example Low Profile Heatsink Attached using Spring Clips.....10-8
10-8. Processor with an Example Full Height Heatsink Attached using Spring Clips......10-8
10-9. Heatsink Recommendations and Guidelines for Use with Rivscrews* ..................10-9
10-10. Heatsink, Rivscrew* and Thermal Plate Recommendations and Guidelines.........10-9
10-11. General Rivscrew* Heatsink Mechanical Recommendations .............................10-10
11-1. S.E.C. Cartridge—Thermal Plate and Cover Side Views......................................11-3
11-2. S.E.C. Cartridge Top and Side Views..................................................................11-4
11-3. S.E.C. Cartridge Bottom Side View.....................................................................11-5
11-4. S.E.C. Cartridge Thermal Plate Side Dimensions................................................11-6
11-5. S.E.C. Cartridge Thermal Plate Flatness Dimensions..........................................11-6
11-6. S.E.C. Cartridge Thermal Plate Attachment Detail Dimensions............................11-7
11-7. S.E.C. Cartridge Latch Arm, Thermal Plate Lug and Cover Lug Dimensions........11-8
11-8. S.E.C. Cartridge Latch Arm, Cover and Thermal Plate Detail Dimensions ...........11-9
11-9. S.E.C. Cartridge Substrate Dimensions (Skirt not shown for clarity) ..................11-10
11-10. S.E.C. Cartridge Substrate Dimensions, Cover Side View.................................11-10
11-11. S.E.C. Cartridge Substrate—Detail A................................................................11-11
11-12. S.E.C. Cartridge Mark Locations (Processor Markings).....................................11-12
12-1. Conceptual Boxed Pentium® II Processor in Retention Mechanism.....................12-2
12-2. Side View Space Requirements for the Boxed Processor (Fan/heatsink
supports not shown) ...........................................................................................12-3
12-3. Front View Space Requirements for the Boxed Processor...................................12-3
12-4. Top View Space Requirements for the Boxed Processor.....................................12-4
12-5. Heatsink Support Hole Locations and Sizes........................................................12-6
12-6. Side View Space Requirements for Boxed Processor Fan/Heatsink Supports......12-7
12-7. Top View Space Requirements for Boxed Processor Fan/Heatsink Supports.......12-8
12-8. Boxed Processor Fan/Heatsink Power Cable Connector Description...................12-9
12-9. Recommended Motherboard Power Header Placement Relative to Fan Power
Connector and Slot 1........................................................................................12-10
13-1. Hardware Components of the ITP.......................................................................13-2
13-2. GTL+ Signal Termination....................................................................................13-3
13-3. TCK/TMS with Series and Parallel Termination, Single Processor Configuration..13-6
13-4. TCK/TMS with Daisy Chain Configuration, 2-Way MP Configuration....................13-7
13-5. TCK with Daisy Chain Configuration....................................................................13-8
13-6. Generic DP System Layout for Debug Port Connection.......................................13-9
13-7. Debug Port Connector on Thermal Plate Side of Circuit Board.......................... 13-10
13-8. Hole Positioning for Connector on Thermal Plate Side of Circuit Board..............13-10
13-9. Processor System where Boundary Scan is Not Used.......................................13-11
13-10. LAI Probe Input Circuit......................................................................................13-12
13-11. Pentium® II Processor Integration Tool Mechanical Keep Out Volume—
Thermal Plate Side View...................................................................................13-13
13-12. Pentium® II Processor Integration Tool Mechanical Keep Out Volume—
Cover Side View...............................................................................................13-14
E CONTENTS
ix
13-13. Pentium® II Processor Integration Tool Mechanical Keep Out Volume—
Side View .........................................................................................................13-15
A-1. PWRGOOD Relationship at Power-On...............................................................A-11
Tables
Table Title Page
2-1. Pentium® II Processor Execution Unit Pipelines..................................................2-13
3-1. Execution Control Signals.................................................................................... 3-2
3-2. Arbitration Signals................................................................................................ 3-4
3-3. Request Signals .................................................................................................. 3-5
3-4. Snoop Signals ..................................................................................................... 3-5
3-5. Response Signals................................................................................................ 3-6
3-6. Data Phase Signals .............................................................................................3-7
3-7. Error Signals........................................................................................................3-7
3-8. PC Compatibility Signals...................................................................................... 3-9
3-9. Diagnostic Support Signals.................................................................................3-10
4-1. Direct Bus Signal Protection................................................................................. 4-2
5-1. APIC Cluster ID Configuration for the Pentium® II Processor Family 1................. 5-5
5-2. Pentium® II Processor Bus BREQ[1:0]# Interconnect (Two Agents)..................... 5-5
5-3. Arbitration ID Configuration with Processors Supporting BR[1:0]# 1..................... 5-6
5-4. Pentium® II Processor Family Power-On Configuration Register.......................... 5-7
5-5. Pentium® II Processor Family Power-On Configuration Register APIC
Cluster ID Bit Field............................................................................................... 5-8
5-6. Pentium® II Processor Family Power-On Configuration Register Arbitration
ID Configuration................................................................................................... 5-8
5-7. Pentium® II Processor Family Power-On Configuration Register Bus Frequency
to Core Frequency Ratio Bit Field ........................................................................ 5-8
6-1. 1149.1 Instructions in the Processor TAP............................................................. 6-7
6-2. TAP Data Registers............................................................................................. 6-8
6-3. Device ID Register............................................................................................... 6-9
6-4. TAP Reset Actions............................................................................................... 6-9
7-1. Core Frequency to System Bus Multiplier Configuration....................................... 7-7
7-2. Voltage Identification Definition
(1, 2, 3)
..............................................................7-10
7-3. Recommended Pull-Up Resistor Values (Approximate) for CMOS
Signals ...............................................................................................................7-11
7-4. Pentium® II Processor/Slot 1 System Bus Signal Groups....................................7-13
7-5. Pentium® II Processor Absolute Maximum Ratings.............................................7-15
7-6. Pentium® II Processor/Slot 1 Connector Voltage/Current Specifications..............7-16
7-7. GTL+ Signal Groups DC Specifications...............................................................7-18
7-8. Non-GTL+ Signal Groups DC Specifications .......................................................7-18
7-9. System Bus AC Specifications (Clock)
(1, 2)
.......................................................7-20
7-10. Valid Pentium® II Processor System Bus, Core Frequency and Cache Bus
Frequencies
(1, 2)
..............................................................................................7-21
7-11. Pentium® II Processor System Bus AC Specifications (GTL+ Signal Group).......7-21
7-12. Pentium® II Processor System Bus AC Specifications (CMOS Signal Group)......7-22
7-13. System Bus AC Specifications (Reset Conditions)...............................................7-22
7-14. System Bus AC Specifications (APIC Clock and APIC I/O)
(1, 2)
.........................7-23
7-15. System Bus AC Specifications (TAP Connection)
(1)
..........................................7-24
8-1. Pentium® II Processor GTL+ Bus Specifications
(1)
............................................. 8-3
8-2. Specifications for Signal Quality........................................................................... 8-4
8-3. I/O Buffer DC Parameters...................................................................................8-13
CONTENTS E
x
8-4. I/O Buffer AC Parameters...................................................................................8-14
9-1. BCLK Signal Quality Specifications...................................................................... 9-1
9-2. GTL+ Signal Groups Ringback Tolerance ............................................................ 9-3
9-3. Signal Ringback Specifications for Non-GTL+ Signals.......................................... 9-5
10-1. Pentium® II Processor Thermal Design Specifications
(1)
...................................10-2
10-2. Example Thermal Solution Performance for 266 MHz Pentium® II
Processor at Thermal Plate Power of 37.0 Watts................................................10-3
11-1. S.E.C. Cartridge Materials..................................................................................11-2
11-2. Description Table for Processor Markings.........................................................11-12
11-3. Signal Listing in Order by Pin Number...............................................................11-13
11-4. Signal Listing in Order by Signal Name .............................................................11-18
12-1. Boxed Processor Fan/Heatsink Spatial Dimensions.............................................12-4
12-2. Boxed Processor Fan/Heatsink Support Dimensions...........................................12-5
12-3. Fan/Heatsink Power and Signal Specifications....................................................12-9
13-1. Debug Port Pinout Description and Requirements 1............................................13-4
A-1. BR0#(I/O), BR1#, BR2#, BR3# Signals Rotating Interconnect..............................A-4
A-2. BR[3:0]# Signal Agent IDs ...................................................................................A-4
A-3. Burst Order Used for Pentium® II Processor Bus Line Transfers..........................A-5
A-4. Slot 1 Occupation Truth Table............................................................................A-13
A-5. Output Signals
(1)
..............................................................................................A-16
A-6. Input Signals
(1)
................................................................................................A-17
A-7. Input/Output Signals (Single Driver)...................................................................A-18
A-8. Input/Output Signals (Multiple Drivers)...............................................................A-18
E
Component Introduction
1
E
1-1
CHAPTER 1
COMPONENT INTRODUCTION
1.1. SYSTEM OVERVIEW
The Pentium® II processor is the next in the Intel386™, Intel486™, Pentium and Pentium Pro line of Intel processors. The Pentium II and Pentium Pro processors are members of the P6 family of processors, which includes all of the Intel Architecture processors that implement Intel’s dynamic execution micro-architecture. The dynamic execution micro­architecture incorporates a unique combination of multiple branch prediction, data flow analysis, and speculative execution, which enables the Pentium II processor to deliver higher performance than the Pentium family of processors, while maintaining binary compatibility with all previous Intel Architecture processors. The Pentium II processor also incorporates Intel’s MMX™ technology, for enhanced media and communication performance. To aid in the design of energy efficient computer systems, Pentium II processor offers multiple low­power states such as AutoHALT, Stop-Grant, Sleep and Deep Sleep, to conserve power during idle times.
The Pentium II processor utilizes the same multi-processing system bus technology as the Pentium Pro processor. This allows for a higher level of performance for both uni-processor and two-way multi-processor (2-way MP) systems. Memory is cacheable for up to 512 MB of addressable memory space, allowing significant headroom for business desktop systems.
The Pentium II processor system bus operates in the same manner as the Pentium Pro processor system bus. The Pentium II processor system bus uses GTL+ signal technology. The Pentium II processor deviates from the Pentium Pro processor by using commercially available die for the L2 cache. The L2 cache (the TagRAM and pipelined burst synchronous static RAM (BSRAM) memories) are now multiple die. Transfer rates between the Pentium II processor core and the L2 cache are one-half the processor core clock frequency and scale with the processor core frequency. Both the TagRAM and BSRAM receive clocked data directly from the Pentium II processor core. As with the Pentium Pro processor, the L2 cache does not connect to the Pentium II processor system bus (see Figure 1-1). As with the Pentium Pro processor, the Pentium II processor has a dedicated cache bus, thus maintaining the dual independent bus architecture to deliver high bus bandwidth and high performance (see Figure 1-1).
The Pentium II processor utilizes Single Edge Contact (S.E.C.) cartridge packaging technology. The S.E.C. cartridge allows the L2 cache to remain tightly coupled to the processor, while enabling use of high volume commercial SRAM components. The L2 cache is performance optimized and tested at the package level. The S.E.C. cartridge utilizes surface mount technology and a substrate with an edge finger connection. The S.E.C. cartridge introduced on the Pentium II processor will also be used in future Slot 1 processors.
COMPONENT INTRODUCTION E
1-2
Pentium II Processor
Substrate and Components
Processor Cor e
Processor
Core
Tag
L2
A
Pentium® Pro Processor Dual Die Cavity Package
L2
Schemat ic onl y
000756c
Figure 1-1. Second Level Cache Implementations
The S.E.C. cartridge has the following features: a thermal plate, a cover and a substrate with an edge finger connection. The thermal plate allows standardized heatsink attachment or customized thermal solutions. The full enclosure also protects the surface mount components. The edge finger connection maintains socketability for system configuration. The edge finger connector is notated as ‘Slot 1 connector’ in this and other documentation.
1.2. TERMINOLOGY
In this document, a ‘#’ symbol after a signal name refers to an active low signal. This means that a signal is in the active state (based on the name of the signal) when driven to a low level. For example, when FLUSH# is low, a flush has been requested. When NMI is high, a non-maskable interrupt has occurred. In the case of signals where the name does not imply an active state but describes part of a binary sequence (such as address or data), the ‘#’ symbol implies that the signal is inverted. For example, D[3:0] = ‘HLHL’ refers to a hex ‘A’, and D#[3:0] = ‘LHLH’ also refers to a hex ‘A’ (H= High logic level, L= Low logic level).
The term “system bus” refers to the interface between the processor, system core logic (a.k.a. the core logic components) and other bus agents. The system bus is a multiprocessing interface to processors, memory and I/O. The term “cache bus” refers to the interface between the processor and the L2 cache components (TagRAM and BSRAMs). The cache bus does NOT connect to the system bus, and is not visible to other agents on the system bus.
When signal values are referenced in tables, a 0 indicates inactive and a 1 indicates active. 0 and 1 do not reflect voltage levels. A # after a signal name indicates active low. An entry of 1 for ADS# means that ADS# is active, with a low voltage level.
E COMPONENT INTRODUCTION
1-3
1.2.1. S.E.C. Cartridge Terminology
The following terms are used often in this document and are explained here for clarification:
Pentium
®
II Processor — The entire product including internal components, substrate,
thermal plate and cover.
S.E.C. Cartridge — The new processor packaging technology is called a “Single Edge
Contact cartridge.”
Processor Substrate —The structure on which the components are mounted inside the
S.E.C. cartridge (with or without components attached).
Processor Core — The processor’s execution engine.
Thermal Plate — The surface used to connect a heatsink or other thermal solutions to
the processor.
Cover — The processor casing on the opposite side of the thermal plate.
Latch Arms — A processor feature that can be utilized as a means for securing the
processor in the retention mechanism.
Additional terms referred to in this and other related documentation:
Slot 1 — The connector that the S.E.C. cartridge plugs into, just as the Pentium
®
Pro
processor uses Socket 8.
Retention Mechanism — A mechanical piece which holds the package in the Slot 1
connector.
Heatsink Support — The support pieces that are mounted on the motherboard to
provide added support for heatsinks.
The L2 cache (TagRAM, BSRAM) dies keep standard industry names.
1.3. REFERENCES
The reader of this specification should also be familiar with material and concepts presented in the following documents:
AP-485, Intel Processor Identification with the CPUID Instruction (Order Number
241618)
AP-585, Pentium
®
II Processor GTL+ Guidelines (Order Number 243330)
AP-586, Pentium
®
II Processor Thermal Design Guidelines (Order Number 243331)
AP-587, Pentium
®
II Processor Power Distribution Guidelines (Order Number 243332)
AP-588, Mechanical and Assembly Technology for S.E.C. Cartridge Processors (Order
Number 243333)
AP-589, Pentium
®
II Processor Electro-Magnetic Interference (Order Number 243334)
COMPONENT INTRODUCTION E
1-4
Pentium® II Processor Specification Update (Order Number 243337)
Pentium
®
II Processor I/O Buffer Models, IBIS Format (Electronic Form)
Intel Architecture Software Developer’s Manual
Volume I: Basic Architecture (Order Number 243190) Volume II: Instruction Set Reference (Order Number 243191) Volume III: System Programming Guide (Order Number 243192)
E
Micro-Architecture Overview
2
E
2-1
CHAPTER 2
MICRO-ARCHITECTURE OVERVIEW
The Pentium II processor uses the same dynamic execution micro-architecture as the other members of P6 family of Intel Architecture processors. This three-way superscalar, pipelined micro-architecture features a decoupled, multi-stage superpipeline, which trades less work per pipestage for more stages. The Pentium II processor, for example, has twelve stages with a pipestage time 33 percent less than the Pentium processor, which helps achieve a higher clock rate on any given manufacturing process.
The approach used in the P6 family micro-architecture removes the constraint of linear instruction sequencing between the traditional “fetch” and “execute” phases, and opens up a wide instruction window using an instruction pool. This approach allows the “execute” phase of the processor to have much more visibility into the program instruction stream so that better scheduling may take place. It requires the instruction “fetch/decode” phase of the processor to be much more efficient in terms of predicting program flow. Optimized scheduling requires the fundamental “execute” phase to be replaced by decoupled “dispatch/execute” and “retire” phases. This allows instructions to be started in any order but always be completed in the original program order. Processors in the P6 family may be thought of as three independent engines coupled with an instruction pool as shown in Figure 2-1.
Fetch/
Decode
Unit
Dispatch/
Execute
Unit
Retire
Unit
Instruction Pool
000925
Figure 2-1. Three Engines Communicating Using an Instruction Pool
MICRO-ARCHITECTURE OVERVIEW E
2-2
2.1. FULL CORE UTILIZATION
The three independent-engine approach was taken to more fully utilize the processor core. Consider the pseudo code fragment in Figure 2-2:
r1 <= mem [r0] /* Instruction 1 */ r2 <= r1 + r2 /* Instruction 2 */ r5 <= r5 + 1 /* Instruction 3 */ r6 <= r6 - r3 /* Instruction 4 */
000922
Figure 2-2. A Typical Pseudo Code Fragment
The first instruction in this example is a load of r1 that, at run time, causes a cache miss. A traditional processor core must wait for its bus interface unit to read this data from main memory and return it before moving on to instruction 2. This processor stalls while waiting for this data and is thus being under-utilized.
To avoid this memory latency problem, a P6 family processor “looks-ahead” into the instruction pool at subsequent instructions and does useful work rather than stalling. In the example in Figure 2-2, instruction 2 is not executable since it depends upon the result of instruction 1; however both instructions 3 and 4 have no prior dependencies and are therefore executable. The processor executes instructions 3 and 4 out-of-order. The results of this out­of-order execution can not be committed to permanent machine state (i.e., the programmer­visible registers) immediately since the original program order must be maintained. The results are instead stored back in the instruction pool awaiting in-order retirement. The core executes instructions depending upon their readiness to execute, and not on their original program order, and is therefore a true dataflow engine. This approach has the side effect that instructions are typically executed out-of-order.
The cache miss on instruction 1 will take many internal clocks, so the core continues to look ahead for other instructions that could be speculatively executed, and is typically looking 20 to 30 instructions in front of the instruction pointer. Within this 20 to 30 instruction window there will be, on average, five branches that the fetch/decode unit must correctly predict if the dispatch/execute unit is to do useful work. The sparse register set of an Intel Architecture (IA) processor will create many false dependencies on registers so the dispatch/execute unit will rename the Intel Architecture registers into a larger register set to enable additional forward progress. The Retire Unit owns the programmer’s Intel Architecture register set and results are only committed to permanent machine state in these registers when it removes completed instructions from the pool in original program order.
Dynamic Execution technology can be summarized as optimally adjusting instruction execution by predicting program flow, having the ability to speculatively execute instructions in any order, and then analyzing the program’s dataflow graph to choose the best order to execute the instructions.
E MICRO-ARCHITECTURE OVERVIEW
2-3
2.2. THE PENTIUM® II PROCESSOR PIPELINE
In order to get a closer look at how the P6 family micro-architecture implements Dynamic Execution, Figure 2-3 shows a block diagram of the Pentium II processor with cache and memory interfaces. The “Units” shown in Figure 2 represent stages of the Pentium II processor pipeline.
Instru c tion Pool
L1 ICache L1 DCache
Bus Interface
Unit
L2 Cache
System Bus
Fetch Load
Store
Fetch/
Decode
Unit
Dispatch/
Execute
Unit
Retire
Unit
000926
Figure 2-3. The Three Core Engines Interface with Memory via Unified Caches
The FETCH/DECODE unit: An in-order unit that takes as input the user program
instruction stream from the instruction cache, and decodes them into a series of µoperations (µops) that represent the dataflow of that instruction stream. The pre-fetch is speculative.
The DISPATCH/EXECUTE unit: An out-of-order unit that accepts the dataflow stream,
schedules execution of the µops subject to data dependencies and resource availability and temporarily stores the results of these speculative executions.
MICRO-ARCHITECTURE OVERVIEW E
2-4
The RETIRE unit: An in-order unit that knows how and when to commit (“retire”) the
temporary, speculative results to permanent architectural state.
The BUS INTERFACE unit: A partially ordered unit responsible for connecting the
three internal units to the real world. The bus interface unit communicates directly with the L2 (second level) cache supporting up to four concurrent cache accesses. The bus interface unit also controls a transaction bus, with MESI snooping protocol, to system memory.
2.2.1. The Fetch/Decode Unit
Figure 2-4 shows a more detailed view of the Fetch/Decode unit.
ICache
Next_IP
Microcode Instruction
Sequencer
Instruction
Decoder
(x3)
From Bus Interface Unit
To Instruction Pool (ReOrder Buffer)
Branch Target
Buffer
Register Alias
Table Allocate
000927
Figure 2-4. Inside the Fetch/Decode Unit
The L1 Instruction Cache is a local instruction cache. The Next_IP unit provides the L1 Instruction Cache index, based on inputs from the Branch Target Buffer (BTB), trap/interrupt status, and branch-misprediction indications from the integer execution section.
The L1 Instruction Cache fetches the cache line corresponding to the index from the Next_IP, and the next line, and presents 16 aligned bytes to the decoder. The prefetched bytes are rotated so that they are justified for the instruction decoders (ID). The beginning and end of the Intel Architecture instructions are marked.
Three parallel decoders accept this stream of marked bytes, and proceed to find and decode the Intel Architecture instructions contained therein. The decoder converts the Intel Architecture instructions into triadic µops (two logical sources, one logical destination per
E MICRO-ARCHITECTURE OVERVIEW
2-5
µop). Most Intel Architecture instructions are converted directly into single µops, some instructions are decoded into one-to-four µops and the complex instructions require microcode (the box labeled Microcode Instruction Sequencer in Figure 2-4). This microcode is just a set of preprogrammed sequences of normal µops. The µops are queued, and sent to the Register Alias Table (RAT) unit, where the logical Intel Architecture-based register references are converted into references to physical registers in P6 family processors physical register references, and to the Allocator stage, which adds status information to the µops and enters them into the instruction pool. The instruction pool is implemented as an array of Content Addressable Memory called the ReOrder Buffer (ROB).
2.2.2. The Dispatch/Execute Unit
The Dispatch unit selects µops from the instruction pool depending upon their status. If the status indicates that a µop has all of its operands then the dispatch unit checks to see if the execution resource needed by that µop is also available. If both are true, the Reservation Station removes that µop and sends it to the resource where it is executed. The results of the µop are later returned to the pool. There are five ports on the Reservation Station, and the multiple resources are accessed as shown in Figure 2-5.
MMX
Ex ecution Unit
MMX™
Ex ecution Unit
Floating-Point
Ex ecution Unit
000928
Reservatio
Station
Integer
Ex ecution Unit
Jump
Ex ecution Unit
Integer
Ex ecution Unit
Load
Unit
Store
Unit
Port 0
Port 1
Port 2
Port 3, 4
To/From
Instruction Pool
(ReOrder Buffer)
Stores
Loads
000928
Figure 2-5. Inside the Dispatch/Execute Unit
MICRO-ARCHITECTURE OVERVIEW E
2-6
The Pentium II processor can schedule at a peak rate of 5 µops per clock, one to each resource port, but a sustained rate of 3 µops per clock is more typical. The activity of this scheduling process is the out-of-order process; µops are dispatched to the execution resources strictly according to dataflow constraints and resource availability, without regard to the original ordering of the program.
Note that the actual algorithm employed by this execution-scheduling process is vitally important to performance. If only one µop per resource becomes data-ready per clock cycle, then there is no choice. But if several are available, it must choose. The P6 family micro­architecture uses a pseudo FIFO scheduling algorithm favoring back-to-back µops.
Note that many of the µops are branches. The Branch Target Buffer will correctly predict most of these branches but it can’t correctly predict them all. Consider a BTB that is correctly predicting the backward branch at the bottom of a loop; eventually that loop is going to terminate, and when it does, that branch will be mispredicted. Branch µops are tagged (in the in-order pipeline) with their fall-through address and the destination that was predicted for them. When the branch executes, what the branch actually did is compared against what the prediction hardware said it would do. If those coincide, then the branch eventually retires and the speculatively executed work between it and the next branch instruction in the instruction pool is good.
But if they do not coincide, then the Jump Execution Unit (JEU) changes the status of all of the µops behind the branch to remove them from the instruction pool. In that case the proper branch destination is provided to the BTB which restarts the whole pipeline from the new target address.
2.2.3. The Retire Unit
Figure 2-6 shows a more detailed view of the Retire Unit.
E MICRO-ARCHITECTURE OVERVIEW
2-7
Reservation
Station
Memory
Interface Unit
Retirement
Register
File
To/From DCache
From To
Instruction Pool
000929
Figure 2-6. Inside the Retire Unit
The Retire Unit is also checking the status of µops in the instruction pool. It is looking for µops that have executed and can be removed from the pool. Once removed, the original architectural target of the µops is written as per the original Intel Architecture instruction. The Retire Unit must not only notice which µops are complete, it must also re-impose the original program order on them. It must also do this in the face of interrupts, traps, faults, breakpoints and mispredictions.
The Retire Unit must first read the instruction pool to find the potential candidates for retirement and determine which of these candidates are next in the original program order. Then it writes the results of this cycle’s retirements to the Retirement Register File (RRF). The Retire Unit is capable of retiring 3 µops per clock.
2.2.4. The Bus Interface Unit
Figure 2-7 shows a more detailed view of the Bus Interface Unit.
MICRO-ARCHITECTURE OVERVIEW E
2-8
Memory
I/F
Memory
Order Buffer
DCache
System
Memory
L2 Cache
From Address
Generation Unit
To/From Instruction Pool (ReOrder Buffer)
000930
Figure 2-7. Inside the Bus Interface Unit
There are two types of memory access: loads and stores. Loads only need to specify the memory address to be accessed, the width of the data being retrieved, and the destination register. Loads are encoded into a single µop.
Stores need to provide a memory address, a data width, and the data to be written. Stores therefore require two µops, one to generate the address and one to generate the data. These µops must later re-combine for the store to complete.
Stores are never performed speculatively since there is no transparent way to undo them. Stores are also never re-ordered among themselves. A store is dispatched only when both the address and the data are available and there are no older stores awaiting dispatch.
A study of the importance of memory access reordering concluded:
Stores must be constrained from passing other stores, for only a small impact on
performance.
Stores can be constrained from passing loads, for an inconsequential performance loss.
Constraining loads from passing other loads or stores has a significant impact on
performance.
The Memory Order Buffer (MOB) allows loads to pass other loads and stores by acting like a reservation station and re-order buffer. It holds suspended loads and stores and re-dispatches them when a blocking condition (dependency or resource) disappears.
E MICRO-ARCHITECTURE OVERVIEW
2-9
2.3. MMX™ TECHNOLOGY AND THE PENTIUM® II PROCESSOR
2.3.1. MMX™ Technology in the Pentium® II Processor Pipeline
Pentium II processors use a Dynamic Execution architecture that blend out-of-order and speculative execution with hardware register renaming and branch prediction. These processors feature an in-order issue pipeline, which breaks Intel386 processor macro­instructions up into simple, µoperations called µops (or uops), and an out-of-order, superscalar processor core, which executes the µops. The out-of-order core of the processor contains several pipelines to which integer, jump, floating-point, and memory execution units are attached. Several different execution units may be clustered on the same pipeline: for example, an integer address logic unit and the floating-point execution units (adder, multiplier, and divider) share a pipeline. The data cache is pseudo-dual ported via interleaving, with one port dedicated to loads and the other to stores. Most simple operations (integer ALU, floating-point add, even floating-point multiply) can be pipelined with a throughput of one or two operations per clock cycle. Floating-point divide is not pipelined. Long latency operations can proceed in parallel with short latency operations.
The Pentium II pipeline is comprised of three parts: (1) the In-Order Issue Front-end, (2) the Out-of-Order Core, and the (3) In-Order Retirement unit. Details about the In-Order Issue Front-end follow below.
Since the dynamic execution processors execute instructions out of order, the most important consideration in performance tuning is making sure enough µops are ready for execution. Correct branch prediction and fast decoding are essential to getting the most performance out of the In-Order Front-End. Branch prediction and the branch target buffer are discussed below and are detailed in the MMX™ Technology Developer’s Guide at the Intel website: http://developer.intel.com.
MICRO-ARCHITECTURE OVERVIEW E
2-10
BTB0
BTB1
IFU0
IFU:
IFU1
IFU2
ID0
ID1
RAT
ROB
Rd
Instruction Cache Unit
IFU1: In this stage, 16-byte instruction packets are fetched.
The packets are aligned on 16-byte boundaries.
IFU2: Instruction Pre-decode: double buffered: 16-byte
packets aligned on any boundary.
ID0: Instruction Decode
ID1: Decode 1 stage: decoder limits
= at most 3 macro-instructions per cycle = at most 6 µops (411) per cycle = at most 3 µops per cycle exit the queue = instructions 7 bytes in length
RAT: Register Allocati on
Decode IP relative branches = at most one per cycle = Branch inform at ion sent to BTB0 pipe stage Rename = partial and flag stalls Allocate resources = the pipeline stalls if the ROB is full
ROB Re -ord e r Buf fer Rea d
= at most 2 completed physical registers reads per cycle
001049
Figure 2-8. Out of Order Core and Retirement Pipeline
E MICRO-ARCHITECTURE OVERVIEW
2-11
During every clock cycle, up to three Intel Architecture macro instructions can be decoded in the ID1 pipestage. However, if the instructions are complex or are over seven bytes then the decoder is limited to decoding fewer instructions.
The decoders can decode:
1. Up to three macro-instructions per clock cycle.
2. Up to six µops per clock cycle.
3. Macro-instructions up to seven bytes in length. Pentium II processors have three decoders in the D1 pipestage. The first decoder is capable
of decoding one Intel Architecture macro-instruction of four or fewer µops in each clock cycle. The other two decoders can each decode an Intel Architecture instruction of one µop in each clock cycle. Instructions composed of more than four µops will take multiple cycles to decode. When programming in assembly language, scheduling the instructions in a 4-1-1 µop sequence increases the number of instructions that can be decoded each clock cycle. In general:
Simple instructions of the register-register form are only one µop.
Load instructions are only one µop.
Store instructions have two µops.
Simple read-modify instructions are two µops.
Simple instructions of the register-memory form have two to three µops.
Simple read-modify write instructions are four µops.
Complex instructions generally have more than four µops, therefore they will take
multiple cycles to decode.
For the purpose of counting µops, MMX technology instructions are simple instructions. See Appendix D in AP-526, Optimizations for Intel’s 32-bit Processors (Order Number 242816) for a table that specifies the number of µops for each instruction in the Intel Architecture instruction set.
Once the µops are decoded, they will be issued from the In-Order Front-End into the Reservation Station (RS), which is the beginning pipestage of the Out-of-Order core. In the RS, the µops wait until their data operands are available. Once a µop has all data sources available, it will be dispatched from the RS to an execution unit. If a µop enters the RS in a data-ready state (that is, all data is available), then the µop will be immediately dispatched to an appropriate execution unit, if one is available. In this case, the µop will spend very few clock cycles in the RS. All of the execution units are clustered on ports coming out of the RS. Once the µop has been executed it returns to the ROB, and waits for retirement.
In this pipestage, all data values are written back to memory and all µops are retired in-order, three at a time. The figure below provides details about the Out-of-Order core and the In­Order retirement pipestages.
MICRO-ARCHITECTURE OVERVIEW E
2-12
Port 0
Port 1
ROB
wb
RRF
ROBrdROB
rd
RS
Port 2
Port 3
Port 4
Additional information regarding each pipeline is in the following table.
Execution pipelines coming out of the RS are multiple pipelines grouped into five clusters.
Re-order Buffer
Writeback (ROB wb)
Retirement (RRF): At most,
three µops are retired per
cycle. Taken branches must
retire in the first slot.
Reservation station (RS): A µop can remain in the RS for many cycles or simply move past to an execution unit. On average, a µop will remain in the RS for three cycles or pipestages.
001050
Figure 2-9. Out-of-Order Core and Retirement Pipeline
Loading...
+ 196 hidden pages