Intel® Celeron® Mobile Processor
P4000 and U3000 Series
Datasheet
Revision 001
October 2010
Document Number: 324471-001
Legal Lines and Disclaimers
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED,
BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS
PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER,
AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING
LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY
PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY
APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH
MAY OCCUR.
Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the
absence or characteristics of any features or instructions marked “reserved” or “undefined.” Intel reserves these for future
definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The
information here is subject to change without notice. Do not finalize a design with this information.
Δ Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor
family, not across different processor families. See http://www.intel.com/products/processor_number
Intel, Intel SpeedStep, Celeron, Intel vPro and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries.
*Other names and brands may be claimed as the property of others.
Table 7-43DDR3 Signal Group DC Specifications ...................................................... 100
Table 7-44Control Sideband and TAP Signal Group DC Specifications.......................... 101
Table 7-45PCI Express DC Specifications ................................................................ 102
Table 7-46eDP DC Specifications ........................................................................... 103
Table 7-47PECI DC Electrical Limits ....................................................................... 104
Table 8-48rPGA988A Processor Pin List by Pin Number ............................................. 110
Table 8-49rPGA988A Processor Pin List by Pin Name................................................ 124
Table 8-50BGA1288 Processor Ball List by Ball Name ............................................... 142
Table 8-51BGA1288 Processor Ball List by Ball Number ............................................ 160
Datasheet7
Revision History
Revision
Number
001Initial releaseOctober 2010
DescriptionRevision Date
§
8Datasheet
Features Summary
1Features Summary
1.1Introduction
Intel® Celeron® P4000 and U3000 mobile processor seriesis the next generation of
64-bit, multi-core mobile processor built on 32-nanometer process technology. Based
on the low-power/high-performance Nehalem micro-architecture, the processor is
designed for a two-chip platform as opposed to the traditional three-chip platforms
(processor, GMCH, and ICH). The two-chip platform consists of a processor and the
Platform Controller Hub (PCH) and enables higher performance, lower cost, easier
validation, and improved x-y footprint. The PCH may also be referred to as Mobile
Intel® 5 Series Chipset (formerly Ibex Peak-M). Intel® Celeron® P4000 and U3000
mobile processor series is designed for the Calpella platform and is offered in rPGA988A
and BGA1288 package respectively.
Included in this family of processors is Intel® HD graphics and memory controller die
on the same package as the processor core die. This two-chip solution of a processor
core die with an integrated graphics and memory controller die is known as a multi-chip
package (MCP) processor.
Note:
1. Throughout this document, Intel® Celeron® P4000 and U3000 mobile processor
series is referred to as processor.
2. Throughout this document, Intel® HD graphics is referred as integrated graphics.
3. Integrated graphics and memory controller die is built on 45-nanometer process
technology
4. Intel® Celeron® P4000 and U3000 mobile processor seriesis not Intel® vPro™
eligible
Datasheet9
Features Summary
Processor
Discrete Graphics
(PEG)
Analog CRT
Gigabit
Network Connection
USB 2.0
Intel® HD Audio
FWH
Super I/O
PCI
Serial ATA
Mobile Intel® 5 Series Chipset
PCH
DDR3 SO-DIMMs
PCI Express* x16
8 PCI Express* x1
Ports
(2.5 GT/s)
14 Ports
PCI
6 Ports
3 Gb/s
SPI
Digital Display x 3
Intel® Flexible
Display Interface
PCI Express*
WiMax
SPI Flash
LPC
SMBUS 2.0
PECI
GPIO
OR
GPU, Memory
Controller
Dual Core
Processor
800/1066 MT/s
2 Channels
1 SO-DIMM / Channel
DMI2
(x4)
Dual Channel NAND
Interface
LVDS Flat Panel
Intel®
Management
Engine
WiFi
Controller Link 1
Embedded
DisplayPort* (eDP)
PCI Express x 1
Figure 1-1.Intel® Celeron® P4000 and U3000 mobile processor series on the Calpella
Platform
10Datasheet
Features Summary
1.2Processor Feature Details
• Two execution cores
• A 32-KB instruction and 32-KB data first-level cache (L1) for each core
• A 512-KB shared instruction/data second-level cache (L2), 256-KB for each core
• Up to 2-MB shared instruction/data third-level cache (L3), shared among all cores
1.2.1Supported Technologies
• Intel® Virtualization Technology (Intel® VT-x)
• Intel® 64 architecture
• Execute Disable Bit
Note:Please refer to the Intel® Celeron® P4000 and U3000 mobile processor series
Specification Update for feature support details
1.3Interfaces
1.3.1System Memory Support
• One or two channels of DDR3 memory with a maximum of one SO-DIMM per
channel
• Single- and dual-channel memory organization modes
• Data burst length of eight for all memory organization modes
• Memory DDR3 data transfer rates of 800 MT/s (SV/ULV) and 1066 MT/s (SV)
• 64-bit wide channels
• DDR3 I/O Voltage of 1.5 V
• Non-ECC, unbuffered DDR3 SO-DIMMs only
• Theoretical maximum memory bandwidth of:
— 12.8 GB/s in dual-channel mode assuming DDR3 800 MT/s
• 1-Gb, and 2-Gb DDR3 DRAM technologies are supported for x8 and x16 devices.
• Using 2-Gb device technologies, the largest memory capacity possible is 8 GB,
assuming dual-channel mode with two x8, double-sided, un-buffered, non-ECC,
SO-DIMM memory configuration.
• Up to 32 simultaneous open pages, 16 per channel (assuming 4 Ranks of 8 Bank
• Partial Writes to memory using Data Mask (DM) signals
• On-Die Termination (ODT)
• Intel® Fast Memory Access (Intel® FMA):
— Just-in-Time Command Scheduling
— Command Overlap
— Out-of-Order Scheduling
1.3.2PCI Express*
• The Processor PCI Express ports are fully compliant to the PCI Express Base
Specification Revision 2.0.
— One 16-lane PCI Express* port intended for graphics attach.
• Gen1 (2.5 GT/s) PCI Express* frequency is supported.
Features Summary
• Gen1 Raw bit-rate on the data pins of 2.5 Gb/s, resulting in a real bandwidth per
pair of 250 MB/s given the 8b/10b encoding used to transmit data across this
interface. This also does not account for packet overhead and link maintenance.
• Maximum theoretical bandwidth on interface of 4 GB/s in each direction
simultaneously, for an aggregate of 8 GB/s when x16 Gen 1.
• Hierarchical PCI-compliant configuration mechanism for downstream devices.
• Traditional PCI style traffic (asynchronous snooped, PCI ordering).
• PCI Express extended configuration space. The first 256 bytes of configuration
space aliases directly to the PCI compatibility configuration space. The remaining
portion of the fixed 4-KB block of memory-mapped space above that (starting at
100h) is known as “extended configuration space”.
• PCI Express Enhanced Access Mechanism. Accessing the device configuration space
in a flat memory mapped fashion.
• Automatic discovery, negotiation, and training of link out of reset.
• Traditional AGP style traffic (asynchronous non-snooped, PCI-X Relaxed ordering).
• Peer segment destination posted write traffic (no peer-to-peer read traffic) in
Virtual Channel 0:
— DMI -> PCI Express Port 0
• 64-bit downstream address format, but the processor never generates an address
above 64 GB (Bits 63:36 will always be zeros).
• 64-bit upstream address format, but the processor responds to upstream read
transactions to addresses above 64 GB (addresses where any of Bits 63:36 are
12Datasheet
Features Summary
• Re-issues configuration cycles that have been previously completed with the
• PCI Express reference clock is 100-MHz differential clock buffered out of system
• Message Signaled Interrupt (MSI and MSI-X) messages
• PEG Lanes shared with Embedded DisplayPort* (see eDP, Section 1.3.6).
• Polarity inversion
non-zero) with an Unsupported Request response. Upstream write transactions to
addresses above 64 GB will be dropped.
Configuration Retry status.
clock generator.
— Does not support dynamic lane reversal, as defined (optional) by the PCI
Express Base Specification.
— PCI Express 1x16 configuration
Normal (1x16): PEG_RX[15:0]; PEG_TX[15:0]
Reversal (1x16): PEG_RX[0:15]; PEG_TX[0:15]
1.3.3Direct Media Interface (DMI)
• Compliant to Direct Media Interface second generation (DMI2).
• Four lanes in each direction.
• 2.5 GT/s point-to-point DMI interface to PCH is supported.
• Raw bit-rate on the data pins of 2.5 Gb/s, resulting in a real bandwidth per pair of
250 MB/s given the 8b/10b encoding used to transmit data across this interface.
Does not account for packet overhead and link maintenance.
• Maximum theoretical bandwidth on interface of 1 GB/s in each direction
simultaneously, for an aggregate of 2 GB/s when DMI x4.
• Shares 100-MHz PCI Express reference clock.
• 64-bit downstream address format, but the processor never generates an address
above 64 GB (Bits 63:36 will always be zeros).
• 64-bit upstream address format, but the processor responds to upstream read
transactions to addresses above 64 GB (addresses where any of Bits 63:36 are
nonzero) with an Unsupported Request response. Upstream write transactions to
addresses above 64 GB will be dropped.
• Supports the following traffic types to or from the PCH:
— DMI -> PCI Express Port 0 write traffic
—DMI -> DRAM
— DMI -> processor core (Virtual Legacy Wires (VLWs), Resetwarn, or MSIs only)
Datasheet13
— Processor core -> DMI
• APIC and MSI interrupt messaging support:
— Message Signaled Interrupt (MSI and MSI-X) messages
• Downstream SMI, SCI and SERR error indication.
• Legacy support for ISA regime protocol (PHOLD/PHOLDA) required for parallel port
DMA, floppy drive, and LPC bus masters.
• DC coupling – no capacitors between the processor and the PCH.
• Polarity inversion.
• PCH end-to-end lane reversal across the link.
• Supports Half Swing “low-power/low-voltage.”
1.3.4Platform Environment Control Interface (PECI)
The PECI is a one-wire interface that provides a communication channel between a
PECI client (the processor) and a PECI master (the PCH).
Features Summary
1.3.5Intel® HD Graphics Controller
• The integrated graphics controller contains a refresh of the fifth generation graphics
core
• Intel® Dynamic Video Memory Technology (Intel® DVMT) support
ICHThe legacy I/O Controller Hub component that contains the main PCI
IMCIntegrated Memory Controller
Intel® 64 Technology64-bit memory extensions to the IA-32 architecture
ITPMIntegrated Trusted Platform Module
IOVI/O Virtualization
LCDLiquid Crystal Display
LVDSLow Voltage Differential Signaling. A high speed, low power data
MCPMulti-Chip Package.
NCTFNon-Critical to Function. NCTF locations are typically redundant
NehalemIntel’s 45-nm processor design, follow-on to the 45-nm Penryn design.
®
Display Power Saving Technology
Technology that provides power management capabilities to laptops.
non-executable, when combined with a supporting operating system.
If code attempts to run in non-executable memory the processor
raises an error to the operating system. This feature can prevent some
classes of viruses or worms that exploit buffer overrun vulnerabilities
and can thus help improve the overall security of the system. See the
Intel® 64 and IA-32 Architectures Software Developer's Manuals for
more detailed information.
interface, LPC interface, USB2, Serial ATA, and other I/O functions. It
communicates with the legacy (G)MCH over a proprietary interconnect
called DMI.
transmission standard used for display connections to LCD panels.
ground or non-critical reserved, so the loss of the solder joint
continuity at end of life conditions will not affect the overall product
functionality.
Datasheet17
Features Summary
TermDescription
PCHPlatform Controller Hub. The new, 2009 chipset with centralized
platform capabilities including the main I/O interfaces along with
display connectivity, audio features, power management,
manageability, security and storage features. The PCH may also be
referred to using the name (Mobile) Intel® 5 Series Chipset
PECIPlatform Environment Control Interface.
PEGPCI Express* Graphics. External Graphics using PCI Express
Architecture. A high-speed serial interface whose configuration is
software compatible with the existing PCI specifications.
ProcessorThe 64-bit, single-core or multi-core component (package).
Processor CoreThe term “processor core” refers to Si die itself which can contain
multiple execution cores. Each execution core has an instruction
cache, data cache, and 256-KB L2 cache. All execution cores share the
L3 cache.
RankA unit of DRAM corresponding four to eight devices in parallel, ignoring
ECC. These devices are usually, but not always, mounted on a single
side of a SO-DIMM.
SCISystem Control Interrupt. Used in ACPI protocol.
Storage ConditionsA non-operational state. The processor may be installed in a platform,
in a tray, or loose. Processors may be sealed in packaging or exposed
to free air. Under these conditions, processor landings should not be
connected to any supply voltages, have any I/Os biased or receive any
clocks. Upon exposure to “free air” (i.e., unsealed packaging or a
device removed from packaging material) the processor must be
handled in accordance with moisture sensitivity labeling (MSL) as
indicated on the packaging material.
TACThermal Averaging Constant.
TDPThermal Design Power.
V
V
V
V
V
CC
SS
AXG
TT
DDQ
Processor core power supply.
Processor ground.
Graphics core power supply.
L3 shared cache, memory controller, and processor I/O power rail.
DDR3 power rail.
VLDVariable Length Decoding.
x1Refers to a Link or Port with one Physical Lane.
x4Refers to a Link or Port with four Physical Lanes.
x8Refers to a Link or Port with eight Physical Lanes.
x16Refers to a Link or Port with sixteen Physical Lanes.
18Datasheet
Features Summary
1.8Related Documents
Document
Public Specifications
Advanced Configuration and Power Interface Specification 3.0http://www.acpi.info/
PCI Local Bus Specification 3.0 http://www.pcisig.com/
PCI Express Base Specification 2.0http://www.pcisig.com
DDR3 SDRAM Specificationhttp://www.jedec.org
DisplayPort Specificationhttp://www.vesa.org
Intel® 64 and IA-32 Architectures Software Developer's Manuals http://www.intel.com/
Volume 1: Basic Architecture253665
Volume 2A: Instruction Set Reference, A-M 253666
Volume 2B: Instruction Set Reference, N-Z 253667
Volume 3A: System Programming Guide 253668
Volume 3B: System Programming Guide 253669
Document Number/
Location
specifications
products/processor/
manuals/index.htm
§
Datasheet19
2Interfaces
This chapter describes the interfaces supported by the processor.
2.1System Memory Interface
2.1.1System Memory Technology Supported
The Integrated Memory Controller (IMC) supports DDR3 protocols with two,
independent, 64-bit wide channels each accessing one SO-DIMM. It supports a
maximum of one, unbuffered non-ECC DDR3 SO-DIMM per-channel thus allowing up to
two device ranks per-channel.
DDR3 Data Transfer Rates:
— 800 MT/s (PC3-6400) and 1066 MT/s (PC3-8500)
Interfaces
• DDR3 SO-DIMM Modules:
— Raw Card A – double-sided x16 unbuffered non-ECC
— Raw Card B – single-sided x8 unbuffered non-ECC
— Raw Card C – single-sided x16 unbuffered non-ECC
— Raw Card D – double-sided x8 (stacked) unbuffered non-ECC
— Raw Card F – double-sided x8 (planar) unbuffered non-ECC
• DDR3 DRAM Device Technology:
— Standard 1-Gb, and 2-Gb technologies and addressing are supported for x16
and x8 devices. There is no support for memory modules with different
technologies or capacities on opposite sides of the same memory module. If one
side of a memory module is populated, the other side is either identical or
empty.
1.System memory configurations are based on availability and are subject to change.
DIMM
Capacity
4 GB2 Gb256 M x 816215/1088K
2.Only Raw Card D SO-DIMMs at 1066 MT/s are supported.
DRAM
Device
Technology
DRAM
Organization
# of
DRAM
Devices
1
# of
Physical
Device
Ranks
# of Row/
Col
Address
Bits
# of
Banks
Inside
DRAM
2.1.2System Memory Timing Support
The IMC supports the following DDR3 Speed Bin, CAS Write Latency (CWL), and
command signal mode timings on the main memory interface:
• tCL = CAS Latency
• tRCD = Activate Command to READ or WRITE Command delay
• tRP = PRECHARGE Command Period
• CWL = CAS Write Latency
• Command Signal modes = 1n indicates a new command may be issued every clock
and 2n indicates a new command may be issued every 2 clocks. Command launch
mode programming depends on the transfer rate and memory configuration.
Page
Size
Table 2-2. DDR3 System Mem ory Timing Support
Transfer
Rate
(MT/s)
80066651n1
106677761n1
NOTES:
1.System memory timing support is based on availability and is subject to change.
tCL
(tCK)
888
tRCD
(tCK)
tRP
(tCK)
CWL
(tCK)
CMD ModeNotes
2.1.3System Memory Organization Modes
The IMC supports two memory organization modes, single-channel and dual-channel.
Depending upon how the SO-DIMM Modules are populated in each memory channel, a
number of different configurations can exist.
Datasheet21
2.1.3.1Single-Channel Mode
CH BCH A
CH BCH A
BB
C
BB
C
B
B
C
Non interleaved
access
Dual channel
interleaved access
TOM
B – The largest physical mem ory amount of the smaller size mem ory module
C – The remaining physical mem ory amount of the larger size mem ory module
In this mode, all memory cycles are directed to a single-channel. Single-channel mode
is used when either Channel A or Channel B SO-DIMM connectors are populated in any
order, but not both.
The IMC supports Intel Flex Memory Technology Mode. This mode combines the
advantages of the Dual-Channel Symmetric (Interleaved) and Dual-Channel
Asymmetric Modes. Memory is divided into a symmetric and a asymmetric zone. The
symmetric zone starts at the lowest address in each channel and is contiguous until the
asymmetric zone begins or until the top address of the channel with the smaller
capacity is reached. In this mode, the system runs with one zone of dual-channel mode
and one zone of single-channel mode, simultaneously, across the whole memory array.
Figure 2-2.Intel Flex Memory Technology Operation
Interfaces
2.1.3.2.1Dual-Channel Symmetric Mode
Dual-Channel Symmetric mode, also known as interleaved mode, provides maximum
performance on real world applications. Addresses are ping-ponged between the
channels after each cache line (64-byte boundary). If there are two requests, and the
second request is to an address on the opposite channel from the first, that request can
be sent before data from the first request has returned. If two consecutive cache lines
are requested, both may be retrieved simultaneously, since they are ensured to be on
opposite channels. Use Dual-Channel Symmetric mode when both Channel A and
22Datasheet
Channel B SO-DIMM connectors are populated in any order, with the total amount of
memory in each channel being the same.
Interfaces
CH. B
CH. A
CH. B
CH. A
CH. B
CH. A
CL
0
Top of
Memory
CL
0
CH. B
CH. A
CH.A-top
DRB
Dual Channel Interleaved
(memory sizes must match)
Dual Channel Asymmetric
(memory sizes can differ)
Top of
Memory
When both channels are populated with the same memory capacity and the boundary
between the dual channel zone and the single channel zone is the top of memory, IMC
operates completely in Dual-Channel Symmetric mode.
Note:The DRAM device technology and width may vary from one channel to the other.
2.1.3.2.2Dual-Channel Asymmetric Mode
This mode trades performance for system design flexibility. Unlike the previous mode,
addresses start at the bottom of Channel A and stay there until the end of the highest
rank in Channel A, and then addresses continue from the bottom of Channel B to the
top. Real world applications are unlikely to make requests that alternate between
addresses that sit on opposite channels with this memory organization, so in most
cases, bandwidth is limited to a single channel.
This mode is used when Intel Flex Memory Technology is disabled and both Channel A
and Channel B SO-DIMM connectors are populated in any order with the total amount
of memory in each channel being different.
Figure 2-3.Dual-Channel Symmetric (Interleaved) and Dual-Channel Asymmetric Modes
2.1.4Rules for Populating Memory Slots
Datasheet23
In all modes, the frequency of system memory is the lowest frequency of all memory
modules placed in the system, as determined through the SPD registers on the
memory modules. The system memory controller supports only one SO-DIMM
Interfaces
connector per channel. For dual-channel modes both channels must have an SO-DIMM
connector populated. For single-channel mode, only a single-channel can have an
SO-DIMM connector populated.
2.1.5Technology Enhancements of Intel® Fast Memory Access
(Intel® FMA)
The following sections describe the Just-in-Time Scheduling, Command Overlap, and
Out-of-Order Scheduling Intel FMA technology enhancements.
2.1.5.1Just-in-Time Command Scheduling
The memory controller has an advanced command scheduler where all pending
requests are examined simultaneously to determine the most efficient request to be
issued next. The most efficient request is picked from all pending requests and issued
to system memory Just-in-Time to make optimal use of Command Overlapping. Thus,
instead of having all memory access requests go individually through an arbitration
mechanism forcing requests to be executed one at a time, they can be started without
interfering with the current request allowing for concurrent issuing of requests. This
allows for optimized bandwidth and reduced latency while maintaining appropriate
command spacing to meet system memory protocol.
2.1.5.2Command Overlap
Command Overlap allows the insertion of the DRAM commands between the Activate,
Precharge, and Read/Write commands normally used, as long as the inserted
commands do not affect the currently executing command. Multiple commands can be
issued in an overlapping manner, increasing the efficiency of system memory protocol.
2.1.5.3Out-of-Order Scheduling
While leveraging the Just-in-Time Scheduling and Command Overlap enhancements,
the IMC continuously monitors pending requests to system memory for the best use of
bandwidth and reduction of latency. If there are multiple requests to the same open
page, these requests would be launched in a back to back manner to make optimum
use of the open memory page. This ability to reorder requests on the fly allows the IMC
to further reduce latency and increase bandwidth efficiency.
2.1.6DRAM Clock Generation
Every supported SO-DIMM has two differential clock pairs. There are total of four clock
pairs driven directly by the processor to two SO-DIMMs.
2.1.7System Memory Pre-Charge Power Down Support Details
The IMC supports and enables slow exit DDR3 DRAM Device pre-charge power down
DLL control. During a pre-charge power down, a slow exit is where the DRAM device
DLL is disabled after entering pre-charge power down for potential power savings.
24Datasheet
Interfaces
2.2PCI Express Interface
This section describes the PCI Express interface capabilities of theprocessor. See the
PCI Express Base Specification for details of PCI Express.
The processor has one PCI Express controller that can support one external x16 PCI
Express Graphics Device or two external x8 PCI Express Graphics Devices. The primary
PCI Express Graphics port is referred to as PEG 0 and the secondary PCI Express
Graphics port is referred to as PEG 1.
2.2.1PCI Express Architecture
Compatibility with the PCI addressing model is maintained to ensure that all existing
applications and drivers operate unchanged.
The PCI Express configuration uses standard mechanisms as defined in the PCI
Plug-and-Play specification. The initial recovered clock speed of 1.25 GHz results in
2.5 Gb/s/direction which provides a 250 MB/s communications channel in each
direction (500 MB/s total). That is close to twice the data rate of classic PCI. The fact
that 8b/10b encoding is used accounts for the 250 MB/s where quick calculations would
imply 300 MB/s.
The PCI Express architecture is specified in three layers: Transaction Layer, Data Link
Layer, and Physical Layer. The partitioning in the component is not necessarily along
these same boundaries. Refer to Figure 2-4 for the PCI Express Layering Diagram.
Figure 2-4.PCI Express Layering Diagram
PCI Express uses packets to communicate information between components. Packets
are formed in the Transaction and Data Link Layers to carry the information from the
transmitting component to the receiving component. As the transmitted packets flow
through the other layers, they are extended with additional information necessary to
handle packets at those layers. At the receiving side, the reverse process occurs and
Datasheet25
packets get transformed from their Physical Layer representation to the Data Link
Layer representation and finally (for Transaction Layer Packets) to the form that can be
processed by the Transaction Layer of the receiving device.
Figure 2-5.Packet Flow through the Layers
2.2.1.1Transaction Layer
The upper layer of the PCI Express architecture is the Transaction Layer. The
Transaction Layer's primary responsibility is the assembly and disassembly of
Transaction Layer Packets (TLPs). TLPs are used to communicate transactions, such as
read and write, as well as certain types of events. The Transaction Layer also manages
flow control of TLPs.
Interfaces
2.2.1.2Data Link Layer
The middle layer in the PCI Express stack, the Data Link Layer, serves as an
intermediate stage between the Transaction Layer and the Physical Layer.
Responsibilities of Data Link Layer include link management, error detection, and error
correction.
The transmission side of the Data Link Layer accepts TLPs assembled by the
Transaction Layer, calculates and applies data protection code and TLP sequence
number, and submits them to Physical Layer for transmission across the Link. The
receiving Data Link Layer is responsible for checking the integrity of received TLPs and
for submitting them to the Transaction Layer for further processing. On detection of TLP
error(s), this layer is responsible for requesting retransmission of TLPs until information
is correctly received, or the Link is determined to have failed. The Data Link Layer also
generates and consumes packets which are used for Link management functions.
2.2.1.3Physical Layer
The Physical Layer includes all circuitry for interface operation, including driver and
input buffers, parallel-to-serial and serial-to-parallel conversion, PLL(s), and impedance
matching circuitry. It also includes logical functions related to interface initialization and
maintenance. The Physical Layer exchanges data with the Data Link Layer in an
implementation-specific format, and is responsible for converting this to an appropriate
serialized format and transmitting it across the PCI Express Link at a frequency and
width compatible with the remote device.
26Datasheet
Interfaces
PCI-PCI
Bridge
representing
root PCI
Express port
(Device 1)
PCI
Compatible
Host Bridge
Device
(Device 0)
PCI
Express
Device
PEG0
DMI
2.2.2PCI Express Configuration Mechanism
The PCI Express (external graphics) link is mapped through a PCI-to-PCI bridge
structure.
Figure 2-6.PCI Express Related Register Structures in the Processor
PCI Express extends the configuration space to 4096 bytes per-device/function, as
compared to 256 bytes allowed by the Conventional PCI Specification. PCI Express
configuration space is divided into a PCI-compatible region (which consists of the first
256 bytes of a logical device's configuration space) and an extended PCI Express region
(which consists of the remaining configuration space). The PCI-compatible region can
be accessed using either the mechanisms defined in the PCI specification or using the
enhanced PCI Express configuration access mechanism described in the PCI Express Enhanced Configuration Mechanism section.
The PCI Express Host Bridge is required to translate the memory-mapped PCI Express
configuration space accesses from the host processor to PCI Express configuration
cycles. To maintain compatibility with PCI configuration addressing mechanisms, it is
recommended that system software access the enhanced configuration space using
32-bit operations (32-bit aligned) only. See the PCI Express Base Specification for
details of both the PCI-compatible and PCI Express Enhanced configuration
mechanisms and transaction rules.
2.2.3PCI Express Ports and Bifurcation
The external graphics attach (PEG) on the processor is a single, 16-lane (x16) port that
can be:
• configured at narrower widths
• bifurcated into two x8 PCI Express ports that may train to narrower widths
The PEG port is being designed to be compliant with the PCI Express Base
Datasheet27
Specification, Revision 2.0.
2.2.3.1PCI Express Bifurcated Mode
When bifurcated, the signals which had previously been assigned to Lanes 15:8 of the
single x16 Primary port are reassigned to lanes 7:0 of the x8 Secondary Port. This
assignment applies whether the lane numbering is reversed or not. PCI Express Port 0
is mapped to PCI Device 1 and PCI Express Port 1 is mapped to PCI Device 6.
2.2.3.2Static Lane Numbering Reversal
Does not support dynamic lane reversal, as defined (optional) by the PCI Express Base
Specification.
PCI Express 1x16 configuration:
• Normal (1x16): PEG_RX[15:0]; PEG_TX[15:0]
• Reversal (1x16): PEG_RX[0:15]; PEG_TX[0:15]
2.3DMI
Interfaces
DMI connects the processor and the PCH chip-to-chip. DMI2 is supported. The DMI is
similar to a four-lane PCI Express supporting up to 1 GB/s of bandwidth in each
direction.
Note:Only DMI x4 configuration is supported.
2.3.1DMI Error Flow
DMI can only generate SERR in response to errors, never SCI, SMI, MSI, PCI INT, or
GPE. Any DMI related SERR activity is associated with Device 0.
2.3.2Processor/PCH Compatibility Assumptions
The processor is compatible with the PCH and is not compatible with any previous
(G)MCH or ICH products.
2.3.3DMI Link Down
The DMI link going down is a fatal, unrecoverable error. If the DMI data link goes to
data link down, after the link was up, then the DMI link hangs the system by not
allowing the link to retrain to prevent data corruption. This is controlled by the PCH.
Downstream transactions that had been successfully transmitted across the link prior
to the link going down may be processed as normal. No completions from downstream,
non-posted transactions are returned upstream over the DMI link after a link down
event.
28Datasheet
Interfaces
Plane A
Cursor B
Sprite B
Plane B
Cursor A
Sprite A
Pipe B
Pipe A
Memory
M
U
X
VGA
Video Engine
2D Engine
3D Engine
Clipper
Strip & Fan/Setup
Alpha
Blend/
Gamma
/Panel
Fitter
Geometry Shader
Vertex Fetch/Vertex
Shader
Windower/IZ
Intel®
FDI
eDP
2.4Intel® HD Graphics Controller
This section details the 2D, 3D and video pipeline and their respective capabilities.
The integrated graphics is powered by a refresh of the fifth generation graphics core
and supports twelve, fully-programmable execution cores. Full-precision, floating-point
operations are supported to enhance the visual experience of compute-intensive
applications.The integrated graphics controller contains several types of components;
the graphics engines, planes, pipes, port and the Intel FDI. The integrated graphics has
a 3D/2D Instruction Processing unit to control the 3D and 2D engines respectively. The
integrated graphics controller’s 3D and 2D engines are fed with data through the IMC.
The outputs of the graphics engine are surfaces sent to memory, which are then
retrieved and processed by the planes. The surfaces are then blended in the pipes and
the display timings are transitioned from display core clock to the pixel (dot) clock.
Figure 2-7.Integrated Graphics Controller Unit Block Diagram
2.4.13D and Video Engines for Graphics Processing
The 3D graphics pipeline architecture simultaneously operates on different primitives or
on different portions of the same primitive. All the cores are fully programmable,
increasing the versatility of the 3D Engine. The Gen 5.75 3D engine provides the
following performance and power-management enhancements:
• Execution units (EUs) increased to 12 from the previous 10 EUsin Gen 5.0.
• Includes Hierarchal-Z
• Includes video quality enhancements
2.4.1.13D Engine Execution Units
• Support 12 EUs. The EUs perform 128-bit wide execution per clock.
Datasheet29
• Support SIMD8 instructions for vertex processing and SIMD16 instructions for pixel
processing.
2.4.1.23D Pipeline
2.4.1.2.1Vertex Fetch (VF) Stage
The VF stage executes 3DPRIMITIVE commands. Some enhancements have been
included to better support legacy D3D APIs as well as SGI OpenGL*.
2.4.1.2.2Vertex Shader (VS) Stage
The VS stage performs shading of vertices output by the VF function. The VS unit
produces an output vertex reference for every input vertex reference received from the
VF unit, in the order received.
2.4.1.2.3Geometry Shader (GS) Stage
The GS stage receives inputs from the VS stage. Compiled application-provided GS
programs, specifying an algorithm to convert the vertices of an input object into some
output primitives. For example, a GS shader may convert lines of a line strip into
polygons representing a corresponding segment of a blade of grass centered on the
line. Or it could use adjacency information to detect silhouette edges of triangles and
output polygons extruding out from the edges.
Interfaces
2.4.1.2.4Clip Stage
The Clip stage performs general processing on incoming 3D objects. However, it also
includes specialized logic to perform a Clip Test function on incoming objects. The Clip
Test optimizes generalized 3D Clipping. The Clip unit examines the position of incoming
vertices, and accepts/rejects 3D objects based on its Clip algorithm.
2.4.1.2.5Strips and Fans (SF) Stage
The SF stage performs setup operations required to rasterize 3D objects. The outputs
from the SF stage to the Windower stage contain implementation-specific information
required for the rasterization of objects and also supports clipping of primitives to some
extent.
2.4.1.2.6Windower/IZ (WIZ) Stage
The WIZ unit performs an early depth test, which removes failing pixels and eliminates
unnecessary processing overhead.
The Windower uses the parameters provided by the SF unit in the object-specific
rasterization algorithms. The WIZ unit rasterizes objects into the corresponding set of
pixels. The Windower is also capable of performing dithering, whereby the illusion of a
higher resolution when using low-bpp channels in color buffers is possible. Color
dithering diffuses the sharp color bands seen on smooth-shaded objects.
30Datasheet
Loading...
+ 151 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.