IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
August 2014
<edit Part Number variable in chapter>
This document describes the Altera® IP Compiler for PCI Express IP core. PCI Express
is a high-performance interconnect protocol for use in a variety of applications
including network adapters, storage area networks, embedded controllers, graphic
accelerator boards, and audio-video products. The PCI Express protocol is software
backwards-compatible with the earlier PCI and PCI-X protocols, but is significantly
different from its predecessors. It is a packet-based, serial, point-to-point interconnect
between two devices. The performance is scalable based on the number of lanes and
the generation that is implemented. Altera offers both endpoints and root ports that
are compliant with PCI Express Base Specification 1.0a or 1.1 for Gen1 and PCI Express
Base Specification 2.0 for Gen1 or Gen2. Both endpoints and root ports can be
implemented as a configurable hard IP block rather than programmable logic, saving
significant FPGA resources. The IP Compiler for PCI Express is available in ×1, ×2, ×4,
and ×8 configurations. Ta bl e 1– 1 shows the aggregate bandwidth of a PCI Express
link for Gen1 and Gen2 IP Compilers for PCI Express for 1, 2, 4, and 8 lanes. The
protocol specifies 2.5 giga-transfers per second for Gen1 and 5 giga-transfers per
second for Gen2. Because the PCI Express protocol uses 8B/10B encoding, there is a
20% overhead which is included in the figures in Tab le 1 –1 . Tab le 1 –1 provides
bandwidths for a single TX or RX channel, so that the numbers in Tab le 1– 1 would be
doubled for duplex operation.
1. Datasheet
Table 1–1. IP Compiler for PCI Express Throughput
Link Width
×1×2×4×8
PCI Express Gen1 Gbps (1.x compliant)24816
PCI Express Gen2 Gbps (2.0 compliant)481632
f Refer to the PCI Express High Performance Reference Design for bandwidth numbers
for the hard IP implementation in Stratix
®
IV GX and Arria®II GX devices.
Features
Altera’s IP Compiler for PCI Express offers extensive support across multiple device
families. It supports the following key features:
■ Hard IP implementation—PCI Express Base Specification 1.1 or 2.0. The PCI Express
protocol stack including the transaction, data link, and physical layers is hardened
in the device.
■ Soft IP implementation:
■PCI Express Base Specification 1.0a or 1.1.
■Many device families supported. Refer to Tab le 1 –4 .
■The PCI Express protocol stack including transaction, data link, and physical
layer is implemented using FPGA fabric logic elements
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
1–2Chapter 1: Datasheet
■ Feature rich:
Features
■Support for ×1, ×2, ×4, and ×8 configurations. You can select the ×2 lane
configuration for the Cyclone
®
IV GX without down configuring a ×4
configuration.
■Optional end-to-end cyclic redundancy code (ECRC) generation and checking
and advanced error reporting (AER) for high reliability applications.
■Extensive maximum payload size support:
Stratix IV GX hard IP—Up to 2 KBytes (128, 256, 512, 1,024, or 2,048 bytes).
Arria II GX, Arria II GZ, and Cyclone IV GX hard IP—Up to 256 bytes (128 or
256 bytes).
Soft IP Implementations—Up to 2 KBytes (128, 256, 512, 1,024, or 2,048 bytes).
■ Easy to use:
■Easy parameterization.
■Substantial on-chip resource savings and guaranteed timing closure using the
IP Compiler for PCI Express hard IP implementation.
■Easy adoption with no license requirement for the hard IP implementation.
■Example designs to get started.
■Qsys support.
■Stratix V support is provided by the Stratix V Hard IP for PCI Express.
■ Stratix V support is not available with the IP Compiler for PCI Express.
■ The Stratix V Hard IP for PCI Express is documented in the Stratix V Hard
IP for PCI Express User Guide.
Different features are available for the soft and hard IP implementations and for the
three possible design flows. Table 1–2 outlines these different features.
Table 1–2. IP Compiler for PCI Express Features (Part 1 of 2)
Feature
Hard IPSoft IP
MegaCore LicenseFreeRequired
Root portNot supportedNot supported
Gen1×1, ×2, ×4, ×8×1, ×4
Gen2×1, ×4No
Avalon Memory-Mapped (Avalon-MM)
Interface
SupportedSupported
64-bit Avalon Streaming (Avalon-ST) Interface Not supportedNot supported
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 1: Datasheet1–3
Release Information
Table 1–2. IP Compiler for PCI Express Features (Part 2 of 2)
Feature
Hard IPSoft IP
Transaction layer packet type (TLP) (2)
■ Memory read
request
■ Memory write
request
■ Completion with
■ Memory read request
■ Memory write
request
■ Completion with or
without data
or without data
Maximum payload size128–256 bytes128–256 bytes
Number of virtual channels 11
Reordering of out–of–order completions
(transparent to the application layer)
Requests that cross 4 KByte address
boundary (transparent to the application layer)
Number of tags supported for non-posted
requests
SupportedSupported
SupportedSupported
1616
ECRC forwarding on RX and TXNot supportedNot supported
MSI-XNot supportedNot supported
Notes to Table 1–2:
(1) Not recommended for new designs.
(2) Refer to Appendix A, Transaction Layer Packet (TLP) Header Formats for the layout of TLP headers.
Release Information
Tab le 1– 3 provides information about this release of the IP Compiler for PCI Express.
Table 1–3. IP Compiler for PCI Express Release Information
Version14.0
Release DateJune 2014
Ordering Codes
Product IDs
■ Hard IP Implementation
■ Soft IP Implementation
Vendor ID
■ Hard IP Implementation
■ Soft IP Implementation
ItemDescription
IP-PCIE/1
IP-PCIE/4
IP-PCIE/8
IP-AGX-PCIE/1
IP-AGX-PCIE/4
No ordering code is required for the hard IP implementation.
FFFF
×1–00A9
×4–00AA
×8–00AB
6AF7
6A66
August 2014 Altera CorporationIP Compiler for PCI Express
1–4Chapter 1: Datasheet
Device Family Support
Altera verifies that the current version of the Quartus® II software compiles the
previous version of each IP core. Any exceptions to this verification are reported in the
MegaCore IP Library Release Notes and Errata. Altera does not verify compilation with
IP core versions older than one release.Table 1–4 shows the level of support offered by
the IP Compiler for PCI Express for each Altera device family.
Device Family Support
Table 1–4. Device Family Support
Device FamilySupport (1)
Arria II GX Final
Arria II GZ Final
Cyclone IV GXFinal
Stratix IV E, GXFinal
Stratix IV GTFinal
Other device familiesNo support
Note to Tab le 1 –4:
(1) Refer to the What's New for IP in Quartus II page for device support level information.
f In the Quartus II 11.0 release, support for Stratix V devices is offered with the Stratix V
Hard IP for PCI Express, and not with the IP Compiler for PCI Express. For more
information, refer to the Stratix V Hard IP for PCI Express User Guide .
General Description
The IP Compiler for PCI Express generates customized variations you use to design
PCI Express root ports or endpoints, including non-transparent bridges, or truly
unique designs combining multiple IP Compiler for PCI Express variations in a single
Altera device. The IP Compiler for PCI Express implements all required and most
optional features of the PCI Express specification for the transaction, data link, and
physical layers.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 1: Datasheet1–5
PCI Express
Protocol Stack
Adapter
Clock & Re se t
Se lectio n
PCIe Hard IP Block
TL
Interface
FPGA Fabric Interface
PIPE Interface
LMI
PCIe
Reconfig
Buffer
Virtual
Channel
Buffer
Retry
PCIe Hard IP Block Reconfiguration
RX
FPGA Fabric
Application
Layer
Test, Debug &
Configuration
Logic
PMA
PCS
Transceivers
General Description
The hard IP implementation includes all of the required and most of the optional
features of the specification for the transaction, data link, and physical layers.
Depending upon the device you choose, one to four instances of the IP Compiler for
PCI Express hard implementation are available. These instances can be configured to
include any combination of root port and endpoint designs to meet your system
requirements. A single device can also use instances of both the soft and hard
implementations of the IP Compiler for PCI Express. Figure 1–1 provides a high-level
block diagram of the hard IP implementation.
Figure 1–1. IP Compiler for PCI Express Hard IP Implementation High-Level Block Diagram (Note 1)(2)
Notes to Figure 1–1:
(1) Stratix IV GX devices have two virtual channels.
(2) LMI stands for Local Management Interface.
This user guide includes a design example and testbench that you can configure as a
root port (RP) or endpoint (EP). You can use these design examples as a starting point
to create and test your own root port and endpoint designs.
f The purpose of the IP Compiler for PCI Express User Guide is to explain how to use the
IP Compiler for PCI Express and not to explain the PCI Express protocol. Although
there is inevitable overlap between the two documents, this document should be used
in conjunction with an understanding of the following PCI Express specifications:
PHY Interface for the PCI Express Architecture PCI Express 3.0 and PCI Express Base
Specification 1.0a, 1.1, or 2.0.
Support for IP Compiler for PCI Express Hard IP
If you target an Arria II GX, Arria II GZ, Cyclone IV GX, or Stratix IV GX device, you
can parameterize the IP core to include a full hard IP implementation of the PCI
Express stack including the following layers:
August 2014 Altera CorporationIP Compiler for PCI Express
■ Physical (PHY)
■ Physical Media Attachment (PMA)
1–6Chapter 1: Datasheet
■ Physical Coding Sublayer (PCS)
■ Media Access Control (MAC)
■ Data link
■ Transaction
General Description
Optimized for Altera devices, the hard IP implementation supports all memory, I/O,
configuration, and message transactions. The IP cores have a highly optimized
application interface to achieve maximum effective throughput. Because the compiler
is parameterizeable, you can customize the IP cores to meet your design
requirements.Table 1–5 lists the configurations that are available for the IP Compiler
for PCI Express hard IP implementation.
Table 1–5. Hard IP Configurations for the IP Compiler for PCI Express in Quartus II Software Version 11.0
DeviceLink Rate (Gbps) ×1×2 (1) ×4 ×8
Avalon Streaming (Avalon-ST) Interface
Arria II GX
Arria II GZ
Cyclone IV GX
Stratix IV GX
2.5yesnoyesyes (2)
5.0nononono
2.5yesnoyesyes (2)
5.0yesnoyes (2)no
2.5yesyesyesno
5.0nononono
2.5yesnoyesyes
5.0yesnoyesyes
Avalon-MM Interface using Qsys Design Flow (3)
Arria II GX2.5yesnoyesno
Cyclone IV GX2.5yesyesyesno
Stratix IV GX
Notes to Table 1–5:
(1) For devices that do not offer a ×2 initial configuration, you can use a ×4 configuration with the upper two lanes left unconnected at the device
pins. The link will negotiate to ×2 if the attached device is ×2 native or capable of negotiating to ×2.
(2) The ×8 support uses a 128-bit bus at 125 MHz.
(3) The Qsys design flow supports the generation of endpoint variations only.
2.5yesnoyesyes
5.0yesnoyesno
Tab le 1– 6 lists the Total RX buffer space, Retry buffer size, and Maximum Payload
size for device families that include the hard IP implementation. You can find these parameters on the Buffer Setup page of the parameter editor.
Table 1–6. IP Compiler for PCI Express Buffer and Payload Information (Part 1 of 2)
The IP Compiler for PCI Express supports ×1, ×2, ×4, and ×8 variations (Table 1–7 on
page 1–8) that are suitable for either root port or endpoint applications. You can use
the parameter editor to customize the IP core. The Qsys design flows do not support
root port variations. Figure 1–2 shows a relatively simple application that includes
two IP Compilers for PCI Express, one configured as a root port and the other as an
endpoint.
Figure 1–2. PCI Express Application with a Single Root Port and Endpoint
August 2014 Altera CorporationIP Compiler for PCI Express
1–8Chapter 1: Datasheet
PCIe Link
PCIe Hard IP Block
RP
Switch
PCIe
Hard IP
Block
RP
User Application
Logic
PCIe Hard IP Block
EP
PCIe
Hard IP
Block
EP
User Application
Logic
IP Compiler
for
PCI Express
Soft IP
Implementation
EP
User Application
Logic
PHY
PIPE
Interface
User
Application
Logic
PCIe Link
PCIe Link
PCIe Link
PCIe Link
User Application
Logic
Altera FPGA with Embedded PCIe
Hard IP Blocks
Altera FPGA with Embedded PCIe
Hard IP Blocks
Altera FPGA with Embedded PCIe
Hard IP Blocks
Altera FPGA Supporting IP Compiler for
PCI Express Soft IP Implementation
IP Compiler
for
PCI Express
Soft IP
Implementation
General Description
Figure 1–3 illustrates a heterogeneous topology, including an Altera device with two
PCIe hard IP root ports. One root port connects directly to a second FPGA that
includes an endpoint implemented using the hard IP IP core. The second root port
connects to a switch that multiplexes among three PCI Express endpoints.
Figure 1–3. PCI Express Application with Two Root Ports
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
If you target a device that includes an internal transceiver, you can parameterize the
IP Compiler for PCI Express to include a complete PHY layer, including the MAC,
PCS, and PMA layers. If you target other device architectures, the IP Compiler for PCI
Express generates the IP core with the Intel-designed PIPE interface, making the IP
core usable with other PIPE-compliant external PHY devices.
Tab le 1– 7 lists the protocol support for devices that include HSSI transceivers.
Table 1–7. Operation in Devices with HSSI Transceivers (Part 1 of 2) (Note 1)
Device Family ×1 ×4 ×8
Stratix IV GX hard IP–Gen1 YesYesYes
Stratix IV GX hard IP–Gen 2Yes (2)Yes (2)Yes (3)
Stratix IV soft IP–Gen1 YesYesNo
Cyclone IV GX hard IP–Gen1YesYesNo
Chapter 1: Datasheet1–9
IP Core Verification
Table 1–7. Operation in Devices with HSSI Transceivers (Part 2 of 2) (Note 1)
Device Family ×1 ×4 ×8
Arria II GX–Gen1 Hard IP ImplementationYesYesYes
Arria II GX–Gen1 Soft IP ImplementationYesYesNo
Arria II GZ–Gen1 Hard IP ImplementationYesYesYes
Arria II GZ–Gen2 Hard IP ImplementationYesYesNo
Notes to Table 1–7:
(1) Refer to Table 1–2 on page 1–2 for a list of features available in the different implementations and design flows.
(2) Not available in -4 speed grade. Requires -2 or -3 speed grade.
(3) Gen2 ×8 is only available in the -2 and I3 speed grades.
1The device names and part numbers for Altera FPGAs that include internal
transceivers always include the letters GX, GT, or GZ. If you select a device that does
not include an internal transceiver, you can use the PIPE interface to connect to an
external PHY. Table 3–9 on page 3–8 lists the available external PHY types.
You can customize the payload size, buffer sizes, and configuration space (base
address registers support and other registers). Additionally, the IP Compiler for PCI
Express supports end-to-end cyclic redundancy code (ECRC) and advanced error
reporting for ×1, ×2, ×4, and ×8 configurations.
External PHY Support
Altera IP Compiler for PCI Express variations support a wide range of PHYs,
including the TI XIO1100 PHY in 8-bit DDR/SDR mode or 16-bit SDR mode; NXP
PX1011A for 8-bit SDR mode, a serial PHY, and a range of custom PHYs using
8-bit/16-bit SDR with or without source synchronous transmit clock modes and 8-bit
DDR with or without source synchronous transmit clock modes. You can constrain TX
I/Os by turning on the Fast Output Enable Register option in the parameter editor,
or by editing this setting in the Quartus II Settings File (.qsf). This constraint ensures
fastest t
Debug Features
The IP Compiler for PCI Express also includes debug features that allow observation
and control of the IP cores for faster debugging of system-level problems.
f For more information about debugging refer to Chapter 17, Debugging.
IP Core Verification
To ensure compliance with the PCI Express specification, Altera performs extensive
validation of the IP Compiler for PCI Express. Validation includes both simulation
and hardware testing.
timing.
CO
August 2014 Altera CorporationIP Compiler for PCI Express
1–10Chapter 1: Datasheet
Performance and Resource Utilization
Simulation Environment
Altera’s verification simulation environment for the IP Compiler for PCI Express uses
multiple testbenches that consist of industry-standard BFMs driving the PCI Express
link interface. A custom BFM connects to the application-side interface.
Altera performs the following tests in the simulation environment:
■ Directed tests that test all types and sizes of transaction layer packets and all bits of
the configuration space
■ Error injection tests that inject errors in the link, transaction layer packets, and data
link layer packets, and check for the proper response from the IP cores
■ PCI-SIG
■ Random tests that test a wide range of traffic patterns across one or more virtual
®
Compliance Checklist tests that specifically test the items in the checklist
channels
Compatibility Testing Environment
Altera has performed significant hardware testing of the IP Compiler for PCI Express
to ensure a reliable solution. The IP cores have been tested at various PCI-SIG PCI
Express Compliance Workshops in 2005–2009 with Arria GX, Arria II GX,
Cyclone IV GX, Stratix II GX, and Stratix IV GX devices and various external PHYs.
They have passed all PCI-SIG gold tests and interoperability tests with a wide
selection of motherboards and test equipment. In addition, Altera internally tests
every release with motherboards and switch chips from a variety of manufacturers.
All PCI-SIG compliance tests are also run with each IP core release.
Performance and Resource Utilization
The hard IP implementation of the IP Compiler for PCI Express is available in
Arria II GX, Arria II GZ, Cyclone IV GX, and Stratix IV GX devices.
Tab le 1– 8 shows the resource utilization for the hard IP implementation using either
the Avalon-ST or Avalon-MM interface with a maximum payload of 256 bytes and 32
tags for the Avalon-ST interface and 16 tags for the Avalon-MM interface.
Table 1–8. Performance and Resource Utilization in Arria II GX, Arria II GZ, Cyclone IV GX, and
Stratix IV GX Devices (Part 1 of 2)
ParametersSize
Lane
Width
×112511001000
×112521001000
×412512002000
×412522002000
×825012002000
×825022002000
Internal
Clock (MHz)
Virtual
Channel
Combinational
Avalon-ST Interface
ALUTs
Dedicated
Registers
Memory Blocks
M9K
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 1: Datasheet1–11
Recommended Speed Grades
Table 1–8. Performance and Resource Utilization in Arria II GX, Arria II GZ, Cyclone IV GX, and
Stratix IV GX Devices (Part 2 of 2)
ParametersSize
Lane
Width
×41251
×11251
×82501
×11251
×41251
×11251
×42501
Note to Tab le 1 –8:
(1) The transaction layer of the Avalon-MM implementation is implemented in programmable logic to improve latency.
Internal
Clock (MHz)
Avalon-MM Interface–Qsys Design Flow - Completer Only Single Dword
Virtual
Channel
Avalon-MM Interface–Qsys Design Flow
Avalon-MM Interface–Qsys Design Flow - Completer Only
Combinational
ALUTs
1600160018 ×41251
1000115010
4304500 ×41251
Dedicated
Registers
Memory Blocks
M9K
f Refer to Appendix C, Performance and Resource Utilization Soft IP Implementation
for performance and resource utilization for the soft IP implementation.
Recommended Speed Grades
Tab le 1– 9 shows the recommended speed grades for each device family for the
supported link widths and internal clock frequencies. For soft IP implementations of
the IP Compiler for PCI Express, the table lists speed grades that are likely to meet
timing; it may be possible to close timing in a slower speed grade. For the hard IP
implementation, the speed grades listed are the only speed grades that close timing.
When the internal clock frequency is 125 MHz or 250 MHz, Altera recommends
setting the Quartus II Analysis & Synthesis Settings Optimization Technique to Speed.
August 2014 Altera CorporationIP Compiler for PCI Express
1–12Chapter 1: Datasheet
Recommended Speed Grades
f Refer to “Setting Up and Running Analysis and Synthesis” in Quartus II Help and
Area and Timing Optimization in volume 2 of the Quartus II Handbook for more
information about how to effect this setting.
Table 1–9. Recommended Device Family Speed Grades (Part 1 of 2)
Device Family Link Width
Internal Clock
Frequency (MHz)
Recommended
Speed Grades
Avalon-ST Hard IP Implementation
×162.5 (2)–4,–5,–6
Arria II GX Gen1 with ECC Support (1)
×1125–4,–5,–6
×4125–4,–5,–6
×8125–4,–5,–6
×1125-3, -4
Arria II GZ Gen1 with ECC Support
×4125-3, -4
×8125-3, -4
Arria II GZ Gen 2 with ECC Support
Cyclone IV GX Gen1 with ECC Support
×1125-3
×4125-3
×162.5 (2)all speed grades
×1, ×2, ×4125all speed grades
×162.5 (2)–2, –3 (3)
Stratix IV GX Gen1 with ECC Support (1)
×1125–2, –3, –4
×4125–2, –3, –4
×8250–2, –3, –4 (3)
Stratix IV GX Gen2 with ECC Support (1)
×1125–2, –3 (3)
×4250–2, –3 (3)
Stratix IV GX Gen2 without ECC Support ×8500 –2, I3 (4)
Avalon–MM Interface–Qsys Flow
Arria II GX×1, ×4125–6
Cyclone IV GX
Stratix IV GX Gen1
Stratix IV GX Gen2
×1, ×2, ×4125–6, –7
×162.5–6, –7, –8
×1, ×4125–2, –3, –4
×8250–2, –3
×1125–2, –3
×4250–2, –3
Avalon-ST or Descriptor/Data Interface Soft IP Implementation
Arria II GX×1, ×4125–4. –5 (5)
Cyclone IV GX×1125–6, –7 (5)
Stratix IV E Gen1
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
×162.5 all speed grades
×1, ×4125all speed grades
Chapter 1: Datasheet1–13
Recommended Speed Grades
Table 1–9. Recommended Device Family Speed Grades (Part 2 of 2)
Device Family Link Width
Stratix IV GX Gen1
Notes to Table 1–9:
(1) The RX Buffer and Retry Buffer ECC options are only available in the hard IP implementation.
(2) This is a power-saving mode of operation.
(3) Final results pending characterization by Altera for speed grades -2, -3, and -4. Refer to the .fit.rpt file generated
by the Quartus II software.
(4) Closing timing for the –3 speed grades in the provided endpoint example design requires seed sweeping.
(5) You must turn on the following Physical Synthesis settings in the Quartus II Fitter Settings to achieve timing
closure for these speed grades and variations: Perform physical synthesis for combinational logic, Perform
register duplication, and Perform register retiming. In addition, you can use the Quartus II Design Space
Explorer or Quartus II seed sweeping methodology. Refer to the Netlist Optimizations and Physical Synthesis
chapter in volume 2 of the Quartus II Handbook for more information about how to set these options.
(6) Altera recommends disabling the OpenCore Plus feature for the ×8 soft IP implementation because including this
feature makes it more difficult to close timing.
×162.5all speed grades
×4125all speed grades
Internal Clock
Frequency (MHz)
Recommended
Speed Grades
August 2014 Altera CorporationIP Compiler for PCI Express
1–14Chapter 1: Datasheet
Recommended Speed Grades
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
acds
quartus - Contains the Quartus II software
ip - Contains the Altera IP Library and third-party IP cores
altera - Contains the Altera IP Library source code
<IP core name> - Contains the IP core source files
August 2014
<edit Part Number variable in chapter>
This section provides step-by-step instructions to help you quickly set up and
simulate the IP Compiler for PCI Express testbench. The IP Compiler for PCI Express
provides numerous configuration options. The parameters chosen in this chapter are
the same as those chosen in the PCI Express High-Performance Reference Design
available on the Altera website.
Installing and Licensing IP Cores
The Altera IP Library provides many useful IP core functions for production use
without purchasing an additional license. You can evaluate any Altera IP core in
simulation and compilation in the Quartus II software using the OpenCore evaluation
feature.
Some Altera IP cores, such as MegaCore
separate license for production use. You can use the OpenCore Plus feature to
evaluate IP that requires purchase of an additional license until you are satisfied with
the functionality and performance. After you purchase a license, visit the Self Service
Licensing Center to obtain a license number for any Altera product. For additional
information, refer to Altera Software Installation and Licensing.
Figure 2–1. IP core Installation Path
2. Getting Started
®
functions, require that you purchase a
1The default installation directory on Windows is <drive>:\altera\<version number>;
on Linux it is <home directory>/altera/<version number>.
OpenCore Plus IP Evaluation
Altera's free OpenCore Plus feature allows you to evaluate licensed MegaCore IP
cores in simulation and hardware before purchase. You need only purchase a license
for MegaCore IP cores if you decide to take your design to production. OpenCore Plus
supports the following evaluations:
■ Simulate the behavior of a licensed IP core in your system.
■ Verify the functionality, size, and speed of the IP core quickly and easily.
■ Generate time-limited device programming files for designs that include IP cores.
■ Program a device with your IP core and verify your design in hardware
OpenCore Plus evaluation supports the following two operation modes:
■ Untethered—run the design containing the licensed IP for a limited time.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
2–2Chapter 2: Getting Started
■ Tethered—run the design containing the licensed IP for a longer time or
IP Catalog and Parameter Editor
indefinitely. This requires a connection between your board and the host
computer.
All IP cores that use OpenCore Plus time out simultaneously when any IP core in the
design times out.
IP Catalog and Parameter Editor
The Quartus II IP Catalog (Too ls > I P C a t a lo g) and parameter editor help you easily
customize and integrate IP cores into your project. You can use the IP Catalog and
parameter editor to select, customize, and generate files representing your custom IP
variation.
1The IP Catalog (To ol s > IP C a t al og ) and parameter editor replace the MegaWizard™
Plug-In Manager for IP selection and parameterization, beginning in Quartus II
software version 14.0. Use the IP Catalog and parameter editor to locate and
paramaterize Altera IP cores.
The IP Catalog lists IP cores available for your design. Double-click any IP core to
launch the parameter editor and generate files representing your IP variation. The
parameter editor prompts you to specify an IP variation name, optional ports, and
output file generation options. The parameter editor generates a top level Qsys
system file (.qsys) or Quartus II IP file (.qip) representing the IP core in your project.
You can also parameterize an IP variation without an open project.
Use the following features to help you quickly locate and select an IP core:
■ Filter IP Catalog to Show IP for active device family or Show IP for all device
families.
■ Search to locate any full or partial IP core name in IP Catalog. Click Search for
Partner IP, to access partner IP information on the Altera website.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 2: Getting Started2–3
IP Catalog and Parameter Editor
■ Right-click an IP core name in IP Catalog to display details about supported
devices, installation location, and links to documentation.
Figure 2–2. Quartus II IP Catalog
Search and filter IP for your target device
Double-click to customize, right-click for information
1The IP Catalog is also available in Qsys (View > IP Catalog). The Qsys IP Catalog
includes exclusive system interconnect, video and image processing, and other
system-level IP that are not available in the Quartus II IP Catalog.
Using the Parameter Editor
The parameter editor helps you to configure your IP variation ports, parameters,
architecture features, and output file generation options:
■ Use preset settings in the parameter editor (where provided) to instantly apply
preset parameter values for specific applications.
■ View port and parameter descriptions and links to detailed documentation.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
2–4Chapter 2: Getting Started
View IP port
and parameter
details
Apply preset parameters for
specific applications
Specify your IP variation name
and target device
Legacy parameter
editors
■ Generate testbench systems or example designs (where provided).
Upgrading Outdated IP Cores
Figure 2–3. IP Parameter Editors
Modifying an IP Variation
You can easily modify the parameters of any Altera IP core variation in the parameter
editor to match your design requirements. Use any of the following methods to
modify an IP variation in the parameter editor.
Table 2–1. Modifying an IP Variation
Menu CommandAction
File > Open
View > Utility Windows >
Project Navigator > IP Components
Project > Upgrade IP Components
Upgrading Outdated IP Cores
IP core variants generated with a previous version of the Quartus II software may
require upgrading before use in the current version of the Quartus II software. Click
Project > Upgrade IP Components to identify and upgrade IP core variants.
The Upgrade IP Components dialog box provides instructions when IP upgrade is
required, optional, or unsupported for specific IP cores in your design. You must
upgrade IP cores that require it before you can compile the IP variation in the current
version of the Quartus II software. Many Altera IP cores support automatic upgrade.
Select the top-levelHDL(.v, or .vhd) IP variation file to
launch the parameter editor and modify the IP variation.
Regenerate the IP variation to implement your changes.
Double-click the IP variation to launch the parameter
editor and modify the IP variation. Regenerate the IP
variation to implement your changes.
Select the IP variation and click Upgrade in Editor to
launch the parameter editor and modify the IP variation.
Regenerate the IP variation to implement your changes.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 2: Getting Started2–5
Upgrading Outdated IP Cores
The upgrade process renames and preserves the existing variation file (.v, .sv, or .vhd)
as <my_ip>_ BAK.v, .sv, .vhd in the project directory.
Table 2–2. IP Core Upgrade Status
IP Core StatusCorrective Action
Required Upgrade IP
Components
You must upgrade the IP variation before compiling in the current
version of the Quartus II software.
Upgrade is optional for this IP variation in the current version of the
Optional Upgrade IP
Components
Quartus II software. You can upgrade this IP variation to take
advantage of the latest development of this IP core. Alternatively you
can retain previous IP core characteristics by declining to upgrade.
Upgrade of the IP variation is not supported in the current version of
the Quartus II software due to IP core end of life or incompatibility
Upgrade Unsupported
with the current version of the Quartus II software. You are prompted
to replace the obsolete IP core with a current equivalent IP core from
the IP Catalog.
Before you begin
■ Archive the Quartus II project containing outdated IP cores in the original version
of the Quartus II software: Click Project > Archive Project to save the project in
your previous version of the Quartus II software. This archive preserves your
original design source and project files.
■ Restore the archived project in the latest version of the Quartus II software: Click
Project > Restore Archived Project. Click OK if prompted to change to a
supported device or overwrite the project database. File paths in the archive must
be relative to the project directory. File paths in the archive must reference the IP
variation .v or .vhd file or .qsys file (not the .qip file).
1. In the latest version of the Quartus II software, open the Quartus II project
containing an outdated IP core variation. The Upgrade IP Components dialog
automatically displays the status of IP cores in your project, along with
instructions for upgrading each core. Click Project > Upgrade IP Components to
access this dialog box manually.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
2–6Chapter 2: Getting Started
Displays upgrade
status for all IP cores
in the Project
Upgrades all IP core that support “Auto Upgrade”
Upgrades individual IP cores unsupported by “Auto Upgrade”
Checked IP cores
support “Auto Upgrade”
Successful
“Auto Upgrade”
Upgrade
unavailable
Double-click to
individually migrate
Upgrading Outdated IP Cores
2. To simultaneously upgrade all IP cores that support automatic upgrade, click
Perform Automatic Upgrade. The Status and Ve rs i o n columns update when
upgrade is complete. Example designs provided with any Altera IP core
regenerate automatically whenever you upgrade the IP core.
Figure 2–4. Upgrading IP Cores
Upgrading IP Cores at the Command Line
You can upgrade IP cores that support auto upgrade at the command line. IP cores
that do not support automatic upgrade do not support command line upgrade.
■ To upgrade a single IP core that supports auto-upgrade, type the following
command:
quartus_sh –ip_upgrade –variation_files
<qii_project>
Example:
■ To simultaneously upgrade multiple IP cores that support auto-upgrade, type the
f IP cores older than Quartus II software version 12.0 do not support upgrade. Altera
verifies that the current version of the Quartus II software compiles the previous
version of each IP core. TheMegaCore IP Library Release Notes reports any verification
exceptions for MegaCore IP. The Quartus II Software and Device Support Release Notes
reports any verification exceptions for other IP cores. Altera does not verify
compilation for IP cores older than the previous two releases.
<my_ip_filepath/my_ip>.<hdl>
<my_ip_filepath/my_ip1>.<hdl>;
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 2: Getting Started2–7
Parameterizing the IP Compiler for PCI Express
Parameterizing the IP Compiler for PCI Express
This section guides you through the process of parameterizing the IP Compiler for
PCI Express as an endpoint, using the same options that are chosen in Chapter 15,
Testbench and Design Example. Complete the following steps to specify the
parameters:
1. In the IP Catalog (Tools > IP Catalog), locate and double-click the name of the IP
core to customize. The parameter editor appears.
2. Specify a top-level name for your custom IP variation. This name identifies the IP
core variation files in your project. For this walkthrough, specify top.v for the
name of the IP core file: <working_dir>\top.v.
3. Specify the following values in the parameter editor:
Table 2–3. System Settings Parameters
ParameterValue
PCIe Core TypePCI Express hard IP
PHY typeStratix IV GX
PHY interfaceserial
Configure transceiver blockUse default settings.
Lanes ×8
Xcvr ref_clk100 MHz
Application interfaceAvalon-ST 128 -bit
Port type Native Endpoint
PCI Express version 2.0
Application clock250 MHz
Max rateGen 2 (5.0 Gbps)
Test out width64 bits
HIP reconfigDisable
4. To enable all of the tests in the provided testbench and chaining DMA example
design, make the base address register (BAR) assignments. Bar2 or Bar3 is
required.Table 2–4. provides the BAR assignments in tabular format.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
0xE001
0x2801
0x01
0x1172
2–8Chapter 2: Getting Started
Parameterizing the IP Compiler for PCI Express
Table 2–4. PCI Registers (Part 2 of 2)
PCI Base Registers (Type 0 Configuration Space)
Subsystem vendor ID
Class code
0x5BDE
0xFF0000
5. Specify the following settings for the Capabilities parameters.
Table 2–5. Capabilities Parameters
ParameterValue
Device Capabilities
Tags supported32
Implement completion timeout disableTurn this option On
Completion timeout rangeABCD
Error Reporting
Implement advanced error reporting Off
Implement ECRC checkOff
Implement ECRC generationOff
Implement ECRC forwarding Off
MSI Capabilities
MSI messages requested4
MSI message 64–bit address capableOn
Link Capabilities
Link common clockOn
Data link layer active reportingOff
Surprise down reportingOff
Link port number0x01
Slot Capabilities
Enable slot capabilityOff
Slot capability register0x0000000
MSI-X Capabilities
Implement MSI-X Off
Table size0x000
Offset0x00000000
BAR indicator (BIR)0
Pending Bit Array (PBA)
Offset0x00000000
BAR Indicator0
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 2: Getting Started2–9
Parameterizing the IP Compiler for PCI Express
6. Click the Buffer Setup tab to specify settings on the Buffer Setup page.
Table 2–6. Buffer Setup Parameters
ParameterValue
Maximum payload size512 bytes
Number of virtual channels1
Number of low-priority VCsNone
Auto configure retry buffer sizeOn
Retry buffer size16 KBytes
Maximum retry packets64
Desired performance for received requests Maximum
Desired performance for received completionsMaximum
1For the PCI Express hard IP implementation, the RX Buffer Space Allocation is fixed
at Maximum performance. This setting determines the values for a read-only table
that lists the number of posted header credits, posted data credits, non-posted header
credits, completion header credits, completion data credits, total header credits, and
total RX buffer space.
7. Specify the following power management settings.
Table 2–7. Power Management Parameters
ParameterValue
L0s Active State Power Management (ASPM)
Idle threshold for L0s entry8,192 ns
Endpoint L0s acceptable latency< 64 ns
Number of fast training sequences (N_FTS)
Common clockGen2: 255
Separate clockGen2: 255
Electrical idle exit (EIE) before FTS4
L1s Active State Power Management (ASPM)
Enable L1 ASPMOff
Endpoint L1 acceptable latency< 1 µs
L1 Exit Latency Common clock> 64 µs
L1 Exit Latency Separate clock> 64 µs
8. On the EDA tab, turn on Generate simulation model to generate an IP functional
simulation model for the IP core. An IP functional simulation model is a
cycle-accurate VHDL or Verilog HDL model produced by the Quartus II software.
cUse the simulation models only for simulation and not for synthesis or any
other purposes. Using these models for synthesis creates a non-functional
design.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
2–10Chapter 2: Getting Started
<working_dir>
<variation>.v = top.v, the parameterized PCI Express IP Core
<variation>.sdc = top.sdc, the timing constraints file
<variation>.tcl = top.tcl, general Quartus II settings
<variation>_examples = top_examples
ip_compiler_for_pci_express-library
contains local copy of the pci express library files needed for
simulation, or compilation, or both
Testbench and
Design Example
Files
IP Compiler for
PCI Express
Files
Includes testbench and incremental compile directories
common
chaining_dma, files to implement the chaining DMA
top_example_chaining_top.qpf, the Quartus II project file
top_example_chaining_top.qsf, the Quartus II settings file
<variation>
_plus.v = top_plus.v,
the parameterized PCI Express IP Core including reset and
calibration circuitry
testbench, scripts to run the testbench
runtb.do, script to run the testbench
<variation>_chaining_testbench = top_chaining_testbench.valtpcietb_bfm_driver_chaining.v , provides test stimulus
Simulation and
Quartus II
Compilation
(1) (2)
Viewing the Generated Files
9. On the Summary tab, select the files you want to generate. A gray checkmark
indicates a file that is automatically generated. All other files are optional.
10. Click Finish to generate the IP core, testbench, and supporting files.
1A report file,
<
variation name>.html, in your project directory lists each file
generated and provides a description of its contents.
Viewing the Generated Files
Figure 2–5 illustrates the directory structure created for this design after you generate
the IP Compiler for PCI Express. The directories includes the following files:
■ The IP Compiler for PCI Express design files, stored in <working_dir>.
■ The chaining DMA design example file, stored in the
<working_dir>\top_examples\chaining_dma directory. This design example tests
your generated IP Compiler for PCI Express variation. For detailed information
about this design example, refer to Chapter 15, Testbench and Design Example.
■ The simulation files for the chaining DMA design example, stored in the
<working_dir>\top_examples\chaining_dma\testbench directory. The Quartus II
software generates the testbench files if you turn on Generate simulation model
on the EDA tab while generating the IP Compiler for PCI Express.
0
Figure 2–5. Directory Structure for IP Compiler for PCI Express and Testbench
Notes to Figure 2–5:
(1) The chaining_dma directory contains the Quartus II project and settings files.
(2) <variation>_plus.v is only available for the hard IP implementation.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 2: Getting Started2–11
Viewing the Generated Files
Figure 2–6 illustrates the top-level modules of this design. As this figure illustrates,
the IP Compiler for PCI Express connects to a basic root port bus functional model
(BFM) and an application layer high-performance DMA engine. These two modules,
when combined with the IP Compiler for PCI Express, comprise the complete
example design. The test stimulus is contained in altpcietb_bfm_driver_chaining.v.
The script to run the tests is runtb.do. For a detailed explanation of this example
design, refer to Chapter 15, Testbench and Design Example.
Figure 2–6. Testbench for the Chaining DMA Design Example
Root Port BFM
Root Port Driver
x8 Root Port Model
PCI Express Link
Endpoint Example
IP Compiler
for PCI Express
Endpoint Application
Layer Example
Traffic Control/Virtual Channel Mapping
Request/Completion Routing
RC
Slave
(Optional)
DMA
Write
Endpoint
Memory
(32 KBytes)
DMA
Read
f The design files used in this design example are the same files that are used for the
PCI Express High-Performance Reference Design. You can download the required
files on the PCI Express High-Performance Reference Design product page. This
product page includes design files for various devices. The example in this document
uses the Stratix IV GX files. You can generate, simulate, and compile the design
example with the files and capabilities provided in your Quartus II software and IP
installation. However, to configure the example on a device, you must also download
altpcie_demo.zip, which includes a software driver that the example design uses,
from the PCI Express High-Performance Reference Design.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
2–12Chapter 2: Getting Started
Simulating the Design
The Stratix IV .zip file includes files for Gen1 and Gen2 ×1, ×4, and ×8 variants. The
example in this document demonstrates the Gen2 ×8 variant. After you download
and unzip this .zip file, you can copy the files for this variant to your project directory,
<working_dir>. The files for the example in this document are included in the
hip_s4gx_gen2x8_128 directory. The Quartus II project file, top.qsf, is contained in <working_dir>. You can use this project file as a reference for the .qsf file for your own
design.
Simulating the Design
As Figure 2–5 illustrates, the scripts to run the simulation files are located in the
<working_dir>\top_examples\chaining_dma\testbench directory. Follow these
steps to run the chaining DMA testbench.
1. Start your simulation tool. This example uses the ModelSim
1The endpoint chaining DMA design example DMA controller requires the
use of BAR2 or BAR3.
2. In the testbench directory,
<working_dir>\top_examples\chaining_dma\testbench, type the following
command:
®
software.
do runtb.do
r
This script compiles the testbench for simulation and runs the chaining DMA
tests.
Example 2–1 shows the partial transcript from a successful simulation. As this
transcript illustrates, the simulation includes the following stages:
■ Link training
■ Configuration
■ DMA reads and writes
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 2: Getting Started2–13
Simulating the Design
■ Root port to endpoint memory reads and writes
Example 2–1. Excerpts from Transcript of Successful Simulation Run
Time: 56000 Instance: top_chaining_testbench.ep.epmap.pll_250mhz_to_500mhz.
altpll_component.pll0
# INFO: 464 ns Completed initial configuration of Root Port.
# INFO: Core Clk Frequency: 251.00 Mhz
# INFO: 3608 ns EP LTSSM State: DETECT.ACTIVE
# INFO: 3644 ns EP LTSSM State: POLLING.ACTIVE
# INFO: 3660 ns RP LTSSM State: DETECT.ACTIVE
# INFO: 3692 ns RP LTSSM State: POLLING.ACTIVE
# INFO: 6012 ns RP LTSSM State: POLLING.CONFIG
# INFO: 6108 ns EP LTSSM State: POLLING.CONFIG
# INFO: 7388 ns EP LTSSM State: CONFIG.LINKWIDTH.START
# INFO: 7420 ns RP LTSSM State: CONFIG.LINKWIDTH.START
# INFO: 7900 ns EP LTSSM State: CONFIG.LINKWIDTH.ACCEPT
# INFO: 8316 ns RP LTSSM State: CONFIG.LINKWIDTH.ACCEPT
# INFO: 8508 ns RP LTSSM State: CONFIG.LANENUM.WAIT
# INFO: 9004 ns EP LTSSM State: CONFIG.LANENUM.WAIT
# INFO: 9196 ns EP LTSSM State: CONFIG.LANENUM.ACCEPT
# INFO: 9356 ns RP LTSSM State: CONFIG.LANENUM.ACCEPT
# INFO: 9548 ns RP LTSSM State: CONFIG.COMPLETE
# INFO: 9964 ns EP LTSSM State: CONFIG.COMPLETE
# INFO: 11052 ns EP LTSSM State: CONFIG.IDLE
# INFO: 11276 ns RP LTSSM State: CONFIG.IDLE
# INFO: 11356 ns RP LTSSM State: L0
# INFO: 11580 ns EP LTSSM State: L0
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
2–14Chapter 2: Getting Started
Simulating the Design
Example 2-1 continued
## INFO: 12536 ns
# INFO: 15896 ns EP PCI Express Link Status Register (1081):
# INFO: 15896 ns Negotiated Link Width: x8
# INFO: 15896 ns Slot Clock Config: System Reference Clock Used
# INFO: 16504 ns RP LTSSM State: RECOVERY.RCVRLOCK
# INFO: 16840 ns EP LTSSM State: RECOVERY.RCVRLOCK
# INFO: 17496 ns EP LTSSM State: RECOVERY.RCVRCFG
# INFO: 18328 ns RP LTSSM State: RECOVERY.RCVRCFG
# INFO: 20440 ns RP LTSSM State: RECOVERY.SPEED
# INFO: 20712 ns EP LTSSM State: RECOVERY.SPEED
# INFO: 21600 ns EP LTSSM State: RECOVERY.RCVRLOCK
# INFO: 21614 ns RP LTSSM State: RECOVERY.RCVRLOCK
# INFO: 22006 ns RP LTSSM State: RECOVERY.RCVRCFG
# INFO: 22052 ns EP LTSSM State: RECOVERY.RCVRCFG
# INFO: 22724 ns EP LTSSM State: RECOVERY.IDLE
# INFO: 22742 ns RP LTSSM State: RECOVERY.IDLE
# INFO: 22846 ns RP LTSSM State: L0
# INFO: 22900 ns EP LTSSM State: L0
# INFO: 23152 ns Current Link Speed: 5.0GT/s
# INFO: 27936 ns ---------
# INFO: 27936 ns TASK:dma_set_header READ
# INFO: 27936 ns Writing Descriptor header
# INFO: 27976 ns data content of the DT header
# INFO: 27976 ns
# INFO: 27976 ns Shared Memory Data Display:
# INFO: 27976 ns Address Data
# INFO: 27976 ns ------- ----
# INFO: 27976 ns 00000900 00000003 00000000 00000900 CAFEFADE
# INFO: 27976 ns ---------
# INFO: 27976 ns TASK:dma_set_rclast
# INFO: 27976 ns Start READ DMA : RC issues MWr (RCLast=0002)
# INFO: 27992 ns ---------
# INFO: 33130 ns TASK:rcmem_poll Polling RC Address0000080C current data (0000FADE)
expected data (00000002)
# INFO: 34130 ns TASK:rcmem_poll Polling RC Address0000080C current data (00000000)
expected data (00000002)
# INFO: 35910 ns TASK:msi_poll Received DMA Write MSI(0000) : B0FD
# INFO: 35930 ns TASK:rcmem_poll Polling RC Address0000080C current data (00000002)
expected data (00000002)
# INFO: 35930 ns TASK:rcmem_poll ---> Received Expected Data (00000002)
# INFO: 35938 ns ---------
# INFO: 35938 ns Completed DMA Write
# INFO: 35938 ns ---------
# INFO: 35938 ns TASK:check_dma_data
# INFO: 35938 ns Passed : 0644 identical dwords.
# INFO: 35938 ns ---------
# INFO: 35938 ns TASK:downstream_loop
# INFO: 36386 ns Passed: 0004 same bytes in BFM mem addr 0x00000040
and 0x00000840
# INFO: 36826 ns Passed: 0008 same bytes in BFM mem addr 0x00000040 and 0x00000840
# INFO: 37266 ns Passed: 0012 same bytes in BFM mem addr 0x00000040 and 0x00000840
# INFO: 37714 ns Passed: 0016 same bytes in BFM mem addr 0x00000040 and 0x00000840
# INFO: 38162 ns Passed: 0020 same bytes in BFM mem addr 0x00000040 and 0x00000840
# INFO: 38618 ns Passed: 0024 same bytes in BFM mem addr 0x00000040 and 0x00000840
# INFO: 39074 ns Passed: 0028 same bytes in BFM mem addr 0x00000040 and 0x00000840
# INFO: 39538 ns Passed: 0032 same bytes in BFM mem addr 0x00000040 and 0x00000840
# INFO: 40010 ns Passed: 0036 same bytes in BFM mem addr 0x00000040 and 0x00000840
# INFO: 40482 ns Passed: 0040 same bytes in BFM mem addr 0x00000040 and 0x00000840
# SUCCESS: Simulation stopped due to successful completion!
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
2–16Chapter 2: Getting Started
Constraining the Design
Constraining the Design
The Quartus project directory for the chaining DMA design example is in
<working_dir>\top_examples\chaining_dma\. Before compiling the design using
the Quartus II software, you must apply appropriate design constraints, such as
timing constraints. The Quartus II software automatically generates the constraint
files when you generate the IP Compiler for PCI Express.
This file includes various Quartus II constraints. In
particular, it includes virtual pin assignments. Virtual
General<working_dir>/<variation>.tcl (top.tcl)
Timing<working_dir>/<variation>.sdc (top.sdc)
pin assignments allow you to avoid making specific
pin assignments for top-level signals while you are
simulating and not yet ready to map the design to
hardware.
This file is the Synopsys Design Constraints File (.sdc)
which includes timing constraints.
If you want to perform an initial compilation to check any potential issues without
creating pin assignments for a specific board, you can do so after running the
following two steps that constrain the chaining DMA design example:
1. To apply Quartus II constraint files, type the following commands at the Tcl
console command prompt:
source ../../
top
.tcl
r
1To display the Quartus II Tcl Console, on the View menu, point to Utility
Windows and click Tc l C o n s o l e .
2. To add the Synopsys timing constraints to your design, follow these steps:
a. On the Assignments menu, click Settings.
b. Click TimeQuest Timing Analyzer.
c. Under SDC files to include in the project, click the Browse button. Browse to
your <working_dir> to add top.sdc.
d. Click Add.
e. Click OK.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 2: Getting Started2–17
Constraining the Design
Example 2–2 illustrates the Synopsys timing constraints.
If you want to download the design to a board, you must specify the device and pin
assignments for the chaining DMA example design. To make device and pin
assignments, follow these steps:
1. To select the device, on the Assignments menu, click Device.
2. In the Family list, select Stratix IV (GT/GX/E).
3. Scroll through the Available devices to select EP4SGX230KF40C2.
4. To add pin assignments for the EP4SGX230KF40C2 device, copy all the text
included in to the chaining DMA design example .qsf file, <working_dir>\top_examples\chaining_dma\top_example_chaining_top.qsf to
your project .qsf file.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
2–18Chapter 2: Getting Started
Constraining the Design
1The pin assignments provided in the .qsf are valid for the Stratix IV GX
FPGA Development Board and the EP4SGX230KF40C2 device. If you are
using different hardware you must determine the correct pin assignments.
Example 2–3. Pin Assignments for the Stratix IV GX (EP4SGX230KF40C2) FPGA Development Board
This constraint aligns the PIPE clocks (
clock skew in ×8 variants.
■ Constraints for design running at frequencies higher than 250 MHz:
set_global_assignment -name PHYSICAL_SYNTHESIS_ASYNCHRONOUS_SIGNAL_PIPELINING ON
This constraint improves performance for designs in which asynchronous signals
in very fast clock domains cannot be distributed across the FPGA fast enough due
to long global network delays. This optimization performs automatic pipelining of
these signals, while attempting to minimize the total number of registers inserted.
Compiling the Design
To test your IP Compiler for PCI Express in hardware, your initial Quartus II
compilation includes all of the directories shown in Figure 2–5. After you have fully
tested your customized design, you can exclude the testbench directory from the
Quartus II compilation.
core_clk_out
) from each quad to reduce
On the Processing menu, click Start Compilation to compile your design.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
2–20Chapter 2: Getting Started
Reusing the Example Design
Reusing the Example Design
To use this example design as the basis of your own design, replace the endpoint
application layer example shown in Figure 2–6 with your own application layer
design. Then, modify the BFM driver to generate the transactions needed to test your
application layer.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
August 2014
<edit Part Number variable in chapter>
You customize the IP Compiler for PCI Express by specifying parameters in the IP
Compiler for PCI Express parameter editor, which you access from the IP Catalog.
Some IP Compiler for PCI Express variations are supported in only one or two of the
design flows. Soft IP implementations are supported only in the Quartus II IP Catalog.
For more information about the hard IP implementation variations available in the
different design flows, refer to Table 1–5 on page 1–6.
This chapter describes the parameters and how they affect the behavior of the IP core.
The IP Compiler for PCI Express parameter editor that appears in the Qsys flow is
different from the IP Compiler for PCI Express parameter editor that appears in the
other two design flows. Because the Qsys design flow supports only a subset of the
variations supported in the other two flows, and generates only hard IP
implementations with specific characteristics, the Qsys flow parameter editor
supports only a subset of the parameters described in this chapter.
3. Parameter Settings
Parameters in the Qsys Design Flow
The following sections describe the IP Compiler for PCI Express parameters available
in the Qsys design flow. Separate sections describe the parameters available in
different sections of the IP Compiler for PCI Express parameter editor.
The available parameters reflect the fact that the Qsys design flow supports only the
following functionality:
■ Hard IP implementation
■ Native endpoint, with no support for:
■I/O space BAR
■32-bit prefetchable memory
■ 16 Tags
■ 1 Message Signaled Interrupt (MSI)
■ 1 virtual channel
■ Up to 256 bytes maximum payload
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
3–2Chapter 3: Parameter Settings
Parameters in the Qsys Design Flow
System Settings
The first parameter section of the IP Compiler for PCI Express parameter editor in the
Qsys flow contains the parameters for the overall system settings. Tab le 3 –1 describes
these settings.
Table 3–1. Qsys Flow System Settings Parameters
ParameterValueDescription
Specifies the maximum data rate at which the link can operate. Turning
Gen2 Lane Rate Mode Off/On
Number of Lanes ×1, ×2, ×4, ×8
Reference clock
frequency
Use 62.5 MHz
application clock
Test out widthNone, 9 bits, or 64 bits
100 MHz, 125 MHz
Off/On
on Gen2 Lane Rate Mode sets the Gen2 rate, and turning it off sets the
Gen1 rate. Refer to Table 1–5 on page 1–6 for a complete list of Gen1
and Gen2 support.
Specifies the maximum number of lanes supported. Refer to Table 1–5
on page 1–6 for a complete list of device support for numbers of lanes.
You can select either a 100 MHz or 125 MHz reference clock for Gen1
operation; Gen2 requires a 100 MHz clock.
Specifies whether the application interface clock operates at the slower
62.5 MHz frequency to support power saving. This parameter can only
be turned on for some Gen1 ×1 variations. Refer to Table 4–1 on
page 4–4 for a list of the supported application interface clock
frequencies in different device families.
Indicates the width of the
reserved. Refer to Table 5–33 on page 5–59 for more information.
Altera recommends that you configure the 64-bit width.
test_out
signal. Most of these signals are
PCI Base Address Registers
The ×1 and ×4 IP cores support memory space BARs ranging in size from 128 bytes to
the maximum allowed by a 32-bit or 64-bit BAR. The ×8 IP cores support memory
space BARs from 4 KBytes to the maximum allowed by a 32-bit or 64-bit BAR.
The available BARs reflect the fact that the Qsys design flow supports only native
endpoints, with no support for I/O space BARs or 32-bit prefetchable memory.
The Avalon-MM address is the translated base address corresponding to a BAR hit of
a received request from the PCI Express link.
In the Qsys design flow, the PCI Base Address Registers (Type 0 Configuration Space)Bar Size and Avalon Base Address information populates from Qsys. You
cannot enter this information in the IP Compiler for PCI Express parameter editor.
After you set the base addresses in Qsys, either automatically or by entering them
manually, the values appear when you reopen the parameter editor.
Altera recommends using the Qsys option—on the System menu, click Assign Base Addresses—to set the base addresses automatically. If you decide to enter the address
translation entries manually, then you must avoid conflicts in address assignment
when adding other components, making interconnections, and assigning base
addresses.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 3: Parameter Settings3–3
Parameters in the Qsys Design Flow
Tab le 3– 2 describes the PCI register parameters. You can configure a BAR with value
other than Not used only if the preceding BARs are configured. When an
even-numbered BAR is set to 64 bit Prefetchable, the following BAR is labelled
Occupied and forced to value Not used.
Table 3–2. PCI Registers (Note 1), (2)
ParameterValueDescription
PCI Base Address Registers (0x10, 0x14, 0x18, 0x1C, 0x20, 0x24)
BAR Table (BAR0)
BAR Type
BAR Table (BAR1)
BAR Type
BAR Table (BAR2)
BAR Type
BAR Table (BAR3)
BAR Type
BAR Table (BAR4)
BAR Type
BAR Table (BAR5)
BAR Type
Notes to Table 3–2:
(1) A prefetchable 64-bit BAR is supported. A non-prefetchable 64-bit BAR is not supported because in a typical system, the root port configuration
register of type 1 sets the maximum non-prefetchable memory window to 32-bits.
(2) The Qsys design flow does not support I/O space for BAR type mapping. I/O space is only supported for legacy endpoint port types.
64 bit Prefetchable
32 but Non-Prefetchable
Not used
32 but Non-Prefetchable
Not used
64 bit Prefetchable
32 but Non-Prefetchable
Not used
32 but Non-Prefetchable
Not used
64 bit Prefetchable
32 but Non-Prefetchable
Not used
32 but Non-Prefetchable
Not used
BAR0 size and type mapping (memory space). BAR0 and BAR1 can
be combined to form a 64-bit prefetchable BAR. BAR0 and BAR1 can
be configured separately as 32-bit non-prefetchable memories.) (2)
BAR1 size and type mapping (memory space). BAR0 and BAR1 can
be combined to form a 64-bit prefetchable BAR. BAR0 and BAR1 can
be configured separately as 32-bit non-prefetchable memories.)
BAR2 size and type mapping (memory space). BAR2 and BAR3 can
be combined to form a 64-bit prefetchable BAR. BAR2 and BAR3 can
be configured separately as 32-bit non-prefetchable memories.) (2)
BAR3 size and type mapping (memory space). BAR2 and BAR3 can
be combined to form a 64-bit prefetchable BAR. BAR2 and BAR3 can
be configured separately as 32-bit non-prefetchable memories.)
BAR4 size and type mapping (memory space). BAR4 and BAR5 can
be combined to form a 64-bit BAR. BAR4 and BAR5 can be
configured separately as 32-bit non-prefetchable memories.) (2)
BAR5 size and type mapping (memory space). BAR4 and BAR5 can
be combined to form a 64-bit BAR. BAR4 and BAR5 can be
configured separately as 32-bit non-prefetchable memories.)
Device Identification Registers
The device identification registers are part of the PCI Type 0 configuration space
header. You can set these register values only at device configuration. Ta bl e 3 –3
describes the PCI read-only device identification registers.
Table 3–3. PCI Registers (Part 1 of 2)
ParameterValueDescription
Vendor ID
0x000
Device ID
0x000
Revision ID
0x008
Class code
0x008
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
0x1172
0x0004Sets the read-only value of the device ID register.
0x01Sets the read-only value of the revision ID register.
0xFF0000Sets the read-only value of the class code register.
Sets the read-only value of the vendor ID register. This parameter
can not be set to 0xFFFF per the PCI Express Specification.
3–4Chapter 3: Parameter Settings
Table 3–3. PCI Registers (Part 2 of 2)
Subsystem ID
0x02C
Subsystem vendor ID
0x02C
0x0004Sets the read-only value of the subsystem device ID register.
Sets the read-only value of the subsystem vendor ID register. This
0x1172
parameter can not be set to 0xFFFF per the PCI Express Base
Specification 1.1 or 2.0.
Parameters in the Qsys Design Flow
Link Capabilities
Tab le 3– 4 describes the capabilities parameter available in the Link Capabilities
section of the IP Compiler for PCI Express parameter editor in the Qsys design flow.
Table 3–4. Link Capabilities Parameter
ParameterValueDescription
Sets the read-only value of the port number field in the link
Link port number1
capabilities register. (offset 0x08C in the PCI Express capability
structure or PCI Express Capability List register).
Error Reporting
The parameters in the Error Reporting section control settings in the PCI Express
advanced error reporting extended capability structure, at byte offsets 0x800 through
0x834. Tabl e 3 –5 describes the error reporting parameters available in the Qsys design
On/OffImplements the advanced error reporting (AER) capability.
Enables ECRC checking capability. Sets the read-only value of the ECRC check
On/Off
On/Off
capable bit in the advanced error capabilities and control register. This parameter
requires you to implement the advanced error reporting capability.
Enables ECRC generation capability. Sets the read-only value of the ECRC generation
capable bit in the advanced error capabilities and control register. This parameter
requires you to implement the advanced error reporting capability.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 3: Parameter Settings3–5
Parameters in the Qsys Design Flow
Buffer Configuration
The Buffer Configuration section of the IP Compiler for PCI Express parameter editor
in the Qsys design flow includes parameters for the receive and retry buffers. The IP
Compiler for PCI Express parameter editor also displays the read-only RX buffer
space allocation information. Tab le 3 –6 describes the parameters and information in
this section of the parameter editor in the Qsys design flow.
Table 3–6. Buffer Configuration Parameters
ParameterValueDescription
Maximum
payload size
0x084
RX buffer credit
allocation –
performance for
received requests
128 bytes,
256 bytes
Maximum,
High,
Medium, Low
Specifies the maximum payload size supported. This parameter sets the read-only
value of the max payload size supported field of the device capabilities register
(0x084[2:0]) and optimizes the IP core for this size payload. Maximum payload size
is 128 bytes or 256 bytes, depending on the device.
Low—Provides the minimal amount of space for desired traffic. Select this option
when the throughput of the received requests is not critical to the system design.
This setting minimizes the device resource utilization.
Because the Arria II GX and Stratix IV hard IP implementations have a fixed RX
Buffer size, the only available value for these devices is Maximum.
Note that the read-only values for header and data credits update as you change
this setting.
For more information, refer to Chapter 11, Flow Control.
Posted header
credit
Posted data credit
Non-posted
header credit
Completion
header credit
Completion data
credit
Read-only
entries
These values show the credits and space allocated for each flow-controllable type,
based on the RX buffer size setting. All virtual channels use the same RX buffer
space allocation.
The entries show header and data credits for RX posted (memory writes) and
completion requests, and header credits for non-posted requests (memory reads).
The table does not show non-posted data credits because the IP core always
advertises infinite non-posted data credits and automatically has room for the
maximum number of dwords of data that can be associated with each non-posted
header.
The numbers shown for completion headers and completion data indicate how much
space is reserved in the RX buffer for completions. However, infinite completion
credits are advertised on the PCI Express link as is required for endpoints. The
application layer must manage the rate of non-posted requests to ensure that the RX
buffer completion space does not overflow. The hard IP RX buffer is fixed at 16
KBytes for Stratix IV GX devices and 4 KBytes for Arria II GX devices.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
3–6Chapter 3: Parameter Settings
Parameters in the Qsys Design Flow
Avalon-MM Settings
The Avalon-MM Settings section of the Qsys design flow IP Compiler for PCI
Express parameter editor contains configuration settings for the PCI Express
Avalon-MM bridge. Ta bl e 3 –7 describes these parameters.
Table 3–7. Avalon-MM Configuration Settings
Parameter ValueDescription
Specifies whether the IP Compiler for PCI Express component is
capable of sending requests to the upstream PCI Express devices, and
whether the incoming requests are pipelined.
Requester/Completer—Enables the IP Compiler for PCI Express to
send request packets on the PCI Express TX link as well as receiving
request packets on the PCI Express RX link.
Completer-Only—In this mode, the IP Compiler for PCI Express
can receive requests, but cannot initiate upstream requests.
However, it can transmit completion packets on the PCI Express TX
link. This mode removes the Avalon-MM TX slave port and thereby
reduces logic utilization.
Completer-Only single dword—Non-pipelined version of
Completer-Only mode. At any time, only a single request can be
outstanding. Completer-Only single dword uses fewer resources
than Completer-Only.
Allows read/write access to bridge registers from the Avalon
interconnect fabric using a specialized slave port. Disabling this option
disallows read/write access to bridge registers, except in the
Completer-Only single dword variations.
Turning this option on enables the IP Compiler for PCI Express
interrupt register at power-up. Turning it off disables the interrupt
register at power-up. The setting does not affect run-time
configurability of the interrupt enable register.
Peripheral Mode
Control Register Access
(CRA) Avalon slave port
(Qsys flow)
Auto Enable PCIe
Interrupt (enabled at
power-on)
Requester/Completer,
Completer-Only,
Completer-Only
single dword
Off/On
Off/On
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 3: Parameter Settings3–7
Parameters in the Qsys Design Flow
Address Translation
The Address Translation section of the Qsys design flow IP Compiler for PCI Express
parameter editor contains parameter settings for address translation in the PCI
Express Avalon-MM bridge. Table 3–8 describes these parameters.
Table 3–8. Avalon-MM Address Translation Settings
Parameter ValueDescription
Sets Avalon-MM-to-PCI Express address translation scheme to
dynamic or fixed.
Dynamic translation table—Enables application software to write
the address translation table contents using the control register
access slave port. On-chip memory stores the table. Requires that
the Avalon-MM CRA Port be enabled. Use several address
translation table entries to avoid updating a table entry before
outstanding requests complete. This option supports up to 512
address pages.
Fixed translation table—Configures the address translation table
contents to hardwired fixed values at the time of system generation.
This option supports up to 16 address pages.
Specifies the number of PCI Express base address pages of memory
that the bridge can access. This value corresponds to the number of
entries in the address translation table. The Avalon address range is
segmented into one or more equal-sized pages that are individually
mapped to PCI Express addresses. Select the number and size of the
address pages. If you select Dynamic translation table, use several
address translation table entries to avoid updating a table entry before
outstanding requests complete. Dynamic translation table supports up
to 512 address pages, and fixed translation table supports up to 16
address pages.
Specifies the size of each PCI Express memory segment accessible by
the bridge. This value is common for all address translation entries.
Address Translation Table Contents
The address translation table in the Qsys design flow IP Compiler for PCI Express
parameter editor is valid only for the fixed translation table configuration. The table
provides information for translating Avalon-MM addresses to PCI Express addresses.
The number of address pages available in the table is the number of address pages
you specify in the Address Translation section of the parameter editor.
The table entries specify the PCI Express base addresses of memory that the bridge
can access. In translation of Avalon-MM addresses to PCI Express addresses, the
upper bits of the Avalon-MM address are replaced with part of a specific entry. The
most significant bits of the Avalon-MM address index the table, selecting the address
page to use for each request.
The PCIe address field comprises two parameters, bits [31:0] and bits [63:32] of the
address. The Size of address pages value you specify in the Address Translation
section of the parameter editor determines the number of least significant bits in the
address that are replaced by the lower bits of the incoming Avalon-MM address.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
3–8Chapter 3: Parameter Settings
IP Core Parameters
However, bit 0 of PCIe Address 31:0 has the following special significance:
■ If bit 0 of PCIe Address 31:0 has value 0, the PCI Express memory accessed
through this address page is 32-bit addressable.
■ If bit 0 of PCIe Address 31:0 has value 1, the PCI Express memory accessed
through this address page is 64-bit addressable.
IP Core Parameters
The following sections describe the IP Compiler for PCI Express parameters
System Settings
The first page of the Parameter Settings tab contains the parameters for the overall
system settings. Table 3–9 describes these settings.
The IP Compiler for PCI Express parameter editor that appears in the Qsys flow
provides only the Gen2 Lane Rate Mode, Number of lanes, Reference clock frequency, Use 62.5 MHz application clock, and Tes t o ut wi d t h system settings
parameters. For more information, refer to “Parameters in the Qsys Design Flow” on
page 3–1.
Table 3–9. System Settings Parameters (Part 1 of 4)
ParameterValueDescription
The hard IP implementation uses embedded dedicated logic to
implement the PCI Express protocol stack, including the physical layer,
data link layer, and transaction layer.
The soft IP implementation uses optimized PLD logic to implement the
PCI Express protocol stack, including physical layer, data link layer, and
transaction layer.
The Qsys design flows support only the hard IP implementation.
PCIe Core Type
Hard IP for PCI Express
Soft IP for PCI Express
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 3: Parameter Settings3–9
IP Core Parameters
Table 3–9. System Settings Parameters (Part 2 of 4)
ParameterValueDescription
PCIe System Parameters
Allows all types of external PHY interfaces (except serial). The number of
Custom
lanes can be ×1 or ×4. This option is only available for the soft IP
implementation.
Serial interface where Stratix II GX uses the Stratix II GX device family's
Stratix II GX
built-in transceiver. Selecting this PHY allows only a serial PHY interface
with the lane configuration set to Gen1 ×1, ×4, or ×8.
Serial interface where Stratix IV GX uses the Stratix IV GX device
family's built-in transceiver to support PCI Express Gen1 and Gen2 ×1,
×4, and ×8. For designs that may target HardCopy IV GX, the
Stratix IV GX
HardCopy IV GX setting must be used even when initially compiling for
Stratix IV GX devices. This procedure ensures that you only apply
HardCopy IV GX compatible settings in the Stratix IV GX
implementation.
Serial interface where Cyclone IV GX uses the Cyclone IV GX device
Cyclone IV GX
family’s built-in transceiver. Selecting this PHY allows only a serial PHY
interface with the lane configuration set to Gen1 ×1, ×2, or ×4.
Serial interface where HardCopy IV GX uses the HardCopy IV GX device
family's built-in transceiver to support PCI Express Gen1 and Gen2 ×1,
×4, and ×8. For designs that may target HardCopy IV GX, the
PHY type (1)
HardCopy IV GX
HardCopy IV GX setting must be used even when initially compiling for
Stratix IV GX devices. This procedure ensures HardCopy IV GX
compatible settings in the Stratix IV GX implementation. For Gen2 ×8
variations, this procedure will set the RX Buffer and Retry Buffer to be
only 8 KBytes which is the HardCopy IV GX compatible implementation.
Serial interface where Arria GX uses the Arria GX device family’s built-in
Arria GX
transceiver. Selecting this PHY allows only a serial PHY interface with
the lane configuration set to Gen1 ×1 or ×4.
Arria II GX
Serial interface where Arria II GX uses the Arria II GX device family's
built-in transceiver to support PCI Express Gen1 ×1, ×4, and ×8.
Serial interface where Arria II GZ uses the Arria II GZ device family's
Arria II GZ
built-in transceiver to support PCI Express Gen1 ×1, ×4, and ×8, Gen2
×1, Gen2 ×4.
TI XIO1100 uses an 8-bit DDR/SDR with a TXClk or a 16-bit SDR with a
TI XIO1100
transmit clock PHY interface. Both of these options restrict the number
of lanes to ×1. This option is only available for the soft IP
implementation.
Philips NPX1011A uses an 8-bit SDR with a TXClk and a PHY interface.
NXP PX1011A
This option restricts the number of lanes to ×1. This option is only
available for the soft IP implementation.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
3–10Chapter 3: Parameter Settings
IP Core Parameters
Table 3–9. System Settings Parameters (Part 3 of 4)
Selects the specific type of external PHY interface based on the interface
datapath width and clocking mode. Refer to Chapter 14, External PHYs
for additional detail on specific PHY modes.
The PHY interface setting only applies to the soft IP implementation.
8-bit SDR w/TxClk,
serial
Clicking this button brings up the transceiver parameter editor, allowing
you to access a much greater subset of the transceiver parameters than
was available in earlier releases. The parameters that you can access are
Configure transceiver
block
different for the soft and hard IP versions of the IP Compiler for PCI
Express and may change from release to release. (2)
For Arria II GX, Cyclone IV GX, Stratix II GX, and Stratix IV GX
transceivers, refer to the “Protocol Settings for PCI Express (PIPE)” in
the ALTGX Transceiver Setup Guide for an explanation of these settings.
Specifies the maximum number of lanes supported. The ×8 soft IP
Lanes ×1, ×2, ×4, ×8
configuration is only supported for Stratix II GX devices. For information
about ×8 support in hard IP configurations, refer to Table 1–5 on
page 1–6.
For Arria II GX, Cyclone IV GX, HardCopy IV GX, and Stratix IV GX, you
can select either a 100 MHz or 125 MHz reference clock for Gen1
operation; Gen2 requires a 100 MHz clock. The Arria GX and
Xcvr ref_clk
PHY pclk
100 MHz, 125 MHz
Stratix II GX devices require a 100 MHz clock. If you use a PIPE interface (and the PHY type is not Arria GX, Arria II GX, Cyclone IV GX,
refclk
HardCopy IV GX, Stratix II GX, or Stratix IV GX) the
For Custom and TI X101100 PHYs, the PHY
For the NXP PX1011A PHY, the
pclk
Specifies the interface between the PCI Express transaction layer and the
application layer. When using the parameter editor, this parameter can
be set to Avalon-ST or Descriptor/Data. Altera recommends the Avalon-ST option for all new designs. 128-bit Avalon-ST is only available when
using the hard IP implementation.
pclk
frequency is 125 MHz.
value is 250 MHz.
Specifies the port type. Altera recommends Native Endpoint for all new
endpoint designs. Select Legacy Endpoint only when you require I/O
transaction support for compatibility. The Qsys design flow only
supports Native Endpoint and the Avalon-MM interface to the user
application. The Root Port option is available in the hard IP
implementations.
The endpoint stores parameters in the Type 0 configuration space which
Port type
Native Endpoint
Legacy Endpoint
Root Port
is outlined in Table 6–2 on page 6–2. The root port stores parameters in
the Type 1 configuration space which is outlined in Table 6–3 on
page 6–3.
Selects the PCI Express specification with which the variation is
compatible. Depending on the device that you select, the IP Compiler for
PCI Express version1.0A, 1.1, 2.0
PCI Express hard IP implementation supports PCI Express versions 1.1
and 2.0. The IP Compiler for PCI Express soft IP implementation
supports PCI Express versions 1.0a and 1.1
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 3: Parameter Settings3–11
IP Core Parameters
Table 3–9. System Settings Parameters (Part 4 of 4)
ParameterValueDescription
Specifies the frequency at which the application interface clock operates.
This frequency can only be set to 62.5 MHz or 125 MHz for some Gen1
×1 variations. For all other variations this field displays the frequency of
operation which is controlled by the number of lanes, application
interface width and Max rate setting. Refer to Table 4–1 on page 4–4 for
Application clock
62.5 MHz
125 MHz
250 MHz
a list of the supported combinations.
Specifies the maximum data rate at which the link can operate. The Gen2
Max rate
Gen 1 (2.5 Gbps)
Gen 2 (5.0 Gbps)
rate is only supported in the hard IP implementations. Refer to Table 1–5
on page 1–6 for a complete list of Gen1 and Gen2 support in the hard IP
implementation.
Indicates the width of the
test_out
signal. The following widths are
possible:
Test out width
0, 9, 64, 128 or 512
bits
Hard IP
Soft IP ×1 or ×4
Soft IP ×8
test_out
test_out
test_out
width: None, 9 bits, or 64 bits
width: None, 9 bits, or 512 bits
width: None, 9 bits, or 128 bits
Most of these signals are reserved. Refer to Table 5–33 on page 5–59
for more information.
Altera recommends the 64-bit width for the hard IP implementation.
Enables reconfiguration of the hard IP PCI Express read-only
HIP reconfigEnable/Disable
configuration registers. This parameter is only available for the hard IP
implementation.
Notes to Table 3–9:
(1) To specify an IP Compiler for PCI Express that targets a Stratix IV GT device, select Stratix IV GX as the PHY type, You must make sure that any
transceiver settings you specify in the transceiver parameter editor are valid for Stratix IV GT devices, otherwise errors will result during
Quartus II compilation.
(2) When you configure the ALT2GXB transceiver for an Arria GX device, the Currently selected device family entry is Stratix II GX. However you
must make sure that any transceiver settings applied in the ALT2GX parameter editor are valid for Arria GX devices, otherwise errors will result
during Quartus II compilation.
PCI Registers
The ×1 and ×4 IP cores support memory space BARs ranging in size from 128 bits to
the maximum allowed by a 32-bit or 64-bit BAR.
The ×1 and ×4 IP cores in legacy endpoint mode support I/O space BARs sized from
16 Bytes to 4 KBytes. The ×8 IP core only supports I/O space BARs of 4 KBytes.
ITab le 3– 10 describes the PCI register parameters.
Table 3–10. PCI Registers (Part 1 of 3)
ParameterValueDescription
PCI Base Address Registers (0x10, 0x14, 0x18, 0x1C, 0x20, 0x24)
BAR0 size and type mapping (I/O space (1), memory space). BAR0
BAR Table (BAR0)BAR type and size
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
and BAR1 can be combined to form a 64-bit prefetchable BAR. BAR0
and BAR1 can be configured separate as 32-bit non-prefetchable
memories.) (2)
3–12Chapter 3: Parameter Settings
IP Core Parameters
Table 3–10. PCI Registers (Part 2 of 3)
BAR1 size and type mapping (I/O space (1), memory space. BAR0
BAR Table (BAR1)BAR type and size
and BAR1 can be combined to form a 64-bit prefetchable BAR. BAR0
and BAR1 can be configured separate as 32-bit non-prefetchable
memories.)
BAR2 size and type mapping (I/O space (1), memory space. BAR2
BAR Table (BAR2)
(3)
BAR type and size
and BAR3 can be combined to form a 64-bit prefetchable BAR. BAR2
and BAR3 can be configured separate as 32-bit non-prefetchable
memories.) (2)
BAR3 size and type mapping (I/O space (1), memory space. BAR2
BAR Table (BAR3)
(3)
BAR type and size
and BAR3 can be combined to form a 64-bit prefetchable BAR. BAR2
and BAR3 can be configured separate as 32-bit non-prefetchable
memories.)
BAR4 size and type mapping (I/O space (1), memory space. BAR4
BAR Table (BAR4)
(3)
BAR type and size
and BAR5 can be combined to form a 64-bit BAR. BAR4 and BAR5
can be configured separate as 32-bit non-prefetchable
memories.) (2)
BAR Table (BAR5)
(3)
BAR Table (EXP-ROM)
(4)
BAR type and size
Disable/Enable
BAR5 size and type mapping (I/O space (1), memory space. BAR4
and BAR5 can be combined to form a 64-bit BAR. BAR4 and BAR5
can be configured separate as 32-bit non-prefetchable memories.)
Expansion ROM BAR size and type mapping (I/O space, memory
space, non-prefetchable).
Device ID
0x000
Subsystem ID
0x02C (3)
Revision ID
0x008
Vendor ID
0x000
Subsystem vendor ID
0x02C (3)
Class code
0x008
Input/Output (5)
PCIe Read-Only Registers
0x0004Sets the read-only value of the device ID register.
0x0004Sets the read-only value of the subsystem device ID register.
0x01Sets the read-only value of the revision ID register.
0x1172
Sets the read-only value of the vendor ID register. This parameter
can not be set to 0xFFFF per the PCI Express Specification.
Sets the read-only value of the subsystem vendor ID register. This
0x1172
parameter can not be set to 0xFFFF per the PCI Express Base
Specification 1.1 or 2.0.
0xFF0000Sets the read-only value of the class code register.
Specifies what address widths are supported for the
IO limit
registers.
IO base
and
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 3: Parameter Settings3–13
IP Core Parameters
Table 3–10. PCI Registers (Part 3 of 3)
Prefetchable memory
(5)
Notes to Table 3–10:
(1) A prefetchable 64-bit BAR is supported. A non-prefetchable 64-bit BAR is not supported because in a typical system, the root port configuration
register of type 1 sets the maximum non-prefetchable memory window to 32-bits.
(2) The Qsys design flows do not support I/O space for BAR type mapping. I/O space is only supported for legacy endpoint port types.
(3) Only available for EP designs which require the use of the Header type 0 PCI configuration register.
(4) The Qsys design flows do not support the expansion ROM.
(5) Only available for RP designs which require the use of the Header type 1 PCI configuration register. Therefore, this option is not available in the
Specifies what address widths are supported for the
memory base
register and
prefetchable memory limit
prefetchable
register.
Capabilities Parameters
The Capabilities page contains the parameters setting various capability properties of
the IP core. These parameters are described in Table 3–11. Some of these parameters
are stored in the Common Configuration Space Header. The byte offset within the
Common Configuration Space Header indicates the parameter address.
The IP Compiler for PCI Express parameter editor that appears in the Qsys flow
provides only the Link port number, Implement advance error reporting, Implement ECRC check, and Implement ECRC generation capabilities parameters.
For more information, refer to “Parameters in the Qsys Design Flow” on page 3–1.
Table 3–11. Capabilities Parameters (Part 1 of 4)
ParameterValueDescription
Device Capabilities
0x084
Indicates the number of tags supported for non-posted requests transmitted by the
application layer. The following options are available:
Hard IP: 32 or 64 tags for ×1, ×4, and ×8
Soft IP: 4–256 tags for ×1 and ×4; 4–32 for ×8
Qsys design flows: 16 tags
This parameter sets the values in the Device Control register (0x088) of the PCI
Tags supported4–256
Express capability structure described in Table 6–7 on page 6–4.
The transaction layer tracks all outstanding completions for non-posted requests
made by the application. This parameter configures the transaction layer for the
maximum number to track. The application layer must set the tag values in all
non-posted PCI Express headers to be less than this value. Values greater than 32
also set the extended tag field supported bit in the configuration space device
capabilities register. The application can only use tag numbers greater than 31 if
configuration software sets the extended tag field enable bit of the device control
register. This bit is available to the application as
This option is only selectable for PCI Express version 2.0 and higher root ports . For
Implement
completion timeout
disable
0x0A8
On/Off
PCI Express version 2.0 and higher endpoints this option is forced to On. For PCI
Express version 1.0a and 1.1 variations, this option is forced to Off. The timeout
range is selectable. When On, the core supports the completion timeout disable
mechanism via the PCI Express Device Control Register 2. The application layer logic
must implement the actual completion timeout mechanism for the required ranges.
cfg_devcsr[8]
.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
3–14Chapter 3: Parameter Settings
IP Core Parameters
Table 3–11. Capabilities Parameters (Part 2 of 4)
ParameterValueDescription
Completion
timeout range
This option is only available for PCI Express version 2.0 and higher. It indicates
device function support for the optional completion timeout programmability
mechanism. This mechanism allows system software to modify the completion
timeout value. This field is applicable only to root ports and endpoints that issue
requests on their own behalf. Completion timeouts are specified and enabled via the
Device Control 2 register (0x0A8) of the PCI Express Capability Structure Version 2.0
described in Table 6–8 on page 6–5. For all other functions this field is reserved and
must be hardwired to 0x0. Four time value ranges are defined:
Ranges A–D
Range A: 50 µs to 10 ms
Range B: 10 ms to 250 ms
Range C: 250 ms to 4 s
Range D: 4 s to 64 s
Bits are set according to the list below to show timeout value ranges supported. 0x0
completion timeout programming is not supported and the function must implement
a timeout value in the range 50 s to 50 ms.
Completion
timeout range
(continued)
Each range is turned on or off to specify the full range value. Bit 0 controls Range A,
bit 1 controls Range B, bit 2 controls Range C, and bit 3 controls Range D. The
following values are supported:
0x1: Range A
0x2: Range B
0x3: Ranges A and B
0x6: Ranges B and C
0x7: Ranges A, B, and C
0xE: Ranges B, C and D
0xF: Ranges A, B, C, and D
All other values are reserved. This parameter is not available for PCIe version 1.0.
Altera recommends that the completion timeout mechanism expire in no less than
10 ms.
Error Reporting
0x800–0x834
Implement
advanced error
On/OffImplements the advanced error reporting (AER) capability.
reporting
Implement ECRC
checkOn/Off
Enables ECRC checking capability. Sets the read-only value of the ECRC check
capable bit in the advanced error capabilities and control register. This parameter
requires you to implement the advanced error reporting capability.
Implement ECRC
generationOn/Off
Enables ECRC generation capability. Sets the read-only value of the ECRC generation
capable bit in the advanced error capabilities and control register. This parameter
requires you to implement the advanced error reporting capability.
Implement ECRC
forwarding
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
On/Off
Available for hard IP implementation only. Forward ECRC to the application layer. On
the Avalon-ST receive path, the incoming TLP contains the ECRC dword and the
TD
bit is set if an ECRC exists. On the Avalon-ST transmit path, the TLP from the
TD
application must contain the ECRC dword and have the
bit set.
Chapter 3: Parameter Settings3–15
3119 18 17 16 15 14
7
65
Physical Slot Number
No Command Completed Support
Electromechanical Interlock Present
Slot Power Limit Scale
Slot Power Limit Value
Hot-Plug Capable
Hot-Plug Surprise
Power Indicator Present
Attention Indicator Present
MRL Sensor Present
Power Controller Present
Attention Button Present
04321
IP Core Parameters
Table 3–11. Capabilities Parameters (Part 3 of 4)
ParameterValueDescription
MSI Capabilities
0x050–0x05C
MSI messages
requested
MSI message
64–bit address
capable
1, 2, 4, 8,
16, 32
On/Off
Indicates the number of messages the application requests. Sets the value of the
multiple message capable field of the message control register, 0x050[31:16]. The
Qsys design flow supports only 1 MSI.
Indicates whether the MSI capability message control register is 64-bit addressing
capable. PCI Express native endpoints always support MSI 64-bit addressing.
Link Capabilities
0x090
Indicates if the common reference clock supplied by the system is used as the
Link common clockOn/Off
reference clock for the PHY. This parameter sets the read-only value of the slot clock
configuration bit in the link status register.
Turn this option On for a downstream port if the component supports the optional
Data link layer active
reporting
0x094
On/Off
capability of reporting the DL_Active state of the Data Link Control and Management
State Machine. For a hot-plug capable downstream port (as indicated by the
Plug Capable
field of the
Slot Capabilities
register), this option must be
Hot-
turned on. For upstream ports and components that do not support this optional
capability, turn this option Off. Endpoints do not support this option.
Surprise down
reporting
On/Off
When this option is On, a downstream port supports the optional capability of
detecting and reporting the surprise down error condition.
Link port number0x01Sets the read-only value of the port number field in the link capabilities register.
Enable slot
capability
Slot capability
register
Implement MSI-X On/Off
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
On/Off
0x00000000
Slot Capabilities
0x094
The slot capability is required for root ports if a slot is implemented on the port. Slot
status is recorded in the
PCI Express Capabilities
register. This capability is
only available for root port variants. Therefore, this option is not available in the Qsys
design flow.
Defines the characteristics of the slot. You turn this option on by selecting Enable slot capability. The various bits are defined as follows:
MSI-X Capabilities (0x68, 0x6C, 0x70)
The MSI-X functionality is only available in the hard IP implementation. The Qsys
design flow does not support MSI-X functionality.
3–16Chapter 3: Parameter Settings
IP Core Parameters
Table 3–11. Capabilities Parameters (Part 4 of 4)
ParameterValueDescription
MSI-X Table size
0x068[26:16]
10:0
MSI-X Table
Offset31:3
System software reads this field to determine the MSI-X Table size <N>, which is
encoded as <N–1>. For example, a returned value of 10’b00000000011 indicates a
table size of 4. This field is read-only.
Points to the base of the MSI-X Table. The lower 3 bits of the table BAR indicator
(BIR) are set to zero by software to form a 32-bit qword-aligned offset. This field is
read-only.
MSI-X Table BAR
Indicator<5–1>:0
Indicates which one of a function’s Base Address registers, located beginning at
0x10 in configuration space, is used to map the MSI-X table into memory space.
This field is read-only.
Pending Bit Array (PBA)
Offset
31:3
Used as an offset from the address contained in one of the function’s Base Address
registers to point to the base of the MSI-X PBA. The lower 3 bits of the PBA BIR are
set to zero by software to form a 32-bit qword-aligned offset. This field is read-only.
BAR Indicator
(BIR)<5–1>:0
Indicates which of a function’s Base Address registers, located beginning at 0x10 in
configuration space, is used to map the function’s MSI-X PBA into memory space.
This field is read-only.
Note to Table 3–11:
(1) Throughout The PCI Express User Guide, the terms word, dword and qword have the same meaning that they have in the PCI Express Base
Specification Revision 1.0a, 1.1, or 2.0. A word is 16 bits, a dword is 32 bits, and a qword is 64 bits.
Buffer Setup
The Buffer Setup page contains the parameters for the receive and retry buffers.
Tab le 3– 12 describes the parameters you can set on this page.
The IP Compiler for PCI Express parameter editor that appears in the Qsys flow
provides only the Maximum payload size and RX buffer credit allocation – performance for received requests buffer setup parameters. This parameter editor
also displays the read-only RX buffer space allocation information without the space
usage or totals information. For more information, refer to “Parameters in the Qsys
Specifies the maximum payload size supported. This parameter sets the read-only
value of the max payload size supported field of the device capabilities register
(0x084[2:0]) and optimizes the IP core for this size payload.
Specifies the number of virtual channels supported. This parameter sets the
read-only extended virtual channel count field of port virtual channel capability
register 1 and controls how many virtual channel transaction layer interfaces are
implemented. The number of virtual channels supported depends upon the
configuration, as follows:
Hard IP: 1–2 channels for Stratix IV GX devices, 1 channel for Arria II GX,
Arria II GZ, Cyclone IV GX, and HardCopy IV GX devices
Soft IP: 2 channels
Qsys: 1 channel
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 3: Parameter Settings3–17
IP Core Parameters
Table 3–12. Buffer Setup Parameters (Part 2 of 3)
ParameterValueDescription
Specifies the number of virtual channels in the low-priority arbitration group. The
Number of
low-priority VCs
0x104
None, 1
virtual channels numbered less than this value are low priority. Virtual channels
numbered greater than or equal to this value are high priority. Refer to “Transmit
Virtual Channel Arbitration” on page 4–10 for more information. This parameter sets
the read-only low-priority extended virtual channel count field of the port virtual
channel capability register 1.
Auto configure
retry buffer size
Retry buffer size
Maximum retry
packets
On/Off
256 Bytes–
16 KBytes
(powers of 2)
4–256
(powers of 2)
Controls automatic configuration of the retry buffer based on the maximum payload
size. For the hard IP implementation, this is set to On.
Sets the size of the retry buffer for storing transmitted PCI Express packets until
acknowledged. This option is only available if you do not turn on Auto configure retry buffer size. The hard IP retry buffer is fixed at 4 KBytes for Arria II GX and
Cyclone IV GX devices and at 16 KBytes for Stratix IV GX devices.
Set the maximum number of packets that can be stored in the retry buffer. For the
hard IP implementation this parameter is set to 64.
Low—Provides the minimal amount of space for desired traffic. Select this option
when the throughput of the received requests is not critical to the system design.
This setting minimizes the device resource utilization.
Because the Arria II GX and Stratix IV hard IP have a fixed RX Buffer size, the
choices for this parameter are limited to a subset of these values. For Max
payload size of 512 bytes or less, the only available value is Maximum. For Max
Desired
performance for
received requests
Maximum,
High,
Medium, Low
payload size of 1 KBytes or 2 KBytes a tradeoff has to be made between how
much space is allocated to requests versus completions. At 1 KByte and 2 KByte
Max payloadsize, selecting a lower value for this setting forces a higher setting
for the Desired performance for received completions.
Note that the read-only values for header and data credits update as you change
this setting.
For more information, refer to Chapter 11, Flow Control. This analysis explains
how the Maximum payload size and Desired performance for received completions that you choose affect the allocation of flow control credits.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
3–18Chapter 3: Parameter Settings
Table 3–12. Buffer Setup Parameters (Part 3 of 3)
ParameterValueDescription
Specifies how to configure the RX buffer size and the flow control credits:
Maximum—Provides additional space to allow for additional external delays (link
side and application side) and still allows full throughput.
If you need more buffer space than this parameter supplies, select a larger
payload size and this setting. The maximum setting increases the buffer size and
slightly increases the number of logic elements (LEs), to support a larger payload
size than is used. This is the default setting for the hard IP implementation.
Medium—Provides a moderate amount of space for received completions. Select
this option when the received completion traffic does not need to use the full link
Desired
performance for
received
completions
Maximum,
High,
Medium, Low
bandwidth, but is expected to occasionally use short bursts of maximum sized
payload packets.
Low—Provides the minimal amount of space for received completions. Select
this option when the throughput of the received completions is not critical to the
system design. This is used when your application is never expected to initiate
read requests on the PCI Express links. Selecting this option minimizes the device
resource utilization.
For the hard IP implementation, this parameter is not directly adjustable. The
value set is derived from the values of Max payload size and the Desired performance for received requests parameter.
For more information, refer to Chapter 11, Flow Control. This analysis explains
how the Maximum payload size and Desired performance for received completions that you choose affects the allocation of flow control credits.
IP Core Parameters
RX Buffer Space
Allocation (per
VC)
Power Management
Shows the credits and space allocated for each flow-controllable type, based on the
RX buffer size setting. All virtual channels use the same RX buffer space allocation.
The table shows header and data credits for RX posted (memory writes) and
completion requests, and header credits for non-posted requests (memory reads).
The table does not show non-posted data credits because the IP core always
advertises infinite non-posted data credits and automatically has room for the
Read-Only
table
maximum number of dwords of data that can be associated with each non-posted
header.
The numbers shown for completion headers and completion data indicate how much
space is reserved in the RX buffer for completions. However, infinite completion
credits are advertised on the PCI Express link as is required for endpoints. The
application layer must manage the rate of non-posted requests to ensure that the RX
buffer completion space does not overflow. The hard IP RX buffer is fixed at 16
KBytes for Stratix IV GX devices and 4 KBytes for Arria II GX devices.
The Power Management page contains the parameters for setting various power
management properties of the IP core. These parameters are not available in the Qsys
design flow.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 3: Parameter Settings3–19
IP Core Parameters
Tab le 3– 13 describes the parameters you can set on this page.
Table 3–13. Power Management Parameters (Part 1 of 2)
ParameterValueDescription
L0s Active State Power Management (ASPM)
This design parameter indicates the idle threshold for L0s entry. This
parameter specifies the amount of time the link must be idle before the
transmitter transitions to L0s state. The PCI Express specification states
Idle threshold for L0s
entry
256 ns–8,192 ns
(in 256 ns
increments)
that this time should be no more than 7 μs, but the exact value is
implementation-specific. If you select the Arria GX, Arria II GX, Cyclone IV GX, Stratix II GX, or Stratix IV GX PHY, this parameter is
disabled and set to its maximum value. If you are using an external PHY,
consult the PHY vendor's documentation to determine the correct value for
this parameter.
This design parameter indicates the acceptable endpoint L0s latency for the
Endpoint L0s
acceptable latency
< 64 ns – > 4 µs
device capabilities register. Sets the read-only value of the endpoint L0s
acceptable latency field of the device capabilities register (0x084). This
value should be based on how much latency the application layer can
tolerate. This setting is disabled for root ports.
Common clockGen1: 0–255
Gen2: 0–255
Separate clock
Electrical idle exit
(EIE) before FTS
Gen1: 0–255
Gen2: 0–255
3:0
Enable L1 ASPMOn/Off
Number of fast training sequences (N_FTS)
Indicates the number of fast training sequences needed in common clock
mode. The number of fast training sequences required is transmitted to the
other end of the link during link initialization and is also used to calculate
the L0s exit latency field of the device capabilities register (0x084). If you
select the Arria GX, Arria II GX, Stratix II GX, or Stratix IV GX PHY, this
parameter is disabled and set to its maximum value. If you are using an
external PHY, consult the PHY vendor's documentation to determine the
correct value for this parameter.
Indicates the number of fast training sequences needed in separate clock
mode. The number of fast training sequences required is transmitted to the
other end of the link during link initialization and is also used to calculate
the L0s exit latency field of the device capabilities register (0x084). If you
select the Arria GX, Arria II GX, Stratix II GX, or Stratix IV GX PHY, this
parameter is disabled and set to its maximum value. If you are using an
external PHY, consult the PHY vendor's documentation to determine the
correct value for this parameter.
Sets the number of EIE symbols sent before sending the N_FTS sequence.
Legal values are 4–8. N_FTS is disabled for Arria II GX and Stratix IV GX
devices pending device characterization.
L1s Active State Power Management (ASPM)
Sets the L1 active state power management support bit in the link
capabilities register (0x08C). If you select the Arria GX, Arria II GXCyclone IV GX, Stratix II GX, or Stratix IV GX PHY, this option is turned off
and disabled.
,
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
3–20Chapter 3: Parameter Settings
Table 3–13. Power Management Parameters (Part 2 of 2)
ParameterValueDescription
This value indicates the acceptable latency that an endpoint can withstand
in the transition from the L1 to L0 state. It is an indirect measure of the
endpoint’s internal buffering. This setting is disabled for root ports. Sets the
Endpoint L1
acceptable latency
L1 Exit Latency
Common clock
L1 Exit Latency
Separate clock
< 1 µs to > 64 µs
< 1µs to > 64 µs
< 1µs to > 64 µs
read-only value of the endpoint L1 acceptable latency field of the device
capabilities register. It provides information to other devices which have
turned On the Enable L1 ASPM option. If you select the Arria GX,
Arria II GX, Cyclone IV GX, Stratix II GX, or Stratix IV GX PHY, this option
is turned off and disabled.
Indicates the L1 exit latency for the separate clock. Used to calculate the
value of the L1 exit latency field of the device capabilities register (0x084). If
you select the Arria GX, Arria II GX, Cyclone IV GX, Stratix II GX, or
Stratix IV GX PHY this parameter is disabled and set to its maximum value.
If you are using an external PHY, consult the PHY vendor's documentation
to determine the correct value for this parameter.
Indicates the L1 exit latency for the common clock. Used to calculate the
value of the L1 exit latency field of the device capabilities register (0x084). If
you select the Arria GX, Arria II GX, Cyclone IV GX, Stratix II GX, or
Stratix IV GX PHY, this parameter is disabled and set to its maximum value.
If you are using an external PHY, consult the PHY vendor's documentation
to determine the correct value for this parameter.
IP Core Parameters
Avalon-MM Configuration
The Avalon Configuration page contains parameter settings for the PCI Express
Avalon-MM bridge. The bridge is available only in the Qsys design flow.For more
information about the Avalon-MM configuration parameters in the Qsys design flow,
refer to “Parameters in the Qsys Design Flow” on page 3–1.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 3: Parameter Settings3–21
IP Core Parameters
Table 3–14. Avalon Configuration Settings (Part 1 of 2)
Parameter ValueDescription
Allows you to specify one or two clock domains for your application
and the IP Compiler for PCI Express. The single clock domain is higher
performance because it avoids the clock crossing logic that separate
clock domains require.
Use PCIe core clock—In this mode, the IP Compiler for PCI Express
provides a clock output,
clk125_out
or
pcie_clk_out
, to be used
as the single clock for the IP Compiler for PCI Express and the
Avalon Clock Domain
Use PCIe core clock
Use separate clock
system application clock.
Use separate clock—In this mode, the protocol layers of the IP
Compiler for PCI Express operate on an internally generated clock.
The IP Compiler for PCI Express exports
clk125_out
; however, this
clock is not visible and cannot drive the components. The AvalonMM bridge logic of the IP Compiler for PCI Express operates on a
different clock.
For more information about these two modes, refer to“Avalon-MM
Interface–Hard IP and Soft IP Implementations” on page 7–11 .
Specifies whether the IP Compiler for PCI Express component is
capable of sending requests to the upstream PCI Express devices, and
whether the incoming requests are pipelined.
Requester/Completer—Enables the IP Compiler for PCI Express to
send request packets on the PCI Express TX link as well as receiving
request packets on the PCI Express RX link.
Completer-Only—In this mode, the IP Compiler for PCI Express can
PCIe Peripheral Mode
Requester/Completer,
Completer-Only,
Completer-Only
single dword
receive requests, but cannot initiate upstream requests. However, it
can transmit completion packets on the PCI Express TX link. This
mode removes the Avalon-MM TX slave port and thereby reduces
logic utilization. When selecting this option, you should also select
Low for the Desired performance for received completions option
on the Buffer Setup page to minimize the device resources
consumed. Completer-Only is only available in hard IP
implementations.
Completer-Only single dword—Non-pipelined version of
Completer-Only mode. At any time, only a single request can be
outstanding. Completer-Only single dword uses fewer resources
than Completer-Only and is only available in hard IP
implementations.
Sets Avalon-MM-to-PCI Express address translation scheme to
dynamic or fixed.
Dynamic translation table—Enables application software to write
the address translation table contents using the control register
access slave port. On-chip memory stores the table. Requires that
the Avalon-MM CRA Port be enabled. Use several address
translation table entries to avoid updating a table entry before
outstanding requests complete.
Fixed translation table—Configures the address translation table
contents to hardwired fixed values at the time of system generation.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
3–22Chapter 3: Parameter Settings
IP Core Parameters
Table 3–14. Avalon Configuration Settings (Part 2 of 2)
Parameter ValueDescription
Address translation table sizeSets Avalon-MM-to-PCI Express address translation windows and size.
Specifies the number of PCI Express base address pages of memory
that the bridge can access. This value corresponds to the number of
Number of address
pages
1, 2, 4, 8, 16, 32, 64,
128, 256, 512
entries in the address translation table. The Avalon address range is
segmented into one or more equal-sized pages that are individually
mapped to PCI Express addresses. Select the number and size of the
address pages. If you select Dynamic translation table, use several
address translation table entries to avoid updating a table entry before
outstanding requests complete.
Size of address pages
1MByte–2GBytes
Specifies the size of each PCI Express memory segment accessible by
the bridge. This value is common for all address translation entries.
Fixed Address Translation Table Contents
32-bit
PCIe base address
Type
Avalon-MM CRA port
64-bit
32-bit Memory
64-bit Memory
Enable/Disable
Specifies the type and PCI Express base addresses of memory that the
bridge can access. The upper bits of the Avalon-MM address are
replaced with part of a specific entry. The MSBs of the Avalon-MM
address, used to index the table, select the entry to use for each
request. The values of the lower bits (as specified in the size of address
pages parameter) entered in this table are ignored. Those lower bits are
replaced by the lower bits of the incoming Avalon-MM addresses.
Allows read/write access to bridge registers from Avalon using a
specialized slave port. Disabling this option disallows read/write access
to bridge registers.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
August 2014
<edit Part Number variable in chapter>
This chapter describes the architecture of the IP Compiler for PCI Express. For the
hard IP implementation, you can design an endpoint using the Avalon-ST interface or
Avalon-MM interface, or a root port using the Avalon-ST interface. For the soft IP
implementation, you can design an endpoint using the Avalon-ST, Avalon-MM, or
Descriptor/Data interface. All configurations contain a transaction layer, a data link
layer, and a PHY layer with the following functions:
■ Transaction Layer—The transaction layer contains the configuration space, which
■ Data Link Layer—The data link layer, located between the physical layer and the
4. IP Core Architecture
manages communication with the application layer: the receive and transmit
channels, the receive buffer, and flow control credits. You can choose one of the
following two options for the application layer interface from parameter editor:
■Avalon-ST Interface
■Descriptor/Data Interface (not recommended for new designs)
transaction layer, manages packet transmission and maintains data integrity at the
link level. Specifically, the data link layer performs the following tasks:
■Manages transmission and reception of data link layer packets
■Generates all transmission cyclical redundancy code (CRC) values and checks
all CRCs during reception
■Manages the retry buffer and retry mechanism according to received
ACK/NAK data link layer packets
■Initializes the flow control mechanism for data link layer packets and routes
flow control credits to and from the transaction layer
■ Physical Layer—The physical layer initializes the speed, lane numbering, and lane
width of the PCI Express link according to packets received from the link and
directives received from higher layers.
1IP Compiler for PCI Express soft IP endpoints comply with the PCI Express Base
Specification 1.0a, or 1.1. IP Compiler PCI Express hard IP endpoints and root ports
comply with the PCI Express Base Specification 1.1. 2.0, or 2.1.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
4–2Chapter 4: IP Core Architecture
Tx
Rx
Transaction Layer
Data Link LayerPhysical Layer
IP Compiler for PCI Express
To Application LayerTo Link
Application Interfaces
Avalon-ST Interface
Data/Descriptor
Interface
Avalon-MM Interface
Tx Port
Rx Port
or
or
With information sent
by the application
layer, the transaction
layer generates a TLP,
which includes a
header and, optionally,
a data payload.
The physical layer
encodes the packet
and transmits it to the
receiving device on the
other side of the link.
The transaction layer
disassembles the
transaction and
transfers data to the
application layer in a
form that it recognizes.
The data link layer
verifies the packet's
sequence number and
checks for errors.
The physical layer
decodes the packet
and transfers it to the
data link layer.
The data link layer
ensures packet
integrity, and adds a
sequence number and
link cyclic redundancy
code (LCRC) check to
the packet.
Application Interfaces
Figure 4–1 broadly describes the roles of each layer of the PCI Express IP core.
Figure 4–1. IP Compiler for PCI Express Layers
This chapter provides an overview of the architecture of the Altera IP Compiler for
PCI Express. It includes the following sections:
■ Application Interfaces
■ Transaction Layer
■ Data Link Layer
■ Physical Layer
■ PCI Express Avalon-MM Bridge
■ Completer Only PCI Express Endpoint Single DWord
Application Interfaces
You can generate the IP Compiler for PCI Express with the following application
interfaces:
■ Avalon-ST Application Interface
■ Avalon-MM Interface
Appendix B describes the Descriptor/Data interface.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Avalon-ST Application Interface
You can create an IP Compiler for PCI Express root port or endpoint using the
parameter editor to specify the Avalon-ST interface. It includes a PCI Express
Avalon-ST adapter module in addition to the three PCI Express layers.
Chapter 4: IP Core Architecture4–3
Tx
Rx
Transaction Layer
Data Link LayerPhysical Layer
IP Compiler for PCI Express
To Application LayerTo Link
Avalon-ST
Tx Port
Avalon-ST
Rx Port
Avalon-ST
Adapter
With information sent
by the application
layer, the transaction
layer generates a TLP,
which includes a
header and, optionally,
a data payload.
The data link layer
ensures packet
integrity, and adds a
sequence number and
link cyclic redundancy
code (LCRC) check to
the packet.
The physical layer
encodes the packet
and transmits it to the
receiving device on the
other side of the link.
The transaction layer
disassembles the
transaction and
transfers data to the
application layer in a
form that it recognizes.
The physical layer
decodes the packet
and transfers it to the
data link layer.
The data link layer
verifies the packet's
sequence number and
checks for errors.
Application Interfaces
The PCI Express Avalon-ST adapter maps PCI Express transaction layer packets
(TLPs) to the user application RX and TX busses. Figure 4–2 illustrates this interface.
Figure 4–2. IP Core with PCI Express Avalon-ST Interface Adapter
In both the hard IP and soft IP implementations of the IP Compiler for PCI Express,
the adapter maps the user application Avalon-ST interface to PCI Express TLPs. The
hard IP and soft IP implementations differ in the following respects:
■ The hard IP implementation includes dedicated clock domain crossing logic
between the PHYMAC and data link layers. In the soft IP implementation you can
specify one or two clock domains for the IP core.
■ The hard IP implementation includes the following interfaces to access the
configuration space registers:
■The LMI interface
■The Avalon-MM PCIe reconfig bus which can access any read-only
configuration space register
■In root port configuration, you can also access the configuration space registers
with a configuration type TLP using the Avalon-ST interface. A type 0
configuration TLP is used to access the RP configuration space registers, and a
type 1 configuration TLP is used to access the configuration space registers of
downstream nodes, typically endpoints on the other side of the link.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
4–4Chapter 4: IP Core Architecture
Clock
Domain
Crossing
(CDC)
Data
Link
Layer
(DLL)
Transaction Layer
(TL)
PHYMAC
IP Compiler for PCI Express Hard IP Implementation
Avalon-ST Rx
Avalon-ST Tx
Side Band
LMI
(Avalon-MM)
PCIe Reconfig
PIPE
Adapter
To Application Layer
LMI
Reconfig
Block
Clock & Reset
Selection
Transceiver
Configuration
Space
Application Interfaces
Figure 4–3 and Figure 4–4 illustrate the hard IP and soft IP implementations of the IP
Compiler for PCI Express with an Avalon-ST interface.
Figure 4–3. PCI Express Hard IP Implementation with Avalon-ST Interface to User Application
Figure 4–4. PCI Express Soft IP Implementation with Avalon-ST Interface to User Application
Transceiver
PIPE
Clock & Reset
IP Compiler for PCI Express Soft IP Implementation
Data
PHYMAC
Link
Layer
(DLL)
Transaction Layer
Selection
(TL)
Test
Avalon-ST Rx
Avalon-ST Tx
Adapter
Test_in/Test_out
Side Band
To Application Layer
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 4: IP Core Architecture4–5
Application Interfaces
Tab le 4– 1 provides the application clock frequencies for the hard IP and soft IP
implementations. As this table indicates, the Avalon-ST interface can be either 64 or
128 bits for the hard IP implementation. For the soft IP implementation, the Avalon-ST
interface is 64 bits.
Table 4–1. Application Clock Frequencies
Hard IP Implementation— Stratix IV GX and Hardcopy IV GX Devices
(1) The 62.5 MHz application clock is available in parameter editor-generated Gen1:×1 hard IP implementations in any
device.
The following sections introduce the functionality of the interfaces shown in
Figure 4–3 and Figure 4–4. For more detailed information, refer to
Avalon-ST RX Port” on page 5–6
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
and “64- or 128-Bit Avalon-ST TX Port” on page 5–15.
“64- or 128-Bit
4–6Chapter 4: IP Core Architecture
Application Interfaces
RX Datapath
The RX datapath transports data from the transaction layer to the Avalon-ST interface.
A FIFO buffers the RX data from the transaction layer until the streaming interface
accepts it. The adapter autonomously acknowledges all packets it receives from the
PCI Express IP core. The
rx_abort
and
rx_retry
signals of the transaction layer
interface are not used. Masking of non-posted requests is partially supported. Refer to
the description of the
rx_st_mask
<n> signal for further information about masking.
TX Datapath
The TX datapath transports data from the application's Avalon-ST interface to the
transaction layer. In the hard IP implementation, a FIFO buffers the Avalon-ST data
until the transaction layer accepts it.
If required, TLP ordering should be implemented by the application layer. The TX
datapath provides a TX credit (
available. For non–posted requests, this vector accounts for credits pending in the
Avalon-ST adapter. For example, if the
credits available to it. For completions and posted requests, the
the credits available in the transaction layer of the IP Compiler for PCI Express. For
example, for completions and posted requests, if
credits available to the application is (5 – <the number of credits in the adaptor>). You
must account for completion and posted credits which may be pending in the
Avalon-ST adapter. You can use the read and write FIFO pointers and the FIFO empty
flag to track packets as they are popped from the adaptor FIFO and transferred to the
transaction layer.
tx_cred
) vector which reflects the number of credits
tx_cred
value is 5, the application layer has 5
tx_cred
tx_cred
is 5, the actual number of
vector reflects
TLP Reordering
Applications that use the non-posted
more packets than
tx_cred
allows. While the IP core always obeys PCI Express flow
control rules, the behavior of the
is violated. When evaluating
tx_cred
that are in flight, and not yet reflected in
tx_cred
tx_cred
signal must ensure they never send
signal itself is unspecified if the credit limit
, the application must take into account TLPs
tx_cred
. Altera recommends your
application implement the following procedure, beginning from a state in which the
application has not yet issued any TLPs:
1. For calibration, ensure this application has issued no TLPs.
2. Wait for
3. Send as many TLPs as are allowed by
tx_cred
to indicate that credits are available.
tx_cred
. For example, if
tx_cred
indicates 3
credits of non-posted headers are available, the application sends 3 non-posted
TLPs, then stops.
In this step, the application exhausts
tx_cred
before waiting for more credits to
free. This step is required.
4. Wait for the TLPs to cross the Avalon-ST TX interface.
5. Wait at least 3 more clock cycles for
tx_cred
to reflect the consumed credits.
6. Repeat from Step 2.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 4: IP Core Architecture4–7
Application Interfaces
1The value of the non-posted
credits available. The non-posted credits displayed may be less than what is actually
available to the IP core.
LMI Interface (Hard IP Only)
The LMI bus provides access to the PCI Express configuration space in the transaction
layer. For more LMI details, refer to the “LMI Signals—Hard IP Implementation” on
page 5–37.
PCI Express Reconfiguration Block Interface (Hard IP Only)
The PCI Express reconfiguration bus allows you to dynamically change the read-only
values stored in the configuration registers. For detailed information refer to the “IP
Core Reconfiguration Block Signals—Hard IP Implementation” on page 5–38.
MSI (Message Signal Interrupt) Datapath
The MSI datapath contains the MSI boundary registers for incremental compilation.
The interface uses the transaction layer's request–acknowledge handshaking protocol.
You use the TX FIFO empty flag from the TX datapath FIFO for TX/MSI
synchronization. When the TX block application drives a packet to the Avalon-ST
adapter, the packet remains in the TX datapath FIFO as long as the IP core throttles
this interface. When you must send an MSI request after a specific TX packet, you can
use the TX FIFO empty flag to determine when the IP core receives the TX packet.
tx_cred
represents that there are at least that number of
For example, you may want to send an MSI request only after all TX packets are
issued to the transaction layer. Alternatively, if you cannot interrupt traffic flow to
synchronize the MSI, you can use a counter to count 16 writes (the depth of the FIFO)
after a TX packet has been written to the FIFO (or until the FIFO becomes empty) to
ensure that the transaction layer interface receives the packet, before you issue the
MSI request.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
4–8Chapter 4: IP Core Architecture
FIFO
Buffer
tx_st_data 0
app_msi_req
Non-Posted Credits
To Transaction
Layer
To Application
Layer
tx_cred0 for Completion
and Posted Requests
(from Transaction Layter)
tx_cred0 for
Non-Posted Requests
tx_fifo_empty0
tx_fifo_wrptr0
tx_fifo_rdptr0
Registers
Application Interfaces
Figure 4–5 illustrates the Avalon-ST TX and MSI datapaths.
Figure 4–5. Avalon-ST TX and MSI Datapaths
Incremental Compilation
The IP core with Avalon-ST interface includes a fully registered interface between the
user application and the PCI Express transaction layer. For the soft IP implementation,
you can use incremental compilation to lock down the placement and routing of the
IP Compiler for PCI Express with the Avalon-ST interface to preserve placement and
timing while changes are made to your application.
1Incremental recompilation is not necessary for the PCI Express hard IP
implementation. This implementation is fixed. All signals in the hard IP
implementation are fully registered.
Avalon-MM Interface
IP Compiler for PCI Express variations generated in the Qsys design flow are PCI
Express Avalon-MM bridges: PCI Express endpoints with an Avalon-MM interface to
the application layer. The hard IP implementation of the PHYMAC and data link
layers communicates with a soft IP implementation of the transaction layer optimized
for the Avalon-MM protocol.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 4: IP Core Architecture4–9
Tx
Rx
Transaction LayerData Link LayerPhysical Layer
IP Compiler for PCI Express
To Application LayerTo Link
Avalon-MM
Master Port
IP Compiler for PCI Express
Avalon-MM Interface
Avalon-MM
Slave Port
(Control Register
Access)
Avalon-MM
Slave Port
Qsys component
controls the upstream
PCI Express devices.
Qsys component
controls access to
internal control and
status registers.
Root port controls the
downstream Qsys
component.
With information sent
by the application
layer, the transaction
layer generates a TLP,
which includes a
header and, optionally,
a data payload.
The data link layer
ensures packet
integrity, and adds a
sequence number and
link cyclic redundancy
code (LCRC) check to
the packet.
The physical layer
encodes the packet
and transmits it to the
receiving device on the
other side of the link.
The transaction layer
disassembles the
transaction and
transfers data to the
application layer in a
form that it recognizes.
The data link layer
verifies the packet's
sequence number and
checks for errors.
The physical layer
decodes the packet
and transfers it to the
data link layer.
Transaction Layer
Figure 4–6 shows the block diagram of an IP Compiler for PCI Express with an
Avalon-MM interface.
Figure 4–6. IP Compiler for PCI Express with Avalon-MM Interface
The PCI Express Avalon-MM bridge provides an interface between the PCI Express
transaction layer and other components across the system interconnect fabric.
Transaction Layer
The transaction layer sits between the application layer and the data link layer. It
generates and receives transaction layer packets. Figure 4–7 illustrates the transaction
layer of a component with two initialized virtual channels (VCs). The transaction
layer contains three general subblocks: the transmit datapath, the configuration space,
and the receive datapath, which are shown with vertical braces in Figure 4–7.
1You can parameterize the Stratix IV GX IP core to include one or two virtual channels.
The Arria II GX and Cyclone IV GX implementations include a single virtual channel.
Tracing a transaction through the receive datapath includes the following steps:
1. The transaction layer receives a TLP from the data link layer.
2. The configuration space determines whether the transaction layer packet is well
formed and directs the packet to the appropriate virtual channel based on traffic
class (TC)/virtual channel (VC) mapping.
3. Within each virtual channel, transaction layer packets are stored in a specific part
of the receive buffer depending on the type of transaction (posted, non-posted, or
completion transaction).
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
4. The transaction layer packet FIFO block stores the address of the buffered
transaction layer packet.
4–10Chapter 4: IP Core Architecture
n
Transaction Layer
5. The receive sequencing and reordering block shuffles the order of waiting
transaction layer packets as needed, fetches the address of the priority transaction
layer packet from the transaction layer packet FIFO block, and initiates the transfer
of the transaction layer packet to the application layer.
Figure 4–7. Architecture of the Transaction Layer: Dedicated Receive Buffer per Virtual Channel
Towards Data Link LayerTowards Application Layer
Interface Established per Virtual ChannelInterface Established per Component
Virtual Channel 1
Tx1 Data
Tx1 Descriptor
Tx Transaction Layer
Packet Description
Tx1 Control
Virtual Channel 0
Tx0 Data
Tx0 Descriptor
Tx0 Control
Tx1 Request
Sequencing
Tx0 Request
Sequencing
Flow Control
Check & Reordering
Flow Control
Check & Reordering
& Data
Rx Flow
Control Credits
Virtual Channel
Arbitration & Tx
Sequencing
Transmit
Data Path
Virtual Channel 0
Rx0 Data
Rx0 Descriptor
Rx0 Control
& Status
Virtual Channel 1
Rx1 Data
Rx1 Descriptor
Rx1 Control
& Status
Type 0 Configuration Space
Rx0 Sequencing
& Reordering
Rx1 Sequencing
& Reordering
Receive Buffer
Posted & Completion
Non-Posted
Transaction Layer
Packet FIFO
Flow Control Update
Receive Buffer
Posted & Completion
Non-Posted
Transaction Layer
Packet FIFO
Flow Control Update
Tx Flow
Control Credits
Rx Transaction
Layer Packet
Configuratio
Space
Receive
Data Path
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 4: IP Core Architecture4–11
Transaction Layer
Tracing a transaction through the transmit datapath involves the following steps:
1. The IP core informs the application layer that sufficient flow control credits exist
for a particular type of transaction. The IP core uses
implementation and
tx_cred[35:0]
for the hard IP implementation. The
tx_cred[21:0]
for the soft IP
application layer may choose to ignore this information.
2. The application layer requests a transaction layer packet transmission. The
application layer must provide the PCI Express transaction and must be prepared
to provide the entire data payload in consecutive cycles.
3. The IP core verifies that sufficient flow control credits exist, and acknowledges or
postpones the request.
4. The application layer forwards the transaction layer packet. The transaction layer
arbitrates among virtual channels, and then forwards the priority transaction layer
packet to the data link layer.
Transmit Virtual Channel Arbitration
For Stratix IV GX devices, the IP Compiler for PCI Express allows you to specify a
high and low priority virtual channel as specified in Chapter 6 of the PCI Express Base
Specification 1.0a, 1.1, or 2.0. You can use the settings on the Buffer Setup page,
accessible from the Parameter Settings tab, to specify the number of virtual channels.
Refer to “Buffer Setup Parameters” on page 3–16.
Configuration Space
The configuration space implements the following configuration registers and
associated functions:
■ Header Type 0 Configuration Space for Endpoints
■ Header Type 1 Configuration Space for Root Ports
The configuration space also generates all messages (PME#, INT, error, slot power
limit), MSI requests, and completion packets from configuration requests that flow in
the direction of the root complex, except slot power limit messages, which are
generated by a downstream port in the direction of the PCI Express link. All such
transactions are dependent upon the content of the PCI Express configuration space
as described in the PCI Express Base Specification 1.0a, 1.1, or 2.0.
f Refer To “Configuration Space Register Content” on page 6–1 or Chapter 7 in the PCI
Express Base Specification 1.0a, 1.1, or 2.0 for the complete content of these registers.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
4–12Chapter 4: IP Core Architecture
To Transaction Layer
Tx Transaction Layer
Packet Description & Data
Transaction Layer
Packet Generator
Retry Buffer
To Physical Layer
Tx Packets
Ack/Nack
Packets
Receive
Data Path
Transmit
Data Path
Rx Packets
DLLP
Checker
Transaction Layer
Packet Checker
DLLP
Generator
Tx Arbitration
Data Link Control
& Management
State Machine
Control
& Status
Configuration Space
Tx Flow Control Credits
Rx Flow Control Credits
Rx Transation Layer
Packet Description & Data
Power
Management
Function
Data Link Layer
Data Link Layer
The data link layer is located between the transaction layer and the physical layer. It is
responsible for maintaining packet integrity and for communication (by data link
layer packet transmission) at the PCI Express link level (as opposed to component
communication by transaction layer packet transmission in the interconnect fabric).
The data link layer is responsible for the following functions:
■ Link management through the reception and transmission of data link layer
packets, which are used for the following functions:
■To initialize and update flow control credits for each virtual channel
■For power management of data link layer packet reception and transmission
■ Data integrity through generation and checking of CRCs for transaction layer
■ Transaction layer packet retransmission in case of
■ Management of the retry buffer
■ Link retraining requests in case of error through the LTSSM of the physical layer
Figure 4–8 illustrates the architecture of the data link layer.
Figure 4–8. Data Link Layer
■To tr a nsm it an d rec eive
ACK/NACK
packets and data link layer packets
reception using the retry buffer
packets
NAK
data link layer packet
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 4: IP Core Architecture4–13
Data Link Layer
The data link layer has the following subblocks:
■ Data Link Control and Management State Machine—This state machine is
synchronized with the physical layer’s LTSSM state machine and is also connected
to the configuration space registers. It initializes the link and virtual channel flow
control credits and reports status to the configuration space. (Virtual channel 0 is
initialized by default, as is a second virtual channel if it has been physically
enabled and the software permits it.)
■ Power Management—This function handles the handshake to enter low power
mode. Such a transition is based on register values in the configuration space and
received PM DLLPs.
■ Data Link Layer Packet Generator and Checker—This block is associated with the
data link layer packet’s 16-bit CRC and maintains the integrity of transmitted
packets.
generating a sequence number and a 32-bit CRC. The packets are also sent to the
retry buffer for internal storage. In retry mode, the transaction layer packet
generator receives the packets from the retry buffer and generates the CRC for the
transmit packet.
■ Retry Buffer—The retry buffer stores transaction layer packets and retransmits all
unacknowledged packets in the case of NAK DLLP reception. For ACK DLLP
reception, the retry buffer discards all acknowledged packets.
■ ACK/NAK Packets—The ACK/NAK block handles ACK/NAK data link layer
packets and generates the sequence number of transmitted packets.
■ Transaction Layer Packet Checker—This block checks the integrity of the received
transaction layer packet and generates a request for transmission of an ACK/NAK
data link layer packet.
■ TX Arbitration—This block arbitrates transactions, basing priority on the
following order:
1. Initialize FC data link layer packet
2. ACK/NAK data link layer packet (high priority)
3. Update FC data link layer packet (high priority)
4. PM data link layer packet
5. Retry buffer transaction layer packet
6. Transaction layer packet
7. Update FC data link layer packet (low priority)
8. ACK/NAK FC data link layer packet (low priority)
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
4–14Chapter 4: IP Core Architecture
Physical Layer
Physical Layer
The physical layer is the lowest level of the IP core. It is the layer closest to the link. It
encodes and transmits packets across a link and accepts and decodes received
packets. The physical layer connects to the link through a high-speed SERDES
interface running at 2.5 Gbps for Gen1 implementations and at 2.5 or 5.0 Gbps for
Gen2 implementations. Only the hard IP implementation supports the Gen2 rate of
5.0 Gbps.
The physical layer is responsible for the following actions:
■ Initializing the link
■ Scrambling and descrambling and 8B/10B encoding and decoding of 2.5 Gbps
(Gen1) or 5.0 Gbps (Gen2) per lane 8B/10B
■ Serializing and deserializing data
The hard IP implementation includes the following additional functionality:
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 4: IP Core Architecture4–15
Scrambler
8B10B
Encoder
Lane n
Tx+ / Tx-
Scrambler
8B10B
Encoder
Lane 0
Tx+ / Tx-
Descrambler
8B10B
Decoder
Lane n
Rx+ / Rx-
Elastic
Buffer
LTSSM
State Machine
SKIP
Generation
Control & Status
PIPE
Emulation Logic
Link Serializer
for an x8 Link
Tx Packets
Rx MAC
Lane
Device Transceiver (per Lane) with 2.5 or 5.0 Gbps SERDES & PLL
Descrambler
8B10B
Decoder
Lane 0
Rx+ / Rx-
Elastic
Buffer
Rx MAC
Lane
PIPE
Interface
Multilane Deskew
Link Serializer for an x8 Link
Rx Packets
Transmit
Data Path
Receive
Data Path
MAC LayerPHY layer
To LinkTo Data Link Layer
Physical Layer
Physical Layer Architecture
Figure 4–9 illustrates the physical layer architecture.
Figure 4–9. Physical Layer
The physical layer is subdivided by the PIPE Interface Specification into two layers
(bracketed horizontally in Figure 4–9):
■ Media Access Controller (MAC) Layer—The MAC layer includes the Link
Training and Status state machine (LTSSM) and the scrambling/descrambling and
multilane deskew functions.
■ PHY Layer—The PHY layer includes the 8B/10B encode/decode functions, elastic
buffering, and serialization/deserialization functions.
The physical layer integrates both digital and analog elements. Intel designed the
PIPE interface to separate the MAC from the PHY. The IP core is compliant with the
PIPE interface, allowing integration with other PIPE-compliant external PHY devices.
Depending on the parameters you set in the parameter editor, the IP core can
automatically instantiate a complete PHY layer when targeting an Arria II GX, Arria
II GZ, Cyclone IV GX, HardCopy IV GX, Stratix II GX, or Stratix IV GX device.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
4–16Chapter 4: IP Core Architecture
Physical Layer
The PHYMAC block is divided in four main sub-blocks:
■ MAC Lane—Both the receive and the transmit path use this block.
■On the receive side, the block decodes the physical layer packet (PLP) and
reports to the LTSSM the type of TS1/TS2 received and the number of TS1s
received since the LTSSM entered the current state. The LTSSM also reports the
reception of FTS, SKIP and IDL ordered sets and the reception of eight
consecutive D0.0 symbols.
■On the transmit side, the block multiplexes data from the data link layer and
the LTSTX sub-block. It also adds lane specific information, including the lane
number and the force PAD value when the LTSSM disables the lane during
initialization.
■ LTSSM—This block implements the LTSSM and logic that tracks what is received
and transmitted on each lane.
■For transmission, it interacts with each MAC lane sub-block and with the
LTSTX sub-block by asserting both global and per-lane control bits to generate
specific physical layer packets.
■On the receive path, it receives the PLPs reported by each MAC lane sub-block.
It also enables the multilane deskew block and the delay required before the TX
alignment sub-block can move to the recovery or low power state. A higher
layer can direct this block to move to the recovery, disable, hot reset or low
power states through a simple request/acknowledge protocol. This block
reports the physical layer status to higher layers.
■ LTSTX (Ordered Set and SKP Generation)—This sub-block generates the physical
layer packet (PLP). It receives control signals from the LTSSM block and generates
PLP for each lane of the core. It generates the same PLP for all lanes and PAD
symbols for the link or lane number in the corresponding TS1/TS2 fields.
The block also handles the receiver detection operation to the PCS sub-layer by
asserting predefined PIPE signals and waiting for the result. It also generates a
SKIP ordered set at every predefined timeslot and interacts with the TX alignment
block to prevent the insertion of a SKIP ordered set in the middle of packet.
■ Deskew—This sub-block performs the multilane deskew function and the RX
alignment between the number of initialized lanes and the 64-bit data path.
The multilane deskew implements an eight-word FIFO for each lane to store
symbols. Each symbol includes eight data bits and one control bit. The FTS, COM,
and SKP symbols are discarded by the FIFO; the PAD and IDL are replaced by
D0.0 data. When all eight FIFOs contain data, a read can occur.
When the multilane lane deskew block is first enabled, each FIFO begins writing
after the first COM is detected. If all lanes have not detected a COM symbol after 7
clock cycles, they are reset and the resynchronization process restarts, or else the
RX alignment function recreates a 64-bit data word which is sent to the data link
layer.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 4: IP Core Architecture4–17
PCI Express Avalon-MM Bridge
Reverse Parallel Loopback
In Arria II GX, Arria II GZ, Cyclone IV GX, and Stratix IV GX devices, the IP
Compiler for PCI Express hard IP implementation supports a reverse parallel
loopback path you can use to test the IP Compiler for PCI Express endpoint link
implementation from a PCI Express root complex. When this path is enabled, data
that the IP Compiler for PCI Express endpoint receives on the PCI Express link passes
through the RX PMA and the word aligner and rate matching FIFO buffer in the RX
PCS as usual. From the rate matching FIFO buffer, it passes along both of the
following two paths:
■ The usual data path through the IP Compiler for PCI Express hard IP block.
■ A reverse parallel loopback path to the TX PMA block and out to the PCI Express
link. The input path to the TX PMA is gated by a multiplexor that controls whether
the TX PMA receives data from the TX PCS or from the reverse parallel loopback
path.
f For information about the reverse parallel loopback mode and an illustrative block
diagram, refer to “PCIe (Reverse Parallel Loopback)” in the Transceiver Architecture in
Arria II Devices chapter of the Arria II Device Handbook, “Reverse Parallel Loopback” in
the Cyclone IV Transceivers Architecture chapter of the Cyclone IV Device Handbook, or
“PCIe Reverse Parallel Loopback” in the Transceiver Architecture in Stratix IV Devices
chapter of the Stratix IV Device Handbook.
For information about configuring and using the reverse parallel loopback path for
testing, refer to “Link and Transceiver Testing” on page 17–3.
PCI Express Avalon-MM Bridge
The IP Compiler for PCI Express uses the IP Compiler for PCI Express Avalon-MM
bridge module to connect the PCI Express link to the system interconnect fabric. The
bridge facilitates the design of PCI Express endpoints that include Qsys components.
The full-featured PCI Express Avalon-MM bridge provides three possible Avalon-MM
ports: a bursting master, an optional bursting slave, and an optional non-bursting
slave. The PCI Express Avalon-MM bridge comprises the following three modules:
slave port propagates read and write requests of up to 4 KBytes in size from the
system interconnect fabric to the PCI Express link. The bridge translates requests
from the interconnect fabric to PCI Express request packets.
Express requests, converting them to bursting read or write requests to the system
interconnect fabric.
■ Control Register Access (CRA) Slave Module—This optional, 32-bit Avalon-MM
dynamic addressing slave port provides access to internal control and status
registers from upstream PCI Express devices and external Avalon-MM masters.
Implementations that use MSI or dynamic address translation require this port.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
4–18Chapter 4: IP Core Architecture
PCI Express Avalon-MM Bridge
Figure 4–10 shows the block diagram of a full-featured PCI Express Avalon-MM
bridge.
Figure 4–10. PCI Express Avalon-MM Bridge
PCI Express MegaCore Function
PCI Express Avalon-MM Bridge
Avalon Clock DomainPCI Express Clock Domain
Control Register
Access Slave
Address
Translator
Avalon-MM
Tx Slave
Avalon-MM
Tx Read
Response
System Interconnect Fabric
Address
Translator
Control & Status
Reg (CSR)
Clock
Domain
Crossing
Sync
Clock Domain
Boundary
MSI or
Legacy Interrupt
Generator
CRA Slave Module
PCI Express
Tx Controller
Tx Slave Module
PCI Link
Physical Layer
Data Link Layer
Transaction Layer
Avalon-MM
Rx Master
Avalon-MM
Rx Read
Response
Clock
Domain
Crossing
PCI Express
Rx Controller
Rx Master ModuleRx Master Module
The PCI Express Avalon-MM bridge supports the following TLPs:
■ Memory write requests
■ Received downstream memory read requests of up to 512 bytes in size
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 4: IP Core Architecture4–19
PCI Express Avalon-MM Bridge
■ Transmitted upstream memory read requests of up to 256 bytes in size
■ Completions
1The PCI Express Avalon-MM bridge supports native PCI Express endpoints, but not
legacy PCI Express endpoints. Therefore, the bridge does not support I/O space BARs
and I/O space requests cannot be generated.
The bridge has the following additional characteristics:
■Type 0 and Type 1 vendor-defined incoming messages are discarded
■Completion-to-a-flush request is generated, but not propagated to the system
interconnect fabric
Each PCI Express base address register (BAR) in the transaction layer maps to a
specific, fixed Avalon-MM address range. You can use separate BARs to map to
various Avalon-MM slaves connected to the RX Master port.
The following sections describe the following modes of operation:
A Qsys-generated PCI Express Avalon-MM bridge accepts Avalon-MM burst write
requests with a burst size of up to 512 bytes.
The PCI Express Avalon-MM bridge converts the write requests to one or more PCI
Express write packets with 32– or 64–bit addresses based on the address translation
configuration, the request address, and the maximum payload size.
The Avalon-MM write requests can start on any address in the range defined in the
PCI Express address table parameters. The bridge splits incoming burst writes that
cross a 4 KByte boundary into at least two separate PCI Express packets. The bridge
also considers the root complex requirement for maximum payload on the PCI
Express side by further segmenting the packets if needed.
The bridge requires Avalon-MM write requests with a burst count of greater than one
to adhere to the following byte enable rules:
■ The Avalon-MM byte enable must be asserted in the first qword of the burst.
■ All subsequent byte enables must be asserted until the deasserting byte enable.
■ The Avalon-MM byte enable may deassert, but only in the last qword of the burst.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
4–20Chapter 4: IP Core Architecture
1To improve PCI Express throughput, Altera recommends using an Avalon-MM burst
master without any byte-enable restrictions.
PCI Express Avalon-MM Bridge
Avalon-MM-to-PCI Express Upstream Read Requests
The PCI Express Avalon-MM bridge converts read requests from the system
interconnect fabric to PCI Express read requests with 32-bit or 64-bit addresses based
on the address translation configuration, the request address, and the maximum read
size.
The Avalon-MM TX slave interface of a Qsys-generated PCI Express Avalon-MM
bridge can receive read requests with burst sizes of up to 512 bytes sent to any
address. However, the bridge limits read requests sent to the PCI Express link to a
maximum of 256 bytes. Additionally, the bridge must prevent each PCI Express read
request packet from crossing a 4 KByte address boundary. Therefore, the bridge may
split an Avalon-MM read request into multiple PCI Express read packets based on the
address and the size of the read request.
For Avalon-MM read requests with a burst count greater than one, all byte enables
must be asserted. There are no restrictions on byte enable for Avalon-MM read
requests with a burst count of one. An invalid Avalon-MM request can adversely
affect system functionality, resulting in a completion with abort status set. An
example of an invalid request is one with an incorrect address.
PCI Express-to-Avalon-MM Read Completions
The PCI Express Avalon-MM bridge returns read completion packets to the initiating
Avalon-MM master in the issuing order. The bridge supports multiple and
out-of-order completion packets.
When the PCI Express Avalon-MM bridge receives PCI Express write requests, it
converts them to burst write requests before sending them to the system interconnect
fabric. The bridge translates the PCI Express address to the Avalon-MM address space
based on the BAR hit information and on address translation table values configured
during the IP core parameterization. Malformed write packets are dropped, and
therefore do not appear on the Avalon-MM interface.
For downstream write and read requests, if more than one byte enable is asserted, the
byte lanes must be adjacent. In addition, the byte enables must be aligned to the size
of the read or write request.
PCI Express-to-Avalon-MM Downstream Read Requests
The PCI Express Avalon-MM bridge sends PCI Express read packets to the system
interconnect fabric as burst reads with a maximum burst size of 512 bytes. The bridge
converts the PCI Express address to the Avalon-MM address space based on the BAR
hit information and address translation lookup table values. The address translation
lookup table values are user configurable. Unsupported read requests generate a
completer abort response.
1IP Compiler for PCI Express variations using the Avalon-ST interface can handle burst
reads up to the specified Maximum Payload Size.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 4: IP Core Architecture4–21
PCI Express Avalon-MM Bridge
As an example, Table 4–2 lists the byte enables for 32-bit data.
Table 4–2. Valid Byte Enable Configurations
Byte Enable ValueDescription
4’b1111Write full 32 bits
4’b0011Write the lower 2 bytes
4’b1100Write the upper 2 bytes
4’b0001Write byte 0 only
4’b0010Write byte 1 only
4’b0100Write byte 2 only
4’b1000Write byte 3 only
In burst mode, the IP Compiler for PCI Express supports only byte enable values that
correspond to a contiguous data burst. For the 32-bit data width example, valid values
in the first data phase are 4’b1111, 4’b1100, and 4’b1000, and valid values in the final
data phase of the burst are 4’b1111, 4’b0011, and 4’b0001. Intermediate data phases in
the burst can only have byte enable value 4’b1111.
Avalon-MM-to-PCI Express Read Completions
The PCI Express Avalon-MM bridge converts read response data from the external
Avalon-MM slave to PCI Express completion packets and sends them to the
transaction layer.
A single read request may produce multiple completion packets based on the
Maximum Payload Size and the size of the received read request. For example, if the
read is 512 bytes but the Maximum Payload Size 128 bytes, the bridge produces four
completion packets of 128 bytes each. The bridge does not generate out-of-order
completions. You can specify the Maximum Payload Size parameter on the Buffer
Setup page of the IP Compiler for PCI Express parameter editor. Refer to “Buffer
Setup Parameters” on page 3–16.
PCI Express-to-Avalon-MM Address Translation
The PCI Express address of a received request packet is translated to the Avalon-MM
address before the request is sent to the system interconnect fabric. This address
translation proceeds by replacing the MSB bits of the PCI Express address with the
value from a specific translation table entry; the LSB bits remain unchanged. The
number of MSB bits to replace is calculated from the total memory allocation of all
Avalon-MM slaves connected to the RX Master Module port. Six possible address
translation entries in the address translation table are configurable manually by Qsys.
Each entry corresponds to a PCI Express BAR. The BAR hit information from the
request header determines the entry that is used for address translation.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
4–22Chapter 4: IP Core Architecture
PCI Express Address
Inside IP Compiler for PCI Express
Matched BAR
selects Avalon-MM
address
Low address bits unchanged
(BAR-specific number of bits)
(1) N is the number of pass-through bits (BAR specific). M is the number of Avalon-MM address bits. P is the number of PCI Express address bits
(32 or 64).
The Avalon-MM RX master module port has an 8-byte datapath. The Qsys
interconnect fabric does not support native addressing. Instead, it supports dynamic
bus sizing. In this method, the interconnect fabric handles mismatched port widths
transparently.
f For more information about both native addressing and dynamic bus sizing, refer to
the “Address Alignment” section in the “Avalon Memory-Mapped Interfaces”
chapter of the Avalon Interface Specifications.
Avalon-MM-to-PCI Express Address Translation
The Avalon-MM address of a received request on the TX Slave Module port is
translated to the PCI Express address before the request packet is sent to the
transaction layer. This address translation process proceeds by replacing the MSB bits
of the Avalon-MM address with the value from a specific translation table entry; the
LSB bits remain unchanged. The number of MSB bits to be replaced is calculated
based on the total address space of the upstream PCI Express devices that the IP
Compiler for PCI Express can access.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
The address translation table contains up to 512 possible address translation entries
that you can configure. Each entry corresponds to a base address of the PCI Express
memory segment of a specific size. The segment size of each entry must be identical.
The total size of all the memory segments is used to determine the number of address
MSB bits to be replaced. In addition, each entry has a 2-bit field,
Sp[1:0]
, that
Chapter 4: IP Core Architecture4–23
PCI Express Avalon-MM Bridge
specifies 32-bit or 64-bit PCI Express addressing for the translated address. Refer to
Figure 4–12 on page 4–24. The most significant bits of the Avalon-MM address are
used by the system interconnect fabric to select the slave port and are not available to
the slave. The next most significant bits of the Avalon-MM address index the address
translation entry to be used for the translation process of MSB replacement.
For example, if the core is configured with an address translation table with the
following attributes:
■ Number of Address Pages—16
■ Size of Address Pages—1MByte
■ PCI Express Address Size—64 bits
then the values in Figure 4–12 are:
■ N = 20 (due to the 1 MByte page size)
■ Q = 16 (number of pages)
■ M = 24 (20 + 4 bit page selection)
■ P = 64
In this case, the Avalon address is interpreted as follows:
■ Bits [31:24] select the TX slave module port from among other slaves connected to
the same master by the system interconnect fabric. The decode is based on the base
addresses assigned in Qsys.
■ Bits [23:20] select the address translation table entry.
■ Bits [63:20] of the address translation table entry become PCI Express address bits
[63:20].
■ Bits [19:0] are passed through and become PCI Express address bits [19:0].
The address translation table can be hardwired or dynamically configured at run
time. When the IP core is parameterized for dynamic address translation, the address
translation table is implemented in memory and can be accessed through the CRA
slave module. This access mode is useful in a typical PCI Express system where
address allocation occurs after BIOS initialization.
For more information about how to access the dynamic address translation table
through the control register access slave, refer to the “Avalon-MM-to-PCI Express
Address Translation Table” on page 6–9.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
4–24Chapter 4: IP Core Architecture
PCIe Address Q-1
SpQ-1
Space Indication
PCI Express address from Table Entry
becomes High PCI Express address bits
PCI Express Address
High
Low
P-1N N-10
Low address bits unchanged
Avalon-MM-to-PCI Express
Address Translation Table
(Q entries by P-N bits wide)
PCIe Address 0Sp0
PCIe Address 1Sp1
Avalon-MM Address
High
Slave Base
Address
Low
M-131MN N-10
Table updates from
control register port
High Avalon-MM Address
Bits Index table
PCI Express Avalon-MM Bridge
Figure 4–12 depicts the Avalon-MM-to-PCI Express address translation process.
(1) N is the number of pass-through bits.
(2) M is the number of Avalon-MM address bits.
(3) P is the number of PCI Express address bits.
(4) Q is the number of translation table entries.
(5) Sp[1:0] is the space indication for each entry.
Generation of PCI Express Interrupts
The PCI Express Avalon-MM bridge supports MSI or legacy interrupts. The completer
only, single dword variant includes an interrupt generation module. For other
variants with the Avalon-MM interface, interrupt support requires instantiation of the
CRA slave module where the interrupt registers and control logic are implemented.
The Qsys-generated PCI Express Avalon-MM bridge supports the Avalon-MM
individual requests interrupt scheme: multiple input signals indicate incoming
interrupt requests, and software must determine priorities for servicing simultaneous
interrupts the IP Compiler for PCI Express receives on the Avalon-MM interface.
In the Qsys-generated IP Compiler for PCI Express, the RX master module port has as
many as 16 Avalon-MM interrupt input signals (
Each interrupt signal indicates a distinct interrupt source. Assertion of any of these
signals, or a PCI Express mailbox register write access, sets a bit in the PCI Express
interrupt status register. Multiple bits can be set at the same time; software determines
priorities for servicing simultaneous incoming interrupt requests. Each set bit in the
PCI Express interrupt status register generates a PCI Express interrupt, if enabled,
when software determines its turn.
Software can enable the individual interrupts by writing to the IP Compiler for PCI
Express “Avalon-MM to PCI Express Interrupt Enable Register Address: 0x0050” on
page 6–8 through the CRA slave.
RXmirq_irq[
<n>
:0]
, where <n> ≤ 15)) .
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 4: IP Core Architecture4–25
PCI Express Avalon-MM Bridge
In Qsys-generated systems, when any interrupt input signal is asserted, the
corresponding bit is written in the “Avalon-MM to PCI Express Interrupt Status
Register Address: 0x0040” on page 6–7. Software reads this register and decides
priority on servicing requested interrupts.
After servicing the interrupt, software must clear the appropriate serviced interrupt
status
bit and ensure that no other interrupts are pending. For interrupts caused by
“Avalon-MM to PCI Express Interrupt Status Register Address: 0x0040” mailbox
writes, the status bits should be cleared in the “Avalon-MM to PCI Express Interrupt
Status Register Address: 0x0040”. For interrupts due to the incoming interrupt
signals on the Avalon-MM interface, the interrupt status should be cleared in the
Avalon-MM component that sourced the interrupt. This sequence prevents interrupt
requests from being lost during interrupt servicing.
Figure 4–13 shows the logic for the entire PCI Express interrupt generation process.
Figure 4–13. IP Compiler for PCI Express Avalon-MM Interrupt Propagation to the PCI Express Link
(Configuration Space Command Register [10])
Avalon-MM-to-PCI-Express
Interrupt Status and Interrupt
Enable Register Bits
A2P_MAILBOX_INT7
A2P_MB_IRQ7
A2P_MAILBOX_INT6
A2P_MB_IRQ6
A2P_MAILBOX_INT5
A2P_MB_IRQ5
A2P_MAILBOX_INT4
A2P_MB_IRQ4
A2P_MAILBOX_INT3
A2P_MB_IRQ3
A2P_MAILBOX_INT2
A2P_MB_IRQ2
A2P_MAILBOX_INT1
A2P_MB_IRQ1
A2P_MAILBOX_INT0
A2P_MB_IRQ0
AV_IRQ_ASSERTED
AVL_IRQ
Interrupt Disable
PCI Express Virtual INTA signalling
(When signal rises ASSERT_INTA Message Sent)
(When signal falls DEASSERT_INTA Message Sent)
SET
DQ
Q
CLR
MSI Request
(Configuration Space Message Control Register[0])
MSI Enable
The PCI Express Avalon-MM bridge selects either MSI or legacy interrupts
automatically based on the standard interrupt controls in the PCI Express
configuration space registers. The
Command
interrupts. The
register (at configuration space offset 0x4) can be used to disable legacy
MSI Enable
bit, which is bit 0 of the
Interrupt Disable
MSI Control Status
bit, which is bit 10 of the
register in the
MSI capability register (bit 16 at configuration space offset 0x50), can be used to
enable MSI interrupts.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
4–26Chapter 4: IP Core Architecture
Completer Only PCI Express Endpoint Single DWord
Only one type of interrupt can be enabled at a time. However, to change the selection
of MSI or legacy interrupts during operation, software must ensure that no interrupt
request is dropped. Therefore, software must first enable the new selection and then
disable the old selection. To set up legacy interrupts, software must first clear the
Interrupt Disable
software must first set the
bit and then clear the
MSI enable
bit and then set the
MSI enable
bit. To set up MSI interrupts,
Interrupt Disable
bit.
Generation of Avalon-MM Interrupts
Generation of Avalon-MM interrupts requires the instantiation of the CRA slave
module where the interrupt registers and control logic are implemented. The CRA
slave port has an Avalon-MM Interrupt (
A write access to an Avalon-MM mailbox register sets one of the
bits in the “PCI Express to Avalon-MM Interrupt Status Register Address: 0x3060”
on page 6–11 and asserts the
CraIrq_o
enable the interrupt by writing to the “PCI Express to Avalon-MM Interrupt Enable
Register Address: 0x3070” on page 6–11 through the CRA slave. After servicing the
interrupt, software must clear the appropriate serviced interrupt
PCI-Express-to-Avalon-MM
Interrupt Status
other interrupt pending.
CraIrq_irq
or
CraIrq_irq
in Qsys systems) output signal.
P2A_MAILBOX_INT
output, if enabled. Software can
status
bit in the
register and ensure that there is no
<n>
Completer Only PCI Express Endpoint Single DWord
The completer only single dword endpoint is intended for applications that use the
PCI Express protocol to perform simple read and write register accesses from a host
CPU. The completer only single dword endpoint is a hard IP implementation
available for Qsys systems, and includes an Avalon-MM interface to the application
layer. The Avalon-MM interface connection in this variation is 32 bits wide. This
endpoint is not pipelined; at any time a single request can be outstanding.
The completer-only single dword endpoint supports the following requests:
■ Read and write requests of a single dword (32 bits) from the root complex
■ Completion with completer abort status generation for other types of non-posted
requests
■ INTX or MSI support with one Avalon-MM interrupt source
As this figure illustrates, the IP Compiler for PCI Express links to a PCI Express root
complex. A bridge component in the IP Compiler for PCI Express includes IP
Compiler for PCI Express TX and RX blocks, an Avalon-MM RX master, and an
interrupt handler. The bridge connects to the FPGA fabric using an Avalon-MM
interface. The following sections provide an overview of each block in the bridge.
IP Compiler for PCI Express RX Block
The IP Compiler for PCI Express RX control logic interfaces to the hard IP block to
process requests from the root complex. It supports memory reads and writes of a
single dword. It generates a completion with Completer Abort (CA) status for reads
greater than four bytes and discards all write data without further action for write
requests greater than four bytes.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 4: IP Core Architecture4–27
Completer Only PCI Express Endpoint Single DWord
The RX block passes header information to the Avalon-MM master, which generates
the corresponding transaction to the Avalon-MM interface. The bridge accepts no
additional requests while a request is being processed. While processing a read
request, the RX block deasserts the
ready
signal until the TX block sends the
corresponding completion packet to the hard IP block. While processing a write
request, the RX block sends the request to the Avalon-MM system interconnect fabric
before accepting the next request.
Avalon-MM RX Master Block
The 32-bit Avalon-MM master connects to the Avalon-MM system interconnect fabric.
It drives read and write requests to the connected Avalon-MM slaves, performing the
required address translation. The RX master supports all legal combinations of byte
enables for both read and write requests.
f For more information about legal combinations of byte enables, refer to Chapter 3,
Avalon Memory-Mapped Interfaces in the Avalon Interface Specifications.
IP Compiler for PCI Express TX Block
The TX block sends completion information to the IP Compiler for PCI Express hard
IP block. The IP core then sends this information to the root complex. The TX
completion block generates a completion packet with Completer Abort (CA) status
and no completion data for unsupported requests. The TX completion block also
supports the zero-length read (flush) command.
Interrupt Handler Block
The interrupt handler implements both INTX and MSI interrupts. The
in the configuration register specifies the interrupt type. The
of MSI message control portion in MSI Capability structure. It is bit[16] of 0x050 in the
configuration space registers. If the
IP Compiler for PCI Express when received, otherwise INTX is signaled. The interrupt
handler block supports a single interrupt source, so that software may assume the
source. You can disable interrupts by leaving the interrupt signal unconnected in the
interrupt signals unconnected in the IRQ column of Qsys. When the MSI registers in
the configuration space of the completer only single dword IP Compiler for PCI
Express are updated, there is a delay before this information is propagated to the
Bridge module. You must allow time for the Bridge module to update the MSI register
information. Under normal operation, initialization of the MSI registers should occur
substantially before any interrupt is generated. However, failure to wait until the
update completes may result in any of the following behaviors:
■ Sending a legacy interrupt instead of an MSI interrupt
msi_enable
msi_enable
msi_enable_bit
bit
is part
bit is on, an MSI request is sent to the
■ Sending an MSI interrupt instead of a legacy interrupt
■ Loss of an interrupt request
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
4–28Chapter 4: IP Core Architecture
Completer Only PCI Express Endpoint Single DWord
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
August 2014
<edit Part Number variable in chapter>
This chapter describes the signals that are part of the IP Compiler for PCI Express for
each of the following primary configurations:
■ Signals in the Hard IP Implementation Root Port with Avalon-ST Interface Signals
■ Signals in the Hard IP Implementation Endpoint with Avalon-ST Interface
■ Signals in the Soft IP Implementation with Avalon-ST Interface
■ Signals in the Soft or Hard Full-Featured IP Core with Avalon-MM Interface
■ Signals in the Qsys Hard Full-Featured IP Core with Avalon-MM Interface
■ Signals in the Completer-Only, Single Dword, IP Core with Avalon-MM Interface
■ Signals in the Qsys Completer-Only, Single Dword, IP Core with Avalon-MM
1Altera does not recommend the Descriptor/Data interface for new designs.
5. IP Core Interfaces
Interface
Avalon-ST Interface
The main functional differences between the hard IP and soft IP implementations
using an Avalon-ST interface are the configuration and clocking schemes. In addition,
the hard IP implementation offers a 128-bit Avalon-ST bus for some configurations. In
128-bit mode, the streaming interface clock,
core clock,
streaming interface clock,
and the streaming data width is 64 bits.
Figure 5–1, Figure 5–2, and Figure 5–3 illustrate the top-level signals for IP cores that
use the Avalon-ST interface.
core_clk
pld_clk
, is one-half the frequency of the
, and the streaming data width is 128 bits. In 64-bit mode, the
pld_clk
, is the same frequency as the core clock,
core_clk
,
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
(1) Available in Arria II GX, Arria II GZ, Cyclone IV GX,, and Stratix IV G devices. TFor Stratix IV GX devices, <n> = 16 for ×1 and ×4 IP cores and <n>
= 33 in the ×8 IP core.
(2) Available in Arria II GX, Arria II GZ, Cyclone IV GX, and Stratix IV GX devices. For Stratix IV GX
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
(1) Available in Stratix IV GX, devices. For Stratix IV GX devices, <n> = 16 for ×1 and ×4 IP cores and <n> = 33 in the ×8 IP core.
(2) Available in Stratix IV GX. For Stratix IV GX
reconfig_togxb
, <n> = 3.
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
Figure 5–3. Signals in the Soft IP Implementation with Avalon-ST Interface
Notes to Figure 5–3:
(1) Available in Stratix IV GX devices. For Stratix IV GX devices, <n> = 16 for ×1 and ×4 IP cores and <n> = 33 in the ×8 IP core.
(2) Available in Stratix IV GX devices. For Stratix IV GX
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
reconfig_togxb
, <n> = 3.
Chapter 5: IP Core Interfaces5–5
Avalon-ST Interface
Tab le 5– 1 lists the interfaces of both the hard IP and soft IP implementations with
links to the subsequent sections that describe each interface.
Table 5–1. Signal Groups in the IP Compiler for PCI Express with Avalon-ST Interface
Hard IP
Signal Group
End
point
Root
Port
Soft
IP
Description
Logical
Avalon-ST RXvvv“64- or 128-Bit Avalon-ST RX Port” on page 5–6
Avalon-ST TXvvv“64- or 128-Bit Avalon-ST TX Port” on page 5–15
Clock vv—“Clock Signals—Hard IP Implementation” on page 5–23
Clock——v“Clock Signals—Soft IP Implementation” on page 5–23
Reset and link trainingvvv“Reset and Link Training Signals” on page 5–24
ECC error vv—“ECC Error Signals” on page 27
Interrupt v—v“PCI Express Interrupts for Endpoints” on page 5–27
Interrupt and global error—v—“PCI Express Interrupts for Root Ports” on page 5–29
Configuration spacevv—“Configuration Space Signals—Hard IP Implementation” on page 5–29
Configuration space——v“Configuration Space Signals—Soft IP Implementation” on page 5–36
LMIvv—“LMI Signals—Hard IP Implementation” on page 5–37
PCI Express
reconfiguration block
vv—
“IP Core Reconfiguration Block Signals—Hard IP Implementation” on
page 5–38
Power managementvvv“Power Management Signals” on page 5–39
Completionvvv“Completion Side Band Signals” on page 5–41
Physical
Transceiver controlvvv“Transceiver Control Signals” on page 5–53
Serial vvv“Serial Interface Signals” on page 5–55
PIPE(1)(1)v“PIPE Interface Signals” on page 5–56
Test
Testvv “Test Interface Signals—Hard IP Implementation” on page 5–59
Test——v“Test Interface Signals—Soft IP Implementation” on page 5–61
Testvvv
Note to Table 5–1:
(1) Provided for simulation only
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
5–6Chapter 5: IP Core Interfaces
Avalon-ST Interface
64- or 128-Bit Avalon-ST RX Port
Tab le 5– 2 describes the signals that comprise the Avalon-ST RX Datapath.
Table 5–2. 64- or 128-Bit Avalon-ST RX Datapath (Part 1 of 3)
SignalWidth Dir
rx_st_ready
rx_st_valid<n>
rx_st_data<n>
rx_st_sop<n>
rx_st_eop<n>
rx_st_empty<n>
<n> (1)(2)1I
(2)1O
64,
128
1O
1O
1 O
O
Avalon-ST
Type
ready
valid
data
start of
packet
end of
packet
empty
Description
Indicates that the application is ready to accept data. The
application deasserts this signal to throttle the data stream.
Clocks
rx_st_data<n>
clocks of
clocks of
to send.
rx_st_sop
Receive data bus. Refer to Figure 5–5 through Figure 5–13 for
the mapping of the transaction layer’s TLP information to
rx_st_data
position of the first payload dword depends on whether the TLP
address is qword aligned. The mapping of message TLPs is the
same as the mapping of transaction layer TLPs with 4 dword
headers. When using a 64-bit Avalon-ST bus, the width of
rx_st_data<n>
the width of
When asserted with
first cycle of the TLP.
When asserted with
final cycle of the TLP.
Indicates that the TLP ends in the lower 64 bits of
Valid only when
applies to 128-bit mode in the hard IP implementation.
When
value 1,
rx_st_data[127:64]
When
value 0,
rx_st_ready<n>
rx_st_ready<n>
rx_st_valid
and
. Refer to Figure 5–15 for the timing. Note that the
rx_st_data<n>
rx_st_eop<n>
rx_st_data[63:0]
rx_st_eop<n>
rx_st_data[127:0]
into application. Deasserts within 3
deassertion and reasserts within 3
assertion if more data is available
can be deasserted between the
rx_st_eop
is 64. When using a 128-bit Avalon-ST bus,
rx_st_valid<n>
rx_st_valid<n>
rx_st_eop<n>
even if
rx_st_ready
is 128.
, indicates that this is the
, indicates that this is the
is asserted. This signal only
is asserted and
holds valid data but
does not hold valid data.
is asserted and
rx_st_empty<n>
rx_st_empty<n>
holds valid data.
is asserted.
rx_st_data
has
has
.
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 5: IP Core Interfaces5–7
Avalon-ST Interface
Table 5–2. 64- or 128-Bit Avalon-ST RX Datapath (Part 2 of 3)
SignalWidth Dir
rx_st_err<n>
rx_st_mask<n>
rx_st_bardec<n>
1O
1I
8O
Avalon-ST
Type
Description
Indicates that there is an uncorrectable error correction coding
(ECC) error in the core’s internal RX buffer of the associated VC.
This signal is only active for the hard IP implementations when
ECC is enabled. ECC is automatically enabled by the Quartus II
assembler in memory blocks, the retry buffer, and the RX buffer
for all hard IP variants with the exception of Gen2 ×8. ECC
corrects single–bit errors and detects double–bit errors on a
per byte basis.
When an uncorrectable ECC error is detected,
error
asserted for at least 1 cycle while
the error occurs before the end of a TLP payload, the packet
may be terminated early with an
rx_st_valid
Altera recommends resetting the IP Compiler for PCI Express
when an uncorrectable (double–bit) ECC error is detected and
the TLP cannot be terminated early. Resetting guarantees that
the Configuration Space Registers are not corrupted by an
errant packet.
This signal is not available for the hard IP implementation in
Arria II GX devices.
Component Specific Signals
Application asserts this signal to tell the IP core to stop sending
non-posted requests. This signal does not affect non-posted
requests that have already been transferred from the
transaction layer to the Avalon-ST Adaptor module. This signal
can be asserted at any time. The total number of non-posted
component
specific
requests that can be transferred to the application after
rx_st_mask
and not more than 14 for 64-bit mode.
Do not design your application layer logic so that
remains asserted until certain Posted Requests or Completions
are received. To function correctly, the
eventually deasserted without waiting for posted requests or
completions.
The decoded BAR bits for the TLP. They correspond to the
transaction layer's
IOWR
component
specific
, and
They are valid on the 2nd cycle of
datapath. For a 128-bit datapath
the first cycle. Figure 5–8 and Figure 5–10 illustrate the timing
of this signal for 64- and 128-bit data, respectively.
rx_st_err
rx_st_valid
rx_st_eop
is asserted. If
and with
deasserted on the cycle after the eop.
is asserted is not more than 26 for 128-bit mode
rx_st_mask
rx_st_mask
rx_desc[135:128]
IORD
TLPs; ignored for the CPL or message TLPs.
. Valid for
rx_st_data<n>
rx_st_bardec<n>
is
MRd, MWr
for a 64-bit
is valid on
is
,
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
5–8Chapter 5: IP Core Interfaces
Avalon-ST Interface
Table 5–2. 64- or 128-Bit Avalon-ST RX Datapath (Part 3 of 3)
SignalWidth Dir
Avalon-ST
Type
Description
These are the byte enables corresponding to the transaction
rx_be
layer's
Express TLP payload fields. When using a 64-bit Avalon-ST
bus, the width of
ST bus, the width of
You can derive the same information decoding the
fields in the TLP header. The correspondence between byte
rx_st_be<n>
8, 16O
component
specific
enables and data is as follows when the data is aligned:
(1) In Stratix IV GX devices, <n> is the virtual channel number, which can be 0 or 1.
(2) The RX interface supports a
readyLatency
of 2 cycles for the hard IP implementation and 3 cycles for the soft IP implementation.
To facilitate the interface to 64-bit memories, the IP core always aligns data to the
qword or 64 bits; consequently, if the header presents an address that is not qword
aligned, the IP core, shifts the data within the qword to achieve the correct alignment.
Figure 5–4 shows how an address that is not qword aligned, 0x4, is stored in memory.
The byte enables only qualify data that is being written. This means that the byte
enables are undefined for 0x0–0x3. This example corresponds to Figure 5–5 on
page 5–9. Qword alignment is a feature of the IP core that cannot be turned off.
Qword alignment applies to all types of request TLPs with data, including memory
writes, configuration writes, and I/O writes. The alignment of the request TLP
depends on bit 2 of the request address. For completion TLPs with data, alignment
depends on bit 2 of the
lower address
field. This bit is always 0 (aligned to qword
boundary) for completion with data TLPs that are for configuration read or I/O read
requests.
Figure 5–5 illustrates the mapping of Avalon-ST RX packets to PCI Express TLPs for a
three dword header with non-qword aligned addresses with a 64-bit bus. In this
example, the byte address is unaligned and ends with 0x4, causing the first data to
correspond to
rx_st_data[63:32]
.
f For more information about the Avalon-ST protocol, refer to the Avalon Interface
Specifications.
1The Avalon-ST protocol, as defined in Avalon Interface Specifications, is big endian, but
the IP Compiler for PCI Express packs symbols into words in little endian format.
Consequently, you cannot use the standard data format adapters that use the AvalonST interface.
Figure 5–5. 64-Bit Avalon-ST rx_st_data<n> Cycle Definition for 3-DWord Header TLPs with Non-QWord Aligned Address
August 2014 Altera CorporationIP Compiler for PCI Express User Guide
5–10Chapter 5: IP Core Interfaces
clk
rx_st_data[63:32]
rx_st_data[31:0]
rx_st_sop
rx_st_eop
rx_st_be[7:4]
rx_st_be[3:0]
Header 1Data1Data3
Header 0Header2Data0Data2
F1
FE
clk
rx_st_data[63:32]
rx_st_data[31:0]
rx_st_sop
rx_st_eop
rx_st_be[7:4]
rx_st_be[3:0]
header1header3data1
header0header2data0
F
F
Avalon-ST Interface
Figure 5–6 illustrates the mapping of Avalon-ST RX packets to PCI Express TLPs for a
three dword header with qword aligned addresses. Note that the byte enables
indicate the first byte of data is not valid and the last dword of data has a single valid
byte.
Figure 5–6. 64-Bit Avalon-ST rx_st_data<n> Cycle Definition for 3-DWord Header TLPs with QWord Aligned Address
(Note 1)
Note to Figure 5–6:
(1)
rx_st_be[7:4]
corresponds to
rx_st_data[63:32]. rx_st_be[3:0]
corresponds to
rx_st_data[31:0]
Figure 5–7 shows the mapping of Avalon-ST RX packets to PCI Express TLPs for TLPs
for a four dword with qword aligned addresses with a 64-bit bus.
Figure 5–7. 64-Bit Avalon-ST rx_st_data<n> Cycle Definitions for 4-DWord Header TLPs with QWord Aligned Addresses
IP Compiler for PCI Express User GuideAugust 2014 Altera Corporation
Chapter 5: IP Core Interfaces5–11
clk
rx_st_data[63:32]
rx_st_data[31:0]
rx_st_sop
rx_st_eop
rx_st_bardec[7:0]
rx_st_be[7:4]
rx_st_be[3:0]
header1header3data0data2
header0header2data1
10
CF
F
Avalon-ST Interface
Figure 5–8 shows the mapping of Avalon-ST RX packet to PCI Express TLPs for TLPs
for a four dword header with non-qword addresses with a 64-bit bus. Note that the
address of the first dword is 0x4. The address of the first enabled byte is 0x6. This
example shows one valid word in the first dword, as indicated by the
rx_st_be
signal.
Figure 5–8. 64-Bit Avalon-ST rx_st_data<n> Cycle Definitions for 4-DWord Header TLPs with Non-QWord Addresses
(Note 1)
Note to Figure 5–8:
(1)
rx_st_be[7:4]
corresponds to
rx_st_data[63:32]. rx_st_be[3:0]
Figure 5–9 illustrates the timing of the RX interface when the application
backpressures the IP Compiler for PCI Express by deasserting