Achronix Speedster22i User Manual SerDes

Speedster22i SerDes
User Guide
UG028 (v2.1) – July 1, 2014
UG028, July 1, 2014
1
Table of Contents
List of Figures .................................................................................................................................... 5
List of Tables ..................................................................................................................................... 6
Physical Media Attachment (PMA) ..................................................................................................... 7
Clocking ............................................................................................................................................. 8
Physical Coding Sublayer (PCS)........................................................................................................ 8
Debug and Test ................................................................................................................................. 8
Major standards supported ................................................................................................................ 9
SerDes Placement ........................................................................................................................... 11
SerDes Architecture Overview ......................................................................................................... 12
1. Common .................................................................................................................................. 13
2. Receiver (RX)/Transmitter (TX) ................................................................................................ 14
3. Digital PMA (DPMA) ................................................................................................................. 14
PCS Self Test Logic ......................................................................................................................... 16
Polarity bit reversal (PBR) #0 and #1 ............................................................................................... 16
Polarity and Bit Inversion – 10/20 bit Operation ............................................................................................. 17
Polarity and Bit Inversion – 8/16 bit Operation ............................................................................................... 18
Interface Encapsulation ................................................................................................................... 20
8b/10b Encoder ............................................................................................................................... 20
Symbols and Comma Character ..................................................................................................................... 20
Running Disparity ............................................................................................................................ 20
Transition Density Checker (TDC) ................................................................................................... 22
Polarity Bit Reversal (PB R) .............................................................................................................. 23
Symbol Alignment ............................................................................................................................ 23
Modes of Operation ........................................................................................................................................ 24
Deskew FIFO ................................................................................................................................... 25
Functional Description .................................................................................................................................... 26
Lane-to-Lane Deskew Modes of Operation ...................................................................................... 26
2 UG028, July 1, 2014
The deskew module can work in three modes: .............................................................................................. 26
Standards Supported by Deskew Module ...................................................................................................... 27
Elastic FIFO (Elastic Buffer) ............................................................................................................. 27
EFIFO Standards and Skip Characters .......................................................................................................... 28
EFIFO Operation ............................................................................................................................................. 29
Overflow/Underflow ......................................................................................................................................... 31
8b/10b Decoder ............................................................................................................................... 31
Bit Slider .......................................................................................................................................... 31
Interface Encapsulation ................................................................................................................... 32
PCS Self Test Checker .................................................................................................................... 32
Gigabit Ethernet Interface ................................................................................................................ 33
XAUI ................................................................................................................................................ 34
PIPE Interface .................................................................................................................................. 34
Loopback Modes ............................................................................................................................. 38
PMA loopback modes: .................................................................................................................................... 39
PCS loopback modes: .................................................................................................................................... 39
PMA Test Pattern Generator ............................................................................................................ 40
PMA Test Pattern Checker .............................................................................................................. 40
PCS Test Pattern Generator ............................................................................................................ 40
PRBS Generator .............................................................................................................................. 40
PCS Test Pattern Checker ............................................................................................................... 41
PMA Latency ................................................................................................................................... 42
PCS Latency .................................................................................................................................... 42
Generating SerDes Wrapper using ACE GUI ................................................................................... 49
Single-Lane Serdes Wrapper ........................................................................................................... 50
Overview Section: ........................................................................................................................................... 53
Section on PMA Settings: ............................................................................................................................... 57
RX PMA Equalization ...................................................................................................................................... 59
RX PMA PLL ................................................................................................................................................... 60
TX PMA Driver ................................................................................................................................................ 62
TX PMA PLL ................................................................................................................................................... 62
Section on PCS Settings: ................................................................................................................................ 63
UG028, July 1, 2014
3
RX PCS Settings ............................................................................................................................................. 64
RX PCS Symbol Alignment ............................................................................................................................. 66
TX PCS Settings ............................................................................................................................................. 68
Section on Manually Overriding PMA/PCS Register Values: ......................................................................... 69
Generation of Wrapper Files: .......................................................................................................................... 70
Files Generated by ACE-GUI ........................................................................................................... 71
Integration of SerDes Wrapper in a Design ...................................................................................... 72
Design and Wrapper Files .............................................................................................................................. 72
Dynamically Changing the SerDes Register Values....................................................................................... 75
Using sBus module to enable internal loopback ............................................................................................. 75
Placement of SerDes ....................................................................................................................... 77
Timing Constraints .......................................................................................................................................... 78
Test bench Setup for Simulation ..................................................................................................................... 79
Design Guidelines ............................................................................................................................ 80
Reset Sequence ............................................................................................................................................. 80
SerDes Placement and Clocking Limitations .................................................................................................. 80
Wide Bus ......................................................................................................................................................... 86
Design Tips ..................................................................................................................................................... 87
Variants of the Simple Design .......................................................................................................... 88
Design Bypassing PCS: .................................................................................................................................. 93
Bypassing PCS by Manually Overr iding Cor r espo ndi ng Reg ister .................................................................. 95
Overview .......................................................................................................................................... 98
Alternatives for using SBUS interface for SerDes register access: ................................................................ 98
ACX_SERDES_SBUS_IF Module ................................................................................................... 99
The Ports of ACX_SERDES_SBUS_IF Module: .......................................................................................... 100
Loopback Modes ........................................................................................................................................... 102
Operating Conditions ..................................................................................................................... 104
Transmitter .................................................................................................................................... 105
Receiver ........................................................................................................................................ 108
Eye Diagram ................................................................................................................................................. 110
Reference Clock ............................................................................................................................ 112
Jitter Specification ......................................................................................................................................... 112
4 UG028, July 1, 2014

List of Figures

UG028, July 1, 2014
5

List of Tables

Table 10:
Supported Transmitter (TX) Features
.................................................................................................................. 45
6 UG028, July 1, 2014
Chapter 1 – SerDes Architecture

Overview

Achronix Speedster22i FPGAs provide very high core fabric and I/O performance which exceeds the system bandwidth requirements of various high end applications. The Speedster22i device family supports up to 64 full-duplex SerDes lanes, each supporting up to
11.3 Gbps data rate.
The Physical Coding Sublayer (PCS) and Physical Media Attachment (PMA) sub-blocks together comprise a single SerDes block. The SerDes PCS has explicit support for PCIe, 10GBASE-R, 1G Ethernet and XAUI. It also has some support for various other interconnect protocols through PCS such as Interlaken, SPI4.2, Infiniband, Fiber-Channel, SAS/SATA, SONET, OC, OBSAI and CPRI. The SerDes can be connected either to the embedded Hard­IPs (PCIe, Interlaken, and 10/40/100G MAC) or to the FPGA Fabric for soft implementation of any other protocol supported.

Physical Media Attachment (PMA)

Data rates supported
o 1.0625 – 11.3 Gbps o 531.25 – 1062.5 Mbps using 2X over-sampling o 265.625 – 531.25 Mbps using 4X over-sampling
Independent lane architecture with dedicated synthesizer for each lane with no off-
chip components required
Low power architecture (<100mW at 10Gbps)
Support both AC and DC coupling
Input driver with Continuous Time Linear Equalizer (CTLE) and Decision Feedback
Equalizer (DFE)
o Input voltage: 50 – 2000 mVp-p differential o Auto-calibrating CTLE and DFE o CTLE with up to 20dB gain tuned for key data rates o Pulse-shaped 5-tap DFE
Output driver with 4-tap Finite Input Response (FIR) filter with Feed Forward
Equalizer (FFE)
o Output voltage: 400 – 1500 mVp-p differential o Slew rate: 31 – 170 ps
Highly digital PLL architecture for the Synthesizer and CDR
o Accuracy & low jitter of an analog PLL
UG028, July 1, 2014
o Tuning range of a digital PLL
7
o Programmable spread spectrum generation o Support for 16-bit fractional multiplication factors o Programmable spread spectrum clocking o Support for fast lock mode for EPON/GPON
On-chip scope in the receiver for measuring eye width, eye height and BER for the
incoming signal
On-chip calibrated 100 ohm termination
Transparent calibration engine to compensate for PVT variation

Clocking

Support for external reference clock from 50 MHz – 300 MHz
Support for recovered reference clock for loop timing and re-timer type applications
that eliminates the need for a cleanup PLL

Physical Coding Sublayer (PCS)

Bypassable and Modular PCS architecture
Support for 8b/10b and 128b/130b encoding
Symbol alignment
Clock and phase compensation FIFO
Lane to lane de-skew
Polarity inversion
Bit reversal
Lane bonding
Low/Deterministic latency modes for protocols such as CPRI and OBSAI

Debug and Test

Up to seven different near-end and far-end loopback modes in PMA and PCS
Built-in self test (BIST)
o PRBS 7, 15, 23, 31 and 40-bit user defined pattern generators and checkers in
the PCS
o PRBS 7, 23, 31 and 40-bit user defined pattern generators and checkers in the
PMA
8 UG028, July 1, 2014

Major standards supported

Gen 2
5.0 Gbps
Gen 3
8.0 Gbps
SGMII
1.25 Gbps
XAUI (802.3ae)
3.125 Gbps
10GBASE-R (802.3ae)
10.3125 Gbps
(802.3ae)
Interlaken
--
3.125 – 10.3125 Gbps
SPI5
3.125 Gbps
SFI5.1
3.125 Gbps
SFI5.2
9.1 – 10.3125
CEI 6G
4.976 – 6.375 Gbps
CEI 11G
9.95 – 11.2 Gbps
FC-2
2.125 Gbps
FC-4
4.25 Gbps
FC-10
10.52 Gbps
OC-12
622.08 Mbps
OC-48
2488.32 Mbps
OC-192
9953.28 Mbps
Table 1: SerDes Standards
Standards Variation Data Rate(s)
Gen1 2.5 Gbps
PCI Express
Gigabit Ethernet
10 Gigabit Ethernet
OIF
1000BASE-CX 1.25 Gbps
XFI 10.3125 Gbps
10GBase-KR
10.3125 Gbps
XLAUI/CAUI
10.3125 Gbps
(802.3ae)
SFI4.2 3.125 Gbps
SFI-S 11.1 Gbps
UG028, July 1, 2014
FC-1 1.0625 Gbps
Fiber Channel
FC-8 8.5 Gbps
OC-24 1244.16 Mbps
SONET
9
Standards Variation Data Rate(s)
6.4 Gbps
SATA-1
1.5 Gbps
SATA-2
3.0 Gbps
SATA-3
6.0 Gbps
SAS-1
3.0 Gbps
SAS-2
6.0 Gbps
SAS-3
12.0 Gbps
Gen2
6.125 Gbps
10 Gbps
10 Gbps
QDR
10.0 Gbps
JESD204B
Up to 12.5 Gbps
CPRI
--
614.4 – 9830.4 Mbps
OBSAI
--
768 – 6144 Mbps
USB
3.0
5.0 Gbps
USB
3.1
10.0 Gbps
QPI
SATA
SAS
Serial Rapid I/O
E-PON 802.3av
Gen1
Gen1
Gen1
Gen2
4.8 Gbps
1.25 Gbps
2.5 Gbps
3.125 Gbps
5.0 Gbps
1.25 Gbps
2.5 Gbps
GPON --
InfiniBand
SDR
DDR
1.25 Gbps
2.5 Gbps
2.5 Gbps
5.0 Gbps
10 UG028, July 1, 2014

SerDes Placement

The Speedster22i device supports up to sixty-four (64), 11.3 Gbps SerDes lanes. Each side (Top and Bottom) has thirty-two (32), 11.3 Gbps SerDes. The lanes are organized by channel based, and are placed as illustrated in “
Figure 1: Location of SerDes Lanes
” below.
UG028, July 1, 2014
Figure 1: Location of SerDes Lanes
11

SerDes Architecture Overview

The SerDes has an independent lane architecture. Each lane has a Physical Media Attachment (PMA), Synthesizer (Transmit PLL), Clock and Data Recovery (CDR) and Physical Coding Sublayer (PCS). The Receiver PMA and Transmitter PMA block diagrams are shown in “Figure 2: SerDes Architecture” below.
Figure 2: SerDes Architecture
The SerDes primarily consists of the following blocks:
PMA
PCS
PCS interface to FPGA fabric
Clocking
Debug and Test
12 UG028, July 1, 2014

Physical Media Attachment (PMA)

The PMA architecture is shown in “Figure 3: PMA Architecture” below.
Figure 3: PMA Architecture
The PMA consists three major blocks:
1. Common
2. Receiver/Transmitter (RX/TX)
3. Digital PMA (DPMA)

1. Common

The common block consists of the following circuits:
Reference clock: This circuit performs reference clock buffering and division before
feeding it to the Synthesizer.
Synthesizer: The synthesizer (transmit PLL) generates the high speed clock for the
serializer of the Transmitter. It also has in-built circuit for spread-spectrum clocking
Bias: The biasing circuit is responsible for controlling the offsets and biasing for the
all the analog circuits in the PMA
Analog Test Port: This port is used by Achronix for manufacturing tests and for
debugging purposes
UG028, July 1, 2014
13

2. Receiver (RX)/Transmitter (TX)

The RX/TX block consists of the following circuits:
TX buffer: Converts single-ended signal to differential and performs equalization on
(or pre-emphasis) the outgoing serial signal
RX buffer: Converts differential signal to single ended and performs equalization on
incoming signal using Continuous Time Linear Equalizer (CTLE) and Decision Feedback Equalizer (DFE)
Clock Data Recovery (CDR): Recovers clock and data from the incoming signal for
deserialization
On-Chip Scope: Used for plotting an eye of the incoming signal post equalization for
debug
Serializer/Deserializer: Converts parallel data to serial data using a high speed clock
from the synthesizer

3. Digital PMA (DPMA)

The DPMA block consists of the following circuits:
Calibration: Performs calibration of all the analog circuits using trim settings and
offsets
PMA BIST: Includes PRBS 7, 23, 31 and 40-bit user defined pattern generators and
checkers Power management
Configuration registers (Memory)
JTAG and Boundary Scan
Figure 4: Synthesizer Architecture
14 UG028, July 1, 2014
Figure 5: Receiver Architecture
UG028, July 1, 2014
15

PCS Blocks in the Transmitter (TX)

This section presents the transmitter (TX) data path within a PCS. The key blocks within the SerDes transmitter are:
Encoder: Encodes the data for transmission line. Primary goal is to ensure DC
balance by eliminating long sequence of 1’s or 0’s.
Polarity Bit Reversal (PBR): Inverts the polarity of data and ordering of data to be
transmitted.
The building block for the SerDes IP is the 1 lane configuration. A simplified block diagram of the TX data path is shown in Figure 6: - PCS Transmitter Block Overview . The functional blocks shown in the diagram represent the functionality supported by a single SerDes lane. A summary of the supported standards is covered in “Table 1 – SerDes Standards”.
Figure 6: PCS Transmitter Block Overview
* SerDes configured in Generic mode supports only 8b/10b encoding.
** Either of PBR#0 or PBR#1 can be used or both may be bypassed.
Note: The PCS block will support lane-bonding across multiple SerDes lanes (max 12) Chapter – “Design Flow: Creating a SerDes Design” presents the ground-up steps that can be followed to prepare a design that supports lane-bonding.
The PCS blocks on TX path are detailed below.

PCS Self Test Logic

This block generates transmit data for PCS self test, detailed in “PCS Test Pattern Generator” and “PCS Test Pattern Checker”.

Polarity bit reversal (PBR) #0 and #1

This block can invert the polarity of the incoming data. It can also reverse the bits of the incoming data such that effectively the most significant bit is sent first, rather than the least significant bit (default). For 16/20bit (2 words) bit streams, the word order can also be inverted such that effectively the most significant byte is sent first, rather than the least significant byte (default).
There are two PBR blocks on transmission data path, as shown in “Figure 6: PCS Transmitter Block Overview”. PBR0 is used before the protocol encapsulation block and PBR1 is used on encoded data. Either PBR0 or PBR1 can be used. Alternatively, both of these two blocks can be bypassed.
16 UG028, July 1, 2014

Polarity and Bit Inversion – 10/20 bit Operation

When operating in 10bit/20bit mode, the bit order within each 10-bit word can be inverted. This is illustrated in “Figure 7: 20 bit Order Reversal”. Effectively the most significant bit of the least significant byte is transmitted first (i.e. bit 9 of byte 0 is transmitted first).
Figure 7: 20 bit Order Reversal
When the word order is reversed in 20-bit mode, the most significant byte (byte 1) is swapped with the least significant byte (byte 0). This is illustrated in “Figure 8: 20-bit Word Order Inversion”. The most significant byte will be transmitted first in such a case
Figure 8: 20-bit Byte Order Swap/Reversal
The polarity for the entire 10bit or 20bit word can be inverted as well. Polarity inversion applies to the entire word (10 bits or 20 bits).
UG028, July 1, 2014
17

Polarity and Bit Inversion – 8/16 bit Operation

When the polarity is inverted in 8bit/16bits mode, only bits [17:10] and [7:0] are inverted, bits [19:18] and [9:8] are not inverted. This is illustrated in “Figure 9: Polarity Inversion (16-bit Word)”.
Figure 9: Polarity Inversion (16-bit Word)
When the bit order is inverted in 8bit/16bit mode, bits [7:0] of byte 0 are swapped while bits [9:8] are not swapped. Similarly bits [17:10] of byte 1 are swapped. This is illustrated in “Figure 10: Bit Order Inversion (16-bit Word)”. In this mode, the most significant bit of the least significant byte is transmitted first.
Figure 10: Bit Order Inversion (16-bit Word)
When the word order is inverted in 16-bit mode, byte 1 is swapped with byte 0. This is illustrated in “Figure 11: Word Order Inversion (16-bit Word)”.
18 UG028, July 1, 2014
Figure 11: Word Order Inversion (16-bit Word)
UG028, July 1, 2014
19

Interface Encapsulation

This block encapsulates the protocols supported by the SerDes in Achronix FPGA. The user may refer to Section – “PCS Interface” for details on the protocols supported. It may be noted again that the SerDes configured in Generic mode supports only 8b/10b encoding.

8b/10b Encoder

The 8b/10b encoder generates 10-bit code groups from 8-bit data and a 1-bit control input. It uses the code group mapping specified in IEEE 802.3 clause 36. If the fabric interface is a 16­bit data path, then two 8b/10b encoders are cascaded to produce a 20-bit code group output to the PMA for serialization.
The 8b/10b encoder essentially translates 8-bit words to 10-bit symbols. This encoding scheme has been proven to achieve DC-balance and running disparity while providing sufficient information for clock recovery. (See the later sections for more information on DC­Balance, running disparity and clock recovery.) The 10-bit encoded output TX_dataout[9:0] will map to bits {jhgf iedcba}per the labeling used in IEEE 802.3-2005 clause 36.

Symbols and Comma Character

While translating 8-bit words into 10-bit symbols, the 8b/10b encoder (in SerDes PCS) form two groups of data. The lower 5-bits of data are encoded into a 6-bit group and the upper 3­bits of data are encoded into a 4-bit group. Furthermore, there are 12 control symbols that are used by 8b/10b encoding scheme for special purposes and are called K-symbols. For instance three of these control symbols can be used for defining the boundary between data packets. These three control symbols are called comma symbols.
The 8b/10b encoder generates 10-bit code groups from 8-bit data and a 1-bit control input. It uses the code group mapping specified in IEEE 802.3 clause 36. If the fabric interface is a 16­bit data path, then two 8b/10b encoders are cascaded to produce a 20-bit code group output to the PMA for serialization. The 1-bit control input (datak signal) is used to identify whether data being transmitted is a comma symbol. Asserted value for datak signal on control-line indicates that the symbol on data-line is a comma symbol.
In Section-“Design and Wrapper Files” of the Chapter – “Design Flow: Creating a SerDes Design”, details are provided on how to transmit 8’hBC (K.28.5) as comma symbol and 1’b1 as control signal, for a sample design. For a 20-bit data width, that design essentially uses {2’h1, 8’hBC, 2’h1, 8’hBC}. In other words, while sending a comma symbol, TX_data[8:8] = TX_data[18:18] = 1’b1 is sent through the control-line.
Note: On the receiver end, when the decoder finds an ‘asserted’ control-bit on control-line, it will consider the symbol on data-line as a comma symbol. Error conditions occur if the datak signal is asserted while there is no comma symbol on the data line (e.g. K21.5).

Running Disparity

A non-encoded data stream may have differences between the number of 1’s and the number of 0’s. The primary goal of using running disparity in the encoding scheme is to limit the difference between the number of 1’s and the number of 0’s that are being transmitted. This ensures DC balance on the transmission line. A side-benefit of using running disparity is that information from running disparity can be used in locating transmission errors. This ensures that the output data is DC balanced. The maximum run length for 8b/10b words is 5 bits.
20 UG028, July 1, 2014
The input disparity for the 6 bit block is based on the disparity of previous word’s 4 bit block while the disparity for the 4 bit block is the disparity of the current word’s 6 bit block. This is illustrated in “Figure 12: 8b/10b Encoding Process”.
Figure 12: 8b/10b Encoding Process
UG028, July 1, 2014
21

PCS Blocks in the Receiver (RX)

This chapter describes the PCS components on the receiver data path. The functional block diagram of the receiver is shown in “Figure 13: - PCS Receive Block Overview”. The key blocks in the RX-PCS include:
Transition Density Checker (TDC): Generates a trigger bit when the number of
consecutive 1’s or 0’s reaches a pre-defined value.
Polarity Bit Reversal (PBR): Inverts data, swaps byte ordering and reverses bit-
ordering, if used on the TX data path.
Symbol Alignment: Uses alignment characters and sequences to define the symbol
boundary on the incoming data-stream.
Decoders: Generates 8-bit code group and 1-bit control signal from the 10-bit
encoded (received) data.
Deskew First-In-First-Out (FIFO): Synchronizes the data received across the lanes
when lane-bonding is used.
Clock Compensation (Elastic FIFO): Synchronizes the data received on PMA at
recovered clock domain with a system clock (typically the transmit clock).
Bit Slider: Takes care of bit-wise skew from the fabric, when used.
PCS Interface Encapsulation: Provides interface with the fabric. Supports Gigabit
Ethernet, XAUI, Pipe and 10G Ethernet interfaces.
PCS Self Test Checker: Self checking module, detailed in Chapters “PCS Test Pattern
Generator” and “PCS Test Pattern Checker”
The main features for the supported standards in the PCS side can be found in Chapter “Major standards supported”
Figure 13: PCS Receive Block Overview

Transition Density Checker (TDC)

The transition density checker monitors the parallel RX data bus from the PMA and monitors the number of consecutive 0s or 1s, called run length. If the number reaches a pre-configured value, the checker sets a trigger bit to indicate the transition density violation. This pre­configured value is called threshold and the minimum threshold programmed is half the width of data path. In case scaling is used the actual threshold effective will be the one shown in “Equation 1”
22 UG028, July 1, 2014
Equation 1:
 
+
The assert signal from Transition Density Checker can be taken to fabric.
Note: Any bit transition would cause the counter to clear and the count to restart.
= ( 
    

)

Polarity Bit Reversal (PBR)

The polarity bit reversal block is used to invert data, swap byte ordering, and reverse bit­ordering. There are two such PCS blocks on the receive path, corresponding to the two polarity bit reversal blocks on the transmit path.
When the polarity bit reversal on transmit path is performed before protocol encapsulation (PBR #0 on “Figure 6: PCS Transmitter Block Overview”), the PBR block after protocol encapsulation is used on receive path (PBR #0 on “Figure 13: - PCS Receive Block Overview”). In contrast, if PBR operation is performed on encoded data on the transmit path (PBR #1 on “Figure 6: PCS Transmitter Block Overview”), the PBR block before symbol alignment/decoder block is used on the receive path (PBR #1 on “Figure 13: - PCS Receive Block Overview”). As noted earlier, both of these blocks can be disabled, both on the transmit and the receive paths.

Symbol Alignment

Symbol alignment uses alignment and sequence characters for identifying the correct symbol boundary in the received data-stream. Attributes for alignment and sequence detect symbols are specified to be 10-bit wide. But when received data-path is in 8-bit (or 16-bit) wide mode, only the lower 8-bits of attribute will be considered.
The symbol alignment block can be configured to support a variety of standards. Some of these standards are listed below:
PCIe
XAUI
GigE
Infiniband
Serial Rapid IO
SPI-5 (lock to training pattern)
CPRI
OBSAI
Fiber Channel
Symbol alignment can be programmed to function in the following modes:
UG028, July 1, 2014
Manual Mode
Bit slip Mode
Automatic Mode
23

Modes of Operation

Manual Mode:
In manual alignment mode, the symbol alignment will attempt to identify a pre-configured pattern and lock to the incoming de-serialized data-stream from the output of the PMA or phase picking block. The alignment operation is triggered by the user logic in the FPGA on the rising edge of RX_com_det_en. The symbol alignment block then searches for the pre­configured alignment pattern with or without trailing sequence pattern. Fabric will wait for the lock status. Once lock to the incoming stream is achieved, the fabric can monitor error status from the 8b/10b decoder or employ any other mechanism in fabric to identify loss of lock. The Fabric asserts another rising edge to trigger a new alignment cycle.
Bit Slip Mode:
In bit slip mode, the user logic controls the symbol alignment using the RX_bit_slip_en signal. Each rising edge of RX_bit_slip_en causes the symbol alignment logic to shift the word boundary by 1-bit, and symbol alignment will attempt to match the alignment pattern within the new word boundary. If the word boundary is not matched, the user logic can again assert RX_bit_slip_en, possibly after waiting for a timeout causing the word boundary to shift by another bit position. This loop continues until lock is achieved. Once lock to the incoming stream is achieved, logic in the fabric can monitor error status from 8b/10b decoder or employ some other mechanism in fabric to identify loss of lock. The bit slip mode supports all attributes used for manual alignment mode. The maximum number of slips that will cause a true change in alignment is limited to the data path width.
Automatic Mode:
In automatic alignment mode, the symbol alignment block will automatically determine the location of the word boundary based on the pre-configured alignment characters. It will also establish a lock acquired condition based on receiving a pre-con d count of alignment characters (hysteresis). A loss of lock condition also can be detected by this block based on a pre-configured count of bad code words (or alignment characters at a different word boundary). Instead of counting every bad code word, the user can decide to count every ‘n’ bad code word for an incrementing unlock count. Also, the user can use decode/disparity errors as per clause 36 of IEEE 802.3 to increment and decrement the unlock counter. Support for Fiber Channel protocol involves synchronization with the 4-symbol wide transmission word (a special code word K28.5 followed by 3 data code words). In case of Fiber Channel, any malformed transmission word causes the symbol alignment to go out of lock based on the un-lock count programmed.
Comma symbols are used for identifying the correct symbol boundary. Section – “Symbols and Comma Character” introduces comma symbols and discusses on how they are used in data output from 8b/10b encoder on the TX side of a SerDes. At the receiver end, the incoming data is scanned for comma symbols. Once the comma symbol is found, the deserializer resets the word boundary of the received data. The received data is continuously scanned for the subsequent comma symbols.
24 UG028, July 1, 2014

Deskew FIFO

The deskew block provides support for standards which require multiple lane bonding and de-skewing of received data across multiple lanes. Lane bonding is required when the users want to transmit data faster than is possible by using one serial link (lane). In such case, the data is received must be aligned across the lanes. Deskew module within the SerDes takes care of this.
UG028, July 1, 2014
Figure 14: Operating principle of deskew technique
“Figure 14 - Operating principle of deskew technique” shows the operating principle of deskew operation. In this figure, data is being sent using four lanes. On the receiver side, before lane-bonding, we find that the data at time t+2on lane-1 is aligned with data at time t+1 on lane-2 and so on. The deskew technique aims to align the data with respect to the clock cycles. In other words, data at time t+2on lane-2 should be aligned with data at time t+2 on the other lanes. The red lines for the clock at receiver end demonstrates this.
For lane bonding, all lanes should use the same reference clock and insert de-skew characters at the same time on each lane. Skew between lanes is introduced by both active (CDR) and passive (board) elements of the link. The deskew operation can result in some loss of data when it aligns characters to the same clock cycle.
25

Functional Description

The de-skew block uses a deskew FIFO on each lane. The writes to the deskew FIFO are performed in the recovered clock domain for each lane. The read side of the deskew FIFO is clocked by the clock from the initiator lane. The lanes are categorized as initiator and followers. Any lane can be an initiator and skew is always calculated between the initiator and each of follower lanes.
Once deskew is enabled, the skew between initiator and follower lanes are calculated continuously by sensing deskew characters in the read side of the FIFO. The read threshold for the FIFO needs to be programmed appropriately based on skew tolerance to avoid FIFO under/over run. Once a deskew character is sensed, each lane starts a skew window equal to the maximum skew allowed in the system. Based on how the lanes are skewed, the follower lane is either lagging or leading and adjust the read clock cycles accordingly. Once the initiator gets indication from all lanes of the bonding group that the skew calculation is over, it declares that all lanes are aligned and asserts data valid for the down-stream logic. The same data valid is used by the follower lanes to assert respective lane data valid. When the initiator does not find such overlap of skew windows, it issues a reset to all FIFOs in the bonding group and restarts the de-skew operation.
To summarize, the initiator lane generates various control signals for the follower lanes and follower lanes send various status signals back to the initiators. Status signals are AND-ed (e.g. for checking if the skew calculation completed in all lanes) or OR-ed (e.g. for checking if any follower lanes window has not started), whereas control signals are used directly. These signals go from one lane to another. The status and control signals are registered at time intervals determined based on the number of lanes bonded

Lane-to-Lane Deskew Modes of Operation

The deskew module can work in three modes:

Manual Mode:
The rising edge of i_dskew_start will start one round of deskew operation. Lanes are declared aligned either just after the deskew operation is completed or after an additional check of a programmed number of aligned deskew characters in all bonded lanes at the same time. The fabric needs to monitor received data for identifying any misalignment, and thus to restart deskew operation. Infiniband uses manual mode of deskew operation.
Auto Mode:
The deskew module is always active. Once lanes are deskewed, all lanes will continuously look for deskew characters in data read from the FIFO. The initiator should see deskew characters on all lanes of the bonding group at the same time. The initiator looks for aligned deskew characters on all lanes for a certain number of times based on the value programmed in the register, and once detected the initiator declares bonded lanes aligned. Any time the initiator finds deskew characters not aligned on all lanes, it starts an unlock count. If the unlock count hits the value programmed in the register, the initiator declares that the lanes are out of lock and re-starts the de-skew operation. While unlock count is incrementing, if the initiator finds de-skew characters are aligned on all lanes again it starts decrementing the unlock counter. This decrement can happen once in every ‘n’ (programmed in the register) times when lanes have de-skew characters aligned to make sure the link has overcome error conditions. If the unlock counter reaches zero, the link remains aligned.
26 UG028, July 1, 2014
Symbol slip mode:
symbol_slip_up
symbol_slip_dn
Comments
0
0
Increment read pointer by 1
0
1
No increment 1 0
Increment read pointer by 2
1
1
Increment read pointer by 1
The deskew module does not actively remove skew across lanes. Each lane is controlled by the fabric. Fabric continuously monitors incoming data and employ a mechanism to find out the skew across lanes. Based on the calculation, it instructs each lane to adjust the read pointer of FIFO. The read pointer can be incremented once by 0, 1 or 2 based on the combination of rising edges on symbol_slip_up and symbol_slip_dn. Based on the skew computed, the fabric may need to provide multiple transitions on symbol_slip_up and symbol_slip_dn to get the required number of pointer adjustments.
Table 2: Symbol Slip Paramaters

Standards Supported by Deskew Module

The deskew module in Achronix SerDes has explicit support for XAUI and Infiniband. For XAUI, align(||A||) characters are sent periodically as per section 48 in IEEE 802.3. For Infiniband, training sequences (TS1/TS2) are used as deskew characters. Though each of TS1/TS2 is 16 code words long, the de-skew module forms de-skew ordered set with COM and four data symbols (D10.2). The distance (gap) between COM and data symbols should be programmed to ‘d1 for Infiniband. In case of 10-bit data path, the max skew handled is 6­bytes and for 20-bit max skew handled 2-bytes. For training in Infiniband, initially data valid will be asserted to pass TS1/TS2/TS3 to fabric. Subsequently, data valid is removed when link training is completed and the fabric decides to de-skew lanes bonded. Once the de-skew operation is completed, data valid is asserted again.
UG028, July 1, 2014
Besides these two protocols, the user can use this module for deskew functions of any protocols provided that the minimum spacing between de-skew characters are maintained.

Elastic FIFO (Elastic Buffer)

An elastic FIFO is used to synchronize the received data from the PMA recovered clock to a system clock, typically the transmit clock. The Elastic FIFO also compensates for any frequency offset between the recovered clock and the system clock. It compensates for the frequency offset by adding or deleting pre-configured skip (or pad) characters from the received data stream. The elastic FIFO in Achronix SerDes provides an indication that skip (or pad) characters were added or deleted to the downstream logic. For PCIe, the elastic FIFO also includes the appropriate status encoding to indicate add/delete operation.
The elastic FIFO can also be configured to be used as a simple phase compensation FIFO for synchronizing data. When used as a phase compensation FIFO, it is left to the user to guarantee that there is no frequency offset (jitter) between the read and write clocks.
27

EFIFO Standards and Skip Characters

PCIe Gen3: To support PCIe Gen3, 4-bytes of skip are added at byte positions 4-7 from the sync header associated with the skip ordered set. Skip removal happens from bytes 0-3 from the sync header associated with the skip ordered set. Due to this particular rule of removal, sync header and receive start block indications are delayed by 4-bytes.
PCIe Gen1/Gen2: For PCIe Gen1/Gen2, the skip ordered set is two 10-bit words – the elastic buffer adds or deletes only the second word.
Fiber Channel: To support Fiber channel, 4-bytes of skip are added and deleted. The PCS operates in 16-bit data-path mode at the fabric interface and 20-bit encoding internally.
XAUI: To support XAUI, the skip ordered set is one 10-bit word, which is added or deleted by the elastic buffer.
GigE: For GigE, the skip ordered set is two 10-bit words – control followed by data. The elastic FIFO adds or removes both of these two 10-bit words.
Other Standards: Besides these specific standards, the elastic FIFO can handle any generic protocols in the similar line due to the programmable nature of SKIP and inverted SKIP ordered set of length 2. The user has flexibility to include an alternate (mostly inverted) word in the ordered set. Beyond two words skip ordered sets, only 4 words skip ordered sets can be used, which are specific to fiber channel. The elastic FIFO generates the final data valid from the PCS, which is used by the fabric to register data.
28 UG028, July 1, 2014

EFIFO Operation

Figure 15: EFIFO SKP Addition/Removal
” illustrates the process of SKP addition/removal.
UG028, July 1, 2014
Figure 15: EFIFO SKP Addition/Removal
In “Figure 15: EFIFO SKP Addition/Removal” upon reset, the difference between the read and write counters is equal to fifo_mid (half the size of the buffer; default 16).
If clk_in is operating at a lower frequency than clk_out, then the read operation is faster than the write operation and the difference between the write and read counters will be less than fifo_mid. In this case, to compensate for clk_in being slower, an SKP is added to the data stream.
If clk_in is operating at a higher frequency than clk_out, then the read operation is slower than the write operation and the difference between the write and read counters will be greater than fifo_mid. In this case, to compensate for clk_out being slower, an SKP is removed from the data stream.
29
“Figure 16: EFIFO SKP Addition/Removal: PCIE, GigE (802.3) and XAUI (802.3)” illustrates SKP additions and removals for PCIe, GigE (802.3), and XAUI (802.3ae). Note that in the figure, data_i and data_o are not actually aligned, they are merely depicted so for clarity.
Figure 16: EFIFO SKP Addition/Removal: PCIE, GigE (802.3) and XAUI (802.3)
30 UG028, July 1, 2014
Loading...
+ 83 hidden pages