ALTERA FPGA DATA SHEET

®
Cyclone
FPGA Family
March 2003, ver. 1.1 Data Sheet

Introduction

Features...

The CycloneTM field programmable gate array family is based on a 1.5-V,
0.13-µm, all-layer copper SRAM process, with densities up to 20,060 logic
elements (LEs) and up to 288 Kbits of RAM. With features like phase­locked loops (PLLs) for clocking and a dedicated double data rate (DDR) interface to meet DDR SDRAM and fast cycle RAM (FCRAM) memory requirements, Cyclone devices are a cost-effective solution for data-path applications. Cyclone devices support various I/O standards, including LVDS at data rates up to 311 megabits per second (Mbps) and 66-MHz, 32-bit peripheral component interconnect (PCI), for interfacing with and supporting ASSP and ASIC devices. Altera also offers new low-cost serial configuration devices to configure Cyclone devices.
2,910 to 20,060 LEs, see Table 1
Up to 294,912 RAM bits (36,864 bytes)
Supports configuration through low-cost serial configuration device
Support for LVTTL, LVCMOS, SSTL-2, and SSTL-3 I/O standards
Support for 66-MHz, 32-bit PCI standard
Low speed (311 Mbps) LVDS I/O support
Up to two PLLs per device provide clock multiplication and phase
shifting
Up to eight global clock lines with six clock resources available per
logic array block (LAB) row
Support for external memory, including DDR SDRAM (133 MHz),
FCRAM, and single data rate (SDR) SDRAM
Support for multiple intellectual property (IP) cores, including
MegaCore functions and Altera Megafunctions Partners
Altera Program (AMPP
SM
) megafunctions
Table 1. Cyclone Device Features
Feature EP1C3 EP1C4 EP1C6 EP1C12 EP1C20
LEs 2,910 4,000 5,980 12,060 20,060 M4K RAM blocks (128 × 36bits)1317205264 Total RAM bits 59,904 78,336 92,160 239,616 294,912 PLLs 12222 Maximum user I/O pins (1) 104 301 185 249 301
Note to Table 1:
(1) This parameter includes global clock pins.
Altera Corporation 1
DS-CYCLONE-1.1
Cyclone FPGA Family Data Sheet Preliminary Information
Cyclone devices are available in quad flat pack (QFP) and space-saving
FineLine BGA
packages (see Tables 2 through 3).
Table 2. Cyclone Package Options & I/O Pin Counts
Device 100-Pin
TQFP (1)
EP1C3 65 104 EP1C4 249 301 EP1C6 98 185 185 EP1C12 173 185 249 EP1C20 233 301
Notes to Table 2:
(1) TQFP: thin quad flat pack.
PQFP: plastic quad flat pack.
(2) Cyclone devices support vertical migration within the same package (i.e., designers can migrate between the EP1C3
device in the 144-pin TQFP package and the EP1C6 device in the same package).
144-Pin
TQFP (1), (2)
240-Pin
PQFP (1)
256-Pin
FineLine
BGA
324-Pin
FineLine
BGA
400-Pin
FineLine
BGA
Table 3. Cyclone QFP & FineLine BGA Package Sizes
Dimension 100-Pin
TQFP
144-Pin
TQFP
240-Pin
PQFP
256-Pin
FineLine
BGA
324-Pin
FineLine
BGA
400-Pin
FineLine
BGA
Pitch (mm) 0.5 0.5 0.5 1.0 1.0 1.0 Area (mm Length × width
(mm × mm)
2 Altera Corporation
2
) 256 484 1,024 289 361 441
16 × 16 22 × 22 34.6 × 34.6 17× 17 19 × 19 21 × 21
Preliminary Information Cyclone FPGA Family Data Sheet

Table of Contents

Introduction........................................................................................................1
Features ...............................................................................................................1
Table of Contents...............................................................................................3
Functional Description......................................................................................4
Logic Array Blocks.............................................................................................6
Logic Elements ...................................................................................................9
MultiTrack Interconnect .................................................................................17
Embedded Memory.........................................................................................23
Global Clock Network & Phase-Locked Loops...........................................34
I/O Structure....................................................................................................44
Power Sequencing & Hot Socketing .............................................................60
IEEE Std. 1149.1 (JTAG) Boundary Scan Support.......................................60
SignalTap II Embedded Logic Analyzer ......................................................65
Configuration ...................................................................................................65
Operating Conditions......................................................................................67
Power Consumption........................................................................................73
Timing Model...................................................................................................73
Software.............................................................................................................93
Device Pin-Outs ...............................................................................................93
Ordering Information......................................................................................93
Altera Corporation 3
Cyclone FPGA Family Data Sheet Preliminary Information

Functional Description

Cyclone devices contain a two-dimensional row- and column-based architecture to implement custom logic. Column and row interconnects of varying speeds provide signal interconnects between LABs and embedded memory blocks.
The logic array consists of LABs, with 10 LEs in each LAB. An LE is a small unit of logic providing efficient implementation of user logic functions. LABs are grouped into rows and columns across the device. Cyclone devices range between 2,910 to 20,060 LEs.
M4K RAM blocks are true dual-port memory blocks with 4K bits of memory plus parity (4,608 bits). These blocks provide dedicated true dual-port, simple dual-port, or single-port memory up to 36-bits wide at up to 200 MHz. These blocks are grouped into columns across the device in between certain LABs. Cyclone devices offer between 60 to 288 Kbits of embedded RAM.
Each Cyclone device I/O pin is fed by an I/O element (IOE) located at the ends of LAB rows and columns around the periphery of the device. I/O pins support various single-ended and differential I/O standards, such as the 66-MHz, 32-bit PCI standard and the LVDS I/O standard at up to 311 Mbps. Each IOE contains a bidirectional I/O buffer and three registers for registering input, output, and output-enable signals. Dual-purpose DQS, DQ, and DM pins along with delay chains (used to phase-align DDR signals) provide interface support with external memory devices such as DDR SDRAM, and FCRAM devices at up to 133 MHz (266 Mbps).
Cyclone devices provide a global clock network and up to two PLLs. The global clock network consists of eight global clock lines that drive throughout the entire device. The global clock network can provide clocks for all resources within the device, such as IOEs, LEs, and memory blocks. The global clock lines can also be used for control signals. Cyclone PLLs provide general-purpose clocking with clock multiplication and phase shifting as well as external outputs for high-speed differential I/O support.
Figure 1 shows a diagram of the Cyclone EP1C12 device.
4 Altera Corporation
Preliminary Information Cyclone FPGA Family Data Sheet
Figure 1. Cyclone EP1C12 Device Block Diagram
IOEs
Logic Array
PLL
M4K Blocks
EP1C12 Device
The number of M4K RAM blocks, PLLs, rows, and columns vary per device. Table 4 lists the resources available in each Cyclone device.
Table 4. Cyclone Device Resources
Device M4K RAM PLLs LAB Columns LAB Rows
Columns Blocks
EP1C3 1 13 1 24 13 EP1C4 1 17 2 26 17 EP1C6 1 20 2 32 20 EP1C12 2 52 2 48 26 EP1C20 2 64 2 64 32
Altera Corporation 5
Cyclone FPGA Family Data Sheet Preliminary Information

Logic Array Blocks

Each LAB consists of 10 LEs, LE carry chains, LAB control signals, a local interconnect, look-up table (LUT) chain, and register chain connection lines. The local interconnect transfers signals between LEs in the same LAB. LUT chain connections transfer the output of one LE’s LUT to the adjacent LE for fast sequential LUT connections within the same LAB. Register chain connections transfer the output of one LE’s register to the adjacent LE’s register within an LAB. The Quartus associated logic within an LAB or adjacent LABs, allowing the use of local, LUT chain, and register chain connections for performance and area efficiency. Figure 2 details the Cyclone LAB.
Figure 2. Cyclone LAB Structure
Direct link interconnect from adjacent block
Row Interconnect
®
II Compiler places
Column Interconnect
Direct link interconnect from adjacent block
Direct link interconnect to adjacent block
Local InterconnectLAB
Direct link interconnect to adjacent block
6 Altera Corporation
Preliminary Information Cyclone FPGA Family Data Sheet

LAB Interconnects

The LAB local interconnect can drive LEs within the same LAB. The LAB local interconnect is driven by column and row interconnects and LE outputs within the same LAB. Neighboring LABs, PLLs, and M4K RAM blocks from the left and right can also drive an LAB’s local interconnect through the direct link connection. The direct link connection feature minimizes the use of row and column interconnects, providing higher performance and flexibility. Each LE can drive 30 other LEs through fast local and direct link interconnects. Figure 3 shows the direct link connection.
Figure 3. Direct Link Connection
Direct link interconnect from
left LAB, M4K memory
block, PLL, or IOE output
Direct link interconnect from right LAB, M4K memory block, PLL, or IOE output
Direct link
interconnect
to left
Local
Interconnect
Direct link interconnect to right
LAB

LAB Control Signals

Each LAB contains dedicated logic for driving control signals to its LEs. The control signals include two clocks, two clock enables, two asynchronous clears, synchronous clear, asynchronous preset/load, synchronous load, and add/subtract control signals. This gives a maximum of 10 control signals at a time. Although synchronous load and clear signals are generally used when implementing counters, they can also be used with other functions.
Altera Corporation 7
Cyclone FPGA Family Data Sheet Preliminary Information
Each LAB can use two clocks and two clock enable signals. Each LAB’s clock and clock enable signals are linked. For example, any LE in a particular LAB using the labclk1 signal will also use labclkena1. If the LAB uses both the rising and falling edges of a clock, it also uses both LAB-wide clock signals. De-asserting the clock enable signal will turn off the LAB-wide clock.
Each LAB can use two asynchronous clear signals and an asynchronous load/preset signal. The asynchronous load acts as a preset when the asynchronous load data input is tied high.
With the LAB-wide addnsub control signal, a single LE can implement a one-bit adder and subtractor. This saves LE resources and improves performance for logic functions such as DSP correlators and signed multipliers that alternate between addition and subtraction depending on data.
The LAB row clocks [5..0] and LAB local interconnect generate the LAB­wide control signals. The MultiTrack allows clock and control signal distribution in addition to data. Figure 4 shows the LAB control signal generation circuit.
Figure 4. LAB-Wide Control Signals
Dedicated LAB Row Clocks
Local Interconnect
Local Interconnect
Local Interconnect
Local Interconnect
Local Interconnect
Local Interconnect
6
labclkena1
TM
interconnect’s inherent low skew
labclkena2
labclk2labclk1
asyncload
or labpre
syncload
labclr1
labclr2
addnsub
synclr
8 Altera Corporation
Preliminary Information Cyclone FPGA Family Data Sheet

Logic Elements

Figure 5. Cyclone LE
LAB Carry-In
addnsub
data1 data2
data3
data4
labclr1 labclr2
labpre/aload
Chip-Wide
Reset
labclk1 labclk2
Carry-In1 Carry-In0
Asynchronous
Clear/Preset/
Load Logic
Clock &
Clock Enable
Select
The smallest unit of logic in the Cyclone architecture, the LE, is compact and provides advanced features with efficient logic utilization. Each LE contains a four-input LUT, which is a function generator that can implement any function of four variables. In addition, each LE contains a programmable register and carry chain with carry select capability. A single LE also supports dynamic single bit addition or subtraction mode selectable by an LAB-wide control signal. Each LE drives all types of interconnects: local, row, column, LUT chain, register chain, and direct link interconnects. See Figure 5.
Register chain routing from
Look-Up
Tabl e (LUT)
Carry
Chain
previous LE
LAB-wide
Synchronous
Load
Synchronous
Synchronous
Load and
Clear Logic
LAB-wide
Clear
Register Bypass
Packed Register Select
PRN/ALD
D ADATA
ENA
CLRN
Register Feedback
Programmable Register
LUT chain routing to next LE
Row, column,
Q
and direct link routing
Row, column, and direct link routing
Local Routing
Register chain output
labclkena1 labclkena2
Carry-Out0 Carry-Out1
LAB Carry-Out
Altera Corporation 9
Cyclone FPGA Family Data Sheet Preliminary Information
Each LE’s programmable register can be configured for D, T, JK, or SR operation. Each register has data, true asynchronous load data, clock, clock enable, clear, and asynchronous load/preset inputs. Global signals, general-purpose I/O pins, or any internal logic can drive the register’s clock and clear control signals. Either general-purpose I/O pins or internal logic can drive the clock enable, preset, asynchronous load, and asynchronous data. The asynchronous load data input comes from the data3 input of the LE. For combinatorial functions, the LUT output bypasses the register and drives directly to the LE outputs.
Each LE has three outputs that drive the local, row, and column routing resources. The LUT or register output can drive these three outputs independently. Two LE outputs drive column or row and direct link routing connections and one drives local interconnect resources. This allows the LUT to drive one output while the register drives another output. This feature, called register packing, improves device utilization because the device can use the register and the LUT for unrelated functions. Another special packing mode allows the register output to feed back into the LUT of the same LE so that the register is packed with its own fan-out LUT. This provides another mechanism for improved fitting. The LE can also drive out registered and unregistered versions of the LUT output.

LUT Chain & Register Chain

In addition to the three general routing outputs, the LEs within an LAB have LUT chain and register chain outputs. LUT chain connections allow LUTs within the same LAB to cascade together for wide input functions. Register chain outputs allow registers within the same LAB to cascade together. The register chain output allows an LAB to use LUTs for a single combinatorial function and the registers to be used for an unrelated shift register implementation. These resources speed up connections between LABs while saving local interconnect resources. See “MultiTrack
Interconnect” on page 17 for more information on LUT chain and register
chain connections.
10 Altera Corporation
Preliminary Information Cyclone FPGA Family Data Sheet

addnsub Signal

The LE’s dynamic adder/subtractor feature saves logic resources by using one set of LEs to implement both an adder and a subtractor. This feature is controlled by the LAB-wide control signal addnsub. The addnsub
signal sets the LAB to perform either A + B or A B. The LUT computes
addition; subtraction is computed by adding the two’s complement of the intended subtractor. The LAB-wide signal converts to two’s complement by inverting the B bits within the LAB and setting carry-in = 1 to add one to the least significant bit (LSB). The LSB of an adder/subtractor must be placed in the first LE of the LAB, where the LAB-wide addnsub signal automatically sets the carry-in to 1. The Quartus II Compiler automatically places and uses the adder/subtractor feature when using adder/subtractor parameterized functions.

LE Operating Modes

The Cyclone LE can operate in one of the following modes:
Normal mode
Dynamic arithmetic mode
Each mode uses LE resources differently. In each mode, eight available
inputs to the LEthe four data inputs from the LAB local interconnect, carry-in0 and carry-in1 from the previous LE, the LAB carry-in
from the previous carry-chain LAB, and the register chain
connectionare directed to different destinations to implement the
desired logic function. LAB-wide signals provide clock, asynchronous clear, asynchronous preset/load, synchronous clear, synchronous load, and clock enable control for the register. These LAB-wide signals are available in all LE modes. The addnsub control signal is allowed in arithmetic mode.
The Quartus II software, in conjunction with parameterized functions such as library of parameterized modules (LPM) functions, automatically chooses the appropriate mode for common functions such as counters, adders, subtractors, and arithmetic functions. If required, the designer can also create special-purpose functions that specify which LE operating mode to use for optimal performance.
Altera Corporation 11
Cyclone FPGA Family Data Sheet Preliminary Information

Normal Mode

The normal mode is suitable for general logic applications and combinatorial functions. In normal mode, four data inputs from the LAB local interconnect are inputs to a four-input LUT (see Figure 6). The Quartus II Compiler automatically selects the carry-in or the data3 signal as one of the inputs to the LUT. Each LE can use LUT chain connections to drive its combinatorial output directly to the next LE in the LAB. Asynchronous load data for the register comes from the data3 input of the LE. LEs in normal mode support packed registers.
Figure 6. LE in Normal Mode
aload
(LAB Wide)
Register chain
connection
sload
(LAB Wide)
sclear
(LAB Wide)
addnsub (LAB Wide)
(1)
data1 data2 data3
cin (from cout of previous LE)
data4
4-Input
LUT
Register Feedback
clock (LAB Wide)
ena (LAB Wide) aclr (LAB Wide)
ALD/PRE ADATA D
ENA
CLRN
Q
Note to Figure 6:
(1) This signal is only allowed in normal mode if the LE is at the end of an adder/subtractor chain.
Row, column, and direct link routing
Row, column, and direct link routing
Local routing
LUT chain connection
Register chain output
12 Altera Corporation
Preliminary Information Cyclone FPGA Family Data Sheet

Dynamic Arithmetic Mode

The dynamic arithmetic mode is ideal for implementing adders, counters, accumulators, wide parity functions, and comparators. An LE in dynamic arithmetic mode uses four 2-input LUTs configurable as a dynamic adder/subtractor. The first two 2-input LUTs compute two summations based on a possible carry-in of 1 or 0; the other two LUTs generate carry outputs for the two chains of the carry select circuitry. As shown in
Figure 7, the LAB carry-in signal selects either the carry-in0 or
carry-in1 chain. The selected chain’s logic level in turn determines which parallel sum is generated as a combinatorial or registered output. For example, when implementing an adder, the sum output is the selection of two possible calculated sums:
data1 + data2 + carry-in0 or data1 + data2 + carry-in1.
The other two LUTs use the data1 and data2 signals to generate two possible carry-out signalsone for a carry of 1 and the other for a carry of
0. The carry-in0 signal acts as the carry select for the carry-out0 output and carry-in1 acts as the carry select for the carry-out1 output. LEs in arithmetic mode can drive out registered and unregistered versions of the LUT output.
The dynamic arithmetic mode also offers clock enable, counter enable, synchronous up/down control, synchronous clear, synchronous load, and dynamic adder/subtractor options. The LAB local interconnect data inputs generate the counter enable and synchronous up/down control signals. The synchronous clear and synchronous load options are LAB­wide signals that affect all registers in the LAB. The Quartus II software automatically places any registers that are not used by the counter into other LABs. The addnsub LAB-wide signal controls whether the LE acts as an adder or subtractor.
Altera Corporation 13
Cyclone FPGA Family Data Sheet Preliminary Information
Figure 7. LE in Dynamic Arithmetic Mode
LAB Carry-In
Carry-In0 Carry-In1
addnsub
(LAB Wide)
sload
Register chain
connection
(1)
(LAB Wide)
sclear
(LAB Wide)
aload
(LAB Wide)
data1 data2 data3
LUT
LUT
LUT
LUT
clock (LAB Wide)
ena (LAB Wide) aclr (LAB Wide)
Register Feedback
Carry-Out1Carry-Out0
ALD/PRE
ADATA
D
ENA
CLRN
Note to Figure 7:
(1) The addnsub signal is tied to the carry input for the first LE of a carry chain only.

Carry-Select Chain

The carry-select chain provides a very fast carry-select function between LEs in dynamic arithmetic mode. The carry-select chain uses the redundant carry calculation to increase the speed of carry functions. The LE is configured to calculate outputs for a possible carry-in of 0 and carry­in of 1 in parallel. The carry-in0 and carry-in1 signals from a lower- order bit feed forward into the higher-order bit via the parallel carry chain and feed into both the LUT and the next portion of the carry chain. Carry­select chains can begin in any LE within an LAB.
Q
Row, column, and direct link routing
Row, column, and direct link routing
Local routing
LUT chain connection
Register chain output
The speed advantage of the carry-select chain is in the parallel pre­computation of carry chains. Since the LAB carry-in selects the precomputed carry chain, not every LE is in the critical path. Only the propagation delays between LAB carry-in generation (LE 5 and LE 10) are now part of the critical path. This feature allows the Cyclone architecture to implement high-speed counters, adders, multipliers, parity functions, and comparators of arbitrary width.
14 Altera Corporation
Preliminary Information Cyclone FPGA Family Data Sheet
Figure 8 shows the carry-select circuitry in an LAB for a 10-bit full adder.
One portion of the LUT generates the sum of two bits using the input signals and the appropriate carry-in bit; the sum is routed to the output of the LE. The register can be bypassed for simple adders or used for accumulator functions. Another portion of the LUT generates carry-out bits. An LAB-wide carry-in bit selects which chain is used for the addition of given inputs. The carry-in signal for each chain, carry-in0 or carry-in1, selects the carry-out to carry forward to the carry-in signal of the next-higher-order bit. The final carry-out signal is routed to an LE, where it is fed to local, row, or column interconnects.
Altera Corporation 15
Cyclone FPGA Family Data Sheet Preliminary Information
Figure 8. Carry Select Chain
LAB Carry-In
A1 B1
A2 B2
A3 B3
A4 B4
A5 B5
A6 B6
A7 B7
A8 B8
A9 B9
01
LE1
LE2
LE3
LE4
LE5
01
LE6
LE7
LE8
LE9
Sum1
Sum2
Sum3
Sum4
Sum5
Sum6
Sum7
Sum8
Sum9
LAB Carry-In Carry-In0
Carry-In1
data1 data2
LUT
Sum
LUT
LUT
LUT
Carry-Out0 Carry-Out1
A10 B10
LAB Carry-Out
LE10
Sum10
The Quartus II Compiler automatically creates carry chain logic during design processing, or the designer can create it manually during design entry. Parameterized functions such as LPM functions automatically take advantage of carry chains for the appropriate functions.
The Quartus II Compiler creates carry chains longer than 10 LEs by linking LABs together automatically. For enhanced fitting, a long carry chain runs vertically allowing fast horizontal connections to M4K memory blocks. A carry chain can continue as far as a full column.
16 Altera Corporation
Preliminary Information Cyclone FPGA Family Data Sheet

Clear & Preset Logic Control

LAB-wide signals control the logic for the register’s clear and preset signals. The LE directly supports an asynchronous clear and preset function. The register preset is achieved through the asynchronous load of a logic high. The direct asynchronous preset does not require a NOT-gate push-back technique. Cyclone devices support simultaneous preset/ asynchronous load and clear signals. An asynchronous clear signal takes precedence if both signals are asserted simultaneously. Each LAB supports up to two clears and one preset signal.
In addition to the clear and preset ports, Cyclone devices provide a chip­wide reset pin (DEV_CLRn) that resets all registers in the device. An option set before compilation in the Quartus II software controls this pin. This chip-wide reset overrides all other control signals.

MultiTrack Interconnect

In the Cyclone architecture, connections between LEs, M4K memory blocks, and device I/O pins are provided by the MultiTrack interconnect structure with DirectDrive consists of continuous, performance-optimized routing lines of different speeds used for inter- and intra-design block connectivity. The Quartus II Compiler automatically places critical design paths on faster interconnects to improve design performance.
DirectDrive technology is a deterministic routing technology that ensures identical routing resource usage for any function regardless of placement within the device. The MultiTrack interconnect and DirectDrive technology simplify the integration stage of block-based designing by eliminating the re-optimization cycles that typically follow design changes and additions.
The MultiTrack interconnect consists of row and column interconnects that span fixed distances. A routing structure with fixed length resources for all devices allows predictable and repeatable performance when migrating through different device densities. Dedicated row interconnects route signals to and from LABs, PLLs, and M4K memory blocks within the same row. These row resources include:
Direct link interconnects between LABs and adjacent blocks
R4 interconnects traversing four blocks to the right or left
The direct link interconnect allows an LAB or M4K memory block to drive into the local interconnect of its left and right neighbors. Only one side of a PLL block interfaces with direct link and row interconnects. The direct link interconnect provides fast communication between adjacent LABs and/or blocks without using row interconnect resources.
TM
technology. The MultiTrack interconnect
Altera Corporation 17
Cyclone FPGA Family Data Sheet Preliminary Information
The R4 interconnects span four LABs, or two LABs and one M4K RAM block. These resources are used for fast row connections in a four-LAB region. Every LAB has its own set of R4 interconnects to drive either left or right. Figure 9 shows R4 interconnect connections from an LAB. R4 interconnects can drive and be driven by M4K memory blocks, PLLs, and row IOEs. For LAB interfacing, a primary LAB or LAB neighbor can drive a given R4 interconnect. For R4 interconnects that drive to the right, the primary LAB and right neighbor can drive on to the interconnect. For R4 interconnects that drive to the left, the primary LAB and its left neighbor can drive on to the interconnect. R4 interconnects can drive other R4 interconnects to extend the range of LABs they can drive. R4 interconnects can also drive C4 interconnects for connections from one row to another.
Figure 9. R4 Interconnect Connections
R4 Interconnect
Driving Left
Adjacent LAB can Drive onto Another LAB's R4 Interconnect
C4 Column Interconnects (1)
R4 Interconnect Driving Right
LAB
Neighbor
Primary LAB (2)
LAB
Neighbor
Notes to Figure 9:
(1) C4 interconnects can drive R4 interconnects. (2) This pattern is repeated for every LAB in the LAB row.
The column interconnect operates similarly to the row interconnect. Each column of LABs is served by a dedicated column interconnect, which vertically routes signals to and from LABs, M4K memory blocks, and row and column IOEs. These column resources include:
LUT chain interconnects within an LAB
Register chain interconnects within an LAB
C4 interconnects traversing a distance of four blocks in an up and
down direction
18 Altera Corporation
Preliminary Information Cyclone FPGA Family Data Sheet
Cyclone devices include an enhanced interconnect structure within LABs for routing LE output to LE input connections faster using LUT chain connections and register chain connections. The LUT chain connection allows the combinatorial output of an LE to directly drive the fast input of the LE right below it, bypassing the local interconnect. These resources can be used as a high-speed connection for wide fan-in functions from LE 1 to LE 10 in the same LAB. The register chain connection allows the register output of one LE to connect directly to the register input of the next LE in the LAB for fast shift registers. The Quartus II Compiler automatically takes advantage of these resources to improve utilization and performance. Figure 10 shows the LUT chain and register chain interconnects.
Altera Corporation 19
Cyclone FPGA Family Data Sheet Preliminary Information
t
Figure 10. LUT Chain & Register Chain Interconnects
Local Interconnect Routing Among LEs in the LAB
LUT Chain
Routing to
Adjacent LE
Local
Interconnect
LE 1
LE 2
LE 3
LE 4
LE 5
LE 6
LE 7
LE 8
LE 9
LE 10
Register Chain Routing to Adjacen LE's Register Input
The C4 interconnects span four LABs or M4K blocks up or down from a source LAB. Every LAB has its own set of C4 interconnects to drive either up or down. Figure 11 shows the C4 interconnect connections from an LAB in a column. The C4 interconnects can drive and be driven by all types of architecture blocks, including PLLs, M4K memory blocks, and column and row IOEs. For LAB interconnection, a primary LAB or its LAB neighbor can drive a given C4 interconnect. C4 interconnects can drive each other to extend their range as well as drive row interconnects for column-to-column connections.
20 Altera Corporation
Preliminary Information Cyclone FPGA Family Data Sheet
4
Figure 11. C4 Interconnect Connections Note (1)
C4 Interconnect Drives Local and R Interconnects Up to Four Rows
C4 Interconnect Driving Up
LAB
Row Interconnect
Adjacent LAB can drive onto neighboring LAB's C4 interconnect
Local
Interconnect
C4 Interconnect Driving Down
Note to Figure 11:
(1) Each C4 interconnect can drive either up or down four rows.
Altera Corporation 21
Cyclone FPGA Family Data Sheet Preliminary Information
All embedded blocks communicate with the logic array similar to LAB-to­LAB interfaces. Each block (i.e., M4K memory or PLL) connects to row and column interconnects and has local interconnect regions driven by row and column interconnects. These blocks also have direct link interconnects for fast connections to and from a neighboring LAB.
Table 5 shows the Cyclone device’s routing scheme.
Table 5. Cyclone Device Routing Scheme
Source Destination
LUT Chain
Register Chain
Local Interconnect
Direct Link Interconnect
R4 Interconnect
C4 Interconnect
LE
M4K RAM Block
PLL
Column IOE
Row IOE
LUT Chain Regis ter Chain Local Interconnect Direct Link
Interconnect R4 Interconnect
C4 Interconnect LE M4K RAM Block PLL Column IOE Row IOE
v
vvv vvv
vvvvvv
vvvv
vvv
vvv
v v vvvvv
v
22 Altera Corporation
Preliminary Information Cyclone FPGA Family Data Sheet

Embedded Memory

The Cyclone embedded memory consists of columns of M4K memory blocks. EP1C3 and EP1C6 devices have one column of M4K blocks, while EP1C12 and EP1C20 devices have two columns (see Table 1 on page 1 for total RAM bits per density). Each M4K block can implement various types of memory with or without parity, including true dual-port, simple dual­port, and single-port RAM, ROM, and FIFO buffers. The M4K blocks support the following features:
4,608 RAM bits
200 MHz performance
True dual-port memory
Simple dual-port memory
Single-port memory
Byte enable
Parity bits
Shift register
FIFO buffer
ROM
Mixed clock mode

Memory Modes

The M4K memory blocks include input registers that synchronize writes and output registers to pipeline designs and improve system performance. M4K blocks offer a true dual-port mode to support any combination of two-port operations: two reads, two writes, or one read and one write at two different clock frequencies. Figure 12 shows true dual-port memory.
Figure 12. True Dual-Port Memory Configuration
AB
dataA[ ] address wren
A
clock clocken qA[ ] aclr
A
[ ]
A
A
A
dataB[ ]
address
wren
clockB
clocken
qB[ ]
aclr
[ ]
B
B
B
B
In addition to true dual-port memory, the M4K memory blocks support simple dual-port and single-port RAM. Simple dual-port memory supports a simultaneous read and write. Single-port memory supports non-simultaneous reads and writes. Figure 13 shows these different M4K RAM memory port configurations.
Altera Corporation 23
Cyclone FPGA Family Data Sheet Preliminary Information
Figure 13. Simple Dual-Port & Single-Port Memory Configurations
Simple Dual-Port Memory
data[ ] wraddress[ ] wren inclock inclocken inaclr
Single-Port Memory
data[ ] address[ ] wren inclock inclocken inaclr
(1)
rdaddress[ ]
rden
q[ ]
outclock
outclocken
outaclr
q[ ]
outclock
outclocken
outaclr
Note to Figure 13:
(1) Two single-port memory blocks can be implemented in a single M4K block as long
as each of the two independent block sizes is equal to or less than half of the M4K block size.
The memory blocks also enable mixed-width data ports for reading and writing to the RAM ports in dual-port RAM configuration. For example,
the memory block can be written in ×1 mode at port A and read out in ×16
mode from port B.
The Cyclone memory architecture can implement fully synchronous RAM by registering both the input and output signals to the M4K RAM block. All M4K memory block inputs are registered, providing synchronous write cycles. In synchronous operation, the memory block generates its own self-timed strobe write enable (wren) signal derived from a global clock. In contrast, a circuit using asynchronous RAM must generate the RAM wren signal while ensuring its data and address signals meet setup and hold time specifications relative to the wren signal. The output registers can be bypassed. Pseudo-asynchronous reading is possible in the simple dual-port mode of M4K blocks by clocking the read enable and read address registers on the negative clock edge and bypassing the output registers.
24 Altera Corporation
Preliminary Information Cyclone FPGA Family Data Sheet
When configured as RAM or ROM, the designer can use an initialization file to pre-load the memory contents.
Two single-port memory blocks can be implemented in a single M4K block as long as each of the two independent block sizes is equal to or less than half of the M4K block size.
The Quartus II software automatically implements larger memory by
combining multiple M4K memory blocks. For example, two 256 × 16-bit RAM blocks can be combined to form a 256 × 32-bit RAM block. Memory
performance does not degrade for memory blocks using the maximum number of words allowed. Logical memory blocks using less than the maximum number of words use physical blocks in parallel, eliminating any external control logic that would increase delays. To create a larger high-speed memory block, the Quartus II software automatically combines memory blocks with LE control logic.

Parity Bit Support

The M4K blocks support a parity bit for each byte. The parity bit, along with internal LE logic, can implement parity checking for error detection to ensure data integrity. Designers can also use parity-size data words to store user-specified control bits. Byte enables are also available for data input masking during write operations.

Shift Register Support

The designer can configure M4K memory blocks to implement shift registers for DSP applications such as pseudo-random number generators, multi-channel filtering, auto-correlation, and cross-correlation functions. These and other DSP applications require local data storage, traditionally implemented with standard flip-flops, which can quickly consume many logic cells and routing resources for large shift registers. A more efficient alternative is to use embedded memory as a shift register block, which saves logic cell and routing resources and provides a more efficient implementation with the dedicated circuitry.
The size of a w × m × n shift register is determined by the input data width
(w), the length of the taps (m), and the number of taps (n). The size of a
w × m × n shift register must be less than or equal to the maximum number
of memory bits in the M4K block (4,608 bits). The total number of shift
register outputs (number of taps n × width w) must be less than the maximum data width of the M4K RAM block (×36). To create larger shift
registers, multiple memory blocks are cascaded together.
Altera Corporation 25
Cyclone FPGA Family Data Sheet Preliminary Information
r
Data is written into each address location at the falling edge of the clock and read from the address at the rising edge of the clock. The shift register mode logic automatically controls the positive and negative edge clocking to shift the data in one clock cycle. Figure 14 shows the M4K memory block in the shift register mode.
Figure 14. Shift Register Memory Configuration
× m × n Shift Register
w
m
-Bit Shift Register
w w
m
-Bit Shift Register
w
w
n Numbe of Taps
m
-Bit Shift Register
w
w
m
-Bit Shift Register
w
w
26 Altera Corporation
Preliminary Information Cyclone FPGA Family Data Sheet

Memory Configuration Sizes

The memory address depths and output widths can be configured as
4,096 × 1, 2,048 × 2, 1,024 × 4, 512 × 8 (or 512 × 9 bits), 256 × 16 (or 256 × 18
bits), and 128 x 32 (or 128 x 36 bits). The 128 x 32- or 36-bit configuration is not available in the true dual-port mode. Mixed-width configurations are also possible, allowing different read and write widths. Tables 6 and 7 summarize the possible M4K RAM block configurations.
Table 6. M4K RAM Block Configurations (Simple Dual-Port)
Read Port Write Port
4K × 12K × 21K × 4512 × 8256 × 16 128 × 32 512 × 9256 × 18 128 × 36
4K × 1 2K × 2 1K × 4 512 × 8 256 × 16 128 × 32 512 × 9 256 × 18 v 128 × 36 vv
vvvvvv vvvvvv vvv vvvv vvv vvv
vvv
vv
v
v
vv
v v
v
vv v
v v
Table 7. M4K RAM Block Configurations (True Dual-Port)
Port A Port B
4K × 12K × 21K × 4512 × 8256 × 16 512 × 9256 × 18
4K × 1 2K × 2 1K × 4 512 × 8 256 × 16 512 × 9 256 × 18
vvvvv vvvvv vvv vvvv vvv
vv
v
v
v
vv vv
When the M4K RAM block is configured as a shift register block, the
designer can create a shift register up to 4,608 bits (w × m × n).
Altera Corporation 27
Cyclone FPGA Family Data Sheet Preliminary Information

Byte Enables

M4K blocks support byte writes when the write port has a data width of 16, 18, 32, or 36 bits. The byte enables allow the input data to be masked so the device can write to specific bytes. The unwritten bytes retain the previous written value. Table 8 summarizes the byte selection.
Table 8. Byte Enable for M4K Blocks Notes (1), (2)
byteena[3..0] datain ×18 datain ×36
[0] = 1 [8..0] [8..0] [1] = 1 [17..9] [17..9] [2] = 1 [26..18] [3] = 1 [35..27]
Notes to Table 8:
(1) Any combination of byte enables is possible. (2) Byte enables can be used in the same manner with 8-bit words, i.e., in ×16 and ×32
modes.

Control Signals & M4K Interface

The M4K blocks allow for different clocks on their inputs and outputs. Either of the two clocks feeding the block can clock M4K block registers (renwe, address, byte enable, datain, and output registers). Only the output register can be bypassed. The six labclk signals or local interconnects can drive the control signals for the A and B ports of the M4K block. LEs can also control the clock_a, clock_b, renwe_a, renwe_b, clr_a, clr_b, clocken_a, and clocken_b signals, as shown in Figure 15.
The R4, C4, and direct link interconnects from adjacent LABs drive the M4K block local interconnect. The M4K blocks can communicate with LABs on either the left or right side through these row resources or with LAB columns on either the right or left with the column resources. Up to 10 direct link input connections to the M4K block are possible from the left adjacent LABs and another 10 possible from the right adjacent LAB. M4K block outputs can also connect to left and right LABs through 10 direct link interconnects each. Figure 16 shows the M4K block to logic array interface.
28 Altera Corporation
Preliminary Information Cyclone FPGA Family Data Sheet
Figure 15. M4K RAM Block Control Signals
Dedicated LAB Row Clocks
Local Interconnect
6
Local Interconnect
Local Interconnect
Local Interconnect
Local Interconnect
Local Interconnect
clocken_a
renwe_aclock_a
Figure 16. M4K RAM Block LAB Row Interface
C4 Interconnects
Direct link interconnect to adjacent LAB
Direct link interconnect from adjacent LAB
10
alcr_a
M4K RAM
Block
Byte enable
Clocks
alcr_b
dataout
Control Signals
renwe_b
clocken_b
clock_b
Local Interconnect
Local Interconnect
Local Interconnect
Local Interconnect
R4 Interconnects
Direct link interconnect to adjacent LAB
Direct link interconnect from adjacent LAB
datainaddress
6
M4K RAM Block Local
LAB Row Clocks
Interconnect Region
Altera Corporation 29
Loading...
+ 65 hidden pages