The CycloneTM field programmable gate array family is based on a 1.5-V,
0.13-µm, all-layer copper SRAM process, with densities up to 20,060 logic
elements (LEs) and up to 288 Kbits of RAM. With features like phaselocked loops (PLLs) for clocking and a dedicated double data rate (DDR)
interface to meet DDR SDRAM and fast cycle RAM (FCRAM) memory
requirements, Cyclone devices are a cost-effective solution for data-path
applications. Cyclone devices support various I/O standards, including
LVDS at data rates up to 311 megabits per second (Mbps) and 66-MHz,
32-bit peripheral component interconnect (PCI), for interfacing with and
supporting ASSP and ASIC devices. Altera also offers new low-cost serial
configuration devices to configure Cyclone devices.
■2,910 to 20,060 LEs, see Table 1
■Up to 294,912 RAM bits (36,864 bytes)
■Supports configuration through low-cost serial configuration device
■Support for LVTTL, LVCMOS, SSTL-2, and SSTL-3 I/O standards
■Support for 66-MHz, 32-bit PCI standard
■Low speed (311 Mbps) LVDS I/O support
■Up to two PLLs per device provide clock multiplication and phase
shifting
■Up to eight global clock lines with six clock resources available per
logic array block (LAB) row
■Support for external memory, including DDR SDRAM (133 MHz),
FCRAM, and single data rate (SDR) SDRAM
■Support for multiple intellectual property (IP) cores, including
MegaCore functions and Altera Megafunctions Partners
Altera
Program (AMPP
SM
) megafunctions
Table 1. Cyclone Device Features
FeatureEP1C3EP1C4EP1C6EP1C12EP1C20
LEs2,9104,0005,98012,06020,060
M4K RAM blocks (128 × 36bits)1317205264
Total RAM bits59,90478,33692,160239,616294,912
PLLs 12222
Maximum user I/O pins (1)104301185249301
Note to Table 1:
(1) This parameter includes global clock pins.
Altera Corporation 1
DS-CYCLONE-1.1
Cyclone FPGA Family Data SheetPreliminary Information
Cyclone devices are available in quad flat pack (QFP) and space-saving
Cyclone FPGA Family Data SheetPreliminary Information
Functional
Description
Cyclone devices contain a two-dimensional row- and column-based
architecture to implement custom logic. Column and row interconnects of
varying speeds provide signal interconnects between LABs and
embedded memory blocks.
The logic array consists of LABs, with 10 LEs in each LAB. An LE is a small
unit of logic providing efficient implementation of user logic functions.
LABs are grouped into rows and columns across the device. Cyclone
devices range between 2,910 to 20,060 LEs.
M4K RAM blocks are true dual-port memory blocks with 4K bits of
memory plus parity (4,608 bits). These blocks provide dedicated true
dual-port, simple dual-port, or single-port memory up to 36-bits wide at
up to 200 MHz. These blocks are grouped into columns across the device
in between certain LABs. Cyclone devices offer between 60 to 288 Kbits of
embedded RAM.
Each Cyclone device I/O pin is fed by an I/O element (IOE) located at the
ends of LAB rows and columns around the periphery of the device. I/O
pins support various single-ended and differential I/O standards, such as
the 66-MHz, 32-bit PCI standard and the LVDS I/O standard at up to
311 Mbps. Each IOE contains a bidirectional I/O buffer and three registers
for registering input, output, and output-enable signals. Dual-purpose
DQS, DQ, and DM pins along with delay chains (used to phase-align DDR
signals) provide interface support with external memory devices such as
DDR SDRAM, and FCRAM devices at up to 133 MHz (266 Mbps).
Cyclone devices provide a global clock network and up to two PLLs. The
global clock network consists of eight global clock lines that drive
throughout the entire device. The global clock network can provide clocks
for all resources within the device, such as IOEs, LEs, and memory blocks.
The global clock lines can also be used for control signals. Cyclone PLLs
provide general-purpose clocking with clock multiplication and phase
shifting as well as external outputs for high-speed differential I/O
support.
Figure 1 shows a diagram of the Cyclone EP1C12 device.
4Altera Corporation
Preliminary InformationCyclone FPGA Family Data Sheet
Figure 1. Cyclone EP1C12 Device Block Diagram
IOEs
Logic Array
PLL
M4K Blocks
EP1C12 Device
The number of M4K RAM blocks, PLLs, rows, and columns vary per
device. Table 4 lists the resources available in each Cyclone device.
Cyclone FPGA Family Data SheetPreliminary Information
Logic Array
Blocks
Each LAB consists of 10 LEs, LE carry chains, LAB control signals, a local
interconnect, look-up table (LUT) chain, and register chain connection
lines. The local interconnect transfers signals between LEs in the same
LAB. LUT chain connections transfer the output of one LE’s LUT to the
adjacent LE for fast sequential LUT connections within the same LAB.
Register chain connections transfer the output of one LE’s register to the
adjacent LE’s register within an LAB. The Quartus
associated logic within an LAB or adjacent LABs, allowing the use of local,
LUT chain, and register chain connections for performance and area
efficiency. Figure 2 details the Cyclone LAB.
Figure 2. Cyclone LAB Structure
Direct link
interconnect from
adjacent block
Row Interconnect
®
II Compiler places
Column Interconnect
Direct link
interconnect from
adjacent block
Direct link
interconnect to
adjacent block
Local InterconnectLAB
Direct link
interconnect to
adjacent block
6Altera Corporation
Preliminary InformationCyclone FPGA Family Data Sheet
LAB Interconnects
The LAB local interconnect can drive LEs within the same LAB. The LAB
local interconnect is driven by column and row interconnects and LE
outputs within the same LAB. Neighboring LABs, PLLs, and M4K RAM
blocks from the left and right can also drive an LAB’s local interconnect
through the direct link connection. The direct link connection feature
minimizes the use of row and column interconnects, providing higher
performance and flexibility. Each LE can drive 30 other LEs through fast
local and direct link interconnects. Figure 3 shows the direct link
connection.
Figure 3. Direct Link Connection
Direct link interconnect from
left LAB, M4K memory
block, PLL, or IOE output
Direct link interconnect from
right LAB, M4K memory
block, PLL, or IOE output
Direct link
interconnect
to left
Local
Interconnect
Direct link
interconnect
to right
LAB
LAB Control Signals
Each LAB contains dedicated logic for driving control signals to its LEs.
The control signals include two clocks, two clock enables, two
asynchronous clears, synchronous clear, asynchronous preset/load,
synchronous load, and add/subtract control signals. This gives a
maximum of 10 control signals at a time. Although synchronous load and
clear signals are generally used when implementing counters, they can
also be used with other functions.
Altera Corporation 7
Cyclone FPGA Family Data SheetPreliminary Information
Each LAB can use two clocks and two clock enable signals. Each LAB’s
clock and clock enable signals are linked. For example, any LE in a
particular LAB using the labclk1 signal will also use labclkena1. If
the LAB uses both the rising and falling edges of a clock, it also uses both
LAB-wide clock signals. De-asserting the clock enable signal will turn off
the LAB-wide clock.
Each LAB can use two asynchronous clear signals and an asynchronous
load/preset signal. The asynchronous load acts as a preset when the
asynchronous load data input is tied high.
With the LAB-wide addnsub control signal, a single LE can implement a
one-bit adder and subtractor. This saves LE resources and improves
performance for logic functions such as DSP correlators and signed
multipliers that alternate between addition and subtraction depending on
data.
The LAB row clocks [5..0] and LAB local interconnect generate the LABwide control signals. The MultiTrack
allows clock and control signal distribution in addition to data. Figure 4
shows the LAB control signal generation circuit.
Figure 4. LAB-Wide Control Signals
Dedicated
LAB Row
Clocks
Local
Interconnect
Local
Interconnect
Local
Interconnect
Local
Interconnect
Local
Interconnect
Local
Interconnect
6
labclkena1
TM
interconnect’s inherent low skew
labclkena2
labclk2labclk1
asyncload
or labpre
syncload
labclr1
labclr2
addnsub
synclr
8Altera Corporation
Preliminary InformationCyclone FPGA Family Data Sheet
Logic Elements
Figure 5. Cyclone LE
LAB Carry-In
addnsub
data1
data2
data3
data4
labclr1
labclr2
labpre/aload
Chip-Wide
Reset
labclk1
labclk2
Carry-In1
Carry-In0
Asynchronous
Clear/Preset/
Load Logic
Clock &
Clock Enable
Select
The smallest unit of logic in the Cyclone architecture, the LE, is compact
and provides advanced features with efficient logic utilization. Each LE
contains a four-input LUT, which is a function generator that can
implement any function of four variables. In addition, each LE contains a
programmable register and carry chain with carry select capability. A
single LE also supports dynamic single bit addition or subtraction mode
selectable by an LAB-wide control signal. Each LE drives all types of
interconnects: local, row, column, LUT chain, register chain, and direct
link interconnects. See Figure 5.
Register chain
routing from
Look-Up
Tabl e
(LUT)
Carry
Chain
previous LE
LAB-wide
Synchronous
Load
Synchronous
Synchronous
Load and
Clear Logic
LAB-wide
Clear
Register Bypass
Packed
Register Select
PRN/ALD
D
ADATA
ENA
CLRN
Register
Feedback
Programmable
Register
LUT chain
routing to next LE
Row, column,
Q
and direct link
routing
Row, column,
and direct link
routing
Local Routing
Register chain
output
labclkena1
labclkena2
Carry-Out0
Carry-Out1
LAB Carry-Out
Altera Corporation 9
Cyclone FPGA Family Data SheetPreliminary Information
Each LE’s programmable register can be configured for D, T, JK, or SR
operation. Each register has data, true asynchronous load data, clock,
clock enable, clear, and asynchronous load/preset inputs. Global signals,
general-purpose I/O pins, or any internal logic can drive the register’s
clock and clear control signals. Either general-purpose I/O pins or
internal logic can drive the clock enable, preset, asynchronous load, and
asynchronous data. The asynchronous load data input comes from the
data3 input of the LE. For combinatorial functions, the LUT output
bypasses the register and drives directly to the LE outputs.
Each LE has three outputs that drive the local, row, and column routing
resources. The LUT or register output can drive these three outputs
independently. Two LE outputs drive column or row and direct link
routing connections and one drives local interconnect resources. This
allows the LUT to drive one output while the register drives another
output. This feature, called register packing, improves device utilization
because the device can use the register and the LUT for unrelated
functions. Another special packing mode allows the register output to
feed back into the LUT of the same LE so that the register is packed with
its own fan-out LUT. This provides another mechanism for improved
fitting. The LE can also drive out registered and unregistered versions of
the LUT output.
LUT Chain & Register Chain
In addition to the three general routing outputs, the LEs within an LAB
have LUT chain and register chain outputs. LUT chain connections allow
LUTs within the same LAB to cascade together for wide input functions.
Register chain outputs allow registers within the same LAB to cascade
together. The register chain output allows an LAB to use LUTs for a single
combinatorial function and the registers to be used for an unrelated shift
register implementation. These resources speed up connections between
LABs while saving local interconnect resources. See “MultiTrack
Interconnect” on page 17 for more information on LUT chain and register
chain connections.
10Altera Corporation
Preliminary InformationCyclone FPGA Family Data Sheet
addnsub Signal
The LE’s dynamic adder/subtractor feature saves logic resources by using
one set of LEs to implement both an adder and a subtractor. This feature
is controlled by the LAB-wide control signal addnsub. The addnsub
signal sets the LAB to perform either A + B or A − B. The LUT computes
addition; subtraction is computed by adding the two’s complement of the
intended subtractor. The LAB-wide signal converts to two’s complement
by inverting the B bits within the LAB and setting carry-in = 1 to add one
to the least significant bit (LSB). The LSB of an adder/subtractor must be
placed in the first LE of the LAB, where the LAB-wide addnsub signal
automatically sets the carry-in to 1. The Quartus II Compiler
automatically places and uses the adder/subtractor feature when using
adder/subtractor parameterized functions.
LE Operating Modes
The Cyclone LE can operate in one of the following modes:
■Normal mode
■Dynamic arithmetic mode
Each mode uses LE resources differently. In each mode, eight available
inputs to the LEthe four data inputs from the LAB local interconnect,
carry-in0 and carry-in1 from the previous LE, the LAB carry-in
from the previous carry-chain LAB, and the register chain
connectionare directed to different destinations to implement the
desired logic function. LAB-wide signals provide clock, asynchronous
clear, asynchronous preset/load, synchronous clear, synchronous load,
and clock enable control for the register. These LAB-wide signals are
available in all LE modes. The addnsub control signal is allowed in
arithmetic mode.
The Quartus II software, in conjunction with parameterized functions
such as library of parameterized modules (LPM) functions, automatically
chooses the appropriate mode for common functions such as counters,
adders, subtractors, and arithmetic functions. If required, the designer can
also create special-purpose functions that specify which LE operating
mode to use for optimal performance.
Altera Corporation 11
Cyclone FPGA Family Data SheetPreliminary Information
Normal Mode
The normal mode is suitable for general logic applications and
combinatorial functions. In normal mode, four data inputs from the LAB
local interconnect are inputs to a four-input LUT (see Figure 6). The
Quartus II Compiler automatically selects the carry-in or the data3 signal
as one of the inputs to the LUT. Each LE can use LUT chain connections to
drive its combinatorial output directly to the next LE in the LAB.
Asynchronous load data for the register comes from the data3 input of
the LE. LEs in normal mode support packed registers.
Figure 6. LE in Normal Mode
aload
(LAB Wide)
Register chain
connection
sload
(LAB Wide)
sclear
(LAB Wide)
addnsub (LAB Wide)
(1)
data1
data2
data3
cin (from cout
of previous LE)
data4
4-Input
LUT
Register Feedback
clock (LAB Wide)
ena (LAB Wide)
aclr (LAB Wide)
ALD/PRE
ADATA
D
ENA
CLRN
Q
Note to Figure 6:
(1) This signal is only allowed in normal mode if the LE is at the end of an adder/subtractor chain.
Row, column, and
direct link routing
Row, column, and
direct link routing
Local routing
LUT chain
connection
Register
chain output
12Altera Corporation
Preliminary InformationCyclone FPGA Family Data Sheet
Dynamic Arithmetic Mode
The dynamic arithmetic mode is ideal for implementing adders, counters,
accumulators, wide parity functions, and comparators. An LE in dynamic
arithmetic mode uses four 2-input LUTs configurable as a dynamic
adder/subtractor. The first two 2-input LUTs compute two summations
based on a possible carry-in of 1 or 0; the other two LUTs generate carry
outputs for the two chains of the carry select circuitry. As shown in
Figure 7, the LAB carry-in signal selects either the carry-in0 or
carry-in1 chain. The selected chain’s logic level in turn determines
which parallel sum is generated as a combinatorial or registered output.
For example, when implementing an adder, the sum output is the
selection of two possible calculated sums:
The other two LUTs use the data1 and data2 signals to generate two
possible carry-out signalsone for a carry of 1 and the other for a carry of
0. The carry-in0 signal acts as the carry select for the carry-out0
output and carry-in1 acts as the carry select for the carry-out1
output. LEs in arithmetic mode can drive out registered and unregistered
versions of the LUT output.
The dynamic arithmetic mode also offers clock enable, counter enable,
synchronous up/down control, synchronous clear, synchronous load,
and dynamic adder/subtractor options. The LAB local interconnect data
inputs generate the counter enable and synchronous up/down control
signals. The synchronous clear and synchronous load options are LABwide signals that affect all registers in the LAB. The Quartus II software
automatically places any registers that are not used by the counter into
other LABs. The addnsub LAB-wide signal controls whether the LE acts
as an adder or subtractor.
Altera Corporation 13
Cyclone FPGA Family Data SheetPreliminary Information
Figure 7. LE in Dynamic Arithmetic Mode
LAB Carry-In
Carry-In0
Carry-In1
addnsub
(LAB Wide)
sload
Register chain
connection
(1)
(LAB Wide)
sclear
(LAB Wide)
aload
(LAB Wide)
data1
data2
data3
LUT
LUT
LUT
LUT
clock (LAB Wide)
ena (LAB Wide)
aclr (LAB Wide)
Register Feedback
Carry-Out1Carry-Out0
ALD/PRE
ADATA
D
ENA
CLRN
Note to Figure 7:
(1) The addnsub signal is tied to the carry input for the first LE of a carry chain only.
Carry-Select Chain
The carry-select chain provides a very fast carry-select function between
LEs in dynamic arithmetic mode. The carry-select chain uses the
redundant carry calculation to increase the speed of carry functions. The
LE is configured to calculate outputs for a possible carry-in of 0 and carryin of 1 in parallel. The carry-in0 and carry-in1 signals from a lower-
order bit feed forward into the higher-order bit via the parallel carry chain
and feed into both the LUT and the next portion of the carry chain. Carryselect chains can begin in any LE within an LAB.
Q
Row, column, and
direct link routing
Row, column, and
direct link routing
Local routing
LUT chain
connection
Register
chain output
The speed advantage of the carry-select chain is in the parallel precomputation of carry chains. Since the LAB carry-in selects the
precomputed carry chain, not every LE is in the critical path. Only the
propagation delays between LAB carry-in generation (LE 5 and LE 10) are
now part of the critical path. This feature allows the Cyclone architecture
to implement high-speed counters, adders, multipliers, parity functions,
and comparators of arbitrary width.
14Altera Corporation
Preliminary InformationCyclone FPGA Family Data Sheet
Figure 8 shows the carry-select circuitry in an LAB for a 10-bit full adder.
One portion of the LUT generates the sum of two bits using the input
signals and the appropriate carry-in bit; the sum is routed to the output of
the LE. The register can be bypassed for simple adders or used for
accumulator functions. Another portion of the LUT generates carry-out
bits. An LAB-wide carry-in bit selects which chain is used for the addition
of given inputs. The carry-in signal for each chain, carry-in0 or carry-in1, selects the carry-out to carry forward to the carry-in signal of
the next-higher-order bit. The final carry-out signal is routed to an LE,
where it is fed to local, row, or column interconnects.
Altera Corporation 15
Cyclone FPGA Family Data SheetPreliminary Information
Figure 8. Carry Select Chain
LAB Carry-In
A1
B1
A2
B2
A3
B3
A4
B4
A5
B5
A6
B6
A7
B7
A8
B8
A9
B9
01
LE1
LE2
LE3
LE4
LE5
01
LE6
LE7
LE8
LE9
Sum1
Sum2
Sum3
Sum4
Sum5
Sum6
Sum7
Sum8
Sum9
LAB Carry-In
Carry-In0
Carry-In1
data1
data2
LUT
Sum
LUT
LUT
LUT
Carry-Out0Carry-Out1
A10
B10
LAB Carry-Out
LE10
Sum10
The Quartus II Compiler automatically creates carry chain logic during
design processing, or the designer can create it manually during design
entry. Parameterized functions such as LPM functions automatically take
advantage of carry chains for the appropriate functions.
The Quartus II Compiler creates carry chains longer than 10 LEs by
linking LABs together automatically. For enhanced fitting, a long carry
chain runs vertically allowing fast horizontal connections to M4K
memory blocks. A carry chain can continue as far as a full column.
16Altera Corporation
Preliminary InformationCyclone FPGA Family Data Sheet
Clear & Preset Logic Control
LAB-wide signals control the logic for the register’s clear and preset
signals. The LE directly supports an asynchronous clear and preset
function. The register preset is achieved through the asynchronous load of
a logic high. The direct asynchronous preset does not require a NOT-gate
push-back technique. Cyclone devices support simultaneous preset/
asynchronous load and clear signals. An asynchronous clear signal takes
precedence if both signals are asserted simultaneously. Each LAB
supports up to two clears and one preset signal.
In addition to the clear and preset ports, Cyclone devices provide a chipwide reset pin (DEV_CLRn) that resets all registers in the device. An option
set before compilation in the Quartus II software controls this pin. This
chip-wide reset overrides all other control signals.
MultiTrack
Interconnect
In the Cyclone architecture, connections between LEs, M4K memory
blocks, and device I/O pins are provided by the MultiTrack interconnect
structure with DirectDrive
consists of continuous, performance-optimized routing lines of different
speeds used for inter- and intra-design block connectivity. The Quartus II
Compiler automatically places critical design paths on faster
interconnects to improve design performance.
DirectDrive technology is a deterministic routing technology that ensures
identical routing resource usage for any function regardless of placement
within the device. The MultiTrack interconnect and DirectDrive
technology simplify the integration stage of block-based designing by
eliminating the re-optimization cycles that typically follow design
changes and additions.
The MultiTrack interconnect consists of row and column interconnects
that span fixed distances. A routing structure with fixed length resources
for all devices allows predictable and repeatable performance when
migrating through different device densities. Dedicated row
interconnects route signals to and from LABs, PLLs, and M4K memory
blocks within the same row. These row resources include:
■Direct link interconnects between LABs and adjacent blocks
■R4 interconnects traversing four blocks to the right or left
The direct link interconnect allows an LAB or M4K memory block to drive
into the local interconnect of its left and right neighbors. Only one side of
a PLL block interfaces with direct link and row interconnects. The direct
link interconnect provides fast communication between adjacent LABs
and/or blocks without using row interconnect resources.
TM
technology. The MultiTrack interconnect
Altera Corporation 17
Cyclone FPGA Family Data SheetPreliminary Information
The R4 interconnects span four LABs, or two LABs and one M4K RAM
block. These resources are used for fast row connections in a four-LAB
region. Every LAB has its own set of R4 interconnects to drive either left
or right. Figure 9 shows R4 interconnect connections from an LAB. R4
interconnects can drive and be driven by M4K memory blocks, PLLs, and
row IOEs. For LAB interfacing, a primary LAB or LAB neighbor can drive
a given R4 interconnect. For R4 interconnects that drive to the right, the
primary LAB and right neighbor can drive on to the interconnect. For R4
interconnects that drive to the left, the primary LAB and its left neighbor
can drive on to the interconnect. R4 interconnects can drive other R4
interconnects to extend the range of LABs they can drive. R4 interconnects
can also drive C4 interconnects for connections from one row to another.
Figure 9. R4 Interconnect Connections
R4 Interconnect
Driving Left
Adjacent LAB can
Drive onto Another
LAB's R4 Interconnect
C4 Column Interconnects (1)
R4 Interconnect
Driving Right
LAB
Neighbor
Primary
LAB (2)
LAB
Neighbor
Notes to Figure 9:
(1) C4 interconnects can drive R4 interconnects.
(2) This pattern is repeated for every LAB in the LAB row.
The column interconnect operates similarly to the row interconnect. Each
column of LABs is served by a dedicated column interconnect, which
vertically routes signals to and from LABs, M4K memory blocks, and row
and column IOEs. These column resources include:
■LUT chain interconnects within an LAB
■Register chain interconnects within an LAB
■C4 interconnects traversing a distance of four blocks in an up and
down direction
18Altera Corporation
Preliminary InformationCyclone FPGA Family Data Sheet
Cyclone devices include an enhanced interconnect structure within LABs
for routing LE output to LE input connections faster using LUT chain
connections and register chain connections. The LUT chain connection
allows the combinatorial output of an LE to directly drive the fast input of
the LE right below it, bypassing the local interconnect. These resources
can be used as a high-speed connection for wide fan-in functions from LE
1 to LE 10 in the same LAB. The register chain connection allows the
register output of one LE to connect directly to the register input of the
next LE in the LAB for fast shift registers. The Quartus II Compiler
automatically takes advantage of these resources to improve utilization
and performance. Figure 10 shows the LUT chain and register chain
interconnects.
Altera Corporation 19
Cyclone FPGA Family Data SheetPreliminary Information
Register Chain
Routing to Adjacen
LE's Register Input
The C4 interconnects span four LABs or M4K blocks up or down from a
source LAB. Every LAB has its own set of C4 interconnects to drive either
up or down. Figure 11 shows the C4 interconnect connections from an
LAB in a column. The C4 interconnects can drive and be driven by all
types of architecture blocks, including PLLs, M4K memory blocks, and
column and row IOEs. For LAB interconnection, a primary LAB or its LAB
neighbor can drive a given C4 interconnect. C4 interconnects can drive
each other to extend their range as well as drive row interconnects for
column-to-column connections.
20Altera Corporation
Preliminary InformationCyclone FPGA Family Data Sheet
4
Figure 11. C4 Interconnect ConnectionsNote (1)
C4 Interconnect
Drives Local and R
Interconnects
Up to Four Rows
C4 Interconnect
Driving Up
LAB
Row
Interconnect
Adjacent LAB can
drive onto neighboring
LAB's C4 interconnect
Local
Interconnect
C4 Interconnect
Driving Down
Note to Figure 11:
(1) Each C4 interconnect can drive either up or down four rows.
Altera Corporation 21
Cyclone FPGA Family Data SheetPreliminary Information
All embedded blocks communicate with the logic array similar to LAB-toLAB interfaces. Each block (i.e., M4K memory or PLL) connects to row
and column interconnects and has local interconnect regions driven by
row and column interconnects. These blocks also have direct link
interconnects for fast connections to and from a neighboring LAB.
Table 5 shows the Cyclone device’s routing scheme.
Table 5. Cyclone Device Routing Scheme
SourceDestination
LUT Chain
Register Chain
Local Interconnect
Direct Link Interconnect
R4 Interconnect
C4 Interconnect
LE
M4K RAM Block
PLL
Column IOE
Row IOE
LUT Chain
Regis ter Chain
Local Interconnect
Direct Link
Preliminary InformationCyclone FPGA Family Data Sheet
Embedded
Memory
The Cyclone embedded memory consists of columns of M4K memory
blocks. EP1C3 and EP1C6 devices have one column of M4K blocks, while
EP1C12 and EP1C20 devices have two columns (see Table 1 on page 1 for
total RAM bits per density). Each M4K block can implement various types
of memory with or without parity, including true dual-port, simple dualport, and single-port RAM, ROM, and FIFO buffers. The M4K blocks
support the following features:
■4,608 RAM bits
■200 MHz performance
■True dual-port memory
■Simple dual-port memory
■Single-port memory
■Byte enable
■Parity bits
■Shift register
■FIFO buffer
■ROM
■Mixed clock mode
Memory Modes
The M4K memory blocks include input registers that synchronize writes
and output registers to pipeline designs and improve system
performance. M4K blocks offer a true dual-port mode to support any
combination of two-port operations: two reads, two writes, or one read
and one write at two different clock frequencies. Figure 12 shows true
dual-port memory.
Figure 12. True Dual-Port Memory Configuration
AB
dataA[ ]
address
wren
A
clock
clocken
qA[ ]
aclr
A
[ ]
A
A
A
dataB[ ]
address
wren
clockB
clocken
qB[ ]
aclr
[ ]
B
B
B
B
In addition to true dual-port memory, the M4K memory blocks support
simple dual-port and single-port RAM. Simple dual-port memory
supports a simultaneous read and write. Single-port memory supports
non-simultaneous reads and writes. Figure 13 shows these different M4K
RAM memory port configurations.
Altera Corporation 23
Cyclone FPGA Family Data SheetPreliminary Information
(1) Two single-port memory blocks can be implemented in a single M4K block as long
as each of the two independent block sizes is equal to or less than half of the M4K
block size.
The memory blocks also enable mixed-width data ports for reading and
writing to the RAM ports in dual-port RAM configuration. For example,
the memory block can be written in ×1 mode at port A and read out in ×16
mode from port B.
The Cyclone memory architecture can implement fully synchronous RAM
by registering both the input and output signals to the M4K RAM block.
All M4K memory block inputs are registered, providing synchronous
write cycles. In synchronous operation, the memory block generates its
own self-timed strobe write enable (wren) signal derived from a global
clock. In contrast, a circuit using asynchronous RAM must generate the
RAM wren signal while ensuring its data and address signals meet setup
and hold time specifications relative to the wren signal. The output
registers can be bypassed. Pseudo-asynchronous reading is possible in the
simple dual-port mode of M4K blocks by clocking the read enable and
read address registers on the negative clock edge and bypassing the
output registers.
24Altera Corporation
Preliminary InformationCyclone FPGA Family Data Sheet
When configured as RAM or ROM, the designer can use an initialization
file to pre-load the memory contents.
Two single-port memory blocks can be implemented in a single M4K
block as long as each of the two independent block sizes is equal to or less
than half of the M4K block size.
The Quartus II software automatically implements larger memory by
combining multiple M4K memory blocks. For example, two 256 × 16-bit
RAM blocks can be combined to form a 256 × 32-bit RAM block. Memory
performance does not degrade for memory blocks using the maximum
number of words allowed. Logical memory blocks using less than the
maximum number of words use physical blocks in parallel, eliminating
any external control logic that would increase delays. To create a larger
high-speed memory block, the Quartus II software automatically
combines memory blocks with LE control logic.
Parity Bit Support
The M4K blocks support a parity bit for each byte. The parity bit, along
with internal LE logic, can implement parity checking for error detection
to ensure data integrity. Designers can also use parity-size data words to
store user-specified control bits. Byte enables are also available for data
input masking during write operations.
Shift Register Support
The designer can configure M4K memory blocks to implement shift
registers for DSP applications such as pseudo-random number
generators, multi-channel filtering, auto-correlation, and cross-correlation
functions. These and other DSP applications require local data storage,
traditionally implemented with standard flip-flops, which can quickly
consume many logic cells and routing resources for large shift registers. A
more efficient alternative is to use embedded memory as a shift register
block, which saves logic cell and routing resources and provides a more
efficient implementation with the dedicated circuitry.
The size of a w × m × n shift register is determined by the input data width
(w), the length of the taps (m), and the number of taps (n). The size of a
w × m×n shift register must be less than or equal to the maximum number
of memory bits in the M4K block (4,608 bits). The total number of shift
register outputs (number of taps n × width w) must be less than the
maximum data width of the M4K RAM block (×36). To create larger shift
registers, multiple memory blocks are cascaded together.
Altera Corporation 25
Cyclone FPGA Family Data SheetPreliminary Information
r
Data is written into each address location at the falling edge of the clock
and read from the address at the rising edge of the clock. The shift register
mode logic automatically controls the positive and negative edge clocking
to shift the data in one clock cycle. Figure 14 shows the M4K memory
block in the shift register mode.
Figure 14. Shift Register Memory Configuration
× m × n Shift Register
w
m
-Bit Shift Register
ww
m
-Bit Shift Register
w
w
n Numbe
of Taps
m
-Bit Shift Register
w
w
m
-Bit Shift Register
w
w
26Altera Corporation
Preliminary InformationCyclone FPGA Family Data Sheet
Memory Configuration Sizes
The memory address depths and output widths can be configured as
bits), and 128 x 32 (or 128 x 36 bits). The 128 x 32- or 36-bit configuration
is not available in the true dual-port mode. Mixed-width configurations
are also possible, allowing different read and write widths. Tables 6 and 7
summarize the possible M4K RAM block configurations.
When the M4K RAM block is configured as a shift register block, the
designer can create a shift register up to 4,608 bits (w × m × n).
Altera Corporation 27
Cyclone FPGA Family Data SheetPreliminary Information
Byte Enables
M4K blocks support byte writes when the write port has a data width of
16, 18, 32, or 36 bits. The byte enables allow the input data to be masked
so the device can write to specific bytes. The unwritten bytes retain the
previous written value. Table 8 summarizes the byte selection.
(1) Any combination of byte enables is possible.
(2) Byte enables can be used in the same manner with 8-bit words, i.e., in ×16 and ×32
modes.
Control Signals & M4K Interface
The M4K blocks allow for different clocks on their inputs and outputs.
Either of the two clocks feeding the block can clock M4K block registers
(renwe, address, byte enable, datain, and output registers). Only the
output register can be bypassed. The six labclk signals or local
interconnects can drive the control signals for the A and B ports of the
M4K block. LEs can also control the clock_a, clock_b, renwe_a, renwe_b, clr_a, clr_b, clocken_a, and clocken_b signals, as
shown in Figure 15.
The R4, C4, and direct link interconnects from adjacent LABs drive the
M4K block local interconnect. The M4K blocks can communicate with
LABs on either the left or right side through these row resources or with
LAB columns on either the right or left with the column resources. Up to
10 direct link input connections to the M4K block are possible from the left
adjacent LABs and another 10 possible from the right adjacent LAB. M4K
block outputs can also connect to left and right LABs through 10 direct
link interconnects each. Figure 16 shows the M4K block to logic array
interface.
28Altera Corporation
Preliminary InformationCyclone FPGA Family Data Sheet
Figure 15. M4K RAM Block Control Signals
Dedicated
LAB Row
Clocks
Local
Interconnect
6
Local
Interconnect
Local
Interconnect
Local
Interconnect
Local
Interconnect
Local
Interconnect
clocken_a
renwe_aclock_a
Figure 16. M4K RAM Block LAB Row Interface
C4 Interconnects
Direct link
interconnect
to adjacent LAB
Direct link
interconnect
from adjacent LAB
10
alcr_a
M4K RAM
Block
Byte enable
Clocks
alcr_b
dataout
Control
Signals
renwe_b
clocken_b
clock_b
Local
Interconnect
Local
Interconnect
Local
Interconnect
Local
Interconnect
R4 Interconnects
Direct link
interconnect
to adjacent LAB
Direct link
interconnect
from adjacent LAB
datainaddress
6
M4K RAM Block Local
LAB Row Clocks
Interconnect Region
Altera Corporation 29
Loading...
+ 65 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.