Intel Quartus Prime Pro User Manual

Intel® Quartus® Prime Pro Edition User Guide
Design Recommendations
Updated for Intel® Quartus® Prime Design Suite: 21.1
Subscribe
Send Feedback
UG-20131 | 2021.03.29
Latest document on the web: PDF | HTML

Contents

Contents
1. Recommended HDL Coding Styles .................................................................................. 4
1.1. Using Provided HDL Templates................................................................................ 4
1.1.1. Inserting HDL Code from a Provided Template............................................... 4
1.2. Instantiating IP Cores in HDL.................................................................................. 5
1.3. Inferring Multipliers and DSP Functions.....................................................................5
1.3.1. Inferring Multipliers....................................................................................6
1.3.2. Inferring Multiply-Accumulator and Multiply-Adder Functions........................... 7
1.4. Inferring Memory Functions from HDL Code ............................................................. 8
1.4.1. Inferring RAM functions from HDL Code........................................................ 9
1.4.2. Inferring ROM Functions from HDL Code..................................................... 26
1.4.3. Inferring Shift Registers in HDL Code..........................................................28
1.5. Register and Latch Coding Guidelines..................................................................... 30
1.5.1. Register Power-Up Values..........................................................................30
1.5.2. Secondary Register Control Signals Such as Clear and Clock Enable................32
1.5.3. Latches ..................................................................................................33
1.6. General Coding Guidelines.................................................................................... 37
1.6.1. Tri-State Signals ..................................................................................... 37
1.6.2. Clock Multiplexing.................................................................................... 37
1.6.3. Adder Trees ............................................................................................39
1.6.4. State Machine HDL Guidelines................................................................... 40
1.6.5. Multiplexer HDL Guidelines .......................................................................46
1.6.6. Cyclic Redundancy Check Functions ...........................................................49
1.6.7. Comparator HDL Guidelines.......................................................................51
1.6.8. Counter HDL Guidelines............................................................................ 52
1.7. Designing with Low-Level Primitives....................................................................... 52
1.8. Recommended HDL Coding Styles Revision History...................................................53
2. Recommended Design Practices................................................................................... 56
2.1. Following Synchronous FPGA Design Practices..........................................................56
2.1.1. Implementing Synchronous Designs........................................................... 56
2.1.2. Asynchronous Design Hazards................................................................... 57
2.2. HDL Design Guidelines..........................................................................................58
2.2.1. Considerations for the Intel Hyperflex™ FPGA Architecture............................. 58
2.2.2. Optimizing Combinational Logic................................................................. 58
2.2.3. Optimizing Clocking Schemes.................................................................... 61
2.2.4. Optimizing Physical Implementation and Timing Closure................................66
2.2.5. Optimizing Power Consumption..................................................................69
2.2.6. Managing Design Metastability...................................................................69
2.3. Use Clock and Register-Control Architectural Features...............................................69
2.3.1. Use Global Reset Resources.......................................................................69
2.3.2. Use Global Clock Network Resources.......................................................... 79
2.3.3. Use Clock Region Assignments to Optimize Clock Constraints.........................80
2.3.4. Avoid Asynchronous Register Control Signals............................................... 82
2.4. Implementing Embedded RAM............................................................................... 82
2.5. Design Assistant Design Rule Checking................................................................... 84
2.5.1. Setting Up Design Assistant.......................................................................85
2.5.2. Running Design Assistant During Compilation.............................................. 86
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
2
Send Feedback
Contents
2.5.3. Running Design Assistant in Analysis Mode..................................................89
2.5.4. Cross-Probing from Design Assistant.......................................................... 92
2.5.5. Managing Design Assistant Rules............................................................... 95
2.5.6. Design Assistant Rule Categories..............................................................104
2.6. Recommended Design Practices Revision History....................................................105
3. Managing Metastability with the Intel Quartus Prime Software.................................. 109
3.1. Metastability Analysis in the Intel Quartus Prime Software....................................... 110
3.1.1. Synchronization Register Chains...............................................................110
3.1.2. Identify Synchronizers for Metastability Analysis.........................................111
3.1.3. How Timing Constraints Affect Synchronizer Identification and
Metastability Analysis..............................................................................111
3.2. Metastability and MTBF Reporting.........................................................................112
3.2.1. Metastability Reports.............................................................................. 113
3.2.2. Synchronizer Data Toggle Rate in MTBF Calculation.....................................115
3.3. MTBF Optimization............................................................................................. 115
3.3.1. Synchronization Register Chain Length......................................................116
3.4. Reducing Metastability Effects..............................................................................117
3.4.1. Apply Complete System-Centric Timing Constraints for the Timing Analyzer... 117
3.4.2. Force the Identification of Synchronization Registers...................................117
3.4.3. Set the Synchronizer Data Toggle Rate......................................................118
3.4.4. Optimize Metastability During Fitting.........................................................118
3.4.5. Increase the Length of Synchronizers to Protect and Optimize......................118
3.4.6. Increase the Number of Stages Used in Synchronizers................................ 118
3.4.7. Select a Faster Speed Grade Device.......................................................... 119
3.5. Scripting Support............................................................................................... 119
3.5.1. Identifying Synchronizers for Metastability Analysis.................................... 119
3.5.2. Synchronizer Data Toggle Rate in MTBF Calculation.....................................120
3.5.3. report_metastability and Tcl Command......................................................120
3.5.4. MTBF Optimization................................................................................. 120
3.5.5. Synchronization Register Chain Length......................................................121
3.6. Managing Metastability....................................................................................... 121
3.7. Managing Metastability with the Intel Quartus Prime Software Revision History...........121
3.8. Intel Quartus Prime Pro Edition User Guide: Design Recommendations Archive...........122
A. Intel Quartus Prime Pro Edition User Guides.............................................................. 123
Send Feedback
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
3
UG-20131 | 2021.03.29
Send Feedback

1. Recommended HDL Coding Styles

This chapter provides Hardware Description Language (HDL) coding style recommendations to ensure optimal synthesis results when targeting Intel FPGA devices.
HDL coding styles have a significant effect on the quality of results for programmable logic designs. Synthesis tools optimize HDL code for both logic utilization and performance; however, synthesis tools cannot interpret the intent of your design. Therefore, the most effective optimizations require conformance to recommended coding styles.
Note: For style recommendations, options, or HDL attributes specific to your synthesis tool,
refer to the synthesis tool vendor’s documentation.
Related Information
Advanced Synthesis Cookbook

1.1. Using Provided HDL Templates

The Intel® Quartus® Prime software provides templates for Verilog HDL, SystemVerilog, and VHDL templates to start your HDL designs. Many of the HDL examples in this document correspond with the Full Designs examples in the Intel Quartus Prime Templates. You can insert HDL code into your own design using the templates or examples.

1.1.1. Inserting HDL Code from a Provided Template

1.
Click File New.
2. In the New dialog box, select the HDL language for the design files: SystemVerilog HDL File, VHDL File, or Verilog HDL File; and click OK. A text editor tab with a blank file opens.
3. Right-click the blank file and click Insert Template.
4. In the Insert Template dialog box, expand the section corresponding to the appropriate HDL, then expand the Full Designs section.
5. Select a template. The template now appears in the Preview pane.
6. To paste the HDL design into the blank Verilog or VHDL file you created, click Insert.
7. Click Close to close the Insert Template dialog box.
Intel Corporation. All rights reserved. Agilex, Altera, Arria, Cyclone, eASIC, Intel, the Intel logo, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others.
ISO 9001:2015 Registered
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
Figure 1. Inserting a RAM Template
Note: Use the Intel Quartus Prime Text Editor to modify the HDL design or save the template
as an HDL file to edit in your preferred text editor.

1.2. Instantiating IP Cores in HDL

Intel provides parameterizable IP cores that are optimized for Intel FPGA device architectures. Using IP cores instead of coding your own logic saves valuable design time.
Additionally, the Intel-provided IP cores offer more efficient logic synthesis and device implementation. Scale the IP core’s size and specify various options by setting parameters. To instantiate the IP core directly in your HDL file code, invoke the IP core name and define its parameters as you would do for any other module, component, or
sub design. Alternatively, you can use the IP Catalog (Tools IP Catalog) and
parameter editor GUI to simplify customization of your IP core variation. You can infer or instantiate IP cores that optimize device architecture features, for example:
Transceivers
LVDS drivers
Memory and DSP blocks
Phase-locked loops (PLLs)
Double-data rate input/output (DDIO) circuitry
For some types of logic functions, such as memories and DSP functions, you can infer device-specific dedicated architecture blocks instead of instantiating an IP core. Intel Quartus Prime synthesis recognizes certain HDL code structures and automatically infers the appropriate IP core or map directly to device atoms.

1.3. Inferring Multipliers and DSP Functions

The following sections describe how to infer multiplier and DSP functions from generic HDL code, and, if applicable, how to target the dedicated DSP block architecture in Intel FPGA devices.
Send Feedback
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
5
Related Information
DSP Solutions Center

1.3.1. Inferring Multipliers

To infer multiplier functions, synthesis tools detect multiplier logic and implement this in Intel FPGA IP cores, or map the logic directly to device atoms.
For devices with DSP blocks, Intel Quartus Prime synthesis can implement the function in a DSP block instead of logic, depending on device utilization. The Intel Quartus Prime fitter can also place input and output registers in DSP blocks (that is, perform register packing) to improve performance and area utilization.
The following Verilog HDL and VHDL code examples show that synthesis tools can infer signed and unsigned multipliers as IP cores or DSP block atoms. Each example fits into one DSP block element. In addition, when register packing occurs, no extra logic cells for registers are required.
Example 1. Verilog HDL Unsigned Multiplier
module unsigned_mult (out, a, b); output [15:0] out; input [7:0] a; input [7:0] b; assign out = a * b; endmodule
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
Note:
The signed declaration in Verilog HDL is a feature of the Verilog 2001 Standard.
Example 2. Verilog HDL Signed Multiplier with Input and Output Registers (Pipelining =
2)
module signed_mult (out, clk, a, b); output [15:0] out; input clk; input signed [7:0] a; input signed [7:0] b;
reg signed [7:0] a_reg; reg signed [7:0] b_reg; reg signed [15:0] out; wire signed [15:0] mult_out;
assign mult_out = a_reg * b_reg;
always @ (posedge clk) begin a_reg <= a; b_reg <= b; out <= mult_out; end endmodule
Example 3. VHDL Unsigned Multiplier with Input and Output Registers (Pipelining = 2)
LIBRARY ieee; USE ieee.std_logic_1164.all; USE ieee.numeric_std.all;
ENTITY unsigned_mult IS PORT ( a: IN UNSIGNED (7 DOWNTO 0);
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
6
Send Feedback
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
b: IN UNSIGNED (7 DOWNTO 0); clk: IN STD_LOGIC; aclr: IN STD_LOGIC; result: OUT UNSIGNED (15 DOWNTO 0) ); END unsigned_mult;
ARCHITECTURE rtl OF unsigned_mult IS SIGNAL a_reg, b_reg: UNSIGNED (7 DOWNTO 0); BEGIN PROCESS (clk, aclr) BEGIN IF (aclr ='1') THEN a_reg <= (OTHERS => '0'); b_reg <= (OTHERS => '0'); result <= (OTHERS => '0'); ELSIF (rising_edge(clk)) THEN a_reg <= a; b_reg <= b; result <= a_reg * b_reg; END IF; END PROCESS; END rtl;
Example 4. VHDL Signed Multiplier
LIBRARY ieee; USE ieee.std_logic_1164.all; USE ieee.numeric_std.all;
ENTITY signed_mult IS PORT ( a: IN SIGNED (7 DOWNTO 0); b: IN SIGNED (7 DOWNTO 0); result: OUT SIGNED (15 DOWNTO 0) ); END signed_mult;
ARCHITECTURE rtl OF signed_mult IS BEGIN result <= a * b; END rtl;

1.3.2. Inferring Multiply-Accumulator and Multiply-Adder Functions

Synthesis tools detect multiply-accumulator or multiply-adder functions, and either implement them as Intel FPGA IP cores or map them directly to device atoms. During placement and routing, the Intel Quartus Prime software places multiply-accumulator and multiply-adder functions in DSP blocks.
Note: Synthesis tools infer multiply-accumulator and multiply-adder functions only if the
Intel device family has dedicated DSP blocks that support these functions.
A simple multiply-accumulator consists of a multiplier feeding an addition operator. The addition operator feeds a set of registers that then feeds the second input to the addition operator. A simple multiply-adder consists of two to four multipliers feeding one or two levels of addition, subtraction, or addition/subtraction operators. Addition is always the second-level operator, if it is used. In addition to the multiply­accumulator and multiply-adder, the Intel Quartus Prime Fitter also places input and output registers into the DSP blocks to pack registers and improve performance and area utilization.
Send Feedback
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
7
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
Some device families offer additional advanced multiply-adder and accumulator functions, such as complex multiplication, input shift register, or larger multiplications.
The Verilog HDL and VHDL code samples infer multiply-accumulator and multiply­adder functions with input, output, and pipeline registers, as well as an optional asynchronous clear signal. Using the three sets of registers provides the best performance through the function, with a latency of three. To reduce latency, remove the registers in your design.
Note: To obtain high performance in DSP designs, use register pipelining and avoid
unregistered DSP functions.
Example 5. Verilog HDL Multiply-Accumulator
module sum_of_four_multiply_accumulate #(parameter INPUT_WIDTH=18, parameter OUTPUT_WIDTH=44) ( input clk, ena, input [INPUT_WIDTH-1:0] dataa, datab, datac, datad, input [INPUT_WIDTH-1:0] datae, dataf, datag, datah, output reg [OUTPUT_WIDTH-1:0] dataout ); // Each product can be up to 2*INPUT_WIDTH bits wide. // The sum of four of these products can be up to 2 bits wider. wire [2*INPUT_WIDTH+1:0] mult_sum;
// Store the results of the operations on the current inputs assign mult_sum = (dataa * datab + datac * datad) + (datae * dataf + datag * datah);
// Store the value of the accumulation always @ (posedge clk) begin if (ena == 1) begin dataout <= dataout + mult_sum; end end endmodule

1.4. Inferring Memory Functions from HDL Code

The following coding recommendations provide portable examples of generic HDL code targeting dedicated Intel FPGA memory IP cores. However, if you want to use some of the advanced memory features in Intel FPGA devices, consider using the IP core directly so that you can customize the ports and parameters easily.
You can also use the Intel Quartus Prime templates provided in the Intel Quartus Prime software as a starting point.
Table 1. Intel Memory HDL Language Templates
Language Full Design Name
VHDL Single-Port RAM
Single-Port RAM with Initial Contents Simple Dual-Port RAM (single clock) Simple Dual-Port RAM (dual clock) True Dual-Port RAM (single clock) True Dual-Port RAM (dual clock)
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
8
continued...
Send Feedback
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
Language Full Design Name
Mixed-Width RAM Mixed-Width True Dual-Port RAM Byte-Enabled Simple Dual-Port RAM Byte-Enabled True Dual-Port RAM Single-Port ROM Dual-Port ROM
Verilog HDL Single-Port RAM
Single-Port RAM with Initial Contents Simple Dual-Port RAM (single clock) Simple Dual-Port RAM (dual clock) True Dual-Port RAM (single clock) True Dual-Port RAM (dual clock) Single-Port ROM Dual-Port ROM
SystemVerilog Mixed-Width Port RAM
Mixed-Width True Dual-Port RAM Mixed-Width True Dual-Port RAM (new data on same port read during write) Byte-Enabled Simple Dual Port RAM Byte-Enabled True Dual-Port RAM
Related Information
Instantiating IP Cores in HDL In Introduction to Intel FPGA IP Cores
Memory In Intel Stratix® 10 High-Performance Design Handbook
Embedded Memory Blocks in Intel Arria® 10 Devices In Intel Arria® 10 Core Fabric and General Purpose I/Os Handbook

1.4.1. Inferring RAM functions from HDL Code

To infer RAM functions, synthesis tools recognize certain types of HDL code and map the detected code to technology-specific implementations. For device families that have dedicated RAM blocks, the Intel Quartus Prime software uses an Intel FPGA IP core to target the device memory architecture.
Synthesis tools typically consider all signals and variables that have a multi­dimensional array type and then create a RAM block, if applicable. This is based on the way the signals or variables are assigned or referenced in the HDL source description.
Standard synthesis tools recognize single-port and simple dual-port (one read port and one write port) RAM blocks. Some synthesis tools (such as the Intel Quartus Prime software) also recognize true dual-port (two read ports and two write ports) RAM blocks that map to the memory blocks in certain Intel FPGA devices.
Some tools (such as the Intel Quartus Prime software) also infer memory blocks for array variables and signals that are referenced (read/written) by two indexes, to recognize mixed-width and byte-enabled RAMs for certain coding styles.
Note:
Send Feedback
If your design contains a RAM block that your synthesis tool does not recognize and infer, the design might require a large amount of system memory that can potentially cause compilation problems.
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
9
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
1.4.1.1. Use Synchronous Memory Blocks
Memory blocks in Intel FPGA are synchronous. Therefore, RAM designs must be synchronous to map directly into dedicated memory blocks. For these devices, Intel Quartus Prime synthesis implements asynchronous memory logic in regular logic cells.
Synchronous memory offers several advantages over asynchronous memory, including higher frequencies and thus higher memory bandwidth, increased reliability, and less standby power. To convert asynchronous memory, move registers from the datapath into the memory block.
A memory block is synchronous if it has one of the following read behaviors:
Memory read occurs in a Verilog HDL always block with a clock signal or a VHDL clocked process. The recommended coding style for synchronous memories is to create your design with a registered read output.
Memory read occurs outside a clocked block, but there is a synchronous read address (that is, the address used in the read statement is registered). Synthesis does not always infer this logic as a memory block, or may require external bypass logic, depending on the target device architecture. Avoid this coding style for synchronous memories.
Note: The synchronous memory structures in Intel FPGA devices can differ from the
structures in other vendors’ devices. For best results, match your design to the target device architecture.
This chapter provides coding recommendations for various memory types. All the examples in this document are synchronous to ensure that they can be directly mapped into the dedicated memory architecture available in Intel FPGAs.
1.4.1.2. Avoid Unsupported Reset and Control Conditions
To ensure correct implementation of HDL code in the target device architecture, avoid unsupported reset conditions or other control logic that does not exist in the device architecture.
The RAM contents of Intel FPGA memory blocks cannot be cleared with a reset signal during device operation. If your HDL code describes a RAM with a reset signal for the RAM contents, the logic is implemented in regular logic cells instead of a memory block. Do not place RAM read or write operations in an always block or process block with a reset signal. To specify memory contents, initialize the memory or write the data to the RAM during device operation.
In addition to reset signals, other control logic can prevent synthesis from inferring memory logic as a memory block. For example, if you use a clock enable on the read address registers, you can alter the output latch of the RAM, resulting in the synthesized RAM result not matching the HDL description. Use the address stall feature as a read address clock enable to avoid this limitation. Check the documentation for your FPGA device to ensure that your code matches the hardware available in the device.
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
10
Send Feedback
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
Example 6. Verilog RAM with Reset Signal that Clears RAM Contents: Not Supported in
Device Architecture
module clear_ram ( input clock, reset, we, input [7:0] data_in, input [4:0] address, output reg [7:0] data_out );
reg [7:0] mem [0:31]; integer i;
always @ (posedge clock or posedge reset) begin if (reset == 1'b1) mem[address] <= 0; else if (we == 1'b1) mem[address] <= data_in;
data_out <= mem[address]; end endmodule
Related Information
Specifying Initial Memory Contents at Power-Up on page 24
1.4.1.3. Check Read-During-Write Behavior
Ensure the read-during-write behavior of the memory block described in your HDL design is consistent with your target device architecture.
Your HDL source code specifies the memory behavior when you read and write from the same memory address in the same clock cycle. The read returns either the old data at the address, or the new data written to the address. This is referred to as the read-during-write behavior of the memory block. Intel FPGA memory blocks have different read-during-write behavior depending on the target device family, memory mode, and block type.
Synthesis tools preserve the functionality described in your source code. Therefore, if your source code specifies unsupported read-during-write behavior for the RAM blocks, the Intel Quartus Prime software implements the logic in regular logic cells as opposed to the dedicated RAM hardware.
Example 7. Continuous read in HDL code
One common problem occurs when there is a continuous read in the HDL code, as in the following examples. Avoid using these coding styles:
//Verilog HDL concurrent signal assignment assign q = ram[raddr_reg];
-- VHDL concurrent signal assignment q <= ram(raddr_reg);
This type of HDL implies that when a write operation takes place, the read immediately reflects the new data at the address independent of the read clock, which is the behavior of asynchronous memory blocks. Synthesis cannot directly map this behavior to a synchronous memory block. If the write clock and read clock are the
Send Feedback
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
11
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
same, synthesis can infer memory blocks and add extra bypass logic so that the device behavior matches the HDL behavior. If the write and read clocks are different, synthesis cannot reliably add bypass logic, so it implements the logic in regular logic cells instead of dedicated RAM blocks. The examples in the following sections discuss some of these differences for read-during-write conditions.
In addition, the MLAB memories in certain device logic array blocks (LABs) does not easily support old data or new data behavior for a read-during-write in the dedicated device architecture. Implementing the extra logic to support this behavior significantly reduces timing performance through the memory.
Note: For best performance in MLAB memories, ensure that your design does not depend on
the read data during a write operation.
In many synthesis tools, you can declare that the read-during-write behavior is not important to your design (for example, if you never read from the same address to which you write in the same clock cycle). In Intel Quartus Prime Pro Edition synthesis, set the synthesis attribute ramstyle to no_rw_check to allow Intel Quartus Prime software to define the read-during-write behavior of a RAM, rather than use the behavior specified by your HDL code. This attribute can prevent the synthesis tool from using extra logic to implement the memory block, or can allow memory inference when it would otherwise be impossible.
1.4.1.4. Controlling RAM Inference and Implementation
Intel Quartus Prime synthesis provides options to control RAM inference and implementation for Intel FPGA devices with synchronous memory blocks. Synthesis tools usually do not infer small RAM blocks because implementing small RAM blocks is more efficient if using the registers in regular logic.
To direct the Intel Quartus Prime software to infer RAM blocks globally for all sizes, enable the Allow Any RAM Size for Recognition option in the Advanced Analysis
& Synthesis Settings dialog box (AssignmentsSettingsCompiler Settings ➤ Synthesis Settings (Advanced)).
Alternatively, use the ramstyle RTL attribute to specify how an inferred RAM is implemented, including the type of memory block or the use of regular logic instead of a dedicated memory block. Intel Quartus Prime synthesis does not map inferred memory into MLABs unless the HDL code specifies the appropriate ramstyle attribute, although the Fitter may map some memories to MLABs.
Set the ramstyle attribute in the RTL or in the .qsf file.
(* ramstyle = "mlab" *) my_shift_reg
set_instance_assignment -name RAMSTYLE_ATTRIBUTE LOGIC -to ram
You can also specify the maximum depth of memory blocks for RAM or ROM inference in RTL. Specify the max_depth synthesis attribute to the declaration of a variable that represents a RAM or ROM in your design file. For example:
// Limit the depth of the memory blocks implement "ram" to 512 // This forces the Intel Quartus Prime software to use two M512 blocks instead of one M4K block to implement this RAM (* max_depth = 512 *) reg [7:0] ram[0:1023];
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
12
Send Feedback
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
In addition, you can specify the no_ram synthesis attribute to prevent RAM inference on a specific array. For example:
(* no_ram *) logic [11:0] my_array [0:12];
1.4.1.5. Single-Clock Synchronous RAM with Old Data Read-During-Write Behavior
The code examples in this section show Verilog HDL and VHDL code that infers simple dual-port, single-clock synchronous RAM. Single-port RAM blocks use a similar coding style.
The read-during-write behavior in these examples is to read the old data at the memory address. For best performance in MLAB memories, use the appropriate attribute so that your design does not depend on the read data during a write operation. The simple dual-port RAM code samples map directly into Intel synchronous memory.
Single-port versions of memory blocks (that is, using the same read address and write address signals) allow better RAM utilization than dual-port memory blocks, depending on the device family. Refer to the appropriate device handbook for recommendations on your target device.
Example 8. Verilog HDL Single-Clock, Simple Dual-Port Synchronous RAM with Old Data
Read-During-Write Behavior
module single_clk_ram( output reg [7:0] q, input [7:0] d, input [4:0] write_address, read_address, input we, clk ); reg [7:0] mem [31:0];
always @ (posedge clk) begin if (we) mem[write_address] <= d; q <= mem[read_address]; // q doesn't get d in this clock cycle end endmodule
Example 9. VHDL Single-Clock, Simple Dual-Port Synchronous RAM with Old Data Read-
During-Write Behavior
LIBRARY ieee; USE ieee.std_logic_1164.all;
ENTITY single_clock_ram IS PORT ( clock: IN STD_LOGIC; data: IN STD_LOGIC_VECTOR (7 DOWNTO 0); write_address: IN INTEGER RANGE 0 to 31; read_address: IN INTEGER RANGE 0 to 31; we: IN STD_LOGIC; q: OUT STD_LOGIC_VECTOR (7 DOWNTO 0) ); END single_clock_ram;
ARCHITECTURE rtl OF single_clock_ram IS TYPE MEM IS ARRAY(0 TO 31) OF STD_LOGIC_VECTOR(7 DOWNTO 0); SIGNAL ram_block: MEM; BEGIN PROCESS (clock)
Send Feedback
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
13
1. Recommended HDL Coding Styles
BEGIN IF (rising_edge(clock)) THEN IF (we = '1') THEN ram_block(write_address) <= data; END IF; q <= ram_block(read_address);
-- VHDL semantics imply that q doesn't get data
-- in this clock cycle END IF; END PROCESS; END rtl;
UG-20131 | 2021.03.29
Note:
The small size of this single_clock_ram causes the Compiler to infer the memory as MLAB memory blocks, rather than M20K memory blocks. If single_clock_ram specifies a larger width, the Compiler infers the memory as M20K memory blocks.
1.4.1.6. Single-Clock Synchronous RAM with New Data Read-During-Write Behavior
The examples in this section describe RAM blocks in which the read-during-write behavior returns the new value being written at the memory address.
To implement this behavior in the target device, synthesis tools add bypass logic around the RAM block. This bypass logic increases the area utilization of the design, and decreases the performance if the RAM block is part of the design’s critical path. If the device memory supports new data read-during-write behavior when in single-port mode (same clock, same read address, and same write address), the Verilog memory block doesn't require any bypass logic. Refer to the appropriate device handbook for specifications on your target device.
The following examples use a blocking assignment for the write so that the data is assigned intermediately.
Example 10. Verilog HDL Single-Clock, Simple Dual-Port Synchronous RAM with New Data
Read-During-Write Behavior
module single_clock_wr_ram( output reg [7:0] q, input [7:0] d, input [6:0] write_address, read_address, input we, clk ); reg [7:0] mem [127:0];
always @ (posedge clk) begin if (we) mem[write_address] = d; q = mem[read_address]; // q does get d in this clock // cycle if we is high end endmodule
Example 11. VHDL Single-Clock, Simple Dual-Port Synchronous RAM with New Data Read-
During-Write Behavior:
LIBRARY ieee; USE ieee.std_logic_1164.all; ENTITY single_clock_ram IS PORT ( clock: IN STD_LOGIC; data: IN STD_LOGIC_VECTOR (2 DOWNTO 0); write_address: IN INTEGER RANGE 0 to 31; read_address: IN INTEGER RANGE 0 to 31;
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
14
Send Feedback
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
we: IN STD_LOGIC; q: OUT STD_LOGIC_VECTOR (2 DOWNTO 0) ); END single_clock_ram;
ARCHITECTURE rtl OF single_clock_ram IS TYPE MEM IS ARRAY(0 TO 31) OF STD_LOGIC_VECTOR(2 DOWNTO 0);
BEGIN PROCESS (clock) VARIABLE ram_block: MEM; BEGIN IF (rising_edge(clock)) THEN IF (we = '1') THEN ram_block(write_address) := data; END IF; q <= ram_block(read_address);
-- VHDL semantics imply that q doesn't get data
-- in this clock cycle END IF; END PROCESS; END rtl;
It is possible to create a single-clock RAM by using an assign statement to read the address of mem and create the output q. By itself, the RTL describes new data read­during-write behavior. However, if the RAM output feeds a register in another
hierarchy, a read-during-write results in the old data. Synthesis tools may not infer a RAM block if the tool cannot determine which behavior is described, such as when the memory feeds a hard hierarchical partition boundary. Avoid this type of RTL.
Example 12. Avoid Verilog Coding Style with Vague read-during-write Behavior
reg [7:0] mem [127:0]; reg [6:0] read_address_reg;
always @ (posedge clk) begin if (we) mem[write_address] <= d; read_address_reg <= read_address; end assign q = mem[read_address_reg];
Example 13. Avoid VHDL Coding Style with Vague read-during-write Behavior
The following example uses a concurrent signal assignment to read from the RAM, and presents a similar behavior.
ARCHITECTURE rtl OF single_clock_rw_ram IS TYPE MEM IS ARRAY(0 TO 31) OF STD_LOGIC_VECTOR(2 DOWNTO 0); SIGNAL ram_block: MEM; SIGNAL read_address_reg: INTEGER RANGE 0 to 31; BEGIN PROCESS (clock) BEGIN IF (rising_edge(clock)) THEN IF (we = '1') THEN ram_block(write_address) <= data; END IF; read_address_reg <= read_address; END IF; END PROCESS; q <= ram_block(read_address_reg); END rtl;
Send Feedback
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
15
1. Recommended HDL Coding Styles
1.4.1.7. Simple Dual-Port, Dual-Clock Synchronous RAM
With dual-clock designs, synthesis tools cannot accurately infer the read-during-write behavior because it depends on the timing of the two clocks within the target device. Therefore, the read-during-write behavior of the synthesized design is undefined and may differ from your original HDL code.
Example 14. Verilog HDL Simple Dual-Port, Dual-Clock Synchronous RAM
module simple_dual_port_ram_dual_clock #(parameter DATA_WIDTH=8, parameter ADDR_WIDTH=6) ( input [(DATA_WIDTH-1):0] data, input [(ADDR_WIDTH-1):0] read_addr, write_addr, input we, read_clock, write_clock, output reg [(DATA_WIDTH-1):0] q );
// Declare the RAM variable reg [DATA_WIDTH-1:0] ram[2**ADDR_WIDTH-1:0];
always @ (posedge write_clock) begin // Write if (we) ram[write_addr] <= data; end
always @ (posedge read_clock) begin // Read q <= ram[read_addr]; end
endmodule
UG-20131 | 2021.03.29
Example 15. VHDL Simple Dual-Port, Dual-Clock Synchronous RAM
LIBRARY ieee; USE ieee.std_logic_1164.all; ENTITY dual_clock_ram IS PORT ( clock1, clock2: IN STD_LOGIC; data: IN STD_LOGIC_VECTOR (3 DOWNTO 0); write_address: IN INTEGER RANGE 0 to 31; read_address: IN INTEGER RANGE 0 to 31; we: IN STD_LOGIC; q: OUT STD_LOGIC_VECTOR (3 DOWNTO 0) ); END dual_clock_ram; ARCHITECTURE rtl OF dual_clock_ram IS TYPE MEM IS ARRAY(0 TO 31) OF STD_LOGIC_VECTOR(3 DOWNTO 0); SIGNAL ram_block: MEM; SIGNAL read_address_reg : INTEGER RANGE 0 to 31; BEGIN PROCESS (clock1) BEGIN IF (rising_edge(clock1)) THEN IF (we = '1') THEN ram_block(write_address) <= data; END IF; END IF; END PROCESS; PROCESS (clock2) BEGIN IF (rising_edge(clock2)) THEN
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
16
Send Feedback
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
q <= ram_block(read_address_reg); read_address_reg <= read_address; END IF; END PROCESS; END rtl;
Related Information
Check Read-During-Write Behavior on page 11
1.4.1.8. True Dual-Port Synchronous RAM
The code examples in this section show Verilog HDL and VHDL code that infers true dual-port synchronous RAM. Different synthesis tools may differ in their support for these types of memories.
Intel FPGA synchronous memory blocks have two independent address ports, allowing for operations on two unique addresses simultaneously. A read operation and a write operation can share the same port if they share the same address.
The Intel Quartus Prime software infers true dual-port RAMs in Verilog HDL and VHDL, with the following characteristics:
Any combination of independent read or write operations in the same clock cycle.
At most two unique port addresses.
In one clock cycle, with one or two unique addresses, they can perform:
— Two reads and one write
— Two writes and one read
— Two writes and two reads
In the synchronous RAM block architecture, there is no priority between the two ports. Therefore, if you write to the same location on both ports at the same time, the result is indeterminate in the device architecture. You must ensure your HDL code does not imply priority for writes to the memory block, if you want the design to be implemented in a dedicated hardware memory block. For example, if both ports are defined in the same process block, the code is synthesized and simulated sequentially so that there is a priority between the two ports. If your code does imply a priority, the logic cannot be implemented in the device RAM blocks and is implemented in regular logic cells. You must also consider the read-during-write behavior of the RAM block to ensure that it can be mapped directly to the device RAM architecture.
When a read and write operation occurs on the same port for the same address, the read operation may behave as follows:
Read new data—Intel Arria® 10 and Intel Stratix® 10 devices support this behavior.
Read old data—Not supported.
Send Feedback
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
17
When a read and write operation occurs on different ports for the same address (also known as mixed port), the read operation may behave as follows:
Read new data—Intel Quartus Prime Pro Edition synthesis supports this mode by creating bypass logic around the synchronous memory block.
Read old data—Intel Arria 10 and Intel Cyclone® 10 devices support this behavior.
Read don’t care—Synchronous memory blocks support this behavior in simple dual-port mode.
The Verilog HDL single-clock code sample maps directly into synchronous Intel Arria 10 memory blocks. When a read and write operation occurs on the same port for the same address, the new data being written to the memory is read. When a read and write operation occurs on different ports for the same address, the old data in the memory is read. Simultaneous writes to the same location on both ports results in indeterminate behavior.
If you generate a dual-clock version of this design describing the same behavior, the inferred memory in the target device presents undefined mixed port read-during-write behavior, because it depends on the relationship between the clocks.
Example 16. Verilog HDL True Dual-Port RAM with Single Clock
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
/ Quartus Prime Verilog Template // True Dual Port RAM with single clock // // Read-during-write behavior is undefined for mixed ports // and "new data" on the same port
module true_dual_port_ram_single_clock #(parameter DATA_WIDTH=8, parameter ADDR_WIDTH=6) ( input [(DATA_WIDTH-1):0] data_a, data_b, input [(ADDR_WIDTH-1):0] addr_a, addr_b, input we_a, we_b, clk, output reg [(DATA_WIDTH-1):0] q_a, q_b );
// Declare the RAM variable reg [DATA_WIDTH-1:0] ram[2**ADDR_WIDTH-1:0];
// Port A always @ (posedge clk) begin if (we_a) begin ram[addr_a] = data_a; end q_a <= ram[addr_a]; end
// Port B always @ (posedge clk) begin if (we_b) begin ram[addr_b] = data_b; end q_b <= ram[addr_b]; end
endmodule
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
18
Send Feedback
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
Example 17. VHDL Read Statement Example
-- Port A process(clk) begin if(rising_edge(clk)) then if(we_a = '1') then ram(addr_a) := data_a; end if; q_a <= ram(addr_a); end if; end process;
-- Port B process(clk) begin if(rising_edge(clk)) then if(we_b = '1') then ram(addr_b) := data_b; end if; q_b <= ram(addr_b); end if; end process;
The VHDL single-clock code sample maps directly into Intel FPGA synchronous memory. When a read and write operation occurs on the same port for the same address, the new data writing to the memory is read. When a read and write operation occurs on different ports for the same address, the behavior results in old data for Intel Arria 10 and Intel Cyclone 10 devices, and is undefined for Intel Stratix 10 devices. Simultaneous write operations to the same location on both ports results in indeterminate behavior.
If you generate a dual-clock version of this design describing the same behavior, the memory in the target device presents undefined mixed port read-during-write behavior because it depends on the relationship between the clocks.
Example 18. VHDL True Dual-Port RAM with Single Clock
-- Quartus Prime VHDL Template
-- True Dual-Port RAM with single clock
--
-- Read-during-write behavior is undefined for mixed ports
-- and "new data" on the same port
library ieee; use ieee.std_logic_1164.all;
entity true_dual_port_ram_single_clock is
generic ( DATA_WIDTH : natural := 8; ADDR_WIDTH : natural := 6 );
port ( clk : in std_logic; addr_a : in natural range 0 to 2**ADDR_WIDTH - 1; addr_b : in natural range 0 to 2**ADDR_WIDTH - 1; data_a : in std_logic_vector((DATA_WIDTH-1) downto 0); data_b : in std_logic_vector((DATA_WIDTH-1) downto 0); we_a : in std_logic := '1'; we_b : in std_logic := '1'; q_a : out std_logic_vector((DATA_WIDTH -1) downto 0); q_b : out std_logic_vector((DATA_WIDTH -1) downto 0) );
Send Feedback
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
19
1. Recommended HDL Coding Styles
end true_dual_port_ram_single_clock;
architecture rtl of true_dual_port_ram_single_clock is
-- Build a 2-D array type for the RAM subtype word_t is std_logic_vector((DATA_WIDTH-1) downto 0); type memory_t is array(2**ADDR_WIDTH-1 downto 0) of word_t;
-- Declare the RAM shared variable ram : memory_t;
begin
-- Port A process(clk) begin if(rising_edge(clk)) then if(we_a = '1') then ram(addr_a) := data_a; end if; q_a <= ram(addr_a); end if; end process;
-- Port B process(clk) begin if(rising_edge(clk)) then if(we_b = '1') then ram(addr_b) := data_b; end if; q_b <= ram(addr_b); end if; end process;
end rtl;
UG-20131 | 2021.03.29
Related Information
Guideline: Customize Read-During-Write Behavior
In Intel Arria 10 Core Fabric and General Purpose I/Os Handbook
1.4.1.9. Mixed-Width Dual-Port RAM
The RAM code examples in this section show SystemVerilog and VHDL code that infers RAM with data ports with different widths.
Verilog-1995 doesn't support mixed-width RAMs because the standard lacks a multi­dimensional array to model the different read width, write width, or both. Verilog-2001 doesn't support mixed-width RAMs because this type of logic requires multiple packed dimensions. Different synthesis tools may differ in their support for these memories. This section describes the inference rules for Intel Quartus Prime Pro Edition synthesis.
The first dimension of the multi-dimensional packed array represents the ratio of the wider port to the narrower port. The second dimension represents the narrower port width. The read and write port widths must specify a read or write ratio supported by the memory blocks in the target device. Otherwise, the synthesis tool does not infer a RAM.
Refer to the Intel Quartus Prime HDL templates for parameterized examples with supported combinations of read and write widths. You can also find examples of true dual port RAMs with two mixed-width read ports and two mixed-width write ports.
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
20
Send Feedback
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
Example 19. SystemVerilog Mixed-Width RAM with Read Width Smaller than Write Width
module mixed_width_ram // 256x32 write and 1024x8 read ( input [7:0] waddr, input [31:0] wdata, input we, clk, input [9:0] raddr, output logic [7:0] q ); logic [3:0][7:0] ram[0:255]; always_ff@(posedge clk) begin if(we) ram[waddr] <= wdata; q <= ram[raddr / 4][raddr % 4]; end endmodule : mixed_width_ram
Example 20. SystemVerilog Mixed-Width RAM with Read Width Larger than Write Width
module mixed_width_ram // 1024x8 write and 256x32 read ( input [9:0] waddr, input [31:0] wdata, input we, clk, input [7:0] raddr, output logic [9:0] q ); logic [3:0][7:0] ram[0:255]; always_ff@(posedge clk) begin if(we) ram[waddr / 4][waddr % 4] <= wdata; q <= ram[raddr]; end endmodule : mixed_width_ram
Example 21. VHDL Mixed-Width RAM with Read Width Smaller than Write Width
library ieee; use ieee.std_logic_1164.all;
package ram_types is type word_t is array (0 to 3) of std_logic_vector(7 downto 0); type ram_t is array (0 to 255) of word_t; end ram_types;
library ieee; use ieee.std_logic_1164.all; library work; use work.ram_types.all;
entity mixed_width_ram is port ( we, clk : in std_logic; waddr : in integer range 0 to 255; wdata : in word_t; raddr : in integer range 0 to 1023; q : out std_logic_vector(7 downto 0)); end mixed_width_ram;
architecture rtl of mixed_width_ram is signal ram : ram_t; begin -- rtl process(clk, we) begin if(rising_edge(clk)) then if(we = '1') then
Send Feedback
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
21
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
ram(waddr) <= wdata; end if; q <= ram(raddr / 4 )(raddr mod 4); end if; end process; end rtl;
Example 22. VHDL Mixed-Width RAM with Read Width Larger than Write Width
library ieee; use ieee.std_logic_1164.all;
package ram_types is type word_t is array (0 to 3) of std_logic_vector(7 downto 0); type ram_t is array (0 to 255) of word_t; end ram_types;
library ieee; use ieee.std_logic_1164.all; library work; use work.ram_types.all;
entity mixed_width_ram is port ( we, clk : in std_logic; waddr : in integer range 0 to 1023; wdata : in std_logic_vector(7 downto 0); raddr : in integer range 0 to 255; q : out word_t); end mixed_width_ram;
architecture rtl of mixed_width_ram is signal ram : ram_t; begin -- rtl process(clk, we) begin if(rising_edge(clk)) then if(we = '1') then ram(waddr / 4)(waddr mod 4) <= wdata; end if; q <= ram(raddr); end if; end process; end rtl;
1.4.1.10. RAM with Byte-Enable Signals
The RAM code examples in this section show SystemVerilog and VHDL code that infers RAM with controls for writing single bytes into the memory word, or byte-enable signals.
Synthesis models byte-enable signals by creating write expressions with two indexes, and writing part of a RAM "word." With these implementations, you can also write more than one byte at once by enabling the appropriate byte enables.
Verilog-1995 doesn't support mixed-width RAMs because the standard lacks a multi­dimensional array to model the different read width, write width, or both. Verilog-2001 doesn't support mixed-width RAMs because this type of logic requires multiple packed dimensions. Different synthesis tools may differ in their support for these memories. This section describes the inference rules for Intel Quartus Prime Pro Edition synthesis.
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
22
Send Feedback
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
Refer to the Intel Quartus Prime HDL templates for parameterized examples that you can use for different address widths, and true dual port RAM examples with two read ports and two write ports.
Example 23. SystemVerilog Simple Dual-Port Synchronous RAM with Byte Enable
module byte_enabled_simple_dual_port_ram ( input we, clk, input [ADDRESS_WIDTH-1:0] waddr, raddr,// address width = 6 input [NUM_BYTES-1:0] be, // 4 bytes per word input [(BYTE_WIDTH * NUM_BYTES -1):0] wdata, // byte width = 8, 4 bytes per word output reg [(BYTE_WIDTH * NUM_BYTES -1):0] q // byte width = 8, 4 bytes per word );
parameter ADDRESS_WIDTH = 6; parameter DEPTH = 2**ADDRESS_WIDTH; parameter BYTE_WIDTH = 8; parameter NUM_BYTES = 4;
// use a multi-dimensional packed array //to model individual bytes within the word logic [NUM_BYTES-1:0][BYTE_WIDTH-1:0] ram[0:DEPTH-1]; // # words = 1 << address width
// port A always@(posedge clk) begin if(we) begin for (int i = 0; i < NUM_BYTES; i = i + 1) begin if(be[i]) ram[waddr][i] <= wdata[i*BYTE_WIDTH +: BYTE_WIDTH]; end end q <= ram[raddr]; end endmodule
Example 24. VHDL Simple Dual-Port Synchronous RAM with Byte Enable
library ieee; use ieee.std_logic_1164.all; library work;
entity byte_enabled_simple_dual_port_ram is generic (DEPTH : integer := 64; NUM_BYTES : integer := 4; BYTE_WIDTH : integer := 8 ); port ( we, clk : in std_logic; waddr, raddr : in integer range 0 to DEPTH -1 ; -- address width = 6 be : in std_logic_vector (NUM_BYTES-1 downto 0); -- 4 bytes per word wdata: in std_logic_vector((NUM_BYTES * BYTE_WIDTH -1) downto 0); -­width = 32 q : out std_logic_vector((NUM_BYTES * BYTE_WIDTH -1) downto 0) ); -­width = 32 end byte_enabled_simple_dual_port_ram;
architecture rtl of byte_enabled_simple_dual_port_ram is
-- build up 2D array to hold the memory type word_t is array (0 to NUM_BYTES-1) of std_logic_vector(BYTE_WIDTH-1 downto 0); type ram_t is array (0 to DEPTH-1) of word_t;
signal ram : ram_t; signal q_local : word_t;
Send Feedback
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
23
begin -- Re-organize the read data from the RAM to match the output unpack: for i in 0 to NUM_BYTES-1 generate q(BYTE_WIDTH*(i+1) - 1 downto BYTE_WIDTH*i) <= q_local(i); end generate unpack;
-- port A process(clk) begin if(rising_edge(clk)) then if(we = '1') then for I in (NUM_BYTES-1) downto 0 loop if(be(I) = '1') then ram(waddr)(I) <= wdata(((I+1)*BYTE_WIDTH-1) downto I*BYTE_WIDTH); end if; end loop; end if; q_local <= ram(raddr); end if; end process; end rtl;
1.4.1.11. Specifying Initial Memory Contents at Power-Up
Your synthesis tool may offer various ways to specify the initial contents of an inferred memory. There are slight power-up and initialization differences between dedicated RAM blocks and the MLAB memory, due to the continuous read of the MLAB.
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
Intel FPGA dedicated RAM block outputs always power-up to zero, and are set to the initial value on the first read. For example, if address 0 is pre-initialized to FF, the RAM block powers up with the output at 0. A subsequent read after power-up from address
0 outputs the pre-initialized value of FF. Therefore, if a RAM powers up and an enable
(read enable or clock enable) is held low, the power-up output of 0 maintains until the first valid read cycle. The synthesis tool implements MLAB using registers that power­up to 0, but initialize to their initial value immediately at power-up or reset. Therefore, the initial value is seen, regardless of the enable status. The Intel Quartus Prime software maps inferred memory to MLABs when the HDL code specifies an appropriate
ramstyle attribute.
In Verilog HDL, you can use an initial block to initialize the contents of an inferred memory. Intel Quartus Prime Pro Edition synthesis automatically converts the initial block into a Memory Initialization File (.mif) for the inferred RAM.
Example 25. Verilog HDL RAM with Initialized Contents
module ram_with_init( output reg [7:0] q, input [7:0] d, input [4:0] write_address, read_address, input we, clk ); reg [7:0] mem [0:31]; integer i;
initial begin for (i = 0; i < 32; i = i + 1) mem[i] = i[7:0]; end
always @ (posedge clk) begin if (we) mem[write_address] <= d;
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
24
Send Feedback
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
q <= mem[read_address]; end endmodule
Intel Quartus Prime Pro Edition synthesis and other synthesis tools also support the
$readmemb and $readmemh attributes. These attributes allow RAM initialization and
ROM initialization work identically in synthesis and simulation.
Example 26. Verilog HDL RAM Initialized with the readmemb Command
reg [7:0] ram[0:15]; initial begin $readmemb("ram.txt", ram); end
In VHDL, you can initialize the contents of an inferred memory by specifying a default value for the corresponding signal. Intel Quartus Prime Pro Edition synthesis automatically converts the default value into a .mif file for the inferred RAM.
Example 27. VHDL RAM with Initialized Contents
LIBRARY ieee; USE ieee.std_logic_1164.all; use ieee.numeric_std.all;
ENTITY ram_with_init IS PORT( clock: IN STD_LOGIC; data: IN UNSIGNED (7 DOWNTO 0); write_address: IN integer RANGE 0 to 31; read_address: IN integer RANGE 0 to 31; we: IN std_logic; q: OUT UNSIGNED (7 DOWNTO 0)); END;
ARCHITECTURE rtl OF ram_with_init IS
TYPE MEM IS ARRAY(31 DOWNTO 0) OF unsigned(7 DOWNTO 0); FUNCTION initialize_ram return MEM is variable result : MEM; BEGIN FOR i IN 31 DOWNTO 0 LOOP result(i) := to_unsigned(natural(i), natural'(8)); END LOOP; RETURN result; END initialize_ram;
SIGNAL ram_block : MEM := initialize_ram; BEGIN PROCESS (clock) BEGIN IF (rising_edge(clock)) THEN IF (we = '1') THEN ram_block(write_address) <= data; END IF; q <= ram_block(read_address); END IF; END PROCESS; END rtl;
Send Feedback
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
25

1.4.2. Inferring ROM Functions from HDL Code

Synthesis tools infer ROMs when a CASE statement exists in which a value is set to a constant for every choice in the CASE statement.
Because small ROMs typically achieve the best performance when they are implemented using the registers in regular logic, each ROM function must meet a minimum size requirement for inference and placement in memory.
For device architectures with synchronous RAM blocks, to infer a ROM block, synthesis must use registers for either the address or the output. When your design uses output registers, synthesis implements registers from the input registers of the RAM block without affecting the functionality of the ROM. If you register the address, the power­up state of the inferred ROM can be different from the HDL design. In this scenario, Intel Quartus Prime synthesis issues a warning.
The following ROM examples map directly to the Intel FPGA memory architecture.
Example 28. Verilog HDL Synchronous ROM
module sync_rom (clock, address, data_out); input clock; input [7:0] address; output reg [5:0] data_out; reg [5:0] data_out;
always @ (posedge clock) begin case (address) 8'b00000000: data_out = 6'b101111; 8'b00000001: data_out = 6'b110110; ... 8'b11111110: data_out = 6'b000001; 8'b11111111: data_out = 6'b101010; endcase end endmodule
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
Example 29. VHDL Synchronous ROM
LIBRARY ieee; USE ieee.std_logic_1164.all;
ENTITY sync_rom IS PORT ( clock: IN STD_LOGIC; address: IN STD_LOGIC_VECTOR(7 downto 0); data_out: OUT STD_LOGIC_VECTOR(5 downto 0) ); END sync_rom;
ARCHITECTURE rtl OF sync_rom IS BEGIN PROCESS (clock) BEGIN IF rising_edge (clock) THEN CASE address IS WHEN "00000000" => data_out <= "101111"; WHEN "00000001" => data_out <= "110110"; ... WHEN "11111110" => data_out <= "000001"; WHEN "11111111" => data_out <= "101010"; WHEN OTHERS => data_out <= "101111"; END CASE;
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
26
Send Feedback
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
END IF; END PROCESS; END rtl;
Example 30. Verilog HDL Dual-Port Synchronous ROM Using readmemb
module dual_port_rom #(parameter data_width=8, parameter addr_width=8) ( input [(addr_width-1):0] addr_a, addr_b, input clk, output reg [(data_width-1):0] q_a, q_b ); reg [data_width-1:0] rom[2**addr_width-1:0];
initial // Read the memory contents in the file //dual_port_rom_init.txt. begin $readmemb("dual_port_rom_init.txt", rom); end
always @ (posedge clk) begin q_a <= rom[addr_a]; q_b <= rom[addr_b]; end endmodule
Example 31. VHDL Dual-Port Synchronous ROM Using Initialization Function
library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all;
entity dual_port_rom is generic ( DATA_WIDTH : natural := 8; ADDR_WIDTH : natural := 8 ); port ( clk : in std_logic; addr_a : in natural range 0 to 2**ADDR_WIDTH - 1; addr_b : in natural range 0 to 2**ADDR_WIDTH - 1; q_a : out std_logic_vector((DATA_WIDTH -1) downto 0); q_b : out std_logic_vector((DATA_WIDTH -1) downto 0) ); end entity;
architecture rtl of dual_port_rom is
-- Build a 2-D array type for the ROM subtype word_t is std_logic_vector((DATA_WIDTH-1) downto 0); type memory_t is array(2**ADDR_WIDTH - 1 downto 0) of word_t;
function init_rom return memory_t is variable tmp : memory_t := (others => (others => '0')); begin for addr_pos in 0 to 2**ADDR_WIDTH - 1 loop
-- Initialize each address with the address itself tmp(addr_pos) := std_logic_vector(to_unsigned(addr_pos, DATA_WIDTH)); end loop; return tmp; end init_rom;
-- Declare the ROM signal and specify a default initialization value. signal rom : memory_t := init_rom; begin
Send Feedback
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
27
process(clk) begin if (rising_edge(clk)) then q_a <= rom(addr_a); q_b <= rom(addr_b); end if; end process; end rtl;

1.4.3. Inferring Shift Registers in HDL Code

To infer shift registers in Intel Arria 10 devices, synthesis tools detect a group of shift registers of the same length, and convert them to an Intel FPGA shift register IP core.
For detection, all shift registers must have the following characteristics:
Use the same clock and clock enable
No other secondary signals
Equally spaced taps that are at least three registers apart
Synthesis recognizes shift registers only for device families with dedicated RAM blocks. Intel Quartus Prime Pro Edition synthesis uses the following guidelines:
The Intel Quartus Prime software determines whether to infer the Intel FPGA shift register IP core based on the width of the registered bus (W), the length between each tap (L), or the number of taps (N).
If the Auto Shift Register Recognition option is set to Auto, Intel Quartus Prime Pro Edition synthesis determines which shift registers are implemented in RAM blocks for logic by using:
— The Optimization Technique setting
— Logic and RAM utilization information about the design
— Timing information from Timing-Driven Synthesis
If the registered bus width is one (W = 1), Intel Quartus Prime synthesis infers shift register IP if the number of taps times the length between each tap is greater than or equal to 64 (N x L > 64).
If the registered bus width is greater than one (W > 1), and the registered bus width times the number of taps times the length between each tap is greater than or equal to 32 (W × N × L > 32), the Intel Quartus Prime synthesis infers Intel FPGA shift register IP core.
If the length between each tap (L) is not a power of two, Intel Quartus Prime synthesis needs external logic (LEs or ALMs) to decode the read and write counters, because of different sizes of shift registers. This extra decode logic eliminates the performance and utilization advantages of implementing shift registers in memory.
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
The registers that Intel Quartus Prime synthesis maps to the Intel FPGA shift register IP core, and places in RAM are not available in a Verilog HDL or VHDL output file for simulation tools, because their node names do not exist after synthesis.
Note:
The Compiler cannot implement a shift register that uses a shift enable signal into MLAB memory; instead, the Compiler uses dedicated RAM blocks. To control the type of memory structure that implements the shift register, use the ramstyle attribute.
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
28
Send Feedback
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
1.4.3.1. Simple Shift Register
The examples in this section show a simple, single-bit wide, 69-bit long shift register. Intel Quartus Prime synthesis implements the register (W = 1 and M = 69) in an
ALTSHIFT_TAPS IP core for supported devices and maps it to RAM in supported devices, which may be placed in dedicated RAM blocks or MLAB memory. If the length of the register is less than 69 bits, Intel Quartus Prime synthesis implements the shift register in logic.
Example 32. Verilog HDL Single-Bit Wide, 69-Bit Long Shift Register
module shift_1x69 (clk, shift, sr_in, sr_out); input clk, shift; input sr_in; output sr_out;
reg [68:0] sr;
always @ (posedge clk) begin if (shift == 1'b1) begin sr[68:1] <= sr[67:0]; sr[0] <= sr_in; end end sr_out <= sr(68); endmodule
Example 33. VHDL Single-Bit Wide, 69-Bit Long Shift Register
LIBRARY IEEE; USE IEEE.STD_LOGIC_1164.all; ENTITY shift_1x69 IS PORT ( clk: IN STD_LOGIC; shift: IN STD_LOGIC; sr_in: IN STD_LOGIC; sr_out: OUT STD_LOGIC ); END shift_1x69;
ARCHITECTURE arch OF shift_1x69 IS TYPE sr_length IS ARRAY (68 DOWNTO 0) OF STD_LOGIC; SIGNAL sr: sr_length; BEGIN PROCESS (clk) BEGIN IF (rising_edge(clk)) THEN IF (shift = '1') THEN sr(68 DOWNTO 1) <= sr(67 DOWNTO 0); sr(0) <= sr_in; END IF; END IF; END PROCESS; sr_out <= sr(65); END arch;
1.4.3.2. Shift Register with Evenly Spaced Taps
The following examples show a Verilog HDL and VHDL 8-bit wide, 255-bit long shift register (W > 1 and M = 255) with evenly spaced taps at 64, 128, 192, and 254.
Send Feedback
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
29
1. Recommended HDL Coding Styles
UG-20131 | 2021.03.29
The synthesis software implements this function in a single ALTSHIFT_TAPS IP core and maps it to RAM in supported devices, which is allowed placement in dedicated RAM blocks or MLAB memory.
Example 34. Verilog HDL 8-Bit Wide, 255-Bit Long Shift Register with Evenly Spaced Taps
module top (clk, shift, sr_in, sr_out, sr_tap_one, sr_tap_two, sr_tap_three ); input clk, shift; input [7:0] sr_in; output [7:0] sr_tap_one, sr_tap_two, sr_tap_three, sr_out; reg [7:0] sr [254:0]; integer n; always @ (posedge clk) begin if (shift == 1'b1) begin for (n = 254; n>0; n = n-1) begin sr[n] <= sr[n-1]; end sr[0] <= sr_in; end end assign sr_tap_one = sr[64]; assign sr_tap_two = sr[128]; assign sr_tap_three = sr[192]; assign sr_out = sr[254]; endmodule

1.5. Register and Latch Coding Guidelines

This section provides device-specific coding recommendations for Intel registers and latches. Understanding the architecture of the target Intel device helps ensure that your RTL produces the expected results and achieves the optimal quality of results.

1.5.1. Register Power-Up Values

Registers in the device core power-up to a low (0) logic level on all Intel FPGA devices. However, for designs that specify a power-up level other than 0, synthesis tools can implement logic that directs registers to behave as if they were powering up to a high (1) logic level.
For designs that use preset signals, but the target device does not support presets in the register architecture, synthesis may convert the preset signal to a clear signal, which requires to perform a NOT gate push-back optimization. NOT gate push-back
adds an inverter to the input and the output of the register, so that the reset and power-up conditions appear high, and the device operates as expected. In this case, the synthesis tool may issue a message about the power-up condition. The register itself powers up low, but since the register output inverts, the signal that arrives at all destinations is high.
Due to these effects, if you specify a non-zero reset value, the synthesis tool may use the asynchronous clear (aclr) signals available on the registers to implement the high bits with NOT gate push-back. In that case, the registers look as though they power-up to the specified reset value.
Intel Quartus Prime Pro Edition User Guide: Design Recommendations
30
Send Feedback
Loading...
+ 94 hidden pages