Chelsio Communications Terminator Series, Terminator 6 Installation And User Manual

Page 1
Chelsio Unified Wire for Linux i
Page 2
Chelsio Unified Wire for Linux ii
This document and related products are distributed under licenses restricting their use, copying, distribution, and reverse-engineering.
No part of this document may be reproduced in any form or by any means without prior written permission by Chelsio Communications.
All third-party trademarks are copyright of their respective owners. THIS DOCUMENTATION IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
THE USE OF THE SOFTWARE AND ANY ASSOCIATED MATERIALS (COLLECTIVELY THE “SOFTWARE”) IS SUBJECT TO THE SOFTWARE LICENSE TERMS OF CHELSIO COMMUNICATIONS, INC.
Sales
For all sales inquiries please send email to sales@chelsio.com
Support
For all support related questions please send email to support@chelsio.com Copyright © 2018. Chelsio Communications. All Rights Reserved.
Chelsio ® is a registered trademark of Chelsio Communications. All other marks and names mentioned herein may be trademarks of their respective companies.
Chelsio Communications (Headquarters) 209 North Fair Oaks Avenue, Sunnyvale, CA 94085 U.S.A
www.chelsio.com
Tel: 408.962.3600 Fax: 408.962.3661
Chelsio (India) Private Limited Subramanya Arcade, Floor 3, Tower B No. 12, Bannerghatta Road, Bangalore-560029 Karnataka, India
Tel: +91-80-4039-6800
Chelsio KK (Japan)
Yamato Building 8F, 5-27-3 Sendagaya, Shibuya-ku, Tokyo 151-0051, Japan
Page 3
Chelsio Unified Wire for Linux iii
Document History
Version
Revision Date
1.0.0
12/08/2011
1.0.1
01/09/2013
1.0.2
01/27/2013
1.0.3
03/26/2013
1.0.4
04/12/2013
1.0.5
06/20/2013
1.0.6
08/17/2013
1.0.7
10/22/2013
1.0.8
03/08/2013
1.0.9
05/15/2013
1.1.0
07/26/2013
1.1.1
08/14/2013
1.1.2
12/06/2013
1.1.3
12/19/2013
1.1.4
03/13/2014
1.1.5
05/02/2014
1.1.6
06/30/2014
1.1.7
10/22/2014
1.1.8
11/04/2014
1.1.9
02/05/2015
1.2.0
03/04/2015
1.2.1
03/25/2015
1.2.2
06/03/2015
1.2.3
08/05/2015
1.2.4
02/29/2016
1.2.5
04/27/2016
1.2.6
08/25/2016
1.2.7
10/07/2016
1.2.8
10/18/2016
1.2.9
11/11/2016
1.3.0
12/05/2016
1.3.1
12/30/2016
1.3.2
01/30/2017
1.3.3
02/27/2017
1.3.4
05/11/2017
1.3.5
09/05/2017
Page 4
Chelsio Unified Wire for Linux iv
1.3.6
09/29/2017
1.3.7
12/29/2017
1.3.8
02/28/2018
1.3.9
03/30/2018
1.4.0
04/18/2018
1.4.1
07/05/2018
1.4.2
07/18/2018
1.4.3
10/01/2018
Page 5
Chelsio Unified Wire for Linux v
TABLE OF CONTENTS
I. CHELSIO UNIFIED WIRE 15
Introduction 16
1.1. Features 16
1.2. Hardware Requirements 17
1.3. Software Requirements 17
1.4. Package Contents 17
Hardware Installation 21 Software/Driver Installation 23
3.1. Pre-requisites 23
3.2. Enabling RDMA on ARM Platforms 24
3.3. Mounting debugfs 24
3.4. Allowing unsupported modules on SLES 25
3.5. Installing Chelsio Unified Wire from source 25
3.6. Installing Chelsio Unified Wire from RPM 32
3.7. Firmware Update 36
3.8. Removing Drivers from initramfs 36
Configuring Chelsio Network Interfaces 37
4.1. Configuring Adapters 37
4.2. Configuring network-scripts 41
4.3. Creating network-scripts 41
4.4. Checking Link 42
Performance Tuning 43
5.1. Generic 43
5.2. Throughput 43
5.3. Latency 43
Software/Driver Update 45 Software/Driver Uninstallation 46
7.1. Uninstalling Chelsio Unified Wire from source 46
7.2. Uninstalling Chelsio Unified Wire from RPM 50
II. NETWORK (NIC/TOE) 51
Introduction 52
1.1. Hardware Requirements 52
1.2. Software Requirements 53
Software/Driver Installation 54 Software/Driver Loading 55
3.1. Loading in NIC mode (without full offload support) 55
3.2. Loading in TOE mode (with full offload support) 55
Software/Driver Configuration 56
4.1. Enabling TCP Offload 56
Page 6
Chelsio Unified Wire for Linux vi
4.2. Enabling Busy waiting 56
4.3. Precision Time Protocol (PTP) 57
4.4. VXLAN Offload 59
4.5. Performance Tuning 62
Software/Driver Unloading 68
5.1. Unloading the NIC Driver 68
5.2. Unloading the TOE Driver 68
III. VIRTUAL FUNCTION NETWORK (VNIC) 70
Introduction 71
1.1. Hardware Requirements 71
1.2. Software Requirements 72
Software/Driver Installation 73
2.1. Pre-requisites 73
2.2. Installation 73
Software/Driver Loading 74
3.1. Instantiate Virtual Functions (SR-IOV) 74
3.2. Loading the Driver 74
Software/Driver Configuration and Fine-tuning 76
4.1. VF Rate Limiting 76
4.2. Bonding 77
4.3. High Capacity VF Configuration 78
Software/Driver Unloading 81
5.1. Unloading the Driver 81
IV. IWARP (RDMA) 82
Introduction 83
1.1. Hardware Requirements 83
1.2. Software Requirements 84
Software/Driver Installation 85
2.1. Pre-requisites 85
2.2. Installation 85
Software/Driver Loading 86
3.1. Loading iWARP Driver 86
Software/Driver Configuration and Fine-tuning 87
4.1. Testing connectivity with ping and rping 87
4.2. Enabling various MPIs 88
4.3. Setting up NFS-RDMA 96
4.4. Performance Tuning 97
Software/Driver Unloading 98
V. ISER 99
Page 7
Chelsio Unified Wire for Linux vii
Introduction 100
1.1. Hardware Requirements 100
1.2. Software Requirements 100
Kernel Installation 101 Software/Driver Installation 102
3.1. Pre-requisites 102
3.2. Installation 102
Software/Driver Loading 103 Software/Driver Configuration and Fine-tuning 104
5.1. HMA 105
5.2. Performance Tuning 105
Software/Driver Unloading 106
VI. WD-UDP 107
Introduction 108
1.1. Hardware Requirements 108
1.2. Software Requirements 108
Software/Driver Installation 110 Software/Driver Loading 111 Software/Driver Configuration and Fine-tuning 112
4.1. Accelerating UDP Socket Communications 112
Software/Driver Unloading 118
VII. WD-TOE 119
Introduction 120
1.1. Hardware Requirements 120
1.2. Software Requirements 120
Software/Driver Installation 121
2.1. Pre-requisites 121
2.2. Installation 121
Software/Driver Loading 122 Software/Driver Configuration and Fine-tuning 123
4.1. Running the Application 123
Software/Driver Unloading 124
VIII. NVME-OF 125
Introduction 126
1.1. Hardware Requirements 126
1.2. Software Requirements 126
Kernel Installation 128 Software/Driver Installation 129
3.1. Pre-requisites 129
Page 8
Chelsio Unified Wire for Linux viii
3.2. Installation 129
Software/Driver Loading 130 Software/Driver Configuration and Fine-tuning 131
5.1. Target 131
5.2. Initiator 132
5.3. HMA 132
5.4. Performance Tuning 133
Software/Driver Unloading 134
IX. LIO ISCSI TARGET OFFLOAD 135
Introduction 136
1.1. Hardware Requirements 136
1.2. Software Requirements 136
Kernel Configuration 138 Software/Driver Installation 141
3.1. Pre-requisites 141
3.2. Installation 141
Software/Driver Loading 143 Software/Driver Configuration and Fine-tuning 144
5.1. Configuring LIO iSCSI Target 144
5.2. Offloading LIO iSCSI Connection 144
5.3. Running LIO iSCSI and Network Traffic Concurrently 145
5.4. Performance Tuning 146
Software/Driver Unloading 147
6.1. Unloading the LIO iSCSI Target Offload Driver 147
6.2. Unloading the NIC Driver 147
X. ISCSI PDU OFFLOAD TARGET 148
Introduction 149
1.1. Features 149
1.2. Hardware Requirements 150
1.3. Software Requirements 151
Software/Driver Installation 153 Software/Driver Loading 154
3.1. Latest iSCSI Software Stack Driver Software 154
Software/Driver Configuration and Fine-tuning 156
4.1. Command Line Tools 156
4.2. iSCSI Configuration File 156
4.3. A Quick Start Guide for Target 157
4.4. The iSCSI Configuration File 159
4.5. Challenge-Handshake Authenticate Protocol (CHAP) 170
4.6. Target Access Control List (ACL) Configuration 172
Page 9
Chelsio Unified Wire for Linux ix
4.7. Target Storage Device Configuration 174
4.8. Target Redirection Support 176
4.9. The command line interface tools “iscsictl” & “chisns” 178
4.10. Rules of Target Reload (i.e. “on the fly” changes) 183
4.11. System Wide Parameters 184
4.12. Performance Tuning 185
Software/Driver Unloading 186
XI. ISCSI PDU OFFLOAD INITIATOR 187
Introduction 188
1.1. Hardware Requirements 188
1.2. Software Requirements 189
Software/Driver Installation 190
2.1. Pre-requisites 190
2.2. Installation 190
Software/Driver Loading 191 Software/Driver Configuration and Fine-tuning 192
4.1. Accelerating open-iSCSI Initiator 192
4.2. HMA 194
4.3. Auto login from cxgb4i initiator at OS bootup 195
4.4. Performance Tuning 196
Software/Driver Unloading 197
XII. CRYPTO OFFLOAD 198
Introduction 199
1.1. Hardware Requirements 199
1.2. Software Requirements 199
Kernel Configuration 200 Software/Driver Installation 203
3.1. Pre-requisites 203
3.2. Installation 203
Software/Driver Loading 204
4.1. Co-processor 204
4.2. Inline 204
Software/Driver Configuration and Fine-tuning 205
5.1. Co-processor 205
5.2. Inline 206
5.3. Performance Tuning 209
Software/Driver Unloading 210
XIII. DATA CENTER BRIDGING (DCB) 211
Introduction 212
Page 10
Chelsio Unified Wire for Linux x
1.1. Hardware Requirements 212
1.2. Software Requirements 212
Software/Driver Installation 214 Software/Driver Loading 215 Software/Driver Configuration and Fine-tuning 217
4.1. Configuring Cisco Nexus 5010 switch 217
4.2. Configuring the Brocade 8000 switch 220
Running NIC & iSCSI Traffic together with DCBx 222
XIV. FCOE FULL OFFLOAD INITIATOR 223
Introduction 224
1.1. Hardware Requirements 224
1.2. Software Requirements 224
Software/Driver Installation 225 Software/Driver Loading 226 Software/Driver Configuration and Fine-tuning 227
4.1. Configuring Cisco Nexus 5010 and Brocade switch 227
4.2. FCoE fabric discovery verification 227
4.3. Formatting the LUNs and Mounting the Filesystem 231
4.4. Creating Filesystem 232
4.5. Mounting the formatted LUN 233
Software/Driver Unloading 234
XV. OFFLOAD BONDING 235
Introduction 236
1.1. Hardware Requirements 236
1.2. Software Requirements 236
Software/Driver Installation 238 Software/Driver Loading 239 Software/Driver Configuration and Fine-tuning 240
4.1. Offloading TCP traffic over a bonded interface 240
Software/Driver Unloading 241
XVI. OFFLOAD MULTI-ADAPTER FAILOVER (MAFO) 242
Introduction 243
1.1. Hardware Requirements 243
1.2. Software Requirements 244
Software/Driver Installation 245 Software/Driver Loading 246 Software/Driver Configuration and Fine-tuning 247
4.1. Offloading TCP traffic over a bonded interface 247
Software/Driver Unloading 248
Page 11
Chelsio Unified Wire for Linux xi
XVII. UDP SEGMENTATION OFFLOAD AND PACING 249
Introduction 250
1.1. Hardware Requirements 251
1.2. Software Requirements 251
Software/Driver Installation 252 Software/Driver Loading 253 Software/Driver Configuration and Fine-tuning 254
4.1. Modifying the Application 254
4.2. Configuring UDP Pacing 255
Software/Driver Unloading 257
XVIII.OFFLOAD IPV6 258
Introduction 259
1.1. Hardware Requirements 259
1.2. Software Requirements 259
Software/Driver Installation 261
2.1. Pre-requisites 261
2.2. Installation 261
Software/Driver Loading 262 Software/Driver Configuration and Fine-tuning 263 Software/Driver Unloading 264
5.1. Unloading the NIC Driver 264
5.2. Unloading the TOE Driver 264
XIX. WD SNIFFING AND TRACING 265
Theory of Operation 266
1.1. Hardware Requirements 267
1.2. Software Requirements 268
Software/Driver Installation 269 Usage 270
3.1. Installing Basic Support 270
3.2. Using Sniffer (wd_sniffer) 270
3.3. Using Tracer (wd_tcpdump_trace) 270
XX. CLASSIFICATION AND FILTERING 272
Introduction 273
1.1. Hardware Requirements 273
1.2. Software Requirements 274
LE-TCAM Filters 275
2.1. Configuration 275
2.2. Creating Filter Rules 278
2.3. Listing Filter Rules 279
Page 12
Chelsio Unified Wire for Linux xii
2.4. Removing Filter Rules 280
2.5. Layer 3 Example 280
2.6. Layer 2 Example 282
2.7. Filtering VF traffic 284
Hash/DDR Filters 286
3.1. Configuration 286
3.2. Creating Filter Rules 288
3.3. Listing Filter Rules 291
3.4. Removing Filter Rules 291
3.5. Filter Priority 291
3.6. Swap MAC Feature 292
3.7. Traffic Mirroring 292
3.8. Packet Tracing and Hit Counters 294
NAT Filtering 296
XXI. OVS KERNEL DATAPATH OFFLOAD 297
Introduction 298
1.1. Hardware Requirements 298
1.2. Software Requirements 299
Software/Driver Installation 300
2.1. Pre-requisites 300
2.2. Installation 300
Software/Driver Configuration and Fine Tuning 301
3.1. Configuring OVS Machine 302
3.2. Creating OVS flows 304
3.3. Verifying OVS Flow Dump 308
3.4. Setting up ODL with OVS 308
Software/Driver Uninstallation 310
XXII. RING BACKBONE 311
Introduction 312
1.1. Hardware Requirements 312
1.2. Software Requirements 312
1.3. Ring Connectivity 313
Software/Driver Installation 314 Software/Driver Configuration and Fine-tuning 315
XXIII. TRAFFIC MANAGEMENT 317
Introduction 318
1.1. Hardware Requirements 318
1.2. Software Requirements 319
Software/Driver Loading 320
Page 13
Chelsio Unified Wire for Linux xiii
Software/Driver Configuration and Fine-tuning 321
3.1. Traffic Management Rules 321
3.2. Configuring Traffic Management 323
Usage 326
4.1. Non-Offloaded Connections 326
4.2. Offloaded Connections 326
4.3. Offloaded Connections with Modified Application 327
Software/Driver Unloading 328
XXIV. DPDK DRIVER 329
Introduction 330
1.1. Hardware Requirements 330
1.2. Software Requirements 331
Software/Driver Installation 332
2.1. Pre-requisites 332
2.2. Installation 332
Flashing Firmware Configuration File 333 Software/Driver Loading 334 Software/Driver Configuration and Fine Tuning 335
5.1. Huge Pages 335
5.2. Binding Network Ports 336
5.3. Unbinding Network Ports 338
5.4. Performance Tuning 338
Running DPDK Test Applications 339
6.1. Testpmd application 339
6.2. Pktgen Application 341
6.3. Runtime Options 344
Software/Driver Unloading 345 Software/Driver Uninstallation 346
XXV. UNIFIED BOOT 347
Introduction 348
1.1. Hardware Requirements 348
1.2. Software Requirements 349
1.3. Pre-requisites 350
Secure Boot 351 Flashing firmware and option ROM 352
3.1. Preparing USB flash drive 352
3.2. Legacy 353
3.3. uEFI 356
3.4. HP Firmware Management Protocol (FMP) 362
3.5. Default Option ROM Settings 366
Page 14
Chelsio Unified Wire for Linux xiv
Configuring PXE Server 368 PXE Boot Process 369
5.1. Legacy PXE Boot 369
5.2. uEFI PXE Boot 372
FCoE Boot Process 376
6.1. Legacy FCoE Boot 376
6.2. uEFI FCoE Boot 381
iSCSI Boot Process 387
7.1. Legacy iSCSI Boot 387
7.2. uEFI iSCSI Boot 395
Creating Driver Update Disk (DUD) 404
8.1. Creating DUD for RedHat Enterprise Linux 404
8.2. Creating DUD for Suse Enterprise Linux 404
OS Installation 406
9.1. Installation using Chelsio DUD 406
9.2. Installation on FCoE LUN 417
9.3. Installation on iSCSI LUN 420
XXVI. APPENDIX A 427
Troubleshooting 428 Chelsio End-User License Agreement (EULA) 430
Page 15
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 15
I. Chelsio Unified Wire
Page 16
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 16
Introduction
Thank you for choosing Chelsio Unified Wire adapters. These high speed, single chip, single firmware cards provide enterprises and data centers with high performance solutions for various Network and Storage related requirements.
The Terminator series is Chelsio’s next generation of highly integrated, hyper-virtualized 1/10/25/40/50/100GbE controllers. The adapters are built around a programmable protocol­processing engine, with full offload of a complete Unified Wire solution comprising NIC, TOE, iWARP RDMA, iSCSI, FCoE and NAT support. It scales to true 100Gb line rate operation from a single TCP connection to thousands of connections, and allows simultaneous low latency and high bandwidth operation thanks to multiple physical channels through the ASIC.
Ideal for all data, storage and high-performance clustering applications, the Unified Wire adapters enable a unified fabric over a single wire by simultaneously running all unmodified IP sockets, Fibre Channel and InfiniBand applications over Ethernet at line rate.
Designed for deployment in virtualized data centers, cloud service installations and high­performance computing environments, Chelsio adapters bring a new level of performance metrics and functional capabilities to the computer networking industry.
Chelsio Unified Wire software comes in two formats: Source code and RPM package forms. Installing from source requires compiling the package to generate the necessary binaries. You can choose this method when you are using a custom-built kernel. You can also install the package using the interactive GUI installer. In other cases, download the RPM package specific to your operating system and follow the steps mentioned to install the package. Please note that the OFED software required to install Chelsio iWARP driver comes bundled in both source as well as RPM packages.
This document describes the installation, use and maintenance of Unified Wire software and its various components.
1.1. Features
The Chelsio Unified Wire package uses a single command to install various drivers and utilities. It consists of the following software:
Network (NIC/TOE) Virtual Function Network (vNIC) iWARP (RDMA) iSER WD-UDP WD-TOE NVMe-oF
Page 17
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 17
LIO iSCSI Target Offload iSCSI PDU Offload Target iSCSI PDU Offload Initiator Crypto Offload Data Center Bridging (DCB) FCoE full offload Initiator Offload Bonding Offload Multi-Adapter Failover(MAFO) UDP Segmentation Offload and Pacing Offload IPv6 Classification and Filtering feature OVS Kernel Datapath Offload Ring Backbone Traffic Management feature (TM) DPDK Unified Boot Software Utility Tools (cop, cxgbtool, t4_perftune, benchmark tools, sniffer & tracer) libs (iWARP, WD-UDP and WD-TOE libraries)
For detailed instructions on loading, unloading and configuring the drivers/tools please refer to their respective sections.
1.2. Hardware Requirements
The Chelsio Unified Wire software supports Chelsio Terminator series of Unified Wire adapters. To know more about the list of adapters supported by each driver, please refer to their respective sections.
1.3. Software Requirements
The Chelsio Unified Wire software has been developed to run on 64-bit Linux based platforms and therefore it is a base requirement for running the driver. To know more about the complete list of operating systems supported by each driver, please refer to their respective sections.
1.4. Package Contents
Source Package
The Chelsio Unified Wire source package consists of the following files/directories:
debrules: This directory contains packaging specification files required for building Debian
packages.
Page 18
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 18
docs: This directory contains support documents - README, Release Notes and User’s
Guide (this document) for the software. kernels: This directory contains kernel.org-4.9.105 installation files. libs: This directory is for libraries required to install the WD-UDP, WD-TOE and iWARP
drivers. The libibverbs library has implementation of RDMA verbs which will be used by
iWARP applications for data transfers. The librdmacm library works as an RDMA
connection manager. The libcxgb4 library works as an interface between the above
mentioned generic libraries and Chelsio iWARP driver. The libcxgb4_sock library is a
LD_PRELOAD-able library that accelerates UDP Socket communications transparently
and without recompilation of the user application. OFED: This directory contains supported OFED packages. RPM-Manager: This directory contains support scripts used for cluster deployment. scripts: Support scripts used by the Unified Wire Installer. specs: The packaging specification files required for building RPM packages. src: Source code for different drivers. support: This directory contains source files for the dialog utility. tools:
autoconf-x.xx: This directory contains the source for Autoconf tool needed for WD-UDP and iWARP libraries.
benchmarks: This directory contains various benchmarking tools to measure throughput and latency of various networks.
chelsio_adapter_config: This directory contains scripts and binaries needed to configure Chelsio 40G adapters.
cop: The cop tool compiles offload policies into a simple program form that can be loaded into the kernel and interpreted. These offload policies are used to determine the settings to be used for various connections. The connections to which the settings are applied are based on matching filter specifications. Please find more details on this tool in its manual page (run man cop command).
cudbg: Chelsio Unified Debug tool which facilitates collection and viewing of various debug entities like register dump, Devlog, CIM LA, etc.
cxgbtool: The cxgbtool queries or sets various aspects of Chelsio network interface cards. It complements standard tools used to configure network settings and provides functionality not available through such tools. Please find more details on this tool in its manual page (run man cxgbtool command). To use cxbtool for FCoE Initiator driver, use [root@host~]# cxgbtool stor -h
nvme_utils: This directory contains nvmecli, nvmetcli and targetcli installation files, and dependent components.
rdma_tools: This directory contains iWARP benchmarking tools.
t4_sniffer: This directory contains sniffer tracing and filtering libraries. See WD
Sniffing and Tracing chapter for more information.
90-rdma.rules: This file contains udev rules needed for running RDMA applications as a non-root user.
Page 19
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 19
chdebug: This script collects operating system environment details and debug information which can be sent to the support team, to troubleshoot Chelsio hardware/software related issues.
chiscsi_set_affinity.sh: This shell script is used for mapping iSCSI Worker threads to different CPUs.
chsetup: The chsetup tool loads NIC, TOE and iWARP drivers, and creates WD- UDP configuration file.
chstatus: This utility provides status information on any Chelsio NIC in the system.
Makefile: The Makefile for building and installing tools.
t4_latencytune.sh: Script used for latency tuning of Chelsio adapters.
t4_perftune.sh: This shell script is to tune the system for higher performance. It
achieves it through modifying the IRQ-CPU binding. This script can also be used to change Tx coalescing settings.
t4-forward.sh: RFC2544 Forward test tuning script.
uname_r: This file is used by chstatus script to verify if the Linux platform is
supported or not.
wdload: UDP acceleration tool.
wdunload: Used to unload all the loaded Chelsio drivers.
Uboot: There are two sub-directories in the Uboot directory: OptionROM and LinuxDUD.
The OptionROM directory contains Unified Boot Option ROM image (cubt4.bin), uEFI driver (ChelsioUD.efi), default boot configuration file (bootcfg) and a legacy flash utility (cfut4.exe), which can be used to flash the option ROM onto Chelsio adapters (CNAs). The LinuxDUD directory contains image (.img) files required to update drivers for Linux
distributions. chelsio-dkms.conf: DKMS configuration files for Ubuntu 16.04.1 install.py, dialog.py: Python scripts needed for the GUI installer. EULA: Chelsio’s End User License Agreement. install-dkms.sh: Installs necessary drivers to DKMS tree for Ubuntu 16.04.1 install.log: File containing installation summary. Makefile: The Makefile for building and installing from the source. sample_machinefile: Sample file used during iWARP installation on cluster nodes.
RPM Package
The Chelsio Unified Wire RPM package consists of the following:
config: This directory contains firmware configuration files. docs: This directory contains support documents i.e. README, Release Notes and User’s
Guide (this document) for the software. DRIVER-RPMS: RPM packages of Chelsio drivers. OFED-RPMS: OFED RPM packages required to install iWARP driver. scripts: Support scripts used by the Unified Wire Installer. EULA: Chelsio’s End User License Agreement. install.py: Python script that installs the RPM package. See Chelsio Unified Wire’s
Software/Driver Installation section for more information.
Page 20
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 20
uninstall.py: Python script that uninstalls the RPM package. See Chelsio Unified Wire’s
Software/Driver Uninstallation section for more information.
Uboot: There are two sub-directories in the Uboot directory: OptionROM and LinuxDUD.
The OptionROM directory contains Unified Boot Option ROM image (cubt4.bin), uEFI
driver (ChelsioUD.efi), default boot configuration file (bootcfg) and a legacy flash utility
(cfut4.exe), which can be used to flash the option ROM onto Chelsio adapters (CNAs).
The LinuxDUD directory contains image (.img) files required to update drivers for Linux
distributions.
Page 21
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 21
Hardware Installation
Follow these steps to install Chelsio adapter in your system: i. Shutdown/power off your system.
ii. Power off all remaining peripherals attached to your system. iii. Unpack the Chelsio adapter and place it on an anti-static surface. iv. Remove the system case cover as per the system manufacturer’s instructions. v. Remove the PCI filler plate from the slot where you will install the Ethernet adapter. vi. For maximum performance, it is highly recommended to install the adapter into a PCIe x8/x16
slot.
vii. Holding the Chelsio adapter by the edges, align the edge connector with the PCI connector
on the motherboard. Apply even pressure on both edges until the card is firmly seated. It may be necessary to remove the SFP (transceiver) modules prior to inserting the adapter.
viii. Secure the Chelsio adapter with a screw, or other securing mechanism, as described by the
system manufacturer’s instructions. Replace the case cover.
ix. After securing the card, ensure that the card is still fully seated in the PCIE x8/x16 slot as
sometimes the process of securing the card causes the card to become unseated.
x. Connect a fiber/twinax cable, multi-mode for short range (SR) optics or single-mode for long
range (LR) optics, to the Ethernet adapter or regular Ethernet cable for the 1Gb Ethernet
adapter. xi. Power on your system. xii. Run update-pciids command to download the current version of PCI ID list
xiii. Verify if the adapter was installed successfully by using the lspci command
For Chelsio adapters, the physical functions are currently assigned as:
Physical functions 0 - 3: for the SR-IOV functions of the adapter Physical function 4: for all NIC functions of the adapter Physical function 5: for iSCSI
All 4-ports of T6425-CR adapter will be functional only if PCIe x8 -> 2x PCIe x4 slot bifurcation is supported by the system and enabled in BIOS. Otherwise, only 2-ports will be functional.
Note
Page 22
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 22
Physical function 6: for FCoE Physical function 7: Currently not assigned
Once Unified Wire package is installed and loaded, examine the output of dmesg to see if the card is discovered. You should see a similar output:
The above outputs indicate the hardware configuration of the adapter as well as serial number.
Network device names for Chelsio’s physical ports are assigned using the following
convention: the port farthest from the motherboard will appear as the first network interface. However, for T5 40G and T420-BT adapters, the association of physical Ethernet ports and their corresponding network device names is opposite. For these adapters, the port nearest to the motherboard will appear as the first network interface.
Note
Page 23
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 23
Software/Driver Installation
There are two main methods to install the Chelsio Unified Wire package: from source and RPM. If you decide to use source, you can install the package using CLI or GUI mode. If you decide to use RPM, you can install the package using Menu or CLI mode.
RPM packages support only distro base kernels. In case of updated/custom kernels, use source package. Irrespective of the method chosen for installation, the machine needs to be rebooted for changes to take effect.
The following table describes the various configuration tuning options available during installation and drivers/software installed with each option by default:
Configuration
Tuning Option
Description
Driver/Software installed
Unified
Wire(Default)
Default Configuration. Configures
adapters to run all protocols
simultaneously.
NIC/TOE, vNIC, iWARP, iSER, WD-UDP,
NVMe-oF, LIO iSCSI Target, iSCSI Target,
iSCSI Initiator, FCoE Initiator, Bonding,
MAFO, IPv6, Sniffer & Tracer, Filtering, TM
Low latency
Networking
Configures adapters to run TOE and
iWARP traffic with low latency.
TOE, iWARP, WD-UDP, WD-TOE, IPv6,
Bonding, MAFO
High capacity RDMA
Configures adapters to establish a large
number of iWARP connections.
iWARP
RDMA Performance
Improves iWARP performance.
iWARP, iSER, NVMe-oF
High capacity TOE
Configures adapters to establish a large
number of TOE connections.
TOE, Bonding, MAFO, IPv6
iSCSI Performance+
Improves iSCSI performance.
LIO iSCSI Target, iSCSI Target, iSCSI
Initiator, Bonding, DCB
UDP Seg.Offload &
Pacing*
Configures adapters to establish a large
number of UDP Segmentation Offload
connections.
UDP-SO, Bonding
Wire Direct Latency#
Configures adapters to provide low Wire
Direct latency.
TOE, iWARP, WD-UDP, WD-TOE
High Capacity WD
Configures adapters to establish a large
number of WD-UDP connections.
WD-UDP, WD-TOE
Hash Filter#
Configures adapters to create more filters.
Filtering
Ring Backbone#
Configures adapters in a ring backbone.
NIC/TOE, iWARP, iSER, WD-UDP, NVMe-
oF, LIO iSCSI Target, iSCSI Target, iSCSI
Initiator, IPv6, Filtering, TM
NVMe Performance^
Improves NVMe-oF performance.
iWARP, NVMe-oF
High Capacity VF#
Configures adapters to support more VFs.
NIC, vNIC
* Supported on T4/T5 + Supported only on T5
#
Supported on T5/T6
^ Supported only on T6
3.1. Pre-requisites
To install Unifided Wire using GUI mode (with Dialog utility), ncurses-devel package must be installed.
Page 24
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 24
3.2. Enabling RDMA on ARM Platforms
RDMA is disabled by default in RHEL 7.3 build of ARM architecture. To enable this feature, follow the steps mentioned below:
i. Download the kernel source package and extract it. ii. Create a kernel configuration file.
[root@host~]# make oldconfig
iii. The above command will create a configuration file .config in the same location. Edit the file
and enable the following parameters:
CONFIG_NET_VENDOR_CHELSIO=y CONFIG_INFINIBAND=y
iv. Compile the kernel. v. During kernel compilation, please ensure that the following parameters are set as follows:
CONFIG_CHELSIO_T1=m CONFIG_CHELSIO_T1_1G=y CONFIG_CHELSIO_T3=m CONFIG_CHELSIO_T4=m CONFIG_CHELSIO_T4VF=m
CONFIG_INFINIBAND_USER_MAD=m CONFIG_INFINIBAND_USER_ACCESS=m CONFIG_INFINIBAND_USER_MEM=y
CONFIG_INFINIBAND_CXGB3=m CONFIG_INFINIBAND_CXGB3_DEBUG=y CONFIG_INFINIBAND_CXGB4=m
CONFIG_SCSI_CXGB3_ISCSI=m CONFIG_SCSI_CXGB4_ISCSI=m
vi. Install the kernel. vii. Reboot into the newly installed kernel.
3.3. Mounting debugfs
All driver debug data is stored in debugfs, which will be mounted in most cases. If not, mount it manually using:
[root@host~]# mount -t debugfs none /sys/kernel/debug
Page 25
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 25
3.4. Allowing unsupported modules on SLES
On SLES11 SPx platforms, edit the /etc/modprobe.d/unsupported-modules file and change allow_unsupported_modules to 1.
On SLES12 SPx platforms, edit the /etc/modprobe.d/10-unsupported-modules.conf file and change allow_unsupported_modules to 1.
3.5. Installing Chelsio Unified Wire from source
GUI mode (with Dialog utility)
i. Download the Unified Wire driver package (tarball) from Chelsio Download Center. ii. Untar the tarball using the following command:
[root@host~]# tar zxvf <driver_package>.tar.gz
iii. Change your current working directory to Chelsio Unified Wire package directory:
[root@host~]# cd ChelsioUwire-x.x.x.x
iv. Run the following script to start the GUI installer:
[root@host~]# ./install.py
v. If Dialog utility is present, you can skip to step (vi). If not, press ‘y’ to install it when the installer
prompts for input.
vi. Select “install” under “Choose an action”
Page 26
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 26
vii. Select Enable IPv6-Offload to install drivers with IPv6 Offload support or Disable IPv6-offload
to continue installation without IPv6 offload support.
viii. Select the required configuration tuning option:
ix. Under “Choose install components”, select “all” to install all the related components for the
option chosen in step (viii) or select “custom” to install specific components.
To install Crypto Offload, WD-TOE, OVS, DPDK drivers and benchmark tools, please select “custom option”.
Important
The tuning options may vary depending on the Linux distribution.
Note
Page 27
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 27
x. Select the required performance tuning option.
a. Enable Binding IRQs to CPUs: Bind MSI-X interrupts to different CPUs and disable
IRQ balance daemon. b. Retain IRQ balance daemon: Do not disable IRQ balance daemon. c. TX-Coalasce: Write tx_coal=2 to modprobe.d/conf.
xi. If you already have the required version of OFED software installed, select Skip-OFED. To
install OFED 4.8-2 choose the Install-OFED option.
xii. The selected components will now be installed:
For more information on the Performance tuning options, please refer to
Performance Tuning section of the Network (NIC/TOE) chapter.
Note
This step will be prompted only for OFED supported platforms.
Note
Page 28
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 28
xiii. After successful installation, summary of installed components will be displayed.
xiv. Select “View log” to view the installation log or “Exit” to continue.
xv. Select “Yes” to exit the installer or “No” to go back.
xvi. Reboot your machine for changes to take effect.
Press Esc or Ctrl+C to exit the installer at any point of time.
Note
Page 29
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 29
CLI mode (without Dialog utility)
If your system does not have Dialog or you choose not to install it, follow the steps mentioned below to install the Unified Wire package:
i. Download the Unified Wire driver package from Chelsio Download Center. ii. Untar the tarball using the following command:
[root@host~]# tar zxvf <driver_package>.tar.gz
iii. Change your current working directory to Chelsio Unified Wire package directory:
[root@host~]# cd ChelsioUwire-x.x.x.x
iv. Run the following script to start the installer
[root@host~]# ./install.py -c <target>
v. Enter the number corresponding to the Configuration tuning option in the Input field and
press Enter.
vi. If you already have the required version of OFED software installed, select Skip-OFED. To
install OFED 4.8-2 choose the Install-OFED option.
vii. The selected components will now be installed.
After successful installation you can press 1 to view the installation log. Press any other key to exit from the installer.
viii. Reboot your machine for changes to take effect.
iWARP driver installation on Cluster nodes
To customize the installation, view the help by typing
[root@host~]#./install.py –h
Important
Please make sure that you have enabled password less authentication with ssh on the peer nodes for this feature to work.
Important
This step will be prompted only for OFED supported platforms.
Note
Page 30
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 30
Chelsio’s Unified Wire package allows installing iWARP drivers on multiple Cluster nodes with a
single command. Follow the procedure mentioned below: i. Change your current working directory to Chelsio Unified Wire package directory:
[root@host~]# cd ChelsioUwire-x.x.x.x
ii. Create a file (machinefilename) containing the IP addresses or hostnames of the nodes in
the cluster. You can view the sample file, sample_machinefile, provided in the package to view the format in which the nodes have to be listed.
iii. Now, execute the following command:
[root@host~]# ./install.py -C -m <machinefilename>
iv. Select the required configuration tuning option. The tuning options may vary depending on
the Linux distribution.
v. Select the required Cluster Configuration.
vi. If you already have the required version of OFED software installed, select Skip-OFED. To
install OFED 4.8-2 choose the Install-OFED option.
vii. The selected components will now be installed. The above commands will install iWARP (iw_cxgb4) and TOE (t4_tom) drivers on all the nodes
listed in the machinefilename file.
CLI mode
i. Download the Unified Wire driver package from Chelsio Download Center.
ii. Untar the tarball using the following command:
[root@host~]# tar zxvf ChelsioUwire-x.x.x.x.tar.gz
iii. Change your current working directory to Chelsio Unified Wire package directory and build
the source:
[root@host~]# cd ChelsioUwire-x.x.x.x [root@host~]# make
Page 31
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 31
iv. Install the drivers, tools and libraries using the following command:
[root@host~]# make install
v. The default configuration tuning option is Unified Wire. The configuration tuning can be
selected using the following commands:
[root@host~]# make CONF=<configuration_tuning>
[root@host~]# make CONF=<configuration_tuning> install
vi. Reboot your machine for changes to take effect.
CLI mode (individual drivers)
You can also choose to install drivers/features individually. Provided here are steps to build and install some of them. For the complete list, view help by running make help.
Change your current working directory to Chelsio Unified Wire package directory:
[root@host~]# cd ChelsioUwire-x.x.x.x
To build and install all drivers without IPv6 support:
[root@host~]# make ipv6_disable=1 [root@host~]# make ipv6_disable=1 install
To view the different configuration tuning options, view help by typing
[root@host~]# make help
Note
Steps (iii) and (iv) mentioned above will NOT install Crypto, DCB, WD-TOE, OVS, DPDK drivers, and benchmark tools. They will have to be installed manually.
Please refer to section CLI mode (individual drivers) for instructions on installing them.
Important
Page 32
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 32
The default configuration tuning option is Unified Wire. The configuration tuning can be
selected using the following commands:
[root@host~]# make CONF=<configuration_tuning> <Build Target>
[root@host~]# make CONF=<configuration_tuning> <Install Target>
To build and install drivers along with benchmarks:
[root@host~]# make BENCHMARKS=1 [root@host~]# make BENCHMARKS=1 install
The drivers will be installed as RPMs or Debian packages (for ubuntu). To skip this and
install drivers:
[root@host~]# make SKIP_RPM=1 install
The installer will remove the Chelsio specific drivers (inbox/outbox) from initramfs. To skip
this and install drivers:
[root@host~]# make SKIP_INIT=1 install
The installer will check for the required dependency packages and will install them if they
are missing from the machine. To skip this and install drivers:
[root@host~]# make SKIP_DEPS=1 install
3.6. Installing Chelsio Unified Wire from RPM
To view the different configuration tuning options, view the help by typing
[root@host~]#make help
If IPv6 is administratively disabled in the machine, the drivers will be built and installed without IPv6 Offload support by default.
Note
IPv6 should be enabled in the machine to use the RPM Packages.
Drivers installed from RPM Packages do not have DCB support.
DPDK installation not supported.
Note
Page 33
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 33
Menu Mode
i. Download the tarball specific to your operating system and architecture from Chelsio
Download Center.
ii. Untar the tarball:
E.g., for RHEL 6.9, untar using the following command:
[root@host~]# tar zxvf <driver_package>-RHEL6.9_x86_64.tar.gz
iii. Change your current working directory to Chelsio Unified Wire package directory
[root@host~]# cd ChelsioUwire-x.x.x.x-<OS>-<arch>
iv. Install Unified Wire:
[root@host~]# ./install.py
v. Select the Installation type as described below. Enter the corresponding number in the Input
field and press Enter.
a. Unified Wire: Install all the drivers in the Unified Wire software package. This option
will not install OFED and drivers built against OFED.
b. Wire Direct Latency: Install Wire Direct Latency drivers needed for Low latency
applications.
c. Custom: Customize the installation. Use this option to install drivers/software and
related components (like OFED 4.8-2) as per the tuning option selected.
d. EXIT: Exit the installer.
vi. The selected components will now be installed. vii. Reboot your machine for changes to take effect.
If the installation aborts with the message "Resolve the errors/dependencies manually and restart the installation", please go through the install.log to resolve errors/dependencies and then start the installation again.
Note
The Installation options may vary depending on the Configuration tuning option selected.
Note
Page 34
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 34
CLI mode
i. Download the tarball specific to your operating system and architecture from Chelsio
Download Center.
ii. Untar the tarball:
E.g., for RHEL 6.9, untar using the following command:
[root@host~]# tar zxvf ChelsioUwire-x.x.x.x-RHEL6.9_x86_64.tar.gz
iii. Change your current working directory to Chelsio Unified Wire package directory:
[root@host~]# cd ChelsioUwire-x.x.x.x-<OS>-<arch>
iv. Install Unified Wire:
[root@host~]# ./install.py -i <nic_toe/all/udpso/wd/crypto/ovs>
Here,
nic_toe
: NIC and TOE drivers only
all
: All Chelsio drivers built against inbox OFED
udpso
: UDP segmentation offload capable NIC and TOE drivers only
wd
: Wire Direct drivers and libraries only
crypto
: Crypto drivers and Chelsio Openssl modules.
ovs
: OVS modules and NIC driver.
v. The default configuration tuning option is Unified Wire. The configuration tuning can be
selected using the following command:
[root@host~]# ./install.py –i <Installation mode> -c <configuration_tuning>
Note
To view the different configuration tuning options, view the help by typing
[root@host~]# ./install.py –h
The installation options may vary depending on the Linux distribution.
Note
Page 35
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 35
vi. To install OFED and Chelsio drivers built against OFED, run the above command with -o
option.
[root@host~]# ./install.py –i <Installation mode> -c <Configuration> -o
vii. Reboot your machine for changes to take effect.
iWARP driver installation on cluster nodes
i. Change your current working directory to Chelsio Unified Wire package directory:
[root@host~]# cd ChelsioUwire-x.x.x.x-<OS>-<arch>
ii. Create a file (machinefilename) containing the IP addresses or hostnames of the nodes in the
cluster. You can view the sample file, sample_machinefile, provided in the package to view the format in which the nodes have to be listed.
iii. Now, execute the following command:
[root@host~]# ./install.py -C -m <machinefilename> -i
<nic_toe/all/udpso/wd> -c <configuration_tuning> -o
Here, -o parameter will install OFED and Chelsio drivers built against OFED. The above command will install iWARP (iw_cxgb4) and TOE (t4_tom) drivers on all the nodes
listed in the <machinefilename> file. iv. Reboot your machine for changes to take effect.
Please make sure that you have enabled password less authentication with ssh
on the peer nodes for this feature to work.
Important
Page 36
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 36
3.7. Firmware Update
The firmware is installed on the system, typically in /lib/firmware/cxgb4, and the driver will auto-load the firmware if an update is required. The kernel must be configured to enable userspace firmware loading support:
Device Drivers -> Generic Driver Options -> Userspace firmware loading support The firmware version can be verified using ethtool:
[root@host~]# ethtool -i <iface>
3.8. Removing Drivers from initramfs
Chelsio drivers (cxgb4, cxgb4vf, iw_cxgb4, cxgb4i, csiostor, etc.) might exist in the initramfs/initrd image as inboxed or older versions of outbox drivers. It is highly recommended to remove them to ensure that the drivers installed using the current Unified Wire package are loaded at every boot.
Run the following commands:
[root@host~]# cd <source/rpm_package> [root@host~]# ./scripts/fix_init.sh -r cxgb4,cxgb4vf,csiostor,iw_cxgb4,chcr,libcxgbi,cxgb4i,libcxgb,cxgbit -y
Page 37
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 37
Configuring Chelsio Network Interfaces
To test Chelsio adapters’ features it is required to use two machines both with Chelsio’s network
adapters installed. These two machines can be connected directly without a switch (back-to­back), or both connected to a switch. The interfaces have to be declared and configured. The configuration files for network interfaces on Red Hat Enterprise Linux (RHEL) distributions are kept under /etc/sysconfig/network-scripts.
4.1. Configuring Adapters
T6 Unified wire adapters support auto-negotiation (enabled by default) which allows link parameters like speed, duplex, FEC and Pause to be negotiated with the PEER.
Setting FEC
100G, 50G and 25G speeds support changing Forward Error Correction (FEC). The existing FEC settings can be viewed using:
[root@host~]# cxgbtool <ethX> fec
Below is a sample output on T6 100G port:
RS FEC is set by default for the T6 port at 100G speed. Configure the same FEC on the PEER for the link to come up. To set FEC:
[root@host~]# cxgbtool <ethX> fec <value>
Here value can be: rs: Reed-Solomon FEC
baser: Base-R/Reed-Solomon FEC auto: Use standard FEC settings as specified by IEEE 802.3 interpretations of Cable
Transceiver Module parameters.
off: Turn off FEC
Some operating systems may attempt to auto-configure the detected hardware and some may not detect all ports on a multi-port adapter. If this happens, please refer to the operating system documentation for manually configuring the network device.
Note
For more information, refer cxgbtool man page by running man cxgbtool
Note
Page 38
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 38
If auto-negotiation is disabled, ensure that the same FEC is set on both sides of the link, for the link to come up.
Setting Speed
T6 100G ports support multiple speeds viz. 100G, 50G, 40G, 25G, 10G and 1G. T6 25G ports support 25G, 10G and 1G speeds. The supported speeds can be seen using ethtool.
Below is a sample output for T6 100G port:
Optics Optics do not support auto-negotiation. Use the following command to change the speed:
[root@host~]# ethtool -s <ethX> speed <speed> autoneg off
The speed, duplex and FEC (if applicable) should be manually set to the same values on the PEER for the link to come up.
ethtool v4.8 or higher required.
Note
Before setting 40G, 10G or 1G speeds, FEC should be disabled on the port using: cxgbtool <ethX> fec off
Important
Page 39
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 39
[root@host~]# ethtool --change <ethX> advertise 0x4482000000
Example: To set 25G speed on 100G port:
[root@host~]# ethtool -s <ethX> speed 25000 autoneg off
Twinax Twinax cables support auto-negotiation. The following speeds can be set in the advertise field.
o Advertise only 100G
[root@host~]# ethtool --change <ethX> advertise 0x4000000000
o Advertise only 40G
[root@host~]# ethtool --change <ethX> advertise 0x2000000
o Advertise only 50G
[root@host~]# ethtool --change <ethX> advertise 0x400000000
o Advertise only 25G
[root@host~]# ethtool --change <ethX> advertise 0x80000000
o Advertise 100/50/40/25G
o Auto-negotiation OFF
The advertise option is only supported with Auto-negotiation enabled. If it is disabled or to set 10G/1G speeds (which do not support Auto-negotiation), use the following command:
[root@host~]# ethtool -s <ethX> speed <speed> autoneg off
Setting Pause
Pause Autonegotiation is enabled by default. To override it and set Pause parameters, run:
[root@host ~]# ethtool -A <ethX> autoneg off tx on rx on
Page 40
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 40
Spider and QSA Modes
T5 Adapters Chelsio T5 40G adapters can be configured in the following 3 modes: i. 2X40Gbps: This is the default mode of operation where each port functions as 40Gbps link.
The port nearest to the motherboard will appear as the first network interface (Port 0). ii. 4X10Gbps (Spider): In this mode, port 0 functions as 4 10Gbps links and port 1 is disabled. iii. QSA: This mode adds support for QSA (QSFP to SFP+) modules, enabling smooth, cost-
effective, connections between 40 Gigabit Ethernet adapters and 1 or 10 Gigabit Ethernet
networks using existing SFP+ based cabling. The port farthest from the motherboard will
appear as the first network interface (Port 0).
T6 Adapters Chelsio T6 100G adapters can be configured in the following 2 modes: i. 2X100Gbps: This is the default mode of operation where each port functions as 100Gbps link.
The port farthest to the motherboard will appear as the first network interface (Port 0). ii. 2X25Gbps (Spider): In this mode, port 0 functions as 2 25Gbps links and port 1 is disabled.
To configure/change the mode of operation, use the following procedure: i. Run the chelsio_adapter_config.py command to detect all Chelsio adapter(s) present in the
system. Select the adapter to configure by specifying the adapter index
ii. Select Change adapter mode
iii. Select the required mode.
QSA modules will work in the default mode.
Note
Page 41
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 41
iv. Reload the network driver for changes to take effect.
[root@host~]# rmmod cxgb4 [root@host~]# modprobe cxgb4
4.2. Configuring network-scripts
A typical interface network-script (e.g., eth0) on RHEL 6.X looks like the following:
# file: /etc/sysconfig/network-scripts/ifcfg-eth0 DEVICE="eth0" HWADDR=00:30:48:32:6A:AA ONBOOT="yes" NM_CONTROLLED="no" BOOTPROTO="static" IPADDR=10.192.167.111 NETMASK=255.255.240.0
In the case of DHCP addressing the last two lines should be removed and
BOOTPROTO="static" should be changed to BOOTPROTO="dhcp"
The ifcfg-ethX files have to be created manually. They are required for bringing the interfaces up and down and attribute the desired IP addresses.
4.3. Creating network-scripts
To spot the new interfaces, make sure the driver is unloaded first. To that point ifconfig -a |
grep HWaddr should display all non-chelsio interfaces whose drivers are loaded, whether the
interfaces are up or not.
[root@host~]# ifconfig -a | grep HWaddr eth0 Link encap:Ethernet HWaddr 00:30:48:32:6A:AA
On earlier versions of RHEL the NETMASK attribute is named IPMASK. Make sure you are using the right attribute name.
Note
If default option is selected in step ii, reboot the machine for changes to take effect.
Note
Page 42
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 42
Then load the driver using the modprobe cxgb4 command (for the moment it does not make any difference whether we are using NIC-only or the TOE-enabling driver). The output of ifconfig should display the adapter interfaces as:
[root@host~]# ifconfig -a | grep HWaddr eth0 Link encap:Ethernet HWaddr 00:30:48:32:6A:AA eth1 Link encap:Ethernet HWaddr 00:07:43:04:6B:E9
eth2 Link encap:Ethernet HWaddr 00:07:43:04:6B:F1
For each interface you can write a configuration file in /etc/sysconfig/network-scripts. The ifcfg-eth1 could look like:
# file: /etc/sysconfig/network-scripts/ifcfg-eth1 DEVICE="eth1" HWADDR=00:07:43:04:6B:E9 ONBOOT="no" NM_CONTROLLED="no" BOOTPROTO="static" IPADDR=10.192.167.112 NETMASK=255.255.240.0
From now on, the eth1 interface of the adapter can be brought up and down through the ifup
eth1 and ifdown eth1 commands respectively. Note that it is of course not compulsory to create
a configuration file for every interface if you are not planning to use them all.
4.4. Checking Link
Once the network-scripts are created for the interfaces you should check the link i.e. make sure it is actually connected to the network. First, bring up the interface you want to test using ifup
eth1.
You should now be able to ping any other machine from your network provided it has ping response enabled.
Page 43
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 43
Performance Tuning
The following section lists the steps to tune the system for optimal performance.
5.1. Generic
Install the adapter into a PCIe Gen3 x8/x16 slot. Ensure that T6 100G adapters are placed in
x16 slots and not in x8_in_x16 slots.
BIOS settings: i. Disable virtualization, c-state technology, VT-d, Intel I/O AT and SR-IOV. ii. CPU Power setting to Performance.
Add intel_pstate=disable processor.max_cstate=1 intel_idle.max_cstate=0 to the kernel
command line to prevent the system from entering power-saving/idle states and avoid CPU
frequency changes.
Turn off irqbalance
[root@host~]# /etc/init.d/irqbalance stop
On RHEL7.X platforms, use the below command:
[root@host~]# systemctl stop irqbalance.service
5.2. Throughput
In addition to the generic settings, set the below tuned-adm profile for RHEL platforms.
[root@host~]# tuned-adm profile network-throughput
5.3. Latency
In addition to the generic settings,
Disable Hyperthreading in BIOS.
Add idle=poll to the kernel command line.
Disable SELinux.
Set the below tuned-adm profile for RHEL7 platforms.
[root@host~]# tuned-adm profile network-latency
Page 44
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 44
Disable few services.
[root@host~]# t4_latencytune.sh <interface>
Set sysctl param net.ipv4.tcp_low_latency to 1
[root@host~]# sysctl -w net.ipv4.tcp_low_latency=1
To optimize your system for different protocols, please refer to their respective chapters.
Page 45
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 45
Software/Driver Update
For any distribution-specific problems, please check README and Release Notes included in the release for possible workaround.
Please visit Chelsio Download Center for regular updates on various software/drivers. You can also subscribe to our newsletter for the latest software updates.
Page 46
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 46
Software/Driver Uninstallation
Similar to installation, the Chelsio Unified Wire package can be uninstalled using two main methods: from the source and RPM, based on the method used for installation. If you decide to use source, you can uninstall the package using CLI or GUI mode.
7.1. Uninstalling Chelsio Unified Wire from source
GUI mode (with Dialog utility)
i. Change your current working directory to Chelsio Unified Wire package directory and run the
following script to start the GUI installer:
[root@host~]# ./install.py
ii. Select “uninstall” , Under “Choose an action”
iii. Select “all” to uninstall all the installed drivers, libraries and tools or select “custom” to
remove specific components.
Page 47
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 47
iv. The selected components will now be uninstalled.
v. After successful uninstalltion, summary of the uninstalled components will be displayed.
vi. Select “View log” to view uninstallation log or “Exit” to continue.
Page 48
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 48
vii. Select “Yes” to exit the installer or “No” to go back.
CLI mode (without Dialog utility)
i. Change your current working directory to Chelsio Unified Wire package directory:
[root@host~]# cd ChelsioUwire-x.x.x.x
ii. Run the following script with –u option to uninstall the Unified Wire Package:
[root@host~]# ./install.py –u <target>
iWARP driver uninstallation on Cluster nodes
i. Change your current working directory to Chelsio Unified Wire package directory:
[root@host~]# cd ChelsioUwire-x.x.x.x
ii. Uninstall iWARP drivers on multiple Cluster nodes using:
[root@host~]# ./install.py -C -m <machinefilename> -u all
The above command will remove Chelsio iWARP (iw_cxgb4) and TOE (t4_tom) drivers from all the nodes listed in the machinefilename file.
Press Esc or Ctrl+C to exit the installer at any point of time.
Note
View help by typing [root@host~]# ./install.py –h for more information
Note
Page 49
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 49
CLI mode
i. Change your current working directory to Chelsio Unified Wire package directory:
[root@host~]# cd ChelsioUwire-x.x.x.x
ii. Uninstall using the following command:
[root@host~]# make uninstall
CLI mode (individual drivers/software)
You can also choose to uninstall drivers/software individually. Provided here are steps to uninstall few of them. For the complete list, view help by running make help
Change your current working directory to Chelsio Unified Wire package directory:
[root@host~]# cd ChelsioUwire-x.x.x.x
To uninstall NIC driver:
[root@host~]# make nic_uninstall
To uninstall offload driver:
[root@host~]# make toe_uninstall
To uninstall iWARP driver:
[root@host~]# make iwarp_uninstall
To uninstall UDP Segmentation Offload driver:
[root@host~]# make udp_offload_uninstall
Page 50
Chapter I. Chelsio Unified Wire
Chelsio Unified Wire for Linux 50
7.2. Uninstalling Chelsio Unified Wire from RPM
i. Change your current working directory to Chelsio Unified Wire package directory:
[root@host~]# cd ChelsioUwire-x.x.x.x-<OS>-<arch>
ii. Uninstall Unified Wire:
[root@host~]# ./uninstall.py <inbox/ofed>
inbox : for removing all Chelsio drivers. ofed : for removing OFED and Chelsio drivers.
iWARP driver uninstallation on Cluster nodes
i. Change your current working directory to Chelsio Unified Wire package directory:
[root@host~]# cd ChelsioUwire-x.x.x.x-<OS>-<arch>
ii. Uninstall iWARP drivers on multiple Cluster nodes using:
[root@host~]# ./install.py -C -m <machinefilename> -u
The above command will remove Chelsio iWARP (iw_cxgb4) and TOE (t4_tom) drivers from all the nodes listed in the machinefilename file.
The uninstallation options may vary depending on Linux distribution. View help by typing [root@host~]# ./uninstall.py –h for more information.
Note
Page 51
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 51
II. Network (NIC/TOE)
Page 52
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 52
Introduction
Chelsio’s Unified Wire adapters provide extensive support for NIC operation, including all stateless offload mechanisms for both IPv4 and IPv6 (IP, TCP and UDP checksum offload, LSO
- Large Send Offload aka TSO - TCP Segmentation Offload, and assist mechanisms for accelerating LRO - Large Receive Offload).
A high performance fully offloaded and fully featured TCP/IP stack meets or exceeds software implementations in RFC compliance. Chelsio’s Terminator engine provides unparalleled performance through a specialized data flow processor implementation and a host of features designed for high throughput and low latency in demanding conditions and networking environments.
TCP offload is fully implemented in the hardware, thus freeing the CPU from TCP/IP overhead. The freed CPU can be used for any computing needs. The TCP offload in turn removes network bottlenecks and enables applications to take full advantage of the networking capabilities.
1.1. Hardware Requirements
Supported Adapters
The following are the currently shipping Chelsio adapters that are compatible with Chelsio Network driver:
T62100-CR T62100-LP-CR T62100-SO-CR* T61100-OCP* T6425-CR T6225-CR T6225-LL-CR T6225-OCP^ T6225-SO-CR^ T580-CR T580-LP-CR T580-SO-CR* T580-OCP-SO* T540-CR T540-LP-CR T540-SO-CR* T540-BT T520-CR T520-LL-CR T520-SO-CR*
Page 53
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 53
T520-OCP-SO* T520-BT T420-CR T440-CR T422-CR T420-SO-CR* T404-BT T420-BCH T440-LP-CR T420-BT T420-LL-CR T420-CX
*Only NIC driver supported
^ Memory-free; 256 IPv4/128 IPv6 offload connections supported.
1.2. Software Requirements
Linux Requirements
Currently the Network driver is available for the following versions:
RHEL 7.5, 3.10.0-862.el7 RHEL 7.5, 3.10.0-862.el7.ppc64le (POWER8 LE) RHEL 7.5, 4.14.0-49.el7a.aarch64 (ARM64) RHEL 7.4, 3.10.0-693.el7 RHEL 7.4, 3.10.0-693.el7.ppc64le (POWER8 LE) RHEL 7.3, 4.5.0-15.el7.aarch64 (ARM64) RHEL 6.9, 2.6.32-696.el6 SLES 15, 4.12.14-23-default SLES 12 SP3, 4.4.73-5-default SLES 12 SP2, 4.4.21-69-default Ubuntu 18.04.1, 4.15.0-29-generic Ubuntu 16.04.4, 4.4.0-116-generic Kernel.org linux-4.14.67 Kernel.org linux-4.9 (Minimum 4.9 kernel version supported is 4.9.13)
Other kernel versions have not been tested and are not guaranteed to work.
Page 54
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 54
Software/Driver Installation
Change your current working directory to Chelsio Unified Wire package directory:
[root@host~]# cd ChelsioUwire-x.x.x.x
To build and install NIC only driver (without offload support):
[root@host~]# make nic_install
To build and install drivers with offload support:
[root@host~]# make toe_install
For more installation options, please run make help or install.py -h
Note
Page 55
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 55
Software/Driver Loading
[root@host~]# rmmod csiostor cxgb4i cxgbit iw_cxgb4 chcr cxgb4vf cxgb4 libcxgbi libcxgb
The driver must be loaded by the root user. Any attempt to load the driver as a regular user will fail.
3.1. Loading in NIC mode (without full offload support)
To load the Network driver without full offload support, run the following command:
[root@host~]# modprobe cxgb4
3.2. Loading in TOE mode (with full offload support)
To enable full offload support, run the following command:
[root@host~]# modprobe t4_tom
In VMDirect Path environment, it is recommended to load the offload driver using the following command:
[root@host~]# modprobe t4_tom vmdirectio=1
Offload support needs to be enabled upon each reboot of the system. This can be done manually as shown above.
Note
Please ensure that all inbox drivers are unloaded before proceeding with
unified wire drivers.
Important
Page 56
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 56
Software/Driver Configuration
4.1. Enabling TCP Offload
Load the offload drivers and bring up the Chelsio interface.
[root@host~]# modprobe t4_tom [root@host~]# ifconfig ethX <IP> up
All TCP traffic will be offloaded over the Chelsio interface now. To see the number of connections offloaded, run the following command:
[root@host~]# cat /sys/kernel/debug/cxgb4/<bus-id>/tids
Where,
TID is the number of offload connections. STID is the number of offload servers.
T6 25G SO adapters support limited number of offload connections (256 IPv4/128 IPv6). Here is a sample output:
4.2. Enabling Busy waiting
Busy waiting/polling is a technique where a process repeatedly checks to see if an event has occurred, by spinning in a tight loop. By making use of similar technique, Linux kernel provides
Page 57
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 57
the ability for the socket layer code to poll directly on an Ethernet device's Rx queue. This eliminates the cost of interrupts and context switching, and with proper tuning allows to achieve latency performance similar to that of hardware.
Chelsio's NIC and TOE drivers support this feature and can be enabled on Chelsio supported devices to attain improved latency.
To make use of BUSY_POLL feature, follow the steps mentioned below: i. Enable BUSY_POLL support in kernel config file by setting CONFIG_NET_RX_BUSY_POLL=y
ii. Enable BUSY_POLL globally in the system by setting the values of following sysctl
parameters depending on the number of connections:
sysctl -w net.core.busy_read=<value>
sysctl -w net.core.busy_poll=<value>
Set the values of the above parameters to 50 for 100 or less connections; and 100 for more than 100 connections.
4.3. Precision Time Protocol (PTP)
Precision Time Protocol (PTP) standard defines a protocol for precise synchronization of clock between master and slave devices in a local area network. It can provide timing accuracies in nanosecond units. The protocol is based on time stamping and measuring the send and receive times. Most of the implementation relies on time stamping of the packets in the software which reduces the accuracy of the time measured. One possible solution to this problem is time stamping the packet in the NIC hardware itself.
Chelsio’s Terminator hardware provides many features to support PTP implementations:
High precision timers which can be read through PIO registers.
Wall clock time based on the time of the day.
Time stamping of selected PTP packets on both ingress and egress direction.
Synchronizing Clocks
ptp4l tool (installed during Unified Wire installation) is used to synchronise clocks:
BUSY_POLL can also be enabled on a per-connection basis by making use of
SO_BUSY_POLL option in the socket application code. Refer socket man-page for more details.
Note
This feature is currently supported on RHEL 7.5, RHEL 7.4, SLES 12 SP3 and kernels 4.14 and 4.9.105.
Important
Page 58
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 58
i. Load the network driver on all master and slave nodes;
[root@host~]# modprobe cxgb4
ii. Assign IP addresses and ensure that master and slave nodes are connected. iii. Start the ptp4l tool on master using the Chelsio interface:
[root@host~]# ptp4l -i <interface> -H -m
iv. Start the ptp4l tool on slave nodes:
[root@host~]# ptp4l -i <interface> -H -m -s
To view the complete list of available options, refer ptp4l help manual.
Note
Page 59
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 59
v. Synchronize the system clock to a PTP hardware clock (PHC) on slave nodes.
[root@host~]# phc2sys -s <interface> -c CLOCK_REALTIME -w -m
4.4. VXLAN Offload
Virtual Extensible LAN (VXLAN) is a network virtualization technique that uses overlay encapsulation protocol to provide Ethernet Layer 2 network services with extended scalability and flexibility. VXLAN extends the virtual LAN (VLAN) address space by adding a 24-bit segment ID and increasing the number of available logical networks from 4096 to 16 million, thereby addressing the scalability and network segmentation issues associated with large cloud computing deployments. Chelsio’s Terminator based adapters are uniquely capable of offloading the processing of VXLAN encapsulated frames such that all stateless offloads (checksums and TSO) are preserved, resulting in significant performance benefits. This is enabled by default on loading the driver.
Host Configuration
i. Load the network driver.
[root@host~]# modprobe cxgb4
ii. Configure larger MTU on the Chelsio interface to accommodate the larger frame size due to
VXLAN encapsulation. Assign IP address and bring it up.
[root@host~]# ifconfig <interface> <IP iaddress> mtu 1600 up
Page 60
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 60
iii. Create the VXLAN interface, assign the VNI, multicast group and the port number and bing
it up.
[root@host~]# modprobe vxlan [root@host~]# ip link add <vxlan_interface> type vxlan id <vni> group
239.1.1.1 dev <interface> dstport 4789 [root@host~]# ifconfig <vxlan_interface> up
iv. Create the bridge interface and bring it up.
[root@host~]# brctl addbr <bridge_interface> [root@host~]# ifconfig <bridge_interface> up
v. Add the VXLAN interface to the bridge interface.
[root@host~]# brctl addif <bridge_interface> <vxlan_interface>
vi. Tx UDP Tunnel Segmentation Offload will be enabled by default on loading the network
driver. To see the current settings,
[root@host~]# ethtool -k <interface> ... tx-udp_tnl-segmentation: on
vii. For better performance, please configure the NIC settings of the Performance Tuning
section.
Guest (VM) Configuration
i. Open the Virtual Machine Manager.
[root@host~]# virt-manager
ii. Add a Virtual Network Interface to the VM, by specifying the Bridge name configured in Step
iv. of the Host Configuration section and Device Model as virtio.
This feature is currently supported only on kernels 4.9.105 and 4.14.
Important
Page 61
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 61
iii. Bring up the Virtual Network interface with the required IP address.
[root@host~]# ifconfig <virtual-interface> <IP address> up
For better performance, the following settings are recommended: i. Increase the number of queues for the Virtual network interface to 8.
[root@host~]# virsh edit <VM> </interface> <interface type='bridge'> <mac address='52:54:00:34:8a:4a'/> <source bridge='br0'/> <model type='virtio'/> <driver name='vhost' queues='8'/>
ii. Map the Virtual CPUs of the VM to physical CPUs which will be free.
Example: On a machine with 16 cores, VM Virtual CPUs were pinned to physical cores 8-15,
leaving cores 0-7 to be utilized by the host.
[root@host~]# virsh edit <VM>
<vcpu placement='static' cpuset='8-15'>8</vcpu>
iii. Restart the libvirtd services and Virtual Machine Manager.
[root@host~]# systemctl restart libvirtd.service [root@host~]# systemctl restart libvirt-guests.service [root@host~]# virt-manager
iv. Bind the Virtual Network Interface Queues to different CPUs. v. Increase the TCP buffers by configuring the sysctl variables mentioned in NIC settings of
Performance Tuning section.
Page 62
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 62
4.5. Performance Tuning
Apply the performance settings mentioned in the Performance Tuning section in the Unified Wire chapter before proceeding.
TOE
i. Run the performance tuning script to map TOE queues to different CPUs:
[root@host~]# t4_perftune.sh -n -Q ofld
ii. Set the following sysctl parameter:
[root@host~]# sysctl -w toe.toe0_tom.delayed_ack=3 [root@host~]# sysctl -w net.ipv4.tcp_timestamps=0 [root@host~]# sysctl -w net.core.netdev_max_backlog=250000 [root@host~]# sysctl -w net.core.rmem_max=4194304 [root@host~]# sysctl -w net.core.wmem_max=4194304 [root@host~]# sysctl -w net.core.rmem_default=4194304 [root@host~]# sysctl -w net.core.wmem_default=4194304 [root@host~]# sysctl -w net.ipv4.tcp_rmem="4096 1048576 4194304" [root@host~]# sysctl -w net.ipv4.tcp_wmem="4096 1048576 4194304"
For 100G performance, disable Nagle using the following steps:
a. Create a COP policy:
[root@host~]# cat <policy_file> all=>offload !nagle
b. Compile the policy:
[root@host~]# cop -d -o <policy_out> <policy_file>
c. Apply the policy:
[root@host~]# cxgbtool ethX policy <policy_out>
Page 63
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 63
NIC
i. Run the performance tuning script to map NIC queues to different CPUs:
[root@host~]# t4_perftune.sh -n -Q nic
ii. Enable adaptive-rx
[root@host~]# ethtool -C enp2s0f4 adaptive-rx on
iii. Set the following sysctl parameters
[root@host~]# sysctl -w net.ipv4.tcp_timestamps=0 [root@host~]# sysctl -w net.core.netdev_max_backlog=250000 [root@host~]# sysctl -w net.core.rmem_max=4194304 [root@host~]# sysctl -w net.core.wmem_max=4194304 [root@host~]# sysctl -w net.core.rmem_default=4194304 [root@host~]# sysctl -w net.core.wmem_default=4194304 [root@host~]# sysctl -w net.ipv4.tcp_rmem="4096 1048576 4194304" [root@host~]# sysctl -w net.ipv4.tcp_wmem="4096 1048576 4194304"
NIC/TOE Latency
Enable BUSY_POLL feature:
[root@host~]# sysctl -w net.core.busy_poll = 50 [root@host~]# sysctl -w net.core.busy_read = 50
Receiver Side Scaling (RSS)
Receiver Side Scaling enables the receiving network traffic to scale with the available number of processors on a modern networked computer. RSS enables parallel receive processing and dynamically balances the load among multiple processors. Chelsio’s network controller fully supports Receiver Side Scaling for IPv4 and IPv6.
This script first determines the number of CPUs on the system and then each receiving queue is bound to an entry in the system interrupt table and assigned to a specific CPU. Thus, each receiving queue interrupts a specific CPU through a specific interrupt now. For example, on a 4­core system, t4_perftune.sh gives the following output:
Page 64
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 64
[root@host~]# t4_perftune.sh
Discovering Chelsio T4/T5 devices ... Configuring Chelsio T4/T5 devices ... Tuning eth7 IRQ table length 4 Writing 1 in /proc/irq/62/smp_affinity Writing 2 in /proc/irq/63/smp_affinity Writing 4 in /proc/irq/64/smp_affinity Writing 8 in /proc/irq/65/smp_affinity
eth7 now up and tuned ...
Because there are 4 CPUs on the system, 4 entries of interrupts are assigned. For other network interfaces, you should see similar output message.
Now the receiving traffic is dynamically assigned to one of the system’s CPUs through a Terminator queue. This achieves a balanced usage among all the processors. This can be verified, for example, by using the iperf tool. First set up a server on the receiver host:
[root@receiver_host~]# iperf –s
Then on the sender host, send data to the server using the iperf client mode. To emulate a moderate traffic workload, use -P option to request 20 TCP streams from the server:
[root@sender_host~]# iperf -c receiver_host_name_or_IP -P 20
Then on the receiver host, look at interrupt rate at /proc/interrupts:
[root@receiver_host~]# cat /proc/interrupts | grep eth6
Id CPU0 CPU1 CPU2 CPU3 type interface
36: 115229 0 0 1 PCI-MSI-edge eth6 (queue 0)
37: 0 121083 1 0 PCI-MSI-edge eth6 (queue 1)
38: 0 0 105423 1 PCI-MSI-edge eth6 (queue 2)
39: 0 0 0 115724 PCI-MSI-edge eth6 (queue 3)
Now interrupts from eth6 are evenly distributed among the 4 CPUs.
Without Terminator’s RSS support, the interrupts caused by network traffic may be distributed
unevenly over CPUs. For your information, the traffic produced by the same iperf commands
Page 65
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 65
gives the following output in /proc/interrupts.
[root@receiver_host~]# cat /proc/interrupts | grep eth6
Id CPU0 CPU1 CPU2 CPU3 type interface
36: 0 9 0 17418 PCI-MSI-edge eth6 (queue 0)
37: 0 0 21718 2063 PCI-MSI-edge eth6 (queue 1)
38: 0 7 391519 222 PCI-MSI-edge eth6 (queue 2)
39: 1 0 33 17798 PCI-MSI-edge eth6 (queue 3)
Here there are 4 receiving queues from the eth6 interface, but they are not bound to a specific CPU or interrupt entry. Queue 2 has caused a very large number of interrupts on CPU2 while CPU0 and CPU1 are barely used by any of the four queues. Enabling RSS is thus essential for best performance.
Interrupt Coalescing
The idea behind Interrupt Coalescing (IC) is to avoid flooding the host CPUs with too many
interrupts. Instead of throwing one interrupt per incoming packet, IC waits for ‘n’ packets to be
available in the Rx queues and placed into the host memory through DMA operations before an interrupt is thrown, reducing the CPU load and thus improving latency. It can be changed using the following command:
[root@host~]# ethtool –C ethX rx-frames n
Large Receive Offload / Generic Receive Offload
Large Receive Offload or Generic Receive Offload is a performance improvement feature at the receiving side. LRO/GRO aggregates the received packets that belong to same stream, and combines them to form a larger packet before pushing them to the receive host network stack. By
Linux’s irqbalance may take charge of distributing interrupts among CPUs on a
multiprocessor platform. However, irqbalance
distributes interrupt requests from all hardware devices across processors. For a server with Chelsio network card constantly receiving large volume of data at 40/10Gbps, the network interrupt demands are significantly high. Under such circumstances, it is necessary to enable RSS to balance the network load across multiple processors and achieve the best
performance.
Note
For more information, run the following command:
[root@host~]# ethtool -h
Note
Page 66
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 66
doing this, rather than processing every small packet, the receiver CPU works on fewer packet headers but with same amount of data. This helps reduce the receive host CPU load and improve throughput in a 40/10Gb network environment where CPU can be the bottleneck.
LRO and GRO are different names to refer to the same receiver packets aggregating feature. LRO and GRO actually differ in their implementation of the feature in the Linux kernel. The feature was first added into the Linux kernel in version 2.6.24 and named Large Receive Offload (LRO). However, LRO only works for TCP and IPv4. As from kernel 2.6.29, a new protocol-independent implementation removing the limitation is added to Linux, and it is named Generic Receive Offload (GRO). The old LRO code is still available in the kernel sources but whenever both GRO and LRO are presented GRO is always the preferred one to use.
Please note that if your Linux system has IP forwarding enabled, i.e. acting as a bridge or router, the LRO needs to be disabled. This is due to a known kernel issue.
Chelsio’s card supports both hardware assisted GRO/LRO and Linux-based GRO/LRO. t4_tom is the kernel module that enables the hardware assisted GRO/LRO. If it is not already in the kernel module list, use the following command to insert it:
[root@host~]# lsmod | grep t4_tom [root@host~]# modprobe t4_tom [root@host~]# lsmod | grep t4_tom t4_tom 88378 0 [permanent] toecore 21618 1 t4_tom cxgb4 225342 1 t4_tom
Then Terminator’s hardware GRO/LRO implementation is enabled. If you would like to use the Linux GRO/LRO for any reason, first the t4_tom kernel module needs
to be removed from kernel module list. Please note you might need to reboot your system. After removing the t4_tom module, you can use ethtool to check the status of current
GRO/LRO settings, for example:
[root@host~]# ethtool -k eth6 Offload parameters for eth6: rx-checksumming: on tx-checksumming: on scatter-gather: on tcp-segmentation-offload: on udp-fragmentation-offload: off generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: off
Page 67
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 67
Now the generic-receive-offload option is on. This means GRO is enabled. Please note that there are two offload options here: generic-receive-offload and large-receive-offload. This is because on this Linux system (RHEL6.0), the kernel supports both GRO and LRO. As mentioned earlier, GRO is always the preferred option when both of them are present. On other systems LRO might be the only available option. Then ethtool could be used to switch LRO on and off as well.
When Linux’s GRO is enabled, Chelsio’s driver provides two GRO-related statistics. They are displayed using the following command:
[root@host~]# ethtool -S eth6 ...
GROPackets : 0 GROMerged : 897723 ...
GROPackets is the number of held packets. Those are candidate packets held by the kernel to be
processed individually or to be merged to larger packets. This number is usually zero.
GROMerged is the number of packets that merged to larger packets. Usually this number increases
if there is any continuous traffic stream present.
ethtool can also be used to switch off the GRO/LRO options when necessary:
[root@host~]# ethtool -K eth6 gro off [root@host~]# ethtool -k eth6 Offload parameters for eth6: rx-checksumming: on tx-checksumming: on scatter-gather: on tcp-segmentation-offload: on udp-fragmentation-offload: off generic-segmentation-offload: on generic-receive-offload: off large-receive-offload: off
The output above shows a disabled GRO.
Page 68
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 68
Software/Driver Unloading
5.1. Unloading the NIC Driver
To unload the NIC driver, run the following command:
[root@host~]# rmmod cxgb4
5.2. Unloading the TOE Driver
A reboot is required to unload the TOE driver. To avoid rebooting, follow the steps mentioned below:
i. Load t4_tom driver with unsupported_allow_unload parameter.
[root@host~]# modprobe t4_tom unsupported_allow_unload=1
ii. Stop all the offloaded traffic, servers and connections. Check for the reference count.
[root@host~]# cat /sys/module/t4_tom/refcnt
If the reference count is 0, the driver can be directly unloaded. Skip to step (iii) If the count is non-zero, load a COP policy which disables offload using the following procedure:
a. Create a policy file which will disable offload
[root@host~]# cat policy_file all => !offload
b. Compile and apply the output policy file
[root@host~]# cop –o no-offload.cop policy_file [root@host~]# cxgbtool ethX policy no-offload.cop
Page 69
Chapter II. Network (NIC/TOE)
Chelsio Unified Wire for Linux 69
iii. Unload the driver:
[root@host~]# rmmod t4_tom [root@host~]# rmmod toecore [root@host~]# rmmod cxgb4
Page 70
Chapter III. Virtual Function Network (vNIC)
Chelsio Unified Wire for Linux 70
III. Virtual Function Network (vNIC)
Page 71
Chapter III. Virtual Function Network (vNIC)
Chelsio Unified Wire for Linux 71
Introduction
The ever-increasing network infrastructure of IT enterprises has lead to a phenomenal increase in maintenance and operational costs. IT managers are forced to acquire more physical servers and other data center resources to satisfy storage and network demands. To solve the Network and I/O overhead, users are opting for server virtualization which consolidates I/O workloads onto lesser physical servers thus resulting in efficient, dynamic and economical data center environments. Other benefits of Virtualization include improved disaster recovery, server portability, cloud computing, Virtual Desktop Infrastructure (VDI), etc.
Chelsio’s Unified Wire family of adapters deliver increased bandwidth, lower latency and lower power with virtualization features to maximize cloud scaling and utilization. The adapters also provide full support for PCI-SIG SR-IOV to improve I/O performance on a virtualized system. User can configure up to 64 Virtual and 8 Physical functions (with 4 PFs as SR-IOV capable) along with 336 virtual MAC addresses.
1.1. Hardware Requirements
Supported adapters
The following are the currently shipping Chelsio adapters that are compatible with the Chelsio vNIC driver:
T62100-CR T62100-LP-CR T62100-SO-CR T6425-CR T6225-CR T6225-LL-CR T6225-OCP T6225-SO-CR T580-CR T580-LP-CR T540-CR T540-BT T520-CR T520-LL-CR T520-BT T420-CR T440-CR T422-CR T420-SO-CR T404-BT T440-LP-CR
Page 72
Chapter III. Virtual Function Network (vNIC)
Chelsio Unified Wire for Linux 72
T420-BT T420-LL-CR
T420-CX
1.2. Software Requirements
Linux Requirements
Currently the vNIC driver is available for the following versions:
RHEL 7.5, 3.10.0-862.el7 RHEL 7.4, 3.10.0-693.el7 RHEL 6.9, 2.6.32-696.el6 SLES 15, 4.12.14-23-default SLES 12 SP3, 4.4.73-5-default SLES 12 SP2, 4.4.21-69-default Ubuntu 18.04.1, 4.15.0-29-generic Ubuntu 16.04.4, 4.4.0-116-generic Kernel.org linux-4.14.67 Kernel.org linux-4.9 (Minimum 4.9 kernel version supported is 4.9.13)
Other kernel versions have not been tested and are not guaranteed to work.
Page 73
Chapter III. Virtual Function Network (vNIC)
Chelsio Unified Wire for Linux 73
Software/Driver Installation
The Virtual Function implementation for Chelsio adapters comprises of two modules:
Standard NIC driver module, cxgb4, which runs on base Hypervisor and is responsible for instantiation and management of the PCIe Virtual Functions (VFs) on the adapter.
VF NIC driver module, cxgb4vf, which runs on Virtual Machine (VM) guest OS using VFs “attached" via Hypervisor VM initiation commands.
2.1. Pre-requisites
Please make sure that the following requirements are met before installation:
PCI Express Slot should be ARI capable.
SR-IOV should be enabled in the machine.
Intel Virtualization Technology for Directed I/O (VT-d) should be enabled in the BIOS.
Add intel_iommu=on to the kernel command line in grub/grub2 menu, to use VFs in VMs.
2.2. Installation
i. Change your current working directory to Chelsio Unified Wire package directory:
[root@host~]# cd ChelsioUwire-x.x.x.x
ii. On the host, install network driver:
[root@host~]# make nic_install
iii. On the guest (VM), install vNIC driver:
[root@host~]# make vnic_install
For more installation options, please run make help or install.py -h
Note
Page 74
Chapter III. Virtual Function Network (vNIC)
Chelsio Unified Wire for Linux 74
Software/Driver Loading
3.1. Instantiate Virtual Functions (SR-IOV)
To instantiate Virtual Functions (VFs) on the host, run the following commands:
[root@host~]# modprobe cxgb4 [root@host~]# echo n > /sys/class/net/ethX/device/driver/<bus_id>/sriov_numvfs
Here, ethX is the interface and n specifies the number of VFs to be instantiated per physical function (bus_id). VFs can be instantiated only from PFs 0 - 3 of the Chelsio adapter. A maximum of 64 virtual functions can be instantiated with 16 virtual functions per physical function.
Example: Instantiating 16 VFs on PF3 of Chelsio adapter.
Unload the vNIC driver on the host (if loaded):
[root@host~]# rmmod cxgb4vf
The virtual functions can now be assigned to virtual machines (guests).
3.2. Loading the Driver
[root@host~]# rmmod csiostor cxgb4i cxgbit iw_cxgb4 chcr cxgb4vf cxgb4 libcxgbi libcxgb
To get familiar with physical and virtual function terminologies, please refer the PCI Express specification.
Note
Please ensure that all inbox drivers are unloaded before proceeding with
unified wire drivers.
Important
Page 75
Chapter III. Virtual Function Network (vNIC)
Chelsio Unified Wire for Linux 75
The vNIC driver must be loaded on the Guest OS by the root user. Any attempt to load the driver as a regular user will fail.
To load the driver, run the following command:
[root@host~]# modprobe cxgb4vf
Page 76
Chapter III. Virtual Function Network (vNIC)
Chelsio Unified Wire for Linux 76
Software/Driver Configuration and Fine-tuning
4.1. VF Rate Limiting
This section describes the method to rate-limit traffic passing through virtual functions (VFs).
i. The VF rate limit needs to be set on the Host (hypervisor). Apply rate-limiting using:
[root@host~]# ip link set dev mgmtpfXX vf <vf_number> rate <rate_in_mbps>
Here,
mgmtpfXX is the management interface to be used. For each PF on which VFs are instantiated, 1 management interface will be created (in "ifconfig -a").
vf_number is the VF on which rate-limiting is applied. Value 0-15.
ii. Run traffic over the VF and the throughput should be rate-limited as per the values set in
the previous step.
Example:
i. 4 VFs are instantiated on PF0.
[root@host~]# modprobe cxgb4 [root@host~]# echo 4 > /sys/class/net/ethX/device/driver/<bus_id>/sriov_numvfs
ii. 2 VMs are configured with 2 VFs each. 2 different networks are configured with the
following IP configuration:
VM0: VF0 (102.1.1.2/24), VF1 (102.2.2.2/24) VM1: VF2 (102.1.1.3/24), VF3 (102.2.2.3/24)
iii. VF Rate-limiting is configured on the host:
[root@host~]# ip link set dev mgmtpf10 vf 0 rate 2000 [root@host~]# ip link set dev mgmtpf10 vf 1 rate 3000
The traffic on 102.1.1.X network will be rate-limited to 2Gbps whereas traffic on 102.2.2.X network will be rate-limited to 3Gbps.
Page 77
Chapter III. Virtual Function Network (vNIC)
Chelsio Unified Wire for Linux 77
4.2. Bonding
The VF network interfaces (assigned to a VM) can be aggregated into a single logical bonded interface effectively combining the bandwidth into a single connection. It also provides redundancy in case one of the link fails. Execute the following steps in the VM (attached with more than 1 VF interface):
i. Load the Virtual Function network driver using force_link_up module parameter.
[root@host~]# modprobe cxgb4vf force_link_up=0
ii. Create a bonded interface:
[root@host~]# modprobe bonding mode=<bonding mode> <optional paramters>
iii. Bring up the bonded interface and enslave the VF interfaces to the bond:
[root@host~]# ifconfig bond0 up [root@host~]# ifenslave bond0 ethX ethY
iv. Assign IPv4/IPv6 address to the bonded interface:
[root@host~]# ifconfig bond0 X.X.X.X/Y [root@host~]# ifconfig bond0 inet6 add <128-bit IPv6 Address> up
Example:
i. 2 VFs are instantiated each on PF0 (Port 0) and PF1 (Port 1) on the host.
[root@host~]# modprobe cxgb4 [root@host~]# echo 2 > /sys/class/net/eth4/device/driver/0000\:01\:00.0/sriov_numvfs
[root@host~]# echo 2 > /sys/class/net/eth4/device/driver/0000\:01\:00.1/sriov_numvfs
ethX and ethY are the VF interfaces attached to the same VM. It is recommended to use VFs of different Ports to achieve redundancy in case of link failures.
Note
Page 78
Chapter III. Virtual Function Network (vNIC)
Chelsio Unified Wire for Linux 78
ii. 1 VM was configured with VF0 of PF0 and VF1 of PF1.
[root@host~]# modprobe cxgb4vf force_link_up=0 [root@host~]# ifconfig enp8s1
enp8s1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
ether 06:44:3c:a8:40:00 txqueuelen 1000 (Ethernet)
[root@host~]# ifconfig enp8s1f5d1
enp8s1f5d1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
ether 06:44:3c:a8:40:11 txqueuelen 1000 (Ethernet)
iii. Bonding mode=1 was configured in the VM.
[root@host~]# modprobe bonding mode=1 miimon=100 [root@host~]# ifconfig bond0 up [root@host~]# ifenslave bond0 enp8s1 enp8s1f5d1 [root@host~]# ifconfig bond0 10.1.1.223/24
The traffic will run over the bond interface in Active-Backup mode. If the link fails on enp8s1, the traffic will failver to enp8s1f5d1.
4.3. High Capacity VF Configuration
Chelsio adapters by default support 16 VFs per PF. In order to use more VFs per PF, please follow the below steps on the host:
i. Change your current working directory to Chelsio Unified Wire package directory and
install the driver:
[root@host~]# cd ChelsioUwire-x.x.x.x [root@host~]# make CONF=HIGH_CAPACITY_VF install
ii. Update adapter configuration and reboot the machine:
Currently supported on T6225-SO-CR and T6225-OCP adapters.
Important
For more installation options, please run make help or install.py -h
Note
Page 79
Chapter III. Virtual Function Network (vNIC)
Chelsio Unified Wire for Linux 79
[root@host~]# chelsio_adapter_config.py
Chelsio adapter detected |------------------------------------| | Choose Chelsio card: | | 1. T580_SO_CR 3:0.0 | | 2. T6225_SO 4:0.0 | |------------------------------------| Select card: 2 Card T6225_SO(4:0.0) selected |------------------------------------| | Choose option | | 1. Change to Default settings | | 2. Change Adapter Config settings | |------------------------------------| Select option: 2 Changing Adapter Config settings |------------------------------------| | Possible Chelsio adapter settings: | | 1: 248 VFs mode | |------------------------------------| 248 VF setting selected
iii. Instantiate virtual functions:
[root@host~]# modprobe cxgb4 [root@host~]# echo n > /sys/class/net/ethX/device/driver/<bus_id>/sriov_numvfs
124 virtual functions can be instantiated on T5 adapter, with 31 virtual functions per
physical function{pf 0..3}.
248 virtual functions can be instantiated on T6 adapter, with 62 virtual functions per
physical function{pf 0..3}.
iv. Unload the vNIC driver on the host (if loaded):
[root@host~]# rmmod cxgb4vf
v. The virtual functions can now be assigned to virtual machines (guests). vi. For each PF on which VFs are instantiated, 1 management interface (mgmtpfX,Y) will be
created. You can see them using ip link show command:
Page 80
Chapter III. Virtual Function Network (vNIC)
Chelsio Unified Wire for Linux 80
[root@host ~]# ip link show
14: mgmtpf1,0: <NOARP> mtu 0 qdisc noop state DOWN mode DEFAULT qlen 1 link/none vf 0 MAC 06:44:3c:b1:00:00, link-state auto
15: mgmtpf1,1: <NOARP> mtu 0 qdisc noop state DOWN mode DEFAULT qlen 1 link/none vf 0 MAC 06:44:3c:b1:80:10, link-state auto
16: mgmtpf1,2: <NOARP> mtu 0 qdisc noop state DOWN mode DEFAULT qlen 1 link/none vf 0 MAC 06:44:3c:b1:80:20, link-state auto
17: mgmtpf1,3: <NOARP> mtu 0 qdisc noop state DOWN mode DEFAULT qlen 1 link/none vf 0 MAC 06:44:3c:b1:80:30, link-state auto
vii. To set a VLAN ID on Virtual Function, use the following syntax:
[root@host ~]# ip link set <mgmtpfX,Y> vf <vf_index> vlan <vlan_id>
Example: The below command will set VLAN ID 20 to VF0 device instantiated on PF0 function.
[root@host ~]# ip link set mgmtpf1,0 vf 0 vlan 20
viii. To set a MAC address on the Virtual Function, use the syntax:
[root@host ~]# ip link set <mgmtpfX,Y> vf <vf_index> mac <vnic_mac>
Example:
[root@host ~]# ip link set mgmtpf1,0 vf 0 mac 06:44:3c:11:22:33
The VF driver (cxgb4vf) needs to be reloaded on the VM for the new settings
(VLAN or MAC address) to take effect.
Note
Page 81
Chapter III. Virtual Function Network (vNIC)
Chelsio Unified Wire for Linux 81
Software/Driver Unloading
5.1. Unloading the Driver
The vNIC driver must be unloaded on the Guest OS by the root user. Any attempt to unload the driver as a regular user will fail.
To unload the driver, execute the following command:
[root@host~]# rmmod cxgb4vf
Page 82
Chapter IV. iWARP (RDMA)
Chelsio Unified Wire for Linux 82
IV. iWARP (RDMA)
Page 83
Chapter IV. iWARP (RDMA)
Chelsio Unified Wire for Linux 83
Introduction
Chelsio’s Terminator engine implements a feature rich RDMA implementation which adheres to
the IETF standards with optional markers and MPA CRC-32C. The iWARP RDMA operation benefits from the virtualization, traffic management and QoS
mechanisms provided by Terminator engine. It is possible to ACL process iWARP RDMA packets. It is also possible to rate control the iWARP traffic on a per-connection or per-class basis, and to give higher priority to QPs that implement distributed locking mechanisms. The iWARP operation also benefits from the high performance and low latency TCP implementation in the offload engine.
1.1. Hardware Requirements
Supported Adapters
The following are the currently shipping Chelsio adapters that are compatible with Chelsio iWARP driver:
T62100-CR T62100-LP-CR T6425-CR T6225-CR T6225-LL-CR T6225-SO-CR^ T6225-OCP^ T580-CR T580-LP-CR T540-CR T540-LP-CR T540-BT T520-CR T520-LL-CR T520-BT T420-CR T440-CR T422-CR T404-BT T440-LP-CR T420-LL-CR T420-CX
^ Memory-free; 256 IPv4/128 IPv6 offload connections supported.
Page 84
Chapter IV. iWARP (RDMA)
Chelsio Unified Wire for Linux 84
1.2. Software Requirements
Linux Requirements
Currently the iWARP driver is available for the following versions:
RHEL 7.5, 3.10.0-862.el7 RHEL 7.5, 3.10.0-862.el7.ppc64le (POWER8 LE) RHEL 7.5, 4.14.0-49.el7a.aarch64 (ARM64) RHEL 7.4, 3.10.0-693.el7 RHEL 7.4, 3.10.0-693.el7.ppc64le (POWER8 LE) RHEL 7.3, 4.5.0-15.el7.aarch64 (ARM64) RHEL 6.9, 2.6.32-696.el6 SLES 15, 4.12.14-23-default SLES 12 SP3, 4.4.73-5-default SLES 12 SP2, 4.4.21-69-default Ubuntu 18.04.1, 4.15.0-29-generic Ubuntu 16.04.4, 4.4.0-116-generic Kernel.org linux-4.14.67 Kernel.org linux-4.9 (Minimum 4.9 kernel version supported is 4.9.13)
Other kernel versions have not been tested and are not guaranteed to work
Page 85
Chapter IV. iWARP (RDMA)
Chelsio Unified Wire for Linux 85
Software/Driver Installation
2.1. Pre-requisites
libnl-devel, libnl3-devel and valgrind-devel packages should be installed for libiwpm and OFED installation.
If you are planning to upgrade OFED on one member of the cluster, the upgrade needs to be installed on all the members.
If you want to install OFED with NFS-RDMA support, refer to Setting up NFS-RDMA section.
rdma-core-devel package should be installed on RHEL 7.4, RHEL 7.5, SLES15 and SLES
12 SP3 systems.
2.2. Installation
i. Change your current working directory to Chelsio Unified Wire package directory:
[root@host~]# cd ChelsioUwire-x.x.x.x
ii. Install iWARP drivers and libraries:
[root@host~]# make iwarp_install
For more installation options, please run make help or install.py -h
Note
Page 86
Chapter IV. iWARP (RDMA)
Chelsio Unified Wire for Linux 86
Software/Driver Loading
[root@host~]# rmmod csiostor cxgb4i cxgbit iw_cxgb4 chcr cxgb4vf cxgb4 libcxgbi libcxgb
3.1. Loading iWARP Driver
The driver must be loaded by the root user. Any attempt to load the driver as a regular user will fail.
To load the iWARP driver we need to load the NIC driver and core RDMA drivers first. Run the following commands:
[root@host~]# modprobe cxgb4 [root@host~]# modprobe iw_cxgb4
[root@host~]# modprobe rdma_ucm
Optionally, you can start the iWARP Port Mapper daemon to enable port mapping:
[root@host~]# iwpmd
Please ensure that all inbox drivers are unloaded before proceeding with
unified wire drivers.
Important
Page 87
Chapter IV. iWARP (RDMA)
Chelsio Unified Wire for Linux 87
Software/Driver Configuration and Fine-tuning
4.1. Testing connectivity with ping and rping
Load the NIC, iWARP & core RDMA modules as mentioned in Software/Driver Loading section. After which, you will see two or four ethernet interfaces for the Terminator device. Configure them with an appropriate ip address, netmask, etc. You can use the Linux ping command to test basic connectivity via the Terminator interface. To test RDMA, use the rping command that is included in the librdmacm-utils RPM:
Run the following command on the server machine:
[root@host~]# rping -s -a server_ip_addr -p 9999
Run the following command on the client machine:
[root@host~]# rping -c –Vv -C10 -a server_ip_addr -p 9999
You should see ping data like this on the client:
ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqr ping data: rdma-ping-1: BCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrs ping data: rdma-ping-2: CDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrst ping data: rdma-ping-3: DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstu ping data: rdma-ping-4: EFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuv ping data: rdma-ping-5: FGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvw ping data: rdma-ping-6: GHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwx ping data: rdma-ping-7: HIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxy ping data: rdma-ping-8: IJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz ping data: rdma-ping-9: JKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzA client DISCONNECT EVENT... #
Page 88
Chapter IV. iWARP (RDMA)
Chelsio Unified Wire for Linux 88
4.2. Enabling various MPIs
Setting shell for Remote Login
User needs to set up authentication on the user account on all systems in the cluster to allow user to remotely logon or executing commands without password.
Quick steps to set up user authentication: i. Change to user home directory
[root@host~]# cd
ii. Generate authentication key
[root@host~]# ssh-keygen -t rsa
iii. Hit [Enter] upon prompting to accept default setup and empty password phrase iv. Create authorization file
[root@host~]# cd .ssh [root@host~]# cat *.pub > authorized_keys [root@host~]# chmod 600 authorized_keys
v. Copy directory .ssh to all systems in the cluster
[root@host~]# cd [root@host~]# scp -r /root/.ssh remotehostname-or-ipaddress:
Configuration of various MPIs (Installation and Setup)
Intel-MPI
i. Download latest Intel MPI from the Intel website ii. Copy the license file (.lic file) into l_mpi_p_x.y.z directory iii. Create machines.LINUX (list of node names) in l_mpi_p_x.y.z iv. Select advanced options during installation and register the MPI. v. Install software on every node.
[root@host~]# ./install.py
Page 89
Chapter IV. iWARP (RDMA)
Chelsio Unified Wire for Linux 89
vi. Set IntelMPI with mpi-selector (do this on all nodes).
[root@host~]# mpi-selector --register intelmpi --source-dir /opt/intel/impi/3.1/bin/ [root@host~]# mpi-selector --set intelmpi
vii. Edit .bashrc and add these lines:
export RSH=ssh export DAPL_MAX_INLINE=64 export I_MPI_DEVICE=rdssm:chelsio export MPIEXEC_TIMEOUT=180 export MPI_BIT_MODE=64
viii. Logout & log back in. ix. Populate mpd.hosts with node names.
x. Contact Intel for obtaining their MPI with DAPL support. xi. To run Intel MPI over RDMA interface, DAPL 2.0 should be set up as follows:
Enable the Chelsio device by adding an entry at the beginning of the /etc/dat.conf file for the Chelsio interface. For instance, if your Chelsio interface name is eth2, then the following line adds a DAT version 2.0 device named chelsio2" for that interface:
chelsio2 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "eth2 0" ""
Open MPI (Installation and Setup)
Open MPI iWARP support is only available in Open MPI version 1.3 or greater. Open MPI will work without any specific configuration via the openib btl. Users wishing to
performance tune the configurable options may wish to inspect the receive queue values. Those can be found in the "Chelsio T4" section of mca-btl-openib-device-params.ini. Follow the steps mentioned below to install and configure Open MPI.
i. If not alreay done, install mpi-selector tool. ii. Download the latest stable/feature version of openMPI from OpenMPI website.
The hosts in this file should be Chelsio interface IP addresses.
I_MPI_DEVICE=rdssm:chelsio assumes you have an entry in
/etc/dat.conf named chelsio.
MPIEXEC_TIMEOUT value might be required to increase if heavy traffic is going across the systems.
Note
Page 90
Chapter IV. iWARP (RDMA)
Chelsio Unified Wire for Linux 90
iii. Untar and change your current working directory to openMPI package directory. iv. Configure and install as:
[root@host~]#./configure --with-openib=/usr CC=gcc CXX=g++ F77=gfortran FC=gfortran --enable-mpirun-prefix-by-default --prefix=/usr/mpi/gcc/openmpi­x.y.z/ --with-openib-libdir=/usr/lib64/ --libdir=/usr/mpi/gcc/openmpi­x.y.z/lib64/ --with-contrib-vt-flags=--disable-iotrace [root@host~]# make [root@host~]# make install
The above step will install openMPI in /usr/mpi/gcc/openmpi-x.y.z/
v. Next, create a shell script, mpivars.csh, with the following entry:
# path if ("" == "`echo $path | grep /usr/mpi/gcc/openmpi-x.y.z/bin`") then set path=(/usr/mpi/gcc/openmpi-x.y.z/bin $path) endif
# LD_LIBRARY_PATH if ("1" == "$?LD_LIBRARY_PATH") then if ("$LD_LIBRARY_PATH" !~ */usr/mpi/gcc/openmpi-x.y.z/lib64*) then setenv LD_LIBRARY_PATH /usr/mpi/gcc/openmpi­x.y.z/lib64:${LD_LIBRARY_PATH} endif else setenv LD_LIBRARY_PATH /usr/mpi/gcc/openmpi-x.y.z/lib64 endif
# MPI_ROOT setenv MPI_ROOT /usr/mpi/gcc/openmpi-x.y.z
To enable multithreading, add “--enable-mpi-thread-multiple and --with-threads=posix” parameters to the above configure command.
Note
Page 91
Chapter IV. iWARP (RDMA)
Chelsio Unified Wire for Linux 91
vi. Simlarly, create another shell script, mpivars.sh, with the following entry:
# PATH if test -z "`echo $PATH | grep /usr/mpi/gcc/openmpi-x.y.z/bin`"; then PATH=/usr/mpi/gcc/openmpi-x.y.z/bin:${PATH} export PATH fi
# LD_LIBRARY_PATH if test -z "`echo $LD_LIBRARY_PATH | grep /usr/mpi/gcc/openmpi- x.y.z/lib64`"; then LD_LIBRARY_PATH=/usr/mpi/gcc/openmpi- x.y.z/lib64${LD_LIBRARY_PATH:+:}$ {LD_LIBRARY_PATH} export LD_LIBRARY_PATH
fi
# MPI_ROOT MPI_ROOT=/usr/mpi/gcc/openmpi-x.y.z export MPI_ROOT
vii. Next, copy the two files created in steps (v) and (vi) to /usr/mpi/gcc/openmpi-x.y.z/bin and
/usr/mpi/gcc/openmpi-x.y.z/etc
viii. Register OpenMPI with MPI-selector:
[root@host~]# mpi-selector --register openmpi --source-dir /usr/mpi/gcc/openmpi-x.y.z/bin
ix. Verify if it is listed in mpi-selector:
[root@host~]# mpi-selector --l
x. Set OpenMPI:
[root@host~]# mpi-selector --set openmpi –yes
xi. Logut and log back in.
Page 92
Chapter IV. iWARP (RDMA)
Chelsio Unified Wire for Linux 92
MVAPICH2 (Installation and Setup)
i. Download the latest MVAPICH2 software package from http://mvapich.cse.ohio-state.edu/
ii. Untar and change your current working directory to MVAPICH2 package directory. iii. Configure and install as:
[root@host~]# ./configure --prefix=/usr/mpi/gcc/mvapich2-x.y/ --with­device=ch3:mrail --with-rdma=gen2 --enable-shared --with-ib­libpath=/usr/lib64/ -enable-rdma-cm --libdir=/usr/mpi/gcc/mvapich2-x.y/lib64 [root@host~]# make [root@host~]# make install
The above step will install MVAPICH2 in /usr/mpi/gcc/mvapich2-x.y/ iv. Next, create a shell script , mpivars.csh, with the following entry:
# path if ("" == "`echo $path | grep /usr/mpi/gcc/mvapich2-x.y/bin`") then set path=(/usr/mpi/gcc/mvapich2-x.y/bin $path) endif
# LD_LIBRARY_PATH if ("1" == "$?LD_LIBRARY_PATH") then if ("$LD_LIBRARY_PATH" !~ */usr/mpi/gcc/mvapich2-x.y/lib64*) then setenv LD_LIBRARY_PATH /usr/mpi/gcc/mvapich2­x.y/lib64:${LD_LIBRARY_PATH} endif else setenv LD_LIBRARY_PATH /usr/mpi/gcc/mvapich2-x.y/lib64 endif
# MPI_ROOT setenv MPI_ROOT /usr/mpi/gcc/mvapich2-x.y
Page 93
Chapter IV. iWARP (RDMA)
Chelsio Unified Wire for Linux 93
v. Simlarly, create another shell script, mpivars.sh, with the following entry:
# PATH if test -z "`echo $PATH | grep /usr/mpi/gcc/ mvapich2-x.y/bin`"; then PATH=/usr/mpi/gcc/mvapich2-x.y/bin:${PATH} export PATH fi
# LD_LIBRARY_PATH if test -z "`echo $LD_LIBRARY_PATH | grep /usr/mpi/gcc/mvapich2­x.y/lib64`"; then LD_LIBRARY_PATH=/usr/mpi/gcc/mvapich2­x.y/lib64${LD_LIBRARY_PATH:+:}${LD_LIBRARY_PATH} export LD_LIBRARY_PATH
fi
# MPI_ROOT MPI_ROOT=/usr/mpi/gcc/mvapich2-x.y export MPI_ROOT
vi. Next, copy the two files created in steps (iv) and (v) to /usr/mpi/gcc/mvapich2-x.y/bin and
/usr/mpi/gcc/mvapich2-x.y/etc
vii. Add the following entries in .bashrc file:
export MVAPICH2_HOME=/usr/mpi/gcc/mvapich2-x.y/ export MV2_USE_IWARP_MODE=1 export MV2_USE_RDMA_CM=1
viii. Register MPI:
[root@host~]# mpi-selector --register mvapich2 --source-dir /usr/mpi/gcc/mvapich2-x.y/bin/
ix. Verify if it is listed in mpi-selector:
[root@host~]# mpi-selector --l
Page 94
Chapter IV. iWARP (RDMA)
Chelsio Unified Wire for Linux 94
x. Set MVAPICH2:
[root@host~]# mpi-selector --set mvapich2 –yes
xi. Logut and log back in. xii. Populate mpd.hosts with node names. xiii. On each node, create /etc/mv2.conf with a single line containing the IP address of the local
adapter interface. This is how MVAPICH2 picks which interface to use for RDMA traffic.
Building MPI Tests
i. Download Intel’s MPI Benchmarks from http://software.intel.com/en-us/articles/intel-mpi-
benchmarks
ii. Untar and change your current working directory to src directory. iii. Edit make_mpich file and set MPI_HOME variable to the MPI which you want to build the
benchmarks tool against. For example, in case of openMPI-1.6.4 set the variable as:
MPI_HOME=/usr/mpi/gcc/openmpi-1.6.4/
iv. Next, build and install the benchmarks using:
[root@host~]# gmake -f make_mpich
The above step will install IMB-MPI1, IMB-IO and IMB-EXT benchmarks in the current working directory (i.e. src).
v. Change your working directory to the MPI installation directory. In case of OpenMPI, it will
be /usr/mpi/gcc/openmpi-x.y.z/ vi. Create a directory called tests and then another directory called imb under tests. vii. Copy the benchmarks built and installed in step (iv) to the imb directory. viii. Follow steps (v), (vi) and (vii) for all the nodes.
Running MPI Applications
Run Intel MPI applications as:
mpdboot -n <no_of_nodes_in_cluster> -r ssh mpdtrace mpiexec -ppn -n 2 /opt/intel/impi/3.1/tests/IMB-3.1/IMB-MPI1
Page 95
Chapter IV. iWARP (RDMA)
Chelsio Unified Wire for Linux 95
The performance is best with NIC MTU set to 9000 bytes.
Run Open MPI application as:
mpirun --host node1,node2 -mca btl openib,sm,self /usr/mpi/gcc/openmpi­x.y.z/tests/imb/IMB-MPI1
The RDMA CM returned an event error while attempting to make a connection. This type of error usually indicates a network configuration error.
Local host: core96n3.asicdesigners.com Local device: Unknown Error name: RDMA_CM_EVENT_ADDR_ERROR Peer: core96n8
Workaround: Increase the OpenMPI rdma route resolution timeout. The default is 1000, or 1000ms. Increase it to 30000 with this parameter:
--mca btl_openib_connect_rdmacm_resolve_timeout 30000
Run MVAPICH2 application as :
mpirun_rsh -ssh -np 8 -hostfile mpd.hosts $MVAPICH2_HOME/tests/imb/IMB-MPI1
For OpenMPI/RDMA clusters with node counts greater than or equal to 8 nodes, and process counts greater than or equal to 64, you may experience the following RDMA address resolution error when running MPI jobs with the default OpenMPI settings:
Note
openmpi-1.4.3 can cause IMB benchmark stalls due to a shared memory BTL issue. This issue is fixed in openmpi-1.4.5 and later releases. Hence, it is recommended that you download and install the latest stable release from Open MPI's official website, http://www.open-mpi.org
Important
Page 96
Chapter IV. iWARP (RDMA)
Chelsio Unified Wire for Linux 96
4.3. Setting up NFS-RDMA
Starting NFS-RDMA
Server-side settings Follow the steps mentioned below to set up an NFS-RDMA server. i. Make entry in /etc/exports file for the directories you need to export using NFS-RDMA on
server as:
/share/rdma *(fsid=0,async,insecure,no_root_squash) /share/rdma1 *(fsid=1,async,insecure,no_root_squash)
Note that for each directory you export, you should have DIFFERENT fsid’s. ii. Load the iwarp modules and make sure peer2peer is set to 1.
iii. Load xprtrdma and svcrdma modules as:
[root@host~]# modprobe xprtrdma [root@host~]# modprobe svcrdma
iv. Start the nfs service as:
[root@host~]# service nfs start
All services in NFS should start without errors. v. Now we need to edit the file portlist in the path /proc/fs/nfsd/
Include the rdma port 2050 into this file as:
[root@host~]# echo rdma 2050 > /proc/fs/nfsd/portlist
On RHEL 7 systems, uninstall OFED 4.8-2 if present in the machine.
Important
Page 97
Chapter IV. iWARP (RDMA)
Chelsio Unified Wire for Linux 97
vi. Run exportfs to make local directories available for Network File System (NFS) clients to
mount.
[root@host~]# exportfs
Now the NFS-RDMA server is ready.
Client-side settings Follow the steps mentioned below at the client side. i. Load the iwarp modules and make sure peer2peer is set to 1. Make sure you are able to
ping and ssh to the server Chelsio interface through which directories will be exported.
ii. Load the xprtrdma module.
[root@host~]# modprobe xprtrdma
iii. Run the showmount command to show all directories from server as:
[root@host~]# showmount –e <server-chelsio-ip>
iv. Once the exported directories are listed, mount them as:
[root@host~]# mount.nfs <serverip>:<directory> <mountpoint-on-client> -o vers=3,rdma,port=2050,wsize=65536,rsize=65536
4.4. Performance Tuning
i. Apply the performance settings mentioned in the Performance Tuning section in the Unified
Wire chapter before proceeding.
ii. Run the performance tuning script to map iWARP queues to different CPUs.
[root@host~]# t4_perftune.sh -Q rdma -n
Page 98
Chapter IV. iWARP (RDMA)
Chelsio Unified Wire for Linux 98
Software/Driver Unloading
To unload the iWARP driver, run the following command:
[root@host~]# rmmod iw_cxgb4
Page 99
Chapter V. iSER
Chelsio Unified Wire for Linux 99
V. iSER
Page 100
Chapter V. iSER
Chelsio Unified Wire for Linux 100
Introduction
The iSCSI Extensions for RDMA (iSER) protocol is a translation layer for operating iSCSI over RDMA transports, such as iWARP/Ethernet or InfiniBand.
1.1. Hardware Requirements
Supported adapters
The following are the currently shipping Chelsio adapters that are compatible with iSER driver:
T62100-CR
T62100-LP-CR
T6425-CR
T6225-CR
T6225-LL-CR
T6225-OCP^
T6225-SO-CR^
T580-CR
T580-LP-CR
T540-CR
T540-BT
T520-CR
T520-LL-CR
T520-BT
^ Memory-free; 256 IPv4/128 IPv6 offload connections supported.
1.2. Software Requirements
Linux Requirements
Currently the iSER driver is available for the following versions:
RHEL 7.5, 3.10.0-862.el7
RHEL 7.5, 4.14.0-49.el7a.aarch64 (ARM64)
RHEL 7.4, 3.10.0-693.el7
SLES 15, 4.12.14-23-default
SLES 12 SP3, 4.4.73-5-default
Ubuntu 18.04.1, 4.15.0-29-generic
Kernel.org linux-4.14.67 (kernel compiled on RHEL 7.3 and above)
Kernel.org linux-4.9 (Minimum 4.9 kernel version supported is 4.9.13; kernel compiled on
RHEL 7.3 and above)
Other kernel versions have not been tested and are not guaranteed to work.
Loading...