For more information.......................................................................................................................... 12
Call to action .................................................................................................................................... 12
Abstract
Remote Direct Memory Access (RDMA) is a data exchange technology that improves network
performance by streamlining data processing operations. This technology brief describes how RDMA
can be applied to the two most common network interconnects, Ethernet and InfiniBand, to provide
efficient throughput in the data center.
Introduction
Advances in computing and storage technologies are placing a considerable burden on the data
center’s network infrastructure. As network speeds increase and greater amounts of data are moved,
it takes more processing power to process data communication.
A typical data center today uses a variety of disparate interconnects for servers-to-servers and serverto-storage links. The use of multiple system and peripheral bus interconnects decreases compatibility,
interoperability, and management efficiency and drives up the cost of equipment, software, training,
and the personnel needed to operate and maintain them. To increase efficiency and lower costs, data
center network infrastructure must be transformed into a unified, flexible, high-speed fabric.
Unified high-speed infrastructures require a high-bandwidth, low-latency fabric that can move data
efficiently and securely between servers, storage, and applications. Evolving fabric interconnects and
associated technologies provide more efficient and scalable computing and data transport within the
data center by reducing the overhead burden on processors and memory. More efficient
communication protocols and technologies, some of which run over existing infrastructures, free
processors for more useful work and improve infrastructure utilization. In addition, the ability of fabric
interconnects to converge functions in the data center over fewer, or possibly even one, industrystandard interconnect presents significant benefits.
Remote direct memory access (RDMA) is a data exchange technology that promises to accomplish
these goals and make iWARP (a protocol that specifies RDMA over TCP/IP) a reality. Applying RDMA
to switched-fabric infrastructures such as InfiniBand™ (IB) can enhance the performance of clustered
systems handling large data transfers.
Limitations of TCP/IP
Transmission Control Protocol and Internet Protocol (TCP/IP) represent the suite of protocols that drive
the Internet. Every computer connected to the Internet uses these protocols to send and receive
information. Information is transmitted in fixed data formats (packets), so that heterogeneous systems
can communicate. The TCP/IP stack of protocols was developed to be an internetworking language
for all types of computers to transfer data across different physical media. The TCP and IP protocol
suite includes over 70,000 software instructions that provide the necessary reliability mechanisms,
error detection and correction, sequencing, recovery, and other communications features.
Computers implement the TCP/IP protocol stack to process outgoing and incoming data packets.
Today, TCP/IP stacks are usually implemented in operating system software and packets are handled
by the main (host) processor. As a result, protocol processing of incoming and outgoing network
traffic consumes processor cycles—cycles that could otherwise be used for business and other
productivity applications. The processing work and associated time delays may also reduce the ability
of applications to scale across multiple servers. As network speeds move beyond 1 gigabit per
second (Gb/s) and larger amounts of data are transmitted, processors become burdened by TCP/IP
protocol processing and data movement.
The burden of protocol stack processing is compounded by a finite amount of memory bus
bandwidth. Incoming network data consumes the memory bus bandwidth because each data packet
2
must be transferred in and out of memory several times (Figure 1): received data is written to the
y
(
device driver buffer, copied into an operating system (OS) buffer, and then copied into application
memory space.
Figure 1. Typical flow of network data in receiving host
Memor
NOTE: The actual number of
memory copies varies depending on
Example: Linux uses 2).
OS
Chipset
Network I/F
CPU
These copy operations add latency, consume memory bus bandwidth, and require host processor
(CPU) intervention. In fact, the TCP/IP protocol overhead associated with 1 Gb of Ethernet traffic can
increase system processor utilization by 20 to 30 percent. Consequently, software overhead for
10 Gb Ethernet operation has the potential to overwhelm system processors. An InfiniBand network
using TCP operations to satisfy compatibility issues will suffer from the same processing overhead
problems that Ethernet networks have.
RDMA solution
Inherent processor overhead and constrained memory bandwidth are performance obstacles for
networks that use TCP, whether out of necessity (Ethernet) or compatibility (InfiniBand).
For Ethernet, the use of a TCP/IP offload engine (TOE) and RDMA can diminish these obstacles. A
network interface adapter (NIC) with a TOE assumes TCP/IP processing duties, freeing the host
processor for other tasks. The capability of a TOE is defined by its hardware design, the OS
programming interface, and the application being run.
RDMA technology was developed to move data from the memory of one computer directly into the
memory of another computer with minimal involvement from their processors. The RDMA protocol
includes information that allows a system to place transferred data directly into its final memory
destination without additional or interim data copies. This “zero copy” or “direct data placement”
(DDP) capability provides the most efficient network communication possible between systems.
Since the intent of both a TOE and RDMA is to relieve host processors of network overhead, they are
sometimes confused with each other. However, the TOE is primarily a hardware solution that
specifically takes responsibility of TCP/IP operations, while RDMA is a protocol solution that operates
at the upper layers of the network communication stack. Consequently, TOEs and RDMA can work
together: a TOE can provide localized connectivity with a device while RDMA enhances the data
throughput with a more efficient protocol.
For InfiniBand, RDMA operations provide an even greater performance benefit since InfiniBand
architecture was designed with RDMA as a core capability (no TOE needed).
RDMA provides a faster path for applications to transmit messages between network devices and is
applicable to both Ethernet and InfiniBand. Both these interconnects can support all new and existing
network standards such as Sockets Direct Protocol (SDP), iSCSI Extensions for RDMA (iSER), Network
File System (NFS), Direct Access File System (DAFS), and Message Passing Interface (MPI).
3
RDMA over TCP
Ethernet is the most prevalent network interconnect in use today. IT organizations have invested
heavily in Ethernet technology and most are unwilling to tear out their networks and replace them.
Reliance on Ethernet is justified by its low cost, backward compatibility, and consistent bandwidth
upgrades over time. Today’s Ethernet networks, which use TCP/IP operations, commonly operate at
100 megabits per second (Mb/s) and 1 gigabit per second (Gb/s). Next-generation speeds will
increase to 10 Gb/s. Customer migration to 10-Gb Ethernet will be tempered by the input/output
(I/O) processing burden that TCP/IP operations place on servers.
The addition of RDMA capability to Ethernet will reduce host processor utilization and increase the
benefits realized by migrating to 10-Gb Ethernet. Adding RDMA capability to Ethernet will allow data
centers to expand the infrastructure with less effect on overall performance. This improves
infrastructure flexibility for adapting to future needs.
RDMA over TCP is a communication protocol that moves data directly between the memory of
applications on two systems (or nodes), with minimal work by the operating system kernel and without
interim data copying into system buffers (Figure 2). This capability enables RDMA over TCP to work
over standard TCP/IP-based networks (such as Ethernet) that are commonly used in data centers
today. Note that RDMA over TCP does not specify the physical layer and will work over any network
that uses TCP/IP.
Figure 2. Data flow with RDMA over TCP (Ethernet)
Sending Host
Memory CPU
TX Pr
RX Pr
RDMA over TCP allows many classes of traffic (networking, I/O, file system and block storage, and
interprocess messaging) to share the same physical interconnect, enabling that physical interconnect
to become the single unifying data center fabric. RDMA over TCP provides more efficient network
communications, which can increase the scalability of processor-bound applications. RDMA over TCP
also leverages existing Ethernet infrastructures and the expertise of IT networking personnel.
Chipset
Ethernet
NIC
Memory
Ethernet LAN
Receiving Host
Chipset
Ethernet
NIC
CPU
4
Loading...
+ 8 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.