Dell PowerFlex User Manual

H18391
Technical White Paper
Dell EMC PowerFlex: Introduction to Replication
Overview and basic configuration of PowerFlex replication
Abstract
3.5 adds native asynchronous replication. This paper provides an overview of PowerFlex replication technology along with deployment and configuration details as well as design considerations for replicating PowerFlex clusters.
June 2020
Revisions
2 Dell EMC PowerFlex: Introduction to Replication | H18391
Revisions
Date
Description
June 2020
Initial release
Acknowledgments
Author: Neil Gerren, Senior Principal Engineer, Storage Technical Marketing Support: Brian Dean, Senior Principal Engineer, Storage Technical Marketing Other: Matt Hobbs, Advisory Systems Engineer, APJ Presales
The information in this publication is provided “as is.” Dell Inc. makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose.
Use, copying, and distribution of any software described in this publication requires an applicable software license.
Copyright © 2020 Dell Inc. or its subsidiaries. All Rights Reserved. Dell Technologies, Dell, EMC, Dell EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other trademarks may be trademarks of their respective owners. [7/8/2020] [Technical White Paper] [H18391]
Table of contents
3 Dell EMC PowerFlex: Introduction to Replication | H18391
Table of contents
Revisions............................................................................................................................................................................. 2
Acknowledgments ............................................................................................................................................................... 2
Table of contents ................................................................................................................................................................ 3
Executive summary ............................................................................................................................................................. 5
1 Introduction ................................................................................................................................................................... 6
2 PowerFlex 3.5 new features ......................................................................................................................................... 7
2.1 Native asynchronous replication ......................................................................................................................... 7
2.2 Protected Maintenance Mode ............................................................................................................................. 7
2.3 SDC authentication ............................................................................................................................................. 7
2.4 New WebUI ......................................................................................................................................................... 8
2.5 Secure snapshots ............................................................................................................................................... 8
2.6 Core improvements ............................................................................................................................................ 9
3 PowerFlex asynchronous replication architecture ...................................................................................................... 10
3.1 Journaling and snapshotting ............................................................................................................................. 11
3.2 Journaling space reservations .......................................................................................................................... 11
3.3 Journal management ........................................................................................................................................ 12
4 Deploying and configuring PowerFlex clusters for replication ................................................................................... 14
4.1 Deployment and configuration .......................................................................................................................... 14
4.1.1 The exchange of storage cluster Certificate Authority root certificates ............................................................ 14
4.1.2 Peering storage clusters ................................................................................................................................... 14
4.2 Replication Consistency Groups ...................................................................................................................... 16
5 Replication monitoring and configuration ................................................................................................................... 20
5.1 The replication dashboard ................................................................................................................................ 20
5.2 The Replication Consistency Group tab ........................................................................................................... 20
5.3 Volume access ................................................................................................................................................. 22
5.3.1 Access mode for mapping the target volume ................................................................................................... 22
5.3.2 Test failover behavior ....................................................................................................................................... 23
5.3.3 Failover behavior .............................................................................................................................................. 23
5.3.4 Create snapshots behavior ............................................................................................................................... 24
5.3.5 Monitoring journal capacity and health ............................................................................................................. 24
6 PowerFlex 3.5 networking considerations .................................................................................................................. 26
6.1 TCP/IP port considerations ............................................................................................................................... 26
6.2 Network bandwidth considerations ................................................................................................................... 26
6.3 Native PowerFlex 3.5 IP load balancing ........................................................................................................... 27
Table of contents
4 Dell EMC PowerFlex: Introduction to Replication | H18391
6.4 Remote replication networking ......................................................................................................................... 27
6.4.1 Routing and firewall considerations for remote replication ............................................................................... 28
7 System component, network, and process failure ..................................................................................................... 30
7.1 SDR failure scenarios ....................................................................................................................................... 30
7.2 SDS failure scenarios ....................................................................................................................................... 31
7.3 Network link failure scenarios ........................................................................................................................... 32
8 Conclusion .................................................................................................................................................................. 33
A Technical support and resources ............................................................................................................................... 34
Executive summary
5 Dell EMC PowerFlex: Introduction to Replication | H18391
Executive summary
As PowerFlex continues to evolve, the 3.5 release adds a variety of core features including asynchronous replication. Customers require disaster recovery and replication features to meet business and compliance requirements. Replication can be leveraged for other use cases such as offloading demanding workloads like analytics, isolating them from mission-critical workloads such as ERP, MRP, or other business-critical systems. Another requirement is the need for more than two copies of certain data sets. This paper covers:
Key features of PowerFlex Release 3.5
The core design principles of PowerFlex replication
Configuration requirements for pairing storage clusters
Configuration requirements of Replication Consistency Groups
Networking Considerations
Replication use cases
Images of the PowerFlex replication architecture will be shared along with screenshots of the new Web User Interface (WebUI) to assist in clearly communicating all the essential elements of replication.
Introduction
6 Dell EMC PowerFlex: Introduction to Replication | H18391

1 Introduction

PowerFlex is a software-defined storage platform designed to significantly reduce operational and infrastructure complexity empowering organizations to move faster by delivering flexibility, elasticity, and simplicity with predictable performance and resiliency at scale. The PowerFlex family provides a foundation that combines compute as well as high performance storage resources in a managed unified fabric. Flexibility is offered as it comes in multiple hardware deployment options such as integrated rack, appliance or ready nodes, all of which provide Server SAN, HCI and storage only architectures.
PowerFlex overview
PowerFlex provides the flexibility and scale demanded by a range of application deployments, whether they are on bare metal, virtualized, or containerized.
It provides the performance and resiliency required by the most demanding enterprises, demonstrating six 9s or greater of mission-critical availability with stable and predictable latency.
Easily providing millions of IOPs at sub-millisecond latency, PowerFlex is ideal for both high performance applications and for private clouds that desire a flexible foundation with synergies into public and hybrid cloud. It is also great for organizations consolidating heterogeneous assets into a single system with a flexible, scalable architecture that provides the automation to manage both storage and compute infrastructure.
PowerFlex 3.5 new features
7 Dell EMC PowerFlex: Introduction to Replication | H18391

2 PowerFlex 3.5 new features

There is much more than replication in this release, so it is worth mentioning some additional features.
New features

2.1 Native asynchronous replication

This much anticipated feature is the key subject of this paper and will be covered in detail.

2.2 Protected Maintenance Mode

Protected Maintenance Mode, or PMM, offers better data protection over Instant Maintenance Mode. While Instant Maintenance Mode, or IMM, offered the ability to perform node maintenance very quickly, there was some exposure to potential data unavailability or loss should an additional device or node experience failure. PMM creates a temporary third copy of the node’s data throughout the system spare capacity during the maintenance period. When maintenance is complete, the deltas are synced back to the maintained node. While it takes longer to perform than IMM, it preserves full data protection throughout the maintenance period and vastly reduces concern for data loss and availability.

2.3 SDC authentication

Authentication of SDCs is better secured with CHAP, or Challenge-Handshake Authentication Protocol. It allows the MDM to validate the authenticity of each SDC when it is first attached and to establish secrets between the SDCs and SDSs to regulate access to volumes. The MDM regularly refreshes the secrets, forcing the SDCs and SDSs to re-authenticate on a regular basis.
PowerFlex 3.5 new features
8 Dell EMC PowerFlex: Introduction to Replication | H18391

2.4 New WebUI

The PowerFlex 3.5 release offers a new, streamlined HTML5-based user interface which is consistent with other Dell Technologies product solutions.
This primary dashboard view displays the majority of system activity at a single glance while also
preserving the ability to drill into all PowerFlex elements to view or manage them.

2.5 Secure snapshots

Secure snapshots were added to meet customer business and statutory requirements for data retention.
Secure snapshot with 1 year expiration time.
PowerFlex 3.5 new features
9 Dell EMC PowerFlex: Introduction to Replication | H18391
Once a snapshot is created with the secure option, it cannot be deleted until the assigned expiration time is reached. For cases where secure snapshots are created by mistake, or must be removed for other reasons, there is a formal process integrated with Dell support that must be followed to delete them. Note also that in
3.5, snapshots now can be created with read-only access, whether they are secure or not.

2.6 Core improvements

There are several core improvements in 3.5, but a few merit special mentions. Release 3.5 adds a Fine Granularity Metadata cache which eliminates the two-step metadata lookup required for FG volume read I/Os. Up to 32GB of FG pool metadata can be cached per SDS. The cache is not persistent and resides in DRAM. It is updated either on new reads after an SDS reboots or upon a cache-miss. This dramatically improves FG read performance for recent and frequently read IOs.
Data resiliency has been improved with two new features. Persistent checksum is now available for data residing on Medium Granularity storage pools, and this is enabled by default on volumes created after the upgrade to 3.5. Additionally, new Partial Disk Error handling prevents immediate media ejections and rebuild of entire drives when only a few sectors fail. This provides a longer useful life of your storage media.
For more information on the PowerFlex 3.5 release, refer to the Getting to Know document in the product documentation bundle found on the Dell EMC support site.
PowerFlex asynchronous replication architecture
10 Dell EMC PowerFlex: Introduction to Replication | H18391

3 PowerFlex asynchronous replication architecture

To understand how replication works, we must first consider the basic architecture of PowerFlex itself.
PowerFlex basic architecture diagram
Servers contributing media to a storage cluster run the Storage Data Server (SDS) software element which allows PowerFlex to aggregate the media while sharing these resources as one or more unified pools on which logical volumes are created.
Servers consuming storage run the Storage Data Client (SDC) which provides access to the logical volumes via the host SCSI layer. Note that iSCSI is not used, but instead, a resilient load-managing, load-balancing network service which runs on TCP/IP storage networks.
The Metadata Manager (MDM) controls the flow of data through the system but is not in the data path. Instead, it creates and maintains information about volume distribution across the SDS cluster and distributes the mapping to the SDC informing it where to place and retrieve data for each part of the address space.
These three base elements comprise the fundamental parts of best software-defined storage solution today, one that scales linearly to hundreds of SDS nodes.
When considering architectural options for replication, maintaining the scalability and resiliency of PowerFlex was critical. The replication architecture in PowerFlex is a natural extension of the fundamentals just described.
PowerFlex asynchronous replication architecture
11 Dell EMC PowerFlex: Introduction to Replication | H18391
PowerFlex basic replication architecture
PowerFlex 3.5 introduces a new storage software component called the Storage Data Replicator. Figure 6 depicts where it fits into the overall PowerFlex architecture. Its role is to proxy the I/O of replicated volumes between the SDC and the SDSs where data is ultimately stored. It splits write I/Os sending one copy on to the SDSs and another to a replication journal volume. As it sits between the SDS and SDC, from the point-of-view of the SDS, the SDR appears as, and behaves as, an SDC sending writes. Conversely, to the SDC, the SDR appears as, and behaves as, an SDS to which writes can be sent.
The SDR only mediates the flow of traffic for replicated volumes, and as always, the MDM instructs each of these elements where to read and write data. Writes related to non-replicated volumes passes directly from the SDC to the SDSs, as always. This is facilitated by the volume mapping presented to the SDC by the MDM, determining which volume’s data is sent directly to the SDSs, and which volume’s data is routed through the SDR and then to the SDS.

3.1 Journaling and snapshotting

There are two schools of thought concerning how replication is implemented. Many storage solutions leverage a snapshot approach. With snapshots, it is easy to identify the block change delta between two points in time. However, as Recovery Point Objectives get smaller, the number of snapshots increase dramatically, which places hard limits on how small RPOs can be. Instead, PowerFlex uses a journaling­based approach.
Journaling provides the possibility of very small RPOs, and, importantly, it is not constrained by the maximum number of available snapshots in the system, or on a given volume.
Checkpoints are maintained in journals, and those journals live as volumes in a storage pool. The journal volumes resize dynamically as writes ebb and flow, so the overall size of the journal buffer will vary over time.

3.2 Journaling space reservations

The reservation size of the journal volume is set by the user and is measured as a percentage of the storage pool in which it is contained. For cluster installations that do not require replication, no journal space reservations are made. This brings up our first system design consideration concerning replication. We need space in one or more storage SSD or NVMe pools to provide a home for journal files. To determine how much space, account for all the write I/O of your replicated volumes including the aggregated write size and rate. Add a margin of safety of 15% for the volume overhead, journal write timestamps, and the journal-
Loading...
+ 23 hidden pages