Dell OneFS User Manual

WHITE PAPER

DELL EMC POWERSCALE ONEFS: A TECHNICAL OVERVIEW

Abstract

This white paper provides technical details on the key features and capabilities of the OneFS operating system that is used to power all Dell EMC PowerScale scale-out NAS storage solutions.

February 2020

1 | Dell EMC PowerScale OneFS: A Technical Overview © 2020 Dell Inc. or its subsidiaries.

Revisions

Version

Date

Comment

 

 

 

1.0

November 2013

Initial release for OneFS 7.1

 

 

 

2.0

June 2014

Updated for OneFS 7.1.1

 

 

 

3.0

November 2014

Updated for OneFS 7.2

 

 

 

4.0

June 2015

Updated for OneFS 7.2.1

 

 

 

5.0

November 2015

Updated for OneFS 8.0

 

 

 

6.0

September 2016

Updated for OneFS 8.0.1

 

 

 

7.0

April 2017

Updated for OneFS 8.1

 

 

 

8.0

November 2017

Updated for OneFS 8.1.1

 

 

 

9.0

February 2019

Updated for OneFS 8.1.3

 

 

 

10.0

April 2019

Updated for OneFS 8.2

 

 

 

11.0

August 2019

Updated for OneFS 8.2.1

12.0December 2019 Updated for OneFS 8.2.2

13.0

June 2020

Updated for OneFS 9.0

14.0September 2020 Updated for OneFS 9.1

Acknowledgements

This paper was produced by the following:

Author:

Nick Trimbee

The information in this publication is provided “as is.” Dell Inc. makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose.

Use, copying, and distribution of any software described in this publication requires an applicable software license.

Copyright © Dell Inc. or its subsidiaries. All Rights Reserved. Dell, EMC, Dell EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries.

Other trademarks may be trademarks of their respective owners.

2 | Dell EMC PowerScale OneFS: A Technical Overview © 2020 Dell Inc. or its subsidiaries.

TABLE OF CONTENTS

 

Introduction.............................................................................................................................................................................

4

OneFS overview.....................................................................................................................................................................

4

Platform nodes ...................................................................................................................................................................

5

Network ..............................................................................................................................................................................

6

OneFS software overview ..................................................................................................................................................

7

File system structure ........................................................................................................................................................

10

Data layout .......................................................................................................................................................................

11

File writes .........................................................................................................................................................................

12

OneFS caching.................................................................................................................................................................

15

OneFS cache coherency......................................................................................................................................................

15

Level 1 cache ...................................................................................................................................................................

15

Level 2 cache ...................................................................................................................................................................

16

Level 3 cache ...................................................................................................................................................................

17

File reads..........................................................................................................................................................................

20

Locks and concurrency ....................................................................................................................................................

21

Multi-threaded IO .............................................................................................................................................................

21

Data protection .................................................................................................................................................................

22

Node compatibility ............................................................................................................................................................

30

Supported protocols .........................................................................................................................................................

31

Non-disruptive operations - protocol support ...................................................................................................................

32

File filtering .......................................................................................................................................................................

32

Data deduplication - SmartDedupe ..................................................................................................................................

32

Small file storage efficiency..............................................................................................................................................

33

In-line Data Reduction......................................................................................................................................................

34

Interfaces..........................................................................................................................................................................

36

Authentication and access control ...................................................................................................................................

37

Active Directory ................................................................................................................................................................

37

Access zones ...................................................................................................................................................................

38

Roles based administration..............................................................................................................................................

38

OneFS auditing ................................................................................................................................................................

39

Software upgrade .............................................................................................................................................................

40

OneFS data protection and management software .............................................................................................................

40

Conclusion............................................................................................................................................................................

41

TAKE THE NEXT STEP.......................................................................................................................................................

42

3 | Dell EMC PowerScale OneFS: A Technical Overview © 2020 Dell Inc. or its subsidiaries.

Introduction

The three layers of the traditional storage model—file system, volume manager, and data protection—have evolved over time to suit the needs of small-scale storage architectures but introduce significant complexity and are not well adapted to petabyte-scale systems. The OneFS operating system replaces all of these, providing a unifying clustered file system with built-in scalable data protection, and obviating the need for volume management. OneFS is a fundamental building block for scale-out infrastructures, allowing for massive scale and tremendous efficiency, and is used to power all Dell EMC PowerScale and Isilon NAS storage solutions.

Crucially, OneFS is designed to scale not just in terms of machines, but also in human terms—allowing large-scale systems to be managed with a fraction of the personnel required for traditional storage systems. OneFS eliminates complexity and incorporates selfhealing and self-managing functionality that dramatically reduces the burden of storage management. OneFS also incorporates parallelism at a very deep-level of the OS, such that virtually every key system service is distributed across multiple units of hardware. This allows OneFS to scale in virtually every dimension as the infrastructure is expanded, ensuring that what works today, will continue to work as the dataset grows.

OneFS is a fully symmetric file system with no single point of failure — taking advantage of clustering not just to scale performance and capacity, but also to allow for any-to-any failover and multiple levels of redundancy that go far beyond the capabilities of RAID. The trend for disk subsystems has been slowly-increasing performance while rapidly-increasing storage densities. OneFS responds to this reality by scaling the amount of redundancy as well as the speed of failure repair. This allows OneFS to grow to multi-petabyte scale while providing greater reliability than small, traditional storage systems.

PowerScale and Isilon hardware provides the appliance on which OneFS executes. Hardware components are best-of-breed, but commodity-based — ensuring the benefits of commodity hardware’s ever-improving cost and efficiency curves. OneFS allows hardware to be incorporated or removed from the cluster at will and at any time, abstracting the data and applications away from the hardware. Data is given infinite longevity, protected from the vicissitudes of evolving hardware generations. The cost and pain of data migrations and hardware refreshes are eliminated.

OneFS is ideally suited for file-based and unstructured “Big Data” applications in enterprise environments including large-scale home directories, file shares, archives, virtualization and business analytics. As such, OneFS is widely used in many data-intensive industries today, including energy, financial services, Internet and hosting services, business intelligence, engineering, manufacturing, media & entertainment, bioinformatics, scientific research and other high-performance computing environments.

Intended Audience

This paper presents information for deploying and managing Dell EMC PowerScale and Isilon clusters and provides a comprehensive background to the OneFS architecture.

The target audience for this white paper is anyone configuring and managing a PowerScale or Isilon clustered storage environment. It is assumed that the reader has a basic understanding of storage, networking, operating systems, and data management.

More information on OneFS commands and feature configuration is available in the OneFS Administration Guide.

OneFS overview

OneFS combines the three layers of traditional storage architectures—file system, volume manager, and data protection—into one unified software layer, creating a single intelligent distributed file system that runs on a OneFS powered storage cluster.

4 | Dell EMC PowerScale OneFS: A Technical Overview © 2020 Dell Inc. or its subsidiaries.

Dell OneFS User Manual

Figure 1: OneFS Combines File System, Volume Manager and Data Protection into One Single Intelligent, Distributed System.

This is the core innovation that directly enables enterprises to successfully utilize the scale-out NAS in their environments today. It adheres to the key principles of scale-out; intelligent software, commodity hardware and distributed architecture. OneFS is not only the operating system but also the underlying file system that drives and stores data in the cluster.

PowerScale and Isilon nodes

OneFS works exclusively with dedicated platform nodes, referred to as a “cluster”. A single cluster consists of multiple nodes, which are rack-mountable enterprise appliances containing: memory, CPU, networking, Ethernet or low-latency InfiniBand interconnects, disk controllers and storage media. As such, each node in the distributed cluster has compute as well as storage or capacity capabilities.

With the Isilon hardware (‘Gen6’), a single chassis of 4 nodes in a 4RU (rack units) form factor is required to create a cluster, which scales up to 252 nodes in OneFS 8.2 and later. Individual node platforms need a minimum of three nodes and 3RU of rack space to form a cluster. There are several different types of nodes, all of which can be incorporated into a single cluster, where different nodes provide varying ratios of capacity to throughput or input/output operations per second (IOPS). OneFS 9.0 also introduces support for the new 1RU all-flash F600 NVMe and F200 PowerScale nodes. Both the traditional Gen6 chassis and the PowerScale stand-alone nodes will happily co-exist within the same cluster.

Each node or chassis added to a cluster increases aggregate disk, cache, CPU, and network capacity. OneFS leverages each of the hardware building blocks, so that the whole becomes greater than the sum of the parts. The RAM is grouped together into a single coherent cache, allowing I/O on any part of the cluster to benefit from data cached anywhere. A file system journal ensures that writes that are safe across power failures. Spindles and CPU are combined to increase throughput, capacity and IOPS as the cluster grows, for access to one file or for multiple files. A cluster’s storage capacity can range from 10’s of TBs to 10’s of PBs. The maximum capacity will continue to increase as storage media and node chassis continue to get denser.

The OneFS powered platform nodes are broken into several classes, or tiers, according to their functionality:

Table 1: Hardware Tiers and Node Types

5 | Dell EMC PowerScale OneFS: A Technical Overview © 2020 Dell Inc. or its subsidiaries.

Network

There are two types of networks associated with a cluster: internal and external.

Back-end network

All intra-node communication in a cluster is performed across a dedicated backend network, comprising either 10, 40, or 100 Gb Ethernet, or low-latency QDR InfiniBand (IB). This back-end network, which is configured with redundant switches for high availability, acts as the backplane for the cluster. This enables each node to act as a contributor in the cluster and isolating node-to-node communication to a private, high-speed, low-latency network. This back-end network utilizes Internet Protocol (IP) for node-to-node communication.

Front-end network

Clients connect to the cluster using Ethernet connections (1GbE, 10GbE, 25GbE, or 40GbE) that are available on all nodes. Because each node provides its own Ethernet ports, the amount of network bandwidth available to the cluster scales linearly with performance and capacity. A cluster supports standard network communication protocols to a customer network, including NFS, SMB, HTTP, FTP, HDFS, and S3. Additionally, OneFS provides full integration with both IPv4 and IPv6 environments.

Complete cluster view

The complete cluster is combined with hardware, software, networks in the following view:

Figure 2: All Components of OneFS at Work

The diagram above depicts the complete architecture; software, hardware and network all working together in your environment with servers to provide a completely distributed single file system that can scale dynamically as workloads and capacity needs or throughput needs change in a scale-out environment.

OneFS SmartConnect is a load balancer that works at the front-end Ethernet layer to evenly distribute client connections across the cluster. SmartConnect supports dynamic NFS failover and failback for Linux and UNIX clients and SMB3 continuous availability for Windows clients. This ensures that when a node failure occurs, or preventative maintenance is performed, all in-flight reads and writes are handed off to another node in the cluster to finish its operation without any user or application interruption.

During failover, clients are evenly redistributed across all remaining nodes in the cluster, ensuring minimal performance impact. If a node is brought down for any reason, including a failure, the virtual IP addresses on that node is seamlessly migrated to another node in the cluster. When the offline node is brought back online, SmartConnect automatically rebalances the NFS and SMB3 clients across the entire cluster to ensure maximum storage and performance utilization. For periodic system maintenance and software updates, this functionality allows for per-node rolling upgrades affording full-availability throughout the duration of the maintenance window.

6 | Dell EMC PowerScale OneFS: A Technical Overview © 2020 Dell Inc. or its subsidiaries.

Further information is available in the OneFS SmartConnect white paper.

OneFS software overview

Operating system

OneFS is built on a BSD-based UNIX Operating System (OS) foundation. It supports both Linux/UNIX and Windows semantics natively, including hard links, delete-on-close, atomic rename, ACLs, and extended attributes. It uses BSD as its base OS because it is a mature and proven Operating System and the open source community can be leveraged for innovation. From OneFS 8.0 onwards, the underlying OS version is FreeBSD 10.

Client services

The front-end protocols that the clients can use to interact with OneFS are referred to as client services. Please refer to the Supported Protocols section for a detailed list of supported protocols. In order to understand, how OneFS communicates with clients, we split the I/O subsystem into two halves: the top half or the ‘initiator’ and the bottom half or the ‘participant’. Every node in the cluster is a participant for a particular I/O operation. The node that the client connects to is the initiator and that node acts as the ‘captain’ for the entire I/O operation. The read and write operation are detailed in later sections

Cluster operations

In a clustered architecture, there are cluster jobs that are responsible for taking care of the health and maintenance of the cluster itself—these jobs are all managed by the OneFS job engine. The job engine runs across the entire cluster and is responsible for dividing and conquering large storage management and protection tasks. To achieve this, it reduces a task into smaller work items and then allocates, or maps, these portions of the overall job to multiple worker threads on each node. Progress is tracked and reported on throughout job execution and a detailed report and status is presented upon completion or termination.

Job Engine includes a comprehensive check-pointing system which allows jobs to be paused and resumed, in addition to stopped and started. The Job Engine framework also includes an adaptive impact management system.

The Job Engine typically executes jobs as background tasks across the cluster, using spare or especially reserved capacity and resources. The jobs themselves can be categorized into three primary classes:

File system maintenance jobs

These jobs perform background file system maintenance, and typically require access to all nodes. These jobs are required to run in default configurations, and often in degraded cluster conditions. Examples include file system protection and drive rebuilds.

Feature support jobs

The feature support jobs perform work that facilitates some extended storage management function, and typically only run when the feature has been configured. Examples include deduplication and anti-virus scanning.

User action jobs

These jobs are run directly by the storage administrator to accomplish some data management goal. Examples include parallel tree deletes and permissions maintenance.

The table below provides a comprehensive list of the exposed Job Engine jobs, the operations they perform, and their respective file system access methods:

User action jobs

These jobs are run directly by the storage administrator to accomplish some data management goal. Examples include parallel tree deletes and permissions maintenance.

The table below provides a comprehensive list of the exposed Job Engine jobs, the operations they perform, and their respective file system access methods:

7 | Dell EMC PowerScale OneFS: A Technical Overview © 2020 Dell Inc. or its subsidiaries.

Job Name

Job Description

Access Method

 

 

 

AutoBalance

Balances free space in the cluster.

Drive + LIN

 

 

 

AutoBalanceLin

Balances free space in the cluster.

LIN

 

 

 

AVScan

Virus scanning job that ICAP or CAVA/CEE server(s) run.

Tree

 

 

 

ChangelistCreate

Create a list of changes between two consecutive SyncIQ snapshots

Tree

 

 

 

Collect

Reclaims disk space that could not be freed due to a node or drive being

Drive + LIN

 

unavailable while they suffer from various failure conditions.

 

 

 

 

Dedupe

Deduplicates identical blocks in the file system.

Tree

 

 

 

DedupeAssessment

Dry run assessment of the benefits of deduplication.

Tree

 

 

 

DomainMark

Associates a path and its contents with a domain.

Tree

 

 

 

FlexProtect

Rebuilds and re-protects the file system to recover from a failure scenario.

Drive + LIN

 

 

 

FlexProtectLin

Re-protects the file system.

LIN

 

 

 

FSAnalyze

Gathers file system analytics data that is used in conjunction with InsightIQ.

LIN

 

 

 

IntegrityScan

Performs online verification and correction of any file system inconsistencies.

LIN

 

 

 

MediaScan

Scans drives for media-level errors.

Drive + LIN

 

 

 

MultiScan

Runs Collect and AutoBalance jobs concurrently.

LIN

 

 

 

PermissionRepair

Correct permissions of files and directories.

Tree

 

 

 

QuotaScan

Updates quota accounting for domains created on an existing directory path.

Tree

 

 

 

SetProtectPlus

Applies the default file policy. This job is disabled if SmartPools is activated on

LIN

 

the cluster.

 

 

 

 

ShadowStoreDelete

Frees space associated with a shadow store.

LIN

 

 

 

SmartPools

Job that runs and moves data between the tiers of nodes within the same

LIN

 

cluster.

 

 

 

 

SmartPoolsTree

Enforce SmartPools file policies on a subtree.

Tree

 

 

 

SnapRevert

Reverts an entire snapshot back to head.

LIN

 

 

 

SnapshotDelete

Frees disk space that is associated with deleted snapshots.

LIN

 

 

 

TreeDelete

Deletes a path in the file system directly from the cluster itself.

Tree

 

 

 

WormQueue

Scan the SmartLock LIN queue

LIN

Figure 1: OneFS Job Engine Job Descriptions

8 | Dell EMC PowerScale OneFS: A Technical Overview © 2020 Dell Inc. or its subsidiaries.

Although the file system maintenance jobs are run by default, either on a schedule or in reaction to a particular file system event, any job engine job can be managed by configuring both its priority-level (in relation to other jobs) and its impact policy.

An impact policy can consist of one or many impact intervals, which are blocks of time within a given week. Each impact interval can be configured to use a single pre-defined impact-level which specifies the amount of cluster resources to use for a particular cluster operation. Available job engine impact-levels are:

Paused

Low

Medium

High

This degree of granularity allows impact intervals and levels to be configured per job, in order to ensure smooth cluster operation. And the resulting impact policies dictate when a job runs and the resources that a job can consume.

Additionally, job engine jobs are prioritized on a scale of one to ten, with a lower value signifying a higher priority. This is similar in concept to the UNIX scheduling utility, ‘nice’.

The job engine allows up to three jobs to be run simultaneously. This concurrent job execution is governed by the following criteria:

Job Priority

Exclusion Sets - jobs which cannot run together (i.e., FlexProtect and AutoBalance)

Cluster health - most jobs cannot run when the cluster is in a degraded state.

9 | Dell EMC PowerScale OneFS: A Technical Overview © 2020 Dell Inc. or its subsidiaries.

Figure 4: OneFS Job Engine Exclusion Sets

Further information is available in the OneFS Job Engine white paper.

File system structure

The OneFS file system is based on the UNIX file system (UFS) and, hence, is a very fast distributed file system. Each cluster creates a single namespace and file system. This means that the file system is distributed across all nodes in the cluster and is accessible by clients connecting to any node in the cluster. There is no partitioning, and no need for volume creation. Instead of limiting access to free space and to non-authorized files at the physical volume-level, OneFS provides for the same functionality in software via share and file permissions, and via the SmartQuotas service, which provides directory-level quota management.

Further information is available in the OneFS SmartQuotas white paper.

Because all information is shared among nodes across the internal network, data can be written to or read from any node, thus optimizing performance when multiple users are concurrently reading and writing to the same set of data.

10 |Dell EMC PowerScale OneFS: A Technical Overview © 2020 Dell Inc. or its subsidiaries.

Figure 5: Single File System with Multiple Access Protocols

OneFS is truly a single file system with one namespace. Data and metadata are striped across the nodes for redundancy and availability. The storage has been completely virtualized for the users and administrator. The file tree can grow organically without requiring planning or oversight about how the tree grows or how users use it. No special thought has to be applied by the administrator about tiering files to the appropriate disk, because OneFS SmartPools will handle that automatically without disrupting the single tree. No special consideration needs to be given to how one might replicate such a large tree, because the OneFS SyncIQ service automatically parallelizes the transfer of the file tree to one or more alternate clusters, without regard to the shape or depth of the file tree.

This design should be compared with namespace aggregation, which is a commonly-used technology to make traditional NAS “appear” to have a single namespace. With namespace aggregation, files still have to be managed in separate volumes, but a simple

“veneer” layer allows for individual directories in volumes to be “glued” to a “top-level” tree via symbolic links. In that model, LUNs and volumes, as well as volume limits, are still present. Files have to be manually moved from volume-to-volume in order to load-balance. The administrator has to be careful about how the tree is laid out. Tiering is far from seamless and requires significant and continual intervention. Failover requires mirroring files between volumes, driving down efficiency and ramping up purchase cost, power and cooling. Overall the administrator burden when using namespace aggregation is higher than it is for a simple traditional NAS device. This prevents such infrastructures from growing very large.

Data layout

OneFS uses physical pointers and extents for metadata and stores file and directory metadata in inodes. OneFS logical inodes (LINs) are typically 512 bytes in size, which allows them to fit into the native sectors which the majority of hard drives are formatted with. However, in OneFS 8.0 and onward, support is also provided for 8KB inodes, in order to support the denser classes of hard drive which are now formatted with 4KB sectors.

B-trees are used extensively in the file system, allowing scalability to billions of objects and near-instant lookups of data or metadata. OneFS is a completely symmetric and highly distributed file system. Data and metadata are always redundant across multiple hardware devices. Data is protected using erasure coding across the nodes in the cluster, this creates a cluster that has highefficiency, allowing 80% or better raw-to-usable on clusters of five nodes or more. Metadata (which makes up generally less than 1% of the system) is mirrored in the cluster for performance and availability. As OneFS is not reliant on RAID, the amount of redundancy is selectable by the administrator, at the fileor directory-level beyond the defaults of the cluster. Metadata access and locking tasks are managed by all nodes collectively and equally in a peer-to-peer architecture. This symmetry is key to the simplicity and resiliency of the architecture. There is no single metadata server, lock manager or gateway node.

11 |Dell EMC PowerScale OneFS: A Technical Overview © 2020 Dell Inc. or its subsidiaries.

Because OneFS must access blocks from several devices simultaneously, the addressing scheme used for data and metadata is indexed at the physical-level by a tuple of {node, drive, offset}. For example, if 12345 was a block address for a block that lived on disk 2 of node 3, then it would read, {3,2,12345}. All metadata within the cluster is multiply mirrored for data protection, at least to the level of redundancy of the associated file. For example, if a file were at an erasure-code protection of “+2n”, implying the file could withstand two simultaneous failures, then all metadata needed to access that file would be 3x mirrored, so it too could withstand two failures. The file system inherently allows for any structure to use any and all blocks on any nodes in the cluster.

Other storage systems send data through RAID and volume management layers, introducing inefficiencies in data layout and providing non-optimized block access. OneFS controls the placement of files directly, down to the sector-level on any drive anywhere in the cluster. This allows for optimized data placement and I/O patterns and avoids unnecessary read-modify-write operations. By laying data on disks in a file-by-file manner, OneFS is able to flexibly control the type of striping as well as the redundancy level of the storage system at the system, directory, and even file-levels. Traditional storage systems would require that an entire RAID volume be dedicated to a particular performance type and protection setting. For example, a set of disks might be arranged in a RAID 1+0 protection for a database. This makes it difficult to optimize spindle use over the entire storage estate (since idle spindles cannot be borrowed) and also leads to inflexible designs that do not adapt with the business requirement. OneFS allows for individual tuning and flexible changes at any time, fully online.

File writes

The OneFS software runs on all nodes equally - creating a single file system that runs across every node. No one node controls the cluster; all nodes are true peers.

Figure 6: Model of Node Components Involved in I/O

If we were to look at all the components within every node of a cluster that are involved in I/O from a high-level, it would look like

Figure 6 above. We have split the stack into a “top” layer, called the Initiator, and a “bottom” layer, called the Participant. This division is used as a “logical model” for the analysis of any one given read or write. At a physical-level, CPUs and RAM cache in the nodes are simultaneously handling Initiator and Participant tasks for I/O taking place throughout the cluster. There are caches and a distributed lock manager that are excluded from the diagram above to keep it simple. They will be covered in later sections of the paper.

When a client connects to a node to write a file, it is connecting to the top half or Initiator of that node. Files are broken into smaller logical chunks called stripes before being written to the bottom half or Participant of a node (disk). Failure-safe buffering using a write coalescer is used to ensure that writes are efficient and read-modify-write operations are avoided. The size of each file chunk is referred to as the stripe unit size.

12 |Dell EMC PowerScale OneFS: A Technical Overview © 2020 Dell Inc. or its subsidiaries.

OneFS stripes data across all nodes—and not simply across disks—and protects the files, directories and associated metadata via software erasure-code or mirroring technology. For data, OneFS can use (at the administrator’s discretion) either the Reed-Solomon erasure coding system for data protection, or (less commonly) mirroring. Mirroring, when applied to user data, tends to be used more for high-transaction performance cases. The bulk of user data will generally use erasure coding, as it provides extremely high performance without sacrificing on-disk efficiency. Erasure coding can provide beyond 80% efficiency on raw disk with five nodes or more, and on large clusters can even do so while providing quadruple-level redundancy. The stripe width for any given file is the number of nodes (not disks) that a file is written across. It is determined by the number of nodes in the cluster, the size of the file, and the protection setting (for example, +2n).

OneFS uses advanced algorithms to determine data layout for maximum efficiency and performance. When a client connects to a node, that node’s initiator acts as the “captain” for the write data layout of that file. Data, erasure code (ECC) protection, metadata and inodes are all distributed on multiple nodes within a cluster, and even across multiple drives within nodes.

Figure 7 below shows a file write happening across all nodes in a three-node cluster.

Figure 7: A File Write Operation on a 3-node Cluster

OneFS uses the back-end network to allocate and stripe data across all nodes in the cluster automatically, so no additional processing is required. As data is being written, it is being protected at the specified level. When writes take place, OneFS divides data out into atomic units called protection groups. Redundancy is built into protection groups, such that if every protection group is safe, then the entire file is safe. For files protected by erasure codes, a protection group consists of a series of data blocks as well as a set of erasure codes for those data blocks; for mirrored files, a protection group consists of all of the mirrors of a set of blocks. OneFS is capable of switching the type of protection group used in a file dynamically, as it is writing. This can allow many additional functionalities including, for example, allowing the system to continue without blocking in situations when temporary node failures in the cluster would prevent the desired number of erasure codes from being used. Mirroring can be used temporarily in these cases to allow writes to continue. When nodes are restored to the cluster, these mirrored protection groups are converted back seamlessly and automatically to erasure- code-protected, without administrator intervention.

13 |Dell EMC PowerScale OneFS: A Technical Overview © 2020 Dell Inc. or its subsidiaries.

Loading...
+ 29 hidden pages