QLogic Mt. Rainier Technology
Accelerates the Enterprise
Shared, Scalable, and Transparent Caching for
Network Storage
Key Findings
A large and growing gap exists between the information access performance demands of critical applications running on highperformance servers, and the performance capacity of subsystems that are based on traditional, mechanical storage. QLogic® findings
indicate that:
• Flash-based caching offers the promise of substantial application performance improvements.
• The placement of Flash-based caches within the storage network profoundly impacts whether, and to what degree, those promises
can be realized. The current market offerings force architects to make difficult trade-offs.
• The QLogic Mt. Rainier technology delivers a unique, shared, server-based caching solution that delivers scalability and performance,
guarantees cache coherence, and improves the economics of enterprise-wide caching adoption.
Executive Summary
WHITE PAPER
With Mt. Rainier technology, QLogic delivers a set of unique solutions
optimized to address the growing performance gap between what
the processor can compute and what the storage I/O subsystem
can deliver. With a very simple deployment model, this approach
seamlessly combines enterprise server I/O connectivity with shared,
server-based I/O caching. Mt. Rainier delivers dramatic and— perhaps
Introduction
Increased server performance, higher virtual machine density, advances in
network bandwidth, and more demanding business application workloads
create a critical I/O performance imbalance between servers, networks,
and storage subsystems. Storage I/O is the primary performance bottleneck
for most virtualized and data-intensive applications. While processor and
memory performance have grown in step with Moore’s Law (getting faster
and smaller), storage performance has lagged far behind, as shown in
Figure 1. This performance gap is further widened by the rapid growth in
data volumes that most organizations are experiencing today. For example,
IDC predicts that the amount of data volume in the digital universe will
increase by a factor of 44 over the next 10 years.
most importantly—smoothly scalable application performance
improvements to the widest range of enterprise applications. In
combination with the QLogic Cache Optimization Protocol™ (QCOP),
these performance benefits are transparently extended to today’s most
demanding active-active clustered environments. This white paper
provides a high-level introduction to the QLogic Mt. Rainier technology.
Figure 2. Sources of Latency on Storage Networks
As application workloads and virtual machine densities increase, so does
the pressure on these potential hotspots and the time required to access
critical application information. Slower storage response times result in
lower application performance, lost productivity, more frequent service
disruptions, reduced customer satisfaction, and, ultimately, a loss of
competitive advantage.
Over the past decade, IT organizations and suppliers have employed
several approaches to address congested storage networks and avoid the
risks and costly consequences of reduced access to information and the
Figure 1. Growing Disparity Between CPU and Disk-Based Storage Performance
Following industry best practices, storage has been consolidated,
centralized, and located on storage networks (for example, Fibre Channel,
Fibre Channel over Ethernet [FCoE], and iSCSI) to enhance efficiency,
compliance policies, and data protection. However, network storage design
introduces many new points where latency can be introduced. Latency
increases response times, reduces application access to information, and
decreases overall performance. Simply put, any port in a network that is
over-subscribed can become a point of congestion, as shown in Figure 2.
SSG-WP12004C SN0430914-00 Rev. C 11/12 2
resulting underperforming applications.
The Traditional Approach: Refresh Storage Infrastructure
The traditional approach to meeting increased demands on enterprise
storage infrastructure is to periodically replace or “refresh” the storage
arrays with newer products. These infrastructure upgrades tend to focus on
higher-performance storage array controllers and disk drives that spread
data wider across a larger number of storage channels, increase the number
of array front- and back-end storage ports available, and increase network
bandwidth. Implementing a well-designed infrastructure refresh delivers
WHITE PAPER
improved system performance, but also introduces significant costs and
risks. For example, installing new arrays requires migrating existing data
to those new arrays, and this migration generally requires a minimum of
one or two outages per attached server. Furthermore, as the sheer volume
of the data involved in these migrations grows, migration jobs take longer,
and cost, complexity, and risk all increase. With the continuing expectation
of geometrically increasing performance demands, the improvements
delivered by these “big bang” infrastructure upgrades are temporary, by
their nature. The dynamic growth of application workloads at the edge of
the comparatively static storage networks and arrays eventually outstrips
any feasible configuration at the core of those networks. This inherent
guarantee of obsolescence results in excessive spending to optimize
storage performance at the expense of efficient capacity and it drives
infrastructure refresh cycles to typically occur every three to five years.
A New Option: Deploy Flash Memory
In the last few years, Flash memory has emerged as a valuable tool for
increasing storage performance. Flash memory outperforms rotating
magnetic media by orders of magnitude when processing random I/O
workloads. As a new and rapidly expanding semiconductor technology,
QLogic expects Flash memory, unlike mechanical and disk drives, to track
a Moore’s Law-style curve for performance and capacity advances.
To accelerate early adoption, Flash memory has been primarily packaged
as solid-state disk (SSD) drives that simplify and accelerate adoption.
Although originally packaged to be plug-compatible with traditional,
rotating, magnetic media disk drives, SSDs are now available in additional
form factors, most notably server-based PCI Express® boards.
SSD Caching Versus Tier 0 Data Storage
The defining characteristic of SSDs is that—independent of physical
form factor—they are accessed as if they are traditional disk drives. The
compatible behavior of SSDs enabled their rapid adoption as an alternative
to 10K to 15K RPM disk drives. In high-end, mission-critical applications,
the much higher performance of SSDs—coupled with much lower power
and cooling requirements—largely offset their initially high prices. As SSD
prices have decreased and capacities have increased, SSDs deployed as
primary storage have seen accelerated adoption for a relatively small set of
performance-critical applications.
Array-Based SSD Caching
Initial deployments of SSD caching were delivered by installing SSDs,
along with the required software and firmware functionality, within shared
storage arrays. Due to the plug-compatibility of early SSDs, these initial
implementations did not require extensive modifications to existing array
hardware or software and, in many cases, were available as upgrades to
existing equipment.
Advantages
Applying SSD caching to improve performance inside storage arrays offers
several advantages that closely parallel the fundamental advantages of
centralized network-attached storage arrays. Advantages include: efficient
sharing of valuable resources, maintenance of existing data protection
regimes, and providing a single point of change while maintaining existing
network topology.
Drawbacks
Adding SSD caching to storage arrays requires upgrading and, in some
cases, replacing existing arrays (including data migration effort and
risk). Even if all of the disk drives are upgraded to SSDs, the expected
performance benefit is not fully realizable due to contention-induced
latency at over-subscribed network and array ports (see Figure 2). The
performance benefits of SSD caching in storage arrays may be shortlived and performance may not scale smoothly. The initial per-server
performance improvements will decrease over time as the overall demands
on the arrays and storage networks increase with growing workloads, and
with server and virtual server attach rates.
Caching Appliances
Caching appliances—relatively new additions to storage networking— are
network-attached devices that are inserted into the data path between
servers and primary storage arrays.
Advantages
Like array-based caching, caching appliances efficiently share relatively
expensive and limited resources, but do not require upgrades to existing
arrays. Because these devices are independent of the primary storage
arrays, they can be distributed to multiple locations within a storage
network to optimize performance for specific servers or classes of servers.
Drawbacks
Also in common with arrays, caching appliances are vulnerable to
congestion in the storage network and at busy appliance ports. The
appliance approach offers better scalability than array-based caching
because it is conceptually simple to add incremental appliances into a
storage network. However, each additional appliance represents a large
capital outlay, network topology changes, and outages.
In contrast to array-based caching, caching appliances are new elements in
the enterprise IT environment and require changes to policies, procedures,
run books, and staff training.
Lastly, bus and memory bandwidth limitations of the industry-standard
components at the heart of caching appliances restrict their ability to
smoothly scale performance. Because these appliances sit in-band on
SSG-WP12004C SN0430914-00 Rev. C 11/12 3
shared network storage links, infrastructure architects and managers
should be concerned about the real-world scalability and stability of these
devices.
Server-Based Caching
The final option for SSD caching placement is at the server edge of the
storage network, directly attached to I/O intensive servers.
Advantages
WHITE PAPER
Adding large caches to high I/O servers places the cache in a position
where it is insensitive to congestion in the storage infrastructure. The
cache is also in the best position to integrate application understanding
to optimize application performance. Server-based caching requires no
upgrades to storage arrays, no additional appliance installation on the data
path of critical networks, and storage I/O performance can scale smoothly
with increasing application demands. As a side benefit, by servicing a large
percentage of the I/O demand of critical servers at the network edge, SSD
caching in servers effectively reduces the demand on storage networks
and arrays. This demand reduction improves storage performance for
other attached servers, and can extend the useful life of existing storage
infrastructure.
Drawbacks
While the current implementations of server-based SSD caching are
very effective at improving the performance of individual servers (that is,
improving point performance), providing storage acceleration across a
broad range of applications in a storage network is beyond their reach.
However, the ever-increasing pressure on IT organizations to do more
with less dictates that, to be supportable, new configurations must be as
generally applicable as possible.
Several serious drawbacks exist with server-based SSD caching, as
currently deployed, including:
• Does not work for today’s most important clustered applications and
environments.
• Creating silos of captive SSD makes SSD caching much more expensive
to achieve a specific performance level.
• Complex layers of driver software increase interoperability risks and
consume server processor and memory resources.
To understand the clustering problem, examine the conditions required for
successful application of current server-based caching solutions, refer to
the idealized read-caching scenario illustrated in Figure 3.
Figure 3. Server Caching Reads from a Shared LUN
Figure 3 shows server-based Flash caching deployed on two servers that
are reading from overlapping regions on a shared LUN. On the initial reads,
the read data is returned to the requestor and also saved in the serverbased cache for that LUN. All subsequent reads of the cached regions are
serviced from the server-based caches, providing a faster response and
reducing the workload on the storage network and arrays.
This scenario works very well, provided that read-only access can be
guaranteed. However if either server in Figure 2 executes write I/0 to the
shared regions of the LUN, cache coherence is lost and the result is nearly
certain data corruption, as shown in Figure 4.
Figure 4. Server Write to a Shared LUN Destroys Cache Coherency
In this case, one of the servers (Server 2) has written back data to the shared
LUN. However, without a mechanism to support coordination between the
server-based caches, Server 1 continues to read and process now-invalid
data from its own local cache. Furthermore, if Server 1 proceeds to write
processed data back to the shared LUN, data previously written by Server 2
is overwritten with logically corrupt data from Server 1. This corruption
occurs even if both servers are members of a host or application cluster
because, by design, server-based caching is transparent to servers and
applications.
The scenario illustrated in Figure 4 assumes a write-through cache strategy
where data is synchronously written to the shared storage subsystem. An
SSG-WP12004C SN0430914-00 Rev. C 11/12 4
WHITE PAPER
equally dangerous situation arises if a write-back strategy is implemented
where writes to the shared LUN are cached.
Figure 5 shows the cache for two servers configured for write-back
caching. In this example, as soon as Server 1 performs a write operation,
cache coherence is lost because Server 2 has no indication that any data
has changed. Even if Server 2 has not yet cached the region that Server 1
has modified, if it does so before Server 1 flushes the write cache, the data
read from the shared LUN becomes invalid and logically corrupt data is
again processed.
Figure 5. Cached Server Writes Destroys Cache Coherency
The Challenge
Characteristic of the current state of SSD caching technology, strong
arguments and indicators exist both for and against the placement of these
caches in storage arrays, appliances, and servers. This situation compels
system architects to weigh complex and potentially painful trade-offs, and
greatly limits their ability to broadly apply the benefits of SSD caching.
An approach that delivers the scalable and sustainable performance
advantages of server-based SSD caching—combined with the cache
coherence and efficient resource allocation of array- and appliancebased caching—is essential. To deliver the performance benefits of SSD
caching requires a flexible, affordable, and scalable way to standardize
configurations.
corruption due to loss of cache coherence, while enabling efficient and
cost-effective pooling of SSD cache resources among servers. Mt. Rainier
combines the performance and scalability of server-based SSD caching
with the economic efficiency, central management, and transparent
support for operating systems and enterprise applications characteristic of
appliance- and array-based SSD caching. QLogic enables IT organizations
to specify fast, reliable, infrastructure-compatible, and cost-effective SSD
caching as a standard for I/O-intensive servers.
Theory
The QLogic Mt. Rainier technology is a new class of host-based, intelligent,
I/O optimization engines that provide integrated storage network
connectivity, an SSD interface, and the embedded processing required to
make all SSD management and caching tasks entirely transparent to the
host. The only host-resident software required for Mt. Rainier operation
is a host operating system-specific driver. All “heavy lifting” is performed
transparently onboard Mt. Rainier by the embedded multicore processor.
The most important benefits of SSD caching with the QLogic Mt. Rainiers
are the clustering of Mt. Rainiers and LUN cache ownership.
Mt. Rainier Clustering
Mt. Rainier clustering creates a logical group that provides a single pointof-management and maintains cache coherence with high availability
and optimal allocation of cache resources. When Mt. Rainier clusters are
formed, one cluster member is assigned as the cluster control primary and
another as the cluster control secondary. The cluster control primary is
responsible for processing management and configuration requests, while
the cluster control secondary provides high availability as a passive backup
for the cluster control primary.
Figure 6 shows a four-node cache adapter cluster defined with member #1
as the cluster control primary and cluster member #2 as the cluster control
secondary (cluster members #3 and #4 have no cluster control functions).
The QLogic Approach
To address this challenge, QLogic has leveraged the company’s core
expertise in network storage and multiprotocol SAN adapters, combined
with over five years of developing and delivering high-performance,
enterprise, data mobility solutions. Based on solutions that reliably move
mission-critical applications and data over multiple storage protocols,
QLogic developed the Mt. Rainier technology with QCOP. Mt. Rainier is
a lightweight, elegant, and effective solution to the complex problem of
critical I/O performance imbalance. With Mt. Rainier and QCOP, QLogic
delivers a solution that simultaneously eliminates the threat of data
SSG-WP12004C SN0430914-00 Rev. C 11/12 5
Figure 6. Cluster Control Primary and Secondary Members
When a primary fails or is otherwise unable to communicate with the other
members of a cluster, the secondary is promoted to primary, and then
assigns a new secondary, as shown in Figure 7.
Figure 7. Cluster Control Primary Failure Recovery Sequence
When the former primary comes back online and rejoins the cluster, it
learns of the new primary and secondary configuration, and then rejoins as
a regular, non-control member, as shown in Figure 8.
In Figure 9, QLogic Mt. Rainiers are installed in both servers and are
clustered together. Server 2 is then configured as the LUN cache owner for
a shared LUN. When Server 1 needs to read or write data on that shared
LUN, participation in the storage accelerator cluster tells the Mt. Rainier on
Server 1 to redirect I/O to the Mt. Rainier on Server 2. This I/O redirection:
• Guarantees cache coherence.
• Is completely transparent to servers and applications.
• Works for both read and write caching.
• Consumes no host processor or memory resources on either server.
Figure 8. Failed Cluster Control Primary Rejoins the Cluster as Non-Control Member
Independent of cluster control functions, pairs of Mt. Rainier cluster
members can also cooperate to provide synchronous mirroring of cache or
SSD data LUNs. In these relationships, only one member of the pair actively
serves requests and synchronizes data to the mirror partner (that is, the
mirroring relationship is active-passive).
LUN Cache Ownership
LUN cache ownership guarantees that at any time, only one Mt. Rainier is
actively caching any specified LUN. Any member of an Mt. Rainier cluster
that requires access to a LUN that is cached by another cluster member
redirects the I/O to the specific LUN’s cache owner, as illustrated in Figure 9.
These characteristics are essential to the active-active clustering
applications and environments that are most critical to enterprise
information processing. Without guaranteed cache coherence, applications
such as VMware
®
ESX® clusters, Oracle® Real Application Clusters (RAC),
and those that rely on the Microsoft® Windows® Failover Clustering feature
simply cannot run with server-based SSD caching. Application transparency
enables painless adoption and the shortest path to the benefits of caching
accelerated performance. For active-active applications with “bursty”
write profiles, write-back caching offers enormous performance benefits.
Encapsulating all SSD processing within Mt. Rainiers leaves host processor
and memory resources available to increase high-value application
processing density (for example, more transparent page sharing, more
virtual machines, and so on).
Because only one Mt. Rainier is ever actively caching a LUN, and all other
members of the accelerator cluster process all I/O requests for that LUN
through that LUN cache owner, all storage accelerator cluster members
work on the same copy of data. Cache coherence is maintained without
the complexity and overhead of coordinating multiple copies of the same
data. Furthermore, by enforcing a single-copy-of-data policy, QLogic Mt.
Rainier clustering efficiently uses the SSD cache; intra-cluster cache
coordination allows more storage data to be cached with fewer SSD
resources consumed. This cache utilization efficiency greatly improves the
economics of server cache deployment and, combined with guaranteed
SSG-WP12004C SN0430914-00 Rev. C 11/12 6
WHITE PAPER
cache coherence, enables adoption of server-based SSD caching as a
standard configuration. Because the caches are server-based, with proper
configuration they can take advantage of the abundant available bandwidth
at the edges of storage networks, allowing them to avoid the primary
sources of network latency: congestion and resource contention.
Managing Mt. Rainier Clusters
All Mt. Rainier configuration, management, and data collection
communication takes place in-band on the existing storage network; no
additional connectivity or configuration is required. QLogic provides a Web
console, as well as scriptable command line and server virtualization plugin management tools.
Common management tasks, such as creating a Mt. Rainier cluster,
adding or removing cluster members, configuring cache parameters,
and configuring SSD data LUNs are easily accomplished by running the
appropriate wizards within the QLogic QConvergeConsole
environment-specific plug-in. Optionally, the scriptable host command line
interface (HCLI) may also be used to perform any operation.
Standard management tools provide advanced operations to enable cluster
maintenance while maintaining continuous availability. For instance, if a
server with a cluster member needs to be taken down for maintenance,
all of its SSD data LUNs, SAN LUN caching, and mirroring responsibilities
can be seamlessly migrated to other cluster members. When the server is
back online, the SSD data LUNs, caching, and mirroring responsibilities can
optionally be migrated back.
Configuration
The only topological requirement for establishing a cache adapter cluster is
that all members of that cluster must be able to “see” each other over two
independent storage network fabrics, as illustrated in Figure 10.
®
(QCC) or an
For Fibre Channel SANs, all cluster members must be zoned together
in each fabric to which they are attached. To optimize performance, the
recommended best practice is to configure clusters with all members
connected to the same pair of top-of-rack or end-of-row switches. This
topology is a good match for a rack- or row-oriented storage network
layout, providing both direct and indirect system performance benefits.
Intra-cluster communication takes advantage of the availability of excess
bandwidth at the storage network edge to optimize storage accelerator
cluster performance. Much of the high-load storage I/O traffic is removed
from the network core, thus benefiting other, non-accelerated storage I/O
requirements.
Integration Tools
To support automation and integration with third-party tools, QLogic
provides full access to all cache configuration and policy options through
the host CLI, as well as an API.
Host Command Line Interface (HCLI)
Any configuration, management, or data-collection task supported by
QConvergeConsole can also be accomplished with the HCLI and may
optionally be scripted. For example, if you are using write-back caching—
when working with network storage that supports LUN snapshots or
cloning—to create point-in-time images (for example, for creating
consistent backups), you must ensure that the write cache is flushed before
taking the snapshot or clone. To perform flushing, insert HCLI cache flush
commands in the snapshot or cloning pre-processing scripts. Similarly,
when mounting a snapshot or clone (for example, in a rollback operation) to
Mt. Rainier-enabled servers, HCLI commands may be scripted to first flush
the cache (if write-back is enabled), and then invalidate the cache for the
affected LUNs before proceeding with the rollback volume mount.
The HCLI also provides advanced SSD data LUN and SAN LUN caching
commands that automate provisioning, decommissioning, and policy
changes. For instance, based on knowledge of application processing
cycles, you may need to modify write caching parameters. A LUN involved
in decision support nominally has low and “bursty” write activity, and
would benefit from a write-back caching policy. However, when such LUNs
are refreshed (for example, during periodic Extract-Transform-Load [ETL]
activities), the profile changes to heavy write, and therefore write-back
should be disabled by inserting the HCLI cache policy commands into the
ETL control scripting.
Mt. Rainier API
Third-party developers and end-users who write custom applications can
have direct access to all SSD data LUN and SAN LUN caching configuration,
Figure 10. Storage Acceleration Cluster with All Mt. Rainiers Attached
to the Same Edge Switch
SSG-WP12004C SN0430914-00 Rev. C 11/12 7
management, and data collection facilities of Mt. Rainier clusters by means
of the QLogic Mt. Rainier API. In addition to the capabilities described for
the HCLI, API users have the opportunity to provide the Mt. Rainier caching
system with application-intimate cache hints. That is, when applications or
operating environments are performing specific operations, they can give
the Mt. Rainier caching system specific directives to optimize performance
for upcoming operations. For example, if a database application “knows”
that it is about to make heavy use of a specific index, it can direct the Mt.
Rainier cluster to pre-fetch and lock the hot areas in the cache. Similarly,
to maximize overall storage infrastructure performance, cache control
software running on storage arrays can use the Mt. Rainier API to coordinate
“who-caches-what” with the caches running in Mt. Rainier-enabled hosts.
Summary and Conclusion
The Mt. Rainier technology from QLogic provides a reliable, efficient, and
effective architecture that delivers the key benefits of server, appliance,
and array SSD caching as a standard feature of I/O-intensive servers. This
new technology ensures that cache coherence is guaranteed, even in the
most demanding active-active clustering environments, and simultaneously
ensures efficient utilization of SSD cache resources through QCOP. With
Mt. Rainier, QLogic delivers the key that unlocks the benefits of smoothscaling, available, and cost-effective SSD caching that can be applied
uniformly across the enterprise.
WHITE PAPER
To stay up to date on Mt. Rainier technology news, please go to
www.QLogic.com.
SSG-WP12004C SN0430914-00 Rev. C 11/12 8
WHITE PAPER
Disclaimer
Reasonable efforts have been made to ensure the validity and accuracy of these performance tests. QLogic Corporation is not liable for any error
in this published white paper or the results thereof. Variation in results may be a result of change in configuration or in the environment. QLogic
specifically disclaims any warranty, expressed or implied, relating to the test results and their accuracy, analysis, completeness, or quality.