Realize Significant Performance Gains
in Oracle RAC with QLogic FabricCache
Guide to Best Practices
Key Findings
• QLogic® recommends using the QLogic FabricCache™ 10000 Series Adapter in an Oracle® Real Application Cluster (RAC) database
with Oracle’s Automatic Storage Management (ASM) feature enabled and provisioning LUNs as a multiple of the quantity of nodes
in the RAC cluster. This configuration ensures distribution of the cache load over the QLogic 10000 Series Adapters in the cluster.
• Oracle recommends four LUNs for a disk group and, for best cache optimization, making the LUN quantity a multiple of the nodes.
• Each 10000 Series Adapter may cache up to 256 LUNs.
• All 10000 Series Adapters in the cluster must be zoned to be visible to each other.
• QLogic FabricCache 10000 Series Adapters enable:
– OLTP processing to be almost twice the transactions at 56 percent of the response time.
– OLAP processing to be 3.25 times the transactions at 25 percent of the response time.
Executive Summary
WHITE PAPER
QLogic FabricCache 10000 Series Adapters bring shared, server-based
caching to the SAN. The 10000 Series Adapters are purposely designed
to address the high I/O demands and distributed nature of clustered
applications and virtualized environments, including Oracle RAC. Using
the following best practice guidelines, 10000 Series Adapters have
been shown to improve Online Analytical Processing workloads in
Oracle RAC environments by up to 3.25 × more transactions in one
quarter the response time. Online Transaction Processing workloads
were accelerated to almost twice the transactions at 56 percent of the
response time.
The 10000 Series Adapter integrates a flash-based cache with a Fibre
Channel Adapter that uses the existing SAN infrastructure to create a
Considerations for Business Continuity, High Availability,
and Scalability
Oracle RAC databases ensure that the critical business application is
resilient because multiple instances (nodes) provide for business continuity
and high availability. The loss of a single node does not result in an
application outage. The Oracle RAC database can be scaled by adding
more nodes to provide the processing capacity the application may need
going forward.
Based on this scalability, some database applications require high I/O with
low latency to support the increasing demand for quick response to the
application. If the SAN cannot provide the required performance in IOPS
for high-value, critical applications, administrators have some options to
improve this, including:
• Select and install a new SAN
• Add a localized, server-based cache
• Add additional nodes to an Oracle RAC cluster
• Add the QLogic FabricCache 10000 Series Adapter
The following sections examine each of the preceding options.
Installing a New SAN
Installing a new SAN is a major operation that takes significant time,
significant investment, and significant planning to minimize operational
disruptions during the installation. While a new SAN also increases the
overall I/O pool and performance for the entire SAN, these increases
shared cache resource distributed over multiple servers. This cachecoherent implementation extends caching performance benefits
to the widest range of enterprise applications, including clustered
applications that use shared SAN storage.
10000 Series Adapters lower transaction latency while increasing
storage I/O capacity in Oracle RAC environments. The 10000
Series Adapter offloads traffic from the SAN infrastructure, which
lowers disk array IOPS, while maintaining SAN data protection and
compliance policies. These benefits translate to improved resource
utilization, increased return on investment (ROI), extended useful
life of the SAN infrastructure, reduced costs, and overall improved
customer satisfaction.
may be overkill if only a small quantity of applications require the
increased performance.
Adding a Cache
Caching devices, such as a solid-state disk (SSD) cache in each node,
require that each node be brought down to install the cache. This cache
is local, captive to the node, and not shared. This limited solution provides
fewer benefits because, within the Oracle RAC environment, the LUNs being
cached must be shared among all of the Oracle nodes. The biggest caching
benefit comes from full table scans (direct reads) that are not cached in the
Oracle block buffers, which requires that the cache be shared among the
servers in the cluster.
Adding RAC Nodes
To add more RAC nodes to the existing cluster requires adding new physical
hardware, modifying the existing SAN zoning and mapping, and adding
network connectivity. Doing so can increase performance, but it does not
provide increased I/O from the SAN.
Adding a QLogic FabricCache 10000 Series Adapter
Adding the 10000 Series Adapter provides increased I/Os by caching read
operations. I/O is improved by replacing the existing QLogic Host Bus
Adapters within the nodes with the QLogic 10000 Series Adapters. The
10000 Series Adapter uses the same QLogic drivers and incurs minimal
disruption to the hardware infrastructure. The 10000 Series Adapters
function as Host Bus Adapters within the SAN fabric, the same as the non-
SN0451405-00 Rev. B 07/13 2
WHITE PAPER
caching Fibre Channel Host Bus Adapters. In an Oracle RAC environment,
Host Bus Adapters can be replaced using a process that is non-destructive
to application operation. The replacement process includes the following
general steps that must be performed on each server (node) in the Oracle
RAC cluster:
1. Shut down the node.
2. Place the new 10000 Series Adapter in the node, replacing the existing
Host Bus Adapter.
3. Add the 10000 Series Adapter to the fabric zone, enabling it to see
the storage.
4. Add the 10000 Series Adapter to a fabric zone for clustering the 10000
Series Adapter.
5. Bring the node back online.
After all the 10000 Series Adapters are in place in the nodes, configure the
adapters to enable LUN caching.
System Architecture and Requirements
At a high level, the Oracle RAC database comprises four nodes that are
connected to the SAN with the QLogic 10000 Series Fibre Channel Adapter.
Both the database public network and the private interconnect are gigabit
Ethernet (GbE).
The load-generation application simulates a business workload and is LANconnected to the RAC database.
Series Fibre Channel Adapter in each server in the cluster)
• SAN:
– Storage array
– Fabric switch with support for 8Gb Fibre Channel
• Ethernet switch
QLogic FabricCache 10000 Series Adapter Best Practices
In the Oracle RAC database, multiple nodes (servers) are actively reading
and changing data on shared LUNs. The database engine manages this
with a global lock manager.
The 10000 Series Adapter communicates “in-band” through the Fibre
Channel fabric to coordinate the cache activity between the 10000 Series
Adapters. For these adapters to function effectively, each card must have
visibility to the other cards in the cluster. Visibility is accomplished by
creating a fabric zone that includes all of the 10000 Series Adapters.
Because the 10000 Series Adapter-enabled cluster has multiple nodes
sharing the same LUNs, this shared-cache environment can improve
the throughput of the application by distributing the cache across all of
the nodes. The distribution works well in a RAC environment because
the Oracle RAC database groups multiple LUNs into a single storage
pool called a disk group (or LUN set). The Oracle ASM distributes the I/O
across all of the disks (LUNs) in the disk group, of which there can be an
unlimited quantity.
The example used in this document includes four configured disk groups.
One of the disk groups (with four LUNs) is specified for cluster management;
this disk group is not cached because of the low I/O demand on it. The other
three disk groups have cache enabled in the best practice environment.
Figure 1 shows the mapping required by each of four servers for the LUNs
to support the database. Note that each node in the cluster is presented
with the same set of LUNs.
Oracle RAC Best Practices
QLogic suggests following Oracle’s best practice recommendations for
RAC, which include:
• Creating four LUNs for each disk group.
• Using the Oracle ASM feature to balance the I/O load for a disk group
over the LUNs in that disk group.
• Using the Oracle Flash Recovery Area (FRA) to hold the backups.
However, because the QLogic examples used in this document do not
use or create a FRA, the FRA does not require caching.
SN0451405-00 Rev. B 07/13 3
Online REDO Logs
The online REDO log files are in their own LUN set, which allows these
files with high write performance to be placed on the highest performing
storage. Caching these LUNs permits the archive activity to read from the
cache and not access the SAN storage.
Figure 1. Mapping Required to Support the Database
WHITE PAPER
Data and Index
The tablespaces that support the data and indexes are the largest
tablespaces (use the most storage) and provide the largest gain with
the 10000 Series Adapter read caching. Distributing the LUN cache
ownership across all of the nodes in the RAC cluster increases the available
FabricCache for caching. When queries are performing full table scans (likely
in OLAP or data warehouse applications), Oracle RAC does not cache this
information unless it finds a block containing a needed row. If many users
are performing similar analysis, the 10000 Series Adapter cache eventually
contains the needed rows and the analysis becomes much faster.
Indexes are read by a database query and tend to stay in the block buffer
for the database, where the work is spread over many nodes. Index blocks
in the 10000 Series Adapter cache and subsequent access to the index on
other nodes is very fast.
UNDO and TEMP Tablespaces
Like the online REDO log files, UNDO and TEMP tablespaces are on a
different LUN set. This LUN set allows the tablespaces to be placed on the
highest performing storage, which enables the consistent read activity and
temporary sorts to perform as fast as possible.
QLogic FabricCache 10000 Series Adapter Deployment
The illustrations in this section show the LUN ownership that is assigned
to each 10000 Series Adapter for cache ownership. In the cluster, a
LUN is assigned to one and only one 10000 Series Adapter for cache
ownership, while each of the 10000 Series Adapters is the cache
owner for one of the LUNs in the defined LUN groups: REDO, Data,
and UNDO.
I/O Behavior with FabricCache
While each LUN is owned by one (and only one) 10000 Series Adapter, the
LUN is shared with the other 10000 Series Adapters in the cluster as a SAN
LUN. LUN sharing ensures that each server has visibility to all the LUNs
being cached. Figure 3 shows the mapping for the data LUN group, where
each data LUN:
• Is owned by a single 10000 Series Adapter (represented by the larger
disk in the foreground).
• Is shared with the other 10000 Series Adapters as a SAN LUN
(represented by the smaller disks of the same color in the background
on the other servers).
Figure 3. SAN LUN Cache Mapping Example for Data LUNs
Figure 4 shows the I/O behavior for one set of LUNs for the data. The
behavior is the same for all LUNs that have cache enabled.
Figure 2 shows the LUN ownership for each 10000 Series Adapter.
Figure 4. LUN Cache Operation Example for Server 1 Data LUNs
Figure 4 depicts the following caching operation:
• OraServer1 requests data from the SAN. If the data is on the Data01
LUN, the local cache is accessed. A cache miss results in access to
the SAN.
• Data on Data02 (or Data03 or Data04) is accessed in-band with
Figure 2. LUN Cache Ownership for the Four-Server RAC Environment
OraServer1’s 10000 Series Adapter requesting the data from the remote
10000 Series Adapter that is cache owner for that LUN. A cache miss
means that the remote 10000 Series Adapter with cache ownership for
that LUN accesses the SAN for the data before returning that requested
data to OraServer1.
• In this clustered, shared cache environment, all data access is managed
by the 10000 Series Adapter that has cache ownership for that LUN.
SN0451405-00 Rev. B 07/13 4
• The benefit of this shared cache becomes apparent when full table
scans occur (Oracle does not cache this in the block buffer). Any
repeated full table scans on any of the nodes may benefit from the
information cached in the 10000 Series Adapter in the cluster.
NOTE: Because the I/O behavior is in-band, it is not visible to the database
or even the driver. Also, if a node is “evicted” (that is, ejected from the
cluster by Oracle’s I/O fencing, causing a reboot) and the 10000 Series
Adapter is not available, the other nodes immediately and directly access
the LUN by means of the storage array. Although the performance gain
from the cache is not realized, the application continues to perform without
interruption. In Figure 4, if OraServer3 is evicted from the cluster, the other
nodes directly access Data03 until OraServer3 has rebooted and the 10000
Series Adapter is active on the SAN. The 10000 Series Adapter then begins
managing the I/O for Data03 and rewarming the cache.
Testing the QLogic 10000 Series Fibre Channel Adapter
QLogic set up a test environment based on a four-node RAC cluster,
with four 10000 Series Adapters clustered together. The load generator,
SwingBench, is configured on four nodes to provide a significant load on the
database. The SwingBench nodes are coordinated as one load-generation
tool. Follow best practices to get better performance. The results of the
cache versus non-cached runs reflect an improvement of approximately
40 to 70 percent.
WHITE PAPER
Figure 5. Performance Gains with Cache Enabled over Cache Disabled
Performance Testing—SwingBench Results
QLogic used SwingBench—a database load generator from Oracle that is
designed to provide load to an Oracle RAC database—to demonstrate the
significant performance gains. The largest gain occurs when running large,
complex queries:
• The sales history set online analytical processing (OLAP) achieved
approximately 3.25 × more transactions in a one hour run when the
10000 Series Adapter caching was enabled at a maximum of one
megabyte, versus when the cache was not enabled. The average
response times were about 75 percent faster.
• The order entry set online transaction processing (OLTP) performance
increase was approximately 95 percent more transactions completed
in the one hour run with 10000 Series caching enabled at default
maximum size compared to cache not enabled. The average response
time was also about 45 percent faster.
Your results will vary depending on machine load, query composition, and
Oracle RAC configuration (number of nodes). SwingBench is designed
to stress the entire database environment and this stress testing is not
focused specifically on the storage access. The largest gains are observed
where the storage access is heaviest, as in sales history reporting.
Figure 6. Response Time Improvement with Cache Enabled over Cache Disabled
Test Scenarios
QLogic set up two test scenarios with the following workload types:
• Decision support system (DSS) and OLAP workload. This workload
depends on reading and analyzing large amounts of data from the
database. Analysis shows a marked benefit from the caching that is
shared between all the nodes in the Oracle RAC cluster. The OLAP I/O
pattern reads large blocks of data for the queries, and the LUN cache
was enabled to accept up to a 1MB block.
• OLTP workload. This workload contains less massive amounts of data,
resulting in more targeted reads and showing less caching benefit. The
OLTP I/O pattern reads and writes 8KB blocks of data, and the LUN
cache was enabled at the default size of 128KB.
Each test scenario was run with the following cache patterns:
• Cache not enabled on the 10000 Series Adapter
• Cache enabled for all LUNs and distributed to different 10000
Series Adapters
The cache patterns support the best practice definition of matching the
LUNs in a disk across the nodes in the RAC cluster. That is, the LUNs
supporting an ASM disk group are a multiple of the nodes in the cluster.
SN0451405-00 Rev. B 07/13 5
Summary and Conclusions
QLogic recommends using the QLogic 10000 Series Fibre Channel Adapter
in an Oracle RAC database with Oracle’s ASM feature and providing LUNs in
a multiple of the nodes in the RAC cluster.
Oracle recommends four LUNs for a disk group and, for best cache
optimization, making the LUN quantity a multiple of the node quantity. At a
minimum, use the following LUN counts:
• Two-node RAC = four LUNs per disk group
– Server1 has cache ownership on LUNs 1 and 3
– Server2 has cache ownership on LUNs 2 and 4
• Three-node RAC = six LUNs per disk group
– Server1 has cache ownership on LUNs 1 and 4
– Server2 has cache ownership on LUNs 2 and 5
– Server3 has cache ownership on LUNs 3 and 6
• Four-node RAC = four LUNs per disk group
– Server1 has cache ownership on LUN 1
– Server2 has cache ownership on LUN 2
– Server3 has cache ownership on LUN 3
– Server4 has cache ownership on LUN 4
WHITE PAPER
Repeat the pattern to ensure that the LUN caching is distributed evenly
across the LUNs. Because the Oracle ASM balances the I/O load across the
LUNs, this configuration provides the best caching performance.
SN0451405-00 Rev. B 07/13 6
WHITE PAPER
Appendix A: Configuring a QLogic FibreCache 10000
Series Adapter Cluster
Follow these steps to create a cluster and enable a cache on the 10000
Series Adapter.
1. Insert the 10000 Series Adapter and PCIe® flash daughter card into the
server.
2. Connect the 10000 Series Adapter to the fabric switch.
3. On the fabric switch, zone all of the 10000 Series Adapters (at least one
in each server) onto the same zone. This step is required to allow the
10000 Series Adapters to communicate with each other for the caching
and I/O functions.
Use the 10000 Series Adapter Host CLI1 utility to perform the
remaining steps.
4. Issue the list adapters command to present all of the adapters
that are visible in the fabric zone. This step provides the Fibre Channel
adapter ID for each of the 10000 Series Adapters that are in the zone.
5. Issue the create cluster command, where the -clusterid
keyword indicates the ID to be used for the new cluster and the
-secondary keyword indicates the ID of the Fibre Channel Adapter
that will be the secondary of the new cluster. This step creates a twonode cluster of 10000 Series Adapters.
6. Issue the add member command to add other members (10000
Series Adapters) to complete the cluster.
7. Set the cache pool for each 10000 Series Adapter as follows:
a. Issue the show pool command to view the available storage
for the 10000 Series Adapter cache pool.
b. Issue the set pool command to specify the largest amount
possible for each 10000 Series Adapter.
All of the QLogic 10000 Series Fibre Channel Adapters are now ready to
identify which LUNs each will “own” for caching purposes.
8. Issue the show lun command to list all of the LUNs that have been
mapped to the 10000 Series Adapters through the fabric.
9. Identify the LUN by one of the following:
[-lun_serial=<LUN Page 0x80 serial number>]
[-lun_wwuln=<WWULN>]
[-lun_any_wwuln=<LUN WWULN, or Page 0x80
serial number designator>]
[-lun_number=<logical unit number>]
[-fc_target=<Fibre Channel target name>]
The cluster is now configured.
CAUTION: To ensure data integrity, any LUN must be cached by one and
only one QLogic 10000 Series Fibre Channel Adapter. Each 10000 Series
Adapter may cache up to 256 LUNs. All 10000 Series Adapters in the cluster
must be zoned to be visible to each other.
1 The Host CLI utility is available for download from the QLogic Downloads and Documentation Web page:
http://driverdownloads.qlogic.com/
Disclaimer
Reasonable efforts have been made to ensure the validity and accuracy of these performance tests. QLogic Corporation is not liable for any error
in this published white paper or the results thereof. Variation in results may be a result of change in configuration or in the environment. QLogic
specifically disclaims any warranty, expressed or implied, relating to the test results and their accuracy, analysis, completeness or quality.