Administration Guide for Novell Open Enterprise Server 2 Support
Pack 1 for Linux
*
Business Continuity Clustering
novdocx (en) 7 January 2010
1.2
February 18, 2010
BCC 1.2: Administration Guide for OES 2 SP1 Linux
Legal Notices
Novell, Inc., makes no representations or warranties with respect to the contents or use of this documentation, and
specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose.
Further, Novell, Inc., reserves the right to revise this publication and to make changes to its content, at any time,
without obligation to notify any person or entity of such revisions or changes.
Further, Novell, Inc., makes no representations or warranties with respect to any software, and specifically disclaims
any express or implied warranties of merchantability or fitness for any particular purpose. Further, Novell, Inc.,
reserves the right to make changes to any and all parts of Novell software, at any time, without any obligation to
notify any person or entity of such changes.
Any products or technical information provided under this Agreement may be subject to U.S. export controls and the
trade laws of other countries. You agree to comply with all export control regulations and to obtain any required
licenses or classification to export, re-export or import deliverables. You agree not to export or re-export to entities on
the current U.S. export exclusion lists or to any embargoed or terrorist countries as specified in the U.S. export laws.
You agree to not use deliverables for prohibited nuclear, missile, or chemical biological weaponry end uses. See the
Novell International Trade Services Web page (http://www.novell.com/info/exports/) for more information on
exporting Novell software. Novell assumes no responsibility for your failure to obtain any necessary export
approvals.
Novell, Inc., has intellectual property rights relating to technology embodied in the product that is described in this
document. In particular, and without limitation, these intellectual property rights may include one or more of the U.S.
patents listed on the Novell Legal Patents Web page (http://www.novell.com/company/legal/patents/) and one or
more additional patents or pending patent applications in the U.S. and in other countries.
Novell, Inc.
404 Wyman Street, Suite 500
Waltham, MA 02451
U.S.A.
www.novell.com
Online Documentation: To access the latest online documentation for this and other Novell products, see
the Novell Documentation Web page (http://www.novell.com/documentation).
Novell Trademarks
For Novell trademarks, see the Novell Trademark and Service Mark list (http://www.novell.com/company/legal/
trademarks/tmlist.html).
Third-Party Materials
All third-party trademarks are the property of their respective owners.
novdocx (en) 7 January 2010
novdocx (en) 7 January 2010
4BCC 1.2: Administration Guide for OES 2 SP1 Linux
10BCC 1.2: Administration Guide for OES 2 SP1 Linux
About This Guide
This guide describes how to install, configure, and manage Novell® Business Continuity Clustering
1.2 for Novell Open Enterprise Server (OES) 2 Support Pack 1 (SP1) for Linux servers in
combination with Novell Cluster Services
SP1 Linux).
Chapter 1, “Overview of Business Continuity Clustering,” on page 13
Chapter 2, “What’s New for BCC 1.2,” on page 25
Chapter 3, “Planning a Business Continuity Cluster,” on page 29
Chapter 4, “Installing Business Continuity Clustering,” on page 37
Chapter 5, “Updating (Patching) BCC 1.2.0 on OES 2 SP1 Linux,” on page 57
Chapter 6, “Upgrading the Identity Manager Nodes to Identity Manager 3.6.1,” on page 61
Chapter 7, “Converting BCC Clusters from NetWare to Linux,” on page 63
Chapter 8, “Configuring the Identity Manager Drivers for BCC,” on page 67
TM
1.8.6 for Linux clusters (the version released in OES 2
novdocx (en) 7 January 2010
Chapter 9, “Configuring BCC for Peer Clusters,” on page 81
Chapter 10, “Managing a Business Continuity Cluster,” on page 91
Chapter 11, “Configuring BCC for Cluster Resources,” on page 99
Chapter 12, “Troubleshooting Business Continuity Clustering,” on page 107
Chapter 13, “Security Considerations,” on page 121
Appendix A, “Console Commands for BCC,” on page 129
Appendix B, “Setting Up Auto-Failover,” on page 133
Appendix C, “Configuring Host-Based File System Mirroring for NSS Pools,” on page 137
Appendix D, “Configuration Worksheet for the BCC Drivers for Identity Manager,” on
page 143
Appendix E, “Using Dynamic DNS with BCC 1.2,” on page 149
Appendix F, “Using Virtual IP Addresses with BCC 1.2,” on page 163
Appendix G, “Removing Business Continuity Clustering Core Software,” on page 171
Audience
This guide is intended for anyone involved in installing, configuring, and managing Novell Cluster
Services for Linux in combination with Novell Business Continuity Clustering.
The Security Considerations section provides information of interest for security administrators or
anyone who is responsible for the security of the system.
Feedback
We want to hear your comments and suggestions about this manual and the other documentation
included with this product. Please use the User Comments feature at the bottom of each page of the
online documentation, or go to www.novell.com/documentation/feedback.html (http://
www.novell.com/documentation/feedback.html) and enter your comments there.
About This Guide11
Documentation Updates
The latest version of this Novell Business Continuity Clustering 1.2 Administration Guide for Linux
is available on the Business Continuity Clustering Documentation Web site (http://www.novell.com/
documentation/bcc/index.html) under BCC 1.2.0 for OES 2 SP1 Linux.
Additional Documentation
For BCC 1.2.1 for OES 2 SP2 Linux, see:
BCC 1.2.1 Administration Guide for Linux (http://www.novell.com/documentation/bcc/
bcc121_admin_lx/data/bookinfo.html)
OES 2 SP2: Novell Cluster Services 1.8.7 for Linux Administration Guide (http://
Identity Manager 3.5.1 Documentation Web site (http://www.novell.com/documentation/
idm35/)
For information about OES 2 Linux, see the OES 2 Documentation Web site (http://
www.novell.com/documentation/oes2/index.html).
For information about NetWare 6.5 SP8, see the NetWare 6.5 SP8 Documentation Web site (http://
www.novell.com/documentation/nw65/index.html).
Documentation Conventions
In Novell documentation, a greater-than symbol (>) is used to separate actions within a step and
items in a cross-reference path.
®
A trademark symbol (
, TM, etc.) denotes a Novell trademark. An asterisk (*) denotes a third-party
trademark.
12BCC 1.2: Administration Guide for OES 2 SP1 Linux
1
Overview of Business Continuity
novdocx (en) 7 January 2010
Clustering
As corporations become more international, fueled in part by the reach of the Internet, the
requirement for service availability has increased. Novell
offers corporations the ability to maintain mission-critical (24x7x365) data and application services
to their users while still being able to perform maintenance and upgrades on their systems.
In the past few years, natural disasters (ice storms, earthquakes, hurricanes, tornadoes, and fires)
have caused unplanned outages of entire data centers. In addition, U.S. federal agencies have
realized the disastrous effects that terrorist attacks could have on the U.S. economy when
corporations lose their data and the ability to perform critical business practices. This has resulted in
initial recommendations for corporations to build mirrored or replicated data centers that are
geographically separated by 300 kilometers (km) or more. (The minimum acceptable distance is 200
km.)
Many companies have built and deployed geographically mirrored data centers. The problem is that
setting up and maintaining the multiple centers is a manual process that takes a great deal of
planning and synchronizing. Even configuration changes must be carefully planned and replicated.
One mistake and the redundant site is no longer able to effectively take over in the event of a
disaster.
This section identifies the implications for disaster recovery, provides an overview of some of the
network implementations today that attempt to address disaster recovery, and describes how
Business Continuity Clustering can improve your disaster recovery solution by providing
specialized software that automates cluster configuration, maintenance, and synchronization across
two to four geographically separate sites.
®
Business Continuity Clustering (BCC)
1
Section 1.1, “Disaster Recovery Implications,” on page 13
Section 1.2, “Disaster Recovery Implementations,” on page 14
Section 1.3, “Business Continuity Clustering,” on page 20
Section 1.4, “BCC Deployment Scenarios,” on page 21
Section 1.5, “Key Concepts,” on page 24
1.1 Disaster Recovery Implications
The implications of disaster recovery are directly tied to your data. Is your data mission critical? In
many instances, critical systems and data drive the business. If these services stop, the business
stops. When calculating the cost of downtime, some things to consider are
File transfers and file storage
E-mail, calendaring, and collaboration
Web hosting
Critical databases
Productivity
Reputation
Overview of Business Continuity Clustering
13
Continuous availability of critical business systems is no longer a luxury, it is a competitive business
requirement.The Gartner Group estimates that 40% of enterprises that experience a disaster will go
out of business in five years, and only 15% of enterprises have a full-fledged business continuity
plan that goes beyond core technology and infrastructure.
The cost to the business for each one hour of service outage includes the following:
Income loss measured as the income-generating ability of the service, data, or impacted group
Productivity loss measured as the hourly cost of impacted employees
Recovery cost measured as the hourly cost of IT personnel to get services back online
Future lost revenue because of customer and partner perception
1.2 Disaster Recovery Implementations
Stretch clusters and cluster-of-clusters are two approaches for making shared resources available
across geographically distributed sites so that a second site can be called into action after one site
fails. To use these approaches, you must first understand how the applications you use and the
storage subsystems in your network deployment can determine whether a stretch cluster or cluster of
clusters solution is possible for your environment.
novdocx (en) 7 January 2010
Section 1.2.1, “LAN-Based versus Internet-Based Applications,” on page 14
Section 1.2.2, “Host-Based versus Storage-Based Data Mirroring,” on page 14
Section 1.2.3, “Stretch Clusters vs. Cluster of Clusters,” on page 15
1.2.1 LAN-Based versus Internet-Based Applications
Traditional LAN applications require a LAN infrastructure that must be replicated at each site, and
might require relocation of employees to allow the business to continue. Internet-based applications
allow employees to work from any place that offers an Internet connection, including homes and
hotels. Moving applications and services to the Internet frees corporations from the restrictions of
traditional LAN-based applications.
®
By using Novell exteNd Director portal services, Novell Access Manager, and ZENworks
services, applications, and data can be rendered through the Internet, allowing for loss of service at
one site but still providing full access to the services and data by virtue of the ubiquity of the
Internet. Data and services continue to be available from the other mirrored sites.
, all
1.2.2 Host-Based versus Storage-Based Data Mirroring
For clustering implementations that are deployed in data centers in different geographic locations,
the data must be replicated between the storage subsystems at each data center. Data-block
replication can be done by host-based mirroring for synchronous replication over short distances up
to 10 km. Typically, replication of data blocks between storage systems in the data centers is
performed by SAN hardware that allows synchronous mirrors over a greater distance.
For stretch clusters, host-based mirroring is required to provide synchronous mirroring of the SBD
(split-brain detector) partition between sites. This means that stretch-cluster solutions are limited to
distances of 10 km.
14BCC 1.2: Administration Guide for OES 2 SP1 Linux
Table 1-1 compares the benefits and limitations of host-based and storage-based mirroring.
Table 1-1 Comparison of Host-Based and Storage-Based Data Mirroring
Mirroring the SBD partitionAn SBD can be mirrored between
Synchronous data-block
replication of data between sites
Failover supportNo additional configuration of the
Failure of the site interconnectLUNs can become primary at
SMI-S complianceIf the storage subsystems are not
Up to 10 kmCan be up to and over 300 km.
two sites.
YesYes, requires a Fibre Channel
hardware is required.
both locations (split brain
problem).
SMI-S compliant, the storage
subsystems must be controllable
by scripts running on the nodes of
the cluster.
The actual distance is limited only
by the SAN hardware and media
interconnects for your
deployment.
Yes, if mirroring is supported by
the SAN hardware and media
interconnects for your
deployment.
SAN or iSCSI SAN.
Requires additional configuration
of the SAN hardware.
Clusters continue to function
independently. Minimizes the
chance of LUNs at both locations
becoming primary (split brain
problem).
If the storage subsystems are not
SMI-S compliant, the storage
subsystems must be controllable
by scripts running on the nodes of
the cluster.
1.2.3 Stretch Clusters vs. Cluster of Clusters
A stretch cluster and a cluster of clusters are two clustering implementations that you can use with
Novell Cluster Services
each deployment type, then compares the capabilities of each.
Novell Business Continuity Clustering automates some of the configuration and processes used in a
cluster of clusters. For information, see Section 1.3, “Business Continuity Clustering,” on page 20.
“Stretch Clusters” on page 15
“Cluster of Clusters” on page 16
“Comparison of Stretch Clusters and Cluster of Clusters” on page 17
“Evaluating Disaster Recovery Implementations for Clusters” on page 19
Stretch Clusters
A stretch cluster consists of a single cluster where the nodes are located in two geographically
separate data centers. All nodes in the cluster must be in the same Novell eDirectory
requires the eDirectory replica ring to span data centers. The IP addresses for nodes and cluster
resources in the cluster must share a common IP subnet.
TM
to achieve your desired level of disaster recovery. This section describes
TM
tree, which
Overview of Business Continuity Clustering15
At least one storage system must reside in each data center. The data is replicated between locations
Server 6 Server 7Server 5Server 8
Fibre Channel
Switch
Server 2 Server 3Server 1Server 4
Fibre Channel
Switch
Fibre Channel
Disk Array
Disk blocks
WAN
Cluster
Heartbeat
SAN
Ethernet SwitchEthernet Switch
Fibre Channel
Disk Array
Building ABuilding B
8-node cluster stretched
between two sites
Site 2Site 1
by using host-based mirroring or storage-based mirroring. For information about using mirroring
solutions for data replication, see Section 1.2.2, “Host-Based versus Storage-Based Data Mirroring,”
on page 14. Link latency can occur between nodes at different sites, so the heartbeat tolerance
between nodes of the cluster must be increased to allow for the delay.
The split-brain detector (SBD) is mirrored between the sites. Failure of the site interconnect can
result in LUNs becoming primary at both locations (split brain problem) if host-based mirroring is
used.
In the stretch-cluster architecture shown in Figure 1-1, the data is mirrored between two data centers
that are geographically separated. The server nodes in both data centers are part of one cluster, so
that if a disaster occurs in one data center, the nodes in the other data center automatically take over.
Figure 1-1 Stretch Cluster
novdocx (en) 7 January 2010
Cluster of Clusters
A cluster of clusters consists of multiple clusters in which each cluster is located in a geographically
separate data center. Each cluster can be in different Organizational Unit (OU) containers in the
same eDirectory tree, or in different eDirectory trees. Each cluster can be in a different IP subnet.
A cluster of clusters provides the ability to fail over selected cluster resources or all cluster resources
from one cluster to another cluster. For example, the cluster resources in one cluster can fail over to
separate clusters by using a multiple-site fan-out failover approach. A given service can be provided
by multiple clusters. Resource configurations are replicated to each peer cluster and synchronized
16BCC 1.2: Administration Guide for OES 2 SP1 Linux
manually. Failover between clusters requires manual management of the storage systems and the
cluster.
Nodes in each cluster access only the storage systems co-located in the same data center. Typically,
Cluster Site 2Cluster Site 1
Fibre Channel
Switch
Fibre Channel
Disk Arrays
Building ABuilding B
Ethernet Switch
Fibre Channel
Switch
Fibre Channel
Disk Arrays
Disk blocks
Ethernet Switch
WAN
eDirectory
IDM
Two independent clusters at
geographically separate sites
SAN
Server2BServer
3B
Server
1B
Server
4B
Server2AServer
3A
Server
1A
Server
4A
data is replicated by using storage-based mirroring. Each cluster has its own SBD partition. The
SBD partition is not mirrored across the sites, which minimizes the chance for a split-brain problem
occurring when using host-based mirroring. For information about using mirroring solutions for data
replication, see Section 1.2.2, “Host-Based versus Storage-Based Data Mirroring,” on page 14.
In the cluster-of-clusters architecture shown in Figure 1-2, the data is synchronized by the SAN
hardware between two data centers that are geographically separated. If a disaster occurs in one data
center, the cluster in the other data center takes over.
Figure 1-2 Cluster of Clusters
novdocx (en) 7 January 2010
Comparison of Stretch Clusters and Cluster of Clusters
Table 1-2 compares the capabilities of a stretch cluster and a cluster of clusters.
Table 1-2 Comparison of Stretch Cluster and Cluster of Clusters
CapabilityStretch ClusterCluster of Clusters
Number of clustersOneTwo or more
Number of geographically
TwoTwo or more
separated data centers
eDirectory treesSingle tree only; requires the
replica ring to span data centers.
One or multiple trees
Overview of Business Continuity Clustering17
CapabilityStretch ClusterCluster of Clusters
novdocx (en) 7 January 2010
eDirectory Organizational Units
(OUs)
IP subnetIP addresses for nodes and
SBD partitionA single SBD is mirrored between
Single OU container for all nodes.
As a best practice, place the
cluster container in an OU
separate from the rest of the tree.
cluster resources must be in a
single IP subnet.
Because the subnet spans
multiple locations, you must
ensure that your switches handle
gratuitous ARP (Address
Resolution Protocol).
two sites by using host-based
mirroring, which limits the
distance between data centers to
10 km.
Each cluster can be in a different
OU. Each cluster is in a single
OU container.
As a best practice, place each
cluster container in an OU
separate from the rest of the tree.
IP addresses in a given cluster
are in a single IP subnet. Each
cluster can use the same or
different IP subnet.
If you use the same subnet for all
clusters in the cluster of clusters,
you must ensure that your
switches handle gratuitous ARP.
Each cluster has its own SBD.
Each cluster can have an on-site
mirror of its SBD for high
availability.
If the cluster of clusters uses
host-based mirroring, the SBD is
not mirrored between sites, which
minimizes the chance of LUNs at
both locations becoming primary.
Failure of the site interconnect if
using host-based mirroring
Storage subsystemEach cluster accesses only the
Data-block replication between
sites
For information about data
replication solutions, see
Section 1.2.2, “Host-Based
versus Storage-Based Data
Mirroring,” on page 14.
Clustered servicesA single service instance runs in
Cluster resource failoverAutomatic failover to preferred
LUNs might become primary at
both locations (split brain
problem).
storage subsystem on its own
site.
Yes; typically uses storage-based
mirroring, but host-based
mirroring is possible for distances
up to 10 km.
the cluster.
nodes at the other site.
Clusters continue to function
independently.
Each cluster accesses only the
storage subsystem on its own
site.
Yes; typically uses storage-based
mirroring, but host-based
mirroring is possible for distances
up to 10 km.
Each cluster can run an instance
of the service.
Manual failover to preferred
nodes on one or multiple clusters
(multiple-site fan-out failover).
Failover requires additional
configuration.
18BCC 1.2: Administration Guide for OES 2 SP1 Linux
CapabilityStretch ClusterCluster of Clusters
Cluster resource configurationsConfigured for a single clusterConfigured for the primary cluster
that hosts the resource, then the
configuration is manually
replicated to the peer clusters.
novdocx (en) 7 January 2010
Cluster resource configuration
synchronization
Failover of cluster resources
between clusters
Link latency between sitesCan cause false failovers.
Controlled by the master nodeManual process that can be
tedious and error-prone.
Not applicableManual management of the
storage systems and the cluster.
Each cluster functions
The cluster heartbeat tolerance
between master and slave must
be increased to as high as 30
seconds. Monitor cluster
heartbeat statistics, then tune
down as needed.
independently in its own
geographical site.
Evaluating Disaster Recovery Implementations for Clusters
Table 1-3 illustrates why a cluster of cluster solution is less problematic to deploy than a stretch
cluster solution. Manual configuration is not a problem when using Novell Business Continuity
Clustering for your cluster of clusters.
Table 1-3 Advantages and Disadvantages of Stretch Clusters versus Cluster of Clusters
Stretch ClusterCluster of Clusters
Advantages It automatically fails over when
configured with host-based
mirroring.
It is easier to manage than separate
clusters.
Cluster resources can fail over to
nodes in any site.
eDirectory partitions don’t need to
span the cluster.
Each cluster can be in different OUs
in the same eDirectory tree.
IP addresses for each cluster can be
on different IP subnets.
Cluster resources can fail over to
separate clusters (multiple-site fanout failover support).
Each cluster has its own SBD.
Each cluster can have an on-site
mirror of its SBD for high availability.
If the cluster of clusters uses hostbased mirroring, the SBD is not
mirrored between sites, which
minimizes the chance of LUNs at
both locations becoming primary.
Overview of Business Continuity Clustering19
Stretch ClusterCluster of Clusters
novdocx (en) 7 January 2010
Disadvantages The eDirectory partition must span
the sites.
Failure of site interconnect can
result in LUNs becoming primary at
both locations (split brain problem) if
host-based mirroring is used.
An SBD partition must be mirrored
between sites.
It accommodates only two sites.
All IP addresses must reside in the
same subnet.
Other
Considerations
Host-based mirroring is required to
mirror the SBD partition between
sites.
Link variations can cause false
failovers.
You could consider partitioning the
eDirectory tree to place the cluster
container in a partition separate from
the rest of the tree.
The cluster heartbeat tolerance
between master and slave must be
increased to accommodate link
latency between sites.
You can set this as high as 30
seconds, monitor cluster heartbeat
statistics, and then tune down as
needed.
Because all IP addresses in the
cluster must be on the same subnet,
you must ensure that your switches
handle ARP.
Contact your switch vendor or
consult your switch documentation
for more information.
Resource configurations must be
manually synchronized.
Storage-based mirroring requires
additional configuration steps.
Depending on the platform used,
storage arrays must be controllable
by scripts that run on OES 2 Linux if
the SANs are not SMI-S compliant.
1.3 Business Continuity Clustering
A Novell Business Continuity Clustering cluster is an automated cluster of Novell Cluster Services
clusters. It is similar to what is described in “Cluster of Clusters” on page 16, except that the cluster
configuration, maintenance, and synchronization have been automated by adding specialized
software.
BCC supports up to four peer clusters. The sites are geographically separated mirrored data centers,
with a high availability cluster located at each site. Configuration is automatically synchronized
between the sites. Data is replicated between sites. All cluster nodes and their cluster resources are
monitored at each site. If one site goes down, business continues through the mirrored sites.
20BCC 1.2: Administration Guide for OES 2 SP1 Linux
The business continuity cluster configuration information is stored in eDirectory. eDirectory schema
extensions provide the additional attributes required to maintain the configuration and status
information of BCC enabled cluster resources. This includes information about the peer clusters, the
cluster resources and their states, and storage control commands.
BCC is an integrated set of tools to automate the setup and maintenance of a business continuity
infrastructure. Unlike competitive solutions that attempt to build stretch clusters, BCC uses a cluster
of clusters. Each site has its own independent clusters, and the clusters in each of the geographically
separate sites are each treated as “nodes” in a larger cluster, allowing a whole site to do fan-out
failover to other multiple sites. Although this can currently be done manually with a cluster of
clusters, BCC automates the system by using eDirectory and policy-based management of the
resources and storage systems.
Novell Business Continuity Clustering software provides the following advantages over typical
cluster-of-clusters solutions:
Supports up to four clusters with up to 32 nodes each.
Integrates with shard storage hardware devices to automate the failover process through
standards-based mechanisms such as SMI-S.
Uses Identity Manager technology to automatically synchronize and transfer cluster-related
eDirectory objects from one cluster to another.
novdocx (en) 7 January 2010
Provides the capability to fail over as few as one cluster resource, or as many as all cluster
resources.
Includes intelligent failover that allows you to perform site failover testing as a standard
practice.
Provides scripting capability that allows enhanced storage management control and
customization of migration and fail over between clusters.
Provides simplified business continuity cluster configuration and management by using the
browser-based Novell iManager management tool. iManager is used for the configuration and
monitoring of the overall system and for the individual resources.
1.4 BCC Deployment Scenarios
There are several Business Continuity Clustering deployment scenarios that can be used to achieve
the desired level of disaster recovery. Three possible scenarios include:
Section 1.4.1, “Two-Site Business Continuity Cluster Solution,” on page 21
Section 1.4.2, “Multiple-Site Business Continuity Cluster Solution,” on page 22
Section 1.4.3, “Low-Cost Business Continuity Cluster Solution,” on page 23
1.4.1 Two-Site Business Continuity Cluster Solution
The two-site business continuity cluster deploys two independent clusters at geographically separate
sites. Each cluster can support up to 32 nodes. The clusters can be designed in one of two ways:
Active Site/Active Site: Two active sites where each cluster supports different applications
and services. Either site can take over for the other site at any time.
Overview of Business Continuity Clustering21
Active Site/Passive Site: A primary site in which all services are normally active, and a
Cluster Site 2Cluster Site 1
Fibre Channel
Switch
Fibre Channel
Disk Arrays
Building ABuilding B
Ethernet Switch
Fibre Channel
Switch
Fibre Channel
Disk Arrays
Disk blocks
Ethernet Switch
WAN
eDirectory
IDM
Two independent clusters at
geographically separate sites
SAN
Server2BServer
3B
Server
1B
Server
4B
Server2AServer
3A
Server
1A
Server
4A
secondary site which is effectively idle. The data is mirrored to the secondary site, and the
applications and services are ready to load if needed.
The active/active deployment option is typically used in a company that has more than one large site
of operations. The active/passive deployment option is typically used when the purpose of the
secondary site is primarily testing by the IT department. Replication of data blocks is typically done
by SAN hardware, but it can be done by host-based mirroring for synchronous replication over short
distances up to 10 km.
Figure 1-3 shows a two-site business continuity cluster that uses storage-based data replication
between the sites. BCC uses eDirectory and Identity Manager to synchronize cluster information
between the two clusters.
Figure 1-3 Two-Site Business Continuity Cluster
novdocx (en) 7 January 2010
1.4.2 Multiple-Site Business Continuity Cluster Solution
The multiple-site business continuity cluster is a large solution capable of supporting up to four
sites. Each cluster can support up to 32 nodes. Services and applications can do fan-out failover
between sites. Replication of data blocks is typically done by SAN hardware, but it can be done by
host-based mirroring for synchronous replication over short distances up to 10 km.
22BCC 1.2: Administration Guide for OES 2 SP1 Linux
Figure 1-4 depicts a four-site business continuity cluster that uses storage-based data replication
Channel
Channel
Building A
Server
2A
Ethernet Switch
Server
3A
Server
1A
Server
4A
Fibre Channel
Switch
Disk blocks
WAN
eDirectory
IDM
Four independent clusters in
geographically separate sites
Cluster Sites 2, 3, and 4
SAN
Cluster Site 1
Fibre Channel
Disk Arrays
Server
4D
Building D
Server
4C
Building C
Server2BServer
3B
Server
1B
Server
4B
Fibre Channel
Switch
Fibre Channel
Disk Arrays
Ethernet Switch
Building B
between the sites. BCC uses eDirectory and Identity Manager to synchronize cluster information
between the two clusters.
Figure 1-4 Four-Site Business Continuity Cluster
novdocx (en) 7 January 2010
Using additional software, all services, applications, and data can be rendered through the Internet,
allowing for loss of service at one site but still providing full access to the services and data by virtue
of the ubiquity of the Internet. Data and services continue to be available from the other mirrored
sites. Moving applications and services to the Internet frees corporations from the restrictions of
traditional LAN-based applications. Traditional LAN applications require a LAN infrastructure that
must be replicated at each site, and might require relocation of employees to allow the business to
continue. Internet-based applications allow employees to work from any place that offers an Internet
connection, including homes and hotels.
1.4.3 Low-Cost Business Continuity Cluster Solution
The low-cost business continuity cluster solution is similar to the previous two solutions, but
replaces Fibre Channel storage arrays with iSCSI storage arrays. Data block mirroring can be
accomplished either with iSCSI-based block replication, or host-based mirroring. In either case,
snapshot technology can allow for asynchronous replication over long distances. However, the lowcost solution does not necessarily have the performance associated with higher-end Fibre Channel
storage arrays.
Overview of Business Continuity Clustering23
1.5 Key Concepts
The key concepts in this section can help you understand how Business Continuity Clustering
manages your business continuity cluster.
Section 1.5.1, “Business Continuity Clusters,” on page 24
Section 1.5.2, “Cluster Resources,” on page 24
Section 1.5.3, “Landing Zone,” on page 24
Section 1.5.4, “BCC Drivers for Identity Manager,” on page 24
1.5.1 Business Continuity Clusters
A cluster of two to four Novell Cluster Services clusters that are managed together by Business
Continuity Clustering software. All nodes in every peer cluster are running the same operating
system.
1.5.2 Cluster Resources
novdocx (en) 7 January 2010
A cluster resource is a cluster-enabled shared disk that is configured for Novell Cluster Services. It
is also BCC-enabled so that it can be migrated and failed over between nodes in different peer
clusters.
1.5.3 Landing Zone
The landing zone is an eDirectory context in which the objects for the Virtual Server, the Cluster
Pool, and the Cluster Volume are placed when they are created for the peer clusters. You specify the
landing zone context when you configure the Identity Manager drivers for the business continuity
cluster.
1.5.4 BCC Drivers for Identity Manager
Business Continuity Clustering requires a special Identity Manager driver that uses an Identity Vault
to synchronize the cluster resource configuration information between the peer clusters. If the peer
clusters are in different eDirectory trees, an additional BCC driver helps synchronize user
information between the trees. For information, see Chapter 8, “Configuring the Identity Manager
Drivers for BCC,” on page 67.
24BCC 1.2: Administration Guide for OES 2 SP1 Linux
2
What’s New for BCC 1.2
This section describes the changes and enhancements that were made to Novell® Business
Continuity Clustering (BCC) 1.2 for Novell Open Enterprise Server (OES) 2 Support Pack 1 (SP1)
since the initial release of BCC 1.2.
Section 2.2, “Identity Manager 3.6.1 Support (June 2009),” on page 26
Section 2.3, “BCC 1.2 for OES 2 SP1 Linux,” on page 26
2.1 BCC 1.2.0 Patch (January 2010)
In January 2010, a BCC 1.2.0 patch is available through the OES 2 SP1 Linux patch channel
(oes2sp1-January-2010-Scheduled-Maintenance-6749). For information about applying the patch,
see Chapter 5, “Updating (Patching) BCC 1.2.0 on OES 2 SP1 Linux,” on page 57.
novdocx (en) 7 January 2010
2
The major changes for BCC 1.2.0 are described in the following sections:
Section 2.1.1, “BCC Engine,” on page 25
Section 2.1.2, “BCC Resource Driver Template for Identity Manager,” on page 25
2.1.1 BCC Engine
The BCC 1.2.0 patch includes the following major bug fixes for the BCC engine:
Improves the update process to wait for the
before running the BCC install scripts. (Bug 561055)
Modified the
Typically, the wait is less than 10 seconds.
Modified the post-install script of the Novell BCC specification file to wait up to 5
seconds when
Improves memory management functions that might cause the
the code was simplified and clarified so that the shared memory functions now do exactly what
their names describe. The
keys are not in use by other processes, then to use the verified unique keys for its processing
threads. (Bug 553527)
Improves the detection and handling of
not caught and handled where it occurs, the engine’s main thread detects the exception and
gracefully shuts itself down. (Bug 428161)
novell-bcc init.d
adminfsd
is stopped. Typically, the wait is about 1 second.
bccd
daemon was modified to generate unique keys to verify that the
adminfsd
script to wait up to 15 seconds when
No Memory
and
bccd
daemons to gracefully stop
bccd
daemon to die. Overall,
exceptions. In addition, if an exception is
bccd
is stopped.
2.1.2 BCC Resource Driver Template for Identity Manager
The BCC 1.2.0 patch for OES 2 SP1 Linux includes a new BCC resource driver template for
Identity Manager that offers the following new feature and bug fixes:
Uses a newer policy linking format so that you are no longer prompted to update the driver in
iManager. (New)
What’s New for BCC 1.2
25
Adds the host resource name and Novell Distributed File Services (DFS) GUID attributes to
the Volume objects that are synchronized for a BCC-enabled volume resource. (Bug 535127)
No longer creates duplicate NCP Server, Volume, and Pool objects when the landing zone is
not the same location as the cluster server’s container. (Found while debugging Bug 537981)
Ensures that a volume resource’s link to the virtual NCP Server object is updated to point to the
cluster where the resource is mounted. (Found while debugging Bug 537981)
novdocx (en) 7 January 2010
The event for the
IsClusterEnabled
policy in a BCC resource driver now allows a resource’s
peer list to be synchronized to the peer clusters, even if the cluster is disabled, if the current
cluster’s name is being removed from the peer list. Only this specific change is allowed; other
changes to a resource are dropped (vetoed) by the driver after a cluster is disabled. (Bug
434243)
The new BCC resource driver template is compatible with the following combinations of Identity
Manager and operating systems:
Identity ManagerOperating System
Identity Manager 3.6 (32-bit)OES 2 SP1 Linux (32-bit)
Identity Manager 3.6.1 (32-bit or 64-bit)OES 2 SP1 Linux (32-bit or 64-bit)
Identity Manager 3.6.1 (32-bit or 64-bit)OES 2 SP2 Linux (32-bit or 64-bit)
Identity Manager 3.5.x NetWare
®
6.5 SP8
The new BCC resource driver template is not automatically applied to existing drivers. You can
continue to use your existing BCC resource drivers, or you can re-create the BCC resource drivers
with the new template in order to take advantage of the changes it offers. We recommend that you
re-create the drivers with the new template, but it is not required.
2.2 Identity Manager 3.6.1 Support (June 2009)
In June 2009, Identity Manager 3.6.1 was released to provide support for the 64-bit OES 2 SP1
Linux operating system. Previously, Identity Manager required a 32-bit operating system, even with
64-bit hardware. This means the Identity Manager node in a BCC peer cluster can now be installed
on a 64-bit operating system. Updating to Identity Manager 3.6.1 is needed only for 64-bit support,
or to take advantage of bug fixes that might be offered in 3.6.1.
For information about upgrading from Identity Manager 3.6 to Identity Manager 3.6.1 in a BCC
environment, see Chapter 6, “Upgrading the Identity Manager Nodes to Identity Manager 3.6.1,” on
page 61.
2.3 BCC 1.2 for OES 2 SP1 Linux
BCC 1.2 for OES 2 SP1 Linux provides the following enhancements and changes over BCC 1.1 SP2
for NetWare
Support for OES 2 SP1 Linux
Support for Novell Cluster Services
Support for Identity Manager 3.6 (32-bit). A 64-bit update is planned.
Support for 32-bit and 64-bit architectures
26BCC 1.2: Administration Guide for OES 2 SP1 Linux
®
6.5 SP8:
TM
1.8.6 for Linux
novdocx (en) 7 January 2010
Support for Novell eDirectory
TM
8.8
Support for Novell iManager 2.7.2
Preferred node failover between clusters
Enterprise data center capabilities
Geographical failover of virtual machines as cluster resources
Full support for CIM management in tools (requires OpenWBEM)
What’s New for BCC 1.227
novdocx (en) 7 January 2010
28BCC 1.2: Administration Guide for OES 2 SP1 Linux
3
Planning a Business Continuity
novdocx (en) 7 January 2010
Cluster
Use the guidelines in this section to design your Novell® Business Continuity Clustering solution.
The success of your business continuity cluster depends on the stability and robustness of the
individual peer clusters. BCC cannot overcome weaknesses in a poorly designed cluster
environment.
Section 3.1, “Determining Design Criteria,” on page 29
Section 3.2, “Best Practices,” on page 29
Section 3.3, “LAN Connectivity Guidelines,” on page 30
Section 3.4, “SAN Connectivity Guidelines,” on page 31
Section 3.5, “Storage Design Guidelines,” on page 32
Section 3.6, “eDirectory Design Guidelines,” on page 32
Section 3.7, “Cluster Design Guidelines,” on page 34
3.1 Determining Design Criteria
The design goal for your business continuity cluster is to ensure that your critical data and services
can continue in the event of a disaster. Design the infrastructure based on your business needs.
3
Determine your design criteria by asking and answering the following questions:
What are the key services that drive your business?
Where are your major business sites, and how many are there?
What services are essential for business continuance?
What is the cost of down time for the essential services?
Based on their mission-critical nature and cost of down time, what services are the highest
priority for business continuance?
Where are the highest-priority services currently located?
Where should the highest-priority services be located for business continuance?
What data must be replicated to support the highest-priority services?
How much data is involved, and how important is it?
3.2 Best Practices
The following practices help you avoid potential problems with your BCC:
IP address changes should always be made on the Protocols page of the iManager cluster plug-
in, not in load and unload scripts.
TM
This is the only way to change the IP address on the virtual NCP
eDirectory
TM
.
server object in
Planning a Business Continuity Cluster
29
Ensure that eDirectory and your clusters are stable before implementing BCC.
Engage Novell Consulting.
Engage a consulting group from your SAN vendor.
The cluster node that hosts the Identity Manager driver should have a full read/write
eDirectory
Driver set container
Cluster container
(Parent) container where the servers reside
Landing zone container
User object container
Ensure that you have full read/write replicas of the entire tree at each data center.
TM
replica with the following containers in the replica:
3.3 LAN Connectivity Guidelines
The primary objective of LAN connectivity in a cluster is to provide uninterrupted heartbeat
communications. Use the guidelines in this section to design the LAN connectivity for each of the
peer clusters in the business continuity cluster:
novdocx (en) 7 January 2010
Section 3.3.1, “VLAN,” on page 30
Section 3.3.2, “Channel Bonding,” on page 30
Section 3.3.3, “IP Addresses,” on page 31
Section 3.3.4, “Name Resolution,” on page 31
Section 3.3.5, “IP Addresses for BCC-Enabled Cluster Resources,” on page 31
3.3.1 VLAN
Use a dedicated VLAN (virtual local area network) for each cluster.
The cluster protocol is non-routable, so you cannot direct communications to specific IP addresses.
Using a VLAN for the cluster nodes provides a protected environment for the heartbeat process and
ensures that heartbeat packets are exchanged only between the nodes of a given cluster.
When using a VLAN, no foreign host can interfere with the heartbeat. For example, it avoids
broadcast storms that slow traffic and result in false split-brain abends.
3.3.2 Channel Bonding
Use channel bonding for adapters for LAN fault tolerance. Channel bonding combines Ethernet
interfaces on a host computer for redundancy or increased throughput. It helps increase the
availability of an individual cluster node, which helps avoid or reduce the occurrences of failover
caused by slow LAN traffic. For information, see
bonding.txt
.
/usr/src/linux/Documentation/
When configuring Spanning Tree Protocol (STP), ensure that Portfast is enabled, or consider Rapid
Spanning Tree. The default settings for STP inhibit the heartbeat for over 30 seconds whenever there
is a change in link status. Test your STP configuration with Novell Cluster Services
make sure that a node is not cast out of the cluster when a broken link is restored.
30BCC 1.2: Administration Guide for OES 2 SP1 Linux
TM
running to
Loading...
+ 148 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.