Compaq ProLiant Clusters HA/F100, ProLiant HA/F200 Administrator's Manual

Page 1
ProLiant Clusters HA/F100 and HA/F200
Administrator Guide
Third Edition (September 2000) Part Number 380362-003 Compaq Computer Corporation
Page 2
Notice
© 2000 Compaq Computer Corporation
COMPAQ, the Compaq logo, Compaq Insight Manager, ProLiant, ROMPaq, SoftPaq, SmartStart, ServerNet
Registered in U.S. Patent and Trademark Office. SANworks is a trademark of Compaq
Information Technologies Group, L.P.
Microsoft, Windows, Windows NT are trademarks of Microsoft Corporation.
Intel is a trademark of Intel Corporation.
All other product names mentioned herein may be trademarks of their respective companies.
Compaq shall not be liable for technical or editorial errors or omissions contained herein. The information in this document is subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND. ANY RISK ARISING OUT OF THE USE OF THIS INFORMATION REMAINS WITH RECIPIENT. IN NO EVENT SHALL COMPAQ BE LIABLE FOR ANY DIRECT, INDIRECT, CONSEQUENTIAL OR OTHER DAMAGES WHATSOEVER (INCLUDING WITHOUT LIMITATION, DAMAGES FOR BUSINESS INTERRUPTION OR LOSS OF BUSINESS INFORMATION OR PROFITS), EVEN IF COMPAQ HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES AND WHETHER IN AN ACTION OF CONTRACT OR TORT, INCLUDING NEGLIGENCE.
The warranties for Compaq products are set forth in the express limited warranty statements accompanying such products. Nothing herein should be construed as constituting an additional warranty.
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide Third Edition (September 2000) Part Number 380362-003
Page 3
Contents
About This Guide
Text Conventions........................................................................................................ix
Symbols in Text...........................................................................................................x
Symbols on Equipment................................................................................................x
Rack Stability .............................................................................................................xi
Getting Help ...............................................................................................................xi
Compaq Technical Support ................................................................................xii
Compaq Website.................................................................................................xii
Compaq Authorized Reseller............................................................................ xiii
Chapter 1
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200
Overview of Compaq ProLiant Clusters HA/F100 and HA/F200 Components...... 1-1
Compaq ProLiant Cluster HA/F100 ........................................................................ 1-2
Compaq ProLiant Cluster HA/F200 ........................................................................ 1-4
Compaq ProLiant Servers........................................................................................ 1-6
Compaq StorageWorks RAID Array 4000 or Compaq StorageWorks
RAID Array 4100............................................................................................. 1-6
Compaq StorageWorks RAID Array 4000 Controller...................................... 1-7
Connection Infrastructure for the RA4000/4100 .............................................. 1-8
Compaq StorageWorks Fibre Channel Host Adapter/P or Compaq
StorageWorks 64-Bit/66-MHz Fibre Channel Host Adapter............................ 1-9
Gigabit Interface Converter-Shortwave.......................................................... 1-10
Cluster Interconnect............................................................................................... 1-10
Client Network ............................................................................................... 1-10
Private or Public Interconnect ........................................................................ 1-11
Interconnect Adapters..................................................................................... 1-11
Redundant Interconnects ................................................................................ 1-12
Cables ............................................................................................................. 1-12
Microsoft Software ................................................................................................ 1-14
Compaq Software .................................................................................................. 1-15
Page 4
iv Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Compaq SmartStart and Support Software CD............................................... 1-15
Compaq Redundancy Manager (Fibre Channel)............................................. 1-17
Compaq SANworks Secure Path for Windows 2000 on RAID Array
4000/4100 ....................................................................................................... 1-18
Compaq Cluster Verification Utility............................................................... 1-18
Compaq Insight Manager................................................................................ 1-19
Compaq Insight Manager XE ......................................................................... 1-20
Compaq Intelligent Cluster Administrator...................................................... 1-20
Resources for Application Installation............................................................ 1-21
Chapter 2
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
Planning Considerations .......................................................................................... 2-2
Cluster Configurations...................................................................................... 2-2
Cluster Groups.................................................................................................. 2-9
Reducing Single Points of Failure in the HA/F100 Configuration ................. 2-14
Enhanced High Availability Features of the HA/F200 ................................... 2-23
Capacity Planning .................................................................................................. 2-28
Server Capacity............................................................................................... 2-29
Shared Storage Capacity................................................................................. 2-31
Static Load Balancing..................................................................................... 2-35
Networking Capacity ...................................................................................... 2-37
Network Considerations......................................................................................... 2-37
Network Configuration ................................................................................... 2-37
Migrating Network Clients ............................................................................. 2-38
Failover/Failback Planning .................................................................................... 2-40
Performance After Failover ............................................................................ 2-40
Microsoft Clustering Software Thresholds and Periods ................................. 2-41
Failover of Directly Connected Devices......................................................... 2-42
Manual vs. Automatic Failback ...................................................................... 2-43
Failover and Failback Policies ........................................................................ 2-44
Chapter 3
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200
Preinstallation Overview.......................................................................................... 3-1
Preinstallation Guidelines ........................................................................................ 3-4
Installing the Hardware............................................................................................ 3-7
Setting Up the Nodes ........................................................................................ 3-7
Setting Up the Compaq StorageWorks RAID Array 4000 and RAID Array
4100 Storage System ........................................................................................ 3-9
Setting Up a Dedicated Interconnect .............................................................. 3-11
Setting Up a Public Interconnect .................................................................... 3-13
Redundant Interconnect .................................................................................. 3-13
Installing the Software ........................................................................................... 3-13
Assisted Integration Using SmartStart (Recommended) ................................ 3-14
Page 5
About This Guide v
Compaq Intelligent Cluster Administrator............................................................. 3-21
Installing Compaq Intelligent Cluster Administrator ..................................... 3-21
Additional Cluster Verification Steps.................................................................... 3-22
Verifying the Creation of the Cluster ............................................................. 3-22
Verifying Node Failover................................................................................. 3-23
Verifying Network Client Failover................................................................. 3-24
Chapter 4
Upgrading the ProLiant Clusters HA/F100 and HA/F200
Pre-migration Preparation........................................................................................ 4-2
Migration Process I:
HA/F100 with Windows NTS/E to HA/F100 with
Windows 2000 Advanced Server............................................................................. 4-5
Migration Process II:
HA/F200 with Windows NTS/E to HA/F200 with
Windows 2000 Advanced Server............................................................................. 4-9
Migration Process III:
HA/F100 with Windows 2000 Advanced Server to
HA/F200 with Windows 2000 Advanced Server .................................................. 4-12
Migration Process IV: HA/F100 Windows NTS/E to HA/F200 Windows 2000
Advanced Server.................................................................................................... 4-15
Migration Process V: HA/F100 Windows NTS/E to HA/F200 Windows NTS/E. 4-19
Chapter 5
Managing the Compaq ProLiant Clusters HA/F100 and HA/F200
Managing a Cluster Without Interrupting Cluster Services..................................... 5-2
Managing a Cluster in a Degraded Condition.......................................................... 5-3
Managing Hardware Components of Individual Cluster Nodes .............................. 5-4
Managing Network Clients Connected to a Cluster................................................. 5-4
Managing a Cluster’s Shared Storage...................................................................... 5-5
Remotely Managing a Cluster ................................................................................. 5-5
Viewing Cluster Events ........................................................................................... 5-5
Modifying Physical Cluster Resources.................................................................... 5-6
Removing Shared Storage System.................................................................... 5-6
Adding Shared Storage System ........................................................................ 5-6
Adding or Removing Shared Storage Drives.................................................... 5-8
Physically Replacing a Cluster Node.............................................................. 5-10
Backing Up Your Cluster ...................................................................................... 5-11
Managing Cluster Performance ............................................................................. 5-12
Compaq Redundancy Manager.............................................................................. 5-13
Changing Paths............................................................................................... 5-14
Other Functions .............................................................................................. 5-15
RAID Array 4000 Controller Hot Replace ............................................................ 5-15
Secure Path Manager ............................................................................................. 5-16
Launching Secure Path Manager.................................................................... 5-16
Logging on to Secure Path Manager............................................................... 5-16
Managing Storagesets and Paths in a Clustered Environment........................ 5-18
Moving a Storageset From One Controller to the Other................................. 5-19
Page 6
vi Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Verifying A Path............................................................................................. 5-19
RA4000 Controller Hot Replace..................................................................... 5-20
Compaq Insight Manager....................................................................................... 5-21
Cluster-Specific Features of Compaq Insight Manager .................................. 5-22
Compaq Insight Manager XE................................................................................. 5-23
Cluster Monitor............................................................................................... 5-24
Compaq Intelligent Cluster Administrator............................................................. 5-26
Monitoring and Managing an Active Cluster.................................................. 5-26
Managing Cluster History............................................................................... 5-27
Importing and Exporting Cluster Configurations............................................ 5-27
Microsoft Cluster Administrator............................................................................ 5-28
Chapter 6
Troubleshooting the Compaq ProLiant Clusters HA/F100 and HA/F200
Installation................................................................................................................ 6-2
Troubleshooting Node-to-Node Problems ............................................................... 6-4
Shared Storage ......................................................................................................... 6-6
Client-to-Cluster Connectivity............................................................................... 6-11
Cluster Groups and Cluster Resource .................................................................... 6-15
Troubleshooting Compaq Redundancy Manager................................................... 6-16
Event Logging................................................................................................. 6-16
Informational Messages.................................................................................. 6-16
Warning Message ........................................................................................... 6-19
Error Messages ............................................................................................... 6-19
Other Potential Problems ................................................................................ 6-21
Troubleshooting Compaq SANworks Secure Path for Windows 2000 on RAID
Array 4000/4100 .................................................................................................... 6-21
Appendix A
Cluster Configuration Worksheets
Overview................................................................................................................. A-1
Cluster Group Definition Worksheet ...................................................................... A-2
Shared Storage Capacity Worksheet....................................................................... A-3
Group Failover/Failback Policy Worksheet............................................................ A-4
Preinstallation Worksheet ....................................................................................... A-5
Appendix B
Using Compaq Redundancy Manager in a Single-Server Environment
Overview..................................................................................................................B-1
Installing Redundancy Manager ..............................................................................B-4
Automatically Installing Redundancy Manager................................................B-5
Manually Installing Redundancy Manager.......................................................B-5
Managing Redundancy Manager .............................................................................B-6
Changing Paths .................................................................................................B-7
Page 7
About This Guide vii
Expanding Capacity..........................................................................................B-8
Other Functions ................................................................................................B-9
Troubleshooting Redundancy Manager................................................................... B-9
Appendix C
Software and Firmware Versions
Glossary
Index
Page 8
Page 9
About This Guide
This guide is designed to be used as step-by-step instructions for installation
and as a reference for operation, troubleshooting, and future upgrades.
Text Conventions
This document uses the following conventions to distinguish elements of text:
Keys Keys appear in boldface. A plus sign (+) between
two keys indicates that they should be pressed simultaneously.
USER INPUT User input appears in a different typeface and in
uppercase.
FILENAMES File names appear in uppercase italics.
Menu Options, Command Names, Dialog Box Names
These elements appear in initial capital letters.
COMMANDS, DIRECTORY NAMES, and DRIVE NAMES
These elements appear in uppercase.
Type When you are instructed to type information, type
the information without pressing the Enter key.
Enter When you are instructed to enter information, type
the information and then press the Enter key.
Page 10
x Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Symbols in Text
These symbols may be found in the text of this guide. They have the following meanings.
WARNING: Text set off in this manner indicates that failure to follow directions in the warning could result in bodily harm or loss of life.
CAUTION: Text set off in this manner indicates that failure to follow directions could result in damage to equipment or loss of information.
IMPORTANT: Text set off in this manner presents clarifying information or specific instructions.
NOTE: Text set off in this manner presents commentary, sidelights, or interesting points of information.
Symbols on Equipment
These icons may be located on equipment in areas where hazardous conditions may exist.
Any surface or area of the equipment marked with these symbols indicates the presence of electrical shock hazards. Enclosed area contains no operator serviceable parts. WARNING: To reduce the risk of injury from electrical shock hazards, do not open this enclosure.
Any RJ-45 receptacle marked with these symbols indicates a Network Interface Connection. WARNING: To reduce the risk of electrical shock, fire, or damage to the equipment, do not plug telephone or telecommunications connectors into this receptacle.
Page 11
About This Guide xi
Any surface or area of the equipment marked with these symbols indicates the presence of a hot surface or hot component. If this surface is contacted, the potential for injury exists. WARNING: To reduce the risk of injury from a hot component, allow the surface to cool before touching.
Power Supplies or Systems marked with these symbols indicate the equipment is supplied by multiple sources of power.
WARNING: To reduce the risk of injury from electrical shock, remove all power cords to completely disconnect power from the system.
Rack Stability
WARNING: To reduce the risk of personal injury or damage to the equipment,
be sure that:
The leveling jacks are extended to the floor.
The full weight of the rack rests on the leveling jacks.
The stabilizing feet are attached to the rack if it is a single rack
installations.
The racks are coupled together in multiple rack installations.
A rack may become unstable if more than one component is extended for
any reason. Extend only one component at a time.
Getting Help
If you have a problem and have exhausted the information in this guide, you
can get further information and other help in the following locations.
Page 12
xii Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Compaq Technical Support
You are entitled to free hardware technical telephone support for your product for as long as you own the product. A technical support specialist will help you diagnose the problem or guide you to the next step in the warranty process.
In North America, call the Compaq Technical Phone Support Center at 1-800-OK-COMPAQ. This service is available 24 hours a day, 7 days a week.
Outside North America, call the nearest Compaq Technical Support Phone Center. Telephone numbers for world wide Technical Support Centers are listed on the Compaq website. Access the Compaq website by logging on to the Internet at:
http://www.compaq.com
Be sure to have the following information available before you call Compaq:
Technical support registration number (if applicable)
Product serial numbers
Product model names and numbers
Applicable error messages
Add-on boards or hardware
Third-party hardware or software
Operating system type and revision level
Detailed, specific questions
Compaq Website
The Compaq website has information on this product as well as the latest drivers and Flash ROM images. You can access the Compaq website by logging on to the Internet at:
http://www.compaq.com.
Page 13
About This Guide xiii
Compaq Authorized Reseller
For the name of your nearest Compaq authorized reseller:
In the United States, call 1-800-345-1518.
In Canada, call 1-800-263-5868.
Elsewhere, see the Compaq website for locations and telephone
numbers.
Page 14
Chapter 1
Architecture of the Compaq ProLiant
Clusters HA/F100 and HA/F200
Overview of Compaq ProLiant Clusters HA/F100 and HA/F200 Components
A cluster is a loosely coupled collection of servers and storage that acts as a
single system, presents a single-system image to clients, provides protection
against system failures, and provides configuration options for static load
balancing.
Clustering is an established technology that may provide one or more of the
following benefits:
Availability
Scalability
Manageability
Investment protection
Operational efficiency
Page 15
1-2 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
This chapter discusses the role each of these products plays in bringing a complete clustering solution to your computing environment.
Compaq ProLiant Cluster HA/F100
The Compaq ProLiant™ Cluster HA/F100 includes these hardware solution components:
Two Compaq ProLiant servers
One or more Compaq StorageWorks™ RAID Array 4000 or Compaq
StorageWorks RAID Array 4100 (RA4000/4100) storage systems.
One Compaq StorageWorks RAID Array 4000 Controller per
RA4000/4100 storage system
One of the following hubs or switches:
Compaq StorageWorks Fibre Channel Storage Hub (7- or 12-port)
Compaq StorageWorks FC-AL Switch 8 with or without a Compaq
StorageWorks FC-AL Switch 3-port Expansion Module
One of the following host bus adapters per server:
Compaq StorageWorks Fibre Channel Host Adapter/P
Compaq StorageWorks 64-Bit/66-MHz Fibre Channel Host Adapter
Network interface cards (NICs)
Gigabit Interface Converter-Shortwave (GBIC-SW) modules
Cables:
Multi-mode Fibre Channel cable
Ethernet crossover cable
Network (LAN) cable
Page 16
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-3
The Compaq ProLiant Cluster HA/F100 uses these software solution
components:
One of the following Microsoft Windows operating systems:
Microsoft Windows NT Server 4.0, Enterprise Edition with
Microsoft Cluster Server (MSCS)
Microsoft Windows 2000 Advanced Server with Microsoft Cluster
Service (MSCS)
Compaq SmartStart and Support Software CD
Compaq Cluster Verification Utility (CCVU)
Compaq Insight Manager (optional)
Compaq Insight Manager XE (optional)
Compaq Intelligent Cluster Administrator (optional)
NOTE: See Appendix C, “Software and Firmware Versions,” for the necessary software version levels for your cluster.
The following illustration depicts the HA/F100 configuration:
RA4000/4100
LAN
storage hub
or switch
storage hub
or switch
Dedicated
Interconnect
Node 2
Node 1
Figure 1-1. Hardware components of the Compaq ProLiant Cluster HA/F100
Page 17
1-4 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Compaq ProLiant Cluster HA/F200
The Compaq ProLiant Cluster HA/F200 adds Compaq Redundancy Manager (for Windows NTS/E) and Compaq SANworks™ Secure Path for Windows 2000 on RAID Array 4000/4100 (for Windows 2000 Advanced Server) software and a redundant Fibre Channel Arbitrated Loop (FC-AL) to the HA/F100 configuration. The redundancy manager software, in conjunction with redundant Fibre Channel loops, enhances the high availability features of the HA/F200.
The Compaq ProLiant Cluster HA/F200 includes these hardware solution components:
Two Compaq ProLiant servers
One or more Compaq StorageWorks RAID Array 4000 or Compaq
StorageWorks RAID Array 4100 (RA4000/4100) storage systems
Two Compaq StorageWorks RAID Array 4000 Controllers per
RA4000/4100 storage system
Two of the following hubs or switches:
Compaq StorageWorks Fibre Channel Storage Hub (7- or 12-port)
Compaq StorageWorks Fibre Channel FC-AL Switch 8 with or
without the Compaq StorageWorks FC-AL Switch 3-port Expansion Module
Two of the following host bus adapters per server:
Compaq StorageWorks Fibre Channel Host Adapter/P
Compaq StorageWorks 64-Bit/66-MHz Fibre Channel Host Adapter
Network interface cards (NICs)
Gigabit Interface Converter-Shortwave (GBIC-SW) modules
Cables:
Multi-mode Fibre Channel cable
Ethernet crossover cable
Network (LAN) cable
Page 18
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-5
The Compaq ProLiant Cluster HA/F200 includes these software solution
components:
One of the following Microsoft Windows operating systems:
Microsoft Windows NT Server 4.0, Enterprise Edition with
Microsoft Cluster Server (MSCS)
Microsoft Windows 2000 Advanced Server with Microsoft Cluster
Service (MSCS)
Compaq SmartStart and Support Software CD
Compaq Redundancy Manager (Fibre Channel) for Windows NT
Compaq SANworks Secure Path for Windows 2000 on
RAID Array 4000/4100
Compaq Cluster Verification Utility (CCVU)
Compaq Insight Manager (optional)
Compaq Insight Manager XE (optional)
Compaq Intelligent Cluster Administrator (optional)
NOTE: See Appendix C, “Software and Firmware Versions,” for the necessary software version levels for your cluster.
The following illustration depicts the basic HA/F200 configuration.
Node 1
RA4000/4100
LAN
Node 2
Dedicated
Interconnect
storage hub
or switch
storage hub
or switch
storage hub
or switch
storage hub
or switch
Figure 1-2. Hardware components of the Compaq ProLiant Cluster HA/F200
Page 19
1-6 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Compaq ProLiant Servers
Compaq industry standard servers are a primary component of all models of Compaq ProLiant Clusters. At the high end of the ProLiant server line, several high availability and manageability features are incorporated as a standard part of the server feature set. These include online backup processors, a PCI bus with hot-plug capabilities, redundant hot-pluggable fans, redundant processor power modules, redundant Network Interface Controller (NIC) support, dual-ported hot-pluggable 10/100 NICs and redundant hot-pluggable power supplies (on most high-end models). Many of these features are available at the low end and mid range of the Compaq ProLiant server line, as well.
Compaq has logged thousands of hours testing multiple models of Compaq servers in clustered configurations and has successfully passed the Microsoft Hardware Cluster Certification Test Suite on numerous occasions. In fact, Compaq was the first vendor to be certified using a shared storage subsystem connected to ProLiant servers through Fibre Channel Arbitrated Loop technology.
NOTE: Visit the Compaq High Availability website (http://www.compaq.com/highavailability) to obtain a comprehensive list of cluster-certified servers.
The Microsoft Cluster Software (MSCS) is based on a cluster architecture known as shared storage clustering, in which clustered servers share access to a common set of hard drives. MSCS requires all clustered (shared) data to be stored in an external storage system.
The Compaq StorageWorks RA4000/4100 storage system is the shared storage system for the Compaq ProLiant Clusters HA/F100 and HA/F200.
Compaq StorageWorks RAID Array 4000 or Compaq StorageWorks RAID Array 4100
The Compaq StorageWorks RAID Array 4000 Compaq StorageWorks RAID Array 4100 (RA4000/4100) is the storage cabinet that contains the disk drives, power supplies, and array controllers. The RA4000/4100 supports the same hot-pluggable drives as Compaq Servers and Compaq ProLiant Storage Systems, online capacity expansion, online spares, and RAID fault tolerance of SMART-2 Array Controller technology. The RA4000/4100 also supports hot-pluggable, redundant power supplies and fans, and hot-pluggable hard drives.
Page 20
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-7
The HA/F100 and HA/F200 ProLiant Clusters must have at least one
RA4000/4100 set up as an external shared storage. Consult the Order and
Configuration Guide for Compaq ProLiant Cluster HA/F100 and HA/F200 at
the Compaq ProLiant Clusters High Availability website
(http://www.compaq.com/highavailability) to determine the maximum supported
cluster configuration.
For more detailed information on the RA400/4100, refer to the following
documents:
Compaq StorageWorks RAID Array 4000 User Guide
Compaq StorageWorks RAID Array 4100 User Guide
Compaq StorageWorks RAID Array 4000 Controller
The Compaq StorageWorks RAID Array 4000 Controller (RA4000
Controller) is fully RAID capable and manages all of the drives in the
RA4000/4100 storage array. Each RA4000/4100 is shipped with one controller
installed. In a HA/F100 cluster, each array controller is connected to both
servers through a single Fibre Channel storage hub or FC-AL switch. In a
HA/F200 cluster, the addition of a second Compaq StorageWorks RA4000
Redundant Controller is required to provide redundancy.
These redundant controllers are connected to each server through two separate,
and redundant, Fibre Channel storage hubs or FC-AL switches. This
dual-connection configuration implements a vital aspect of the enhanced high
availability features of the HA/F200 cluster. Each of these components is
discussed in the following sections. For more information, refer to the Compaq
StorageWorks RAID Array 4000 Redundant Array Controller Configuration
Poster.
For more information about shared storage clustering, refer to the Microsoft
clustering documentation.
Page 21
1-8 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Connection Infrastructure for the RA4000/4100
The servers in a Compaq ProLiant Cluster HA/F100 and HA/F200 are connected to one or more RA4000/4100 shared external storage systems using industry-standard Fibre Channel Arbitrated Loop (FC-AL) technology. The components used to implement the Fibre Channel Arbitrated Loop include shortwave (multi-mode) fiber optic cables, Gigabit Interface Converters-Shortwave (GBIC-SW) and Fibre Channel storage hubs or FC-AL switches.
Compaq StorageWorks Fibre Channel Storage Hubs
The Compaq StorageWorks Fibre Channel Storage Hub is a critical component of the FC-AL configuration and allows up to five RA4000/4100s to be connected to the cluster servers in a “star” topology. For the HA/F100, a single hub is used. For the HA/F200, two redundant Fibre Channel storage hubs are used. Either the 7-port or 12-port hub may be used in either type of cluster.
If the maximum number of supported RA4000/4100s (currently five) are connected to either type of cluster using a 12-port hub, there will be unused ports. Compaq does not currently support using these ports to connect additional RA4000/4100s. Other FC-AL capable devices, such as tape backup systems, should not be connected to these unused ports under any circumstances.
For more information, refer the following guides:
Compaq StorageWorks Fibre Channel Storage Hub 7 Installation Guide
Compaq StorageWorks Fibre Channel Storage Hub 12 Installation
Guide
Compaq StorageWorks FC-AL Switch 8
The Compaq StorageWorks FC-AL Switch 8 is the core component of an affordable storage area network (SAN) solution that will consolidate storage, simplify storage management, manage explosive data growth, and reduce business downtime. The FC-AL Switch 8 is a high-performance, switch engine, interconnect component that will help you take that important step towards building a low cost SAN. Built on the stable, easy to use, and mature FC-AL protocol, the FC-AL Switch 8 offers eight ports with dedicated non-blocking 100 MB/second point-to-point parallel connections.
Page 22
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-9
Using the StorageWorks FC-AL Switch 8 as the cornerstone of your SAN
deployment, you can start by combining your primary storage components,
such as the RA4000/RA4100 storage systems and secondary storage
Enterprise Backup Solution components, such as tape libraries
(TL890/TL891/TL895) on the same departmental SAN. With a 12Gbps switch
engine, the StorageWorks FC-AL Switch 8 delivers the necessary resiliency
and speed to isolate your client-server network from heavier storage network
traffic. Furthermore, as your connection needs grow, the 8-port StorageWorks
FC-AL Switch 8 can be expanded to 11 ports using the StorageWorks FC-AL
Switch 3-Port Expansion Module.
The StorageWorks FC-AL Switch 8 can be easily managed using management
tools such as StorageWorks Command Console (SWCC), Compaq Insight
Manager -XE (CIM-XE), Compaq Insight Manager (CIM), Array
Configuration Utility (ACU), and the StorageWorks Switch Management
Utility.
For more information, refer to the Compaq StorageWorks Fibre Channel
FC-AL Switch 8 Installation Guide.
Compaq StorageWorks Fibre Channel Host Adapter/P or Compaq StorageWorks 64-Bit/66-MHz Fibre Channel Host Adapter
Compaq StorageWorks Fibre Channel Host Adapters/P and Compaq
StorageWorks 64-Bit/66-MHz Fibre Channel Host Adapter are the interface
between the servers and the RA4000/4100 storage system. At least two host
bus adapters, one for each cluster node, are required in the Compaq ProLiant
Cluster HA/F100. At least four host bus adapters, two for each cluster node,
are required in the HA/F200 configuration.
For more information about these products, refer to the Compaq StorageWorks
Fibre Channel Host Adapter/P or Compaq StorageWorks 64-Bit/66-MHz Fibre
Channel Host Adapter documentation.
For more information, refer to the following documents:
Compaq StorageWorks Fibre Channel Host Bus Adapter Installation
Guide
Compaq StorageWorks 64-Bit/66-MHz Fibre Channel Host Adapter
Installation Guide
Page 23
1-10 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Gigabit Interface Converter-Shortwave
Two Gigabit Interface Converter-Shortwave (GBIC-SW) modules are required for each Fibre Channel cable installed. Two GBIC-SW modules are provided with each RA4000/4100, RA4000 Controller, and host bus adapter.
GBIC-SW modules hot-plug into Fibre Channel storage hubs, array controllers, and host bus adapters. These converters provide ease of expansion and 100 MB/s performance. GBIC-SW modules support distances up to 500 meters using multi-mode fibre optic cable.
Cluster Interconnect
The cluster interconnect is a data path over which nodes of a cluster communicate. This type of communication is termed intracluster communication. At a minimum, the interconnect consists of two network adapters (one in each server) and a cable connecting the adapters.
The cluster nodes use the interconnect data path to:
Communicate individual resource and overall cluster status
Send and receive heartbeat signals
Update modified registry information
IMPORTANT: MSCS requires TCP/IP as the cluster communication protocol. When configuring the interconnects, be sure to enable TCP/IP.
Client Network
Every client/server application requires a local area network, or LAN, over which client machines and servers communicate. The components of the LAN are no different than with a stand-alone server configuration.
Because clients desiring the full advantage of the cluster will now connect to the cluster rather than to a specific server, configuring client connections will differ from those for a stand-alone server. Clients will connect to virtual servers, which are cluster groups that contain their own IP addresses.
Within this guide, communication between the network clients and the cluster is termed cluster-to-LAN communication.
Page 24
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-11
Private or Public Interconnect
There are two types of interconnect paths:
A private interconnect (also known as a dedicated interconnect) is used
solely for intracluster (node-to-node) communication. Communication to and from network clients does not occur over this type of interconnect.
A public interconnect not only takes care of communication between the
cluster nodes, it also shares the data path with communication between the cluster and its network clients.
For more information about Compaq-recommended interconnect strategies,
refer to the White Paper, “Increasing Availability of Cluster Communications
in a Windows NT Cluster,” available from the Compaq High Availability
website (
http://www.compaq.com/highavailability).
Interconnect Adapters
Ethernet adapters, or Compaq ServerNet™ adapters, can be used for the
interconnect between the servers in a Compaq ProLiant Cluster. Either
10Mb/sec, or 100Mb/sec, Ethernet may be used. ServerNet adapters have
built-in redundancy and provide a high-speed interconnect with 100MB/sec
aggregate throughput.
Ethernet adapters can be connected together using an Ethernet crossover cable
or a private Ethernet hub. Both of these options provide a dedicated
interconnect.
Implementing a direct Ethernet or ServerNet connection minimizes the
potential single points of failure.
Page 25
1-12 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Redundant Interconnects
To reduce potential disruptions of intracluster communication, use a redundant path over which communication can continue if the primary path is disrupted.
Compaq recommends configuring the client LAN as a backup path for intracluster communication. This provides a secondary path for the cluster heartbeat in case the dedicated primary path for intracluster communications fails. This is configured when installing the cluster software, or it can be added later using the MSCS Cluster Administrator.
It is also important to provide a redundant path to the client LAN. This can be done by using a second NIC as a hot standby for the primary client LAN NIC.
There are two ways to achieve this, and the method you choose is dependent on your hardware. One way is through use of the Redundant NIC Utility available on all Compaq 10/100 Fast Ethernet products. The other option is through the use of the Network Fault Tolerance feature designed to operate with the Compaq 10/100 Intel silicon-based NICs. These features allow two NICs to be configured so that one is a hot backup for the other.
For detailed information about interconnect redundancy, refer to the Compaq White Paper, “Increasing Availability of Cluster Communications in a Windows NT Cluster,” available from the Compaq High Availability website (
http://www.compaq.com/highavailability).
Cables
Three general categories of cables are used for Compaq ProLiant HA/F100 and HA/F200 clusters:
Server to Storage
Shortwave (multi-mode) fiber optic cables are used to connect the servers, Fibre Channel storage hubs and FC-AL switches, and RA4000/4100s in a Fibre Channel Arbitrated Loop configuration.
Page 26
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-13
Cluster Interconnect
Two types of cluster interconnect cables may be used depending on the type of
devices used to implement the interconnect, and whether the interconnect is
dedicated or shared:
Ethernet
If Ethernet NICs are used to implement the interconnect, there are three options:
Dedicated Interconnect Using an Ethernet Crossover Cable:
An Ethernet crossover cable (supplied in both the HA/F100 and HA/F200 kits) can be used to connect the NICs directly together to create a dedicated interconnect.
Dedicated Interconnect Using Standard Ethernet Cables and a
private Ethernet Hub: Standard Ethernet cables can be used to connect the NICs together through a private Ethernet hub to create another type of dedicated interconnect. Note that an Ethernet crossover cable should not be used when using an Ethernet hub because the hub performs the crossover function.
Shared Interconnect Using Standard Ethernet Cables and a
Public Hub: Standard Ethernet cables may also be used to connect the NICs to a public network to create a nondedicated interconnect.
ServerNet
If Compaq ServerNet adapters are used to implement the interconnect, special ServerNet cables must be used.
Network Interconnect
Standard Ethernet cables are used to provide this type of connection.
Page 27
1-14 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Microsoft Software
Microsoft Windows NT Server 4.0/Enterprise Edition (Windows NTS/E) and Microsoft Windows 2000 Advanced Server are the operating systems for the Compaq ProLiant Clusters HA/F100 and HA/F200. The Microsoft clustering software, Cluster Server for Windows NTS/E and Microsoft Cluster Service for Windows 2000 Advanced Server (MSCS), provides the underlying technology to:
Send and receive heartbeat signals between the cluster nodes.
Monitor the state of each cluster node.
Initiate failover and failback events.
NOTE: MSCS will only run with Windows NTS/E. Previous versions of Windows NT are not supported.
NOTE: The HA/F200 only supports MSCS with Windows 2000 Advanced Server. Other versions of Windows 2000 are not supported.
Microsoft Cluster Administrator, another component of Windows NTS/E and Windows 2000 Advanced Server, allows you to do the following:
Define and modify cluster groups
Manually control the cluster
View the current state of the cluster
NOTE: Microsoft Windows NTS/E or Microsoft Windows 2000 Advanced Server must be purchased separately for your Compaq ProLiant Cluster, through your Microsoft reseller.
Page 28
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-15
Compaq Software
Compaq offers an extensive set of features and optional tools to support the
configuration and management of your Compaq ProLiant Cluster:
Compaq SmartStart and Support Software CD
Compaq Redundancy Manager (Fibre Channel)
Compaq SANworks Secure Path for Windows 2000 on RAID Array
4000/4100
Compaq Insight Manager
Compaq Insight Manager XE
Compaq Intelligent Cluster Administrator
Compaq Cluster Verification Utility (CCVU)
Compaq SmartStart and Support Software CD
Compaq SmartStart is located on the SmartStart and Support Software CD
included in the Compaq Server Setup and Management Pack shipped with
ProLiant servers. SmartStart is the recommended way to configure the
Compaq ProLiant Cluster HA/F100 or HA/F200. SmartStart uses a step-by-
step process to configure the cluster and load the system software. For
information concerning SmartStart, refer to the Compaq Server Setup and
Management pack.
For information about using SmartStart to install the Compaq ProLiant
Cluster HA/F100 and HA/F200, see chapters 3 and 4 of this guide.
Compaq Array Configuration Utility
The Compaq Array Configuration Utility, found on the Compaq SmartStart
and Support Software CD, is used to configure the array controller, add disk
drives to an existing configuration, and expand capacity.
Page 29
1-16 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Compaq System Configuration Utility
The SmartStart and Support Software CD also contains the Compaq System Configuration Utility. This utility is the primary means to configure hardware devices in your server, such as I/O addresses, boot order of disk controllers, and so on.
For information concerning the Compaq System Configuration Utility, refer to the Compaq Server Setup and Management pack.
Compaq Server Support (SSD) for Microsoft Windows NT 4.0
The Compaq Server Support (SSD) for Microsoft Windows NT 4.0 contains device drivers and utilities that enable you to take advantage of specific capabilities offered on Compaq products. These drivers are provided for use with Compaq hardware only.
The SSD is included in the Compaq Server Setup and Management pack.
Compaq Support Paq for Microsoft Windows 2000
The Compaq Support Paq for Microsoft Windows 2000 is an advanced software delivery tool that replaces the familiar SSD utility vehicles used for support of Windows NT 3.51 and Windows NT 4.0. The Compaq Support Paq for Microsoft Windows 2000 includes an installer that analyzes system requirements and installs all drivers.
The Compaq Support Paq can be installed or downloaded from the Compaq website (
www.compaq.com/support).
Options ROMPaq Utility
The SmartStart and Support Software CD also contains the Options ROMPaq™ utility. Options ROMPaq updates the firmware on the Compaq StorageWorks RA4000 Controllers and the hard drives.
Fibre Channel Fault Isolation Utility (FFIU)
The SmartStart and Support Software CD also contains the Fibre Channel Fault Isolation Utility (FFIU). The FFIU verifies the integrity of a new or existing FC-AL installation. This utility provides fault detection and help in locating a failing device on the FC-AL.
Page 30
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-17
Compaq Redundancy Manager (Fibre Channel)
Compaq Redundancy Manager, a software component that works in
conjunction with the Windows NTS/E operating system and the Windows NT
file system (NTFS), increases the availability of both single-server and
clustered systems that use the Compaq StorageWorks RAID Array 4000 and
RAID Array 4100 Storage System and Compaq ProLiant servers. Redundancy
Manager can detect failures in the host bus adapter, array controller or other
Fibre Channel Arbitrated Loop components. When such a failure occurs, I/O
processing is rerouted through a redundant path, allowing applications to
continue processing. This rerouting is transparent to NTFS. Therefore, in an
HA/F200 configuration, it is not necessary for MSCS to fail resources over to
the other node. Redundancy Manager, in combination with redundant
hardware components, is the basis for the enhanced high availability features
of the HA/F200 running Windows NTS/E.
The Compaq Redundancy Manager (Fibre Channel) CD is included in the
Compaq ProLiant Cluster HA/F200 kit. Redundancy Manager is licensed on a
single server or single cluster of servers basis. For more information about
installing Redundancy Manager in a Compaq ProLiant Cluster HA/F200, see
Chapter 3 of this guide. For detailed information about the Redundancy
Manager software, refer to the Redundancy Manager documentation included
in your cluster kit.
Page 31
1-18 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Compaq SANworks Secure Path for Windows 2000 on RAID Array 4000/4100
Compaq SANworks Secure Path is a software component that works in conjunction with the Windows 2000 Advanced Server operating system and the Windows NT file system (NTFS). Secure Path increases the availability of both single-server and clustered systems that use the Compaq StorageWorks RA4000/4100 storage system and Compaq ProLiant servers. Secure Path can detect failures in the host bus adapter, array controller or other Fibre Channel Arbitrated Loop components.
When such a failure occurs, I/O processing is rerouted through a redundant path, allowing applications to continue processing. This rerouting is transparent to Windows 2000 Advanced Server. Therefore, in an HA/F200 configuration, it is not necessary for MSCS to fail resources over to the other node. Secure Path, in combination with redundant hardware components, is the basis for the enhanced high availability features of the HA/F200 running Windows NTS/E.
Two licenses of Secure Path are included in your Compaq ProLiant Cluster HA/F200 Cluster Kit. Secure Path is licensed on a per server basis and can be purchased separately or in the cluster kit.
For more information about installing Secure Path in a Compaq ProLiant Cluster HA/F200, see Chapter 3 of this guide. For detailed information about the Secure Path software, refer to the Secure Path documentation included in your cluster kit.
Compaq Cluster Verification Utility
CCVU is a software utility that can be used to validate several key aspects of the Compaq ProLiant Cluster HA/F100 and HA/F200 and their components.
The stand-alone utility can be run from either of the cluster nodes or remotely from a network client attached to the cluster. When CCVU is run remotely, it can validate any number of Windows NTS/E and Windows 2000 Advanced Server clusters to which the client is attached.
Page 32
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-19
The CCVU tests your cluster configuration in the following categories:
A node test verifies that the clustered servers are supported in HA/F100
and HA/F200 cluster configurations.
Networking tests verify that your setup meets the minimum cluster
requirements for network cards, connectivity, and TCP/IP configuration.
Storage tests verify the presence and minimum configuration
requirements of supported host bus adapters, array controllers, and external storage subsystem.
System software tests verify that Microsoft Windows NTS/E or
Windows 2000 Advanced Server have been installed.
The Compaq Cluster Verification Utility CD is included in the HA/F100 and
HA/F200 cluster kits. For detailed information about the CCVU, refer to the
online documentation (CCVU.HLP) included on the CD.
Compaq Insight Manager
Compaq Insight Manager, loaded from the Compaq Management CD that is
shipped with each ProLiant server, is an easy-to-use, console-based software
utility for collecting server and cluster information. Compaq Insight Manager
performs the following functions:
Monitors fault conditions and system status
Monitors shared storage and interconnect adapters
Forwards server alert fault conditions
Remotely controls servers
The Integrated Management Log collects and feeds data to Compaq Insight
Manager. This log is used with the Insight Management Desktop (IMD),
Remote Insight (optional controller), and SmartStart.
In Compaq servers, each hardware subsystem, such as disk storage, system
memory, and system processor, has a robust set of management capabilities.
Compaq Full Spectrum Fault Management notifies of impending fault
conditions and keeps the server up and running in the unlikely event of a
hardware failure.
For information concerning Compaq Insight Manager, refer to the Compaq
Server Setup and Management pack shipped with each ProLiant server.
Page 33
1-20 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Compaq Insight Manager XE
Compaq Insight Manager XE is a Web-based management system and is located on the Compaq Management CD shipped with each ProLiant server. It can be used in conjunction with Compaq Insight Manager agents as well as its own Web-enabled agents. This browser-based utility provides increased flexibility and efficiency for the administrator. It extends the functionality of Compaq Insight Manager and works in conjunction with the Cluster Monitor subsystem, providing a common data repository and control point for enterprise servers and clusters, desktops, and other devices using either SNMP- or DMI-based messaging.
Cluster Monitor
Cluster Monitor is a Web-based monitoring subsystem of Compaq Insight Manager XE. With Cluster Monitor, you can view all clusters from a single browser and configure monitor points and specific operational performance thresholds that will alert you when these thresholds have been met or exceeded on your application systems. Cluster Monitor relies heavily on the Compaq Insight Manager agents for basic information about system health. It also has custom agents that are designed specifically for monitoring cluster health. Cluster Monitor provides access to the Compaq Insight Manager alarm, device, and configuration information.
Cluster Monitor allows the administrator to view some or all of the clusters, depending on administrative controls that are specified when clusters are discovered by Compaq Insight Manager XE.
Compaq Intelligent Cluster Administrator
Compaq Intelligent Cluster Administrator extends Compaq Insight Manager and Cluster Monitor by enabling Administrator to configure and manage ProLiant clusters from a Web browser. With Compaq Intelligent Cluster Administrator, you can copy, modify, and dynamically install a cluster configuration on the same physical cluster or on any physical cluster anywhere in the system, through the Web.
Compaq Intelligent Cluster Administrator checks for any cluster destabilizing conditions, such as disk thresholds or application slowdowns, and reallocates cluster resources to meet processing demands. This software also performs dynamic allocation of cluster resources that may be failing without causing the cluster to fail over.
Page 34
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-21
Compaq Intelligent Cluster Administrator also provides initialized cluster
configurations that allow rapid cluster generation as well as cluster
configuration builder wizards for extending the Compaq initialized
configurations.
Compaq Intelligent Cluster Administrator is included with the
HA/F200 cluster kit and can be purchased as a stand-alone component for the
HA/F100 cluster. Intelligent Cluster Administrator is licensed on a per cluster
basis.
Resources for Application Installation
The client/server software applications are among the key components of any
cluster. Compaq is working with its key software partners to ensure that
cluster-aware applications are available and that the applications work
seamlessly on Compaq ProLiant clusters.
Compaq provides a number of Integration TechNotes and White Papers to
assist you with installing these applications in a Compaq ProLiant Cluster
environment.
Visit the Compaq High Availability website
(
http://www.compaq.com/highavailability) to download current versions of these
TechNotes and other technical documents.
IMPORTANT: Your software applications may need to be updated to take full advantage of clustering. Contact your software vendors to check whether their software supports MSCS and to ask whether any patches or updates are available for MSCS operation.
Page 35
Chapter 2
Designing the Compaq ProLiant
Clusters HA/F100 and HA/F200
Before connecting any cables or powering up any machines, it is important to
understand how all of the cluster components and concepts fit together to meet
your information system needs. The major topics discussed in this chapter are:
Planning Considerations
Capacity Planning
Network Considerations
Failover/Failback Planning
In addition to reading this chapter, read the planning chapter in Microsoft
documentation that came with your operating system.
Page 36
2-2 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Planning Considerations
To correctly assess capacity, network, and failover needs in your business environment, it is important to understand clustering and the things that affect the availability of clusters. The items detailed in this section will help you design your Compaq ProLiant Cluster so that it addresses your specific availability needs.
Cluster configuration design is addressed in “Cluster Configurations.”
A step-by-step approach to creating cluster groups is discussed in
“Cluster Groups.”
Recommendations regarding how to reduce or eliminate single points of
failure are contained in the “Reducing Single Points of Failure in the HA/F100 Configuration” section of this chapter. By definition, a highly available system is not continuously available and therefore may have single points of failure.
NOTE: The discussion in this chapter relating to single points of failure applies only to the Compaq ProLiant Cluster HA/F100. The HA/F200 includes dual redundant loops, that eliminate certain single points of failure contained in the HA/F100.
Cluster Configurations
Although there are many ways to set up clusters, most configurations fall into two categories: active/active and active/standby.
Active/Active Configuration
The core definition of an active/active configuration is that each node is actively processing data when the cluster is in a normal operating state. Both the first and second nodes are “active.” Because both nodes are processing client requests, an active/active design maximizes the use of all hardware in both nodes.
Page 37
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-3
An active/active configuration has two primary designs:
The first design uses MSCS failover capabilities on both nodes, enabling
Node 1 to fail over clustered applications to Node 2 and enabling Node 2 to fail over clustered applications to Node 1. This design optimizes availability since both nodes can fail over applications to each other.
The second design is a one-way failover. For example, the Microsoft
clustering software may be set up to allow Node 1 to fail over clustered applications to Node 2, but not to allow Node 2 to fail over clustered applications to Node 1. While this design increases availability, it does not maximize availability since failover is configured on only one node.
When designing cluster nodes to fail over to each other, ensure that each
server has enough capacity, memory, and processor power to run all
applications (all applications running on the first node plus all clustered
applications running on the other node).
When designing your cluster so that only one node (Node 1) fails over to the
other (Node 2), ensure that Node 2 has enough capacity, memory, and CPU
power to execute not only its own applications, but to run the clustered
applications that can fail over from Node 1.
Another consideration when determining your servers’ hardware is
understanding your clustered applications’ required level of performance when
the cluster is in a degraded state (when one or more clustered applications is
running on a secondary node). If Node 2 is running near peak performance
when the cluster is in a normal operating state, and if several clustered
applications are failed over from Node 1, Node 2 will likely execute the
clustered applications more slowly than when they were executed on Node 1.
Some level of performance degradation may be acceptable. Determining how
much degradation is acceptable depends on the company.
Page 38
2-4 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Example 1: File & Print/File & Print
An example business scenario (Figure 2-1) involves two file and print servers. The Human Resources (HR) department uses one server, and the Marketing department uses the other. Both servers actively run their own file shares and print spoolers while the cluster is in its normal state (an active/active design).
If the HR server encounters a failure, it fails over its file and print services to the Marketing server. HR clients experience a slight disruption of service while the file shares and print spooler fail over to their secondary server. Any jobs that were in the print spooler before the failure event will now print from the Marketing server.
File and Print
Human Resources
Capacity
Marketing
File and Print
Marketing
Capacity
Human Resources
Shared Storage
(Human Resources)(Marketing)
Figure 2-1. Active/active example 1
When failover is complete, all of the HR clients have full access to their file shares and print spooler. Marketing clients do not experience any disruption of service. All clients may experience slowed performance while the cluster runs in a degraded state.
Page 39
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-5
Example 2: Database/Database
Another scenario (Figure 2-2) has two distinct database applications running
on two separate cluster nodes. One database application maintains Human
Resources records, and its primary node is set to the HR database node. The
other database application is used for market research, and its primary node is
set to the Marketing database node.
Order Entry
Database
Order Entry
Database
Shared Storage
Node 1 Node 2
(Order Entry) (Order Entry)
Figure 2-2. Active/active example 2
While in a normal state, both cluster nodes run at expected performance levels.
If the Marketing server encounters a failure, the market research application
and associated data resources fail over to their secondary node, the HR
database server. The Marketing clients experience a slight disruption of
service while the database resources are failed over, the database transaction
log is rolled back, and the information in the database is validated. When the
database validation is complete, the market research application is brought
online on the HR database node and the Marketing clients can reconnect to it.
While the Marketing database validation is occurring, the HR clients do not
experience any disruption of service.
Page 40
2-6 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Example 3: File & Print/Database
In this example (Figure 2-3), a business uses a single server to run its order entry department. The same department has a file and print server. While order entry is business-critical and requires maximum availability, the file and print server can be unavailable for several hours without impacting revenue. In this scenario, the order entry database is configured to use the file and print server as its secondary node. However, the file and print server will not be configured to fail over applications to the order entry server.
File and Print
Services
Order Entry
Database
Capacity of
Order Entry
Database
Node1 Node2
Shared Storage
(Order Entry)
(File and Print)
Figure 2-3. Active/active example 3
Page 41
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-7
If the node running the order entry database encounters a failure, the database
fails over to its secondary node. The order entry clients experience a slight
disruption of service while the database resources are failed over, the database
transaction log is rolled back, and the information in the database is validated.
When the database validation is complete, the order entry application is
brought online on the file and print server and the clients can reconnect to it.
While the database validation is occurring, file and print activities continue
without disruption.
If the file and print server encounters a failure, those services are not failed
over to the order entry server. File and print services are offline until the
problem is resolved and the node is brought back online.
Active/Standby Configuration
The primary difference between an active/active configuration and an
active/standby configuration is the number of servers actively processing data.
In active/standby, only one server is processing data (active) while the other
(the standby server) is in an idle state.
The standby server must be logged in to the Windows NT or Windows 2000
domain and the Microsoft clustering software must be up and running.
However, no applications are running. The standby servers only purpose is to
take over failed clustered applications from its partner. The standby server is
not a preferred node for any clustered applications and, therefore, does not fail
over any applications to its partner server.
Because the standby server does not process data until it accepts failed over
applications, the limited use of the server may not justify the cost of the server.
However, the cost of standby servers is justified when performance and
availability are paramount to a business operations.
The standby server should be designed to run all of the clustered applications
with little or no performance degradation. Since the standby server is not
running any applications while the cluster is in a normal operating state, a
failed-over clustered application will likely execute with the same speed and
response time as if it were executing on the primary server.
Page 42
2-8 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Example 4: Database/Standby Server
An example business scenario describes a mail order business whose competitive edge is quick product delivery (Figure 2-4). If the product is not delivered on time, the order is void and the sale is terminated. The business uses a single server to perform queries and calculations on order entry information, translating sales orders into packaging and distribution instructions for the warehouse. With an estimated downtime cost of $1,000/hour, the company determines that the cost of a standby server is justified.
This mission-critical (active) server is clustered with a standby server. If the active server encounters a failure, this critical application and all its resources fail over to the standby server, which validates the database and brings it online. The standby server now becomes active and the application executes at an acceptable level of performance.
Mail Order System
Shared Storage
(Mail Order Database)
(Standby)
Node1 Node2
Capacity
(Mail Order System)
Figure 2-4. Active/standby server example
Page 43
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-9
Cluster Groups
Understanding the relationship between your companys business functions
and cluster groups is essential to getting the most from your cluster. Business
functions rely on computer systems to support activities such as transaction
processing, information distribution, and information retrieval. Each computer
activity relies on applications or services, and each application depends on
software and hardware subsystems. For example, most applications need a
storage subsystem to hold their data files.
This section is designed to help you understand which subsystems, or
resources, must be available for either cluster node to run a clustered
application properly.
Creating a Cluster Group
The easiest approach to creating a cluster group is to start by designing a
resource dependency tree. A resource dependency tree has as its top level the
business function for which cluster groups are created. Each cluster group has
branches that indicate the resources upon which the group is dependent.
Page 44
2-10 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Resource Dependency Tree
The following steps describe the process of creating a resource dependency tree. Each step is illustrated by adding information to a sample resource dependency tree. The sample is for a hypothetical Web Sales Order business function, which consists of two cluster groups: a database server (a Windows NT or Windows 2000 application) and a Web server (a Windows NT or Windows 2000 service).
NOTE: For this example, it is assumed that each cluster group can communicate with the other even if they are not executing on the same node, for example, by means of an IP address. With this assumption, one cluster group can fail over to the other node, while the remaining cluster group continues to execute on its primary node.
1. List each business function that requires a clustered application
or service (Figure 2-5).
Web Sales Order
Business Function
Web Sales Order
Cluster Group
Cluster Group #1
Cluster Group #2
Figure 2-5. Resource dependency tree: step 1
Page 45
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-11
2. List each application or service required for each business function
(Figure 2-6).
Web Sales Order
Business Function
Web Server Service
(Cluster Group #1)
Resource
#1
Dependent-Resource
#1
Resource#2Resource
#3
Resource
#1
Dependent-Resource
#1
Resource#2Resource#3Resource
#4
Database Server Application
(Cluster Group #2)
Figure 2-6. Resource dependency tree: step 2
Page 46
2-12 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
3. List the immediate dependencies for each application or service (Figure
2-7.
Web Sales Order
Business Function
Web Server Service
(Cluster Group #1)
Database Server Application
(Cluster Group #2)
Network
Name
IP Address
Network
Name
IP Address
Web Server
Service
Physical Disk
Resource-
contains web
pages and web
scripts
Database
Application
Physical Disk
Resource -
contains DB
log file(s)
Physical Disk
Resource -
contains DB
data file(s)
Figure 2-7. Resource dependency tree: step 3
4. Transfer the resource dependency tree into a Cluster Group Definition
worksheet.
Page 47
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-13
Figure 2-8 illustrates the worksheet for the Web Sales Order business function.
A blank copy of the worksheet is provided in Appendix A.
Cluster Group Definition Worksheet
Cluster Function
Web Sales Order
Group #1
Web Server Service
Group #2
Database Server Application
Resource Definitions
Group #1 (Web Server Service)
Resource #1 Network Name
Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4
IP Address
Resource #2 Physical Disk Resource-contains Web pages and Web scripts
Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4
Resource #3 Web Server Service
Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4
Resource #4 N/A
Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4
Group #2 (Database Server Application)
Resource #1 Network Name
Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4
IP Address
Resource #2 Physical Disk Resource-contains database log files
Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4
Resource #3 Physical Disk Resource-contains database data files
Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4
Resource #4 Database Application
Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4
Figure 2-8. Cluster Group Definition Worksheet (example)
Page 48
2-14 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Use the resource dependency tree concept to review your company’s availability needs. It is a useful exercise, directing you to record the exact design and definition of each cluster group.
Reducing Single Points of Failure in the HA/F100 Configuration
The final planning consideration is reducing single points of failure. Depending on your needs, you may leave all vulnerable areas alone, accepting the risk associated with a potential failure. Or, if the risk of failure is unacceptable for a given area, you may elect to use a redundant component to minimize, or remove, the single point of failure.
NOTE: Although not specifically covered in this section, redundant server components (such as power supplies and processor modules) should be used wherever possible. These features will vary based upon your specific server model.
The single points of failure described in this section are:
Cluster interconnect
Fibre Channel data paths
Non-shared disk drives
Shared disk drives
NOTE: The Compaq ProLiant Cluster HA/F200 addresses the single points of failure listed above with its dual redundant loop configuration. For more information, refer to the “Enhanced High Availability Features of the HA/F200” section of this chapter.
Cluster Interconnect
The interconnect is the primary means for the cluster nodes to communicate. Intracluster communication is crucial to the health of the cluster. If communication between the cluster nodes ceases, the Microsoft clustering software must determine the state of the cluster and take action, in most cases bringing the cluster groups offline on one of the nodes and failing over all cluster groups to the other node.
Following are two strategies for increasing the availability of intracluster communication. Combined, these strategies provide even more redundancy.
Page 49
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-15
Microsoft clustering software configuration
Microsoft Cluster Server for Windows NTS/E and Cluster Service for
Windows 2000 Advanced Server (MSCS) allow you to configure a primary
and backup path for intracluster communication, which will reduce the
possibility of an intracluster communication disruption. Any network interface
card (NIC) in the nodes can be configured to serve as a backup path for node-
to-node communication. When the primary path is disrupted, the transfer of
communication responsibilities goes undetected by applications running on the
cluster. Whether a dedicated or public interconnect has been set up, a separate
NIC should be configured to act as a redundant interconnect. This is an easy
and inexpensive way to add redundancy to intracluster communication.
Redundant Interconnect Card
Another strategy to increase availability is to use a redundant interconnect
card. This may be done for either the dedicated intracluster communication
path, or for the client LAN. If you are using a dedicated, direct-connection
interconnect configuration, you can install a second dedicated,
direct-connection interconnect.
NOTE: If you are using the ServerNet option as the interconnect, the card itself has a built-in level of redundancy. Each ServerNet PCI adapter has two data ports, thereby allowing two separate cables to be run to and from each cluster node. If the ServerNet adapter determines that data is being sent from one adapter but not received by the other, it will automatically route the information through its other port.
There are two implementations that provide identical redundant NIC
capability. The implementation you choose will depend on your hardware. The
Compaq TLAN Teaming and Configuration Utility is supported on all
Compaq TI-based Ethernet and Fast Ethernet NICs, such as NetFlex-3 and
Netelligent 10/100 TX PCI Ethernet NICs. The Compaq Network Teaming
and Configuration Utility is designed to operate with the Compaq Intel-based
10/100 NICs. Combining these utilities with the appropriate NICs will enable
a seamless, undetectable failover of the primary interconnect to the redundant
interconnect.
NOTE: These two methods of NIC redundancy cannot be combined in a single redundant NIC pair: TI-based NICs may not be paired with Intel-based NICs to create a redundant pair. For more information, refer to the Compaq White Paper, “High Availability Options Supported by Compaq Network Interface Controllers,” available at the Compaq High Availability website (http://www.compaq.com/).
Page 50
2-16 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Because the purpose of the redundant interconnect is to increase the availability of the cluster, it is important to monitor the status of your redundant NICs. Compaq Insight Manager and Compaq Insight Manager XE simplify management of the interconnect by monitoring the state of the NIC. You can view status information and alert conditions for all cards in each node. If a failover event occurs due to a disruption in the heartbeat, you can use the Compaq Insight Manager tools to determine where the disruption originated.
Cluster-to-LAN Communication
Each cluster node must have at least one NIC that connects to the LAN. Through this connection, network clients can access applications and data on the cluster. If the LAN NIC fails in one of the nodes, any clients connected directly to the cluster node by means of the computer name, cluster node IP address, or MAC address of the NIC will no longer have access to their applications. Clients connected to a virtual server on the cluster (via the IP address or network name of a cluster group) reconnect to the cluster through the surviving cluster node.
Failure of a LAN NIC in a cluster node may have serious repercussions. If your cluster is configured with a dedicated interconnect and a single LAN NIC, the failure of a LAN NIC will prevent network clients from accessing cluster groups running on that node. If the interconnect path is not disrupted, it is possible that a failover will not occur. The applications will continue to run on the node with the failed NIC; however, clients will be unable to access them.
Install redundant NICs and use the proper redundant NIC utility to reduce the possibility of LAN NIC failure. When your cluster nodes are configured with the utility, the redundant NIC automatically takes over operation if the primary NIC fails. Clients maintain their connection with their primary node and, without disruption, continue to have access to their applications.
Compaq offers a dual-port NIC that can utilize the Compaq Redundant NIC Utility. This also reduces the possibility of the failure scenario described above. However, if the entire NIC or the node slot into which the NIC is placed fails, the same failure scenario will occur.
Compaq Insight Manager and Compaq Insight Manager XE monitor the health of any network cards used for the LAN. If any of the cards experience a fault, the Compaq Insight Manager tools mark the card as “Offline” and change its condition to the appropriate status.
Page 51
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-17
Recommended Cluster Communication Strategy
The past two sections discussed the redundancy of intracluster and
cluster-to-LAN communication. However, to obtain the most benefit while
minimizing cost and complexity, view cluster communications as a single
entity.
To create redundancy for both intracluster and cluster-to-LAN
communication, first, employ physical hardware redundancy for the LAN
NICs. Second, configure the Microsoft clustering software to use both the
primary and redundant LAN NIC as backup for intracluster communication.
With this strategy, your cluster can continue normal operations (without a
failover event) when each of the following points of failure are encountered:
Failure of the interconnect card
Failure of the interconnect cable
Failure of the port on the LAN NIC
Failure of the LAN NIC (if redundant NICs, as opposed to dual-ported
NICs, are used)
Failure of the Ethernet cable running from a cluster node to the Ethernet
hub (which connects to the LAN)
The following examples describe how to physically set up your cluster nodes
to employ the Compaq-recommended strategy.
Page 52
2-18 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Example 1
A Compaq dual-port NIC and a single-port NIC are used in this example (Figure 2-9). The first port of the dual-port NIC is a dedicated interconnect, and the second port is the backup path for the cluster-to-LAN network. The single-port NIC is configured as the primary network path for cluster-to-LAN communication.
The TLAN Teaming and Configuration Utility (for ThunderLAN NICs) and the Network Teaming and Configuration Utility (for Intel NICs) are used to configure the second port on the dual-port NIC as the backup port of a redundant pair. The single port on the other NIC is configured to be the primary port for cluster-to-LAN communication.
The interconnect retains its fully redundant status when MSCS is configured to use the other network ports as interconnect backup. Failure of the primary interconnect path results in intracluster communications occurring over the single-port NIC, since the single-port NIC was configured in MSCS as the backup for intracluster communication. If the entire dual-port NIC fails, the cluster nodes still have a working communication path over the single-port NIC.
With this configuration, even a failure of the dual-port NIC results in the transfer of the cluster-to-LAN communication to the single-port NIC. Other than a failure of the network hub, the failure of any cluster network component will be resolved by the redundancy of this configuration.
Clients
Hub
Primary Interconnect Path
Backup Cluster to LAN and Backup Interconnect Path
Primary Cluster to LAN and
Backup Interconnect Path
Node 1
Node 2
Figure 2-9. Use of dual-port NICs to increase redundancy
Page 53
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-19
Example 2
The second example configuration consists of three single-port NICs (Figure
2-10). One NIC is dedicated to intracluster communication. The other two
NICs are used for cluster-to-LAN communication. The Compaq Advanced
Network Control Utility is used to configure two of the NICsone as the
primary and one as the standby of a redundant pair.
The interconnect is fully redundant when the Microsoft clustering software is
configured to use the other network cards as backups for the interconnect.
Failure of the primary interconnect path results in intracluster communications
occurring over the primary NIC of the redundant pair. If the entire
interconnect card fails, the cluster nodes will still have a working
communication path.
The cluster-to-LAN communication is fully redundant up to the network hub.
With this configuration, even a failure of the primary NIC results only in the
transfer of the network path to the standby NIC. Other than a failure of the
network hub, any failure of any cluster network component will be resolved by
the redundancy of this configuration.
The primary disadvantage of this configuration as compared to Example 1 is
that an additional card slot is used by the third NIC.
Clients
Hub
Node 1
Primary Interconnect Path
Primary Cluster to LAN and Backup Interconnect Path
Backup Cluster to LAN and
Backup Interconnect Path
Node 2
Figure 2-10. Use of three NICs to increase redundancy
Page 54
2-20 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
HA/F100 Fibre Channel Data Paths
The Compaq StorageWorks RAID Array 4000 or Compaq StorageWorks RAID Array 4100 storage system is the mechanism with which ProLiant Clusters implement shared storage. Generally, the storage system consists of a host bus adapter in each server, a storage hub or switch, a Compaq StorageWorks RA4000 Controller, and a Compaq StorageWorks RAID Array 4000 or Compaq StorageWorks RAID Array 4100 (RA4000/4100) into which the SCSI disks are placed.
The RA4000/4100 storage system has two distinct data paths, separated by the Fibre Channel storage hub or FC-AL switch:
The first data path runs from the host bus adapters in the servers to the
Fibre Channel storage hub or FC-AL switch.
The second data path runs from the Fibre Channel storage hub or
FC-AL switch to the RA4000/4100.
The effects of a failure will vary depending on whether the failure occurred on the first or second data path.
Page 55
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-21
Failure of the Host Bus Adapter-to-Storage Hub
Data Path
If the host bus adapter-to-storage hub path fails (Figure 2-11), it results in a
failover of all applications. For instance, if one server can no longer access the
storage hub (and by extension the shared storage), all of the cluster groups that
depend on shared storage will fail over to the second server. The cost of failure
is relatively minor. It is the downtime experienced by users while the failover
event occurs.
Corporate LAN
ProLiant
Server
ProLiant
Server
RA4000/4100
storage hub
or switch
Interconnect
Figure 2-11. Host bus adapter-to-storage hub data path
Note that the Compaq Insight Manager tools monitor the health of the
RA4000/4100 storage system. If any part of the Fibre Channel data path
disrupts a servers access to the RA4000/4100, the array controller status
changes to “Failed” and the condition is red. The red condition bubbles up to
higher-level Compaq Insight Manager screens and eventually to the device
list.
NOTE: The Compaq Insight Manager tools display a failure of physical hardware through the Mass Storage button on the View screen, marking the hardware “Failed.” A logical drive in the cluster is reported on the Cluster Shared Resources screen as a logical disk resource. Compaq Insight Manager and Compaq Insight Manager XE do not associate the logical drive with the physical hardware.
Page 56
2-22 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Failure of the Hub-to-RA4000/4100 Data Path
The second data path (Figure 2-12), from the storage hub to the RA4000/4100, has more severe implications when it fails. If this data path fails, all clustered applications become inoperable. Even attempting to fail the applications to another cluster node will not gain access to the RA4000/4100.
NOTE: This failure scenario can be avoided by deploying the redundant Fibre Channel loop configuration of the Compaq ProLiant Cluster HA/F200.
Corporate LAN
ProLiant
Server
ProLiant
Server
RA4000/4100
storage hub
or switch
Interconnect
Figure 2-12. Hub-to-RA4000/4100 data path
Without access to shared storage, clustered applications cannot reach their data or log files. The data, however, is unharmed and remains safely stored on the physical disks inside the RA4000/4100. If a database application was running when this failure occurred, some in-progress transactions will be lost. The database will need to be rolled back and the in-progress transactions re­entered.
Like the server-to-storage hub data path, the Compaq Insight Manager tools detect this fault, change the RA4000/4100 status to “Failed,” and change its condition to red. The red condition bubbles up through Compaq Insight Manager screens, eventually to the device list.
Page 57
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-23
Nonshared Disk Drives
Nonshared disk drives, or local storage, operate the same way in a cluster as
they do in a single-server environment. These drives can be in the server drive
bays or in an external storage cabinet. As long as they are not accessible by
both servers, they are considered nonshared.
Treat nonshared drives in a clustered environment as you would in a
nonclustered environment. Most likely, some form of RAID is used to protect
the drives and restore a failed drive. Since the operating system is stored on
these drives, use either hardware or software RAID to protect the information.
Hardware RAID is available with the Compaq SMART-2 Controller or by
using a nonshared storage system.
Shared Disk Drives
Shared disk drives are contained in the RA4000/4100, which is accessible by
each cluster node. Employ hardware RAID 1 or 5 on all of your shared disk
drives. This is configured using the Compaq Array Configuration Utility.
If RAID 1 or 5 is not used, failure of a shared disk drive will disrupt service to
all clustered applications and services that depend on the drive. Failover of a
cluster node will not resolve this failure, since neither server can read from a
failed drive.
NOTE: Windows NTS/E software RAID is not available for shared drives when using MSCS. Hardware RAID is the only available RAID option for shared storage.
As with other system failures, Compaq Insight Manager monitors the health of
disk drives and will mark a failed drive as “Failed.”
Enhanced High Availability Features of the HA/F200
A single point of failure refers to any component in the system that, should it
fail, prevents the system from functioning. Single points of failure in hardware
can be minimized, and in some cases eliminated, by using redundant
components. The most effective way of accomplishing this is by clustering.
The Compaq ProLiant Cluster HA/F100 reduces the single points of failure
that exist in a single-server environment by allowing two servers to share
storage and take over for each other in the event that one server fails. The
Compaq ProLiant Cluster HA/F200 goes one step further by implementing a
dual redundant Fibre Channel Arbitrated Loop configuration.
Page 58
2-24 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
The Compaq ProLiant Cluster HA/F200 further enhances high availability through the use of additional, redundant, components in the server-to-storage connection and in the shared storage system itself. In the event of a failure, processing is switched to an alternate path without affecting applications and end users. In fact, this path switch is transparent even to the Windows NT and Windows 2000 file system (NTFS). The combination of multiple paths and redundant hardware components provided by the HA/F200 offers significantly enhanced high availability over non-redundant configurations.
A single component failure in the HA/F200 will result in an automatic failover to an alternate component, allowing end users to continue accessing their applications without interruption. Some typical failures and associated responses in an HA/F200 configuration are:
A server failure will cause the Microsoft clustering software to fail
application processing over to the second server.
A host bus adapter failure will cause I/O requests intended for the failed
adapter to be rerouted through the remaining adapter.
Page 59
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-25
A storage hub, switch, or cable failure will be treated like a host bus adapter failure and a failover to the second host bus adapter, which is using a different storage hub and cables, will occur.
An array controller failure will cause the redundant array controller to
take over for the failed controller.
In all of the above examples, end users will experience minimal interruptions
while the failover occurs. In some cases, the interruptions may not even be
noticeable.
The following illustration depicts the HA/F200 configuration components.
Node 1
RA4000/4100
Dedicated
Interconnect
LAN
Node 2
storage hub
or switch
storage hub
or switch
storage hub
or switch
storage hub
or switch
Figure 2-13. HA/F200 configuration
HA/F200 Fibre Channel Data Paths
The Compaq StorageWorks RAID Array 4000/4100 storage system is the
mechanism with which the HA/F200 cluster implements shared storage. The
Compaq ProLiant Cluster HA/F200 minimum configuration consists of two
host bus adapters in each server, two Fibre Channel storage hubs or FC-AL
switches, two array controllers per RA4000/4100, and one or more
RA4000/4100s.
Page 60
2-26 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
The RA4000/4100 storage system has active data paths and standby data paths, separated by two Fibre Channel storage hubs or FC-AL switches. Figure 2-14 and Figure 2-15 detail the active and standby paths of the minimum HA/F200 configuration.
RA4000/4100
storage hub
or switch
storage hub
or switch
Server Server
SS
A
Active Standby
A
Figure 2-14. Active host bus adapter-to-storage data paths
Page 61
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-27
The active data paths run from the active host bus adapters in the servers to the
active storage hub. If this path fails, the applications can seamlessly fail over
to the standby host bus adapter-to-storage hub data paths (Figure 2-15).
storage hub
or switch
RA4000/4100
storage hub
or switch
Server Server
SS
A
Active Standby
A
Figure 2-15. Active hub-to-storage data path
The second active data path runs from the active hub or switch to the
RA4000/4100. If this path fails, the applications can seamlessly fail over to the
standby hub-to-RA4000/4100 data path.
The dual redundant loop feature of the Compaq ProLiant Cluster HA/F200
increases the level of availability over clusters that have only one path to the
shared storage. In addition, the second path in the HA/F200 provides for
improved performance through static load balancing. Static load balancing
considerations are discussed in the Static Load Balancing section of this
chapter.
Page 62
2-28 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Capacity Planning
Capacity planning determines how much computer hardware is needed to support the applications and data on your clustered servers. Unlike conventional, single-server capacity planning, clustered configurations must ensure that each node is capable of running any applications or services that may fail over from its partner node. To simplify the following discussion, the software running on each of the clustered nodes is divided into three generic categories:
Operating system
Nonclustered applications and services
Clustered applications and services
Figure 2-16 illustrates these categories in the cluster.
Operating System
Clustered Applications
& Services
Non-Clustered
Applications & Services
Operating System
Clustered Applications
& Services
Non-Clustered
Applications & Services
Data for Node1 Clustered
Applications & Services
Data for Node2 Clustered
Applications & Services
Shared Storage
Node1
Node2
Figure 2-16. File locations in a Compaq ProLiant Cluster
For each server, determine the processor, memory, and disk storage requirements needed to support its operating system and nonclustered applications and services.
Page 63
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-29
Determine the processor and memory requirements needed to support the
clustered applications and services that will run on each node while the cluster
is in a normal operating state.
If the program files of a clustered application and/or service will reside on
local storage, remember to add that capacity to the amount of local storage
needed on each node.
For all files that will reside on shared storage, see Shared Storage Capacity
later in this chapter.
Server Capacity
The capacity needed in each server depends on whether you design your
cluster as an active/active configuration or as an active/standby configuration.
Capacity planning for each configuration is discussed in the following
sections.
Active/Active Configuration
As described earlier in this chapter, an active/active configuration can be
designed in two ways:
Applications and services may be configured to fail over from each node
to its partner node.
Applications and services may be configured to fail over from just one
node to its partner node.
Page 64
2-30 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
The following table details the capacity requirements that can be applied to either active/active design.
Table 2-1
Server Capacity* Requirements for Active/Active Configuration
Node 1 Node 2
Operating system (with MSCS) Operating system (with MSCS)
Nonclustered applications and services Nonclustered applications and services
Server1 clustered applications and services Server2 clustered applications and services
Server2 clustered applications and services
(if Server2 is set up to fail applications and services to Server1)
Server1 clustered applications and services
(if Server1 is set up to fail applications and services to Server2)
* Processing power, memory, and nonshared storage
Active/Standby Configuration
In an active/standby configuration, only one node actively runs applications and services. The other node is in an idle, or standby, state. Assume Node 1 is the active node and Node 2 is the standby node.
Table 2-2
Server Capacity* Requirements for Active/Standby Configuration
Node 1 Node 2
Operating System (with MSCS) Operating system (with MSCS)
Nonclustered applications and services Server1 clustered applications and
services
Server1 clustered applications and services
* Processing power, memory, and nonshared storage
Page 65
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-31
Shared Storage Capacity
Each server is connected to shared storage (the Compaq StorageWorks RAID
Array 4000/4100 storage system), which mainly stores data files of clustered
applications and services. Follow the guidelines below to determine how much
capacity is needed for your shared storage.
NOTE: For some clustered applications, it may make sense to store the application program files on shared storage. If the application allows customization and the customized information is stored in program files, the program files should be placed on shared storage. When a failover event occurs, the secondary node will launch the application from shared storage. The application will execute with the same customizations that existed when executed on the primary node.
Two factors help to determine the required amount of shared storage disk
space:
The amount of space required for all clustered applications and their
dependencies.
The level of data protection (RAID) required for each type of data used
by each clustered application. Two factors driving RAID requirements are:
The performance required for each drive volume
The recovery time required for each drive volume
IMPORTANT: Windows software RAID is not available for shared drives when using MSCS. Hardware RAID is the only available RAID option for shared storage.
For more information about hardware RAID, see the following:
Compaq StorageWorks Fibre Channel RAID Array 4000 User Guide
Compaq StorageWorks Fibre Channel RAID Array 4100 User Guide
Page 66
2-32 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
In the Cluster Groups section of this chapter, you created a resource dependency tree, then transferred that information into a Cluster Group Definition Worksheet (Figure 2-8). Under the resource dependencies in the worksheet, you listed at least one physical disk resource. For each physical disk resource, determine the capacity and level of protection required for the data to be stored on it.
For example, the Web Sales Order Database group depends on a log file, data files, and program files. It might be important for the log file and program files to have a quick recovery time, while performance would be a secondary concern. Together, the files do not take up much capacity; therefore, mirroring (RAID 1) would be an efficient use of disk space and would fulfill the recovery and performance characteristics. The data files, however, would need excellent performance and excellent protection. The data files are expected to be large; therefore, a mirrored configuration would require an unacceptably expensive number of disk drives. To minimize the amount of physical capacity and still meet the performance and protection requirements, the data files would be configured to use Distributed Data Guarding (RAID 5).
Page 67
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-33
Array Configuration
The Compaq Array Configuration Utility (ACU) is used to initially configure
the array controller, reconfigure the array controller, add additional disk drives
to an existing configuration, and expand capacity. The capacity expansion
feature provides the ability to add storage capacity to a configured array
without disturbing the existing data and to add a new physical drive to the
array.
An array is created by grouping disk drives together to share a common
RAID (Redundant Array of Inexpensive Disks) fault tolerance type. For
example, in a single RA4000/4100 storage system containing eight 18.2 GB
drives, you could configure two of the drives in a RAID 1 mirrored array and
the remaining six drives as a RAID 5 Distributed Data Guarding array.
Each array must be divided into at least one volume (up to a maximum of
eight volumes per array). Each volume is presented to the operating system as
an independent disk drive and can be independently controlled by the cluster
software. Using the previous example, you could configure the two-drive
RAID 1 array as a single volume (for example, drive F), and the six-drive
RAID 5 array as two volumes (for example, drives G and H). Because the
operating system views these as independent disks, it is possible for cluster
Node 1 to control drive G, while cluster Node 2 controls drives F and H.
More information regarding cluster disk configuration can be found in the
Compaq TechNote, Planning Considerations for Compaq ProLiant Clusters
Using Microsoft Cluster Server, located on the Compaq website
(
http://www.compaq.com).
This capability provides a high level of flexibility in configuring your
RA4000/4100 storage system. However, minimize the number of volumes
configured in each array to improve performance. To achieve optimal
performance, each array should contain a single volume. In some cases (such
as for the Microsoft clustering software quorum drive), it may be desirable to
add a second, smaller volume to an array.
Page 68
2-34 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Shared Storage Capacity Worksheet
The following Shared Storage Capacity worksheet will assist in determining your shared storage capacity requirements. The following example illustrates the required shared storage capacity for the entire Web Sales Order business function. A blank worksheet is provided in Appendix A.
Shared Storage Capacity Worksheet
Disk Resource 1 Disk Resource 2
Description Web files and Web scripts
for Web Service Group
Log file(s) for Database
Required Application Capacity 12 GB 4.3 GB
Desired Level of Protection RAID 5 RAID 1
RAID Configuration 4 x 4.3 GB 2 x 4.3 GB
Required Capacity With RAID 17.2 GB 8.6 GB
Total Usable Capacity 12.9 GB 4.3 GB
Disk Resource 3 Disk Resource 4
Description Data file(s) for Database N/A
Required Application Capacity 27 GB
Desired Level of Protection RAID 5
RAID Configuration 4 x 9.0 GB
Required Capacity With RAID 36 GB
Total Usable Capacity 27 GB
Disk Resource 5 Disk Resource 6
Description N/A N/A
Required Capacity Without RAID
Desired Level of Protection
RAID Configuration
Required Capacity With RAID
Total Usable Capacity
Figure 2-17. Shared Storage Capacity Worksheet (example)
Page 69
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-35
Static Load Balancing
Static load balancing helps to attain enhanced performance from the cluster by
balancing the systems workload. With cluster configurations, applications and
data can be shared by all components so that no one component is working at
its maximum capability.
There are two means of static load balancing. One way balances a system’s
workload across the cluster. The other balances a servers workload across
multiple data paths. The dual redundant loop of the Compaq ProLiant
Cluster HA/F200 and an added RA4000/4100 storage system spread a
systems applications and data across the data paths through an active/active
host bus adapter configuration. This configuration can increase the
functionality of the cluster.
IMPORTANT: Disk load balancing cannot be done when using a single RA4000/4100 in a Compaq ProLiant Cluster HA/F200. Add another RA4000/4100 to your HA/F200 configuration for host bus adapters in a single server to be active/active.
Figure 2-18 shows a Compaq ProLiant Cluster HA/F200 configuration with
only one RA4000/4100. Because there is only one RA4000/4100, the host bus
adapters are in active/standby HBA mode, which means that they do not have
load-balancing capability.
RA4000/4100
Server
Server
A
S
S
A
storage hub
or switch
storage hub
or switch
storage hub
or switch
storage hub
or switch
S
A
Figure 2-18. Compaq ProLiant Cluster HA/F200 with one RA4000/4100
Page 70
2-36 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Figure 2-19 depicts a Compaq ProLiant Cluster HA/F200 with dual RA4000/4100s. This configuration can accommodate static load balancing because the host bus adapters of one server can be in an active/active HBA mode to different storage systems.
RA4000/4100
RA4000/4100
Server
Server
A
A
S
A
S
S
S
A
storage hub
or switch
storage hub
or switch
storage hub
or switch
storage hub
or switch
Figure 2-19. Compaq ProLiant Cluster HA/F200 with dual RA4000/4100s
Page 71
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-37
Networking Capacity
The final capacity planning section addresses networking. The cluster nodes
must have enough network capacity to handle requests from the client
machines and must gracefully handle failover/failback events.
Make sure both nodes can handle the maximum number of clients that can
attach to the cluster. If Node 1 encounters a failure and its applications and
services fail over to Node 2, then Node 2 needs to handle access from its own
network clients as well as those that normally connect to the failed node
(Node 1).
Note the effect of failover on network I/O bandwidth. When the cluster
encounters a server failover event, only one node is responding to network I/O
requests. Be sure the surviving nodes network speed and protocol will
sufficiently handle the maximum number of network I/Os when the cluster is
running in a degraded state.
Network Considerations
This section addresses clustering items that affect the corporate LAN. The
Microsoft clustering software has specific requirements regarding which
protocol can be used and how IP address and network name resolution occurs.
Additionally, consider how network clients will interact within a clustering
environment. Some client-side applications may need modification to receive
the maximum availability benefits of operating a cluster.
Network Configuration
Network Protocols
TCP/IP and NBT (NetBIOS over TCP/IP) are the only transport protocols that
are supported in an Microsoft clustering software failover environment. Other
protocols, such as NetBEUI, IPX/SPX (Novell), NB/IPX, or DLC (IBM) may
be used, but they cannot take advantage of the failover features of the
Microsoft clustering software.
Applications that use these other protocols will function identically to a
single-server environment. Users can still use these protocols, but they will
connect directly to the individual servers and not to the virtual servers on the
cluster, just as in a single-server environment. If a failure occurs, any
connections using these protocols will not switch over. Since these protocols
cannot fail over to another server, avoid these protocols, if possible.
Page 72
2-38 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
WINS and DNS
WINS (Windows Internet Name Service) and DNS (Domain Name Service) are supported in the Microsoft clustering software. Use WINS to register the network names and IP addresses of cluster resources. If WINS is not used, create an entry in the hosts or lmhosts file that lists each network name and IP address pair, as well as the cluster name and its IP, address since these function as virtual servers to the clients.
If clients are located on a different subnet than the cluster nodes, modify the DNS database to add a DNS address record for the cluster.
DHCP
Only use DHCP for the clients; it should not be used for the cluster node IP addresses or cluster resource IP addresses. DHCP cannot be used to assign IP addresses for virtual servers.
When configuring DHCP, exclude enough static IP addresses from the pool of dynamically leased addresses to account for the following:
Cluster node IP addresses
At least one static IP address for each virtual server
Migrating Network Clients
One of the first steps in assessing the impact of a clustered environment on the network clients is to identify the various types of network functions and applications that are provided to the users. It is likely that several steps are necessary to migrate your clients to take full advantage of clustering.
File and Print Services
The main consideration for file and print services is the method clients use to connect to the shared resources. If clients use batch files to connect to shared directories on the server, the batch files may need to be updated to reflect the new path name and, possibly, the new share name.
Page 73
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-39
Connecting to Shared Resources
In the traditional, command-driven connection to a shared resource, the user
needs to know the server name and the share name. In a clustered
environment, the command is changed to reflect the cluster network name and
file share that were configured as part of the failover group for that shared
directory.
Compare the command syntax in Table 2-3 for connecting to a shared resource
on a stand-alone server versus a clustered server.
Table 2-3
Comparison of Net Use Command Syntax
Server Environment Command Syntax
Stand-alone server Net use J:\\servername\sharename
Cluster node Net use J:\\networkname\fileshare
Change client login scripts and profiles so that users connect to resources
using the cluster network name and file share.
Client/Server Applications
Reconfiguration of client applications in a client/server environment may also
be required. Some applications, such as many of the popular databases, require
the client to specify the IP address of the server that holds the database they
want to connect to. The IP addresses may be held in a special configuration
program or in a text file. Any references to the servers actual IP addresses
must be changed to reflect the new IP Address Resource that has been
configured for that applications cluster group.
Some databases allow you to specify the IP address of a backup server, which
the client database software attempts to use in case the database is not
accessible using the first IP address. The backup IP address scheme can be
used in a nonclustered environment to assist clients if the primary server fails.
This is no longer necessary when using MSCS.
Page 74
2-40 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
In a clustered environment, IP addresses for the database are configured to fail over with the database application, making a backup IP address on the client unnecessary. When the database resources have failed over to the other server, the client can reconnect to the database using the same IP address as before the failure. This process may be automated if the client application software supports automatic connection retries.
IMPORTANT: Examine these client configuration issues in a pilot and testing phase before implementing a clustered system. This will help you to identify any client reconfiguration requirements and understand how client applications will behave in a clustered environment, especially after a failure.
Failover/Failback Planning
The final section of this chapter addresses several factors to consider when planning for failover and failback events.
Performance of clustered servers after failover
Cluster server thresholds and periods
Failover of directly connected devices
Automatic vs. manual failover
Failover/failback policies
Performance After Failover
As applications or resources fail from one server to another, the performance of the clustered servers may change dynamically. This is especially obvious after a server failure, when all of the cluster resources may move to the other cluster node.
Performance monitoring of server loads after a failure should be investigated prior to a full clustered system implementation. You may need additional hardware, such as memory or system processors, to support the additional workload incurred after a failover.
It is also important to understand the performance impact when configuring server pairs in a failover cluster. If a business-critical database is already running at peak performance, requiring the server to take on the additional workload of a failed server may adversely affect business operations. In some cases, you may find it appropriate to pair that server with a low-load server, or even with a no-load server (as in an active/standby cluster configuration).
Page 75
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-41
You can use the Windows performance tool to observe and track system
performance. Some applications may also have their own internal performance
measurement capabilities.
Microsoft Clustering Software Thresholds and Periods
The Microsoft clustering software offers flexibility in configuring the
initiation of failover events. For resources, the Microsoft clustering software
allows you to set Restart Thresholds and Periods. For cluster groups, the
Microsoft clustering software allows you to set Failover Thresholds and
Periods.
Restart Threshold and Restart Period
A restart threshold defines the maximum number of times per restart period
that the Microsoft clustering software attempts to restart a resource before
failing over the resource and its corresponding cluster group. See the following
example:
Assume you have a disk resource (Disk1) that is part of a cluster group
(Group1). You set the restart threshold to 5 and the restart period to 10. If the
Disk1 resource fails, the Microsoft clustering software will attempt to restart
the resource on the groups current cluster node five times within a 10-minute
period. If the resource cannot be brought online within the 10-minute restart
period, then Group1 will fail over to the partner cluster node.
Note that the Microsoft clustering software waits the length of the restart
period (for example, 10 minutes) before actually failing over the cluster group.
You must assess the likelihood that the group will successfully restart on its
present node against the time required to restart the cluster group before failing
it over. If it is appropriate to immediately fail over any group that encounters a
problem, set the restart threshold to 0 (zero). If the group will experience
severe performance limitations if failed over to a secondary server, set the
threshold and period so that the Microsoft clustering software attempts to
restart the group on its preferred server.
Page 76
2-42 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Failover Threshold and Failover Period
The failover threshold and failover period are similar to the restart values. The failover threshold defines the maximum number of times per failover period that the Microsoft clustering software attempts to fail over a cluster group. If the cluster group exceeds the failover threshold in the allotted failover period, the group is left on its current node, in its current state, whether that is online, offline, or partially online.
The failover threshold and failover period prevents a cluster group from bouncing back and forth between servers. If a cluster group is so unstable that it cannot run properly on either cluster node, it will eventually be left in its current state on one of the nodes. The failover threshold and period determine the point at which the decision is made to leave the cluster group in its current state.
The following example illustrates the relationship between the restart threshold and period and the failover threshold and period.
Assume you have a cluster group (Group1) that is configured to have a preferred server (Server1). If Group1 encounters an event that forces it offline, MSCS attempts to restart the resource. If Group1 cannot be restarted within the limits of the restart threshold and period, the Microsoft clustering software attempts to fail over Group1 to Node 2. If the failover threshold for Group1 is set to 10 and the failover period is set to 3 (hours), the Microsoft clustering software will fail over Group1 as many as 10 times in a 3-hour period. If a failure is still forcing Group1 offline after three hours, the Microsoft clustering software will no longer attempt to fail over the group.
Failover of Directly Connected Devices
Devices that are physically connected to a server cannot move to the other cluster node. Therefore, any applications or resources dependent on these devices may be unable to restart on the other cluster node. Examples of direct-connect devices include printers, mainframe interfaces, modems, fax interfaces, and customized input devices such as bank card readers.
For example, if a server is providing print services to users, and the printer is directly connected to the parallel port of the server, there is no way to switch the physical connection to the other server, even though the print queue and spooler can be configured to fail over. The printer should be configured as a true network printer and connected to a hub that is accessible from either cluster node. In the event of a server failure, not only will the print queue and spooler fail over to the other server, but physical access to the printer will be maintained.
Page 77
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-43
Another example of a direct-connect device is a directly connected mainframe
interface. If the first server is directly connected to the mainframe, as through
an SDLC (Synchronous Data Link Control) card in the server, there is no way
to switch the physical connection to a second server. In a case like this, you
may be able to use the client network to access the mainframe using TCP/IP.
Since TCP/IP addresses can be configured to fail over, you may be able to
reestablish the connection after a switch. However, many mainframe
connectivity applications use the Media Access Control (MAC) address that is
burned into the NIC to communicate with the server. This would cause a
problem because MAC addresses cannot be configured to fail over.
Carefully examine the direct-connect devices on each server to determine
whether you need to provide alternate solutions outside of what the cluster
hardware and software can accomplish. These devices can be considered
single points of failure because the cluster components may not be able to
provide failover capabilities for them.
Manual vs. Automatic Failback
Failback is the act of integrating a failed cluster node back into the cluster.
Specifically, it brings cluster groups and resources back to their preferred
server. the Microsoft clustering software offers automatic and manual failback
options. The automatic failback event will occur whenever the preferred server
is reintegrated into the cluster. If the reintegration occurs during normal
business hours, there may be a slight interruption in service for network clients
during the failback process. If the interruption needs to occur in nonpeak
hours, be sure to set the failback policy to “Allow” and set the “Between
Hours settings to acceptable values. For full control over when a cluster node
is reintegrated, use manual failback by choosing “Prevent” as the failback
policy.
Many organizations prefer to use manual failback for business-critical clusters.
This prevents applications from automatically failing back to a server that has
failed, automatically rebooted, and automatically rejoined the cluster before
the root cause of the original error has been determined.
These terms are described and illustrated in the Group Failover/Failback
Policy Worksheet provided in the following section.
Page 78
2-44 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Failover and Failback Policies
In the Cluster Groups section of this chapter, you created one or more cluster group definition worksheets (Figure 2-8). For each cluster group defined in the worksheets, you will now determine its failover and failback policies by filling in the Group Failover/Failback Policy worksheet.
Terms and Definitions
The following terms and definitions are used in defining failover/failback policies for cluster groups.
Table 2-4
Group Failover/Failback Policy Terms and Definitions
Term Definition
Failover policy The circumstances the Microsoft clustering software uses to take
a group offline on the primary (preferred) node and online on the secondary node.
Failback policy The circumstances the Microsoft clustering software uses to
bring a group offline on the secondary node and online on the primary (preferred) node.
Preferred owner The cluster node you want the cluster group to run on when the
cluster is in a normal state.
Failover threshold The number of times the Microsoft clustering software will
attempt to fail over a group within a specified failover period.
continued
Page 79
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200 2-45
Table 2-4
Group Failover/Failback Policy Terms and Definitions
continued
Term Definition
Failover period The length of time in which the Microsoft clustering software
attempts to fail over a cluster group. When the failover threshold count has been exceeded within the failover period, the Microsoft clustering software leaves the group on its current node, in its current state.
Example: If the failover threshold = 5 and the failover period = 1, the Microsoft clustering software will attempt to fail over the group 5 times within a 1-hour period.
Prevent Prevent automatic failback. This setting allows the administrator
to fail back a group manually.
Allow Allow automatic failback. This setting allows the Microsoft
clustering software to fail back a group automatically.
Allow immediately This setting allows automatic failback as soon as the preferred
node is reintegrated into the cluster and brought back online.
Allow between hours This setting allows the administrator to determine specific hours
of the day during which automatic failback can occur.
Refer to the Microsoft clustering documentation for detailed information about
failover and failback policies of groups and resources.
Page 80
2-46 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Group Failover/Failback Policy
Use the Group Failover/Failback Policy worksheet to define the failover and failback policies for each cluster group. Figure 2-20 illustrates the failover/failback parameters for the Web Server Service of the Web Sales Order business function defined in previous examples. A blank copy of the worksheet is provided in Appendix A.
Group Failover/Failback Policy Worksheet
Group Name Web Server Service
General Properties
Name Web Server Service
Description Group containing Web Server Service used to operate the Web
Sales Order business function
Preferred Owners Server 1
Failover Properties
Threshold 5
Period 10
Failback Properties
Prevent Allow
(manual failback preferred for this group)
Choose one:
Immediately Between hours
Start End
Figure 2-20. Group Failover/Failback Policy Worksheet (example)
Page 81
Chapter 3
Setting Up the Compaq ProLiant
Clusters HA/F100 and HA/F200
Preinstallation Overview
This chapter provides instructions for building a new Compaq ProLiant
Cluster HA/F100 or a Compaq ProLiant Cluster HA/F200.
If you are planning to migrate from an HA/F100 to an HA/F200 configuration
or you are planning to upgrade the operating system of an HA/F100 or
HA/F200, see Chapter 4 for more details.
The Compaq ProLiant Clusters HA/F100 and HA/F200 are combinations of
several individually available products. Have the following documents
available as you set up your cluster.
Documentation for the clustered Compaq ProLiant servers
Compaq shared external storage documentation
Compaq StorageWorks RAID Array 4000 User Guide
Compaq StorageWorks RAID Array 4100 User Guide
Page 82
3-2 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Compaq host bus adapter documentation
Compaq StorageWorks Fibre Channel Host Adapter Installation
Guide
Compaq StorageWorks 64-Bit/66-MHz Fibre Channel Host Adapter
Installation Guide
Installation guide for the NIC of your choice
Fibre Channel storage hub or FC-AL switch documentation
Compaq StorageWorks Fibre Channel Storage Hub 7 Installation
Guide
Compaq StorageWorks Fibre Channel Storage Hub 12 Installation
Guide
Compaq StorageWorks Fibre Channel FC-AL Switch 8 Installation
Guide
Documentation received with your operating system
Microsoft Windows NT Server 4.0, Enterprise Edition
(Windows NTS/E)
Microsoft Windows 2000 Advanced Server
Compaq SmartStart for Servers Setup Poster
Compaq Insight Manager Installation Poster
Compaq Intelligent Cluster Administrator Quick Setup Guide
Microsoft clustering documentation
The installation and setup of your ProLiant Cluster is described in the following sections:
Preinstallation guidelines
Installing the hardware, including:
Cluster nodes
Compaq StorageWorks RAID Array 4000 or 4100 storage system
Cluster interconnect
Page 83
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-3
Installing the software, including:
Compaq SmartStart for Servers
Microsoft Windows NT Server 4.0, Enterprise Edition
Microsoft Windows 2000 Advanced Server
Compaq Redundancy Manager (Fibre Channel)
Compaq SANworks Secure Path for Windows 2000 on RAID Array
4000/4100
Compaq Insight Manager (optional)
Compaq Insight Manager XE (optional)
Compaq Intelligent Cluster Administrator (optional)
Additional cluster verification steps, including:
Verifying creation of the cluster
Verifying node failover
Verifying network client failover
These installation and configuration steps are described in the following pages.
Page 84
3-4 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Preinstallation Guidelines
Using the worksheets in Appendix A, write down the answers to the following questions before installing MSCS on a cluster node.
Are you forming or joining a cluster?
What is the cluster name?
What is the username, password, and domain for the domain account
that MSCS will run under?
What disks will you use for shared storage?
Which shared disk will you use to store permanent cluster files?
What are the adapter names and IP addresses of the network adapter
cards you will use for client access to the cluster?
What are the adapter names and IP addresses of the network adapter
cards you will use for the dedicated interconnect between the cluster nodes?
What is the IP address and subnet mask of the address you will use to
administer the cluster?
What are the slot numbers of the controllers to be managed by the
cluster?
Installing clustering software requires several specific steps and guidelines that may not be necessary when installing software on a single server. Read and understand the following items before proceeding with any software installation:
Ensure that you have sufficient software licensing rights to install the
Microsoft Windows operating system and software applications on each server.
Page 85
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-5
Ensure that the Fibre Channel storage hub or FC-AL switch has AC power.
Power up the RA4000/4100 storage system before the cluster nodes are
powered up.
Log on to the domain using an account that has administrative
permissions on both cluster nodes. When installing MSCS, both cluster nodes must be in the same Microsoft Windows NT or Windows 2000 domain. The cluster nodes can be members of an existing Windows NT or Windows 2000 domain, they can both be member servers, they can make up their own domain by assigning one as Primary Domain Controller (PDC) and one as Backup Domain Controller (BDC), or they can both be a BDC in an existing Windows NT or Windows 2000 domain.
One of the utilities the SmartStart CD runs is the Compaq Array
Configuration Utility, which configures the drives in the RA4000/4100. The Array Configuration Utility stores the drive configuration information on the drives themselves. After you have configured the shared drives from one of the cluster nodes it is not necessary to configure the drives from the other cluster node.
When the Array Configuration Utility runs on the first cluster node,
configure the shared drives in the RA4000/4100 storage system. When SmartStart runs the utility on the second cluster node, it will display information on the shared drives that was entered when the Array Configuration Utility was run on the first node. Accept the information as displayed and continue.
For a manual software installation, use Disk Administrator
(Windows NTS/E) or Disk Management (Windows 2000 Advanced Server) on the first cluster node to configure the shared drives, and allow MSCS to synchronize information between the two nodes.
By running Disk Administrator or Disk Management from the first
node, you prevent potential problems caused by inconsistent drive configurations. When the second cluster node joins the cluster, the disk information in the Windows Registry is copied from the first node to the second node.
Only New Technology File System (NTFS) is supported on shared
drives.
Page 86
3-6 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
MSCS software requires drive letters to remain constant throughout the life of the cluster; therefore, you must assign permanent drive letters to your shared drives. If you are performing manual software installation, use Disk Administrator or Disk Management to assign permanent drive letters.
Microsoft Windows NTS/E or Windows 2000 Advanced Server makes
dynamic drive letter assignments (when drives are added or removed, or when the boot order of drive controllers is changed), but Disk Administrator or Disk Management allows you to make permanent drive letter assignments.
Cluster nodes can be members of only one cluster.
When you set up the cluster interconnect, select TCP/IP as the network
protocol. MSCS requires the TCP/IP protocol. The cluster interconnect must be on its own subnet. The IP addresses of the interconnects must be static, not dynamically assigned by DHCP.
Page 87
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-7
Installing the Hardware
The following installation steps detail a new installation and setup of a
Compaq ProLiant Cluster HA/F100 or HA/F200.
Setting Up the Nodes
Physically preparing the nodes (servers) for use in a cluster is not very
different from preparing them for individual use. The primary difference will
be in setting up the shared storage:
1. Install all necessary adapter cards and insert all internal hard drives.
2. Attach network cables and plug in SCSI and/or Fibre Channel cables.
3. Set up one node completely, then set up the second node.
IMPORTANT: Do not load any software on either cluster node until all the hardware has been installed in both cluster nodes.
NOTE: Compaq recommends that Automatic Server Recovery (ASR) be left at the default values for clustered servers.
Follow the installation instructions in your Compaq ProLiant Server
documentation to set up the hardware. To install Compaq StorageWorks Fibre
Channel Host Adapters and any NICs, follow the instructions in the next
sections.
IMPORTANT: For the most up-to-date list of cluster-certified servers, access the Compaq High Availability website (http://www.compaq.com/highavailability).
Installing the Compaq StorageWorks
Fibre Channel Host Adapter
Follow the installation instructions in your Compaq StorageWorks Fibre
Channel Host Adapter Installation Guide or Compaq StorageWorks 64-Bit/66-
MHz Fibre Channel Host Adapter Installation Guide and your Compaq
ProLiant server documentation to install the host bus adapter in your servers.
Install one adapter per server for the HA/F100 configuration. Install two
adapters per server for the HA/F200 configuration.
Page 88
3-8 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
The host bus adapters, which connect the two servers to the storage through Fibre Channel storage hubs or FC-AL switches, are installed in each server like any other PCI card. The HA/F100 cluster requires one host bus adapter per server, while the HA/F200 requires two host bus adapters per server. The extra host bus adapter in each server contributes to the enhanced high availability features of the HA/F200. The dual host bus adapters, in conjunction with dual Fibre Channel storage hubs or FC-AL switches and dual array controllers form two completely independent paths to the storage, making the server-to-storage connection totally redundant. However, it is important to ensure that each host bus adapter in a particular server is connected to a different hub, because it is physically possible to connect the servers to the storage hubs is such a way that the cluster seem to be working correctly, but will not be able to fail over properly.
NOTE: To determine the preferred slots for installing the host bus adapters, use PCI bus-loading techniques to balance the PCI bus for your hardware and configuration. For more information, refer to your server documentation and the Compaq white paper, “Where Do I Plug the Cable? Solving the Logical-Physical Slot Numbering Problem,” available from the Compaq website (http://www.compaq.com).
Installing the Cluster Interconnect
There are many ways to physically set up an interconnect. See Chapter 1 for a description of the types of interconnect strategies.
If you are using a dedicated interconnect, install an interconnect adapter card (Ethernet or ServerNet) in each cluster node. If you are sharing your LAN NIC with your interconnect, install the LAN NIC.
NOTE: To determine the preferred slot for installing the interconnect card, use PCI bus-loading techniques to balance the PCI bus for your hardware and configuration. If you are installing the ServerNet card, treat it as a NIC in determining the preferred installation slot for maximum performance. For more information, see your server documentation and the Compaq white paper, “Where Do I Plug the Cable? Solving the Logical-Physical Slot Numbering Problem,” available from the Compaq website (http://www.compaq.com).
For specific instructions on how to install an adapter card, refer to the documentation for the interconnect card you are installing or the Compaq ProLiant server you are using. The cabling of interconnects is outlined later in this chapter.
Page 89
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-9
Setting Up the Compaq StorageWorks RAID Array 4000 and RAID Array 4100 Storage System
Follow the instructions in the Compaq shared external storage documentation
to set up the RA4000/4100, the Compaq StorageWorks Fibre Channel Storage
Hub 7 or 12, Compaq StorageWorks FC-AL Switch 8, the Compaq
StorageWorks RA4000 Controller, and the Fibre Channel cables.
Note that the Compaq shared external storage documentation explains how to
install these devices for a single server. Because clustering requires shared
storage, you will need to install these devices for two servers. This will require
running an extra Fibre Channel cable from the Fibre Channel storage hub or
FC-AL switch to the second server (Figure 3-1).
RA4000/4100
storage hub
or switch
Dedicated Interconnect
Node 1
Node 2
LAN
Figure 3-1. RA4000/4100 storage system connected to clustered servers in
the HA/F100 configuration
Page 90
3-10 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
For optimum performance and stability, it is highly recommended that you use port LIP propagation policies in a cluster environment with Compaq StorageWorks FC-AL Switches. Select the following settings:
Connected to Server (Host)= Disabled (NOLIP)
Connected to Storage (Target)= Enabled (LIP).
The ports on the Compaq StorageWorks FC-AL Switch 8 are configured using the FC-AL Switch Management Utility included with the switch. Using this utility, select 8 Port FC-AL Switch in the left hand column. Then select the FC-AL Switch Port Detail tab and the Configuration tab. On this screen on a per port basis, you can set the LIP Propagation policy to Enabled or Disabled as required by the ports in your cluster configuration.
The ports on the Compaq StorageWorks FC-AL 3-Port Expansion Module are configured in a similar fashion by selecting 3 Port Expansion Module in the left hand column of the management utility screen and then selecting the PEM Port Detail tab and finally the Configuration tab.
For more information on configuring port policies, refer to the Compaq StorageWorks FC-AL Switch User Guide.
IMPORTANT: Before running the Compaq Array Configuration Utility, ensure that all shared drives are in the storage box.
Powering Up
Before applying power to the RA4000/4100, ensure that all components are installed and connected to the Fibre Channel storage hub or FC-AL switch.
Power up the cluster in the following order:
1. Fibre Channel storage hubs or FC-AL switches. Power is applied when
the AC power cord is plugged in.
2. Storage systems
3. Servers
Page 91
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-11
Configuring Shared Storage
The Compaq Array Configuration Utility sets up the hardware aspects of any
drives attached to an array controller, including the drives in the shared
RA4000/4100s. The Array Configuration Utility can initially configure the
array controller, reconfigure the array controller, add additional disk drives to
an existing configuration, and expand capacity. The Array Configuration
Utility stores the drive configuration information on the drives themselves;
therefore, after you have configured the drives from one of the cluster nodes, it
is not necessary to configure the drives from the other cluster node.
For detailed information about configuring the drives, refer to the section on
the Compaq Array Configuration Utility in the Compaq shared external
storage documentation.
NOTE: The Array Configuration Utility runs automatically during an automated SmartStart installation.
Setting Up a Dedicated Interconnect
There are four ways to set up a dedicated interconnect.
Ethernet direct connect
Ethernet direct connect using a private hub
ServerNet direct connect
ServerNet direct connect using a switch
Ethernet Direct Connect
An Ethernet crossover cable is included with your Compaq ProLiant Cluster
kit. This cable directly connects two NIC that have been dedicated as the
dedicated interconnect. Connect one end of the cable to the NIC in Node 1 and
the other end of the cable to the NIC in Node 2.
IMPORTANT: Connect the cable to the dedicated interconnect NICs and not to the Ethernet connections used for the network clients (the public LAN).
NOTE: The crossover cable will not work in conjunction with a network hub or switch.
Page 92
3-12 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Ethernet Direct Connect Using a Private Hub or Switch
An Ethernet hub or switch requires standard Ethernet cables; Ethernet crossover cables will not work with a hub or switch. Follow these steps to cable the server interconnect using an Ethernet hub or switch:
1. Connect the end of one of the Ethernet cables to the NIC in Node 1.
2. Connect the other end of the cable to a port in the hub or switch.
3. Repeat steps 1 and 2 for the NIC in Node 2.
IMPORTANT: Place the cable into the dedicated interconnect NICs and not into the Ethernet connections used for the network clients (the public LAN).
ServerNet Direct Connect
To use the Compaq ServerNet option as the server interconnect for your ProLiant Cluster, you need the following:
Two ServerNet PCI adapter cards
Two ServerNet cables
Follow these steps to install the ServerNet interconnect:
1. Connect one end of a ServerNet cable to connector X on the ServerNet
card in Node 1.
2. Connect the other end of the ServerNet cable to connector X on the
ServerNet card in Node 2.
3. Connect the two ends of the second ServerNet cable to the Y connectors
on the ServerNet cards in Node 1 and Node 2.
IMPORTANT: Fasten the cable screws tightly. A loose cable could cause an unexpected fault in the interconnect path and an unnecessary failover event.
ServerNet Direct Connect Using a Switch
Although not necessary for a two-node cluster, the use of a ServerNet Switch allows for future growth. Refer to the Compaq ServerNet documentation for a description and detailed installation instructions.
Page 93
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-13
Setting Up a Public Interconnect
It is possible—but not recommended—to use a public network as your
dedicated interconnect path. To set up a public Ethernet interconnect, connect
the NICs, hub, and cables as you would in a nonclustered environment. Then
configure the NICs for both network clients and for the dedicated interconnect.
IMPORTANT: Using a public network as your dedicated interconnect path is not recommended because it represents a potential single point of failure for cluster communication.
NOTE: ServerNet is designed to be used only as a private or dedicated interconnect. It cannot be used as a public interconnect.
Redundant Interconnect
MSCS allows you to configure any certified network card as a possible path
for intracluster communication. If you are employing a dedicated interconnect,
use MSCS to configure your LAN network cards to serve as a backup for your
interconnect.
See the “Recommended Cluster Communication Strategy” section in
Chapter 2 of this guide for more information about setting up redundancy for
intracluster and cluster-to-LAN communication.
Installing the Software
The following sections describe the software installation steps for the
HA/F100 and the HA/F200. Proceed with these steps once you have all
equipment installed and your hubs or switches, storage system, and one server
powered up.
You need the following during installation:
IMPORTANT: Refer to Appendix C for the software and firmware version levels your cluster requires.
Compaq SmartStart and Support Software
Compaq SmartStart Setup Poster
Server Profile Diskette (included with SmartStart)
Page 94
3-14 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
One of the following operating systems:
Microsoft Windows NTS/E software and documentation
Microsoft Windows 2000 Advanced Server software and
documentation
Microsoft Service Packs
Compaq redundancy management software (HA/F200 only)
Compaq Redundancy Manager (Fibre Channel) (for Windows
NTS/E)
Compaq SANworks Secure Path for Windows 2000 on RAID Array
4000/4100 for HA/F200
Monitoring and Management Software
Compaq Insight Manager software and documentation
Compaq Insight Manager XE software and documentation
Compaq Intelligent Cluster Administrator software and
documentation
Compaq Cluster Verification Utility
At least 10 high-density diskettes
Assisted Integration Using SmartStart (Recommended)
IMPORTANT: Prior to the installation of Microsoft Windows 2000 Advanced Server,
upgrade the system ROM on each node with the latest systems ROMPaq from the Compaq website at http://www.compaq.com/support.
Use the SmartStart Assisted Integration procedure to configure the servers (nodes) in the HA/F100 and HA/F200 configuration. You will set up two nodes during this process. Proceed through all of the steps on each of the nodes, with noted exceptions.
CAUTION: Installation using SmartStart assumes that SmartStart is being installed on new servers. Any existing data on the servers boot drive will be erased.
Page 95
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-15
Cluster-Specific SmartStart Installation
The SmartStart Setup Poster describes the typical procedure for configuring
and installing software on a single server. The installation for Compaq
ProLiant Clusters HA/F100 and HA/F200 will be similar. The difference
between running SmartStart on a stand-alone server and running SmartStart
for a cluster are noted below:
Through the Compaq Array Configuration Utility, you can configure the
shared drives on both servers. For cluster configuration, configure the drives on the first server, then accept the same settings for the shared drives when given the option on the second server.
When configuring drives through the Array Configuration Utility, create
a logical drive with 100MB of space to be used as the quorum disk.
Assisted Integration Installation Steps
IMPORTANT: Power down Node 2 when setting up Node 1.
1. Power up your hardware in the following manner:
a. Fibre Channel storage hub or FC-AL switch (power is applied when
the AC cord is plugged in).
b. Shared storage and wait for drives to spin up.
c. Node 1 and place the SmartStart CD in the CD-ROM drive. The CD
will automatically run.
2. Select the Assisted Integration installation path. Follow steps outlined in
the SmartStart Setup Poster.
3. Select one of the following when SmartStart prompts for the operating
system:
Microsoft Windows NT Server 4.0/Enterprise Edition (Retail)
Microsoft Windows NT Server 4.0/Enterprise Edition (Select)
Microsoft Windows 2000 Advanced Server
Page 96
3-16 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
4. Press Enter after the hardware configuration utility has run. SmartStart
will automatically run the Array Configuration Utility.
IMPORTANT: Node 2 exception: In Step 5 when configuring Node 2, the Array Configuration Utility shows the results of the shared drives configured during Node 1 setup. Accept these changes for Node 2 by exiting the Array Configuration Utility.
NOTE: If the Node being configured has an array controller attached to the server-centric hard drives, the server array controller will also need to be configured at this time.
NOTE: Create a logical drive on one of the RA4000/4100 Arrays with 100 MB of space to be used as the quorum drive.
5. Choose the custom configuration option to create RAID sets on your
RA4000/4100 storage system. Refer to the user guide for the RA4000 or RA4100 for more details.
After you have completed using the Array Configuration Utility, the
system will reboot and SmartStart will automatically create your system partition.
6. Install additional Compaq software and utilities and choose the boot
partition. If installing Microsoft Windows NTS/E, install the Compaq Server Support for Microsoft Windows NT. SmartStart will guide you through the steps. Also, follow the instructions in the SmartStart setup poster.
IMPORTANT: In Step 7, when configuring Node 2, exit out of the Diskette Builder Utility and go to Step 8.
7. Create the Options ROMPaq in the Diskette Builder Utility. Label the
diskettes you create. The Options ROMPaq updates the firmware on the array controllers and the hard drives. For more information about Options ROMPaq, refer to the documentation that came with the RA4000/4100.
The node will reboot to prepare for the operating system installation.
8. Insert the Microsoft Windows CD when prompted.
Page 97
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-17
9. If installing Windows NTS/E, install Service Pack 3 when prompted.
After Service Pack 3 is installed, the node reboots and Enterprise Edition Installer loads automatically. Exit the Enterprise Edition Installer.
IMPORTANT: In Step 10, when updating the firmware on the array controllers, make sure that Node 2 is powered off.
IMPORTANT: Node 2 Exception: Do not update the firmware on the array controllers for the external shared storage when setting up Node 2.
10. Power down Node 1, insert Options ROMPaq diskette in Node 1, and
restart the node. Run Options ROMPaq from diskettes and choose to update the firmware on the array controllers.
11. Power down the storage and Node 1 after the firmware update
completes.
12. Power on the storage and wait for the drives to spin.
13. Power on Node 1.
14. Open the Disk Administrator for Windows NTS/E or Disk Management
for Windows 2000 Advanced Server. If prompted for drive signature stamp, choose “Yes.” If prompted to upgrade disks, choose “No” because MSCS does not support dynamic disks in a cluster.
15. Power on Node 2 and repeat steps 2-13.
16. Open the Disk Administrator for Windows NTS/E or Disk Management
for Windows 2000 Advanced Server on Node 2. If prompted for drive signature stamp, choose “Yes.” If prompted to upgrade disks, choose No because MSCS does not support dynamic disks in a cluster.
Page 98
3-18 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
17. If configuring an HA/F200 with Windows NTS/E, install the Compaq
Redundancy Manager on both nodes using the following steps:
a. Place the Compaq Redundancy Manager (Fibre Channel) CD in the
CD-ROM drive. It automatically loads the Install program.
b. Follow the instructions offered by the Redundancy Manager
installation screens.
c. Remove the Compaq Redundancy Manager (Fibre Channel) CD
from the CD-ROM drive.
d. Reboot the node.
To manually install Redundancy Manager:
a. Place the Compaq Redundancy Manager (Fibre Channel) CD into
the CD-ROM drive.
b. Select Settings from the Start menu.
c. Select Control Panel from the Settings menu.
d. Select Add/Remove Programs from the Control Panel.
e. Click Install from the Add/Remove Programs page.
f. Click Next from the Add/Remove Programs page.
g. Click Browse from the Add/Remove Programs page.
h. Locate the Redundancy Manager SETUP.EXE file on the Compaq
Redundancy Manager (Fibre Channel) CD.
i. Click Finish from the Add/Remove Programs page. The setup
program begins.
j. Follow the instructions displayed on the Redundancy Manager
installation screens.
k. Close the Control Panel.
l. Remove the Compaq Redundancy Manager (Fibre Channel) CD
from the CD-ROM drive.
m. Reboot the node.
To use Redundancy Manager, double-click the icon. For more
information about Redundancy Manager, refer to the online documentation (CPQDXCFG.HLP) included on the CD.
Page 99
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-19
18. If configuring an HA/F200 with Windows 2000 Advanced Server,
install Secure Path on both nodes using the following steps:
a. Insert the Secure Path CD to automatically start the Secure Path
installation process. Alternatively, double click the following file on the CD:
<CD- ROM drive>:\SPInstall\setup.exe
b. During the installation, you are required to configure your clients.
Remove the Compaq SANworks Secure Path CD from the CD-ROM
drive.
c. Reboot the node when prompted.
To use Secure Path select Start, Programs, SecurePath, SPM.
NOTE: If you have problems authorizing client connections using Fully Qualified Domain Names (FQDN), it may be due to a Domain Name Service (DNS) resolution issue, and can be resolved by a HOSTs file entry containing relevant FQDN to IP address mapping.
For more detailed information on Secure Path, refer to the Secure Path documentation.
19. Run the Compaq Cluster Verification Utility CD from your cluster kit to ensure that your node is ready for cluster installation.
Refer to the CCVU online help for detailed information on running
CCVU.
NOTE: You must have administrative accounts with identical username and password on the computers selected.
IMPORTANT: When setting up the cluster, both nodes must have the operating system installed prior to installing and configuring MSCS.
20. Install MSCS for Node 1.
For Windows NTS/E, open the Enterprise Edition Installer and
install MSCS on both cluster nodes as outlined in MSCS documentation.
For Windows 2000 Advanced Server, install the Cluster Service
(MSCS) component in Add/Remove Programs. For more information on installing and configuring MSCS, refer to your Windows 2000 Advanced Server documentation.
21. Install MSCS for Node 2.
22. Run CCVU again to verify successful cluster installation.
Page 100
3-20 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
23. Install the Microsoft Service Packs.
If installing Windows NTS/E, install Microsoft Windows NT
Service Pack 6a after cluster installation completes.
If installing Microsoft Windows 2000 Advanced Server, install
Microsoft Windows 2000 Service Pack 1 after
cluster installation
completes.
For the latest information on Service Packs, refer to your Microsoft provider or the Microsoft website (
http\\www.microsoft.com).
24. Run the appropriate support software.
For Microsoft Windows NTS/E, run Compaq Server Support for
Microsoft Windows NT to verify that all installed drivers are current. This service can be run from the following path on the SmartStart CD:
x:\cpqsupsw\ntssd\setup.exe
For Microsoft Windows 2000 Advanced Server, run Compaq Support
Paq for Windows 2000 to verify that all installed drivers are current. This service can be run from the following path on the SmartStart CD:
x:\cpqsupsw\ntcsp\setup.exe
For the latest versions of the support software for Microsoft Windows
NTS/E, or Microsoft Windows 2000 Advanced Server refer to the Compaq support website (
http://www.compaq.com/support).
25. Install your applications and managing and monitoring software.
Refer to the Compaq Insight Manager Installation Poster for information on installing Compaq Insight Manager on the management console and Insight Management Agents on servers and desktops.
Compaq Intelligent Cluster Administrator CD is located in your
HA/F200 cluster kit and is available as an orderable option for the HA/F100. Installation steps for installing Compaq Intelligent Cluster Administrator can be found later in this chapter and in the Compaq Intelligent Cluster Administrator Quick Setup Guide.
Loading...