Hp COMPAQ PROLIANT 2500, COMPAQ PROLIANT 4500 Clusters HA/F100 and HA/F200 Administrator Guide

Compaq ProLiant Clusters HA/F100 and HA/F200

Administrator Guide
Second Edition (September 1999) Part Number 380362-002 Compaq Computer Corporation
Writer: Linda Arnold Project: Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide Comments:
Part Number: 380362-002 File Name: a-frnt.doc Last Saved On: 8/11/99 3:55 PM
Compaq Confidential – Need to Know Required

Notice

The information in this publication is subject to change without notice. COMPAQ COMPUTER CORPORATION SHALL NOT BE LIABLE FOR TECHNICAL OR
EDITORIAL ERRORS OR OMISSIONS CONTAINED HEREIN, NOR FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES RESULTING FROM THE FURNISHING, PERFORMANCE, OR USE OF THIS MATERIAL. THIS INFORMATION IS PROVIDED “AS IS” AND COMPAQ COMPUTER CORPORATION DISCLAIMS ANY WARRANTIES, EXPRESS, IMPLIED OR STATUTORY AND EXPRESSLY DISCLAIMS THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR PARTICULAR PURPOSE, GOOD TITLE AND AGAINST INFRINGEMENT.
This publication contains information protected by copyright. No part of this publication may be photocopied or reproduced in any form without prior written consent from Compaq Computer Corporation.
© 1999 Compaq Computer Corporation. All rights reserved. Printed in the U.S.A. The software described in this guide is furnished under a license agreement or nondisclosure agreement.
The software may be used or copied only in accordance with the terms of the agreement. Compaq, Deskpro, Fastart, Compaq Insight Manager, Systempro, Systempro/LT, ProLiant, ROMPaq,
QVision, SmartStart, NetFlex, QuickFind, PaqFax, ProSignia, registered United States Patent and Trademark Office.
Netelligent, Systempro/XL, SoftPaq, QuickBlank, QuickLock are trademarks and/or service marks of Compaq Computer Corporation.
Microsoft, MS-DOS, Windows, and Windows NT are registered trademarks of Microsoft Corporation. Pentium is a registered trademark and Xeon is a trademark of Intel Corporation. Other product names mentioned herein may be trademarks and/or registered trademarks of their
respective companies.
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide Second Edition (September 1999) Part Number 380362-002
Writer: Linda Arnold Project: Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide Comments:
Part Number: 380362-002 File Name: a-frnt.doc Last Saved On: 8/11/99 3:55 PM
Compaq Confidential – Need to Know Required

Contents

About This Guide
Audience......................................................................................................................ix
Scope.............................................................................................................................x
Text Conventions.........................................................................................................xi
Symbols in Text..........................................................................................................xii
Getting Help................................................................................................................xii
Compaq Technical Support.................................................................................xii
Compaq Website................................................................................................ xiii
Compaq Authorized Reseller ............................................................................ xiii
Chapter 1
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200
Overview of Compaq ProLiant Clusters HA/F100 and HA/F200 Components...... 1-1
Compaq ProLiant Cluster HA/F100......................................................................... 1-3
Compaq ProLiant Cluster HA/F200......................................................................... 1-5
Compaq ProLiant Servers......................................................................................... 1-7
Compaq StorageWorks RAID Array 4000 Storage System..................................... 1-7
Compaq StorageWorks RAID Array 4000 ....................................................... 1-8
Compaq StorageWorks Fibre Channel Storage Hubs....................................... 1-9
Compaq StorageWorks RA4000 Controller...................................................... 1-9
Compaq StorageWorks Fibre Channel Host Adapter ..................................... 1-10
Gigabit Interface Converter-Shortwave .......................................................... 1-10
Cables .............................................................................................................. 1-10
Cluster Interconnect................................................................................................ 1-12
Client Network ................................................................................................ 1-12
Private or Public Interconnect......................................................................... 1-12
Interconnect Adapters...................................................................................... 1-13
Redundant Interconnects ................................................................................. 1-13
Microsoft Software................................................................................................. 1-14
Writer: Linda Arnold Project: Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide Comments:
Part Number: 380362-002 File Name: a-frnt.doc Last Saved On: 8/11/99 3:55 PM
Compaq Confidential – Need to Know Required
iv Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200
continued
Compaq Software ....................................................................................................1-14
Compaq SmartStart and Support Software CD................................................1-15
Compaq Redundancy Manager (Fibre Channel)..............................................1-16
Compaq Cluster Verification Utility................................................................1-16
Compaq Insight Manager.................................................................................1-17
Compaq Insight Manager XE...........................................................................1-18
Compaq Intelligent Cluster Administrator.......................................................1-18
Resources for Application Installation.............................................................1-19
Chapter 2
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
Planning Considerations............................................................................................2-2
Cluster Configurations .......................................................................................2-2
Cluster Groups....................................................................................................2-8
Reducing Single Points of Failure in the HA/F100 Configuration..................2-13
Enhanced High Availability Features of the HA/F200....................................2-22
Capacity Planning....................................................................................................2-26
Server Capacity ................................................................................................2-27
Shared Storage Capacity ..................................................................................2-29
Load Balancing.................................................................................................2-32
Networking Capacity........................................................................................2-33
Network Considerations..........................................................................................2-34
Network Configuration.....................................................................................2-34
Migrating Network Clients...............................................................................2-35
Failover/Failback Planning......................................................................................2-37
Performance After Failover..............................................................................2-37
MSCS Thresholds and Periods.........................................................................2-38
Failover of Directly Connected Devices..........................................................2-39
Manual vs. Automatic Failback .......................................................................2-40
Failover and Failback Policies .........................................................................2-40
Chapter 3
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200
Preinstallation Overview ...........................................................................................3-1
Preinstallation Guidelines..........................................................................................3-3
Installing the Hardware .............................................................................................3-6
Setting Up the Nodes..........................................................................................3-6
Setting Up the Compaq StorageWorks Raid Array 4000 Storage System........3-8
Setting Up a Private Interconnect.....................................................................3-10
Setting Up a Public Interconnect......................................................................3-12
Redundant Interconnect....................................................................................3-12
Writer: Linda Arnold Project: Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide Comments:
Part Number: 380362-002 File Name: a-frnt.doc Last Saved On: 8/11/99 3:55 PM
Compaq Confidential – Need to Know Required
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200
continued
Installing the Software............................................................................................ 3-13
Assisted Integration Using SmartStart (Recommended)................................. 3-13
Manual Installation Using SmartStart............................................................. 3-19
Compaq Intelligent Cluster Administrator ............................................................. 3-22
Installing Compaq Intelligent Cluster Administrator...................................... 3-22
Additional Cluster Verification Steps..................................................................... 3-23
Verifying the Creation of the Cluster.............................................................. 3-23
Verifying Node Failover.................................................................................. 3-24
Verifying Network Client Failover ................................................................. 3-25
Chapter 4
Upgrading the HA/F100 to an HA/F200
Preinstallation Overview .......................................................................................... 4-1
Materials Required.................................................................................................... 4-2
Upgrade Procedures.................................................................................................. 4-3
Chapter 5
Managing the Compaq ProLiant Clusters HA/F100 and HA/F200
Managing a Cluster Without Interrupting Cluster Services ..................................... 5-2
Managing a Cluster in a Degraded Condition .......................................................... 5-2
Managing Hardware Components of Individual Cluster Nodes .............................. 5-3
Managing Network Clients Connected to a Cluster................................................. 5-3
Managing a Cluster’s Shared Storage....................................................................... 5-4
Remotely Managing a Cluster.................................................................................. 5-4
Viewing Cluster Events............................................................................................ 5-4
Modifying Physical Cluster Resources..................................................................... 5-5
Removing Shared Storage System .................................................................... 5-5
Adding Shared Storage System......................................................................... 5-5
Adding or Removing Shared Storage Drives.................................................... 5-7
Physically Replacing a Cluster Node................................................................ 5-9
Backing Up Your Cluster ....................................................................................... 5-10
Managing Cluster Performance.............................................................................. 5-11
Compaq Redundancy Manager............................................................................... 5-12
Changing Paths................................................................................................ 5-13
Other Functions ............................................................................................... 5-14
Compaq Insight Manager ....................................................................................... 5-15
Cluster-Specific Features of Compaq Insight Manager.................................. 5-16
Compaq Insight Manager XE................................................................................. 5-17
Cluster Monitor................................................................................................ 5-18
About This Guide v
Writer: Linda Arnold Project: Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide Comments:
Part Number: 380362-002 File Name: a-frnt.doc Last Saved On: 8/11/99 3:55 PM
Compaq Confidential – Need to Know Required
vi Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Managing the Compaq ProLiant Clusters HA/F100 and HA/F200
continued
Compaq Intelligent Cluster Administrator..............................................................5-20
Monitoring and Managing an Active Cluster...................................................5-20
Managing Cluster History................................................................................5-21
Importing and Exporting Cluster Configurations.............................................5-21
Microsoft Cluster Administrator .............................................................................5-22
Chapter 6
Troubleshooting the Compaq ProLiant Clusters HA/F100 and HA/F200
Installation .................................................................................................................6-2
Troubleshooting Node-to-Node Problems ................................................................6-5
Shared Storage...........................................................................................................6-7
Client-to-Cluster Connectivity ................................................................................6-12
Cluster Groups and Cluster Resource......................................................................6-16
Troubleshooting Compaq Redundancy Manager....................................................6-17
Event Logging..................................................................................................6-17
Informational Messages ...................................................................................6-17
Warning Message.............................................................................................6-20
Error Messages.................................................................................................6-20
Other Potential Problems..................................................................................6-22
Appendix A
Cluster Configuration Worksheets
Overview...................................................................................................................A-1
Cluster Group Definition Worksheet........................................................................A-2
Shared Storage Capacity Worksheet ........................................................................A-3
Group Failover/Failback Policy Worksheet.............................................................A-4
Preinstallation Worksheet.........................................................................................A-5
Appendix B
Using Compaq Redundancy Manager in a Single-Server Environment
Overview...................................................................................................................B-1
Installing Redundancy Manager...............................................................................B-4
Automatically Installing Redundancy Manager................................................B-4
Manually Installing Redundancy Manager.......................................................B-5
Managing Redundancy Manager..............................................................................B-6
Changing Paths..................................................................................................B-7
Expanding Capacity ..........................................................................................B-8
Other Functions.................................................................................................B-9
Troubleshooting Redundancy Manager ...................................................................B-9
Overview.................................................................................................................B-10
Informational Messages..........................................................................................B-10
Writer: Linda Arnold Project: Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide Comments:
Part Number: 380362-002 File Name: a-frnt.doc Last Saved On: 8/11/99 3:55 PM
Compaq Confidential – Need to Know Required
About This Guide vii
Using Compaq Redundancy Manager in a Single-Server Environment
continued
Warning Message ...................................................................................................B-12
Error Messages .......................................................................................................B-13
Troubleshooting Redundancy Manager..................................................................B-16
Troubleshooting Potential Problems ...............................................................B-16
Appendix C
Software and Firmware Versions
Glossary Index
Writer: Linda Arnold Project: Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide Comments:
Part Number: 380362-002 File Name: a-frnt.doc Last Saved On: 8/11/99 3:55 PM
Compaq Confidential – Need to Know Required

Audience

About This Guide

This guide is designed to be used as step-by-step instructions for installation and as a reference for operation, troubleshooting, and future upgrades of the cluster server.
This guide provides information about the installation, configuration, and implementation of the Compaq ProLiant Cluster Models HA/F100 and HA/F200.
The primary audience of this guide consists of MIS professionals whose jobs include designing, installing, configuring, and maintaining Compaq ProLiant clusters.
This guide contains information that may be used by network administrators, installation technicians, systems integrators, and other technical personnel in the enterprise environment for the purpose of cluster installation, implementation, and maintenance.
IMPORTANT:
information that can be valuable for a variety of users. If you are installing the Compaq ProLiant Cluster HA/F100 or HA/F200 but will not be administering the cluster on a daily basis, please make this guide available for the person who will be responsible for the clustered servers when you have completed the installation.
Compaq Confidential – Need to Know Required
Writer: Linda Arnold Project: Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide Comments:
Part Number: 380362-002 File Name: a-frnt.doc Last Saved On: 8/11/99 3:55 PM
This guide contains installation, configuration, and maintenance
x Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide

Scope

Some clustering topics are mentioned, but not detailed, in this guide. Be sure to obtain other Compaq documents that offer additional guidance. This guide does not describe how to install and configure specific applications on a cluster. However, several Compaq TechNotes provide this information for industry-leading application.
This guide is designed to assist you in attaining the following objectives:
Planning and designing the Compaq ProLiant Cluster HA/F100 or
HA/F200 configuration to meet your business needs Installing and configuring the Compaq ProLiant Cluster HA/F100 or
HA/F200 hardware and software Using Compaq Insight Manager and Compaq Insight Manager XE,
Compaq Intelligent Cluster Administrator, and Compaq Redundancy Manager to manage your Compaq ProLiant Cluster HA/F100 or HA/F200
The contents of this guide are outlined below:
Chapter 1, “Architecture of the Compaq ProLiant Clusters HA/F100 and
HA/F200,” describes the hardware and software components of the Compaq ProLiant Cluster HA/F100 and HA/F200.
Chapter 2, “Designing the Compaq ProLiant Clusters HA/F100 and
HA/F200,” outlines a step-by-step approach to planning and designing a cluster configuration that meets your business needs. Included are several cluster planning worksheets that help in documenting information necesssary to configure your clustering solution.
Chapter 3 “Setting Up the Compaq ProLiant Clusters HA/F100 and
HA/F200,” outlines the steps you will take to install and configure the Compaq ProLiant Clusters HA/F100 and HA/F200.
Chapter 4, “Upgrading the HA/F100 to an HA/F200,” illustrates
procedures for upgrading the Compaq ProLiant Cluster HA/F100 to a HA/F200.
Chapter 5, “Managing the Compaq ProLiant Cluster HA/F100 and
HA/F200,” includes techniques for managing and maintaining the Compaq ProLiant Clusters HA/F100 and HA/F200.
Chapter 6, “Troubleshooting the Compaq ProLiant Clusters HA/F100
and HA/F200,” contains high-level troubleshooting information for the Compaq ProLiant Clusters HA/F100 and HA/F200.
Writer: Linda Arnold Project: Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide Comments:
Part Number: 380362-002 File Name: a-frnt.doc Last Saved On: 8/11/99 3:55 PM
Compaq Confidential – Need to Know Required
Appendix A, “Cluster Configuration Worksheets,” contains blank
worksheets to copy and use as directed in the cluster design and the installation steps outlined in chapters 2, 3, and 4.
Appendix B, “Using Compaq Redundancy Manager in a Single-Server
Environment,” explains how to implement Redundancy Manager in a nonclustered server environment.
Appendix C, “Software and Firmware Versions,” provides software and
firmware version levels that are required for your Compaq ProLiant Cluster.
“Glossary,” provides definitions for terms used throughout the guide.

Text Conventions

This document uses the following conventions to distinguish elements of text:
Keys
About This Guide xi
Keys appear in boldface. A plus sign (+) between two keys indicates that they should be pressed simultaneously.
USER INPUT
User input appears in a different typeface and in uppercase.
FILENAMES
Menu Options,
File names appear in uppercase italics.
These elements appear with initial capital letters. Command Names, Dialog Box Names
COMMANDS,
These elements appear in uppercase. DIRECTORY NAMES, and DRIVE NAMES
Type When you are instructed to
the information
without
pressing the
Enter When you are instructed to
the information and then press the
information, type
type
Enter
information, type
enter
Enter
key.
key.
Writer: Linda Arnold Project: Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide Comments:
Part Number: 380362-002 File Name: a-frnt.doc Last Saved On: 8/11/99 3:55 PM
Compaq Confidential – Need to Know Required
xii Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide

Symbols in Text

These symbols may be found in the text of this guide. They have the following meanings.
WARNING:
in the warning could result in bodily harm or loss of life.
CAUTION:
could result in damage to equipment or loss of information.
IMPORTANT:
instructions.
NOTE:
of information.
Text set off in this manner presents clarifying information or specific
Text set off in this manner presents commentary, sidelights, or interesting points

Getting Help

If you have a problem and have exhausted the information in this guide, you can get further information and other help in the following locations.

Compaq Technical Support

You are entitled to free hardware technical telephone support for your product for as long you own the product. A technical support specialist will help you diagnose the problem or guide you to the next step in the warranty process.
Text set off in this manner indicates that failure to follow directions
Text set off in this manner indicates that failure to follow directions
In North America, call the Compaq Technical Phone Support Center at 1-800-OK-COMPAQ
1
. This service is available 24 hours a day, 7 days a week.
Outside North America, call the nearest Compaq Technical Support Phone Center. Telephone numbers for worldwide Technical Support Centers are listed on the Compaq website. Access the Compaq website by logging on to the Internet:
http://www.compaq.com
1
For continuous quality improvement, calls may be recorded or monitored.
Compaq Confidential – Need to Know Required
Writer: Linda Arnold Project: Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide Comments:
Part Number: 380362-002 File Name: a-frnt.doc Last Saved On: 8/11/99 3:55 PM
About This Guide xiii
Be sure to have the following information available before you call Compaq:
Technical support registration number (if applicable)
Product serial numbers
Product model name and number
Applicable error messages
Add-on boards or hardware
Third-party hardware or software
Operating system type and revision level
Detailed, specific questions
For additional information, refer to documentation related to specific hardware and software components of the Compaq ProLiant Clusters HA/F100 and HA/F200, including, but not limited to, the following:
Documentation related to the ProLiant servers you are clustering
(for example, manuals, posters, and performance and tuning guides) Compaq RA4000 Array documentation
Microsoft NT Server 4.0/Enterprise Edition Administrator’s Guide

Compaq Website

The Compaq website has information on this product as well as the latest drivers and Flash ROM images. You can access the Compaq website by logging on to the Internet:
http://www.compaq.com

Compaq Authorized Reseller

For the name of your nearest Compaq authorized reseller:
In the United States, call 1-800-345-1518.
In Canada, call 1-800-263-5868.
Elsewhere, see the Compaq website for locations and telephone
numbers.
Writer: Linda Arnold Project: Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide Comments:
Part Number: 380362-002 File Name: a-frnt.doc Last Saved On: 8/11/99 3:55 PM
Compaq Confidential – Need to Know Required
Architecture of the
Compaq ProLiant Clusters
HA/F100 and HA/F200

Overview of Compaq ProLiant Clusters HA/F100 and HA/F200 Components

A cluster is a loosely coupled collection of servers and storage that acts as a single system, presents a single-system image to clients, provides protection against system failures, and provi des configuration options for load balancing.
Chapter
1
Clustering is an established technology that may provide one or more of the following benefit s:
Availability
Scalability
Manageability
Investment protection
Operational efficiency
1-2 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Compaq ProLiant Clusters HA/F100 and HA/F200 platforms are composed of the following.
Hardware:
Compaq ProLiant servers
Compaq StorageWorks R AID Array 4000 Storage System (former ly
Compaq Fibre Channel Storage System)
q
Compaq StorageWorks RAID Array 4000
q
Compaq StorageWorks Fibre Channe l Storage Hub (7- or 12-port)
q
Compaq StorageWorks Fibre Channe l Host Adapter
q
Compaq StorageWorks RA4000 Controller
q
Gigabit Interface Converter-Shortwave (GBIC-SW) modules
Cables
Cluster interconnect adapters
Software:
Microsoft Windows NT Server 4.0 Enterprise Editio n
Compaq SmartStart and Support Software CD
Compaq Support Software Diskette for Windows NT (NT SSD)
Compaq Redundancy Manager (Fibre Channel)
Compaq Cluster Verification Utility
Compaq Insight Manager
Compaq Insight Manager XE
Compaq Intelligent Cluster Administrator
This chapter disc usses the role each of these products plays in bringing a complete clustering solution to your computing environment.
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-3

Compaq ProLiant Cluster HA/F100

The Compaq ProLiant Cluster HA/F100 includes these hardware solution components:
2 Compaq ProLiant servers
1 or more Compaq StorageWorks RAI D Array 4000s
1 Compaq StorageWork s Fibre Channel Stora ge Hub (7- or 12-port)
1 Compaq StorageWorks RA4000 Controller per RA4000
1 Compaq StorageWork s Fibre Channel Host Adapter per server
Network interface cards (NICs)
Gigabit Interface Converter-Shortwave (GBIC-SW) modules
Cables
q
Multi-mode Fibre Channel cable
q
Ethernet crossover cable
q
Network (LAN) cable
The Compaq ProLiant Cluster HA/F100 includes these software solution components:
Microsoft Windows NT Server 4.0 Enterprise Editio n
Compaq SmartStart and Support Software CD
Compaq Support Software Diskette for Windows NT (NT SSD)
Compaq Cluster Verification Utility
Compaq Insight Manager (optional)
Compaq Insight Manager XE (optional)
Compaq Intelligent Cluster Administrator (optional)
NOTE: See Appendix C, “Software and Firmware Versions,” for the necessary software version levels for your cluster.
1-4 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
The following illustration depicts the HA/F100 configuration:
RA4000
Dedicated
storage hubstorage hub
LAN
Figure 1-1. Hardware components of the Compaq ProLiant Cluster HA/F100
Interconnect
Node 1
Node 2
The Compaq ProLiant Cluster HA/F100 configuration is a cluster with a Compaq StorageWorks RAID Array 4000, a single Compaq StorageWorks Fibre Channel Storage Hub (7- or 12- port), two Compaq Pr oLiant servers (nodes), a sing le Compaq StorageW orks Fibre Channel Host Adapter per server, a single Compaq StorageWorks RA4000 Controller per RA4000, and a dedicated interconnect.
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-5

Compaq ProLiant Cluster HA/F200

The Compaq ProLiant Cluster HA/F200 adds Redundancy Manager softwar e and a second, redundant, Fibre Channel Arbitrated Lo op (FC-AL) to the HA/F100 conf iguration. The Redundancy Manager software, i n conjunction with redundant fibre channel loops, enhances the high availability features of the HA/F200.
The Compaq ProLiant Cluster HA/F200 includes these hardware solution components:
2 Compaq ProLiant servers
1 or more Compaq StorageWorks RAI D Array 4000s
2 Compaq StorageWork s Fibre Channel Stora ge Hubs (7- or 12-por t)
2 Compaq StorageWorks RA4000 Controllers per RA40 00
2 Compaq StorageWork s Fibre Channel Host Adapters per server
Network interface cards (NICs)
Gigabit Interface Converter-Shortwave (GBIC-SW) modules
Cables
q
Multi-mode Fibre Channel cable
q
Ethernet crossover cable
q
Network (LAN) cable
The Compaq ProLiant Cluster HA/F200 includes these software solution components:
Microsoft Windows NT Server 4.0 Enterprise Editio n
Compaq SmartStart and Support Software CD
Compaq Support Software Diskette for Windows NT (NT SSD)
Compaq Redundancy Manager (Fibre Channel)
Compaq Cluster Verification Utility
Compaq Insight Manager (optional)
Compaq Insight Manager XE (optional)
Compaq Intelligent Cluster Administrator (optional)
1-6 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
NOTE: See Appendix C, “Software and Firmware Versions,” for the necessary software
version levels for your cluster.
The following illustration depicts the basic HA/F200 configuration.
RA4000
storage hubstorage hub
Dedicated
Interconnect
Node 1
storage hubstorage hub
LAN
Figure 1-2. Hardware components of the Compaq ProLiant Cluster HA/F200
Node 2
The Compaq ProLiant Cluster HA/F200 configuration is a c l uster with one or more Compaq StorageWorks RAID Array 4000s, tw o Compaq StorageWorks Fibre Channel Storage Hubs (7- or 12-port), two Compaq Pr oLiant servers, two Compaq StorageWorks Fibre Channel Host Adapters per server, two Compaq StorageWorks RA4000 Controllers per RA400 0, and a dedicated interconnect.
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-7

Compaq ProLiant Servers

Compaq industry standard servers ar e a primary component of all models of Compaq ProLia nt Clusters. At the high end of the ProLiant server line, several high availability and manageability features are incorporated as a standard part of the server feature set. These include online backup processors, a PCI bus with hot-plug capabilities, redundant hot-pluggable fans, redundant processor power modules, redundant Network Interface Controller (NIC) support, dual-ported hot-pluggable 10/100 NICs and redundant hot-pluggable power supplies (on most high-end models). Many of these features are available at the low end and mid range of the Com paq ProLiant server line, as well.
Compaq has logged thousands of hours testing multiple models of Compaq servers in clustered configurations and has successfully passed the Microsoft Cluster Certification Test Suite on numerous occasions. In fact, Compaq was the first vendor to be certified using a shared storage subsystem connecte d to ProLiant servers through Fibre Channel Arbitrated Loop technology.
NOTE: Visit the Compaq High Availability website (http://www.compaq.com/highavailability) to obtain a comprehensive list of cluster-certified servers.

Compaq StorageWorks RAID Array 4000 Storage System

Microsoft Cluster Server (MSCS) is based on a cluster architecture known as shared storage clustering, in which clustered servers share access to a common set of hard drives. MSCS requires all clustered (shared) data to be stored in an external storage system.
The Compaq StorageWorks RAID Array 4000 storage system is the shared storage system for the Compaq ProLiant Clusters HA/F100 and HA/F200. The storage system consists of the following components and options:
Compaq StorageWor ks RAID Array 4000
Compaq StorageWorks Fibre Channel Stora ge Hub (7- or 12-port)
Compaq StorageWorks RA400 0 Contr olle rs
Compaq StorageWorks Fibre Channel Host Adapters
1-8 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Gigabit Interface Converter-Shortwave (GBIC-SW) modules
Cables
q
Multi-mode Fibre Channel cable
q
Ethernet crossover cable
q
Network (LAN) cable
Each of these components is discussed in the following sections. For detailed information, refer to the following guides:
Compaq StorageWorks RAID Array 4000 User Guide
Compaq StorageWor ks RAID Array 4000 Configuration Poster
Compaq StorageWor ks RAID Array 4000 Redundant Array Controller
Configuration Poster
Compaq StorageWorks Fibre Channel Host Adapter Installation Guide
Compaq StorageWorks Fibre Channel Storage Hub 7 Installation Guide
Compaq StorageWorks Fibre Channel Storage Hub 12 Installation
Guide
Compaq Fibre Channel Troubleshooting Guide
For more information about shared storage clustering, refer to the Microsoft Cluster Server Administrator’s Guide.

Compaq StorageWorks RAID Array 4000

The Compaq St orageWorks RAID Array 4000 (RA4000, previously Compaq Fibre Channel Storage System) is the storage cabinet that contains the disk drives, power supply, and array co ntroller. The RA4000 can hold twelve 1-inch or eight 1.6-inch Wide-Ultr a SCSI drives. The RA4000 supports the same hot-pluggable drives as Compaq Servers and Compaq ProLiant Storage Systems, online capacity expansio n, online spares, and RAID fault tolerance of SMART-2 Arr ay Controller technology. The RA4000 also supports hot-pluggable , redundant power supplies and fans, hot-pluggable hard drives, and MSCS.
The HA/F100 and HA/F200 ProLiant Clusters must have at least one RA4000 set up as external shared storage. Consult the Order and Confi gur ati on G ui de for Compaq ProLiant Cluster HA/F100 and HA/F200 at the Compaq High Availability website maximum supported cluster config uration.
(http://www.compaq.com/highavailability) to determine the
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-9

Compaq StorageWorks Fibre Channel Storage Hubs

The servers in a Compaq ProLiant Cluster HA/F100 and HA/F200 are connected to one or more Compaq StorageWorks Raid Array 4000 shared external storage systems using industry-standard Fibre Channel Arbitrated Loop (FC-AL) technology. The components used to implement the Fibre Channel Arbitrated Loop include shortwave (multi-mode) fiber optic cables, Gigabit Interface Converters-Shortwave (GBIC-SW) and Fibre Channel storage hubs. The Compaq StorageWorks Fibre Channel Storage Hub is a critical component of the FC-AL configuration and allows up to five RA4000s
to be connected to the cluster servers in a “star” topology. For the HA/F100, a single hub is used. For the HA/F200, two redundant hubs are used. Either the 7-port or 12-port hub may be use d in both types of c lusters. For the H A/F200 cluster, 7-port and 12-port hubs may be combined, if desired. If the maximum number of supported RA4000s ( currently five) are connected to either type of cluster using a 12-port hub, there will be unused ports. Compaq does not currently support using these ports to connect additional RA4000s. Other FC-AL capable devices such as tape backup systems should not be connected to these unused ports under any circumstances.

Compaq StorageWorks RA4000 Controller

The Compaq Stor ageWorks RA4000 Controller is fully RAID capable and manages all of t he drives in the RA4000 storage array. Each RA4000 is shipped with one controller installed. In an HA/F100 cluster, each array controller is connected to both servers through a single Fibre Channel storage hub. In an HA/F200 cluster, the addition of a second Compaq StorageWorks RA4000 Redundant Controller is required to provide redundancy. These redundant controllers are connected to each server through two separate, and redundant, Fibre Channel storage hubs. This dual-connection configur ation implements a vital aspect of the enhanced high availability features of the HA/F200 cluster.
1-10 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide

Compaq StorageWorks Fibre Channel Host Adapter

Compaq StorageWorks Fibre Channe l Host Adapters (host bus adapters) are the interface between the server and the RA4000 storage system. At least two host bus adapters (PCI or EISA), one for each cluster node, are required in the Compaq ProLiant Cluster HA/F100. At least four host bus adapters (PCI only), two for each cluster node, are required in the HA/F200 configuration.
For more information about this product, refer to the Comp aq St or a geW ork s Fibre Channel Host Adapter Installation Guide.

Gigabit Interface Converter-Shortwave

Two Gigabit Interface Converter-Shortwave (GBIC-SW) modules are required for each Fibre Channel cable installed. Two GBIC-SW modules are provided with each RA4000 and host bus adapter.
GBIC-SW module s hot-plug into Fibre Channel storage hubs, array controllers, and host bus adapters. These converters provide ease of expansion and 100 MB/s performanc e. GBIC- S W modu les s upp ort distances up to 500 meters using multi-mode fibre optic cable.

Cables

Three general c ategories of cables are used for Compaq ProLiant HA/F100 and HA/F200 clusters:
Server to Storage
Shortwave (multi-mode) fiber optic cables are used to connect the servers, hubs and RA4000s in a Fibre Channel Arbitrated Loop configuration.
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-11
Cluster Interconnect
Two types of cluster interconnect cables may be used depending on the type of devices used to implement the interconnect, and whether the interconnect is dedicated or shared:
Ethernet
1.
If Ethernet NICs are used to implement the interconnect, there are three options:
Dedicated Interconnect Using an Et hernet Crossover Cable: An Ethernet crossover cable (supplied in both the HA/F100 and HA/F200 kits) can be used to connect the NICs directly together to create a dedicated interconnect.
Dedicated Interconnect Using Standard Ethernet Cables and a private Ethernet Hub: Standard Ethernet cables can be used to connect the NICs together through a private Ethernet hub to create another type of dedicated interconnect. Note that an E t hernet crossover cable should not be used when using an Ethernet hub because the hub performs the crossover function.
Shared Interconnect Using Standard Ethernet Cables and a Public Hub: Standa rd Ethernet cables may also be used to connect the NICs to a public network to create a nondedicated interconnect.
ServerNet
2.
If Compaq ServerNet adapters are used to implement the interconnect, special ServerNet cables must be used.
Network Interconnect
Standard Ethernet cables are used to provide this type of connection.
1-12 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide

Cluster Interconnect

The cluster interconnect is a data path over which nodes of a cluster communicate. This type of communication is termed intracluster communication. At a minimum, the interconnect consists of two network adapters (one in each server) and a cable connecting the adapters.
The cluster nodes use the interconnect data path to:
Communicate individual re source and overall cluster status
Send and receive heartbeat signals
Update modified registry information
IMPORTANT: MSCS requires TCP/IP as the cluster communication protocol. When configuring the interconnects, be sure to enable TCP/IP.

Client Network

Every client/server application requires a local area network, or LAN, over which client machines and servers communicate. The components of the LAN are no different than with a stand-alone server configuration.
Because clients desiring the full advantage of the cluster will now connect to the cluster rather than to a specific server, configuring client connections will differ from those for a stand-alone server. Clients will c onnect to virtual servers, which ar e cluster groups that contain their ow n IP addresses.
Within this guide , communication be tween the network clients and the cluster is termed cluster-to-LAN communication.

Private or Public Interconnect

There are two types of interconnect pa ths:
A private interconnect (also known as a dedicated interconnec t ) is used
solely for intracluster (node-to-n ode) communicatio n. Communication to and from network clients does not occur over this type of interconnect.
A public interconnect not only takes care of c ommunication betw een the
cluster nodes, it also shares the data path with communication between the cluster and its network clien ts.
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-13
For more information about Compaq-recommended interconnect strategies,
refer to the White Paper, “Increasing Availability of Cluster Communications in a Windows NT Cluster,” available from the Compaq High Availability website (
http://www.compaq.com/highavailability).

Interconnect Adapters

Ethernet adapters, or Compaq ServerNet adapters, can be used for the interconnect between the servers in a Compaq ProLiant Cluster. Either 10Mb/sec, or 100Mb/sec, Ethernet may be used. ServerNet adapters have built-in redundancy and provide a high-speed interconnect with 100MB/sec aggregate throughput.
Ethernet adapters can be connected together using an Ethernet crossover cable or a private Ethernet hub. Both of these options provide a dedicated interconnect.
Implementing a direct Ethernet or ServerNet connection minimizes the potential single points of failure.

Redundant Interconnects

To reduce potential disruptions of intracluster communication, use a redundant path over which c ommunication can continue if the primary path is disrupted.
Compaq recommends configuring the client LAN as a bac kup path for intracluster communication. This provides a secondary path for the cluster heartbeat in case the dedicated primary path for intracluster communications fails. This is configured when installing the cluster software, or it can be added later using the MSCS Cluster A dministrator.
It is also important to provide a redundant path to the client LAN. This can be done by using a second NIC as a hot standby for the primary client LAN NIC.
There are two ways to achieve this, and the method you choose is dependent on your hardware. One way is through use of the Redundant NIC Utility available on all Compaq 10/100 Fast Ethernet products. The other option is through the use of the Network Fault Tolerance feature designed to operate with the Compaq 10/100 Intel silicon-based NICs. These features allow two NICs to be configured so that one is a hot backup for the other.
For detailed information about interconnect redundancy, refer to the Compaq White Paper, “Increasing Availability of Cluster Communications in a Windows NT Cluster,” available from the Compaq High Availability website
http://www.compaq.com/highavailability).
(
1-14 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide

Microsoft Software

Microsoft Windo w s NT Server 4.0/Enterpr i se Edition (Windows NTS / E) is the operating system for the Compa q ProLiant Clusters HA /F100 and HA/F200. Microsoft Cluster Server (MSCS) is part of Windows NTS/E. As the core component of Windows NT cl ustering, MSCS provides the underlying technology to:
Send and receive heartbeat signals between the cluster nodes.
Monitor the state of each cluster node.
Initiate failover and failback events.
NOTE: MSCS will only run with Windows NTS/E. Previous versions of Windows NT are not supported.
Microsoft Cluster Administrator, a nother component of Windows NTS/E, allows you to do the following:
Define and modify cluster groups
Manually control the cluster
View the current state of the cluster
NOTE: Microsoft Windows NTS/E must be purchased separately from your Compaq ProLiant Cluster, through your Microsoft reseller.

Compaq Software

Compaq offers an extensive set of features and optional tools to support the configuration and management of your Compaq ProLiant Cluster:
Compaq SmartStart and Support Software CD
Compaq Redundancy Manager (Fibre Channel)
Compaq Insight Manager
Compaq Insight Manager XE
Compaq Intelligent Cluster Administrator
Compaq Cluster Verification Utility
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-15

Compaq SmartStart and Support Software CD

Compaq SmartStart is located on the SmartStart and Support Software CD shipped with ProLiant servers. SmartStar t is the recommended way to configure the Compaq ProLiant Cluster HA/F100 or HA/F200. SmartStart uses a step-by-step process to configure the cluster and load the system software. For information concerning SmartStart, refer to the Compaq Server Setup and Management pack.
For information about usin g SmartStart to install the Compaq ProLiant Cluster HA/F100 and HA/F200, see c hapters 3 and 4 of this guide.
Compaq Array Configuration Utility
The Compaq Array Configuration Utility, found on the Compaq SmartStart and Support Software CD, is use d to configure the array controller, add disk drives to an existing configuration, and expand capacity.
Compaq System Configuration Utility
The SmartStart and Support Software CD also contains the Compaq System Configuration Utility. This utility is the primary means to configure hardware devices in your se rver, such as I/O addresses, boot orde r of disk controllers, and so on.
For information concerning the Compaq System Configuration Utility, refer to the Compaq Serve r Setup and Management pack.
Compaq Support Software Diskette (NT SSD)
The Compaq Support Software Diskette for Windows NT (NT SSD) contains device drivers and utilities that enable you to take advantage of specific capabilities offered on Compaq products. These drivers are provided for use with Compaq hardware only.
The NT SSD is included in the Compaq Server Setup and Management pack.
Options ROMPaq Utility
The SmartStart and Support Software CD also contains the Options ROMPaq utility. Options ROMPaq updates the firmware on the Compaq StorageWorks RA4000 Controlle rs a nd the ha r d drive s.
1-16 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Fibre Channel Fault Isolation Utility (FFIU)
The SmartStart and Support Software CD also contains the Fibre Channel Fault Isolation Utility (FFIU). The FFIU verifies the integrity of a new or existing FC-AL installation. This utility provides fault detection and help in locating a failing device on the FC-AL.

Compaq Redundancy Manager (Fibre Channel)

Compaq Redundancy Manager, a Compaq-written software component that works in conjunction with the Windows NT file system (NT FS), increases the availability of both single-server and clustered systems that use the Compa q StorageWorks RAID Array 4000 Storage System and Compaq ProLiant servers. Redundancy Manager can detect failures in the host bus adapter, array controller or other Fibre Channel Arbitrated Loop components. When such a failure occurs, I/O processing is rerouted through a redundant path, allowing applications to continue processing. This rerouting is transparent to the Windows NT file system. Therefore, in an HA/F200 configuration, it is not necessary for MSCS to fail resources over to the other node. Redundancy Manager, in c ombination with r edundant hardware components, is the basis for the enhanced high availability fe atures of the HA/F200.
Compaq Redundancy Manager (Fibre Channel) CD is included in the Compaq ProLiant Cluster HA/F200 kit.

Compaq Cluster Verification Utility

The Compaq Cluster Verification Utility (CCV U) is a software utility that can be used to validate several key aspe cts of the Compaq ProLiant Cluster HA/F100 and HA/F200 and their components.
The stand-alo ne utility can be run fr om either of the cluster nodes or remotely from a network client attached to the cluster. When CCVU is run remotely, it can validate any number of Windows NT clusters to which the client is attached.
The CCVU tests your cluster configuration in the following categories:
A node test verifies that the clustered servers are supported in HA/F100
and HA/F200 cluster configurations.
Networking tests verify that your setup meets the minimum cluster
requirements for network cards, connectivity, and TCP/IP configuration.
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-17
Storage tests verify the presence and minimum configuration
requirements of supported host bus adapters, array controllers, and external storage subsystem.
System software tests verify that Microsoft Windows NT
Server 4.0/ Enterprise Edition has been installed.
The Compaq Cluster Verification Utility CD is included in the HA/F100 and HA/F200 cluster kits. For detailed information ab out the CCVU, refer to the online documentation (CCVU.HLP) included on the CD.

Compaq Insight Manager

Compaq Insight Manager, loaded from the Compaq Management CD, is an easy-to-use, console-based software utility for collecting server and cluster information. Compaq Insight Mana ger performs the following functions:
Monitors fault conditions and system status
Monitors shared storage and interconnect adapters
Forwards server alert fault conditions
Remotely controls servers
The Integrated Management Log collects and feeds data to Compaq Insight Manager. This log is used with the Insight Management Desktop (IMD), Remote Insight (optional controller), and SmartStart.
In Compaq servers, each hardware subsystem, such as disk storage, system memory, and system processor, has a rob ust set of management capabilities. Compaq Full Spectrum Fault Managem ent notifies of impe nding fault conditions and keeps the server up and running in the unlikely event of a hardware failure.
For information concerning Compaq Insight Manager, refer to the Compaq Server Setup and Management pack.
1-18 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide

Compaq Insight Manager XE

Compaq Insight Manager XE is a Web-based management system. It can be used in conjunction with Compaq In sight Manager age nts a s w ell as its own Web-enabled agents. This browser-b ased utility provides increase d flexibility and efficiency for the administrator. It extends the functionality of Compaq Insight Manage r and works in conjunction with the Clu s ter Monitor subsystem, providing a common data repository and control point for enterprise servers and clusters, de sktops, and other devices using eithe r SNMP- or DMI-based messaging.
Compaq Insight Manager XE is an optional CD available upon request from the Compaq Syst em Management website (
Cluster Monitor
Cluster Monitor is a Web-based monit oring subsystem of Compaq Insight Manager XE. With Cluster Monitor, you can view all clusters from a single browser and configure monitor points and specific o pe rational performance thresholds that will alert you when these thresholds have been met or exceeded on your application systems. Cluster Monitor relies heavily on the Compaq Insight Manager agents for basic information about system health. It also has custom agents that are designed specifically for monitoring cluster health. Cluster Monitor provides access to the Compaq Insight Manager alarm, device, and c onfiguration information.
http://www.compaq.com/sysmanage).
Cluster Monitor allows the administrator to view some or all of the clusters, depending on administrative controls that are specified when clusters are discovered by Compaq Insight Mana ger XE.

Compaq Intelligent Cluster Administrator

Compaq Intelligent Cluster Administrator extends Compaq Insig ht Manager and Cluster Monitor by enabling Administrator to configure and manage ProLiant clusters from a Web browser. With Compaq Intelligent Cluster Administrator, you can copy, modif y, and dynamically install a cluster configuration on the same physical cluster or on any physical cluster anywhere in the system, through the Web.
Compaq Intelligent Cluster Administrator checks for any cluster destabilizing conditions, such as disk thresholds or application slowdowns, and reallocates cluster reso urces to meet processing demands. This software also performs dynamic alloca t ion of cluster resources that may be failing without causing the cluster to fail over.
Architecture of the Compaq ProLiant Clusters HA/F100 and HA/F200 1-19
Compaq Intelligent Cluster Administrator also provides initialized cl uster configurations that allow rapid cl uster generation as we ll as cluster configuration builder wizar ds for extending the Compaq initialized configurations.
Compaq Intelligent Cluster Administrator is included with the HA/F200 cluster kit and can be purchased as a stand-alo ne component for the HA/F100 cluster.

Resources for Application Installation

The client/server software applications are among the key components of any cluster. Compaq is working with its key software partne rs to ensure that cluster-aware applications are available and that the applications work seamlessly on Compaq ProLiant clusters.
Compaq provides a number of Integration TechNotes and White Papers to assist you with installin g these applications in a Compaq ProLiant Cl uster environment.
Visit the Compaq High Availability website
http://www.compaq.com/highavailability) to download current versions of these
( TechNotes and other technical documents.
IMPORTANT: Your software applications may need to be updated to take full advantage of clustering. Contact your software vendors to check whether their software supports MSCS and to ask whether any patches or updates are available for MSCS operation.
Chapter
2
Designing the Compaq ProLiant
Clusters HA/F100 and HA/F200
Before connecting any cables or powering on any machines, it is important to understand how all of the various cluster components and concepts fit together to meet your information system needs. The major topics discussed in this chapter are:
Planning Considerations
Capacity Planning
Network Considerations
Failover/Failback Planning
In addition to reading this chapter, read the planning chapter in Microsoft Cluster Server Administrator’s Guide.
2-2
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide

Planning Considerations

To correctly assess capacity, network, and failover needs in your business environment, it is important to have a good understanding of clustering and the things that affect the availability of clusters. The items detailed in this section will help you design your Compaq ProLiant Cluster so that it addresses your specific availability needs.
Cluster configuration design is addressed in “Cluster Configurations.”
A step-by-step approach to creating cluster groups is discussed in
“Cluster Groups.”
Recommendations regarding how to reduce or eliminate single points of
failure are contained in the “Reducing Single Points of Failure in the HA/F100 Configuration” section of this chapter. By definition, a highly available system is not continuously available and therefore may have single points of failure.
The discussion in this chapter relating to single points of failure applies only to the
NOTE:
Compaq ProLiant Cluster HA/F100. The HA/F200 includes dual redundant loops, which eliminate certain single points of failure contained in the HA/F100.

Cluster Configurations

Although there are many ways to set up clusters, most configurations fall into two categories: active/active and active/standby.
Active/Active Configuration
The core definition of an active/active configuration is that each node is actively processing data when the cluster is in a normal operating state. Both the first and second nodes are “active.” Because both nodes are processing client requests, an active/active design maximizes the use of all hardware in both nodes.
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
An active/active configuration has two primary designs:
The first design uses Microsoft Cluster Server (MSCS) failover
capabilities on both nodes, enabling Node1 to fail over clustered applications to Node2 and enabling Node2 to fail over clustered applications to Node1. This design optimizes availability since both nodes can fail over applications to each other.
The second design is a one-way failover. For example, MSCS may be
set up to allow Node1 to fail over clustered applications to Node2, but not to allow Node2 to fail over clustered applications to Node1. While this design increases availability, it does not maximize availability since failover is configured on only one node.
When designing cluster nodes to fail over to each other, ensure that each server has enough capacity, memory, and processor power to run all applications (all applications running on the first node plus all clustered applications running on the other node).
When designing your cluster so that only one node (Node1) fails over to the other (Node2), ensure that Node2 has enough capacity, memory, and CPU power to execute not only its own applications, but to run the clustered applications that can fail over from Node1.
2-3
Another consideration when determining your servers’ hardware is understanding your clustered applications’ required level of performance when the cluster is in a degraded state (when one or more clustered applications is running on a secondary node). If Node2 is running near peak performance when the cluster is in a normal operating state, and if several clustered applications are failed over from Node1, Node2 will likely execute the clustered applications more slowly than when they were executed on Node1. Some level of performance degradation may be acceptable. Determining how much degradation is acceptable depends on the company.
2-4
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Example 1: File & Print/File & Print
An example business scenario involves two file and print servers. The Human Resources (HR) department uses one server, and the Marketing department uses the other. Both servers actively run their own file shares and print spoolers while the cluster is in its normal state (an active/active design).
If the HR server encounters a failure, it fails over its file and print services to the Marketing server. HR clients experience a slight disruption of service while the file shares and print spooler fail over to their secondary server. Any jobs that were in the print spooler before the failure event will now print from the Marketing server.
File and Print
Marketing
Capacity
Human Resources
Figure 2-1. Active/active example 1
Shared Storage
(Human Resources)(Marketing)
File and Print
Human Resources
Capacity
Marketing
When failover is complete, all of the HR clients have full access to their file shares and print spooler. Marketing clients do not experience any disruption of service. All clients may experience slowed performance while the cluster runs in a degraded state.
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
Example 2: Database/Database
Another scenario has two distinct database applications running on two separate cluster nodes. One database application maintains Human Resources records, and its primary node is set to the HR database node. The other database application is used for market research, and its primary node is set to the Marketing database node.
While in a normal state, both cluster nodes run at expected performance levels. If the Marketing server encounters a failure, the market research application and associated data resources fail over to their secondary node, the HR database server. The Marketing clients experience a slight disruption of service while the database resources are failed over, the database transaction log is rolled back, and the information in the database is validated. When the database validation is complete, the market research application is brought online on the HR database node and the Marketing clients can reconnect to it. While the Marketing database validation is occurring, the HR clients do not experience any disruption of service.
Example 3: File & Print/Database
In this example, a business uses a single server to run its order entry department. The same department has a file and print server. While order entry is business-critical and requires maximum availability, the file and print server can be unavailable for several hours without impacting revenue. In this scenario, the order entry database is configured to use the file and print server as its secondary node. However, the file and print server will not be configured to fail over applications to the order entry server.
2-5
Order Entry
Database
Shared Storage
Node1 Node2
(Order Entry)
Figure 2-2. Active/active example 3
(File and Print)
File and Print
Services
Capacity of
Order Entry
Database
2-6
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
If the node running the order entry database encounters a failure, the database fails over to its secondary node. The order entry clients experience a slight disruption of service while the database resources are failed over, the database transaction log is rolled back, and the information in the database is validated. When the database validation is complete, the order entry application is brought online on the file and print server and the clients can reconnect to it. While the database validation is occurring, file and print activities continue without disruption.
If the file and print server encounters a failure, those services are not failed over to the order entry server. File and print services are offline until the problem is resolved and the node is brought back online.
Active/Standby Configuration
The primary difference between an active/active configuration and an active/standby configuration is the number of servers actively processing data. In active/standby, only one server is processing data (active) while the other (the standby server) is in an idle state.
The standby server must be logged in to the Windows NT domain and MSCS must be up and running. However, no applications are running. The standby server’s only purpose is to take over failed clustered applications from its partner. The standby server is not a preferred node for any clustered applications and, therefore, does not fail-over any applications to its partner server.
Because the standby server does not process data until it accepts failed over applications, the limited use of the server may not justify the cost of the server. However, the cost of standby servers is justified when performance and availability are paramount to a business’ operations.
The standby server should be designed to run all of the clustered applications with little or no performance degradation. Since the standby server is not running any applications while the cluster is in a normal operating state, a failed-over clustered application will likely execute with the same speed and response time as if it were executing on the primary server.
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
Example – Database/Standby Server
An example business scenario describes a mail order business whose competitive edge is quick product delivery. If the product is not delivered on time, the order is void and the sale is terminated. The business uses a single server to perform queries and calculations on order entry information, translating sales orders into packaging and distribution instructions for the warehouse. With an estimated downtime cost of $1,000/hour, the company determines that the cost of a standby server is justified.
This mission-critical (active) server is clustered with a standby server. If the active server encounters a failure, this critical application and all its resources fail over to the standby server, which validates the database and brings it online. The standby server now becomes active and the application executes at an acceptable level of performance.
2-7
Mail Order System
Node1 Node2
(Mail Order Database)
Figure 2-3. Active/standby example
Capacity
(Mail Order System)
Shared Storage
(Standby)
2-8
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide

Cluster Groups

Understanding the relationship between your company’s business functions and cluster groups is essential to getting the most from your cluster. Business functions rely on computer systems to support activities such as transaction processing, information distribution, and information retrieval. Each computer activity relies on applications or services, and each application depends on software and hardware subsystems. For example, most applications need a storage subsystem to hold their data files.
This section is designed to help you understand which subsystems, or resources, must be available for either cluster node to run a clustered application properly.
Creating a Cluster Group
The easiest approach to creating a cluster group is to start by designing a resource dependency tree. A resource dependency tree has as its top level the business function for which cluster groups are created. Each cluster group has branches that indicate the resources upon which the group is dependent.
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
Resource Dependency Tree
The following steps describe the process of creating a resource dependency tree. Each step is illustrated by adding information to a sample resource dependency tree. The sample is for a hypothetical Web Sales Order business function, which consists of two cluster groups: a database server (a Windows NT application) and a Web server (a Windows NT service).
For this example, it is assumed that each cluster group can communicate with the
NOTE:
other even if they are not executing on the same node, for example, by means of an IP address. With this assumption, one cluster group can fail over to the other node, while the remaining cluster group continues to execute on its primary node.
List each business function that requires a clustered application
1.
or service.
Web Sales Order
Business Function
Web Sales Order
Cluster Group
2-9
Cluster Group #1
Figure 2-4. Resource dependency tree: step 1
Cluster Group #2
2-10
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
List each application or service required for each business function.
2.
Web Sales Order
Business Function
Web Server Service
(Cluster Group #1)
Resource
#1
Dependent-Resource
#1
Figure 2-5. Resource dependency tree: step 2
Resource#2Resource
#3
Database Server Application
(Cluster Group #2)
Resource
#1
Dependent-Resource
Resource#2Resource#3Resource
#1
#4
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
List the immediate dependencies for each application (or service).
3.
Web Sales Order
Business Function
2-11
Web Server Service
(Cluster Group #1)
Network
Name
IP Address
Figure 2-6. Resource dependency tree: step 3
Transfer the resource dependency tree into a Cluster Group Definition
4.
Web Server
Service
Physical Disk
Resource-
contains web
pages and web
scripts
Network
Name
IP Address
Database Server Application
(Cluster Group #2)
Physical Disk
Resource -
contains DB
log file(s)
worksheet.
Physical Disk
Resource -
contains DB
data file(s)
Database
Application
2-12
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Figure 2-7 illustrates the worksheet for the Web Sales Order business function. A blank copy of the worksheet is provided in Appendix A.
Cluster Group Definition Worksheet Cluster Function Group #1 Group #2
Web Sales Order Web Server Service Database Server Application
Resource Definitions
Group #1 (Web Server Service)
Resource #1 Network Name
Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4
IP Address
Resource #2 Physical Disk Resource-contains Web pages and Web scripts
Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4
Resource #3 Web Server Service
Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4
Resource #4 N/A
Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4
Group #2 (Database Server Application)
Resource #1 Network Name
Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4
IP Address
Resource #2 Physical Disk Resource-contains database log files
Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4
Resource #3 Physical Disk Resource-contains database data files
Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4
Resource #4 Database Application
Sub Resource 1 Sub Resource 2 Sub Resource 3 Sub Resource 4
Figure 2-7. Cluster Group Definition Worksheet (example)
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
Use the resource dependency tree concept to review your company’s availability needs. It is a useful exercise, directing you to record the exact design and definition of each cluster group.

Reducing Single Points of Failure in the HA/F100 Configuration

The final planning consideration is reducing single points of failure. Depending on your needs, you may leave all vulnerable areas alone, accepting the risk associated with a potential failure. Or, if the risk of failure is unacceptable for a given area, you may elect to use a redundant component to minimize, or remove, the single point of failure.
Although not specifically covered in this section, redundant server components
NOTE:
(such as power supplies and processor modules) should be used wherever possible. These features will vary based upon your specific server model.
The single points of failure described in this section are:
Cluster interconnect
Fibre Channel data paths
2-13
Non-shared disk drives
Shared disk drives
The Compaq ProLiant Cluster HA/F200 addresses the single points of failure listed
NOTE:
above with its dual redundant loop configuration. For more information, refer to the “Enhanced High Availability Features of the HA/F200” section of this chapter.
Cluster Interconnect
The interconnect is the primary means for the cluster nodes to communicate. Intracluster communication is crucial to the health of the cluster. If communication between the cluster nodes ceases, MSCS must determine the state of the cluster and take action, in most cases bringing the cluster groups offline on one of the nodes and failing over all cluster groups to the other node.
Following are two strategies for increasing the availability of intracluster communication. Combined, these strategies provide even more redundancy.
2-14
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
MSCS Configuration
MSCS allows you to configure a primary and backup path for intracluster communication, which will reduce the possibility of an intracluster communication disruption. Any network interface card (NIC) in the nodes can be configured to serve as a backup path for node-to-node communication. When the primary path is disrupted, the transfer of communication responsibilities goes undetected by applications running on the cluster. Whether a dedicated or public interconnect has been set up, a separate NIC should be configured to act as a redundant interconnect. This is an easy and inexpensive way to add redundancy to intracluster communication.
Redundant Interconnect Card
Another strategy to increase availability is to use a redundant interconnect card. This may be done for either the dedicated intracluster communication path, or for the client LAN. If you are using a dedicated, direct-connection interconnect configuration, you can install a second dedicated, direct-connection interconnect.
If you are using the ServerNet option as the interconnect, the card itself has a
NOTE:
built-in level of redundancy. Each ServerNet PCI adapter has two data ports, thereby allowing two separate cables to be run to and from each cluster node. If the ServerNet adapter determines that data is being sent from one adapter but not received by the other, it will automatically route the information through its other port.
There are two implementations that provide identical redundant NIC capability. The implementation you choose will depend on your hardware. The Compaq Redundant NIC Utility (originally called Advanced Network Fault Detection and Correction Feature) is supported on all Compaq TI-based Ethernet and Fast Ethernet NICs, such as NetFlex-3 and Netelligent 10/100 TX PCI Ethernet NICs. The Network Fault Tolerance feature is designed to operate with the Compaq Intel-based 10/100 NICs. Combining these utilities with the appropriate NICs will enable a seamless, undetectable failover of the primary interconnect to the redundant interconnect.
These two methods of NIC redundancy cannot be combined in a single redundant
NOTE:
NIC pair: TI-based NICs may not be paired with Intel-based NICs to create a redundant pair. For more information, refer to the Compaq White Paper, “
Supported by Compaq Network Interface Controllers
Availability website (http://www.compaq.com/highavailability).
,” available at the Compaq High
High Availability Options
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
Because the purpose of the redundant interconnect is to increase the availability of the cluster, it is important to monitor the status of your redundant NICs. Compaq Insight Manager and Compaq Insight Manager XE simplify management of the interconnect by monitoring the state of the interconnect card. You can view status information and alert conditions for all cards in each node. If a failover event occurs due to a disruption in the heartbeat, you can use the Compaq Insight Manager tools to determine where the disruption originated.
Cluster-to-LAN Communication
Each cluster node must have at least one NIC that connects to the LAN. Through this connection, network clients can access applications and data on the cluster. If the LAN NIC fails in one of the nodes, any clients connected directly to the cluster node by means of the computer name, cluster node IP address, or MAC address of the NIC no longer have access to their applications. Clients connected to a virtual server on the cluster (via the IP address or network name of a cluster group) reconnect to the cluster through the surviving cluster node.
Failure of a LAN NIC in a cluster node may have serious repercussions. If your cluster is configured with a dedicated interconnect and a single LAN NIC, the failure of a LAN NIC will prevent network clients from accessing cluster groups running on that node. If the interconnect path is not disrupted, it is possible that a failover will not occur. The applications will continue to run on the node with the failed NIC; however, clients will be unable to access them.
2-15
Install redundant NICs and use the Compaq Redundant NIC Utility to reduce the possibility of LAN NIC failure. When your cluster nodes are configured with the utility, the redundant NIC automatically takes over operation if the primary NIC fails. Clients maintain their connection with their primary node and, without disruption, continue to have access to their applications.
Compaq offers a dual-port NIC that can utilize the Compaq Redundant NIC Utility. This also reduces the possibility of the failure scenario described above. However, if the entire NIC or the node slot into which the NIC is placed fails, the same failure scenario will occur.
Compaq Insight Manager and Compaq Insight Manager XE monitor the health of any network cards used for the LAN. If any of the cards experience a fault, the Compaq Insight Manager tools mark the card as “Offline” and change its condition to the appropriate status.
2-16
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Recommended Cluster Communication Strategy
The past two sections discussed the redundancy of intracluster and cluster-to-LAN communication. However, to obtain the most benefit while minimizing cost and complexity, view cluster communications as a single entity.
To create redundancy for both intracluster and cluster-to-LAN communication, first, employ physical hardware redundancy for the LAN NICs. Second, configure MSCS to use both the primary and redundant LAN NIC as backup for intracluster communication.
With this strategy, your cluster can continue normal operations (without a failover event) when each of the following points of failure are encountered:
Failure of the interconnect card
Failure of the port on the interconnect card
Failure of the interconnect cable
Failure of the port on the LAN NIC
Failure of the LAN NIC (if redundant NICs, as opposed to dual-ported
NICs, are used)
Failure of the Ethernet cable running from a cluster node to the Ethernet
hub (which connects to the LAN)
The following examples describe how to physically set up your cluster nodes to employ the Compaq-recommended strategy. For more information about the strategy, refer to the Compaq White Paper, “Increasing Availability of Cluster Communications in a Windows NT Cluster.”
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
Example 1
A Compaq dual-port NIC and a single-port NIC are used in this example. The first port of the dual-port NIC is a dedicated interconnect, and the second port is the backup path for the cluster-to-LAN network. The single-port NIC is configured as the primary network path for cluster-to-LAN communication.
The Compaq Advanced Network Control Utility is used to configure the second port on the dual-port NIC as the backup port of a redundant pair. The single port on the other NIC is configured to be the primary port for cluster-to-LAN communication.
The interconnect retains its fully redundant status when MSCS is configured to use the other network ports as interconnect backup. Failure of the primary interconnect path results in intracluster communications occurring over the single-port NIC, since it was configured in MSCS as the backup for intracluster communication. If the entire dual-port NIC fails, the cluster nodes still have a working communication path over the single-port NIC.
With this configuration, even a failure of the dual-port NIC results in the transfer of the cluster-to-LAN communication to the single-port NIC. Other than a failure of the network hub, the failure of any cluster network component will be resolved by the redundancy of this configuration.
2-17
Primary Interconnect Path
Node 1
Primary Cluster to LAN and
Backup Interconnect Path
Clients
Figure 2-8. Use of dual-port NICs to increase redundancy
Backup Cluster to LAN and Backup Interconnect Path
Hub
Node 2
2-18
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Example 2
The second example configuration consists of three single-port NICs. One NIC is dedicated to intracluster communication. The other two NICs are used for cluster-to-LAN communication. The Compaq Advanced Network Control Utility is used to configure two of the NICs—one as the primary and one as the standby of a redundant pair.
The interconnect is fully redundant when MSCS is configured to use the other network cards as backups for the interconnect. Failure of the primary interconnect path results in intracluster communications occurring over the primary NIC of the redundant pair. If the entire interconnect card fails, the cluster nodes will still have a working communication path.
The cluster-to-LAN communication is fully redundant up to the network hub. With this configuration, even a failure of the primary NIC results only in the transfer of the network path to the standby NIC. Other than a failure of the network hub, any failure of any cluster network component will be resolved by the redundancy of this configuration.
The primary disadvantage of this configuration as compared to Example 1 is that an additional card slot is used by the third NIC.
Primary Interconnect Path
Node 1
Backup Cluster to LAN and
Backup Interconnect Path
Hub
Clients
Figure 2-9. Use of three NICs to increase redundancy
Node 2
Primary Cluster to LAN and Backup Interconnect Path
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
HA/F100 Fibre Channel Data Paths
The Compaq StorageWorks RAID Array 4000 storage system (formerly Compaq Fibre Channel storage system) is the mechanism with which ProLiant Clusters implement shared storage. Generally, the storage system consists of Compaq StorageWorks Host Adapters (host bus adapters) in each server, a Compaq StorageWorks Fibre Channel Storage Hub, a Compaq StorageWorks RA4000 Controller (array controller), and a Compaq StorageWorks RAID Array 4000 into which the SCSI disks are placed.
The RA4000 storage system has two distinct data paths, separated by the storage hub:
The first data path runs from the host bus adapters in the servers to the
storage hub.
The second data path runs from the storage hub to the RA4000.
The effects of a failure will vary depending on whether the failure occurred on the first or second data path.
Failure of the Host Bus Adapter-to-Storage Hub Data Path
2-19
RA4000
storage hub
Interconnect
ProLiant
Server
Figure 2-10. Host bus adapter-to-storage hub data path
ProLiant
Server
Corporate LAN
2-20
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
If the host bus adapter-to-storage hub path fails, it results in a failover of all applications. For instance, if one server can no longer access the storage hub (and by extension the shared storage), all of the cluster groups that depend on shared storage will fail over to the second server. The cost of failure is relatively minor. It is the downtime experienced by users while the failover event occurs.
Note that the Compaq Insight Manager tools monitor the health of the RA4000 storage system. If any part of the Fibre Channel data path disrupts a server’s access to the RA4000, the array controller status changes to “Failed” and the condition is red. The red condition bubbles up to higher-level Compaq Insight Manager screens and eventually to the device list.
The Compaq Insight Manager tools display a failure of physical hardware through
NOTE:
the Mass Storage button on the View screen, marking the hardware “Failed.” A logical drive in the cluster is reported on the Cluster Shared Resources screen as a logical disk resource. Compaq Insight Manager and Compaq Insight Manager XE do not associate the logical drive with the physical hardware.
Failure of the Hub-to-RA4000 Data Path
RA4000
storage hub
Interconnect
ProLiant
Server
Figure 2-11. Hub-to-RA4000 data path
ProLiant
Server
Corporate LAN
The second data path, from the storage hub to the RA4000, has more severe implications when it fails. If this data path fails, all clustered applications become inoperable. Even attempting to fail the applications to another cluster node will not gain access to the RA4000.
This failure scenario can be avoided by deploying the redundant Fibre Channel
NOTE:
loop configuration of the Compaq ProLiant Cluster HA/F200.
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
Without access to shared storage, clustered applications cannot reach their data or log files. The data, however, is unharmed and remains safely stored on the physical disks inside the RA4000. If a database application was running when this failure occurred, some in-progress transactions will be lost. The database will need to be rolled back and the in-progress transactions re-entered.
Like the server-to-storage hub data path, the Compaq Insight Manager tools detect this fault, change the RA4000 status to “Failed,” and change its condition to red. The red condition bubbles up through Compaq Insight Manager screens, eventually to the device list.
Nonshared Disk Drives
Nonshared disk drives, or local storage, operate the same way in a cluster as they do in a single-server environment. These drives can be in the server drive bays or in an external storage cabinet. As long as they are not accessible by both servers, they are considered nonshared.
Treat nonshared drives in a clustered environment as you would in a nonclustered environment. Most likely, some form of RAID is used to protect the drives and restore a failed drive. Since the operating system is stored on these drives, use either hardware or software RAID to protect the information. Hardware RAID is available with Compaq’s SMART-2 Controller or by using a nonshared storage system.
2-21
Shared Disk Drives
Shared disk drives are contained in the RA4000, which is accessible by each cluster node. Employ hardware RAID 1 or 5 on all of your shared disk drives. This is configured using the Compaq Array Configuration Utility.
If RAID 1 or 5 is not used, failure of a shared disk drive will disrupt service to all clustered applications and services that depend on the drive. Failover of a cluster node will not resolve this failure, since neither server can read from a failed drive.
Windows NT software RAID is not available for shared drives when using MSCS.
NOTE:
Hardware RAID is the only available RAID option for shared storage.
As with other system failures, Compaq Insight Manager monitors the health of disk drives and will mark a failed drive as “Failed.”
2-22
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide

Enhanced High Availability Features of the HA/F200

A “single point of failure” refers to any component in the system that, should it fail, prevents the system from functioning. Single points of failure in hardware can be minimized, and in some cases eliminated, by using redundant components. The most effective way of accomplishing this is by clustering.
The Compaq ProLiant Cluster HA/F100 reduces the single points of failure that exist in a single-server environment by allowing two servers to share storage and take over for each other in the event that one server fails. The Compaq ProLiant Cluster HA/F200 goes one step further by implementing a dual redundant Fibre Channel Arbitrated Loop configuration.
The Compaq ProLiant Cluster HA/F200 further enhances high availability through the use of additional, redundant, components in the server-to-storage connection and in the shared storage system itself. In the event of a failure, processing is switched to an alternate path without affecting applications and end users. In fact, this path switch is transparent even to the Windows NT file system (NTFS). The combination of multiple paths and redundant hardware components provided by the HA/F200 offers significantly enhanced high availability over non-redundant configurations.
A single component failure in the HA/F200 will result in an automatic failover to an alternate component, allowing end users to continue accessing their applications without interruption. Some typical failures and associated responses in an HA/F200 configuration are:
A server failure will cause MSCS to fail application processing over to
the second server.
A host bus adapter failure will cause I/O requests intended for the failed
adapter to be re-routed through the remaining adapter.
A storage hub, or cable, failure will be treated like a host bus adapter
failure and a failover to the second host bus adapter, which is using a different storage hub and cables, will occur.
An array controller failure will cause the redundant array controller to
take over for the failed controller.
In all of the above examples, end users will experience minimal interruptions while the failover occurs. In some cases, the interruptions may not even be noticeable.
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
The following illustration depicts the HA/F200 configuration components.
RA4000
Node 1
storage hub
storage hub
Dedicated
Interconnect
storage hub
storage hub
LAN
Figure 2-12. HA/F200 configuration
Node 2
2-23
2-24
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
HA/F200 Fibre Channel Data Paths
The Compaq StorageWorks RAID Array 4000 storage system is the mechanism with which the HA/F200 cluster implements shared storage. The Compaq ProLiant Cluster HA/F200 minimum configuration consists of two host bus adapters in each server, two 7- or 12-port storage hubs, two array controllers per RA4000, and one or more RA4000s.
The RA4000 storage system has active data paths and standby data paths, separated by two storage hubs. Figure 2-13 and Figure 2-14 detail the active and standby paths of the minimum HA/F200 configuration.
A SS
Server Server
storage hub storage hub
Active Standby
RA4000
Figure 2-13. Active host bus adapter-to-storage data paths
A
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
The active data paths run from the active host bus adapters in the servers to the active storage hub. If this path fails, the applications can seamlessly fail over to the standby host bus adapter-to-storage hub data paths.
2-25
A SS
Server Server
storage hub
Active Standby
RA4000
Figure 2-14. Active hub-to-storage data path
A
storage hub
The second active data path runs from the active hub to the RA4000. If this path fails, the applications can seamlessly fail over to the standby hub-to-RA4000 data path.
The dual redundant loop feature of the Compaq ProLiant Cluster HA/F200 increases the level of availability over clusters that have only one path to the shared storage. In addition, the second loop in the HA/F200 provides for improved performance through load balancing. Load balancing considerations are discussed in the “Load Balancing” section of this chapter.
2-26
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide

Capacity Planning

Capacity planning determines how much computer hardware is needed to support the applications and data on your clustered servers. Unlike conventional, single-server capacity planning, clustered configurations must ensure that each node is capable of running any applications or services that may fail over from its partner node. To simplify the following discussion, the software running on each of the clustered nodes is divided into three generic categories:
Operating system
Nonclustered applications and services
Clustered applications and services
Shared Storage
Data for Node1 Clustered
Applications & Services
Data for Node2 Clustered
Applications & Services
Operating System
Clustered Applications
& Services
Non-Clustered
Applications & Services
Node1
Figure 2-15. File locations in a Compaq ProLiant Cluster
Node2
Operating System
Clustered Applications
& Services
Non-Clustered
Applications & Services
For each server, determine the processor, memory, and disk storage requirements needed to support its operating system and nonclustered applications and services.
Determine the processor and memory requirements needed to support the clustered applications and services that will run on each node while the cluster is in a normal operating state.
If the program files of a clustered application and/or service will reside on local storage, remember to add that capacity to the amount of local storage needed on each node.
For all files that will reside on shared storage, see “Shared Storage Capacity” later in this chapter.

Server Capacity

The capacity needed in each server depends on whether you design your cluster as an active/active configuration or as an active/standby configuration. Capacity planning for each configuration is discussed in the following sections.
Active/Active Configuration
As described earlier in this chapter, an active/active configuration can be designed in two ways:
Applications and services may be configured to fail over from each node
to its partner node, or
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
2-27
Applications and services may be configured to fail over from just one
node to its partner node.
2-28
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
The following table details the capacity requirements that can be applied to either active/active design.
Server Capacity* Requirements for Active/Active Configuration
Table 2-1
Operating system (Windows NT and MSCS) Operating system (Windows NT and MSCS) Nonclustered applications and services Server1 clustered applications and services Server2 clustered applications and services Server2 clustered applications and services
(if Server2 is set up to fail applications and services to Server1)
* Processing power, memory, and nonshared storage
Active/Standby Configuration
In an active/standby configuration, only one node actively runs applications and services. The other node is in an idle, or standby, state. Assume Node1 is the active node and Node2 is the standby node.
Server Capacity* Requirements for Active/Standby Configuration
Operating System (Windows NT and MSCS)
Node1
Node1
Node2
Nonclustered applications and services
Server1 clustered applications and services (if Server1 is set up to fail applications and services to Server2)
Table 2-2
Node2
Operating system (Windows NT and MSCS)
Nonclustered applications and services
Server1 clustered applications and services * Processing power, memory, and nonshared storage
Server1 clustered applications and services

Shared Storage Capacity

Each server is connected to shared storage (the Compaq StorageWorks RAID Array 4000 storage system), which mainly stores data files of clustered applications and services. Follow the guidelines below to determine how much capacity is needed for your shared storage.
For some clustered applications, it may make sense to store the application
NOTE:
program files on shared storage. If the application allows customization and the customized information is stored in program files, the program files should be placed on shared storage. When a failover event occurs, the secondary node will launch the application from shared storage. The application will execute with the same customizations that existed when executed on the primary node.
Two factors help to determine the required amount of shared storage disk space:
The amount of space required for all clustered applications and their
dependencies.
The level of data protection (RAID) required for each type of data used
by each clustered application. Two factors driving RAID requirements are:
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
2-29
The performance required for each drive volumeThe recovery time required for each drive volume
IMPORTANT:
MSCS. Hardware RAID is the only available RAID option for shared storage.
Windows NT software RAID is not available for shared drives when using
For more information about hardware RAID, see the following:
Compaq StorageWorks Fibre Channel RAID Array 4000 User Guide
Configuring Compaq RAID Technology for Database Servers
(TechNote 184206-001), available at the Compaq website (
http://www.compaq.com).
In the “Cluster Groups” section of this chapter, you created a resource dependency tree, then transferred that information into a Cluster Group Definition Worksheet (Figure 2-7). Under the resource dependencies in the worksheet, you listed at least one physical disk resource. For each physical disk resource, determine the capacity and level of protection required for the data to be stored on it.
2-30
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
For example, the Web Sales Order Database group depends on a log file, data files, and program files. It might be important for the log file and program files to have a quick recovery time, while performance would be a secondary concern. Together, the files do not take up much capacity; therefore, mirroring (RAID 1) would be an efficient use of disk space and would fulfill the recovery and performance characteristics. The data files, however, would need excellent performance and excellent protection. The data files are expected to be large; therefore, a mirrored configuration would require an unacceptably expensive number of disk drives. To minimize the amount of physical capacity and still meet the performance and protection requirements, the data files would be configured to use Distributed Data Guarding (RAID 5).
Array Configuration
The Compaq Array Configuration Utility (ACU) is used to initially configure the array controller, reconfigure the array controller, add additional disk drives to an existing configuration, and expand capacity. The capacity expansion feature provides the ability to add storage capacity to a configured array without disturbing the existing data and to add a new physical drive to the array.
An array is created by grouping disk drives together to share a common RAID (Redundant Array of Inexpensive Disks) fault tolerance type. For example, in a single RA4000 storage system containing eight 18.2 GB drives, you could configure two of the drives in a RAID 1 mirrored array and the remaining six drives as a RAID 5 Distributed Data Guarding array.
Each array must be divided into at least one volume (up to a maximum of eight volumes per array). Each volume is presented to the operating system as an independent disk drive and can be independently controlled by the cluster software. Using the previous example, you could configure the two-drive RAID 1 array as a single volume (for example, drive F), and the six-drive RAID 5 array as two volumes (for example, drives G and H). Because the operating system views these as independent disks, it is possible for cluster Node1 to control drive G, while cluster Node2 controls drives F and H.
More information regarding cluster disk configuration can be found in the Compaq TechNote, Planning Considerations for Compaq ProLiant Clusters Using Microsoft Cluster Server, located on the Compaq website (
http://www.compaq.com).
This capability provides a high level of flexibility in configuring your RA4000 storage system. However, minimize the number of volumes configured in each array to improve performance. To achieve optimal performance, each array should contain a single volume. In some cases (such as for the MSCS quorum drive), it may be desirable to add a second, smaller volume to an array.
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
Shared Storage Capacity Worksheet
The following Shared Storage Capacity worksheet will assist in determining your shared storage capacity requirements. The following example illustrates the required shared storage capacity for the entire Web Sales Order business function. A blank worksheet is provided in Appendix A.
Shared Storage Capacity Worksheet
Disk Resource 1
Disk Resource 2
2-31
Description
Required Application Capacity Desired Level of Protection RAID Configuration Required Capacity With RAID Total Usable Capacity
Description Required Application Capacity Desired Level of Protection RAID Configuration Required Capacity With RAID Total Usable Capacity
Description Required Capacity Without RAID Desired Level of Protection
Web files and Web scripts for Web Service Group
12 GB RAID 5 4 x 4.3 GB
17.2 GB
12.9 GB
Disk Resource 3
Data file(s) for Database 27 GB RAID 5 4 x 9.0 GB 36 GB 27 GB
Disk Resource 5
N/A
Log file(s) for Database
4.3 GB RAID 1 2 x 4.3 GB
8.6 GB
4.3 GB
Disk Resource 4
N/A
Disk Resource 6
N/A
RAID Configuration Required Capacity With RAID Total Usable Capacity
Figure 2-16. Shared Storage Capacity Worksheet (example)
2-32
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide

Load Balancing

Load balancing helps to attain enhanced performance from the cluster by balancing the system’s workload. With cluster configurations, applications and data can be shared by all components so that no one component is working at its maximum capability.
There are two means of load balancing. One way balances a system’s workload across the cluster. The other balances a server’s workload across multiple data paths. The dual redundant loop of the Compaq ProLiant Cluster HA/F200 and an added RA4000 storage system spread a system’s applications and data across the data paths through an active/active host bus adapter. This configuration can increase the functionality of the cluster.
IMPORTANT:
Compaq ProLiant Cluster HA/F200. Add another RA4000 to your HA/F200 configuration for host bus adapters in a single server to be active/active.
Disk load balancing cannot be done when using a single RA4000 in a
Figure 2-17 shows a Compaq ProLiant Cluster HA/F200 configuration with only one RA4000. Because there is only one RA4000, the host bus adapters are in active/standby mode, which means that they do not have load-balancing capability.
RA4000
A
S
storage hubstorage hub
storage hubstorage hub
Figure 2-17. Compaq ProLiant Cluster HA/F200 with one RA4000
A
A
Server
S
Server
S
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
Figure 2-18 depicts a Compaq ProLiant Cluster HA/F200 with dual RA4000s. This configuration can accommodate load balancing because the host bus adapters are in active/active mode to different storage systems.
RA4000
A
2-33
S
Figure 2-18. Compaq ProLiant Cluster HA/F200 with dual RA4000s
Networking Capacity
The final capacity planning section addresses networking. The cluster nodes must have enough network capacity to handle requests from the client machines and must gracefully handle failover/failback events.
Make sure both nodes can handle the maximum number of clients that can attach to the cluster. If Node1 encounters a failure and its applications and services fail over to Node2, then Node2 needs to handle access from its own network clients as well as those that normally connect to the failed node (Node1).
RA4000
A
S
storage hub
storage hub
A
A
Server
S
Server
S
Note the effect of failover on network I/O bandwidth. When the cluster encounters a server failover event, only one node is responding to network I/O requests. Be sure the surviving node’s network speed and protocol will sufficiently handle the maximum number of network I/Os when the cluster is running in a degraded state.
2-34
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide

Network Considerations

This section addresses clustering items that affect the corporate LAN. MSCS has specific requirements regarding which protocol can be used and how IP address and network name resolution occurs. Additionally, consider how network clients will interact within a clustering environment. Some client-side applications may need modification to receive the maximum availability benefits of operating a cluster.

Network Configuration

Network Protocols
TCP/IP and NBT (NetBIOS over TCP/IP) are the only transport protocols that are supported in an MSCS failover environment. Other protocols, such as NetBEUI, IPX/SPX (Novell), NB/IPX, or DLC (IBM) may be used, but they cannot take advantage of the failover features of MSCS.
Applications that use these other protocols will function identically to a single-server environment. Users can still use these protocols, but they will connect directly to the individual servers and not to the virtual servers on the cluster, just as in a single-server environment. If a failure occurs, any connections using these protocols will not switch over. Since these protocols cannot fail over to another server, avoid these protocols, if possible.
WINS and DNS
WINS (Windows Internet Name Service) and DNS (Domain Name Service) are supported in MSCS. Use WINS to register the network names and IP addresses of cluster resources. If WINS is not used, create an entry in the hosts or lmhosts file that lists each network name and IP address pair, as well as the cluster name and its IP, address since these function as virtual servers to the clients.
If clients are located on a different subnet than the cluster nodes, modify the DNS database to add a DNS address record for the cluster.
DHCP
Only use DHCP for the clients; it should not be used for the cluster node IP addresses or cluster resource IP addresses. DHCP cannot be used to assign IP addresses for virtual servers.
When configuring DHCP, exclude enough static IP addresses from the pool of dynamically leased addresses to account for the following:
Cluster node IP addresses
At least one static IP address for each virtual server

Migrating Network Clients

One of the first steps in assessing the impact of a clustered environment on the network clients is to identify the various types of network functions and applications that are provided to the users. It is likely that several steps are necessary to migrate your clients to take full advantage of clustering.
File and Print Services
The main consideration for file and print services is the method clients use to connect to the shared resources. If clients use batch files to connect to shared directories on the server, the batch files may need to be updated to reflect the new path name and, possibly, the new share name.
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
2-35
2-36
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Connecting to Shared Resources
In the traditional, command-driven connection to a shared resource, the user needs to know the server name and the share name. In a clustered environment, the command is changed to reflect the cluster network name and file share that were configured as part of the failover group for that shared directory.
Compare the command syntax in Table 2-3 for connecting to a shared resource on a stand-alone server versus a clustered server.
Comparison of Net Use Command Syntax
Table 2-3
Server Environment
Stand-alone server
Cluster node
Command Syntax
Net use J:\\servername\sharename Net use J:\\networkname\fileshare
Change client login scripts and profiles so that users connect to resources using the cluster network name and file share.
Client/Server Applications
Reconfiguration of client applications in a client/server environment may also be required. Some applications, such as many of the popular databases, require the client to specify the IP address of the server that holds the database they want to connect to. The IP addresses may be held in a special configuration program or in a text file. Any references to the server’s actual IP addresses must be changed to reflect the new IP Address Resource that has been configured for that application’s cluster group.
Some databases allow you to specify the IP address of a backup server, which the client database software attempts to use in case the database is not accessible using the first IP address. The backup IP address scheme can be used in a nonclustered environment to assist clients if the primary server fails. This is no longer necessary when using MSCS.
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
In a clustered environment, IP addresses for the database are configured to fail over with the database application, making a backup IP address on the client unnecessary. When the database resources have failed over to the other server, the client can reconnect to the database using the same IP address as before the failure. This process may be automated if the client application software supports automatic connection retries.
2-37
IMPORTANT:
before implementing a clustered system. This will help you to identify any client reconfiguration requirements and understand how client applications will behave in a clustered environment, especially after a failure.
Examine these client configuration issues in a pilot and testing phase

Failover/Failback Planning

The final section of this chapter addresses several factors to consider when planning for failover and failback events.
Performance of clustered servers after failover
Cluster server thresholds and periods
Failover of directly connected devices
Automatic vs. manual failover
Failover/failback policies

Performance After Failover

As applications or resources fail from one server to another, the performance of the clustered servers may change dynamically. This is especially obvious after a server failure, when all of the cluster resources may move to the other cluster node.
Performance monitoring of server loads after a failure should be investigated prior to a full clustered system implementation. You may need additional hardware, such as memory or system processors, to support the additional workload incurred after a failover.
It is also important to understand the performance impact when configuring server pairs in a failover cluster. If a business-critical database is already running at peak performance, requiring the server to take on the additional workload of a failed server may adversely affect business operations. In some cases, you may find it appropriate to pair that server with a low-load server, or even with a no-load server (as in an active/standby cluster configuration).
2-38
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
You can use the Windows NT Performance Monitor to observe and track system performance. Some applications may also have their own internal performance measurement capabilities.

MSCS Thresholds and Periods

MSCS offers flexibility in configuring the initiation of failover events. For resources, MSCS allows you to set Restart Thresholds and Periods. For cluster groups, MSCS allows you to set Failover Thresholds and Periods.
Restart Threshold and Restart Period
A restart threshold defines the maximum number of times per restart period that MSCS attempts to restart a resource before failing over the resource and its corresponding cluster group. See the following example:
Assume you have a disk resource (Disk1) that is part of a cluster group (Group1). You set the restart threshold to 5 and the restart period to 10. If the Disk1 resource fails, MSCS will attempt to restart the resource on the group’s current cluster node five times within a 10-minute period. If the resource cannot be brought online within the 10-minute restart period, then Group1 will fail over to the partner cluster node.
Note that MSCS waits the length of the restart period (for example, 10 minutes) before actually failing over the cluster group. You must assess the likelihood that the group will successfully restart on its present node against the time required to restart the cluster group before failing it over. If it is appropriate to immediately fail over any group that encounters a problem, set the restart threshold to 0 (zero). If the group will experience severe performance limitations if failed over to a secondary server, set the threshold and period so that MSCS attempts to restart the group on its preferred server.
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
Failover Threshold and Failover Period
The failover threshold and failover period are similar to the restart values. The failover threshold defines the maximum number of times per failover period that MSCS attempts to fail over a cluster group. If the cluster group exceeds the failover threshold in the allotted failover period, the group is left on its current node, in its current state, whether that is online, offline, or partially online.
The failover threshold and failover period prevents a cluster group from bouncing back and forth between servers. If a cluster group is so unstable that it cannot run properly on either cluster node, it will eventually be left in its current state on one of the nodes. The failover threshold and period determine the point at which the decision is made to leave the cluster group in its current state.
The following example illustrates the relationship between the restart threshold and period and the failover threshold and period.
Assume you have a cluster group (Group1) that is configured to have a preferred server (Server1). If Group1 encounters an event that forces it offline, MSCS attempts to restart the resource. If Group1 cannot be restarted within the limits of the restart threshold and period, MSCS attempts to fail over Group1 to Node2. If the failover threshold for Group1 is set to 10 and the failover period is set to 3 (hours), MSCS will fail over Group1 as many as 10 times in a 3-hour period. If a failure is still forcing Group1 offline after three hours, MSCS will no longer attempt to fail over the group.
2-39

Failover of Directly Connected Devices

Devices that are physically connected to a server cannot move to the other cluster node. Therefore, any applications or resources dependent on these devices may be unable to restart on the other cluster node. Examples of direct-connect devices include printers, mainframe interfaces, modems, fax interfaces, and customized input devices such as bank card readers.
For example, if a server is providing print services to users, and the printer is directly connected to the parallel port of the server, there is no way to switch the physical connection to the other server, even though the print queue and spooler can be configured to fail over. The printer should be configured as a true network printer and connected to a hub that is accessible from either cluster node. In the event of a server failure, not only will the print queue and spooler fail over to the other server, but physical access to the printer will be maintained.
2-40
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Another example of a direct-connect device is a directly connected mainframe interface. If the first server is directly connected to the mainframe, as through an SDLC (Synchronous Data Link Control) card in the server, there is no way to switch the physical connection to a second server. In a case like this, you may be able to use the client network to access the mainframe using TCP/IP. Since TCP/IP addresses can be configured to fail over, you may be able to reestablish the connection after a switch. However, many mainframe connectivity applications use the Media Access Control (MAC) address that is burned into the NIC to communicate with the server. This would cause a problem because MAC addresses cannot be configured to fail over.
Carefully examine the direct-connect devices on each server to determine whether you need to provide alternate solutions outside of what the cluster hardware and software can accomplish. These devices can be considered single points of failure because the cluster components may not be able to provide failover capabilities for them.

Manual vs. Automatic Failback

Failback is the act of integrating a failed cluster node back into the cluster. Specifically, it brings cluster groups and resources back to their preferred server. MSCS offers automatic and manual failback options. The automatic failback event will occur whenever the preferred server is reintegrated into the cluster. If the reintegration occurs during normal business hours, there may be a slight interruption in service for network clients during the failback process. If the interruption needs to occur in nonpeak hours, be sure to set the failback policy to “Allow” and set the “Between Hours” settings to acceptable values. For full control over when a cluster node is reintegrated, use manual failback by choosing “Prevent” as the failback policy.
Many organizations prefer to use manual failback for business-critical clusters. This prevents applications from automatically failing back to a server that has failed, automatically rebooted, and automatically rejoined the cluster before the root cause of the original error has been determined.
These terms are described and illustrated in the Group Failover/Failback Policy Worksheet provided in the following section.

Failover and Failback Policies

In the “Cluster Groups” section of this chapter, you created one or more cluster group definition worksheets (Figure 2-7). For each cluster group defined in the worksheets, you will now determine its failover and failback policies by filling in the Group Failover/Failback Policy worksheet.
Designing the Compaq ProLiant Clusters HA/F100 and HA/F200
Terms and Definitions
The following terms and definitions are used in defining failover/failback policies for cluster groups.
Table 2-4
Group Failover/Failback Policy Terms and Definitions
Term Definition
Failover policy The circumstances MSCS uses to take a group offline on the
primary (preferred) node and online on the secondary node.
Failback policy The circumstances MSCS uses to bring a group offline on the
secondary node and online on the primary (preferred) node.
Preferred owner The cluster node you want the cluster group to run on when the
cluster is in a normal state.
Failover threshold The number of times MSCS will attempt to fail over a group within
a specified failover period.
Failover period The length of time in which MSCS attempts to fail over a cluster
group. When the failover threshold count has been exceeded within the failover period, MSCS leaves the group on its current node, in its current state.
Example: If the failover threshold = 5 and the failover period = 1, MSCS will attempt to fail over the group 5 times within a 1-hour period.
2-41
Prevent Prevent automatic failback. This setting allows the administrator
to fail back a group manually.
Allow Allow automatic failback. This setting allows MSCS to fail back a
group automatically.
Allow immediately This setting allows automatic failback as soon as the preferred
node is reintegrated into the cluster and brought back online.
Allow between hours This setting allows the administrator to determine specific hours
of the day during which automatic failback can occur.
Refer to the Microsoft Cluster Server Administrator’s Guide for detailed information about failover and failback policies of groups and resources.
2-42
Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Group Failover/Failback Policy
Use the Group Failover/Failback Policy worksheet to define the failover and failback policies for each cluster group. Figure 2-19 illustrates the failover/failback parameters for the Web Server Service of the Web Sales Order business function defined in previous examples. A blank copy of the worksheet is provided in Appendix A.
Group Failover/Failback Policy Worksheet
Group Name Web Server Service
General Properties
Name Web Server Service Description Group containing Web Server Service used to operate the Web
Sales Order business function
Preferred Owners Server 1
Failover Properties
Threshold 5 Period 10
Failback Properties
Prevent
(manual failback preferred for this group)
Figure 2-19. Group Failover/Failback Policy Worksheet (example)
Allow
Choose one:
Immediately
Between hours
Start End

Preinstallation Overview

This chapter provides instructions for building a new Compaq ProLiant Cluster HA/F100 or a Compaq ProLiant Cluster HA/F200.
If you already have an HA/F100 and you wish upgrade to an H A/F200 configuration, turn to Chapter 4. If you begin with Chapter 4, be sufficiently familiar with the guidelines outlined at the beginning of Chapter 3, which will help to ensure that the configuration is properly set up. Chapter 4 assumes you are familiar with the HA/F100 configuration hardware and software.
Chapter
3
Setting Up the
Compaq ProLiant Clusters
HA/F100 and HA/F200
The Compaq ProLiant Clusters HA/F100 and HA/F200 are combinations of several indivi dually available pro ducts. Have the foll ow ing documents available as you set up your cluster.
Documentation for the clustered Compaq ProLiant Servers
Compaq StorageWorks RAID Array 4000 User Guide
Compaq StorageWorks Fibre Channel Host Adapter Installation Guide
Installation guide for the interconnect card of your choice
Compaq SmartStart for Servers Setup Poster
Compaq Insight Manager Installation Poster
3-2 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Compaq Intelligent Cluster Administrator Quick Se tup Guide
Microsoft Windows NT Server 4.0/Enterprise Edition Administrator’s
Guide
Microsoft Cluster Server Administrator’s Guide
The installation and setup of your ProL iant Cluster can be described in the following phases:
Preinstallation guidelines
Installing the hardware, including:
q
Cluster nodes
q
Compaq Stora geWorks RAID Array 4000 storage system
q
Cluster Interconnect
Installing the software, including:
q
Compaq SmartStart for Servers
q
Compaq Redu ndancy Manager ( F ibre Channel)
q
Microsoft Windo w s NT Server 4.0/Enterpr i se Edition
q
Compaq Insight Manager (optiona l)
q
Compaq Insight Manager XE (optiona l)
q
Compaq Intelligent Cluster Administrator (optional)
Additional Cluster Verif i cation Steps, including:
q
Verifying creation of the cluster
q
Verifying node failover
q
Verifying network client failover
These installation and configuration steps are described in the followin g pages.
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-3

Preinstallation Guidelines

When setting up the cluster, you will need to answer each of the following questions. Using the Preins tallation Worksheet in Appendix A, write down the answers to these questions before installing Microsoft Cluster Server (MSCS) on either cluster node.
Are you f orming or joining a cluster?
What is the cluster name?
What are the username, password, and domain for the domain account
that MSCS will run under?
What disks will you use for shared storage?
Which shared disk will you use to store permanent cluster files?
What are the adapter names and IP addresses of the network adapter
cards you will use for client access to the cluster?
What are the adapter names and IP addresses of the network adapter
cards you will use for the private interconnect between the cluster nodes?
What is the IP address and subnet mask of the address you will use to
administer the cluster?
Installing clustering software requires several specific steps and guidelines that may not be necessary when installing software on a single server. Read and understand the following items before proceeding with any softw are installation:
Have sufficient software licensing rights to install Windows NTS/E and
software applications on each server because a cluster configuration uses more than one server.
Install clustering hardware before installing the software.
The storage hub must have AC power.
The RA4000 storage system must be turned on before the cluster nodes
are powered on.
3-4 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Log on to the domain using an account that has administrative
permissions on both cluster nodes. When installing MSCS, both cluster nodes must be i n the same Windows NT domain. The cluster nodes can be members of an existing Windows NT domain, they can both be member servers, they can make u p their own domain by assigning one as Primary Domain Controller (PDC) and one as Backup D omain Controller (BD C), or they can both be a BDC in an existing Windows NT domain.
One of the utilities the SmartStart CD runs is the Compaq Array
Configuration Utility, which configures the drives in the RA4000. The Array Configuration Utility stores the drive configuration informatio n on the drives themselves. After you have config ured the shared dr ives from one of the cluster nodes it is not necessary to configure the drives from the other cluster node.
When the Array Configuration Utility runs on the first cluster node, configure the shared drives in the RA4000 storage system. When SmartStart runs the utility on the second cluster node, you will be presented the information on the shared drives that was entered when the Array Configuration Utility was run on the first node. Accept the information as presented and continue.
For a manual software installation, use Wi ndows NT Disk
Administrator on the first cluster node to configure the shared drives, and allow MSCS to synchronize infor mation between the two nodes.
By running Disk Administrator from the first node, you prevent potential problems caused by incon s istent drive configurations. When the second cluster node joins the cluster, the disk information in the Windows NT Registry is copied from the first node to the second node.
Only New Technology File System (NTFS) is supported on shared
drives.
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-5
MSCS requires drive letters to rema in constant throughout the life of the
cluster; therefore, you must assign permanent drive letters to your shared drives. If you are performing manual softwar e installation, use Windows NT Disk Administrator to assign drive letters and ensure the assignments are permanent.
Windows NT ma kes dynamic drive letter assignments (w hen drives are added or removed, or when the boot order of drive controllers is changed), but Windows NT Disk Adm i nistrator allows you to make permanent drive letter assignments.
NOTE: This step does not apply if you are using the SmartStart Assisted Integration installation.
Cluster nodes can be members of only one cluster.
NOTE: It is possible to use hosts and LMHosts files, as well as DNS (Domain Name Server), for intracluster communication. However, WINS is the recommended method.
When you set up the cluster interconnect, select TCP/IP as the network
protocol. MSC S requires the TCP/IP protocol. The cluste r interconnect must be on its own subnet. The IP addresses of the intercon nects must be static, not dynamically assigned by DHCP.
3-6 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide

Installing the Hardware

The following installation steps detail setting up a Compaq ProLiant Cluster HA/F100 or HA/F200.

Setting Up the Nodes

Physically preparing the nodes (servers) for use in a cluster is not very different from pr eparing them for individual use. The primary difference will be setting up the shared storage:
Install all necessary adapter cards and insert all internal hard drives.
1.
Attach network cables and plug in SCSI and/or Fibre C ha nnel cables.
2.
Set up one node completely, then set up the second node.
3.
IMPORTANT: Do not load any software on either cluster node until all the hardware has been installed in both cluster nodes.
NOTE: Compaq recommends that Automatic Server Recovery (ASR) be left at the default values for clustered servers.
Follow the installa tion instructions in your Compaq ProLiant Server documentation to set up the hardware . To install Compaq St orageWorks Fibre Channel Host Adapters (host bus adapte rs) and any interconnect cards, foll ow the instructions in the next sections.
IMPORTANT: For the most up-to-date list of cluster-certified servers, access the Compaq High Availability website (http://www.compaq.com/highavailability).
Installing the Compaq StorageWorks Fibre Channel Host Adapter
Follow the installation instructions in your Compaq StorageWorks Fibre Channel Host Adapter Installation Guide and your Compaq ProLiant
Server documentation to install the host bus adapter in your servers. Install one adapter per server for the HA/F100 configuration. Install two adapters per server for the HA/F200 configuration.
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-7
The host bus adapters, which connect the two servers to the storage through storage hubs, ar e installed in each serve r like any other PCI card. The HA/F100 cluster requires one host bus adapter per server, while the HA/F200 requires two host bus adapters per server. The extra host bus adapter in each server contributes to the enhanced high availability features of the HA/F200. The dual host bus adapters, in conjunction with dual hubs and dual array controllers form two completely independent paths to the storage, making the server-to-storage connection totall y redundant. However, it is important to ensure that each host bus adapter in a particular server is connected to a different hub, because it is physically possible to connect the servers to the storage hubs is such a way that the cluster will appear to be working correctly, but will not be able to fail over pro perly.
NOTE: To determine the preferred slots for installing the host bus adapters, use PCI bus-loading techniques to balance the PCI bus for your hardware and configuration. For more information, refer to your server documentation, Performance and Tuning
TechNotes, and the Compaq White Paper, “Where Do I Plug the Cable? Solving the Logical-Physical Slot Numbering Problem,” available from the Compaq website (http://www.compaq.com).
Installing the Cluster Interconnect
There are many ways to physically set up an interconnect. Chapter 1 discusses the various types of interconnect str ategies.
If you are using a dedicated interc onnect, install an interconnect adapter ca rd (Ethernet or ServerNet) in each cluster node if you are using a dedicated interconnect. If you are sharing your local area network (LAN) network interface card (NIC) with your interconnect, install the LAN NIC.
NOTE: To determine the preferred slot for installing the interconnect card, use PCI bus-loading techniques to balance the PCI bus for your hardware and configuration. If you are installing the ServerNet card, treat it as a NIC in determining the preferred installation slot for maximum performance. For more information, see your server documentation, Performance and Tuning TechNotes, and the Compaq White Paper, “Where Do I Plug the Cable? Solving the Logical-Physical Slot Numbering Problem,” available from the Compaq website (http://www.compaq.com).
For specific instructions on how to install an adapter card, refer to the documentation for the interconnect card you are installing or the Compaq ProLiant Server you are using. The cabling of interconne cts is outlined later in this chapter.
3-8 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide

Setting Up the Compaq StorageWorks Raid Array 4000 Storage System

Follow the instructions in the Compaq Storage Works RAID Array 4000 User Guide to set up the RA4000s, the Compaq StorageWorks Fibre Channel
Storage Hub 7 or 12, the Compaq StorageWorks RA4000 Controller (array controller), and the Fibre Channel cables.
Note that the Compaq StorageWorks RAID Array 4000 User Guide explains how to install these devices for a single server. Because clustering requires shared storage, you will need to install these devices for two servers. This will require running an extra Fibre Channel cable from the st orage hub to the second server.
RA4000
storage hub
Dedicated Interconnect
Node 1
Figure 3-1. RA4000 storage system connected to clustered servers in the HA/F100 configuration
IMPORTANT: It is strongly recommended for cluster configuration that all shared drives be in the storage box before running the Compaq Array Configuration Utility.
Node 2
LAN
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-9
Powering Up
Before applying power to the RA4 000, all components must be installed and connected to the storage hub. Hard drives should be installed in the RA4000 so they can be identif ied and configured.
The cluster must be powered up in the following order:
Storage hub (Pow er is applied when t he AC power cord is plug ged in)
1.
Storage system
2.
Servers
3.
Configuring Shared Storage
The Compaq Array Configuration Utility sets up the hardware aspects of any drives attached to an array controller, including the drives in the shared RA4000s. The Array Configuration Utility can initially configure the array controller, reconfigure the array c ontroller, add additional disk drives to an existing configuration, and expand capacity. The Array Configuration Utility stores the drive configuration information on the drives themselves; therefore, after you have c onfigured the drives from one of the cluster nodes, it is not necessary to configure the drives from the other cluster node.
For detailed information about configuring the drives, refer to “Running the Array Configuration Utility” in t he Compaq StorageWorks RAID Array 4000 User Guide.
The Array Configuration Utility runs automatically during an Automated SmartStart cluster installation.
3-10 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide

Setting Up a Private Interconnect

There are four ways to set up a private interconnect.
Ethernet direct connect
Ethernet direct connect using a private hub
ServerNet direct connect
ServerNet direct connect using a switch
Ethernet Direct Connect
An Ethernet crossover cable is included with your Com paq ProLiant Cluster. This cable directly connects two Ethernet cards. Place one end of the cable in the interconnect (Ethernet) card in Node1 and the other end of the cable in the interconnect c ard in Node2.
IMPORTANT: Place the cable into the interconnect cards and not into the Ethernet connections used for the network clients (the LAN).
NOTE: The crossover cable will not work in conjunction with a storage hub.
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-11
Ethernet Direct Connect Using a Private Hub
An Ethernet hub requires standard Ethernet cables; Ethernet crossover cables will not work with a hub. Follow these steps to cable the server interconnect using an Ethernet hub:
Connect the end of one of the Ethernet cables to the interconnect
1.
(Ethernet) card in Node1. Connect the other end of the cable to a port in the hub.
2.
Repeat steps 1 and 2 for the interconnect card in N ode2.
3.
IMPORTANT: Place the cable into the interconnect cards and not into the Ethernet connections used for the network clients (the LAN).
ServerNet Direct Connect
If you have ch osen the Compaq Server N et option as the ser ve r interconnect for your ProLiant Cluster, you will need the following:
Two ServerNet PCI adapter cards
Two ServerNet cables
Follow these steps to install the ServerNet interconnect:
Connect one end of a ServerNet cable to connector X on the ServerNet
1.
card in Node1. Connect the other end of the ServerNet cable to connector X on the
2.
ServerNet card in Node2. Connect the two ends of the second ServerNet cable to the Y connectors
3.
on the ServerNet cards in Node1 and Node2.
IMPORTANT: Fasten the cable screws tightly. A loose cable could cause an unexpected fault in the interconnect path and potentially cause an unnecessary failover event.
ServerNet Direct Connect Using a Switch
Although not necessary for a two-node cluster, the use of a ServerNet Switch allows for future growth. Refer to the Compaq ServerNet documentation for a description a nd detailed installation instructions.
3-12 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide

Setting Up a Public Interconnect

It is possible—but not recommended—to use a public netw ork as your dedicated interconnect path. To set up a public Ethernet interconnect, c onnect the Ethernet cards, hub, and cables as you woul d in a nonclustered environment. Then configure the Ethernet cards for both network clients and for the cluster interconnect.
IMPORTANT: Using a public network as your dedicated interconnect path is not recommended because it represents a potential single point of failure for cluster communication.
NOTE: ServerNet is designed to be used only as a private or dedicated interconnect. It cannot be used as a public interconnect.

Redundant Interconnect

MSCS allows yo u to configure any certified network card as a possible path for intracluster c ommunication. If you are employing a dedicated interc onnect, use MSCS to configure your LAN network cards to serve as a backup for your interconnect.
See the “Recommended Cluster Communication Strategy” section in Chapter 2 of this guide for more information about setting up redundancy for intracluster and cluster-to-LAN comm unication.
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-13

Installing the Software

The following information describes the software installation steps for the HA/F100 and the HA/F200. Proceed with these steps once you have all equipment installed and your hubs, storage system, and one server powered up.
You need the following during installation:
IMPORTANT: Refer to Appendix C for the software and firmware version levels your cluster requires.
Compaq SmartStart for the Servers CD
Compaq SmartStart for Servers Setup Poster
Server Profile Diskette (inclu ded with SmartStart)
Microsoft Windows NT Server 4.0/Enterprise Edition software and
documentation
Compaq Redundancy Manager (Fibre Channel) software
Monitoring and Management Software
q
Compaq Insight Manager software and documentation
q
Compaq Insight Manager XE software and documentation
q
Compaq Intelligent Cluster Administrator software and documentation
At least ten high-density diskettes
Windows NT 4.0 Se rvic e Pack s

Assisted Integration Using SmartStart (Recommended)

Use the SmartStart Assisted Integration procedure to configure the serve rs (nodes) in the HA/F100 and HA/F2 00 configuration. You will set up two nodes during this process. Go through all steps on each of the nodes with noted exceptions. The following steps will take you through the SmartStart Assisted Integr ation procedure.
CAUTION: Installation using SmartStart assumes that SmartStart is being
installed on new servers. Any existing data on the servers’ boot drive will be erased.
3-14 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Cluster-Specific SmartStart Installation
The SmartStart setup poster describes the general flow of configuring and installing software on a single server. The installation for a Compaq ProLiant Cluster HA/F100 and HA/F200 will be very similar. The difference between running Smart Start on a stand-alone server and running SmartStart for a cluster are noted below:
Through the Compa q Array Configuration Utility, you can configure the
shared drives on both servers. For cluster configuration, confi gure the drives on the first server, then accept the same settings for the shared drives when given the option on the second server.
When configuring drives through the Array Configuration Utility, create
a logical drive with 100MB of space to be used as the quorum disk.
Assisted Installation Steps
IMPORTANT: Power off Node2 when setting up Node1.
Power on the shared storage. Place the SmartStart CD in the CD-ROM
1.
drive of the cl uster node and pow er on the node. The CD will automatically run.
Select the Assisted Integration installation path.
2.
Follow steps outlined in the SmartStart Setup Poster. Noted below are some steps spec ific to configuring a se rver for use in the HA /F100 and HA/F200.
When SmartStart pr ompts for the operating system, select Microsoft
3.
Windows NT Server 4.0/Enterprise Edition (Retail) or Microsoft Windows NT Server 4.0/Enterprise Edition (Select) as the operating
system. The edition of Windows NT that you select will be determined by the version of the softwar e you have.
After the hardware configuration has run, restart the computer.
4.
When you restart the computer, SmartStart will automatically run the
5.
Array Configuration Utility. Choose the custom configuration option to create RAID sets on your RA4000 storage system. Refer to the Compaq StorageWork s RAID Array 4000 User Guide for more details.
IMPORTANT: Node2 Exception: When configuring Node2, the Array Configuration Utility shows the results of the shared drives configured during Node1 setup. Accept these changes for Node2 by exiting.
NOTE: Create a logical drive with 100MB of space to be used as the quorum disk.
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-15
After you have completed using the Array Configuration Utility, t he
6.
system will reboot and SmartStart will automatically create your system partition.
Next, you will be guided through steps to install addition Compaq
7.
software and utilities including choosing the NT boot partition size and installing the Compaq Support Software Disk (SSD). Follow the instructions in the SmartStart setup poster.
IMPORTANT: Node2 Exception: When configuring Node2, exit out of the Diskette Builder Utility and go to step 9.
After the Diskette Builder Utility has loa ded, create t he Options
8.
ROMPaq utility in Diskette Builder. Label the diskettes you create. The Options ROMPaq utility updates the firmware on the arr ay controllers and the hard drives. For more information about Options ROMPaq, refer to the Compaq StorageWorks RAID Array 4000 User Guide. You can also create the SSD during this process.
At this time, the system will reboot to prepare for the operating system
9.
installation. When prompted, insert the Windows NT Server 4.0/Enterprise Edition (Retail) or Windows NT Server 4.0/Enterprise Edition (Select) CD.
10. When prompted, install Service Pack 3. After Service Pack 3 is installed, the server reboots and Enterprise Edition Installer loads automatically.
11. Exit the Enterprise Edition Installer.
IMPORTANT: Node2 Exception: Open Disk Administrator and when prompted for drive signature stamp, choose yes. After this process is complete, exit Disk Administrator.
12. Open Disk Administrator and create disk partitions.
13. Power down the se rver, insert Options ROMPaq diskette in Node1, and restart the system.
IMPORTANT: When updating the firmware on the array controllers, make sure that Node2 is powered off.
IMPORTANT: Node2 Exception: Do not update the firmware on the array controllers when setting up Node 2. Skip to step 17.
14. Run Options ROMPaq from diskettes and choose to update the firmware on the array controllers.
15. After firmware update completes, power d own the storage and ser ve r.
3-16 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
16. Power on storage and wait for drives to spin, then power on the server.
17. If setting up an HA/F200, install Red undancy Manager. To automatically install Redunda ncy Manager Redundancy Mana ger:
Place the Compaq Redundancy Manager (Fibre Channel) CD in the
a.
CD-ROM drive. It automatically loads the I nstall program. Follow the instructions offere d by the Redunda ncy Manager
b.
installation screens. Remove the C ompaq Redundancy Manager (Fibre Channel) CD
c.
from the CD-ROM drive. Reboot the server.
d.
Redundancy M anager is now installed on your computer. To use Redundancy M anager, double-click the icon.
To manually install Redundancy Manager:
Place the Compaq Redundancy Manager (Fibre Channel) CD into
a.
the CD-ROM drive. Select Settings from the Start menu.
b.
Select Control Panel from the Settings menu.
c.
Select Add/Remove Programs from the Control Panel.
d.
Click Install from the Add/Remove Programs page.
e.
Click Next from the Add/Remove Programs page.
f.
Click Browse from the Add/Remove Programs page.
g.
Locate the Redundancy Manager SETUP.EXE file on the
h.
Compaq Redundancy Manager (Fibre Channel) CD. Click Finish from the Add/Remove Progra ms page. The setup
i.
program begins. Follow the instructions displayed on the Redundancy Manager
j.
installation screens. Close the Control Panel.
k.
Remove the C ompaq Redundancy Manager (Fibre Channel) CD
l.
from the CD-ROM drive. Reboot the server.
m.
Redundancy M anager is now installed on your computer. To use Redundancy M anager, double-click the icon.
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-17
IMPORTANT: Node2 Exception: Repeat SmartStart Assisted Integration steps 1-11 for
Node2. Then proceed to step 17.
IMPORTANT: Node1 Exception: Execute step 18 only after Node2 is set up.
18. Run the Compaq Cluster Verification Utility (CCVU) on both nodes to ensure that your system is ready for cluster installation.
Use the Compaq Cluster Verification Utility CD in your cluster kit to install the Compaq Cluster Verification Utility, then follow the steps listed below.
To automatically install CCVU, insert the Compaq Cluster Verification Utility CD in the CD-ROM drive. Installation should automatically run.
If manual installation is necessary:
Select Start, Run.
a.
Click Browse and select the CD-ROM drive.
b.
Double-click SETUP.EXE.
c.
Follow these steps to run t he C ompaq Cluster Verification Utility:
On the Microsof t Windows NT desktop, select Start, Programs,
a.
Compaq Utilities, Compaq Cluster Verification Utility. Select a Test by highl ighting the cluster model you are verifying.
b.
Click Next.
c.
Select one or two computers on which to perform the verification
d.
tests. Click Add to add a computer to the selected list.
e.
Click Finish. You must have at least one computer on the selected
f.
list before clicking Finish.
NOTE: You must have administrative accounts with identical username and password on the computers selected.
Click Remove to remove a computer from the selected list.
g.
For more information about CCVU, refer to the online documentation (CCVU.HLP) included on the CD .
3-18 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
19. Open the Enterprise Edition Installer and install MSCS on both cluster nodes as outlined in MSCS documentat ion.
20. Run CCVU again to verify successful cluster installation.
21. After cluster installation completes, install Windows NT Service Pack 5. For the latest information on Service Packs, please refer to the release notes.
22. Run the Compaq Support Software Disk (SSD) through the diskettes you created or from the SmartStart CD and verify that all installed drivers are current.
23. Install your applications and managing and monitoring software. Refer to the Compaq Insight Manager Installation Poster for information on installing Compaq Insight Manager on the management console and Insight Manage ment Agents on servers and desktops. Compaq Intelligent Cluster Administrator CD i s located in your HA/F200 cluster kit and is available as an orderable option for the HA/F100. Installation steps for installing Compaq Intelligent Cluster Administrator can be found later in this chapter and in the Compaq I ntelligent Cluster Administrator Quick Setup Guide.
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-19

Manual Installation Using SmartStart

To perform a manual installation perform the following steps:
IMPORTANT: Power off Node2 when setting up Node1
Power up the shared storage. Place the SmartStart CD in the CD-ROM
1.
drive of the cl uster node. Follow the steps on the SmartStart Setup Poster, making sure you select
2.
Manual Configuration as the SmartStart installation path. Select Microsoft Windows NT Server 4.0/Enterprise Edition (Retail)
3.
or Microsoft Windows NT Server 4.0/Enterprise Edition (Select) as the operating system. The edition of Windows NT that you select will be determined by t he version of the softwa re you have.
After this process is complete, restart the computer.
4.
When you restart the computer, SmartStart will automatically create
5.
your system partition.
IMPORTANT: Node2 Exception: When configuring Node2, the Array Configuration Utility shows the results of the shared drives configured during Node1 setup. Accept these changes for Node2 by exiting.
NOTE: Create a logical drive with 100MB of space to be used as the quorum disk.
Next, SmartStart will automatically run the Array C onfiguration Utility.
6.
Refer to the Compaq StorageWorks RAID Array 4000 User Guide for instructions about creating RAID sets.
IMPORTANT: Node2 Exception: When configuring Node2, exit out of the Diskette Builder Utility, remove SmartStart CD and go to step 12.
After you have completed using the Array Configuration Utility, t he
7.
system will reboot and present a menu screen. Choose to run Create Support Software. This will load the Diskette B uilder Utility. After the Diskette Builder Utility has l oaded, create the Options ROMPaq utility, and Compaq Software Support Disk (SSD). Label the diskettes you create. The Options ROMPaq utility updates the firmware on the array controllers and the hard drives. For more information about Options ROMPaq, refer to the Compaq StorageWorks RAID Array 4000 User Guide.
Exit and remove the SmartStart CD.
8.
3-20 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Power down the se rver, insert Options ROMPaq diskette in Node1, and
9.
restart the system.
IMPORTANT: When updating the firmware on the array controllers, make sure that one server is powered off.
IMPORTANT: Node2 Exception: Do not update the firmware on the array controllers when setting up Node 2. Skip to step 12.
10. Run Options ROMPaq and choose to update the firmware on the array controllers.
11. After firmware update completes, power down the storage and ser ver. Power on storage and wait for drives to spin, then power on the server.
12. Insert the Microsoft Windows NT Server 4.0/Enterprise Edition (Retail) or Microsoft Wi ndows NT Server 4.0/ Enterprise Edition (Select) CD.
13. When prompted, install Service Pack 3. After Service Pack 3 is installed, the server reboots and Enterprise Edition Installer loads automatically.
14. Exit the Enterprise Edition Installer.
15. Run the Compaq Support Software Disk using the diskettes you created in Step 7 or insert the SmartStart CD and run the SSD directly from there. Install all necessary drivers and utilities. Refer to the Support Software Disk online help for more details. Reboot when prompted.
IMPORTANT: Node2 Exception: Open Disk Administrator and when prompted for drive signature stamp, choose yes. After this process is complete, exit Disk Administrator.
16. Open Disk Administrator and create disk partitions.
17. If setting up a Compaq ProLiant HA/F 200, install Redundancy Manager, which is described in step 17 of the SmartStart Assisted Integration steps, then continue with the next step.
IMPORTANT: Node1 Exception: Execute step 18 only after Node2 is set up.
18. Run the Compaq Cluster Verification Utility (CCVU), which is described in step 18 of the SmartStar t Assisted Integrati on steps, then continue with the next step.
IMPORTANT: Node2 Exception: Repeat SmartStart Assisted Integration steps 1 through 17 for Node2. Then proceed to step 19.
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-21
19. Open the Enterprise Edition Installer and install MSCS on both cluster nodes as outlined in MSCS documentat ion.
20. Run the CCVU to verify a successful cluster installation.
21. After cluster installation completes, install Windows NT Service Pack 5. See the latest release notes for the latest service pack information.
22. Run Compaq Support Software Diskette (SSD) through the diskettes you created or from the SmartStart CD and verify that all installed drivers are current.
23. Install your applications and managing and monitoring software. Refer to the Compaq Insight Manager Installation Poster for information on installing Compaq Insight Manager on the management console and Insight Manage ment Agents on servers and desktops. Compaq Intelligent Cluster Administrator CD i s located in your HA/F200 cluster kit and is available as an orderable option for the HA/F100. Installation steps for installing Compaq Intelligent Cluster Administrator can be found later in this chapter and in the Compaq I ntelligent Cluster Administrator Quick Setup Guide.
3-22 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide

Compaq Intelligent Cluster Administrator

Compaq Intelligent Cluster Adminis t rator (CICA) suppor ts a variety of preconfigured cluster options. These options can be initialized on your cluster if you have the a ppropriate software installed. After you have installed Compaq Intelligent Cluster Administrator you can select from a menu of preconfigured cluster configurations, and they will automatically be applied to your cluster.
The Compaq Intelligent Cluster Administrator Setup Guide and CD are located in your HA/F200 cluster kit. If you are setting up an HA/F100 configuration, you can order Compaq Intelligent Cluster Administrator separately.

Installing Compaq Intelligent Cluster Administrator

To install Compaq Intelligent Cluster Administrator on your system:
Insert the Intelligent Cluster Administrator CD.
1.
Click the Explore button.
2.
Double-click the CICA folder.
3.
Double-click SETUP.EXE.
4.
The Compaq Intelligent Cluster Administrator will begin installation. If
5.
a previous version of the product is installed, the service will be stopped and the new version will be installed.
Double-click on the Setup icon on the installation disk and f ollow the
6.
instructions. The program will be deployed into the C:\COMPAQ\CICA directory. If this directory does n ot e xist, the installation program will create it. Once installed, the files should not be moved.
Set the effective User ID for the Compaq Intelligent Cluster
7.
Administrator se rvice to the Windows N T domain administrator user account.
Repeat these steps to install the software on the other cluster node.
8.
For more specific instructions about using Compaq Intelligent Cluster Administrator, refer to the Compaq Intelligent Cluster Admini strator Quick Setup Guide , which is included in your HA/F200 clus ter kit.
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-23
Compaq has worked directly with seve ral application vendors throughout the development of Compaq ProLiant clu sters. As a result of these efforts, Compaq provides a number of Integration TechNotes and White Papers to assist you with installin g these applications in a Compaq ProLiant Cl uster environment.
Visit to the Compaq High Availability website
http://www.compaq.com/highavailability) to download the most current Integration
( TechNotes and other technical documents.

Additional Cluster Verification Steps

The following information describes several Microsoft Cluster Administra tor steps for verif ying the creation of the c luster, verifying node failover, and verifying network client failover.

Verifying the Creation of the Cluster

After you have installed the software and placed the servers in a fresh state, verify creati on of the cluster using the following step s.
Shut down and power off both servers.
1.
Power off and then power on the RA4000.
2.
Power both servers back on.
3.
3-24 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
When Windows NTS/E finishes booting up on both servers, follow these steps to use Microsoft Cluster Administrator to verify creation of the cluster:
From the Windows NTS/E desktop on either cluster server, select Start,
1.
Programs, Administrative Tools (Common), Cluster Administrator. When you are prompted for Clus ter or Server Name, enter the name or
2.
IP address of one of the cluster nodes. If the cluster has been created correctly, the computer names of both
cluster nodes a ppear on the left side of the Cluster Administrator window (see Figure 3-2).
Figure 3-2. Microsoft Cluster Administrator
If the cluster is not working correctly, see the installation
3.
troubleshooting tips in Chapter 6.

Verifying Node Failover

NOTE: Do not run any client activity while testing failover events.
Follow these steps to verify failover of a cluster node:
From the Windows NTS/E desktop on both servers, select Sta rt,
1.
Programs, Administrative Tools (Common), Cluster Administrator. When you are prompted for Clus ter or Server Name, enter the name or
2.
IP address of one of the cluster nodes
Setting Up the Compaq ProLiant Clusters HA/F100 and HA/F200 3-25
Make sure all predefined resources and groups are online. Verify that
3.
some of the resources and groups are o wned by the server you will be powering off, so that a failure event will result in failover of resources and/or groups.
Power off one of the cluster nodes.
4.
Within several seconds, Microsoft Cluster Administrator will bring online all of the predefined resources and groups that were previously owned by the powered-off serve r. If, after a m i nute, nothing appears to have happened, refresh the screen by selecting Refresh (F5).
If failover is not working correctly, see the installation troubleshooting
5.
tips in Chapter 6.

Verifying Network Client Failover

After you have verified that each server is correctly running as a cluster node, the next step is to verify that networ k clients can interact with the cluster.
The following steps will lead you through this validatio n procedure:
Ensure both cl uster nodes are running, and verif y, by means of
1.
Microsoft Cluster Administrator, tha t all groups and res ources are online.
For each hard disk in the shared storage, MSCS automatically creates a
2.
cluster group t ha t consists of a single re source, the disk drive. Using Microsoft Cluster Administrator, a dd an existing IP addre ss as another resource to one of these groups. (Do NOT use the Cluster Group.) Save the changes and return to the main Cluster Administrator screen.
Open a DOS wind ow on a network client m achine.
3.
Ensure the network client can access the IP address. Regardless of
4.
whether you are using WINS or DHCP, you can execute the DOS command ping to check the connection.
Execute a Ping command from the network client , using the cluster IP address as the argument.
The client has successfully accessed the cluster resource if you get a response similar to:
Reply from <IP Address>: bytes=xx time=xxxms TTL=xx.
The client has not successfully accessed the cluster resource if you get a response of:
Reply from <IP Address>: Destination host unreachable
3-26 Compaq ProLiant Clusters HA/F100 and HA/F200 Administrator Guide
Use Microsoft Cluster Administrator to perform a manual failover of the
5.
cluster group that contains the IP address. After the manual failover completes, execute the ping command again.
6.
As soon as the other node brings the cluster grou p online, a response
7.
similar to the one noted in Step 4 should be returned. If the client successfully accessed the failed-over IP address, your cluster is working. If the client was unsuccessful, either the cluster group was not configured correctly, the failover did not occur, or the ping command was performed before the failover activity completed.
If network client failover is not working correctly, see the installation
8.
troubleshooting tips in Chapter 6.
To verify a more extreme case, instead of failing over the IP address, power off the primary cluster node and verify that the resource fails over to the other node.
Upgrading the HA/F100 to an HA/F200

Preinstallation Overview

The difference between the Compaq ProLiant Cluster HA/F100 and the HA/F200 is the addition of a dual redundant loop. This redundant loop provides the HA/F200 cluster with no single points of failure in the connection between the servers and the storage subsytem(s).
Adding this loop to an existing HA/F100 configuration requires the following:
One additional Compaq StorageWorks RA4000 Controller (array
controller) per storage subsystem
Chapter
4
One additional Compaq StorageWorks Fibre Channel Storage Hub
One additional Compaq StorageWorks Fibre Channel Host Adapter
(host bus adapter) per server
Compaq Redundancy Manager (Fibre Channel) software
Appropriate firmware and drivers
If you already have a Compaq ProLiant Cluster HA/F100 up and running, you do not need to repeat the installation of those hardware components described in Chapter 3. However, the upgrade from the HA/F100 to the HA/F200 will require the addition of the dual redundant loop hardware components, the Redundancy Manager software and updated Windows NT device drivers and firmware. These additional steps are described in Chapter 4.
Loading...