Red Hat CLUSTER SUITE - CONFIGURING AND MANAGING A CLUSTER 2006, Cluster Suite User Manual

Page 1
Red Hat Cluster Suite
Configuring and Managing a
Cluster
Page 2
Red Hat Cluster Suite: Configuring and Managing a Cluster
Copyright © 2000-2006 by Red Hat, Inc.Mission Critical Linux, Inc.K.M. Sorenson
Red Hat, Inc.
rh-cs(EN)-4-Print-RHI (2007-01-05T17:28) For Part I Using the Red Hat Cluster Manager and Part III Appendixes, permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation. A copy of the license is available at http://www.gnu.org/licenses/fdl.html. The content described in this paragraph is copyrighted by © Mission Critical Linux, Inc. (2000), K.M. Sorenson (2000), and Red Hat, Inc. (2000-2003). This material in Part II Configuring a Linux Virtual Server Cluster may be distributedonly subject to the terms and conditions set forth in the Open Publication License, V1.0 or later (the latest version is presently available at
http://www.opencontent.org/openpub/). Distribution of substantively modified versions of this material is prohibited without the explicit permission of the copyright holder. Distribution of the work or derivativeof the work in any standard (paper) book form for commercial purposes is prohibited unless prior permission is obtained from the copyright holder. The content described in this paragraph is copyrighted by © Red Hat, Inc. (2000-2003). Red Hat and the Red Hat "Shadow Man" logo are registered trademarks of Red Hat, Inc. in the United States and other countries. All other trademarks referenced herein are the property of their respective owners. The GPG fingerprint of the security@redhat.com key is: CA 20 86 86 2B D6 9D FC 65 F6 EC C4 21 91 80 CD DB 42 A6 0E
Page 3
Table of Contents
Introduction........................................................................................................................ i
1. How To Use This Manual .................................................................................... i
2. Document Conventions....................................................................................... ii
3. More to Come ......................................................................................................v
3.1. Send in Your Feedback .........................................................................v
4. Activate Your Subscription ................................................................................ vi
4.1. Provide a Red Hat Login..................................................................... vi
4.2. Provide Your Subscription Number ................................................... vii
4.3. Connect Your System......................................................................... vii
I. Using the Red Hat Cluster Manager ............................................................................ i
1. Red Hat Cluster Manager Overview....................................................................1
1.1. Red Hat Cluster Manager Features .......................................................2
2. Hardware Installation and Operating System Configuration ...............................9
2.1. Choosing a Hardware Configuration ....................................................9
2.2. Cluster Hardware Components ...........................................................14
2.3. Setting Up the Nodes ..........................................................................18
2.4. Installing and Configuring Red Hat Enterprise Linux ........................22
2.5. Setting Up and Connecting the Cluster Hardware..............................27
3. Installing and Configuring Red Hat Cluster Suite Software ..............................35
3.1. Software Installation and Configuration Tasks ...................................35
3.2. Overview of the Cluster Configuration Tool....................................36
3.3. Installing the Red Hat Cluster Suite Packages....................................39
3.4. Starting the Cluster Configuration Tool...........................................40
3.5. Naming The Cluster............................................................................43
3.6. Configuring Fence Devices.................................................................44
3.7. Adding and Deleting Members...........................................................49
3.8. Configuring a Failover Domain ..........................................................55
3.9. Adding Cluster Resources...................................................................60
3.10. Adding a Cluster Service to the Cluster............................................62
3.11. Propagating The Configuration File: New Cluster ...........................65
3.12. Starting the Cluster Software ............................................................65
4. Cluster Administration.......................................................................................67
4.1. Overview of the Cluster Status Tool .................................................67
4.2. Displaying Cluster and Service Status................................................68
4.3. Starting and Stopping the Cluster Software........................................71
4.4. Modifying the Cluster Configuration..................................................71
4.5. Backing Up and Restoring the Cluster Database................................72
4.6. Updating the Cluster Software............................................................74
4.7. Changing the Cluster Name ................................................................74
4.8. Disabling the Cluster Software ...........................................................74
4.9. Diagnosing and Correcting Problems in a Cluster..............................75
5. Setting Up Apache HTTP Server.......................................................................77
5.1. Apache HTTP Server Setup Overview ...............................................77
Page 4
5.2. Configuring Shared Storage ................................................................77
5.3. Installing and Configuring the Apache HTTP Server.........................78
II. Configuring a Linux Virtual Server Cluster ............................................................81
6. Introduction to Linux Virtual Server..................................................................83
6.1. Technology Overview .........................................................................83
6.2. Basic Configurations ...........................................................................84
7. Linux Virtual Server Overview..........................................................................85
7.1. A Basic LVS Configuration ................................................................85
7.2. A Three Tiered LVS Configuration.....................................................87
7.3. LVS Scheduling Overview..................................................................89
7.4. Routing Methods.................................................................................91
7.5. Persistence and Firewall Marks ..........................................................93
7.6. LVS Cluster — A Block Diagram ......................................................94
8. Initial LVS Configuration...................................................................................97
8.1. Configuring Services on the LVS Routers ..........................................97
8.2. Setting a Password for the Piranha Configuration Tool ..................98
8.3. Starting the Piranha Configuration Tool Service.............................99
8.4. Limiting Access To the Piranha Configuration Tool .....................100
8.5. Turning on Packet Forwarding..........................................................101
8.6. Configuring Services on the Real Servers ........................................101
9. Setting Up a Red Hat Enterprise Linux LVS Cluster.......................................103
9.1. The NAT LVS Cluster.......................................................................103
9.2. Putting the Cluster Together .............................................................106
9.3. Multi-port Services and LVS Clustering...........................................107
9.4. FTP In an LVS Cluster......................................................................109
9.5. Saving Network Packet Filter Settings .............................................112
10. Configuring the LVS Routers with Piranha Configuration Tool ................115
10.1. Necessary Software.........................................................................115
10.2. Logging Into the Piranha Configuration Tool .............................115
10.3. CONTROL/MONITORING........................................................116
10.4. GLOBAL SETTINGS ..................................................................118
10.5. REDUNDANCY............................................................................120
10.6. VIRTUAL SERVERS ...................................................................122
10.7. Synchronizing Configuration Files .................................................133
10.8. Starting the Cluster .........................................................................135
III. Appendixes...............................................................................................................137
A. Supplementary Hardware Information............................................................139
A.1. Attached Storage Requirements.......................................................139
A.2. Setting Up a Fibre Channel Interconnect .........................................139
A.3. SCSI Storage Requirements.............................................................141
B. Selectively Installing Red Hat Cluster Suite Packages ...................................147
B.1. Installing the Red Hat Cluster Suite Packages .................................147
C. Multipath-usage.txt File for Red Hat Enterprise Linux 4 Update 3 ......157
Page 5
Index................................................................................................................................165
Colophon.........................................................................................................................171
Page 6
Page 7
Introduction
The Red Hat Cluster Suite is a collection of technologies working together to provide data integrity and the ability to maintain application availability in the event of a failure. Administrators can deploy enterprise cluster solutions using a combination of hardware redundancy along with the failover and load-balancing technologies in Red Hat Cluster Suite.
Red Hat Cluster Manager is a high-availability cluster solution specifically suited for database applications, network file servers, and World Wide Web (Web) servers with dy­namic content. A Red Hat Cluster Manager system features data integrity and applica­tion availability using redundant hardware, shared disk storage, power management, robust cluster communication, and robust application failover mechanisms.
Administrators can also deploy highly available applications services using Piranha, a load­balancing and advanced routing cluster solution based on Linux Virtual Server (LVS) tech­nology. Using Piranha, administrators can build highly available e-commerce sites that feature complete data integrity and service availability, in addition to load balancing capa­bilities. Refer to Part II Configuring a Linux Virtual Server Cluster for more information.
This guide assumes that the user has an advanced working knowledge of Red Hat Enter­prise Linux and understands the concepts of server computing. For more information about using Red Hat Enterprise Linux, refer to the following resources:
Red Hat Enterprise Linux Installation Guide for information regarding installation.
Red Hat Enterprise Linux Introduction to System Administration for introductory infor-
mation for new Red Hat Enterprise Linux system administrators.
Red Hat Enterprise Linux System Administration Guide for more detailed information
about configuring Red Hat Enterprise Linux to suit your particular needs as a user.
Red Hat Enterprise Linux Reference Guide provides detailed information suited for more
experienced users to reference when needed, as opposed to step-by-step instructions.
Red Hat Enterprise Linux Security Guide details the planning and the tools involved in
creating a secured computing environment for the data center, workplace, and home.
HTML, PDF, and RPM versions of the manuals are available on the Red Hat Enterprise Linux Documentation CD and online at:
http://www.redhat.com/docs/
1. How To Use This Manual
This manual contains information about setting up a Red Hat Cluster Manager system. These tasks are described in the following chapters:
Page 8
ii Introduction
Chapter 2 Hardware Installation and Operating System Configuration
Chapter 3 Installing and Configuring Red Hat Cluster Suite Software
Part II Configuring a Linux Virtual Server Cluster describes how to achieve load balancing in an Red Hat Enterprise Linux cluster by using the Linux Virtual Server.
Appendix A Supplementary Hardware Information contains detailed configuration infor- mation on specific hardware devices and shared storage configurations.
Appendix B Selectively Installing Red Hat Cluster Suite Packages contains information about custom installation of Red Hat Cluster Suite and Red Hat GFS RPMs.
Appendix C Multipath-usage.txt File for Red Hat Enterprise Linux 4 Update 3 con­tains information from the Multipath-usage.txt file. The file provides guidelines for using dm-multipath with Red Hat Cluster Suite for Red Hat Enterprise Linux 4 Update
3.
This guide assumes you have a thorough understanding of Red Hat Enterprise Linux sys­tem administration concepts and tasks. For detailed information on Red Hat Enterprise Linux system administration, refer to the Red Hat Enterprise Linux System Administra-
tion Guide. For reference information on Red Hat Enterprise Linux, refer to the Red Hat Enterprise Linux Reference Guide.
2. Document Conventions
In this manual, certain words are represented in different fonts, typefaces, sizes, and weights. This highlighting is systematic; different words are represented in the same style to indicate their inclusion in a specific category. The types of words that are represented this way include the following:
command
Linux commands (and other operating system commands, when used) are represented this way. This style should indicate to you that you can type the word or phrase on the command line and press [Enter] to invoke a command. Sometimes a command contains words that would be displayed in a different style on their own (such as file names). In these cases, they are considered to be part of the command, so the entire phrase is displayed as a command. For example:
Use the cat testfile command to view the contents of a file, named testfile, in the current working directory.
file name
File names, directory names, paths, and RPM package names are represented this way. This style indicates that a particular file or directory exists with that name on your system. Examples:
Page 9
Introduction iii
The .bashrc file in your home directory contains bash shell definitions and aliases for your own use.
The /etc/fstab file contains information about different system devices and file systems.
Install the webalizer RPM if you want to use a Web server log file analysis program.
application
This style indicates that the program is an end-user application (as opposed to system software). For example:
Use Mozilla to browse the Web.
[key]
A key on the keyboard is shown in this style. For example:
To use [Tab] completion, type in a character and then press the [Tab] key. Your termi­nal displays the list of files in the directory that start with that letter.
[key]-[combination]
A combination of keystrokes is represented in this way. For example:
The [Ctrl]-[Alt]-[Backspace] key combination exits your graphical session and returns you to the graphical login screen or the console.
text found on a GUI interface
A title, word, or phrase found on a GUI interface screen or window is shown in this style. Text shown in this style indicates that a particular GUI screen or an element on a GUI screen (such as text associated with a checkbox or field). Example:
Select the Require Password checkbox if you would like your screensaver to require a password before stopping.
top level of a menu on a GUI screen or window
A word in this style indicates that the word is the top level of a pulldown menu. If you click on the word on the GUI screen, the rest of the menu should appear. For example:
Under File on a GNOME terminal, the New Tab option allows you to open multiple shell prompts in the same window.
Instructions to type in a sequence of commands from a GUI menu look like the fol­lowing example:
Go to Applications (the main menu on the panel) => Programming => Emacs Text Editor to start the Emacs text editor.
Page 10
iv Introduction
button on a GUI screen or window
This style indicates that the text can be found on a clickable button on a GUI screen. For example:
Click on the Back button to return to the webpage you last viewed.
computer output
Text in this style indicates text displayed to a shell prompt such as error messages and responses to commands. For example:
The ls command displays the contents of a directory. For example:
Desktop about.html logs paulwesterberg.png Mail backupfiles mail reports
The output returned in response to the command (in this case, the contents of the directory) is shown in this style.
prompt
A prompt, which is a computer’s way of signifying that it is ready for you to input something, is shown in this style. Examples:
$
#
[stephen@maturin stephen]$
leopard login:
user input
Text that the user types, either on the command line or into a text box on a GUI screen, is displayed in this style. In the following example, text is displayed in this style:
To boot your system into the text based installation program, you must type in the text command at the boot: prompt.
replaceable
Text used in examples that is meant to be replaced with data provided by the user is displayed in this style. In the following example,
version-numberis dis-
played in this style:
The directory for the kernel source is /usr/src/kernels/
version-number/,
where
version-numberis the version and type of kernel installed on this
system.
Additionally, we use several different strategies to draw your attention to certain pieces of information. In order of urgency, these items are marked as a note, tip, important, caution, or warning. For example:
Page 11
Introduction v
Note
Remember that Linux is case sensitive. In other words, a rose is not a ROSE is not a rOsE.
Tip
The directory /usr/share/doc/ contains additional documentation for packages installed on your system.
Important
If you modify the DHCP configuration file, the changes do not take effect until you restart the DHCP daemon.
Caution
Do not perform routine tasks as root — use a regular user account unless you need to use the root account for system administration tasks.
Warning
Be careful to remove only the necessary partitions. Removing other partitions could result in data loss or a corrupted system environment.
3. More to Come
This manual is part of Red Hat’s growing commitment to provide useful and timely support to Red Hat Enterprise Linux users.
Page 12
vi Introduction
3.1. Send in Your Feedback
If you spot a typo, or if you have thought of a way to make this manual better, we would love to hear from you. Please submit a report in Bugzilla (http://bugzilla.redhat.com/bugzilla/) against the component rh-cs.
Be sure to mention the manual’s identifier:
rh-cs(EN)-4-Print-RHI (2007-01-05T17:28)
By mentioning this manual’s identifier, we know exactly which version of the guide you have.
If you have a suggestion for improving the documentation, try to be as specific as possible. If you have found an error, please include the section number and some of the surrounding text so we can find it easily.
4. Activate Your Subscription
Before you can access service and software maintenance information, and the support doc­umentation included in your subscription, you must activate your subscription by register­ing with Red Hat. Registration includes these simple steps:
Provide a Red Hat login
Provide a subscription number
Connect your system
The first time you boot your installation of Red Hat Enterprise Linux, you are prompted to register with Red Hat using the Setup Agent. If you follow the prompts during the Setup Agent, you can complete the registration steps and activate your subscription.
If you can not complete registration during the Setup Agent (which requires network access), you can alternatively complete the Red Hat registration process online at http://www.redhat.com/register/.
4.1. Provide a Red Hat Login
If you do not have an existing Red Hat login, you can create one when prompted during the Setup Agent or online at:
https://www.redhat.com/apps/activate/newlogin.html
A Red Hat login enables your access to:
Software updates, errata and maintenance via Red Hat Network
Page 13
Introduction vii
Red Hat technical support resources, documentation, and Knowledgebase
If you have forgotten your Red Hat login, you can search for your Red Hat login online at:
https://rhn.redhat.com/help/forgot_password.pxt
4.2. Provide Your Subscription Number
Your subscription number is located in the package that came with your order. If your package did not include a subscription number, your subscription was activated for you and you can skip this step.
You can provide your subscription number when prompted during the Setup Agent or by visiting http://www.redhat.com/register/.
4.3. Connect Your System
The Red Hat Network Registration Client helps you connect your system so that you can begin to get updates and perform systems management. There are three ways to connect:
1. During the Setup Agent — Check the Send hardware information and Send sys- tem package list options when prompted.
2. After the Setup Agent has been completed — From Applications (the main menu on the panel), go to System Tools, then select Red Hat Network.
3. After the Setup Agent has been completed — Enter the following command from the command line as the root user:
/usr/bin/up2date --register
Page 14
viii Introduction
Page 15
I. Using the Red Hat Cluster Manager
Clustered systems provide reliability, scalability, and availability to critical production ser­vices. Using the Red Hat Cluster Manager, administrators can create high availability clus­ters for filesharing, Web servers, and more. This part discusses the installation and con­figuration of cluster systems using the recommended hardware and Red Hat Enterprise Linux.
This section is licensed under the GNU Free Documentation License. For details refer to the Copyright page.
Table of Contents
1. Red Hat Cluster Manager Overview............................................................................1
2. Hardware Installation and Operating System Configuration ...................................9
3. Installing and Configuring Red Hat Cluster Suite Software ...................................35
4. Cluster Administration................................................................................................67
5. Setting Up Apache HTTP Server ...............................................................................77
Page 16
Page 17
Chapter 1.
Red Hat Cluster Manager Overview
Red Hat Cluster Manager allows administrators to connect separate systems (called mem­bers or nodes) together to create failover clusters that ensure application availability and
data integrity under several failure conditions. Administrators can use Red Hat Cluster Manager with database applications, file sharing services, web servers, and more.
To set up a failover cluster, you must connect the nodes to the cluster hardware, and con­figure the nodes into the cluster environment. The foundation of a cluster is an advanced host membership algorithm. This algorithm ensures that the cluster maintains complete data integrity by using the following methods of inter-node communication:
Network connections between the cluster systems
A Cluster Configuration System daemon (ccsd) that synchronizes configuration between
cluster nodes
To make an application and data highly available in a cluster, you must configure a cluster service, an application that would benefit from Red Hat Cluster Manager to ensure high availability. A cluster service is made up of cluster resources, components that can be failed over from one node to another, such as an IP address, an application initialization script, or a Red Hat GFS shared partition. Building a cluster using Red Hat Cluster Manager allows transparent client access to cluster services. For example, you can provide clients with access to highly-available database applications by building a cluster service using Red Hat Cluster Manager to manage service availability and shared Red Hat GFS storage partitions for the database data and end-user applications.
You can associate a cluster service with a failover domain, a subset of cluster nodes that are eligible to run a particular cluster service. In general, any eligible, properly-configured node can run the cluster service. However, each cluster service can run on only one cluster node at a time in order to maintain data integrity. You can specify whether or not the nodes in a failover domain are ordered by preference. You can also specify whether or not a cluster service is restricted to run only on nodes of its associated failover domain. (When associated with an unrestricted failover domain, a cluster service can be started on any cluster node in the event no member of the failover domain is available.)
You can set up an active-active configuration in which the members run different cluster services simultaneously, or a hot-standby configuration in which primary members run all the cluster services, and a backup member takes over only if a primary member fails.
If a hardware or software failure occurs, the cluster automatically restarts the failed node’s cluster services on the functional node. This cluster-service failover capability ensures that no data is lost, and there is little disruption to users. When the failed node recovers, the cluster can re-balance the cluster services across the nodes.
Page 18
2 Chapter 1. Red Hat Cluster Manager Overview
In addition, you can cleanly stop the cluster services running on a cluster system and then restart them on another system. This cluster-service relocation capability allows you to maintain application and data availability when a cluster node requires maintenance.
1.1. Red Hat Cluster Manager Features
Cluster systems deployed with Red Hat Cluster Manager include the following features:
No-single-point-of-failure hardware configuration
Clusters can include a dual-controller RAID array, multiple bonded network channels, multiple paths between cluster members and storage, and redundant uninterruptible power supply (UPS) systems to ensure that no single failure results in application down time or loss of data.
Note
For information about using dm-multipath with Red Hat Cluster Suite, refer toAppendix C Multipath-usage.txt File for Red Hat Enterprise Linux 4 Update 3
Alternatively, a low-cost cluster can be set up to provide less availability than a no­single-point-of-failure cluster. For example, you can set up a cluster with a single­controller RAID array and only a single Ethernet channel.
Certain low-cost alternatives, such as host RAID controllers, software RAID without cluster support, and multi-initiator parallel SCSI configurations are not compatible or appropriate for use as shared cluster storage.
Cluster configuration and administration framework
Red Hat Cluster Manager allows you to easily configure and administer cluster ser­vices to make resources such as applications, server daemons, and shared data highly available. To create a cluster service, you specify the resources used in the cluster ser­vice as well as the properties of the cluster service, such as the cluster service name, application initialization (init) scripts, disk partitions, mount points, and the cluster nodes on which you prefer the cluster service to run. After you add a cluster service, the cluster management software stores the information in a cluster configuration file, and the configuration data is aggregated to all cluster nodes using the Cluster Config- uration System (or CCS), a daemon installed on each cluster node that allows retrieval of changes to the XML-based /etc/cluster/cluster.conf configuration file.
Red Hat Cluster Manager provides an easy-to-use framework for database appli­cations. For example, a database cluster service serves highly-available data to a database application. The application running on a cluster node provides network access to database client systems, such as Web applications. If the cluster service fails over to another node, the application can still access the shared database data. A
Page 19
Chapter 1. Red Hat Cluster Manager Overview 3
network-accessible database cluster service is usually assigned an IP address, which is failed over along with the cluster service to maintain transparent access for clients.
The cluster-service framework can also easily extend to other applications through the use of customized init scripts.
Cluster administration user interface
The Red Hat Cluster Suite management graphical user interface (GUI) facilitates the administration and monitoring tasks of cluster resources such as the following: creat­ing, starting, and stopping cluster services; relocating cluster services from one node to another; modifying the cluster service configuration; and monitoring the cluster nodes. The CMAN interface allows administrators to individually control the cluster on a per-node basis.
Failover domains
By assigning a cluster service to a restricted failover domain, you can limit the nodes that are eligible to run a cluster service in the event of a failover. (A cluster service that is assigned to a restricted failover domain cannot be started on a cluster node that is not included in that failover domain.) You can order the nodes in a failover domain by preference to ensure that a particular node runs the cluster service (as long as that node is active). If a cluster service is assigned to an unrestricted failover domain, the cluster service starts on any available cluster node (if none of the nodes of the failover domain are available).
Data integrity assurance
To ensure data integrity, only one node can run a cluster service and access cluster­service data at one time. The use of power switches in the cluster hardware configura­tion enables a node to power-cycle another node before restarting that node’s cluster services during the failover process. This prevents any two systems from simulta­neously accessing the same data and corrupting it. It is strongly recommended that fence devices (hardware or software solutions that remotely power, shutdown, and re­boot cluster nodes) are used to guarantee data integrity under all failure conditions. Watchdog timers are an alternative used to ensure correct operation of cluster service failover.
Ethernet channel bonding
To monitor the health of the other nodes, each node monitors the health of the remote power switch, if any, and issues heartbeat pings over network channels. With Ethernet channel bonding, multiple Ethernet interfaces are configured to behave as one, reduc­ing the risk of a single-point-of-failure in the typical switched Ethernet connection between systems.
Cluster-service failover capability
If a hardware or software failure occurs, the cluster takes the appropriate action to
Page 20
4 Chapter 1. Red Hat Cluster Manager Overview
maintain application availability and data integrity. For example, if a node completely fails, a healthy node (in the associated failover domain, if used) starts the service or services that the failed node was running prior to failure. Cluster services already running on the healthy node are not significantly disrupted during the failover process.
Note
For Red Hat Cluster Suite 4, node health is monitored through a cluster network heartbeat. In previous versions of Red Hat Cluster Suite, node health was monitored on shared disk. Shared disk is not required for node-health monitoring in Red Hat Cluster Suite 4.
When a failed node reboots, it can rejoin the cluster and resume running the cluster service. Depending on how the cluster services are configured, the cluster can re­balance services among the nodes.
Manual cluster-service relocation capability
In addition to automatic cluster-service failover, a cluster allows you to cleanly stop cluster services on one node and restart them on another node. You can perform planned maintenance on a node system while continuing to provide application and data availability.
Event logging facility
To ensure that problems are detected and resolved before they affect cluster-service availability, the cluster daemons log messages by using the conventional Linux syslog subsystem.
Application monitoring
The infrastructure in a cluster monitors the state and health of an application. In this manner, should an application-specific failure occur, the cluster automatically restarts the application. In response to the application failure, the application attempts to be restarted on the node it was initially running on; failing that, it restarts on another cluster node. You can specify which nodes are eligible to run a cluster service by assigning a failover domain to the cluster service.
1.1.1. Red Hat Cluster Manager Subsystem Overview
Table 1-1 summarizes the GFS Software subsystems and their components.
Page 21
Chapter 1. Red Hat Cluster Manager Overview 5
Software Subsystem
Components Description
Cluster Configuration Tool
system-config-cluster Command used to manage cluster
configuration in a graphical setting.
Cluster Configuration System (CCS)
ccs_tool Notifies ccsd of an updated
cluster.conf file. Also, used for
upgrading a configuration file from a Red Hat GFS 6.0 (or earlier) cluster to the format of the Red Hat Cluster Suite 4 configuration file.
ccs_test Diagnostic and testing command
that is used to retrieve information from configuration files through
ccsd.
ccsd CCS daemon that runs on all
cluster nodes and provides configuration file data to cluster software.
Resource Group Manager (rgmanager)
clusvcadm Command used to manually
enable, disable, relocate, and restart user services in a cluster
clustat Command used to display the
status of the cluster, including node membership and services running.
clurgmgrd Daemon used to handle user
service requests including service start, service disable, service relocate, and service restart
Fence fence_ack_manual User interface for fence_manual
agent.
fence_apc Fence agent for APC power switch.
fence_bladecenter Fence agent for for IBM
Bladecenters with Telnet interface.
fence_brocade Fence agent for Brocade Fibre
Channel switch.
Page 22
6 Chapter 1. Red Hat Cluster Manager Overview
Software Subsystem
Components Description
fence_bullpap Fence agent for Bull Novascale
Platform Administration Processor (PAP) Interface.
fence_drac Fence agent for Dell Remote
Access Controller/Modular Chassis (DRAC/MC).
fence_egenera Fence agent used with Egenera
BladeFrame system.
fence_gnbd Fence agent used with GNBD
storage.
fence_ilo Fence agent for HP ILO interfaces
(formerly fence_rib).
fence_ipmilan Fence agent for Intelligent Platform
Management Interface (IPMI).
fence_manual Fence agent for manual interaction.
Note: Manual fencing is not supported for production environments.
fence_mcdata Fence agent for McData Fibre
Channel switch.
fence_node Command used by lock_gulmd
when a fence operation is required. This command takes the name of a node and fences it based on the node’s fencing configuration.
fence_rps10 Fence agent for WTI Remote
Power Switch, Model RPS-10 (Only used with two-node clusters).
fence_rsa Fence agent for IBM Remote
Supervisor Adapter II (RSA II).
fence_sanbox2 Fence agent for SANBox2 Fibre
Channel switch.
fence_vixel Fence agent for Vixel Fibre
Channel switch.
Page 23
Chapter 1. Red Hat Cluster Manager Overview 7
Software Subsystem
Components Description
fence_wti Fence agent for WTI power switch.
fenced The fence daemon. Manages the
fence domain.
DLM libdlm.so.1.0.0 Library for Distributed Lock
Manager (DLM) support.
dlm.ko Kernel module that is installed on
cluster nodes for Distributed Lock Manager (DLM) support.
LOCK_GULM lock_gulm.o Kernel module that is installed on
GFS nodes using the LOCK_GULM lock module.
lock_gulmd Server/daemon that runs on each
node and communicates with all nodes in GFS cluster.
libgulm.so.xxx Library for GULM lock manager
support
gulm_tool Command that configures and
debugs the lock_gulmd server.
LOCK_NOLOCK lock_nolock.o Kernel module installed on a node
using GFS as a local file system.
GNBD gnbd.o Kernel module that implements the
GNBD device driver on clients.
gnbd_serv.o Kernel module that implements the
GNBD server. It allows a node to export local storage over the network.
gnbd_export Command to create, export and
manage GNBDs on a GNBD server.
gnbd_import Command to import and manage
GNBDs on a GNBD client.
Table 1-1. Red Hat Cluster Manager Software Subsystem Components
Page 24
8 Chapter 1. Red Hat Cluster Manager Overview
Page 25
Chapter 2.
Hardware Installation and Operating System Configuration
To set up the hardware configuration and install Red Hat Enterprise Linux, follow these steps:
Choose a cluster hardware configuration that meets the needs of applications and users;
refer to Section 2.1 Choosing a Hardware Configuration.
Set up and connect the members and the optional console switch and network switch or
hub; refer to Section 2.3 Setting Up the Nodes.
Install and configure Red Hat Enterprise Linux on the cluster members; refer to Section
2.4 Installing and Configuring Red Hat Enterprise Linux.
Set up the remaining cluster hardware components and connect them to the members;
refer to Section 2.5 Setting Up and Connecting the Cluster Hardware.
After setting up the hardware configuration and installing Red Hat Enterprise Linux, install the cluster software.
2.1. Choosing a Hardware Configuration
The Red Hat Cluster Manager allows administrators to use commodity hardware to set up a cluster configuration that meets the performance, availability, and data integrity needs of applications and users. Cluster hardware ranges from low-cost minimum configurations that include only the components required for cluster operation, to high-end configurations that include redundant Ethernet channels, hardware RAID, and power switches.
Regardless of configuration, the use of high-quality hardware in a cluster is recommended, as hardware malfunction is a primary cause of system down time.
Although all cluster configurations provide availability, some configurations protect against every single point of failure. In addition, all cluster configurations provide data integrity, but some configurations protect data under every failure condition. Therefore, administra­tors must fully understand the needs of their computing environment and also the avail­ability and data integrity features of different hardware configurations to choose the cluster hardware that meets the requirements.
When choosing a cluster hardware configuration, consider the following:
Page 26
10 Chapter 2. Hardware Installation and Operating System Configuration
Performance requirements of applications and users
Choose a hardware configuration that provides adequate memory, CPU, and I/O re­sources. Be sure that the configuration chosen can handle any future increases in workload as well.
Cost restrictions
The hardware configuration chosen must meet budget requirements. For example, systems with multiple I/O ports usually cost more than low-end systems with fewer expansion capabilities.
Availability requirements
In a mission-critical production environment, a cluster hardware configuration must protect against all single points of failure, including: disk, storage interconnect, Eth­ernet channel, and power failure. Environments that can tolerate an interruption in availability (such as development environments) may not require as much protection.
Data integrity under all failure conditions requirement
Using fence devices in a cluster configuration ensures that service data is protected under every failure condition. These devices enable a node to power cycle another node before restarting its services during failover. Power switches protect against data corruption in cases where an unresponsive (or hung) node tries to write data to the disk after its replacement node has taken over its services.
If you are not using power switches in the cluster, cluster service failures can result in services being run on more than one node, which can cause data corruption. Refer to Section 2.5.2 Configuring a Fence Device for more information about the benefits of using power switches in a cluster. It is required that production environments use power switches in the cluster hardware configuration.
2.1.1. Minimum Hardware Requirements
A minimum hardware configuration includes only the hardware components that are re­quired for cluster operation, as follows:
At least two servers to run cluster services
Ethernet connection for sending heartbeat pings and for client network access
Network switch or hub to connect cluster nodes and resources
A fence device
The hardware components described in Table 2-1 can be used to set up a minimum cluster configuration. This configuration does not ensure data integrity under all failure conditions, because it does not include power switches. Note that this is a sample configuration; it is possible to set up a minimum configuration using other hardware.
Page 27
Chapter 2. Hardware Installation and Operating System Configuration 11
Warning
The minimum cluster configuration is not a supported solution and should not be used in a production environment, as it does not ensure data integrity under all failure conditions.
Hardware Description
At least two server systems Each system becomes a node exclusively for use in the
cluster; system hardware requirements are similar to that of Red Hat Enterprise Linux 4.
One network interface card (NIC) for each node
One network interface connects to a hub or switch for cluster connectivity.
Network cables with RJ45 connectors
Network cables connect to the network interface on each node for client access and heartbeat packets.
RAID storage enclosure The RAID storage enclosure contains one controller with
at least two host ports.
Two HD68 SCSI cables Each cable connects one host bus adapter to one port on
the RAID controller, creating two single-initiator SCSI buses.
Table 2-1. Example of Minimum Cluster Configuration
The minimum hardware configuration is a cost-effective cluster configuration for develop­ment purposes; however, it contains components that can cause service outages if failed. For example, if the RAID controller fails, then all cluster services become unavailable.
To improve availability, protect against component failure, and ensure data integrity under all failure conditions, more hardware is required. Refer to Table 2-2.
Problem Solution
Disk failure Hardware RAID to replicate data across multiple
disks
RAID controller failure Dual RAID controllers to provide redundant
access to disk data
Network interface failure Ethernet channel bonding and failover
Power source failure Redundant uninterruptible power supply (UPS)
systems
Machine failure Power switches
Page 28
12 Chapter 2. Hardware Installation and Operating System Configuration
Table 2-2. Improving Availability and Data Integrity
Figure 2-1 illustrates a hardware configuration with improved availability. This configura­tion uses a fence device (in this case, a network-attached power switch) and the nodes are configured for Red Hat GFS storage attached to a Fibre Channel SAN switch. For more information about configuring and using Red Hat GFS, refer to the Red Hat GFS Adminis- trator’s Guide.
Figure 2-1. Hardware Configuration for Improved availability
A hardware configuration that ensures data integrity under failure conditions can include the following components:
At least two servers to run cluster services
Switched Ethernet connection between each node for heartbeat pings and for client net-
work access
Dual-controller RAID array or redundant access to SAN or other storage.
Page 29
Chapter 2. Hardware Installation and Operating System Configuration 13
Network power switches to enable each node to power-cycle the other nodes during the
failover process
Ethernet interfaces configured to use channel bonding
At least two UPS systems for a highly-available source of power
The components described in Table 2-3 can be used to set up a no single point of failure cluster configuration that includes two single-initiator SCSI buses and power switches to ensure data integrity under all failure conditions. Note that this is a sample configuration; it is possible to set up a no single point of failure configuration using other hardware.
Hardware Description
Two servers (up to 16 supported)
Each node includes the following hardware: Two network interfaces for: Client network access
Fence device connection
One network switch A network switch enables the connection of multiple
nodes to a network.
Three network cables (each node)
Two cables to connect each node to the redundant network switches and a cable to connect to the fence device.
Two RJ45 to DB9 crossover cables
RJ45 to DB9 crossover cables connect a serial port on each node to the Cyclades terminal server.
Two power switches Power switches enable each node to power-cycle the other
node before restarting its services. Two RJ45 Ethernet cables for a node are connected to each switch.
FlashDisk RAID Disk Array with dual controllers
Dual RAID controllers protect against disk and controller failure. The RAID controllers provide simultaneous access to all the logical units on the host ports.
Two HD68 SCSI cables HD68 cables connect each host bus adapter to a RAID
enclosure "in" port, creating two single-initiator SCSI buses.
Two terminators Terminators connected to each "out" port on the RAID
enclosure terminate both single-initiator SCSI buses.
Redundant UPS Systems UPS systems provide a highly-available source of power.
The power cables for the power switches and the RAID enclosure are connected to two UPS systems.
Table 2-3. Example of a No Single Point of Failure Configuration
Cluster hardware configurations can also include other optional hardware components that are common in a computing environment. For example, a cluster can include a network
Page 30
14 Chapter 2. Hardware Installation and Operating System Configuration
switch or network hub, which enables the connection of the nodes to a network. A cluster may also include a console switch, which facilitates the management of multiple nodes and eliminates the need for separate monitors, mouses, and keyboards for each node.
One type of console switch is a terminal server, which enables connection to serial con­soles and management of many nodes from one remote location. As a low-cost alternative, you can use a KVM (keyboard, video, and mouse) switch, which enables multiple nodes to share one keyboard, monitor, and mouse. A KVM switch is suitable for configurations in which access to a graphical user interface (GUI) to perform system management tasks is preferred.
When choosing a system, be sure that it provides the required PCI slots, network slots, and serial ports. For example, a no single point of failure configuration requires multiple bonded Ethernet ports. Refer to Section 2.3.1 Installing the Basic Cluster Hardware for more information.
2.1.2. Choosing the Type of Fence Device
The Red Hat Cluster Manager implementation consists of a generic power management layer and a set of device-specific modules which accommodate a range of power manage­ment types. When selecting the appropriate type of fence device to deploy in the cluster, it is important to recognize the implications of specific device types.
Important
Use of a fencing method is an integral part of a production cluster environment. Configu­ration of a cluster without a fence device is not supported.
Red Hat Cluster Manager supports several types of fencing methods, including network power switches, fabric switches, and Integrated Power Management hardware. Table 2-5 summarizes the supported types of fence devices and some examples of brands and models that have been tested with Red Hat Cluster Manager.
Ultimately, choosing the right type of fence device to deploy in a cluster environment de­pends on the data integrity requirements versus the cost and availability of external power switches.
2.2. Cluster Hardware Components
Use the following section to identify the hardware components required for the cluster configuration.
Page 31
Chapter 2. Hardware Installation and Operating System Configuration 15
Hardware Quantity Description Required
Cluster nodes
16 (maximum supported)
Each node must provide enough PCI slots, network slots, and storage adapters for the cluster hardware configuration. Because attached storage devices must have the same device special file on each node, it is recommended that the nodes have symmetric I/O subsystems. It is also recommended that the processor speed and amount of system memory be adequate for the processes run on the cluster nodes. Refer to Section 2.3.1 Installing the Basic Cluster Hardware for more information.
Yes
Table 2-4. Cluster Node Hardware
Table 2-5 includes several different types of fence devices.
A single cluster requires only one type of power switch.
Type Description Models
Network-attached power switches.
Remote (LAN, Internet) fencing using RJ45 Ethernet connections and remote terminal access to the device.
APC MasterSwitch 92xx/96xx; WTI NPS-115/NPS-230, IPS-15, IPS-800/IPS-800-CE and TPS-2
Fabric Switches. Fence control interface
integrated in several models of fabric switches used for Storage Area Networks (SANs). Used as a way to fence a failed node from accessing shared data.
Brocade Silkworm 2x 00, McData Sphereon, Vixel 9200
Integrated Power Management Interfaces
Remote power management features in various brands of server systems; can be used as a fencing agent in cluster systems
HP Integrated Lights-out (iLO), IBM BladeCenter with firmware dated 7-22-04 or later
Table 2-5. Fence Devices
Table 2-7 through Table 2-8 show a variety of hardware components for an administrator to choose from. An individual cluster does not require all of the components listed in these tables.
Page 32
16 Chapter 2. Hardware Installation and Operating System Configuration
Hardware Quantity Description Required
Network interface
One for each network connection
Each network connection requires a network interface installed in a node.
Yes
Network switch or hub
One A network switch or hub allows
connection of multiple nodes to a network.
Yes
Network cable
One for each network interface
A conventional network cable, such as a cable with an RJ45 connector, connects each network interface to a network switch or a network hub.
Yes
Table 2-6. Network Hardware Table
Hardware Quantity Description Required
Host bus adapter
One per node
To connect to shared disk storage, install either a parallel SCSI or a Fibre Channel host bus adapter in a PCI slot in each
cluster node. For parallel SCSI, use a low voltage differential (LVD) host bus adapter. Adapters have either HD68 or VHDCI connectors.
Yes
Page 33
Chapter 2. Hardware Installation and Operating System Configuration 17
Hardware Quantity Description Required
External disk storage enclosure
At least one Use Fibre Channel or single-initiator
parallel SCSI to connect the cluster nodes
to a single or dual-controller RAID array.
To use single-initiator buses, a RAID
controller must have multiple host ports
and provide simultaneous access to all
the logical units on the host ports. To use
a dual-controller RAID array, a logical
unit must fail over from one controller to
the other in a way that is transparent to
the operating system.
SCSI RAID arrays that provide
simultaneous access to all logical units on
the host ports are recommended.
To ensure symmetry of device IDs and
LUNs, many RAID arrays with dual
redundant controllers must be configured
in an active/passive mode. Refer to Appendix A Supplementary Hardware Information for more information.
Yes
SCSI cable One per
node
SCSI cables with 68 pins connect each host bus adapter to a storage enclosure port. Cables have either HD68 or VHDCI connectors. Cables vary based on adapter type.
Only for parallel SCSI config­urations
SCSI terminator
As required by hardware configura­tion
For a RAID storage enclosure that uses "out" ports (such as FlashDisk RAID Disk Array) and is connected to single-initiator SCSI buses, connect terminators to the "out" ports to terminate the buses.
Only for parallel SCSI config­urations and only as necessary for termination
Fibre Channel hub or switch
One or two A Fibre Channel hub or switch may be
required.
Only for some Fibre Channel configura­tions
Page 34
18 Chapter 2. Hardware Installation and Operating System Configuration
Hardware Quantity Description Required
Fibre Channel cable
As required by hardware configura­tion
A Fibre Channel cable connects a host bus adapter to a storage enclosure port, a Fibre Channel hub, or a Fibre Channel switch. If a hub or switch is used, additional cables are needed to connect the hub or switch to the storage adapter ports.
Only for Fibre Channel configura­tions
Table 2-7. Shared Disk Storage Hardware Table
Hardware Quantity Description Required
UPS system One or more Uninterruptible power supply (UPS)
systems protect against downtime if a power outage occurs. UPS systems are highly recommended for cluster operation. Connect the power cables for the shared storage enclosure and both power switches to redundant UPS systems. Note that a UPS system must be able to provide voltage for an adequate period of time, and should be connected to its own power circuit.
Strongly rec­ommended for availability
Table 2-8. UPS System Hardware Table
Hardware Quantity Description Required
Terminal server
One A terminal server enables you to manage
many nodes remotely.
No
KVM switch One A KVM switch enables multiple nodes to
share one keyboard, monitor, and mouse. Cables for connecting nodes to the switch depend on the type of KVM switch.
No
Table 2-9. Console Switch Hardware Table
2.3. Setting Up the Nodes
After identifying the cluster hardware components described in Section 2.1 Choosing a Hardware Configuration, set up the basic cluster hardware and connect the nodes to the
Page 35
Chapter 2. Hardware Installation and Operating System Configuration 19
optional console switch and network switch or hub. Follow these steps:
1. In all nodes, install the required network adapters and host bus adapters. Refer to Section 2.3.1 Installing the Basic Cluster Hardware for more information about per­forming this task.
2. Set up the optional console switch and connect it to each node. Refer to Section 2.3.3 Setting Up a Console Switch for more information about performing this task.
If a console switch is not used, then connect each node to a console terminal.
3. Set up the network switch or hub and use network cables to connect it to the nodes and the terminal server (if applicable). Refer to Section 2.3.4 Setting Up a Network Switch or Hub for more information about performing this task.
After performing the previous tasks, install Red Hat Enterprise Linux as described in Sec­tion 2.4 Installing and Configuring Red Hat Enterprise Linux.
2.3.1. Installing the Basic Cluster Hardware
Nodes must provide the CPU processing power and memory required by applications.
In addition, nodes must be able to accommodate the SCSI or Fibre Channel adapters, net­work interfaces, and serial ports that the hardware configuration requires. Systems have a limited number of pre-installed serial and network ports and PCI expansion slots. Table 2-10 helps determine how much capacity the employed node systems require.
Cluster Hardware Component Serial
Ports
Ethernet Ports
PCI Slots
SCSI or Fibre Channel adapter to shared disk storage One for
each bus adapter
Network connection for client access and Ethernet heartbeat pings
One for each network connec­tion
Point-to-point Ethernet connection for 2-node clusters (optional)
One for each connec­tion
Terminal server connection (optional) One
Page 36
20 Chapter 2. Hardware Installation and Operating System Configuration
Table 2-10. Installing the Basic Cluster Hardware
Most systems come with at least one serial port. If a system has graphics display capabil­ity, it is possible to use the serial console port for a power switch connection. To expand your serial port capacity, use multi-port serial PCI cards. For multiple-node clusters, use a network power switch.
Also, ensure that local system disks are not on the same SCSI bus as the shared disks. For example, use two-channel SCSI adapters, such as the Adaptec 39160-series cards, and put the internal devices on one channel and the shared disks on the other channel. Using multiple SCSI cards is also possible.
Refer to the system documentation supplied by the vendor for detailed installation infor­mation. Refer to Appendix A Supplementary Hardware Information for hardware-specific information about using host bus adapters in a cluster.
2.3.2. Shared Storage considerations
In a cluster, shared disks can be used to store cluster service data. Because this storage must be available to all nodes running the cluster service configured to use the storage, it cannot be located on disks that depend on the availability of any one node.
There are some factors to consider when setting up shared disk storage in a cluster:
It is recommended to use a clustered file system such as Red Hat GFS to configure Red
Hat Cluster Manager storage resources, as it offers shared storage that is suited for high­availability cluster services. For more information about installing and configuring Red Hat GFS, refer to the Red Hat GFS Administrator’s Guide.
Whether you are using Red Hat GFS, local, or remote (for example, NFS) storage, it is
strongly recommended that you connect any storage systems or enclosures to redundant UPS systems for a highly-available source of power. Refer to Section 2.5.3 Configuring UPS Systems for more information.
The use of software RAID or Logical Volume Management (LVM) for shared storage is
not supported. This is because these products do not coordinate access to shared storage from multiple hosts. Software RAID or LVM may be used on non-shared storage on cluster nodes (for example, boot and system partitions, and other file systems that are not associated with any cluster services).
An exception to this rule is CLVM, the daemon and library that supports clustering of LVM2. CLVM allows administrators to configure shared storage for use as a resource in cluster services when used in conjunction with the CMAN cluster manager and the Distributed Lock Manager (DLM) mechanism for prevention of simultaneous node ac­cess to data and possible corruption. In addition, CLVM works with GULM as its cluster manager and lock manager.
Page 37
Chapter 2. Hardware Installation and Operating System Configuration 21
For remote file systems such as NFS, you may use gigabit Ethernet for improved band-
width over 10/100 Ethernet connections. Consider redundant links or channel bonding for improved remote file system availability. Refer to Section 2.5.1 Configuring Ethernet Channel Bonding for more information.
Multi-initiator SCSI configurations are not supported due to the difficulty in obtaining
proper bus termination. Refer to Appendix A Supplementary Hardware Information for more information about configuring attached storage.
A shared partition can be used by only one cluster service.
Do not include any file systems used as a resource for a cluster service in the node’s
local /etc/fstab files, because the cluster software must control the mounting and unmounting of service file systems.
For optimal performance of shared file systems, make sure to specify a 4 KB block size
with the mke2fs -b command. A smaller block size can cause long fsck times. Refer to Section 2.5.3.2 Creating File Systems.
After setting up the shared disk storage hardware, partition the disks and create file systems on the partitions. Refer to Section 2.5.3.1 Partitioning Disks, and Section 2.5.3.2 Creating File Systems for more information on configuring disks.
2.3.3. Setting Up a Console Switch
Although a console switch is not required for cluster operation, it can be used to facilitate node management and eliminate the need for separate monitors, mouses, and keyboards for each cluster node. There are several types of console switches.
For example, a terminal server enables connection to serial consoles and management of many nodes from a remote location. For a low-cost alternative, use a KVM (keyboard, video, and mouse) switch, which enables multiple nodes to share one keyboard, monitor, and mouse. A KVM switch is suitable for configurations in which GUI access to perform system management tasks is preferred.
Set up the console switch according to the documentation provided by the vendor.
After the console switch has been set up, connect it to each cluster node. The cables used depend on the type of console switch. For example, a Cyclades terminal server uses RJ45 to DB9 crossover cables to connect a serial port on each node to the terminal server.
2.3.4. Setting Up a Network Switch or Hub
A network switch or hub, although not required for operating a two-node cluster, can be used to facilitate cluster and client system network operations. Clusters of more than two nodes require a switch or hub.
Set up a network switch or hub according to the documentation provided by the vendor.
Page 38
22 Chapter 2. Hardware Installation and Operating System Configuration
After setting up the network switch or hub, connect it to each node by using conven­tional network cables. A terminal server, if used, is connected to the network switch or hub through a network cable.
2.4. Installing and Configuring Red Hat Enterprise Linux
After the setup of basic cluster hardware, proceed with installation of Red Hat Enterprise Linux on each node and ensure that all systems recognize the connected devices. Follow these steps:
1. Install Red Hat Enterprise Linux on all cluster nodes. Refer to Red Hat Enterprise Linux Installation Guide for instructions.
In addition, when installing Red Hat Enterprise Linux, it is strongly recommended to do the following:
Gather the IP addresses for the nodes and for the bonded Ethernet ports, before
installing Red Hat Enterprise Linux. Note that the IP addresses for the bonded Ethernet ports can be private IP addresses, (for example, 10.x.x.x).
Do not place local file systems (such as /, /etc, /tmp, and /var) on shared
disks or on the same SCSI bus as shared disks. This helps prevent the other cluster nodes from accidentally mounting these file systems, and also reserves the limited number of SCSI identification numbers on a bus for cluster disks.
Place /tmp and /var on different file systems. This may improve node perfor-
mance.
When a node boots, be sure that the node detects the disk devices in the same order
in which they were detected during the Red Hat Enterprise Linux installation. If the devices are not detected in the same order, the node may not boot.
When using certain RAID storage configured with Logical Unit Numbers (LUNs)
greater than zero, it may be necessary to enable LUN support by adding the fol­lowing to /etc/modprobe.conf:
options scsi_mod max_scsi_luns=255
2. Reboot the nodes.
3. When using a terminal server, configure Red Hat Enterprise Linux to send console messages to the console port.
4. Edit the /etc/hosts file on each cluster node and include the IP addresses used in the cluster or ensure that the addresses are in DNS. Refer to Section 2.4.1 Editing the /etc/hosts File for more information about performing this task.
Page 39
Chapter 2. Hardware Installation and Operating System Configuration 23
5. Decrease the alternate kernel boot timeout limit to reduce boot time for nodes. Refer to Section 2.4.2 Decreasing the Kernel Boot Timeout Limit for more information about performing this task.
6. Ensure that no login (or getty) programs are associated with the serial ports that are being used for the remote power switch connection (if applicable). To perform this task, edit the /etc/inittab file and use a hash symbol (#) to comment out the entries that correspond to the serial ports used for the remote power switch. Then, invoke the init q command.
7. Verify that all systems detect all the installed hardware:
Use the dmesg command to display the console startup messages. Refer to Section
2.4.3 Displaying Console Startup Messages for more information about perform­ing this task.
Use the cat /proc/devices command to display the devices configured in the
kernel. Refer to Section 2.4.4 Displaying Devices Configured in the Kernel for more information about performing this task.
8. Verify that the nodes can communicate over all the network interfaces by using the
ping command to send test packets from one node to another.
9. If intending to configure Samba services, verify that the required RPM packages for Samba services are installed.
2.4.1. Editing the /etc/hosts File
The /etc/hosts file contains the IP address-to-hostname translation table. The
/etc/hosts file on each node must contain entries for IP addresses and associated
hostnames for all cluster nodes.
As an alternative to the /etc/hosts file, name services such as DNS or NIS can be used to define the host names used by a cluster. However, to limit the number of dependencies and optimize availability, it is strongly recommended to use the /etc/hosts file to define IP addresses for cluster network interfaces.
The following is an example of an /etc/hosts file on a node of a cluster that does not use DNS-assigned hostnames:
Page 40
24 Chapter 2. Hardware Installation and Operating System Configuration
127.0.0.1 localhost.localdomain localhost
192.168.1.81 node1.example.com node1
193.186.1.82 node2.example.com node2
193.186.1.83 node3.example.com node3
The previous example shows the IP addresses and hostnames for three nodes (node1, node2, and node3),
Important
Do not assign the node hostname to the localhost (127.0.0.1) address, as this causes issues with the CMAN cluster management system.
Verify correct formatting of the local host entry in the /etc/hosts file to ensure that it does not include non-local systems in the entry for the local host. An example of an incorrect local host entry that includes a non-local system (server1) is shown next:
127.0.0.1 localhost.localdomain localhost server1
An Ethernet connection may not operate properly if the format of the /etc/hosts file is not correct. Check the /etc/hosts file and correct the file format by removing non-local systems from the local host entry, if necessary.
Note that each network adapter must be configured with the appropriate IP address and netmask.
The following example shows a portion of the output from the /sbin/ip addr list command on a cluster node:
2: eth0:BROADCAST,MULTICAST,UPmtu 1356 qdisc pfifo_fast qlen 1000
link/ether 00:05:5d:9a:d8:91 brd ff:ff:ff:ff:ff:ff inet 10.11.4.31/22 brd 10.11.7.255 scope global eth0 inet6 fe80::205:5dff:fe9a:d891/64 scope link
valid_lft forever preferred_lft forever
You may also add the IP addresses for the cluster nodes to your DNS server. Refer to the Red Hat Enterprise Linux System Administration Guide for information on configuring DNS, or consult your network administrator.
2.4.2. Decreasing the Kernel Boot Timeout Limit
It is possible to reduce the boot time for a node by decreasing the kernel boot timeout limit. During the Red Hat Enterprise Linux boot sequence, the boot loader allows for specifying an alternate kernel to boot. The default timeout limit for specifying a kernel is ten seconds.
To modify the kernel boot timeout limit for a node, edit the appropriate files as follows:
Page 41
Chapter 2. Hardware Installation and Operating System Configuration 25
When using the GRUB boot loader, the timeout parameter in /boot/grub/grub.conf should be modified to specify the appropriate number of seconds for the timeout param­eter. To set this interval to 3 seconds, edit the parameter to the following:
timeout = 3
When using the LILO or ELILO boot loaders, edit the /etc/lilo.conf file (on x86 systems) or the elilo.conf file (on Itanium systems) and specify the desired value (in tenths of a second) for the timeout parameter. The following example sets the timeout limit to three seconds:
timeout = 30
To apply any changes made to the /etc/lilo.conf file, invoke the /sbin/lilo com­mand.
On an Itanium system, to apply any changes made to the
/boot/efi/efi/redhat/elilo.conf file, invoke the /sbin/elilo command.
2.4.3. Displaying Console Startup Messages
Use the dmesg command to display the console startup messages. Refer to the dmesg(8) man page for more information.
The following example of output from the dmesg command shows that two external SCSI buses and nine disks were detected on the node. (Lines with backslashes display as one line on most screens):
May 22 14:02:10 storage3 kernel: scsi0 : Adaptec AHA274x/284x/294x \
(EISA/VLB/PCI-Fast SCSI) 5.1.28/3.2.4 May 22 14:02:10 storage3 kernel: May 22 14:02:10 storage3 kernel: scsi1 : Adaptec AHA274x/284x/294x \
(EISA/VLB/PCI-Fast SCSI) 5.1.28/3.2.4 May 22 14:02:10 storage3 kernel: May 22 14:02:10 storage3 kernel: scsi : 2 hosts. May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST39236LW Rev: 0004 May 22 14:02:11 storage3 kernel: Detected scsi disk sda at scsi0, channel 0, id 0, lun 0 May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST318203LC Rev: 0001 May 22 14:02:11 storage3 kernel: Detected scsi disk sdb at scsi1, channel 0, id 0, lun 0 May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST318203LC Rev: 0001 May 22 14:02:11 storage3 kernel: Detected scsi disk sdc at scsi1, channel 0, id 1, lun 0 May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST318203LC Rev: 0001 May 22 14:02:11 storage3 kernel: Detected scsi disk sdd at scsi1, channel 0, id 2, lun 0 May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST318203LC Rev: 0001 May 22 14:02:11 storage3 kernel: Detected scsi disk sde at scsi1, channel 0, id 3, lun 0 May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST318203LC Rev: 0001 May 22 14:02:11 storage3 kernel: Detected scsi disk sdf at scsi1, channel 0, id 8, lun 0 May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST318203LC Rev: 0001
Page 42
26 Chapter 2. Hardware Installation and Operating System Configuration
May 22 14:02:11 storage3 kernel: Detected scsi disk sdg at scsi1, channel 0, id 9, lun 0 May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST318203LC Rev: 0001 May 22 14:02:11 storage3 kernel: Detected scsi disk sdh at scsi1, channel 0, id 10, lun 0 May 22 14:02:11 storage3 kernel: Vendor: SEAGATE Model: ST318203LC Rev: 0001 May 22 14:02:11 storage3 kernel: Detected scsi disk sdi at scsi1, channel 0, id 11, lun 0 May 22 14:02:11 storage3 kernel: Vendor: Dell Model: 8 BAY U2W CU Rev: 0205 May 22 14:02:11 storage3 kernel: Type: Processor \
ANSI SCSI revision: 03
May 22 14:02:11 storage3 kernel: scsi1 : channel 0 target 15 lun 1 request sense \
failed, performing reset. May 22 14:02:11 storage3 kernel: SCSI bus is being reset for host 1 channel 0. May 22 14:02:11 storage3 kernel: scsi : detected 9 SCSI disks total.
The following example of the dmesg command output shows that a quad Ethernet card was detected on the node:
May 22 14:02:11 storage3 kernel: 3c59x.c:v0.99H 11/17/98 Donald Becker May 22 14:02:11 storage3 kernel: tulip.c:v0.91g-ppc 7/16/99 May 22 14:02:11 storage3 kernel: eth0: Digital DS21140 Tulip rev 34 at 0x9800, \
00:00:BC:11:76:93, IRQ 5. May 22 14:02:12 storage3 kernel: eth1: Digital DS21140 Tulip rev 34 at 0x9400, \
00:00:BC:11:76:92, IRQ 9. May 22 14:02:12 storage3 kernel: eth2: Digital DS21140 Tulip rev 34 at 0x9000, \
00:00:BC:11:76:91, IRQ 11. May 22 14:02:12 storage3 kernel: eth3: Digital DS21140 Tulip rev 34 at 0x8800, \
00:00:BC:11:76:90, IRQ 10.
2.4.4. Displaying Devices Configured in the Kernel
To be sure that the installed devices (such as network interfaces), are configured in the kernel, use the cat /proc/devices command on each node. For example:
Character devices:
1 mem 4 /dev/vc/0 4 tty 4 ttyS 5 /dev/tty 5 /dev/console 5 /dev/ptmx 6 lp
7 vcs 10 misc 13 input 14 sound 29 fb 89 i2c
116 alsa
Page 43
Chapter 2. Hardware Installation and Operating System Configuration 27
128 ptm 136 pts 171 ieee1394 180 usb 216 rfcomm 226 drm 254 pcmcia
Block devices:
1 ramdisk
2 fd
3 ide0
8 sd
9 md 65 sd 66 sd 67 sd 68 sd 69 sd 70 sd 71 sd
128 sd 129 sd 130 sd 131 sd 132 sd 133 sd 134 sd 135 sd 253 device-mapper
The previous example shows:
Onboard serial ports (ttyS)
USB devices (usb)
SCSI devices (sd)
2.5. Setting Up and Connecting the Cluster Hardware
After installing Red Hat Enterprise Linux, set up the cluster hardware components and verify the installation to ensure that the nodes recognize all the connected devices. Note that the exact steps for setting up the hardware depend on the type of configuration. Refer to Section 2.1 Choosing a Hardware Configuration for more information about cluster configurations.
To set up the cluster hardware, follow these steps:
Page 44
28 Chapter 2. Hardware Installation and Operating System Configuration
1. Shut down the nodes and disconnect them from their power source.
2. When using power switches, set up the switches and connect each node to a power switch. Refer to Section 2.5.2 Configuring a Fence Device for more information.
In addition, it is recommended to connect each power switch (or each node’s power cord if not using power switches) to a different UPS system. Refer to Section 2.5.3 Configuring UPS Systems for information about using optional UPS systems.
3. Set up shared disk storage according to the vendor instructions and connect the nodes to the external storage enclosure. Refer to Section 2.3.2 Shared Storage considera- tions.
In addition, it is recommended to connect the storage enclosure to redundant UPS systems. Refer to Section 2.5.3 Configuring UPS Systems for more information about using optional UPS systems.
4. Turn on power to the hardware, and boot each cluster node. During the boot-up process, enter the BIOS utility to modify the node setup, as follows:
Ensure that the SCSI identification number used by the host bus adapter is unique
for the SCSI bus it is attached to. Refer to Section A.3.4 SCSI Identification Num­bers for more information about performing this task.
Enable or disable the onboard termination for each host bus adapter, as required by
the storage configuration. Refer to Section A.3.2 SCSI Bus Termination for more information about performing this task.
Enable the node to automatically boot when it is powered on.
5. Exit from the BIOS utility, and continue to boot each node. Examine the startup messages to verify that the Red Hat Enterprise Linux kernel has been configured and can recognize the full set of shared disks. Use the dmesg command to display console startup messages. Refer to Section 2.4.3 Displaying Console Startup Messages for more information about using the dmesg command.
6. Set up the bonded Ethernet channels, if applicable. Refer to Section 2.5.1 Configur- ing Ethernet Channel Bonding for more information.
7. Run the ping command to verify packet transmission between all cluster nodes.
Page 45
Chapter 2. Hardware Installation and Operating System Configuration 29
2.5.1. Configuring Ethernet Channel Bonding
Ethernet channel bonding in a no-single-point-of-failure cluster system allows for a fault tolerant network connection by combining two Ethernet devices into one virtual device. The resulting channel bonded interface ensures that in the event that one Ethernet device fails, the other device will become active. This type of channel bonding, called an active- backup policy allows connection of both bonded devices to one switch or can allow each Ethernet device to be connected to separate hubs or switches, which eliminates the single point of failure in the network hub/switch.
Channel bonding requires each cluster node to have two Ethernet devices installed. When it is loaded, the bonding module uses the MAC address of the first enslaved network device and assigns that MAC address to the other network device if the first device fails link detection.
To configure two network devices for channel bonding, perform the following:
1. Create a bonding devices in /etc/modprobe.conf. For example:
alias bond0 bonding options bonding miimon=100 mode=1
This loads the bonding device with the bond0 interface name, as well as passes options to the bonding driver to configure it as an active-backup master device for the enslaved network interfaces.
2. Edit the /etc/sysconfig/network-scripts/ifcfg-ethX configuration file for both eth0 and eth1 so that the files show identical contents. For example:
DEVICE=ethX USERCTL=no ONBOOT=yes MASTER=bond0 SLAVE=yes BOOTPROTO=none
This will enslave ethX (replace X with the assigned number of the Ethernet devices) to the bond0 master device.
3. Create a network script for the bonding device (for example,
/etc/sysconfig/network-scripts/ifcfg-bond0), which would appear like
the following example:
DEVICE=bond0 USERCTL=no ONBOOT=yes BROADCAST=192.168.1.255 NETWORK=192.168.1.0 NETMASK=255.255.255.0 GATEWAY=192.168.1.1 IPADDR=192.168.1.10
4. Reboot the system for the changes to take effect.
Page 46
30 Chapter 2. Hardware Installation and Operating System Configuration
2.5.2. Configuring a Fence Device
Fence devices enable a node to power-cycle another node before restarting its services as part of the failover process. The ability to remotely disable a node ensures data integrity is maintained under any failure condition. Deploying a cluster in a production environment requires the use of a fence device. Only development (test) environments should use a configuration without a fence device. Refer to Section 2.1.2 Choosing the Type of Fence Device for a description of the various types of power switches.
In a cluster configuration that uses fence devices such as power switches, each node is connected to a switch through either a serial port (for two-node clusters) or network con­nection (for multi-node clusters). When failover occurs, a node can use this connection to power-cycle another node before restarting its services.
Fence devices protect against data corruption if an unresponsive (or hanging) node be­comes responsive after its services have failed over, and issues I/O to a disk that is also receiving I/O from another node. In addition, if CMAN detects node failure, the failed node will be removed from the cluster. If a fence device is not used in the cluster, then a failed node may result in cluster services being run on more than one node, which can cause data corruption and possibly system crashes.
A node may appear to hang for a few seconds if it is swapping or has a high system workload. For this reason, adequate time is allowed prior to concluding that a node has failed.
If a node fails, and a fence device is used in the cluster, the fencing daemon power-cycles the hung node before restarting its services. This causes the hung node to reboot in a clean state and prevent it from issuing I/O and corrupting cluster service data.
When used, fence devices must be set up according to the vendor instructions; however, some cluster-specific tasks may be required to use them in a cluster. Consult the man­ufacturer documentation on configuring the fence device. Note that the cluster-specific information provided in this manual supersedes the vendor information.
When cabling a physical fence device such as a power switch, take special care to en­sure that each cable is plugged into the appropriate port and configured correctly. This is crucial because there is no independent means for the software to verify correct cabling. Failure to cable correctly can lead to an incorrect node being power cycled, fenced off from shared storage via fabric-level fencing, or for a node to inappropriately conclude that it has successfully power cycled a failed node.
2.5.3. Configuring UPS Systems
Uninterruptible power supplies (UPS) provide a highly-available source of power. Ideally, a redundant solution should be used that incorporates multiple UPS systems (one per server). For maximal fault-tolerance, it is possible to incorporate two UPS systems per server as well as APC Automatic Transfer Switches to manage the power and shutdown management of the server. Both solutions are solely dependent on the level of availability desired.
Page 47
Chapter 2. Hardware Installation and Operating System Configuration 31
It is not recommended to use a single UPS infrastructure as the sole source of power for the cluster. A UPS solution dedicated to the cluster is more flexible in terms of manageability and availability.
A complete UPS system must be able to provide adequate voltage and current for a pro­longed period of time. While there is no single UPS to fit every power requirement, a solution can be tailored to fit a particular configuration.
If the cluster disk storage subsystem has two power supplies with separate power cords, set up two UPS systems, and connect one power switch (or one node’s power cord if not using power switches) and one of the storage subsystem’s power cords to each UPS system. A redundant UPS system configuration is shown in Figure 2-2.
Figure 2-2. Redundant UPS System Configuration
An alternative redundant power configuration is to connect the power switches (or the nodes’ power cords) and the disk storage subsystem to the same UPS system. This is the most cost-effective configuration, and provides some protection against power failure. However, if a power outage occurs, the single UPS system becomes a possible single point of failure. In addition, one UPS system may not be able to provide enough power to all the attached devices for an adequate amount of time. A single UPS system configuration is shown in Figure 2-3.
Page 48
32 Chapter 2. Hardware Installation and Operating System Configuration
Figure 2-3. Single UPS System Configuration
Many vendor-supplied UPS systems include Red Hat Enterprise Linux applications that monitor the operational status of the UPS system through a serial port connection. If the battery power is low, the monitoring software initiates a clean system shutdown. As this occurs, the cluster software is properly stopped, because it is controlled by a SysV runlevel script (for example, /etc/rc.d/init.d/rgmanager).
Refer to the UPS documentation supplied by the vendor for detailed installation informa­tion.
2.5.3.1. Partitioning Disks
After shared disk storage has been set up, partition the disks so they can be used in the cluster. Then, create file systems or raw devices on the partitions.
Use parted to modify a disk partition table and divide the disk into partitions. While in
parted, use the p to display the partition table and the mkpart command to create new
partitions. The following example shows how to use parted to create a partition on disk:
Invoke parted from the shell using the command parted and specifying an available
shared disk device. At the (parted) prompt, use the p to display the current partition table. The output should be similar to the following:
Disk geometry for /dev/sda: 0.000-4340.294 megabytes Disk label type: msdos Minor Start End Type Filesystem Flags
Decide on how large of a partition is required. Create a partition of this size using the
mkpart command in parted. Although the mkpart does not create a file system, it
normally requires a file system type at partition creation time. parted uses a range on the disk to determine partition size; the size is the space between the end and the
Page 49
Chapter 2. Hardware Installation and Operating System Configuration 33
beginning of the given range. The following example shows how to create two partitions of 20 MB each on an empty disk.
(parted) mkpart primary ext3 0 20 (parted) mkpart primary ext3 20 40 (parted) p Disk geometry for /dev/sda: 0.000-4340.294 megabytes Disk label type: msdos Minor Start End Type Filesystem Flags 1 0.030 21.342 primary 2 21.343 38.417 primary
When more than four partitions are required on a single disk, it is necessary to create an
extended partition. If an extended partition is required, the mkpart also performs this task. In this case, it is not necessary to specify a file system type.
Note
Only one extended partition may be created, and the extended partition must be one of the four primary partitions.
(parted) mkpart extended 40 2000 (parted) p Disk geometry for /dev/sda: 0.000-4340.294 megabytes Disk label type: msdos Minor Start End Type Filesystem Flags 1 0.030 21.342 primary 2 21.343 38.417 primary 3 38.417 2001.952 extended
An extended partition allows the creation of logical partitionsinside of it. The following
example shows the division of the extended partition into two logical partitions.
(parted) mkpart logical ext3 40 1000 (parted) p Disk geometry for /dev/sda: 0.000-4340.294 megabytes Disk label type: msdos Minor Start End Type Filesystem Flags 1 0.030 21.342 primary 2 21.343 38.417 primary 3 38.417 2001.952 extended 5 38.447 998.841 logical (parted) mkpart logical ext3 1000 2000 (parted) p Disk geometry for /dev/sda: 0.000-4340.294 megabytes Disk label type: msdos Minor Start End Type Filesystem Flags 1 0.030 21.342 primary 2 21.343 38.417 primary 3 38.417 2001.952 extended 5 38.447 998.841 logical
Page 50
34 Chapter 2. Hardware Installation and Operating System Configuration
6 998.872 2001.952 logical
A partition may be removed using parted’s rm command. For example:
(parted) rm 1 (parted) p Disk geometry for /dev/sda: 0.000-4340.294 megabytes Disk label type: msdos Minor Start End Type Filesystem Flags 2 21.343 38.417 primary 3 38.417 2001.952 extended 5 38.447 998.841 logical 6 998.872 2001.952 logical
After all required partitions have been created, exit parted using the quit command.
If a partition was added, removed, or changed while both nodes are powered on and con­nected to the shared storage, reboot the other node for it to recognize the modifications. After partitioning a disk, format the partition for use in the cluster. For example, create the file systems for shared partitions. Refer to Section 2.5.3.2 Creating File Systems for more information on configuring file systems.
For basic information on partitioning hard disks at installation time, refer to the Red Hat Enterprise Linux Installation Guide.
2.5.3.2. Creating File Systems
Use the mkfs command to create an ext3 file system. For example:
mke2fs -j -b 4096 /dev/sde3
For optimal performance of shared file systems, make sure to specify a 4 KB block size with the mke2fs -b command. A smaller block size can cause long fsck times.
Page 51
Chapter 3.
Installing and Configuring Red Hat Cluster Suite Software
This chapter describes how to install and configure Red Hat Cluster Suite software and consists of the following sections:
Section 3.1 Software Installation and Configuration Tasks
Section 3.2 Overview of the Cluster Configuration Tool
Section 3.3 Installing the Red Hat Cluster Suite Packages
Section 3.4 Starting the Cluster Configuration Tool
Section 3.5 Naming The Cluster
Section 3.6 Configuring Fence Devices
Section 3.7 Adding and Deleting Members
Section 3.8 Configuring a Failover Domain
Section 3.9 Adding Cluster Resources
Section 3.10 Adding a Cluster Service to the Cluster
Section 3.11 Propagating The Configuration File: New Cluster
Section 3.12 Starting the Cluster Software
3.1. Software Installation and Configuration Tasks
Installing and configuring Red Hat Cluster Suite software consists of the following steps:
1. Installing Red Hat Cluster Suite software.
Refer to Section 3.3 Installing the Red Hat Cluster Suite Packages.
2. Starting the Cluster Configuration Tool.
a. Creating a new configuration file or using an existing one.
b. Choose locking: either DLM or GULM.
Refer to Section 3.4 Starting the Cluster Configuration Tool.
3. Naming the cluster. Refer to Section 3.5 Naming The Cluster.
4. Creating fence devices. Refer to Section 3.6 Configuring Fence Devices.
Page 52
36 Chapter 3. Installing and Configuring Red Hat Cluster Suite Software
5. Creating cluster members. Refer to Section 3.7 Adding and Deleting Members.
6. Creating failover domains. Refer to Section 3.8 Configuring a Failover Domain.
7. Creating resources. Refer to Section 3.9 Adding Cluster Resources.
8. Creating cluster services.
Refer to Section 3.10 Adding a Cluster Service to the Cluster.
9. Propagating the configuration file to the other nodes in the cluster.
Refer to Section 3.11 Propagating The Configuration File: New Cluster.
10. Starting the cluster software. Refer to Section 3.12 Starting the Cluster Software.
3.2. Overview of the Cluster Configuration Tool
The Cluster Configuration Tool (Figure 3-1) is a graphical user interface (GUI) for creating, editing, saving, and propagating the cluster configuration file,
/etc/cluster/cluster.conf. The Cluster Configuration Tool is part of the Red
Hat Cluster Suite management GUI, (the system-config-cluster package) and is accessed by the Cluster Configuration tab in the Red Hat Cluster Suite management GUI.
Page 53
Chapter 3. Installing and Configuring Red Hat Cluster Suite Software 37
Figure 3-1. Cluster Configuration Tool
The Cluster Configuration Tool uses a hierarchical structure to show relationships among components in the cluster configuration. A triangle icon to the left of a component name indicates that the component has one or more subordinate components assigned to it. To expand or collapse the portion of the tree below a component, click the triangle icon.
The Cluster Configuration Tool represents the cluster configuration with the following components in the left frame:
Cluster Nodes — Defines cluster nodes. Nodes are represented by name as subordinate
elements under Cluster Nodes. Using configuration buttons at the bottom of the right frame (below Properties), you can add nodes, delete nodes, edit node properties, and configure fencing methods for each node.
Fence Devices — Defines fence devices. Fence devices are represented as subordinate
elements under Fence Devices. Using configuration buttons at the bottom of the right frame (below Properties), you can add fence devices, delete fence devices, and edit fence-device properties. Fence devices must be defined before you can configure fencing (with the Manage Fencing For This Node button) for each node.
Page 54
38 Chapter 3. Installing and Configuring Red Hat Cluster Suite Software
Managed Resources — Defines failover domains, resources, and services.
Failover Domains — Use this section to configure one or more subsets of cluster
nodes used to run a service in the event of a node failure. Failover domains are rep­resented as subordinate elements under Failover Domains. Using configuration but- tons at the bottom of the right frame (below Properties), you can create failover do­mains (when Failover Domains is selected) or edit failover domain properties (when a failover domain is selected).
Resources — Use this section to configure resources to be managed by the system.
Choose from the available list of file systems, IP addresses, NFS mounts and exports, and user-created scripts and configure them individually. Resources are represented as subordinate elements under Resources. Using configuration buttons at the bottom of the right frame (below Properties), you can create resources (when Resources is selected) or edit resource properties (when a resource is selected).
Services — Use this section to create and configure services that combine cluster
resources, nodes, and failover domains as needed. Services are represented as subor­dinate elements under Services. Using configuration buttons at the bottom of the right frame (below Properties), you can create services (when Services is selected) or edit service properties (when a service is selected).
Warning
Do not manually edit the contents of the /etc/cluster/cluster.conf file without guid­ance from an authorized Red Hat representative or unless you fully understand the con­sequences of editing the /etc/cluster/cluster.conf file manually.
Figure 3-2 shows the hierarchical relationship among cluster configuration components. The cluster comprises cluster nodes. The cluster nodes are connected to one or more fenc­ing devices. Nodes can be separated by failover domains to a cluster service. The services comprise managed resources such as NFS exports, IP addresses, and shared GFS partitions. The structure is ultimately reflected in the /etc/cluster/cluster.conf XML struc­ture. The Cluster Configuration Tool provides a convenient way to create and manipulate the /etc/cluster/cluster.conf file.
Page 55
Chapter 3. Installing and Configuring Red Hat Cluster Suite Software 39
Figure 3-2. Cluster Configuration Structure
3.3. Installing the Red Hat Cluster Suite Packages
You can install Red Hat Cluster Suite and (optionally install) Red Hat GFS RPMs auto­matically by running the up2date utility at each node for the Red Hat Cluster Suite and Red Hat GFS products.
Tip
You can access the Red Hat Cluster Suite and Red Hat GFS products by using Red Hat Network to subscribe to and access the channels containing the Red Hat Cluster Suite and Red Hat GFS packages. From the Red Hat Network channel, you can manage entitlements for your cluster nodes and upgrade packages for each node within the Red Hat Network Web-based interface. For more information on using Red Hat Network, visit http://rhn.redhat.com.
Page 56
40 Chapter 3. Installing and Configuring Red Hat Cluster Suite Software
To automatically install RPMs, follow these steps at each node:
1. Log on as the root user.
Note
The following steps specify using up2date installall with the --force option. Using the --force option includes kernels that are required for successful instal­lation of Red Hat Cluster Suite and Red Hat GFS. (Without the --force option,
up2date skips kernels by default.)
2. Run up2date --force --installall=channel-label for Red Hat Cluster Suite. The following example shows running the command for i386 RPMs:
# up2date --force --installall=rhel-i386-as-4-cluster
3. (Optional) If you are installing Red Hat GFS, run up2date --force
--installall=channel-label for Red Hat GFS. The following example shows
running the command for i386 RPMs:
# up2date --force --installall=rhel-i386-as-4-gfs-6.1
Note
The preceding procedure accommodates most installation requirements. However, if your installation has extreme limitations on storage and RAM, refer to Appendix B Selectively Installing Red Hat Cluster Suite Packages for more detailed information about Red Hat Cluster Suite and Red Hat GFS RPM packages and customized installation of those packages.
3.4. Starting the Cluster Configuration Tool
You can start the Cluster Configuration Tool by logging in to a cluster node as root with the ssh -Y command and issuing the system-config-cluster command. For exam­ple, to start the Cluster Configuration Tool on cluster node nano-01, do the following:
1. Log in to a cluster node and run system-config-cluster. For example:
$ssh -Y root@nano-01
. . .
Page 57
Chapter 3. Installing and Configuring Red Hat Cluster Suite Software 41
#system-config-cluster
a. If this is the first time you have started the Cluster Configuration Tool, the
program prompts you to either open an existing configuration or create a new one. Click Create New Configuration to start a new configuration file (refer to Figure 3-3).
Figure 3-3. Starting a New Configuration File
Note
The Cluster Management tab for the Red Hat Cluster Suite management GUI is available after you save the configuration file with the Cluster Con- figuration Tool, exit, and restart the the Red Hat Cluster Suite management GUI (system-config-cluster). (The Cluster Management tab displays the status of the cluster service manager, cluster nodes, and resources, and shows statistics concerning cluster service operation. To manage the clus­ter system further, choose the Cluster Configuration tab.)
b. For a new configuration, a Lock Method dialog box is displayed requesting
a choice of either the GULM or DLM lock method (and multicast address for DLM).
Page 58
42 Chapter 3. Installing and Configuring Red Hat Cluster Suite Software
Figure 3-4. Choosing a Lock Method
2. Starting the Cluster Configuration Tool displays a graphical representation of the configuration (Figure 3-5) as specified in the cluster configuration file,
/etc/cluster/cluster.conf.
Page 59
Chapter 3. Installing and Configuring Red Hat Cluster Suite Software 43
Figure 3-5. The Cluster Configuration Tool
3.5. Naming The Cluster
Naming the cluster consists of specifying a cluster name, a configuration version (optional), and values for Post-Join Delay and Post-Fail Delay. Name the cluster as follows:
1. At the left frame, click Cluster.
2. At the bottom of the right frame (labeled Properties), click the Edit Cluster Prop- erties button. Clicking that button causes a Cluster Properties dialog box to be displayed. The Cluster Properties dialog box presents text boxes for Name, Con-
fig Version, and two Fence Daemon Properties parameters: Post-Join Delay and Post-Fail Delay.
3. At the Name text box, specify a name for the cluster. The name should be descrip­tive enough to distinguish it from other clusters and systems on your network (for example, nfs_cluster or httpd_cluster). The cluster name cannot exceed 15 characters.
Page 60
44 Chapter 3. Installing and Configuring Red Hat Cluster Suite Software
Tip
Choose the cluster name carefully. The only way to change the name of a Red Hat cluster is to create a new cluster configuration with the new name.
4. (Optional) The Config Version value is set to 1 by default and is automatically incremented each time you save your cluster configuration. However, if you need to set it to another value, you can specify it at the Config Version text box.
5. Specify the Fence Daemon Properties parameters: Post-Join Delay and Post-Fail Delay.
a. The Post-Join Delay parameter is the number of seconds the fence daemon
(fenced) waits before fencing a node after the node joins the fence domain. The Post-Join Delay default value is 3. A typical setting for Post-Join Delay is between 20 and 30 seconds, but can vary according to cluster and network performance.
b. The Post-Fail Delay parameter is the number of seconds the fence daemon
(fenced) waits before fencing a node (a member of the fence domain) after the node has failed. The Post-Fail Delay default value is 0. Its value may be varied to suit cluster and network performance.
Note
For more information about Post-Join Delay and Post-Fail Delay, refer to the fenced(8) man page.
6. Save cluster configuration changes by selecting File => Save.
3.6. Configuring Fence Devices
Configuring fence devices for the cluster consists of selecting one or more fence devices and specifying fence-device-dependent parameters (for example, name, IP address, login, and password).
To configure fence devices, follow these steps:
Page 61
Chapter 3. Installing and Configuring Red Hat Cluster Suite Software 45
1. Click Fence Devices. At the bottom of the right frame (labeled Properties), click the Add a Fence Device button. Clicking Add a Fence Device causes the Fence
Device Configuration dialog box to be displayed (refer to Figure 3-6).
Figure 3-6. Fence Device Configuration
2. At the Fence Device Configuration dialog box, click the drop-down box under Add a New Fence Device and select the type of fence device to configure.
3. Specify the information in the Fence Device Configuration dialog box according to the type of fence device. Refer to the following tables for more information.
Field Description
Name A name for the APC device connected to the cluster.
IP Address The IP address assigned to the device.
Login The login name used to access the device.
Password The password used to authenticate the connection to the device.
Table 3-1. Configuring an APC Fence Device
Page 62
46 Chapter 3. Installing and Configuring Red Hat Cluster Suite Software
Field Description
Name A name for the Brocade device connected to the cluster.
IP Address The IP address assigned to the device.
Login The login name used to access the device.
Password The password used to authenticate the connection to the device.
Table 3-2. Configuring a Brocade Fibre Channel Switch
Field Description
IP Address The IP address assigned to the PAP console.
Login The login name used to access the PAP console.
Password The password used to authenticate the connection to the PAP
console.
Table 3-3. Configuring a Bull Platform Administration Processor (PAP) Inter­face
Field Description
Name The name assigned to the DRAC.
IP Address The IP address assigned to the DRAC.
Login The login name used to access the DRAC.
Password The password used to authenticate the connection to the DRAC.
Table 3-4. Configuring a Dell Remote Access Controller/Modular Chassis (DRAC/MC) Interface
Field Description
Name A name for the BladeFrame device connected to the cluster.
CServer The hostname (and optionally the username in the form of
username@hostname) assigned to the device. Refer to the fence_egenera(8) man page.
Table 3-5. Configuring an Egenera BladeFrame
Page 63
Chapter 3. Installing and Configuring Red Hat Cluster Suite Software 47
Field Description
Name A name for the GNBD device used to fence the cluster. Note that
the GFS server must be accessed via GNBD for cluster node fencing support.
Server The hostname of each GNBD to disable. For multiple hostnames,
separate each hostname with a space.
Table 3-6. Configuring a Global Network Block Device (GNBD) fencing agent
Field Description
Name A name for the server with HP iLO support.
Login The login name used to access the device.
Password The password used to authenticate the connection to the device.
Hostname The hostname assigned to the device.
Table 3-7. Configuring an HP Integrated Lights Out (iLO) card
Field Description
Name A name for the IBM Bladecenter device connected to the cluster.
IP Address The IP address assigned to the device.
Login The login name used to access the device.
Password The password used to authenticate the connection to the device.
Table 3-8. Configuring an IBM Blade Center that Supports Telnet
Field Description
Name A name for the RSA device connected to the cluster.
IP Address The IP address assigned to the device.
Login The login name used to access the device.
Password The password used to authenticate the connection to the device.
Table 3-9. Configuring an IBM Remote Supervisor Adapter II (RSA II)
Page 64
48 Chapter 3. Installing and Configuring Red Hat Cluster Suite Software
Field Description
IP Address The IP address assigned to the IPMI port.
Login The login name of a user capable of issuing power on/off
commands to the given IPMI port.
Password The password used to authenticate the connection to the IPMI port.
Table 3-10. Configuring an Intelligent Platform Management Interface (IPMI)
Field Description
Name A name to assign the Manual fencing agent. Refer to
fence_manual(8) for more information.
Table 3-11. Configuring Manual Fencing
Note
Manual fencing is not supported for production environments.
Field Description
Name A name for the McData device connected to the cluster.
IP Address The IP address assigned to the device.
Login The login name used to access the device.
Password The password used to authenticate the connection to the device.
Table 3-12. Configuring a McData Fibre Channel Switch
Field Description
Name A name for the WTI RPS-10 power switch connected to the cluster.
Device The device the switch is connected to on the controlling host (for
example, /dev/ttys2).
Port The switch outlet number.
Table 3-13. Configuring an RPS-10 Power Switch (two-node clusters only)
Page 65
Chapter 3. Installing and Configuring Red Hat Cluster Suite Software 49
Field Description
Name A name for the SANBox2 device connected to the cluster.
IP Address The IP address assigned to the device.
Login The login name used to access the device.
Password The password used to authenticate the connection to the device.
Table 3-14. Configuring a QLogic SANBox2 Switch
Field Description
Name A name for the Vixel switch connected to the cluster.
IP Address The IP address assigned to the device.
Password The password used to authenticate the connection to the device.
Table 3-15. Configuring a Vixel SAN Fibre Channel Switch
Field Description
Name A name for the WTI power switch connected to the cluster.
IP Address The IP address assigned to the device.
Password The password used to authenticate the connection to the device.
Table 3-16. Configuring a WTI Network Power Switch
4. Click OK.
5. Choose File => Save to save the changes to the cluster configuration.
3.7. Adding and Deleting Members
The procedure to add a member to a cluster varies depending on whether the cluster is a newly-configured cluster or a cluster that is already configured and running. To add a member to a new cluster, refer to Section 3.7.1 Adding a Member to a Cluster. To add a member to an existing cluster, refer to Section 3.7.2 Adding a Member to a Running
Cluster. To delete a member from a cluster, refer to Section 3.7.3 Deleting a Member from a Cluster.
Page 66
50 Chapter 3. Installing and Configuring Red Hat Cluster Suite Software
3.7.1. Adding a Member to a Cluster
To add a member to a new cluster, follow these steps:
1. Click Cluster Node.
2. At the bottom of the right frame (labeled Properties), click the Add a Cluster Node button. Clicking that button causes a Node Properties dialog box to be displayed. For a DLM cluster, the Node Properties dialog box presents text boxes for Cluster
Node Name and Quorum Votes (refer to Figure 3-7). For a GULM cluster, the Node Properties dialog box presents text boxes for Cluster Node Name and Quorum Votes, and presents a checkbox for GULM Lockserver (refer to Figure 3-8).
Figure 3-7. Adding a Member to a New DLM Cluster
Figure 3-8. Adding a Member to a New GULM Cluster
3. At the Cluster Node Name text box, specify a node name. The entry can be a name or an IP address of the node on the cluster subnet.
Note
Each node must be on the same subnet as the node from which you are run­ning the Cluster Configuration Tool and must be defined either in DNS or in the
/etc/hosts file of each cluster node.
Page 67
Chapter 3. Installing and Configuring Red Hat Cluster Suite Software 51
Note
The node on which you are running the Cluster ConfigurationTool must be explic­itly added as a cluster member; the node is not automatically added to the cluster configuration as a result of running the Cluster Configuration Tool.
4. Optionally, at the Quorum Votes text box, you can specify a value; however in most configurations you can leave it blank. Leaving the Quorum Votes text box blank causes the quorum votes value for that node to be set to the default value of 1.
5. If the cluster is a GULM cluster and you want this node to be a GULM lock server, click the GULM Lockserver checkbox (marking it as checked).
6. Click OK.
7. Configure fencing for the node:
a. Click the node that you added in the previous step.
b. At the bottom of the right frame (below Properties), click Manage Fencing
For This Node. Clicking Manage Fencing For This Node causes the Fence Configuration dialog box to be displayed.
c. At the Fence Configuration dialog box, bottom of the right frame (below
Properties), click Add a New Fence Level. Clicking Add a New Fence Level
causes a fence-level element (for example, Fence-Level-1, Fence-Level-2, and so on) to be displayed below the node in the left frame of the Fence Configu- ration dialog box.
d. Click the fence-level element.
e. At the bottom of the right frame (below Properties), click Add a New Fence
to this Level. Clicking Add a New Fence to this Level causes the Fence Properties dialog box to be displayed.
f. At the Fence Properties dialog box, click the Fence Device Type drop-down
box and select the fence device for this node. Also, provide additional infor­mation required (for example, Port and Switch for an APC Power Device).
g. At the Fence Properties dialog box, click OK. Clicking OK causes a fence
device element to be displayed below the fence-level element.
h. To create additional fence devices at this fence level, return to step 6d. Other-
wise, proceed to the next step.
i. To create additional fence levels, return to step 6c. Otherwise, proceed to the
next step.
j. If you have configured all the fence levels and fence devices for this node, click
Close.
Page 68
52 Chapter 3. Installing and Configuring Red Hat Cluster Suite Software
8. Choose File => Save to save the changes to the cluster configuration.
3.7.2. Adding a Member to a Running Cluster
The procedure for adding a member to a running cluster depends on whether the cluster contains only two nodes or more than two nodes. To add a member to a running cluster, follow the steps in one of the following sections according to the number of nodes in the cluster:
For clusters with only two nodes —
Section 3.7.2.1 Adding a Member to a Running Cluster That Contains Only Two Nodes
For clusters with more than two nodes —
Section 3.7.2.2 Adding a Member to a Running Cluster That Contains More Than Two Nodes
3.7.2.1. Adding a Member to a Running Cluster That Contains Only Two Nodes
To add a member to an existing cluster that is currently in operation, and contains only two nodes, follow these steps:
1. Add the node and configure fencing for it as in
Section 3.7.1 Adding a Member to a Cluster.
2. Click Send to Cluster to propagate the updated configuration to other running nodes in the cluster.
3. Use the scp command to send the updated /etc/cluster/cluster.conf file from one of the existing cluster nodes to the new node.
4. At the Red Hat Cluster Suite management GUI Cluster Status Tool tab, disable each service listed under Services.
5. Stop the cluster software on the two running nodes by running the following com­mands at each node in this order:
a. service rgmanager stop
b. service gfs stop, if you are using Red Hat GFS
c. service clvmd stop
d. service fenced stop
e. service cman stop
f. service ccsd stop
Page 69
Chapter 3. Installing and Configuring Red Hat Cluster Suite Software 53
6. Start cluster software on all cluster nodes (including the added one) by running the following commands in this order:
a. service ccsd start
b. service cman start
c. service fenced start
d. service clvmd start
e. service gfs start, if you are using Red Hat GFS
f. service rgmanager start
7. Start the Red Hat Cluster Suite management GUI. At the Cluster Configuration Tool tab, verify that the configuration is correct. At the Cluster Status Tool tab verify that the nodes and services are running as expected.
3.7.2.2. Adding a Member to a Running Cluster That Contains More Than Two Nodes
To add a member to an existing cluster that is currently in operation, and contains more than two nodes, follow these steps:
1. Add the node and configure fencing for it as in
Section 3.7.1 Adding a Member to a Cluster.
2. Click Send to Cluster to propagate the updated configuration to other running nodes in the cluster.
3. Use the scp command to send the updated /etc/cluster/cluster.conf file from one of the existing cluster nodes to the new node.
4. Start cluster services on the new node by running the following commands in this order:
a. service ccsd start
b. service lock_gulmd start or service cman start according to the
type of lock manager used
c. service fenced start (DLM clusters only)
d. service clvmd start
e. service gfs start, if you are using Red Hat GFS
f. service rgmanager start
Page 70
54 Chapter 3. Installing and Configuring Red Hat Cluster Suite Software
5. Start the Red Hat Cluster Suite management GUI. At the Cluster Configuration Tool tab, verify that the configuration is correct. At the Cluster Status Tool tab
verify that the nodes and services are running as expected.
3.7.3. Deleting a Member from a Cluster
To delete a member from an existing cluster that is currently in operation, follow these steps:
1. At one of the running nodes (not to be removed), run the Red Hat Cluster Suite man­agement GUI. At the Cluster Status Tool tab, under Services, disable or relocate each service that is running on the node to be deleted.
2. Stop the cluster software on the node to be deleted by running the following com­mands at that node in this order:
a. service rgmanager stop
b. service gfs stop, if you are using Red Hat GFS
c. service clvmd stop
d. service fenced stop (DLM clusters only)
e. service lock_gulmd stop or service cman stop according to the
type of lock manager used
f. service ccsd stop
3. At the Cluster Configuration Tool (on one of the running members), delete the member as follows:
a. If necessary, click the triangle icon to expand the Cluster Nodes property.
b. Select the cluster node to be deleted. At the bottom of the right frame (labeled
Properties), click the Delete Node button.
c. Clicking the Delete Node button causes a warning dialog box to be displayed
requesting confirmation of the deletion (Figure 3-9).
Figure 3-9. Confirm Deleting a Member
Page 71
Chapter 3. Installing and Configuring Red Hat Cluster Suite Software 55
d. At that dialog box, click Yes to confirm deletion.
e. Propagate the updated configuration by clicking the Send to Cluster button.
(Propagating the updated configuration automatically saves the configuration.)
4. Stop the cluster software on the all remaining running nodes (including GULM lock­server nodes for GULM clusters) by running the following commands at each node in this order:
a. service rgmanager stop
b. service gfs stop, if you are using Red Hat GFS
c. service clvmd stop
d. service fenced stop (DLM clusters only)
e. service lock_gulmd stop or service cman stop according to the
type of lock manager used
f. service ccsd stop
5. Start cluster software on all remaining cluster nodes (including the GULM lock­server nodes for a GULM cluster) by running the following commands in this order:
a. service ccsd start
b. service lock_gulmd start or service cman start according to the
type of lock manager used
c. service fenced start (DLM clusters only)
d. service clvmd start
e. service gfs start, if you are using Red Hat GFS
f. service rgmanager start
6. Start the Red Hat Cluster Suite management GUI. At the Cluster Configuration Tool tab, verify that the configuration is correct. At the Cluster Status Tool tab verify that the nodes and services are running as expected.
3.8. Configuring a Failover Domain
A failover domain is a named subset of cluster nodes that are eligible to run a cluster service in the event of a node failure. A failover domain can have the following characteristics:
Page 72
56 Chapter 3. Installing and Configuring Red Hat Cluster Suite Software
Unrestricted — Allows you to specify that a subset of members are preferred, but that a
cluster service assigned to this domain can run on any available member.
Restricted — Allows you to restrict the members that can run a particular cluster service.
If none of the members in a restricted failover domain are available, the cluster service cannot be started (either manually or by the cluster software).
Unordered — When a cluster service is assigned to an unordered failover domain, the
member on which the cluster service runs is chosen from the available failover domain members with no priority ordering.
Ordered — Allows you to specify a preference order among the members of a failover
domain. The member at the top of the list is the most preferred, followed by the second member in the list, and so on.
By default, failover domains are unrestricted and unordered.
In a cluster with several members, using a restricted failover domain can minimize the work to set up the cluster to run a cluster service (such as httpd), which requires you to set up the configuration identically on all members that run the cluster service). Instead of setting up the entire cluster to run the cluster service, you must set up only the members in the restricted failover domain that you associate with the cluster service.
Tip
To configure a preferred member, you can create an unrestricted failover domain compris­ing only one cluster member. Doing that causes a cluster service to run on that cluster member primarily (the preferred member), but allows the cluster service to fail over to any of the other members.
The following sections describe adding a failover domain, removing a failover domain, and removing members from a failover domain:
Section 3.8.1 Adding a Failover Domain
Section 3.8.2 Removing a Failover Domain
Section 3.8.3 Removing a Member from a Failover Domain
3.8.1. Adding a Failover Domain
To add a failover domain, follow these steps:
1. At the left frame of the the Cluster Configuration Tool, click Failover Domains.
Page 73
Chapter 3. Installing and Configuring Red Hat Cluster Suite Software 57
2. At the bottom of the right frame (labeled Properties), click the Create a Failover Domain button. Clicking the Create a Failover Domain button causes the Add Failover Domain dialog box to be displayed.
3. At the Add Failover Domain dialog box, specify a failover domain name at the Name for new Failover Domain text box and click OK. Clicking OK causes the Failover Domain Configuration dialog box to be displayed (Figure 3-10).
Note
The name should be descriptive enough to distinguish its purpose relative to other names used in your cluster.
Figure 3-10. Failover Domain Configuration: Configuring a Failover Domain
4. Click the Available Cluster Nodes drop-down box and select the members for this failover domain.
5. To restrict failover to members in this failover domain, click (check) the Restrict
Failover To This Domains Members checkbox. (With Restrict Failover To This Domains Members checked, services assigned to this failover domain fail over only
to nodes in this failover domain.)
6. To prioritize the order in which the members in the failover domain assume control of a failed cluster service, follow these steps:
a. Click (check) the Prioritized List checkbox (Figure 3-11). Clicking Priori-
tized List causes the Priority column to be displayed next to the Member Node column.
Page 74
58 Chapter 3. Installing and Configuring Red Hat Cluster Suite Software
Figure 3-11. Failover Domain Configuration: Adjusting Priority
b. For each node that requires a priority adjustment, click the node listed in the
Member Node/Priority columns and adjust priority by clicking one of the Adjust Priority arrows. Priority is indicated by the position in the Member Node column and the value in the Priority column. The node priorities are listed highest to lowest, with the highest priority node at the top of the Member Node column (having the lowest Priority number).
7. Click Close to create the domain.
8. At the Cluster Configuration Tool, perform one of the following actions depending on whether the configuration is for a new cluster or for one that is operational and running:
New cluster — If this is a new cluster, choose File => Save to save the changes to
the cluster configuration.
Running cluster — If this cluster is operational and running, and you want to
propagate the change immediately, click the Send to Cluster button. Clicking Send to Cluster automatically saves the configuration change. If you do not want
to propagate the change immediately, choose File => Save to save the changes to the cluster configuration.
Page 75
Chapter 3. Installing and Configuring Red Hat Cluster Suite Software 59
3.8.2. Removing a Failover Domain
To remove a failover domain, follow these steps:
1. At the left frame of the the Cluster Configuration Tool, click the failover domain that you want to delete (listed under Failover Domains).
2. At the bottom of the right frame (labeled Properties), click the Delete Failover Do- main button. Clicking the Delete Failover Domain button causes a warning dialog box do be displayed asking if you want to remove the failover domain. Confirm that the failover domain identified in the warning dialog box is the one you want to delete and click Yes. Clicking Yes causes the failover domain to be removed from the list of failover domains under Failover Domains in the left frame of the Cluster Con- figuration Tool.
3. At the Cluster Configuration Tool, perform one of the following actions depending on whether the configuration is for a new cluster or for one that is operational and running:
New cluster — If this is a new cluster, choose File => Save to save the changes to
the cluster configuration.
Running cluster — If this cluster is operational and running, and you want to
propagate the change immediately, click the Send to Cluster button. Clicking Send to Cluster automatically saves the configuration change. If you do not want
to propagate the change immediately, choose File => Save to save the changes to the cluster configuration.
3.8.3. Removing a Member from a Failover Domain
To remove a member from a failover domain, follow these steps:
1. At the left frame of the the Cluster Configuration Tool, click the failover domain that you want to change (listed under Failover Domains).
2. At the bottom of the right frame (labeled Properties), click the Edit Failover Do- main Properties button. Clicking the Edit Failover Domain Properties button causes the Failover Domain Configuration dialog box to be displayed (Figure 3-
10).
3. At the Failover Domain Configuration dialog box, in the Member Node column, click the node name that you want to delete from the failover domain and click the Remove Member from Domain button. Clicking Remove Member from Domain removes the node from the Member Node column. Repeat this step for each node that is to be deleted from the failover domain. (Nodes must be deleted one at a time.)
4. When finished, click Close.
Page 76
60 Chapter 3. Installing and Configuring Red Hat Cluster Suite Software
5. At the Cluster Configuration Tool, perform one of the following actions depending
on whether the configuration is for a new cluster or for one that is operational and running:
New cluster — If this is a new cluster, choose File => Save to save the changes to
the cluster configuration.
Running cluster — If this cluster is operational and running, and you want to
propagate the change immediately, click the Send to Cluster button. Clicking Send to Cluster automatically saves the configuration change. If you do not want
to propagate the change immediately, choose File => Save to save the changes to the cluster configuration.
3.9. Adding Cluster Resources
To specify a device for a cluster service, follow these steps:
1. On the Resources property of the Cluster Configuration Tool, click the Create
a Resource button. Clicking the Create a Resource button causes the Resource Configuration dialog box to be displayed.
2. At the Resource Configuration dialog box, under Select a Resource Type, click the
drop-down box. At the drop-down box, select a resource to configure. The resource options are described as follows:
GFS
Name — Create a name for the file system resource.
Mount Point — Choose the path to which the file system resource is mounted.
Device — Specify the device file associated with the file system resource.
Options — Options to pass to the
call
for the new file system.
File System ID — When creating a new file system resource, you can leave this field blank. Leaving the field blank causes a file system ID to be assigned automatically after you click OK at the Resource Configuration dialog box. If you need to assign a file system ID explicitly, specify it in this field.
Force Unmount checkbox — If checked, forces the file system to unmount. The default setting is unchecked.
File System
Name — Create a name for the file system resource.
mount
Page 77
Chapter 3. Installing and Configuring Red Hat Cluster Suite Software 61
File System Type — Choose the file system for the resource using the drop-
down menu.
Mount Point — Choose the path to which the file system resource is mounted.
Device — Specify the device file associated with the file system resource.
Options — Options to pass to the call for the new file system.
File System ID — When creating a new file system resource, you can leave
this field blank. Leaving the field blank causes a file system ID to be assigned automatically after you click OK at the Resource Configuration dialog box. If you need to assign a file system ID explicitly, specify it in this field.
Checkboxes — Specify mount and unmount actions when a service is stopped (for example, when disabling or relocating a service):
Force unmount — If checked, forces the file system to unmount. The default
setting is unchecked.
Reboot host node if unmount fails — If checked, reboots the node if un-
mounting this file system fails. The default setting is unchecked.
Check file system before mounting — If checked, causes fsck to be run on
the file system before mounting it. The default setting is unchecked.
IP Address
IP Address — Type the IP address for the resource.
Monitor Link checkbox — Check the box to enable or disable link status mon-
itoring of the IP address resource
NFS Mount
Name — Create a symobolic name for the NFS mount.
Mount Point — Choose the path to which the file system resource is mounted.
Host — Specify the NFS server name.
Export Path — NFS export on the server.
NFS and NFS4 options — Specify NFS protocol:
NFS — Specifies using NFSv3 protocol. The default setting is NFS.
NFS4 — Specifies using NFSv4 protocol.
Options — NFS-specific options to pass to the
call for the new file sys-
tem. For more information, refer to the nfs(5) man page.
Force Unmount checkbox — If checked, forces the file system to unmount. The default setting is unchecked.
mount
mount
Page 78
62 Chapter 3. Installing and Configuring Red Hat Cluster Suite Software
NFS Client
Name — Enter a name for the NFS client resource.
Target — Enter a target for the NFS client resource. Supported targets are host-
names, IP addresses (with wild-card support), and netgroups.
Read-Write and Read Only options — Specify the type of access rights for this NFS client resource:
Read-Write — Specifies that the NFS client has read-write access. The de-
fault setting is Read-Write.
Read Only — Specifies that the NFS client has read-only access.
Options — Additional client access rights. For more information, refer to the exports(5) man page, General Options
NFS Export
Name — Enter a name for the NFS export resource.
Script
Name — Enter a name for the custom user script.
File (with path) — Enter the path where this custom script is located (for ex-
ample, /etc/init.d/userscript)
Samba Service
Name — Enter a name for the Samba server.
Work Group — Enter the Windows workgroup name or Windows NT domain
of the Samba service.
Note
When creating or editing a cluster service, connect a Samba-service resource directly to the service, not to a resource within a service. That is, at the Ser-
vice Management dialog box, use either Create a new resource for this service or Add a Shared Resource to this service; do not use Attach a new Private Resource to the Selection or Attach a Shared Resource to the selection.
3. When finished, click OK.
4. Choose File => Save to save the change to the /etc/cluster/cluster.conf configuration file.
Page 79
Chapter 3. Installing and Configuring Red Hat Cluster Suite Software 63
3.10. Adding a Cluster Service to the Cluster
To add a cluster service to the cluster, follow these steps:
1. At the left frame, click Services.
2. At the bottom of the right frame (labeled Properties), click the Create a Service button. Clicking Create a Service causes the Add a Service dialog box to be dis­played.
3. At the Add a Service dialog box, type the name of the service in the Name text box and click OK. Clicking OK causes the Service Management dialog box to be displayed (refer to Figure 3-12).
Tip
Use a descriptive name that clearly distinguishes the service from other services in the cluster.
Figure 3-12. Adding a Cluster Service
4. If you want to restrict the members on which this cluster service is able to run, choose a failover domain from the Failover Domain drop-down box. (Refer to Section 3.8 Configuring a Failover Domain for instructions on how to configure a failover do­main.)
5. Autostart This Service checkbox — This is checked by default. If Autostart This Service is checked, the service is started automatically when a cluster is started and running. If Autostart This Service is not checked, the service must be started man­ually any time the cluster comes up from stopped state.
Page 80
64 Chapter 3. Installing and Configuring Red Hat Cluster Suite Software
6. Run Exclusive checkbox — This sets a policy wherein the service only runs on nodes that have no other services running on them. For example, for a very busy web server that is clustered for high availability, it would would be advisable to keep that service on a node alone with no other services competing for his resources — that is, Run Exclusive checked. On the other hand, services that consume few resources (like NFS and Samba), can run together on the same node without little concern over contention for resources. For those types of services you can leave the Run Exclusive unchecked.
7. Select a recovery policy to specify how the resource manager should recover from a service failure. At the upper right of the Service Management dialog box, there are three Recovery Policy options available:
Restart — Restart the service in the node the service is currently located. The
default setting is Restart. If the service cannot be restarted in the the current node, the service is relocated.
Relocate — Relocate the service before restarting. Do not restart the node where
the service is currently located.
Disable — Do not restart the service at all.
8. Click the Add a Shared Resource to this service button and choose the a resource listed that you have configured in Section 3.9 Adding Cluster Resources.
Note
If you are adding a Samba-service resource, connect a Samba-service resource directly to the service, not to a resource within a service. That is, at the Service
Management dialog box, use either Create a new resource for this service or Add a Shared Resource to this service; do not use Attach a new Private Re­source to the Selection or Attach a Shared Resource to the selection.
9. If needed, you may also create a private resource that you can create that becomes a subordinate resource by clicking on the Attach a new Private Resource to the Selection button. The process is the same as creating a shared resource described in Section 3.9 Adding Cluster Resources. The private resource will appear as a child to the shared resource to which you associated with the shared resource. Click the triangle icon next to the shared resource to display any private resources associated.
10. When finished, click OK.
11. Choose File => Save to save the changes to the cluster configuration.
Page 81
Chapter 3. Installing and Configuring Red Hat Cluster Suite Software 65
Note
To verify the existence of the IP service resource used in a cluster service, you must use the /sbin/ip addr list command on a cluster node. The following output shows the
/sbin/ip addr list command executed on a node running a cluster service:
1: lo:
LOOPBACK,UPmtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0:
BROADCAST,MULTICAST,UPmtu 1356 qdisc pfifo_fast qlen 1000 link/ether 00:05:5d:9a:d8:91 brd ff:ff:ff:ff:ff:ff inet 10.11.4.31/22 brd 10.11.7.255 scope global eth0 inet6 fe80::205:5dff:fe9a:d891/64 scope link inet 10.11.4.240/22 scope global secondary eth0
valid_lft forever preferred_lft forever
3.11. Propagating The Configuration File: New Cluster
For newly defined clusters, you must propagate the configuration file to the cluster nodes as follows:
1. Log in to the node where you created the configuration file.
2. Using the scp command, copy the /etc/cluster/cluster.conf file to all nodes in the cluster.
Note
Propagating the cluster configuration file this way is necessary for the first time a cluster is created. Once a cluster is installed and running, the cluster configuration file is propagated using the Red Hat cluster management GUI Send to Cluster but­ton. For more information about propagating the cluster configuration using the GUI Send to Cluster button, refer to Section 4.4 Modifying the Cluster Configuration.
3.12. Starting the Cluster Software
After you have propagated the cluster configuration to the cluster nodes you can either reboot each node or start the cluster software on each cluster node by running the following commands at each node in this order:
Page 82
66 Chapter 3. Installing and Configuring Red Hat Cluster Suite Software
1. service ccsd start
2. service lock_gulmd start or service cman start according to the type of lock manager used
3. service fenced start (DLM clusters only)
4. service clvmd start
5. service gfs start, if you are using Red Hat GFS
6. service rgmanager start
7. Start the Red Hat Cluster Suite management GUI. At the Cluster Configuration Tool tab, verify that the configuration is correct. At the Cluster Status Tool tab verify that the nodes and services are running as expected.
Page 83
Chapter 4.
Cluster Administration
This chapter describes the various administrative tasks for maintaining a cluster after it has been installed and configured.
4.1. Overview of the Cluster Status Tool
The Cluster Status Tool is part of the Red Hat Cluster Suite management GUI, (the
system-config-cluster package) and is accessed by a tab in the Red Hat Cluster Suite
management GUI. The Cluster Status Tool displays the status of cluster members and ser- vices and provides control of cluster services.
The members and services displayed in the Cluster Status Tool are determined by the cluster configuration file (/etc/cluster/cluster.conf). The cluster configuration file is maintained via the Cluster Configuration Tool in the cluster management GUI.
Warning
Do not manually edit the contents of the /etc/cluster/cluster.conf file without guid­ance from an authorized Red Hat representative or unless you fully understand the con­sequences of editing the /etc/cluster/cluster.conf file manually.
You can access the Cluster Status Tool by clicking the Cluster Management tab at the cluster management GUI to Figure 4-1).
Use the Cluster Status Tool to enable, disable, restart, or relocate a service. To enable a service, select the service in the Services area and click Enable. To disable a service, select the service in the Services area and click Disable. To restart a service, select the service in the Services area and click Restart. To relocate service from one member to another, drag the service to another member and drop the service onto that member. Relocating a member restarts the service on that member. (Relocating a service to its current member — that is, dragging a service to its current member and dropping the service onto that member — restarts the service.)
Page 84
68 Chapter 4. Cluster Administration
Figure 4-1. Cluster Status Tool
4.2. Displaying Cluster and Service Status
Monitoring cluster and application service status can help identify and resolve problems in the cluster environment. The following tools assist in displaying cluster status information:
The Cluster Status Tool
The clustat utility
Important
Members that are not running the cluster software cannot determine or report the status of other members of the cluster.
Page 85
Chapter 4. Cluster Administration 69
Cluster and service status includes the following information:
Cluster member system status
Service status and which cluster system is running the service or owns the service
The following tables describe how to analyze the status information shown by the Cluster Status Tool and the clustat utility.
Member Status Description
Member The node is part of the cluster.
Note: A node can be a member of a cluster; however, the node may be inactive and incapable of running services. For example, if rgmanager is not running on the node, but all other cluster software components are running in the node, the node appears as a Member in the Cluster Status Tool. However, without
rgmanager running, the node does not appear in the clustat
display.
Dead The member system is unable to participate as a cluster member.
The most basic cluster software is not running on the node.
Table 4-1. Member Status for the Cluster Status Tool
Member Status Description
Online The node is communicating with other nodes in the cluster.
Inactive The node is unable to communicate with the other nodes in the
cluster. If the node is inactive, clustat does not display the node. If rgmanager is not running in a node, the node is inactive.
Note: Although a node is inactive, it may still appear as a Member in the Cluster Status Tool. However, if the node is inactive, it is incapable of running services.
Table 4-2. Member Status for clustat
Page 86
70 Chapter 4. Cluster Administration
Service Status Description
Started The service resources are configured and available on the cluster
system that owns the service.
Pending The service has failed on a member and is pending start on
another member.
Disabled The service has been disabled, and does not have an assigned
owner. A disabled service is never restarted automatically by the cluster.
Stopped The service is not running; it is waiting for a member capable of
starting the service. A service remains in the stopped state if autostart is disabled.
Failed The service has failed to start on the cluster and cannot
successfully stop the service. A failed service is never restarted automatically by the cluster.
Table 4-3. Service Status
The Cluster Status Tool displays the current cluster status in the Services area and au- tomatically updates the status every 10 seconds. Additionally, you can display a snapshot of the current cluster status from a shell prompt by invoking the clustat utility. Example 4-1 shows the output of the clustat utility.
# clustat
Member Status: Quorate, Group Member
Member Name State ID
------ ---- ----- -­tng3-2 Online 0x0000000000000002 tng3-1 Online 0x0000000000000001
Service Name Owner (Last) State
-------- ----- ----- ------ ----­webserver (tng3-1 ) failed email tng3-2 started
Example 4-1. Output of clustat
Page 87
Chapter 4. Cluster Administration 71
To monitor the cluster and display status at specific time intervals from a shell prompt, invoke clustat with the -i time option, where time specifies the number of seconds between status snapshots. The following example causes the clustat utility to display cluster status every 10 seconds:
#clustat -i 10
4.3. Starting and Stopping the Cluster Software
To start the cluster software on a member, type the following commands in this order:
1. service ccsd start
2. service lock_gulmd start or service cman start according to the type of lock manager used
3. service fenced start (DLM clusters only)
4. service clvmd start
5. service gfs start, if you are using Red Hat GFS
6. service rgmanager start
To stop the cluster software on a member, type the following commands in this order:
1. service rgmanager stop
2. service gfs stop, if you are using Red Hat GFS
3. service clvmd stop
4. service fenced stop (DLM clusters only)
5. service lock_gulmd stop or service cman stop according to the type of lock manager used
6. service ccsd stop
Stopping the cluster services on a member causes its services to fail over to an active member.
4.4. Modifying the Cluster Configuration
To modify the cluster configuration (the cluster configuration file (/etc/cluster/cluster.conf), use the Cluster Configuration Tool. For more information about using the Cluster Configuration Tool, refer to Chapter 3 Installing and Configuring Red Hat Cluster Suite Software.
Page 88
72 Chapter 4. Cluster Administration
Warning
Do not manually edit the contents of the /etc/cluster/cluster.conf file without guid­ance from an authorized Red Hat representative or unless you fully understand the con­sequences of editing the /etc/cluster/cluster.conf file manually.
Important
Although the Cluster Configuration Tool provides a Quorum Votes parameter in the Properties dialog box of each cluster member, that parameter is intended only for use
during initial cluster configuration. Furthermore, it is recommended that you retain the de­fault Quorum Votes value of 1. For more information about using the Cluster Configu- ration Tool, refer to Chapter 3 Installing and Configuring Red Hat Cluster Suite Software.
To edit the cluster configuration file, click the Cluster Configuration tab in the cluster configuration GUI. Clicking the Cluster Configuration tab displays a graphical represen­tation of the cluster configuration. Change the configuration file according the the follow­ing steps:
1. Make changes to cluster elements (for example, create a service).
2. Propagate the updated configuration file throughout the cluster by clicking Send to Cluster.
Note
The Cluster Configuration Tool does not display the Send to Cluster button if the cluster is new and has not been started yet, or if the node from which you are run­ning the Cluster Configuration Tool is not a member of the cluster. If the Send to Cluster button is not displayed, you can still use the Cluster Configuration Tool; however, you cannot propagate the configuration. You can still save the configu­ration file. For information about using the Cluster Configuration Tool for a new cluster configuration, refer to Chapter 3 Installing and Configuring Red Hat Cluster Suite Software.
3. Clicking Send to Cluster causes a Warning dialog box to be displayed. Click Yes to save and propagate the configuration.
4. Clicking Yes causes an Information dialog box to be displayed, confirming that the current configuration has been propagated to the cluster. Click OK.
5. Click the Cluster Management tab and verify that the changes have been propa­gated to the cluster members.
Page 89
Chapter 4. Cluster Administration 73
4.5. Backing Up and Restoring the Cluster Database
The Cluster Configuration Tool automatically retains backup copies of the three most recently used configuration files (besides the currently used configuration file). Retaining the backup copies is useful if the cluster does not function correctly because of misconfig­uration and you need to return to a previous working configuration.
Each time you save a configuration file, the Cluster Configuration Tool saves backup copies of the three most recently used configuration files as
/etc/cluster/cluster.conf.bak.1, /etc/cluster/cluster.conf.bak.2,
and /etc/cluster/cluster.conf.bak.3. The backup file
/etc/cluster/cluster.conf.bak.1 is the newest backup, /etc/cluster/cluster.conf.bak.2 is the second newest backup, and /etc/cluster/cluster.conf.bak.3 is the third newest backup.
If a cluster member becomes inoperable because of misconfiguration, restore the configu­ration file according to the following steps:
1. At the Cluster Configuration Tool tab of the Red Hat Cluster Suite management GUI, click File => Open.
2. Clicking File => Open causes the system-config-cluster dialog box to be displayed.
3. At the the system-config-cluster dialog box, select a backup file (for example,
/etc/cluster/cluster.conf.bak.1). Verify the file selection in the Selection
box and click OK.
4. Increment the configuration version beyond the current working version number as follows:
a. Click Cluster => Edit Cluster Properties.
b. At the Cluster Properties dialog box, change the Config Version value and
click OK.
5. Click File => Save As.
6. Clicking File => Save As causes the system-config-cluster dialog box to be dis­played.
7. At the the system-config-cluster dialog box, select
/etc/cluster/cluster.conf and click OK. (Verify the file selection in the
Selection box.)
8. Clicking OK causes an Information dialog box to be displayed. At that dialog box, click OK.
9. Propagate the updated configuration file throughout the cluster by clicking Send to Cluster.
Page 90
74 Chapter 4. Cluster Administration
Note
The Cluster Configuration Tool does not display the Send to Cluster button if the cluster is new and has not been started yet, or if the node from which you are run­ning the Cluster Configuration Tool is not a member of the cluster. If the Send to Cluster button is not displayed, you can still use the Cluster Configuration Tool; however, you cannot propagate the configuration. You can still save the configu­ration file. For information about using the Cluster Configuration Tool for a new cluster configuration, refer to Chapter 3 Installing and Configuring Red Hat Cluster Suite Software.
10. Clicking Send to Cluster causes a Warning dialog box to be displayed. Click Yes
to propagate the configuration.
11. Click the Cluster Management tab and verify that the changes have been propa-
gated to the cluster members.
4.6. Updating the Cluster Software
For information about updating the cluster software, contact an authorized Red Hat support representative.
4.7. Changing the Cluster Name
Although the Cluster Configuration Tool provides a Cluster Properties dialog box with a cluster Name parameter, the parameter is intended only for use during initial cluster configuration. The only way to change the name of a Red Hat cluster is to create a new cluster with the new name. For more information about using the Cluster Configuration Tool, refer to Chapter 3 Installing and Configuring Red Hat Cluster Suite Software.
4.8. Disabling the Cluster Software
It may become necessary to temporarily disable the cluster software on a cluster mem­ber. For example, if a cluster member experiences a hardware failure, you may want to reboot that member, but prevent it from rejoining the cluster to perform maintenance on the system.
Use the /sbin/chkconfig command to stop the member from joining the cluster at boot­up as follows:
chkconfig --level 2345 rgmanager off chkconfig --level 2345 gfs off chkconfig --level 2345 clvmd off chkconfig --level 2345 fenced off
Page 91
Chapter 4. Cluster Administration 75
chkconfig --level 2345 lock_gulmd off chkconfig --level 2345 cman off chkconfig --level 2345 ccsd off
Once the problems with the disabled cluster member have been resolved, use the following commands to allow the member to rejoin the cluster:
chkconfig --level 2345 rgmanager on chkconfig --level 2345 gfs on chkconfig --level 2345 clvmd on chkconfig --level 2345 fenced on chkconfig --level 2345 lock_gulmd on chkconfig --level 2345 cman on chkconfig --level 2345 ccsd on
You can then reboot the member for the changes to take effect or run the following com­mands in the order shown to restart cluster software:
1. service ccsd start
2. service lock_gulmd start or service cman start according to the type of lock manager used
3. service fenced start (DLM clusters only)
4. service clvmd start
5. service gfs start, if you are using Red Hat GFS
6. service rgmanager start
4.9. Diagnosing and Correcting Problems in a Cluster
For information about diagnosing and correcting problems in a cluster, contact an autho­rized Red Hat support representative.
Page 92
76 Chapter 4. Cluster Administration
Page 93
Chapter 5.
Setting Up Apache HTTP Server
This chapter contains instructions for configuring Red Hat Enterprise Linux to make the Apache HTTP Server highly available.
The following is an example of setting up a cluster service that fails over an Apache HTTP Server. Although the actual variables used in the service depend on the specific configura­tion, the example may assist in setting up a service for a particular environment.
5.1. Apache HTTP Server Setup Overview
First, configure Apache HTTP Server on all nodes in the cluster. If using a failover domain , assign the service to all cluster nodes configured to run the Apache HTTP Server. Refer to Section 3.8 Configuring a Failover Domain for instructions. The cluster software ensures that only one cluster system runs the Apache HTTP Server at one time. The example con­figuration consists of installing the httpd RPM package on all cluster nodes (or on nodes in the failover domain, if used) and configuring a shared GFS shared resource for the Web content.
When installing the Apache HTTP Server on the cluster systems, run the following com­mand to ensure that the cluster nodes do not automatically start the service when the system boots:
chkconfig --del httpd
Rather than having the system init scripts spawn the httpd daemon, the cluster infrastruc­ture initializes the service on the active cluster node. This ensures that the corresponding IP address and file system mounts are active on only one cluster node at a time.
When adding an httpd service, a floating IP address must be assigned to the service so that the IP address will transfer from one cluster node to another in the event of failover or service relocation. The cluster infrastructure binds this IP address to the network interface on the cluster system that is currently running the Apache HTTP Server. This IP address ensures that the cluster node running httpd is transparent to the clients accessing the service.
The file systems that contain the Web content cannot be automatically mounted on the shared storage resource when the cluster nodes boot. Instead, the cluster software must mount and unmount the file system as the httpd service is started and stopped. This pre­vents the cluster systems from accessing the same data simultaneously, which may result in data corruption. Therefore, do not include the file systems in the /etc/fstab file.
Page 94
78 Chapter 5. Setting Up Apache HTTP Server
5.2. Configuring Shared Storage
To set up the shared file system resource, perform the following tasks as root on one cluster system:
1. On one cluster node, use the interactive parted utility to create a partition to use for the document root directory. Note that it is possible to create multiple document root directories on different disk partitions. Refer to Section 2.5.3.1 Partitioning Disks for more information.
2. Use the mkfs command to create an ext3 file system on the partition you created in the previous step. Specify the drive letter and the partition number. For example:
mkfs -t ext3 /dev/sde3
3. Mount the file system that contains the document root directory. For example:
mount /dev/sde3 /var/www/html
Do not add this mount information to the /etc/fstab file because only the cluster software can mount and unmount file systems used in a service.
4. Copy all the required files to the document root directory.
5. If you have CGI files or other files that must be in different directories or in separate partitions, repeat these steps, as needed.
5.3. Installing and Configuring the Apache HTTP Server
The Apache HTTP Server must be installed and configured on all nodes in the assigned failover domain, if used, or in the cluster. The basic server configuration must be the same on all nodes on which it runs for the service to fail over correctly. The following example shows a basic Apache HTTP Server installation that includes no third-party modules or performance tuning.
On all node in the cluster (or nodes in the failover domain, if used), install the httpd RPM package. For example:
rpm -Uvh httpd-
version.arch.rpm
To configure the Apache HTTP Server as a cluster service, perform the following tasks:
1. Edit the /etc/httpd/conf/httpd.conf configuration file and customize the file according to your configuration. For example:
Specify the directory that contains the HTML files. Also specify this mount point
when adding the service to the cluster configuration. It is only required to change this field if the mountpoint for the website’s content differs from the default setting of /var/www/html/. For example:
DocumentRoot "/mnt/httpdservice/html"
Page 95
Chapter 5. Setting Up Apache HTTP Server 79
Specify a unique IP address to which the service will listen for requests. For ex-
ample:
Listen 192.168.1.100:80
This IP address then must be configured as a cluster resource for the service using the Cluster Configuration Tool.
If the script directory resides in a non-standard location, specify the directory that
contains the CGI programs. For example:
ScriptAlias /cgi-bin/ "/mnt/httpdservice/cgi-bin/"
Specify the path that was used in the previous step, and set the access permissions
to default to that directory. For example:
Directory /mnt/httpdservice/cgi-bin"
AllowOverride None Options None Order allow,deny Allow from all
/Directory
Additional changes may need to be made to tune the Apache HTTP Server or add module functionality. For information on setting up other options, refer to the Red
Hat Enterprise Linux System Administration Guide and the Red Hat Enterprise Linux Reference Guide.
2. The standard Apache HTTP Server start script, /etc/rc.d/init.d/httpd is also used within the cluster framework to start and stop the Apache HTTP Server on the active cluster node. Accordingly, when configuring the service, specify this script by adding it as a Script resource in the Cluster Configuration Tool.
3. Copy the configuration file over to the other nodes of the cluster (or nodes of the failover domain, if configured).
Before the service is added to the cluster configuration, ensure that the Apache HTTP Server directories are not mounted. Then, on one node, invoke the Cluster Configura- tion Tool to add the service, as follows. This example assumes a failover domain named
httpd-domain was created for this service.
1. Add the init script for the Apache HTTP Server service.
Select the Resources tab and click Create a Resource. The Resources Config-
ureation properties dialog box is displayed.
Select Script form the drop down menu.
Enter a Name to be associated with the Apache HTTP Server service.
Specify the path to the Apache HTTP Server init script (for example,
/etc/rc.d/init.d/httpd) in the File (with path) field.
Click OK.
Page 96
80 Chapter 5. Setting Up Apache HTTP Server
2. Add a device for the Apache HTTP Server content files and/or custom scripts.
Click Create a Resource.
In the Resource Configuration dialog, select File System from the drop-down
menu.
Enter the Name for the resource (for example, httpd-content.
Choose ext3 from the File System Type drop-down menu.
Enter the mount point in the Mount Point field (for example,
/var/www/html/).
Enter the device special file name in the Device field (for example, /dev/sda3).
3. Add an IP address for the Apache HTTP Server service.
Click Create a Resource.
Choose IP Address from the drop-down menu.
Enter the IP Address to be associatged with the Apache HTTP Server service.
Make sure that the Monitor Link checkbox is left checked.
Click OK.
4. Click the Services property.
5. Create the Apache HTTP Server service.
Click Create a Service. Type a Name for the service in the Add a Service dialog.
In the Service Management dialog, select a Failover Domain from the drop-
down menu or leave it as None.
Click the Add a Shared Resource to this service button. From the available list,
choose each resource that you created in the previous steps. Repeat this step until all resources have been added.
Click OK.
6. Choose File => Save to save your changes.
Page 97
II. Configuring a Linux Virtual Server Cluster
Building a Linux Virtual Server (LVS) system offers highly-available and scalable solution for production services using specialized routing and load-balancing techniques configured through the Piranha Configuration Tool. This part discusses the configuration of high­performance systems and services with Red Hat Enterprise Linux and LVS.
This section is licensed under the Open Publication License, V1.0 or later. For details refer to the Copyright page.
Table of Contents
6. Introduction to Linux Virtual Server.........................................................................83
7. Linux Virtual Server Overview ..................................................................................85
8. Initial LVS Configuration............................................................................................97
9. Setting Up a Red Hat Enterprise Linux LVS Cluster.............................................103
10. Configuring the LVS Routers with Piranha Configuration Tool.........................115
Page 98
Page 99
Chapter 6.
Introduction to Linux Virtual Server
Using Red Hat Enterprise Linux, it is possible to create highly available server clustering solutions able to withstand many common hardware and software failures with little or no interruption of critical services. By allowing multiple computers to work together in offer­ing these critical services, system administrators can plan and execute system maintenance and upgrades without service interruption.
The chapters in this part guide you through the following steps in understanding and de­ploying a clustering solution based on the Red Hat Enterprise Linux Linux Virtual Server (LVS) technology:
Explains the Linux Virtual Server technology used by Red Hat Enterprise Linux to create
a load-balancing cluster
Explains how to configure a Red Hat Enterprise Linux LVS cluster
Guides you through the Piranha Configuration Tool, a graphical interface used for
configuring and monitoring an LVS cluster
6.1. Technology Overview
Red Hat Enterprise Linux implements highly available server solutions via clustering. It is important to note that cluster computing consists of three distinct branches:
Compute clustering (such as Beowulf) uses multiple machines to provide greater com-
puting power for computationally intensive tasks. This type of clustering is not addressed by Red Hat Enterprise Linux.
High-availability (HA) clustering uses multiple machines to add an extra level of relia-
bility for a service or group of services.
Load-balance clustering uses specialized routing techniques to dispatch traffic to a pool
of servers.
Red Hat Enterprise Linux addresses the latter two types of clustering technology. Using a collection of programs to monitor the health of the systems and services in the cluster.
Note
The clustering technology included in Red Hat Enterprise Linux is not synonymous with fault tolerance. Fault tolerant systems use highly specialized and often very expensive
Page 100
84 Chapter 6. Introduction to Linux Virtual Server
hardware to implement a fully redundant environment in which services can run uninter­rupted by hardware failures.
However, fault tolerant systems do not account for operator and software errors which Red Hat Enterprise Linux can address through service redundancy. Also, since Red Hat Enterprise Linux is designed to run on commodity hardware, it creates an environment with a high level of system availability at a fraction of the cost of fault tolerant hardware.
6.2. Basic Configurations
While Red Hat Enterprise Linux can be configured in a variety of different ways, the con­figurations can be broken into two major categories:
High-availability clusters using Red Hat Cluster Manager
Load-balancing clusters using Linux Virtual Servers
This part explains what a load-balancing cluster system is and how to configure a load­balancing system using Linux Virtual Servers on Red Hat Enterprise Linux.
6.2.1. Load-Balancing Clusters Using Linux Virtual Servers
To an outside user accessing a hosted service (such as a website or database application), a Linux Virtual Server (LVS) cluster appears as one server. In reality, however, the user is actually accessing a cluster of two or more servers behind a pair of redundant LVS routers that distribute client requests evenly throughout the cluster system. Load-balanced clus­tered services allow administrators to use commodity hardware and Red Hat Enterprise Linux to create continuous and consistent access to all hosted services while also address­ing availability requirements.
An LVS cluster consists of at least two layers. The first layer is composed of a pair of similarly configured Linux machines or cluster members. One of these machine acts as the LVS routers, configured to direct requests from the Internet to the cluster. The second layer consists of a cluster of machines called real servers. The real servers provide the critical services to the end-user while the LVS router balances the load on these servers.
For a detailed overview of LVS clustering, refer to Chapter 7 Linux Virtual Server Overview.
Loading...