HP StoreAll 9730, IBRIX X9720 Administrator's Manual

Page 1

HP IBRIX X9720/StoreAll 9730 Storage Administrator Guide

Abstract

This guide describes tasks related to cluster configuration and monitoring, system upgrade and recovery, hardware component replacement, and troubleshooting. It does not document StoreAll file system features or standard Linux administrative tools and commands. For information about configuring and using StoreAll file system features, see the

HP StoreAll Storage File System User Guide.

This guide is intended for system administrators and technicians who are experienced with installing and administering networks, and with performing Linux operating and administrative tasks. For the latest StoreAll guides, browse to

http://www.hp.com/support/StoreAllManuals.

HP Part Number: AW549-96073 Published: July 2013 Edition: 14

Page 2

Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license.

The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.

Acknowledgments

Microsoft® and Windows® are U.S. registered trademarks of Microsoft Corporation.

UNIX® is a registered trademark of The Open Group.

Warranty

WARRANTY STATEMENT: To obtain a copy of the warranty for this product, see the warranty information website:

http://www.hp.com/go/storagewarranty

Revision History

DescriptionSoftware

Version

DateEdition

Initial release of the IBRIX X9720 Storage.5.3.1December 20091

Added network management and Support ticket.5.4April 20102

Added Fusion Manager backup, migration to an agile Fusion Manager configuration, software upgrade procedures, and system recovery procedures.

5.4.1August 20103

Revised upgrade procedure.5.4.1August 20104

Added information about NDMP backups and configuring virtual interfaces, and updated cluster procedures.

5.5December 20105

Updated segment evacuation information.5.5March 20116

Revised upgrade procedure.5.6April 20117

Added or updated information about the agile Fusion Manager, Statistics tool, Ibrix Collect, event notification, capacity block installation, NTP servers, upgrades.

6.0September 20118

Added or updated information about 9730 systems, hardware monitoring, segment evacuation, HP Insight Remote Support, software upgrades, events, Statistics tool.

6.1June 20129

Added or updated information about High Availability, failover, server tuning, VLAN tagging, segment migration and evacuation, upgrades, SNMP.

6.2December 201210

Updated information on upgrades, remote support, collection logs, phone home and troubleshooting. Now point users to website for the latest spare parts list instead of shipping

6.3March 201311

the list. Added before and after upgrade steps for Express Query when going from 6.2 to 6.3.

Removed post upgrade step that tells users to modify the /etc/hosts file on every StoreAll node. Changed firmware version to 4.0.0-13 in “Upgrading IBRIX X9720 chassis

6.3April 201312

firmware.” In the “Cascading Upgrades” appendix, added a section that tells users to ensure that the NFS exports option subtree_check is the default export option for every NFS export when upgrading from a StoreAll 5.x release. Also changed ibrix_fm -m nofmfailover -A to ibrix_fm -m maintenance -A in the “Cascading Upgrades” appendix. Updated information about SMB share creation.

Updated the example in the section “Enabling collection and synchronization.”6.3June 201313

Page 3

Contents

1 Upgrading the StoreAll software to the 6.3 release.......................................10

Upgrading 9720 chassis firmware............................................................................................12

Online upgrades for StoreAll software.......................................................................................12

Preparing for the upgrade...................................................................................................13

Performing the upgrade......................................................................................................13

After the upgrade..............................................................................................................14

Automated offline upgrades for StoreAll software 6.x to 6.3.........................................................14

Preparing for the upgrade...................................................................................................14

Performing the upgrade......................................................................................................14

After the upgrade..............................................................................................................15

Manual offline upgrades for StoreAll software 6.x to 6.3.............................................................15

Preparing for the upgrade...................................................................................................15

Performing the upgrade manually.........................................................................................17

After the upgrade..............................................................................................................17

Upgrading Linux StoreAll clients................................................................................................18

Installing a minor kernel update on Linux clients.....................................................................18

Upgrading Windows StoreAll clients.........................................................................................19

Upgrading pre-6.3 Express Query enabled file systems...............................................................19

Required steps before the StoreAll Upgrade for pre-6.3 Express Query enabled file systems.........19

Required steps after the StoreAll Upgrade for pre-6.3 Express Query enabled file systems...........20

Troubleshooting upgrade issues................................................................................................21

Automatic upgrade............................................................................................................21

Manual upgrade...............................................................................................................22

Offline upgrade fails because iLO firmware is out of date........................................................22

Node is not registered with the cluster network .....................................................................22

File system unmount issues...................................................................................................23

File system in MIF state after StoreAll software 6.3 upgrade.....................................................23

2 Product description...................................................................................25

System features.......................................................................................................................25

System components.................................................................................................................25

HP StoreAll software features....................................................................................................25

High availability and redundancy.............................................................................................26

3 Getting started.........................................................................................27

Setting up the X9720/9730 Storage.........................................................................................27

Installation steps................................................................................................................27

Additional configuration steps.............................................................................................27

Logging in to the system..........................................................................................................28

Using the network..............................................................................................................28

Using the TFT keyboard/monitor..........................................................................................28

Using the serial link on the Onboard Administrator.................................................................29

Booting the system and individual server blades.........................................................................29

Management interfaces...........................................................................................................29

Using the StoreAll Management Console..............................................................................29

Customizing the GUI..........................................................................................................32

Adding user accounts for Management Console access..........................................................33

Using the CLI.....................................................................................................................33

Starting the array management software...............................................................................33

StoreAll client interfaces......................................................................................................34

StoreAll software manpages.....................................................................................................34

Changing passwords..............................................................................................................34

Contents 3

Page 4

Configuring ports for a firewall.................................................................................................35

Configuring NTP servers..........................................................................................................36

Configuring HP Insight Remote Support on StoreAll systems..........................................................36

Configuring the StoreAll cluster for Insight Remote Support......................................................38

Configuring Insight Remote Support for HP SIM 7.1 and IRS 5.7...............................................41

Configuring Insight Remote Support for HP SIM 6.3 and IRS 5.6..............................................44

Testing the Insight Remote Support configuration....................................................................47

Updating the Phone Home configuration...............................................................................47

Disabling Phone Home.......................................................................................................47

Troubleshooting Insight Remote Support................................................................................48

4 Configuring virtual interfaces for client access..............................................49

Network and VIF guidelines.....................................................................................................49

Creating a bonded VIF............................................................................................................50

Configuring backup servers......................................................................................................50

Configuring NIC failover.........................................................................................................50

Configuring automated failover................................................................................................51

Example configuration.............................................................................................................51

Specifying VIFs in the client configuration...................................................................................51

Configuring VLAN tagging......................................................................................................52

Support for link state monitoring...............................................................................................52

5 Configuring failover..................................................................................53

Agile management consoles....................................................................................................53

Agile Fusion Manager modes..............................................................................................53

Agile Fusion Manager and failover......................................................................................53

Viewing information about Fusion Managers.........................................................................54

Configuring High Availability on the cluster................................................................................54

What happens during a failover..........................................................................................55

Configuring automated failover with the HA Wizard...............................................................55

Configuring automated failover manually..............................................................................62

Changing the HA configuration manually.........................................................................63

Failing a server over manually.............................................................................................64

Failing back a server .........................................................................................................64

Setting up HBA monitoring..................................................................................................65

Checking the High Availability configuration.........................................................................66

Capturing a core dump from a failed node................................................................................68

Prerequisites for setting up the crash capture..........................................................................68

Setting up nodes for crash capture.......................................................................................69

6 Configuring cluster event notification...........................................................70

Cluster events.........................................................................................................................70

Setting up email notification of cluster events..............................................................................70

Associating events and email addresses................................................................................70

Configuring email notification settings..................................................................................71

Dissociating events and email addresses...............................................................................71

Testing email addresses......................................................................................................71

Viewing email notification settings........................................................................................72

Setting up SNMP notifications..................................................................................................72

Configuring the SNMP agent...............................................................................................73

Configuring trapsink settings................................................................................................73

Associating events and trapsinks..........................................................................................74

Defining views...................................................................................................................74

Configuring groups and users..............................................................................................75

Deleting elements of the SNMP configuration........................................................................75

Listing SNMP configuration information.................................................................................75

4 Contents

Page 5

7 Configuring system backups.......................................................................76

Backing up the Fusion Manager configuration............................................................................76

Using NDMP backup applications............................................................................................76

Configuring NDMP parameters on the cluster........................................................................77

NDMP process management...............................................................................................78

Viewing or canceling NDMP sessions..............................................................................78

Starting, stopping, or restarting an NDMP Server..............................................................78

Viewing or rescanning tape and media changer devices.........................................................79

NDMP events....................................................................................................................79

8 Creating host groups for StoreAll clients.......................................................80

How host groups work.............................................................................................................80

Creating a host group tree.......................................................................................................80

Adding a StoreAll client to a host group....................................................................................81

Adding a domain rule to a host group.......................................................................................81

Viewing host groups................................................................................................................82

Deleting host groups...............................................................................................................82

Other host group operations....................................................................................................82

9 Monitoring cluster operations.....................................................................83

Monitoring X9720/9730 hardware..........................................................................................83

Monitoring servers.............................................................................................................83

Monitoring hardware components........................................................................................87

Monitoring blade enclosures...........................................................................................88

Obtaining server details.................................................................................................91

Monitoring storage and storage components.........................................................................94

Monitoring storage clusters.............................................................................................96

Monitoring drive enclosures for a storage cluster...........................................................96

Monitoring pools for a storage cluster.........................................................................99

Monitoring storage controllers for a storage cluster.....................................................100

Monitoring storage switches in a storage cluster..............................................................101

Managing LUNs in a storage cluster..............................................................................101

Monitoring the status of file serving nodes................................................................................102

Monitoring cluster events.......................................................................................................103

Viewing events................................................................................................................103

Removing events from the events database table..................................................................104

Monitoring cluster health.......................................................................................................104

Health checks..................................................................................................................104

Health check reports........................................................................................................104

Viewing logs........................................................................................................................106

Viewing operating statistics for file serving nodes......................................................................106

10 Using the Statistics tool..........................................................................108

Installing and configuring the Statistics tool..............................................................................108

Installing the Statistics tool.................................................................................................108

Enabling collection and synchronization..............................................................................108

Upgrading the Statistics tool from StoreAll software 6.0.............................................................109

Using the Historical Reports GUI.............................................................................................109

Generating reports...........................................................................................................111

Deleting reports...............................................................................................................111

Maintaining the Statistics tool.................................................................................................112

Space requirements..........................................................................................................112

Updating the Statistics tool configuration.............................................................................112

Changing the Statistics tool configuration............................................................................112

Fusion Manager failover and the Statistics tool configuration.................................................112

Checking the status of Statistics tool processes.....................................................................113

Contents 5

Page 6

Controlling Statistics tool processes.....................................................................................113

Troubleshooting the Statistics tool............................................................................................114

Log files...............................................................................................................................114

Uninstalling the Statistics tool.................................................................................................114

11 Maintaining the system..........................................................................115

Shutting down the system.......................................................................................................115

Shutting down the StoreAll software....................................................................................115

Powering off the system hardware......................................................................................116

Starting up the system...........................................................................................................117

Powering on the system hardware......................................................................................117

Powering on after a power failure......................................................................................117

Starting the StoreAll software.............................................................................................117

Powering file serving nodes on or off.......................................................................................117

Performing a rolling reboot....................................................................................................118

Starting and stopping processes.............................................................................................118

Tuning file serving nodes and StoreAll clients............................................................................118

Managing segments.............................................................................................................122

Migrating segments..........................................................................................................123

Evacuating segments and removing storage from the cluster ..................................................125

Removing a node from a cluster..............................................................................................128

Maintaining networks............................................................................................................129

Cluster and user network interfaces....................................................................................129

Adding user network interfaces..........................................................................................129

Setting network interface options in the configuration database..............................................131

Preferring network interfaces..............................................................................................131

Unpreferring network interfaces.........................................................................................132

Making network changes..................................................................................................132

Changing the IP address for a Linux StoreAll client...........................................................132

Changing the cluster interface.......................................................................................133

Managing routing table entries.....................................................................................133

Adding a routing table entry....................................................................................133

Deleting a routing table entry...................................................................................133

Deleting a network interface.........................................................................................133

Viewing network interface information................................................................................134

12 Licensing.............................................................................................135

Viewing license terms............................................................................................................135

Retrieving a license key.........................................................................................................135

Using AutoPass to retrieve and install permanent license keys......................................................135

13 Upgrading firmware..............................................................................136

Components for firmware upgrades.........................................................................................136

Steps for upgrading the firmware............................................................................................137

Finding additional information on FMT...............................................................................140

Adding performance modules on 9730 systems...................................................................140

Adding new server blades on 9720 systems........................................................................141

14 Troubleshooting....................................................................................143

Collecting information for HP Support with the IbrixCollect.........................................................143

Collecting logs................................................................................................................143

Downloading the archive file.............................................................................................144

Deleting the archive file....................................................................................................144

Configuring Ibrix Collect...................................................................................................145

Obtaining custom logging from ibrix_collect add-on scripts....................................................146

Creating an add-on script.............................................................................................146

Running an add-on script.............................................................................................147

6 Contents

Page 7

Viewing the output from an add-on script........................................................................147

Viewing data collection information....................................................................................149

Adding/deleting commands or logs in the XML file..............................................................149

Viewing software version numbers..........................................................................................149

Troubleshooting specific issues................................................................................................150

Software services.............................................................................................................150

Failover..........................................................................................................................150

Windows StoreAll clients...................................................................................................151

Synchronizing information on file serving nodes and the configuration database...........................151

Troubleshooting an Express Query Manual Intervention Failure (MIF)...........................................152

15 Recovering the X9720/9730 Storage......................................................154

Obtaining the latest StoreAll software release...........................................................................154

Preparing for the recovery......................................................................................................154

Restoring an X9720 node with StoreAll 6.1 or later...............................................................155

Recovering an X9720 or 9730 file serving node.......................................................................155

Completing the restore .........................................................................................................162

Troubleshooting....................................................................................................................165

Manually recovering bond1 as the cluster...........................................................................165

iLO remote console does not respond to keystrokes...............................................................169

The ibrix_auth command fails after a restore........................................................................169

16 Support and other resources...................................................................170

Contacting HP......................................................................................................................170

Related information...............................................................................................................170

Obtaining spare parts...........................................................................................................171

HP websites.........................................................................................................................171

Rack stability........................................................................................................................171

Product warranties................................................................................................................171

Subscription service..............................................................................................................171

17 Documentation feedback.......................................................................173

A Cascading Upgrades.............................................................................174

Upgrading the StoreAll software to the 6.1 release....................................................................174

Upgrading 9720 chassis firmware.....................................................................................175

Online upgrades for StoreAll software 6.x to 6.1..................................................................175

Preparing for the upgrade............................................................................................175

Performing the upgrade................................................................................................176

After the upgrade........................................................................................................176

Offline upgrades for StoreAll software 5.6.x or 6.0.x to 6.1...................................................176

Preparing for the upgrade............................................................................................176

Performing the upgrade................................................................................................178

After the upgrade........................................................................................................178

Upgrading Linux StoreAll clients.........................................................................................179

Installing a minor kernel update on Linux clients..............................................................179

Upgrading Windows StoreAll clients..................................................................................180

Upgrading pre-6.0 file systems for software snapshots..........................................................180

Upgrading pre-6.1.1 file systems for data retention features....................................................181

Troubleshooting upgrade issues.........................................................................................182

Automatic upgrade......................................................................................................182

Manual upgrade.........................................................................................................182

Offline upgrade fails because iLO firmware is out of date.................................................182

Node is not registered with the cluster network ...............................................................183

File system unmount issues............................................................................................183

Moving the Fusion Manager VIF to bond1......................................................................184

Upgrading the StoreAll software to the 5.6 release...................................................................185

Contents 7

Page 8

Automatic upgrades.........................................................................................................185

Manual upgrades............................................................................................................186

Preparing for the upgrade............................................................................................186

Saving the node configuration......................................................................................186

Performing the upgrade................................................................................................186

Restoring the node configuration...................................................................................187

Completing the upgrade..............................................................................................187

Troubleshooting upgrade issues.........................................................................................188

Automatic upgrade......................................................................................................188

Manual upgrade.........................................................................................................188

Upgrading the StoreAll software to the 5.5 release....................................................................188

Automatic upgrades.........................................................................................................189

Manual upgrades............................................................................................................190

Standard upgrade for clusters with a dedicated Management Server machine or blade........190

Standard online upgrade.........................................................................................190

Standard offline upgrade.........................................................................................192

Agile upgrade for clusters with an agile management console configuration.......................194

Agile online upgrade..............................................................................................194

Agile offline upgrade..............................................................................................198

Troubleshooting upgrade issues.........................................................................................200

B StoreAll 9730 component and cabling diagrams........................................201

Back view of the main rack....................................................................................................201

Back view of the expansion rack.............................................................................................202

StoreAll 9730 CX I/O modules and SAS port connectors...........................................................202

StoreAll 9730 CX 1 connections to the SAS switches.................................................................203

StoreAll 9730 CX 2 connections to the SAS switches.................................................................204

StoreAll 9730 CX 3 connections to the SAS switches.................................................................205

StoreAll 9730 CX 7 connections to the SAS switches in the expansion rack..................................206

C The IBRIX X9720 component and cabling diagrams....................................207

Base and expansion cabinets.................................................................................................207

Front view of a base cabinet..............................................................................................207

Back view of a base cabinet with one capacity block...........................................................208

Front view of a full base cabinet.........................................................................................209

Back view of a full base cabinet.........................................................................................210

Front view of an expansion cabinet ...................................................................................211

Back view of an expansion cabinet with four capacity blocks.................................................212

Performance blocks (c-Class Blade enclosure)............................................................................212

Front view of a c-Class Blade enclosure...............................................................................212

Rear view of a c-Class Blade enclosure...............................................................................213

Flex-10 networks...............................................................................................................213

Capacity blocks...................................................................................................................214

X9700c (array controller with 12 disk drives).......................................................................215

Front view of an X9700c..............................................................................................215

Rear view of an X9700c..............................................................................................215

X9700cx (dense JBOD with 70 disk drives)..........................................................................215

Front view of an X9700cx............................................................................................216

Rear view of an X9700cx.............................................................................................216

Cabling diagrams................................................................................................................216

Capacity block cabling—Base and expansion cabinets........................................................216

Virtual Connect Flex-10 Ethernet module cabling—Base cabinet.............................................217

SAS switch cabling—Base cabinet.....................................................................................218

SAS switch cabling—Expansion cabinet..............................................................................218

8 Contents

Page 9

D Warnings and precautions......................................................................220

Electrostatic discharge information..........................................................................................220

Preventing electrostatic discharge.......................................................................................220

Grounding methods.....................................................................................................220

Grounding methods.........................................................................................................220

Equipment symbols...............................................................................................................221

Weight warning...................................................................................................................221

Rack warnings and precautions..............................................................................................221

Device warnings and precautions...........................................................................................222

E Regulatory information............................................................................224

Belarus Kazakhstan Russia marking.........................................................................................224

Turkey RoHS material content declaration.................................................................................224

Ukraine RoHS material content declaration..............................................................................224

Warranty information............................................................................................................224

Glossary..................................................................................................225

Index.......................................................................................................227

Contents 9

Page 10

1 Upgrading the StoreAll software to the 6.3 release

This chapter describes how to upgrade to the 6.3 StoreAll software release. You can also use this procedure for any subsequent 6.3.x patches.

IMPORTANT: Print the following table and check off each step as you complete it.

NOTE: (Upgrades from version 6.0.x) CIFS share permissions are granted on a global basis in

v6.0.X. When upgrading from v6.0.X, confirm that the correct share permissions are in place.

Table 1 Prerequisites checklist for all upgrades

Step completed?DescriptionStep

Verify that the entire cluster is currently running StoreAll 6.0 or later by entering the following command:

ibrix_version -l

IMPORTANT: All the StoreAll nodes must be at the same release.

• If you are running a version of StoreAll earlier than 6.0, upgrade the product as

described in “Cascading Upgrades” (page 174).

• If you are running StoreAll 6.0 or later, proceed with the upgrade steps in this section.

Verify that the /local partition contains at least 4 GB for the upgrade by using the following command:

df -kh /local

For 9720 systems, enable password-less access among the cluster nodes before starting the upgrade.

The 6.3 release requires that nodes hosting the agile Fusion Manager be registered on the cluster network. Run the following command to verify that nodes hosting the agile Fusion Manager have IP addresses on the cluster network:

ibrix_fm -l

If a node is configured on the user network, see “Node is not registered with the cluster

network ” (page 22) for a workaround.

NOTE: The Fusion Manager and all file serving nodes must be upgraded to the new

release at the same time. Do not change the active/passive Fusion Manager configuration during the upgrade.

Verify that the crash kernel parameter on all nodes has been set to 256M by viewing the default boot entry in the /etc/grub.conf file, as shown in the following example:

kernel /vmlinuz-2.6.18-194.el5 ro root=/dev/vg1/lv1 crashkernel=256M@16M

The /etc/grub.conf file might contain multiple instances of the crash kernel parameter. Make sure you modify each instance that appears in the file.

If you must modify the /etc/grub.conf file, follow the steps in this section:

1. Use SSH to access the active Fusion Manager (FM).

2. Do one of the following:

• (Versions 6.2 and later) Place all passive FMs into nofmfailover mode:

ibrix_fm -m nofmfailover -A

• (Versions earlier than 6.2) Place all passive FMs into maintenance mode:

ibrix_fm -m maintenance -A

3. Disable Segment Server Failover on each node in the cluster:

ibrix_server -m -U -h <node>

10 Upgrading the StoreAll software to the 6.3 release

Page 11

Table 1 Prerequisites checklist for all upgrades (continued)

Step completed?DescriptionStep

4. Set the crash kernel to 256M in the /etc/grub.conf file. The /etc/grub.conf

file might contain multiple instances of the crash kernel parameter. Make sure you modify each instance that appears in the file.

NOTE: Save a copy of the /etc/grub.conf file before you modify it.

The following example shows the crash kernel set to 256M:

kernel /vmlinuz-2.6.18-194.el5 ro root=/dev/vg1/lv1 crashkernel=256M@16M

5. Reboot the active FM.

6. Use SSH to access each passive FM and do the following:

a. Modify the /etc/grub.conf file as described in the previous steps. b. Reboot the node.

7. After all nodes in the cluster are back up, use SSH to access the active FM.

8. Place all disabled FMs back into passive mode:

ibrix_fm -m passive -A

9. Re-enable Segment Server Failover on each node:

ibrix_server -m -h <node>

If your cluster includes G6 servers, check the iLO2 firmware version. This issue does not affect G7 servers. The firmware must be at version 2.05 for HA to function properly. If your servers have an earlier version of the iLO2 firmware, run the CP014256.scexe script as described in the following steps:

1. Ensure that the /local/ibrix/ folder is empty prior to copying the contents of pkgfull.

When you upgrade the StoreAll software later in this chapter, this folder must contain only .rpm packages listed in the build manifest for the upgrade or the upgrade will fail.

2. Mount the pkg-full ISO image and copy the entire directory structure to the /local/

ibrix/ directory, as shown in the following example:

mount -o loop /local/pkg/ibrix-pkgfull-FS_6.3.72+IAS_6.3.72-x86_64.signed.iso /mnt/

3. Execute the firmware binary at the following location:

/local/ibrix/distrib/firmware/CP014256.scexe

Make sure StoreAll is running the latest firmware. For information on how to find the version of firmware that StoreAll is running, see the Administrator Guide for your release.

Verify that all file system nodes can “see” and “access” every segment logical volume that the file system node is configured for as either the owner or the backup by entering the following commands:

1. To view all segments, logical volume name, and owner, enter the following command

on one line:

ibrix_fs -i | egrep -e OWNER -e MIXED|awk '{ print $1, $3, $6, $2, $14, $5}' | tr " " "\t"

2. To verify the visibility of the correct segments on the current file system node enter the

following command on each file system node:

lvm lvs | awk '{print $1}'

Ensure that no active tasks are running. Stop any active remote replication, data tiering, or rebalancer tasks running on the cluster. (Use ibrix_task -l to list active tasks.) When the upgrade is complete, you can start the tasks again.

For additional information on how to stop a task, enter the ibrix_task command for the help.

Page 12

Table 1 Prerequisites checklist for all upgrades (continued)

Step completed?DescriptionStep

For 9720 systems, delete the existing vendor storage by entering the following command:

ibrix_vs -d -n EXDS

The vendor storage is registered automatically after the upgrade.

Record all host tunings, FS tunings and FS mounting options by using the following commands:

1. To display file system tunings, enter: ibrix_fs_tune -l

>/local/ibrix_fs_tune-l.txt

2. To display default StoreAll tunings and settings, enter: ibrix_host_tune -L

>/local/ibrix_host_tune-L.txt

3. To display all non-default configuration tunings and settings, enter: ibrix_host_tune

-q >/local/ibrix_host_tune-q.txt

Ensure that the "ibrix" local user account exists and it has the same UID number on all the servers in the cluster. If they do not have the same UID number, create the account and change the UIDs as needed to make them the same on all the servers. Similarly, ensure that the "ibrix-user" local user group exists and has the same GID number on all servers.

Enter the following commands on each node:

grep ibrix /etc/passwd

grep ibrix-user /etc/group

Ensure that all nodes are up and running. To determine the status of your cluster nodes, check the health of each server by either using the dashboard on the Management Console or entering the ibrix_health -S -i -h <hostname> command for each node in the cluster. At the top of the output look for “PASSED.”

If you are running StoreAll 6.2.x or earlier and you have one or more Express Query enabled file system, each one needs to be manually upgraded as described in “Upgrading

pre-6.3 Express Query enabled file systems” (page 19).

IMPORTANT: Run the steps in “Required steps before the StoreAll Upgrade for pre-6.3

Express Query enabled file systems” (page 19) before the upgrade. This section provides

steps for saving your custom metadata and audit log. After you upgrade the StoreAll software, run the steps in “Required steps after the StoreAll Upgrade for pre-6.3 Express

Query enabled file systems” (page 20). These post-upgrade steps are required for you to

preserve your custom metadata and audit log data.

Upgrading 9720 chassis firmware

Before upgrading 9720 systems to StoreAll software 6.3, the 9720 chassis firmware must be at version 4.0.0-13. If the firmware is not at this level, upgrade it before proceeding with the StoreAll upgrade.

To upgrade the firmware, complete the following steps:

1. Go to http://www.hp.com/go/StoreAll.

2. On the HP StoreAll Storage page, select HP Support & Drivers from the Support section.

3. On the Business Support Center, select Download Drivers and Software and then select HP 9720 Base Rack > Red Hat Enterprise Linux 5 Server (x86-64).

4. Click HP 9720 Storage Chassis Firmware version 4.0.0-13.

5. Download the firmware and install it as described in the HP 9720 Network Storage System

4.0.0-13 Release Notes.

Online upgrades for StoreAll software

Online upgrades are supported only from the StoreAll 6.x release. Upgrades from earlier StoreAll releases must use the appropriate offline upgrade procedure.

12 Upgrading the StoreAll software to the 6.3 release

Page 13

When performing an online upgrade, note the following:

• File systems remain mounted and client I/O continues during the upgrade.

• The upgrade process takes approximately 45 minutes, regardless of the number of nodes.

• The total I/O interruption per node IP is four minutes, allowing for a failover time of two minutes

and a failback time of two additional minutes.

• Client I/O having a timeout of more than two minutes is supported.

Preparing for the upgrade

To prepare for the upgrade, complete the following steps, ensure that high availability is enabled on each node in the cluster by running the following command:

ibrix_haconfig -l

If the command displays an Overall HA Configuration Checker Results - PASSED status, high availability is enabled on each node in the cluster. If the command returns Overall HA Configuration Checker Results - FAILED, complete the following list items based on the result returned for each component:

1. Make sure you have completed all steps in the upgrade checklist (Table 1 (page 10)).

2. If Failed was displayed for the HA Configuration or Auto Failover columns or both, perform

the steps described in the section “Configuring High Availability on the cluster” in the

administrator guide for your current release.

3. If Failed was displayed for the NIC or HBA Monitored columns, see the sections for

ibrix_nic -m -h <host> -A node_2/node_interface and ibrix_hba -m -h

<host> -p <World_Wide_Name> in the CLI guide for your current release.

Performing the upgrade

The online upgrade is supported only from the StoreAll 6.x releases.

IMPORTANT: Complete all steps provided in the Table 1 (page 10).

Complete the following steps:

1. This release is only available through the registered release process. To obtain the ISO image, contact HP Support to register for the release and obtain access to the software dropbox.

2. Ensure that the /local/ibrix/ folder is empty prior to copying the contents of pkgfull. The upgrade will fail if the /local/ibrix/ folder contains leftover .rpm packages not listed in the build manifest.

3. Mount the pkg-full ISO image and copy the entire directory structure to the /local/ibrix/ directory, as shown in the following example:

mount -o loop /local/pkg/ibrix-pkgfull-FS_6.3.72+IAS_6.3.72-x86_64.signed.iso /mnt/

4. Change the permissions of all components in the /local/ibrix/ directory structure by entering the following command:

chmod -R 777 /local/ibrix/

5. Change to the /local/ibrix/ directory.

cd /local/ibrix/

6. Run the upgrade script and follow the on-screen directions:

./auto_online_ibrixupgrade

7. Upgrade Linux StoreAll clients. See “Upgrading Linux StoreAll clients” (page 18).

8. If you received a new license from HP, install it as described in “Licensing” (page 135).

Online upgrades for StoreAll software 13

Page 14

After the upgrade

Complete these steps:

1. If your cluster nodes contain any 10Gb NICs, reboot these nodes to load the new driver. You must do this step before you upgrade the server firmware, as requested later in this procedure.

2. Upgrade your firmware as described in “Upgrading firmware” (page 136).

3. Start any remote replication, rebalancer, or data tiering tasks that were stopped before the upgrade.

4. If you have a file system version prior to version 6, you might have to make changes for snapshots and data retention, as mentioned in the following list:

• Snapshots. Files used for snapshots must either be created on StoreAll software 6.0 or

later, or the pre-6.0 file system containing the files must be upgraded for snapshots. To upgrade a file system, use the upgrade60.sh utility. For more information, see

“Upgrading pre-6.0 file systems for software snapshots” (page 180).

• Data retention. Files used for data retention (including WORM and auto-commit) must be

created on StoreAll software 6.1.1 or later, or the pre-6.1.1 file system containing the files must be upgraded for retention features. To upgrade a file system, use the ibrix_reten_adm -u -f FSNAME command. Additional steps are required before and after you run the ibrix_reten_adm -u -f FSNAME command. For more information, see “Upgrading pre-6.1.1 file systems for data retention features” (page 181).

5. If you have an Express Query enabled file system prior to version 6.3, manually complete each file system upgrade as described in “Required steps after the StoreAll Upgrade for pre-6.3

Express Query enabled file systems” (page 20).

Automated offline upgrades for StoreAll software 6.x to 6.3

Preparing for the upgrade

To prepare for the upgrade, complete the following steps:

1. Make sure you have completed all steps in the upgrade checklist (Table 1 (page 10)).

2. Stop all client I/O to the cluster or file systems. On the Linux client, use lsof </mountpoint> to show open files belonging to active processes.

3. Verify that all StoreAll file systems can be successfully unmounted from all FSN servers:

ibrix_umount -f fsname

Performing the upgrade

This upgrade method is supported only for upgrades from StoreAll software 6.x to the 6.3 release. Complete the following steps:

1. This release is only available through the registered release process. To obtain the ISO image, contact HP Support to register for the release and obtain access to the software dropbox.

3. Mount the pkg-full ISO image and copy the entire directory structure to the /local/ibrix/ directory, as shown in the following example:

mount -o loop /local/pkg/ibrix-pkgfull-FS_6.3.72+IAS_6.3.72-x86_64.signed.iso /mnt/

4. Change the permissions of all components in the /local/ibrix/ directory structure by entering the following command:

chmod -R 777 /local/ibrix/

14 Upgrading the StoreAll software to the 6.3 release

Page 15

5. Change to the /local/ibrix/ directory.

cd /local/ibrix/

6. Run the following upgrade script:

./auto_ibrixupgrade

The upgrade script automatically stops the necessary services and restarts them when the upgrade is complete. The upgrade script installs the Fusion Manager on all file serving nodes. The Fusion Manager is in active mode on the node where the upgrade was run, and is in passive mode on the other file serving nodes. If the cluster includes a dedicated Management Server, the Fusion Manager is installed in passive mode on that server.

7. Upgrade Linux StoreAll clients. See “Upgrading Linux StoreAll clients” (page 18).

8. If you received a new license from HP, install it as described in “Licensing” (page 135).

After the upgrade

Complete the following steps:

1. If your cluster nodes contain any 10Gb NICs, reboot these nodes to load the new driver. You must do this step before you upgrade the server firmware, as requested later in this procedure.

2. Upgrade your firmware as described in “Upgrading firmware” (page 136).

3. Mount file systems on Linux StoreAll clients.

4. If you have a file system version prior to version 6, you might have to make changes for snapshots and data retention, as mentioned in the following list:

• Snapshots. Files used for snapshots must either be created on StoreAll software 6.0 or

later, or the pre-6.0 file system containing the files must be upgraded for snapshots. To upgrade a file system, use the upgrade60.sh utility. For more information, see

“Upgrading pre-6.0 file systems for software snapshots” (page 180).

• Data retention. Files used for data retention (including WORM and auto-commit) must be

5. If you have an Express Query enabled file system prior to version 6.3, manually complete each file system upgrade as described in “Required steps after the StoreAll Upgrade for pre-6.3

Express Query enabled file systems” (page 20).

Manual offline upgrades for StoreAll software 6.x to 6.3

Preparing for the upgrade

To prepare for the upgrade, complete the following steps:

1. Make sure you have completed all steps in the upgrade checklist (Table 1 (page 10)).

2. Verify that ssh shared keys have been set up. To do this, run the following command on the node hosting the active instance of the agile Fusion Manager:

ssh <server_name>

Repeat this command for each node in the cluster.

3. Verify that all file system node servers have separate file systems mounted on the following partitions by using the df command:

• /

• /local

Manual offline upgrades for StoreAll software 6.x to 6.3 15

Page 16

• /stage

• /alt

4. Verify that all FSN servers have a minimum of 4 GB of free/available storage on the /local partition by using the df command .

5. Verify that all FSN servers are not reporting any partition as 100% full (at least 5% free space) by using the df command .

6. Note any custom tuning parameters, such as file system mount options. When the upgrade is complete, you can reapply the parameters.

7. Stop all client I/O to the cluster or file systems. On the Linux client, use lsof </mountpoint> to show open files belonging to active processes.

8. On the active Fusion Manager, enter the following command to place the Fusion Manager into maintenance mode:

<ibrixhome>/bin/ibrix_fm -m nofmfailover -P -A

9. On the active Fusion Manager node, disable automated failover on all file serving nodes:

<ibrixhome>/bin/ibrix_server -m -U

10. Run the following command to verify that automated failover is off. In the output, the HA column should display off.

<ibrixhome>/bin/ibrix_server -l

11. Unmount file systems on Linux StoreAll clients:

ibrix_umount -f MOUNTPOINT

12. Stop the SMB, NFS and NDMP services on all nodes. Run the following commands on the node hosting the active Fusion Manager:

ibrix_server -s -t cifs -c stop

ibrix_server -s -t nfs -c stop

ibrix_server -s -t ndmp -c stop

If you are using SMB, verify that all likewise services are down on all file serving nodes:

ps -ef | grep likewise

Use kill -9 to stop any likewise services that are still running. If you are using NFS, verify that all NFS processes are stopped:

ps -ef | grep nfs

If necessary, use the following command to stop NFS services:

/etc/init.d/nfs stop

Use kill -9 to stop any NFS processes that are still running. If necessary, run the following command on all nodes to find any open file handles for the

mounted file systems:

lsof </mountpoint>

Use kill -9 to stop any processes that still have open file handles on the file systems.

13. Unmount each file system manually:

ibrix_umount -f FSNAME

Wait up to 15 minutes for the file systems to unmount. Troubleshoot any issues with unmounting file systems before proceeding with the upgrade.

See “File system unmount issues” (page 23).

16 Upgrading the StoreAll software to the 6.3 release

Page 17

Performing the upgrade manually

This upgrade method is supported only for upgrades from StoreAll software 6.x to the 6.3 release. Complete the following steps first for the server running the active Fusion Manager and then for the servers running the passive Fusion Managers:

1. This release is only available through the registered release process. To obtain the ISO image, contact HP Support to register for the release and obtain access to the software dropbox.

3. Mount the pkg-full ISO image and copy the entire directory structure to the /local/ibrix/ directory, as shown in the following example:

mount -o loop /local/pkg/ibrix-pkgfull-FS_6.3.72+IAS_6.3.72-x86_64.signed.iso /mnt/

4. Change the permissions of all components in the /local/ibrix/ directory structure by entering the following command:

chmod -R 777 /local/ibrix/

5. Change to the /local/ibrix/ directory, and then run the upgrade script:

cd /local/ibrix/

./ibrixupgrade —f

The upgrade script automatically stops the necessary services and restarts them when the upgrade is complete. The upgrade script installs the Fusion Manager on the server.

6. After completing the previous steps for the server running the active Fusion Manager, repeat the steps for each of the servers running the passive Fusion Manager.

7. Upgrade Linux StoreAll clients. See “Upgrading Linux StoreAll clients” (page 18).

8. If you received a new license from HP, install it as described in “Licensing” (page 135).

After the upgrade

Complete the following steps:

1. If your cluster nodes contain any 10Gb NICs, reboot these nodes to load the new driver. You must do this step before you upgrade the server firmware, as requested later in this procedure.

2. Upgrade your firmware as described in “Upgrading firmware” (page 136).

3. Run the following command to rediscover physical volumes:

ibrix_pv -a

4. Apply any custom tuning parameters, such as mount options.

5. Remount all file systems:

ibrix_mount -f <fsname> -m </mountpoint>

6. Re-enable High Availability if used:

ibrix_server -m

7. Start any remote replication, rebalancer, or data tiering tasks that were stopped before the upgrade.

8. If you are using SMB, set the following parameters to synchronize the SMB software and the Fusion Manager database:

• smb signing enabled

• smb signing required

• ignore_writethru

Manual offline upgrades for StoreAll software 6.x to 6.3 17

Page 18

Use ibrix_cifsconfig to set the parameters, specifying the value appropriate for your cluster (1=enabled, 0=disabled). The following examples set the parameters to the default values for the 6.3 release:

ibrix_cifsconfig -t -S "smb_signing_enabled=0, smb_signing_required=0"

ibrix_cifsconfig -t -S "ignore_writethru=1"

The SMB signing feature specifies whether clients must support SMB signing to access SMB shares. See the HP StoreAll Storage File System User Guide for more information about this feature. When ignore_writethru is enabled, StoreAll software ignores writethru buffering to improve SMB write performance on some user applications that request it.

9. Mount file systems on Linux StoreAll clients.

10. If you have a file system version prior to version 6, you might have to make changes for snapshots and data retention, as mentioned in the following list:

• Snapshots. Files used for snapshots must either be created on StoreAll software 6.0 or

later, or the pre-6.0 file system containing the files must be upgraded for snapshots. To upgrade a file system, use the upgrade60.sh utility. For more information, see

“Upgrading pre-6.0 file systems for software snapshots” (page 180).

• Data retention. Files used for data retention (including WORM and auto-commit) must be

11. If you have an Express Query enabled file system prior to version 6.3, manually complete each file system upgrade as described in “Required steps after the StoreAll Upgrade for pre-6.3

Express Query enabled file systems” (page 20).

Upgrading Linux StoreAll clients

Be sure to upgrade the cluster nodes before upgrading Linux StoreAll clients. Complete the following steps on each client:

1. Download the latest HP StoreAll client 6.3 package.

2. Expand the tar file.

3. Run the upgrade script:

./ibrixupgrade -tc -f

The upgrade software automatically stops the necessary services and restarts them when the upgrade is complete.

4. Execute the following command to verify the client is running StoreAll software:

/etc/init.d/ibrix_client status IBRIX Filesystem Drivers loaded IBRIX IAD Server (pid 3208) running...

The IAD service should be running, as shown in the previous sample output. If it is not, contact HP Support.

Installing a minor kernel update on Linux clients

The StoreAll client software is upgraded automatically when you install a compatible Linux minor kernel update.

If you are planning to install a minor kernel update, first run the following command to verify that the update is compatible with the StoreAll client software:

/usr/local/ibrix/bin/verify_client_update <kernel_update_version>

18 Upgrading the StoreAll software to the 6.3 release

Page 19

The following example is for a RHEL 4.8 client with kernel version 2.6.9-89.ELsmp:

# /usr/local/ibrix/bin/verify_client_update 2.6.9-89.35.1.ELsmp

Kernel update 2.6.9-89.35.1.ELsmp is compatible.

If the minor kernel update is compatible, install the update with the vendor RPM and reboot the system. The StoreAll client software is then automatically updated with the new kernel, and StoreAll client services start automatically. Use the ibrix_version -l -C command to verify the kernel version on the client.

NOTE: To use the verify_client command, the StoreAll client software must be installed.

Upgrading Windows StoreAll clients

Complete the following steps on each client:

1. Remove the old Windows StoreAll client software using the Add or Remove Programs utility

in the Control Panel.

2. Copy the Windows StoreAll client MSI file for the upgrade to the machine.

3. Launch the Windows Installer and follow the instructions to complete the upgrade.

4. Register the Windows StoreAll client again with the cluster and check the option to Start Service

after Registration.

5. Check Administrative Tools | Services to verify that the StoreAll client service is started.

6. Launch the Windows StoreAll client. On the Active Directory Settings tab, click Update to

retrieve the current Active Directory settings.

7. Mount file systems using the StoreAll Windows client GUI.

NOTE: If you are using Remote Desktop to perform an upgrade, you must log out and log back

in to see the drive mounted.

Upgrading pre-6.3 Express Query enabled file systems

The internal database schema format of Express Query enabled file systems changed between releases 6.2.x and 6.3. Each file system with Express Query enabled must be manually upgraded to 6.3. This section has instructions to be run before and after the StoreAll upgrade, on each of those file systems.

Required steps before the StoreAll Upgrade for pre-6.3 Express Query enabled file systems

These steps are required before the StoreAll Upgrade:

1. Mount all Express Query file systems on the cluster to be upgraded if they are not mounted yet.

2. Save your custom metadata by entering the following command:

/usr/local/ibrix/bin/MDExport.pl --dbconfig /usr/local/Metabox/scripts/startup.xml --database <FSNAME>

--outputfile /tmp/custAttributes.csv --user ibrix

3. Save your audit log data by entering the following commands:

ibrix_audit_reports -t time -f <FSNAME>

cp <path to report file printed from previous command> /tmp/auditData.csv

4. Disable auditing by entering the following command:

ibrix_fs -A -f <FSNAME> -oa audit_mode=off

In this instance <FSNAME> is the file system.

Upgrading Windows StoreAll clients 19

Page 20

5. If any archive API shares exist for the file system, delete them.

NOTE: To list all HTTP shares, enter the following command:

ibrix_httpshare -l

To list only REST API (Object API) shares, enter the following command:

ibrix_httpshare -l -f <FSNAME> -v 1 | grep "objectapi: true" | awk '{ print $2 }'

In this instance <FSNAME> is the file system.

• Delete all HTTP shares, regular or REST API (Object API) by entering the following

command:

ibrix_httpshare -d -f <FSNAME>

In this instance <FSNAME> is the file system.

• Delete a specific REST API (Object API) share by entering the following command:

ibrix_httpshare -d <SHARENAME> -c <PROFILENAME> -t <VHOSTNAME>

In this instance

◦ <SHARENAME> is the share name.

◦ <PROFILENAME> is the profile name.

◦ <VHOSTNAME> is the virtual host name

6. Disable Express Query by entering the following command:

ibrix_fs -T -D -f <FSNAME>

7. Shut down Archiving daemons for Express Query by entering the following command:

ibrix_archiving -S -F

8. delete the internal database files for this file system by entering the following command:

rm -rf <FS_MOUNTPOINT>/.archiving/database

In this instance <FS_MOUNTPOINT> is the file system mount point.

Required steps after the StoreAll Upgrade for pre-6.3 Express Query enabled file systems

These steps are required after the StoreAll Upgrade:

1. Restart the Archiving daemons for Express Query:

2. Re-enable Express Query on the file systems you disabled it from before by entering the following command:

ibrix_fs -T -E -f <FSNAME>

In this instance <FSNAME> is the file system. Express Query will begin resynchronizing (repopulating) a new database for this filesystem.

3. Re-enable auditing if you had it running before (the default) by entering the following command:

ibrix_fs -A -f <FSNAME> -oa audit_mode=on

In this instance <FSNAME> is the file system.

4. Re-create REST API (Object API) shares deleted before the upgrade on each node in the cluster (if desired) by entering the following command:

20 Upgrading the StoreAll software to the 6.3 release

Page 21

NOTE: The REST API (Object API) functionality has expanded, and any REST API (Object

API) shares you created in previous releases are now referred to as HTTP-StoreAll REST API shares in file-compatible mode. The 6.3 release is also introducing a new type of share called HTTP-StoreAll REST API share in Object mode.

ibrix_httpshare -a <SHARENAME> -c <PROFILENAME> -t <VHOSTNAME> -f <FSNAME> -p <DIRPATH> -P <URLPATH> -S

“ibrixRestApiMode=filecompatible, anonymous=true”

In this instance:

• <SHARENAME> is the share name.

• <PROFILENAME> is the profile name.

• <VHOSTNAME> is the virtual host name

• <FSNAME> is the file system.

• <DIRPATH> is the directory path.

• <URLPATH> is the URL path.

• <SETTINGLIST> is the settings.

5. Wait for the resynchronizer to complete by entering the following command until its output is

<FSNAME>: OK:

ibrix_archiving -l

6. Restore your audit log data by entering the following command:

MDImport -f <FSNAME> -n /tmp/auditData.csv -t audit

In this instance <FSNAME> is the file system.

7. Restore your custom metadata by entering the following command:

MDImport -f <FSNAME> -n /tmp/custAttributes.csv -t custom

In this instance <FSNAME> is the file system.

Troubleshooting upgrade issues

If the upgrade does not complete successfully, check the following items. For additional assistance, contact HP Support.

Automatic upgrade

Check the following:

• If the initial execution of /usr/local/ibrix/setup/upgrade fails, check

/usr/local/ibrix/setup/upgrade.log for errors. It is imperative that all servers are

up and running the StoreAll software before you execute the upgrade script.

• If the install of the new OS fails, power cycle the node. Try rebooting. If the install does not

begin after the reboot, power cycle the machine and select the upgrade line from the grub boot menu.

• After the upgrade, check /usr/local/ibrix/setup/logs/postupgrade.log for errors

or warnings.

• If configuration restore fails on any node, look at

/usr/local/ibrix/autocfg/logs/appliance.log on that node to determine which feature restore failed. Look at the specific feature log file under /usr/local/ibrix/setup/ logs/ for more detailed information.

Troubleshooting upgrade issues 21

Page 22

To retry the copy of configuration, use the following command:

/usr/local/ibrix/autocfg/bin/ibrixapp upgrade -f -s

• If the install of the new image succeeds, but the configuration restore fails and you need to

revert the server to the previous install, run the following command and then reboot the machine. This step causes the server to boot from the old version (the alternate partition).

/usr/local/ibrix/setup/boot_info -r

• If the public network interface is down and inaccessible for any node, power cycle that node.

NOTE: Each node stores its ibrixupgrade.log file in /tmp.

Manual upgrade

Check the following:

• If the restore script fails, check /usr/local/ibrix/setup/logs/restore.log for

details.

• If configuration restore fails, look at /usr/local/ibrix/autocfg/logs/appliance.log

to determine which feature restore failed. Look at the specific feature log file under /usr/ local/ibrix/setup/logs/ for more detailed information.

To retry the copy of configuration, use the following command:

/usr/local/ibrix/autocfg/bin/ibrixapp upgrade -f -s

Offline upgrade fails because iLO firmware is out of date

If the iLO2 firmware is out of date on a node, the auto_ibrixupgrade script will fail. The /usr/ local/ibrix/setup/logs/auto_ibrixupgrade.log reports the failure and describes how to update the firmware.

After updating the firmware, run the following command on the node to complete the StoreAll software upgrade:

/local/ibrix/ibrixupgrade -f

Node is not registered with the cluster network

Nodes hosting the agile Fusion Manager must be registered with the cluster network. If the ibrix_fm command reports that the IP address for a node is on the user network, you will need to reassign the IP address to the cluster network. For example, the following commands report that node ib51-101, which is hosting the active Fusion Manager, has an IP address on the user network (192.168.51.101) instead of the cluster network.

[root@ib51-101 ibrix]# ibrix_fm -i FusionServer: ib51-101 (active, quorum is running) ================================================== [root@ib51-101 ibrix]# ibrix_fm -l NAME IP ADDRESS

-------- ---------ib51-101 192.168.51.101 ib51-102 10.10.51.102

1. If the node is hosting the active Fusion Manager, as in this example, stop the Fusion Manager on that node:

[root@ib51-101 ibrix]# /etc/init.d/ibrix_fusionmanager stop Stopping Fusion Manager Daemon [ OK ] [root@ib51-101 ibrix]#

22 Upgrading the StoreAll software to the 6.3 release

Page 23

2. On the node now hosting the active Fusion Manager (ib51-102 in the example), unregister node ib51-101:

[root@ib51-102 ~]# ibrix_fm -u ib51-101 Command succeeded!

3. On the node hosting the active Fusion Manager, register node ib51-101 and assign the correct IP address:

[root@ib51-102 ~]# ibrix_fm -R ib51-101 -I 10.10.51.101 Command succeeded!

NOTE: When registering a Fusion Manager, be sure the hostname specified with -R matches

the hostname of the server.

The ibrix_fm commands now show that node ib51-101 has the correct IP address and node

ib51-102 is hosting the active Fusion Manager.

[root@ib51-102 ~]# ibrix_fm -f NAME IP ADDRESS

-------- ----------

ib51-101 10.10.51.101 ib51-102 10.10.51.102 [root@ib51-102 ~]# ibrix_fm -i FusionServer: ib51-102 (active, quorum is running) ==================================================

File system unmount issues

If a file system does not unmount successfully, perform the following steps on all servers:

1. Run the following commands:

chkconfig ibrix_server off

chkconfig ibrix_ndmp off

chkconfig ibrix_fusionmanager off

2. Reboot all servers.

3. Run the following commands to move the services back to the on state. The commands do not start the services.

chkconfig ibrix_server on

chkconfig ibrix_ndmp on

chkconfig ibrix_fusionmanager on

4. Run the following commands to start the services:

service ibrix_fusionmanager start

service ibrix_server start

5. Unmount the file systems and continue with the upgrade procedure.

File system in MIF state after StoreAll software 6.3 upgrade

If an Express Query enabled file systems ended in MIF state after completing the StoreAll software upgrade process (ibrix_archiving -l prints <FSNAME>: MIF), check the MIF status by running the following command:

cat /<FSNAME>/.archiving/database/serialization/ManualInterventionFailure

If the command’s output displays Version mismatch, upgrade needed (as shown in the following output), steps were not performed as described in “Required steps after the StoreAll

Upgrade for pre-6.3 Express Query enabled file systems” (page 20).

MIF:Version mismatch, upgrade needed. (error code 14)

Troubleshooting upgrade issues 23

Page 24

If you did not see the Version mismatch, upgrade needed in the command’s output, see

“Troubleshooting an Express Query Manual Intervention Failure (MIF)” (page 152).

Perform the following steps only if you see the Version mismatch, upgrade needed in the command’s output:

1. Disable auditing by entering the following command:

ibrix_fs -A -f <FSNAME> -oa audit_mode=off

In this instance <FSNAME> is the file system.

2. Disable Express Query by entering the following command:

ibrix_fs -T -D -f <FSNAME>

In this instance <FSNAME> is the file system.

3. Delete the internal database files for this file system by entering the following command:

rm -rf <FS_MOUNTPOINT>/.archiving/database

In this instance <FS_MOUNTPOINT> is the file system mount point.

4. Clear the MIF condition by running the following command:

ibrix_archiving -C <FSNAME>

In this instance <FSNAME> is the file system.

5. Re-enable Express Query on the file systems:

ibrix_fs -T -E -f <FSNAME>

In this instance <FSNAME> is the file system. Express Query will begin resynchronizing (repopulating) a new database for this file system.

6. Re-enable auditing if you had it running before (the default).

ibrix_fs -A -f <FSNAME> -oa audit_mode=on

In this instance <FSNAME> is the file system.

7. Restore your audit log data:

MDImport -f <FSNAME> -n /tmp/auditData.csv -t audit

In this instance <FSNAME> is the file system.

8. Restore your custom metadata:

MDImport -f <FSNAME> -n /tmp/custAttributes.csv -t custom

In this instance <FSNAME> is the file system.

24 Upgrading the StoreAll software to the 6.3 release

Page 25

2 Product description

HP X9720 and 9730 Storage are a scalable, network-attached storage (NAS) product. The system combines HP StoreAll software with HP server and storage hardware to create a cluster of file serving nodes.

System features

The X9720 and 9730 Storage provide the following features:

• Segmented, scalable file system under a single namespace

• NFS, SMB(Server Message Block), FTP, and HTTP support for accessing file system data

• Centralized CLI and GUI for cluster management

• Policy management

• Continuous remote replication

• Dual redundant paths to all storage components

• Gigabytes-per-second of throughput

IMPORTANT: It is important to keep regular backups of the cluster configuration. See “Backing

up the Fusion Manager configuration” (page 76) for more information.

System components

IMPORTANT: All software included with the X9720/9730 Storage is for the sole purpose of

operating the system. Do not add, remove, or change any software unless instructed to do so by HP-authorized personnel.

For information about 9730 system components and cabling, see “StoreAll 9730 component and

cabling diagrams” (page 201).

For information about X9720 system components and cabling, see “The IBRIX X9720 component

and cabling diagrams” (page 207).

For a complete list of system components, see the HP StoreAll Storage QuickSpecs, which are available at:

http://www.hp.com/go/StoreAll

HP StoreAll software features

HP StoreAll software is a scale-out, network-attached storage solution including a parallel file system for clusters, an integrated volume manager, high-availability features such as automatic failover of multiple components, and a centralized management interface. StoreAll software can scale to thousands of nodes.

Based on a segmented file system architecture, StoreAll software integrates I/O and storage systems into a single clustered environment that can be shared across multiple applications and managed from a central Fusion Manager.

StoreAll software is designed to operate with high-performance computing applications that require high I/O bandwidth, high IOPS throughput, and scalable configurations.

Some of the key features and benefits are as follows:

• Scalable configuration. You can add servers to scale performance and add storage devices

to scale capacity.

• Single namespace. All directories and files are contained in the same namespace.

System features 25

Page 26

• Multiple environments. Operates in both the SAN and DAS environments.

• High availability. The high-availability software protects servers.

• Tuning capability. The system can be tuned for large or small-block I/O.

• Flexible configuration. Segments can be migrated dynamically for rebalancing and data

tiering.

High availability and redundancy

The segmented architecture is the basis for fault resilience—loss of access to one or more segments does not render the entire file system inaccessible. Individual segments can be taken offline temporarily for maintenance operations and then returned to the file system.

To ensure continuous data access, StoreAll software provides manual and automated failover protection at various points:

• Server. A failed node is powered down and a designated standby server assumes all of its

segment management duties.

• Segment. Ownership of each segment on a failed node is transferred to a designated standby

server.

• Network interface. The IP address of a failed network interface is transferred to a standby

network interface until the original network interface is operational again.

• Storage connection. For servers with HBA-protected Fibre Channel access, failure of the HBA

triggers failover of the node to a designated standby server.

26 Product description

Page 27

3 Getting started

This chapter describes how to log in to the system, boot the system and individual server blades, change passwords, and back up the Fusion Manager configuration. It also describes the StoreAll software management interfaces.

IMPORTANT: Follow these guidelines when using your system:

• Do not modify any parameters of the operating system or kernel, or update any part of the

X9720/9730 Storage unless instructed to do so by HP; otherwise, the system could fail to operate properly.

• File serving nodes are tuned for file serving operations. With the exception of supported

backup programs, do not run other applications directly on the nodes.

Setting up the X9720/9730 Storage

An HP service specialist sets up the system at your site, including the following tasks:

Installation steps

• Before starting the installation, ensure that the product components are in the location where

they will be installed. Remove the product from the shipping cartons, confirm the contents of each carton against the list of included items, check for any physical damage to the exterior of the product, and connect the product to the power and network provided by you.

• Review your server, network, and storage environment relevant to the HP Enterprise NAS

product implementation to validate that prerequisites have been met.

• Validate that your file system performance, availability, and manageability requirements have

not changed since the service planning phase. Finalize the HP Enterprise NAS product implementation plan and software configuration.

• Implement the documented and agreed-upon configuration based on the information you

provided on the pre-delivery checklist.

• Document configuration details.

Additional configuration steps

When your system is up and running, you can continue configuring the cluster and file systems. The Management Console and CLI are used to perform most operations. (Some features described here may be configured for you as part of the system installation.)

Cluster. Configure the following as needed:

• Firewall ports. See “Configuring ports for a firewall” (page 35)

• HP Insight Remote Support and Phone Home. See “Configuring HP Insight Remote Support

on StoreAll systems” (page 36).

• Virtual interfaces for client access. See “Configuring virtual interfaces for client access”

(page 49).

• Cluster event notification through email or SNMP. See “Configuring cluster event notification”

(page 70).

• Fusion Manager backups. See “Backing up the Fusion Manager configuration” (page 76).

• NDMP backups. See “Using NDMP backup applications” (page 76).

• Statistics tool. See “Using the Statistics tool” (page 108).

• Ibrix Collect. See “Collecting information for HP Support with the IbrixCollect” (page 143).

Setting up the X9720/9730 Storage 27

Page 28

File systems. Set up the following features as needed:

• NFS, SMB (Server Message Block), FTP, or HTTP. Configure the methods you will use to access

file system data.

• Quotas. Configure user, group, and directory tree quotas as needed.

• Remote replication. Use this feature to replicate changes in a source file system on one cluster

to a target file system on either the same cluster or a second cluster.

• Data retention and validation. Use this feature to manage WORM and retained files.

• Antivirus support. This feature is used with supported Antivirus software, allowing you to scan

files on a StoreAll file system.

• StoreAll software snapshots. This feature allows you to capture a point-in-time copy of a file

system or directory for online backup purposes and to simplify recovery of files from accidental deletion. Users can access the file system or directory as it appeared at the instant of the snapshot.

• File allocation. Use this feature to specify the manner in which segments are selected for storing

new files and directories.

• Data tiering. Use this feature to move files to specific tiers based on file attributes.

For more information about these file system features, see the HP StoreAll Storage File System User

Guide.

Localization support

Red Hat Enterprise Linux 5 uses the UTF-8 (8-bit Unicode Transformation Format) encoding for supported locales. This allows you to create, edit and view documents written in different locales using UTF-8. StoreAll software supports modifying the /etc/sysconfig/i18n configuration file for your locale. The following example sets the LANG and SUPPORTED variables for multiple character sets:

LANG="ko_KR.utf8" SUPPORTED="en_US.utf8:en_US:en:ko_KR.utf8:ko_KR:ko:zh_CN.utf8:zh_CN:zh" SYSFONT="lat0-sun16" SYSFONTACM="iso15"

Logging in to the system

Using the network

Use ssh to log in remotely from another host. You can log in to any server using any configured site network interface (eth1, eth2, or bond1).

With ssh and the root user, after you log in to any server, your .ssh/known_hosts file will work with any server in the cluster.

The original server blades in your cluster are configured to support password-less ssh. After you have connected to one server, you can connect to the other servers without specifying the root password again. To enable the same support for other server blades, or to access the system itself without specifying a password, add the keys of the other servers to .ssh/authorized keys on each server blade.

Using the TFT keyboard/monitor

If the site network is down, you can log in to the console as follows:

1. Pull out the keyboard monitor (See “Front view of a base cabinet” (page 207)).

2. Access the on-screen display (OSD) main dialog box by pressing Print Scrn or by pressing Ctrl twice within one second.

28 Getting started

Page 29

3. Double-click the first server name.

4. Log in as normal.

NOTE: By default, the first port is connected with the dongle to the front of blade 1 (that is, server

1). If server 1 is down, move the dongle to another blade.

Using the serial link on the Onboard Administrator

If you are connected to a terminal server, you can log in through the serial link on the Onboard Administrator.

Booting the system and individual server blades

Before booting the system, ensure that all of the system components other than the server blades—the capacity blocks or performance modules and so on—are turned on. By default, server blades boot whenever power is applied to the system performance chassis (c-Class Blade enclosure). If all server blades are powered off, you can boot the system as follows:

1. Press the power button on server blade 1.

2. Log in as root to server 1.

3. Power on the remaining server blades:

ibrix_server -P on -h <hostname>

NOTE: Alternatively, press the power button on all of the remaining servers. There is no

need to wait for the first server blade to boot.

Management interfaces

Cluster operations are managed through the StoreAll Fusion Manager, which provides both a Management Console and a CLI. Most operations can be performed from either the StoreAll Management Console or the CLI.

The following operations can be performed only from the CLI:

• SNMP configuration (ibrix_snmpagent, ibrix_snmpgroup, ibrix_snmptrap,

ibrix_snmpuser, ibrix_snmpview)

• Health checks (ibrix_haconfig, ibrix_health, ibrix_healthconfig)

• Raw storage management (ibrix_pv, ibrix_vg, ibrix_lv)

• Fusion Manager operations (ibrix_fm) and Fusion Manager tuning (ibrix_fm_tune)

• File system checks (ibrix_fsck)

• Kernel profiling (ibrix_profile)

• Cluster configuration (ibrix_clusterconfig)

• Configuration database consistency (ibrix_dbck)

• Shell task management (ibrix_shell)

The following operations can be performed only from the StoreAll Management Console:

• Scheduling recurring data validation scans

• Scheduling recurring software snapshots

Using the StoreAll Management Console

The StoreAll Management Console is a browser-based interface to the Fusion Manager. See the release notes for the supported browsers and other software required to view charts on the dashboard. You can open multiple Management Console windows as necessary.

Booting the system and individual server blades 29

Page 30

If you are using HTTP to access the Management Console, open a web browser and navigate to the following location, specifying port 80:

http://<management_console_IP>:80/fusion

If you are using HTTPS to access the Management Console, navigate to the following location, specifying port 443:

https://<management_console_IP>:443/fusion

In these URLs, <management_console_IP> is the IP address of the Fusion Manager user VIF. The Management Console prompts for your user name and password. The default administrative

user is ibrix. Enter the password that was assigned to this user when the system was installed. (You can change the password using the Linux passwd command.) To allow other users to access the Management Console, see “Adding user accounts for Management Console access” (page 33).

Upon login, the Management Console dashboard opens, allowing you to monitor the entire cluster. (See the online help for information about all Management Console displays and operations.) There are three parts to the dashboard: System Status, Cluster Overview, and the Navigator.

30 Getting started

Page 31

System Status

The System Status section lists the number of cluster events that have occurred in the last 24 hours. There are three types of events:

Alerts. Disruptive events that can result in loss of access to file system data. Examples are a segment that is unavailable or a server that cannot be accessed.

Warnings. Potentially disruptive conditions where file system access is not lost, but if the situation is not addressed, it can escalate to an alert condition. Examples are a very high server CPU utilization level or a quota limit close to the maximum.

Information. Normal events that change the cluster. Examples are mounting a file system or creating a segment.

Cluster Overview

The Cluster Overview provides the following information:

Capacity

The amount of cluster storage space that is currently free or in use.

File systems

The current health status of the file systems in the cluster. The overview reports the number of file systems in each state (healthy, experiencing a warning, experiencing an alert, or unknown).

Segment Servers

The current health status of the file serving nodes in the cluster. The overview reports the number of nodes in each state (healthy, experiencing a warning, experiencing an alert, or unknown).

Services

Whether the specified file system services are currently running:

One or more tasks are running.

No tasks are running.

Management interfaces 31

Page 32

Statistics

Historical performance graphs for the following items:

• Network I/O (MB/s)

• Disk I/O (MB/s)

• CPU usage (%)

• Memory usage (%)

On each graph, the X-axis represents time and the Y-axis represents performance. Use the Statistics menu to select the servers to monitor (up to two), to change the maximum

value for the Y-axis, and to show or hide resource usage distribution for CPU and memory.

Recent Events

The most recent cluster events. Use the Recent Events menu to select the type of events to display.

You can also access certain menu items directly from the Cluster Overview. Mouse over the Capacity, Filesystems or Segment Server indicators to see the available options.

Navigator

The Navigator appears on the left side of the window and displays the cluster hierarchy. You can use the Navigator to drill down in the cluster configuration to add, view, or change cluster objects such as file systems or storage, and to initiate or view tasks such as snapshots or replication. When you select an object, a details page shows a summary for that object. The lower Navigator allows you to view details for the selected object, or to initiate a task. In the following example, we selected Filesystems in the upper Navigator and Mountpoints in the lower Navigator to see details about the mounts for file system ifs1.

NOTE: When you perform an operation on the GUI, a spinning finger is displayed until the

operation is complete. However, if you use Windows Remote Desktop to access the GUI, the spinning finger is not displayed.

Customizing the GUI

For most tables in the GUI, you can specify the columns that you want to display and the sort order of each column. When this feature is available, mousing over a column causes the label to change color and a pointer to appear. Click the pointer to see the available options. In the following

32 Getting started

Page 33

example, you can sort the contents of the Mountpoint column in ascending or descending order, and you can select the columns that you want to appear in the display.

Adding user accounts for Management Console access

StoreAll software supports administrative and user roles. When users log in under the administrative role, they can configure the cluster and initiate operations such as remote replication or snapshots. When users log in under the user role, they can view the cluster configuration and status, but cannot make configuration changes or initiate operations. The default administrative user name is ibrix. The default regular username is ibrixuser.

User names for the administrative and user roles are defined in the /etc/group file. Administrative users are specified in the ibrix-admin group, and regular users are specified in the ibrix-user group. These groups are created when StoreAll software is installed. The following entries in the

/etc/group file show the default users in these groups:

ibrix-admin:x:501:root,ibrix ibrix-user:x:502:ibrix,ibrixUser,ibrixuser

You can add other users to these groups as needed, using Linux procedures. For example:

adduser -G ibrix-<groupname> <username>

When using the adduser command, be sure to include the -G option.

Using the CLI

The administrative commands described in this guide must be executed on the Fusion Manager host and require root privileges. The commands are located in $IBRIXHOME⁄bin. For complete information about the commands, see the HP StoreAll Network Storage System CLI Reference Guide.

When using ssh to access the machine hosting the Fusion Manager, specify the IP address of the Fusion Manager user VIF.

Starting the array management software

Depending on the array type, you can launch the array management software from the GUI. In the Navigator, select Vendor Storage, select your array from the Vendor Storage page, and click Launch Storage Management.

Management interfaces 33

Page 34

StoreAll client interfaces

StoreAll clients can access the Fusion Manager as follows:

• Linux clients. Use Linux client commands for tasks such as mounting or unmounting file systems

and displaying statistics. See the HP StoreAll Storage CLI Reference Guide for details about these commands.

• Windows clients. Use the Windows client GUI for tasks such as mounting or unmounting file

systems and registering Windows clients.

Using the Windows StoreAll client GUI

The Windows StoreAll client GUI is the client interface to the Fusion Manager. To open the GUI, double-click the desktop icon or select the StoreAll client program from the Start menu on the client. The client program contains tabs organized by function.

NOTE: The Windows StoreAll client GUI can be started only by users with Administrative

privileges.

• Status. Shows the client’s Fusion Manager registration status and mounted file systems, and

provides access to the IAD log for troubleshooting.

• Registration. Registers the client with the Fusion Manager, as described in the HP StoreAll

Storage Installation Guide.

• Mount. Mounts a file system. Select the Cluster Name from the list (the cluster name is the

Fusion Manager name), enter the name of the file system to mount, select a drive, and then click Mount. (If you are using Remote Desktop to access the client and the drive letter does not appear, log out and log in again.)

• Umount. Unmounts a file system.

• Tune Host. Tunable parameters include the NIC to prefer (the client uses the cluster interface

by default unless a different network interface is preferred for it), the communications protocol (UDP or TCP), and the number of server threads to use.

• Active Directory Settings. Displays current Active Directory settings.

For more information, see the client GUI online help.

StoreAll software manpages

StoreAll software provides manpages for most of its commands. To view the manpages, set the MANPATH variable to include the path to the manpages and then export it. The manpages are in the $IBRIXHOME/man directory. For example, if $IBRIXHOME is /usr/local/ibrix (the default), set the MANPATH variable as follows and then export the variable:

MANPATH=$MANPATH:/usr/local/ibrix/man

Changing passwords

IMPORTANT: The hpspAdmin user account is added during the StoreAll software installation

and is used internally. Do not remove this account or change its password.

You can change the following passwords on your system:

• Hardware passwords. See the documentation for the specific hardware for more information.

• Root password. Use the passwd(8) command on each server.

• StoreAll software user password. This password is created during installation and is used to

34 Getting started

Page 35

# passwd ibrix

You will be prompted to enter the new password.

Configuring ports for a firewall

IMPORTANT: To avoid unintended consequences, HP recommends that you configure the firewall

during scheduled maintenance times.

When configuring a firewall, you should be aware of the following:

• SELinux should be disabled.

• By default, NFS uses random port numbers for operations such as mounting and locking.

These ports must be fixed so that they can be listed as exceptions in a firewall configuration file. For example, you will need to lock specific ports for rpc.statd, rpc.lockd, rpc.mountd, and rpc.quotad.

• It is best to allow all ICMP types on all networks; however, you can limit ICMP to types 0, 3,

8, and 11 if necessary.

Be sure to open the ports listed in the following table.

DescriptionPort

SSH22/tcp

SSH for Onboard Administrator (OA); only for X9720/9730 blades9022/tcp

NTP123/tcp, 123/upd

Multicast DNS, 224.0.0.2515353/udp

netperf tool12865/tcp

Fusion Manager to file serving nodes80/tcp

443/tcp

Fusion Manager and StoreAll file system5432/tcp 8008/tcp 9002/tcp 9005/tcp 9008/tcp 9009/tcp 9200/tcp

Between file serving nodes and NFS clients (user network)2049/tcp, 2049/udp

NFS111/tcp, 111/udp

RPC875/tcp, 875/udp

quota32803/tcp

lockmanager32769/udp

lockmanager892/tcp, 892/udp

mount daemon662/tcp, 662/udp

stat2020/tcp, 2020/udp

stat outgoing4000:4003/tcp

reserved for use by a custom application (CMU) and can be disabled if not used

Configuring ports for a firewall 35

Page 36

DescriptionPort

137/udp Between file serving nodes and SMB clients (user network) 138/udp 139/tcp 445/tcp

Between file serving nodes and StoreAll clients (user network)9000:9002/tcp

9000:9200/udp

Between file serving nodes and FTP clients (user network)20/tcp, 20/udp

21/tcp, 21/udp

Between GUI and clients that need to access the GUI7777/tcp

8080/tcp

Dataprotector5555/tcp, 5555/udp

Internet Printing Protocol (IPP)631/tcp, 631/udp

ICAP1344/tcp, 1344/udp

Configuring NTP servers

When the cluster is initially set up, primary and secondary NTP servers are configured to provide time synchronization with an external time source. The list of NTP servers is stored in the Fusion Manager configuration. The active Fusion Manager node synchronizes its time with the external source. The other file serving nodes synchronize their time with the active Fusion Manager node. In the absence of an external time source, the local hardware clock on the agile Fusion Manager node is used as the time source. This configuration method ensures that the time is synchronized on all cluster nodes, even in the absence of an external time source.

On StoreAll clients, the time is not synchronized with the cluster nodes. You will need to configure NTP servers on StoreAll clients.

List the currently configured NTP servers:

ibrix_clusterconfig -i -N

Specify a new list of NTP servers:

ibrix_clusterconfig -c -N SERVER1[,...,SERVERn]

Configuring HP Insight Remote Support on StoreAll systems

IMPORTANT: In the StoreAll software 6.1 release, the default port for the StoreAll SNMP agent

changed from 5061 to 161. This port number cannot be changed.

36 Getting started

Page 37

NOTE: Configuring Phone Home enables the hp-snmp-agents service internally. As a result, a

large number of error messages, such as the following, could occasionally appear in

/var/log/hp-snmp-agents/cma.log:

Feb 08 13:05:54 x946s1 cmahostd[25579]: cmahostd: Can't update OS filesys object: /ifs1 (PEER3023)

The cmahostd daemon is part of the hp-snmp-agents service. This error message occurs because the file system exceeds <n> TB. If this occurs, HP recommends that before you perform operations such as unmounting a file system or stopping services on a file serving node (using the

ibrix_server command), you disable the hp-snmp-agent service on each server first:

service hp-snmp-agents stop

After remounting the file system or restarting services on the file serving node, restart the hp-snmp-agents service on each server:

service hp-snmp-agents start

Prerequisites

The required components for supporting StoreAll systems are preinstalled on the file serving nodes. You must install HP Insight Remote Support on a separate Windows system termed the Central Management Server (CMS):

• HP Insight Manager (HP SIM). This software manages HP systems and is the easiest and least

expensive way to maximize system uptime and health.

• Insight Remote Support Advanced (IRSA). This version is integrated with HP Systems Insight

Manager (SIM). It provides comprehensive remote monitoring, notification/advisories, dispatch, and proactive service support. IRSA and HP SIM together are referred to as the CMS.

• The Phone Home configuration does not support backup or standby NICs that are used for

NIC failover. If backup NICs are currently configured, remove the backup NICs from all nodes before configuring Phone Home. After a successful Phone Home configuration, you can reconfigure the backup NICs.

The following versions of the software are supported.

• HP SIM 6.3 and IRSA 5.6

• HP SIM 7.1 and IRSA 5.7

IMPORTANT: Keep in mind the following:

• For each file serving node, add the physical user network interfaces (by entering the

ibrix_nic command or selecting the Server > NICs tab in the GUI) so the interfaces can communicate with HP SIM.

• Ensure that all user network interfaces on each file serving node can communicate with the

CMS.

IMPORTANT: Insight Remote Support Standard (IRSS ) is not supported with StoreAll software

6.1 and later.

For product descriptions and information about downloading the software, see the HP Insight Remote Support Software web page:

http://www.hp.com/go/insightremotesupport

For information about HP SIM:

http://www.hp.com/products/systeminsightmanager

For IRSA documentation:

http://www.hp.com/go/insightremoteadvanced-docs

Configuring HP Insight Remote Support on StoreAll systems 37

Page 38

IMPORTANT: You must compile and manually register the StoreAll MIB file by using HP Systems

Insight Manager:

1. Download ibrixMib.txt from /usr/local/ibrix/doc/.

2. Rename the file to ibrixMib.mib.

3. In HP Systems Insight Manager, complete the following steps: a. Unregister the existing MIB by entering the following command:

<BASE>\mibs>mxmib -d ibrixMib.mib

b. Copy the ibrixMib.mib file to the <BASE>\mibs directory, and then enter the following

commands:

<BASE>\mibs>mcompile ibrixMib.mib

<BASE>\mibs>mxmib -a ibrixMib.cfg

For more information about the MIB, see the "Compiling and customizing MIBs" chapter in the HP Systems Insight Manager User Guide, which is available at:

http://www.hp.com/go/insightmanagement/sim/

Click Support & Documents and then click Manuals. Navigate to the user guide.

Limitations

Note the following:

• For StoreAll systems, the HP Insight Remote Support implementation is limited to hardware

events.

• The X9720 CX storage device is not supported for HP Insight Remote Support.

Configuring the StoreAll cluster for Insight Remote Support

To enable X9720/9730 systems for remote support, you need to configure the Virtual SAS Manager, Virtual Connect Manager, and Phone Home settings. All nodes in the cluster should be up when you perform this step.

NOTE: Configuring Phone Home removes any previous StoreAll snmp configuration details and

populates the SNMP configuration with Phone Home configuration details. When Phone Home is enabled, you cannot use ibrix_snmpagent to edit or change the snmp agent configuration. However, you can use ibrix_snmptrap to add trapsink IPs and you can use ibrix_event to associate events to the trapsink IPs.

Registering Onboard Administrator

The Onboard Administrator is registered automatically.

Configuring the Virtual SAS Manager

On 9730 systems, the SNMP service is disabled by default on the SAS switches. To enable the SNMP service manually and provide the trapsink IP on all SAS switches, complete these steps:

1. Open the Virtual SAS Manager from the OA. Select OA IP > Interconnect Bays > SAS Switch > Management Console.

2. On the Virtual SAS Manager, open the Maintain tab, click SAS Blade Switch, and select SNMP Settings. On the dialog box, enable the SNMP service and supply the information needed for alerts.

38 Getting started

Page 39

Configuring the Virtual Connect Manager

To configure the Virtual Connect Manager on an X9720/9730 system, complete the following steps:

1. From the Onboard Administrator, select OA IP > Interconnect Bays > HP VC Flex-10 > Management Console.

2. On the HP Virtual Connect Manager, open the SNMP Configuration tab.

3. Configure the SNMP Trap Destination:

• Enter the Destination Name and IP Address (the CMS IP).

• Select SNMPv1 as the SNMP Trap Format.

• Specify public as the Community String.

4. Select all trap categories, VCM traps, and trap severities.

Configuring HP Insight Remote Support on StoreAll systems 39

Page 40

Configuring Phone Home settings

To configure Phone Home on the GUI, select Cluster Configuration in the upper Navigator and then select Phone Home in the lower Navigator. The Phone Home Setup panel shows the current configuration.

40 Getting started

Page 41

Click Enable to configure the settings on the Phone Home Settings dialog box. Skip the Software Entitlement ID field; it is not currently used.

The time required to enable Phone Home depends on the number of devices in the cluster, with larger clusters requiring more time.

To configure Phone Home settings from the CLI, use the following command:

ibrix_phonehome -c -i <IP Address of the Central Management Server> -P Country Name [-z Software Entitlement ID] [-r Read Community] [-w Write Community] [-t System Contact] [-n System Name] [-o System Location]

For example:

ibrix_phonehome -c -i 99.2.4.75 -P US -r public -w private -t Admin -n SYS01.US -o Colorado

Next, configure Insight Remote Support for the version of HP SIM you are using:

• HP SIM 7.1 and IRS 5.7. See “Configuring Insight Remote Support for HP SIM 7.1 and IRS

5.7” (page 41).

• HP SIM 6.3 and IRS 5.6. See “Configuring Insight Remote Support for HP SIM 6.3 and IRS

5.6” (page 44).

Configuring Insight Remote Support for HP SIM 7.1 and IRS 5.7

To configure Insight Remote Support, complete these steps:

1. Configure Entitlements for the servers and chassis in your system.

2. Discover devices on HP SIM.

Configuring Entitlements for servers and chassis

Expand Phone Home in the lower Navigator. When you select Chassis or Servers, the GUI displays the current Entitlements for that type of device. The following example shows Entitlements for the servers in the cluster.

Configuring HP Insight Remote Support on StoreAll systems 41

Page 42

To configure Entitlements, select a device and click Modify to open the dialog box for that type of device. The following example shows the Server Entitlement dialog box. The customer-entered serial number and product number are used for warranty checks at HP Support.

Use the following commands to entitle devices from the CLI. The commands must be run for each device present in the cluster.

Entitle a server:

ibrix_phonehome -e -h <Host Name> -b <Customer Entered Serial Number>

-g <Customer Entered Product Number> Enter the Host Name parameter exactly as it is listed by the ibrix_fm -l command.

Entitle a chassis:

ibrix_phonehome -e -C <OA IP Address of the Chassis> -b <Customer Entered Serial Number> -g <Customer Entered Product Number>

NOTE: The Phone Home > Storage selection on the GUI does not apply to X9720/9730 systems.

Discovering devices on HP SIM

HP Systems Insight Manager (SIM) uses the SNMP protocol to discover and identify StoreAll systems automatically. On HP SIM, open Options > Discovery > New. Select Discover a group of systems, and then enter the discovery name and the Fusion Manager IP address on the New Discovery dialog box.

42 Getting started

Page 43

Enter the read community string on the Credentials > SNMP tab. This string should match the Phone Home read community string. If the strings are not identical, the Fusion Manager IP might be discovered as “Unknown.”

Configuring HP Insight Remote Support on StoreAll systems 43

Page 44

Devices are discovered as described in the following table.

Discovered asDevice

Fusion Manager

System Type:

Fusion Manager IP

9000

System Subtype:

HP 9000 SolutionProduct Model:

Storage Device

System Type:

File serving nodes

9000, Storage, HP ProLiant

System Subtype:

HP X9720 NetStor FSN(ProLiant BL460 G6)

Product Model:

HP X9720 NetStor FSN(ProLiant BL460 G6)

HP 9730 NetStor FSN(ProLiant BL460 G7)

The following example shows discovered devices on HP SIM 7.1.

File serving nodes and the OA IP are associated with the Fusion Manager IP address. In HP SIM, select Fusion Manager and open the Systems tab. Then select Associations to view the devices.

You can view all StoreAll devices under Systems by Type > Storage System > Scalable Storage Solutions > All X9000 Systems

Configuring Insight Remote Support for HP SIM 6.3 and IRS 5.6

Discovering devices in HP SIM

HP Systems Insight Manager (SIM) uses the SNMP protocol to discover and identify StoreAll systems automatically. On HP SIM, open Options > Discovery > New, and then select Discover a group of systems. On the New Discovery dialog box, enter the discovery name and the IP addresses of the devices to be monitored. For more information, see the HP SIM 6.3 documentation.

NOTE: Each device in the cluster should be discovered separately.

44 Getting started

Page 45

Enter the read community string on the Credentials > SNMP tab. This string should match the Phone Home read community string. If the strings are not identical, the device will be discovered as “Unknown.”

The following example shows discovered devices on HP SIM 6.3. File serving nodes are discovered as ProLiant server.

Configuring device Entitlements

Configure the CMS software to enable remote support for StoreAll systems. For more information, see "Using the Remote Support Setting Tab to Update Your Client and CMS Information” and “Adding Individual Managed Systems” in the HP Insight Remote Support Advanced A.05.50 Operations Guide.

Configuring HP Insight Remote Support on StoreAll systems 45

Page 46

Enter the following custom field settings in HP SIM:

• Custom field settings for X9720/9730 Onboard Administrator

The Onboard Administrator (OA) is discovered with OA IP addresses. When the OA is discovered, edit the system properties on the HP Systems Insight Manager. Locate the Entitlement Information section of the Contract and Warranty Information page and update the following:

◦ Enter the StoreAll enclosure product number as the Customer-Entered product number

◦ Enter X9000 as the Custom Delivery ID

◦ Select the System Country Code

◦ Enter the appropriate Customer Contact and Site Information details

• Contract and Warranty Information

Under Entitlement Information, specify the Customer-Entered serial number, Customer-Entered product number, System Country code, and Custom Delivery ID.

Verifying device entitlements

To verify the entitlement information in HP SIM, complete the following steps:

1. Go to Remote Support Configuration and Services and select the Entitlement tab.

2. Check the devices discovered.

NOTE: If the system discovered on HP SIM does not appear on the Entitlement tab, click

Synchronize RSE.

3. Select Entitle Checked from the Action List.

4. Click Run Action.

5. When the entitlement check is complete, click Refresh.

NOTE: If the system discovered on HP SIM does not appear on the Entitlement tab, click

Synchronize RSE.

46 Getting started

Page 47

The devices you entitled should be displayed as green in the ENT column on the Remote Support System List dialog box.

If a device is red, verify that the customer-entered serial number and part number are correct and then rediscover the devices.

Testing the Insight Remote Support configuration

To determine whether the traps are working properly, send a generic test trap with the following command:

snmptrap -v1 -c public <CMS IP> .1.3.6.1.4.1.232 <Managed System IP> 6 11003 1234 .1.3.6.1.2.1.1.5.0 s test .1.3.6.1.4.1.232.11.2.11.1.0 i 0 .1.3.6.1.4.1.232.11.2.8.1.0 s "IBRIX remote support testing"

For example, if the CMS IP address is 99.2.2.2 and the StoreAll node is 99.2.2.10, enter the following:

snmptrap -v1 -c public 99.2.2.2 .1.3.6.1.4.1.232 99.2.2.10 6 11003 1234 .1.3.6.1.2.1.1.5.0 s test .1.3.6.1.4.1.232.11.2.11.1.0 i 0 .1.3.6.1.4.1.232.11.2.8.1.0 s "IBRIX remote support testing"

Updating the Phone Home configuration

The Phone Home configuration should be synchronized after you add or remove devices in the cluster. The operation enables Phone Home on newly added devices (servers, storage, and chassis) and removes details for devices that are no longer in the cluster. On the GUI, select Cluster Configuration in the upper Navigator, select Phone Home in the lower Navigator, and click Rescan on the Phone Home Setup panel.

On the CLI, run the following command:

ibrix_phonehome -s

Disabling Phone Home

When Phone Home is disabled, all Phone Home information is removed from the cluster and hardware and software are no longer monitored. To disable Phone Home on the GUI, click Disable on the Phone Home Setup panel. On the CLI, run the following command:

ibrix_phonehome -d

Configuring HP Insight Remote Support on StoreAll systems 47

Page 48

Troubleshooting Insight Remote Support

Devices are not discovered on HP SIM

Verify that cluster networks and devices can access the CMS. Devices will not be discovered properly if they cannot access the CMS.

The maximum number of SNMP trap hosts has already been configured

If this error is reported, the maximum number of trapsink IP addresses have already been configured. For OA devices, the maximum number of trapsink IP addresses is 8. Manually remove a trapsink IP address from the device and then rerun the Phone Home configuration to allow Phone Home to add the CMS IP address as a trapsink IP address.

A cluster node was not configured in Phone Home

If a cluster node was down during the Phone Home configuration, the log file will include the following message:

SEVERE: Sent event server.status.down: Server <server name> down

When the node is up, rescan Phone Home to add the node to the configuration. See “Updating

the Phone Home configuration” (page 47).

Fusion Manager IP is discovered as “Unknown”

Verify that the read community string entered in HP SIM matches the Phone Home read community string.

Also run snmpwalk on the VIF IP and verify the information:

# snmpwalk -v 1 -c <read community string> <FM VIF IP> .1.3.6.1.4.1.18997

Critical failures occur when discovering X9720 OA

The 3GB SAS switches have internal IPs in the range 169.x.x.x, which cannot be reached from HP SIM. These switches will not be monitored; however, other OA components are monitored.

Discovered device is reported as unknown on CMS

Run the following command on the file serving node to determine whether the Insight Remote Support services are running:

# service snmpd status # service hpsmhd status # service hp-snmp-agents status

If the services are not running, start them:

# service snmpd start # service hpsmhd start # service hp-snmp-agents start

Alerts are not reaching the CMS

If nodes are configured and the system is discovered properly but alerts are not reaching the CMS, verify that a trapif entry exists in the cma.conf configuration file on the file serving nodes.

Device Entitlement tab does not show GREEN

If the Entitlement tab does not show GREEN, verify the Customer-Entered serial number and part number or the device.

SIM Discovery

On SIM discovery, use the option Discover a Group of Systems for any device discovery.

48 Getting started

Page 49

4 Configuring virtual interfaces for client access

StoreAll software uses a cluster network interface to carry Fusion Manager traffic and traffic between file serving nodes. This network is configured as bond0 when the cluster is installed. To provide failover support for the Fusion Manager, a virtual interface is created for the cluster network interface.

Although the cluster network interface can carry traffic between file serving nodes and clients, HP recommends that you configure one or more user network interfaces for this purpose.

To provide high availability for a user network, you should configure a bonded virtual interface (VIF) for the network and then set up failover for the VIF. This method prevents interruptions to client traffic. If necessary, the file serving node hosting the VIF can fail over to its backup server, and clients can continue to access the file system through the backup server.

StoreAll systems also support the use of VLAN tagging on the cluster and user networks. See

“Configuring VLAN tagging” (page 52) for an example.

Network and VIF guidelines

To provide high availability, the user interfaces used for client access should be configured as bonded virtual interfaces (VIFs). Note the following:

• Nodes needing to communicate for file system coverage or for failover must be on the same

network interface. Also, nodes set up as a failover pair must be connected to the same network interface.

• Use a Gigabit Ethernet port (or faster) for user networks.

• NFS, SMB, FTP, and HTTP clients can use the same user VIF. The servers providing the VIF

should be configured in backup pairs, and the NICs on those servers should also be configured for failover. See “Configuring High Availability on the cluster” in the administrator guide for information about performing this configuration from the GUI.

• For Linux and Windows StoreAll clients, the servers hosting the VIF should be configured in

backup pairs. However, StoreAll clients do not support backup NICs. Instead, StoreAll clients should connect to the parent bond of the user VIF or to a different VIF.

• Ensure that your parent bonds, for example bond0, have a defined route:

1. Check for the default Linux OS route/gateway for each parent interface/bond that was

defined during the HP StoreAll installation by entering the following command at the command prompt:

# route

The output from the command is the following:

The default destination is the default gateway/route for Linux. The default destination, which was defined during the HP StoreAll installation, had the operating system default gateway defined but not for StoreAll.

2. Display network interfaces controlled by StoreAll by entering the following command at

the command prompt:

# ibrix_nic -l

Notice if the “ROUTE” column is unpopulated for IFNAME.

Network and VIF guidelines 49

Page 50

3. To assign the IFNAME a default route for the parent cluster bond and the user VIFS assigned to FSNs for use with SMB/NFS, enter the following ibrix_nic command at the command prompt:

# ibrix_nic -r -n IFNAME -h HOSTNAME-A -R <ROUTE_IP>

4. Configure backup monitoring, as described in “Configuring backup servers” (page 50).

Creating a bonded VIF

NOTE: The examples in this chapter use the unified network and create a bonded VIF on bond0.

If your cluster uses a different network layout, create the bonded VIF on a user network bond such as bond1.

Use the following procedure to create a bonded VIF (bond0:1 in this example):

1. If high availability (automated failover) is configured on the servers, disable it. Run the following command on the Fusion Manager:

# ibrix_server -m -U

2. Identify the bond0:1 VIF:

# ibrix_nic -a -n bond0:1 -h node1,node2,node3,node4

3. Assign an IP address to the bond1:1 VIFs on each node. In the command, -I specifies the IP address, -M specifies the netmask, and -B specifies the broadcast address:

# ibrix_nic -c -n bond0:1 -h node1 -I 16.123.200.201 -M 255.255.255.0 -B 16.123.200.255 # ibrix_nic -c -n bond0:1 -h node2 -I 16.123.200.202 -M 255.255.255.0 -B 16.123.200.255 # ibrix_nic -c -n bond0:1 -h node3 -I 16.123.200.203 -M 255.255.255.0 -B 16.123.200.255 # ibrix_nic -c -n bond0:1 -h node4 -I 16.123.200.204 -M 255.255.255.0 -B 16.123.200.255

Configuring backup servers

The servers in the cluster are configured in backup pairs. If this step was not done when your cluster was installed, assign backup servers for the bond0:1 interface. In the following example, node1 is the backup for node2, node2 is the backup for node1, node3 is the backup for node4, and node4 is the backup for node3.

1. Add the VIF:

# ibrix_nic -a -n bond0:2 -h node1,node2,node3,node4

2. Set up a backup server for each VIF:

# ibrix_nic -b -H node1/bond0:1,node2/bond0:2 # ibrix_nic -b -H node2/bond0:1,node1/bond0:2 # ibrix_nic -b -H node3/bond0:1,node4/bond0:2 # ibrix_nic -b -H node4/bond0:1,node3/bond0:2

Configuring NIC failover

NIC monitoring should be configured on VIFs that will be used by NFS, SMB, FTP, or HTTP.

IMPORTANT: When configuring NIC monitoring, use the same backup pairs that you used when

configuring standby servers.

50 Configuring virtual interfaces for client access

Page 51

For example:

# ibric_nic -m -h node1 -A node2/bond0:1 # ibric_nic -m -h node2 -A node1/bond0:1 # ibric_nic -m -h node3 -A node4/bond0:1 # ibric_nic -m -h node4 -A node3/bond0:1

Configuring automated failover

To enable automated failover for your file serving nodes, execute the following command:

ibrix_server -m [-h SERVERNAME]

Example configuration

This example uses two nodes, ib50-81 and ib50-82. These nodes are backups for each other, forming a backup pair.

[root@ib50-80 ~]# ibrix_server -l Segment Servers =============== SERVER_NAME BACKUP STATE HA ID GROUP

----------- ------- ------------ --- ------------------------------------ ----ib50-81 ib50-82 Up on 132cf61a-d25b-40f8-890e-e97363ae0d0b servers ib50-82 ib50-81 Up on 7d258451-4455-484d-bf80-75c94d17121d servers

All VIFs on ib50-81 have backup (standby) VIFs on ib50-82. Similarly, all VIFs on ib50-82 have backup (standby) VIFs on ib50-81. NFS, SMB, FTP, and HTTP clients can connect to bond0:1 on either host. If necessary, the selected server will fail over to bond0:2 on the opposite host. StoreAll clients could connect to bond1 on either host, as these clients do not support or require NIC failover. (The following sample output shows only the relevant fields.)

Specifying VIFs in the client configuration

When you configure your clients, you may need to specify the VIF that should be used for client access.

NFS/SMB. Specify the VIF IP address of the servers (for example, bond0:1) to establish connection. You can also configure DNS round robin to ensure NFS or SMB client-to-server distribution. In both cases, the NFS/SMB clients will cache the initial IP they used to connect to the respective share, usually until the next reboot.

FTP. When you add an FTP share on the Add FTP Shares dialog box or with the ibrix_ftpshare command, specify the VIF as the IP address that clients should use to access the share.

HTTP. When you create a virtual host on the Create Vhost dialog box or with the ibrix_httpvhost command, specify the VIF as the IP address that clients should use to access shares associated with the Vhost.

StoreAll clients. Use the following command to prefer the appropriate user network. Execute the command once for each destination host that the client should contact using the specified interface.

ibrix_client -n -h SRCHOST -A DESTNOST/IFNAME

For example:

ibrix_client -n -h client12.mycompany.com -A ib50-81.mycompany.com/bond1

Configuring automated failover 51

Page 52

NOTE: Because the backup NIC cannot be used as a preferred network interface for StoreAll

clients, add one or more user network interfaces to ensure that HA and client communication work together.

Configuring VLAN tagging

VLAN capabilities provide hardware support for running multiple logical networks over the same physical networking hardware. To allow multiple packets for different VLANs to traverse the same physical interface, each packet must have a field added that contains the VLAN tag. The tag is a small integer number that identifies the VLAN to which the packet belongs. When an intermediate switch receives a “tagged” packet, it can make the appropriate forwarding decisions based on the value of the tag.

When set up properly, StoreAll systems support VLAN tags being transferred all of the way to the file serving node network interfaces. The ability of file serving nodes to handle the VLAN tags natively in this manner makes it possible for the nodes to support multiple VLAN connections simultaneously over a single bonded interface.

Linux networking tools such as ifconfig display a network interface with an associated VLAN tag using a device label with the form bond#.<VLAN_id>. For example, if the first bond created by StoreAll has a VLAN tag of 30, it will be labeled bond0.30.

It is also possible to add a VIF on top of an interface that has an associated VLAN tag. In this case, the device label of the interface takes the form bond#.<VLAN_id>.<VVIF_label>. For example, if a VIF with a label of 2 is added for the bond0.30 interface, the new interface device label will be bond0.30:2.

The following commands show configuring a bonded VIF and backup nodes for a unified network topology using the 10.10.x.y subnet. VLAN tagging is configured for hosts ib142-129 and ib142-131 on the 51 subnet.

Add the bond0.51 interface with the VLAN tag:

# ibrix_nic -a -n bond0.51 -h ib142-129 # ibrix_nic -a -n bond0.51 -h ib142-131

Assign an IP address to the bond0:51 VIFs on each node:

# ibrix_nic -c -n bond0.51 -h ib142-129 -I 192.168.51.101 -M 255.255.255.0 # ibrix_nic -c -n bond0.51 -h ib142-131 -I 192.168.51.102 -M 255.255.255.0

Add the bond0.51:2 VIF on top of the interface:

# ibrix_nic -a -n bond0.51:2 -h ib142-131 # ibrix_nic -a -n bond0.51:2 -h ib142-129

Configure backup nodes:

# ibrix_nic -b -H ib142-129/bond0.51,ib142-131/bond0.51:2 # ibrix_nic -b -H ib142-131/bond0.51,ib142-129/bond0.51:2

Create the user FM VIF:

ibrix_fm -c 192.168.51.125 -d bond0.51:1 -n 255.255.255.0 -v user

For more information about VLAG tagging, see the HP StoreAll Storage Network Best Practices Guide.

Support for link state monitoring

Do not configure link state monitoring for user network interfaces or VIFs that will be used for SMB or NFS. Link state monitoring is supported only for use with iSCSI storage network interfaces, such as those provided with 9300 Gateway systems.

52 Configuring virtual interfaces for client access

Page 53

5 Configuring failover

This chapter describes how to configure failover for agile management consoles, file serving nodes, network interfaces, and HBAs.

Agile management consoles

The agile Fusion Manager maintains the cluster configuration and provides graphical and command-line user interfaces for managing and monitoring the cluster. The agile Fusion Manager is installed on all file serving nodes when the cluster is installed. The Fusion Manager is active on one node, and is passive on the other nodes. This is called an agile Fusion Manager configuration.

Agile Fusion Manager modes

An agile Fusion Manager can be in one of the following modes:

• active. In this mode, the Fusion Manager controls console operations. All cluster administration

and configuration commands must be run from the active Fusion Manager.

• passive. In this mode, the Fusion Manager monitors the health of the active Fusion Manager.

If the active Fusion Manager fails, the a passive Fusion Manager is selected to become the active console.

• nofmfailover. In this mode, the Fusion Manager does not participate in console operations.

Use this mode for operations such as manual failover of the active Fusion Manager, StoreAll software upgrades, and server blade replacements.

Changing the mode

Use the following command to move a Fusion Manager to passive or nofmfailover mode:

ibrix_fm -m passive | nofmfailover [-P] [-A | -h <FMLIST>]

If the Fusion Manager was previously the active console, StoreAll software will select a new active console. A Fusion Manager currently in active mode can be moved to either passive or nofmfailover mode. A Fusion Manager in nofmfailover mode can be moved only to passive mode.

With the exception of the local node running the active Fusion Manager, the -A option moves all instances of the Fusion Manager to the specified mode. The -h option moves the Fusion Manager instances in <FMLIST> to the specified mode.

Agile Fusion Manager and failover

Using an agile Fusion Manager configuration provides high availability for Fusion Manager services. If the active Fusion Manager fails, the cluster virtual interface will go down. When the passive Fusion Manager detects that the cluster virtual interface is down, it will become the active console. This Fusion Manager rebuilds the cluster virtual interface, starts Fusion Manager services locally, transitions into active mode, and take over Fusion Manager operation.

Failover of the active Fusion Manager affects the following features:

• User networks. The virtual interface used by clients will also fail over. Users may notice a brief

reconnect while the newly active Fusion Manager takes over management of the virtual interface.

• GUI. You must reconnect to the Fusion Manager VIF after the failover.

Failing over the Fusion Manager manually

To fail over the active Fusion Manager manually, place the console into nofmfailover mode. Enter the following command on the node hosting the console:

ibrix_fm -m nofmfailover

Agile management consoles 53

Page 54

The command takes effect immediately. The failed-over Fusion Manager remains in nofmfailover mode until it is moved to passive mode

using the following command:

ibrix_fm -m passive

NOTE: A Fusion Manager cannot be moved from nofmfailover mode to active mode.

Viewing information about Fusion Managers

To view mode information, use the following command:

ibrix_fm -i

NOTE: If the Fusion Manager was not installed in an agile configuration, the output will report

FusionServer: fusion manager name not set! (active, quorum is not configured).

When a Fusion Manager is installed, it is registered in the Fusion Manager configuration. To view a list of all registered management consoles, use the following command:

ibrix_fm -l

Configuring High Availability on the cluster

StoreAll High Availability provides monitoring for servers, NICs, and HBAs. Server HA. Servers are configured in backup pairs, with each server in the pair acting as a backup

for the other server. The servers in the backup pair must see the same storage. When a server is failed over, the ownership of its segments and its Fusion Manager services (if the server is hosting the active FM) move to the backup server.

NIC HA.When server HA is enabled, NIC HA provides additional triggers that cause a server to fail over to its backup server. For example, you can create a user VIF such as bond0:2 to service SMB requests on a server and then designate the backup server as a standby NIC for bond0:2. If an issue occurs with bond0:2 on a server, the server, including its segment ownership and FM services, will fail over to the backup server, and that server will now handle SMB requests going through bond0:2.

You can also fail over just the NIC to its standby NIC on the backup server. HBA monitoring. This method protects server access to storage through an HBA. Most servers ship

with an HBA that has two controllers, providing redundancy by design. Setting up StoreAll HBA monitoring is not commonly used for these servers. However, if a server has only a single HBA, you might want to monitor the HBA; then, if the server cannot see its storage because the single HBA goes offline or faults, the server and its segments will fail over.

You can set up automatic server failover and perform a manual failover if needed. If a server fails over, you must fail back the server manually.

When automatic HA is enabled, the Fusion Manager listens for heartbeat messages that the servers broadcast at one-minute intervals. The Fusion Manager initiates a server failover when it fails to receive five consecutive heartbeats. Failover conditions are detected more quickly when NIC HA is also enabled; server failover is initiated when the Fusion Manager receives a heartbeat message indicating that a monitored NIC might be down and the Fusion Manager cannot reach that NIC. If HBA monitoring is enabled, the Fusion Manager fails over the server when a heartbeat message indicates that a monitored HBA or pair of HBAs has failed.

54 Configuring failover

Page 55

What happens during a failover

The following actions occur when a server is failed over to its backup:

1. The Fusion Manager verifies that the backup server is powered on and accessible.

2. The Fusion Manager migrates ownership of the server’s segments to the backup and notifies

all servers and StoreAll clients about the migration. This is a persistent change. If the server is hosting the active FM, it transitions to another server.

3. If NIC monitoring is configured, the Fusion Manager activates the standby NIC and transfers

the IP address (or VIF) to it.

Clients that were mounted on the failed-over server may experience a short service interruption while server failover takes place. Depending on the protocol in use, clients can continue operations after the failover or may need to remount the file system using the same VIF. In either case, clients will be unaware that they are now accessing the file system on a different server.

To determine the progress of a failover, view the Status tab on the GUI or execute the ibrix_server -l command. While the Fusion Manager is migrating segment ownership, the operational status of the node is Up-InFailover or Down-InFailover, depending on whether the node was powered up or down when failover was initiated. When failover is complete, the operational status changes to Up-FailedOver or Down-FailedOver. For more information about operational states, see “Monitoring the status of file serving nodes” (page 102).

Both automated and manual failovers trigger an event that is reported on the GUI. Automated failover can be configured with the HA Wizard or from the command line.

Configuring automated failover with the HA Wizard

IMPORTANT: On the X9720 platform, the ibrixpwr iLO user must be created on each node

before the cluster HA can be fully functional. Enter the following command on each cluster node to create an iLO user with the username ibrixpwr and with the password hpinvent:

ibrix_ilo -c -u ibrixpwr -p hpinvent

The HA wizard configures a backup server pair and, optionally, standby NICs on each server in the pair. It also configures a power source such as an iLO on each server. The Fusion Manager uses the power source to power down the server during a failover.

On the GUI, select Servers from the Navigator.

Click High Availability to start the wizard. Typically, backup servers are configured and server HA is enabled when your system is installed, and the Server HA Pair dialog box shows the backup pair configuration for the server selected on the Servers panel.

If necessary, you can configure the backup pair for the server. The wizard identifies the servers in the cluster that see the same storage as the selected server. Choose the appropriate server from the list.

Configuring High Availability on the cluster 55

Page 56

The wizard also attempts to locate the IP addresses of the iLOs on each server. If it cannot locate an IP address, you will need to enter the address on the dialog box. When you have completed the information, click Enable HA Monitoring and Auto-Failover for both servers.

Use the NIC HA Setup dialog box to configure NICs that will be used for data services such as SMB or NFS. You can also designate NIC HA pairs on the server and its backup and enable monitoring of these NICs.

56 Configuring failover

Page 57

For example, you can create a user VIF that clients will use to access an SMB share serviced by server ib69s1. The user VIF is based on an active physical network on that server. To do this, click Add NIC in the section of the dialog box for ib69s1.

On the Add NIC dialog box, enter a NIC name. In our example, the cluster uses the unified network and has only bond0, the active cluster FM/IP. We cannot use bond0:0, which is the management IP/VIF. We will create the VIF bond0:1, using bond0 as the base. When you click OK, the user VIF is created.

The new, active user NIC appears on the NIC HA setup dialog box.

Configuring High Availability on the cluster 57

Page 58

Next, enable NIC monitoring on the VIF. Select the new user NIC and click NIC HA. On the NIC HA Config dialog box, check Enable NIC Monitoring.

58 Configuring failover

Page 59

In the Standby NIC field, select New Standby NIC to create the standby on backup server ib69s2. The standby you specify must be available and valid. To keep the organization simple, we specified bond0:1 as the Name; this matches the name assigned to the NIC on server ib69s1. When you click OK, the NIC HA configuration is complete.

Configuring High Availability on the cluster 59

Page 60

You can create additional user VIFs and assign standby NICs as needed. For example, you might want to add a user VIF for another share on server ib69s2 and assign a standby NIC on server ib69s1. You can also specify a physical interface such eth4 and create a standby NIC on the backup server for it.

The NICs panel on the GUI shows the NICs on the selected server. In the following example, there are four NICs on server ib69s1: bond0, the active cluster FM/IP; bond0:0, the management IP/VIF (this server is hosting the active FM); bond0:1, the NIC created in this example; and bond0:2, a standby NIC for an active NIC on server ib69s2.

60 Configuring failover

Page 61

The NICs panel for the ib69s2, the backup server, shows that bond0:1 is an inactive, standby NIC and bond0:2 is an active NIC.

Changing the HA configuration

To change the configuration of a NIC, select the server on the Servers panel, and then select NICs from the lower Navigator. Click Modify on the NICs panel. The General tab on the Modify NIC Properties dialog box allows you change the IP address and other NIC properties. The NIC HA tab allows you to enable or disable HA monitoring and failover on the NIC and to change or remove the standby NIC.

To view the power source for a server, select the server on the Servers panel, and then select Power from the lower Navigator. The Power Source panel shows the power source configured on the server when HA was configured. You can add or remove power sources on the server, and can power the server on or off, or reset the server.

Configuring High Availability on the cluster 61

Page 62

Configuring automated failover manually

To configure automated failover manually, complete these steps:

1. Configure file serving nodes in backup pairs.

2. Identify power sources for the servers in the backup pair.

3. Configure NIC monitoring.

4. Enable automated failover.

1. Configure server backup pairs

File serving nodes are configured in backup pairs, where each server in a pair is the backup for the other. This step is typically done when the cluster is installed. The following restrictions apply:

• The same file system must be mounted on both servers in the pair and the servers must see the

same storage.

• In a SAN environment, a server and its backup must use the same storage infrastructure to

access a segment’s physical volumes (for example, a multiported RAID array).

For a cluster using the unified network configuration, assign backup nodes for the bond0:1 interface. For example, node1 is the backup for node2, and node2 is the backup for node1.

1. Add the VIF:

ibrix_nic -a -n bond0:2 -h node1,node2,node3,node4

2. Set up a standby server for each VIF:

# ibrix_nic -b -H node1/bond0:1,node2/bond0:2

ibrix_nic -b -H node2/bond0:1,node1/bond0:2

ibrix_nic -b -H node3/bond0:1,node4/bond0:2

ibrix_nic -b -H node4/bond0:1,node3/bond0:2

2. Identify power sources

To implement automated failover, perform a forced manual failover, or remotely power a file serving node up or down, you must set up programmable power sources for the nodes and their backups. Using programmable power sources prevents a “split-brain scenario” between a failing file serving node and its backup, allowing the failing server to be centrally powered down by the Fusion Manager in the case of automated failover, and manually in the case of a forced manual failover.

StoreAll software works with iLO, IPMI, OpenIPMI, and OpenIPMI2 integrated power sources. The following configuration steps are required when setting up integrated power sources:

• For automated failover, ensure that the Fusion Manager has LAN access to the power sources.

• Install the environment and any drivers and utilities, as specified by the vendor documentation.

If you plan to protect access to the power sources, set up the UID and password to be used.

Use the following command to identify a power source:

ibrix_powersrc -a -t {ipmi|openipmi|openipmi2|ilo} -h HOSTNAME -I IPADDR

-u USERNAME -p PASSWORD

For example, to identify an iLO power source at IP address 192.168.3.170 for node ss01:

ibrix_powersrc -a -t ilo -h ss01 -I 192.168.3.170 -u Administrator -p password

62 Configuring failover

Page 63

3. Configure NIC monitoring

NIC monitoring should be configured on user VIFs that will be used by NFS, SMB, FTP, or HTTP.

IMPORTANT: When configuring NIC monitoring, use the same backup pairs that you used when

configuring backup servers.

Identify the servers in a backup pair as NIC monitors for each other. Because the monitoring must be declared in both directions, enter a separate command for each server in the pair.

ibrix_nic -m -h MONHOST -A DESTHOST/IFNAME

The following example sets up monitoring for NICs over bond0:1:

ibric_nic -m -h node1 -A node2/bond0:1

ibric_nic -m -h node2 -A node1/bond0:1

ibric_nic -m -h node3 -A node4/bond0:1

ibric_nic -m -h node4 -A node3/bond0:1

The next example sets up server s2.hp.com to monitor server s1.hp.com over user network interface

eth1:

ibrix_nic -m -h s2.hp.com -A s1.hp.com/eth1

4. Enable automated failover

Automated failover is turned off by default. When automated failover is turned on, the Fusion Manager starts monitoring heartbeat messages from file serving nodes. You can turn automated failover on and off for all file serving nodes or for selected nodes.

Turn on automated failover:

ibrix_server -m [-h SERVERNAME]

Changing the HA configuration manually

Update a power source:

If you change the IP address or password for a power source, you must update the configuration database with the changes. The user name and password options are needed only for remotely managed power sources. Include the -s option to have the Fusion Manager skip BMC.

ibrix_powersrc -m [-I IPADDR] [-u USERNAME] [-p PASSWORD] [-s] -h POWERSRCLIST

The following command changes the IP address for power source ps1:

ibrix_powersrc -m -I 192.168.3.153 -h ps1

Disassociate a server from a power source:

You can dissociate a file serving node from a power source by dissociating it from slot 1 (its default association) on the power source. Use the following command:

ibrix_hostpower -d -s POWERSOURCE -h HOSTNAME

Delete a power source:

To conserve storage, delete power sources that are no longer in use. If you are deleting multiple power sources, use commas to separate them.

ibrix_powersrc -d -h POWERSRCLIST

Delete NIC monitoring:

To delete NIC monitoring, use the following command:

ibrix_nic -m -h MONHOST -D DESTHOST/IFNAME

Delete NIC standbys:

To delete a standby for a NIC, use the following command:

ibrix_nic -b -U HOSTNAME1/IFNAME1

Configuring High Availability on the cluster 63

Page 64

For example, to delete the standby that was assigned to interface eth2 on file serving node s1.hp.com:

ibrix_nic -b -U s1.hp.com/eth2

Turn off automated failover:

ibrix_server -m -U [-h SERVERNAME]

To specify a single file serving node, include the -h SERVERNAME option.

Failing a server over manually

The server to be failed over must belong to a backup pair. The server can be powered down or remain up during the procedure. You can perform a manual failover at any time, regardless of whether automated failover is in effect. Manual failover does not require the use of a programmable power supply. However, if you have identified a power supply for the server, you can power it down before the failover.

Use the GUI or the CLI to fail over a file serving node:

• On the GUI, select the node on the Servers panel and then click Failover on the Summary

panel.

• On the CLI, run ibrix_server -f, specifying the node to be failed over as the HOSTNAME.

If appropriate, include the -p option to power down the node before segments are migrated:

ibrix_server -f [-p] -h HOSTNAME

Check the Summary panel or run the following command to determine whether the failover was successful:

ibrix_server -l

The STATE field indicates the status of the failover. If the field persistently shows Down-InFailover or Up-InFailover, the failover did not complete; contact HP Support for assistance. For information about the values that can appear in the STATE field, see “What happens during a

failover” (page 55).

Failing back a server

After an automated or manual failover of a server, you must manually fail back the server, which restores ownership of the failed-over segments and network interfaces to the server. Before failing back the server, confirm that it can see all of its storage resources and networks. The segments owned by the server will not be accessible if the server cannot see its storage.

To fail back a node from the GUI, select the node on the Servers panel and then click Failback on the Summary panel.

On the GUI, select the node on the Servers panel and then click Failback on the Summary pane On the CLI, run the following command, where HOSTNAME is the failed-over node:

ibrix_server -f -U -h HOSTNAME

After failing back the node, check the Summary panel or run the ibrix_server -l command to determine whether the failback completed fully. If the failback is not complete, contact HP Support.

NOTE: A failback might not succeed if the time period between the failover and the failback is

too short, and the primary server has not fully recovered. HP recommends ensuring that both servers are up and running and then waiting 60 seconds before starting the failback. Use the ibrix_server -l command to verify that the primary server is up and running. The status should be Up-FailedOver before performing the failback.

64 Configuring failover

Page 65

Setting up HBA monitoring

You can configure High Availability to initiate automated failover upon detection of a failed HBA. HBA monitoring can be set up for either dual-port HBAs with built-in standby switching or single-port HBAs, whether standalone or paired for standby switching via software. The StoreAll software does not play a role in vendor- or software-mediated HBA failover; traffic moves to the remaining functional port with no Fusion Manager involvement.

HBAs use worldwide names for some parameter values. These are either worldwide node names (WWNN) or worldwide port names (WWPN). The WWPN is the name an HBA presents when logging in to a SAN fabric. Worldwide names consist of 16 hexadecimal digits grouped in pairs. In StoreAll software, these are written as dot-separated pairs (for example,

21.00.00.e0.8b.05.05.04). To set up HBA monitoring, first discover the HBAs, and then perform the procedure that matches

your HBA hardware:

• For single-port HBAs without built-in standby switching: Turn on HBA monitoring for all ports

that you want to monitor for failure.

• For dual-port HBAs with built-in standby switching and single-port HBAs that have been set

up as standby pairs in a software operation: Identify the standby pairs of ports to the configuration database and then turn on HBA monitoring for all paired ports. If monitoring is turned on for just one port in a standby pair and that port fails, the Fusion Manager will fail over the server even though the HBA has automatically switched traffic to the surviving port. When monitoring is turned on for both ports, the Fusion Manager initiates failover only when both ports in a pair fail.

When both HBA monitoring and automated failover for file serving nodes are configured, the Fusion Manager will fail over a server in two situations:

• Both ports in a monitored set of standby-paired ports fail. Because all standby pairs were

identified in the configuration database, the Fusion Manager knows that failover is required only when both ports fail.

• A monitored single-port HBA fails. Because no standby has been identified for the failed port,

the Fusion Manager knows to initiate failover immediately.

Discovering HBAs

You must discover HBAs before you set up HBA monitoring, when you replace an HBA, and when you add a new HBA to the cluster. Discovery adds the WWPN for the port to the configuration database.

ibrix_hba -a [-h HOSTLIST]

Adding standby-paired HBA ports

Identifying standby-paired HBA ports to the configuration database allows the Fusion Manager to apply the following logic when they fail:

• If one port in a pair fails, do nothing. Traffic will automatically switch to the surviving port, as

configured by the HBA vendor or the software.

• If both ports in a pair fail, fail over the server’s segments to the standby server.

Use the following command to identify two HBA ports as a standby pair:

ibrix_hba -b -P WWPN1:WWPN2 -h HOSTNAME

Enter the WWPN as decimal-delimited pairs of hexadecimal digits. The following command identifies port 20.00.12.34.56.78.9a.bc as the standby for port 42.00.12.34.56.78.9a.bc for the HBA on file serving node s1.hp.com:

ibrix_hba -b -P 20.00.12.34.56.78.9a.bc:42.00.12.34.56.78.9a.bc -h s1.hp.com

Configuring High Availability on the cluster 65

Page 66

Turning HBA monitoring on or off

If your cluster uses single-port HBAs, turn on monitoring for all of the ports to set up automated failover in the event of HBA failure. Use the following command:

ibrix_hba -m -h HOSTNAME -p PORT

For example, to turn on HBA monitoring for port 20.00.12.34.56.78.9a.bc on node s1.hp.com:

ibrix_hba -m -h s1.hp.com -p 20.00.12.34.56.78.9a.bc

To turn off HBA monitoring for an HBA port, include the -U option:

ibrix_hba -m -U -h HOSTNAME -p PORT

Deleting standby port pairings

Deleting port pairing information from the configuration database does not remove the standby pairing of the ports. The standby pairing is either built in by the HBA vendor or implemented by software.

To delete standby-paired HBA ports from the configuration database, enter the following command:

ibrix_hba -b -U -P WWPN1:WWPN2 -h HOSTNAME

For example, to delete the pairing of ports 20.00.12.34.56.78.9a.bc and

42.00.12.34.56.78.9a.bc on node s1.hp.com:

ibrix_hba -b -U -P 20.00.12.34.56.78.9a.bc:42.00.12.34.56.78.9a.bc

-h s1.hp.com

Deleting HBAs from the configuration database

Before switching an HBA to a different machine, delete the HBA from the configuration database:

ibrix_hba -d -h HOSTNAME -w WWNN

Displaying HBA information

Use the following command to view information about the HBAs in the cluster. To view information for all hosts, omit the -h HOSTLIST argument.

ibrix_hba -l [-h HOSTLIST]

The output includes the following fields:

DescriptionField

Server on which the HBA is installed.Host

This HBA’s WWNN.Node WWN

This HBA’s WWPN.Port WWN

Operational state of the port.Port State

WWPN of the standby port for this port (standby-paired HBAs only).Backup Port WWN

Whether HBA monitoring is enabled for this port.Monitoring

Checking the High Availability configuration

Use the ibrix_haconfig command to determine whether High Availability features have been configured for specific file serving nodes. The command checks for the following features and provides either a summary or a detailed report of the results:

• Programmable power source

• Standby server or standby segments

• Cluster and user network interface monitors

• Standby network interface for each user network interface

66 Configuring failover

Page 67

• HBA port monitoring

• Status of automated failover (on or off)

For each High Availability feature, the summary report returns status for each tested file serving node and optionally for their standbys:

• Passed. The feature has been configured.

• Warning. The feature has not been configured, but the significance of the finding is not clear.

For example, the absence of discovered HBAs can indicate either that the HBA monitoring feature was not configured or that HBAs are not physically present on the tested servers.

• Failed. The feature has not been configured.

The detailed report includes an overall result status for all tested file serving nodes and describes details about the checks performed on each High Availability feature. By default, the report includes details only about checks that received a Failed or a Warning result. You can expand the report to include details about checks that received a Passed result.

Viewing a summary report

Use the ibrix_haconfig -l command to see a summary of all file serving nodes. To check specific file serving nodes, include the -h HOSTLIST argument. To check standbys, include the

-b argument. To view results only for file serving nodes that failed a check, include the -f argument.

ibrix_haconfig -l [-h HOSTLIST] [-f] [-b]

For example, to view a summary report for file serving nodes xs01.hp.com and xs02.hp.com:

ibrix_haconfig -l -h xs01.hp.com,xs02.hp.com

Host HA Configuration Power Sources Backup Servers Auto Failover Nics Monitored Standby Nics HBAs Monitored xs01.hp.com FAILED PASSED PASSED PASSED FAILED PASSED FAILED xs02.hp.com FAILED PASSED FAILED FAILED FAILED WARNED WARNED

Viewing a detailed report

Execute the ibrix_haconfig -i command to view the detailed report:

ibrix_haconfig -i [-h HOSTLIST] [-f] [-b] [-s] [-v]

The -h HOSTLIST option lists the nodes to check. To also check standbys, include the -b option. To view results only for file serving nodes that failed a check, include the -f argument. The -s option expands the report to include information about the file system and its segments. The -v option produces detailed information about configuration checks that received a Passed result.

For example, to view a detailed report for file serving node xs01.hp.com:

ibrix_haconfig -i -h xs01.hp.com

--------------- Overall HA Configuration Checker Results --------------FAILED

--------------- Overall Host Results --------------Host HA Configuration Power Sources Backup Servers Auto Failover Nics Monitored Standby Nics HBAs Monitored xs01.hp.com FAILED PASSED PASSED PASSED FAILED PASSED FAILED

--------------- Server xs01.hp.com FAILED Report ---------------

Check Description Result Result Information ================================================ ====== ================== Power source(s) configured PASSED Backup server or backups for segments configured PASSED Automatic server failover configured PASSED

Cluster & User Nics monitored Cluster nic xs01.hp.com/eth1 monitored FAILED Not monitored

Configuring High Availability on the cluster 67

Page 68

User nics configured with a standby nic PASSED

HBA ports monitored Hba port 21.01.00.e0.8b.2a.0d.6d monitored FAILED Not monitored Hba port 21.00.00.e0.8b.0a.0d.6d monitored FAILED Not monitored

Capturing a core dump from a failed node

The crash capture feature collects a core dump from a failed node when the Fusion Manager initiates failover of the node. You can use the core dump to analyze the root cause of the node failure. When enabled, crash capture is supported for both automated and manual failover. Failback is not affected by this feature. By default, crash capture is disabled. This section provides the prerequisites and steps for enabling crash capture.

NOTE: Enabling crash capture adds a delay (up to 240 seconds) to the failover to allow the

crash kernel to load. The failover process ensures that the crash kernel is loaded before continuing.

When crash capture is enabled, the system takes the following actions when a node fails:

1. The Fusion Manager triggers a core dump on the failed node when failover starts, changing the state of the node to Up, InFailover.

2. The failed node boots into the crash kernel. The state of the node changes to Dumping, InFailover.

3. The failed node continues with the failover, changing state to Dumping, FailedOver.

4. After the core dump is created, the failed node reboots and its state changes to Up, FailedOver.

IMPORTANT: Complete the steps in “Prerequisites for setting up the crash capture” (page 68)

before setting up the crash capture.

Prerequisites for setting up the crash capture

The following parameters must be configured in the ROM-based setup utility (RBSU) before a crash can be captured automatically on a file server node in failed condition.

1. Start RBSU – Reboot the server, and then Press F9 Key.

2. Highlight the System Options option in main menu, and then press the Enter key. Highlight the Virtual Serial Port option (below figure), and then press the Enter key. Select the COM1 port, and then press the Enter key.

68 Configuring failover

Page 69

3. Highlight the BIOS Serial Console & EMS option in main menu, and then press the Enter key.

Highlight the BIOS Serial Console Port option and then press the Enter key. Select the COM1 port, and then press the Enter key.

4. Highlight the BIOS Serial Console Baud Rate option, and then press the Enter key. Select the 115200 Serial Baud Rate.

5. Highlight the Server Availability option in main menu, and then press the Enter key. Highlight the ASR Timeout option and then press the Enter key. Select the 30 Minutes, and then press the Enter key.

6. To exit RBSU, press Esc until the main menu is displayed. Then, at the main menu, press F10. The server automatically restarts.

Setting up nodes for crash capture

IMPORTANT: Complete the steps in “Prerequisites for setting up the crash capture” (page 68)

before starting the steps in this section.

To set up nodes for crash capture, complete the following steps:

1. Enable crash capture. Run the following command:

ibrix_host_tune -S { -h HOSTLIST | -g GROUPLIST } -o trigger_crash_on_failover=1

2. Tune Fusion Manager to set the DUMPING status timeout by entering the following command:

ibrix_fm_tune -S -o dumpingStatusTimeout=240

This command is required to delay the failover until the crash kernel is loaded; otherwise, Fusion Manager will bring down the failed node.

Capturing a core dump from a failed node 69

Page 70

6 Configuring cluster event notification

Cluster events

There are three categories for cluster events:

Alerts. Disruptive events that can result in loss of access to file system data.

Warnings. Potentially disruptive conditions where file system access is not lost, but if the situation is not

addressed, it can escalate to an alert condition.

Information. Normal events that change the cluster.

The following table lists examples of events included in each category.

NameTrigger PointEvent Type

filesystem.unmountedFile system is unmounted

server.status.downFile serving node is down/restarted

server.unreachableFile serving node terminated unexpectedly

segment.migratedUser migrates segment using GUIWARN

filesystem.cmdFile system is created

server.deregisteredFile serving node is deleted

nic.addedNIC is added using GUI

nic.removedNIC is removed using GUI

physicalvolume.addedPhysical storage is discovered and added using

management console

physicalvolume.deletedPhysical storage is deleted using management console

You can be notified of cluster events by email or SNMP traps. To view the list of supported events, use the command ibrix_event -q.

Setting up email notification of cluster events

You can set up event notifications by event type or for one or more specific events. To set up automatic email notification of cluster events, associate the events with email recipients and then configure email settings to initiate the notification process.

Associating events and email addresses

You can associate any combination of cluster events with email addresses: all Alert, Warning, or Info events, all events of one type plus a subset of another type, or a subset of all types.

The notification threshold for Alert events is 90% of capacity. Threshold-triggered notifications are sent when a monitored system resource exceeds the threshold and are reset when the resource

70 Configuring cluster event notification

Page 71

utilization dips 10% below the threshold. For example, a notification is sent the first time usage reaches 90% or more. The next notice is sent only if the usage declines to 80% or less (event is reset), and subsequently rises again to 90% or above.

To associate all types of events with recipients, omit the -e argument in the following command:

ibrix_event -c [-e ALERT|WARN|INFO|EVENTLIST] -m EMAILLIST

Use the ALERT, WARN, and INFO keywords to make specific type associations or use EVENTLIST to associate specific events.

The following command associates all types of events to admin@hp.com:

ibrix_event -c -m admin@hp.com

The next command associates all Alert events and two Info events to admin@hp.com:

ibrix_event -c -e ALERT,server.registered,filesystem.space.full

-m admin@hp.com

Configuring email notification settings

To configure email notification settings, specify the SMTP server and header information and turn the notification process on or off.

ibrix_event -m on|off -s SMTP -f from [-r reply-to] [-t subject]

The server must be able to receive and send email and must recognize the From and Reply-to addresses. Be sure to specify valid email addresses, especially for the SMTP server. If an address is not valid, the SMTP server will reject the email.

The following command configures email settings to use the mail.hp.com SMTP server and turns on notifications:

ibrix_event -m on -s mail.hp.com -f FM@hp.com -r MIS@hp.com -t Cluster1 Notification

NOTE: The state of the email notification process has no effect on the display of cluster events

in the GUI.

Dissociating events and email addresses

To remove the association between events and email addresses, use the following command:

ibrix_event -d [-e ALERT|WARN|INFO|EVENTLIST] -m EMAILLIST

For example, to dissociate event notifications for admin@hp.com:

ibrix_event -d -m admin@hp.com

To turn off all Alert notifications for admin@hp.com:

ibrix_event -d -e ALERT -m admin@hp.com

To turn off the server.registered and filesystem.created notifications for admin1@hp.com and admin2@hp.com:

ibrix_event -d -e server.registered,filesystem.created -m admin1@hp.com,admin2@hp.com

Testing email addresses

To test an email address with a test message, notifications must be turned on. If the address is valid, the command signals success and sends an email containing the settings to the recipient. If the address is not valid, the command returns an address failed exception.

ibrix_event -u -n EMAILADDRESS

Setting up email notification of cluster events 71

Page 72

Viewing email notification settings

The ibrix_event -L command provides comprehensive information about email settings and configured notifications.

ibrix_event -L Email Notification : Enabled SMTP Server : mail.hp.com From : FM@hp.com Reply To : MIS@hp.com

EVENT LEVEL TYPE DESTINATION

------------------------------------- ----- ----- ----------asyncrep.completed ALERT EMAIL admin@hp.com asyncrep.failed ALERT EMAIL admin@hp.com

Setting up SNMP notifications

StoreAll software supports SNMP (Simple Network Management Protocol) V1, V2, and V3. Whereas SNMPV2 security was enforced by use of community password strings, V3 introduces

the USM and VACM. Discussion of these models is beyond the scope of this document. Refer to RFCs 3414 and 3415 at http://www.ietf.org for more information. Note the following:

• In the SNMPV3 environment, every message contains a user name. The function of the USM

is to authenticate users and ensure message privacy through message encryption and decryption. Both authentication and privacy, and their passwords, are optional and will use default settings where security is less of a concern.

• With users validated, the VACM determines which managed objects these users are allowed

to access. The VACM includes an access scheme to control user access to managed objects; context matching to define which objects can be accessed; and MIB views, defined by subsets of IOD subtree and associated bitmask entries, which define what a particular user can access in the MIB.

Steps for setting up SNMP include:

• Agent configuration (all SNMP versions)

• Trapsink configuration (all SNMP versions)

• Associating event notifications with trapsinks (all SNMP versions)

• View definition (V3 only)

• Group and user configuration (V3 only)

StoreAll software implements an SNMP agent that supports the private StoreAll software MIB. The agent can be polled and can send SNMP traps to configured trapsinks.

Setting up SNMP notifications is similar to setting up email notifications. You must associate events to trapsinks and configure SNMP settings for each trapsink to enable the agent to send a trap when an event occurs.

NOTE: When Phone Home is enabled, you cannot edit or change the configuration of the StoreAll

SNMP agent with the ibrix_snmpagent. However, you can add trapsink IPs with ibrix_snmptrap and can associate events to the trapsink IP with ibrix_event.

72 Configuring cluster event notification

Page 73

Configuring the SNMP agent

The SNMP agent is created automatically when the Fusion Manager is installed. It is initially configured as an SNMPv2 agent and is off by default.

Some SNMP parameters and the SNMP default port are the same, regardless of SNMP version. The default agent port is 161. SYSCONTACT, SYSNAME, and SYSLOCATION are optional MIB-II agent parameters that have no default values.

NOTE: The default SNMP agent port was changed from 5061 to 161 in the StoreAll 6.1 release.

This port number cannot be changed.

The -c and -s options are also common to all SNMP versions. The -c option turns the encryption of community names and passwords on or off. There is no encryption by default. Using the -s option toggles the agent on and off; it turns the agent on by starting a listener on the SNMP port, and turns it off by shutting off the listener. The default is off.

The format for a v1 or v2 update command follows:

ibrix_snmpagent -u -v {1|2} [-p PORT] [-r READCOMMUNITY] [-w WRITECOMMUNITY] [-t SYSCONTACT] [-n SYSNAME] [-o SYSLOCATION] [-c {yes|no}] [-s {on|off}]

The update command for SNMPv1 and v2 uses optional community names. By convention, the default READCOMMUNITY name used for read-only access and assigned to the agent is public. No default WRITECOMMUNITY name is set for read-write access (although the name private is often used).

The following command updates a v2 agent with the write community name private, the agent’s system name, and that system’s physical location:

ibrix_snmpagent -u -v 2 -w private -n agenthost.domain.com -o DevLab-B3-U6

The SNMPv3 format adds an optional engine id that overrides the default value of the agent’s host name. The format also provides the -y and -z options, which determine whether a v3 agent can process v1/v2 read and write requests from the management station. The format is:

ibrix_snmpagent -u -v 3 [-e engineId] [-p PORT] [-r READCOMMUNITY] [-w WRITECOMMUNITY] [-t SYSCONTACT] [-n SYSNAME] [-o SYSLOCATION] [-y {yes|no}] [-z {yes|no}] [-c {yes|no}] [-s {on|off}]

Configuring trapsink settings

A trapsink is the host destination where agents send traps, which are asynchronous notifications sent by the agent to the management station. A trapsink is specified either by name or IP address. StoreAll software supports multiple trapsinks; you can define any number of trapsinks of any SNMP version, but you can define only one trapsink per host, regardless of the version.

At a minimum, trapsink configuration requires a destination host and SNMP version. All other parameters are optional and many assume the default value if no value is specified.

The format for creating a v1/v2 trapsink is:

ibrix_snmptrap -c -h HOSTNAME -v {1|2} [-p PORT] [-m COMMUNITY] [-s {on|off}]

If a port is not specified, the command defaults to port 162. If a community is not specified, the command defaults to the community name public. The -s option toggles agent trap transmission on and off. The default is on. For example, to create a v2 trapsink with a new community name, enter:

ibrix_snmptrap -c -h lab13-116 -v 2 -m private

For a v3 trapsink, additional options define security settings. USERNAME is a v3 user defined on the trapsink host and is required. The security level associated with the trap message depends on which passwords are specified—the authentication password, both the authentication and privacy passwords, or no passwords. The CONTEXT_NAME is required if the trap receiver has defined subsets of managed objects. The format is:

Setting up SNMP notifications 73

Page 74

ibrix_snmptrap -c -h HOSTNAME -v 3 [-p PORT] -n USERNAME [-j {MD5|SHA}] [-k AUTHORIZATION_PASSWORD] [-y {DES|AES}] [-z PRIVACY_PASSWORD] [-x CONTEXT_NAME] [-s {on|off}]

The following command creates a v3 trapsink with a named user and specifies the passwords to be applied to the default algorithms. If specified, passwords must contain at least eight characters.

ibrix_snmptrap -c -h lab13-114 -v 3 -n trapsender -k auth-passwd -z priv-passwd

Associating events and trapsinks

Associating events with trapsinks is similar to associating events with email recipients, except that you specify the host name or IP address of the trapsink instead of an email address.

Use the ibrix_event command to associate SNMP events with trapsinks. The format is:

ibrix_event -c -y SNMP [-e ALERT|INFO|EVENTLIST]

-m TRAPSINK

For example, to associate all Alert events and two Info events with a trapsink at IP address

192.168.2.32, enter:

ibrix_event -c -y SNMP -e ALERT,server.registered, filesystem.created -m 192.168.2.32

Use the ibrix_event -d command to dissociate events and trapsinks:

ibrix_event -d -y SNMP [-e ALERT|INFO|EVENTLIST] -m TRAPSINK

Defining views

A MIB view is a collection of paired OID subtrees and associated bitmasks that identify which subidentifiers are significant to the view’s definition. Using the bitmasks, individual OID subtrees can be included in or excluded from the view.

An instance of a managed object belongs to a view if:

• The OID of the instance has at least as many sub-identifiers as the OID subtree in the view.

• Each sub-identifier in the instance and the subtree match when the bitmask of the corresponding

sub-identifier is nonzero.

The Fusion Manager automatically creates the excludeAll view that blocks access to all OIDs. This view cannot be deleted; it is the default read and write view if one is not specified for a group with the ibrix_snmpgroup command. The catch-all OID and mask are:

OID = .1 Mask = .1

Consider these examples, where instance .1.3.6.1.2.1.1 matches, instance .1.3.6.1.4.1 matches, and instance .1.2.6.1.2.1 does not match.

OID = .1.3.6.1.4.1.18997 Mask = .1.1.1.1.1.1.1

OID = .1.3.6.1.2.1 Mask = .1.1.0.1.0.1

To add a pairing of an OID subtree value and a mask value to a new or existing view, use the following format:

ibrix_snmpview -a -v VIEWNAME [-t {include|exclude}] -o OID_SUBTREE [-m MASK_BITS]

The subtree is added in the named view. For example, to add the StoreAll software private MIB to the view named hp, enter:

ibrix_snmpview -a -v hp -o .1.3.6.1.4.1.18997 -m .1.1.1.1.1.1.1

74 Configuring cluster event notification

Page 75

Configuring groups and users

A group defines the access control policy on managed objects for one or more users. All users must belong to a group. Groups and users exist only in SNMPv3. Groups are assigned a security level, which enforces use of authentication and privacy, and specific read and write views to identify which managed objects group members can read and write.

The command to create a group assigns its SNMPv3 security level, read and write views, and context name. A context is a collection of managed objects that can be accessed by an SNMP entity. A related option, -m, determines how the context is matched. The format follows:

ibrix_snmpgroup -c -g GROUPNAME [-s {noAuthNoPriv|authNoPriv|authPriv}] [-r READVIEW] [-w WRITEVIEW]

For example, to create the group group2 to require authorization, no encryption, and read access to the hp view, enter:

ibrix_snmpgroup -c -g group2 -s authNoPriv -r hp

The format to create a user and add that user to a group follows:

ibrix_snmpuser -c -n USERNAME -g GROUPNAME [-j {MD5|SHA}] [-k AUTHORIZATION_PASSWORD] [-y {DES|AES}] [-z PRIVACY_PASSWORD]

Authentication and privacy settings are optional. An authentication password is required if the group has a security level of either authNoPriv or authPriv. The privacy password is required if the group has a security level of authPriv. If unspecified, MD5 is used as the authentication algorithm and DES as the privacy algorithm, with no passwords assigned.

For example, to create user3, add that user to group2, and specify an authorization password for authorization and no encryption, enter:

ibrix_snmpuser -c -n user3 -g group2 -k auth-passwd -s authNoPriv

Deleting elements of the SNMP configuration

All SNMP commands use the same syntax for delete operations, using -d to indicate the object is to delete. The following command deletes a list of hosts that were trapsinks:

ibrix_snmptrap -d -h lab15-12.domain.com,lab15-13.domain.com,lab15-14.domain.com

There are two restrictions on SNMP object deletions:

• A view cannot be deleted if it is referenced by a group.

• A group cannot be deleted if it is referenced by a user.

Listing SNMP configuration information

All SNMP commands employ the same syntax for list operations, using the -l flag. For example:

ibrix_snmpgroup -l

This command lists the defined group settings for all SNMP groups. Specifying an optional group name lists the defined settings for that group only.

Setting up SNMP notifications 75

Page 76

7 Configuring system backups

Backing up the Fusion Manager configuration

The Fusion Manager configuration is automatically backed up whenever the cluster configuration changes. The backup occurs on the node hosting the active Fusion Manager. The backup file is stored at <ibrixhome>/tmp/fmbackup.zip on that node.

The active Fusion Manager notifies the passive Fusion Manager when a new backup file is available. The passive Fusion Manager then copies the file to <ibrixhome>/tmp/fmbackup.zip on the node on which it is hosted. If a Fusion Manager is in maintenance mode, it will also be notified when a new backup file is created, and will retrieve it from the active Fusion Manager.

You can create an additional copy of the backup file at any time. Run the following command, which creates a fmbackup.zip file in the $IBRIXHOME/log directory:

$IBRIXHOME/bin/db_backup.sh Once each day, a cron job rotates the $IBRIXHOME/log directory into the $IBRIXHOME/log/

daily subdirectory. The cron job also creates a new backup of the Fusion Manager configuration

in both $IBRIXHOME/tmp and $IBRIXHOME/log. To force a backup, use the following command:

ibrix_fm -B

IMPORTANT: You will need the backup file to recover from server failures or to undo unwanted

configuration changes. Whenever the cluster configuration changes, be sure to save a copy of fmbackup.zip in a safe, remote location such as a node on another cluster.

Using NDMP backup applications

The NDMP backup feature can be used to back up and recover entire StoreAll software file systems or portions of a file system. You can use any supported NDMP backup application to perform the backup and recovery operations. (In NDMP terminology, the backup application is referred to as a Data Management Application, or DMA.) The DMA is run on a management station separate from the cluster and communicates with the cluster's file serving nodes over a configurable socket port.

The NDMP backup feature supports the following:

• NDMP protocol versions 3 and 4

• Two-way NDMP operations

• Three-way NDMP operations between two network storage systems

Each file serving node functions as an NDMP Server and runs the NDMP Server daemon (ndmpd) process. When you start a backup or restore operation on the DMA, you can specify the node and tape device to be used for the operation.

Following are considerations for configuring and using the NDMP feature:

• When configuring your system for NDMP operations, attach your tape devices to a SAN and

then verify that the file serving nodes to be used for backup/restore operations can see the appropriate devices.

• When performing backup operations, take snapshots of your file systems and then back up

the snapshots.

• When directory tree quotas are enabled, an NDMP restore to the original location fails if the

hard quota limit is exceeded. The NDMP restore operation first creates a temporary file and then restores a file to the temporary file. After this succeeds, the restore operation overwrites the existing file (if it present in same destination directory) with the temporary file. When the

76 Configuring system backups

Page 77

hard quota limit for the directory tree has been exceeded, NDMP cannot create a temporary file and the restore operation fails.

Configuring NDMP parameters on the cluster

Certain NDMP parameters must be configured to enable communications between the DMA and the NDMP Servers in the cluster. To configure the parameters on the GUI, select Cluster Configuration from the Navigator, and then select NDMP Backup. The NDMP Configuration Summary shows the default values for the parameters. Click Modify to configure the parameters for your cluster on the Configure NDMP dialog box. See the online help for a description of each field.

Using NDMP backup applications 77

Page 78

To configure NDMP parameters from the CLI, use the following command:

ibrix_ndmpconfig -c [-d IP1,IP2,IP3,...] [-m MINPORT] [-x MAXPORT] [-n LISTENPORT] [-u USERNAME] [-p PASSWORD] [-e {0=disable,1=enable}] -v [{0=10}] [-w BYTES] [-z NUMSESSIONS]

NDMP process management

Normally all NDMP actions are controlled from the DMA. However, if the DMA cannot resolve a problem or you suspect that the DMA may have incorrect information about the NDMP environment, take the following actions from the GUI or CLI:

• Cancel one or more NDMP sessions on a file serving node. Canceling a session stops all

spawned sessions processes and frees their resources if necessary.

• Reset the NDMP server on one or more file serving nodes. This step stops all spawned session

processes, stops the ndmpd and session monitor daemons, frees all resources held by NDMP, and restarts the daemons.

Viewing or canceling NDMP sessions

To view information about active NDMP sessions, select Cluster Configuration from the Navigator, and then select NDMP Backup > Active Sessions. For each session, the Active NDMP Sessions panel lists the host used for the session, the identifier generated by the backup application, the status of the session (backing up data, restoring data, or idle), the start time, and the IP address used by the DMA.

To cancel a session, select that session and click Cancel Session. Canceling a session kills all spawned sessions processes and frees their resources if necessary.

To see similar information for completed sessions, select NDMP Backup > Session History.

View active sessions from the CLI:

ibrix_ndmpsession -l

View completed sessions:

ibrix_ndmpsession -l -s [-t YYYY-MM-DD]

The -t option restricts the history to sessions occurring on or before the specified date.

Cancel sessions on a specific file serving node:

ibrix_ndmpsession -c SESSION1,SESSION2,SESSION3,... -h HOST

Starting, stopping, or restarting an NDMP Server

When a file serving node is booted, the NDMP Server is started automatically. If necessary, you can use the following command to start, stop, or restart the NDMP Server on one or more file serving nodes:

ibrix_server -s -t ndmp -c { start | stop | restart} [-h SERVERNAMES]

78 Configuring system backups

Page 79

Viewing or rescanning tape and media changer devices

To view the tape and media changer devices currently configured for backups, select Cluster Configuration from the Navigator, and then select NDMP Backup > Tape Devices.

If you add a tape or media changer device to the SAN, click Rescan Device to update the list. If you remove a device and want to delete it from the list, reboot all of the servers to which the device is attached.

To view tape and media changer devices from the CLI, use the following command:

ibrix_tape -l

To rescan for devices, use the following command:

ibrix_tape -r

NDMP events

An NDMP Server can generate three types of events: INFO, WARN, and ALERT. These events are displayed on the GUI and can be viewed with the ibrix_event command.

INFO events. Identifies when major NDMP operations start and finish, and also report progress. For example:

7012:Level 3 backup of /mnt/ibfs7 finished at Sat Nov 7 21:20:58 PST 2011 7013:Total Bytes = 38274665923, Average throughput = 236600391 bytes/sec.

WARN events. Indicates an issue with NDMP access, the environment, or NDMP operations. Be sure to review these events and take any necessary corrective actions. Following are some examples:

0000:Unauthorized NDMP Client 16.39.40.201 trying to connect 4002:User [joe] md5 mode login failed.

ALERT events. Indicates that an NDMP action has failed. For example:

1102: Cannot start the session_monitor daemon, ndmpd exiting. 7009:Level 6 backup of /mnt/shares/accounts1 failed (writing eod header error). 8001:Restore Failed to read data stream signature.

You can configure the system to send email or SNMP notifications when these types of events occur.

Using NDMP backup applications 79

Page 80

8 Creating host groups for StoreAll clients

A host group is a named set of StoreAll clients. Host groups provide a convenient way to centrally manage clients. You can put different sets of clients into host groups and then perform the following operations on all members of the group:

• Create and delete mount points

• Mount file systems

• Prefer a network interface

• Tune host parameters

• Set allocation policies

Host groups are optional. If you do not choose to set them up, you can mount file systems on clients and tune host settings and allocation policies on an individual level.

How host groups work

In the simplest case, the host groups functionality allows you to perform an allowed operation on all StoreAll clients by executing a command on the default clients host group with the CLI or the GUI. The clients host group includes all StoreAll clients configured in the cluster.

NOTE: The command intention is stored on the Fusion Manager until the next time the clients

contact the Fusion Manager. (To force this contact, restart StoreAll software services on the clients, reboot the clients, or execute ibrix_lwmount -a or ibrix_lwhost --a.) When contacted, the Fusion Manager informs the clients about commands that were executed on host groups to which they belong. The clients then use this information to perform the operation.

You can also use host groups to perform different operations on different sets of clients. To do this, create a host group tree that includes the necessary host groups. You can then assign the clients manually, or the Fusion Manager can automatically perform the assignment when you register a StoreAll client, based on the client's cluster subnet. To use automatic assignment, create a domain rule that specifies the cluster subnet for the host group.

Creating a host group tree

The clients host group is the root element of the host group tree. Each host group in a tree can have only one parent, but a parent can have multiple children. In a host group tree, operations performed on lower-level nodes take precedence over operations performed on higher-level nodes. This means that you can effectively establish global client settings that you can override for specific clients.

For example, suppose that you want all clients to be able to mount file system ifs1 and to implement a set of host tunings denoted as Tuning 1, but you want to override these global settings for certain host groups. To do this, mount ifs1 on the clients host group, ifs2 on host group A, ifs3 on host group C, and ifs4 on host group D, in any order. Then, set Tuning 1 on the clients host group and Tuning 2 on host group B. The end result is that all clients in host group B will mount ifs1 and implement Tuning 2. The clients in host group A will mount ifs2 and implement Tuning 1. The clients in host groups C and D respectively, will mount ifs3 and ifs4 and implement Tuning 1. The following diagram shows an example of these settings in a host group tree.

80 Creating host groups for StoreAll clients

Page 81

To create one level of host groups beneath the root, simply create the new host groups. You do not need to declare that the root node is the parent. To create lower levels of host groups, declare a parent element for host groups. Do not use a host name as a group name.

To create a host group tree using the CLI:

1. Create the first level of the tree:

ibrix_hostgroup -c -g GROUPNAME

2. Create all other levels by specifying a parent for the group:

ibrix_hostgroup -c -g GROUPNAME [-p PARENT]

Adding a StoreAll client to a host group

You can add a StoreAll client to a host group or move a client to a different host group. All clients belong to the default clients host group.

To add or move a host to a host group, use the ibrix_hostgroup command as follows:

ibrix_hostgroup -m -g GROUP -h MEMBER

For example, to add the specified host to the finance group:

ibrix_hostgroup -m -g finance -h cl01.hp.com

Adding a domain rule to a host group

To configure automatic host group assignments, define a domain rule for host groups. A domain rule restricts host group membership to clients on a particular cluster subnet. The Fusion Manager uses the IP address that you specify for clients when you register them to perform a subnet match and sorts the clients into host groups based on the domain rules.

Setting domain rules on host groups provides a convenient way to centrally manage mounting, tuning, allocation policies, and preferred networks on different subnets of clients. A domain rule is a subnet IP address that corresponds to a client network. Adding a domain rule to a host group restricts its members to StoreAll clients that are on the specified subnet. You can add a domain rule at any time.

To add a domain rule to a host group, use the ibrix_hostgroup command as follows:

ibrix_hostgroup -a -g GROUPNAME -D DOMAIN

Adding a StoreAll client to a host group 81

Page 82

For example, to add the domain rule 192.168 to the finance group:

ibrix_hostgroup -a -g finance -D 192.168

Viewing host groups

To view all host groups or a specific host group, use the following command:

ibrix_hostgroup -l [-g GROUP]

Deleting host groups

When you delete a host group, its members are reassigned to the parent of the deleted group. To force the reassigned StoreAll clients to implement the mounts, tunings, network interface

preferences, and allocation policies that have been set on their new host group, either restart StoreAll software services on the clients or execute the following commands locally:

• ibrix_lwmount -a to force the client to pick up mounts or allocation policies

• ibrix_lwhost --a to force the client to pick up host tunings

To delete a host group using the CLI:

ibrix_hostgroup -d -g GROUPNAME

Other host group operations

Additional host group operations are described in the following locations:

• Creating or deleting a mountpoint, and mounting or unmounting a file system (see “Creating

and mounting file systems” in the HP StoreAll Storage File System User Guide)

• Changing host tuning parameters (see “Tuning file serving nodes and StoreAll clients”

(page 118))

• Preferring a network interface (see “Preferring network interfaces” (page 131))

• Setting allocation policy (see “Using file allocation” in the HP StoreAll Storage File System

User Guide)

82 Creating host groups for StoreAll clients

Page 83

9 Monitoring cluster operations

This chapter describes how to monitor the operational state of the cluster and how to monitor cluster health.

Monitoring X9720/9730 hardware

The GUI displays status, firmware versions, and device information for the servers, chassis, and system storage included in X9720 and 9730 systems. The Management Console displays a top-level status of the chassis, server, and storage hardware components. You can also drill-down to view the status of individual chassis, server, and storage sub-components.

Monitoring servers

To view information about the server and chassis included in your system.

1. Select Servers from the Navigator tree. The Servers panel lists the servers included in each chassis.

2. Select the server you want to obtain more information about. Information about the servers in the chassis is displayed in the right pane.

To view summary information for the selected server, select the Summary node in the lower Navigator tree.

Monitoring X9720/9730 hardware 83

Page 84

Select the server component that you want to view from the lower Navigator panel, such as NICs.

84 Monitoring cluster operations

Page 85

The following are the top-level options provided for the server:

NOTE: Information about the Hardware node can be found in “Monitoring hardware components”

(page 87).

• HBAs. The HBAs panel displays the following information:

Node WWN◦

◦ Port WWN

◦ Backup

◦ Monitoring

◦ State

• NICs. The NICs panel shows all NICs on the server, including offline NICs. The NICs panel

displays the following information:

◦ Name

◦ IP

◦ Type

◦ State

Monitoring X9720/9730 hardware 85

Page 86

◦ Route

◦ Standby Server

◦ Standby Interface

• Mountpoints. The Mountpoints panel displays the following information:

Mountpoint◦

◦ Filesystem

◦ Access

• NFS. The NFS panel displays the following information:

Host◦

◦ Path

◦ Options

• CIFS. The CIFS panel displays the following information:

NOTE: CIFS in the GUI has not been rebranded to SMB yet. CIFS is just a different name

for SMB.

◦ Name

◦ Value

• Power. The Power panel displays the following information:

Host◦

◦ Name

◦ Type

◦ IP Address

◦ Slot ID

• Events. The Events panel displays the following information:

Level◦

◦ Time

◦ Event

• Hardware. The Hardware panel displays the following information:

The name of the hardware component.◦

◦ The information gathered in regards to that hardware component.

See “Monitoring hardware components” (page 87) for detailed information about the Hardware panel.

86 Monitoring cluster operations

Page 87

Monitoring hardware components

The front of the chassis includes server bays and the rear of the chassis includes components such as fans, power supplies, Onboard Administrator modules, and interconnect modules (VC modules and SAS switches). The following Onboard Administrator view shows a chassis enclosure on a StoreAll 9730 system.

To monitor these components from the GUI:

1. Click Servers from the upper Navigator tree.

2. Click Hardware from the lower Navigator tree for information about the chassis that contains the server selected on the Servers panel, as shown in the following image.

Monitoring X9720/9730 hardware 87

Page 88

Monitoring blade enclosures

To view summary information about the blade enclosures in the chassis:

1. Expand the Hardware node.

2. Select the Blade Enclosure node under the Hardware node. The following summary information is displayed for the blade enclosure:

• Status

• Type

• Name

• UUID

• Serial number

Detailed information of the hardware components in the blade enclosure is provided by expanding the Blade Enclosure node and clicking one of the sub-nodes.

When you select one of the sub-nodes under the Blade Enclosure node, additional information is provided. For example, when you select the Fan node, additional information about the Fan for the blade enclosure is provided in the Fan panel.

88 Monitoring cluster operations

Page 89

The sub-nodes under the Blade Enclosure node provide information about the hardware components within the blade enclosure:

Monitoring X9720/9730 hardware 89

Page 90

Table 2 Obtaining detailed information about a blade enclosure

Information providedPanel name

Bay

• Status

• Type

• Name

• UUID

• Serial number

• Model

• Properties

Temperature Sensor: The Temperature Sensor panel displays information for a bay, OA module or for the blade enclosure.

• Status

• Type

• UUID

• Properties

Fan: The Fan panel displays information for a blade enclosure.

• Status

• Type

• Name

• UUID

• Location

• Properties

OA Module

• Status

• Type

• Name

• UUID

• Serial number

• Model

• Firmware version

• Location

• Properties

Power Supply

• Status

• Type

• Name

• UUID

• Serial number

• Location

Shared Interconnect

• Status

• Type

• Name

• UUID

• Serial number

• Model

• Firmware version

• Location

• Properties

90 Monitoring cluster operations

Page 91

Obtaining server details

The Management Console provides detailed information for each server in the chassis. To obtain summary information for a server, select the Server node under the Hardware node.

The following overview information is provided for each server:

• Status

• Type

• Name

• UUID

• Serial number

• Model

• Firmware version

• Message

• Diagnostic Message

Column dynamically appears depending on the situation.

Obtain detailed information for hardware components in the server by clicking the nodes under the Server node.

Monitoring X9720/9730 hardware 91

Page 92

Table 3 Obtaining detailed information about a server

Information providedPanel name

CPU

• Status

• Type

• Name

• UUID

• Model

• Location

ILO Module

• Status

• Type

• Name

• UUID

• Serial Number

• Model

• Firmware Version

• Properties

Memory DiMM

• Status

• Type

• Name

• UUID

• Location

• Properties

NIC

• Status

• Type

• Name

• UUID

• Properties

Power Management Controller

• Status

• Type

• Name

• UUID

• Firmware Version

Storage Cluster

• Status

• Type

• Name

• UUID

92 Monitoring cluster operations

Page 93

Table 3 Obtaining detailed information about a server (continued)

Information providedPanel name

Drive: Displays information about each drive in a storage

cluster.

• Status

• Type

• Name

• UUID

• Serial Number

• Model

• Firmware Version

• Location

• Properties

Storage Controller (Displayed for a server)

• Status

• Type

• Name

• UUID

• Serial Number

• Model

• Firmware Version

• Location

• Message

• Diagnostic message

Volume: Displays volume information for each server.

• Status

• Type

• Name

• UUID

• Properties

Storage Controller (Displayed for a storage cluster)

• Status

• Type

• UUID

• Serial Number

• Model

• Firmware Version

• Message

• Diagnostic Message

Battery (Displayed for each storage controller)

• Status

• Type

• UUID

• Properties

IO Cache Module (Displayed for a storage controller)

• Status

• Type

• UUID

• Properties

Monitoring X9720/9730 hardware 93

Page 94

Table 3 Obtaining detailed information about a server (continued)

Information providedPanel name

Temperature Sensor: Displays information for each

temperature sensor.

• Status

• Type

• Name

• UUID

• Locations

• Properties

Monitoring storage and storage components

Select Vendor Storage from the Navigator tree to display status and device information for storage and storage components. The Vendor Storage panel lists the HP 9730 CX storage systems included in the system.

The Summary panel shows details for a selected vendor storage, as shown in the following image:

94 Monitoring cluster operations

Page 95

The Management Console provides a wide-range of information in regards to vendor storage, as shown in the following image.

Drill down into the following components in the lower Navigator tree to obtain additional details:

• Servers. The Servers panel lists the host names for the attached storage.

• Storage Cluster. The Storage Cluster panel provides detailed information about the storage

cluster. See “Monitoring storage clusters” (page 96) for more information.

• Storage Switch. The Storage Switch panel provides detailed information about the storage

switches. See “Monitoring storage switches in a storage cluster” (page 101) for more information.

• LUNs. The LUNs panel provides information about the LUNs in a storage cluster. See “Managing LUNs in a storage cluster” (page 101) for more information.

Monitoring X9720/9730 hardware 95

Page 96

Monitoring storage clusters

The Management Console provides detailed information for each storage cluster. Click one of the following sub-nodes displayed under the Storage Clusters node to obtain additional information:

• Drive Enclosure. The Drive Enclosure panel provides detailed information about the drive

enclosure. Expand the Drive Enclosure node to view information about the power supply and sub enclosures. See “Monitoring drive enclosures for a storage cluster” (page 96) for more information.

• Pool. The Pool panel provides detailed information about a pool in a storage cluster. Expand

the Pool node to view information about the volumes in the pool. See “Monitoring pools for

a storage cluster” (page 99) for more information.

• Storage Controller. The Storage Controller panel provides detailed information about the

storage controller. Expand the Storage Controller node to view information about batteries and IO cache modules for a storage controller. See “Monitoring storage controllers for a

storage cluster” (page 100) for more information.

Monitoring drive enclosures for a storage cluster

Each 9730 CX has a single drive enclosure. That enclosure includes two sub-enclosures, which are shown under the Drive Enclosure node. Select one of the Sub Enclosure nodes to display information about one of the sub-enclosures.

96 Monitoring cluster operations

Page 97

Expand the Drive Enclosure node to provide additional information about the power supply and sub enclosures.

Table 4 Details provided for the drive enclosure

Where to find detailed informationNode

“Monitoring the power supply for a storage cluster” (page 97)

Power Supply

“Monitoring sub enclosures” (page 98)Sub Enclosure

Monitoring the power supply for a storage cluster

Each drive enclosure also has power supplies. Select the Power Supply node to view the following information for each power supply in the drive enclosure:

• Status

• Type

• Name

• UUID

The Power Supply panel displayed in the following image provides information about four power supplies in an enclosure:

Monitoring X9720/9730 hardware 97

Page 98

Monitoring sub enclosures

Expand the Sub Enclosure node to obtain information about the following components for each sub-enclosure:

• Drive. The Drive panel provides the following information about the drives in a sub-enclosure:

Status◦

◦ Volume Name

◦ Type

◦ UUID

◦ Serial Number

◦ Model

◦ Firmware Version

◦ Location. This column displays where the drive is located. For example, assume the

location for a drive in the list is Port: 52 Box 1 Bay: 7. To find the drive, go to Bay 7. The port number specifies the switch number and switch port. For port 52, the drive is connected to port 2 on switch 5. For location Port: 72 Box 1, Bay 6, the drive is connected to port 2 on switch 7 in bay 6.

◦ Properties

• Fan. The Fan panel provides the following information about the fans in a sub-enclosure:

Status◦

◦ Type

◦ Name

◦ UUID

◦ Properties

• SEP. The SEP panel provides the following information about the storage enclosed processors

in the sub-enclosure:

◦ Status

◦ Type

◦ Name

◦ UUID

◦ Serial Number

◦ Model

◦ Firmware Version

• Temperature Sensor. The Temperature Sensor panel provides the following information about

the temperature sensors in the sub-enclosure:

◦ Status

◦ Type

98 Monitoring cluster operations

Page 99

◦ Name

◦ UUID

◦ Properties

Monitoring pools for a storage cluster

The Management Console lists a Pool node for each pool in the storage cluster. Select one of the Pool nodes to display information about that pool.

When you select the Pool node, the following information is displayed in the Pool panel:

• Status

• Type

• Name

• UUID

• Properties

To obtain details on the volumes in the pool, expand the Pool node and then select the Volume node. The following information is displayed for the volume in the pool:

• Status

• Type

• Name

Monitoring X9720/9730 hardware 99

Page 100

• UUID

• Properties

The following image shows information for two volumes named LUN_15 and LUN_16 on the Volume panel.

Monitoring storage controllers for a storage cluster

The Management Console displays a Storage Controller node for each storage controller in the storage cluster. Select the Storage Controller node to view the following information for the selected storage controller:

• Status

• Type

• UUID

• Serial Number

• Model

• Firmware Version

• Location

• Message

• Diagnostic Message

Expand the Storage Controller node to obtain information about the battery and IO cache module for the storage controller.

Monitoring the batteries for a storage controller

The Battery panel displays the following information:

• Status

• Type

• UUID

• Properties. Provides information about the remaining charge and the charge status.

In the following image, the Battery panel shows the information about a battery that has 100 percent of its charge remaining.

Monitoring the IO Cache Modules for a storage controller

The IO Cache Module panel displays the following information about the IO cache module for a storage controller:

• Status

• Type

100 Monitoring cluster operations

HP StoreAll 9730, IBRIX X9720 Administrator's Manual

Specifications and Main Features

Frequently Asked Questions

User Manual