Avago Technologies Syncro CS 9286-8e User Manual

Download

Syncro® CS 9286-8e Solution

User Guide

Version 3.0

November 2014

DB15-001017-02

Syncro CS 9286-8e Solution User Guide November 2014

For a comprehensive list of changes to this document, see the Revision History.

Corporate Headquarters Email Website

San Jose, CA globalsupport.pdl@avagotech.com www.lsi.com

800-372-2447

Avago, Avago Technologies, the A logo, LSI, Storage by LSI, CacheCade, CacheVault, Dimmer Switch, MegaRAID, MegaRAID Storage Manager, and Syncro are trademarks of Avago Technologies in the United States and other countries. All other brand and product names may be trademarks of their respective companies.

Syncro CS 9286-8e Solution User Guide November 2014

Table of Contents

Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.1 Concepts of High-Availability DAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2 HA-DAS Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 Syncro CS 9286-8e Solution Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4 Hardware Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.5 Overview of the Cluster Setup and Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.6 Performance Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Chapter 2: Hardware and Software Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1 Syncro CS 9286-8e Two-Server Cluster Hardware Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2 Cabling Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Chapter 3: Creating the Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.1 Creating Virtual Drives on the Controller Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.1.1 Creating Shared or Exclusive VDs with the WebBIOS Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.1.2 Creating Shared or Exclusive VDs with StorCLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.1.3 Creating Shared or Exclusive VDs with MegaRAID Storage Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.1.3.1 Unsupported Drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.2 HA-DAS CacheCade Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.3 Creating the Cluster in Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3.1 Prerequisites for Cluster Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3.1.1 Clustered RAID Controller Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3.1.2 Enable Failover Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3.1.3 Configure Network Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.3.2 Creating the Failover Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.3.3 Validating the Failover Cluster Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.4 Creating the Cluster in Red Hat Enterprise Linux (RHEL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.4.1 Prerequisites for Cluster Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.4.1.1 Configure Network Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.4.1.2 Install and Configure the High Availability Add-On Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.4.1.3 Configure SELinux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.4.2 Creating the Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.4.3 Configure the Logical Volumes and Apply GFS2 File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.4.4 Add a Fence Device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.4.5 Create a Failover Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.4.6 Add Resources to the Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.4.7 Create Service Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.4.8 Mount the NFS Resource from the Remote Client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.5 Creating the Cluster in SuSE Linux Enterprise Server (SLES) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.5.1 Prerequisites for Cluster Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.5.1.1 Prepare the Operating System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.5.1.2 Configure Network Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.5.1.3 Connect to the NTP Server for Time Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.5.2 Creating the Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.5.2.1 Cluster Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.5.3 Bringing the Cluster Online . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.5.4 Configuring the NFS Resource with STONITH SBD Fencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.5.4.1 Install NFSSERVER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.5.4.2 Configure the Partition and the File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.5.4.3 Configure stonith_sbd Fencing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.5.5 Adding NFS Cluster Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.5.6 Mounting NFS in the Remote Client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Avago Technologies

- 3 -

Syncro CS 9286-8e Solution User Guide November 2014

Table of Contents

Chapter 4: System Administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.1 High Availability Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.2 Understanding Failover Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.2.1 Understanding and Using Planned Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.2.1.1 Planned Failover in Windows Server 2012 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.2.1.2 Planned Failover in Windows Server 2008 R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.2.1.3 Planned Failover in Red Hat Enterprise Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.2.2 Understanding Unplanned Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.3 Updating the Syncro CS 9286-8e Controller Firmware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.4 Updating the MegaRAID Driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.4.1 Updating the MegaRAID Driver in Windows Server 2008 R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.4.2 Updating the MegaRAID Driver in Windows Server 2012 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.4.3 Updating the Red Hat Linux System Driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.4.4 Updating the SuSE Linux Enterprise Server 11 Driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.5 Performing Preventive Measures on Disk Drives and VDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

Chapter 5: Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5.1 Verifying HA-DAS Support in Tools and the OS Driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5.2 Confirming SAS Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.2.1 Using WebBIOS to View Connections for Controllers, Expanders, and Drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.2.2 Using WebBIOS to Verify Dual-Ported SAS Addresses to Disk Drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

5.2.3 Using StorCLI to Verify Dual-Ported SAS Addresses to Disk Drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5.2.4 Using MegaRAID Storage Manager to Verify Dual-Ported SAS Addresses to Disk Drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5.3 Understanding CacheCade Behavior During a Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5.4 Error Situations and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.5 Event Messages and Error Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5.5.1 Error Level Meaning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

Avago Technologies

- 4 -

Syncro CS 9286-8e Solution User Guide November 2014

Chapter 1: Introduction

This document explains how to set up and configure the hardware and software for the Syncro® CS 9286-8e high-availability direct-attached storage (HA-DAS) solution.

The Syncro CS 9286-8e solution provides fault tolerance capabilities as a key part of a high-availability data storage system. The Syncro CS 9286-8e solution combines redundant servers, LSI® HA-DAS RAID controllers, computer nodes, cable connections, common SAS JBOD enclosures, and dual-ported SAS storage devices.

The redundant components and software technologies provide a high-availability system with ongoing service that is not interrupted by the following events:

 The failure of a single internal node does not interrupt service because the solution has multiple nodes with

cluster failover.

 An expander failure does not interrupt service because the dual expanders in every enclosure provide redundant

data paths.

 A drive failure does not interrupt service because RAID fault tolerance is part of the configuration.  A system storage expansion or maintenance activity can be completed without requiring an interruption of

service because of redundant components, management software, and maintenance procedures.

Concepts of High-Availability DAS

Chapter 1: Introduction

1.1 Concepts of High-Availability DAS

In terms of data storage and processing, High Availability (HA) means a computer system design that ensures a high level of operational continuity and data access reliability over a long period of time. High-availability systems are critical to the success and business needs of small and medium-sized business (SMB) customers, such as retail outlets and health care offices, who cannot afford to have their computer systems go down. An HA-DAS solution enables customers to maintain continuous access to and use of their computer system. Shared direct-attached drives are accessible to multiple servers, thereby maintaining ease of use and reducing storage costs.

A cluster is a group of computers working together to run a common set of applications and to present a single logical system to the client and application. Failover clustering provides redundancy to the cluster group to maximize up-time by utilizing fault-tolerant components. In the example of two servers with shared storage that comprise a failover cluster, when a server fails, the failover cluster automatically moves control of the shared resources to the surviving server with no interruption of processing. This configuration allows seamless failover capabilities in the event of planned failover (maintenance mode) for maintenance or upgrade, or in the event of a failure of the CPU, memory, or other server failures.

Because multiple initiators exist in a clustered pair of servers (nodes) with a common shared storage domain, there is a concept of device reservations in which physical drives, drive groups, and virtual drives (VDs) are managed by a selected single initiator. For HA-DAS, I/O transactions and RAID management operations are normally processed by a single Syncro CS 9286-8e controller, and the associated physical drives, drive groups, and VDs are only visible to that controller. To assure continued operation, all other physical drives, drive groups, and VDs are also visible to, though not normally controlled by, the Syncro CS controller. This key functionality allows the Syncro CS solution to share VDs among multiple initiators as well as exclusively constrain VD access to a particular initiator without the need for SAS zoning.

Node downtime in an HA system can be either planned and unplanned. Planned node downtime is the result of management-initiated events, such as upgrades and maintenance. Unplanned node downtime results from events that are not within the direct control of IT administrators, such as failed software, drivers, or hardware. The Syncro CS 9286-8e solution protects your data and maintains system up-time from both planned and unplanned node downtime. It also enables you to schedule node downtime to update hardware or firmware, and so on. When you bring one controller node down for scheduled maintenance, the other node takes over with no interruption of service.

Avago Technologies

- 5 -

Syncro CS 9286-8e Solution User Guide November 2014

1.2 HA-DAS Terminology

This section defines some additional important HA-DAS terms.

 Cache Mirror: A cache coherency term that describes the duplication of write-back cached data across

two controllers.

 Exclusive Access: A host access policy in which a VD is only exposed to, and accessed by, a single specified server.  Failover: The process in which the management of drive groups and VDs transitions from one controller to the

peer controller to maintain data access and availability.

 HA Domain: A type of storage domain that consists of a set of HA controllers, cables, shared disk resources, and

storage media.

 Peer Controller: A relative term to describe the HA controller in the HA domain that acts as the failover controller.  Server/Controller Node: A processing entity composed of a single host processor unit or multiple host processor

units that is characterized by having a single instance of a host operating system.

 Server Storage Cluster: An HA storage topology in which a common pool of storage devices is shared by two

computer nodes through dedicated Syncro CS 9286-8e controllers.

 Shared Access: A host access policy in which a VD is exposed to, and can be accessed by, all servers in the

HA domain.

 Virtual Drive (VD): A storage unit created by a RAID controller from one or more physical drives. Although a

virtual drive can consist of multiple drives, it is seen by the operating system as a single drive. Depending on the RAID level used, the virtual drive might retain redundant data in case of a drive failure.

Chapter 1: Introduction

HA-DAS Terminology

1.3 Syncro CS 9286-8e Solution Features

The Syncro CS 9286-8e solution supports the following HA features.

 Server storage cluster topology, enabled by the following supported operating systems:

— Microsoft® Windows Server® 2008 R2 — Microsoft Windows Server 2012 — Red Hat® Enterprise Linux® 6.3 — Red Hat Enterprise Linux 6.4 — SuSE® Linux Enterprise Server 11 SP3 — SuSE Linux Enterprise Server 11 SP2

 Clustering/HA services support:

— Microsoft failover clustering — Red Hat High Availability Add-on — SuSE High Availability Extensions

 Dual-active HA with shared storage  Controller-to-controller intercommunication over SAS  Write-back cache coherency  CacheCade® 1.0 (Read)  Shared and exclusive VD I/O access policies  Operating system boot from the controller (exclusive access)  Controller hardware and property mismatch detection, handling, and reporting  Global hot spare support for all volumes in the HA domain  Planned and unplanned failover modes  CacheVault® provides cache cached data protection in case of host power loss or server failure

Avago Technologies

- 6 -

Syncro CS 9286-8e Solution User Guide November 2014

 Full MegaRAID® features, with the following exceptions.

— T10 Data Integrity Field (DIF) is not supported. — Self-encrypting drives (SEDs) and full disk encryption (FDE) are not supported. — CacheCade 2.0 (write back) is not supported. — Dimmer Switch® is not supported. — SGPIO sideband signaling for enclosure management is not supported.

1.4 Hardware Compatibility

The servers, disk drives, and JBOD enclosures that you use in the Syncro CS 9286-8e solution must be selected from the list of approved components that LSI has tested for compatibility. Refer to the web page for the compatibility lists at http://www.lsi.com/channel/support/pages/interoperability.aspx.

1.5 Overview of the Cluster Setup and Configuration

Chapter 2 and Chapter 3 describe how to install the hardware and software so that you can use the fault tolerance

capabilities of HA-DAS to provide continuous service in event of drive failure or server failure, expand the system storage.

Chapter 1: Introduction

Hardware Compatibility

Chapter 2 describes how to install the Syncro CS 9286-8e controllers and connect them by cable to an external drive

enclosure. In addition, it lists the steps required after controller installation and cable connection, which include the following:

 Configure the drive groups and the virtual drives on the two controllers  Install the operating system driver on both server nodes.  Install the operating system on both server nodes, following the instructions from the manufacturer  Install StorCLI and MegaRAID Storage Manager™ utilities

Chapter 3 describes how to perform the following actions while using a supported OS:

 Install and enable the cluster feature on both servers.  Set up a cluster under the supported operating systems  Configure drive groups and virtual drives  Create a CacheCade 1.0 virtual drive as part of a Syncro CS 9286-8e configuration.

1.6 Performance Considerations

SAS technology offers throughput-intensive data transfers and low latency times. Throughput is crucial during failover periods where the system needs to process reconfiguration activity in a fast, efficient manner. SAS offers a throughput rate of 124 Gb/s over a single lane. SAS controllers and enclosures typically aggregate four lanes into an x4-wide link, giving an available bandwidth of 48 Gb/s across a single connector, which makes SAS ideal for HA environments.

Syncro CS controllers work together across a shared SAS Fabric to achieve sharing, cache coherency, heartbeat monitoring and redundancy by using a set of protocols to carry out these functions. At any point in time, a particular VD is accessed or owned by a single controller. This owned VD is a termed a local VD. The second controller is aware of the VD on the first controller, but it has only indirect access to the VD. The VD is a remote VD for the second controller.

Avago Technologies

- 7 -

Syncro CS 9286-8e Solution User Guide November 2014

In a configuration with multiple VDs, the workload is typically balanced across controllers to provide a higher degree of efficiency.

When a controller requires access to a remote VD, the I/Os are shipped to the remote controller, which processes the I/O locally. I/O requests that are handled by local VDs are much faster than those handled by remote VDs.

The preferred configuration is for the controller to own the VD that hosts the clustered resource (the MegaRAID Storage Manager utility shows which controller owns this VD). If the controller does not own this VD, it must issue a request to the peer controller to ship the data to it, which affects performance. This situation can occur if the configuration has been configured incorrectly or if the system is in a failover situation.

MegaRAID Storage Manager has no visibility to remote VDs, so all VD management operations must be performed locally. A controller that has no direct access to a VD must use I/O shipping to access the data if it receives a client data request. Accessing the remote VD affects performance because of the I/O shipping overhead.

Chapter 1: Introduction

Performance Considerations

NOTE Performance tip: You can reduce the impact of I/O shipping by

locating the VD or drive groups with the server node that is primarily driving the I/O load. Avoid drive group configurations with multiple VDs whose I/O load is split between the server nodes.

NOTE Performance tip: Use the MegaRAID Storage Manager utility to verify

the correct resource ownership and load balancing. Load balancing is a method of spreading work between two or more computers, network links, CPUs, drives, or other resources. Load balancing is used to maximize resource use, throughput, or response time. Load balancing is the key to ensuring that client requests are handled in a timely, efficient manner.

Avago Technologies

- 8 -

Syncro CS 9286-8e Solution User Guide November 2014

Chapter 2: Hardware and Software Setup

This chapter describes how to set up the hardware and software for a Syncro CS 9286-8e solution with two controller nodes and shared storage. For this implementation, you use two standard server modules with Syncro CS 9286-8e controllers that provide access to disks in one or more JBOD enclosures for reliable, high-access redundancy.

The LSI Syncro CS 9286-8e controller kit includes the following items:

 Two Syncro CS 9286-8e controllers  Two CacheVault Flash Modules 03 (CVFM03) (pre-installed on the controllers)  Two CacheVault Power Modules 02 (CVPM02)  Two CVPM02 mounting clips and hardware  Two CVPM02 extension cables  Two low-profile brackets  Syncro CS 9286-8e Controller Quick Installation Guide  Syncro CS Resource CD

The hardware configuration for the Syncro CS 9286-8e solution requires the following additional hardware that is not included in the kit:

 Two server modules from the approved compatibility list from LSI. The servers must include network cards.  One or more JBOD enclosures with SAS disk drives from the approved compatibility list from LSI.  A monitor and a pointing device for each server node.  Network cabling and SAS cabling to connect the servers and JBOD enclosures.

Chapter 2: Hardware and Software Setup

The following figure shows a high-level diagram of a Syncro CS 9286-8e solution connected to a network.

Avago Technologies

- 9 -

Syncro CS 9286-8e Solution User Guide November 2014

Figure 1 Two-Server Syncro CS 9286-8e Configuration

Syncro CS 9286-8e Two-Server Cluster Hardware Setup

Chapter 2: Hardware and Software Setup

I3#3)

#LIENTWITH

!PPLICATIONS

I3#3)

#LIENTWITH

!PPLICATIONS

3XEOLF/RFDO$UHD1HWZRUN/$1

6HUYHU1RGH$

&KDVVLV

3YNCRO#3E

4O P

#ONNECTOR

#ONNECTOR#ONNECTOR

#ONNECTOR



"OTTOM

#ONNECTOR



%XPANDER!

3ULYDWH/$1

I3#3)

#LIENTWITH

!PPLICATIONS

6HUYHU1RGH%

3YNCRO#3E

4O P

#ONNECTOR

#ONNECTOR#ONNECTOR

%XPANDER"

$.33ERVER

&KDVVLV

#ONNECTOR

.ETWORK

"OTTOM





([WHUQDO-%2'6$6'ULYH(QFORVXUH

.OTE %XPANDER!CONNECTSTO0ORT!OFTHE3!3DRIVES

%XPANDER"CONNECTSTO0ORT"OFTHE3!3DRIVES

3!33INGLE,ANES 3!3&OUR,ANES

?

2.1 Syncro CS 9286-8e Two-Server Cluster Hardware Setup

Follow these steps to set up the hardware for a Syncro CS 9286-8e configuration.

1. Unpack the Syncro CS 9286-8e controllers and the CVPM02 modules from the kit, and inspect them for damage.

If any components of the kit appear to be damaged, or if any items are missing from the kit, contact your LSI support representative.

2. Turn off the power to the server units, disconnect the power cords, and disconnect any network cables.

3. Remove the covers from the two server units.

Refer to the server documentation for instructions on how to do this.

4. Review the Syncro CS 9286-8e jumper settings and change them if necessary. Also note the location of the two external Mini-SAS SFF8088 connectors J1A4 and J1B1.

You usually do not need to change the default factory settings of the jumpers. The following figure shows the location of the jumpers and connectors on the controller board.

Avago Technologies

- 10 -

Syncro CS 9286-8e Solution User Guide November 2014

The CVFM03 module comes preinstalled on the Syncro CS controller; however, the module is not included in the following figure so that you can see all of the connectors and headers on the controller board. Figure 3 and

Figure 4 show the controller with the CVFM03 module installed.

Figure 2 Syncro CS 9286-8e Jumpers and Connectors

J1A1

J1A4

J1B1

J1A3

Ports

0 - 3

Ports

4 - 7

J2A1

J2A2

J2A4

Syncro CS 9286-8e Two-Server Cluster Hardware Setup

Chapter 2: Hardware and Software Setup

J4A1 J4A2 J5A1 J6A1

J5B1

3_00922-01

J2B1

In the figure, Pin 1 is highlighted in red for each jumper.

The following table describes the jumpers and connectors on the Syncro CS 9286-8e controller.

Table 1 Syncro CS 9286-8e Controller Jumpers

Jumper/

Connector

J1A4 External SFF-8088 4-port SAS connector In the cabling figures later in this chapter, this connector is called the

J1B1 External SFF-8088 4-port SAS connector In the cabling figures later in this chapter, this connector is called the

J1A1 Write-Pending LED header 2-pin connector

J1A3 Global Drive Fault LED header 2-pin connector

J2A1 Activity LED header 2-pin connector

J2A2 Advanced Software Options Hardware Key

header

J2A4

I2O Mode jumper

J4A1 Serial EEPROM 2-pin connector

Typ e Description

"top" connector.

"bottom" connector.

Connects to an LED that indicates when the data in the cache has yet to be written to the storage devices. Used when the write-back feature is enabled.

Connects to an LED that indicates whether a drive is in a fault condition.

Connects to an LED that indicates activity on the drives connected to the controller.

3-pin header

Enables support for the Advanced Software Options features.

2-pin connector

Installing this jumper causes the RAID controller to run in I The default, recommended mode of operation is without the shunt and running in Fusion mode.

Provides controller information, such as the serial number, revision, and manufacturing date. The default is no shunt installed.

O mode.

Avago Technologies

- 11 -

Syncro CS 9286-8e Solution User Guide November 2014

Table 1 Syncro CS 9286-8e Controller Jumpers (Continued)

Syncro CS 9286-8e Two-Server Cluster Hardware Setup

Chapter 2: Hardware and Software Setup

Jumper/

Connector

Typ e Description

J4A2 LSI Test header Reserved for Avago use.

J5A1 Serial UART connector for the expander Reserved for Avago use.

J6A1 Serial UART connector for the expander Reserved for Avago use.

5. Ground yourself, and then place the Syncro CS controller on a flat, clean, static-free surface.

NOTE If you want to replace a bracket, refer to "Replacing Brackets" on

MegaRAID SAS+SATA RAID Controllers Quick Installation Guide

for instructions.

6. Take the cable included with the kit, and insert one end of it into the 9-pin connector on the remote CVPM02 module, as shown in the following figure.

NOTE The CVPM02 module is a super-capacitor pack that provides power for

the cache offload capability to protect cached data in case of host power loss or server failure.

Figure 3 Connecting the Cable to the Remote CVPM02 Module

CVPM02

Module

3_01853-00

CVFM03

Module

J2A1

7. Mount the CVPM02 module to the inside of the server chassis, as shown in Figure 4. Refer to the server documentation to determine the exact method of mounting the module.

NOTE Because server chassis design varies from vendor to vendor, no

standard mounting option exists for the CVPM02 module that is compatible with all chassis configurations. Authorized resellers and chassis manufacturers might have recommendations for the location

Avago Technologies

- 12 -

Syncro CS 9286-8e Solution User Guide November 2014

8. Make sure that the power to the server is still turned off, that the power cords are unplugged, and that the chassis is grounded and has no AC power.

9. Insert the Syncro CS 9286-8e controller into a PCI Express® (PCIe®) slot on the motherboard, as shown in the following figure.

Press down gently, but firmly, to seat the Syncro CS 9286-8e controller correctly in the slot.

Figure 4 Installing the Syncro CS 9286-8e Controller and Connecting the Cable

Syncro CS 9286-8e Two-Server Cluster Hardware Setup

Chapter 2: Hardware and Software Setup

of the power module to provide the most flexibility within various environments.

NOTE The Syncro CS 9286-8e controller is a PCIe x8 card that can operate in

x8 or x16 slots. Some x16 PCIe slots support only PCIe graphics cards;

if you install a Syncro CS 9286-8e in one of these slots, the controller will not function. Refer to the motherboard documentation for information about the configuration of the PCIe slots.

Bracket Screw

Press

Here

Press

Here

3_01852-00

PCIe Slot

Edge of

Motherboard

10. Secure the controller to the computer chassis with the bracket screw.

11. Insert the other 9-pin cable connector on the cable into the J2A1 connector on the CVFM03 module, as shown in

Figure 3.

12. Repeat step 5 to step 11 to install the second Syncro CS 9286-8e controller in the second server module.

13. Install SAS disk drives in the JBOD enclosure or enclosures.

NOTE In a Syncro CS configuration, the expanders in the JBOD enclosure

must have two four-lane IN ports. As an option, the expanders can be

Avago Technologies

- 13 -

Syncro CS 9286-8e Solution User Guide November 2014

Refer to the drive documentation to determine any pre-installation configuration requirements. Be sure to use SAS disk drives that are listed on the approved list from LSI. (To view this list, follow the URL listed in Section 1.4,

Hardware Compatibility.)

14. If necessary, install network boards in the two server modules and install the cabling between them.

15. Reinstall the covers of the two servers.

16. Install the two server modules and the JBOD enclosure in an industry-standard cabinet, if appropriate, following the manufacturer’s instructions.

17. Use SAS cables to connect the two external connectors on the Syncro CS 9286-8e controller to the JBOD enclosure or enclosures. See Figure 2 to view the location of the external connectors.

See Section 2.2, Cabling Configurations, for specific cabling instructions for one or two JBOD enclosures.

18. Reconnect the power cords and turn on the power to the servers and the JBOD enclosure or enclosures.

Follow the generally accepted best practice by turning on the power on to the JBOD enclosure before you power the two servers. If you power the servers before you power the JBOD enclosure, the servers might not recognize the disk drives.

When the servers boot, a BIOS message appears. The firmware takes several seconds to initialize. The configuration utility prompt times out after several seconds. The second portion of the BIOS message shows the Syncro CS 9286-8e number, firmware version, and cache SDRAM size. The numbering of the controllers follows the PCI slot scanning order used by the host motherboard.

19. Configure the drive groups and the virtual drives on the two controllers.

For specific instructions, see Section 3.1, Creating Virtual Drives on the Controller Nodes. You can use the WebBIOS, StorCLI, or MegaRAID Storage Manager configuration utilities to create the groups and virtual drives.

20. Install the operating system driver on both server nodes.

You must install the software drivers first, before you install the operating system.

You can view the supported operating systems and download the latest drivers for the Syncro CS controllers from the LSI website at http://www.lsi.com/support/Pages/download-search.aspx. Access the download center, and follow the steps to download the appropriate driver.

Refer to the MegaRAID SAS Device Driver Installation User Guide on the Syncro CS Resource CD for more information about installing the driver. Be sure to review the readme file that accompanies the driver.

21. Install the operating system on both server nodes, following the instructions from the operating system vendor.

Make sure you apply all of the latest operating system updates and service packs to ensure proper functionality.

You have two options for installing the operating system for each controller node:

— Install it on a private volume connected to the system-native storage controller. The recommended best

practice is to install the operating system on this private volume because the disks in the clustering configuration cannot see this volume. Therefore, no danger exists of accidentally overwriting the operating system disk when you set up clustering.

— Install it on an exclusive virtual drive connected to the Syncro CS 9286-8e controller. exclusive host access is

required for a boot volume so the volume is not overwritten accidentally when you create virtual drives for data storage. For instructions on creating exclusive virtual drives using the WebBIOS utility, see Section 3.1.1,

Creating Shared or Exclusive VDs with the WebBIOS Utility.

Syncro CS 9286-8e Two-Server Cluster Hardware Setup

Chapter 2: Hardware and Software Setup

configured with a third four-lane port to connect to other cascaded expanders, as shown in Figure 1. JBOD enclosures with dual expanders can support split mode or unified mode. For fault-tolerant cabling configurations, you typically configure the JBOD enclosure in unified mode. (Check with the JBOD enclosure vendor to determine the appropriate settings.)

NOTE The Syncro CS 9286-8e solution does not support booting from a

shared operating system volume.

Avago Technologies

- 14 -

Syncro CS 9286-8e Solution User Guide November 2014

22. Install StorCLI and MegaRAID Storage Manager for the Windows® and Linux operating systems following the installation steps outlined in the StorCLI Reference Manual and MegaRAID SAS Software User Guide on the Syncro CS Resource CD.

2.2 Cabling Configurations

This section has information about initially setting up a Syncro CS configuration with one or two JBOD enclosures. It also explains how to add a second JBOD enclosure to an operational single-JBOD configuration without interrupting service on the configuration.

System throughput problems can occur if you use the wrong kind of SAS cables. To minimize the potential for problems, use high-quality cables that meet SAS 2.1 standards and that are less than 6 meters long. See the list of approved cables and vendors on the web link listed at the end of Section 1.4, Hardware Compatibility.

The following figure shows the SAS cable connections for a two-controller-node configuration with a single JBOD enclosure.

NOTE In the figures in this section, Top Connector means the external

connector closest to the bracket screw support end of the controller bracket (connector J1A4 in Figure 2). Bottom Connector means the other external connector (connector J1B1 in Figure 2).

Chapter 2: Hardware and Software Setup

Cabling Configurations

Figure 5 Two-Controller-Node Configuration with Single JBOD Enclosure

6HUYHU1RGH$

&KDVVLV

3YNCRO#3E

4O P

#ONNECTOR

#ONNECTOR#ONNECTOR

#ONNECTOR



"OTTOM

#ONNECTOR



%XPANDER!

([WHUQDO-%2'6$6'ULYH(QFORVXUH

6HUYHU1RGH%

&KDVVLV

3YNCRO#3E

4O P

#ONNECTOR

#ONNECTOR#ONNECTOR

%XPANDER"

"OTTOM

#ONNECTOR





?

The cross-connections between the controllers provide redundant paths that safeguard against expander, cable, or expander failure.

To retain consistent device reporting, the corresponding port numbers for both controllers must be connected to a common enclosure expander. In this example, the top connector on each controllers is connected to Expander A and the bottom connector on each controller is connected to Expander B.

Avago Technologies

- 15 -

Syncro CS 9286-8e Solution User Guide November 2014

The following figure shows the SAS cable connections for a two-controller-node configuration with two JBOD enclosures.

Figure 6 Two-Controller-Node Configuration with Dual JBOD Enclosures

Chapter 2: Hardware and Software Setup

Cabling Configurations

NOTE To save space, the figure does not show the disk drives that are in the

JBOD enclosures.

6HUYHU1RGH$

6\QFUR&6H

4OP

#ONNECTOR

#ONNECTOR

)N

([SDQGHU$

#ONNECTOR

/UT

#ONNECTOR

)N

([SDQGHU$

#ONNECTOR

/UT

"OTTOM

#ONNECTOR

#ONNECTOR

)N

'ULYH(QFORVXUH$

#ONNECTOR

)N

'ULYH(QFORVXUH%

6HUYHU1RGH%

6\QFUR&6H

4OP

#ONNECTOR

)N

([SDQGHU%

#ONNECTOR

/UT

#ONNECTOR

)N

([SDQGHU%

#ONNECTOR

/UT

"OTTOM

#ONNECTOR

#ONNECTOR

)N

#ONNECTOR

)N

?

The recommended method shown in the preceding figure is preferable to simply daisy-chaining a second JBOD enclosure from the single-JBOD configuration shown in Figure 5, because a single power failure in the first JBOD enclosure could interrupt all data access. Instead, connect the second Syncro CS controller envisioning the JBOD enclosures in reverse order. The resulting top-down/bottom-up cabling approach shown in the preceding figure is preferred because it assures continued access to operating drives if either of the JBOD enclosures fails or is removed.

The following figure shows how to hot-add a second JBOD enclosure to an existing two-server cluster without interrupting service on the HA configuration.

NOTE To save space, the figure does not show the disk drives that are in the

JBOD enclosures.

Avago Technologies

- 16 -

Syncro CS 9286-8e Solution User Guide November 2014

Figure 7 Adding a Second JBOD Enclosure – Redundant Configuration

Chapter 2: Hardware and Software Setup

Cabling Configurations

6HUYHU1RGH$

6\QFUR&6H

4OP

#ONNECTOR

)N

([SDQGHU$

#ONNECTOR

/UT



#ONNECTOR

)N

([SDQGHU$

#ONNECTOR

/UT

"OTTOM

#ONNECTOR

)N

'ULYH(QFORVXUH$

#ONNECTOR

)N

'ULYH(QFORVXUH%

6HUYHU1RGH%

6\QFUR&6H

4OP

#ONNECTOR

)N

([SDQGHU%

#ONNECTOR

/UT

#ONNECTOR

)N

([SDQGHU%

#ONNECTOR

/UT

"OTTOM

#ONNECTOR

)N



#ONNECTOR

)N





?

The steps for adding the second JBOD enclosure are as follows:

1. Connect a link from connector 2 on Expander B of JBOD enclosure A to connector 1 on expander B of JBOD enclosure B.

2. Disconnect the link from connector 1 on expander A of JBOD enclosure A and reconnect it to connector 1 on expander A of JBOD enclosure B.

3. Disconnect the link from connector 0 on expander A of JBOD enclosure A and reconnect it to connector 0 on expander A of JBOD enclosure B.

4. Connect the link from connector 2 on expander A of JBOD enclosure B to connector 0 on expander A of JBOD enclosure A.

Avago Technologies

- 17 -

Syncro CS 9286-8e Solution User Guide November 2014

Chapter 3: Creating the Cluster

This chapter explains how to set up HA-DAS clustering on a Syncro CS 9286-8e configuration after the hardware is fully configured and the operating system is installed.

3.1 Creating Virtual Drives on the Controller Nodes

This section describes that next step in setup, which is creating VDs on the controller nodes.

The HA-DAS cluster configuration requires a minimum of one shared VD to be used as a quorum disk to enable the operating system support for clusters. Refer to the MegaRAID SAS Software User Guide for information about the available RAID levels and the advantages of each one.

As explained in the instructions in the following sections, VDs created for storage in an HA-DAS configuration must be shared. If you do not designate them as shared, the VDs are visible only from the controller node from which they were created.

You can use the WebBIOS pre-boot utility to create the VDs. You can also use the LSI MegaRAID Storage Manager (MSM) utility or the StorCLI utility to create VDs after the OS has booted. Refer to the MegaRAID SAS Software User Guide for complete instructions on using these utilities.

Creating Virtual Drives on the Controller Nodes

Chapter 3: Creating the Cluster

3.1.1 Creating Shared or Exclusive VDs with the WebBIOS Utility

To coordinate the configuration of the two controller nodes, both nodes must be booted into the WebBIOS pre-boot utility. The two nodes in the cluster system boot simultaneously after power on, so you must rapidly access both consoles. One of the systems is used to create the VDs; the other system simply remains in the pre-boot utility. This approach keeps the second system in a state that does not fail over while the VDs are being created on the first system.

NOTE The WebBIOS utility cannot see boot sectors on the disks. Therefore,

be careful not to select the boot disk for a VD. Preferably, unshare the boot disk before doing any configuration with the pre-boot utility. To do this, select Logical Drive Properties and deselect the Shared Virtual Disk property.

Follow these steps to create VDs with the WebBIOS utility.

1. When prompted during the POST on the two systems, use the keyboard to access the WebBIOS pre-boot BIOS utility (on both systems) by pressing Ctrl-H.

Respond quickly, because the system boot times are very similar and the time-out period is short. When both controller nodes are running the WebBIOS utility, follow these steps to create RAID 5 arrays.

NOTE To create a RAID 0, RAID 1, or RAID 6 array, modify the instructions to

select the appropriate number of disks.

2. Click Start.

3. On the WebBIOS main page, click Configuration Wizard, as shown in the following figure.

Avago Technologies

- 18 -

Syncro CS 9286-8e Solution User Guide November 2014

Figure 8 WebBIOS Main Page

Creating Virtual Drives on the Controller Nodes

Chapter 3: Creating the Cluster

The first Configuration Wizard window appears.

4. Select Add Configuration and click Next.

5. On the next wizard screen, select Manual Configuration and click Next.

The Drive Group Definition window appears.

6. In the Drives panel on the left, select the first drive, then hold down the Ctrl key and select more drives for the array, as shown in the following figure.

Figure 9 Selecting Drives

7. Click Add To Array, click ACCEPT, and click Next.

Avago Technologies

- 19 -

Syncro CS 9286-8e Solution User Guide November 2014

8. On the next screen, click Add to SPAN, and then click Next.

9. On the next screen, click Update Size.

10. Select Provide Shared Access on the bottom left of the window, as shown in the following figure.

Alternatively, deselect this option to create an exclusive VD as a boot volume for this cluster node.

Figure 10 Virtual Drive Definition

Creating Virtual Drives on the Controller Nodes

Chapter 3: Creating the Cluster

The Provide Shared Access option enables a shared VD that both controller nodes can access. If you uncheck this box, the VD has a status of exclusive, and only the controller node that created this VD can access it.

11. On this same page, click Accept, and then click Next.

12. On the next page, click Next.

13. Click Ye s to accept the configuration.

14. Repeat the previous steps to create the other VDs.

As the VDs are configured on the first controller node, the other controller node’s drive listing is updated to reflect the use of the drives.

15. When prompted, click Ye s to save the configuration, and click Yes to confirm that you want to initialize it.

16. Define hot spare disks for the VDs to maximize the level of data protection.

NOTE The Syncro CS 9286-8e solution supports global hot spares and

dedicated hot spares. Global hot spares are global for the cluster, not for a controller.

17. When all VDs are configured, reboot both systems as a cluster.

Avago Technologies

- 20 -

Syncro CS 9286-8e Solution User Guide November 2014

3.1.2 Creating Shared or Exclusive VDs with StorCLI

The Storage Command Line Tool, StorCLI, is the command-line management utility for MegaRAID, that you can use to create and manage VDs. StorCLI can run in any directory on the server. The following procedure assumes that a current copy of the 64-bit version of StorCLI is located on the server in a common directory as the StorCLI executable and the commands are run with administrator privileges.

1. At the command prompt, run the following command:

storcli /c0/vall show

The c0 parameter presumes that there is only one Syncro CS 9286-8e controller in the system or that these steps reference the first Syncro CS 9286-8e controller in a system with multiple controllers.

The following figure shows some sample configuration information that appears in response to the command.

Figure 11 Sample Configuration Information

Creating Virtual Drives on the Controller Nodes

Chapter 3: Creating the Cluster

The command generates many lines of information that scroll down in the window. You must use some of this information to create the shared VD.

2. Find the Device ID for the JBOD enclosure for the system and the Device IDs of the available physical drives for the VD that you will create.

In the second table in the preceding figure, the enclosure device ID of 252 appears under the heading EID, and the device ID of 0 appears under the heading DID. Use the scroll bar to find the device IDs for the other physical drives for the VD.

Detailed drive information, such as the drive group, capacity, and sector size, follows the device ID in the table and is explained in the text below the table.

3. Create the shared VD using the enclosure and drive device IDs with the following command line syntax:

Storcli /c0 add vd rX drives=e:s

The HA-DAS version of StorCLI creates, by default, a shared VD that is visible to all cluster nodes.

Avago Technologies

- 21 -

Syncro CS 9286-8e Solution User Guide November 2014

The following notes explain the command line parameters.

— The /c0 parameter selects the first Syncro CS 9286-8e controller in the system. — The add vd parameter configures and adds a VD (logical disk). — The rX parameter selects the RAID level, where X is the level. — The opening and closing square brackets define the list of drives for the VD. Each drive is listed in the form

enclosure device ID: [slot]drive device ID.

NOTE To create a VD that is visible only to the node that created it (such as

creating a boot volume for this cluster node), add the [ExclusiveAccess] parameter to the command line.

For more information about StorCLI command line parameters, refer to the MegaRAID SAS Software User Guide.

3.1.3 Creating Shared or Exclusive VDs with MegaRAID Storage Manager

Follow these steps to create VDs for data storage with MegaRAID Storage Manager. When you create the VDs, you assign the Share Virtual Drive property to them to make them visible from both controller nodes. This example assumes you are creating a RAID 5 redundant VD. Modify the instructions as needed for other RAID levels.

NOTE Not all versions of MegaRAID Storage Manager support HA-DAS.

Check the release notes to determine if your version of MegaRAID Storage Manager supports HA-DAS. Also, see Section 5.1, Verifying

HA-DAS Support in Tools and the OS Driver.

1. In the left panel of the MegaRAID Storage Manager Logical pane, right-click the Syncro CS 9286-8e controller and select Create Virtual Drive from the pop-up menu.

The Create Virtual Drive wizard appears.

2. Select the Advanced option and click Next.

3. In the next wizard screen, select RAID 5 as the RAID level, and select unconfigured drives for the VD, as shown in the following figure.

Creating Virtual Drives on the Controller Nodes

Chapter 3: Creating the Cluster

Figure 12 Drive Group Settings

Avago Technologies

- 22 -

Syncro CS 9286-8e Solution User Guide November 2014

4. Click Add to add the VD to the drive group.

The selected drives appear in the Drive groups window on the right.

5. Click Create Drive Group. Then click Next to continue to the next window.

The Virtual Drive Settings window appears.

6. Enter a name for the VD.

7. Select Always Write Back as the Write policy option, and select other VD settings as required.

8. Select the Provide Shared Access option, as shown in the following figure.

Figure 13 Provide Shared Access Option

Creating Virtual Drives on the Controller Nodes

Chapter 3: Creating the Cluster

NOTE If you do not select Provide Shared Access, the VD is visible only from

the server node on which it is created. Leave this option unselected if you are creating a boot volume for this cluster node.

9. Click Create Virtual Drive to create the virtual drive with the settings you specified.

The new VD appears in the Drive groups window on the right of the window.

10. Click Next to continue.

The Create Virtual Drive Summary window appears, as shown in the following figure.

Avago Technologies

- 23 -

Syncro CS 9286-8e Solution User Guide November 2014

Figure 14 Create Virtual Drive Summary

Chapter 3: Creating the Cluster

HA-DAS CacheCade Support

11. Click Finish to complete the VD creation process.

12. Click OK when the Create Virtual Drive - complete message appears.

3.1.3.1 Unsupported Drives

Drives that are used in the Syncro CS 9286-8e solution must selected from the list of approved drives listed on the LSI website (see the URL in Section 1.4, Hardware Compatibility). If the MegaRAID Storage Manager (MSM) utility finds a drive that does not meet this requirement, it marks the drive as Unsupported, as shown in the following figure.

Figure 15 Unsupported Drive in MegaRAID Storage Manager

3.2 HA-DAS CacheCade Support

The Syncro CS 9286-8e controller includes support for CacheCade 1.0, a feature that uses SAS SSD devices for read caching of frequently accessed read data. When a VD is enabled for the CacheCade feature, frequently read data regions of the VD are copied into the SSD when the CacheCade algorithm determines the region is a good candidate. When the data region is in the CacheCade SSD volume, the firmware can service related reads from the faster-access SSD volume instead of the higher-latency VD. The CacheCade feature uses a single SSD to service multiple VDs.

Avago Technologies

- 24 -

Syncro CS 9286-8e Solution User Guide November 2014

The Syncro CS 9286-8e solution requires the use of SAS SSDs that support SCSI-3 persistent reservations (PR) for CacheCade VDs. LSI maintains a list of SAS SSD drives that meet the HA-DAS requirements

Follow these steps to create a CacheCade 1.0 VD as part of a Syncro CS 9286-8e configuration. The procedure automatically associates the CacheCade volume with all existing shared VDs in the configuration. Be sure that one or more SAS SSD drives are already installed in the system. Also, be sure you are using a version of MegaRAID Storage Manager that supports HA-DAS.

1. In MegaRAID Storage Manager, open the physical view, right-click on the controller name and select Create CacheCade SSD Caching.

2. In the Drive Group window, set the CacheCade RAID level and select one or more unconfigured SSD drives. Use the Add button to place the selected drives into the drive group.

RAID 0 is the recommended RAID level for the CacheCade volume.

The following figure shows the CacheCade drive group.

Chapter 3: Creating the Cluster

HA-DAS CacheCade Support

NOTE A CacheCade VD is not presented to the host operating system, and it

does not move to the peer controller node when a failover occurs. A CacheCade VD possesses properties that are similar to a VD with exclusive host access. Therefore, the CacheCade volume does not cache read I/Os for VDs that are managed by the peer controller node.

Figure 16 Creating a CacheCade Drive Group: 1

3. Click Create Drive Group and then click Next.

4. In the Create CacheCade SSD Caching Virtual Drive window, update the SSD Caching VD name and set the size as necessary.

The maximum allowable size for the CacheCade volume is 512 GB. To achieve optimal read cache performance, the recommended best practice is to make the size as large as possible with the available SSDs, up to this limit.

Avago Technologies

- 25 -

Syncro CS 9286-8e Solution User Guide November 2014

Figure 17 Creating a CacheCade Drive Group: 2

Chapter 3: Creating the Cluster

HA-DAS CacheCade Support

5. Click Create Virtual Drive, and then click Next.

6. In the Create CacheCade SSD Caching Summary window, review the configuration, and then click Finish.

Figure 18 Reviewing the Configuration

7. In the Create CacheCade SSD Caching Complete box, click OK.

The CacheCade VD now appears on the Logical tab of MegaRAID Storage Manager, as shown in the following figure. The CacheCade volume association with the drive groups appears in this view.

Avago Technologies

- 26 -

Syncro CS 9286-8e Solution User Guide November 2014

Figure 19 New CacheCade Drive Group

Chapter 3: Creating the Cluster

Creating the Cluster in Windows

3.3 Creating the Cluster in Windows

The following subsections describe how to validate the failover configuration and configure the cluster setup while running a Windows operating system.

3.3.1 Prerequisites for Cluster Setup

3.3.1.1 Clustered RAID Controller Support

Support for clustered RAID controllers is not enabled by default in Microsoft Windows Server 2012 or Microsoft Windows Server 2008 R2.

To enable support for this feature, please consult with your server vendor. For additional information, visit the Cluster in a Box Validation Kit for Windows Server site on the Microsoft Windows Server TechCenter website for Knowledge Base (KB) article 2839292 on enabling this support.

3.3.1.2 Enable Failover Clustering

The Microsoft Server 2012 operating system installation does not enable the clustering feature by default. Follow these steps to view the system settings, and, if necessary, to enable clustering.

1. From the desktop, launch Server Manager.

2. Click Manage and select Add Roles and Features.

3. If the Introduction box is enabled (and appears), click Next.

4. In the Select Installation Type box, select Role Based or Feature Based.

5. In the Select Destination Server box, select the system and click Next.

6. In the Select Server Roles list, click Next to present the Features list.

7. Make sure that failover clustering is installed, including the tools. If necessary, run the Add Roles and Features wizard to install the features dynamically from this user interface.

8. If the cluster nodes need to support I/O as iSCSI targets, expand File and Storage Services, File Services and check for iSCSI Target Server and Server for NFS.

Avago Technologies

- 27 -

Syncro CS 9286-8e Solution User Guide November 2014

During creation of the cluster, Windows automatically defines and creates the quorum, a configuration database that contains metadata required for the operation of the cluster. To create a shared VD for the quorum, see the instructions in Section 3.1, Creating Virtual Drives on the Controller Nodes.

To determine if the cluster is active, run MegaRAID Storage Manager and look at the Dashboard tab for the controller. The first of two nodes that boots shows the cluster status as Inactive until the second node is running and the MegaRAID Storage Manager dashboard on the first node has been refreshed.

The following figure shows the controller dashboard with Active peer controller status.

Figure 20 Controller Dashboard: Active Cluster Status

Chapter 3: Creating the Cluster

Creating the Cluster in Windows

NOTE The recommended best practice is to create a small redundant VD for

the quorum. A size of 500 MB is adequate for this purpose.

NOTE To refresh the MegaRAID Storage Manager dashboard, press F5 or

select Manage > Refresh on the menu.

3.3.1.3 Configure Network Settings

To establish inter-server node communication within the cluster, each server node is contained within a common network domain served by a DNS.

1. Set the IP addresses of each server node within the same domain.

2. Use the same DNS and log on as members of the same domain name.

See the following example network configuration settings.

Server 1:

IP address: 135.15.194.21

Subnet mask: 255.255.255.0

Default gateway: 135.15.194.1

DNS server: 135.15.194.23

Server 2:

IP address: 135.15.194.22

Subnet mask: 255.255.255.0

Default gateway: 135.15.194.1

DNS server: 135.15.194.23

Avago Technologies

- 28 -

Syncro CS 9286-8e Solution User Guide November 2014

3.3.2 Creating the Failover Cluster

After all of the cluster prerequisites have been fulfilled, you can a create Failover Cluster by performing the following steps.

1. Launch the Failover Cluster Manager Tool from Server Manager: Select Server Manager > Tools > Failover Cluster Manager.

2. Launch the Create Cluster wizard: Click Create Cluster... from the Actions panel.

3. Select Servers: Use the Select Server wizard to add the two servers you want to use for clustering.

4. Validation Warning: To ensure the proper operation of the cluster, Microsoft recommends validating the configuration of your cluster.

See Section 3.3.3, Validating the Failover Cluster Configuration, for additional details.

5. Access Point for Administering the Cluster: Enter the name that you want to assign to the Cluster in the Cluster Name field.

6. Confirmation: A brief report containing the cluster properties appears. If no other changes are required, you have the option to specify available storage by selecting the Add all eligible Storage to the cluster check box.

7. Creating the New Cluster: Failover Cluster Manager uses the selected parameters to create the cluster.

8. Summary: A cluster creation report summary appears; this report includes any errors or warnings encountered.

Click on the View Report button for additional details about the report.

Chapter 3: Creating the Cluster

Creating the Cluster in Windows

3.3.3 Validating the Failover Cluster Configuration

Microsoft recommends that you validate the failover configuration before you set up failover clustering. To do this, run the Validate a Configuration wizard for Windows Server 2008 R2 or Windows Server 2012, following the instructions from Microsoft. The tests in the validation wizard include simulations of cluster actions. The tests fall into the following categories:

 System Configuration tests. These tests analyze whether the two server modules meet specific requirements,

such as running the same version of the operating system version using the same software updates.

 Network tests. These tests analyze whether the planned cluster networks meet specific requirements, such as

requirements for network redundancy.

 Storage tests. These tests analyze whether the storage meets specific requirements, such as whether the storage

correctly supports the required SCSI commands and handles simulated cluster actions correctly.

Follow these steps to run the Validate a Configuration wizard.

NOTE You can also run the Validate a Configuration wizard after you create

the cluster.

1. Perform the following steps to add a registry key to enable support for shared storage that uses direct-attached clustered RAID controllers.

NOTE You can refer to the Microsoft Support page at

http://support.microsoft.com/kb/2839292 for more information

about the Windows Registry Entry procedure required to pass cluster validation.

a. Open the Registry editor (regedit.exe). b. Locate and then select the following registry subkey:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\ClusDIsk\Parameters

c. Right-click the Parameters key and select New. d. Select DWORD and name it AllowBusTypeRAID.

Avago Technologies

- 29 -

Syncro CS 9286-8e Solution User Guide November 2014

e. After you create the key, assign it a value of 0x01. f. Click OK. g. Exit the Registry editor. h. Restart the computer.

2. In the failover cluster snap-in, in the console tree, make sure Failover Cluster Management is selected and then, under Management, click Validate a Configuration.

The Validate a Configuration wizard starts.

3. Follow the instructions for the wizard and run the tests.

Microsoft recommends that you run all available tests in the wizard.

4. When you arrive at the Summary page, click View Reports to view the results of the tests.

5. If any of the validation tests fails or results in a warning, correct the problems that were uncovered and run the test again.

Creating the Cluster in Red Hat Enterprise Linux (RHEL)

Chapter 3: Creating the Cluster

NOTE Storage Spaces does not currently support Clustered RAID controllers.

Therefore, do not include the Validate Storage Spaces Persistent Reservation storage test in the storage test suite. For additional information, visit the Cluster in a Box Validation Kit for Windows Server site on the Microsoft Windows Server TechCenter website.

3.4 Creating the Cluster in Red Hat Enterprise Linux (RHEL)

The following subsections describe how to enable cluster support, create a two-node cluster and configure NFS-clustered resources for a Red Hat operating system.

Note that the Syncro CS solution requires the Red Hat Enterprise Linux High Availability add-on for the dual-active HA functionality to operate properly and ensure data integrity through fencing. Product information regarding the Red Hat Enterprise Linux High Availability add-on can be found at

http://www.redhat.com/products/enterprise-linux-add-ons/high-availability/.

3.4.1 Prerequisites for Cluster Setup

Before you create a cluster, perform the following tasks so that all of the necessary modules and settings are pre-configured. Additional details regarding Red Hat High Availability Add-On configuration and management can be found at:

https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/pdf/Cluster_Administration/Red _Hat_Enterprise_Linux-6-Cluster_Administration-en-US.pdf.

3.4.1.1 Configure Network Settings

Perform the following steps to configure the network settings.

1. Activate the network connections for node eth0 and node eth1 by selecting the following paths:

System > Preferences > Network Connections > System eth0 > Edit > Check Connect automatically

System > Preferences > Network Connections > System eth1 > Edit > Check Connect automatically

2. NetworkManager is not supported on cluster nodes and should be removed or disabled. To disable, enter the following commands at the command prompt for both nodes:

service NetworkManager stop chkconfig NetworkManager off

Avago Technologies

- 30 -

Syncro CS 9286-8e Solution User Guide November 2014

3. Perform the following steps to assign static IP addresses for both nodes (a total of four IP addresses).

a. Select Setup > Network Configuration > Device Configuration. b. Use DNS: YOUR_IP_ADDRESS. c. Edit the /etc/hosts file to include the IP address and the hostname for both nodes and the client. d. Make sure you can ping hostname from both nodes.

The following are examples of the hosts file:

YOUR_IP_ADDRESS Node1

YOUR_IP_ADDRESS Node2

YOUR_IP_ADDRESS Client

4. Configure the following iptables firewall settings to allow cluster services communication:

— cman (Cluster Manager): UDP ports 5405 and 5405 — dlm (Distributed Lock Manager): TCP port 21064 — ricci (part of Conga remote agent): TCP port 11111 — modclustered (part of Conga remote agent): TCP port 16851 — luci (Conga User Interface server): TCP port 8084

3.4.1.2 Install and Configure the High Availability Add-On Features

The Syncro CS solution requires that the Red Hat Enterprise Linux High Availability add-on be applied to the base RHEL OS.

Creating the Cluster in Red Hat Enterprise Linux (RHEL)

Chapter 3: Creating the Cluster

Perform the following steps to install and configure the add-on feature.

1. Install the Red Hat Cluster Resource Group Manager, Logical Volume Manager (LVM), and GFS2 utilities, and then update to the latest version by typing the following commands at the command prompt:

yum install rgmanager lvm2-cluster gfs2-utils yum update

NOTE This step assumes that both nodes have been registered with Red Hat

using the Red Hat Subscription Manager.

2. Ricci is a daemon that runs on both server nodes and allows the cluster configuration commands to communicate with each cluster node.

Perform the following steps to change the ricci password for both server nodes.

a. Enter the following at the command prompt:

passwd ricci

b. Specify your password when prompted. c. Start the ricci service by entering the following at the command prompt for both nodes:

service ricci start

d. (Optional) Configure the ricci service to start on boot for both nodes by entering the following at the

command prompt:

chkconfig ricci on

3. Luci is a user interface server that allows you to configure the cluster using the High Availability management web interface, Conga.

NOTE Best Practice: You can run the luci web interface on either node but it is

best to run it on a remote management system.

Install luci by entering the following at the command prompt:

yum install luci service luci start

Avago Technologies

- 31 -

Syncro CS 9286-8e Solution User Guide November 2014

3.4.1.3 Configure SELinux

You need to configure SELinux policies to allow for clustering. Refer to Red Hat documentation to properly configure your application.

3.4.2 Creating the Cluster

Configuring cluster software often occurs on a single node and is then pushed to the remaining nodes in the cluster. Multiple methods exist to configure the cluster, such as using the command line, directly editing configuration files, and using a GUI. The procedures in this document use the Conga GUI tool to configure the cluster. After the Cluster is created, the following steps allow you to specify cluster resources, configure fencing, create a failover domain, and add cluster service groups.

Perform the following steps to create the Cluster.

1. Launch the luci web interface by going to https://YOUR_LUCI_SERVER_HOSTNAME:8084 from your web browser.

The following window appears.

Figure 21 Create New Cluster Window

Creating the Cluster in Red Hat Enterprise Linux (RHEL)

Chapter 3: Creating the Cluster

2. Log in as root for the user, and enter the associated root password for the host server node.

3. Go to the Manage Cluster tab.

4. Click Create.

5. Enter a name in the Cluster Name field.

NOTE The Cluster Name field identifies the cluster and will be referenced in

proceeding steps.

6. Add each server node in the Node Name field.

7. In the password field, enter the ricci password for each server node participating in the cluster.

Avago Technologies

- 32 -

Syncro CS 9286-8e Solution User Guide November 2014

8. Select the check box for the Enable Share Storage Support option.

9. Click Create Cluster to complete.

3.4.3 Configure the Logical Volumes and Apply GFS2 File System

Perform the following steps to create a virtual drive volume that can be managed by the Linux kernel Logical Volume Manager. All of the commands in the following procedure are entered in the command line prompt.

1. Create a virtual drive with Shared access policy based on the steps defined in Section 3.1, Creating Virtual Drives

on the Controller Nodes.

2. Create a physical volume label for use with LVM by entering the following command:

pvcreate /dev/sdb

3. Create a volume group (vol_grp0) and map /dev/sdb to the volume group by entering the following command:

create vol_grp0 /dev/sdb

4. Create a virtual drive volume from the volume group of size X (gigabytes) by entering the following command:

lvcreate --size XXXG vol_grp0

NOTE Best Practice: Use the command vgdisplay to display X size

information of the volume group.

The system now has the following device file (BlockDevice): /dev/vol_grp0/lvol0. The GFS2 file system is a cluster file system that allows for shared storage access.

Creating the Cluster in Red Hat Enterprise Linux (RHEL)

Chapter 3: Creating the Cluster

NOTE The Cluster Name is the name that you specified in Section 3.4.2,

Creating the Cluster.

5. Perform the following step to apply this file system to the virtual drives created in the previous step.

mkfs.gfs2 -p lock_dlm -t ClusterName:FSName -j NumberJournals BlockDevice

For example, using the virtual drive created in the previous step, the result is:

mkfs.gfs2 -p lock_dlm -t YOUR_CLUSTER_NAME:V1 -j 2 /dev/vol_grp0/lvol0.

6. Create mount points from each server node. For example, you can create a mount point by entering the following command: /root/mnt/vol1.

Avago Technologies

- 33 -

Syncro CS 9286-8e Solution User Guide November 2014

3.4.4 Add a Fence Device

Fencing ensures data integrity on the shared storage file system by removing (by a power-down) any problematic node from the cluster before the node compromises a shared resource.

Perform the following steps to add a Fence Device.

1. Select the Fence Device tab and then click Add on the following window.

2. Select SCSI Reservation Fencing.

3. Select the Nodes tab and then perform the following steps for both nodes.

4. Select a cluster node name.

5. Under the section for Fencing Devices, select Add Fence Method > Submit.

6. Select Add Fence Instance > Choose Fence Devices.

7. Select Create > Submit.

Figure 22 Fence Devices Window

Creating the Cluster in Red Hat Enterprise Linux (RHEL)

Chapter 3: Creating the Cluster

Avago Technologies

- 34 -

Syncro CS 9286-8e Solution User Guide November 2014

3.4.5 Create a Failover Domain

By default, all of the nodes can run any cluster service. To provide better administrative control over cluster services, Failover Domains limit which nodes are permitted to run a service or establish node preference.

Perform the following steps to create a failover domain.

1. Click the Failover Domains tab, and click Add on the following window.

2. Enter a failover domain name in the Name text box, and click the No Failback check box.

3. Select the nodes that you want to make members of the failover domain.

4. Specify any options needed for this resource in the Prioritized, Restricted, and No Failback check boxes.

5. Click Create to complete.

Figure 23 Add Failover Domain to Cluster Window

Creating the Cluster in Red Hat Enterprise Linux (RHEL)

Chapter 3: Creating the Cluster

Avago Technologies

- 35 -

Syncro CS 9286-8e Solution User Guide November 2014

3.4.6 Add Resources to the Cluster

Shared resources can be shared directories or properties, such as the IP address, that are tied to the cluster. These resources can be referenced by clients as though the cluster were a single server/entity. This section describes how to add GFS2 and IP address cluster resources.

Perform the following steps to create a GFS2 cluster resource:

1. Select the Resources tab, and click Add on the following window.

2. Select GFS2 from the pull-down menu.

3. Specify the name of the GF2 resource in the Name field.

4. Specify the mount point of the resource by using the mount point that you created for the shared storage logical volume in the Mount Point field.

5. Specify an appropriate reference for this resource in the Device, FS label, or UUID field.

6. Select GFS2 from the pull-down menu for the Filesystem Type field.

7. Specify any options needed for this volume in the Mount Options field.

8. Specify any options needed for this resource in the Filesystem ID field, and the Force Unmount, Enable NFS daemon and lockd workaround, or Reboot Host Node if Unmount Fails check boxes.

9. Select Submit to complete.

Creating the Cluster in Red Hat Enterprise Linux (RHEL)

Chapter 3: Creating the Cluster

Figure 24 Add GFS2 Resource to Cluster Window

Avago Technologies

- 36 -

Syncro CS 9286-8e Solution User Guide November 2014

Perform the following steps to create an IP Address cluster resource:

1. Select the Resources tab, and click Add on the following window.

2. Select IP Address from the pull-down menu.

3. Specify the address of the cluster resource in the IP Address field.

4. Specify any options needed for this resource in the Netmask Bites, Monitor Link, Disable Updates to Static Routes, and Number of Seconds to Sleep After Removing an IP Address fields.

5. Select Submit to complete.

Figure 25 Add IP Address Resource to Cluster Window

Creating the Cluster in Red Hat Enterprise Linux (RHEL)

Chapter 3: Creating the Cluster

Avago Technologies

- 37 -

Syncro CS 9286-8e Solution User Guide November 2014

Perform the following steps to create an NFSv3 Export cluster resource:

1. Select the Resources tab, and click Add on the following window.

2. Select NFS v3 Export from the pull-down menu.

3. Specify the name of the resource in the Name field.

Select Submit to complete.

Figure 26 Add the NFSv3 Export Resource to Cluster Window

Creating the Cluster in Red Hat Enterprise Linux (RHEL)

Chapter 3: Creating the Cluster

Avago Technologies

- 38 -

Syncro CS 9286-8e Solution User Guide November 2014

Perform the following steps to create an NFS Client cluster resource:

1. Select the Resources tab, and click Add on the following window.

2. Select NFS Client from the pull-down menu.

3. Specify the name of the resource in the Name field.

4. Specify the address of the resource in the Target Hostname, Wildcard, or Netgroup field.

5. Specify any options needed for this resource in the Allow Recovery of This NFS Client check box and Options field.

6. Select Submit to complete.

Figure 27 Add NFS Client Resource to Cluster Window

Creating the Cluster in Red Hat Enterprise Linux (RHEL)

Chapter 3: Creating the Cluster

Avago Technologies

- 39 -

Syncro CS 9286-8e Solution User Guide November 2014

3.4.7 Create Service Groups

Service groups allow for greater organization and management of cluster resources and services that are associated with the cluster.

Perform the following steps to create service groups:

1. Select the Services Group tab and click Add on the following window.

2. Choose a service name that describes the function for which you are creating the service.

3. Select a previously created failover domain from the pull-down menu.

4. Click the Add a Resource tab.

5. From the drop down menu, select the IP Address resource that you created earlier (all of the created resources appear on top).

6. Click Add Resource, and then select the GFS File System resource created earlier from drop-down menu.

7. Click Add a child resource to the added GFS File System resource, and select the NFS Export resource created earlier.

8. Click Add a child resource to the newly added NFS Export resource, and select the NFS Client resource created earlier.

Figure 28 Create Service Group Window

Creating the Cluster in Red Hat Enterprise Linux (RHEL)

Chapter 3: Creating the Cluster

Avago Technologies

- 40 -

Syncro CS 9286-8e Solution User Guide November 2014

Creating the Cluster in SuSE Linux Enterprise Server (SLES)

3.4.8 Mount the NFS Resource from the Remote Client

Mount the NFS volume from the remote client by entering the following at the command line prompt:

mount –t nfs –o rw,nfsvers=3 exportname:/pathname /mntpoint

3.5 Creating the Cluster in SuSE Linux Enterprise Server (SLES)

The following subsections describe how to enable cluster support, create a two node cluster and configure a NFS clustered resource for SLES 11 SP2/SP3 operating system.

NOTE The Syncro CS solution requires the SuSE Linux Enterprise High

Availability (SLE-HA) extensions to operate properly. Additional product details regarding SuSE High Availability Extensions can be found at https://www.suse.com/products/highavailability/.

3.5.1 Prerequisites for Cluster Setup

Chapter 3: Creating the Cluster

Before you create a cluster, you need to perform the following tasks to ensure that all of the necessary modules and settings are pre-configured.

3.5.1.1 Prepare the Operating System

Perform the following steps to prepare the operating system:

1. Make sure that all of the maintenance updates for the SLES 11 Service Pack 2/3 are installed.

2. Install the SLE-HA extension by performing the following steps.

a. Download the SLE-HA Extension ISO to each node. b. To install this SLE-HA add-on, start YaST and select Software > Add-On Products. c. Select the local ISO image, and then enter the path to ISO Image. d. From the filter list, select Patterns, and activate the High Availability pattern in the pattern list. e. Click Accept to start installing the packages. f. Install the High Availability pattern on node 2 that is part of the cluster.

Avago Technologies

- 41 -

Syncro CS 9286-8e Solution User Guide November 2014

3.5.1.2 Configure Network Settings

Each node should have two Ethernet ports, with one (em1) connected to the network switch, and another (em2) connected to the em2 ethernet port on the other node.

Perform the following steps to configure network settings:

1. Perform the following steps to assign the static IP addresses:

a. In each node, set up the static IP address for both em1 and em2 ethernet ports by selecting Application >

yast > Network settings. b. Select the ethernet port em1, and then select Edit. c. Select the statically assigned IP address, input the IP Address/ Subnet Mask / Hostname, and then confirm

the changes. d. Repeat steps b and c for ethernet port em2.

2. You need to open the following ports for each node in the firewall for communication of cluster services between nodes:

— TCP Ports – 30865, 5560, 7630, 21064 — UDP Port – 5405

3. Create the file /etc/sysconfig/network/routes, and enter the following text:

default YOUR_GATEWAY_IPADDRESS - -

4. Edit the file /etc/resolv.conf to change the DNS IP address to the IP address of the DNS server in your network in the following format:

nameserver YOUR_DNS_IPADDRESS

Alternatively, you can set the DNS and Default gateway in the Network Settings screen Global Options tab, Host/DNS tab, or Routing tab, as shown in the following figures.

Creating the Cluster in SuSE Linux Enterprise Server (SLES)

Chapter 3: Creating the Cluster

Figure 29 Network Settings on Global Options Tab

Avago Technologies

- 42 -

Syncro CS 9286-8e Solution User Guide November 2014

Figure 30 Network Settings on Hostname/DNS Tab

Creating the Cluster in SuSE Linux Enterprise Server (SLES)

Chapter 3: Creating the Cluster

Avago Technologies

- 43 -

Syncro CS 9286-8e Solution User Guide November 2014

Figure 31 Network Settings on Routing Tab

Creating the Cluster in SuSE Linux Enterprise Server (SLES)

Chapter 3: Creating the Cluster

5. Restart the network service if needed by entering the following command:

/etc/init.d/network restart

6. Edit the /etc/hosts file to include the IP address and the hostname for both node 1, node 2, and the remote client. Make sure that you can access both nodes through the public and private IP addresses.

The following examples use sles-ha1 to denote node 1 and sles-ha2 to denote node 2.

YOUR_IP_ADDRESS_1 sles-ha1.yourdomain.com sles-ha1 YOUR_IP_ADDRESS_2 sles-ha1.yourdomain.com sles-ha1 YOUR_IP_ADDRESS_3 sles-ha2.yourdomain.com sles-ha2 YOUR_IP_ADDRESS_4 sles-ha2.yourdomain.com sles-ha2 YOUR_CLIENT_IP_ADDRESS client.yourdomain client

7. Establish ssh keyless entry from node1 to node 2, and vice versa, by entering the following commands:

ssh-keygen ssh-copy-id -i ~/.ssh/id_rsa.pub root@node1

3.5.1.3 Connect to the NTP Server for Time Synchronization

Perform the following steps to configure both the nodes to use the NTP server in the network for synchronizing the time.

1. Access Yast > Network Services > NTP configuration, and then select Now & on boot.

2. Select Add, check Server, and select the local NTP server.

3. Add the IP address of the NTP server in your network.

Avago Technologies

- 44 -

Syncro CS 9286-8e Solution User Guide November 2014

3.5.2 Creating the Cluster

You can use multiple methods to configure the cluster directly, such as using the command line, editing configuration files, and using a GUI. The procedures in this document use a combination of the Yast GUI tool and the command line to configure the cluster. After the cluster is online, you can perform the following steps to add NFS cluster resources.

3.5.2.1 Cluster Setup

Perform the following step to install the cluster setup automatically.

1. On node1, start the bootstrap script by entering the following command:

sleha-init

NOTE If NTP is not configured on the nodes, a warning appears. You can

2. Specify the Worldwide Identifier (WWID) for a shared virtual drive in your node when prompted.

The following WWID is shown as an example.

/dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343

3. On Node 2, start the bootstrap script by entering the following command.

sleha-join

4. Complete the cluster setup on node 2 by specifying the Worldwide Identifier (WWID) for a shared virtual drive in your node when prompted.

Creating the Cluster in SuSE Linux Enterprise Server (SLES)

Chapter 3: Creating the Cluster

address the warning by configuring NTP by following the steps in

Section 3.5.1.3, Connect to the NTP Server for Time Synchronization.

After you perform the initial cluster setup using the bootstrap scripts, you need to make changes to the cluster settings that you could not make during bootstrap.

Perform the following steps to revise the cluster Communication Channels, Security, Service, Csync2, and conntrackd settings.

1. Start the cluster module from command line by entering the following command.

yast2 cluster

2. After the fields in the following screen display the information (according to your setup), click the check box next to the Auto Generate Node ID field to automatically generate a unique ID for every cluster node.

3. If you modified any options for an existing cluster, confirm your changes, and close the cluster module. YaST writes the configuration to /etc/corosync/corosync.conf.

Avago Technologies

- 45 -

Syncro CS 9286-8e Solution User Guide November 2014

Figure 32 Cluster Setup on the Communication Channels Tab

Creating the Cluster in SuSE Linux Enterprise Server (SLES)

Chapter 3: Creating the Cluster

4. Click Finish.

5. After the fields in the following screen display the information, click Generate Auth Key File on Node1 only. This action creates an authentication key that is written to /etc/corosync/authkey.

To make node2 join the existing cluster, do not generate a new key file. Instead, copy the /etc/corosync/authkey from node1 to the node2 manually.

Avago Technologies

- 46 -

Syncro CS 9286-8e Solution User Guide November 2014

Figure 33 Cluster Setup on Security Tab

Creating the Cluster in SuSE Linux Enterprise Server (SLES)

Chapter 3: Creating the Cluster

6. Click Finish.

The following screen appears.

Figure 34 Cluster Setup on Service Tab

Avago Technologies

- 47 -

Syncro CS 9286-8e Solution User Guide November 2014

7. Click Finish.

The following screen appears.

Figure 35 Cluster Setup on Csync2 Tab

Creating the Cluster in SuSE Linux Enterprise Server (SLES)

Chapter 3: Creating the Cluster

8. To specify the synchronization group, click Add in the Sync Host Group, and enter the local hostnames of all nodes in your cluster. For each node, you must use exactly the strings that are returned by the hostname command.

9. Click Generate Pre-Shared-Keys to create a key file for the synchronization group. The key file is written to /etc/csync2/key_hagroup. After the key file has been created, you must copy it

manually to node2 of the cluster by performing the following steps:

a. Make sure the same Csync2 configuration is available on all nodes. To do so, copy the file

/etc/csync2/csync2.cfg manually to all node2 after you complete cluster configuration in node1.

Include this file in the list of files to be synchronized with Csync2.

b. Copy the file /etc/csync2/key_hagroup that you generated on node1 to node2 in the cluster, as it is

needed for authentication by Csync2. However, do not regenerate the file on node2; it needs to be the same file on all nodes.

c. Both Csync2 and xinetd must be running on all nodes. Execute the following commands on all nodes to make

both services start automatically at boot time and to start xinetd now:

chkconfig csync2 on chkconfig xinetd on rcxinetd start

d. Copy the configuration from node1 to node2 by using the following command:

csync2 -xv

This action places all of the files on all of the nodes. If all files are copied successfully, Csync2 finishes with no errors.

Avago Technologies

- 48 -

Syncro CS 9286-8e Solution User Guide November 2014

10. Activate Csync2 by clicking Turn Csync2 ON.

This action executes chkconfig csync2 to start Csync2 automatically at boot time.

11. Click Finish.

The following screen appears.

Figure 36 Cluster Setup on Conntrackd Tab

Creating the Cluster in SuSE Linux Enterprise Server (SLES)

Chapter 3: Creating the Cluster

12. After the information appears, click Generate /etc/conntrackd/conntrackd.conf to create the configuration file for conntrackd.

13. Confirm your changes and close the cluster module.

If you set up the initial cluster exclusively with the YaST cluster module, you have now completed the basic configuration steps.

14. Starting at step 1 in this procedure, perform the steps for Node2.

Some keys do not need to be regenerated on Node2, but they have to be copied from Node1.

Avago Technologies

- 49 -

Syncro CS 9286-8e Solution User Guide November 2014

3.5.3 Bringing the Cluster Online

Perform the following steps to bring the cluster online.

1. Check if the openais service is already running by entering the following command:

rcopenais status

2. If the openais service is already running, go to step 3. If not, start OpenAIS/Corosync now by entering the following command:

rcopenais start

3. Repeat the steps above for each of the cluster nodes. On each of the nodes, check the cluster status with the following command:

crm_mon

If all of the nodes are online, the output should be similar to the following:

============ Last updated: Thu May 23 04:28:26 2013 Last change: Mon May 20 09:05:29 2013 by hacluster via crmd on sles-ha1 Stack: openais Current DC: sles-ha2 - partition with quorum Version: 1.1.6-b988976485d15cb702c9307df55512d323831a5e 2 Nodes configured, 2 expected votes 1 Resources configured.

Online: [ sles-ha2 sles-ha1 ] stonith-sbd (stonith:external/sbd): Started sles-ha2 ============

This output indicates that the cluster resource manager is started and is ready to manage resources.

Creating the Cluster in SuSE Linux Enterprise Server (SLES)

Chapter 3: Creating the Cluster

3.5.4 Configuring the NFS Resource with STONITH SBD Fencing

The following subsections describe how to set up an NFS resource by installing the NFS kernel server, configuring the shared VD by partitioning, applying the ext3 file system, and configuring the stonith_sbd fencing.

3.5.4.1 Install NFSSERVER

Use Yast to install nfs-kernel-server and all of the required dependencies.

3.5.4.2 Configure the Partition and the File System

Perform the following steps to configure the partition and the file system.

1. Use fdisk or any other partition modification tool to create partitions on the virtual drive. For this example, /dev/sda is on a shared virtual drive with two partitions created: sda1 (part1) for sbd and sda2

(part2) for NFS mount (the actual data sharing partition)

2. Use mkfs to apply the ext3 file system to the partition(s)/

sles-ha1:~ # mkfs.ext3 /dev/sda1 sles-ha1:~ # mkfs.ext3 /dev/sda2

Avago Technologies

- 50 -

Syncro CS 9286-8e Solution User Guide November 2014

Creating the Cluster in SuSE Linux Enterprise Server (SLES)

Chapter 3: Creating the Cluster

3.5.4.3 Configure stonith_sbd Fencing

Stonith_sbd is the fencing mechanism used in SLES-HA. Fencing ensures data integrity on the shared storage by not allowing problematic nodes from accessing the cluster resources. Before you create another resource, you have to configure this mechanism correctly.

For this example, the World Wide Name (WWN - 0x600605b00316386019265c4910e9a343) refers to /dev/sda1.

NOTE Use only the wwn-xyz device handle to configure stonith_sbd. The

/dev/sda1 device handle is not persistent, and using it can cause sbd unavailability after a reboot.

Perform the following step to set up the stonith_sbd fencing mechanism.

1. Create the sbd header and set the watchdog timeout to 52 seconds and mgswait timeout to 104 seconds by entering the following at the command prompt:

sles-ha1:~ # sbd -d /dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1 -4 104 -1 52 create

The following output appears.

Initializing device /dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1 Creating version 2 header on device 3 Initializing 255 slots on device 3 Device /dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1 is initialized.

2. Verify that the sbd header was created and timeout set properly by entering the following at the command prompt:

sles-ha1:~ # sbd -d /dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1 dump

The following output appears.

==Dumping header on disk /dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1 Header version : 2 Number of slots : 255 Sector size : 512 Timeout (watchdog) : 10 Timeout (allocate) : 2 Timeout (loop) : 1 Timeout (msgwait) : 104 ==Header on disk /dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1 is dumped

3. Add the contents to /etc/sysconfig/sbd by entering the following at the command prompt:

sles-ha1:~ # cat /etc/sysconfig/sbd

The following output appears.

SBD_DEVICE="/dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1" SBD_OPTS="-W"

4. Allocate a slot for the node 1 for sbd by entering the following at the command prompt:

sles-ha1: # scp /etc/sysconfig/sbd root@sles-ha2:/etc/sysconfig/ sles-ha1:/etc/sysconfig # sbd -d /dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1 allocate sles-ha1

The following output appears.

Trying to allocate slot for sles-ha1 on device /dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1. slot 0 is unused - trying to own Slot for sles-ha1 has been allocated on /dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1.

Avago Technologies

- 51 -

Syncro CS 9286-8e Solution User Guide November 2014

Creating the Cluster in SuSE Linux Enterprise Server (SLES)

Chapter 3: Creating the Cluster

5. Allocate a slot for the node 2 for sbd by entering the following at the command prompt:

sles-ha1:/etc/sysconfig # sbd -d /dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1 allocate sles-ha2

The following output appears.

Trying to allocate slot for sles-ha2 on device /dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1. slot 1 is unused - trying to own Slot for sles-ha2 has been allocated on /dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1.

6. Verify that the both nodes have allocated slots for sbd by entering the following at the command prompt:

sles-ha1:/etc/sysconfig # sbd -d /dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1 list

The following output appears:

0 sles-ha1 clear 1 sles-ha2 clear

7. Restart the corosync daemon on node 1 by entering the following at the command prompt:

sles-ha1: # rcopenais restart

The following output appears.

Stopping OpenAIS/corosync daemon (corosync): Stopping SBD - done OK Starting OpenAIS/Corosync daemon (corosync): Starting SBD - starting... OK

8. Restart the corosync daemon on node 2 by entering the following at the command prompt:

sles-ha2:# rcopenais restart

The following output appears.

Stopping OpenAIS/corosync daemon (corosync): Stopping SBD - done OK Starting OpenAIS/Corosync daemon (corosync): Starting SBD - starting... OK

9. Check whether both the nodes are able to communicate with each other through sbd by entering the following commands.

sles-ha2:# sbd -d /dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1 message sles-ha1 test sles-ha1: # tail -f /var/log/messages

Output from node 1 similar to the following appears.

Jun 4 07:45:15 sles-ha1 sbd: [8066]: info: Received command test from sles-ha2 on disk /dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1

10. Send a message from node 1 to node 2 to confirm that the message can be sent both ways by entering the following command.

sles-ha1: # sbd -d /dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1 message sles-ha2 test

11. After you confirm the message can be sent either way, configure stonith_sbd as a resource by entering the following commands in crm, the command line utility.

sles-ha1:# crm configure crm(live)configure# property stonith-enabled="true" crm(live)configure# property stonith-timeout="120s" crm(live)configure# primitive stonith_sbd stonith:external/sbd params sbd_device="/dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1" crm(live)configure# commit crm(live)configure# quit

Avago Technologies

- 52 -

Syncro CS 9286-8e Solution User Guide November 2014

Creating the Cluster in SuSE Linux Enterprise Server (SLES)

Chapter 3: Creating the Cluster

12. Revise the global cluster policy settings by entering the following commands.

sles-ha1:# crm configure crm(live)configure# property no-quorum-policy="ignore" crm(live)configure# rsc_defaults resource-stickiness="100" crm(live)configure# commit crm(live)configure# quit

3.5.5 Adding NFS Cluster Resources

This section describes the method you can use to add NFS cluster resources using command line method. Alternatively, you can use the Pacemaker GUI tool.

1. Create the mount folders in both sles-ha1 and sles-ha2 according to your requirements, by using the following commands.

sles-ha1:# mkdir /nfs sles-ha1:# mkdir /nfs/part2 sles-ha2:# mkdir /nfs sles-ha2:# mkdir /nfs/part2

ATTENTION Do not manually mount the ext3 partition on this folder. The cluster

takes care of that action automatically. Mounting the partition manually would corrupt the file system.

2. Add the following contents to /etc/exports on both sles-ha1 and sles-ha2 by using the following command.

sles-ha2:~ # cat /etc/exports

The following output appears.

/nfs/part2 YOUR_SUBNET/YOUR_NETMASK (fsid=1,rw,no_root_squash,mountpoint)

3. Configure the NFSSERVER to be started and stopped by the cluster by using the following commands.

sles-ha2:~ # crm configure crm(live)configure# primitive lsb_nfsserver lsb:nfsserver op monitor interval="15s" timeout=”15s”

4. Configure a Filesystem service by using the following command.

crm(live)configure# primitive p_fs_part2 ocf:heartbeat:Filesystem params device=/dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part2 directory=/nfs/part2 fstype=ext3 op monitor interval="10s"

5. Configure a Virtual IP address. This IP address is different from the IP address that connects to the Ethernet ports. This IP address can move between both nodes. Also, enter the netmask according to your network.

crm(live)configure# primitive p_ip_nfs ocf:heartbeat:IPaddr2 params ip="YOUR_VIRTUAL_IPADDRESS" cidr_netmask="YOUR_NETMASK" op monitor interval="30s"

6. Create a group and add the resources part of the same group by using the following commands.

NOTE The stonith_sbd should not be part of this group. Make sure that all

added shared storage is listed at the beginning of the group order because migration of the storage resource is a dependency for other resources.

crm(live)configure# group g_nfs p_fs_part2 p_ip_nfs lsb_nfsserver crm(live)configure# edit g_nfs

The following output appears.

group g_nfs p_fs_part2 p_ip_nfs lsb_nfsserver \ meta target-role="Started"

Avago Technologies

- 53 -

Syncro CS 9286-8e Solution User Guide November 2014

Creating the Cluster in SuSE Linux Enterprise Server (SLES)

Chapter 3: Creating the Cluster

7. Commit the changes and exit crm by entering the following commands:

crm(live)configure# commit crm(live)configure# quit

8. Check whether the resources are added and the parameters are set to the correct values by using the following command. If the output is not correct, modify the resources and parameters accordingly.

sles-ha2:~ # crm configure show

The following output appears.

node sles-ha1 node sles-ha2 primitive lsb_nfsserver lsb:nfsserver \ operations $id="lsb_nfsserver-operations" \ op monitor interval="15" timeout="15" primitive p_fs_part2 ocf:heartbeat:Filesystem \ params device="/dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part2" directory="/nfs/part2" fstype="ext3" \ op monitor interval="10s" primitive p_ip_nfs ocf:heartbeat:IPaddr2 \ params ip="YOUR_VIRTUAL_IPADDRESS" cidr_netmask="YOUR_NETMASK" \ op monitor interval="30s" primitive stonith_sbd stonith:external/sbd \ params sbd_device="/dev/disk/by-id/wwn-0x600605b00316386019265c4910e9a343-part1" group g_nfs p_ip_nfs p_fs_part2 lsb_nfsserver \ meta target-role="Started" property $id="cib-bootstrap-options" \ dc-version="1.1.6-b988976485d15cb702c9307df55512d323831a5e" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ stonith-timeout="120s" \ no-quorum-policy="ignore" \ last-lrm-refresh="1370644577" \ default-action-timeout="120s" \ default-resource-stickiness="100"

9. (Optional) You can use the Pacemaker GUI as an alternative to configure the CRM parameters.

Perform the following steps to use the Pacemaker GUI: a. Before you use the cluster GUI for the first time, set the passwd for the hacluster account on both the

nodes by entering the following commands. This password is needed to connect to the GUI.

sles-ha2:~ # passwd hacluster sles-ha1:~ # passwd hacluster

b. Enter the following command to launch the GUI.

sles-ha2:~ # crm_gui

The Pacemaker GUI appears as shown in the following figures.

Avago Technologies

- 54 -

Syncro CS 9286-8e Solution User Guide November 2014

Figure 37 Pacemaker GUI on Policy Engine Tab (CRM Config)

Creating the Cluster in SuSE Linux Enterprise Server (SLES)

Chapter 3: Creating the Cluster

The following figure shows the Pacemaker GUI with the CRM Daemon Engine tab selected.

Figure 38 Pacemaker GUI on CRM Daemon Engine Tab (CRM Config)

Avago Technologies

- 55 -

Syncro CS 9286-8e Solution User Guide November 2014

10. Check to confirm that the CRM is running by entering the following command.

sles-ha2:# crm_mon

The following output appears.

============ Last updated: Mon Jun 10 12:19:47 2013 Last change: Fri Jun 7 17:13:20 2013 by hacluster via mgmtd on sles-ha1 Stack: openais Current DC: sles-ha2 - partition with quorum Version: 1.1.6-b988976485d15cb702c9307df55512d323831a5e 2 Nodes configured, 2 expected votes 4 Resources configured. ============

3.5.6 Mounting NFS in the Remote Client

In the remote system, use the following command to mount the exported NFS partition:

mount -t nfs "YOUR_VIRTUAL_IPADDRESS":/nfs/part2 /srv/nfs/part2

Creating the Cluster in SuSE Linux Enterprise Server (SLES)

Chapter 3: Creating the Cluster

Avago Technologies

- 56 -

Syncro CS 9286-8e Solution User Guide November 2014

Chapter 4: System Administration

This chapter explains how to perform system administration tasks, such as planned failovers and updates of the Syncro CS 9286-8e controller firmware.

4.1 High Availability Properties

The following figure shows the high availability properties that MegaRAID Storage Manager displays on the Controller Properties tab for a Syncro CS 9286-8e controller.

Figure 39 Controller Properties: High Availability Properties

Chapter 4: System Administration

High Availability Properties

Following is a description of each high availability property:

 Topology Type – A descriptor of the HA topology for which the Syncro CS 9286-8e controller is currently

configured (the default is Server Storage Cluster).

 Maximum Controller Nodes – The maximum number of concurrent Syncro CS 9286-8e controllers within the HA

domain that the controller supports.

 Domain ID – A unique number that identifies the HA domain in which the controller is currently included. This

field has a number if the cluster or peer controller is in active state.

 Peer Controller Status – The current state of the peer controller.

Active: The peer controller is present and is participating in the HA domain. Inactive: The peer controller is missing or has failed. Incompatible: The peer controller is detected, but it has an incompatibility with the controller.

 Incompatibility Details – If the peer controller is incompatible, this field lists the cause of the incompatibility.

Avago Technologies

- 57 -

Syncro CS 9286-8e Solution User Guide November 2014

4.2 Understanding Failover Operations

A failover operation in HA-DAS is the process by which VD management transitions from one server node to the peer server node. A failover operation might result from a user-initiated, planned action to move an application to a different controller node so that maintenance activities can be performed, or the failover might be unintended and unplanned, resulting from hardware component failure that blocks access to the storage devices. Figure 40 and

Figure 41 show an example of a failover operation of various drive groups and VDs from Server A to Server B. The

following figure shows the condition of the two server nodes before the failover.

Figure 40 Before Failover from Server A to Server B

Chapter 4: System Administration

Understanding Failover Operations

Before failover, the cluster status is as follows in terms of managing the drive group and VDs:

 All VDs in A-DG0 (Server A - Drive Group 0) are managed by Server A.  VD3 in B-DG0 (Server B – Drive Group 0) is managed by Server B.  The CacheCade VD (CC-VD) in A-CC is managed by Server A and services VDs in drive group A-DG0.

Before failover, the operating system perspective is as follows:

 The operating system on Server A only sees VDs with shared host access and exclusive host access to Server A.  The operating system on Server B only sees VDs with shared host access and exclusive host access to Server B.

Before failover, the operating system perspective of I/O transactions is as follows:

 Server A is handling I/O transactions that rely on A-DG0:VD1 and A-DG0:VD2.  Server B is handling I/O transactions that rely on A-DG0:VD0 and B-DG0:VD3.

The following figure shows the condition of the two server nodes after the failover.

Avago Technologies

- 58 -

Syncro CS 9286-8e Solution User Guide November 2014

Figure 41 After Failover from Server A to Server B

Chapter 4: System Administration

Understanding Failover Operations

After failover, the cluster status is as follows, in terms of managing the drive group and the VDs:

 All shared VDs in A-DG0 have failed over and are now managed by Server B.  VD3 in B-DG0 is still managed by Server B.  The CacheCade VD (CC-VD) in A-CC now appears as a foreign VD on Server B but does not service any VDs in

A-DG0 or B-DG0.

After failover, the operating system perspective is as follows:

 The operating system on Server B manages all shared VDs and any exclusive Server B VDs.

After failover, the operating system perspective of I/O transactions is as follows:

 Failover Cluster Manager has moved the I/O transactions for VD2 on A-DG0 to Server B.  Server B continues to run I/O transactions on B-DG0:VD3.  I/O transactions that rely on the exclusive A-DG0:VD1 on Server A fail because exclusive volumes do not move

with a failover

NOTE When Server A returns, the management and I/O paths of the

pre-failover configurations are automatically restored.

The following sections provide more detailed information about planned failover and unplanned failover.

Avago Technologies

- 59 -

Syncro CS 9286-8e Solution User Guide November 2014

4.2.1 Understanding and Using Planned Failover

A planned failover occurs when you deliberately transfer control of the drive groups from one controller node to the other. The usual reason for initiating a planned failover is to perform some kind of maintenance or upgrade on one of the controller nodes—for example, upgrading the controller firmware, as described in the following section. A planned failover can occur when there is active data access to the shared drive groups.

Before you start a planned failover on a Syncro CS system, be sure that no processes are scheduled to run during that time. Be aware that system performance might be impacted during the planned failover.

NOTE Failed-over VDs with exclusive host access cannot be accessed unless

the VD host access is set to SHARED. Do not transition operating system boot volumes from exclusive to SHARED access.

4.2.1.1 Planned Failover in Windows Server 2012

Follow these steps to perform a planned failover on a Syncro CS 9286-8e system running Windows Server 2012.

1. Create a backup of the data on the Syncro CS 9286-8e system.

2. In the Failover Cluster Manager snap-in, if the cluster that you want to manage is not displayed in the console tree, right-click Failover Cluster Manager, click Manage a Cluster, and then select or specify the cluster that you want.

3. If the console tree is collapsed, expand the tree under the cluster that you want to configure.

4. Expand Services and Applications, and click the name of the virtual machine.

5. On the right-hand side of the screen, under Actions, click Move this service or application to another node, and click the name of the other node.

Chapter 4: System Administration

Understanding Failover Operations

As the virtual machine is moved, the status is displayed in the results panel (center panel). Verify that the move succeeded by inspecting the details of each node in the RAID management utility.

Avago Technologies

- 60 -

Syncro CS 9286-8e Solution User Guide November 2014

4.2.1.2 Planned Failover in Windows Server 2008 R2

Follow these steps to perform a planned failover on a Syncro CS 9286-8e system running Windows Server 2008 R2.

1. Create a backup of the data on the Syncro CS 9286-8e system.

2. Open the Failover Cluster Manager, as shown in the following figure.

Figure 42 Failover Cluster Manager

Chapter 4: System Administration

Understanding Failover Operations

3. In the left panel, expand the tree to display the disks, as shown in the following figure.

Figure 43 Expand Tree

4. Right-click on the entry in the Assigned To column in the center panel of the window.

5. On the pop-up menu, select Move > Select Node, as shown in the following figure.

6. Select the node for the planned failover.

Avago Technologies

- 61 -

Syncro CS 9286-8e Solution User Guide November 2014

Figure 44 Expand Tree

Chapter 4: System Administration

Understanding Failover Operations

Avago Technologies

- 62 -

Syncro CS 9286-8e Solution User Guide November 2014

4.2.1.3 Planned Failover in Red Hat Enterprise Linux

Follow these steps to perform a planned failover on a Syncro CS 9286-8e system running Red Hat Enterprise Linux.

1. Create a backup of the data on the Syncro CS 9286-8e system.

2. From the High Availability management web interface, select the Service Groups tab and select the service group that you want to migrate to the other node.

3. Select the node to migrate the service group to from the drop-down menu by the Status field.

4. Click the play radio button to migrate the service group

Figure 45 Migrate Service Groups

Chapter 4: System Administration

Understanding Failover Operations

4.2.2 Understanding Unplanned Failover

An unplanned failover might occur if the controller in one of the server nodes fails, or if the cable from one controller node to the JBOD enclosure is accidentally disconnected. The Syncro CS 9286-8e solution is designed to automatically switch to the other controller node when such an event occurs, without any disruption of access to the data on the drive groups.

NOTE When the failed controller node returns, the management and I/O

paths of the pre-failover configurations are automatically restored.

Avago Technologies

- 63 -

Syncro CS 9286-8e Solution User Guide November 2014

4.3 Updating the Syncro CS 9286-8e Controller Firmware

Follow these steps to update the firmware on the Syncro CS 9286-8e controller board. You must perform the update only on the controller node that is not currently accessing the drive groups.

NOTE Be sure that the version of firmware selected for the update is specified

for Syncro CS controllers. If you updating to a version of controller firmware that does not support Syncro CS controllers, you will experience a loss of HA-DAS functionality.

1. If necessary, perform a planned failover as described in the previous section to transfer control of the drive groups to the other controller node.

2. Start the MegaRAID Storage Manager utility on the controller node that does not currently own the cluster.

NOTE To determine which node currently owns the cluster in Windows

Server 2012, follow the steps in Section 4.2.1.2, Planned Failover in

Windows Server 2008 R2, up to step 3, where information about the

cluster disks is displayed in the center panel. The current owner of the cluster is listed in the Owner Node column.

3. In the left panel of the MegaRAID Storage Manager window, click the icon of the controller that requires an upgrade.

4. In the MegaRAID Storage Manager window, select Go To > Controller > Update Controller Firmware.

5. Click Browse to locate the .rom update file.

6. After you locate the file, click Ok.

The MegaRAID Storage Manager software displays the version of the existing firmware and the version of the new firmware file.

7. When you are prompted to indicate whether you want to upgrade the firmware, click Yes . The controller is updated with the new firmware code contained in the .rom file.

8. Reboot the controller node after the new firmware is flashed.

The new firmware does not take effect until reboot.

9. If desired, use planned failover to transfer control of the drive groups back to the controller node you just upgraded.

10. Repeat this process for the other controller.

11. Restore the cluster to its non-failed-over mode.

Updating the Syncro CS 9286-8e Controller Firmware

Chapter 4: System Administration

4.4 Updating the MegaRAID Driver

To update the MegaRAID driver used in the clustering configuration, download the latest version of the driver from the LSI website. Then follow these instructions for Windows Server 2008 R2, Windows Server 2012, Red Hat Linux, or SuSE Enterprise Linux.

Avago Technologies

- 64 -

Syncro CS 9286-8e Solution User Guide November 2014

4.4.1 Updating the MegaRAID Driver in Windows Server 2008 R2

As a recommended best practice, always back up system data before updating the driver, and then perform a planned failover. These steps are recommended because a driver update requires a system reboot.

1. Right-click on Computer and select Properties.

2. Click Change Settings, as shown in the following figure.

Figure 46 Windows Server 2008 R2 System Properties

Chapter 4: System Administration

Updating the MegaRAID Driver

3. Select the Hardware tab and click Device Manager.

4. Click Storage to expose the Syncro CS 9286-8e controller.

5. Right-click the Syncro CS 9286-8e controller and select Update Driver Software to start the Driver Update wizard, as shown in the following figure.

Avago Technologies

- 65 -

Syncro CS 9286-8e Solution User Guide November 2014

Figure 47 Updating the Driver Software

Chapter 4: System Administration

Updating the MegaRAID Driver

Syncro CS 92

6. Follow the instructions in the wizard.

Avago Technologies

- 66 -

Syncro CS 9286-8e Solution User Guide November 2014

4.4.2 Updating the MegaRAID Driver in Windows Server 2012

1. Run Server Manager and select Local Server on the left panel.

2. Click the Tasks selection list on the right-hand side of the window, as shown in the following figure.

Figure 48 Updating the Driver Software

Chapter 4: System Administration

Updating the MegaRAID Driver

3. Select Computer Management, then click Device Manager.

4. Click Storage to expose the Syncro CS controller.

5. Right-click the Syncro CS controller and select Update Driver Software, as shown in the following figure, to start the Driver Update wizard.

6. Follow the instructions in the wizard.

Avago Technologies

- 67 -

Syncro CS 9286-8e Solution User Guide November 2014

Figure 49 Updating the Driver Software

Chapter 4: System Administration

Updating the MegaRAID Driver

Syncro CS 92

4.4.3 Updating the Red Hat Linux System Driver

Perform the following steps to install or update to the latest version of the MegaSAS driver:

1. Boot the system.

2. Go to Console (your terminal GUI).

3. Install the Dynamic Kernel Module Support (DKMS) driver RPM.

Uninstall the earlier version first, if needed.

4. Install the MegaSAS driver RPM.

Uninstall the earlier version first, if needed.

5. Reboot the system to load the driver.

4.4.4 Updating the SuSE Linux Enterprise Server 11 Driver

Perform the following steps to install or upgrade to the latest version of the MegaSAS driver:

1. Boot the system.

2. Go to Console (your terminal GUI).

3. Run Dynamic Kernel Module Support (DKMS) driver RPM.

Uninstall the earlier version first, if needed.

4. Install the MegaSAS driver RPM.

Uninstall the earlier version first, if needed.

5. Reboot the system to load the driver.

Avago Technologies

- 68 -

Syncro CS 9286-8e Solution User Guide November 2014

Performing Preventive Measures on Disk Drives and VDs

4.5 Performing Preventive Measures on Disk Drives and VDs

The following drive and VD-level operations help to proactively detect disk drive and VD errors that could potentially cause the failure of a controller node. For more information about these operations, refer to the MegaRAID SAS Software User Guide.

 Patrol Read – A patrol read periodically verifies all sectors of disk drives that are connected to a controller,

including the system reserved area in the RAID configured drives. You can run a patrol read for all RAID levels and for all hot spare drives. A patrol read is initiated only when the controller is idle for a defined time period and has no other background activities.

 Consistency Check – You should periodically run a consistency check on fault-tolerant VDs (RAID 1, RAID 5,

RAID 6, RAID 10, RAID 50, and RAID 60 configurations; RAID 0 does not provide data redundancy). A consistency check scans the VDs to determine whether the data has become corrupted and needs to be restored.

For example, in a VD with parity, a consistency check computes the data on one drive and compares the results to the contents of the parity drive. You must run a consistency check if you suspect that the data on the VD might be corrupted.

NOTE Be sure to back up the data before running a consistency check if you

think the data might be corrupted.

Chapter 4: System Administration

Avago Technologies

- 69 -

Syncro CS 9286-8e Solution User Guide November 2014

Chapter 5: Troubleshooting

This chapter has information about troubleshooting a Syncro CS system.

5.1 Verifying HA-DAS Support in Tools and the OS Driver

Not all versions of MegaRAID Storage Manager (MSM) support HA-DAS. The MegaRAID Storage Manager versions that include support for HA-DAS have specific references to clustering. It is not always possible to determine the level of support from the MegaRAID Storage Manager version number. Instead, look for the MegaRAID Storage Manager user interface features that indicate clustering support. If the second item in the MegaRAID Storage Manager Properties box on the dashboard for the HA-DAS controller is High Availability Cluster status, the version supports HA-DAS. This entry does not appear on versions of MegaRAID Storage Manager without HA-DAS support.

You can also verify HA-DAS support in the MegaRAID Storage Manager Create Virtual Drive wizard. A Provide Shared Access check box appears only if the MegaRAID Storage Manager version supports clustering, as shown in the following figure.

Figure 50 Provide Shared Access Property

Verifying HA-DAS Support in Tools and the OS Driver

Chapter 5: Troubleshooting

Versions of MegaRAID Storage Manager that support HA-DAS also require an HA-DAS-capable OS driver to present HA-DAS features. The in-box driver for Windows Server 2012, RHEL 6.4, and SLES 11 SP3 do not present HA-DAS features in MegaRAID Storage Manager.

To determine if your version of StorCLI supports HA-DAS, enter this help command:

Storcli /c0 add vd ?

Avago Technologies

- 70 -

Syncro CS 9286-8e Solution User Guide November 2014

If the help text that is returned includes information about the Host Access Policy: Exclusive to peer Controller / Exclusive / Shared parameter, your version of StorCLI supports HA-DAS.

5.2 Confirming SAS Connections

The high availability functionality of HA-DAS is based on redundant SAS data paths between the clustered nodes and the disk drives. If all of the components in the SAS data path are configured and connected properly, each HA-DAS controller has two SAS addresses for every drive, when viewed from the HA-DAS controllers.

This section explains how to use three tools (StorCLI, WebBIOS, and MegaRAID Storage Manager) to confirm the correctness of the SAS data paths.

5.2.1 Using WebBIOS to View Connections for Controllers, Expanders, and Drives

Use the Physical View in WebBIOS to confirm the connections between the controllers and expanders in the Syncro CS system. As shown in the following figure, if both expanders are running, the view in WebBIOS from one of the nodes includes the other HA-DAS RAID controller (Processor 8 in the figure), the two expanders, and any drives, as shown in the following figure.

Chapter 5: Troubleshooting

Confirming SAS Connections

Figure 51 WebBIOS Physical View

If the other node is powered off, the other RAID controller does not appear in WebBIOS. Devices can appear and disappear while the system is running, as connections are changed. Use the WebBIOS rescan feature to rediscover the devices and topology after a connection change.

Avago Technologies

- 71 -

Syncro CS 9286-8e Solution User Guide November 2014

5.2.2 Using WebBIOS to Verify Dual-Ported SAS Addresses to Disk Drives

Use the Drive Properties View in WebBIOS to confirm that each SAS drive displays two SAS addresses. In a Syncro CS 9286-8e system that is properly cabled and configured, every drive should have two SAS addresses. If the system lacks redundant SAS data paths, the WebBIOS shows only one SAS address on the screen. For information about redundant cabling configurations, see Section 2.2, Cabling Configurations.

To check the drive SAS addresses, open the Physical View on the home page of WebBIOS and click a drive link. On the Disk Properties page, click Next. When the redundant SAS data paths are missing, this second view of drive properties shows only one SAS address in the left panel, as shown in the following figure.

Figure 52 Redundant SAS Data Paths Are Missing

Chapter 5: Troubleshooting

Confirming SAS Connections

The following figure shows the correct view with two drive SAS addresses.

Figure 53 Redundant SAS Data Paths Are Present

Avago Technologies

- 72 -

Syncro CS 9286-8e Solution User Guide November 2014

5.2.3 Using StorCLI to Verify Dual-Ported SAS Addresses to Disk Drives

The StorCLI configuration display command (show all) returns many lines of information, including a summary for each physical disk. To confirm the controller discovery of both SAS addresses for a single drive, examine the StorCLI configuration text for the drive information following the Physical Disk line. If only one of the drive’s SAS ports was discovered, the second SAS address is listed as 0x0. If both drive SAS ports were discovered, the second drive port SAS address is identical to the first except for the last hexadecimal digit, which always has a value of plus 1 or minus 1, relative to SAS Address(0).

The syntax of the StorCLI command is as follows:

Storcli /c0/ex/sx show all

The returned information relating to the physical disk is as follows. Some of the other preceding text is removed for brevity. The SAS addresses are listed at the end. In this example, only one of the drive’s SAS ports is discovered, so the second SAS address is listed as 0x0.

Drive /c0/e14/s9 : ================

-------------------------------------------------------------------------

EID:Slt DID State DG Size Intf Med SED PI SeSz Model Sp

-------------------------------------------------------------------------

14:9 8 Onln 2 278.875 GB SAS HDD N N 512B ST3300657SS U

-------------------------------------------------------------------------

Chapter 5: Troubleshooting

Confirming SAS Connections

Drive /c0/e14/s9 - Detailed Information : =======================================

Drive /c0/e14/s9 State : ====================== Shield Counter = 0 Media Error Count = 0

Drive /c0/e14/s9 Device attributes : ================================== SN = 6SJ2VR1N WWN = 5000C5004832228C Firmware Revision = 0008

Drive /c0/e14/s9 Policies/Settings : ================================== Drive position = DriveGroup:2, Span:1, Row:1 Enclosure position = 0 Connected Port Number = 0(path1)

Port Information : ================

-----------------------------------------

Port Status Linkspeed SAS address

-----------------------------------------

0 Active 6.0Gb/s 0x0 1 Active 6.0Gb/s 0x5000c5004832228e

-----------------------------------------

Avago Technologies

- 73 -

Syncro CS 9286-8e Solution User Guide November 2014

Understanding CacheCade Behavior During a Failover

Chapter 5: Troubleshooting

5.2.4 Using MegaRAID Storage Manager to Verify Dual-Ported SAS Addresses to Disk Drives

When the Syncro CS system is running, you can use MegaRAID Storage Manager to verify the dual SAS paths to disk drives in the HA-DAS configuration by following these steps:

1. Start MegaRAID Storage Manager, and access the Physical tab for the controller.

2. Click a drive in the left panel to view the Properties tab for the drive.

3. Look at the SAS Address fields.

As shown in the following figure, a correctly configured and running HA-DAS cluster with both nodes active displays dual SAS addresses on the drives and dual 4-lane SAS connections on the controller.

Figure 54 Redundant SAS Connections Displayed in MegaRAID Storage Manager

5.3 Understanding CacheCade Behavior During a Failover

A CacheCade VD possesses properties that are similar to a VD with exclusive host access, and it is not presented to the other host operating system. Therefore, the CacheCade volume does not cache read I/Os for VDs that are managed by the peer controller node.

Foreign import of a CacheCade VD is not permitted. To migrate a CacheCade VD from one controller node to another, you must delete it from the controller node that currently manages it and then recreate the CacheCade VD on the peer controller node.

Avago Technologies

- 74 -

Syncro CS 9286-8e Solution User Guide November 2014

5.4 Error Situations and Solutions

The following table lists some problems that you might encounter in a Syncro CS configuration, along with possible causes and solutions.

Table 2 Error Situations and Solutions

Problem Possible Cause Solution

Chapter 5: Troubleshooting

Error Situations and Solutions

A drive is reported as Unsupported, and the drive cannot be used in a drive group.

One or more of the following error messages appear after you run the Microsoft Cluster Validation tool:

 Disk bus type does not support

clustering. Disk partition style is MBR. Disk partition type is BASIC.

 No disks were found on which to

perform cluster validation tests.

When booting a controller node, the controller reports that it is entering Safe Mode. After entering Safe Mode, the controller does not report the presence of any drives or devices.

The LSI management applications do not present or report the HA options and properties.

Drives are not reported in a consistent manner.

The management application does not report a VD or disk group, but the VD or disk group is visible to the OS.

In Windows clustered environments, I/O stops on the remote client when both SAS cable connections from one controller node are severed. The clustered shared volumes appear in offline state even when both cables have been reconnected.

The drive is not a SAS drive, or it does not support SCSI-3 PR.

 Two I/O paths are not established

between the controller and drive.

 This build of the Windows

operating system does not natively support RAID controllers for clustering.

 An incompatible peer controller

parameter is detected. The peer controller is prevented from entering the HA domain.

 A peer controller is not compatible

with the controller in the HA domain. Entering Safe Mode protects the VDs by blocking access to the controller to allow for correction of the incompatibility.

The version of the management applications might not be HA-compatible.

Improper connections might impact the order in which the drives are discovered.

The shared VD is managed by the peer controller.

Behavior outlined in Microsoft Knowledge Base article 2842111 might be encountered.

Be sure you are using SAS drives that are included on the list of compatible SAS drives on the LSI web site, or ask your drive vendor.

 Confirm that device ports and all cabling connections

between the controller and drive are correct and are functioning properly. See Section 5.2, Confirming SAS

Connections.

 Confirm that the version (or the current settings) of the

operating system supports clustered RAID controllers.

 The peer controller might have settings that do not

match the controller. To correct this situation, update the firmware for the peer controller and the other controller, or both, to ensure that they are at the same firmware version.

 The peer controller hardware does not exactly match

the controller. To correct this situation, replace the peer controller with a unit that matches the controller hardware.

Obtain an HA-compatible version of the management application from the LSI web site, or contact an LSI support representative.

Make sure you are following the cabling configuration guidelines listed in Section 2.2, Cabling Configurations.

The VD or drive group can be seen and managed on the other controller node. Log in or to open a terminal on the other controller node.

1. Reconnect the severed SAS cables.

2. Open Failover Cluster Manager > Storage > Disk > Cluster Disk to check the status of the cluster.

3. If the disks are online, you can restart your client application.

If the disks are not online, right click on Disks > Refresh and bring them online manually. If the disks do not go online through manual methods, reboot the server node.

4. Restart the Server Role associated with the disks.

5. Apply hotfix for Microsoft Knowledge Base article 2842111 to both server nodes.

Avago Technologies

- 75 -

Syncro CS 9286-8e Solution User Guide November 2014

5.5 Event Messages and Error Messages

Each message that appears in the MegaRAID Storage Manager event log has an error level that indicates the severity of the event, as listed in the following table.

Table 3 Event Error Levels

Error Level Meaning

Information Informational message. No user action is necessary.

Warning Some component might be close to a failure point.

Critical A component has failed, but the system has not lost data.

Fatal A component has failed, and data loss has occurred or will occur.

5.5.1 Error Level Meaning

The following table lists MegaRAID Storage Manager event messages that might appear in the MegaRAID Storage Manager event log when the Syncro CS system is running.

Table 4 HA-DAS MegaRAID Storage Manager Events and Messages

Chapter 5: Troubleshooting

Event Messages and Error Messages

Number

0x01cc Information Peer controller entered HA

0x01cd Information Peer controller exited HA

0x01ce Information Peer controller now manages

0x01cf Information Controller ID: <Controller

0x01d0 Information Peer controller now manages

0x01d1 Information Controller ID: <Controller

0x01d2 Critical Target ID conflict detected.

0x01d3 Information Shared access set for VD: <VD

0x01d4 Information Exclusive access set for VD:

Severity

Level

Event Text Cause Resolution

Domain

PD: <PD identifier>

identifier> now manages PD: <PD identifier>

VD: <VD identifier>

identifier> now manages VD: <VD identifier>

VD: <VD identifier> access is restricted from Peer controller

identifier>

A compatible peer controller entered the HA domain.

A peer controller is not detected or has left the HA domain.

A PD is now managed by the peer controller.

A PD is now managed by the controller.

A VD is now managed by the peer controller.

A VD is now managed by the controller.

Multiple VD target IDs are in conflict due to scenarios that might occur when the HA domain has a missing cross-link that establishes direct controller-to-controller communication (called split-brain condition).

A VD access policy is set to Shared. None - Informational

A VD access policy is set to Exclusive. None - Informational

None - Informational

Planned conditions such as a system restart due to scheduled node maintenance are normal. Unplanned conditions must be further investigated to resolve.

None - Informational

The Peer controller cannot access VDs with conflicting IDs. To resolve, re-establish the controller-to-controller communication path to both controllers and perform a reset of one system.

Avago Technologies

- 76 -

Syncro CS 9286-8e Solution User Guide November 2014

Table 4 HA-DAS MegaRAID Storage Manager Events and Messages (Continued)

Chapter 5: Troubleshooting

Event Messages and Error Messages

Number

0x01d5 Warning VD: <VD identifier> is

Severity

Level

Event Text Cause Resolution

incompatible in the HA

The controller or peer controller does not support the VD type.

domain

0x01d6 Warning Peer controller settings are

incompatible

An incompatible peer controller parameter is detected. The peer controller is rejected from entering the HA domain.

0x01d7 Warning Peer Controller hardware is

incompatible with HA Domain ID: <Domain identifier>

An incompatible peer controller is detected. The peer controller is rejected from entering the HA domain.

0x01d8 Warning Controller property mismatch

detected with Peer controller

A mismatch exists between the controller properties and the peer controller properties.

0x01d9 Warning FW version does not match

Peer controller

A mismatch exists between the controller and peer controller firmware versions.

0x01da Warning Advanced Software Option(s)

<option names> mismatch detected with Peer controller

A mismatch exists between the controller and peer controller advanced software options.

0x01db Information Cache mirroring is online Cache mirroring is established

between the controller and the peer controller. VDs with write-back cache enabled are transitioned from write-through mode to write-back mode.

Attempts to create a VD that is not supported by the peer controller result in a creation failure. To resolve, create a VD that aligns with the peer controller VD support level.

Attempts to introduce an unsupported VD that is managed by the peer controller result in rejection of the VD by the controller. To resolve, convert the unsupported VD to one that is supported by both controllers, or migrate the data to a VD that is supported by both controllers.

The peer controller might have settings that do not match the controller. These settings can be corrected by a firmware update. To resolve, update the firmware for the peer controller and/or controller to ensure that they are at the same version.

The peer controller hardware does not exactly match the controller. To resolve, replace the peer controller with a unit that matches the controller hardware

Controller properties do not match between the controller and peer controller. To resolve, set the mismatched controller property to a common value.

This condition can occur when an HA controller is introduced to the HA domain during a controller firmware update. To resolve, upgrade or downgrade the controller or peer controller firmware to the same version.

This case does not result in an incompatibility that can affect HA functionality, but it can impact the effectiveness of the advanced software options. To resolve, enable an identical level of advanced software options on both controllers.

None - Informational

Avago Technologies

- 77 -

Syncro CS 9286-8e Solution User Guide November 2014

Table 4 HA-DAS MegaRAID Storage Manager Events and Messages (Continued)

Chapter 5: Troubleshooting

Event Messages and Error Messages

Number

Severity

Level

Event Text Cause Resolution

0x01dc Warning Cache mirroring is offline Cache mirroring is not active between

the controller and the peer controller. VDs with write-back cache enabled are transitioned from write-back mode to write-through mode.

0x01dd Critical Cached data from peer

controller is unavailable. VD: <VD identifier> access policy is set to Blocked.

The peer controller has cached data for the affected VDs, but is not present in the HA domain. The VD access policy is set to Blocked until the peer controller can flush the cache data to the VD.

0x01e9 Critical Direct communication with

peer controller(s) was not established. Please check proper cable connections.

The peer controller might be passively detected, but direct controller-to-controller communication could not be established due to a split brain condition.

A split brain condition occurs when the two server nodes are not aware of each other’s existence but can access the same end device/drive.

This condition can occur if cache coherency is lost, such as a communication failure with the peer controller or VD in write-back mode when pending writes go offline, or with pinned cache scenarios. To resolve, reestablish proper cabling and hardware connections to the peer controller or disposition a controller's pinned cache.

This condition can occur when cache coherency is lost due to failure of communication with the peer controller. To resolve, bring the peer controller online and reestablish communication paths to the peer controller. If the peer controller is unrecoverable, restore data from a backup or manually set the access policy (data is unrecoverable).

Attempts to create a VD that is not supported by the peer controller result in a creation failure. To resolve, create a VD that aligns with the peer controller VD support level.

The following table shows HA-DAS boot events and messages.

Table 5 HA-DAS Boot Events and Messages

Boot Event Text Generic Conditions When Each Event Occurs Actions to Resolve

Peer controller firmware is not HA compatible. Please resolve firmware version/settings incompatibility or press 'C' to continue in Safe Mode (all drives will be hidden from this controller).

Peer controller hardware is not HA compatible. Please replace peer controller with compatible unit or press 'C' to continue in Safe Mode (all drives will be hidden from this controller).

Direct communication with peer controller(s) was not established. Please check proper cable connections.

An incompatible peer controller parameter is detected. The peer controller is rejected from entering the HA domain.

A peer controller is not compatible with the controller in the HA domain. Entering Safe Mode protects the VDs by blocking access to the controller to allow the incompatibility to be corrected.

The peer controller can be passively detected, but direct controller-to-controller communication could not be established due to split-brain conditions caused by a missing cross link.

The peer controller might have settings that do not match the controller. These settings might be corrected by a firmware update. To resolve, update the firmware for the peer controller and/or controller to ensure that they are at the same version.

The peer controller hardware does not exactly match the controller. To resolve, replace the peer controller with a unit that matches the controller hardware

A cross link to establish direct peer controller communication is not present. To resolve, check all SAS links in the topology for proper routing and connectivity.

Avago Technologies

- 78 -

Syncro CS 9286-8e Solution User Guide November 2014

Revision History

Version 3.0, November 2014

Revised instructions for creating clusters in the Windows® and Linux® sections.

Version 2.0, October 2013

Added sections for Linux® operating system support.

Version 1.0, March 2013

Initial release of this document.

Version 3.0, November 2014

Revision History

Avago Technologies Confidential

- 79 -

Avago Technologies Syncro CS 9286-8e User Manual

Specifications and Main Features

Frequently Asked Questions

User Manual

Chapter 1: Introduction

1.1 Concepts of High-Availability DAS

1.2 HA-DAS Terminology

1.3 Syncro CS 9286-8e Solution Features

1.4 Hardware Compatibility

1.5 Overview of the Cluster Setup and Configuration

1.6 Performance Considerations

Chapter 2: Hardware and Software Setup

2.1 Syncro CS 9286-8e Two-Server Cluster Hardware Setup

2.2 Cabling Configurations

Chapter 3: Creating the Cluster

3.1 Creating Virtual Drives on the Controller Nodes

3.1.1 Creating Shared or Exclusive VDs with the WebBIOS Utility

3.1.2 Creating Shared or Exclusive VDs with StorCLI

3.1.3 Creating Shared or Exclusive VDs with MegaRAID Storage Manager

3.1.3.1 Unsupported Drives

3.2 HA-DAS CacheCade Support

3.3 Creating the Cluster in Windows

3.3.1 Prerequisites for Cluster Setup

3.3.1.1 Clustered RAID Controller Support

3.3.1.2 Enable Failover Clustering

3.3.1.3 Configure Network Settings

3.3.2 Creating the Failover Cluster

3.3.3 Validating the Failover Cluster Configuration

3.4 Creating the Cluster in Red Hat Enterprise Linux (RHEL)

3.4.1 Prerequisites for Cluster Setup

3.4.1.1 Configure Network Settings

3.4.1.2 Install and Configure the High Availability Add-On Features

3.4.1.3 Configure SELinux

3.4.2 Creating the Cluster

3.4.3 Configure the Logical Volumes and Apply GFS2 File System

3.4.4 Add a Fence Device

3.4.5 Create a Failover Domain

3.4.6 Add Resources to the Cluster

3.4.7 Create Service Groups

3.4.8 Mount the NFS Resource from the Remote Client

3.5 Creating the Cluster in SuSE Linux Enterprise Server (SLES)

3.5.1 Prerequisites for Cluster Setup

3.5.1.1 Prepare the Operating System

3.5.1.2 Configure Network Settings

3.5.1.3 Connect to the NTP Server for Time Synchronization

3.5.2 Creating the Cluster

3.5.2.1 Cluster Setup

3.5.3 Bringing the Cluster Online

3.5.4 Configuring the NFS Resource with STONITH SBD Fencing

3.5.4.1 Install NFSSERVER

3.5.4.2 Configure the Partition and the File System

3.5.4.3 Configure stonith_sbd Fencing

3.5.5 Adding NFS Cluster Resources

3.5.6 Mounting NFS in the Remote Client

Chapter 4: System Administration

4.1 High Availability Properties

4.2 Understanding Failover Operations

4.2.1 Understanding and Using Planned Failover

4.2.1.1 Planned Failover in Windows Server 2012

4.2.1.2 Planned Failover in Windows Server 2008 R2

4.2.1.3 Planned Failover in Red Hat Enterprise Linux

4.2.2 Understanding Unplanned Failover

4.3 Updating the Syncro CS 9286-8e Controller Firmware

4.4 Updating the MegaRAID Driver

4.4.1 Updating the MegaRAID Driver in Windows Server 2008 R2

4.4.2 Updating the MegaRAID Driver in Windows Server 2012

4.4.3 Updating the Red Hat Linux System Driver

4.4.4 Updating the SuSE Linux Enterprise Server 11 Driver

4.5 Performing Preventive Measures on Disk Drives and VDs

Chapter 5: Troubleshooting

5.1 Verifying HA-DAS Support in Tools and the OS Driver

5.2 Confirming SAS Connections

5.2.1 Using WebBIOS to View Connections for Controllers, Expanders, and Drives

5.2.2 Using WebBIOS to Verify Dual-Ported SAS Addresses to Disk Drives

5.2.3 Using StorCLI to Verify Dual-Ported SAS Addresses to Disk Drives

5.2.4 Using MegaRAID Storage Manager to Verify Dual-Ported SAS Addresses to Disk Drives

5.3 Understanding CacheCade Behavior During a Failover

5.4 Error Situations and Solutions

5.5 Event Messages and Error Messages

5.5.1 Error Level Meaning