HP P700m User Manual

Page 1

HP Smart Array P700m Controller for HP ProLiant Servers User Guide

Part Number 456549-001 October 2007 (First Edition)

Page 2

© Copyright 2007 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express

warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.

Microsoft and Windows are U.S. registered trademarks of Microsoft Corporation. Bluetooth is a trademark owned by its proprietor and used by Hewlett-Packard Company under license.

Audience assumptions

This document is for the person who installs, administers, and troubleshoots servers and storage systems. HP assumes you are qualified in the servicing of computer equipment and trained in recognizing hazards in products with hazardous energy levels.

Page 3

Hardware features........................................................................................................................ 5

Main components on the controller board .................................................................................................... 5

Controller specifications ............................................................................................................................. 5

Overview of the installation procedure ............................................................................................ 7

Installing the controller in an unconfigured server blade ................................................................................. 7

Installing the controller in a previously-configured server................................................................................. 7

Installing the controller hardware.................................................................................................... 9

Preparing the server blade.......................................................................................................................... 9

Installing the controller board...................................................................................................................... 9

Updating the firmware ................................................................................................................ 10

Methods for updating the firmware............................................................................................................ 10

Configuring an array .................................................................................................................. 11

Utilities available for configuring an array.................................................................................................. 11

Setting the boot controller and controller order............................................................................... 12

Setting a controller as the boot controller.................................................................................................... 12

Setting the controller order ....................................................................................................................... 12

Installing device drivers and Management Agents .......................................................................... 14

Installing device drivers............................................................................................................................ 14

Installing Management Agents .................................................................................................................. 14

Upgrading or replacing controller options ..................................................................................... 15

Replacing the battery............................................................................................................................... 15

Replacing, moving, or adding hard drives..................................................................................... 18

Identifying the status of a hard drive .......................................................................................................... 18

Recognizing hard drive failure .................................................................................................................. 19

Effects of a hard drive failure .......................................................................................................... 20

Compromised fault tolerance .......................................................................................................... 20

Recovering from compromised fault tolerance.................................................................................... 20

Replacing hard drives.............................................................................................................................. 21

Factors to consider before replacing hard drives................................................................................21

Automatic data recovery (rebuild).................................................................................................... 22

Upgrading hard drive capacity ....................................................................................................... 24

Moving drives and arrays ........................................................................................................................ 24

Adding drives......................................................................................................................................... 25

Diagnosing array problems.......................................................................................................... 27

Controller board runtime LEDs................................................................................................................... 27

Battery pack LEDs.................................................................................................................................... 28

Diagnostic tools ...................................................................................................................................... 29

Electrostatic discharge................................................................................................................. 31

Preventing electrostatic discharge..............................................................................................................31

Grounding methods to prevent electrostatic discharge.................................................................................. 31

Contents 3

Page 4

Regulatory compliance notices ..................................................................................................... 32

European Union regulatory notice .............................................................................................................32

BSMI notice............................................................................................................................................ 32

Korean class A notice .............................................................................................................................. 33

Battery replacement notice........................................................................................................................ 33

Taiwan battery recycling notice................................................................................................................. 33

Acronyms and abbreviations........................................................................................................ 34

Index......................................................................................................................................... 35

Contents 4

Page 5

Hardware features

Main components on the controller board

Item ID Description

1 Status LEDs 2 Connector (not used on HP ProLiant servers) 3 Cache module (also known as BBWC or array accelerator) 4 Connector for cache battery 5 Mezzanine connector to system board

Controller specifications

Feature Details

Board type Type I 4-port PCIe mezzanine board Dimensions 11.3 cm × 10.0 cm × 2.0 cm (4.5 in × 4.0 in × 0.8 in) Type of drives supported 3 Gb/s SAS or 1.5 Gb/s SATA Maximum power required Approximately 9.3 W Temperature range Operating, 10° to 55°C (50° to 131°F)

Storage, -30° to 60°C (-22° to 140°F)

Relative humidity (noncondensing) Operating, 10% to 90%

Storage, 5% to 90%

Hardware features 5

Page 6

RAID levels supported 0, 1, 1+0, and 5; if the battery is used, RAID 6 is also

supported Type of connector Grid array mezzanine connector Transfer rate Up to 2 GB/s in each direction Maximum number of physical drives 108 Maximum number of logical drives 32 Maximum size of a logical drive More than 2 TB Cache size 72 bits, 512 MB (64 MB is used by the onboard processor) Spare battery part number 453779-001 Time required to recharge battery From 15 minutes to 2 hours, depending on the initial battery

charge level Duration of battery backup If the batteries are fully charged and less than 3 years old,

more than 2 days Battery life expectancy More than 3 years

For more information about the controller features and specifications, and for information about system requirements, refer to the HP website (http://www.hp.com/products/smartarray

Hardware features 6

Page 7

Overview of the installation procedure

Installing the controller in an unconfigured server blade

New HP ProLiant server models autoconfigure when they are powered up for the first time. For more information about the autoconfiguration process, see the server-specific setup and installation guide or the HP ROM-Based Setup Utility User Guide. These guides are available on the server Documentation CD.

IMPORTANT: Do not power up the server until the hardware configuration is satisfactory, as

To install the controller in an unconfigured server blade:

described in the procedure given in this section.

1. Install the controller hardware in the server blade.

2. Install the server blade in a blade enclosure.

3. Install an HP 3Gb SAS BL-c Pass-Thru Module in the blade enclosure.

4. Connect the server blade to the pass-thru module.

5. Connect the pass-thru module to a drive enclosure.

6. If necessary, install physical drives in the drive enclosure.

The number of drives connected to the pass-thru module determines the RAID level that is autoconfigured when the server is powered up. For details, see the server-specific setup and installation guide or the HP ROM-Based Setup Utility User Guide.

7. Power up the server. The autoconfiguration process runs.

8. Update the server firmware ("Methods for updating the firmware" on page 10).

9. Update the controller firmware ("Methods for updating the firmware" on page 10).

10. Install the operating system and device drivers ("Installing device drivers" on page 14). Instructions

are provided with the CD that is supplied in the controller kit.

11. (Optional) Create additional logical drives ("Configuring an array" on page 11).

The server is now ready to use.

Installing the controller in a previously-configured server

1. Back up data on the system.

2. Update the server blade firmware ("Methods for updating the firmware" on page 10).

3. If the new controller will be the boot device, install the device drivers ("Installing device drivers" on

page 14). Otherwise, go directly to step 4.

Overview of the installation procedure 7

Page 8

4. Power down the server blade.

5. Remove the server blade from the enclosure.

6. Remove the access panel from the server blade.

7. Install the controller hardware ("Installing the controller hardware" on page 9).

8. Reinstall the access panel.

9. Reinstall the server blade in the enclosure.

10. Install an HP 3Gb SAS BL-c Pass-Thru Module in the enclosure.

11. Connect the server blade to the pass-thru module.

12. Connect the pass-thru module to a drive enclosure.

13. Power up the server blade.

14. Update the controller firmware ("Methods for updating the firmware" on page 10).

15. (Optional) Set this controller as the boot controller using ORCA ("Setting a controller as the boot

controller" on page 12).

16. (Optional) Change the controller boot order using RBSU ("Setting the controller order" on page 12).

17. If the controller will not be the boot device, install the device drivers ("Installing device drivers" on

page 14).

18. If new versions of the Management Agents are available, update the Management Agents

("Installing Management Agents" on page 14).

Overview of the installation procedure 8

Page 9

Installing the controller hardware

Preparing the server blade

1. Back up all data.

2. Close all applications.

3. Power down the server blade.

CAUTION: In systems that use external data storage, be sure that the server is the first unit to

be powered down and the last to be powered back up. Taking this precaution ensures that the system does not erroneously mark the drives as failed when the server is powered up.

4. Remove the server blade from the enclosure.

Installing the controller board

WARNING: To reduce the risk of personal injury or damage to the equipment, consult the

safety information and user documentation provided with the server before attempting the installation.

Many servers are capable of providing energy levels that are considered hazardous and are intended to be serviced only by qualified personnel who have been trained to deal with these hazards. Do not remove enclosures or attempt to bypass any interlocks that may be provided

1. Remove the access panel from the server blade.

2. Select an available mezzanine socket on the system board.

3. Remove the socket cover, and then save it for future use.

4. Plug the controller into the socket.

5. Tighten the three spring-loaded captive screws at the corners of the controller.

for the purpose of removing these hazardous conditions.

WARNING: To reduce the risk of personal injury from hot surfaces, allow the drives and the

internal system components to cool before touching them.

6. Reinstall the access panel.

CAUTION: Do not operate the server for long periods with the access panel open or removed.

Operating the server in this manner results in improper airflow and improper cooling that can

7. Reinstall the server blade in the enclosure.

8. Install an HP 3Gb SAS BL-c Pass-Thru Module in the enclosure.

9. Connect the server blade to the pass-thru module.

10. Connect the pass-thru module to a drive enclosure.

lead to thermal damage.

Installing the controller hardware 9

Page 10

Updating the firmware

Methods for updating the firmware

To update the firmware on the server, controller, or hard drives, use Smart Components. These components are available on the Firmware Maintenance CD. A more recent version of a particular server or controller component might be available on the support page of the HP website (http://www.hp.com/support available from the software and drivers page for storage products (http://www.hp.com/support/proliantstorage

1. Find the most recent version of the component that you require. Components for controller firmware

updates are available in offline and online formats.

2. Follow the instructions for installing the component on the server. These instructions are given with

the CD and are provided on the same Web page as the component.

). Components for controller and hard drive firmware updates are also

3. Follow the additional instructions that describe how to use the component to flash the ROM. These

instructions are provided with each component.

For more information about updating the firmware, refer to the HP ProLiant Storage Firmware

Maintenance User Guide (for controller and hard drive firmware) or the HP Online ROM Flash User Guide (for server firmware).

Updating the firmware 10

Page 11

Configuring an array

Utilities available for configuring an array

Two utilities are available for configuring an array on this controller:

• ORCA is a simple utility that is used mainly to configure the first logical drive in a new server before

the operating system is loaded.

• ACU is an advanced utility that enables you to perform many complex configuration tasks.

For more information about the features of these utilities and for instructions for using the utilities, see the Configuring Arrays on HP Smart Array Controllers Reference Guide. This guide is available on the Documentation CD that is provided in the controller kit.

Whichever utility you use, remember the following factors when you build an array:

• All drives in an array must be of the same type (for example, all SAS or all SATA).

• For the most efficient use of drive space, all drives within an array should have approximately the

same capacity. Each configuration utility treats every physical drive in an array as if it had the same capacity as the smallest drive in the array. Any excess capacity of a particular drive cannot be used in the array, and is unavailable for data storage.

• The more physical drives that there are in an array, the greater the probability that the array will

experience a drive failure during any given period. To guard against the data loss that occurs when a drive fails, configure all logical drives in an array with a suitable fault-tolerance (RAID) method.

Configuring an array 11

Page 12

Setting the boot controller and controller order

Setting a controller as the boot controller

The following procedure enables you only to set a controller as the boot controller. If you also want to adjust the boot order settings of other controllers in the system, use RBSU instead ("Setting the controller

order" on page 12).

1. Confirm that the controller is connected to a logical drive. (If it is not, it cannot be set as the boot

controller.)

2. Perform a normal system shutdown.

3. Restart the server.

POST runs, and all controllers in the server are initialized one at a time in the current boot order sequence. If a controller is connected to one or more hard drives, an ORCA prompt message appears during the initialization process for that controller.

As soon as you see the ORCA prompt for the controller that you want to set as the boot controller, continue with the next step.

4. Press the F8 key.

The ORCA main menu appears. If the controller is configured with a logical drive, one of the menu options is to set the controller as the boot controller.

5. Select the appropriate menu option, and follow any subsequent on-screen instructions. If prompted to

save the settings, do so.

6. If you want to configure or reconfigure an array on this controller, you can do this while you are still

in ORCA. (For more information, see the Configuring Arrays on HP Smart Array Controllers Reference Guide. This guide is available on the Documentation CD that is provided in the controller

kit.) If you do not want to configure an array at this time or if you intend to use a different utility to

configure the array, exit from ORCA, and then restart the server for the new boot controller setting to take effect.

Setting the controller order

1. Power up the server.

The server runs the POST sequence and briefly displays an RBSU prompt.

2. At the prompt, press the F9 key to start RBSU.

3. Follow the on-screen instructions to set the boot order for the different controllers in the system.

4. Save the settings.

5. Exit from the utility.

Setting the boot controller and controller order 12

Page 13

For more information about using RBSU, refer to the HP ROM-Based Setup Utility User Guide or the server setup and installation guide. These documents are both available on the Documentation CD supplied in the server kit.

Setting the boot controller and controller order 13

Page 14

Installing device drivers and Management Agents

Installing device drivers

The drivers for the controller are located on the Support Software CD or the SmartStart CD that is provided in the controller kit. Updates are posted to the HP website (http://www.hp.com/support

Using the Support Software CD: Instructions for installing the drivers from the Support Software CD are given in the leaflet that is supplied with the CD.

Using the SmartStart CD: If you use the Assisted Installation path feature of SmartStart to install the operating system on a new server, the drivers are automatically installed at the same time.

You can also use SmartStart to update the drivers manually on systems that are already configured. For more information, refer to the SmartStart documentation.

Installing Management Agents

If you use the Assisted Installation path feature of SmartStart to install the operating system on a new server, the Management Agents are automatically installed at the same time.

You can update the Management Agents by using the latest versions of the agents from the HP website (http://www.hp.com/servers/manage Web page.

If the new agents do not function correctly, you might also need to update Systems Insight Manager. The latest version of Systems Insight Manager is available for download at the HP website (http://www.hp.com/servers/manage

). The procedure for updating the agents is provided on the same

Installing device drivers and Management Agents 14

Page 15

Upgrading or replacing controller options

Replacing the battery

CAUTION: Electrostatic discharge can damage electronic components. Be sure you are

The method for replacing a battery depends on whether the battery case is mounted on the inner wall of the server chassis by a hook-and-loop strip or located in a hard drive slot.

If the battery case is mounted on the inner wall of the server chassis:

1. Back up all data.

2. Close all applications.

3. Power down the server.

4. Remove the server from the enclosure.

5. Remove the server access panel.

6. Remove the battery case from the chassis wall.

7. Remove the battery from the battery case.

8. Install the replacement battery in the battery case.

9. Mount the battery case on the chassis wall.

10. Close the server access panel.

11. Reinstall the server in the enclosure.

properly grounded before beginning this procedure.

NOTE: After installing a battery pack, you might see a POST message during reboot

indicating that the array accelerator (cache) is temporarily disabled. This is normal, because the new battery pack is likely to have a low charge. You do not need to take any action, because the recharge process begins automatically when the battery pack is installed. The controller will operate properly while the battery pack recharges, although the performance advantage of the array accelerator will be absent. When the battery pack has been charged to a satisfactory level, the array accelerator will automatically be enabled.

If the battery case is located in a hard drive slot:

1. Back up all data.

2. Close all applications.

3. Power down the server.

4. Remove the server from the enclosure.

5. Remove the server access panel.

6. Remove the battery case from the hard drive slot.

7. Invert the battery case.

Upgrading or replacing controller options 15

Page 16

8. Pull the right hand portion of the battery case away from the battery pack and simultaneously rotate

the battery out of the opening.

9. Position the replacement battery pack in the opening in the battery case as shown. The upper left

edge of the battery is under the flanges on the pillars at the left edge of the opening, and the right side of the battery rests on the right pillars.

Upgrading or replacing controller options 16

Page 17

10. Pull the right hand portion of the battery case away from the battery, and simultaneously rotate the

battery pack into the opening.

11. Connect the battery cable to the battery and the cache. Route the battery cable so that the cache

and battery can be removed together. (If you need to remove the cache to transfer data, the battery must remain connected to it so that the data is preserved.)

12. Insert the battery case into the hard drive slot.

13. Close the server access panel.

14. Reinstall the server in the enclosure.

NOTE: After installing a battery pack, you might see a POST message during reboot

Upgrading or replacing controller options 17

Page 18

Replacing, moving, or adding hard drives

Identifying the status of a hard drive

When a drive is configured as a part of an array and connected to a powered-up controller, the condition of the drive can be determined from the illumination pattern of the hard drive status lights (LEDs).

Item Description

1 Fault/UID LED (amber/blue) 2 Online LED (green)

Online/activity LED (green)

On, off, or flashing Alternating amber

On, off, or flashing Steadily blue The drive is operating normally, and it has been selected by a

On Amber, flashing

On Off The drive is online, but it is not active currently.

Fault/UID LED (amber/blue)

and blue

regularly (1 Hz)

Interpretation

The drive has failed, or a predictive failure alert has been received for this drive; it also has been selected by a management application.

management application. A predictive failure alert has been received for this drive. Replace the drive as soon as possible.

Replacing, moving, or adding hard drives 18

Page 19

Online/activity LED (green)

Flashing regularly (1 Hz)

Flashing irregularly Amber, flashing

Flashing irregularly Off The drive is active, and it is operating normally. Off Steadily amber A critical fault condition has been identified for this drive, and

Off Amber, flashing

Off Off The drive is offline, a spare, or not configured as part of an

Fault/UID LED

Interpretation

(amber/blue)

Amber, flashing regularly (1 Hz)

Off Do not remove the drive. Removing a drive may terminate the

regularly (1 Hz)

Do not remove the drive. Removing a drive may terminate the current operation and cause data loss.

The drive is part of an array that is undergoing capacity expansion or stripe migration, but a predictive failure alert has been received for this drive. To minimize the risk of data loss, do not replace the drive until the expansion or migration is complete.

current operation and cause data loss. The drive is rebuilding, or it is part of an array that is undergoing

capacity expansion or stripe migration. The drive is active, but a predictive failure alert has been

received for this drive. Replace the drive as soon as possible.

the controller has placed it offline. Replace the drive as soon as possible.

A predictive failure alert has been received for this drive. Replace the drive as soon as possible.

array.

Recognizing hard drive failure

A steadily glowing Fault LED indicates that that drive has failed. Other means by which hard drive failure is revealed include:

• The amber LED on the front of a storage system illuminates if failed drives are inside. This LED also

illuminates when other problems occur, such as when a fan fails, a redundant power supply fails, or the system overheats.

• A POST message lists failed drives when the system is restarted, as long as the controller detects at

least one functional drive.

• ACU represents failed drives with a distinctive icon.

• HP Systems Insight Manager can detect failed drives remotely across a network. (For more

information about HP Systems Insight Manager, see the documentation on the Management CD.)

• The Event Notification Service posts an event to the Microsoft® Windows® system event log and the

IML.

• ADU lists all failed drives.

For additional information about diagnosing hard drive problems, see the HP Servers Troubleshooting Guide.

Replacing, moving, or adding hard drives 19

Page 20

CAUTION: Sometimes, a drive that has previously been failed by the controller may seem to

be operational after the system is power-cycled or (for a hot-pluggable drive) after the drive has been removed and reinserted. However, continued use of such marginal drives may eventually result in data loss. Replace the marginal drive as soon as possible.

Effects of a hard drive failure

When a hard drive fails, all logical drives that are in the same array are affected. Each logical drive in an array might be using a different fault-tolerance method, so each logical drive can be affected differently.

• RAID 0 configurations cannot tolerate drive failure. If any physical drive in the array fails, all non-

fault-tolerant (RAID 0) logical drives in the same array will also fail.

• RAID 1+0 configurations can tolerate multiple drive failures as long as no failed drives are mirrored

to one another.

• RAID 5 configurations can tolerate one drive failure.

• RAID 6 (ADG) configurations can tolerate the simultaneous failure of two drives.

Compromised fault tolerance

If more hard drives fail than the fault-tolerance method allows, fault tolerance is compromised, and the logical drive fails. In this case, all requests from the operating system are rejected with unrecoverable errors. You are likely to lose data, although it can sometimes be recovered (refer to "Recovering from

compromised fault tolerance" on page 20).

One example of a situation in which compromised fault tolerance may occur is when a drive in an array fails while another drive in the array is being rebuilt. If the array has no online spare, any logical drives in this array that are configured with RAID 5 fault tolerance will fail.

Compromised fault tolerance can also be caused by non-drive problems, such as a faulty cable or temporary power loss to a storage system. In such cases, you do not need to replace the physical drives. However, you may still have lost data, especially if the system was busy at the time that the problem occurred.

Recovering from compromised fault tolerance

If fault tolerance is compromised, inserting replacement drives does not improve the condition of the logical volume. Instead, if the screen displays unrecoverable error messages, perform the following procedure to recover data:

1. Power down the entire system, and then power it back up. In some cases, a marginal drive will work

again for long enough to enable you to make copies of important files. If a 1779 POST message is displayed, press the F2 key to re-enable the logical volumes. Remember

that data loss has probably occurred and any data on the logical volume is suspect.

2. Make copies of important data, if possible.

3. Replace any failed drives.

4. After you have replaced the failed drives, fault tolerance may again be compromised. If so, cycle the

power again. If the 1779 POST message is displayed:

a. Press the F2 key to re-enable the logical drives.

Replacing, moving, or adding hard drives 20

Page 21

b. Recreate the partitions. c. Restore all data from backup.

To minimize the risk of data loss that is caused by compromised fault tolerance, make frequent backups of all logical volumes.

Replacing hard drives

The most common reason for replacing a hard drive is that it has failed. However, another reason is to gradually increase the storage capacity of the entire system.

If you insert a hot-pluggable drive into a drive bay while the system power is on, all disk activity in the array pauses for a second or two while the new drive is spinning up. When the drive has achieved its normal spin rate, data recovery to the replacement drive begins automatically (as indicated by the blinking Online/Activity LED on the replacement drive) if the array is in a fault-tolerant configuration.

If you replace a drive belonging to a fault-tolerant configuration while the system power is off, a POST message appears when the system is next powered up. This message prompts you to press the F1 key to start automatic data recovery. If you do not enable automatic data recovery, the logical volume remains in a ready-to-recover condition and the same POST message appears whenever the system is restarted.

Factors to consider before replacing hard drives

Before replacing a degraded drive:

• Open Systems Insight Manager, and inspect the Error Counter window for each physical drive in the

same array to confirm that no other drives have any errors. (For details, refer to the Systems Insight Manager documentation on the Management CD.)

• Be sure that the array has a current, valid backup.

• Confirm that the replacement drive is of the same type (SAS or SATA) as the degraded drive.

• Use replacement drives that have a capacity at least as great as that of the smallest drive in the

array. The controller immediately fails drives that have insufficient capacity.

In systems that use external data storage, be sure that the server is the first unit to be powered down and the last to be powered back up. Taking this precaution ensures that the system does not erroneously mark the drives as failed when the server is powered up.

To minimize the likelihood of fatal system errors, take these precautions when removing failed drives:

• Do not remove a degraded drive if any other drive in the array is offline (the Online/Activity LED is

off). In this situation, no other drive in the array can be removed without data loss. The following cases are exceptions:

o When RAID 1+0 is used, drives are mirrored in pairs. Several drives can be in a failed condition

simultaneously (and they can all be replaced simultaneously) without data loss, as long as no two failed drives belong to the same mirrored pair.

o When RAID 6 (ADG) is used, two drives can fail simultaneously (and be replaced simultaneously)

without data loss.

o If the offline drive is a spare, the degraded drive can be replaced.

Replacing, moving, or adding hard drives 21

Page 22

• Do not remove a second drive from an array until the first failed or missing drive has been replaced

and the rebuild process is complete. (The rebuild is complete when the Online/Activity LED on the front of the drive stops blinking.)

The following cases are exceptions:

o In RAID 6 (ADG) configurations, any two drives in the array can be replaced simultaneously. o In RAID 1+0 configurations, any drives that are not mirrored to other removed or failed drives

can be simultaneously replaced offline without data loss.

Automatic data recovery (rebuild)

When you replace a hard drive in an array, the controller uses the fault-tolerance information on the remaining drives in the array to reconstruct the missing data (the data that was originally on the replaced drive) and write it to the replacement drive. This process is called automatic data recovery, or rebuild. If fault tolerance is compromised, this data cannot be reconstructed and is likely to be permanently lost.

If another drive in the array fails while fault tolerance is unavailable during rebuild, a fatal system error can occur, and all data on the array is then lost. In exceptional cases, however, failure of another drive need not lead to a fatal system error. These exceptions include:

• Failure after activation of a spare drive

• Failure of a drive that is not mirrored to any other failed drives (in a RAID 1+0 configuration)

• Failure of a second drive in a RAID 6 (ADG) configuration

Time required for a rebuild

The time required for a rebuild varies considerably, depending on several factors:

• The priority that the rebuild is given over normal I/O operations (you can change the priority setting

by using ACU)

• The amount of I/O activity during the rebuild operation

• The rotational speed of the hard drives

• The availability of drive cache

• The brand, model, and age of the drives

• The amount of unused capacity on the drives

• For RAID 5 and RAID 6 (ADG), the number of drives in the array

Allow approximately 15 minutes per gigabyte for the rebuild process to be completed. This figure is conservative; the actual time required is usually less than this.

System performance is affected during the rebuild, and the system is unprotected against further drive failure until the rebuild has finished. Therefore, replace drives during periods of low activity when possible.

When automatic data recovery has finished, the Online/Activity LED of the replacement drive stops blinking steadily at 1 Hz and begins to either glow steadily (if the drive is inactive) or flash irregularly (if the drive is active).

Replacing, moving, or adding hard drives 22

Page 23

CAUTION: If the Online/Activity LED on the replacement drive does not light up while the

corresponding LEDs on other drives in the array are active, the rebuild process has abnormally terminated. The amber Fault LED of one or more drives might also be illuminated. Refer to "Abnormal termination of a rebuild (on page 23)" to determine what action you must take.

Abnormal termination of a rebuild

If the Online/Activity LED on the replacement drive permanently ceases to be illuminated even while other drives in the array are active, the rebuild process has abnormally terminated. The following table indicates the three possible causes of abnormal termination of a rebuild.

Observation Cause of rebuild termination

None of the drives in the array have an illuminated amber Fault LED.

The replacement drive has an illuminated amber Fault LED.

One of the other drives in the array has an illuminated amber Fault LED.

Each of these situations requires a different remedial action.

One of the drives in the array has experienced an uncorrectable read error.

The replacement drive has failed.

The drive with the illuminated Fault LED has now failed.

Case 1: An uncorrectable read error has occurred.

1. Back up as much data as possible from the logical drive.

CAUTION: Do not remove the drive that has the media error. Doing so causes the logical drive

to fail.

2. Restore data from backup. Writing data to the location of the unreadable sector often eliminates the

error.

3. Remove and reinsert the replacement drive. This action restarts the rebuild process.

If the rebuild process still terminates abnormally:

1. Delete and recreate the logical drive.

2. Restore data from backup.

Case 2: The replacement drive has failed.

Verify that the replacement drive is of the correct capacity and is a supported model. If these factors are not the cause of the problem, use a different drive as the replacement.

Case 3: Another drive in the array has failed. A drive that has recently failed can sometimes be made temporarily operational again by cycling the

server power.

1. Power down the server.

2. Remove the replacement physical drive (the one undergoing a rebuild), and reinstall the drive that it

is replacing.

3. Power up the server.

If the newly failed drive seems to be operational again:

1. Back up any unsaved data.

Replacing, moving, or adding hard drives 23

Page 24

2. Remove the drive that was originally to be replaced, and reinsert the replacement physical drive. The

rebuild process automatically restarts.

3. When the rebuild process has finished, replace the newly failed drive.

However, if the newly failed drive has not recovered:

1. Remove the drive that was originally to be replaced, and reinsert the replacement physical drive.

2. Replace the newly failed drive.

3. Restore data from backup.

Upgrading hard drive capacity

You can increase the storage capacity on a system even if there are no available drive bays by swapping drives one at a time for higher capacity drives. This method is viable as long as a fault-tolerance method is running.

CAUTION: Because it can take up to 15 minutes per gigabyte to rebuild the data in the new

configuration, the system is unprotected against drive failure for many hours while a given drive is upgraded. Perform drive capacity upgrades only during periods of minimal system

To upgrade hard drive capacity:

1. Back up all data.

2. Replace any drive. The data on the new drive is re-created from redundant information on the

activity.

remaining drives.

CAUTION: Do not replace any other drive until data rebuild on this drive is complete.

When data rebuild on the new drive is complete, the Online/Activity LED stops flashing steadily and either flashes irregularly or glows steadily.

3. Repeat the previous step for the other drives in the array, one at a time.

When you have replaced all drives, you can use the extra capacity to either create new logical drives or extend existing logical drives. For more information about these procedures, refer to the HP Array Configuration Utility User Guide.

Moving drives and arrays

You can move drives to other ID positions on the same array controller. You can also move a complete array from one controller to another, even if the controllers are on different servers.

Before you move drives, the following conditions must be met:

• The server must be powered down.

• If moving the drives to a different server, the new server must have enough empty bays to

accommodate all the drives simultaneously.

• The array has no failed or missing drives, and no spare drive in the array is acting as a replacement

for a failed drive.

Replacing, moving, or adding hard drives 24

Page 25

• The controller is not running capacity expansion, capacity extension, or RAID or stripe size

migration.

• The controller is using the latest firmware version (recommended).

If you want to move an array to another controller, all drives in the array must be moved at the same time. When all the conditions have been met:

1. Back up all data before removing any drives or changing configuration. This step is required if you

are moving data-containing drives from a controller that does not have a battery-backed cache.

2. Power down the system.

3. Move the drives.

4. Power up the system. If a 1724 POST message appears, drive positions were changed successfully

and the configuration was updated. If a 1785 (Not Configured) POST message appears:

a. Power down the system immediately to prevent data loss. b. Return the drives to their original locations. c. Restore the data from backup, if necessary.

5. Verify the new drive configuration by running ORCA or ACU ("Configuring an array" on page 11).

Adding drives

You can add hard drives to a system at any time, as long as you do not exceed the maximum number of drives that the controller supports. You can then either build a new array from the added drives or use the extra storage capacity to expand the capacity of an existing array.

To perform an array capacity expansion, use ACU. If the system is using hot-pluggable drives and ACU is running in the same environment as the normal server applications, you can expand array capacity without shutting down the operating system. For more information, see the Configuring Arrays on HP Smart Array Controllers Reference Guide.

Replacing, moving, or adding hard drives 25

Page 26

The expansion process is illustrated in the following figure, in which the original array (containing data) is shown with a dashed border, and the newly added drives (containing no data) are shown unshaded. The array controller adds the new drives to the array and redistributes the original logical drives over the enlarged array one logical drive at a time. This process liberates some storage capacity on each physical drive in the array. Each logical drive keeps the same fault-tolerance method in the enlarged array that it had in the smaller array.

When the expansion process has finished, you can use the liberated storage capacity on the enlarged array to create new logical drives. Alternatively, you can use ACU to enlarge (extend) one of the original logical drives.

Replacing, moving, or adding hard drives 26

Page 27

Diagnosing array problems

Controller board runtime LEDs

Immediately after you power up the server, the controller runtime LEDs illuminate briefly in a predetermined pattern as part of the POST sequence. At all other times during server operation, the illumination pattern of the runtime LEDs indicates the status of the controller, as described in the following table.

LED ID Color LED name and interpretation

1 Amber CR10: Thermal Alert LED. Not used on this controller. 2 Amber CR9: System Error LED. The controller ASIC has locked up and cannot process

any commands.

3 Amber CR1: Diagnostics Error LED. One of the diagnostics utilities in the server has

detected a controller error.

4 Amber CR2: Drive Failure LED. A physical drive connected to the controller has failed.

Check the Fault LED on each drive to determine which drive has failed. 5 Green CR3: Activity LED for port 2. 6 Green CR4: Activity LED for port 1. 7 Green CR5: Command Outstanding LED. The controller is working on a command from

the host driver. 8 Green CR6: Controller Heartbeat LED. This LED flashes every 2 seconds to indicate the

controller health. 9 Green CR7: Gas Pedal LED. This LED, together with the Idle Task LED, indicates the

amount of controller CPU activity. For details, see the following table.

Diagnosing array problems 27

Page 28

LED ID Color LED name and interpretation

10 Green CR8: Idle Task LED. This LED, together with the Gas Pedal LED, indicates the

amount of controller CPU activity. For details, see the following table.

Gas Pedal LED status

Idle Task LED status

Off Blinking 0–25% Blinking Off 25–50% On steadily Off 50–75% On steadily On steadily 75–100%

Battery pack LEDs

Controller CPU activity level

Item ID Color Description

1 Green System Power LED. This LED glows steadily when the

system is powered up and 12 V system power is available. This power supply is used to maintain the battery charge and provide supplementary power to the cache microcontroller.

2 Green Auxiliary Power LED. This LED glows steadily when 3.3V

auxiliary voltage is detected. The auxiliary voltage is used to preserve BBWC data and is available any time that the system power cords are connected to a power supply.

3 Amber Battery Health LED. To interpret the illumination patterns of

this LED, see the following table.

4 Green BBWC Status LED. To interpret the illumination patterns of

this LED, see the following table.

Diagnosing array problems 28

Page 29

LED3 pattern LED4 pattern Interpretation

— One blink every

two seconds

— Double blink,

then pause

— One blink per

second

— Steady glow The battery pack is fully charged, and posted write data is stored in

— Off The battery pack is fully charged, and there is no posted write data

One blink per second

Steady glow — There is a short circuit across the battery terminals or within the

One blink per second

— There is an open circuit across the battery terminals or within the

The system is powered down, and the cache contains data that has not yet been written to the drives. Restore system power as soon as possible to prevent data loss.

Data preservation time is extended any time that 3.3 V auxiliary power is available, as indicated by LED 2. In the absence of auxiliary power, battery power alone preserves the data. A fullycharged battery can normally preserve data for at least two days.

The battery lifetime also depends on the cache module size. For further information, refer to the controller QuickSpecs on the HP website (http://www.hp.com

The cache microcontroller is waiting for the host controller to communicate.

The battery pack is below the minimum charge level and is being charged. Features that require a battery (such as write cache, capacity expansion, stripe size migration, and RAID migration) are temporarily unavailable until charging is complete. The recharge process takes between 15 minutes and two hours, depending on the initial capacity of the battery.

the cache.

in the cache. An alternating green and amber blink pattern indicates that the

cache microcontroller is executing from within its boot loader and receiving new flash code from the host controller.

battery pack. BBWC features are disabled until the battery pack is replaced. The life expectancy of a battery pack is typically more than three years.

Diagnostic tools

Several diagnostic tools provide feedback about problems with arrays.

• ADU

This utility is available on the SmartStart CD in the controller kit and also on the HP website (http://www.hp.com/support messages, see the HP Servers Troubleshooting Guide.

• Event Notification Service

This utility reports array events to the Microsoft® Windows® system event log and IML. You can obtain the utility from the SmartStart CD or the HP website (http://www.hp.com/support

• POST messages

). For more information about the meanings of the various ADU error

Diagnosing array problems 29

Page 30

Smart Array controllers produce diagnostic error messages (POST messages) at reboot. Many POST messages are self-explanatory and suggest corrective actions. For more information about POST messages, see the HP Servers Troubleshooting Guide.

• HP Insight Diagnostics

HP Insight Diagnostics is a tool that displays information about the system hardware configuration and performs tests on the system and its components (including hard drives if they are connected to Smart Array controllers). This utility is available on the SmartStart CD and also on the HP website (http://www.hp.com/servers/diags

Diagnosing array problems 30

Page 31

Electrostatic discharge

Preventing electrostatic discharge

To prevent damaging the system, be aware of the precautions you need to follow when setting up the system or handling parts. A discharge of static electricity from a finger or other conductor may damage system boards or other static-sensitive devices. This type of damage may reduce the life expectancy of the device.

To prevent electrostatic damage:

• Avoid hand contact by transporting and storing products in static-safe containers.

• Keep electrostatic-sensitive parts in their containers until they arrive at static-free workstations.

• Place parts on a grounded surface before removing them from their containers.

• Avoid touching pins, leads, or circuitry.

• Always be properly grounded when touching a static-sensitive component or assembly.

Grounding methods to prevent electrostatic discharge

Several methods are used for grounding. Use one or more of the following methods when handling or installing electrostatic-sensitive parts:

• Use a wrist strap connected by a ground cord to a grounded workstation or computer chassis. Wrist

straps are flexible straps with a minimum of 1 megohm ±10 percent resistance in the ground cords. To provide proper ground, wear the strap snug against the skin.

• Use heel straps, toe straps, or boot straps at standing workstations. Wear the straps on both feet

when standing on conductive floors or dissipating floor mats.

• Use conductive field service tools.

• Use a portable field service kit with a folding static-dissipating work mat.

If you do not have any of the suggested equipment for proper grounding, have an authorized reseller install the part.

For more information on static electricity or assistance with product installation, contact an authorized reseller.

Electrostatic discharge 31

Page 32

Regulatory compliance notices

European Union regulatory notice

This product complies with the following EU Directives:

• Low Voltage Directive 2006/95/EC

• EMC Directive 2004/108/EC

Compliance with these directives implies conformity to applicable harmonized European standards (European Norms) which are listed on the EU Declaration of Conformity issued by Hewlett-Packard for this product or product family.

This compliance is indicated by the following conformity marking placed on the product:

This marking is valid for non-Telecom products and EU harmonized Telecom products (e.g. Bluetooth).

This marking is valid for EU non-harmonized Telecom products. *Notified body number (used only if applicable—refer to the product label)

Hewlett-Packard GmbH, HQ-TRE, Herrenberger Strasse 140, 71034 Boeblingen, Germany

BSMI notice

Regulatory compliance notices 32

Page 33

Korean class A notice

Battery replacement notice

This component uses a nickel metal hydride (NiMH) battery pack.

WARNING: There is a risk of explosion, fire, or personal injury if a battery pack is

mishandled. To reduce this risk:

• Do not attempt to recharge the batteries if they are disconnected from the controller.

• Do not expose the battery pack to water, or to temperatures higher than 60°C (140°F).

• Do not abuse, disassemble, crush, or puncture the battery pack.

• Do not short the external contacts.

• Replace the battery pack only with the designated HP spare.

• Battery disposal should comply with local regulations.

Batteries, battery packs, and accumulators should not be disposed of together with the general household waste. To forward them to recycling or proper disposal, use the public collection system or return them to HP, an authorized HP Partner, or their agents.

For more information about battery replacement or proper disposal, contact an authorized reseller or an authorized service provider.

Taiwan battery recycling notice

The Taiwan EPA requires dry battery manufacturing or importing firms in accordance with Article 15 of the Waste Disposal Act to indicate the recovery marks on the batteries used in sales, giveaway or promotion. Contact a qualified Taiwanese recycler for proper battery disposal.

Regulatory compliance notices 33

Page 34

Acronyms and abbreviations

ACU

Array Configuration Utility

ADG

Advanced Data Guarding (also known as RAID 6)

ADU

Array Diagnostics Utility

ASIC

Application Specific Integrated Circuit

BBWC

battery-backed write cache

IML

Integrated Management Log

ORCA

Option ROM Configuration for Arrays

POST

Power-On Self Test

RBSU

ROM-Based Setup Utility

Acronyms and abbreviations 34

Page 35

Index

ACU (Array Configuration Utility) 11 adding drives 25 ADU (Array Diagnostic Utility) 29 Array Configuration Utility (ACU) 11 array controller installation overview 7 Array Diagnostic Utility (ADU) 29 array expansion 25 array, configuring 11 array, moving 24 automatic data recovery (rebuild) 22

batteries, replacing 15 batteries, specifications 5 battery pack LEDs 28 battery replacement notice 33 board components 5 boot controller, setting 12 BSMI notice 32

cache, features 5 compromised fault tolerance 20 configuring an array 11 connectors 5 controller board, features of 5 controller board, installing 9 controller installation, overview of 7 controller LEDs 27 controller options, replacing 15 controller order, setting 12

data recovery 20, 22 data transfer rate 5 device drivers, installing 14 diagnostic tools 29 drive failure, detecting 19 drive LEDs 18 drivers 14

drives, moving 24

electrostatic discharge 31 environmental requirements 5 error messages 19, 29 European Union notice 32 expanding an array 25 extending logical drive capacity 25

failure, hard drive 19 fault tolerance, compromised 20 firmware, updating 10

grounding methods 31 guidelines, replacing hard drives 21

hard drive capacity, upgrading 24 hard drive LEDs 18 hard drive, failure of 19, 20 hard drive, replacing 21 hard drives, adding 25 hard drives, determining status of 18 hard drives, maximum number of 5 hard drives, moving 24 hard drives, types supported 5

IML (Integrated Management Log) 29 Insight Diagnostics 29 installation overview 7 Integrated Management Log (IML) 29

Korean notices 33

Index 35

Page 36

LEDs, battery pack 28 LEDs, controller 27 LEDs, hard drive 18 logical drive capacity extension 25 logical drive, creating 11 logical drives, maximum number of 5

Management Agents, updating 14 moving an array 24 moving drives 24

Option ROM Configuration for Arrays (ORCA) 11,

ORCA (Option ROM Configuration for Arrays) 11,

overview of installation process 7

Taiwan battery recycling notice 33 temperature requirements 5 troubleshooting 27, 29

unconfigured server, installation in 7 updating the firmware 10 upgrading drive capacity 24

physical drives, maximum number of 5 POST error messages 19, 29 power requirements 5 preparation procedures 9 previously configured server, installation in 7

RAID levels supported 5 rebuild, abnormal termination of 23 rebuild, description of 22 rebuild, time required for 22 regulatory compliance notices 32, 33 replacing hard drives 18 replacing the batteries 15 ROM, updating 10 runtime LEDs 27

spares, battery pack, part number 5 specifications, controller 5 static electricity 31 status lights, battery pack 28 status lights, controller 27 status lights, hard drive 18 storage capacity, increasing 24 summary of installation procedure 7

Index 36

HP P700m User Manual

Specifications and Main Features

Frequently Asked Questions

User Manual

HP Smart Array P700m Controller for HP ProLiant Servers User Guide

Contents

Hardware features

Main components on the controller board

Controller specifications

Overview of the installation procedure

Installing the controller in an unconfigured server blade

Installing the controller in a previously-configured server

Installing the controller hardware

Preparing the server blade

Installing the controller board

Updating the firmware

Methods for updating the firmware

Configuring an array

Utilities available for configuring an array

Setting the boot controller and controller order

Setting a controller as the boot controller

Setting the controller order

Installing device drivers and Management Agents

Installing device drivers

Installing Management Agents

Upgrading or replacing controller options

Replacing the battery

Replacing, moving, or adding hard drives

Identifying the status of a hard drive

Recognizing hard drive failure

Effects of a hard drive failure

Compromised fault tolerance

Recovering from compromised fault tolerance

Replacing hard drives

Factors to consider before replacing hard drives

Automatic data recovery (rebuild)

Time required for a rebuild

Abnormal termination of a rebuild

Upgrading hard drive capacity

Moving drives and arrays

Adding drives

Diagnosing array problems

Controller board runtime LEDs

Battery pack LEDs

Diagnostic tools

Electrostatic discharge

Preventing electrostatic discharge

Grounding methods to prevent electrostatic discharge

Regulatory compliance notices

European Union regulatory notice

BSMI notice

Korean class A notice

Battery replacement notice

Taiwan battery recycling notice

Acronyms and abbreviations

Index