HP (Hewlett-Packard) ProLight Server User Manual

HP ProLiant Servers
J
Troubleshooting Guide
une 2006 (Fifth Edition)
Part Number 375445-005
© Copyright 2004-2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express
warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.
Microsoft, Windows, and Windows NT are U.S. registered trademarks of Microsoft Corporation. Windows Server 2003 is a trademark of Microsoft Corporation. Intel and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. UNIX is a registered trademark of The Open Group. Linux is a U.S. registered trademark of Linus Torvalds. June 2006 (Fifth Edition) Part Number 375445-005
Audience assumptions
This document is for the person who installs, administers, and troubleshoots servers and storage systems. HP assumes you are qualified in the servicing of computer equipment and trained in recognizing hazards in products with hazardous energy levels.

Contents

Introduction................................................................................................................................ 10
What's new............................................................................................................................................ 10
Revision history ....................................................................................................................................... 10
375445-xx4 (May 2006)...............................................................................................................10
375445-xx3 (September 2005) ...................................................................................................... 10
Getting started............................................................................................................................ 11
Pre-diagnostic steps ................................................................................................................................. 12
Important safety information............................................................................................................ 12
Symptom information ..................................................................................................................... 14
Prepare the server for diagnosis ......................................................................................................14
Common problem resolution ........................................................................................................ 15
Loose connections ...................................................................................................................................15
Service notifications.................................................................................................................................15
Updating firmware .................................................................................................................................. 15
Hard drive guidelines ..............................................................................................................................16
SAS and SATA hard drive guidelines ............................................................................................... 16
SCSI hard drive guidelines ............................................................................................................. 16
Hot-plug SCSI hard drive LED combinations ................................................................................................ 16
SAS and SATA hard drive LED combinations ..............................................................................................17
Diagnostic flowcharts .................................................................................................................. 19
Troubleshooting flowcharts .......................................................................................................................19
Start diagnosis flowchart ................................................................................................................ 20
General diagnosis flowchart ...........................................................................................................20
Power-on problems flowchart .......................................................................................................... 22
POST problems flowchart ...............................................................................................................26
Operating system boot problems flowchart .......................................................................................28
Server fault indications flowchart .....................................................................................................29
Hardware problems .................................................................................................................... 32
Procedures for all ProLiant servers.............................................................................................................. 32
Power problems ...................................................................................................................................... 32
Power source problems ..................................................................................................................32
Power supply problems ..................................................................................................................32
UPS problems ...............................................................................................................................33
General hardware problems.....................................................................................................................33
Problems with new hardware .......................................................................................................... 33
Unknown problem .........................................................................................................................34
Third-party device problems............................................................................................................ 35
Internal system problems ..........................................................................................................................35
CD-ROM and DVD drive problems...................................................................................................35
Diskette drive problems .................................................................................................................. 36
Tape drive problems ...................................................................................................................... 37
Hard drive problems...................................................................................................................... 39
Fan problems................................................................................................................................ 40
Memory problems ......................................................................................................................... 41
PPM problems............................................................................................................................... 42
Processor problems........................................................................................................................42
System open circuits and short circuits........................................................................................................ 43
External device problems.......................................................................................................................... 43
Contents 3
Video problems.............................................................................................................................43
Mouse and keyboard problems....................................................................................................... 44
Audio problems............................................................................................................................. 45
Printer problems ............................................................................................................................ 45
Local I/O cable problems...............................................................................................................45
Modem problems ..........................................................................................................................45
Network controller problems ........................................................................................................... 47
Software problems...................................................................................................................... 49
Operating system problems and resolutions ................................................................................................ 49
Operating system problems ............................................................................................................ 49
Operating system updates .............................................................................................................. 50
Restoring to a backed-up version ..................................................................................................... 51
When to Reconfigure or Reload Software .........................................................................................51
Linux operating systems.................................................................................................................. 52
Application software problems.................................................................................................................. 52
Software locks up.......................................................................................................................... 52
Errors occur after a software setting is changed.................................................................................52
Errors occur after the system software is changed ..............................................................................52
Errors occur after an application is installed...................................................................................... 52
Remote ROM flash problems.....................................................................................................................52
General remote ROM flash problems are occurring ........................................................................... 52
Command-line syntax error ............................................................................................................. 53
Access denied on target computer ................................................................................................... 53
Invalid or incorrect command-line parameters....................................................................................53
Network connection fails on remote communication ........................................................................... 53
Failure occurs during ROM flash...................................................................................................... 53
Target system is not supported......................................................................................................... 53
Software tools and solutions......................................................................................................... 54
Configuration tools.................................................................................................................................. 54
Array Configuration Utility.............................................................................................................. 54
SmartStart software........................................................................................................................ 54
SmartStart Scripting Toolkit ............................................................................................................. 55
HP ROM-Based Setup Utility............................................................................................................ 55
Option ROM Configuration for Arrays .............................................................................................57
HP ProLiant Essentials Rapid Deployment Pack................................................................................... 57
Re-entering the server serial number and product ID........................................................................... 57
Management CD...........................................................................................................................58
Management tools................................................................................................................................... 58
Automatic Server Recovery .............................................................................................................58
ROMPaq utility..............................................................................................................................58
Remote Insight Lights-Out Edition II................................................................................................... 58
Integrated Lights-Out technology...................................................................................................... 58
Erase Utility ..................................................................................................................................59
StorageWorks library and tape tools................................................................................................ 59
HP Systems Insight Manager........................................................................................................... 59
Management Agents......................................................................................................................59
HP ProLiant Essentials Virtualization Management Software ................................................................59
HP ProLiant Essentials Server Migration Pack - Physical to ProLiant Edition............................................. 60
HP BladeSystem Essentials Insight Control Data Center Edition ............................................................60
HP Control Tower .......................................................................................................................... 60
System Management homepage......................................................................................................61
USB support.................................................................................................................................. 61
Contents 4
Clustering software........................................................................................................................ 61
Diagnostic tools ......................................................................................................................................61
HP Insight Diagnostics....................................................................................................................61
Survey Utility................................................................................................................................. 62
Integrated Management Log ...........................................................................................................62
Array Diagnostic Utility ..................................................................................................................63
Remote support and analysis tools............................................................................................................. 63
HP Instant Support Enterprise Edition................................................................................................63
Web-Based Enterprise Service......................................................................................................... 63
Open Services Event Manager........................................................................................................ 63
Keeping the system current ....................................................................................................................... 63
Drivers .........................................................................................................................................63
Version control.............................................................................................................................. 64
Resource Paqs............................................................................................................................... 64
ProLiant Support Packs ................................................................................................................... 64
Operating system version support.................................................................................................... 64
SoftPaqs....................................................................................................................................... 64
Change control and proactive notification ........................................................................................ 64
Care Pack ....................................................................................................................................65
Firmware maintenance.............................................................................................................................65
Types of ROM............................................................................................................................... 65
Methods for updating firmware ....................................................................................................... 66
Current firmware versions............................................................................................................... 67
Updating firmware ........................................................................................................................67
HP resources for troubleshooting................................................................................................... 69
Online resources..................................................................................................................................... 69
HP website ...................................................................................................................................69
Server documentation .................................................................................................................... 69
Service notifications....................................................................................................................... 69
Subscriber's choice........................................................................................................................69
Natural language search assistant ................................................................................................... 69
Care Pack ....................................................................................................................................69
White papers................................................................................................................................ 70
General server resources..........................................................................................................................70
Additional product information........................................................................................................70
Device driver information................................................................................................................ 70
External cabling information ........................................................................................................... 70
Fault tolerance, security, care and maintenance, configuration and setup .............................................70
Installation and configuration information for the server management system......................................... 70
Installation and configuration information for the server setup software................................................. 70
iLO information .............................................................................................................................70
Key features, option part numbers.................................................................................................... 70
Management of the server .............................................................................................................. 71
Operating system installation and configuration information (for factory-installed operating systems) ........71
Operating system version support.................................................................................................... 71
Overview of server features and installation instructions...................................................................... 71
Power capacity ............................................................................................................................. 71
Registering the server..................................................................................................................... 71
Server configuration information......................................................................................................71
Software installation and configuration of the server .......................................................................... 71
Switch settings, LED functions, drive, memory, expansion board and processor installation instructions, and
board layouts................................................................................................................................ 71
Server and option specifications, symbols, installation warnings, and notices ........................................ 72
Contents 5
Teardown procedures, part numbers, specifications ........................................................................... 72
Technical topics............................................................................................................................. 72
Error messages........................................................................................................................... 73
ADU error messages................................................................................................................................73
Introduction to ADU error messages ................................................................................................. 73
Accelerator Board not Detected....................................................................................................... 73
Accelerator Error Log ..................................................................................................................... 73
Accelerator Parity Read Errors: X..................................................................................................... 73
Accelerator Parity Write Errors: X ....................................................................................................74
Accelerator Status: Cache was Automatically Configured During Last Controller Reset............................ 74
Accelerator Status: Data in the Cache was Lost... ..............................................................................74
Accelerator Status: Dirty Data Detected has Reached Limit... ...............................................................74
Accelerator Status: Dirty Data Detected... ......................................................................................... 74
Accelerator Status: Excessive ECC Errors Detected in at Least One Cache Line... ...................................74
Accelerator Status: Excessive ECC Errors Detected in Multiple Cache Lines... ........................................74
Accelerator Status: Obsolete Data Detected ......................................................................................75
Accelerator Status: Obsolete Data was Discarded ............................................................................. 75
Accelerator Status: Obsolete Data was Flushed (Written) to Drives.......................................................75
Accelerator Status: Permanently Disabled .........................................................................................75
Accelerator Status: Possible Data Loss in Cache.................................................................................75
Accelerator Status: Temporarily Disabled.......................................................................................... 75
Accelerator Status: Unrecognized Status...........................................................................................75
Accelerator Status: Valid Data Found at Reset ...................................................................................76
Accelerator Status: Warranty Alert................................................................................................... 76
Adapter/NVRAM ID Mismatch........................................................................................................ 76
Array Accelerator Battery Pack X not Fully Charged........................................................................... 76
Array Accelerator Battery Pack X Below Reference Voltage (Recharging) .............................................. 76
Board in Use by Expand Operation ................................................................................................. 76
Board not Attached........................................................................................................................ 76
Cache Has Been Disabled Because ADG Enabler Dongle is Broken or Missing .....................................76
Cache Has Been Disabled; Likely Caused By a Loose Pin on One of the RAM Chips .............................. 77
Configuration Signature is Zero.......................................................................................................77
Configuration Signature Mismatch................................................................................................... 77
Controller Communication Failure Occurred...................................................................................... 77
Controller Detected. NVRAM Configuration not Present ...................................................................... 77
Controller Firmware Needs Upgrading.............................................................................................77
Controller is Located in Special "Video" Slot ..................................................................................... 77
Controller Is Not Configured........................................................................................................... 77
Controller Reported POST Error. Error Code: X..................................................................................78
Controller Restarted with a Signature of Zero .................................................................................... 78
Disable Command Issued ...............................................................................................................78
Drive (Bay) X Firmware Needs Upgrading ........................................................................................78
Drive (Bay) X has Insufficient Capacity for its Configuration................................................................. 78
Drive (Bay) X has Invalid M&P Stamp...............................................................................................78
Drive (Bay) X Has Loose Cable........................................................................................................78
Drive (Bay) X is a Replacement Drive................................................................................................ 79
Drive (Bay) X is a Replacement Drive Marked OK.............................................................................. 79
Drive (Bay) X is Failed.................................................................................................................... 79
Drive (Bay) X is Undergoing Drive Recovery......................................................................................79
Drive (Bay) X Upload Code Not Readable........................................................................................ 79
Drive (Bay) X Was Inadvertently Replaced ........................................................................................79
Drive Monitoring Features Are Unobtainable..................................................................................... 80
Drive Monitoring is NOT Enabled for SCSI Port X Drive ID Y............................................................... 80
Contents 6
Drive Time-Out Occurred on Physical Drive Bay X..............................................................................80
Drive X Indicates Position Y.............................................................................................................80
Duplicate Write Memory Error ........................................................................................................80
Error Occurred Reading RIS Copy from SCSI Port X Drive ID............................................................... 80
FYI: Drive (Bay) X is Third-Party Supplied .......................................................................................... 80
Identify Logical Drive Data did not Match with NVRAM......................................................................81
Insufficient adapter resources .......................................................................................................... 81
Inter-Controller Link Connection Could Not Be Established ..................................................................81
Less Than 75% Batteries at Sufficient Voltage ....................................................................................81
Less Than 75% of Batteries at Sufficient Voltage Battery Pack X Below Reference Voltage........................ 81
Logical Drive X Failed Due to Cache Error ........................................................................................81
Logical Drive X Status = Failed ........................................................................................................81
Logical Drive X Status = Interim Recovery (Volume Functional, but not Fault Tolerant).............................. 82
Logical Drive X Status = Loose Cable Detected... ...............................................................................82
Logical Drive X Status = Overheated ................................................................................................ 82
Logical Drive X Status = Overheating ............................................................................................... 82
Logical Drive X Status = Recovering (rebuilding data on a replaced drive) ............................................ 82
Logical Drive X Status = Wrong Drive Replaced ................................................................................ 83
Loose Cable Detected - Logical Drives May Be Marked FAILED Until Corrected...................................... 83
Mirror Data Miscompare................................................................................................................ 83
No Configuration for Array Accelerator Board.................................................................................. 83
One or More Drives is Unable to Support Redundant Controller Operation ........................................... 83
Other Controller Indicates Different Hardware Model.........................................................................83
Other Controller Indicates Different Firmware Version......................................................................... 84
Other Controller Indicates Different Cache Size.................................................................................84
Processor Reduced Power Mode Enabled in RBSU ............................................................................. 84
Processor Not Started (Processor Stalled).......................................................................................... 84
Processor Not Started (Stepping Does Not Match) ............................................................................. 84
Processor Not Started (Unsupported Processor Stepping) .................................................................... 84
Processor Not Supported (Unsupported Core Speed) .........................................................................84
RIS Copies Between Drives Do Not Match ........................................................................................ 84
SCSI Port X Drive ID Y Failed - REPLACE (failure message) .................................................................. 85
SCSI Port X, Drive ID Y Firmware Needs Upgrading ..........................................................................85
SCSI Port X, Drive ID Y Has Exceeded the Following Threshold(s)......................................................... 85
SCSI Port X, Drive ID Y is not Stamped for Monitoring........................................................................ 85
SCSI Port X, Drive ID Y May Have a Loose Connection....................................................................... 85
SCSI Port X, Drive ID Y RIS Copies Within This Drive Do Not Match..................................................... 85
SCSI Port X, Drive ID Y...S.M.A.R.T. Predictive Failure Errors Have Been Detected in the Factory Monitor and
Performance Data.......................................................................................................................... 85
SCSI Port X, Drive ID Y...S.M.A.R.T. Predictive Failure Errors Have Been Detected in the Power Monitor and
Performance Data.......................................................................................................................... 86
SCSI Port X, Drive ID Y Was Replaced On a Good Volume: (failure message)....................................... 86
Set Configuration Command Issued .................................................................................................86
Soft firmware upgrade required....................................................................................................... 86
Storage Enclosure on SCSI Bus X has a Cabling Error (Bus Disabled)... ................................................86
Storage Enclosure on SCSI Bus X Indicated a Door Alert.....................................................................86
Storage Enclosure on SCSI Bus X Indicated a Power Supply Failure...................................................... 86
Storage Enclosure on SCSI Bus X Indicated an Overheated Condition... ...............................................87
Storage enclosure on SCSI Bus X is unsupported with its current firmware version... ............................... 87
Storage Enclosure on SCSI Bus X Indicated that the Fan Failed... ......................................................... 87
Storage Enclosure on SCSI Bus X Indicated that the Fan is Degraded... ................................................87
Storage Enclosure on SCSI Bus X Indicated that the Fan Module is Unplugged... ...................................87
Storage Enclosure on SCSI Bus X - Wide SCSI Transfer Failed............................................................. 87
Swapped cables or configuration error detected. A configured array of drives...................................... 88
Contents 7
Swapped Cables or Configuration Error Detected. A Drive Rearrangement........................................... 88
Swapped Cables or Configuration Error Detected. An Unsupported Drive Arrangement Was Attempted... 88
Swapped cables or configuration error detected. The cables appear to be interchanged... .....................88
Swapped cables or configuration error detected. The configuration information on the attached drives... .89
Swapped Cables or Configuration Error Detected. The Maximum Logical Volume Count X...................... 89
System Board is Unable to Identify which Slots the Controllers are in.................................................... 89
The Redundant Controllers Installed are not the Same Model...............................................................89
This Controller Can See the Drives but the Other Controller Can't ........................................................90
This Controller Can't See the Drives but the Other Controller Can ........................................................90
Unable to Communicate with Drive on SCSI Port X, Drive ID Y ............................................................ 90
Unable to Retrieve Identify Controller Data. Controller May be Disabled or Failed ................................. 90
Unknown Disable Code.................................................................................................................. 90
Unrecoverable Read Error ..............................................................................................................90
Unsupported Processor Configuration (Processor Required in Slot #1) .................................................. 91
Warning Bit Detected..................................................................................................................... 91
WARNING - Drive Write Cache is Enabled on X............................................................................... 91
WARNING - Mixed Feature Processors Were Detected...................................................................... 91
WARNING - Resetting Corrupted CMOS ......................................................................................... 91
WARNING - Resetting Corrupted NVRAM........................................................................................ 91
WARNING - Resetting Corrupted System Environment........................................................................ 91
WARNING - Restoring Default Configurations as Requested ...............................................................91
WARNING: Storage Enclosure on SCSI Bus X Indicated it is Operating in Single Ended Mode... ............92
Write Memory Error....................................................................................................................... 92
Wrong Accelerator........................................................................................................................92
POST error messages and beep codes....................................................................................................... 92
Introduction to POST error messages................................................................................................ 92
Non-numeric messages or beeps only ..............................................................................................93
100 Series ................................................................................................................................. 101
200 Series ................................................................................................................................. 103
300 Series ................................................................................................................................. 106
400 Series ................................................................................................................................. 107
600 Series ................................................................................................................................. 108
1100 Series ............................................................................................................................... 109
1600 Series ............................................................................................................................... 109
1700 Series ............................................................................................................................... 112
Event list error messages ........................................................................................................................124
Introduction to event list error messages..........................................................................................124
A CPU Power Module (System Board, Socket X)... ........................................................................... 124
ASR Lockup Detected: Cause ........................................................................................................124
Automatic operating system shutdown initiated due to fan failure.......................................................125
Automatic Operating System Shutdown Initiated Due to Overheat Condition....................................... 125
Blue Screen Trap: Cause [NT]... .................................................................................................... 125
Corrected Memory Error Threshold Passed (Slot X, Memory Module Y)...............................................125
EISA Expansion Bus Master Timeout (Slot X).................................................................................... 125
PCI Bus Error (Slot X, Bus Y, Device Z, Function X) ...........................................................................125
Processor Correctable Error Threshold Passed (Slot X, Socket Y)......................................................... 125
Processor Uncorrectable Internal Error (Slot X, Socket Y)................................................................... 125
Real-Time Clock Battery Failing...................................................................................................... 126
System AC Power Overload (Power Supply X).................................................................................126
System AC Power Problem (Power Supply X)...................................................................................126
System Fan Failure (Fan X, Location) ..............................................................................................126
System Fans Not Redundant.......................................................................................................... 126
System Overheating (Zone X, Location) ..........................................................................................126
System Power Supplies Not Redundant........................................................................................... 126
Contents 8
System Power Supply Failure (Power Supply X)................................................................................ 126
Unrecoverable Host Bus Data Parity Error... .................................................................................... 126
Uncorrectable Memory Error (Slot X, Memory Module Y).................................................................. 127
HP BladeSystem infrastructure error codes ................................................................................................ 127
Server blade management module error codes................................................................................127
Power management module error codes......................................................................................... 130
Port 85 codes and iLO messages ............................................................................................................131
Troubleshooting the system using port 85 codes ..............................................................................131
Processor-related port 85 codes..................................................................................................... 131
Memory-related port 85 codes ...................................................................................................... 132
Expansion board-related port 85 codes.......................................................................................... 133
Miscellaneous port 85 codes ........................................................................................................ 133
Windows® Event Log processor error codes............................................................................................. 134
Message ID: 4137 ......................................................................................................................134
Message ID: 4140 ......................................................................................................................134
Message ID: 4141 ......................................................................................................................134
Message ID: 4169 ......................................................................................................................134
Message ID: 4190 ......................................................................................................................134
Insight Diagnostics processor error codes .................................................................................................135
MSG_CPU_RR_1......................................................................................................................... 135
MSG_CPU_RR_2......................................................................................................................... 135
MSG_CPU_RR_3......................................................................................................................... 135
MSG_CPU_RR_5......................................................................................................................... 135
MSG_CPU_RR_6......................................................................................................................... 135
MSG_CPU_RR_7......................................................................................................................... 136
MSG_CPU_RR_8......................................................................................................................... 136
MSG_CPU_RR_9......................................................................................................................... 136
MSG_CPU_RR_10....................................................................................................................... 136
MSG_CPU_RR_11....................................................................................................................... 136
MSG_CPU_RR_12....................................................................................................................... 136
MSG_CPU_RR_13....................................................................................................................... 136
MSG_CPU_RR_14....................................................................................................................... 136
MSG_CPU_RR_15....................................................................................................................... 136
MSG_CPU_RR_16....................................................................................................................... 136
MSG_CPU_RR_17....................................................................................................................... 137
Contacting HP.......................................................................................................................... 138
Contacting HP technical support or an authorized reseller ..........................................................................138
Customer self repair............................................................................................................................... 138
Server information you need................................................................................................................... 139
Operating system information you need ................................................................................................... 139
Microsoft operating systems.......................................................................................................... 139
Linux operating systems................................................................................................................ 140
Novell NetWare operating systems ............................................................................................... 141
SCO operating systems ................................................................................................................ 141
IBM OS/2 operating systems........................................................................................................ 142
Sun Solaris operating systems .......................................................................................................142
Acronyms and abbreviations...................................................................................................... 144
Index....................................................................................................................................... 148
Contents 9

Introduction

In this section
What's new........................................................................................................................................... 10
Revision history ...................................................................................................................................... 10

What's new

The fifth edition of the HP ProLiant Servers Troubleshooting Guide, part number 375445-xx5, includes the following additions:
c-Class server blade power-on problems flowchart (on page 25)
c-Class server blade POST problems flowchart (on page 28)
c-Class server blade fault indications flowchart (on page 31)
Windows® Event Log processor error codes (on page 134)
Insight Diagnostics processor error codes (on page 135)

Revision history

375445-xx4 (May 2006)

The fourth edition of the HP ProLiant Servers Troubleshooting Guide, part number 375445-xx4, included the following additions:
Hot-plug SAS and SATA hard drive LED combinations (on page 17)
Operating system issues with Intel® dual-core processors (Hyper-Threading enabled) (on page 50)
Tape drive problems (on page 37)
New error messages in ADU error messages (on page 73) and POST error messages and beep
codes (on page 92)

375445-xx3 (September 2005)

The third edition of the HP ProLiant Servers Troubleshooting Guide, part number 375445-xx3, included the following changes:
Updated SCSI hard drive guidelines
Added hot-plug SCSI hard drive LED combinations (on page 16)
Updated diagnostic flowcharts (on page 19)
Added operating system problems (on page 49)
Added Port 85 codes and iLO messages (on page 131)
Added new error messages to ADU error messages and POST error messages and beep codes
Introduction 10
Updated contacting HP:
Contacting HP technical support or an authorized reseller
Server information you need

Getting started

NOTE: For common troubleshooting procedures, the term "server" is used to mean servers and server
blades.
This guide provides common procedures and solutions for the many levels of troubleshooting a ProLiant server—from the most basic connector issues to complex software configuration problems.
To understand the sections of this guide and to identify the best starting point for a problem, use the following descriptions:
Common problem resolution (on page 15)
Many server problems are caused by loose connections (on page 15), outdated firmware ("Updating firmware" on page 15), and other issues. Use this section to perform basic troubleshooting for common problems.
Problem diagnosis
When a server exhibits symptoms that do not immediately pinpoint the problem, use this section to begin troubleshooting. The section contains a series of flowcharts that provide a common troubleshooting process for troubleshooting ProLiant servers. The flowcharts identify a diagnostic tool or a process to solve the problem.
Hardware problems (on page 32)
When the symptoms point to a specific component, use this section to find solutions for problems with power, general components, system boards, system open circuits and short circuits, and external devices.
Software problems (on page 49)
When you have a known, specific software problem, use this section to identify a solution to the problem.
Software tools and solutions (on page 54)
Use this section as a reference for software tools and utilities.
HP resources for troubleshooting (on page 69)
When additional information becomes necessary, use this section to identify websites and supplemental documents that contain troubleshooting information.
Error messages
Use this section to locate a complete list of ADU error messages (on page 73), POST error messages and beep codes (on page 92), event list error messages (on page 124), HP BladeSystem infrastructure error codes (on page 127), and Port 85 codes and iLO messages (on page 131).
Getting started 11

Pre-diagnostic steps

WARNING: To avoid potential problems, ALWAYS read the warnings and cautionary
information in the server documentation before removing, replacing, reseating, or modifying system components.
IMPORTANT: This guide provides information for multiple servers. Some information may not apply to the
server you are troubleshooting. Refer to the server documentation for information on procedures, hardware options, software tools, and operating systems supported by the server.
1. Review the important safety information (on page 12).
2. Gather symptom information (on page 14).
3. Prepare the server for diagnosis.

Important safety information

4. Use the Start diagnosis flowchart (on page 20) to begin the diagnostic process.
Familiarize yourself with the safety information in the following sections before troubleshooting the server.
Important safety information
Before servicing this product, read the Important Safety Information document provided with the server.
Symbols on equipment
The following symbols may be placed on equipment to indicate the presence of potentially hazardous conditions.
This symbol indicates the presence of hazardous energy circuits or electric shock hazards. Refer all servicing to qualified personnel.
WARNING: To reduce the risk of injury from electric shock hazards, do not open this enclosure. Refer all maintenance, upgrades, and servicing to qualified personnel.
This symbol indicates the presence of electric shock hazards. The area contains no user or field serviceable parts. Do not open for any reason.
WARNING: To reduce the risk of injury from electric shock hazards, do not open this enclosure.
This symbol on an RJ-45 receptacle indicates a network interface connection. WARNING: To reduce the risk of electric shock, fire, or damage to the equipment,
do not plug telephone or telecommunications connectors into this receptacle.
This symbol indicates the presence of a hot surface or hot component. If this surface is contacted, the potential for injury exists.
WARNING: To reduce the risk of injury from a hot component, allow the surface to cool before touching.
Getting started 12
This symbol indicates that the component exceeds the recommended weight for one
weight in kg weight in lb
individual to handle safely. WARNING: To reduce the risk of personal injury or damage to the equipment,
observe local occupational health and safety requirements and guidelines for manual material handling.
These symbols, on power supplies or systems, indicate that the equipment is supplied by multiple sources of power.
WARNING: To reduce the risk of injury from electric shock, remove all power cords to completely disconnect power from the system.
Warnings and cautions
WARNING: Only authorized technicians trained by HP should attempt to repair this
equipment. All troubleshooting and repair procedures are detailed to allow only subassembly/module-level repair. Because of the complexity of the individual boards and subassemblies, no one should attempt to make repairs at the component level or to make modifications to any printed wiring board. Improper repairs can create a safety hazard.
WARNING: To reduce the risk of personal injury or damage to the equipment, be sure
that:
The leveling feet are extended to the floor.
The full weight of the rack rests on the leveling feet.
The stabilizing feet are attached to the rack if it is a single-rack installation.
The racks are coupled together in multiple-rack installations.
Only one component is extended at a time. A rack may become unstable if more than
one component is extended for any reason.
WARNING: To reduce the risk of electric shock or damage to the equipment:
Do not disable the power cord grounding plug. The grounding plug is an important safety feature.
Plug the power cord into a grounded (earthed) electrical outlet that is easily accessible at all times.
Unplug the power cord from the power supply to disconnect power to the equipment.
Do not route the power cord where it can be walked on or pinched by items placed
against it. Pay particular attention to the plug, electrical outlet, and the point where the cord extends from the server.
WARNING: To reduce the risk of personal injury or damage to the equipment:
weight in kg weight in lb
Observe local occupation health and safety requirements and guidelines for manual handling.
Obtain adequate assistance to lift and stabilize the chassis during installation or removal.
The server is unstable when not fastened to the rails.
When mounting the server in a rack, remove the power supplies and any other
removable module to reduce the overall weight of the product.
CAUTION: To properly ventilate the system, you must provide at least 7.6 cm (3.0 in) of clearance at the
front and back of the server.
Getting started 13
CAUTION: The server is designed to be electrically grounded (earthed). To ensure proper operation, plug
the AC power cord into a properly grounded AC outlet only.

Symptom information

Before troubleshooting a server problem, collect the following information:
What events preceded the failure? After which steps does the problem occur?
What has been changed since the time the server was working?
Did you recently add or remove hardware or software? If so, did you remember to change the
appropriate settings in the server setup utility, if necessary?
How long has the server exhibited problem symptoms?
If the problem occurs randomly, what is the duration or frequency?
To answer these questions, the following information may be useful:
Run HP Insight Diagnostics (on page 61) and use the survey page to view the current configuration
or to compare it to previous configurations.
Refer to your hardware and software records for information.
Refer to server LEDs and their statuses.

Prepare the server for diagnosis

1. Be sure the server is in the proper operating environment with adequate power, air conditioning,
and humidity control. Refer to the server documentation for required environmental conditions.
2. Record any error messages displayed by the system.
3. Remove all diskettes and CDs from the media drives.
4. Power down the server and peripheral devices if you will be diagnosing the server offline. Always
perform an orderly shutdown, if possible. This means you must:
a. Exit any applications. b. Exit the operating system. c. Power down the server.
5. Disconnect any peripheral devices not required for testing (any devices not necessary to power up
the server). Do not disconnect the printer if you want to use it to print error messages.
6. Collect all tools and utilities, such as a Torx screwdriver, loopback adapters, ESD wrist strap, and
software utilities, necessary to troubleshoot the problem.
You must have the appropriate Health Drivers and Management Agents installed on the server.
NOTE: To verify the server configuration, connect to the System Management homepage (on page 61) and
select Version Control Agent. The VCA gives you a list of names and versions of all installed HP drivers, Management Agents, and utilities, and whether they are up to date.
HP recommends you have access to the server documentation for server-specific information.
HP recommends you have access to the SmartStart CD for value-added software and drivers
required during the troubleshooting process.
NOTE: Download the current version of SmartStart from the HP website
(http://www.hp.com/servers/smartstart
).
Getting started 14

Common problem resolution

In this section
Loose connections .................................................................................................................................. 15
Service notifications................................................................................................................................ 15
Updating firmware ................................................................................................................................. 15
Hard drive guidelines ............................................................................................................................. 16
Hot-plug SCSI hard drive LED combinations............................................................................................... 16
SAS and SATA hard drive LED combinations............................................................................................. 17

Loose connections

Action:
Be sure all power cords are securely connected.
Be sure all cables are properly aligned and securely connected for all external and internal
components.
Remove and check all data and power cables for damage. Be sure no cables have bent pins or
damaged connectors.
If a fixed cable tray is available for the server, be sure the cords and cables connected to the server
are correctly routed through the tray.
Be sure each device is properly seated.
If a device has latches, be sure they are completely closed and locked.
Check any interlock or interconnect LEDs that may indicate a component is not connected properly.
If problems continue to occur, remove and reinstall each device, checking the connectors and sockets
for bent pins or other damage.

Service notifications

To view the latest service notifications, refer to the HP website (http://www.hp.com/go/bizsupport). Select the appropriate server model, and then click the Troubleshoot a Problem link on the product page.

Updating firmware

To update the system ROM or option firmware, use HP Smart Components. These components are available on the Firmware Maintenance CD and the HP website (http://www.hp.com/support recent version of a particular server or option firmware is available on the following:
HP Support website (http://www.hp.com/support)
HP ROM-BIOS/Firmware Updates website
(http://h18023.www1.hp.com/support/files/server/us/romflash.html
). The most
)
Common problem resolution 15
Components for option firmware updates are also available from the HP Storage Products Software and Drivers website (http://www.hp.com/support/proliantstorage
1. Find the most recent version of the component that you require. Components for controller firmware
updates are available in offline and online formats.
2. Follow the instructions for installing the component on the server. These instructions are included with
the CD and on the component website.
3. Follow the additional instructions that describe how to use the component to flash the ROM. These
instructions are provided with each component.
View additional documentation on updating firmware, such as the Regular Firmware Updates Essential for Optimal Performance and Functionality of HP ProLiant Servers white paper, on the HP ROM­BIOS/Firmware Updates website (http://h18023.www1.hp.com/support/files/server/us/romflash.html

Hard drive guidelines

SAS and SATA hard drive guidelines

When adding hard drives to the server, observe the following general guidelines:
The system automatically sets all drive numbers.
If only one hard drive is used, install it in the bay with the lowest drive number.
Drives must be the same capacity to provide the greatest storage space efficiency when drives are
grouped together into the same drive array.
).
).
NOTE: ACU does not support mixing SAS and SATA drives in the same logical volume.

SCSI hard drive guidelines

Each SCSI drive must have a unique ID.
The system automatically sets all SCSI IDs.
If only one SCSI hard drive is used, install it in the bay with the lowest number.
Drives must be the same capacity to provide the greatest storage space efficiency when drives are
grouped together into the same drive array.

Hot-plug SCSI hard drive LED combinations

Activity LED (1)
On, off, or flashing
On, off, or flashing
On or flashing
Online LED (2)
On or off Flashing A predictive failure alert has been received for this drive.
On Off The drive is online and is configured as part of an array.
Flashing Off
Fault LED (3)
Interpretation
Replace the drive as soon as possible.
If the array is configured for fault tolerance and all other drives in the array are online, and a predictive failure alert is received or a drive capacity upgrade is in progress, you may replace the drive online.
Do not remove the drive. Removing a drive may terminate the current operation and cause data loss.
The drive is rebuilding or undergoing capacity expansion.
Common problem resolution 16
Activity LED (1)
Online LED (2)
Fault LED (3)
Interpretation
On Off Off Do not remove the drive.
The drive is being accessed, but (1) it is not configured as part of an array; (2) it is a replacement drive and rebuild has not yet started; or (3) it is spinning up during the POST sequence.
Flashing Flashing Flashing
Do not remove the drive. Removing a drive may cause data loss in non-fault-tolerant configurations.
One or more of the following conditions may exist:
The drive is part of an array being selected by an array
configuration utility
Drive Identification has been selected in HP SIM
The drive firmware is being updated
Off Off On
The drive has been placed offline due to hard disk drive failure or subsystem communication failure.
You may need to replace the drive.
Off Off Off One or more of the following conditions may exist:
The drive is not configured as part of an array
The drive is configured as part of an array, but it is a
replacement drive that is not being accessed or being rebuilt yet
The drive is configured as an online spare
If the drive is connected to an array controller, you may replace the drive online.

SAS and SATA hard drive LED combinations

NOTE: Predictive failure alerts can occur only when the server is connected to a Smart Array controller.
Online/activity LED (green)
On, off, or flashing
On, off, or flashing Steadily blue
On
On Off The drive is online, but it is not active currently. Flashing regularly
(1 Hz)
Flashing regularly (1 Hz)
Fault/UID LED (amber/blue)
Alternating amber and blue
Amber, flashing regularly (1 Hz)
Amber, flashing regularly (1 Hz)
Off
Interpretation
The drive has failed, or a predictive failure alert has been received for this drive; it also has been selected by a management application.
The drive is operating normally, and it has been selected by a management application.
A predictive failure alert has been received for this drive. Replace the drive as soon as possible.
Do not remove the drive. Removing a drive may terminate the current operation and cause data loss.
The drive is part of an array that is undergoing capacity expansion or stripe migration, but a predictive failure alert has been received for this drive. To minimize the risk of data loss, do not replace the drive until the expansion or migration is complete.
Do not remove the drive. Removing a drive may terminate the current operation and cause data loss.
The drive is rebuilding, or it is part of an array that is undergoing capacity expansion or stripe migration.
Common problem resolution 17
Online/activity LED (green)
Flashing irregularly
Fault/UID LED (amber/blue)
Amber, flashing regularly (1 Hz)
Interpretation
The drive is active, but a predictive failure alert has been
received for this drive. Replace the drive as soon as possible. Flashing irregularly Off The drive is active, and it is operating normally. Off Steadily amber
A critical fault condition has been identified for this drive, and
the controller has placed it offline. Replace the drive as soon as
possible. Off
Amber, flashing regularly (1 Hz)
Off Off
A predictive failure alert has been received for this drive.
Replace the drive as soon as possible.
The drive is offline, a spare, or not configured as part of an
array.
Common problem resolution 18

Diagnostic flowcharts

In this section
Troubleshooting flowcharts ...................................................................................................................... 19

Troubleshooting flowcharts

To effectively troubleshoot a problem, HP recommends that you start with the first flowchart in this section, "Start diagnosis flowchart (on page 20)," and follow the appropriate diagnostic path. If the other flowcharts do not provide a troubleshooting solution, follow the diagnostic steps in "General diagnosis flowchart (on page 20)." The General diagnosis flowchart is a generic troubleshooting process to be used when the problem is not server-specific or is not easily categorized into the other flowcharts.
The available flowcharts include:
Start diagnosis flowchart (on page 20)
General diagnosis flowchart (on page 20)
Power-on problems
Server power-on problems flowchart (on page 22)
p-Class server blade power-on problems flowchart (on page 23)
c-Class server blade power-on problems flowchart (on page 25)
POST problems flowchart (on page 26)
Server and p-Class server blade POST problems flowchart (on page 27)
c-Class server blade POST problems flowchart (on page 28)
Operating system boot problems flowchart (on page 28)
Server fault indications flowchart (on page 29)
Server and p-Class server blade fault indications flowchart (on page 30)
c-Class server blade fault indications flowchart (on page 31)
Diagnostic flowcharts 19

Start diagnosis flowchart

Use the following flowchart to start the diagnostic process.

General diagnosis flowchart

Diagnostic flowcharts 20
The General diagnosis flowchart provides a generic approach to troubleshooting. If you are unsure of the problem, or if the other flowcharts do not fix the problem, use the following flowchart.
Diagnostic flowcharts 21

Power-on problems flowchart

Server power-on problems flowchart
Symptoms:
The server does not power on.
The system power LED is off or amber.
The external health LED is red or amber.
The internal health LED is red or amber.
NOTE: For the location of server LEDs and information on their statuses, refer to the server documentation.
Possible causes:
Improperly seated or faulty power supply
Loose or faulty power cord
Power source problem
Power-on circuit problem
Improperly seated component or interlock problem
Faulty internal component
Diagnostic flowcharts 22
p-Class server blade power-on problems flowchart
Symptoms:
The server does not power on.
The system power LED is off or amber.
The health LED is red or amber.
NOTE: For the location of server LEDs and information on their statuses, refer to the server documentation.
Possible causes:
Improperly seated or faulty power supply
Diagnostic flowcharts 23
Loose or faulty power cord
Power source problem
Power-on circuit problem
Improperly seated component or interlock problem
Faulty internal component
Diagnostic flowcharts 24
c-Class server blade power-on problems flowchart
Symptoms:
The server does not power on.
The system power LED is off or amber.
The health LED is red or amber.
NOTE: For the location of server LEDs and information on their statuses, refer to the server documentation.
Possible causes:
Improperly seated or faulty power supply
Loose or faulty power cord
Power source problem
Power on circuit problem
Improperly seated component or interlock problem
Faulty internal component
Diagnostic flowcharts 25

POST problems flowchart

Symptoms:
Server does not complete POST
NOTE: The server has completed POST when the system attempts to access the boot device.
Server completes POST with errors
Possible problems:
Improperly seated or faulty internal component
Faulty KVM device
Faulty video device
Diagnostic flowcharts 26
Server and p-Class server blade POST problems flowchart
Diagnostic flowcharts 27
c-Class server blade POST problems flowchart

Operating system boot problems flowchart

Symptoms:
Server does not boot a previously installed OS
Server does not boot SmartStart
Possible causes:
Corrupted OS
Hard drive subsystem problem
Incorrect boot order setting in RBSU
There are two ways to use SmartStart when diagnosing OS boot problems on a server blade:
Diagnostic flowcharts 28
Use iLO to remotely attach virtual devices to mount the SmartStart CD onto the server blade.
Use a local I/O cable and drive to connect to the server blade, and then restart the server blade.

Server fault indications flowchart

Symptoms:
Server boots, but a fault event is reported by Insight Management Agents (on page 59)
Server boots, but the internal health LED, external health LED, or component health LED is red or
amber
Diagnostic flowcharts 29
NOTE: For the location of server LEDs and information on their statuses, refer to the server documentation.
Possible causes:
Improperly seated or faulty internal or external component
Unsupported component installed
Redundancy failure
System overtemperature condition
Server and p-Class server blade fault indications flowchart
Diagnostic flowcharts 30
Loading...
+ 122 hidden pages