Microsoft, Windows, and Windows NT are trademarks of Microsoft Corporation in the U.S.
and other countries.
Intel, Pentium, and Itanium are trademarks of Intel Corporation in the U.S. and other
countries.
UNIX is a trademark of The Open Group in the U.S. and other countries.
Hewlett-Packard Company shall not be liable for technical or editorial errors or omissions
contained herein. The information in this document is provided “as is” without warranty of
any kind and is subject to change without notice. The warranties for HP products are set forth
in the express limited warranty statements accompanying such products. Nothing herein
should be construed as constituting an additional warranty.
HP Servers Troubleshooting Guide
January 2003 (Seventh Edition)
Part Number 161759-007
Contents
About This Guide
Who Should Use This Guide............................................................................................. xi
How to Use This Guide .................................................................................................... xii
Key Terms ........................................................................................................................ xii
Symbols in Text............................................................................................................... xiii
Reader’s Comments ........................................................................................................ xiii
HP Resources .................................................................................................................. xiii
Chapter 1
Diagnosing the Problem
Developing a Troubleshooting Plan ................................................................................ 1-2
Preparing to Troubleshoot the Server.............................................................................. 1-3
Preparing the Server for Diagnosis ........................................................................... 1-3
Using a Troubleshooting Methodology .................................................................... 1-3
This guide provides troubleshooting information for ProLiant and TaskSmart servers.
For convenience, this guide includes a complete list of Power-On Self-Test (POST)
error messages, Diagnostics test error codes, Integrated Management Log (IML)
event list error messages, and Array Diagnostic Utility (ADU) error messages.
IMPORTANT: The chapters in this guide provide information for multiple servers. Some of the
hardware or software information covered may not apply to your specific server. You may
need to modify some of the examples or procedures in this guide for your work environment.
Refer to your server-specific user documentation for information on procedures, hardware
options, software tools, and operating systems supported by, and specific to, your server.
WARNING: To reduce the risk of personal injury or damage to the equipment,
refer to the user documentation supplied with the server and observe the
appropriate safety precautions.
Who Should Use This Guide
This guide is for two types of users:
• The novice user interested in learning troubleshooting methods such as how to
record what happened before a problem, procedures for troubleshooting, tools to
use for problem resolution, and general information to help you avoid future
problems
• The advanced user already familiar with troubleshooting techniques who is
interested in specific information to troubleshoot server problems
HP Servers Troubleshooting Guide xi
About This Guide
How to Use This Guide
To learn and use proper troubleshooting methods, follow the procedures described
throughout Chapter 1, which helps you isolate the problem and refers you to the part
of this guide containing the information necessary to solve the problem.
To immediately find help for the specific problem you are troubleshooting, refer to
“Locating Troubleshooting Information” in Chapter 1, which lists the location of
information in this guide.
Because this guide contains information covering multiple servers, refer to your
server-specific user documentation to find information about the system
specifications, switch settings, and status and LED indicators for your server.
Key Terms
• Boot—The process of initializing a server, beginning when the power switch is
pressed, including the running of self-tests, and concluding with the loading of
the operating system.
• Reboot—To restart a server by reloading the operating system.
• Power up—To apply power to the server by pressing the power switch. Powering
up a server is the first step of the boot process.
• Power down—To turn off a server by pressing the power switch or as required by
the operating system.
WARNING: Live circuits may still be present when the server is powered
down. To reduce the risk of injury or equipment damage, remove power
from the server by disconnecting all power cords from the power supplies.
• Server-specific user documentation—The set of documents that apply
specifically to a server, such as the setup and installation guide, maintenance and
service guide, and installation poster.
• Shut down—To completely remove all sources of power from a server.
xii HP Servers Troubleshooting Guide
• Server setup utility—a utility designed to set up and configure your server,
including ROM-Based Setup Utility (RBSU), System Configuration Utility
(SCU), and BIOS Setup Utility.
Symbols in Text
These symbols may be found in the text of this guide. They have the following
meanings.
WARNING: Text set off in this manner indicates that failure to follow directions
in the warning could result in bodily harm or loss of life.
CAUTION: Text set off in this manner indicates that failure to follow directions could
result in damage to equipment or loss of information.
IMPORTANT: Text set off in this manner presents essential information to explain a concept
or complete a task.
NOTE: Text set off in this manner presents additional information to emphasize or supplement
important points of the main text.
About This Guide
Reader’s Comments
HP welcomes your comments on this guide. Please send your comments and
suggestions by e-mail to
ServerDocumentation@hp.com.
HP Resources
For information on additional HP resources, refer to Appendix A, “HP Resources.”
HP Servers Troubleshooting Guide xiii
1
Diagnosing the Problem
This chapter covers the steps that you are recommended to take when an error occurs.
Going through a structured set of tasks helps you to isolate the problem quickly.
IMPORTANT: This guide provides information for multiple servers. Some of the hardware or
software information may not apply to your specific server. You may need to modify some of
the examples or procedures in this guide for your work environment. Refer to your
server-specific user documentation for information on procedures, hardware options, software
tools, and operating systems supported by, and specific to, your server.
The following sections are outlined in this chapter:
• Developing a Troubleshooting Plan
• Preparing to Troubleshoot the Server
• Gathering Information
• Locating Troubleshooting Information
• Contacting HP
Even if you are experienced in troubleshooting, consider skimming through this
chapter before using the remainder of the book and the documentation that shipped
with your server. Otherwise refer to “Locating Troubleshooting Information” in this
chapter, which points you to the appropriate section of this guide.
WARNING: To avoid potential problems, ALWAYS read the warnings and
cautionary information in your server-specific user documentation before
removing, replacing, reseating, or modifying system components.
HP Servers Troubleshooting Guide 1-1
Diagnosing the Problem
Developing a Troubleshooting Plan
Evaluate all of the information and symptoms to:
• Identify the problem.
— Prepare the server for diagnosis and familiarize yourself with appropriate
troubleshooting methods using the following section, “Preparing to
Troubleshoot the Server.”
— Collect the facts related to the problem you want to troubleshoot using the
“Gathering Information” section later in this chapter.
— If the problem has not been identified after following the procedures in this
guide, refer to the “Contacting HP” section of this chapter.
• Plan your solution to each problem.
— Identify all steps necessary for implementation of each solution.
— Balance the time and cost required for implementing each solution against
the likelihood of resolving the problem.
— Gather the documentation that shipped with your server. Server-specific user
documentation is also located on the following website:
www.compaq.com/support/servers
Select your server, and then look in the Manuals section.
— Compile a master plan to be sure that you manipulate one variable at a time.
• Identify and collect all tools, such as a Torx screwdriver, electrostatic discharge
(ESD) wrist strap, and software utilities, necessary to troubleshoot the problem.
• Troubleshoot the problem using the information in this guide. Record each action
you take and list the results.
• Test your actions to be sure that the problem is truly resolved.
• Perform preventive steps to stop the problem from recurring. Refer to Chapter 6,
“Error Prevention,” for prevention information.
IMPORTANT: Familiarize yourself with the appropriate warnings for your server by referring
to your server-specific user documentation.
1-2 HP Servers Troubleshooting Guide
Preparing to Troubleshoot the Server
Before troubleshooting, follow the steps to prepare the server for diagnosis. Also,
read the proper troubleshooting procedures to increase troubleshooting effectiveness.
Preparing the Server for Diagnosis
Before troubleshooting the server:
1. Record any error messages displayed by the system.
2. Remove all diskettes and CDs from the media drives.
3. Power down the server and peripheral devices. Always perform an orderly server
shutdown if possible. This means that you must:
a. Exit the applications.
b. Exit the operating system.
c. Power down the server.
4. Disconnect any peripheral devices not required for testing (any devices not
necessary to power up the server). Do not disconnect the printer if you want to
use it to log error messages.
Diagnosing the Problem
At this point, you can attempt to boot the server using the steps provided in your
server-specific user documentation to determine if the server is starting as it should.
First, however, read through the proper troubleshooting procedures in the “Using a
Troubleshooting Methodology” section.
Using a Troubleshooting Methodology
As you follow the troubleshooting steps in this guide and your server-specific user
documentation, use the methods described in Table 1-1. When troubleshooting, some
results are obvious, such as error messages or significant changes in functionality.
Other changes may not be as obvious, requiring you to check system logs for new
events recorded after the change was made.
After familiarizing yourself with these troubleshooting methods, follow the steps
outlined in “Gathering Information” in this chapter to troubleshoot your server.
HP Servers Troubleshooting Guide 1-3
Diagnosing the Problem
Table 1-1: Troubleshooting Methodology
What to Check Troubleshooting Method
What are the results of each
troubleshooting step?
Did anything change? If so, what? Check system logs. Look for any
Was any functionality gained or
diminished?
Were any errors made in
implementing a step?
Was more than one variable changed
at a time?
Were any steps skipped or completed
out of order?
Were any steps accidentally added?
Were any steps added intentionally to
complete or correct another step?
Look for and record new
symptoms, such as error
messages or informational
messages.
Were the results logical,
consistent, and expected?
type of change, no matter how
insignificant.
Look for functionality changes to
judge the effectiveness of each
troubleshooting step.
Look for and record any mistakes
made while executing a step.
To be sure that the specific cause
of the problem is isolated, be sure
that during each step only one
variable is changed at a time.
Place checkmarks against the
steps as they are executed, and
circle the steps not executed. Look
for skipped steps or steps
executed out of order.
If steps had to be added in order
to proceed, record why, and note
the preceding step.
1-4 HP Servers Troubleshooting Guide
Gathering Information
If you encounter a problem with your server, follow the guidelines in this section and
record your findings in a notebook. Having these details available reduces
troubleshooting time. This information also helps the authorized service provider to
diagnose and solve your problem, if their assistance is used.
Preliminary Information
Before troubleshooting your specific server problem, collect the following
information:
• What events preceded the failure? After which steps does the problem occur?
• What has been changed between the time the server was working and now?
• Did you recently add or remove hardware or software? If so, did you remember
to change the appropriate settings in the server setup utility, if necessary?
• Was the server recently installed or moved?
• Has the server exhibited problem symptoms for a period of time?
Diagnosing the Problem
• If the problem occurs randomly, what is the duration or frequency?
To answer these questions, the following information may be useful:
• Run the Survey Utility and compare what has changed (for servers running the
Microsoft Windows NT, Linux, or Novell NetWare operating system).
• Refer to your software and hardware records for information.
After collecting this information, refer to the appropriate section in this chapter:
• When the Server Does Not Start
• When the Self-Tests Fail
• When the Operating System Does Not Load
HP Servers Troubleshooting Guide 1-5
Diagnosing the Problem
When the Server Does Not Start
The following visual and audio clues indicate that the server is not starting:
• The LEDs are off.
• The fans are not spinning.
• Something seems, looks, or sounds wrong or different.
• There is physical damage to the system.
• Something is cool that should be warm.
• There are frayed cables.
• The system does not follow the normal power-up sequence, as described in your
server-specific user documentation.
When a ProLiant ML, ProLiant DL, TaskSmart, or Previously Released Server
Does Not Start
Use the information in Table 1-2 to troubleshoot problems with a ProLiant ML,
ProLiant DL, TaskSmart, or previously released server.
Table 1-2: When the Server Does Not Start
What to Check What to Do
Check for connection problems:
• Is the server power cord plugged
into a working grounded
(earthed) AC outlet?
• Has the Power On/Standby
switch been firmly pressed?
• Are there unconnected or loose
plugs or cables?
• Are any connections loose or
improperly seated?
1-6 HP Servers Troubleshooting Guide
Refer to “Loose Connections” in
Chapter 2.
continued
Table 1-2: When the Server Does Not Start continued
What to Check What to Do
Diagnosing the Problem
Check for incorrect system settings:
• Are switches set correctly?
Check for faulty power delivery:
• Is the power cord working?
• Is the power strip working?
• Is the power outlet working, and
at the correct voltage level?
Check for power supply problems:
• Is each power supply fan
spinning?
• Are the power supplies’ LEDs
indicating that each power supply
is working?
• Have you recently added
hardware which might be
overburdening the power
supplies?
• Is the uninterruptible power
supply (UPS) starting and
working correctly?
Check for a system short circuit:
• Is the power status LED blinking
intermittently, turning amber, or
staying off?
Refer to your server-specific user
documentation to verify switch
settings.
Refer to “Power Source” in
Chapter 2.
Refer to:
• “Power Supply” in Chapter 2
• “Uninterruptible Power
Supply” in Chapter 2
• Your server-specific user
documentation for more
information on LEDs
Refer to:
• “System Short Circuit” in
Chapter 2
• Your server-specific user
documentation for more
information on LEDs
continued
HP Servers Troubleshooting Guide 1-7
Diagnosing the Problem
Table 1-2: When the Server Does Not Start continued
What to Check What to Do
Check for Processor Power Module
(PPM) problems:
• Has a PPM failed and forced the
server into a reset condition?
Check for automatic server
recovery-2 (ASR-2) reboot:
• Is your server rebooting
repeatedly?
Refer to “Processor Power
Modules” in Chapter 2.
Be sure that the server is not
rebooting due to a problem that
initiates an ASR-2 reboot. Refer to
“Automatic Server Recovery-2” in
Chapter 5 for more information.
1-8 HP Servers Troubleshooting Guide
When a ProLiant BL Server Does Not Start
Use the information in Table 1-3 to troubleshoot problems with a ProLiant BL server.
Table 1-3: When a ProLiant BL Server Does Not Start
What to Check What to Do
Check the enclosure(s):
Diagnosing the Problem
Check for connection problems:
• Are all power cords properly
connected throughout the
system? Are there
unconnected or loose plugs or
cables?
Check for power delivery problems:
• Are all power cords working?
• Is the power outlet working,
and at the correct voltage
level?
• If applicable to your system,
are the circuit breakers set in
their appropriate positions?
• Do the LEDs on the system
indicate that power delivery is
working?
Refer to:
• “Loose Connections” in
Chapter 2
• Your server-specific user
documentation for more
information about the
cabling necessary for
enclosures
Refer to:
• “Power Source” in
Chapter 2
• Your server-specific user
documentation for LED
information
continued
HP Servers Troubleshooting Guide 1-9
Diagnosing the Problem
Table 1-3: When a ProLiant BL Server Does Not Start continued
What to Check What to Do
Check for power supply problems:
• Is each power supply fan
spinning?
• Do the power supplies’ LEDs
indicate that each power
supply is working?
• Have you recently added
hardware which might be
overburdening the power
supplies?
• If you have an uninterruptible
power supply (UPS), is it
starting and working correctly?
Check for a system short circuit:
• Is the power status LED
blinking intermittently, turning
amber, or staying off?
If your system supports the
Integrated Administrator:
• Is the Integrated Administrator
rebooting repeatedly?
Refer to:
• “Power Supply” in
Chapter 2.
• Your server-specific user
documentation for more
information on LEDs.
Refer to “Uninterruptible Power
Supply” in Chapter 2.
Refer to “System Short Circuit”
in Chapter 2.
Be sure that the server is not
rebooting due to a problem that
initiates an enclosure self
recovery (ESR) reboot. Refer to
the ProLiant BLe-Series
Integrated Administrator User
Guide for more information.
continued
1-10HP Servers Troubleshooting Guide
Diagnosing the Problem
Table 1-3: When a ProLiant BL Server Does Not Start continued
What to Check What to Do
Check each server blade:
Check for power delivery problems:
• Do all appropriate LEDs
indicate that the server blade is
receiving power?
• If applicable to your system,
has the server blade power
button been firmly pressed?
Refer to your server-specific
user documentation for LED
information.
Check for connection problems:
• Is the server blade seated
properly in the enclosure?
• Are any connections loose or
improperly seated? Are there
unconnected or loose plugs or
cables?
Check for incorrect system settings:
• Are switches set correctly?
If applicable, check for Processor
Power Module (PPM) problems:
• Has a PPM failed and forced
the server blade into a reset
condition?
Check for memory problems:
• Is memory working and
properly seated?
• Is memory set up correctly for
your server?
If applicable, check for automatic
server recovery-2 (ASR-2) reboot:
• Is your server rebooting
repeatedly?
Refer to “Loose Connections” in
Chapter 2.
Refer to your server-specific
user documentation to verify
switch settings.
Refer to “Processor Power
Modules” in Chapter 2.
Refer to “Memory” in Chapter 2.
Be sure that there is not a
problem that is initiating an
ASR-2 reboot. Refer to
“Automatic Server Recovery-2”
in Chapter 5.
HP Servers Troubleshooting Guide 1-11
Diagnosing the Problem
When the Self-Tests Fail
This section provides steps to follow if the system starts, but fails to complete the
self-tests without error. The following visual and audio clues indicate that the system
is not completing the self-tests:
• The system begins to boot, then suddenly shuts down.
• The system keeps restarting.
• Random errors are occurring during the boot process.
• There are intermittent problems during the boot process.
• Error messages are appearing on the screen.
• Your server has an Integrated Management Display (IMD), but the IMD does not
display a twirling baton and checkmarks during POST, or the twirling baton
appears but continues twirling for an excessive amount of time.
Table 1-4: When the Self-Tests Fail
What to Check What to Do
Check for failure information:
Are there error messages such as:
• Power-On Self-Test
(POST) messages?
• Stop/Abend/Trap?
• Integrated Management
Log (IML) messages?
• Insight Manager 7 (or
previous version) detail?
1-12 HP Servers Troubleshooting Guide
Record the full error message.
Refer to Appendix C, “POST Error
Messages.”
Refer to:
• Chapter 3, “Software
Problems”
• Chapter 2, “Hardware
Problems”
Refer to “Integrated Management
Log” in Chapter 4.
Refer to “Server Management” in
Chapter 4.
continued
Table 1-4: When the Self-Tests Fail continued
What to Check What to Do
Check the system configuration:
• Are all required switch
settings set correctly?
What is the system
configuration for the:
• Memory
• Processors; check speed,
type, and location
• Cache memory
• Controllers
• Shadow RAM
• Free space on the hard
drive
Refer to your server-specific user
documentation.
Run the Inspect Utility. Refer to:
• “Inspect Utility” in Chapter 4
• Your system configuration
settings; check the server
setup utility
Diagnosing the Problem
Check component information,
such as:
• IRQ settings
• I/O address
• Direct memory access
(DMA) channels
• Connector type
• Do you have third-party
devices that may have
caused a conflict?
Run the Survey Utility (for servers
running the Windows NT, Linux, or
NetWare operating system) and
Insight Manager.
Refer to:
• “Survey Utility” in Chapter 4
• “Server Management” in
Chapter 4
Refer to:
• “Third-Party Devices” in
Chapter 2
• Your third-party
documentation
continued
HP Servers Troubleshooting Guide 1-13
Diagnosing the Problem
Table 1-4: When the Self-Tests Fail continued
What to Check What to Do
Check for system failures:
• Be sure that all expansion
boards, drives, and
processors are firmly
seated and that all latches
are firmly closed.
• Be sure that all system
cables are properly
connected and not
damaged.
If your server includes one or
more Processor Power
Modules (PPMs), check each
PPM.
Be sure that there are no
processor problems.
Be sure that there are no
memory problems.
For configure-to-order servers:
• Check the initial
factory-installed
configuration.
• Note any changes that
have been made to the
original system.
• Note configuration
changes made before or
after completing the
operating system
installation.
Refer to “Loose Connections” in
Chapter 2.
Refer to “Processor Power
Modules” in Chapter 2 for
information on testing PPMs.
Refer to “Processors” in
Chapter 2.
Refer to “Memory” in Chapter 2.
Refer to your server-specific user
documentation.
1-14 HP Servers Troubleshooting Guide
When the Operating System Does Not Load
This section provides steps to follow if the server starts and completes the self-tests
without error, but encounters errors while loading the operating system. Make note of
the following information before following the steps in this section:
• What operating system version is installed?
• Was the operating system factory installed?
• Has the operating system ever started?
• What version of the diagnostic utilities is installed?
• If applicable, what file system is used (example for Windows NT: NTFS, FAT)?
• In addition to the operating system, what other software has been added?
Table 1-5: When the Operating System Does Not Load
What to Check What to Do
Check for any errors detected by the system:
• For Microsoft Windows NT
users, were there any
errors in the event log?
Refer to the Microsoft
Windows NT user documentation.
Diagnosing the Problem
• Check for any Survey
errors.
HP Servers Troubleshooting Guide 1-15
Run the Survey Utility (for servers
running Windows NT, Linux, or
NetWare operating systems).
Refer to:
• “Survey Utility” in Chapter 4
• “List of Events” in Chapter 4
continued
Diagnosing the Problem
Table 1-5: When the Operating System Does Not Load continued
What to Check What to Do
• Were there any test
errors?
Run Diagnostics.
Refer to:
• “Diagnostics” in Chapter 4
• Appendix B, “Test Error
Codes”
Check for any incorrect, conflicting, or out-of-date software versions:
• Are you running the latest
ROM version?
• What version of the
diagnostic utilities is
Refer to your server-specific user
documentation.
Refer to “Diagnostics” in
Chapter 4.
installed?
• If the problem is with a
particular device, what
version of the driver is
installed?
Refer to:
• Your device-specific user
documentation
• The section for the specific
device in Chapter 2
• Is the Insight Manager
console software version
Refer to the user documentation
on the Management CD.
different from the
Management Agents
version?
• If your server uses the
Rapid Deployment Pack,
has the system been
configured correctly with
this software?
Refer to:
• Your server-specific user
documentation
• The documentation that ships
with the Rapid Deployment
Pack
continued
1-16HP Servers Troubleshooting Guide
Diagnosing the Problem
Table 1-5: When the Operating System Does Not Load continued
What to Check What to Do
If your server uses EFI boot manager, check the EFI settings:
• Is the operating system
configured as the default
operating system in EFI
boot manager?
Check the utilization rate/traffic:
• Is the utilization rate/traffic
shown in Insight Manager
7 (or previous version)
appropriate?
• How does the current
utilization differ from the
historical?
Refer to your server-specific user
documentation for more
information.
Refer to the utilization information
provided by your third-party tools.
HP Servers Troubleshooting Guide 1-17
Loading...
+ 277 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.