Problem Determination and Ser vice
Guide for the
IBM Power PS700 (8406-70Y)
GI11-9831-00
Power Systems
Problem Determination and Ser vice
Guide for the
IBM Power PS700 (8406-70Y)
GI11-9831-00
Note
Before using this information and the product it supports, read the information in “Notices,” on page 271, “Safety notices”
on page v, the IBM Systems Safety Notices manual, G229-9054, and the IBM Environmental Notices and User Guide, Z125–5823.
This edition applies to IBM Power Systems servers that contain the POWER7 processor and to all associated
models.
Installing the blade server in a BladeCenter unit236
Removing and replacing Tier 1 CRUs .....237
Removing the blade server cover ......237
Installing and closing the blade server cover . . 239
Removing the bezel assembly .......240
Installing the bezel assembly .......240
Removing a drive ...........241
Installing a drive ...........242
Removing a memory module .......244
Installing a memory module .......245
Removing and installing an I/O expansion card 246
Removing a CIOv form-factor expansion card 247
Installing a CIOv form-factor expansion card 247
Removing a combination-form-factor
expansion card ...........249
Installing a combination-form-factor
expansion card ...........249
Removing the battery.........250
Installing the battery ..........251
Removing the disk drive tray .......252
Installing the disk drive tray .......253
Removing the tier 2 management card .....255
Installing the tier 2 management card .....256
Obtaining a PowerVM Virtualization Engine
system technologies activation code ......257
Replacing the FRU system-board and chassis
assembly ...............260
Chapter 5. Configuring .......263
Updating the firmware ..........263
Configuring the blade server ........264
Using the SMS utility...........265
Starting the SMS utility .........265
SMS utility menu choices ........265
Creating a CE login ...........265
Configuring the Gigabit Ethernet controllers . . . 266
Blade server Ethernet controller enumeration . . . 267
MAC addresses for host Ethernet adapters. . . 267
Configuring a RAID array .........268
Updating IBM Director ..........268
Appendix. Notices .........271
Trademarks ..............272
Electronic emission notices .........273
Class A Notices............273
Class B Notices............277
Terms and conditions...........280
ivPower Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Safety notices
Safety notices may be printed throughout this guide:
v DANGER notices call attention to a situation that is potentially lethal or extremely hazardous to
people.
v CAUTION notices call attention to a situation that is potentially hazardous to people because of some
existing condition.
v Attention notices call attention to the possibility of damage to a program, device, system, or data.
World Trade safety information
Several countries require the safety information contained in product publications to be presented in their
national languages. If this requirement applies to your country, a safety information booklet is included
in the publications package shipped with the product. The booklet contains the safety information in
your national language with references to the U.S. English source. Before using a U.S. English publication
to install, operate, or service this product, you must first become familiar with the related safety
information in the booklet. You should also refer to the booklet any time you do not clearly understand
any safety information in the U.S. English publications.
German safety information
Das Produkt ist nicht für den Einsatz an Bildschirmarbeitsplätzen im Sinne§2der
Bildschirmarbeitsverordnung geeignet.
Laser safety information
IBM®servers can use I/O cards or features that are fiber-optic based and that utilize lasers or LEDs.
Laser compliance
IBM servers may be installed inside or outside of an IT equipment rack.
When working on or around the system, observe the following precautions:
Electrical voltage and current from power, telephone, and communication cables are hazardous. To
avoid a shock hazard:
v Connect power to this unit only with the IBM provided power cord. Do not use the IBM
provided power cord for any other product.
v Do not open or service any power supply assembly.
v Do not connect or disconnect any cables or perform installation, maintenance, or reconfiguration
of this product during an electrical storm.
v The product might be equipped with multiple power cords. To remove all hazardous voltages,
disconnect all power cords.
v Connect all power cords to a properly wired and grounded electrical outlet. Ensure that the outlet
supplies proper voltage and phase rotation according to the system rating plate.
v Connect any equipment that will be attached to this product to properly wired outlets.
v When possible, use one hand only to connect or disconnect signal cables.
v Never turn on any equipment when there is evidence of fire, water, or structural damage.
v Disconnect the attached power cords, telecommunications systems, networks, and modems before
you open the device covers, unless instructed otherwise in the installation and configuration
procedures.
v Connect and disconnect cables as described in the following procedures when installing, moving,
or opening covers on this product or attached devices.
To Disconnect:
1. Turn off everything (unless instructed otherwise).
2. Remove the power cords from the outlets.
3. Remove the signal cables from the connectors.
4. Remove all cables from the devices
To Connect:
1. Turn off everything (unless instructed otherwise).
2. Attach all cables to the devices.
3. Attach the signal cables to the connectors.
4. Attach the power cords to the outlets.
5. Turn on the devices.
(D005)
DANGER
viPower Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Observe the following precautions when working on or around your IT rack system:
v Heavy equipment–personal injury or equipment damage might result if mishandled.
v Always lower the leveling pads on the rack cabinet.
v Always install stabilizer brackets on the rack cabinet.
v To avoid hazardous conditions due to uneven mechanical loading, always install the heaviest
devices in the bottom of the rack cabinet. Always install servers and optional devices starting
from the bottom of the rack cabinet.
v Rack-mounted devices are not to be used as shelves or work spaces. Do not place objects on top
of rack-mounted devices.
v Each rack cabinet might have more than one power cord. Be sure to disconnect all power cords in
the rack cabinet when directed to disconnect power during servicing.
v Connect all devices installed in a rack cabinet to power devices installed in the same rack
cabinet. Do not plug a power cord from a device installed in one rack cabinet into a power
device installed in a different rack cabinet.
v An electrical outlet that is not correctly wired could place hazardous voltage on the metal parts of
the system or the devices that attach to the system. It is the responsibility of the customer to
ensure that the outlet is correctly wired and grounded to prevent an electrical shock.
CAUTION
v Do not install a unit in a rack where the internal rack ambient temperatures will exceed the
manufacturer's recommended ambient temperature for all your rack-mounted devices.
v Do not install a unit in a rack where the air flow is compromised. Ensure that air flow is not
blocked or reduced on any side, front, or back of a unit used for air flow through the unit.
v Consideration should be given to the connection of the equipment to the supply circuit so that
overloading of the circuits does not compromise the supply wiring or overcurrent protection. To
provide the correct power connection to a rack, refer to the rating labels located on the
equipment in the rack to determine the total power requirement of the supply circuit.
v (For sliding drawers.) Do not pull out or install any drawer or feature if the rack stabilizer brackets
are not attached to the rack. Do not pull out more than one drawer at a time. The rack might
become unstable if you pull out more than one drawer at a time.
v (For fixed drawers.) This drawer is a fixed drawer and must not be moved for servicing unless
specified by the manufacturer. Attempting to move the drawer partially or completely out of the
rack might cause the rack to become unstable or cause the drawer to fall out of the rack.
(R001)
Safety noticesvii
CAUTION:
Removing components from the upper positions in the rack cabinet improves rack stability during
relocation. Follow these general guidelines whenever you relocate a populated rack cabinet within a
room or building:
v Reduce the weight of the rack cabinet by removing equipment starting at the top of the rack
cabinet. When possible, restore the rack cabinet to the configuration of the rack cabinet as you
received it. If this configuration is not known, you must observe the following precautions:
– Remove all devices in the 32U position and above.
– Ensure that the heaviest devices are installed in the bottom of the rack cabinet.
– Ensure that there are no empty U-levels between devices installed in the rack cabinet below the
32U level.
v If the rack cabinet you are relocating is part of a suite of rack cabinets, detach the rack cabinet from
the suite.
v Inspect the route that you plan to take to eliminate potential hazards.
v Verify that the route that you choose can support the weight of the loaded rack cabinet. Refer to the
documentation that comes with your rack cabinet for the weight of a loaded rack cabinet.
v Verify that all door openings are at least 760 x 2030 mm (30 x 80 in.).
v Ensure that all devices, shelves, drawers, doors, and cables are secure.
v Ensure that the four leveling pads are raised to their highest position.
v Ensure that there is no stabilizer bracket installed on the rack cabinet during movement.
v Do not use a ramp inclined at more than 10 degrees.
v When the rack cabinet is in the new location, complete the following steps:
– Lower the four leveling pads.
– Install stabilizer brackets on the rack cabinet.
– If you removed any devices from the rack cabinet, repopulate the rack cabinet from the lowest
position to the highest position.
v If a long-distance relocation is required, restore the rack cabinet to the configuration of the rack
cabinet as you received it. Pack the rack cabinet in the original packaging material, or equivalent.
Also lower the leveling pads to raise the casters off of the pallet and bolt the rack cabinet to the
pallet.
(R002)
(L001)
(L002)
viiiPower Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
(L003)
or
All lasers are certified in the U.S. to conform to the requirements of DHHS 21 CFR Subchapter J for class
1 laser products. Outside the U.S., they are certified to be in compliance with IEC 60825 as a class 1 laser
product. Consult the label on each part for laser certification numbers and approval information.
CAUTION:
This product might contain one or more of the following devices: CD-ROM drive, DVD-ROM drive,
DVD-RAM drive, or laser module, which are Class 1 laser products. Note the following information:
v Do not remove the covers. Removing the covers of the laser product could result in exposure to
hazardous laser radiation. There are no serviceable parts inside the device.
v Use of the controls or adjustments or performance of procedures other than those specified herein
might result in hazardous radiation exposure.
(C026)
Safety noticesix
CAUTION:
Data processing environments can contain equipment transmitting on system links with laser modules
that operate at greater than Class 1 power levels. For this reason, never look into the end of an optical
fiber cable or open receptacle. (C027)
CAUTION:
This product contains a Class 1M laser. Do not view directly with optical instruments. (C028)
CAUTION:
Some laser products contain an embedded Class 3A or Class 3B laser diode. Note the following
information: laser radiation when open. Do not stare into the beam, do not view directly with optical
instruments, and avoid direct exposure to the beam. (C030)
Power and cabling information for NEBS (Network Equipment-Building System)
GR-1089-CORE
The following comments apply to the IBM servers that have been designated as conforming to NEBS
(Network Equipment-Building System) GR-1089-CORE:
The equipment is suitable for installation in the following:
v Network telecommunications facilities
v Locations where the NEC (National Electrical Code) applies
The intrabuilding ports of this equipment are suitable for connection to intrabuilding or unexposed
wiring or cabling only. The intrabuilding ports of this equipment must not be metallically connected to the
interfaces that connect to the OSP (outside plant) or its wiring. These interfaces are designed for use as
intrabuilding interfaces only (Type 2 or Type 4 ports as described in GR-1089-CORE) and require isolation
from the exposed OSP cabling. The addition of primary protectors is not sufficient protection to connect
these interfaces metallically to OSP wiring.
Note: All Ethernet cables must be shielded and grounded at both ends.
The ac-powered system does not require the use of an external surge protection device (SPD).
The dc-powered system employs an isolated DC return (DC-I) design. The DC battery return terminal
shall not be connected to the chassis or frame ground.
xPower Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Chapter 1. Introduction
This problem determination and service information helps you solve problems that might occur in your
PS700 blade server. The information describes the diagnostic tools that come with the blade server, error
codes and suggested actions, and instructions for replacing failing components.
Replaceable components are of three types:
v Tier 1 customer replaceable unit (CRU): Replacement of Tier 1 CRUs is your responsibility. If IBM
installs a Tier 1 CRU at your request, you are charged for the installation.
v Tier 2 customer replaceable unit: You can install a Tier 2 CRU yourself or request IBM to install it, at
no additional charge, under the type of warranty service that is designated for your blade server.
v Field replaceable unit (FRU): FRUs must be installed only by trained service technicians.
The serial number for the PS700 blade server can be found in the following locations:
v The bottom front of the blade server in the right corner on the 1S label.
vThe bottom rear of the blade server in the right corner.
v Under the front cover door.
For information about the terms of the warranty and getting service and assistance, see the information
center or the Warranty and Support Information document on the IBM BladeCenter
®
Documentation CD.
Related documentation
Documentation for the PS700 blade server includes documents in Portable Document Format (PDF) on
the IBM BladeCenter Documentation CD and the online information center.
The most recent version of all BladeCenter documentation is in the BladeCenter information center.
The online BladeCenter information center is available in the IBM BladeCenter Information Center at
http://publib.boulder.ibm.com/infocenter/bladectr/documentation/index.jsp.
PDF versions of the following documents are on the IBM BladeCenter Documentation CD and in the online
information center:
v Installation and User's Guide
This document contains general information about the blade server, including how to install supported
options and how to configure the blade server.
v Safety Information
This document contains translated caution and danger statements. Each caution and danger statement
that appears in the documentation has a number that you can use to locate the corresponding
statement in your language in the Safety Information document.
v Warranty and Support Information
This document contains information about the terms of the warranty and about getting service and
assistance.
Additional documents might be included in the online information center and on the IBM BladeCenterDocumentation CD.
The blade server might have features that are not described in the documentation that comes with the
blade server. Occasional updates to the documentation might include information about those features, or
technical updates might be available to provide additional information that is not included in the
documentation that comes with the blade server.
Review the online information or the Planning Guide and the Installation Guide for your IBM BladeCenter
unit. The information can help you prepare for system installation and configuration. The most current
version of each document is available in the BladeCenter information center.
Notices and statements
The caution and danger statements in this document are also in the multilingual Safety Information. Each
statement is numbered for reference to the corresponding statement in your language in the SafetyInformation document.
The following notices and statements are used in this document:
v Note: These notices provide important tips, guidance, or advice.
v Important: These notices provide information or advice that might help you avoid inconvenient or
problem situations.
v Attention: These notices indicate potential damage to programs, devices, or data. An attention notice is
placed just before the instruction or situation in which damage might occur.
v Caution: These statements indicate situations that can be potentially hazardous to you. A caution
statement is placed just before the description of a potentially hazardous procedure step or situation.
v Danger: These statements indicate situations that can be potentially lethal or extremely hazardous to
you. A danger statement is placed just before the description of a potentially lethal or extremely
hazardous procedure step or situation.
Features and specifications
Features and specifications of the IBM BladeCenter PS700 blade server are summarized in this overview.
The PS700 Type 8406 is a single-wide (non-expandable) blade server. The PS700 blade server is used in an
IBM BladeCenter H (8852 and 7989), BladeCenter HT (8740 and 8750), or BladeCenter S (8886 and 7779)
chassis unit.
Notes:
v Power, cooling, removable-media drives, external ports, and advanced system management are
provided by the BladeCenter unit.
v The operating system in the blade server must provide support for the Universal Serial Bus (USB), to
enable the blade server to recognize and communicate internally with the removable-media drives and
front-panel USB ports.
2Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Core electronics:
v 64-bit Power 7 processors (12S
technology)
v Four core, single socket (4-way)
processors @ 3.0 GHz
v 64 GB maximum in 8 very low
profile (VLP) DIMM slots; Supports
4 GB DDR3 at 1066MHz, and 8 GB
DDR3 at 800HMz
P5IOC2 I/O hub
On-board, integrated features:
v Two 1 GB Ethernet ports (HEA)
(two on each side)
v SAS controller
v USB 2.0
v 1 Serial over LAN (SOL) console
using FSP
FSP1 Service Processor - IPMI and
SOL
v The baseboard management
controller (BMC) is a flexible
service processor (FSP1) with
Intelligent Platform Management
Interface (IPMI), Serial over LAN
(SOL), and Wake on LAN (WOL)
firmware support.
Local Storage:
v First DASD bay: zero or one 2.5"
SAS HDD
v Second DASD bay: zero or one 2.5"
SAS HDD
v SAS HDDs are 300 GB and 600 GB
v Hardware mirroring
Daughter card I/O options:
v 1 1Xe expansion card (CIOv)
v SAS Pass-through using 1Xe
v 1 High-Speed expansion card
(CFFh)
Integrated functions:
v RS-485 interface for
communication with the
management module
v Automatic server restart (ASR)
v SOL through FSP
v Two Universal Serial Bus (USB
2.0) buses on base planar for
communication with
removable-media drives
v Optical media available by shared
chassis feature
Environment:
v Air temperature:
– Blade server on: 10° to 35°C
(50° to 95°F). Altitude: 0 to 914
m (3000 ft)
– Blade server on: 10° to 32°C
(50° to 90°F). Altitude: 914 m to
2133 m (3000 ft to 7000 ft)
– Blade server off: -40° to 60°C
(-40° to 140°F)
v Humidity:
– Blade server on: 8% to 80%
– Blade server off: 8% to 80%
PS700 Size:
v Height: 24.5 cm (9.7 inches)
v Depth: 44.6 cm (17.6 inches)
v Width: 30 mm (1.14 inches)
Systems management:
v Supported by BladeCenter chassis
management module
v Front panel LEDs
v IBM Director
v Hardware Management Console
(HMC)
v Integrated Virtualization Manager
(IVM)
v Energy Scale thermal management
for power management/
oversubscription (throttling) and
environmental sensing
v Active Energy Manager
Clusters support for:
v IBM Director
v xCat
Virtualization support for:
PowerVM
®
Standard Edition hardware
feature, which provides the Integrated
Virtualization Manager, Virtual I/O
Server, and Director Power Systems
™
Manager (DPSM).
Reliability and service features:
v Dual alternating current power
supply
v BladeCenter chassis redundant and
hot plug power and cooling
modules
v Boot-time processor deallocation
v Blade server hot plug
v Customer setup and expansion
v Automatic reboot on power loss
v Internal and ambient temperature
monitors
v ECC, chipkill memory
v System management alerts
Electrical input: 12Vdc
See the ServerProven Web site for information about supported operating-system versions and all PS700
blade server optional devices.
Chapter 1. Introduction3
Supported DIMMs
Each planar in the PS700 blade server contains eight very low profile (VLP) memory connectors for
registered dual inline memory modules (RDIMMs). The maximum size for a single DIMM is 8 GB. The
total memory capacity ranges for PS700 from a minimum of 4 GB to a maximum of 64 GB.
See Chapter 3, “Parts listing, Type 8406,” on page 229 for memory modules that you can order from IBM.
Memory module rules:
v Install DIMM fillers in unused DIMM slots for proper cooling.
v Install DIMMs in pairs (1 and 3, 6 and 8, 2 and 4, 5 and 7)
v Both DIMMs in a pair must be the same size, speed, type, and technology. You can mix compatible
DIMMs from different manufacturers.
v Each DIMM within a processor-support group (1-4 and 5-8) must be the same size and speed.
®
v Install only supported DIMMs, as described on the ServerProven
servers/eserver/serverproven/compat/us/.
v Installing or removing DIMMs changes the configuration of the blade server. After you install or
remove a DIMM, the blade server is automatically re-configured, and the new configuration
information is stored.
v See “System-board connectors” on page 8 for DIMM connector locations.
Table 1 shows allowable placement of DIMM modules:
Table 1. Memory module combinations
DIMM
countPS700 Base blade planar (P1) DIMM slots
12345678
2XX
4XXXX
6XXXXXX
8XXXXXXXX
Web site. See http://www.ibm.com/
Figure 1. DIMM connectors. Base unit connectors
4Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Blade server control panel buttons and LEDs
Blade server control panel buttons and LEDs provide operational controls and status indicators.
Note: Figure 2 shows the control-panel door in the closed (normal) position. To access the power-control
button, you must open the control-panel door.
Figure 2. Blade server control panel buttons and LEDs
1 Media-tray select button: Press this button to associate the shared BladeCenter unit media tray
(removable-media drives and front-panel USB ports) with the blade server. The LED on the button flashes
while the request is being processed, then is lit when the ownership of the media tray has been
transferred to the blade server. It can take approximately 20 seconds for the operating system in the blade
server to recognize the media tray.
If there is no response when you press the media-tray select button, use the management module to
determine whether local control has been disabled on the blade server.
Note: The operating system in the blade server must provide USB support for the blade server to
recognize and use the removable-media drives and USB ports.
Chapter 1. Introduction5
2 Information LED: When this amber LED is lit, it indicates that information about a system error for
the blade server has been placed in the management-module event log. The information LED can be
turned off through the Web interface of the management module or through IBM Director Console.
3 Blade-error LED: When this amber LED is lit, it indicates that a system error has occurred in the
blade server. The blade-error LED will turn off after one of the following events:
v Correcting the error
v Reseating the blade server in the BladeCenter unit
v Cycling the BladeCenter unit power
4 Power-control button: This button is behind the control panel door. Press this button to turn on or
turn off the blade server.
The power-control button has effect only if local power control is enabled for the blade server. Local
power control is enabled and disabled through the Web interface of the management module.
Press the power button for 5 seconds to begin powering down the blade server.
5 NMI reset (recessed): The nonmaskable interrupt (NMI) reset dumps the partition. Use this recessed
button only as directed by IBM Support.
6 Power-on LED: This green LED indicates the power status of the blade server in the following
manner:
v Flashing rapidly: The service processor is initializing the blade server.
v Flashing slowly: The blade server has completed initialization and is waiting for a power-on command.
v Lit continuously: The blade server has power and is turned on.
Note: The enhanced service processor can take as long as three minutes to initialize after you install the
BladeCenter PS700 blade server, at which point the LED begins to flash slowly.
7 Activity LED: When this green LED is lit, it indicates that there is activity on the hard disk drive or
network.
8 Location LED: When this blue LED is lit, it has been turned on by the system administrator to aid in
visually locating the blade server. The location LED can be turned off through the Web interface of the
management module or through IBM Director Console.
Turning on the blade server
After you connect the blade server to power through the BladeCenter unit, you can start the blade server
after the discovery and initialization process is complete.
6Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
You can start the blade server in any of the following ways.
v Start the blade server by pressing the power-control button on the front of the blade server.
The power-control button is behind the control panel door, as described in “Blade server control panel
buttons and LEDs” on page 5.
After you push the power-control button, the power-on LED continues to blink slowly for about 15
seconds, then is lit solidly when the power-on process is complete.
Wait until the power-on LED on the blade server flashes slowly before you press the blade server
power-control button. If the power-on LED is flashing rapidly, the service processor is initializing the
blade server. The power-control button does not respond during initialization.
Note: The enhanced service processor can take as long as three minutes to initialize after you install
the BladeCenter PS700 blade server, at which point the LED begins to flash slowly.
v Start the blade server automatically when power is restored after a power failure.
If a power failure occurs, the BladeCenter unit and then the blade server can start automatically when
power is restored. You must configure the blade server to restart through the management module.
v Start the blade server remotely using the management module.
After you initiate the power-on process, the power-on LED blinks slowly for about 15 seconds, then is
lit solidly when the power-on process is complete.
Turning off the blade server
When you turn off the blade server, it is still connected to power through the BladeCenter unit. The blade
server can respond to requests from the service processor, such as a remote request to turn on the blade
server. To remove all power from the blade server, you must remove it from the BladeCenter unit.
Shut down the operating system before you turn off the blade server. See the operating-system
documentation for information about shutting down the operating system.
You can turn off the blade server in one of the following ways.
v Turn off the blade server by pressing the power-control button for at least 5 seconds.
The power-control button is on the blade server behind the control panel door. See “Blade server
control panel buttons and LEDs” on page 5 for the location.
Note: The power-control LED can remain on solidly for up to 1 minute after you push the
power-control button. After you turn off the blade server, wait until the power-control LED is blinking
slowly before you press the power-control button to turn on the blade server again.
If the operating system stops functioning, press and hold the power-control button for more than 5
seconds to force the blade server to turn off.
v Use the management module to turn off the blade server.
The power-control LED can remain on solidly for up to 1 minute after you initiate the power-off
process. After you turn off the blade server, wait until the power-control LED is blinking slowly before
you initiate the power-on process from the AMM to turn on the blade server again.
Use the management-module Web interface to configure the management module to turn off the blade
server if the system is not operating correctly.
For additional information, see the online documentation or the User's Guide for the management
module.
Chapter 1. Introduction7
System-board layouts
Illustrations show the connectors and LEDs on the system board. The illustrations might differ slightly
from your hardware.
System-board connectors
Blade server components attach to the connectors on the system board.
Figure 3 shows the connectors on the base unit system board in the blade server.
Figure 3. PS700 system-board connectors
Table 2 shows connector descriptions.
Table 2. PS700 connectors
CalloutPS700 blade server connectors
1Operator panel connector
2DIMM 1-4 connectors (See Figure 4 on page 9 for individual connectors.) Expansion unit
9DIMM 5-8 connectors (See Figure 4 on page 9 for individual connectors.)
103V lithium battery connector (P1-E1)
Figure 4 on page 9 shows individual DIMM connectors.
8Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Figure 4. DIMM connectors. Base unit connectors
System-board LEDs
Use the illustration of the LEDs on the system board to identify a light emitting diode (LED).
Remove the blade server from the BladeCenter unit, open the cover, press the blue button to see any
error LEDs that were turned on during error processing, and use Figure 5 to identify the failing
component.
Figure 5 shows the locations of LEDs on the system board.
Table 3 shows LED descriptions.
Figure 5. LED locations on the system board of the PS700 blade server
Table 3. PS700 LEDs
CalloutBase unit LEDs
13V lithium battery LED
2DIMM 1-4 LEDs
3Management card LED
4Light path power LED
5System board LED
6HDD1 LED
7Interposer LED
Chapter 1. Introduction9
Table 3. PS700 LEDs (continued)
CalloutBase unit LEDs
8CIOv (1Xe) expansion card connector LED
9High-Speed (CFFh) expansion card connector LED
10HDD2 LED
11DIMM 5-8 LEDs
10Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Chapter 2. Diagnostics
Use the available diagnostic tools to help solve any problems that might occur in the blade server.
The first and most crucial component of a solid serviceability strategy is the ability to accurately and
effectively detect errors when they occur. While not all errors are a threat to system availability, those that
go undetected are dangerous because the system does not have the opportunity to evaluate and act if
necessary. POWER7
that extend from processor cores and memory to power supplies and hard drives.
POWER7 processor-based systems contain specialized hardware detection circuitry for detecting
erroneous hardware operations. Error checking hardware ranges from parity error detection coupled with
processor instruction retry and bus retry, to ECC correction on caches and system buses.
IBM hardware error checkers have these distinct attributes:
v Continuous monitoring of system operations to detect potential calculation errors
v Attempted isolation of physical faults based on runtime detection of each unique failure
v Initiation of a wide variety of recovery mechanisms designed to correct a problem
POWER7 processor-based systems include extensive hardware and firmware recovery logic.
Machine check handling
Machine checks are handled by firmware. When a machine check occurs, the firmware analyzes the error
to identify the failing device and creates an error log entry.
®
processor-based systems are specifically designed with error-detection mechanisms
If the system degrades to the point that the service processor cannot reach standby state, the ability to
analyze the error does not exist. If the error occurs during POWER
PHYP initiates a system reboot.
In partitioned mode, an error that occurs during partition activity is reported to the operating system in
the partition.
®
hypervisor (PHYP) activities, the
Diagnostic tools
Tools are available to help you diagnose and solve hardware-related problems.
v Power-on self-test (POST) progress codes (checkpoints), error codes, and isolation procedures
The POST checks out the hardware at system initialization. IPL diagnostic functions test some system
components and interconnections. The POST generates eight-digit checkpoints to mark the progress of
powering up the blade server.
Use the management module to view progress codes.
The documentation of a progress code includes recovery actions for system hangs. See “POST progress
codes (checkpoints)” on page 84 for more information.
If the service processor detects a problem during POST, an error code is logged in the management
module event log. Error codes are also logged in the Linux syslog or AIX
®
diagnostic log, if possible.
See “System reference codes (SRCs)” on page 16.
The service processor can generate codes that point to specific isolation procedures. See “Service
processor problems” on page 200.
v Light path diagnostics
Use the light path diagnostic LEDs on the system board to identify failing hardware. If the system
error LED on the system LED panel on the front or rear of the BladeCenter unit is lit, one or more
error LEDs on the BladeCenter unit components also might be lit.
Light path diagnostics help identify failing customer replaceable unit (CRUs). CRU location codes are
included in error codes and the event log.
LED locations
See “System-board LEDs” on page 9.
Front panel
See “Blade server control panel buttons and LEDs” on page 5.
v Troubleshooting tables
Use the troubleshooting tables to find solutions to problems that have identifiable symptoms.
See “Troubleshooting tables” on page 191.
v Dump data collection
In some circumstances, an error might require a dump to show more data. The Integrated
Virtualization Manager (IVM) or Hardware Management Console (HMC) sets up a dump area. Specific
IVM or HMC information is included as part of the information that can optionally be sent to IBM
support for analysis.
See “Collecting dump data” on page 13 for more information.
v Stand-alone diagnostics
The AIX-based stand-alone diagnostics CD is in the ship package and is also available from the IBM
Web site. Boot the diagnostics from a CD drive or from an AIX network installation manager (NIM)
server if the blade server cannot boot to an operating system, no matter which operating system is
installed.
Functions provided by the stand-alone diagnostics include:
– Analysis of errors reported by platform, such as microprocessor and memory errors
– Testing of resources, such as I/O adapters and devices
– Service aids, such as firmware update, format disk, and Raid Manager
v Diagnostic utilities for the AIX operating system
Run AIX concurrent diagnostics if AIX is functioning instead of the stand-alone diagnostics. Functions
provided by disk-based AIX diagnostics include:
– Automatic error log analysis
– Analysis of errors reported by platform, such as microprocessor and memory errors
– Testing of resources, such as I/O adapters and devices
– Service aids, such as firmware update, format disk, and Raid Manager
v Diagnostic utilities for Linux operating systems
12Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Linux on POWER service and productivity tools include hardware diagnostic aids and productivity
tools, and installation aids. The installation aids are provided in the IBM Installation Toolkit for Linux
on POWER, a set of tools that aids the installation of Linux on IBM servers with POWER architecture.
You can also use the tools to update the PS700 blade server firmware.
Diagnostic utilities for the Linux operating system are available from IBM at https://
www14.software.ibm.com/webapp/set2/sas/f/lopdiags/home.html.
v Diagnostic utilities for other operating systems
You can use the stand-alone diagnostics CD to perform diagnostics on the PS700 blade server, no matter
which operating system is loaded on the blade server. However, other supported operating systems
might have diagnostic tools that are available through the operating system. See the documentation for
your operating system for more information.
Collecting dump data
A dump might be critical for fault isolation when the built-in First Failure Data Capture (FFDC)
mechanisms are not capturing sufficient fault data. Even when a fault is identified, dump data can
provide additional information that is useful in problem determination.
All hardware state information is part of the dump if a hardware checkstop occurs. When a checkstop
occurs, the service processor attempts to dump data that is necessary to analyze the error from
appropriate parts of the system.
Note: If you power off the blade through the management module while the service processor is
performing a dump, platform dump data is lost.
You might be asked to retrieve a dump to send it to IBM Support for analysis. The location of the dump
data varies by operating system.
v Collect an AIX dump from the /var/adm/platform directory.
v Collect a Linux dump from the /var/log/dump directory.
v Collect an Integrated Virtualization Manager (IVM) dump from the IVM-managed PS700 blade server
through the Manage Dumps task in the IVM console.
v To collect a system dump by using the Hardware Management Console (HMC), complete these steps:
1. Perform a controlled shutdown of all partitions.
Note: A system dump will abnormally terminate any running partitions.
2. In the navigation area, open Systems Management.
3. Select the server and open it.
4. Select Serviceability > Manage Dumps > Action > Initiate System Dump. The dump is
automatically saved to the HMC. For details on how to copy, report, or delete a dump after you
have completed a dump, see Managing dumps.
Chapter 2. Diagnostics13
Location codes
Location codes identify components of the blade server. Location codes are displayed with some error
codes to identify the blade server component that is causing the error.
See “System-board connectors” on page 8 for component locations.
Notes:
1. Location codes do not indicate the location of the blade server within the BladeCenter unit. The codes
identify components of the blade server only.
2. For checkpoints with no associated location code, see “Light path diagnostics” on page 214 to identify
the failing component when there is a hang condition.
3. For checkpoints with location codes, use the following table to identify the failing component when
there is a hang condition.
4. For 8-digit codes not listed in Table 4, see the “Checkout procedure” on page 184.
Table 4. Location codes
ComponentsPhysical Location CodeCRU LED
Un location codes are for enclosure and VPD locations.
Un = Utttt.mmm.sssssss
tttt = system machine type
mmm = system model number
sssssss = system serial number
DIMM 1Un-P1-C1Yes
DIMM 2Un-P1-C2Yes
DIMM 3Un-P1-C3Yes
DIMM 4Un-P1-C4Yes
DIMM 5Un-P1-C5Yes
DIMM 6Un-P1-C6Yes
DIMM 7Un-P1-C7Yes
DIMM 8Un-P1-C8Yes
2.5" SAS HDD1Un-P1-D1Yes
2.5" SAS HDD2Un-P1-D2Yes
Management CardUn-P1-C9Yes
BatteryUn-P1-E1Yes
PCIe High Speed Expansion CardUn-P1-C12Yes
1Xe CardUn-P1-C11Yes
USB Port 1 (CDROM/FDD)Un-P1-T1No
USB Port 2 (CDROM/FDD)Un-P1-T2No
SAS controllerUn-P1-T3No
Ethernet HEA0_AUn-P1-T4No
Ethernet HEA0_BUn-P1-T5No
Machine Location CodeUtttt.mmm.sssssssNo
Um codes are for firmware. The format is the same as for a Un location code.
Um = Utttt.mmm.sssssss
14Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 4. Location codes (continued)
ComponentsPhysical Location CodeCRU LED
Firmware versionUm-Y1
Reference codes
Reference codes are diagnostic aids that help you determine the source of a hardware or operating
system problem. To use reference codes effectively, use them in conjunction with other service and
support procedures.
The BladeCenter PS700 Type 8406 blade server produces several types of codes.
Progress codes: The power-on self-test (POST) generates eight-digit status codes that are known as
checkpoints or progress codes, which are recorded in the management-module event log. The checkpoints
indicate which blade server resource is initializing.
Error codes: The First Failure Data Capture (FFDC) error checkers capture fault data, which the service
processor then analyzes. For unrecoverable errors (UEs), for recoverable events that meet or exceed their
service thresholds, and for fatal system errors, an unrecoverable checkstop service event triggers the
service processor to analyze the error, log the system reference code (SRC), and turn on the system
attention LED.
The service processor logs the nine-word, eight-digit per word error code in the BladeCenter
management-module event log. Error codes are either system reference codes (SRCs) or service requestnumbers (SRNs). A location code might also be included.
Isolation procedures: If the fault analysis does not determine a definitive cause, the service processor
might indicate a fault isolation procedure that you can use to isolate the failing component.
Viewing the codes
The PS700 blade server does not display checkpoints or error codes on the remote console. The shared
BladeCenter unit video also does not display the codes.
If the POST detects a problem, a 9-word, 8-digit error code is logged in the BladeCenter
management-module event log. A location code that identifies a component might also be included. See
“Error logs” on page 183 for information about viewing the management-module event log.
Service request numbers can be viewed using the AIX diagnostics CD, or various operating system
utilities, such as AIX diagnostics or the Linux service aid “diagela”, if it is installed.
Chapter 2. Diagnostics15
System reference codes (SRCs)
System reference codes indicate a server hardware or software problem that can originate in hardware, in
firmware, or in the operating system.
A blade server component generates an error code when it detects a problem. An SRC identifies the
component that generated the error code and describes the error. Use the SRC information to identify a
list of possibly failing items and to find information about any additional isolation procedures.
The following table shows the syntax of a nine-word B700xxxx SRC as it might be displayed in the event
log of the management module.
The first word of the SRC in this example is the message identifier, B7001111. This example numbers each
word after the first word to show relative word positions. The seventh word is the direct select address,
which is 77777777 in the example.
Table 5. Nine-word system reference code in the management-module event log
IndexSevSourceDate/TimeText
1EBlade_05
01/21/2008,
17:15:14
Depending on your operating system and the utilities you have installed, error messages might also be
stored in an operating system log. See the documentation that comes with the operating system for more
information.
The management module can display the most recent 32 SRCs and time stamps. Manually refresh the list
to update it.
Select Blade Service Data > blade_name in the management module to see a list of the 32 most recent
SRCs.
Table 6. Management module reference code listing
Unique IDSystem Reference CodeTimestamp
00040001D15139012005-11-13 19:30:20
00000016D15138012005-11-13 19:30:16
Any message with more detail is highlighted as a link in the System Reference Code column. Click the
message to cause the management module to present the additional message detail:
D1513901
Created at: 2007-11-1319:30:20
SRC Version: 0x02
Hex Words 2-5: 020110F0 52298910 C1472000 200000FF
16Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
SRC formats
SRCs are strings of either six or eight alphanumeric characters. The first two characters designate the
reference code type.
The first character indicates the type of error. In a few cases, the first two characters indicate the type of
error:
v 1xxxxxxx - System power control network (SPCN) error
v 6xxxxxxx - Virtual optical device error
v A1xxxxxx - Attention required (Service processor)
v AAxxxxxx - Attention required (Partition firmware)
v B1xxxxxx - Service processor error, such as a boot problem
v B6xxxxxx - Licensed Internal Code or hardware event error
v B9xxxxxx - Software installation error or IBM i IPL error. See "Recovering from IPL or system failures"
in the IBM i Information Center at http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/
index.jsp?topic=/ipha5_p5/iplprocedure.htm.
v BAxxxxxx - Partition firmware error
v Cxxxxxxx - Checkpoint (must hang to indicate an error)
v Dxxxxxxx - Dump checkpoint (must hang to indicate an error)
To find a description of a SRC that is not listed in this PS700 blade server documentation, refer to the
POWER7 Reference Code Lookup page at http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/
index.jsp?topic=/ipha8/codefinder.htm.
1xxxyyyy SRCs
The 1xxxyyyy system reference codes are system power control network (SPCN) reference codes.
Look for the rightmost 4 characters (yyyy in 1xxxyyyy) in the error code; this is the reference code. Find
the reference code in Table 7.
Perform all actions before exchanging failing items.
Table 7. 1xxxyyyy SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy
Error
Codes
00ACInformational message: AC loss
00ADInformational message: A
1F02Informational message: The
1F03Informational message: Invalid
DescriptionAction
No action is required.
was reported
No action is required.
service processor reset caused
the blade server to power off
No action is required.
trace logs reached 1K of data.
No action is required.
TMS of location code.
Chapter 2. Diagnostics17
Table 7. 1xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy
Error
Codes
2600Power good (pGood) master
2610pGood fault
262012V dc pGood input fault
26291.5V reg_pgood fault
262B1.8V reg_pgood fault
262C5V reg_pgood fault
262D3.3V reg_pgood fault
262E2.5V reg_pgood fault
2630VRM CP0 core pGood fault
2632VRM CP0 cache pGood fault
264712V "or-ing" FET short
2648Blade power latch fault
DescriptionAction
fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
18Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 7. 1xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy
Error
DescriptionAction
Codes
2649Blade power fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2670The BladeCenter encountered a
problem, and the blade server
was automatically shut down
as a result
1. Check the management-module event log for entries that were
made around the time that the PS700 blade server shut down.
2. Resolve any problems that are found.
3. Reboot the blade server.
4. If the problem is not resolved, replace the system-board and
chassis assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
267112V power fault in the blade
server
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2672Blades PEU3 voltage alertPerform the DTRCARD symbolic CRU isolation procedure by
completing the following steps:
1. Reseat the PCIe expansion card.
2. If the problem persists, replace the expansion card.
3. If the problem persists, go to the “Checkout procedure” on page
184.
4. If the problem persists, replace the system-board and chassis
assembly, as described in “Replacing the FRU system-board and
chassis assembly” on page 260.
The DTRCARD symbolic CRU isolation procedure is in “Service
processor problems” on page 200
26751.1
Reg_CPU0_P5IO2C_Vio_pGood
fault
2676VTTA pGood fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2677VTTA pGood fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2678PROC_Vmem_controller_pGood
1.0V fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
2679Vmem_controller_pGood 1.5V
reg fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
Chapter 2. Diagnostics19
Table 7. 1xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy
Error
Codes
267AHSDC/4xel_A0_pGood faultPerform the DTRCARD symbolic CRU isolation procedure by
267BHSDC/4xel_B0_pGood faultPerform the DTRCARD symbolic CRU isolation procedure by
267CREG_P5IO2C_core 1.2V pGood
267D2.0_PLL_pGood fault
2710pGood output/
3120VRM voltage adjustment failure
DescriptionAction
completing the following steps:
1. Reseat the PCIe expansion card.
2. If the problem persists, replace the expansion card.
3. If the problem persists, go to the “Checkout procedure” on page
184.
4. If the problem persists, replace the system-board and chassis
assembly, as described in “Replacing the FRU system-board and
chassis assembly” on page 260.
The DTRCARD symbolic CRU isolation procedure is in “Service
processor problems” on page 200
completing the following steps:
1. Reseat the PCIe expansion card.
2. If the problem persists, replace the expansion card.
3. If the problem persists, go to the “Checkout procedure” on page
184.
4. If the problem persists, replace the system-board and chassis
assembly, as described in “Replacing the FRU system-board and
chassis assembly” on page 260.
The DTRCARD symbolic CRU isolation procedure is in “Service
processor problems” on page 200
fault
P7_VRM_PVID_gate fault
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
20Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 7. 1xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
1xxxyyyy
Error
Codes
3134Fault on the hardware
8400Invalid configuration decode
8402Unable to get VPD from the
8413Invalid processor 1 VPD
8414Invalid processor 2 VPD
8423No processor VPD was found
8480Bad or missing memory
84A0No backplane VPD was found
DescriptionAction
Perform the DTRCARD symbolic CRU isolation procedure by
monitoring chip
concentrator
controller VID
completing the following steps:
1. Reseat the PCIe expansion card.
2. If the problem persists, replace the expansion card.
3. If the problem persists, go to the “Checkout procedure” on page
184.
4. If the problem persists, replace the system-board and chassis
assembly, as described in “Replacing the FRU system-board and
chassis assembly” on page 260.
The DTRCARD symbolic CRU isolation procedure is in “Service
processor problems” on page 200
1. Check for server firmware updates.
2. Apply any available updates.
3. If the problem persists:
a. Go to “Checkout procedure” on page 184.
b. Replace the system-board, as described in “Replacing the FRU
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in “Replacing the FRU
system-board and chassis assembly” on page 260.
system-board and chassis assembly” on page 260.
6xxxyyyy SRCs
The 6xxxyyyy system reference codes are virtual optical reference codes.
Chapter 2. Diagnostics21
Look for the rightmost 4 characters (yyyy in 6xxxyyyy) in the error code; this is the reference code. Find
the reference code in Table 8.
Table 8. 6xxxyyyy SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
6xxxyyyy
Error Codes
632BCFC1A virtual optical device cannot
632BCFC2A non-recoverable error was
632BCFC3The data in the list of volumes
632BCFC4A virtual optical device cannot
632BCFC5A non-recoverable error was
632BCFC6The file specified does not
632BCFC7A virtual optical device
632BCFC8A virtual optical device
632CC000Informational system log entry
632CC002self configuring SCSI device
632CC010Undefined sense key returned
632CC020Configuration error.Refer to the hosting partition for problem analysis.
632CC100SCSD bus error occurred.Refer to the hosting partition for problem analysis.
632CC110SCSD command timeout
632CC210Informational system log entry
DescriptionAction
632Byyyy codes are Network File System (NFS) virtual optical SRCs
On this partition and on the Network File System server, verify
access the file containing the
list of volumes.
detected while reading the list
of volumes.
is not valid.
access the file containing the
specified optical volume.
detected while reading a
virtual optical volume.
contain data that can be
processed as a virtual optical
volume.
detected an error reported by
the Network File System
server that cannot be
recovered.
encountered a non-recoverable
error.
632Cyyyy codes are virtual optical SRCs
only.
(SCSD) selection or reselection
timeout occurred.
by device.
occurred.
only.
that the proper file is specified and that the proper authority is
granted.
Resolve any errors on the Network File System server.
On the Network File System server, verify that the proper file is
specified, that all files are entered correctly, that there are no blank
lines, and that the character set used is valid.
On the Network File System server, verify that the proper file is
specified in the list of volumes, and that the proper authority is
granted.
Resolve any errors on the Network File System server.
On the Network File System server, verify that all the files specified
in the list of optical volumes are correct.
Resolve any errors on the Network File System server.
Install any available operating system updates.
No corrective action is required.
Refer to the hosting partition for problem analysis.
Refer to the hosting partition for problem analysis.
Refer to the hosting partition for problem analysis.
No corrective action is required.
22Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 8. 6xxxyyyy SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
6xxxyyyy
Error Codes
632CC300Media or device error
DescriptionAction
Refer to the hosting partition for problem analysis.
occurred.
632CC301Media or device error
Refer to the hosting partition for problem analysis.
occurred.
632CC302Media or device error
Refer to the hosting partition for problem analysis.
occurred.
632CC303Media has an unknown
No corrective action is required.
format.
632CC333Incompatible media.
1. Verify that the disk has a supported format.
2. If the format is supported, clean the disk and attempt the
failing operation again.
3. If the operation fails again with the same system reference code,
ask your media source for a replacement disk.
632CC400Physical link error detected by
Refer to the hosting partition for problem analysis.
device.
632CC402An internal program error
Install any available operating system updates.
occurred.
632CCFF2Informational system log entry
No corrective action is required.
only.
632CCFF4Internal device error occurred. Refer to the hosting partition for problem analysis.
632CCFF6Informational system log entry
No corrective action is required.
only.
632CCFF7Informational system log entry
No corrective action is required.
only.
632CCFFEInformational system log entry
No corrective action is required.
only.
632CFF3DInformational system log entry
No corrective action is required.
only.
632CFF6DInformational system log entry
No corrective action is required.
only.
Chapter 2. Diagnostics23
A1xxyyyy service processor SRCs
An A1xxyyyy system reference code (SRC) is an attention code that offers information about a platform
or service processor dump or confirms a control panel function request. Take the steps in the Action
column only if the BladeSystem appears to hang on an attention code.
Table 9 shows A1xxyyyy SRCs.
Table 9. A1xxyyyy service processor SRCs
Attention codeDescriptionAction
A1xxyyyyAttention code
A2xxyyyy Logical partition SRCs
An A2xxyyyy SRC is a logical partition reference code that is related to logical partitioning.
Table 10. A2xxyyyy Logical partition SRCs
Reference CodeDescriptionAction
A2xxyyyySee the description for the B200yyyy error
code with the same yyyy value.
A2D03000User-initiated immediate termination and MSD
of a partition.
A2D03001User-initiated RSCDUMP of RPA partition's
PFW content.
A2D03002User-initiated RSCDUMP of IBM i partition's
SLIC bootloader and PFW content.
1. Go to “Checkout procedure” on page 184.
2. Replace the system board and chassis
assembly, as described in “Replacing the
FRU system-board and chassis assembly” on
page 260.
Perform the action described in the B200yyyy
error code with the same yyyy value.
No corrective action is required.
No corrective action is required.
No corrective action is required.
A700yyyy Licensed internal code SRCs
An A700xxxx system reference code (SRC) is an error/event code that is related to licensed internal code.
Table 11. A700yyyy Licensed internal code SRCs
Reference CodeDescriptionAction
A700173CInformational system log entry only.No corrective action is required.
A7003000A user-initiated platform dump occurred.No service action required.
A7004700Informational system log entry only.No corrective action is required.
A7004712A problem occurred when initializing, reading,
or using system VPD.
A7004713A problem occurred when initializing, reading,
or using system VPD.
A7004715A problem occurred when initializing, reading,
or using system VPD.
Replace the management card, as described in
“Removing the tier 2 management card” on
page 255 and “Installing the tier 2 management
card” on page 256.
Replace the management card, as described in
“Removing the tier 2 management card” on
page 255 and “Installing the tier 2 management
card” on page 256.
Replace the management card, as described in
“Removing the tier 2 management card” on
page 255 and “Installing the tier 2 management
card” on page 256.
24Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Perform the action in the B7xxyyyy error code
with the same xxyyyy value.
AA00E1A8 to AA260005 Partition firmware attention codes
AAxx attention codes provide information about the next target state for the platform firmware. These
codes might indicate that you need to perform an action.
Table 12 describes the partition firmware codes that might be displayed if the POST detects a problem.
Each message description includes a suggested action to correct the problem.
Table 12. AA00E1A8 to AA260005 Partition firmware attention codes
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Attention codeDescriptionAction
AA00E1A8The system is booting to the open
firmware prompt.
AA00E1A9The system is booting to the System
Management Services (SMS) menus.
AA00E1B0Waiting for the user to select the
language and keyboard. The menu
should be visible on the console.
At the open firmware prompt, type dev
/packages/gui obe and press Enter; then, type
1 to select SMS Menu.
1. If the system or partition returns to the
SMS menus after a boot attempt failed, use
the SMS menus to check the progress
indicator history for a BAxx xxxx error,
which may indicate why the boot attempt
failed. Follow the actions for that error code
to resolve the boot problem.
2. Use the SMS menus to establish the boot
list and restart the blade server.
1. Check for server firmware updates.
2. Apply any available updates.
3. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics25
Table 12. AA00E1A8 to AA260005 Partition firmware attention codes (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Attention codeDescriptionAction
AA00E1B1Waiting for the user to accept or decline
the license
AA060007A keyboard was not found.Verify that a keyboard is attached to the USB
AA06000BThe system or partition was not able to
find an operating system on any of the
devices in the boot list.
AA06000CThe media in a device in the boot list
was not bootable.
AA06000DThe media in the device in the bootlist
was not found under the I/O adapter
specified by the bootlist.
AA06000EThe adapter specified in the boot list is
not present or is not functioning.
1. Check for server firmware updates.
2. Apply any available updates.
3. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
port that is assigned to the partition.
1. Use the SMS menus to modify the boot list
so that it includes devices that have a
known-good operating system and restart
the blade server.
2. If the problem remains, go to “Boot
problem resolution” on page 190.
1. Replace the media in the device with
known-good media or modify the boot list
to boot from another bootable device.
2. If the problem remains, go to “Boot
problem resolution” on page 190.
1. Verify that the media from which you are
trying to boot is bootable or modify the
boot list to boot from another bootable
device.
2. If the problem remains, go to “Boot
problem resolution” on page 190.
v For an AIX operating system:
1. Try booting the blade server from
another bootable device; then, run AIX
online diagnostics against the failing
adapter.
2. If AIX cannot be booted from another
device, boot the blade server using the
stand-alone diagnostics CD or a NIM
server; then, run diagnostics against the
failing adapter.
v For a Linux operating system, boot the blade
server using the stand-alone diagnostics CD
or a NIM server; then, run diagnostics
against the failing adapter.
26Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 12. AA00E1A8 to AA260005 Partition firmware attention codes (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Attention codeDescriptionAction
AA060011The firmware did not find an operating
system image and at least one hard disk
in the boot list was not detected by the
firmware. The firmware is retrying the
entries in the boot list.
This might occur if a disk enclosure that
contains the boot disk is not fully initialized or
if the boot disk belongs to another partition.
Verify that:
v The boot disk belongs to the partition from
which you are trying to boot.
v The boot list in the SMS menus is correct.
AA130013Bootable media is missing from a USB
CD-ROM
Verify that a bootable CD is properly inserted
in the CD or DVD drive and retry the boot
operation.
AA130014The media in a USB CD-ROM has
changed.
1. Retry the operation.
2. Check for server firmware updates; then,
install the updates if available and retry the
operation.
AA170210Setenv/$setenv parameter error - the
name contains a null character.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
AA170211Setenv/$setenv parameter error - the
value contains a null character.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
AA190001The hypervisor function to get or set the
time-of-day clock reported an error.
1. Use the operating system to set the system
clock.
2. If the problem persists, check for server
firmware updates.
3. Install any available updates and retry the
operation.
AA260001Enter the Type Model Number (Must be
8 characters)
AA260002Enter the Serial Number (Must be 7
characters)
AA260003Enter System Unique ID (Must be 12
characters)
AA260004Enter WorldWide Port Number (Must be
12 characters)
Enter the machine type and model of the blade
server at the prompt.
Enter the serial number of the blade server at
the prompt.
Enter the system unique ID number at the
prompt.
Enter the worldwide port number of the blade
server at the prompt.
AA260005Enter Brand (Must be 2 characters)Enter the brand number of the blade server at
the prompt.
Chapter 2. Diagnostics27
Bxxxxxxx Service processor early termination SRCs
A Bxxxxxxx system reference code (SRC) is an error code that is related to an event or exception that
occurred in the service processor firmware.
To find a description of a SRC that is not listed in this PS700 blade server documentation, refer to the
POWER7 Reference Code Lookup page at http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/
index.jsp?topic=/ipha8/codefinder.htm.
Table 13 describes error codes that might occur if POST detects a problem. The description also includes
suggested actions to correct the problem.
Note: For problems persisting after completing the suggested actions, see “Solving undetermined
problems” on page 227.
Table 13. B181xxxx Service processor early termination SRCs
B181 xxxx Error
CodeDescriptionAction
7200Invalid boot request
7201Service processor failure
7202The permanent and temporary
firmware sides are both marked
invalid
7203Error setting boot parameters
7204Error reading boot parameters
7205Boot code error
7206Unit check timer was reset
7207Error reading from NVRAM
7208Error writing to NVRAM
7209The service processor boot watchdog
timer expired and forced the service
processor to attempt a boot from the
other firmware image in the service
processor flash memory
720APower-off reset occurred. FipsDump
should be analyzed: Possible
software problem
Go to “Checkout procedure” on page 184.
28Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
B200xxxx Logical partition SRCs
A B200xxxx SRC is a logical partition reference code that is related to logical partitioning.
Table 14 describes system reference codes that might be displayed if system firmware detects a problem.
Suggested actions to correct the problem are also listed.
Note: For problems persisting after completing the suggested actions, see “Checkout procedure” on page
184 and “Solving undetermined problems” on page 227.
Table 14. B200xxxx Logical partition SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference CodeDescriptionAction
B2001130A problem occurred during the
migration of a partition
You attempted to migrate a partition
to a system that has a power or
thermal problem. The migration will
not continue.
B2001131A problem occurred during the
migration of a partition.
Look for and fix power or thermal problems and then
retry the migration.
Check for server firmware updates; then, install the
updates if available.
The migration of a partition did not
complete.
B2001132A problem occurred during the
startup of a partition.
A platform firmware error occurred
while it was trying to allocate
memory. The startup will not
continue.
B2001133A problem occurred during the
migration of a partition.
The migration of a partition did not
complete.
B2001134A problem occurred during the
migration of a partition.
The migration of a partition did not
complete.
B2001140A problem occurred during the
migration of a partition.
The migration of a partition did not
complete.
B2001141A problem occurred during the
migration of a partition.
Collect a platform dump and then go to “Isolating
firmware problems” on page 218.
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference CodeDescriptionAction
B2001142A problem occurred during the
migration of a partition.
The migration of a partition did not
complete.
B2001143A problem occurred during the
migration of a partition.
The migration of a partition did not
complete.
B2001144A problem occurred during the
migration of a partition.
The migration of a partition did not
complete.
B2001148A problem occurred during the
migration of a partition.
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
The migration of a partition did not
complete.
B2001150During the startup of a partition, a
partitioning configuration problem
occurred.
B2001151A problem occurred during the
migration of a partition.
The migration of a partition did not
complete.
B2001170During the startup of a partition, a
failure occurred due to a validation
error.
B2001225A problem occurred during the
startup of a partition.
The partition attempted to start up
prior to the platform fully
initializing. Restart the partition after
the platform has fully completed and
the platform is not in standby mode.
B2001230During the startup of a partition, a
partitioning configuration problem
occurred; the partition is lacking the
necessary resources to start up.
B2001260A problem occurred during the
startup of a partition.
Go to “Verifying the partition configuration” on page 186.
Check for server firmware updates; then, install the
updates if available.
Go to “Verifying the partition configuration” on page 186.
Restart the partition.
Go to “Verifying the partition configuration” on page 186.
Set the partition to Normal.
The partition could not start at the
Timed Power On setting because the
partition was not set to Normal.
30Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference CodeDescriptionAction
B2001265The partition could not start up. An
Correct the startup settings.
operating system Main Storage
Dump startup was attempted with
the startup side on D-mode, which is
not a valid operating system startup
scenario. The startup will be halted.
This SRC can occur when a D-mode
SLIC installation fails and attempts a
Main Storage Dump.
B2001266The partition could not start up. You
are attempting to start up an
Install a supported operating system and restart the
partition.
operating system that is not
supported.
B2001280A problem occurred during a
Go to “Isolating firmware problems” on page 218.
partition Main Storage Dump. A
mainstore dump startup did not
complete due to a configuration
mismatch.
B2001281A partition memory error occurred.
Restart the partition.
The failed memory will no longer be
used.
B2001282A problem occurred during the
Go to “Isolating firmware problems” on page 218.
startup of a partition.
B2001320A problem occurred during the
startup of a partition.
Configure a load source for the partition. Then restart the
partition.
No default load source was selected.
The startup will attempt to continue,
but there may not be enough
information to find the correct load
source.
B2001321A problem occurred during the
startup of a partition.
B2001322In the partition startup, code failed
during a check of the load source
path.
B2002048A problem occurred during a
partition Main Storage Dump. A
mainstore dump startup did not
complete due to a copy error.
B2002054A problem occurred during a
partition Main Storage Dump. A
mainstore dump IPL did not
complete due to a configuration
mismatch.
Verify that the correct slot is specified for the load source.
Then restart the partition.
Verify that the path for the load source is specified
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference CodeDescriptionAction
B200308CA problem occurred during the
Verify that a valid I/O Load Source is tagged.
startup of a partition.
The adapter type cannot be
determined.
B2003090A problem occurred during the
Go to “Isolating firmware problems” on page 218.
startup of a partition.
B2003110A problem occurred during the
Go to “Isolating firmware problems” on page 218.
startup of a partition.
B2003113A problem occurred during the
Look for B7xx xxxx errors and resolve them.
startup of a partition.
B2003114A problem occurred during the
Look for other errors and resolve them.
startup of a partition.
B2003120Informational system log entry only. No corrective action is required.
B2003123Informational system log entry only. No corrective action is required.
B2003125During the startup of a partition, the
blade server firmware could not
Check for server firmware updates; then, install the
updates if available.
obtain a segment of main storage
within the blade server to use for
managing the creation of a partition.
B2003128A problem occurred during the
Look for and resolve B700 69xx errors.
startup of a partition. A return code
for an unexpected failure was
returned when attempting to query
the load source path.
B2003130A problem occurred during the
startup of a partition.
B2003135A problem occurred during the
startup of a partition.
B2003140A problem occurred during the
startup of a partition. This is a
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
Reconfigure the partition to include the intended load
source path.
configuration problem in the
partition.
B2003141Informational system log entry only. No corrective action is required.
B2003142Informational system log entry only. No corrective action is required.
B2003143Informational system log entry only. No corrective action is required.
B2003144Informational system log entry only. No corrective action is required.
B2003145Informational system log entry only. No corrective action is required.
B2003200Informational system log entry only. No corrective action is required.
B2004158Informational system log entry only. No corrective action is required.
B2004400A problem occurred during the
startup of a partition.
Check for server firmware updates; then, install the
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference CodeDescriptionAction
B2005106A problem occurred during the
startup of a partition. There is not
enough space to contain the partition
main storage dump. The startup will
not continue.
B2005109A problem occurred during the
startup of a partition. There was a
partition main storage dump
problem. The startup will not
continue.
B2005114A problem occurred during the
startup of a partition. There is not
enough space to contain the partition
main storage dump. The startup will
not continue.
B2005115A problem occurred during the
startup of a partition. There was an
error reading the partition main
storage dump from the partition
load source into main storage. The
startup will attempt to continue.
B2005117A problem occurred during the
startup of a partition. A partition
main storage dump has occurred but
cannot be written to the load source
device because a valid dump already
exists.
B2005121A problem occurred during the
startup of a partition. There was an
error writing the partition main
storage dump to the partition load
source. The startup will not
continue.
B2005122Informational system log entry only. No corrective action is required.
B2005123Informational system log entry only. No corrective action is required.
B2005135A problem occurred during the
startup of a partition. There was an
error writing the partition main
storage dump to the partition load
source. The main store dump startup
will continue.
B2005137A problem occurred during the
startup of a partition. There was an
error writing the partition main
storage dump to the partition load
source. The main store dump startup
will continue.
Verify that there is sufficient memory available to start the
partition as it is configured. If there is already enough
memory, then go to “Isolating firmware problems” on
page 218.
Go to “Isolating firmware problems” on page 218.
Go to “Isolating firmware problems” on page 218.
If the startup does not continue, look for and resolve
other errors.
Use the Main Storage Dump Manager to rename or copy
the current main storage dump.
Look for related errors in the "Product Activity Log" and
fix any problems found. Use virtual control panel
function 34 to retry the current Main Store Dump startup
while the partition is still in the failed state.
Look for other errors and resolve them.
Look for other errors and resolve them.
34Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference CodeDescriptionAction
B2005145A problem occurred during the
Look for other errors and resolve them.
startup of a partition. There was an
error writing the partition main
storage dump to the partition load
source. The main store dump startup
will continue.
B2005148A problem occurred during the
Go to “Isolating firmware problems” on page 218.
startup of a partition. An error
occurred while doing a main storage
dump that would have caused
another main storage dump. The
startup will not continue.
B2005149A problem occurred during the
startup of a partition while doing a
Check for server firmware updates; then, install the
updates if available.
Firmware Assisted Dump that would
have caused another Firmware
Assisted Dump.
B200514AA Firmware Assisted Dump did not
complete due to a copy error.
B200542AA Firmware Assisted Dump did not
complete due to a read error.
B200542BA Firmware Assisted Dump did not
complete due to a copy error.
B200543AA Firmware Assisted Dump did not
complete due to a copy error.
B200543BA Firmware Assisted Dump did not
complete due to a copy error.
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
Check for server firmware updates; then, install the
updates if available.
B200543CInformational system log entry only. No corrective action is required.
B200543DA Firmware Assisted Dump did not
complete due to a copy error.
B2006006During the startup of a partition, a
Check for server firmware updates; then, install the
updates if available.
Go to “Isolating firmware problems” on page 218.
system firmware error occurred
when the partition memory was
being initialized; the startup will not
continue.
B2006006A problem occurred during the
Contact IBM support.
startup of a partition. The partition
could not reserve the memory
required for IPL.
B2006012During the startup of a partition, the
Go to “Isolating firmware problems” on page 218.
partition LID failed to completely
load into the partition main storage
area.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference CodeDescriptionAction
B2008151System firmware detected an error.Use the Integrated Virtualization Manager (IVM) or
management console to increase the Logical Memory
Block (LMB) size, and to reduce the number of virtual
devices for the partition.
B2008152No active system processors.Verify that processor resources are assigned to the
partition.
B2008160A problem occurred during the
migration of a partition.
B2008161A problem occurred during the
migration of a partition.
B200A100A partition ended abnormally; the
partition could not stay running and
shut itself down.
B200A101A partition ended abnormally; the
partition could not stay running and
shut itself down.
B200A140A lower priority partition lost a
usable processor to supply it to a
higher priority partition with a bad
processor.
B200B07BInformational system log entry only. No corrective action is required.
B200B215A problem occurred after a partition
ended abnormally.
Contact IBM support.
Contact IBM support.
1. Check the error logs and take the actions for the error
codes that are found.
2. Go to “Isolating firmware problems” on page 218.
1. Check the error logs and take the actions for the error
codes that are found.
2. Go to “Isolating firmware problems” on page 218.
Evaluate the entire LPAR configuration. Adjust partition
profiles with the new number of processors available in
the system.
Restart the platform.
There was a communications
problem between this partition's
service processor and the platform's
service processor.
B2005127Timeout occurred during a main
store dump IPL.
B2D03001Informational system log entry only.No corrective action is required.
B2D03002Informational system log entry only.No corrective action is required.
B200C1F0An internal system firmware error
occurred during a partition
shutdown or a restart.
B200D150A partition ended abnormally; there
was a communications problem
between this partition and the code
that handles resource allocation.
B200E0AAA problem occurred during the
power off of a partition.
There was not enough memory available for the dump to
complete before the timeout occurred. Retry the main
store dump IPL, or else power on the partition normally.
Go to “Isolating firmware problems” on page 218.
Check for server firmware updates; then, install the
updates if available.
Go to “Isolating firmware problems” on page 218.
38Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B200 xxxx
Reference CodeDescriptionAction
B200F001A problem occurred during the
startup of a partition. An operation
has timed out.
B200F003During the startup of a partition, the
partition processor(s) did not start
the firmware within the time-out
window.
B200F004Informational system log entry only.No corrective action is required.
B200F005Informational system log entry only.No corrective action is required.
B200F006During the startup of a partition, the
code load operation for the partition
startup timed out.
B200F007During a shutdown of the partition,
a time-out occurred while trying to
stop a partition.
B200F008Informational system log entry only.No corrective action is required.
B200F009Informational system log entry only.No corrective action is required.
B200F00AInformational system log entry only. No corrective action is required.
B200F00BInformational system log entry only.No corrective action is required.
B200F00CInformational system log entry only.No corrective action is required.
B200F00DInformational system log entry only. No corrective action is required.
Look for other errors and resolve them.
Collect the partition dump information; then, go to
“Isolating firmware problems” on page 218.
1. Check the error logs and take the actions for the error
codes that are found.
2. Go to “Isolating firmware problems” on page 218.
Check for server firmware updates; then, install the
updates if available.
B700xxxx Licensed internal code SRCs
A B700xxxx system reference code (SRC) is an error code or event code that is related to licensed internal
code.
Table 15 describes the system reference codes that might be displayed if system firmware detects a
problem. Suggested actions to correct the problem are also listed.
Note: For problems persisting after completing the suggested actions, see “Checkout procedure” on page
184 and “Solving undetermined problems” on page 227.
Table 15. B700xxxx Licensed internal code SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error CodesDescriptionAction
0103System firmware detected a failure
0104System firmware failure. Machine check,
undefined error occurred.
0105System firmware detected an error.
More than one request to terminate the
system was issued.
0106System firmware failure.
0107System firmware failure. The system
detected an unrecoverable machine
check condition.
0200System firmware has experienced a low
storage condition
0201System firmware detected an error.No immediate action is necessary.
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 218.
1. Check for server firmware updates.
2. Update the firmware.
Go to “Isolating firmware problems” on page
218.
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 218.
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 218.
No immediate action is necessary.
Continue running the system normally. At the
earliest convenient time or service window,
work with IBM Support to collect a platform
dump and restart the system; then, go to
“Isolating firmware problems” on page 218.
Continue running the system normally. At the
earliest convenient time or service window,
work with IBM Support to collect a platform
dump and restart the system; then, go to
“Isolating firmware problems” on page 218.
0202Informational system log entry only.No corrective action is required.
0302System firmware failure
0441Service processor failure. The platform
encountered an error early in the
startup or termination process.
0443Service processor failure.Replace the system-board and chassis
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 218.
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
40Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error CodesDescriptionAction
0601Informational system log entry only.No corrective action is required.
Note: This code and associated data can be
used to determine why the time of day for a
partition was lost.
0602System firmware detected an error
condition.
1. Collect the event log information.
2. Go to “Isolating firmware problems” on
page 218.
0611There is a problem with the system
hardware clock; the clock time is
Use the operating system to set the system
clock.
invalid.
0621Informational system log entry only.No corrective action is required.
0641System firmware detected an error.
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 218.
0650System firmware detected an error.
Resource management was unable to
allocate main storage. A platform dump
was initiated.
1. Collect the event log.
2. Collect the platform dump data.
3. Collect the partition configuration
information.
4. Go to “Isolating firmware problems” on
page 218.
0651The system detected an error in the
system clock hardware
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
0803Informational system log entry only.No corrective action is required.
0804Informational system log entry only.No corrective action is required.
0A00Informational system log entry only.No corrective action is required.
0A01Informational system log entry only.No corrective action is required.
0A10Informational system log entry only.No corrective action is required.
1150Informational system log entry only.No corrective action is required.
1151Informational system log entry only.No corrective action is required.
1152Informational system log entry only.No corrective action is required.
1160Service processor failure
1. Go to “Isolating firmware problems” on
page 218.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1161Informational system log entry only.No corrective action is required.
1730The VPD for the system is not what was
expected at startup.
Replace the management card, as described in
“Removing the tier 2 management card” on
page 255 and “Installing the tier 2 management
card” on page 256.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error CodesDescriptionAction
1731The VPD on a memory DIMM is not
correct and the memory on the DIMM
cannot be used, resulting in reduced
memory.
1732The VPD on a processor card is not
correct and the processor card cannot be
used, resulting in reduced processing
power.
1733System firmware failure. The startup
will not continue.
Replace the MEMDIMM symbolic CRU, as
described in “Service processor problems” on
page 200.
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
Look for and correct B1xxxxxx errors. If there
are no serviceable B1xxxxxx errors, or if
correcting the errors does not correct the
problem, contact IBM support to reset the
server firmware settings.
Attention: Resetting the server firmware
settings results in the loss of all of the partition
data that is stored on the service processor.
Before continuing with this operation,
manually record all settings that you intend to
preserve.
The service processor reboots after IBM
Support resets the server firmware settings.
If the problem persists, Replace the
system-board, as described in “Replacing the
FRU system-board and chassis assembly” on
page 260.
173AA VPD collection overflow occurred.
173BA system firmware failure occurred
during VPD collection.
4091Informational system log entry only.No corrective action is required.
4400There is a platform dump to collect
4401System firmware failure. The system
firmware detected an internal problem.
4402A system firmware error occurred while
attempting to allocate the memory
necessary to create a platform dump.
1. Look for and resolve other errors.
2. If there are no other errors:
a. Update the firmware to the current
level, as described in “Updating the
firmware” on page 263.
b. You might also have to update the
management module firmware to a
compatible level.
Look for and correct other B1xxxxxx errors.
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 218.
Go to “Isolating firmware problems” on page
218.
Go to “Isolating firmware problems” on page
218.
42Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error CodesDescriptionAction
6971PCI bus failure
6972System bus errorReplace the system-board and chassis
6973System bus error
6974Informational system log entry only.No corrective action is required.
6978Informational system log entry only.No corrective action is required.
6979Informational system log entry only.No corrective action is required.
697CConnection from service processor to
system processor failed.
6980RIO, HSL or 12X controller failureReplace the system-board and chassis
6981System bus error.Replace the system-board and chassis
6984Informational system log entry only.No corrective action is required.
6985Remote I/O (RIO), high-speed link
(HSL), or 12X loop status message.
6987Remote I/O (RIO), high-speed link
(HSL), or 12X connection failure.
1. Use the “PCI expansion card (PIOCARD)
problem isolation procedure” on page 194
to determine the failing component.
2. If the problem persists, replace the
system-board and chassis assembly, as
described in “Replacing the FRU
system-board and chassis assembly” on
page 260.
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
1. Use the “PCI expansion card (PIOCARD)
problem isolation procedure” on page 194
to determine the failing component.
2. If the problem persists, replace the
system-board and chassis assembly, as
described in “Replacing the FRU
system-board and chassis assembly” on
page 260.
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
46Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error CodesDescriptionAction
6990Service processor failure.Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
6991System firmware failureGo to “Isolating firmware problems” on page
218.
6993Service processor failure
1. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
2. Go to “Isolating firmware problems” on
page 218.
6994Service processor failure.Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
6995Informational system log entry only.No corrective action is required.
69C2Informational system log entry only.No corrective action is required.
69C3Informational system log entry only.No corrective action is required.
69D9Host Ethernet Adapter (HEA) failure.Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
69DAInformational system log entry only.No corrective action is required.
69DBSystem firmware failure.
1. Collect the platform dump information.
2. Go to “Isolating firmware problems” on
page 218.
BAD1The platform firmware detected an
error.
BAD2System firmware detected an error.
Go to “Isolating firmware problems” on page
218.
1. Collect the event log information.
2. Go to “Isolating firmware problems” on
page 218.
F103System firmware failure
1. Collect the event log information.
2. Collect the platform dump information.
3. Go to “Isolating firmware problems” on
page 218.
F104Operating system error. System
firmware terminated a partition.
Check the management-module event log for
partition firmware error codes (especially
BA00F104); then, take the appropriate actions
for those error codes.
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
B700 xxxx Error CodesDescriptionAction
F106System firmware detected an errorReplace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
F107System firmware detected an error.
F108A firmware error caused the system to
terminate.
F10ASystem firmware detected an errorLook for and correct B1xxxxxx errors.
F10BA processor resource has been disabled
due to hardware problems
F10CThe platform LIC detected an internal
problem performing Partition Mobility.
F120Informational system log entry only.No corrective action is required.
F130Thermal Power Management Device
firmware error was detected.
1. Collect the event log information.
2. Go to “Isolating firmware problems” on
page 218.
1. Collect the event log information.
2. Go to “Isolating firmware problems” on
page 218.
Replace the system-board and chassis
assembly, as described in “Replacing the FRU
system-board and chassis assembly” on page
260.
1. Collect the event log information.
2. Go to “Isolating firmware problems” on
page 218.
Check for server firmware updates; then,
install the updates if available.
BA000010 to BA400002 Partition firmware SRCs
The power-on self-test (POST) might display an error code that the partition firmware detects.
Table 16 describes error codes that might be displayed if POST detects a problem. The description also
includes suggested actions to correct the problem.
Note: For problems persisting after completing the suggested actions, see “Checkout procedure” on page
184 and “Solving undetermined problems” on page 227.
Table 16. BA000010 to BA400002 Partition firmware SRCs
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA000010The device data structure is corrupted
48Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA000020Incompatible firmware levels were
found
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA000030An lpevent communication failure
occurred
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA000031An lpevent communication failure
occurred
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA000032The firmware failed to register the
lpevent queues
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA000034The firmware failed to exchange
capacity and allocate lpevents
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA000038The firmware failed to exchange virtual
continuation events
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA000040The firmware was unable to obtain the
RTAS code lid details
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics49
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA000050The firmware was unable to load the
RTAS code lid
BA000060The firmware was unable to obtain the
open firmware code lid details
BA000070The firmware was unable to load the
open firmware code lid
BA000080The user did not accept the license
agreement
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Accept the license agreement and restart the
blade server.
If the problem persists:
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA000081Failed to get the firmware license policy
BA000082Failed to set the firmware license policy
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
50Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA000091Unable to load a firmware code update
module
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA00E820An lpevent communication failure
occurred
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA00E830Failure when initializing ibm,event-scan
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA00E840Failure when initializing PCI hot-plug
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA00E843Failure when initializing the interface to
AIX or Linux
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA00E850Failure when initializing dynamic
reconfiguration
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA00E860Failure when initializing sensors
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA010000There is insufficient information to boot
the systems
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA010001The client IP address is already in use
by another network device
Verify that all of the IP addresses on the
network are unique; then, retry the operation.
Chapter 2. Diagnostics51
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA010002Cannot get gateway IP addressPerform the following actions that checkpoint
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA010003Cannot get server hardware addressPerform the following actions that checkpoint
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA010004Bootp failedPerform the following actions that checkpoint
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
52Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA010005File transmission (TFTP) failedPerform the following actions that checkpoint
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA010006The boot image is too largeStart up from another device with a bootable
image.
BA010007The device does not have the required
device_type property.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA010008The device_type property for this device
is not supported by the iSCSI initiator
configuration specification.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA010009The arguments specified for the ping
function are invalid.
The embedded host Ethernet adapters (HEAs)
help provide iSCSI, which is supported by
iSCSI software device drivers on either AIX or
Linux. Verify that all of the iSCSI configuration
arguments on the operating system comply
with the configuration for the iSCSI Host Bus
Adapter (HBA), which is the iSCSI initiator.
BA01000AThe itname parameter string exceeds the
maximum length allowed.
The embedded host Ethernet adapters (HEAs)
help provide iSCSI, which is supported by
iSCSI software device drivers on either AIX or
Linux. Verify that all of the iSCSI configuration
arguments on the operating system comply
with the configuration for the iSCSI Host Bus
Adapter (HBA), which is the iSCSI initiator.
Chapter 2. Diagnostics53
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA01000BThe ichapid parameter string exceeds
the maximum length allowed.
BA01000CThe ichappw parameter string exceeds
the maximum length allowed.
BA01000DThe iname parameter string exceeds the
maximum length allowed.
BA01000EThe LUN specified is not valid.The embedded host Ethernet adapters (HEAs)
BA01000FThe chapid parameter string exceeds the
maximum length allowed.
BA010010The chappw parameter string exceeds
the maximum length allowed.
BA010011SET-ROOT-PROP could not find / (root)
package
The embedded host Ethernet adapters (HEAs)
help provide iSCSI, which is supported by
iSCSI software device drivers on either AIX or
Linux. Verify that all of the iSCSI configuration
arguments on the operating system comply
with the configuration for the iSCSI Host Bus
Adapter (HBA), which is the iSCSI initiator.
The embedded host Ethernet adapters (HEAs)
help provide iSCSI, which is supported by
iSCSI software device drivers on either AIX or
Linux. Verify that all of the iSCSI configuration
arguments on the operating system comply
with the configuration for the iSCSI Host Bus
Adapter (HBA), which is the iSCSI initiator.
The embedded host Ethernet adapters (HEAs)
help provide iSCSI, which is supported by
iSCSI software device drivers on either AIX or
Linux. Verify that all of the iSCSI configuration
arguments on the operating system comply
with the configuration for the iSCSI Host Bus
Adapter (HBA), which is the iSCSI initiator.
help provide iSCSI, which is supported by
iSCSI software device drivers on either AIX or
Linux. Verify that all of the iSCSI configuration
arguments on the operating system comply
with the configuration for the iSCSI Host Bus
Adapter (HBA), which is the iSCSI initiator.
The embedded host Ethernet adapters (HEAs)
help provide iSCSI, which is supported by
iSCSI software device drivers on either AIX or
Linux. Verify that all of the iSCSI configuration
arguments on the operating system comply
with the configuration for the iSCSI Host Bus
Adapter (HBA), which is the iSCSI initiator.
The embedded host Ethernet adapters (HEAs)
help provide iSCSI, which is supported by
iSCSI software device drivers on either AIX or
Linux. Verify that all of the iSCSI configuration
arguments on the operating system comply
with the configuration for the iSCSI Host Bus
Adapter (HBA), which is the iSCSI initiator.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
54Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA010013The information in the error log entry
Informational message. No action is required.
for this SRC provides network trace
data.
BA010014The information in the error log entry
Informational message. No action is required.
for this SRC provides network trace
data.
BA010015The information in the error log entry
Informational message. No action is required.
for this SRC provides network trace
data.
BA010020A trace entry addition failed because of
a bad trace type.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA012010Opening the TCP node failed.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA012011TCP failed to read from the network
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA012012TCP failed to write to the network.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA012013Closing TCP failed.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics55
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA017020Failed to open the TFTP packageVerify that the Trivial File Transfer Protocol
(TFTP) parameters are correct.
BA017021Failed to load the TFTP fileVerify that the TFTP server and network
connections are correct.
BA01B010Opening the BOOTP node failed.
BA01B011BOOTP failed to read from the networkPerform the following actions that checkpoint
BA01B012BOOTP failed to write to the networkPerform the following actions that checkpoint
BA01B013The discover mode is invalid
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
56Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA01B014Closing the BOOTP node failed
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA01B015The BOOTP discover server timed outPerform the following actions that checkpoint
CA00E174 describes:
1. Verify that:
v The bootp server is correctly configured;
then, retry the operation.
v The network connections are correct;
then, retry the operation.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA01D001Opening the DHCP node failed
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA01D020DHCP failed to read from the network
1. Verify that the network cable is connected,
and that the network is active.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA01D030DHCP failed to write to the network
1. Verify that the network cable is connected,
and that the network is active.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics57
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA01D040The DHCP discover server timed out
BA01D050DHCP::discover no good offerDHCP discovery did not receive any DHCP
1. Verify that the DHCP server has addresses
available.
2. Verify that the DHCP server configuration
file is not overly constrained. An
over-constrained file might prevent a server
from meeting the configuration requested
by the client.
3. Perform the following actions that
checkpoint CA00E174 describes:
a. Verify that:
v The bootp server is correctly
configured; then, retry the operation.
v The network connections are correct;
then, retry the operation.
b. If the problem persists:
1) Go to “Checkout procedure” on
page 184.
2) Replace the system-board, as
described in “Replacing the FRU
system-board and chassis assembly”
on page 260.
offers from the servers that meet the client
requirements.
BA01D051DHCP::discover DHCP request timed
out
BA01D052DHCP::discover: 10 incapable servers
were found
BA01D053DHCP::discover received a reply, but
without a message type
Verify that the DHCP server configuration file
is not overly constrained. An over-constrained
file might prevent a server from meeting the
configuration requested by the client.
DHCP discovery did receive a DHCP offer
from a server that met the client requirements,
but the server did not send the DHCP
acknowledgement (DHCP ack) to the client
DHCP request.
Another client might have used the address
that was served.
Verify that the DHCP server has addresses
available.
Ten DHCP servers have sent DHCP offers,
none of which met the requirements of the
client. Check the compatibility of the
configuration that the client is requesting and
the server DHCP configuration files.
Verify that the DHCP server is properly
configured.
58Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA01D054DHCP::discover: DHCP nak receivedDHCP discovery did receive a DHCP offer
from a server that meets the client
requirements, but the server sent a DHCP not
acknowledged (DHCP nak) to the client DHCP
request.
Another client might be using the address that
was served.
This situation can occur when there are
multiple DHCP servers on the same network,
and server A does not know the subnet
configuration of server B, and vice-versa.
This situation can also occur when the pool of
addresses is not truly divided.
Set the DHCP server configuration file to
"authoritative".
Verify that the DHCP server is functioning
properly.
BA01D055DHCP::discover: DHCP declineDHCP discovery did receive a DHCP offer
from one or more servers that meet the client
requirements. However, the client performed
an ARP test on the address and found that
another client was using the address.
The client sent a DHCP decline to the server,
but the client did not receive an additional
DHCP offer from a server. The client still does
not have a valid address.
Verify that the DHCP server is functioning
properly.
BA01D056DHCP::discover: unknown DHCP
message
DHCP discovery received an unknown DHCP
message type. Verify that the DHCP server is
functioning properly.
BA01D0FFClosing the DHCP node failed.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA030011RTAS attempt to allocate memory failed
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics59
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA04000FSelf test failed on device; no error or
location code information available
BA040010Self test failed on device; can't locate
package.
BA040020The machine type and model are not
recognized by the blade server
firmware.
BA040030The firmware was not able to build the
UID properly for this system. As a
result, problems may occur with the
licensing of the AIX operating system.
BA040035The firmware was unable to find the
“plant of manufacture” in the VPD. This
may cause problems with the licensing
of the AIX operating system.
BA040040Setting the machine type, model, and
serial number failed.
BA040050The h-call to switch off the boot
watchdog timer failed.
1. If a location code is identified with the
error, replace the device specified by the
location code.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. If a location code is identified with the
error, replace the device specified by the
location code.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Verify that the machine type, model, and serial
number are correct for this server. If this is a
new server, check for server firmware updates;
then, install the updates if available.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
60Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA040060Setting the firmware boot side for the
next boot failed.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA050001Failed to reboot a partition in logical
partition mode
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA050004Failed to locate service processor device
tree node.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA05000AFailed to send boot failed message
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA060008No configurable adapters found by the
Remote IPL menu in the SMS utilities
This error occurs when the firmware cannot
locate any LAN adapters that are supported by
the remote IPL function. Verify that the devices
in the remote IPL device list are correct using
the SMS menus.
BA06000BThe system was not able to find an
Go to “Boot problem resolution” on page 190.
operating system on the devices in the
boot list.
BA06000CA pointer to the operating system was
found in non-volatile storage.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA060020The environment variable “boot-device”
exceeded the allowed character limit.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA060021The environment variable “boot-device”
contained more than five entries.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics61
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA060022The environment variable “boot-device”
contained an entry that exceeded 255
characters in length
BA060030Logical partitioning with shared
processors is enabled and the operating
system does not support it.
BA060060The operating system expects an IOSP
partition, but it failed to make the
transition to alpha mode.
BA060061The operating system expects a
non-IOSP partition, but it failed to make
the transition to MGC mode.
1. Using the SMS menus, set the boot list to
the default boot list.
2. Shut down; then, start up the blade server.
3. Use SMS menus to customize the boot list
as required.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Install or boot a level of the operating
system that supports shared processors.
2. Disable logical partitioning with shared
processors in the operating system.
3. If the problem remains:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Verify that:
v The alpha-mode operating system image
is intended for this partition.
v The configuration of the partition
supports an alpha-mode operating
system.
2. If the problem remains:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Verify that:
v The alpha-mode operating system image
is intended for this partition.
v The configuration of the partition
supports an alpha-mode operating
system.
2. If the problem remains:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
62Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA060070The operating system does not support
this system's processor(s)
BA060071An invalid number of vectors was
received from the operating system
BA060072Client-arch-support hcall error
Boot a supported version of the operating
system.
Boot a supported version of the operating
system.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA060075Client-arch-support firmware error
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA060200Failed to set the operating system boot
list from the management module boot
list
1. Using the SMS menus, set the boot list to
the default boot list.
2. Shut down; then, start up the blade server.
3. Use SMS menus to customize the boot list
as required.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA060201Failed to read the VPD "boot path" field
value
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA060202Failed to update the VPD with the new
"boot path" field value
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA060300An I/O error on the adapter from which
the boot was attempted prevented the
operating system from being booted.
1. Using the SMS menus, select another
adapter from which to boot the operating
system, and reboot the system.
2. Attempt to reboot the system.
3. Go to “Boot problem resolution” on page
190.
BA07xxxxself configuring SCSI device (SCSD)
controller failure
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics63
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA090001SCSD DASD: test unit ready failed;
hardware error
BA090002SCSD DASD: test unit ready failed;
sense data available
BA090003SCSD DASD: send diagnostic failed;
sense data available
BA090004SCSD DASD: send diagnostic failed:
devofl cmd
BA09000AThere was a vendor specification error.
BA09000BGeneric SCSD sense error
BA09000CThe media is write-protected
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Check the vendor specification for
additional information.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Verify that the SCSD cables and devices are
properly plugged.
2. Correct any problems that are found.
3. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Change the setting of the media to allow
writing, then retry the operation.
2. Insert new media of the correct type.
3. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
64Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA09000DThe media is unsupported or not
recognized.
1. Insert new media of the correct type.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA09000EThe media is not formatted correctly.
1. Insert the media.
2. Insert new media of the correct type.
3. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA09000FMedia is not present
1. Insert new media with the correct format.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA090010The request sense command failed.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA090011The retry limit has been exceeded.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics65
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA090012There is a SCSD device that is not
supported.
BA120001On an undetermined SCSD device, test
unit ready failed; hardware error
BA120002On an undetermined SCSD device, test
unit ready failed; sense data available
1. Replace the SCSD device that is not
supported with a supported device.
2. If the problem persists:
a. Troubleshoot the SCSD devices.
b. Verify that the SCSD cables and devices
are properly plugged. Correct any
problems that are found.
c. Replace the SCSD cables and devices.
d. If the problem persists:
1) Go to “Checkout procedure” on
page 184.
2) Replace the system-board, as
described in “Replacing the FRU
system-board and chassis assembly”
on page 260.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
66Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA120003On an undetermined SCSD device, send
diagnostic failed; sense data available
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA120004On an undetermined SCSD device, send
diagnostic failed; devofl command
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA120010Failed to generate the SAS device
physical location code. The event log
entry has the details.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA130010USB CD-ROM in the media tray: device
remained busy longer than the time-out
period
1. Retry the operation.
2. Reboot the blade server.
3. Troubleshoot the media tray and CD-ROM
drive.
4. Replace the USB CD or DVD drive.
5. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics67
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA130011USB CD-ROM in the media tray:
execution of ATA/ATAPI command was
not completed with the allowed time.
BA130012USB CD-ROM in the media tray:
execution of ATA/ATAPI command
failed.
BA130013USB CD-ROM in the media tray:
bootable media is missing from the
drive
1. Retry the operation.
2. Reboot the blade server.
3. Troubleshoot the media tray and CD-ROM
drive.
4. Replace the USB CD or DVD drive.
5. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Retry the operation.
2. Reboot the blade server.
3. Troubleshoot the media tray and CD-ROM
drive.
4. Replace the USB CD or DVD drive.
5. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Insert a bootable CD in the drive and retry
the operation.
2. If the problem persists:
a. Retry the operation.
b. Reboot the blade server.
c. Troubleshoot the media tray and
CD-ROM drive.
d. Replace the USB CD or DVD drive.
e. If the problem persists:
1) Go to “Checkout procedure” on
page 184.
2) Replace the system-board, as
described in “Replacing the FRU
system-board and chassis assembly”
on page 260.
68Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA130014USB CD-ROM in the media tray: the
media in the USB CD-ROM drive has
been changed.
1. Retry the operation.
2. Reboot the blade server.
3. Troubleshoot the media tray and CD-ROM
drive.
4. Replace the USB CD or DVD drive.
5. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA130015USB CD-ROM in the media tray:
ATA/ATAPI packet command execution
failed.
1. Remove the CD or DVD in the drive and
replace it with a known-good disk.
2. If the problem persists:
a. Retry the operation.
b. Reboot the blade server.
c. Troubleshoot the media tray and
CD-ROM drive.
d. Replace the USB CD or DVD drive.
e. If the problem persists:
1) Go to “Checkout procedure” on
page 184.
2) Replace the system-board, as
described in “Replacing the FRU
system-board and chassis assembly”
on page 260.
BA131010The USB keyboard has been removed.
1. Reseat the keyboard cable in the
management module USB port.
2. Check for server firmware updates; then,
install the updates if available.
BA140001The SCSD read/write optical test unit
ready failed; hardware error.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics69
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA140002The SCSD read/write optical test unit
ready failed; sense data available.
BA140003The SCSD read/write optical send
diagnostic failed; sense data available.
BA140004The SCSD read/write optical send
diagnostic failed; devofl command.
BA150001PCI Ethernet BNC/RJ-45 or PCI
Ethernet AUI/RJ-45 adapter: internal
wrap test failure
BA15100110/100 Mbps Ethernet PCI adapter:
internal wrap test failure
BA15100210/100 Mbps Ethernet card failure
BA153002Gigabit Ethernet adapter failureVerify that the MAC address programmed in
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Troubleshoot the SCSD devices.
2. Verify that the SCSD cables and devices are
properly plugged. Correct any problems
that are found.
3. Replace the SCSD cables and devices.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Replace the adapter specified by the location
code.
Replace the adapter specified by the location
code.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
the FLASH/EEPROM is correct.
70Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA153003Gigabit Ethernet adapter failure
1. Check for server firmware updates; then,
install the updates if available.
2. Replace the Gigabit Ethernet adapter.
BA154010HEA software error
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA154020The required open firmware property
was not found.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA154030Invalid parameters were passed to the
HEA device driver.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA154040The TFTP package open failed
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA154050The transmit operation failed.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA154060Failed to initialize the HEA port or
queue
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics71
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA154070The receive operation failed.
BA170000NVRAMRC initialization failed; device
test failed
BA170100NVRAM data validation check failed
BA170201The firmware was unable to expand
target partition - saving configuration
variable
BA170202The firmware was unable to expand
target partition - writing event log entry
BA170203The firmware was unable to expand
target partition - writing VPD data
BA170210Setenv/$Setenv parameter error - name
contains a null character
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Shut down the blade server; then, restart it.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
72Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA170211Setenv/$Setenv parameter error - value
contains a null character
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA170220Unable to write a variable value to
NVRAM due to lack of free memory in
NVRAM.
1. Reduce the number of partitions, if
possible, to add more NVRAM memory to
this partition.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA170221Setenv/$setenv had to delete stored
firmware network boot settings to free
memory in NVRAM.
BA170998NVRAMRC script evaluation error -
command line execution error.
Enter the adapter and network parameters
again for the network boot or network
installation.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA180008PCI device Fcode evaluation error
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA180009The Fcode on a PCI adapter left a data
stack imbalance
1. Reseat the PCI adapter card.
2. Check for adapter firmware updates; then,
install the updates if available.
3. Check for server firmware updates; then,
install the updates if available.
4. Replace the PCI adapter card.
5. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA180010PCI probe error, bridge in freeze state
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA180011PCI bridge probe error, bridge is not
usable
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics73
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA180012PCI device runtime error, bridge in
freeze state
BA180014MSI software error
BA180020No response was received from a slot
during PCI probing.
BA180099PCI probe error; bridge in freeze state,
slot in reset state
BA180100The FDDI adapter Fcode driver is not
supported on this server.
BA180101Stack underflow from fibre-channel
adapter
BA190001Firmware function to get/set
time-of-day reported an error
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reseat the PCI adapter card.
2. Check for adapter firmware updates; then,
install the updates if available.
3. Check for server firmware updates; then,
install the updates if available.
4. Replace the PCI adapter card.
5. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
IBM may produce a compatible driver in the
future, but does not guarantee one.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
74Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA201001The serial interface dropped data
packets
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA201002The serial interface failed to open
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA201003The firmware failed to handshake
properly with the serial interface
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA210000Partition firmware reports a default
catch
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA210001Partition firmware reports a stack
underflow was caught
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA210002Partition firmware was ready before
standout was ready
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics75
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA210003A data storage error was caught by
partition firmware
BA210004An open firmware stack-depth assert
failed.
BA210010The transfer of control to the SLIC
loader failed
BA210011The transfer of control to the IO
Reporter failed
BA210012There was an NVRAMRC forced-boot
problem; unable to load the previous
boot's operating system image
BA210013There was a partition firmware error
when in the SMS menus.
1. If the location code reported with the error
points to an adapter, check for adapter
firmware updates.
2. Apply any available updates.
3. Check for server firmware updates.
4. Apply any available updates.
5. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Use the SMS menus to verify that the
partition firmware can still detect the
operating system image.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
76Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA210020I/O configuration exceeded the
maximum size allowed by partition
firmware.
1. Increase the logical memory block size to
256 MB and restart the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA210100An error may not have been sent to the
management module event log.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA210101The partition firmware event log queue
is full
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA210102There was a communication failure
between partition firmware and the
hypervisor. The lpevent that was
expected from the hypervisor was not
received.
1. Review the event log for errors that
occurred around the time of this error.
2. Correct any errors that are found and
reboot the blade server.
3. If the problem persists:
a. Reboot the blade server.
b. If the problem persists:
1) Go to “Checkout procedure” on
page 184.
2) Replace the system-board, as
described in “Replacing the FRU
system-board and chassis assembly”
on page 260.
Chapter 2. Diagnostics77
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA210103There was a communication failure
between partition firmware and the
hypervisor. There was a failing return
code with the lpevent acknowledgement
from the hypervisor.
BA220010There was a partition firmware error
during a USB hotplug probing. USB
hotplug may not work properly on this
partition.
BA220020CRQ registration error; partner vslot
may not be valid
BA278001Failed to flash firmware: invalid image
file
BA278002Flash file is not designed for this
platform
BA278003Unable to lock the firmware update lid
manager
BA278004An invalid firmware update lid was
requested
BA278005Failed to flash a firmware update lidDownload a new firmware update image and
BA278006Unable to unlock the firmware update
lid manager
1. Review the event log for errors that
occurred around the time of this error.
2. Correct any errors that are found and
reboot the blade server.
3. If the problem persists:
a. Reboot the blade server.
b. If the problem persists:
1) Go to “Checkout procedure” on
page 184.
2) Replace the system-board, as
described in “Replacing the FRU
system-board and chassis assembly”
on page 260.
1. Look for EEH-related errors in the event
log.
2. Resolve any EEH event log entries that are
found.
3. Correct any errors that are found and
reboot the blade server.
4. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Verify that this client virtual slot device has a
valid server virtual slot device in a hosting
partition.
Download a new firmware update image and
retry the update.
Download a new firmware update image and
retry the update.
1. Restart the blade server.
2. Verify that the operating system is
authorized to update the firmware. If the
system is running multiple partitions, verify
that this partition has service authority.
Download a new firmware update image and
retry the update.
retry the update.
Restart the blade server.
78Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA278007Failed to reboot the system after a
Restart the blade server.
firmware flash update
BA278009The operating system's server firmware
update management tools are
incompatible with this system.
Go to the IBM download site at
www14.software.ibm.com/webapp/set2/sas/
f/lopdiags/home.html to download the latest
version of the service aids package for Linux.
BA27800AThe firmware installation failed due to a
hardware error that was reported.
1. Look for hardware errors in the event log.
2. Resolve any hardware errors that are found.
3. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA280000RTAS discovered an invalid operation
that may cause a hardware error
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA290000RTAS discovered an internal stack
overflow
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
BA290001RTAS low memory corruption was
detected
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA290002RTAS low memory corruption was
detected
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA310010Unable to obtain the SRC history
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics79
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA310020An invalid SRC history was obtained.
BA310030Writing the MAC address to the VPD
failed.
BA330000Memory allocation error.
BA330001Memory allocation error.
BA330002Memory allocation error.
BA330003Memory allocation error.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
80Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA330004Memory allocation error.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA340001There was a logical partition event
communication failure reading the
BladeCenter open fabric manager
parameter data structure from the
service processor.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA340002There was a logical partition event
communication failure reading the
BladeCenter open fabric manager
location code mapping data from the
service processor.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA340003An internal firmware error occurred;
unable to allocate memory for the open
fabric manager location code mapping
data.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA340004An internal firmware error occurred; the
open fabric manager parameter data
was corrupted.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA340005An internal firmware error occurred; the
location code mapping table was
corrupted.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics81
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA340006An LP event communication failure
occurred reading the system initiator
capability data from the service
processor.
BA340007An internal firmware error occurred; the
open fabric manager system initiator
capability data was corrupted.
BA340008An internal firmware error occurred; the
open fabric manager system initiator
capability data version was not correct.
BA340009An internal firmware error occurred; the
open fabric manager system initiator
capability processing encountered an
unexpected error.
BA340010An internal firmware error was detected
during open fabric manager processing.
BA340011Assignment of fabric ID to the I/O
adapter failed.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
82Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 16. BA000010 to BA400002 Partition firmware SRCs (continued)
v Follow the suggested actions in the order in which they are listed in the Action column until the problem is
solved. If an action solves the problem, then you can stop performing the remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Error codeDescriptionAction
BA340020A logical partition event communication
failure occurred when writing the
BladeCenter open fabric manager
parameter data to the service processor.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA340021A logical partition event communication
failure occurred when writing the
BladeCenter open fabric manager system
initiator capabilities data to the service
processor.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA400001Informational message: DMA trace
buffer full.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
BA400002Informational message: DMA map-out
size mismatch.
1. Reboot the blade server.
2. If the problem persists:
a. Go to “Checkout procedure” on page
184.
b. Replace the system-board, as described
in “Replacing the FRU system-board
and chassis assembly” on page 260.
Chapter 2. Diagnostics83
POST progress codes (checkpoints)
When you turn on the blade server, the power-on self-test (POST) performs a series of tests to check the
operation of the blade server components. Use the management module to view progress codes that offer
information about the stages involved in powering on and performing an initial program load (IPL).
Progress codes do not indicate an error, although in some cases, the blade server can pause indefinitely
(hang). Progress codes for blade servers are, 8-digit hexadecimal numbers that start with C and D.
Checkpoints are generated by various components. The baseboard management controller (BMC) service
processor and the partitioning firmware are key contributors. The service processor provides additional
isolation procedure codes for troubleshooting.
A checkpoint might have an associated location code as part of the message. The location code provides
information that identifies the failing component when there is a hang condition.
Notes:
1. For checkpoints with no associated location code, see “Light path diagnostics” on page 214 to identify
the failing component when there is a hang condition.
2. For checkpoints with location codes, see “Location codes” on page 14 to identify the failing
component when there is a hang condition.
3. For eight-digit codes not listed here, see “Checkout procedure” on page 184 for information.
The management module can display the most recent 32 SRCs and time stamps. Manually refresh the list
to update it.
Select Blade Service Data > blade_name in the management module to see a list of the 32 most recent
SRCs.
Any message with more detail is highlighted as a link in the System Reference Code column. Click the
message to cause the management module to present the additional message detail:
D1513901
Created at: 2007-11-1319:30:20
SRC Version: 0x02
Hex Words 2-5: 020110F0 52298910 C1472000 200000FF
84Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
C1001F00 to C1645300 Service processor checkpoints
The C1xx progress codes, or checkpoints, offer information about the initialization of both the service
processor and the server. Service processor checkpoints are typical reference codes that occur during the
initial program load (IPL) of the server.
Table 18 lists the progress codes that might be displayed during the power-on self-test (POST), along with
suggested actions to take if the system hangs on the progress code. Only when you experience a hang
condition should you take any of the actions described for a progress code.
In the following progress codes, x can be any number or letter.
Table 18. C1001F00 to C1645300 checkpoints
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress codeDescriptionAction
C10010xxPre-standby
C1001F00Pre-standby: starting initial transition
file
C1001F0DPre-standby: discovery completed in
initial transition file
While the blade server displays this
checkpoint, the service processor reads
the system vital product data (VPD). The
service processor must complete reading
the system VPD before the system
displays the next progress code.
C1001F0FPre-standby: waiting for standby
synchronization from initial transition
file
C1001FFFPre-standby: completed initial transition
file
C1009x01Hardware object manager: (HOM): the
cancontinue flag is being cleared
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Wait at least 15 minutes for this checkpoint
to change before you decide that the system
is hung.
Reading the system VPD might take as long
as 15 minutes on systems with maximum
configurations or many disk drives.
2. Go to “Checkout procedure” on page 184.
3. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics85
Table 18. C1001F00 to C1645300 checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress codeDescriptionAction
C1009x02Hardware object manager: (HOM):
erase HOM IPL step in progress
C1009x04Hardware object manager: (HOM):
build cards IPL step in progress
C1009x08Hardware object manager: (HOM):
build processors IPL step in progress
C1009x0CHardware object manager: (HOM):
build chips IPL step in progress
C1009x10Hardware object manager: (HOM):
initialize HOM
C1009x14Hardware object manager: (HOM):
validate HOM
C1009x18Hardware object manager: (HOM):
GARD in progress
C1009x1CHardware object manager: (HOM):
clock test in progress
C1009x20Frequency control IPL step in progress
C1009x24Asset protection IPL step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
86Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Table 18. C1001F00 to C1645300 checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress codeDescriptionAction
C1009x28Memory configuration IPL step in
progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x2CProcessor CFAM initialization in
progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x30Processor self-synchronization in
progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009034Processor mask attentions being
initialized
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x38Processor check ring IPL step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x39Processor L2 line delete in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x3ALoad processor gptr IPL step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x3CProcessor ABIST step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x40Processor LBIST step in progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
C1009x44Processor array initialization step in
progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
Chapter 2. Diagnostics87
Table 18. C1001F00 to C1645300 checkpoints (continued)
v If the system hangs on a progress code, follow the suggested actions in the order in which they are listed in
the Action column until the problem is solved. If an action solves the problem, you can stop performing the
remaining actions.
v See Chapter 3, “Parts listing, Type 8406,” on page 229 to determine which components are CRUs and which
components are FRUs.
Progress codeDescriptionAction
C1009x46Processor AVP initialization step in
progress
C1009x48Processor flush IPL step in progress
C1009x4CProcessor wiretest IPL step in progress
C1009x50Processor long scan IPL step in progress
C1009x54Start processor clocks IPL step in
progress
C1009x58Processor SCOM initialization step in
progress
C1009x5CProcessor interface alignment procedure
in progress
C1009x5EProcessor AVP L2 test case in progress
C1009x60Processor random data test in progress
C1009x64Processor enable machine check test in
progress
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
1. Go to “Checkout procedure” on page 184.
2. Replace the system-board, as described in
“Replacing the FRU system-board and
chassis assembly” on page 260.
88Power Systems: Problem Determination and Service Guide for the IBM Power PS700 (8406-70Y)
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.